原创 Java GC
young generation and old generation. 1 eden and 2 survivor spaces.minor GC, mark and copy, from eden and one survivor t
原创 HBase Filters, Counters & Coprocessors
- Scan, setCaching(rows), setBatch(cells)- Filter -> FilterBase. setFilter(filter) method on Get and Scan- CompareFilte
原创 HBase Region Split
- Split Policy (ConstantSizeRegionSplitPolicy, IncreasingToUpperBoundRegionSplitPolicy(default), KeyPrefixRegionSplitPo
原创 submit spark code to yarn
- configure eclipse, add scala-ide plugin and m2e-scala plugin (http://alchim31.free.fr/m2e-scala/update-site/)- config
原创 JVM trouble shooting
- JPS, TOP and JSTACK, jps to find java info, like classname, parameters of main, JVM arguments, pid, jps -m -ltop to
原创 spark - Running on Cluster
- package spark app (maven)<plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin<
原创 Spark Trouble Shooting and Performance Tuning
- spark master server - more memory export SPARK_DAEMON_MEMORY=5gspark.ui.retainedJobs 500 # 默認都是1000 spark.ui.retain
原创 spark - Tuning and Debugging Spark
- submit application (sparkconf object cannot be changed after SparkContext creationmethod 1bin/spark-submit \ —class c
原创 HBase Concept
- Data Model, sparse, distributed, persisted multidimensional sorted map(row:string, column:string, time:int64) -> stri
原创 bloom filter
- space efficient look up for fixed number of static elements. - may have, definitely no haven: number of elementsk: nu