正常情況下,hive通過 CompactionTxnHandler中的findPotentialCompactions 方法獲取需要合併的表信息,如下所示,會分別掃描COMPLETED_TXN_COMPONENTS和 TXNS, TXN_COMPONENTS,獲取已commit 的表信息和abort的事務表信息。
/**
* This will look through the completed_txn_components table and look for partitions or tables
* that may be ready for compaction. Also, look through txns and txn_components tables for
* aborted transactions that we should add to the list.
* @param maxAborted Maximum number of aborted queries to allow before marking this as a
* potential compaction.
* @return list of CompactionInfo structs. These will not have id, type,
* or runAs set since these are only potential compactions not actual ones.
*/
public Set<CompactionInfo> findPotentialCompactions(int maxAborted) throws MetaException {
Connection dbConn = null;
Set<CompactionInfo> response = new HashSet<CompactionInfo>();
Statement stmt = null;
try {
try {
dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED);
stmt = dbConn.createStatement();
// Check for completed transactions
String s = "select distinct ctc_database, ctc_table, " +
"ctc_partition from COMPLETED_TXN_COMPONENTS";
LOG.debug("Going to execute query <" + s + ">");
ResultSet rs = stmt.executeQuery(s);
while (rs.next()) {
CompactionInfo info = new CompactionInfo();
info.dbname = rs.getString(1);
info.tableName = rs.getString(2);
info.partName = rs.getString(3);
response.add(info);
}
// Check for aborted txns
s = "select tc_database, tc_table, tc_partition " +
"from TXNS, TXN_COMPONENTS " +
"where txn_id = tc_txnid and txn_state = '" + TXN_ABORTED + "' " +
"group by tc_database, tc_table, tc_partition " +
"having count(*) > " + maxAborted;
LOG.debug("Going to execute query <" + s + ">");
rs = stmt.executeQuery(s);
while (rs.next()) {
CompactionInfo info = new CompactionInfo();
info.dbname = rs.getString(1);
info.tableName = rs.getString(2);
info.partName = rs.getString(3);
info.tooManyAborts = true;
response.add(info);
}
LOG.debug("Going to rollback");
dbConn.rollback();
} catch (SQLException e) {
LOG.error("Unable to connect to transaction database " + e.getMessage());
checkRetryable(dbConn, e, "findPotentialCompactions(maxAborted:" + maxAborted + ")");
} finally {
closeDbConn(dbConn);
closeStmt(stmt);
}
return response;
}
catch (RetryException e) {
return findPotentialCompactions(maxAborted);
}
}
這裏還有個生成有效事務號的方法,這裏包含了爲open狀態的最小事務id(txn-id)
/**
* Transform a {@link org.apache.hadoop.hive.metastore.api.GetOpenTxnsInfoResponse} to a
* {@link org.apache.hadoop.hive.common.ValidTxnList}. This assumes that the caller intends to
* compact the files, and thus treats only open transactions as invalid.
* @param txns txn list from the metastore
* @return a valid txn list.
*/
public static ValidTxnList createValidCompactTxnList(GetOpenTxnsInfoResponse txns) {
long highWater = txns.getTxn_high_water_mark();
long minOpenTxn = Long.MAX_VALUE;
long[] exceptions = new long[txns.getOpen_txnsSize()];
int i = 0;
for (TxnInfo txn : txns.getOpen_txns()) {
if (txn.getState() == TxnState.OPEN) minOpenTxn = Math.min(minOpenTxn, txn.getId());
exceptions[i++] = txn.getId();
}
return new ValidCompactorTxnList(exceptions, minOpenTxn, highWater);
}
}
然後在Initiator初始化和提交合並任務前,會去做check, 是否某表或分區滿足合併條件,並且決定合併的類型(major,或者是minor):
調用找如下
這裏面有個關鍵的方法會決定compact的類型,有兩個可配置的參數:
下面這個參數決定minor合併的觸發條件: HIVE_COMPACTOR_DELTA_NUM_THRESHOLD("hive.compactor.delta.num.threshold", 10, "Number of delta directories in a table or partition that will trigger a minor\n" + "compaction."),
下面這個參數決定major合併的觸發條件: HIVE_COMPACTOR_DELTA_PCT_THRESHOLD("hive.compactor.delta.pct.threshold", 0.1f, "Percentage (fractional) size of the delta files relative to the base that will trigger\n" + "a major compaction. (1.0 = 100%, so the default 0.1 = 10%.)"),
private CompactionType determineCompactionType(CompactionInfo ci, ValidTxnList txns,
StorageDescriptor sd)
throws IOException, InterruptedException {
boolean noBase = false;
Path location = new Path(sd.getLocation());
FileSystem fs = location.getFileSystem(conf);
AcidUtils.Directory dir = AcidUtils.getAcidState(location, conf, txns);
Path base = dir.getBaseDirectory();
long baseSize = 0;
FileStatus stat = null;
if (base != null) {
stat = fs.getFileStatus(base);
if (!stat.isDir()) {
LOG.error("Was assuming base " + base.toString() + " is directory, but it's a file!");
return null;
}
baseSize = sumDirSize(fs, base);
}
List<FileStatus> originals = dir.getOriginalFiles();
for (FileStatus origStat : originals) {
baseSize += origStat.getLen();
}
long deltaSize = 0;
List<AcidUtils.ParsedDelta> deltas = dir.getCurrentDirectories();
for (AcidUtils.ParsedDelta delta : deltas) {
stat = fs.getFileStatus(delta.getPath());
if (!stat.isDir()) {
LOG.error("Was assuming delta " + delta.getPath().toString() + " is a directory, " +
"but it's a file!");
return null;
}
deltaSize += sumDirSize(fs, delta.getPath());
}
if (baseSize == 0 && deltaSize > 0) {
noBase = true;
} else {
float deltaPctThreshold = HiveConf.getFloatVar(conf,
HiveConf.ConfVars.HIVE_COMPACTOR_DELTA_PCT_THRESHOLD);
boolean bigEnough = (float)deltaSize/(float)baseSize > deltaPctThreshold;
if (LOG.isDebugEnabled()) {
StringBuffer msg = new StringBuffer("delta size: ");
msg.append(deltaSize);
msg.append(" base size: ");
msg.append(baseSize);
msg.append(" threshold: ");
msg.append(deltaPctThreshold);
msg.append(" will major compact: ");
msg.append(bigEnough);
LOG.debug(msg);
}
if (bigEnough) return CompactionType.MAJOR;
}
int deltaNumThreshold = HiveConf.getIntVar(conf,
HiveConf.ConfVars.HIVE_COMPACTOR_DELTA_NUM_THRESHOLD);
boolean enough = deltas.size() > deltaNumThreshold;
if (enough) {
LOG.debug("Found " + deltas.size() + " delta files, threshold is " + deltaNumThreshold +
(enough ? "" : "not") + " and no base, requesting " + (noBase ? "major" : "minor") +
" compaction");
// If there's no base file, do a major compaction
return noBase ? CompactionType.MAJOR : CompactionType.MINOR;
}
return null;
}
但是這裏有個問題就是,掃描的文件夾,要在有效的範圍內,還記的上面獲取了有效的txnlist嗎,這裏調用
AcidUtils.Directory dir = AcidUtils.getAcidState(location, conf, txns);
會去檢查文件夾所屬的事務是否已經正常結束。最終是調用ValidCompactorTxnList的isTxnRangeValid方法。如下所示:
@Override
public RangeResponse isTxnRangeValid(long minTxnId, long maxTxnId) {
if (highWatermark < minTxnId) {
return RangeResponse.NONE;
} else if (minOpenTxn < 0) {
return highWatermark >= maxTxnId ? RangeResponse.ALL : RangeResponse.NONE;
} else {
return minOpenTxn > maxTxnId ? RangeResponse.ALL : RangeResponse.NONE;
}
}
因此,當txns表中存在一些異常結束(比如thrift異常掛了)但是狀態殘留爲o沒有修改的情況下,就會出現無法提交合並請求的情況。這種爲o情況,在下一次提交事務時未將timeout的事務狀態修改爲a,從而受到hive.compactor.abortedtxn.threshold參數的影響,決定在失敗事務達到一定的閾值是觸發合併。
但是奇怪的是,在我們的環境中,總是有一些異常o沒有能夠被正常修改爲a,導致合併過程阻塞,使得表的小文件過多,影響讀寫效率。但是該問題目前尚未定位到原因。只能手動刪除txns表中,殘留的過期異常o狀態事務,使得合併能夠繼續。