背景
最近由於某些原因需要把一些原本 location
在 oss
(阿里云云對象存儲)上的 hive
數據遷移到cosn
(騰訊雲對象存儲)。目前一直在增量進行同步,在遷移之前需要進行數據的對比。至於對比的方法計劃有兩種,一種是對比 oss
和 cosn
對應文件下的文件所佔磁盤空間大小,即使用 hadoop fs -du -s -h 路徑
命令,然後對比相應表 location
的數據大小是否一直即可;另外一種是直接對相應的 hive
表進行 count
操作,表的 location
地址可以通過 hive server2
或者 spark thrift server
獲取相應的元數據信息。
生成對比腳本
根據 hive server2
或者 spark thrift server
獲取的分區、location、庫名、表名等信息生成了數據量和所佔磁盤空間的 shell
語句。這裏的數據量語句最終生成到一個文件中:count.sh
count=`spark-sql --master yarn --executor-memory 8G --executor-cores 4 --num-executors 2 --driver-memory 4G -e "select count(1) from bi_ods.table1 "`
echo bi_ods.table1:$count >>/Users/scx/work/git/company/utils/count.log
count=`spark-sql --master yarn --executor-memory 8G --executor-cores 4 --num-executors 2 --driver-memory 4G -e "select count(1) from bi_ods.table2 "`
echo bi_ods.table2:$count >>/Users/scx/work/git/company/utils/count.log
count=`spark-sql --master yarn --executor-memory 8G --executor-cores 4 --num-executors 2 --driver-memory 4G -e "select count(1) from bi_ods.table3 where dt >= 20191020 and dt <= 20191027"`
echo bi_ods.table3:$count >>/Users/scx/work/git/company/utils/count.log
...
通過語句可以看出,對於沒有分區的表直接進行 count
計算,對於分區的表,只會對比最近的數據(日期自定義)
我們可以通過執行這個 shell
腳本,可以把最終的結果重定向到 count.log
中,以 :
隔開,前半本部分是表名,後半部分是數據量,再寫個程序解析表和數據量,最後與騰訊雲執行的 count
結果進行對比即可。
執行優化
上面生成了 count.sh
後執行呢?直接 bash count.sh
嗎?是一種方法,除非你的表個數很少,很快就能執行完。要知道,這種方式是串行執行,只有等一個表的 count
語句執行完成之後才能執行下一個語句。我這邊共有近2000
張表,如果這樣執行的話需要近 30
個小時,太浪費時間了。
所以有沒有什麼方法優化呢?當然有,記得 Java
裏面有調用 shell
程序的類,比如ProcessBuilder
、Runtime
。所以我們可以解析 count.sh
腳本,獲取所有的表的count 命令和對應的重定向命令,然後使用線程池多線程執行每一個 spark-sql
的 count
語句並且把結果重定向到 count.log
廢話不多說,上代碼
public class BashExecute {
static Random random = new Random();
/**
* 隨機種子
*/
static int sed = Integer.MAX_VALUE;
/**
* 已經完成的任務數量
*/
static int finshTask = 0;
/**
* 環境變量
*/
static Map<String, String> customPro = new HashMap<>();
static {
Properties properties = System.getProperties();
for (Map.Entry<Object, Object> entry : properties.entrySet()) {
customPro.put(String.valueOf(entry.getKey()), String.valueOf(entry.getValue()));
}
customPro.putAll(System.getenv());
}
public static void execute(String cmds, long num) throws IOException {
File tmpFile = createTmpFile(cmds);
/*
//生成腳本日誌的正常輸出的文件
File logFile = new File(HiveApp.workDir + "/success/" + num + ".log");
if (!logFile.exists()) {
logFile.createNewFile();
}
//生成腳本日誌的異常輸出的文件
File errFile = new File(HiveApp.workDir + "/error/" + num + "_err.log");
if (!errFile.exists()) {
errFile.createNewFile();
}
*/
ProcessBuilder builder = new ProcessBuilder("bash", tmpFile.getAbsolutePath())
.directory(new File(HiveApp.workDir));
builder.environment().putAll(customPro);
Process process = null;
try {
process = builder.start();
/*
//一個線程讀取正常日誌信息
new StreamThread(logFile, process.getInputStream()).start();
//一個線程讀取異常日誌信息
new StreamThread(errFile, process.getErrorStream()).start();
*/
int exitCode = process.waitFor();
if (exitCode != 0) {
System.out.println("執行任務異常:" + cmds);
}
} catch (IOException | InterruptedException e) {
e.printStackTrace();
} finally {
if (process != null) {
process.destroy();
process = null;
}
tmpFile.delete();
}
}
/**
* 根據要執行的命令創建臨時腳本文件
*
* @param cmds shell腳本內容
* @return 創建的文件
*/
private static File createTmpFile(String cmds) throws IOException {
File file = new File("/tmp/" + System.currentTimeMillis() + random.nextInt(sed) + ".sh");
if (!file.exists()) {
if (!file.createNewFile()) {
throw new RuntimeException("新建臨時文件失敗" + file.getAbsolutePath());
}
}
BufferedWriter writer = new BufferedWriter(new FileWriter(file));
writer.write(cmds);
writer.flush();
writer.close();
return file;
}
/**
* nohup java -classpath utils-1.0-SNAPSHOT-jar-with-dependencies.jar com.sucx.app.BashExecute /home/hadoop/sucx/count.sh 70 > exe.log &
*
* @param args args[0] count.log lying args[1] 併發個數
* @throws IOException
*/
public static void main(String[] args) throws IOException, InterruptedException {
List<String> cmdList;
//默認並行度
int poolSize = 15;
if (args.length == 0) {
cmdList = getCmd(HiveApp.countCmd);
} else {
//count.sh 路徑
cmdList = getCmd(args[0]);
if (args.length > 1) {
poolSize = Integer.parseInt(args[1]);
}
}
int allTask = cmdList.size();
System.out.println("總任務量:" + allTask);
ExecutorService service = Executors.newFixedThreadPool(poolSize);
CountDownLatch latch = new CountDownLatch(allTask);
for (String cmd : cmdList) {
service.execute(() -> {
try {
execute(cmd, allTask - latch.getCount());
} catch (IOException e) {
e.printStackTrace();
} finally {
synchronized (BashExecute.class) {
finshTask++;
//輸出當前進行
System.out.println(LocalDateTime.now() + "已完成:" + finshTask + "/" + allTask + ",百分比爲:" + (finshTask * 100 / allTask) + "%");
}
latch.countDown();
}
});
}
latch.await();
System.out.println("任務全部完成");
service.shutdown();
}
/**
* 根據count.sh的路徑來獲取所有任務的count語句和重定向語句
*
* @param path count.sh路徑
* @return 所有表命令的集合
*/
private static List<String> getCmd(String path) throws IOException {
File file = new File(path);
if (!file.exists() || !file.isFile()) {
throw new RuntimeException("文件不存在");
}
BufferedReader reader = new BufferedReader(new FileReader(file));
String line;
List<String> cmds = new ArrayList<>();
while ((line = reader.readLine()) != null) {
cmds.add("#!/bin/bash" + "\n" + line + "\n" + reader.readLine());
}
reader.close();
return cmds;
}
/**
* 讀取日誌的線程
*/
/*static class StreamThread extends Thread {
private BufferedWriter writer;
private InputStream inputStream;
public StreamThread(File file, InputStream inputStream) throws IOException {
this.writer = new BufferedWriter(new FileWriter(file));
this.inputStream = inputStream;
}
@Override
public void run() {
try (BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream, "utf-8"))) {
String line;
int lineNum = 0;
while ((line = reader.readLine()) != null) {
writer.write(line);
writer.newLine();
if (lineNum++ == 10) {
writer.flush();
lineNum = 0;
}
}
} catch (Exception ignored) {
}
}
}*/
}
代碼很簡單,main
方法接受兩個參數,第一個參數爲 count.sh
的絕對路徑,第二個參數爲併發的線程數n
- 解析
count.sh
的腳本內容,每兩行shell
代碼爲一對(第一行爲count
的語句,第二行爲結果重定向),返回一個List
集合cmdList
- 新建一個
fixed
的線程池,其中核心線程數和最大線程數都爲n
- 設置
CountDownLatch
爲任務集合cmdList
的大小,遍歷cmdList
集合,向線程池提交執行的shell
代碼,然後輸出當前的執行進度(爲什麼同步呢?因爲有一個finshTask++;
的操作),並釋放CountDownLatch
鎖 CountDownLatch await
等待所有任務執行完畢後輸出完成的代碼,並關閉線程池
其中在執行腳本的 execute
代碼中爲腳本建了一個臨時文件,然後使用 ProcessBuilder
執行該文件,等待腳本結束即可。
卡死問題
代碼寫好了 就好快快樂樂的分別把 jar
上傳到阿里雲和騰訊雲集羣執行了
nohup java -classpath utils-1.0-SNAPSHOT-jar-with-dependencies.jar com.sucx.app.BashExecute /home/hadoop/sucx/count.sh 70 > exe.log &
然後tail -f exe.log
查看實時日誌,看着騰訊雲的代碼歡快的執行,心裏有點小興奮
但是到了阿里雲這裏好像有點異常啊,等了好久都不動了
最後 yarn
上顯示任務失敗了
查看 yarn
日誌發現報錯都是一樣的,並且單獨拉出來在命令行執行都是 ok
的
19/11/05 10:32:13 WARN executor.Executor: Issue communicating with driver in heartbeater
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [10 seconds]. This timeout is controlled by spark.executor.heartbeatInterval
at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:47)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:62)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:58)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:92)
at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:785)
at org.apache.spark.executor.Executor$$anon$2$$anonfun$run$1.apply$mcV$sp(Executor.scala:814)
at org.apache.spark.executor.Executor$$anon$2$$anonfun$run$1.apply(Executor.scala:814)
at org.apache.spark.executor.Executor$$anon$2$$anonfun$run$1.apply(Executor.scala:814)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1991)
at org.apache.spark.executor.Executor$$anon$2.run(Executor.scala:814)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
... 14 more
但是 java
的進程好像還是卡着不動
"process reaper" #22 daemon prio=10 os_prio=0 tid=0x00007fcefc001680 nid=0x540a runnable [0x00007fcf4d38a000]
java.lang.Thread.State: RUNNABLE
at java.lang.UNIXProcess.waitForProcessExit(Native Method)
at java.lang.UNIXProcess.lambda$initStreams$3(UNIXProcess.java:289)
at java.lang.UNIXProcess$$Lambda$10/889583863.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"process reaper" #21 daemon prio=10 os_prio=0 tid=0x00007fcf08007ab0 nid=0x5407 runnable [0x00007fcf4d3c3000]
java.lang.Thread.State: RUNNABLE
at java.lang.UNIXProcess.waitForProcessExit(Native Method)
at java.lang.UNIXProcess.lambda$initStreams$3(UNIXProcess.java:289)
at java.lang.UNIXProcess$$Lambda$10/889583863.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-1-thread-10" #19 prio=5 os_prio=0 tid=0x00007fcfa418bb20 nid=0x53f8 in Object.wait() [0x00007fcf4d4c4000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000006750be3a8> (a java.lang.UNIXProcess)
at java.lang.Object.wait(Object.java:502)
at java.lang.UNIXProcess.waitFor(UNIXProcess.java:395)
- locked <0x00000006750be3a8> (a java.lang.UNIXProcess)
at com.sucx.app.BashExecute.execute(BashExecute.java:75)
at com.sucx.app.BashExecute.lambda$main$0(BashExecute.java:136)
at com.sucx.app.BashExecute$$Lambda$1/1406718218.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
spark-sql
任務明明失敗了,但是我們在執行的進程還沒退出?怎麼回事,好費解
那麼就選擇一個線程分析,通過上面堆棧信息選了 process repaer
線程的堆棧信息
"process reaper" #21 daemon prio=10 os_prio=0 tid=0x00007fcf08007ab0 nid=0x5407 runnable [0x00007fcf4d3c3000]
java.lang.Thread.State: RUNNABLE
at java.lang.UNIXProcess.waitForProcessExit(Native Method)
at java.lang.UNIXProcess.lambda$initStreams$3(UNIXProcess.java:289)
at java.lang.UNIXProcess$$Lambda$10/889583863.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
查看 java
源碼 java.lang.UNIXProcess.waitForProcessExit
是個 native
方法,跳過
private native int waitForProcessExit(int pid);
繼續向上分析
void initStreams(int[] fds) throws IOException {
switch (platform) {
case LINUX:
case BSD:
stdin = (fds[0] == -1) ?
ProcessBuilder.NullOutputStream.INSTANCE :
new ProcessPipeOutputStream(fds[0]);
stdout = (fds[1] == -1) ?
ProcessBuilder.NullInputStream.INSTANCE :
new ProcessPipeInputStream(fds[1]);
stderr = (fds[2] == -1) ?
ProcessBuilder.NullInputStream.INSTANCE :
new ProcessPipeInputStream(fds[2]);
processReaperExecutor.execute(() -> {
int exitcode = waitForProcessExit(pid);
synchronized (this) {
this.exitcode = exitcode;
this.hasExited = true;
this.notifyAll();
}
if (stdout instanceof ProcessPipeInputStream)
((ProcessPipeInputStream) stdout).processExited();
if (stderr instanceof ProcessPipeInputStream)
((ProcessPipeInputStream) stderr).processExited();
if (stdin instanceof ProcessPipeOutputStream)
((ProcessPipeOutputStream) stdin).processExited();
});
break;
/** 省略下面代碼 **/
查看 initStreams
後我們可以發現該方法是爲了異步監控應用的退出,然後關閉輸入流和輸出流。好像沒發現什麼問題,繼續向上分析
UNIXProcess(final byte[] prog,
final byte[] argBlock, final int argc,
final byte[] envBlock, final int envc,
final byte[] dir,
final int[] fds,
final boolean redirectErrorStream)
throws IOException {
pid = forkAndExec(launchMechanism.ordinal() + 1,
helperpath,
prog,
argBlock, argc,
envBlock, envc,
dir,
fds,
redirectErrorStream);
try {
doPrivileged((PrivilegedExceptionAction<Void>) () -> {
initStreams(fds);
return null;
});
} catch (PrivilegedActionException ex) {
throw (IOException) ex.getException();
}
}
這裏好像看到我們的程序就是在forkAndExec
方法中執行然後返回一個程序的進程號(pid
),進去看看
/**
* Creates a process. Depending on the {@code mode} flag, this is done by
* one of the following mechanisms:
* <pre>
* 1 - fork(2) and exec(2)
* 2 - posix_spawn(3P)
* 3 - vfork(2) and exec(2)
*
* (4 - clone(2) and exec(2) - obsolete and currently disabled in native code)
* </pre>
* @param fds an array of three file descriptors.
* Indexes 0, 1, and 2 correspond to standard input,
* standard output and standard error, respectively. On
* input, a value of -1 means to create a pipe to connect
* child and parent processes. On output, a value which
* is not -1 is the parent pipe fd corresponding to the
* pipe which has been created. An element of this array
* is -1 on input if and only if it is <em>not</em> -1 on
* output.
* @return the pid of the subprocess
*/
private native int forkAndExec(int mode, byte[] helperpath,
byte[] prog,
byte[] argBlock, int argc,
byte[] envBlock, int envc,
byte[] dir,
int[] fds,
boolean redirectErrorStream)
throws IOException;
也是一個 native
方法,但是上面的註釋不能忽略了,尤其是關於 fds
參數的。大致意思是:
fds是三個文件描述符的數組。索引0、1和2分別對應於標準輸入,
標準輸出和標準誤差。對於輸入流,-1表示創建一個管道來連接
子進程和父進程,對於輸出流,非-1的值是與已創建的管道相對應的父管道。 當且僅當輸出不是-1時,輸入才爲-1。
看到這裏突然想到,我使用 ProcessBuilder
執行的 spark-sql
的日誌都打到哪裏去了?好像也並沒有重定向日誌信息。
繼續往上看
static Process start(String[] cmdarray,
java.util.Map<String,String> environment,
String dir,
ProcessBuilder.Redirect[] redirects,
boolean redirectErrorStream)
throws IOException
{
/*省略部分代碼*/
try {
if (redirects == null) {
//如果不重定向輸入流、輸出流則全部默認爲管道形式鏈接到當前java程序
std_fds = new int[] { -1, -1, -1 };
} else {
std_fds = new int[3];
if (redirects[0] == Redirect.PIPE)
std_fds[0] = -1;
else if (redirects[0] == Redirect.INHERIT)
std_fds[0] = 0;
else {
f0 = new FileInputStream(redirects[0].file());
std_fds[0] = fdAccess.get(f0.getFD());
}
if (redirects[1] == Redirect.PIPE)
std_fds[1] = -1;
else if (redirects[1] == Redirect.INHERIT)
std_fds[1] = 1;
else {
f1 = new FileOutputStream(redirects[1].file(),
redirects[1].append());
std_fds[1] = fdAccess.get(f1.getFD());
}
if (redirects[2] == Redirect.PIPE)
std_fds[2] = -1;
else if (redirects[2] == Redirect.INHERIT)
std_fds[2] = 2;
else {
f2 = new FileOutputStream(redirects[2].file(),
redirects[2].append());
std_fds[2] = fdAccess.get(f2.getFD());
}
}
return new UNIXProcess
(toCString(cmdarray[0]),
argBlock, args.length,
envBlock, envc[0],
toCString(dir),
std_fds,
redirectErrorStream);
} finally {
// In theory, close() can throw IOException
// (although it is rather unlikely to happen here)
try { if (f0 != null) f0.close(); }
finally {
try { if (f1 != null) f1.close(); }
finally { if (f2 != null) f2.close(); }
}
}
}
發現如果我們沒有重定型輸入流輸出流的話,全部設置爲默認值 -1
,而 -1
表示創建一個管道連接子進程與父進程。
記得以前使用 ulimit -a
命令時看到參數裏有 pipe size
的限制,即管道緩衝區的大小,會不會是因爲緩衝區不夠用導致任務卡死呢?
在阿里雲機器執行一下發現大小爲 8*512byte=4k
[hadoop@emr-header-1 ~]$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 63471
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 65535
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 32767
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
騰訊雲機器執行一下
[hadoop@172 ~]$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 256991
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 100001
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 256991
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
發現結果一樣,都是 4K
的大小
於是網上搜索下發現
2.6
標準版本的 linux
內核,pipe
緩衝區是 64 KB
,儘管命令 ulimit -a
看到管道大小 8
塊,緩衝區的大小不是 4 k
,因爲內核動態分配最大16
“緩衝條目”,相乘爲 64 k
。這些限制是硬編碼的
查看騰訊雲的緩衝條目個數爲16
,剛好 64 K
[hadoop@172 kernels]$ cat /usr/src/kernels/3.10.0-514.21.1.el7.x86_64/include/linux/pipe_fs_i.h | grep PIPE_DEF_BUFFERS
#define PIPE_DEF_BUFFERS 16
阿里雲的緩衝條目找不到
[hadoop@emr-header-1 kernels]$ ls /usr/src/kernels/
[hadoop@emr-header-1 kernels]$
那麼就默認的 4K
。
看到這裏想必你也知道了導致我們任務卡死是由於並沒有任務進程從管道緩衝區讀數據導致了緩衝區滿了無法再進行寫入。解決辦法就是在ProcessBuilder
啓動程序的時候重定向輸出流到文件或者使用異步線程實時讀取輸出流。