1) 測試HDFS寫性能
測試內容:向HDFS集羣寫10個128M的文件
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.2-tests.jar TestDFSIO -write -nrFiles 10 -fileSize 128MB
輸出日誌:
20/04/16 04:07:54 INFO fs.TestDFSIO: TestDFSIO.1.8
20/04/16 04:07:54 INFO fs.TestDFSIO: nrFiles = 10
20/04/16 04:07:54 INFO fs.TestDFSIO: nrBytes (MB) = 128.0
20/04/16 04:07:54 INFO fs.TestDFSIO: bufferSize = 1000000
20/04/16 04:07:54 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
20/04/16 04:07:56 INFO fs.TestDFSIO: creating control file: 134217728 bytes, 10 files
20/04/16 04:07:57 INFO fs.TestDFSIO: created control files for: 10 files
20/04/16 04:07:57 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.187.203:8032
20/04/16 04:07:58 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.187.203:8032
20/04/16 04:07:58 INFO mapred.FileInputFormat: Total input paths to process : 10
20/04/16 04:07:59 INFO mapreduce.JobSubmitter: number of splits:10
20/04/16 04:08:00 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1587019021463_0001
20/04/16 04:08:01 INFO impl.YarnClientImpl: Submitted application application_1587019021463_0001
20/04/16 04:08:02 INFO mapreduce.Job: The url to track the job: http://hadoop103:8088/proxy/application_1587019021463_0001/
20/04/16 04:08:02 INFO mapreduce.Job: Running job: job_1587019021463_0001
20/04/16 04:08:16 INFO mapreduce.Job: Job job_1587019021463_0001 running in uber mode : false
20/04/16 04:08:16 INFO mapreduce.Job: map 0% reduce 0%
20/04/16 04:09:13 INFO mapreduce.Job: map 7% reduce 0%
20/04/16 04:09:47 INFO mapreduce.Job: map 17% reduce 0%
20/04/16 04:10:04 INFO mapreduce.Job: map 17% reduce 3%
20/04/16 04:10:31 INFO mapreduce.Job: map 23% reduce 3%
20/04/16 04:10:42 INFO mapreduce.Job: map 27% reduce 3%
20/04/16 04:10:57 INFO mapreduce.Job: map 33% reduce 7%
20/04/16 04:10:59 INFO mapreduce.Job: map 37% reduce 7%
20/04/16 04:11:08 INFO mapreduce.Job: map 40% reduce 10%
20/04/16 04:11:12 INFO mapreduce.Job: map 40% reduce 13%
20/04/16 04:11:24 INFO mapreduce.Job: map 47% reduce 13%
20/04/16 04:11:27 INFO mapreduce.Job: map 53% reduce 13%
20/04/16 04:11:52 INFO mapreduce.Job: map 57% reduce 13%
20/04/16 04:11:57 INFO mapreduce.Job: map 57% reduce 17%
20/04/16 04:12:00 INFO mapreduce.Job: map 67% reduce 17%
20/04/16 04:12:03 INFO mapreduce.Job: map 67% reduce 20%
20/04/16 04:12:13 INFO mapreduce.Job: map 70% reduce 20%
20/04/16 04:12:15 INFO mapreduce.Job: map 70% reduce 23%
20/04/16 04:12:17 INFO mapreduce.Job: map 77% reduce 23%
20/04/16 04:12:20 INFO mapreduce.Job: Task Id : attempt_1587019021463_0001_m_000003_1, Status : FAILED
Error: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /benchmarks/TestDFSIO/io_data/test_io_3 (inode 16853): File does not exist. Holder DFSClient_attempt_1587019021463_0001_m_000003_1_-167239712_1 does not have any open files.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3428)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:3518)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:3485)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:786)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:536)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
at org.apache.hadoop.ipc.Client.call(Client.java:1475)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy13.complete(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:462)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy14.complete(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2290)
at org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:2272)
at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2236)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
at org.apache.hadoop.fs.IOMapperBase.map(IOMapperBase.java:136)
at org.apache.hadoop.fs.IOMapperBase.map(IOMapperBase.java:37)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
20/04/16 04:12:22 INFO mapreduce.Job: map 70% reduce 23%
20/04/16 04:12:24 INFO mapreduce.Job: map 77% reduce 23%
20/04/16 04:12:25 INFO mapreduce.Job: map 80% reduce 23%
20/04/16 04:12:27 INFO mapreduce.Job: map 80% reduce 27%
20/04/16 04:12:28 INFO mapreduce.Job: map 93% reduce 27%
20/04/16 04:12:33 INFO mapreduce.Job: map 97% reduce 27%
20/04/16 04:12:36 INFO mapreduce.Job: map 97% reduce 30%
20/04/16 04:12:38 INFO mapreduce.Job: map 100% reduce 30%
20/04/16 04:12:39 INFO mapreduce.Job: map 100% reduce 100%
20/04/16 04:12:40 INFO mapreduce.Job: Job job_1587019021463_0001 completed successfully
20/04/16 04:12:40 INFO mapreduce.Job: Counters: 52
File System Counters
FILE: Number of bytes read=859
FILE: Number of bytes written=1304876
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=2350
HDFS: Number of bytes written=1342177357
HDFS: Number of read operations=43
HDFS: Number of large read operations=0
HDFS: Number of write operations=12
Job Counters
Failed map tasks=1
Killed map tasks=6
Launched map tasks=17
Launched reduce tasks=1
Data-local map tasks=8
Rack-local map tasks=9
Total time spent by all maps in occupied slots (ms)=2522912
Total time spent by all reduces in occupied slots (ms)=168342
Total time spent by all map tasks (ms)=2522912
Total time spent by all reduce tasks (ms)=168342
Total vcore-milliseconds taken by all map tasks=2522912
Total vcore-milliseconds taken by all reduce tasks=168342
Total megabyte-milliseconds taken by all map tasks=2583461888
Total megabyte-milliseconds taken by all reduce tasks=172382208
Map-Reduce Framework
Map input records=10
Map output records=50
Map output bytes=753
Map output materialized bytes=913
Input split bytes=1230
Combine input records=0
Combine output records=0
Reduce input groups=5
Reduce shuffle bytes=913
Reduce input records=50
Reduce output records=5
Spilled Records=100
Shuffled Maps =10
Failed Shuffles=0
Merged Map outputs=10
GC time elapsed (ms)=29508
CPU time spent (ms)=270680
Physical memory (bytes) snapshot=2984280064
Virtual memory (bytes) snapshot=23289643008
Total committed heap usage (bytes)=2072510464
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=1120
File Output Format Counters
Bytes Written=77
20/04/16 04:12:40 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write
20/04/16 04:12:40 INFO fs.TestDFSIO: Date & time: Thu Apr 16 04:12:40 EDT 2020
20/04/16 04:12:40 INFO fs.TestDFSIO: Number of files: 10
20/04/16 04:12:40 INFO fs.TestDFSIO: Total MBytes processed: 1280.0
20/04/16 04:12:40 INFO fs.TestDFSIO: Throughput mb/sec: 5.3630535886370305
20/04/16 04:12:40 INFO fs.TestDFSIO: Average IO rate mb/sec: 8.655435562133789
20/04/16 04:12:40 INFO fs.TestDFSIO: IO rate std deviation: 4.845843661707895
20/04/16 04:12:40 INFO fs.TestDFSIO: Test exec time sec: 283.202
20/04/16 04:12:40 INFO fs.TestDFSIO:
2)測試HDFS讀性能
測試內容:讀取HDFS集羣10個128M的文件
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.2-tests.jar TestDFSIO -read -nrFiles 10 -fileSize 128MB
日誌輸出:
20/04/16 04:23:14 INFO fs.TestDFSIO: TestDFSIO.1.8
20/04/16 04:23:14 INFO fs.TestDFSIO: nrFiles = 10
20/04/16 04:23:14 INFO fs.TestDFSIO: nrBytes (MB) = 128.0
20/04/16 04:23:14 INFO fs.TestDFSIO: bufferSize = 1000000
20/04/16 04:23:14 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
20/04/16 04:23:14 INFO fs.TestDFSIO: creating control file: 134217728 bytes, 10 files
20/04/16 04:23:15 INFO fs.TestDFSIO: created control files for: 10 files
20/04/16 04:23:15 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.187.203:8032
20/04/16 04:23:15 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.187.203:8032
20/04/16 04:23:15 INFO mapred.FileInputFormat: Total input paths to process : 10
20/04/16 04:23:15 INFO mapreduce.JobSubmitter: number of splits:10
20/04/16 04:23:15 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1587019021463_0002
20/04/16 04:23:16 INFO impl.YarnClientImpl: Submitted application application_1587019021463_0002
20/04/16 04:23:16 INFO mapreduce.Job: The url to track the job: http://hadoop103:8088/proxy/application_1587019021463_0002/
20/04/16 04:23:16 INFO mapreduce.Job: Running job: job_1587019021463_0002
20/04/16 04:23:22 INFO mapreduce.Job: Job job_1587019021463_0002 running in uber mode : false
20/04/16 04:23:22 INFO mapreduce.Job: map 0% reduce 0%
20/04/16 04:23:58 INFO mapreduce.Job: map 13% reduce 0%
20/04/16 04:24:02 INFO mapreduce.Job: map 40% reduce 0%
20/04/16 04:24:11 INFO mapreduce.Job: map 40% reduce 13%
20/04/16 04:24:23 INFO mapreduce.Job: map 47% reduce 13%
20/04/16 04:24:38 INFO mapreduce.Job: map 57% reduce 13%
20/04/16 04:24:41 INFO mapreduce.Job: map 57% reduce 17%
20/04/16 04:24:47 INFO mapreduce.Job: map 60% reduce 17%
20/04/16 04:24:48 INFO mapreduce.Job: map 70% reduce 17%
20/04/16 04:24:50 INFO mapreduce.Job: map 70% reduce 23%
20/04/16 04:24:51 INFO mapreduce.Job: map 80% reduce 23%
20/04/16 04:24:52 INFO mapreduce.Job: map 90% reduce 23%
20/04/16 04:24:53 INFO mapreduce.Job: map 100% reduce 30%
20/04/16 04:24:54 INFO mapreduce.Job: map 100% reduce 100%
20/04/16 04:24:54 INFO mapreduce.Job: Job job_1587019021463_0002 completed successfully
20/04/16 04:24:54 INFO mapreduce.Job: Counters: 51
File System Counters
FILE: Number of bytes read=850
FILE: Number of bytes written=1304836
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=1342179630
HDFS: Number of bytes written=81
HDFS: Number of read operations=53
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Killed map tasks=3
Launched map tasks=13
Launched reduce tasks=1
Data-local map tasks=6
Rack-local map tasks=7
Total time spent by all maps in occupied slots (ms)=663929
Total time spent by all reduces in occupied slots (ms)=51227
Total time spent by all map tasks (ms)=663929
Total time spent by all reduce tasks (ms)=51227
Total vcore-milliseconds taken by all map tasks=663929
Total vcore-milliseconds taken by all reduce tasks=51227
Total megabyte-milliseconds taken by all map tasks=679863296
Total megabyte-milliseconds taken by all reduce tasks=52456448
Map-Reduce Framework
Map input records=10
Map output records=50
Map output bytes=744
Map output materialized bytes=904
Input split bytes=1230
Combine input records=0
Combine output records=0
Reduce input groups=5
Reduce shuffle bytes=904
Reduce input records=50
Reduce output records=5
Spilled Records=100
Shuffled Maps =10
Failed Shuffles=0
Merged Map outputs=10
GC time elapsed (ms)=7163
CPU time spent (ms)=39650
Physical memory (bytes) snapshot=2840920064
Virtual memory (bytes) snapshot=23231057920
Total committed heap usage (bytes)=1963458560
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=1120
File Output Format Counters
Bytes Written=81
20/04/16 04:24:54 INFO fs.TestDFSIO: ----- TestDFSIO ----- : read
20/04/16 04:24:54 INFO fs.TestDFSIO: Date & time: Thu Apr 16 04:24:54 EDT 2020
20/04/16 04:24:54 INFO fs.TestDFSIO: Number of files: 10
20/04/16 04:24:54 INFO fs.TestDFSIO: Total MBytes processed: 1280.0
20/04/16 04:24:54 INFO fs.TestDFSIO: Throughput mb/sec: 22.55188695866662
20/04/16 04:24:54 INFO fs.TestDFSIO: Average IO rate mb/sec: 104.15641021728516
20/04/16 04:24:54 INFO fs.TestDFSIO: IO rate std deviation: 201.10983284488756
20/04/16 04:24:54 INFO fs.TestDFSIO: Test exec time sec: 99.53
20/04/16 04:24:54 INFO fs.TestDFSIO:
吞吐量:Throughput mb/sec
平均IO率:Average IO rate mb/sec
3)刪除測試生成數據
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.2-tests.jar TestDFSIO -clean
4)使用Sort程序評測MapReduce-----資源要求較高
(1)使用RandomWriter來產生隨機數,每個節點運行10個Map任務,每個Map產生大約1G大小的二進制隨機數
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar randomwriter random-data
(2)執行Sort程序
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar sort random-data sorted-data
(3)驗證數據是否真正排好序了
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar testmapredsort -sortInput random-data -sortOutput sorted-data