Hadoop性能基準測試

1) 測試HDFS寫性能

測試內容:向HDFS集羣寫10個128M的文件

hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.2-tests.jar TestDFSIO -write -nrFiles 10 -fileSize 128MB

輸出日誌:

20/04/16 04:07:54 INFO fs.TestDFSIO: TestDFSIO.1.8
20/04/16 04:07:54 INFO fs.TestDFSIO: nrFiles = 10
20/04/16 04:07:54 INFO fs.TestDFSIO: nrBytes (MB) = 128.0
20/04/16 04:07:54 INFO fs.TestDFSIO: bufferSize = 1000000
20/04/16 04:07:54 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
20/04/16 04:07:56 INFO fs.TestDFSIO: creating control file: 134217728 bytes, 10 files
20/04/16 04:07:57 INFO fs.TestDFSIO: created control files for: 10 files
20/04/16 04:07:57 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.187.203:8032
20/04/16 04:07:58 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.187.203:8032
20/04/16 04:07:58 INFO mapred.FileInputFormat: Total input paths to process : 10
20/04/16 04:07:59 INFO mapreduce.JobSubmitter: number of splits:10
20/04/16 04:08:00 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1587019021463_0001
20/04/16 04:08:01 INFO impl.YarnClientImpl: Submitted application application_1587019021463_0001
20/04/16 04:08:02 INFO mapreduce.Job: The url to track the job: http://hadoop103:8088/proxy/application_1587019021463_0001/
20/04/16 04:08:02 INFO mapreduce.Job: Running job: job_1587019021463_0001
20/04/16 04:08:16 INFO mapreduce.Job: Job job_1587019021463_0001 running in uber mode : false
20/04/16 04:08:16 INFO mapreduce.Job:  map 0% reduce 0%
20/04/16 04:09:13 INFO mapreduce.Job:  map 7% reduce 0%
20/04/16 04:09:47 INFO mapreduce.Job:  map 17% reduce 0%
20/04/16 04:10:04 INFO mapreduce.Job:  map 17% reduce 3%
20/04/16 04:10:31 INFO mapreduce.Job:  map 23% reduce 3%
20/04/16 04:10:42 INFO mapreduce.Job:  map 27% reduce 3%
20/04/16 04:10:57 INFO mapreduce.Job:  map 33% reduce 7%
20/04/16 04:10:59 INFO mapreduce.Job:  map 37% reduce 7%
20/04/16 04:11:08 INFO mapreduce.Job:  map 40% reduce 10%
20/04/16 04:11:12 INFO mapreduce.Job:  map 40% reduce 13%
20/04/16 04:11:24 INFO mapreduce.Job:  map 47% reduce 13%
20/04/16 04:11:27 INFO mapreduce.Job:  map 53% reduce 13%
20/04/16 04:11:52 INFO mapreduce.Job:  map 57% reduce 13%
20/04/16 04:11:57 INFO mapreduce.Job:  map 57% reduce 17%
20/04/16 04:12:00 INFO mapreduce.Job:  map 67% reduce 17%
20/04/16 04:12:03 INFO mapreduce.Job:  map 67% reduce 20%
20/04/16 04:12:13 INFO mapreduce.Job:  map 70% reduce 20%
20/04/16 04:12:15 INFO mapreduce.Job:  map 70% reduce 23%
20/04/16 04:12:17 INFO mapreduce.Job:  map 77% reduce 23%
20/04/16 04:12:20 INFO mapreduce.Job: Task Id : attempt_1587019021463_0001_m_000003_1, Status : FAILED
Error: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /benchmarks/TestDFSIO/io_data/test_io_3 (inode 16853): File does not exist. Holder DFSClient_attempt_1587019021463_0001_m_000003_1_-167239712_1 does not have any open files.
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3428)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:3518)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:3485)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:786)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:536)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

	at org.apache.hadoop.ipc.Client.call(Client.java:1475)
	at org.apache.hadoop.ipc.Client.call(Client.java:1412)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
	at com.sun.proxy.$Proxy13.complete(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:462)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
	at com.sun.proxy.$Proxy14.complete(Unknown Source)
	at org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2290)
	at org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:2272)
	at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2236)
	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
	at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
	at org.apache.hadoop.fs.IOMapperBase.map(IOMapperBase.java:136)
	at org.apache.hadoop.fs.IOMapperBase.map(IOMapperBase.java:37)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

20/04/16 04:12:22 INFO mapreduce.Job:  map 70% reduce 23%
20/04/16 04:12:24 INFO mapreduce.Job:  map 77% reduce 23%
20/04/16 04:12:25 INFO mapreduce.Job:  map 80% reduce 23%
20/04/16 04:12:27 INFO mapreduce.Job:  map 80% reduce 27%
20/04/16 04:12:28 INFO mapreduce.Job:  map 93% reduce 27%
20/04/16 04:12:33 INFO mapreduce.Job:  map 97% reduce 27%
20/04/16 04:12:36 INFO mapreduce.Job:  map 97% reduce 30%
20/04/16 04:12:38 INFO mapreduce.Job:  map 100% reduce 30%
20/04/16 04:12:39 INFO mapreduce.Job:  map 100% reduce 100%
20/04/16 04:12:40 INFO mapreduce.Job: Job job_1587019021463_0001 completed successfully
20/04/16 04:12:40 INFO mapreduce.Job: Counters: 52
	File System Counters
		FILE: Number of bytes read=859
		FILE: Number of bytes written=1304876
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=2350
		HDFS: Number of bytes written=1342177357
		HDFS: Number of read operations=43
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=12
	Job Counters 
		Failed map tasks=1
		Killed map tasks=6
		Launched map tasks=17
		Launched reduce tasks=1
		Data-local map tasks=8
		Rack-local map tasks=9
		Total time spent by all maps in occupied slots (ms)=2522912
		Total time spent by all reduces in occupied slots (ms)=168342
		Total time spent by all map tasks (ms)=2522912
		Total time spent by all reduce tasks (ms)=168342
		Total vcore-milliseconds taken by all map tasks=2522912
		Total vcore-milliseconds taken by all reduce tasks=168342
		Total megabyte-milliseconds taken by all map tasks=2583461888
		Total megabyte-milliseconds taken by all reduce tasks=172382208
	Map-Reduce Framework
		Map input records=10
		Map output records=50
		Map output bytes=753
		Map output materialized bytes=913
		Input split bytes=1230
		Combine input records=0
		Combine output records=0
		Reduce input groups=5
		Reduce shuffle bytes=913
		Reduce input records=50
		Reduce output records=5
		Spilled Records=100
		Shuffled Maps =10
		Failed Shuffles=0
		Merged Map outputs=10
		GC time elapsed (ms)=29508
		CPU time spent (ms)=270680
		Physical memory (bytes) snapshot=2984280064
		Virtual memory (bytes) snapshot=23289643008
		Total committed heap usage (bytes)=2072510464
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=1120
	File Output Format Counters 
		Bytes Written=77
20/04/16 04:12:40 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write
20/04/16 04:12:40 INFO fs.TestDFSIO:            Date & time: Thu Apr 16 04:12:40 EDT 2020
20/04/16 04:12:40 INFO fs.TestDFSIO:        Number of files: 10
20/04/16 04:12:40 INFO fs.TestDFSIO: Total MBytes processed: 1280.0
20/04/16 04:12:40 INFO fs.TestDFSIO:      Throughput mb/sec: 5.3630535886370305
20/04/16 04:12:40 INFO fs.TestDFSIO: Average IO rate mb/sec: 8.655435562133789
20/04/16 04:12:40 INFO fs.TestDFSIO:  IO rate std deviation: 4.845843661707895
20/04/16 04:12:40 INFO fs.TestDFSIO:     Test exec time sec: 283.202
20/04/16 04:12:40 INFO fs.TestDFSIO: 

 

2)測試HDFS讀性能

測試內容:讀取HDFS集羣10個128M的文件

hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.2-tests.jar TestDFSIO -read -nrFiles 10 -fileSize 128MB

日誌輸出:
 

20/04/16 04:23:14 INFO fs.TestDFSIO: TestDFSIO.1.8
20/04/16 04:23:14 INFO fs.TestDFSIO: nrFiles = 10
20/04/16 04:23:14 INFO fs.TestDFSIO: nrBytes (MB) = 128.0
20/04/16 04:23:14 INFO fs.TestDFSIO: bufferSize = 1000000
20/04/16 04:23:14 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
20/04/16 04:23:14 INFO fs.TestDFSIO: creating control file: 134217728 bytes, 10 files
20/04/16 04:23:15 INFO fs.TestDFSIO: created control files for: 10 files
20/04/16 04:23:15 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.187.203:8032
20/04/16 04:23:15 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.187.203:8032
20/04/16 04:23:15 INFO mapred.FileInputFormat: Total input paths to process : 10
20/04/16 04:23:15 INFO mapreduce.JobSubmitter: number of splits:10
20/04/16 04:23:15 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1587019021463_0002
20/04/16 04:23:16 INFO impl.YarnClientImpl: Submitted application application_1587019021463_0002
20/04/16 04:23:16 INFO mapreduce.Job: The url to track the job: http://hadoop103:8088/proxy/application_1587019021463_0002/
20/04/16 04:23:16 INFO mapreduce.Job: Running job: job_1587019021463_0002
20/04/16 04:23:22 INFO mapreduce.Job: Job job_1587019021463_0002 running in uber mode : false
20/04/16 04:23:22 INFO mapreduce.Job:  map 0% reduce 0%
20/04/16 04:23:58 INFO mapreduce.Job:  map 13% reduce 0%
20/04/16 04:24:02 INFO mapreduce.Job:  map 40% reduce 0%
20/04/16 04:24:11 INFO mapreduce.Job:  map 40% reduce 13%
20/04/16 04:24:23 INFO mapreduce.Job:  map 47% reduce 13%
20/04/16 04:24:38 INFO mapreduce.Job:  map 57% reduce 13%
20/04/16 04:24:41 INFO mapreduce.Job:  map 57% reduce 17%
20/04/16 04:24:47 INFO mapreduce.Job:  map 60% reduce 17%
20/04/16 04:24:48 INFO mapreduce.Job:  map 70% reduce 17%
20/04/16 04:24:50 INFO mapreduce.Job:  map 70% reduce 23%
20/04/16 04:24:51 INFO mapreduce.Job:  map 80% reduce 23%
20/04/16 04:24:52 INFO mapreduce.Job:  map 90% reduce 23%
20/04/16 04:24:53 INFO mapreduce.Job:  map 100% reduce 30%
20/04/16 04:24:54 INFO mapreduce.Job:  map 100% reduce 100%
20/04/16 04:24:54 INFO mapreduce.Job: Job job_1587019021463_0002 completed successfully
20/04/16 04:24:54 INFO mapreduce.Job: Counters: 51
	File System Counters
		FILE: Number of bytes read=850
		FILE: Number of bytes written=1304836
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=1342179630
		HDFS: Number of bytes written=81
		HDFS: Number of read operations=53
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=2
	Job Counters 
		Killed map tasks=3
		Launched map tasks=13
		Launched reduce tasks=1
		Data-local map tasks=6
		Rack-local map tasks=7
		Total time spent by all maps in occupied slots (ms)=663929
		Total time spent by all reduces in occupied slots (ms)=51227
		Total time spent by all map tasks (ms)=663929
		Total time spent by all reduce tasks (ms)=51227
		Total vcore-milliseconds taken by all map tasks=663929
		Total vcore-milliseconds taken by all reduce tasks=51227
		Total megabyte-milliseconds taken by all map tasks=679863296
		Total megabyte-milliseconds taken by all reduce tasks=52456448
	Map-Reduce Framework
		Map input records=10
		Map output records=50
		Map output bytes=744
		Map output materialized bytes=904
		Input split bytes=1230
		Combine input records=0
		Combine output records=0
		Reduce input groups=5
		Reduce shuffle bytes=904
		Reduce input records=50
		Reduce output records=5
		Spilled Records=100
		Shuffled Maps =10
		Failed Shuffles=0
		Merged Map outputs=10
		GC time elapsed (ms)=7163
		CPU time spent (ms)=39650
		Physical memory (bytes) snapshot=2840920064
		Virtual memory (bytes) snapshot=23231057920
		Total committed heap usage (bytes)=1963458560
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=1120
	File Output Format Counters 
		Bytes Written=81
20/04/16 04:24:54 INFO fs.TestDFSIO: ----- TestDFSIO ----- : read
20/04/16 04:24:54 INFO fs.TestDFSIO:            Date & time: Thu Apr 16 04:24:54 EDT 2020
20/04/16 04:24:54 INFO fs.TestDFSIO:        Number of files: 10
20/04/16 04:24:54 INFO fs.TestDFSIO: Total MBytes processed: 1280.0
20/04/16 04:24:54 INFO fs.TestDFSIO:      Throughput mb/sec: 22.55188695866662
20/04/16 04:24:54 INFO fs.TestDFSIO: Average IO rate mb/sec: 104.15641021728516
20/04/16 04:24:54 INFO fs.TestDFSIO:  IO rate std deviation: 201.10983284488756
20/04/16 04:24:54 INFO fs.TestDFSIO:     Test exec time sec: 99.53
20/04/16 04:24:54 INFO fs.TestDFSIO: 

吞吐量:Throughput mb/sec
平均IO率:Average IO rate mb/sec

 

3)刪除測試生成數據

hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.2-tests.jar TestDFSIO -clean

4)使用Sort程序評測MapReduce-----資源要求較高

(1)使用RandomWriter來產生隨機數,每個節點運行10個Map任務,每個Map產生大約1G大小的二進制隨機數

hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar randomwriter random-data

(2)執行Sort程序

hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar sort random-data sorted-data

(3)驗證數據是否真正排好序了

hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar testmapredsort -sortInput random-data -sortOutput sorted-data

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章