Hadoop運行模式
-
本地模式
By default, Hadoop is configured to run in a non-distributed mode, as a single Java process. This is useful for debugging.
默認情況下,Hadoop被配置爲以非分佈式模式作爲單個Java進程運行。 這對於調試很有用。
-
官方Grep案例
The following example copies the unpacked conf directory to use as input and then finds and displays every match of the given regular expression. Output is written to the given output directory.
下面的示例複製解壓縮的配置目錄以用作輸入,然後查找並顯示給定正則表達式的每個匹配項。 輸出被寫入給定的輸出目錄。
$ mkdir input $ cp etc/hadoop/*.xml input $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar grep input output 'dfs[a-z.]+' $ cat output/*
實際操作:
- 構造輸入
- 執行提供的案例
grep
- 查看輸出(
output
文件夾不要手動創建,在程序執行過程中會自動創建。手動創建會出現org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory file:/opt/module/hadoop-2.7.2/output already exists
異常。)
_SUCCESS存在代表執行成功
- 構造輸入
-
官方WordCount案例(統計單詞格個數)
實際操作:
-
構造輸入
[root@localhost hadoop-2.7.2]# mkdir wcinput [root@localhost hadoop-2.7.2]# cd wcinput/ [root@localhost wcinput]# touch wc.input [root@localhost wcinput]# vim wc.input [root@localhost wcinput]# cat wc.input Baidu Alibaba ByteDance zhangsan lisi wangwu wangwu Bcxtm Bcxtm Bcxtm
-
執行提供的案例
wordcount
[root@localhost hadoop-2.7.2]# bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount wcinput/ wcoutput
-
查看輸出
[root@localhost hadoop-2.7.2]# cd wcoutput/ [root@localhost wcoutput]# ll 總用量 4 -rw-r--r-- 1 root root 65 7月 5 10:40 part-r-00000 -rw-r--r-- 1 root root 0 7月 5 10:40 _SUCCESS [root@localhost wcoutput]# cat part-r-00000 Alibaba 1 Baidu 1 Bcxtm 3 ByteDance 1 lisi 1 wangwu 2 zhangsan 1
-
-