本文用於複習《Hadoop權威指南》第三章後半部分內容

代碼來自於書中，僅有少部分修改，主要是爲了回憶起來方便

在文章eclipse實現word count中就有關於如何在eclipse中開發hadoop項目，鏈接如下Eclipse實現Hadoop WordCount

利用FileSystem的listStatus方法來讀取文件和目錄的元數據，再用stat2path方法講status數組轉爲path數組。

這裏用一種與之前不同的路徑設置方法（雖然其實是與書上一樣的），在run configuration中添加參數
hdfs://localhost:9000/ 和hdfs://localhost:9000/user/wyh/

    public class ListStatus {

  public static void main(String[] args) throws Exception {
    String uri = args[0];
    Configuration conf = new Configuration();
    FileSystem fs = FileSystem.get(URI.create(uri), conf);

    Path[] paths = new Path[args.length];
    for (int i = 0; i < paths.length; i++) {
      paths[i] = new Path(args[i]);
    }

    FileStatus[] status = fs.listStatus(paths);
    Path[] listedPaths = FileUtil.stat2Paths(status);
    for (Path p : listedPaths) {
      System.out.println(p);
    }
  }
}

書上給出了一個RegexExcludePathFilter類實現了PathFilter接口，用於排除一個正則表達式路徑，也覺得挺有趣的，但是沒有給具體實現排除的代碼，在這裏寫了一個TestFilter類來實現一下。

首先把2016.12.01.txt和2016.12.02.txt放到分佈式系統中，可以看到在去掉RegexExcludePathFilter的時候會兩個文件都顯示，直接運行則會顯示路徑+2016.12.02.txt。

     public class TestFilter {
      public static void main(String[] args) throws Exception {
            String uri = "hdfs://localhost:9000/";
            Configuration conf = new Configuration();
            FileSystem fs = FileSystem.get(URI.create(uri), conf);

            FileStatus[] status =fs.globStatus(new Path("hdfs://localhost:9000/user/wyh/2016.*.*")
            ,new RegexExcludePathFilter("hdfs://localhost:9000/user/wyh/2016.12.01.*"));
            // ,new RegexExcludePathFilter("hdfs://localhost:9000/user/wyh/2016.12.01.*")
            Path[] listedPaths = FileUtil.stat2Paths(status);
            for (Path p : listedPaths) {
              System.out.println(p);
            }
      }
}

Yuhua Wang

發佈了41 篇原創文章 · 獲贊 15 · 訪問量 8萬+

私信關注

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Eclipse實現DFS部分操作複習（2）

本文用於複習《Hadoop權威指南》第三章後半部分內容

代碼來自於書中，僅有少部分修改，主要是爲了回憶起來方便

在文章eclipse實現word count中就有關於如何在eclipse中開發hadoop項目，鏈接如下Eclipse實現Hadoop WordCount

解決WelcomeApplet在Safari中無法加載

機器學習第一章複習（1）

StreamCompressor實現過程中的一些問題

機器學習第一章複習（2）

機器學習第二章複習（1）

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結