讀取大文件的兩種方式

原創

2018-09-01 23:50

對於大文件的讀取，如果我們使用普通的IO讀取，肯定會引起內存的溢出，即便不會，也會佔用極大的內存空間。下面提供了兩種讀取大文件的方式

1.使用java.util.Scanner進行對大文件的讀取

public class ScannerTest {

    public static void main(String[] args) {
        new ScannerTest().read();

    }

    public void read() {
        FileInputStream inputStream = null;
        Scanner scanner = null;
        try {
            inputStream = new FileInputStream("D:/test.txt");
            scanner = new Scanner(inputStream, "utf-8");
            while (scanner.hasNextLine()) {
                String string = scanner.nextLine();
                System.out.println(string);
            }
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } finally {
            if (scanner != null) {
                scanner.close();
            }
            if (inputStream != null) {
                try {
                    inputStream.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
        }
    }
}

優點是：遍歷文件的所有行，允許對每一行進行處理，而且不保持對他的引用。總之沒有把他們放入內存，可以防止內存溢出。

2.使用apache的commons IO庫實現

public class ApacheTest {
    public static void main(String[] args) {
        File file = new File("D:/test.txt");
        LineIterator iterator = null;
        try {
            iterator = FileUtils.lineIterator(file, "utf-8");
            while (iterator.hasNext()) {
                String string = iterator.nextLine();
                System.out.println(string);
            }
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            if (iterator != null) {
                iterator.close();
            }
        }

    }
}

兩種方式都可以放置內存的溢出。當然還可以利用NIO對文件進行讀取，這種方式我暫時還沒有研究，後續會添加進來。

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

讀取大文件的兩種方式

Wireshark 安裝+使用（一）

jsp頁面無刷新上傳文件

靜態語句塊訪問變量

自定義類對比的兩種方式

Collections中shuffle對List進行重新排序

讀取大文件的兩種方式

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結