Python 文件相關操作

內置函數文件讀取

通常我們在讀取文件的時候，會用到read()， readline()， readlines()。
read ()的方法是一次性把文件的內容以字符串的方式讀到內存，放到一個字符串變量中
readlines()的方法是一次性讀取所有內容，並按行生成一個list
因爲read()和readlines()是一次性把文件加載到內存，如果文件較大，甚至比內存的大小還大，內存就會爆掉。所以，這兩種方法只適合讀取小的文件。

def test():
    f = open("/tmp/test.log", "r")
    print f.read()
    f.close()
    f = open("/tmp/test.log", "r")
    for line in f.readlines():
        print line
    f.close()

實際工作中，會碰到讀取10幾G的大文件的需求，比如說日誌文件。這時候就要用的新的讀取文件的方法：就是利用到生成器generator。

將文件切分成小段，每次處理完小段內容後，釋放內存，這裏會使用yield生成自定義可迭代對象，即generator，每一個帶有yield的函數就是一個generator。

def read_in_block(file_path):
    BLOCK_SIZE = 64*10000
    with open(file_path, "r") as f:
        while True:
            block = f.read(BLOCK_SIZE)  # 每次讀取固定長度到內存緩衝區
            if block:
                yield block
            else:
                return  # 如果讀取到文件末尾，則退出
def test():
    file_path = "/tmp/test.log"
    for block in read_in_block(file_path):
        print block

利用open（“”， “”）系統自帶方法生成的迭代對象

def test():
    with open("/tmp/test.log") as f:
        for line in f:
            print line

for line in f 這種用法是把文件對象f當作迭代對象，系統將自動處理IO緩衝和內存管理，這種方法是更加pythonic的方法。比較簡潔。

利用Pandas函數

import pandas as pd
def read_data(file_name):
    '''
    file_name:文件地址
    '''
    inputfile = open(file_name, 'rb')   #可打開含有中文的地址
    data = pd.read_csv(inputfile, iterator=True)
    loop = True
    chunkSize = 1000    #一千行一塊
    chunks = []
    while loop:
        try:
            chunk = data.get_chunk(chunkSize)
            chunks.append(chunk)
        except StopIteration:
            loop = False
            print("Iteration is stopped.")
    data = pd.concat(chunks, ignore_index=True)
    #print(train.head())
    return data

Python 文件相關操作

內置函數文件讀取

利用Pandas函數

LeetCode.1.Two Sum

LeetCode.412.Fizz Buzz

LeetCode.326.Power of Three

python基礎學習

LeetCode.283.Move Zeroes

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結