Python 文本處理

原創

2020-02-24 18:19

1. 打開關閉文件

fh = open(filename, mode) # 打開現有文件
fp = file(filename, mode) # 新建並打開文件

其中，mode 選項如下所示：

r 以只讀方式打開
w 以寫方式打開，若有內容，會清空
a 以追加模式打開，若有內容，不會覆蓋
r+ 清除內容，以讀方式打開
w+ 清除內容，以寫方式打開
a+ 以追加模式打開，不清除內容，文件指針自動移動到文件尾
b 以二進制打開

2. 讀取文件

Python 對文本讀入提供了三個讀的方法：

read()
readline()
readlines()

每種方法可以接受一個變量以限制每次讀取的數據量，但它們通常都不使用變量。

假設存在 new.txt 文件，文件裏的內容爲：

firstLine 1
secondLine 2

read()
一次讀取整個文件，通常用於將文件內容放到一個字符串變量中。但是對於連續的面向行的處理卻是非必要的。並且，如果文件大於可用內存，則無法實現。

fh = open('new.txt')
string = fh.read()
print string
fh.close()

結果爲：

firstLine 1
secondLine 2

readline()
readline() 每次只讀取一行，通常比 readlines() 慢許多，僅當沒有足夠內存可以一次讀取整個文件時，才應該使用 .readline()

fh = open('new.txt')
print fh.readline()
print fh.readline()
fh.close()

結果爲：

firstLine 1
secondLine 2

readlines()
readlines() 一次讀取整個文件。但是與 read() 不同的是，readlines() 會自動將文件內容分成一個行的列表，該列表可以使用 Python 的 for .. in .. 結構處理

fh = open('new.txt')
for line in fh.readlines():
    print line
fh.close()

結果爲：

firstLine 1

secondLine 2

3. 寫入文件

write(str)
參數是一個字符串，一次把整個字符串寫入

fh = open('new.txt', 'a+')
fh.write('thirdLine 3\n')
fh.close()

writelines(list)
參數是序列，例如列表，迭代寫入

fh = open('new.txt', 'a+')
fh.writelines(["fourthLine 4\n", "fifthLine 5\n"])
fh.close()

4. Python 字符串處理

字符串長度：len(str)

str = 'python string'
len(str)

字母處理
- 全部大寫：str.upper()
- 全部小寫：str.lower()
- 首字母大寫：str.title()
- 大小寫互換：str.swapcase()
字符串搜索
- 搜索指定字符串，沒有返回 -1 ：str.find('st')
- 指定起始終止位置搜索：str.find('st', start, end)
字符串替換
替換 old 爲 new：str.replace('old', 'new')
字符串去空格或去指定字符
- 去兩邊空格：str.strip()
- 去左邊空格：str.lstrip()
- 去右邊空格：str.rstrip()
- 去兩邊字符串：str.strip('p')

str = ' python string '
print str.strip()
print len(str.strip())
print str.lstrip()
print len(str.lstrip())
print str.rstrip()
print len(str.rstrip())
print str.strip(' py')
print len(str.strip(' py'))

結果：

python string
13
python string 
14
 python string
14
thon string
11

按指定字符分割字符串爲數組
- 按空格分割：str.split()
- 按指定字符分割：str.split('.')

str = 'a bc def'
print str.split()

str = 'blog.csdn.net/endlch'
print str.split('.')

結果：

['a', 'bc', 'def']
['blog', 'csdn', 'net/endlch']

5. map 函數

map 函數會根據提供的函數對指定序列做映射。用法如下：

map(function, sequence)

其中：function 表示一個函數， sequence 表示一個或多個序列。返回一個集合。

示例：

map(lambda x: x ** 2, [1, 2, 3, 4])

結果：

[1, 4, 9, 16]

可以用來計算數組的加法：

map(lambda x, y: x + y, [1, 2, 3], [4, 5, 6])

結果：

[5, 7, 9]

如下，字符串變浮點型，從文件中讀取的數字都是字符串，可以通過這個函數變成數字：

map(float, ['1', '2', '3', '4'])

結果：

[1, 2, 3, 4]

6. fliter 函數

fliter 函數對指定序列執行過濾操作。

fliter(function or None, sequence)

其中：function 表示一個函數， sequence 表示一個序列。返回一個集合。

filter 函數會對序列參數 sequence 中的每個元素調用 function 函數，最後返回的結果包含調用結果爲 True 的元素

def is_even(x):
    return x&1 != 0
filter(is_even, [1, 2, 3, 4, 5, 6, 7, 8])

輸出結果：

[1, 3, 5, 7]

7. reduce 函數

reduce 函數會對參數序列中元素進行累積。

reduce(function, sequence)

其中：function 表示一個函數， sequence 表示一個序列。返回一個數。

示例：

reduce(lambda x, y: x + y, [1, 2, 3, 4, 5, 6])

結果：

8. numpy 中 matrix 轉 list

可以通過 type 查看數據的類型。

python 裏的一些函數輸入是 list 型纔可以，而 numpy 中很多使用的是 matrix 型數據，因此要把 matrix 型數據轉換爲 list 型。

a = numpy.matrix([1, 2, 3])

b = list(numpy.array(a).reshape(-1))
c = numpy.array(a).reshape(-1).tolist
d = numpy.array(a)[0].tolist()

其中，reshape 用於調整矩陣的各維度，-1 表示自動推導該維度。

示例：

A = numpy.matrix('1, 2, 3, 4')
B = A.reshape(1, 4)
C = A.reshape(1, -1)

結果：

A = [[1, 2]
     [3, 4]]
B = [[1, 2, 3, 4]]
C = [[1, 2, 3, 4]]

9. 參考

How to make List from Numpy Matrix in Python

python中的map、filter、reduce函數

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Python 文本處理

1. 打開關閉文件

2. 讀取文件

3. 寫入文件

4. Python 字符串處理

5. map 函數

6. fliter 函數

7. reduce 函數

8. numpy 中 matrix 轉 list

9. 參考

c++ 類中的 static

freeglut 使用筆記

十種排序方法總結

相似圖像搜索

Kmeans++及字典學習和圖像分割

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結