《Python數據分析與展示》-Numpy學習筆記02

Numpy中文件的存取

一、數據的CSV文件存取

CSV (Comma‐Separated Value, 逗號分隔值)

CSV是一種常見的文件格式，用來存儲批量數據

1)CSV的儲存

np.savetxt(frame, array, fmt='%.18e', delimiter=None)

• frame : 文件、字符串或產生器，可以是.gz或.bz2的壓縮文件

• array : 存入文件的數組 • fmt : 寫入文件的格式，例如：%d %.2f %.18e

• delimiter : 分割字符串，默認是任何空格

實例：

In [2]: import numpy as np

In [3]: a=np.arange(100).reshape(5,20)

In [4]: np.savetxt('a.csv',a,fmt='%d',delimiter=',')

用TXT打開存儲的文件查看數據

0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,……,98,99

實例

In [6]: a=np.arange(100).reshape(5,20)

In [7]: np.savetxt('a.csv',a,fmt='%.1f',delimiter=',')

用TXT打開存儲的文件查看數據

0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,……,97.0,98.0,99.0

1)CSV的讀取

np.loadtxt(frame, dtype=np.float, delimiter=None， unpack=False)

• frame : 文件、字符串或產生器，可以是.gz或.bz2的壓縮文件

• dtype : 數據類型，可選

• delimiter : 分割字符串，默認是任何空格

• unpack : 如果True，讀入屬性將分別寫入不同變量

實例：

In [8]: b=np.loadtxt('a.csv',delimiter=',')

In [9]: b
Out[9]:
array([[ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.,
        13., 14., 15., 16., 17., 18., 19.],
       [20., 21., 22., 23., 24., 25., 26., 27., 28., 29., 30., 31., 32.,
        33., 34., 35., 36., 37., 38., 39.],
       [40., 41., 42., 43., 44., 45., 46., 47., 48., 49., 50., 51., 52.,
        53., 54., 55., 56., 57., 58., 59.],
       [60., 61., 62., 63., 64., 65., 66., 67., 68., 69., 70., 71., 72.,
        73., 74., 75., 76., 77., 78., 79.],
       [80., 81., 82., 83., 84., 85., 86., 87., 88., 89., 90., 91., 92.,
        93., 94., 95., 96., 97., 98., 99.]])

In [10]: b=np.loadtxt('a.csv',dtype=np.int,delimiter=',')

In [11]: b
Out[11]:
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,
        16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
        36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55,
        56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75,
        76, 77, 78, 79],
       [80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95,
        96, 97, 98, 99]])

3）CSV文件的侷限性

CSV只能有效存儲一維和二維數組 np.savetxt() np.loadtxt()只能有效存取一維和二維數組

二、多維數據的存取

1）多維數據的儲存

a.tofile(frame, sep='', format='%s')

• frame : 文件、字符串

• sep : 數據分割字符串，如果是空串，寫入文件爲二進制

• format : 寫入數據的格式

實例：

In [12]: a=np.arange(100).reshape(5,10,2)

In [13]: a.tofile("b.dat",sep=",",format='%d')

用pycharm打開b.dat文件

0,1,2,3,4,5,6,7,8,9,10,11,12,13,……,96,97,98,99

實例：

In [14]: a=np.arange(100).reshape(5,10,2)

In [15]: a.tofile("b.dat",format='%d')

用記事本打開b.dat文件

……a b c

2）多維數據的讀取

np.fromfile(frame, dtype=float, count=‐1, sep='')

• frame : 文件、字符串

• dtype : 讀取的數據類型

• count : 讀入元素個數，‐1表示讀入整個文件

• sep : 數據分割字符串，如果是空串，寫入文件爲二進制

實例：

In [19]: c=np.fromfile("b.dat",dtype=np.int,sep=",")

In [20]: c
Out[20]:
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       ……
       85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])
In [21]: c=np.fromfile("b.dat",dtype=np.int,sep=",").reshape(5,10,2)

In [22]: c
Out[22]:
array([[[ 0,  1],
        [ 2,  3],
        [ 4,  5],
        [ 6,  7],
        [ 8,  9],
        ……
        ……
        [96, 97],
        [98, 99]]])

實例：

In [23]: a=np.arange(100).reshape(5,10,2)

In [24]: a.tofile("b.dat",format='%d')

In [25]: c=np.fromfile("b.dat",dtype=np.int).reshape(5,10,2)

In [26]: c
Out[26]:
array([[[ 0,  1],
        [ 2,  3],
        [ 4,  5],
        ……
        [94, 95],
        [96, 97],
        [98, 99]]])

3）需要注意

該方法需要讀取時知道存入文件時數組的維度和元素類型 a.tofile() 和np.fromfile()需要配合使用可以通過元數據文件來存儲額外信息

三、NumPy的便捷文件存取

np.save(fname, array) 或 np.savez(fname, array)

• fname : 文件名，以.npy爲擴展名，壓縮擴展名爲.npz

• array : 數組變量 np.load(fname)

• fname : 文件名，以.npy爲擴展名，壓縮擴展名爲.npz

實例：

In [27]: a=np.arange(100).reshape(5,10,2)

In [28]: np.save("a.npy",a)

In [29]: b=np.load("a.npy")

In [30]: b
Out[30]:
array([[[ 0,  1],
        [ 2,  3],
        [ 4,  5]
        ……
        [94, 95],
        [96, 97],
        [98, 99]]])

NumPy的函數

一、NumPy的隨機數函數

NumPy的random字庫

np.random的隨機數函數（1）

函數	說明
rand(d0,d1,..,dn)	根據d0‐dn創建隨機數數組，浮點數，[0,1)，均勻分佈
randn(d0,d1,..,dn)	根據d0‐dn創建隨機數數組，標準正態分佈
randint(low[,high,shape])	根據shape創建隨機整數或整數數組，範圍是[low, high)
seed(s)	隨機數種子，s是給定的種子值

實例：

In [31]: a=np.random.rand(3,4,5)

In [32]: a
Out[32]:
array([[[8.83299431e-03, 8.05387769e-01, 3.86462185e-01, 6.46913067e-02,
         9.93829691e-01],
        [4.53347896e-01, 4.22900051e-01, 3.92576772e-01, 8.58568265e-01,
         1.20287066e-01],
        ……
        [5.00080272e-01, 4.62143545e-01, 4.03942908e-01, 4.26192622e-01,
         6.77575819e-01]]])

In [33]: sn=np.random.randn(3,4,5)

In [34]: sn
Out[34]:
array([[[-2.57248464e+00,  1.03484560e+00,  8.59184836e-01,
         -4.11266639e-01,  8.70097584e-01],
        [-9.22467511e-01,  6.15670993e-01,  1.17973012e-02,
         -2.00623178e+00, -7.09408579e-01],
        ……
        [-2.85435832e-01, -6.54918725e-01,  5.90400730e-01,
          2.68995787e-01,  4.09511541e-04]]])

In [35]: b=np.random.randint(100,200,(3,4))

In [36]: b
Out[36]:
array([[126, 199, 127, 175],
       [163, 136, 106, 192],
       [164, 193, 110, 113]])

In [37]: np.random.seed(10)

In [38]: np.random.randint(100,200,(3,4))
Out[38]:
array([[109, 115, 164, 128],
       [189, 193, 129, 108],
       [173, 100, 140, 136]])

np.random的隨機數函數（2）

函數	說明
shuffle(a)	根據數組a的第1軸進行隨排列，改變數組x
permutation(a)	根據數組a的第1軸產生一個新的亂序數組，不改變數組x
choice(a[,size,replace,p])	從一維數組a中以概率p抽取元素，形成size形狀新數組 replace表示是否可以重用元素，默認爲False

實例：

In [39]: b=np.random.randint(100,200,(3,4))

In [40]: b
Out[40]:
array([[116, 111, 154, 188],
       [162, 133, 172, 178],
       [149, 151, 154, 177]])

In [43]: np.random.shuffle(b)

In [44]: b
Out[44]:
array([[149, 151, 154, 177],
       [116, 111, 154, 188],
       [162, 133, 172, 178]])

In [47]: np.random.permutation(b)
Out[47]:
array([[149, 151, 154, 177],
       [162, 133, 172, 178],
       [116, 111, 154, 188]])

In [50]: b=np.random.randint(100,200,(8,))

In [51]: b
Out[51]: array([130, 189, 112, 165, 131, 157, 136, 127])

In [52]: np.random.choice(b,(3,2))
Out[52]:
array([[112, 157],
       [157, 136],
       [127, 136]])

In [53]:

In [53]: np.random.choice(b,(3,2),replace=False)
Out[53]:
array([[136, 157],
       [189, 127],
       [112, 131]])

In [54]: np.random.choice(b,(3,2),p=b/np.sum(b))
Out[54]:
array([[112, 165],
       [130, 136],
       [165, 189]])

np.random的隨機數函數（3）

函數	說明
uniform(low,high,size)	產生具有均勻分佈的數組,low起始值,high結束值,size形狀
normal(loc,scale,size)	產生具有正態分佈的數組,loc均值,scale標準差,size形狀
poisson(lam,size)	產生具有泊松分佈的數組,lam隨機事件發生率,size形狀

實例：

In [56]: u
Out[56]:
array([[7.31734625, 1.38782465, 7.66880049, 8.31989768],
       [3.09778055, 5.9758229 , 8.7239246 , 9.83020867],
       [4.67403279, 8.75744495, 2.96068699, 1.31291053]])

In [57]: u=np.random.normal(10,5,(3,4))

In [58]: u
Out[58]:
array([[12.22662292, 14.79969824,  3.2682308 , 20.09449687],
       [15.41188173,  1.07141088,  8.63831508,  8.54815806],
       [18.28073583,  3.82715648,  4.71157638, 10.69206992]])

二、NumPy的統計函數

NumPy直接提供的統計類函數 np.*

NumPy的統計函數（1）

函數	說明（axis=None 是統計函數的標配參數）
sum(a, axis=None)	根據給定軸axis計算數組a相關元素之和，axis整數或元組
mean(a, axis=None)	根據給定軸axis計算數組a相關元素的期望，axis整數或元組
average(a,axis=None,weights=None)	根據給定軸axis計算數組a相關元素的加權平均值
std(a, axis=None)	根據給定軸axis計算數組a相關元素的標準差
var(a, axis=None)	根據給定軸axis計算數組a相關元素的方差

實例：

In [59]: a=np.arange(15).reshape(3,5)

In [60]: a
Out[60]:
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [61]: np.sum(a)
Out[61]: 105

In [62]: np.mean(a,axis=1)
Out[62]: array([ 2.,  7., 12.])

In [63]: np.mean(a,axis=0)
Out[63]: array([5., 6., 7., 8., 9.])

In [65]: np.average(a,axis=0,weights=[10,5,1])
Out[65]: array([2.1875, 3.1875, 4.1875, 5.1875, 6.1875])

In [66]: np.std(a)
Out[66]: 4.320493798938574

In [67]: np.var(a)
Out[67]: 18.666666666666668

NumPy的統計函數（2）

函數	說明
min(a) max(a)	計算數組a中元素的最小值、最大值
argmin(a) argmax(a)	計算數組a中元素最小值、最大值的降一維後下標
unravel_index(index, shape)	根據shape將一維下標index轉換成多維下標
ptp(a)	計算數組a中元素最大值與最小值的差
median(a)	計算數組a中元素的中位數（中值）

實例：

In [68]: b=np.arange(15,0,-1).reshape(3,5)

In [69]: b
Out[69]:
array([[15, 14, 13, 12, 11],
       [10,  9,  8,  7,  6],
       [ 5,  4,  3,  2,  1]])

In [70]: np.max(b)
Out[70]: 15

In [71]: np.argmax(b)
Out[71]: 0

In [72]: np.unravel_index(np.argmax(b),b.shape)
Out[72]: (0, 0)

In [73]: np.ptp(b)
Out[73]: 14

In [74]: np.median(b)
Out[74]: 8.0

三、NumPy的梯度函數

函數	說明
np.gradient(f)	計算數組f中元素的梯度，當f爲多維時，返回每個維度梯度

梯度：連續值之間的變化率，即斜率 XY座標軸連續三個X座標對應的Y軸值：a, b, c，其中，a的梯度是： (b‐a)/1、b的梯度是： (c‐a)/2、c的梯度是： (c-b)/1

實例：

In [75]: a=np.random.randint(0,20,(5))

In [76]: a
Out[76]: array([16, 18, 16, 14, 19])

In [77]: np.gradient(a)
Out[77]: array([ 2. ,  0. , -2. ,  1.5,  5. ])

In [78]: b=np.random.randint(0,20,(5))

In [79]: b
Out[79]: array([15, 19, 17, 18, 14])

In [80]: np.gradient(b)
Out[80]: array([ 4. ,  1. , -0.5, -1.5, -4. ])

In [81]: c=np.random.randint(0,50,(3,5))

In [82]: c
Out[82]:
array([[48, 42, 17, 32, 17],
       [41, 16, 41, 26, 12],
       [30, 17, 17, 16,  0]])

In [84]: np.gradient(c)
Out[84]:
[array([[ -7. , -26. ,  24. ,  -6. ,  -5. ],
        [ -9. , -12.5,   0. ,  -8. ,  -8.5],
        [-11. ,   1. , -24. , -10. , -12. ]]),
 array([[ -6. , -15.5,  -5. ,   0. , -15. ],
        [-25. ,   0. ,   5. , -14.5, -14. ],
        [-13. ,  -6.5,  -0.5,  -8.5, -16. ]])]

《Python數據分析與展示》-Numpy學習筆記02

工作中用到的腳本合集

24-5-18 X

貓眼電影-分析及展示（Python+pycharts）

貓眼電影-爬取（Python）

990萬次騎行：紐約自行車共享系統分析

MATPLOTLIB官網examples練習及筆記

LeetCode題庫第三十八題（簡單系列）

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結