一，ndarray基礎（2-2）

ndarray.ndim 數組軸的個數，在python的世界中，軸的個數被稱作秩
ndarray.shape 數組的維度。這是一個指示數組在每個維度上大小的整數元組。例如一個n排m列的矩陣，它的shape屬性將是(2,3),這個元組的長度顯然是秩，即維度或者ndim屬性
ndarray.size 數組元素的總個數，等於shape屬性中元組元素的乘積。
ndarray.dtype 一個用來描述數組中元素類型的對象，可以通過創造或指定dtype使用標準Python類型。另外NumPy提供它自己的數據類型。
ndarray.itemsize 數組中每個元素的字節大小。例如，一個元素類型爲float64的數組itemsiz屬性值爲8(=64/8),又如，一個元素類型爲complex32的數組item屬性爲4(=32/8).
ndarray.data 包含實際數組元素的緩衝區，通常我們不需要使用這個屬性，因爲我們總是通過索引來使用數組中的元素

slice

reshape

x=np.arange(0,9)     >>0,1,2,3,4,5,6,7,8
y=x.reshape(3,3)  
   
>>
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])


3,返回原來。
y.reshape(9)         >> array([0, 1, 2, 3, 4, 5, 6, 7, 8])

4,修改
flattened[0] = 1000  >> array([1000,    1,    2,    3,    4,    5,    6,    7,    8])

5,合併
np.hstack((a, b))    

>>
array([[0, 1, 2,b1],
       [3, 4, 5,b2],
       [6, 7, 8,b3]])

通用函數

1，均值，標準差，方差

a = np.arange(1,10)
a.mean(), a.std(), a.var( )   >>(5.0, 2.5819888974716112, 6.666666666666667)


2，求和，集
a.sum(), a.prod()             >>(45, 362880)

3,求累加和，累加乘
a.cumsum(), a.cumprod()

>>
(array([ 1,  3,  6, 10, 15, 21, 28, 36, 45], dtype=int32),
 array([     1,      2,      6,     24,    120,    720,   5040,  40320,
        362880], dtype=int32))

4,所有和部分  all，any

二，Series

pandas中的兩種基礎數據結構之一，可以理解爲一維帶標籤數組.

數組中數據可以爲任意類型(整數，字符串，浮點數，Python objects等).
數組中數據爲同一類型(homogeneous)

import numpy as np
import pandas as pd

建立一個Series

這裏data可以是

- list
- array
- dictionary

1，創建
s = pd.Series(data, index=index)

--------------------------------------------
2，指名稱 name=
price = pd.Series([15280,45888,15692,55689,28410,27566],name="price")
        
price   >>
0    15280
1    45888
2    15692
3    55689
4    28410
5    27566
Name: price, dtype: int64
----------------------------------------------
3,字母順序排列 tamp

temp = {'Mon': 33, 'Tue': 19, 'Wed': 15, 'Thu': 89, 'Fri': 11, 'Sat': -5, 'Sun': 9}
pd.Series(temp)

>>
Fri    11
Mon    33
Sat    -5
Sun     9
Thu    89
Tue    19
Wed    15
dtype: int64
-------------------------------------------
4,指查看前面幾位 head(2) ,後面幾個tail(3)

price.head(2)、tail(3)

>>
0    15280
1    45888
dtype: int64
----------------------------------
5，查看所有
print(dir(price))

index

1,訪問數據[]
2,字符串

price = pd.Series([15280,45888,15692],index=['wh','sh','hz''])
>>
wh    15280.0
sh    45888.0
hz    15692.0
--------------------------
3,查找數據 顯示第一列，顯示第二列
teap[0]      >>wh

temp.loc[0]   >>15280.0
-----------------
4，採取切片方式  , iloc[]

temp.iloc[0:3]
----------------------------

5,修改與刪除Series中的值
price

-----（改）
price.iloc[0]= 修改
price['cd']=9500

-----（添加）
price.append(pd.Series([1000],index=['名'])   

>>添加
wh    15280.0
sh    45888.0
名    10000

----（刪除）
del price['名']

>>刪除

統計

1，最小，最大，中間，

price.min()
price.max()
price.median()

2，顯示所有
price.describe()
>>
count        6.000000
mean     29461.666667
std      20585.747940
min       9500.000000
25%      15383.000000
50%      22051.000000
75%      41518.500000

三，dataframe(2-4)

import numpy as np
import pandas as pd

新增

1,創建  np.array, dict

df1=pd.DataFrame(np.array([[10,20],[30,40]]))
>>

0	1
0	10	20
1	30	40

df2 = pd.DataFrame([pd.Series(np.arange(1, 8)),pd.Series(np.arange(11, 18))])
>>

0	1	2	3	4	5	6
0	1	2	3	4	5	6	7
1	11	12	13	14	15	16	17
--------------------------------------
2,創建行列名稱

df3=pd.DataFrame(np.array([[10,20],[30,40]]),index=['a','b'],columns=['c1','c2'])
>>

c1	c2
a	10	20
b	30	40
---------------------------------
3,查看info()

df5.info()  >>  df5所有詳細數據
-----------------------------------
4，讀取csv文件

eu12=pd.read_csv('../data/Eueo2012.csv',index_col='Team')

查看，選擇列【】

1，查看 選擇1,2列
eu12[[1,2]].head()

----------------------------
2，查看 選擇“列名字1”“列名字2”
eu12[['Shots on target','Shots off target']].head()

-----------------------
3，智能判斷索引  ix[] - index label 或者location

eu12.ix[[2,4]]   >>根據第2行，和第3行索引

eu12.ix[['名1','名2']] 

>>
	Goals	Shots on target	Shots off target											
名2 4	10	10	50.0%	20.0%	
名2 3	22	24	37.9%	6.5%	

------------------------------
4，精細搜索  .at[]與.iat[]獲取某一行列的值

eu12.at['行名2','列名2']   >> 3

修改，刪除colimns

1，修改
eu12_goals.rename(columns={'原來':'修改後'})

--------------------
2，插入
eu12_goals.insert(3,'on_target_percent',eu12_goals.on_target_ratio*100)

>>eu12_goals.insert(第幾列,'列名',eu12_goals.列名*100)

--------------------
3，刪除

del eu12_goals['列名'] 

eu12_goals_cp.pop('列名')

----------------------

添加行

df3=df1.append(df2)

---------------
2，添加行中行  multi-index
   （1）新增行列
np.arrays = [['one','one','one','two','two','two'],[1,2,3,1,2,3]]
   （2）randn(列,行) ，zip（分成多少）
df = pd.DataFrame(np.random.randn(6,2),index=pd.MultiIndex.from_tuples(list(zip(*np.arrays))),columns=['A','B'])

>>

A	B
one	1	-1.424150	-0.030356
    2	0.389071	1.209949
    3	0.780984	2.399635
two	1	-0.893666	0.703441
    2	-0.110211	-0.916821
    3	-1.176147	0.235822

總結： summary。（求平均，合...）

1,求平均

  按列求平均 :>> one_mon_hist.mean()
  按行求平均 :>> one_mon_hist.mean(axis=1) # row

2，求方差
one_mon_hist.median()

3,求函數
one_mon_hist.var()

4，最小值
one_mon_hist[['列1', '列2']].min()

5，最大值
one_mon_hist[['列1', '列2']].max()

6，查找“最小值”在哪一行
one_mon_hist[['列1', '列2']].idxmin()

7，查找“最大值”在哪一行
one_mon_hist[['列1', '列2']].idxmax()
---------------------------------------------------
8，查看所有數據
one_mon_hist.describe()
>>
 
        MSFT	    AAPL
count	22.000000	22.000000
mean	47.493182	112.411364
std	0.933077	2.388772
min	45.160000	106.750000
25%	46.967500	111.660000
50%	47.625000	112.530000
75%	48.125000	114.087500
max	48.840000	115.930000

【數據py02】Pandas_01

一，ndarray基礎（2-2）

slice

reshape

通用函數

二，Series

建立一個Series

index

統計

三，dataframe(2-4)

新增

查看，選擇列【】

修改，刪除colimns

添加行

總結： summary。（求平均，合...）

Python 爬蟲：Spring Boot 反爬蟲的成功案例

京東科技數字化營銷能力的演進與最佳實踐| 京東雲技術團隊

【項目django-前端】1首頁

【Python-爬蟲】基礎學習

【項目】房天下數據爬取

【爬蟲】 excel

【爬蟲2019,9月】攜程eleven參數解密

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結