Python-Pandas中Series用法總結

原創

ckSpark

2020-07-05 17:43

Series：帶標籤的數組

本文對Pandas包中的一維數據類型Series特點及用法進行了總結歸納。

2.1 如何創建Series

#導入Pandas包
import pandas as pd

#創建Series
#1.1.1 通過列表List
listSer=pd.Series([10,20,30,40])
print(listSer)

#1.1.2 通過字典dict
dictSer=pd.Series({'a':10,'b':40,'c':5,'d':90,'e':35,'f':40},name='數值')
print(dictSer)

#1.1.3 通過array
import numpy as np
arrySer=pd.Series(np.arange(10,15),index=['a','b','c','d','e'])
print(arrySer)

[output]
0    10
1    20
2    30
3    40
dtype: int64
a    10
b    40
c     5
d    90
e    35
f    40
Name: 數值, dtype: int64
a    10
b    11
c    12
d    13
e    14
dtype: int64

2.2 索引及name屬性

Series類型包括(index,values)兩部分

#index
print(arrySer.index)
#values
print(arrySer.values)

[output]
Index(['a', 'b', 'c', 'd', 'e'],dtype='object')
[10 11 12 13 14]

2.3 獲取數據

#iloc通過位置獲取數據
dictSer[0:1] #相當於dictSer.iloc[0:1]
>>>
a    10
Name: 數值, dtype: int64

#loc通過索引獲取數據
dictSer[['a','b']]  #相當於dictSer.loc[['a','b']]
>>>
a    10
b    40
Name: 數值, dtype: int64

#boolean indexing獲取值
dictSer[dictSer.values<=10] #獲取值不超過10的數據
>>>
a    10
c     5
Name: 數值, dtype: int64

dictSer[dictSer.index!='a']  #獲取索引值不是a的數據
>>>
b    40
c     5
d    90
e    35
f    40
Name: 數值, dtype: int64

2.4 基本運算

查看描述性統計數據

dictSer.describe() 
>>>
count     6.000000
mean     36.666667
std      30.276504
min       5.000000
25%      16.250000
50%      37.500000
75%      40.000000
max      90.000000
Name: 數值, dtype: float64

dictSer.mean() #均值
dictSer.median() #中位數
dictSer.sum() #求和
dictSer.std() #標準差
dictSer.mode() #衆數
dictSer.value_counts() #每個值的數量

數學運算

dictSer/2 #對每個值除2
dictSer//2 #對每個值除2後取整
dictSer%2 #取餘
dictSer**2 #求平方
np.sqrt(dictSer) #求開方
np.log(dictSer) #求對數

對齊計算

dictSer2=pd.Series({'a':10,'b':20,'d':23,'g':90,'h':35,'i':40},name='數值')
dictSer3=dictSer+dictSer2
dictSer3
>>>
a     20.0
b     60.0
c      NaN
d    113.0
e      NaN
f      NaN
g      NaN
h      NaN
i      NaN
Name: 數值, dtype: float64

2.5 缺失值處理

#找出空/非空值
dictSer3[dictSer3.notnull()] #非空值
>>>
a     20.0
b     60.0
d    113.0
Name: 數值, dtype: float64

dictSer3[dictSer3.isnull()]  #空值
>>>
c   NaN
e   NaN
f   NaN
g   NaN
h   NaN
i   NaN
Name: 數值, dtype: float64

#填充空值
dictSer3=dictSer3.fillna(dictSer3.mean()) #用均值來填充缺失值
>>>
a     20.000000
b     60.000000
c     64.333333
d    113.000000
e     64.333333
f     64.333333
g     64.333333
h     64.333333
i     64.333333
Name: 數值, dtype: float64

2.6 刪除值

dictSer3=dictSer3.drop('b')
print(dictSer3)
>>>
a     20.000000
c     64.333333
d    113.000000
e     64.333333
f     64.333333
g     64.333333
h     64.333333
i     64.333333
Name: 數值, dtype: float64

如果對於本文中代碼或數據有任何疑問，歡迎評論或私信交流

相近文章：
Numpy中Array用法總結
 Pandas中DataFrame用法總結

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Python-Pandas中Series用法總結

Series：帶標籤的數組

2.1 如何創建Series

2.2 索引及name屬性

2.3 獲取數據

2.4 基本運算

2.5 缺失值處理

2.6 刪除值

Pandas-排序函數sort_values()

Python-格式化符%

機器學習-集成學習(ensemble learning)

Pandas-object字符類型轉時間類型to_datetime()函數

Pandas-去除重複項函數drop_duplicates()

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結