Pandas详解一之Series对象

原創

2020-02-22 18:10

约定：

import pandas as pd
from pandas import Series,DataFrame
import numpy as np

Series

一、Series属性及方法

Series是Pandas中最基本的对象，Series类似一种一维数组:

se1=Series([4,7,-2,8])
se1

代码结果：

0    4
1    7
2   -2
3    8
dtype: int64

通常，我们希望能自己创建索引：

se2=Series([4,7,-2,8],index=['b','c','a','d'])
se2

代码结果：

b    4
c    7
a   -2
d    8
dtype: int64

可通过Series的俩个属性values和index获取内容和索引:

se1.values

代码结果：

array([ 4,  7, -2,  8], dtype=int64)

se1.index

代码结果：

RangeIndex(start=0, stop=4, step=1)

Series对象有字典的功能：

'b' in se2

代码结果：

True

list(se2)

代码结果：

[4, 7, -2, 8]

list(se2.iteritems())

代码结果：

[('b', 4), ('c', 7), ('a', -2), ('d', 8)]

还可将字典转换为Series：

dict={"red":100,"black":400,"green":300,"pink":900}

se3=Series(dict)
se3

代码结果：

black    400
green    300
pink     900
red      100
dtype: int64

Series对象的内容和索引都有个name属性：

se3

代码结果：

black    400
green    300
pink     900
red      100
dtype: int64

se3.name="values"
se3.index.name="color"
se3

代码结果：

color
black    400
green    300
pink     900
red      100
Name: values, dtype: int64

用pandas的isnull和nonull可检测缺失数据：

pd.isnull(se3)

代码结果：

purple    False
brown     False
glod      False
blue      False
Name: values, dtype: bool

pd.notnull(se3)

代码结果：

purple    True
brown     True
glod      True
blue      True
Name: values, dtype: bool

或者直接用Series对象（se4）的 isnull ：

se3.isnull()

代码结果：

purple    False
brown     False
glod      False
blue      False
Name: values, dtype: bool

二、Series对象存取

Series对象的下标运算同时支持位置和标签两种方式：

print("位置下标：  ",se2[1])
print("标签下标：  ",se2['c'])

代码结果：

位置下标：   7
标签下标：   7

Series对象支持位置切片和标签切片，但需要注意的是后者包括结束标签:

se2

代码结果：

b    4
c    7
a   -2
d    8
dtype: int64

se2[1:3]

代码结果：

c    7
a   -2
dtype: int64

se2['b':'a']

代码结果：

b    4
c    7
a   -2
dtype: int64

和ndarray数组一样，可以用位置列表、位置数组来存取元素，同样地，标签列表、标签数组也能存取：

se2[[1,3,2]]

代码结果：

c    7
d    8
a   -2
dtype: int64

se2[['a','b','c']]

代码结果：

a   -2
b    4
c    7
dtype: int64

还可通过索引进行排序（字典中缺失的则用NaN作为内容）：

se4=Series(dict,index=["red","yellow","green","white","black","pink"])
se4

代码结果：

red       100.0
yellow      NaN
green     300.0
white       NaN
black     400.0
pink      900.0
dtype: float64

Series对象的内容可通过索引赋值进行重新排序：

se4.index=["red","yellow","black","pink","green","white"]
se4

代码结果：

red       100.0
yellow      NaN
black     300.0
pink        NaN
green     400.0
white     900.0
dtype: float64

三、Series运算特性

可支持Numpy数组运算（布尔数组过滤、标量乘法、数学函数）：

se2

代码结果：

b    4
c    7
a   -2
d    8
dtype: int64

se2[se2>0]

代码结果：

b    4
c    7
d    8
dtype: int64

se2*2

代码结果：

b     8
c    14
a    -4
d    16
dtype: int64

np.exp(se2)

代码结果：

b      54.598150
c    1096.633158
a       0.135335
d    2980.957987
dtype: float64

两个Series对象支持操作符运算，Series会按照标签对齐元素再运算（也就是只有相同标签的元素才能进行运算），当某一方标签不存在则默认用NaN填充：

se2+se3

代码结果：

a        NaN
b        NaN
blue     NaN
brown    NaN
c        NaN
d        NaN
glod     NaN
purple   NaN
dtype: float64

谢谢大家的浏览，
希望我的努力对您有帮助，
共勉！

yungeisme

发布了48 篇原创文章 · 获赞 181 · 访问量 34万+

私信关注

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Pandas详解一之Series对象

Series

一、Series属性及方法

二、Series对象存取

三、Series运算特性

诈骗（杀猪盘）网站进行渗透测试

Python 潮流周刊#50：我最喜欢的 Python 3.13 新特性！

【Python】保存gym截图

【译】使用 GitHub Copilot 作为你的编码 GPS

Linux 服务器配置-安装portainer-ce社区版

外行也能读懂的网络硬件设备功能原理速成

安装Auto-GPT

創建樸素貝葉斯分類器、交叉驗證

Pandas詳解二之DataFrame對象

Pandas詳解七之DatetimeIndex、PeriodIndex和TimedeltaIndex時間序列

Pandas詳解一之Series對象

Pandas詳解十之Dropna濾除缺失數據

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結