pandas中三大對象

1.pandas的Series對象

pandas的Series對象是一個帶索引數據構成的一維數組。可以用一個數組創建Series對象

import pandas as pd

data=pd.Series([0.25,0.5,0.75,1.0])
print(data)
# output:0    0.25
# 1    0.50
# 2    0.75
# 3    1.00
# dtype: float64
在上面的運行結果中，Series對象將一組數據和一組索引綁定在一起，我們可以通過values屬性和index屬性獲取數據。

data.values
#output；[0.25 0.5 0.75 1. ]

data.index
#output:RangeIndex(start=0, stop=4, step=1)
另外，數據可以通過python的中括號索引標籤獲取：

data[1]
#output:0.5
data[:3]
#output:1 0.50
# 2 0.75
# dtype: float64
1.1 Series是通用的Numpy數組

從上面的例子看出來，Numpy的一維數組和Series對象基本可以等價交換，但是兩者的本質差異是存在於索引：Numpy數組是通過隱式定義的整數索引獲取數值，而pandas的Series對象是用一種顯式定義的索引與數值關聯。

顯式索引的定義讓Series對象擁有了更強的定義。例如：索引不再僅僅是整數，還可以是任意想要的類型。

data=pd.Series([0.25,0.5,0.75,1.0],index=['a','b','c','d'])
print(data)
print(data['a'])
#output:a    0.25
# b    0.50
# c    0.75
# d    1.00
# dtype: float64
# 0.25
注意：也可以使用不連續或不按順序的索引。

data=pd.Series([0.25,0.5,0.75,1.0],index=[2,5,3,7])

1.2 Series是特殊的字典

Series對象看成是一種特殊的Python字典。字典是一種將任意鍵映射到一組任意值的數據結構，而Series對象其實是一組類型鍵映射到一組類型值的數據結構。

import pandas as pd
population_dict={'California':38332521,'Texas':26448193,'New York':19651127,'Florida':19552860,'Illinois':12882135}
population=pd.Series(population_dict)
print(population)
#output:California    38332521
# Texas         26448193
# New York      19651127
# Florida       19552860
# Illinois      12882135
# dtype: int64
上面對數組的操作仍然可以對這個Series對象適用。如切片操作，取值等。

1.3 創建Series對象

一般的創建方式：pd.Series(data,,index=index)

a)data可以是列表或者Numpy，這時index默認值爲整數序列

z=pd.Series([2,4,6])
print(z)
#output:0    2
# 1    4
# 2    6
# dtype: int64
b)data也可以是一個標量，創建Series對象時會重複填充到每個索引上：

z=pd.Series(5,index=[100,200,300])
print(z)
#output:100    5
# 200    5
# 300    5
# dtype: int64
c)data還可以是一個字典，index默認是排序的字典鍵：

z=pd.Series({2:'a',1:'b',3:'c'},index=[3,2])
print(z)
#output:3 c
# 2 a
# dtype: object

#出現這樣的原因是：Series對象只會保留顯式定義的鍵值對。
2.pandas的DataFrame對象

2.1 DataFrame是通用的Numpy數組

可以將DataFrame看作是一種既有靈活的行索引，又有靈活的列名的二維數組。它的行和列都可以通過索引獲取

同時，可以將DataFrame看成是有序‘排列’的若干Series對象

#創建基本的DataFrame,採用Series對象來解決。
import pandas as pd

population_dict={'California':38332521,'Texas':26448193,'New York':19651127,'Florida':19552860,'Illinois':12882135}
population=pd.Series(population_dict)
area_dict={'California':423967,'Texas':170312,'New York':141297,'Florida':170312,'Illinois':149995}
area=pd.Series(area_dict)
#用一個字典創建一個包含這些信息的二維數組：
states=pd.DataFrame({'population':population,'area':area})
print(states)

#output:            population    area
# California    38332521 423967
# Texas         26448193 170312
# New York      19651127 141297
# Florida       19552860 170312
# Illinois      12882135 149995

#查看行索引標籤(index 屬性)
states.index
#output：Index(['California', 'Texas', 'New York', 'Florida', 'Illinois'], dtype='object')
#返回的是Index對象

#查看列索引標籤
states.columns
#output:Index(['population', 'area'], dtype='object')
2.2 DataFrame是特殊的字典

我們可以把DataFrame看成是特殊的字典。字典是一個鍵映射一個值，而DataFrame是一列映射一個Series的數據。

# 通過‘area’列屬性獲取包含於area列的所有數據
print(states['area'])
#output:California    423967
# Texas         170312
# New York      141297
# Florida       170312
# Illinois      149995
# Name: area, dtype: int64
2.3 創建DataFrame對象

a)通過單個Series對象創建

#DataFrame 是一組Series對象的集合，可以使用單個Series創建一個單列的DataFrame
pd.DataFrame(population,columns=['populations'])
#output              populations
# California     38332521
# Texas          26448193
# New York       19651127
# Florida        19552860
# Illinois       12882135
b)通過字典列表創建

任何元素是字典的列表都可以變成DataFrame.用一個簡單的列表綜合來創建一些數據：

data=[{'a':i,'b':2*i}for i in range(3)]
z=pd.DataFrame(data,index=list('ABC'))
print(z)
#output: a b
# A 0 0
# B 1 2
# C 2 4
當字典中有些鍵不存在時，Pandas也會使用缺失值（NaN）(not a number)來表示：

data=[{'a':1,'b':2},{'b':3,'c':4}]
z=pd.DataFrame(data)
print(z)
#output: a b c
# 0 1.0 2 NaN
# 1 NaN 3 4.0
c)通過Series對象字典創建

見最開始的示例中DataFrame的創建方式。

d)通過Numpy二維數組創建

假如有一個二維數組，就可以創建一個可以指定行列索引值的DataFrame.如果不指定行列索引值，那麼行列默認都是整數索引值：

data=np.random.rand(3,2)
z=pd.DataFrame(data,columns=['foo','bar'],index=['a','b','c'])
print(z)
#output: foo bar
# a 0.679849 0.791610
# b 0.438278 0.331297
# c 0.998745 0.861642
e)通過Numpy結構化數組創建

首先是介紹來自於官網的結構化數組：

Structured type, two fields: the first field contains an unsigned int, the second an int32:
>> np.dtype([('f1', np.uint), ('f2', np.int32)])
#output:dtype([('f1', '<u4'), ('f2', '<i4')])
通過結構化數組創建DataFrame：

A=np.zeros(3,dtype=[('A','i8'),('b','f8')])
print(A)
#output:[(0, 0.) (0, 0.) (0, 0.)]
z=pd.DataFrame(A)
print(z)
#output： A B
# 0 0 0.0
# 1 0 0.0
# 2 0 0.0
3.pandas的Index對象

先簡單的創建Index對象數組

import pandas as pd

index=pd.Index([2,3,5,7,11])
print(index)

#output:Int64Index([2, 3, 5, 7, 11], dtype='int64')
3.1 將Index看做不可變數組

Index對象的許多操作都類似於數組，可以採用Python標準的取值方法獲取數值，也可以通過切片獲取數值：

index[1]
#output:3

#通過切片獲取數值
# 每隔一個值取一個，一下子看蒙了
print(index[::2])
但是，Index對象裏面的值是不可更改的，如果這樣：index[0]=1會報錯，這就是不可修改的含義。

3.2 將Index看做有序集合

Index對象遵循Python標準庫的集合（set）數據結構的許多習慣用法，包括並集、交集、差集等。

indA=pd.Index([1,3,5,7,9])
indB=pd.Index([2,3,5,7,11])

# 交集
print(indA&indB)
# 並集
print(indA|indB)
# 異或
print(indA^indB)

#output:Int64Index([3, 5, 7], dtype='int64')
# Int64Index([1, 2, 3, 5, 7, 9, 11], dtype='int64')
# Int64Index([1, 2, 9, 11], dtype='int64')

https://blog.csdn.net/sir_TI/article/details/83478146

pandas中三大對象

測試人員都是畫畫大神，讓我看看誰還不會用代碼圖？

Object.values()對象遍歷

pandas中三大對象

Python鏡像

Excel中3個超級好用的條件求和的函數

Excel讓部分單元格不可選

Python中枚舉取代IF

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結