【數據分析可視化】Series和DataFrame的排序

import numpy as np
import pandas as pd
from pandas import Series, DataFrame

Series的排序

s1 = Series(np.random.rand(10))
s1
0    0.324583
1    0.528829
2    0.922022
3    0.050265
4    0.069271
5    0.447179
6    0.595703
7    0.518557
8    0.695466
9    0.685736
dtype: float64
s1.values
array([0.32458288, 0.52882927, 0.92202246, 0.05026548, 0.06927059,
       0.44717888, 0.59570299, 0.51855686, 0.69546586, 0.68573564])
s1.index
RangeIndex(start=0, stop=10, step=1)
# value排序 升降可調ascending默認升序
s2 = s1.sort_values()
s2
3    0.050265
4    0.069271
0    0.324583
5    0.447179
7    0.518557
1    0.528829
6    0.595703
9    0.685736
8    0.695466
2    0.922022
dtype: float64
# 索引排序
s2.sort_index()
0    0.324583
1    0.528829
2    0.922022
3    0.050265
4    0.069271
5    0.447179
6    0.595703
7    0.518557
8    0.695466
9    0.685736
dtype: float64

DataFrame的排序

df1 = DataFrame(np.random.randn(40).reshape(8,5),columns=['A','B','C','D','E'])
df1
A B C D E
0 1.069063 0.266594 -0.129437 -0.361949 -1.491594
1 1.520675 1.673761 0.310567 -1.535689 0.388416
2 1.828228 0.221382 -0.092250 -0.111522 -1.187931
3 -1.049244 -0.093515 0.175138 0.627553 -0.357136
4 0.572511 -0.871314 1.142248 -0.489059 0.677733
5 0.088234 -0.786141 -0.222611 0.087407 -0.221874
6 2.199338 0.191928 0.278917 -0.388502 0.611719
7 1.260192 -0.001860 0.144536 -0.312155 1.664181
# 列排序 沒法顯示全部
df1['A'].sort_values()
3   -1.049244
5    0.088234
4    0.572511
0    1.069063
7    1.260192
1    1.520675
2    1.828228
6    2.199338
Name: A, dtype: float64
# 對指定列排序 顯示全部
df2 = df1.sort_values('A')
df2
A B C D E
3 -1.049244 -0.093515 0.175138 0.627553 -0.357136
5 0.088234 -0.786141 -0.222611 0.087407 -0.221874
4 0.572511 -0.871314 1.142248 -0.489059 0.677733
0 1.069063 0.266594 -0.129437 -0.361949 -1.491594
7 1.260192 -0.001860 0.144536 -0.312155 1.664181
1 1.520675 1.673761 0.310567 -1.535689 0.388416
2 1.828228 0.221382 -0.092250 -0.111522 -1.187931
6 2.199338 0.191928 0.278917 -0.388502 0.611719
df2.sort_index()
A B C D E
0 1.069063 0.266594 -0.129437 -0.361949 -1.491594
1 1.520675 1.673761 0.310567 -1.535689 0.388416
2 1.828228 0.221382 -0.092250 -0.111522 -1.187931
3 -1.049244 -0.093515 0.175138 0.627553 -0.357136
4 0.572511 -0.871314 1.142248 -0.489059 0.677733
5 0.088234 -0.786141 -0.222611 0.087407 -0.221874
6 2.199338 0.191928 0.278917 -0.388502 0.611719
7 1.260192 -0.001860 0.144536 -0.312155 1.664181

讀取csv文件,電影評分降序,輸出新的csv

# 讀取數據
csv_input = '/Users/bennyrhys/Desktop/數據分析可視化-數據集/homework/movie_metadata.csv'
pd.read_csv(csv_input).head()
color director_name num_critic_for_reviews duration director_facebook_likes actor_3_facebook_likes actor_2_name actor_1_facebook_likes gross genres ... num_user_for_reviews language country content_rating budget title_year actor_2_facebook_likes imdb_score aspect_ratio movie_facebook_likes
0 Color James Cameron 723.0 178.0 0.0 855.0 Joel David Moore 1000.0 760505847.0 Action|Adventure|Fantasy|Sci-Fi ... 3054.0 English USA PG-13 237000000.0 2009.0 936.0 7.9 1.78 33000
1 Color Gore Verbinski 302.0 169.0 563.0 1000.0 Orlando Bloom 40000.0 309404152.0 Action|Adventure|Fantasy ... 1238.0 English USA PG-13 300000000.0 2007.0 5000.0 7.1 2.35 0
2 Color Sam Mendes 602.0 148.0 0.0 161.0 Rory Kinnear 11000.0 200074175.0 Action|Adventure|Thriller ... 994.0 English UK PG-13 245000000.0 2015.0 393.0 6.8 2.35 85000
3 Color Christopher Nolan 813.0 164.0 22000.0 23000.0 Christian Bale 27000.0 448130642.0 Action|Thriller ... 2701.0 English USA PG-13 250000000.0 2012.0 23000.0 8.5 2.35 164000
4 NaN Doug Walker NaN NaN 131.0 NaN Rob Walker 131.0 NaN Documentary ... NaN NaN NaN NaN NaN NaN 12.0 7.1 NaN 0

5 rows × 28 columns

pd.read_csv(csv_input)[['movie_title','imdb_score']].sort_values('imdb_score',ascending=False).head()
movie_title imdb_score
2765 Towering Inferno 9.5
1937 The Shawshank Redemption 9.3
3466 The Godfather 9.2
4409 Kickboxer: Vengeance 9.1
2824 Dekalog 9.1
# 一行代碼排序並輸出新csv
pd.read_csv(csv_input)[['movie_title','imdb_score']].sort_values('imdb_score',ascending=False).to_csv('imdb.csv')
!ls
02file.ipynb
4-1 DataFrame的簡單數學計算.ipynb
4-2 Series和DataFrame的排序.ipynb
4-3 重命名Dataframe的index.ipynb
7B4349AB-7282-428F-A780-CB538E0517A3.dmp
[34mApplications[m[m
[34mCreative Cloud Files[m[m
[34mDesktop[m[m
[34mDocuments[m[m
[34mDownloads[m[m
[34mHadoop_VM[m[m
Java.gitignore
[34mLibrary[m[m
[34mMovies[m[m
[34mMusic[m[m
NumPy-排序.ipynb
Numpy-3.4數組讀寫.ipynb
Numpy1.ipynb
Pandas.ipynb
[34mPictures[m[m
[34mPostman[m[m
[34mPromotionRes[m[m
[34mPublic[m[m
[34mUntitled Folder[m[m
[34mUntitled Folder 1[m[m
Untitled.ipynb
Untitled1.ipynb
[34mVirtual Machines.localized[m[m
[34mWeChatProjects[m[m
ap.plist
apps.plist
bt.plist
[34meclipse-workspace[m[m
history.plist
[34miCloud 雲盤(歸檔)[m[m
imdb.csv
[34minstall[m[m
nadarray.ipynb
[34mopt[m[m
[34msell[m[m
[34mvue-demo01[m[m
[34mvue-sell-cube[m[m
[34mvue-selll[m[m
輸出1.spv
數據分析-分組 聚合 可視化.ipynb
班級成績.ipynb
!more imdb.csv
,movie_title,imdb_score
2765,Towering Inferno             ,9.5
1937,The Shawshank Redemption ,9.3
3466,The Godfather ,9.2
4409,Kickboxer: Vengeance ,9.1
2824,Dekalog             ,9.1
3207,Dekalog             ,9.1
66,The Dark Knight ,9.0
2837,The Godfather: Part II ,9.0
3481,Fargo             ,9.0
339,The Lord of the Rings: The Return of the King ,8.9
4822,12 Angry Men ,8.9
4498,"The Good, the Bad and the Ugly ",8.9
3355,Pulp Fiction ,8.9
1874,Schindler's List ,8.9
683,Fight Club ,8.8
836,Forrest Gump ,8.8
270,The Lord of the Rings: The Fellowship of the Ring ,8.8
2051,Star Wars: Episode V - The Empire Strikes Back ,8.8
97,Inception ,8.8
1842,It's Always Sunny in Philadelphia             ,8.8
459,Daredevil             ,8.8
1620,Friday Night Lights             ,8.7
[7mimdb.csv[m[K
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章