如何並排輸出兩個Pandas數據框中的差異？

原創

2020-06-28 14:18

我試圖強調兩個數據框之間的確切變化。

假設我有兩個Python Pandas數據框：

"StudentRoster Jan-1":
id   Name   score                    isEnrolled           Comment
111  Jack   2.17                     True                 He was late to class
112  Nick   1.11                     False                Graduated
113  Zoe    4.12                     True       

"StudentRoster Jan-2":
id   Name   score                    isEnrolled           Comment
111  Jack   2.17                     True                 He was late to class
112  Nick   1.21                     False                Graduated
113  Zoe    4.12                     False                On vacation

我的目標是輸出一個HTML表格：

標識已更改的行（可以是int，float，boolean，string）
輸出具有相同，舊和新值的行（理想情況下放入HTML表格中），以便用戶可以清楚地看到兩個數據框之間的變化： "StudentRoster Difference Jan-1 - Jan-2": id Name score isEnrolled Comment 112 Nick was 1.11| now 1.21 False Graduated 113 Zoe 4.12 was True | now False was "" | now "On vacation"

第一部分與Constantine相似，你可以得到其中行爲空的布爾值*：

In [21]: ne = (df1 != df2).any(1)

In [22]: ne
Out[22]:
0    False
1     True
2     True
dtype: bool

然後我們可以看到哪些條目已經改變：

In [23]: ne_stacked = (df1 != df2).stack()

In [24]: changed = ne_stacked[ne_stacked]

In [25]: changed.index.names = ['id', 'col']

In [26]: changed
Out[26]:
id  col
1   score         True
2   isEnrolled    True
    Comment       True
dtype: bool

這裏第一項是索引，第二項是已更改的列。

In [27]: difference_locations = np.where(df1 != df2)

In [28]: changed_from = df1.values[difference_locations]

In [29]: changed_to = df2.values[difference_locations]

In [30]: pd.DataFrame({'from': changed_from, 'to': changed_to}, index=changed.index)
Out[30]:
               from           to
id col
1  score       1.11         1.21
2  isEnrolled  True        False
   Comment     None  On vacation

注意：df1並且df2共享相同的索引。爲了克服這種模糊性，可以確保你只使用共享標籤df1.index & df2.index

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

如何並排輸出兩個Pandas數據框中的差異？

win11關閉自動檢測病毒刪文件

千兆寬帶實際網速能到達多少？

elasticsearch分組並獲得分組裏面的結果 group_concat

秒殺活動——理論設計

Docker容器動態添加端口

博客優化：博客文章圖片Django使用ckeditor上傳到七牛雲

教你在Nginx上使用CertBot把自己網站設置成HTTPS

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結