python: 找到dataframe某列中含有特定字段字符串的行 & 遍歷dataframe & 函數返回爲空 & 找到特定單元格

題目描述:

6.) Proving Afzal Wrong

We have detoured from the original aim of this question for long enough. Compare the popularity of dance music genres and pop music genres across the dataset using appropiate visualisation/s. Make the assumption that the popularity of a genre is defined by the average popularity column entry across all songs in the appropriate genres.

Hint/s

  • Dance sub-genres can be considered: edm, dance pop, trap music, big room, brostep
  • Pop sub-genres can be considered anything with pop in the name

 

原始數據集:

本題我卡在瞭如何找出Genre列含有’pop‘字段的行,如行0、行2、行3等。然後解決這個題還涉及一些python的常識性小tips,就記錄一下。

搜索了衆多python函數後,我還是沒有找到可以一鍵替換的函數,看來只能遍歷了。

import re #正則表達式的包
import pandas as pd


songData['dance_or_pop'] = '' # 新建一列來存儲音樂類型
songData['dance_or_pop'].loc[(songData['Genre'] == 'edm')|(songData['Genre'] == 'brostep')|(songData['Genre'] == 'dance pop')|(songData['Genre'] == 'trap music')|(songData['Genre'] == 'big room')] = 'dance' # 根據題目給舞曲型賦值

pop_song = re.compile('.*pop.*') # 定義正則表達式,即任何含pop的字段

for i in songData['Genre']._stat_axis.values : # 根據行號遍歷dataframe
    item = songData.loc[i,'Genre'] # python中如何找到某特定單元格的內容
    if (re.match(pop_song, item) != None): # python中函數返回爲空是等於None
        songData.loc[i, 'dance_or_pop'] = 'pop'

songData

完成後是這個樣子:

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章