python: 找到dataframe某列中含有特定字段字符串的行 & 遍历dataframe & 函数返回为空 & 找到特定单元格

题目描述:

6.) Proving Afzal Wrong

We have detoured from the original aim of this question for long enough. Compare the popularity of dance music genres and pop music genres across the dataset using appropiate visualisation/s. Make the assumption that the popularity of a genre is defined by the average popularity column entry across all songs in the appropriate genres.

Hint/s

  • Dance sub-genres can be considered: edm, dance pop, trap music, big room, brostep
  • Pop sub-genres can be considered anything with pop in the name

 

原始数据集:

本题我卡在了如何找出Genre列含有’pop‘字段的行,如行0、行2、行3等。然后解决这个题还涉及一些python的常识性小tips,就记录一下。

搜索了众多python函数后,我还是没有找到可以一键替换的函数,看来只能遍历了。

import re #正则表达式的包
import pandas as pd


songData['dance_or_pop'] = '' # 新建一列来存储音乐类型
songData['dance_or_pop'].loc[(songData['Genre'] == 'edm')|(songData['Genre'] == 'brostep')|(songData['Genre'] == 'dance pop')|(songData['Genre'] == 'trap music')|(songData['Genre'] == 'big room')] = 'dance' # 根据题目给舞曲型赋值

pop_song = re.compile('.*pop.*') # 定义正则表达式,即任何含pop的字段

for i in songData['Genre']._stat_axis.values : # 根据行号遍历dataframe
    item = songData.loc[i,'Genre'] # python中如何找到某特定单元格的内容
    if (re.match(pop_song, item) != None): # python中函数返回为空是等于None
        songData.loc[i, 'dance_or_pop'] = 'pop'

songData

完成后是这个样子:

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章