一、源碼有誤之處
可能是自己買的盜版的印刷問題,但是更可能是源碼錯誤
源碼:
from csv import DictReader
import pprint
path = 'I:\\360下載\\data-wrangling\\data\\unicef\\mn.csv'
data = DictReader(open(path, 'rb'))
data_row = [d for d in data]
def combine_data_dict(data_rows):
'''合併數據'''
data_dict = {}
for row in data_rows:
key = '%s-%s' % (row.get('HH1'), row.get('HH2'))
if key in data_dict.keys():
data_dict[key].append(row)
else:
data_dict[key] = [row]
'''此處作用在於將row中合併的key作爲鍵(如果不存在就設爲鍵)而將row作爲key對應的值'''
return data_dict
mn_dict = combine_data_dict(data_row)
print(len(mn_dict))
報錯
Traceback (most recent call last):
File "J:/PyCharm項目/學習進行中/《python數據處理》/基礎/數據清洗/數據合併.py", line 12, in <module>
data_row = [d for d in data]
File "J:/PyCharm項目/學習進行中/《python數據處理》/基礎/數據清洗/數據合併.py", line 12, in <listcomp>
data_row = [d for d in data]
File "D:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\csv.py", line 111, in __next__
self.fieldnames
File "D:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\csv.py", line 98, in fieldnames
self._fieldnames = next(self.reader)
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)
翻譯爲中文:_csv.Error:迭代器應返回字符串,而不是字節(您是否以文本模式打開文件?)
二、修改
from csv import DictReader
import pprint
path = 'I:\\360下載\\data-wrangling\\data\\unicef\\mn.csv'
data = DictReader(open(path, 'r'))
data_row = [d for d in data]
def combine_data_dict(data_rows):
'''合併數據'''
data_dict = {}
for row in data_rows:
key = '%s-%s' % (row.get('HH1'), row.get('HH2'))
if key in data_dict.keys():
data_dict[key].append(row)
else:
data_dict[key] = [row]
'''此處作用在於將row中合併的key作爲鍵(如果不存在就設爲鍵)而將row作爲key對應的值'''
return data_dict
mn_dict = combine_data_dict(data_row)
print(len(mn_dict))
正確輸出爲