1.6 字典中的鍵映射多個值
問題
實現一個字典結構,一個鍵可以對應多個值
解決方案
1.列表和集合
1.>>> d
2.{'b': [4, 5], 'a': [1, 2, 3]}
3.>>> d['b']
4.[4, 5]
5.>>> e={'a':{1,2,2,3},'b':{2,3,4,4}}
6.>>> e
7.{'b': {2, 3, 4}, 'a': {1, 2, 3}}
8.>>> d['b'][1]
9.5
2.使用defaultdict
1.>>> from collections import defaultdict
2.>>> d = defaultdict(list)
3.>>> d
4.defaultdict(<class 'list'>, {})
5.>>> d['a'].append(1)
6.>>> d['a'].append(2)
7.>>> d['a'].append(5)
8.>>> d
9.defaultdict(<class 'list'>, {'a': [1, 2, 5]})
10.>>> d2 = defaultdict(set)
11.>>> d2['a'].add(1)
12.>>> d2['a'].add(1)
13.>>> d2['a'].add(2)
14.>>> d2
15.defaultdict(<class 'set'>, {'a': {1, 2}})
知識補充
set,集合
-
python的set與數學概念上的集合是高度匹配的. 代表一個無序,無重複的元素集.數學集合上的概念在python的set數據類型中都有體現. 比如,支持合集,交集,差集.對稱差集等
-
集合支持x in set,len(set),for x in set等操作. 由於集合不記錄位置信息,所以索引,切片等類序列的操作.
-
集合有兩種類型,常規的set和frozenset.set類型是可變的,可以通過add()和remove()進行增刪,因此,常規類型的set並沒有辦法計算hash值,所以它不可以做字典的鍵.而frozenset是不可變的,可以計算hash.可以用於做字典的鍵.
-
構建一個集合:
{'jack','tim'}
或者s=set(['jack','tim'])
1.class set([iterable]) :
2.
3.class frozenset([iterable]):
4.
5.
6.#Return a new set or frozenset object whose elements are taken from iterable. The elements of a set must be hashable. To represent sets of sets, the inner sets must be frozenset objects. If iterable is not specified, a new empty set is returned.
7. - Instances of set and frozenset provide the following operations:
- len(s) 返回set長度
- x in s/x not in s 成員操作符
- isdisjoint(other) 兩個集合是否不相交. 不相交返回True,相交返回False
- s1<=s2,s1是s2的子集
- s1<s2,s1是s2的真子集
- issubset(other)
- issuperset(other)
- union(other, …) Return a new set with elements from the set and all others.
- set | other | …
- intersection(other, …) Return a new set with elements common to the set and all others.
- set & other & …
- difference(other, …) Return a new set with elements in the set that are not in the others.
- set - other - …
- symmetric_difference(other) Return a new set with elements in either the set or other but not both.
- set ^ other
- copy() Return a new set with a shallow copy of s.
- 一個set和一個frozenset做union的時候,會返回一個frozenset
- 只適用於set不使用於frozenset的方法:
1.update(other, ...)
2.set |= other | ...
3.#Update the set, adding elements from all others.
4.intersection_update(other, ...)
5.set &= other & ...
6.#Update the set, keeping only elements found in it and all others.
7.difference_update(other, ...)
8.set -= other | ...
9.#Update the set, removing elements found in others.
10.symmetric_difference_update(other)
11.set ^= other
12.#Update the set, keeping only elements found in either set, but not in both.
13.add(elem)
14.#Add element elem to the set.
15.remove(elem)
16.#Remove element elem from the set. Raises KeyError if elem is not contained in the set.
17.discard(elem)
18.#Remove element elem from the set if it is present.
19.pop()
20.#Remove and return an arbitrary element from the set. Raises KeyError if the set is empty.
21.clear()
22.#Remove all elements from the set.
defaultdict
正常的字典,如果訪問不存在的鍵,就會報KeyError,但是很多時候,我們需要字典中的每個鍵都存在默認值.
1.strings = ('puppy', 'kitten', 'puppy', 'puppy','weasel', 'puppy', 'kitten', 'puppy')
2.counts = {}
3.for kw in strings:
4. counts[kw] += 1
這是一段統計字符出現次數的代碼,但是在python中這段代碼是不可以運行的,會報KeyError.所以可以做一個改進
1.strings = ('puppy', 'kitten', 'puppy', 'puppy','weasel', 'puppy', 'kitten', 'puppy')
2.counts = {}
3.for kw in strings:
4. if kw not in counts:
5. counts[kw] = 1
6. else:
7. counts[kw] += 1
或者可以使用default dict:
1.strings = ('puppy', 'kitten', 'puppy', 'puppy','weasel', 'puppy', 'kitten', 'puppy')
2.counts = {}
3.for kw in strings:
4. counts.setdefault(kw, 0)
5. counts[kw] += 1
setdefault()接受兩個參數,一個鍵的名稱,一個默認值. 還有一種比較好的表達方式,但是不容易理解
1.strings = ('puppy', 'kitten', 'puppy', 'puppy','weasel', 'puppy', 'kitten', 'puppy')
2.counts = {}
3.for kw in strings:
4. counts[kw] = counts.setdefault(kw, 0) + 1
python提供了一個專門的類來處理defaultdict: collections.defaultdict
- 初始化,defaultdict()接受一個類型作爲初始化參數.
1.>>> from collections import defaultdict
2.>>> dd=defaultdict(list)
3.>>> dd
4.defaultdict(<class 'list'>, {})
- 當訪問的鍵不存在的時候,會自動初始化一個默認值.
1.>>> dd['tim']
2.[]
3.>>> dd['jerry']=26
4.>>> dd
5.defaultdict(<class 'list'>, {'jerry': 26, 'tim': []})
6.>>> dd['jerry']=27
7.>>> dd
8.defaultdict(<class 'list'>, {'jerry': 27, 'tim': []})
9.>>> dd['tim'].append(22)
10.>>> dd
11.defaultdict(<class 'list'>, {'jerry': 27, 'tim': [22]})
12.>>>
這種形式的默認值只有在通過dict[key]或者dict.getitem(key)訪問的時候纔有效
- defaultdict除了接受類型作爲參數外,也可以以不帶參數的可調用函數作爲參數.
1.>>> from collections import defaultdict
2.>>> def zero():
3.... return 0
4....
5.>>> dd = defaultdict(zero)
6.>>> dd
7.defaultdict(<function zero at 0xb7ed2684>, {})
8.>>> dd['foo']
9.0
10.>>> dd
11.defaultdict(<function zero at 0xb7ed2684>, {'foo': 0})
來解決最初的單詞統計問題
1.from collections import defaultdict
2.strings = ('puppy', 'kitten', 'puppy', 'puppy','weasel', 'puppy', 'kitten', 'puppy')
3.counts = defaultdict(lambda: 0) # 使用lambda來定義簡單的函數
4.for s in strings:
5. counts[s] += 1
- defaultdict類是如何實現的?
關鍵在__missing__()
方法,當是用__getitem__()
方法獲取不到值的時候,會調用__missing__()
方法獲取默認值.並將該鍵添加到字典中.
在dict的派生子類中,如果訪問的鍵不存在時,d[key]會調用__missing__()
方法來獲取默認值. 但是在dict中,並未實現這一方法.
1.>>> class Missing(dict):
2. def __missing__(self,key):
3. return 'missing'
4.>>> d=Missing()
5.>>> d['1']
6.'missing'
通過上面這個例子就可以看出來.__missing__()
函數確實生效了.
另外,還可以進一步修改以達到defaultdict的效果:
1.>>> class Defaulting(dict):
2. def __missing__(self,key):
3. self[key]='default'
4. return 'default'
5.>>> dd=Defaulting()
6.>>> dd['1']
7.'default'
8.>>> dd
9.{'1': 'default'}