Python 筆記 快速查詢
http://yanghuangblog.com/index.php/archives/7/
[BLOG]
文章目錄
- Python 筆記 快速查詢
- 資料
- 語法差異
- reserved words
- `from import as`
- `def return`
- `pass`
- `if elif else`
- `is ` `is not` `in`
- `and or`
- `while` `for in` `continue break`
- `try except`
- `class`
- `del`
- `with as`
- 待整理的關鍵詞
- 運算符 其他規則
- 變量 內置函數和class 高級內置數據結構
- Built-in Functions
- `type()`
- `dir()`
- `intput([prompt])`
- `print(*objects, sep=' ', end='\n', file=sys.stdout, flush=False)`
- `int()` ` float()` ` str()` `list()` `tuple()`
- `max()` `min()` `len()` `sum()`
- `open()`
- `range()` 默認從0開始,不包含stop,類似於c數組
- 其他
- string
- file
- list
- list用for in list,是隻讀模式
- list的+是合併,*是複製
- list, append單個,extend list
- sort(*, key=None, reverse=False)
- list,pop index,remove value,del 數組方式
- Lists and strings (list split join)
- dict
- Tuple
- 大括號 中括號 小括號
- 常用庫 需要 import的,但默認安裝的 標準庫
- random
- Regular expressions
- socket
- urllib
- xml.etree.ElementTree
- json
- sqlite3
- time
- 常用庫 需要 import的,但需要手動安裝的
- class
- 疑問
- todo
- Jupyter notebook
- numpy
- 獲取矩陣行數列數(二維情況)
- Numpy 常用方法總結
- 創建矩陣(採用ndarray對象)
- 矩陣的截取
- 矩陣的合併
- 通過函數創建矩陣
- **fromstring** **——獲得字符ASCII碼**
- 矩陣的運算
- 矩陣信息獲取(如平均值)
- 參考
資料
在線書
docs
Dash 查詢
語法差異
示例代碼 參考代碼
可以返回多個返回值
如果其中某些不需要,可以用_
代替,比如:
parameters_values, _ = dictionary_to_vector(parameters)
換行
Python中一般是一行寫完所有代碼,如果遇到一行寫不完需要換行的情況,有兩種方法:
1.在該行代碼末尾加上續行符“ \”(即空格+\);
test = ‘item_one’ \
‘item_two’ \
‘tem_three’
輸出結果:‘item_oneitem_twotem_three’
2.加上括號,() {} []中不需要特別加換行符:
test2 = ('csdn ’
‘cssdn’)
reserved words
The reserved words in the language where humans talk to Python include the following:
from import as
def return
def thing(): pass #please implement this pass return
pass
pass一般作爲佔位符或者創建佔位程序,不會執行任何操作;
pass在軟件設計階段也經常用來作爲TODO,提醒實現相應的實現;
if elif else
is
is not
in
python中is 和== 的區別是啥?
is比較的是id是不是一樣,==比較的是值是不是一樣。
Python中,萬物皆對象!萬物皆對象!萬物皆對象!(很重要,重複3遍)
每個對象包含3個屬性,id,type,value
id就是對象地址,可以通過內置函數id()查看對象引用的地址。
type就是對象類型,可以通過內置函數type()查看對象的類型。
value就是對象的值。
引申內容:
所以大多數情況下當用is和
==
的結果是一樣時,用is的效率是會高於==
的效率。in用在邏輯判斷,返回True False
and or
連接多個條件判斷
while
for in
continue break
>>> list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
for i in range(10):
pass
for i in [5,4,3,2,1]:
pass
for countdown in 5, 4, 3, 2, 1, "hey!":
print(countdown)
while True:
pass
try except
這個差異比較大。
默認 traceback 停止運行。
try + blocks, 如果blocks中一條失敗,下面就不運行了,直接到expect中運行。
class
del
There is a way to remove an item from a list given its index instead of its value: the del statement.
>>> a = [-1, 1, 66.25, 333, 333, 1234.5] >>> del a[0] >>> a [1, 66.25, 333, 333, 1234.5] >>> del a[2:4] >>> a [1, 66.25, 1234.5] >>> del a[:] >>> a [] >>> del a
with as
如果不用with語句,代碼如下:
file = open("/tmp/foo.txt") data = file.read() file.close()
這裏有兩個問題。一是可能忘記關閉文件句柄;二是文件讀取數據發生異常,沒有進行任何處理。下面是處理異常的加強版本:
file = open("/tmp/foo.txt") try: data = file.read() finally: file.close()
雖然這段代碼運行良好,但是太冗長了。這時候就是with一展身手的時候了。除了有更優雅的語法,with還可以很好的處理上下文環境產生的異常。下面是with版本的代碼:
with open("/tmp/foo.txt") as file: data = file.read()
原理:
基本思想是with所求值的對象必須有一個__enter__()方法,一個__exit__()方法。 緊跟with後面的語句被求值後,返回對象的__enter__()方法被調用, 這個方法的返回值將被賦值給as後面的變量。 當with後面的代碼塊全部被執行完之後,將調用前面返回對象的__exit__()方法。 在with後面的代碼塊拋出任何異常時,__exit__()方法被執行。正如例子所示, 異常拋出時,與之關聯的type,value和stack trace傳給__exit__()方法, 因此拋出的ZeroDivisionError異常被打印出來了。 開發庫時,清理資源,關閉文件等等操作,都可以放在__exit__方法當中。
待整理的關鍵詞
global yield
assert
raise
finally lambda nonlocal
運算符 其他規則
運算符
**
乘方操作 冪函數
/
返回浮點; //
返回整數
>>> minute = 59
>>> minute/60
0.9833333333333333
>>> minute = 59
>>> minute//60
0
string +
連接
>>> first = '100'
>>> second = '150'
>>> print(first + second)
100150
tab
python不允許tab和空格混用,所以都有空格,ide中要設置好。
不使用tab,而要用空格
sublime text 3 user設置:
{
"color_scheme": "Packages/Color Scheme - Default/Solarized (Light).tmTheme",
"ignored_packages":
[
"Vintage"
],
"tab_size": 4,
"translate_tabs_to_spaces": true,
}
代碼塊 blocks
通常是“:”開始,縮進回退結束。
if 4 > 3:
print("111")
print("222")
沒有分號,沒有大括號
vowels.sort(reverse=True) 直接用第三個參數
變量 內置函數和class 高級內置數據結構
靜態語言 動態語言 腳本語言 膠水代碼
Built-in Functions
type()
class type(object)
dir()
dir 列出類的所有方法 dir(class)
intput([prompt])
print(*objects, sep=' ', end='\n', file=sys.stdout, flush=False)
a = np.array([[1,2,3,4]]) print(str(a.shape) + "123") print(a.shape + "123") (1, 4)123 --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-19-40d703a12efd> in <module>() 16 a = np.array([[1,2,3,4]]) 17 print(str(a.shape) + "123") ---> 18 print(a.shape + "123") TypeError: can only concatenate tuple (not "str") to tuple
只單獨打印,不需要加str,但如果要用+之類的,要先用str;
int()
float()
str()
list()
tuple()
class int(x=0)
class int(x, base=10)
# class float([x]) >>> float('+1.23') 1.23 >>> float(' -12345\n') -12345.0 >>> float('1e-003') 0.001 >>> float('+1E6') 1000000.0 >>> float() 0.0
max()
min()
len()
sum()
open()
open(file, mode=‘r’, buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)
Open file and return a corresponding file object. If the file cannot be opened, an OSError is raised.
range()
默認從0開始,不包含stop,類似於c數組
class
range
(stop)class
range
(start, stop[, step])
start
The value of the start parameter (or
0
if the parameter was not supplied)
stop
The value of the stop parameter
step
The value of the step parameter (or
1
if the parameter was not supplied)
>>> list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list(range(1, 11))
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> list(range(0, 30, 5))
[0, 5, 10, 15, 20, 25]
其他
abs() delattr() hash() memoryview() set()
all() dict() help() setattr()
any() hex() next() slice()
ascii() divmod() id() object() sorted()
bin() enumerate() oct() staticmethod()
bool() eval()
breakpoint() exec() isinstance() ord()
bytearray() filter() issubclass() pow() super()
bytes() iter()
callable() format() property()
chr() frozenset() vars()
classmethod() getattr() locals() repr() zip()
compile() globals() map() reversed() import()
complex() hasattr() round()
None
如果不用None,比如c中,要實現查找最小值,需要一個而外的flag,來記錄是否還是初始化狀態;
string
[0,4]
是指的0 1 2 3,不包含4 切片操作
>>> fruit = 'banana'
>>> fruit[:3]
'ban'
>>> fruit[3:]
'ana'
>>> fruit = 'banana'
>>> len(fruit)
6
The expression fruit[-1]
yields the last letter, fruit[-2]
yields the second to last, and so on.
>>> 'a' in 'banana'
True
>>> 'seed' in 'banana'
False
Strings are immutable string中內容是隻讀 不可更改
>>> greeting = 'Hello, world!'
>>> greeting[0] = 'J'
TypeError: 'str' object does not support item assignment
string methods
find
str.find(sub[, start[, end]])
Return the lowest index in the string where substring sub is found within the slice s[start:end].
Return -1 if sub is not found.
>>> line = ' Here we go '
>>> line.strip()
'Here we go'
>>> line = 'Have a nice day'
>>> line.startswith('h')
False
>>> line.lower()
'have a nice day'
>>> line.lower().startswith('h')
True
strip 與 lower的返回值是個新的string,而沒有改變原來string內容;
Format operator
>>> camels = 42
>>> 'I have spotted %d camels.' % camels
'I have spotted 42 camels.'
>>> 'In %d years I have spotted %g %s.' % (3, 0.1, 'camels')
'In 3 years I have spotted 0.1 camels.'
file
讀取,直接用for in就可以,句柄是一個以行爲單位的序列
file.read() 讀取所有數據
rstrip 連換行也strip了
fname = input('Enter the file name: ')
try:
fhand = open(fname)
except:
print('File cannot be opened:', fname)
exit()
count = 0
for line in fhand:
if line.startswith('Subject:'):
count = count + 1
print('There were', count, 'subject lines in', fname)
# Code: http://www.py4e.com/code3/search7.py
list
Lists are mutable
list用for in list,是隻讀模式
huangyangdeMacBook-Pro:python_test yang$ cat test.py
tmp = [1,2,3]
print(tmp)
for iterm in tmp:
print(iterm)
iterm = 4
print(tmp)
for i in range(len(tmp)):
print(tmp[i])
tmp[i] = 4
print(tmp)huangyangdeMacBook-Pro:python_test yang$ python3 test.py
[1, 2, 3]
1
2
3
[1, 2, 3]
1
2
3
[4, 4, 4]
list的+是合併,*是複製
list, append單個,extend list
sort(*, key=None, reverse=False)
they modify the list and return
None
.
list,pop index,remove value,del 數組方式
del是內置函數
x = t.pop(1)
t.remove('b')
del t[1]
Lists and strings (list split join)
For example, list(‘abc’) returns [‘a’, ‘b’, ‘c’] and list( (1, 2, 3) ) returns [1, 2, 3]. If no argument is given, the constructor creates a new empty list, [].
>>> s = 'pining for the fjords' >>> t = s.split() >>> print(t) ['pining', 'for', 'the', 'fjords'] >>> print(t[2]) the
list, str.split(delimiter) delimiter.join(list)
str.split(sep=None, maxsplit=-1)
>>> '1 2 3'.split() ['1', '2', '3'] >>> '1,2,3'.split(',') ['1', '2', '3'] >>> '1,2,3'.split(',', maxsplit=1) ['1', '2,3'] >>> '1,2,,3,'.split(',') ['1', '2', '', '3', '']
>>> t = ['pining', 'for', 'the', 'fjords'] >>> delimiter = ' ' >>> delimiter.join(t) 'pining for the fjords'
連續定義兩個字符串,同一個id,但連續兩個list,不同id
dict
>>> eng2sp = {'one': 'uno', 'two': 'dos', 'three': 'tres'}
>>> print(eng2sp)
{'one': 'uno', 'three': 'tres', 'two': 'dos'}
>>> print(eng2sp['two'])
'dos'
>>> len(eng2sp)
3
>>> print(eng2sp['four'])
KeyError: 'four'
>>> 'one' in eng2sp
True
>>> 'uno' in eng2sp
False
dict, in只在key的範圍中找,在value中找,要用values方法先導出
>>> d = {"one": 1, "two": 2, "three": 3, "four": 4}
>>> d
{'one': 1, 'two': 2, 'three': 3, 'four': 4}
>>> list(d)
['one', 'two', 'three', 'four']
>>> list(d.keys())
['one', 'two', 'three', 'four']
>>> list(d.values())
[1, 2, 3, 4]
>>> list(d.items())
[('one', 1), ('two', None), ('three', 3), ('four', 4)]
dict反查
但如果此時,我們想由value查找key,則會相對複雜一點,一般來說可通過如下3種方式實現:
#-----------------------------------------------------------------------------------
A. 充分利用 keys() 、values()、index() 函數
>>> list (student.keys()) [list (student.values()).index (‘1004’)]
結果顯示: ‘小明’
get
(key[, default])Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to
None
, so that this method never raises a [KeyError
](file:///Users/yang/Library/Application%20Support/Dash/DocSets/Python_3/Python%203.docset/Contents/Resources/Documents/doc/library/stdtypes.html#//apple_ref/Method/exceptions.html#KeyError).
常見錯誤
parameters['W' + str(l)]
parameters[ W + str(l)] #err!
Tuple
Tuples are immutable
A tuple is a sequence of values much like a list. The values stored in a tuple can be any type, and they are indexed by integers. The important difference is that tuples are immutable. Tuples are also comparable and hashable so we can sort lists of them and use tuples as key values in Python dictionaries.
>>> t = 'a', 'b', 'c', 'd', 'e'
>>> type(t)
<class 'tuple'>
>>> t = ['a', 'b', 'c', 'd', 'e']
>>> type(t)
<class 'list'>
>>> t = ('a', 'b', 'c', 'd', 'e')
>>> type(t)
<class 'tuple'>
>>> t1 = ('a',)
>>> type(t1)
<type 'tuple'>
>>> t2 = ('a')
>>> type(t2)
<type 'str'>
You can’t modify the elements of a tuple, but you can replace one tuple with another:
>>> t = ('a', 'b', 'c', 'd', 'e')
>>> t = ('A',) + t[1:]
>>> t
('A', 'b', 'c', 'd', 'e')
>>> t = ('a', 'b', 'c', 'd', 'e')
>>> t = ('A',t[1:])
>>> t
('A', ('b', 'c', 'd', 'e'))
Tuple assignment
>>> m = [ 'have', 'fun' ] >>> x, y = m >>> x 'have' >>> y 'fun' >>> m = [ 'have', 'fun' ] >>> (x, y) = m >>> x 'have' >>> y 'fun' A particularly clever application of tuple assignment allows us to swap the values of two variables in a single statement: >>> a, b = b, a >>> addr = '[email protected]' >>> uname, domain = addr.split('@')
Dictionaries and tuples
>>> d = {'a':10, 'b':1, 'c':22} >>> t = list(d.items()) >>> t [('b', 1), ('a', 10), ('c', 22)] >>> t.sort() >>> t [('a', 10), ('b', 1), ('c', 22)] for key, val in list(d.items()): print(val, key)
大括號 中括號 小括號
dict定義用 {}
list定義用 []
tuples定義用 ()
但上面三個,使用時,都用 []
類似於c中的聲明方法:
d = {} # 聲明一個dict d
d = [] # 聲明一個list d
d = () # 聲明一個tuple d
常用庫 需要 import的,但默認安裝的 標準庫
random
import random
for i in range(2):
x = random.random()
print(x)
0.11132867921152356
0.5950949227890241
>>> random.randint(5, 10)
5
>>> random.randint(5, 10)
9
>>> t = [1, 2, 3]
>>> random.choice(t)
2
>>> random.choice(t)
3
Regular expressions
^
Matches the beginning of the line.
$
Matches the end of the line.
.
Matches any character (a wildcard).
\s
Matches a whitespace character.
\S
Matches a non-whitespace character (opposite of \s).
*
Applies to the immediately preceding character(s) and indicates to match zero or more times.
*?
Applies to the immediately preceding character(s) and indicates to match zero or more times in “non-greedy mode”.
+
Applies to the immediately preceding character(s) and indicates to match one or more times.
+?
Applies to the immediately preceding character(s) and indicates to match one or more times in “non-greedy mode”.
?
Applies to the immediately preceding character(s) and indicates to match zero or one time.
??
Applies to the immediately preceding character(s) and indicates to match zero or one time in “non-greedy mode”.
[aeiou]
Matches a single character as long as that character is in the specified set. In this example, it would match “a”, “e”, “i”, “o”, or “u”, but no other characters.
[a-z0-9]
You can specify ranges of characters using the minus sign. This example is a single character that must be a lowercase letter or a digit.
[^A-Za-z]
When the first character in the set notation is a caret, it inverts the logic. This example matches a single character that is anything other than an uppercase or lowercase letter.
( )
When parentheses are added to a regular expression, they are ignored for the purpose of matching, but allow you to extract a particular subset of the matched string rather than the whole string when usingfindall()
.
\b
Matches the empty string, but only at the start or end of a word.
\B
Matches the empty string, but not at the start or end of a word.
\d
Matches any decimal digit; equivalent to the set [0-9].
\D
Matches any non-digit character; equivalent to the set [^0-9].
re.search
# Search for lines that start with 'X' followed by any non # whitespace characters and ':' # followed by a space and any number. # The number can include a decimal. import re hand = open('mbox-short.txt') for line in hand: line = line.rstrip() if re.search('^X\S*: [0-9.]+', line): print(line) # Code: http://www.py4e.com/code3/re10.py
When we run the program, we see the data nicely filtered to show only the lines we are looking for.
X-DSPAM-Confidence: 0.8475 X-DSPAM-Probability: 0.0000 X-DSPAM-Confidence: 0.6178 X-DSPAM-Probability: 0.0000
re.findall and extracting
# Search for lines that start with 'X' followed by any # non whitespace characters and ':' followed by a space # and any number. The number can include a decimal. # Then print the number if it is greater than zero. import re hand = open('mbox-short.txt') for line in hand: line = line.rstrip() x = re.findall('^X\S*: ([0-9.]+)', line) if len(x) > 0: print(x) # Code: http://www.py4e.com/code3/re11.py
Instead of calling
search()
, we add parentheses around the part of the regular expression that represents the floating-point number to indicate we only wantfindall()
to give us back the floating-point number portion of the matching string.The output from this program is as follows:
['0.8475'] ['0.0000'] ['0.6178'] ['0.0000'] ['0.6961'] ['0.0000'] ..
socket
import socket mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) mysock.connect(('data.pr4e.org', 80)) cmd = 'GET http://data.pr4e.org/romeo.txt HTTP/1.0\r\n\r\n'.encode() mysock.send(cmd) while True: data = mysock.recv(512) if len(data) < 1: break print(data.decode(),end='') mysock.close() # Code: http://www.py4e.com/code3/socket1.py
urllib
import urllib.request fhand = urllib.request.urlopen('http://data.pr4e.org/romeo.txt') for line in fhand: print(line.decode().strip()) # Code: http://www.py4e.com/code3/urllib1.py
urllib.request.urlopen 兩種用法
先for,然後在for中處理每一行,用到數據時 decode具體line
import urllib.request, urllib.parse, urllib.error
fhand = urllib.request.urlopen('http://data.pr4e.org/romeo.txt')
counts = dict()
for line in fhand:
words = line.decode().split()
for word in words:
counts[word] = counts.get(word, 0) + 1
print(counts)
# Code: http://www.py4e.com/code3/urlwords.py
一次性read出byte數字,然後交給其他庫去一次性處理
# Search for lines that start with From and have an at sign
import urllib.request, urllib.parse, urllib.error
import re
url = input('Enter - ')
html = urllib.request.urlopen(url).read()
links = re.findall(b'href="(http://.*?)"', html)
for link in links:
print(link.decode())
# Code: http://www.py4e.com/code3/urlregex.py
HTTP error 403 in Python 3 Web Scraping
from urllib.request import Request, urlopen
req = Request('http://www.cmegroup.com/trading/products/#sortField=oi&sortAsc=false&venues=3&page=1&cleared=1&group=1', headers={'User-Agent': 'Mozilla/5.0'})
webpage = urlopen(req).read()
xml.etree.ElementTree
xml results = tree.findall(“comments/comment/count”) 自身名字除外,要從第一級子名字開始寫,到你想要的爲止
find
import xml.etree.ElementTree as ET data = ''' <person> <name>Chuck</name> <phone type="intl"> +1 734 303 4456 </phone> <email hide="yes" /> </person>''' tree = ET.fromstring(data) print('Name:', tree.find('name').text) print('Attr:', tree.find('email').get('hide')) # Code: http://www.py4e.com/code3/xml1.py
findall
import xml.etree.ElementTree as ET input = ''' <stuff> <users> <user x="2"> <id>001</id> <name>Chuck</name> </user> <user x="7"> <id>009</id> <name>Brent</name> </user> </users> </stuff>''' stuff = ET.fromstring(input) lst = stuff.findall('users/user') print('User count:', len(lst)) for item in lst: print('Name', item.find('name').text) print('Id', item.find('id').text) print('Attribute', item.get('x')) # Code: http://www.py4e.com/code3/xml2.py
json
json 返回字典或list
json.loads
import json data = ''' [ { "id" : "001", "x" : "2", "name" : "Chuck" } , { "id" : "009", "x" : "7", "name" : "Brent" } ]''' info = json.loads(data) print('User count:', len(info)) for item in info: print('Name', item['name']) print('Id', item['id']) print('Attribute', item['x']) # Code: http://www.py4e.com/code3/json2.py
BeautifulSoup 不需要decode
url = input('Enter - ')
html = urllib.request.urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, 'html.parser')
# Retrieve all of the anchor tags
tags = soup('a')
for tag in tags:
print(tag.get('href', None))
json需要先decode再處理
uh = urllib.request.urlopen(url)
data = uh.read().decode()
try:
js = json.loads(data)
except:
js = None
if not js or 'status' not in js or js['status'] != 'OK':
print('==== Failure To Retrieve ====')
print(data)
quit()
# print(json.dumps(js, indent=1))
place_id = js["results"][0]["place_id"]
print(place_id)
sqlite3
CREATE TABLE
import sqlite3 conn = sqlite3.connect('music.sqlite') cur = conn.cursor() cur.execute('DROP TABLE IF EXISTS Tracks') cur.execute('CREATE TABLE Tracks (title TEXT, plays INTEGER)') conn.close() # Code: http://www.py4e.com/code3/db1.py
INSERT INTO & SELECT
import sqlite3 conn = sqlite3.connect('music.sqlite') cur = conn.cursor() cur.execute('INSERT INTO Tracks (title, plays) VALUES (?, ?)', ('Thunderstruck', 20)) cur.execute('INSERT INTO Tracks (title, plays) VALUES (?, ?)', ('My Way', 15)) conn.commit() print('Tracks:') cur.execute('SELECT title, plays FROM Tracks') for row in cur: print(row) cur.execute('DELETE FROM Tracks WHERE plays < 100') conn.commit() cur.close() # Code: http://www.py4e.com/code3/db2.py
Programming with multiple tables
INTEGER PRIMARY KEY簡介
Sqlite 中INTEGER PRIMARY KEY AUTOINCREMENT和rowid/INTEGER PRIMARY KEY的使用
在用sqlite設計表時,每個表都有一個自己的整形id值作爲主鍵,插入後能直接得到該主鍵.
因爲sqlite內部本來就會爲每個表加上一個rowid,這個rowid可以當成一個隱含的字段使用,
但是由sqlite引擎來維護的,在3.0以前rowid是32位的整數,3.0以後是64位的整數,可以使用這個內部的rowid作爲每個表的id主鍵。
insert ignore表示,如果中已經存在相同的記錄,則忽略當前新數據;
A join without condition is a cross join. A cross join repeats each row for the left hand table for each row in the right hand table:
fetchone() 如果沒有結果 , 則返回 None
time
import time x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0] x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0] ### CLASSIC DOT PRODUCT OF VECTORS IMPLEMENTATION ### tic = time.process_time() dot = 0 for i in range(len(x1)): dot+= x1[i]*x2[i] toc = time.process_time() print ("dot = " + str(dot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
常用庫 需要 import的,但需要手動安裝的
BeautifulSoup
# To run this, you can install BeautifulSoup # https://pypi.python.org/pypi/beautifulsoup4 # Or download the file # http://www.py4e.com/code3/bs4.zip # and unzip it in the same directory as this file import urllib.request, urllib.parse, urllib.error from bs4 import BeautifulSoup import ssl # Ignore SSL certificate errors ctx = ssl.create_default_context() ctx.check_hostname = False ctx.verify_mode = ssl.CERT_NONE url = input('Enter - ') html = urllib.request.urlopen(url, context=ctx).read() soup = BeautifulSoup(html, 'html.parser') # Retrieve all of the anchor tags tags = soup('a') for tag in tags: print(tag.get('href', None)) # Code: http://www.py4e.com/code3/urllinks.py
tags = soup(‘a’) 只最後一個標籤 ??
安裝bs4
方式1:
sudo easy_install beautifulsoup4
有可能遇到TLS版本問題,要用方式2
方式2:
curl ‘https://bootstrap.pypa.io/get-pip.py’ > get-pip.py
sudo python3 get-pip.py
sudo pip install bs4
class
class PartyAnimal:
x = 0
def __init__(self):
print('I am constructed')
def party(self) :
self.x = self.x + 1
print('So far',self.x)
def __del__(self):
print('I am destructed', self.x)
an = PartyAnimal()
PartyAnimal.party(an)
上面兩種方式等價
等價調用方式
numpy.reshape
(a, newshape, order=‘C’)a = np.arange(6).reshape((3, 2))
a = np.reshape(np.arange(6), (3, 2))
疑問
>>> print("hyhy", "1111")
hyhy 1111
>>> print("hyhy" + "1111")
hyhy1111
如果不想要空格,改如何處理?
- print(…)
- print(value, …, sep=’ ', end=‘n’, file=sys.stdout, flush=False) # n表示換行
todo
md相關
移植目錄,默認都是外鏈,如果是本地圖片,怎麼一起移動。
命名規範
可以
優秀的參考代碼
如何積累大量參考代碼
那些必須try,否則一定traceback
Jupyter notebook
使用Anaconda安裝
Anaconda(官方網站)就是可以便捷獲取包且對包能夠進行管理,同時對環境可以統一管理的發行版本。Anaconda包含了conda、Python在內的超過180個科學包及其依賴項。
你可以通過進入Anaconda的官方下載頁面自行選擇下載;
numpy
官方文檔:https://docs.scipy.org/doc/numpy-1.10.1/index.html
np.copy
直接複製,類似於cpp的引用,後續的修改會影響原來的值,所以用np.copy
parameters_values, _ = dictionary_to_vector(parameters)
thetaplus = np.copy(parameters_values)
thetaplus[i][0] = thetaplus[i][0] + epsilon
axis
獲取矩陣行數列數(二維情況)
要對矩陣進行遍歷時,一般先獲取矩陣的行數和列數。要獲取narray對象的各維的長度,可以通過narray對象的shape屬性
import numpy as np a = np.array([[1,2,3,4,5],[6,7,8,9,10]]) print(a.shape) # 結果返回一個tuple元組 (2, 5) print(a.shape[0]) # 獲得行數,返回 2 print(a.shape[1]) # 獲得列數,返回 5
hy:
從數據結構上理解axis 0 1 2,最外層的是0,其次是1,其次是2;
而行和列,在數據結構上都是以行爲單位,所以行排在列前面;
a = np.array([ [[1,2,3,4,5],[6,7,8,9,10]], [[1,2,3,4,5],[6,7,8,9,10]], ]) print(a.shape) (2, 2, 5) # shape[0] 對應最高的維度2(第三維度), shape[1]和[2]對應行和列 v = image.reshape(image.shape[0]*image.shape[1]*image.shape[2], 1) reshape的參數,如果是兩個,就是行和列數量;reshape之後,image的shape並沒有改變,只是v變成了新的shape;
https://blog.csdn.net/taotao223/article/details/79187823
axis
二維數組就更簡單了shape(3,4)這是一個三行四列的數組
sum(axis=0),不考慮行數,把列對應的數相加
最後總結下,axis=n ,就相當於不用考慮n所對應的意義,這個是針對於sum求和,如果是cumsum是不一樣的,那個是累加shape保持不變
很多都這樣用:
x_sum = np.sum(x_exp, axis = 1, keepdims = True)
np.sum
求和 sum()
矩陣求和的函數是sum(),可以對行,列,或整個矩陣求和
import numpy as np a = np.array([[1,2,3],[4,5,6]]) print(a.sum()) # 對整個矩陣求和 # 結果 21 print(a.sum(axis=0)) # 對行方向求和 # 結果 [5 7 9] print(a.sum(axis=1)) # 對列方向求和 # 結果 [ 6 15] 1234567891011121314
Numpy 常用方法總結
Numpy 常用方法總結
本文主要列出numpy模塊常用方法
創建矩陣(採用ndarray對象)
對於python中的numpy模塊,一般用其提供的ndarray對象。
創建一個ndarray對象很簡單,只要將一個list作爲參數即可。import numpy as np # 創建一維的narray對象 a = np.array([1,2,3,4,5]) # 創建二維的narray對象 a2 = np.array([[1,2,3,4,5],[6,7,8,9,10]]) # 創建多維對象以其類推123456789
矩陣的截取
按行列截取
矩陣的截取和list相同,可以通過**[](方括號)**來截取
import numpy as np a = np.array([[1,2,3,4,5],[6,7,8,9,10]]) print(a[0:1]) # 截取第一行,返回 [[1,2,3,4,5]] print(a[1,2:5]) # 截取第二行,第三、四列,返回 [8,9] print(a[1,:]) # 截取第二行,返回 [ 6,7,8,9,10] print(a[1:,2:]) # 截取第一行之後,第2列之後內容,返回[8,9,10] 123456789
按條件截取
按條件截取其實是在[](方括號)中傳入自身的布爾語句
import numpy as np a = np.array([[1,2,3,4,5],[6,7,8,9,10]]) b = a[a>6] # 截取矩陣a中大於6的元素,範圍的是一維數組 print(b) # 返回 [ 7 8 9 10] # 其實布爾語句首先生成一個布爾矩陣,將布爾矩陣傳入[](方括號)實現截取 print(a>6) # 返回 [[False False False False False] [False True True True True]] 1234567891011121314
按條件截取應用較多的是對矩陣中滿足一定條件的元素變成特定的值。
例如:將矩陣中大於6的元素變成0。import numpy as np a = np.array([[1,2,3,4,5],[6,7,8,9,10]]) print(a) #開始矩陣爲 [[ 1 2 3 4 5] [ 6 7 8 9 10]] a[a>6] = 0 print(a) #大於6清零後矩陣爲 [[1 2 3 4 5] [6 0 0 0 0]]1234567891011121314151617
矩陣的合併
矩陣的合並可以通過numpy中的hstack方法和vstack方法實現
import numpy as np a1 = np.array([[1,2],[3,4]]) a2 = np.array([[5,6],[7,8]]) # 注意! 參數傳入時要以列表list或元組tuple的形式傳入 print(np.hstack([a1,a2])) # 橫向合併,返回結果如下 [[1 2 5 6] [3 4 7 8]] print(np.vstack((a1,a2))) # 縱向合併,返回結果如下 [[1 2] [3 4] [5 6] [7 8]] # 矩陣的合併也可以通過concatenatef方法。 np.concatenate( (a1,a2), axis=0 ) # 等價於 np.vstack( (a1,a2) ) np.concatenate( (a1,a2), axis=1 ) # 等價於 np.hstack( (a1,a2) )123456789101112131415161718192021222324
通過函數創建矩陣
numpy模塊中自帶了一些創建ndarray對象的函數,可以很方便的創建常用的或有規律的矩陣。
arange
import numpy as np a = np.arange(10) # 默認從0開始到10(不包括10),步長爲1 print(a) # 返回 [0 1 2 3 4 5 6 7 8 9] a1 = np.arange(5,10) # 從5開始到10(不包括10),步長爲1 print(a1) # 返回 [5 6 7 8 9] a2 = np.arange(5,20,2) # 從5開始到20(不包括20),步長爲2 print(a2) # 返回 [ 5 7 9 11 13 15 17 19]12345678910
linspace
linspace()和matlab的linspace很類似,用於創建指定數量等間隔的序列,實際生成一個等差數列。
import numpy as np a = np.linspace(0,10,7) # 生成首位是0,末位是10,含7個數的等差數列 print(a) # 結果 [ 0. 1.66666667 3.33333333 5. 6.66666667 8.33333333 10. ] 123456789
logspace
linspace用於生成等差數列,而logspace用於生成等比數列。
下面的例子用於生成首位是100,末位是104,含5個數的等比數列。import numpy as np a = np.logspace(0,4,5) print(a) # 結果 [ 1.00000000e+00 1.00000000e+01 1.00000000e+02 1.00000000e+03 1.00000000e+04]123456789
ones、zeros、eye、empty
ones創建全1矩陣
zeros創建全0矩陣
eye創建單位矩陣
empty創建空矩陣(實際有值)import numpy as np a_ones = np.ones((3,4)) # 創建3*4的全1矩陣 print(a_ones) # 結果 [[ 1. 1. 1. 1.] [ 1. 1. 1. 1.] [ 1. 1. 1. 1.]] a_zeros = np.zeros((3,4)) # 創建3*4的全0矩陣 print(a_zeros) # 結果 [[ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.]] a_eye = np.eye(3) # 創建3階單位矩陣 print(a_eye) # 結果 [ 1. 0. 0.] [ 0. 1. 0.] [ 0. 0. 1.]] a_empty = np.empty((3,4)) # 創建3*4的空矩陣 print(a_empty) # 結果 [[ 9.25283328e+086, nan, 6.04075076e-309, 1.53957654e-306], [ 3.60081101e+228, 8.59109220e+115, 5.83022290e+252, 7.29515154e-315], [ 8.73990008e+245, -1.12621655e-279, 8.06565391e-273, 8.35428692e-308]] 123456789101112131415161718192021222324252627282930313233343536
fromstring ——獲得字符ASCII碼
fromstring()方法可以將字符串轉化成ndarray對象,需要將字符串數字化時這個方法比較有用,可以獲得字符串的ascii碼序列。
import numpy as np a = "abcdef" b = np.fromstring(a,dtype=np.int8) # 因爲一個字符爲8爲,所以指定dtype爲np.int8 print(b) # 返回 [ 97 98 99 100 101 102] 123456
fromfunction
fromfunction()方法可以根據矩陣的行號列號生成矩陣的元素。
例如創建一個矩陣,矩陣中的每個元素都爲行號和列號的和。import numpy as np def func(i,j): return i+j a = np.fromfunction(func,(5,6)) # 第一個參數爲指定函數,第二個參數爲列表list或元組tuple,說明矩陣的大小 print(a) # 返回 [[ 0. 1. 2. 3. 4. 5.] [ 1. 2. 3. 4. 5. 6.] [ 2. 3. 4. 5. 6. 7.] [ 3. 4. 5. 6. 7. 8.] [ 4. 5. 6. 7. 8. 9.]] # 注意這裏行號的列號都是從0開始的 123456789101112131415161718
矩陣的運算
常用矩陣運算符
numpy中的ndarray對象重載了許多運算符,使用這些運算符可以完成矩陣間對應元素的運算。
例如:+ - * / % **常用矩陣函數
同樣地,numpy中也定義了許多函數,使用這些函數可以將函數作用於矩陣中的每個元素。
表格中默認導入了numpy模塊,即 import numpy as npa爲ndarray對象。
np.sin(a) 對矩陣a中每個元素取正弦,sin(x)
np.cos(a) 對矩陣a中每個元素取餘弦,cos(x)
np.tan(a) 對矩陣a中每個元素取正切,tan(x)
np.arcsin(a) 對矩陣a中每個元素取反正弦,arcsin(x)
np.arccos(a) 對矩陣a中每個元素取反餘弦,arccos(x)
np.arctan(a) 對矩陣a中每個元素取反正切,arctan(x)
np.exp(a) 對矩陣a中每個元素取指數函數,ex
np.sqrt(a) 對矩陣a中每個元素開根號√x
hy:
abs 絕對值
square 計算平方
例如:
import numpy as np a = np.array([[1,2,3],[4,5,6]]) print(np.sin(a)) # 結果 [[ 0.84147098 0.90929743 0.14112001] [-0.7568025 -0.95892427 -0.2794155 ]] print(np.arcsin(a)) # 結果 # RuntimeWarning: invalid value encountered in arcsin print(np.arcsin(a)) [[ 1.57079633 nan nan] [ nan nan nan]] 123456789101112131415161718
當矩陣中的元素不在定義域範圍內,會產生RuntimeWarning,結果爲nan(not a number)。
矩陣乘法(點乘)
矩陣乘法必須滿足矩陣乘法的條件,即第一個矩陣的列數等於第二個矩陣的行數。
矩陣乘法的函數爲 dot
例如:import numpy as np a1 = np.array([[1,2,3],[4,5,6]]) # a1爲2*3矩陣 a2 = np.array([[1,2],[3,4],[5,6]]) # a2爲3*2矩陣 print(a1.shape[1]==a2.shape[0]) # True, 滿足矩陣乘法條件 print(a1.dot(a2)) # a1.dot(a2)相當於matlab中的a1*a2 # 而python中的a1*a2相當於matlab中的a1.*a2 # 結果 [[22 28] [49 64]] 123456789101112131415
矩陣的轉置 A T
import numpy as np a = np.array([[1,2,3],[4,5,6]]) print(a.transpose()) # 結果 [[1 4] [2 5] [3 6]] ### 矩陣的轉置還有更簡單的方法,就是a.T a = np.array([[1,2,3],[4,5,6]]) print(a.T) # 結果 [[1 4] [2 5] [3 6]] 123456789101112131415161718192021
矩陣的逆 a−1
求矩陣的逆需要先導入numpy.linalg,用linalg的inv函數來求逆。
矩陣求逆的條件是矩陣的行數和列數相同。import numpy as np import numpy.linalg as lg a = np.array([[1,2,3],[4,5,6],[7,8,9]]) print(lg.inv(a)) # 結果 [[ -4.50359963e+15 9.00719925e+15 -4.50359963e+15] [ 9.00719925e+15 -1.80143985e+16 9.00719925e+15] [ -4.50359963e+15 9.00719925e+15 -4.50359963e+15]] a = np.eye(3) # 3階單位矩陣 print(lg.inv(a)) # 單位矩陣的逆爲他本身 # 結果 [[ 1. 0. 0.] [ 0. 1. 0.] [ 0. 0. 1.]] 12345678910111213141516171819
矩陣信息獲取(如平均值)
最大最小值
獲得矩陣中元素最大最小值的函數分別是max和min,可以獲得整個矩陣、行或列的最大最小值。
例如import numpy as np a = np.array([[1,2,3],[4,5,6]]) print(a.max()) # 獲取整個矩陣的最大值 結果: 6 print(a.min()) # 結果:1 # 可以指定關鍵字參數axis來獲得行最大(小)值或列最大(小)值 # axis=0 行方向最大(小)值,即獲得每列的最大(小)值 # axis=1 列方向最大(小)值,即獲得每行的最大(小)值 # 例如 print(a.max(axis=0)) # 結果爲 [4 5 6] print(a.max(axis=1)) # 結果爲 [3 6] # 要想獲得最大最小值元素所在的位置,可以通過argmax函數來獲得 print(a.argmax(axis=1)) # 結果爲 [2 2]1234567891011121314151617181920
平均值 mean()
獲得矩陣中元素的平均值可以通過函數mean()。同樣地,可以獲得整個矩陣、行或列的平均值。
import numpy as np a = np.array([[1,2,3],[4,5,6]]) print(a.mean()) # 結果爲: 3.5 # 同樣地,可以通過關鍵字axis參數指定沿哪個方向獲取平均值 print(a.mean(axis=0)) # 結果 [ 2.5 3.5 4.5] print(a.mean(axis=1)) # 結果 [ 2. 5.]123456789
方差 var()
方差的函數爲var(),方差函數var()相當於函數mean(abs(x - x.mean())**2),其中x爲矩陣。
import numpy as np a = np.array([[1,2,3],[4,5,6]]) print(a.var()) # 結果 2.91666666667 print(a.var(axis=0)) # 結果 [ 2.25 2.25 2.25] print(a.var(axis=1)) # 結果 [ 0.66666667 0.66666667] 123456789
標準差 std()
標準差的函數爲std()。
std()相當於 sqrt(mean(abs(x - x.mean())**2)),或相當於sqrt(x.var())。import numpy as np a = np.array([[1,2,3],[4,5,6]]) print(a.std()) # 結果 1.70782512766 print(a.std(axis=0)) # 結果 [ 1.5 1.5 1.5] print(a.std(axis=1)) # 結果 [ 0.81649658 0.81649658] 123456789
中值 median()
中值指的是將序列按大小順序排列後,排在中間的那個值,如果有偶數個數,則是排在中間兩個數的平均值。
例如序列[5,2,6,4,2],按大小順序排成 [2,2,4,5,6],排在中間的數是4,所以這個序列的中值是4。
又如序列[5,2,6,4,3,2],按大小順序排成 [2,2,3,4,5,6],因爲有偶數個數,排在中間兩個數是3、4,所以這個序列中值是3.5。
中值的函數是median(),調用方法爲 numpy.median(x,[axis]),axis可指定軸方向,默認axis=None,對所有數去中值。
import numpy as np x = np.array([[1,2,3],[4,5,6]]) print(np.median(x)) # 對所有數取中值 # 結果 3.5 print(np.median(x,axis=0)) # 沿第一維方向取中值 # 結果 [ 2.5 3.5 4.5] print(np.median(x,axis=1)) # 沿第二維方向取中值 # 結果 [ 2. 5.] 12345678910111213141516
累積和 cumsum()
某位置累積和指的是該位置之前(包括該位置)所有元素的和。
例如序列[1,2,3,4,5],其累計和爲[1,3,6,10,15],即第一個元素爲1,第二個元素爲1+2=3,……,第五個元素爲1+2+3+4+5=15。
矩陣求累積和的函數是cumsum(),可以對行,列,或整個矩陣求累積和。
import numpy as np a = np.array([[1,2,3],[4,5,6]]) print(a.cumsum()) # 對整個矩陣求累積和 # 結果 [ 1 3 6 10 15 21] print(a.cumsum(axis=0)) # 對行方向求累積和 # 結果 [[1 2 3] [5 7 9]] print(a.cumsum(axis=1)) # 對列方向求累積和 # 結果 [[ 1 3 6] [ 4 9 15]] 123456789101112131415161718
參考
參考自:smallpi
另外參考:numpy中的array與matrix
me
矩陣範數 numpy.linalg.norm
https://blog.csdn.net/bitcarmanlee/article/details/51945271
https://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.linalg.norm.html#numpy.linalg.norm
The Frobenius norm is given by [R41]:
norm()方法的原型:
def norm(x, ord=None, axis=None, keepdims=False): Matrix or vector norm. This function is able to return one of eight different matrix norms, or one of an infinite number of vector norms (described below), depending on the value of the ``ord`` parameter.
hy:keepdims是爲了保持結構,方便與原來的x進行計算,否則輸出默認是行向量。
keepdims : bool, optional
If this is set to True, the axes which are normed over are left in the result as dimensions with size one. With this option the result will broadcast correctly against the original x.
再看看更爲詳細的計算說明:
The following norms can be calculated: ===== ============================ ========================== ord norm for matrices norm for vectors ===== ============================ ========================== None Frobenius norm 2-norm 'fro' Frobenius norm -- 'nuc' nuclear norm -- inf max(sum(abs(x), axis=1)) max(abs(x)) -inf min(sum(abs(x), axis=1)) min(abs(x)) 0 -- sum(x != 0) 1 max(sum(abs(x), axis=0)) as below -1 min(sum(abs(x), axis=0)) as below 2 2-norm (largest sing. value) as below -2 smallest singular value as below other -- sum(abs(x)**ord)**(1./ord) ===== ============================ ==========================1234567891011121314151617
看到上面這個表以後,同學們應該特別清楚了吧。
reshape
https://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.reshape.html#numpy.reshape
numpy.reshape
(a, newshape, order=‘C’)[source]
-1
的作用:最後一個維度,可以用-1
讓程序自動計算,而不是準確計算出來,若newshape是個整數,則都在同一行裏;所以newshape中,n等價於(1,n);
newshape是個元組,若只有一個參數,可以不加() ,但正常都要帶着! 或者爲防止出錯,還是習慣性帶着吧,哪怕只有一個參數。
a = np.arange(6).reshape((3, 2))
a = np.arange(6).reshape(3, 2)
a = np.reshape(np.arange(6), (3, 2))
a = np.reshape(np.arange(6), 3, 2) #err!
W1 = np.random.randn(n_h, n_x)*0.01
W1 = np.random.randn((n_h, n_x))*0.01 #err! 這個當特例,記住吧!!!!!!!!
b1 = np.zeros((n_h,1)) #算是標準寫法吧
b1 = np.zeros(n_h,1) #err!
A trick when you want to flatten a matrix X of shape (a,b,c,d) to a matrix X_flatten of shape (b∗∗c∗∗d, a) is to use:
X_flatten = X.reshape(X.shape[0], -1).T # X.T is the transpose of X
Numpy乘法(dot multiply * 三種)
dot是標準矩陣乘法
multiply是對應項相乘 點乘
*
有多重含義,不建議使用
數學計算
>>> np.log10(100)
2.0
>>> np.log(np.e)
1.0
>>> np.log2(4)
2.0
square 平方
sqrt 平方根
numpy.random.randn
Return a sample (or samples) from the “standard normal” distribution.
filled with random floats sampled from a univariate “normal” (Gaussian) distribution of mean 0 and variance 1
For random samples from , use:
sigma * np.random.randn(...) + mu