Python 筆記 快速查詢

Python 筆記 快速查詢

http://yanghuangblog.com/index.php/archives/7/

[BLOG]

文章目錄

資料

在線書

https://www.py4e.com/html3/

docs

https://docs.python.org/3/

Dash 查詢

語法差異

示例代碼 參考代碼


可以返回多個返回值

如果其中某些不需要,可以用_代替,比如:

parameters_values, _ = dictionary_to_vector(parameters)

換行

Python中一般是一行寫完所有代碼,如果遇到一行寫不完需要換行的情況,有兩種方法:

1.在該行代碼末尾加上續行符“ \”(即空格+\);

test = ‘item_one’ \

‘item_two’ \

‘tem_three’

輸出結果:‘item_oneitem_twotem_three’

2.加上括號,() {} []中不需要特別加換行符:

test2 = ('csdn ’

‘cssdn’)

reserved words

The reserved words in the language where humans talk to Python include the following:

from import as

def return

def thing():
    pass #please implement this
	pass
	return

pass

pass一般作爲佔位符或者創建佔位程序,不會執行任何操作;

pass在軟件設計階段也經常用來作爲TODO,提醒實現相應的實現;

if elif else

is is not in

python中is 和== 的區別是啥?

is比較的是id是不是一樣,==比較的是值是不是一樣。

Python中,萬物皆對象!萬物皆對象!萬物皆對象!(很重要,重複3遍)

每個對象包含3個屬性,id,type,value

id就是對象地址,可以通過內置函數id()查看對象引用的地址。

type就是對象類型,可以通過內置函數type()查看對象的類型。

value就是對象的值。

引申內容:

所以大多數情況下當用is和==的結果是一樣時,用is的效率是會高於==的效率。

in用在邏輯判斷,返回True False

and or

連接多個條件判斷

while for in continue break

>>> list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

for i in range(10):
    pass

for i in [5,4,3,2,1]:
	pass

for countdown in 5, 4, 3, 2, 1, "hey!":
    print(countdown)

while True:
	pass

try except

這個差異比較大。

默認 traceback 停止運行。

try + blocks, 如果blocks中一條失敗,下面就不運行了,直接到expect中運行。

class

del

There is a way to remove an item from a list given its index instead of its value: the del statement.

>>> a = [-1, 1, 66.25, 333, 333, 1234.5]
>>> del a[0]
>>> a
[1, 66.25, 333, 333, 1234.5]
>>> del a[2:4]
>>> a
[1, 66.25, 1234.5]
>>> del a[:]
>>> a
[]

>>> del a

with as

如果不用with語句,代碼如下:

file = open("/tmp/foo.txt")
data = file.read()
file.close()

這裏有兩個問題。一是可能忘記關閉文件句柄;二是文件讀取數據發生異常,沒有進行任何處理。下面是處理異常的加強版本:

file = open("/tmp/foo.txt")
try:
    data = file.read()
finally:
    file.close()

雖然這段代碼運行良好,但是太冗長了。這時候就是with一展身手的時候了。除了有更優雅的語法,with還可以很好的處理上下文環境產生的異常。下面是with版本的代碼:

with open("/tmp/foo.txt") as file:
    data = file.read()

原理:

基本思想是with所求值的對象必須有一個__enter__()方法,一個__exit__()方法。

緊跟with後面的語句被求值後,返回對象的__enter__()方法被調用,
這個方法的返回值將被賦值給as後面的變量。
當with後面的代碼塊全部被執行完之後,將調用前面返回對象的__exit__()方法。

在with後面的代碼塊拋出任何異常時,__exit__()方法被執行。正如例子所示,
異常拋出時,與之關聯的type,value和stack trace傳給__exit__()方法,
因此拋出的ZeroDivisionError異常被打印出來了。
開發庫時,清理資源,關閉文件等等操作,都可以放在__exit__方法當中。

參考:https://www.cnblogs.com/DswCnblog/p/6126588.html

待整理的關鍵詞

global yield

assert
​ raise
finally lambda nonlocal

運算符 其他規則

運算符

**乘方操作 冪函數

/返回浮點; //返回整數

>>> minute = 59
>>> minute/60
0.9833333333333333

>>> minute = 59
>>> minute//60
0 

string +連接

>>> first = '100'
>>> second = '150'
>>> print(first + second)
100150

tab

python不允許tab和空格混用,所以都有空格,ide中要設置好。

不使用tab,而要用空格

sublime text 3 user設置:

{
	"color_scheme": "Packages/Color Scheme - Default/Solarized (Light).tmTheme",
	"ignored_packages":
	[
		"Vintage"
	],

	"tab_size": 4,
	"translate_tabs_to_spaces": true,
}

代碼塊 blocks

通常是“:”開始,縮進回退結束。

if 4 > 3:
    print("111")
    print("222")

沒有分號,沒有大括號

vowels.sort(reverse=True) 直接用第三個參數

變量 內置函數和class 高級內置數據結構

靜態語言 動態語言 腳本語言 膠水代碼

Built-in Functions

type()

class type(object)

dir()

dir 列出類的所有方法 dir(class)

intput([prompt])

print(*objects, sep=' ', end='\n', file=sys.stdout, flush=False)

a = np.array([[1,2,3,4]])
print(str(a.shape) + "123")
print(a.shape + "123")

(1, 4)123
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-19-40d703a12efd> in <module>()
     16 a = np.array([[1,2,3,4]])
     17 print(str(a.shape) + "123")
---> 18 print(a.shape + "123")

TypeError: can only concatenate tuple (not "str") to tuple

只單獨打印,不需要加str,但如果要用+之類的,要先用str;

int() float() str() list() tuple()

class int(x=0)

class int(x, base=10)

# class float([x])

>>> float('+1.23')
1.23
>>> float('   -12345\n')
-12345.0
>>> float('1e-003')
0.001
>>> float('+1E6')
1000000.0
>>> float()
0.0

max() min() len() sum()

open()

open(file, mode=‘r’, buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)

Open file and return a corresponding file object. If the file cannot be opened, an OSError is raised.

range() 默認從0開始,不包含stop,類似於c數組

class range(stop)

class range(start, stop[, step])

  • start

    The value of the start parameter (or 0 if the parameter was not supplied)

  • stop

    The value of the stop parameter

  • step

    The value of the step parameter (or 1 if the parameter was not supplied)

>>> list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list(range(1, 11))
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> list(range(0, 30, 5))
[0, 5, 10, 15, 20, 25]

其他

abs() delattr() hash() memoryview() set()
all() dict() help() setattr()
any() hex() next() slice()
ascii() divmod() id() object() sorted()
bin() enumerate() oct() staticmethod()
bool() eval()
breakpoint() exec() isinstance() ord()
bytearray() filter() issubclass() pow() super()
bytes() iter()
callable() format() property()
chr() frozenset() vars()
classmethod() getattr() locals() repr() zip()
compile() globals() map() reversed() import()
complex() hasattr() round()

None

如果不用None,比如c中,要實現查找最小值,需要一個而外的flag,來記錄是否還是初始化狀態;

string

[0,4]是指的0 1 2 3,不包含4 切片操作

>>> fruit = 'banana'
>>> fruit[:3]
'ban'
>>> fruit[3:]
'ana'
>>> fruit = 'banana'
>>> len(fruit)
6

The expression fruit[-1] yields the last letter, fruit[-2] yields the second to last, and so on.

>>> 'a' in 'banana'
True
>>> 'seed' in 'banana'
False

Strings are immutable string中內容是隻讀 不可更改

>>> greeting = 'Hello, world!'
>>> greeting[0] = 'J'
TypeError: 'str' object does not support item assignment

string methods

find

str.find(sub[, start[, end]])

Return the lowest index in the string where substring sub is found within the slice s[start:end].

Return -1 if sub is not found.

>>> line = '  Here we go  '
>>> line.strip()
'Here we go'

>>> line = 'Have a nice day'
>>> line.startswith('h')
False
>>> line.lower()
'have a nice day'
>>> line.lower().startswith('h')
True

strip 與 lower的返回值是個新的string,而沒有改變原來string內容;

Format operator

>>> camels = 42
>>> 'I have spotted %d camels.' % camels
'I have spotted 42 camels.'

>>> 'In %d years I have spotted %g %s.' % (3, 0.1, 'camels')
'In 3 years I have spotted 0.1 camels.'

file

讀取,直接用for in就可以,句柄是一個以行爲單位的序列

file.read() 讀取所有數據

rstrip 連換行也strip了

fname = input('Enter the file name: ')
try:
    fhand = open(fname)
except:
    print('File cannot be opened:', fname)
    exit()
count = 0
for line in fhand:
    if line.startswith('Subject:'):
        count = count + 1
print('There were', count, 'subject lines in', fname)

# Code: http://www.py4e.com/code3/search7.py

list

Lists are mutable

list用for in list,是隻讀模式

huangyangdeMacBook-Pro:python_test yang$ cat test.py 
tmp = [1,2,3]

print(tmp)

for iterm in tmp:

    print(iterm)
    iterm = 4

print(tmp)

for i in range(len(tmp)):

    print(tmp[i]) 
    tmp[i] = 4

print(tmp)huangyangdeMacBook-Pro:python_test yang$ python3 test.py 
[1, 2, 3]
1
2
3
[1, 2, 3]
1
2
3
[4, 4, 4]

list的+是合併,*是複製

list, append單個,extend list

sort(*, key=None, reverse=False)

they modify the list and return None.

list,pop index,remove value,del 數組方式

del是內置函數

x = t.pop(1)
t.remove('b')
del t[1]

Lists and strings (list split join)

For example, list(‘abc’) returns [‘a’, ‘b’, ‘c’] and list( (1, 2, 3) ) returns [1, 2, 3]. If no argument is given, the constructor creates a new empty list, [].

>>> s = 'pining for the fjords'
>>> t = s.split()
>>> print(t)
['pining', 'for', 'the', 'fjords']
>>> print(t[2])
the

list, str.split(delimiter) delimiter.join(list)

str.split(sep=None, maxsplit=-1)

>>> '1 2 3'.split()
['1', '2', '3']
>>> '1,2,3'.split(',')
['1', '2', '3']
>>> '1,2,3'.split(',', maxsplit=1)
['1', '2,3']
>>> '1,2,,3,'.split(',')
['1', '2', '', '3', '']
>>> t = ['pining', 'for', 'the', 'fjords']
>>> delimiter = ' '
>>> delimiter.join(t)
'pining for the fjords'

連續定義兩個字符串,同一個id,但連續兩個list,不同id

dict

>>> eng2sp = {'one': 'uno', 'two': 'dos', 'three': 'tres'}
>>> print(eng2sp)
{'one': 'uno', 'three': 'tres', 'two': 'dos'}
>>> print(eng2sp['two'])
'dos'
>>> len(eng2sp)
3
>>> print(eng2sp['four'])
KeyError: 'four'
>>> 'one' in eng2sp
True
>>> 'uno' in eng2sp
False

dict, in只在key的範圍中找,在value中找,要用values方法先導出

>>> d = {"one": 1, "two": 2, "three": 3, "four": 4}
>>> d
{'one': 1, 'two': 2, 'three': 3, 'four': 4}
>>> list(d)
['one', 'two', 'three', 'four']
>>> list(d.keys())
['one', 'two', 'three', 'four']
>>> list(d.values())
[1, 2, 3, 4]
>>> list(d.items())
[('one', 1), ('two', None), ('three', 3), ('four', 4)]

dict反查

但如果此時,我們想由value查找key,則會相對複雜一點,一般來說可通過如下3種方式實現:

#-----------------------------------------------------------------------------------

A. 充分利用 keys() 、values()、index() 函數

>>> list (student.keys()) [list (student.values()).index (‘1004’)]

結果顯示: ‘小明’

get(key[, default])

Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a [KeyError](file:///Users/yang/Library/Application%20Support/Dash/DocSets/Python_3/Python%203.docset/Contents/Resources/Documents/doc/library/stdtypes.html#//apple_ref/Method/exceptions.html#KeyError).

常見錯誤

parameters['W' + str(l)]
parameters[ W + str(l)]   #err!

Tuple

Tuples are immutable

A tuple is a sequence of values much like a list. The values stored in a tuple can be any type, and they are indexed by integers. The important difference is that tuples are immutable. Tuples are also comparable and hashable so we can sort lists of them and use tuples as key values in Python dictionaries.

>>> t = 'a', 'b', 'c', 'd', 'e'
>>> type(t)
<class 'tuple'>
>>> t = ['a', 'b', 'c', 'd', 'e']
>>> type(t)
<class 'list'>
>>> t = ('a', 'b', 'c', 'd', 'e')
>>> type(t)
<class 'tuple'>

>>> t1 = ('a',)
>>> type(t1)
<type 'tuple'>
>>> t2 = ('a')
>>> type(t2)
<type 'str'>

You can’t modify the elements of a tuple, but you can replace one tuple with another:

>>> t = ('a', 'b', 'c', 'd', 'e')
>>> t = ('A',) + t[1:]
>>> t
('A', 'b', 'c', 'd', 'e')


>>> t = ('a', 'b', 'c', 'd', 'e')
>>> t = ('A',t[1:])
>>> t
('A', ('b', 'c', 'd', 'e'))

Tuple assignment

>>> m = [ 'have', 'fun' ]
>>> x, y = m
>>> x
'have'
>>> y
'fun'

>>> m = [ 'have', 'fun' ]
>>> (x, y) = m
>>> x
'have'
>>> y
'fun'

A particularly clever application of tuple assignment allows us to swap the values of two variables in a single statement:

>>> a, b = b, a

>>> addr = '[email protected]'
>>> uname, domain = addr.split('@')

Dictionaries and tuples

>>> d = {'a':10, 'b':1, 'c':22}
>>> t = list(d.items())
>>> t
[('b', 1), ('a', 10), ('c', 22)]
>>> t.sort()
>>> t
[('a', 10), ('b', 1), ('c', 22)]


for key, val in list(d.items()):
    print(val, key)

大括號 中括號 小括號

dict定義用 {}

list定義用 []

tuples定義用 ()

但上面三個,使用時,都用 []

類似於c中的聲明方法:

d = {} # 聲明一個dict d
d = [] # 聲明一個list d
d = () # 聲明一個tuple d

常用庫 需要 import的,但默認安裝的 標準庫

random

import random

for i in range(2):
    x = random.random()
    print(x)
    
0.11132867921152356
0.5950949227890241

>>> random.randint(5, 10)
5
>>> random.randint(5, 10)
9

>>> t = [1, 2, 3]
>>> random.choice(t)
2
>>> random.choice(t)
3

Regular expressions

^ Matches the beginning of the line.

$ Matches the end of the line.

. Matches any character (a wildcard).

\s Matches a whitespace character.

\S Matches a non-whitespace character (opposite of \s).

* Applies to the immediately preceding character(s) and indicates to match zero or more times.

*? Applies to the immediately preceding character(s) and indicates to match zero or more times in “non-greedy mode”.

+ Applies to the immediately preceding character(s) and indicates to match one or more times.

+? Applies to the immediately preceding character(s) and indicates to match one or more times in “non-greedy mode”.

? Applies to the immediately preceding character(s) and indicates to match zero or one time.

?? Applies to the immediately preceding character(s) and indicates to match zero or one time in “non-greedy mode”.

[aeiou] Matches a single character as long as that character is in the specified set. In this example, it would match “a”, “e”, “i”, “o”, or “u”, but no other characters.

[a-z0-9] You can specify ranges of characters using the minus sign. This example is a single character that must be a lowercase letter or a digit.

[^A-Za-z] When the first character in the set notation is a caret, it inverts the logic. This example matches a single character that is anything other than an uppercase or lowercase letter.

( ) When parentheses are added to a regular expression, they are ignored for the purpose of matching, but allow you to extract a particular subset of the matched string rather than the whole string when using findall().

\b Matches the empty string, but only at the start or end of a word.

\B Matches the empty string, but not at the start or end of a word.

\d Matches any decimal digit; equivalent to the set [0-9].

\D Matches any non-digit character; equivalent to the set [^0-9].

re.search

# Search for lines that start with 'X' followed by any non
# whitespace characters and ':'
# followed by a space and any number.
# The number can include a decimal.
import re
hand = open('mbox-short.txt')
for line in hand:
    line = line.rstrip()
    if re.search('^X\S*: [0-9.]+', line):
        print(line)

# Code: http://www.py4e.com/code3/re10.py

When we run the program, we see the data nicely filtered to show only the lines we are looking for.

X-DSPAM-Confidence: 0.8475
X-DSPAM-Probability: 0.0000
X-DSPAM-Confidence: 0.6178
X-DSPAM-Probability: 0.0000

re.findall and extracting

# Search for lines that start with 'X' followed by any
# non whitespace characters and ':' followed by a space
# and any number. The number can include a decimal.
# Then print the number if it is greater than zero.
import re
hand = open('mbox-short.txt')
for line in hand:
    line = line.rstrip()
    x = re.findall('^X\S*: ([0-9.]+)', line)
    if len(x) > 0:
        print(x)

# Code: http://www.py4e.com/code3/re11.py

Instead of calling search(), we add parentheses around the part of the regular expression that represents the floating-point number to indicate we only want findall() to give us back the floating-point number portion of the matching string.

The output from this program is as follows:

['0.8475']
['0.0000']
['0.6178']
['0.0000']
['0.6961']
['0.0000']
..

socket

import socket

mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect(('data.pr4e.org', 80))
cmd = 'GET http://data.pr4e.org/romeo.txt HTTP/1.0\r\n\r\n'.encode()
mysock.send(cmd)

while True:
    data = mysock.recv(512)
    if len(data) < 1:
        break
    print(data.decode(),end='')

mysock.close()

# Code: http://www.py4e.com/code3/socket1.py

urllib

import urllib.request

fhand = urllib.request.urlopen('http://data.pr4e.org/romeo.txt')
for line in fhand:
    print(line.decode().strip())

# Code: http://www.py4e.com/code3/urllib1.py

urllib.request.urlopen 兩種用法

先for,然後在for中處理每一行,用到數據時 decode具體line
import urllib.request, urllib.parse, urllib.error

fhand = urllib.request.urlopen('http://data.pr4e.org/romeo.txt')

counts = dict()
for line in fhand:
    words = line.decode().split()
    for word in words:
        counts[word] = counts.get(word, 0) + 1
print(counts)

# Code: http://www.py4e.com/code3/urlwords.py

一次性read出byte數字,然後交給其他庫去一次性處理
# Search for lines that start with From and have an at sign
import urllib.request, urllib.parse, urllib.error
import re

url = input('Enter - ')
html = urllib.request.urlopen(url).read()
links = re.findall(b'href="(http://.*?)"', html)
for link in links:
    print(link.decode())

# Code: http://www.py4e.com/code3/urlregex.py

HTTP error 403 in Python 3 Web Scraping

from urllib.request import Request, urlopen

req = Request('http://www.cmegroup.com/trading/products/#sortField=oi&sortAsc=false&venues=3&page=1&cleared=1&group=1', headers={'User-Agent': 'Mozilla/5.0'})
webpage = urlopen(req).read()

xml.etree.ElementTree

xml results = tree.findall(“comments/comment/count”) 自身名字除外,要從第一級子名字開始寫,到你想要的爲止

find

import xml.etree.ElementTree as ET

data = '''
<person>
  <name>Chuck</name>
  <phone type="intl">
    +1 734 303 4456
  </phone>
  <email hide="yes" />
</person>'''

tree = ET.fromstring(data)
print('Name:', tree.find('name').text)
print('Attr:', tree.find('email').get('hide'))

# Code: http://www.py4e.com/code3/xml1.py

findall

import xml.etree.ElementTree as ET

input = '''
<stuff>
  <users>
    <user x="2">
      <id>001</id>
      <name>Chuck</name>
    </user>
    <user x="7">
      <id>009</id>
      <name>Brent</name>
    </user>
  </users>
</stuff>'''

stuff = ET.fromstring(input)
lst = stuff.findall('users/user')
print('User count:', len(lst))

for item in lst:
    print('Name', item.find('name').text)
    print('Id', item.find('id').text)
    print('Attribute', item.get('x'))

# Code: http://www.py4e.com/code3/xml2.py

json

json 返回字典或list

json.loads

import json

data = '''
[
  { "id" : "001",
    "x" : "2",
    "name" : "Chuck"
  } ,
  { "id" : "009",
    "x" : "7",
    "name" : "Brent"
  }
]'''

info = json.loads(data)
print('User count:', len(info))

for item in info:
    print('Name', item['name'])
    print('Id', item['id'])
    print('Attribute', item['x'])

# Code: http://www.py4e.com/code3/json2.py

BeautifulSoup 不需要decode

url = input('Enter - ')
html = urllib.request.urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, 'html.parser')

# Retrieve all of the anchor tags
tags = soup('a')
for tag in tags:
    print(tag.get('href', None))

json需要先decode再處理

uh = urllib.request.urlopen(url)
data = uh.read().decode()

try:
    js = json.loads(data)
except:
    js = None

if not js or 'status' not in js or js['status'] != 'OK':
    print('==== Failure To Retrieve ====')
    print(data)
    quit()

# print(json.dumps(js, indent=1))

place_id = js["results"][0]["place_id"]

print(place_id)

sqlite3

CREATE TABLE

import sqlite3

conn = sqlite3.connect('music.sqlite')
cur = conn.cursor()

cur.execute('DROP TABLE IF EXISTS Tracks')
cur.execute('CREATE TABLE Tracks (title TEXT, plays INTEGER)')

conn.close()

# Code: http://www.py4e.com/code3/db1.py

INSERT INTO & SELECT

import sqlite3

conn = sqlite3.connect('music.sqlite')
cur = conn.cursor()

cur.execute('INSERT INTO Tracks (title, plays) VALUES (?, ?)',
    ('Thunderstruck', 20))
cur.execute('INSERT INTO Tracks (title, plays) VALUES (?, ?)',
    ('My Way', 15))
conn.commit()

print('Tracks:')
cur.execute('SELECT title, plays FROM Tracks')
for row in cur:
     print(row)

cur.execute('DELETE FROM Tracks WHERE plays < 100')
conn.commit()

cur.close()

# Code: http://www.py4e.com/code3/db2.py

Programming with multiple tables

INTEGER PRIMARY KEY簡介

Sqlite 中INTEGER PRIMARY KEY AUTOINCREMENT和rowid/INTEGER PRIMARY KEY的使用
在用sqlite設計表時,每個表都有一個自己的整形id值作爲主鍵,插入後能直接得到該主鍵.
因爲sqlite內部本來就會爲每個表加上一個rowid,這個rowid可以當成一個隱含的字段使用,
但是由sqlite引擎來維護的,在3.0以前rowid是32位的整數,3.0以後是64位的整數,可以使用這個內部的rowid作爲每個表的id主鍵。

insert ignore表示,如果中已經存在相同的記錄,則忽略當前新數據;
A join without condition is a cross join. A cross join repeats each row for the left hand table for each row in the right hand table:
fetchone() 如果沒有結果 , 則返回 None

time

import time

x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0]
x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0]

### CLASSIC DOT PRODUCT OF VECTORS IMPLEMENTATION ###
tic = time.process_time()
dot = 0
for i in range(len(x1)):
    dot+= x1[i]*x2[i]
toc = time.process_time()
print ("dot = " + str(dot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")

常用庫 需要 import的,但需要手動安裝的

BeautifulSoup

# To run this, you can install BeautifulSoup
# https://pypi.python.org/pypi/beautifulsoup4

# Or download the file
# http://www.py4e.com/code3/bs4.zip
# and unzip it in the same directory as this file

import urllib.request, urllib.parse, urllib.error
from bs4 import BeautifulSoup
import ssl

# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE

url = input('Enter - ')
html = urllib.request.urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, 'html.parser')

# Retrieve all of the anchor tags
tags = soup('a')
for tag in tags:
    print(tag.get('href', None))

# Code: http://www.py4e.com/code3/urllinks.py

tags = soup(‘a’) 只最後一個標籤 ??

安裝bs4

方式1:

sudo easy_install beautifulsoup4

有可能遇到TLS版本問題,要用方式2

方式2:

curl ‘https://bootstrap.pypa.io/get-pip.py’ > get-pip.py

sudo python3 get-pip.py

sudo pip install bs4

class

class PartyAnimal:
   x = 0

   def __init__(self):
     print('I am constructed')

   def party(self) :
     self.x = self.x + 1
     print('So far',self.x)

   def __del__(self):
     print('I am destructed', self.x)


an = PartyAnimal()
PartyAnimal.party(an)
上面兩種方式等價

等價調用方式

numpy.reshape(a, newshape, order=‘C’)

a = np.arange(6).reshape((3, 2))

a = np.reshape(np.arange(6), (3, 2))

疑問

print

>>> print("hyhy", "1111")
hyhy 1111
>>> print("hyhy" + "1111")
hyhy1111

如果不想要空格,改如何處理?

  1. print(…)
  2. print(value, …, sep=’ ', end=‘n’, file=sys.stdout, flush=False) # n表示換行

todo

md相關

移植目錄,默認都是外鏈,如果是本地圖片,怎麼一起移動。

命名規範

可以

優秀的參考代碼

如何積累大量參考代碼

那些必須try,否則一定traceback

Jupyter notebook

Jupyter Notebook介紹、安裝及使用教程

使用Anaconda安裝

Anaconda(官方網站)就是可以便捷獲取包且對包能夠進行管理,同時對環境可以統一管理的發行版本。Anaconda包含了conda、Python在內的超過180個科學包及其依賴項。

你可以通過進入Anaconda的官方下載頁面自行選擇下載;

numpy

官方文檔:https://docs.scipy.org/doc/numpy-1.10.1/index.html

np.copy

直接複製,類似於cpp的引用,後續的修改會影響原來的值,所以用np.copy

parameters_values, _ = dictionary_to_vector(parameters)

thetaplus = np.copy(parameters_values)       
thetaplus[i][0] = thetaplus[i][0] + epsilon     

axis

獲取矩陣行數列數(二維情況)

要對矩陣進行遍歷時,一般先獲取矩陣的行數和列數。要獲取narray對象的各維的長度,可以通過narray對象的shape屬性

import numpy as np
a = np.array([[1,2,3,4,5],[6,7,8,9,10]])

print(a.shape)          # 結果返回一個tuple元組 (2, 5)
print(a.shape[0])       # 獲得行數,返回 2
print(a.shape[1])       # 獲得列數,返回 5


hy:

從數據結構上理解axis 0 1 2,最外層的是0,其次是1,其次是2;

而行和列,在數據結構上都是以行爲單位,所以行排在列前面;

a = np.array([
	[[1,2,3,4,5],[6,7,8,9,10]],
	[[1,2,3,4,5],[6,7,8,9,10]],
	])

print(a.shape) 

(2, 2, 5)
# shape[0] 對應最高的維度2(第三維度), shape[1]和[2]對應行和列

v = image.reshape(image.shape[0]*image.shape[1]*image.shape[2], 1)

reshape的參數,如果是兩個,就是行和列數量;reshape之後,image的shape並沒有改變,只是v變成了新的shape;


https://blog.csdn.net/taotao223/article/details/79187823

axis

二維數組就更簡單了shape(3,4)這是一個三行四列的數組

sum(axis=0),不考慮行數,把列對應的數相加

最後總結下,axis=n ,就相當於不用考慮n所對應的意義,這個是針對於sum求和,如果是cumsum是不一樣的,那個是累加shape保持不變

很多都這樣用:

x_sum = np.sum(x_exp, axis = 1, keepdims = True)

np.sum

求和 sum()

矩陣求和的函數是sum(),可以對行,列,或整個矩陣求和

import numpy as np

a = np.array([[1,2,3],[4,5,6]])

print(a.sum())           # 對整個矩陣求和
# 結果 21

print(a.sum(axis=0)) # 對行方向求和
# 結果 [5 7 9]

print(a.sum(axis=1)) # 對列方向求和
# 結果 [ 6 15]
1234567891011121314

Numpy 常用方法總結

Numpy 常用方法總結

本文主要列出numpy模塊常用方法

創建矩陣(採用ndarray對象)

對於python中的numpy模塊,一般用其提供的ndarray對象。
創建一個ndarray對象很簡單,只要將一個list作爲參數即可。

import numpy as np 

# 創建一維的narray對象
a = np.array([1,2,3,4,5])

# 創建二維的narray對象
a2 = np.array([[1,2,3,4,5],[6,7,8,9,10]])

# 創建多維對象以其類推123456789

矩陣的截取

按行列截取

矩陣的截取和list相同,可以通過**[](方括號)**來截取

import numpy as np
a = np.array([[1,2,3,4,5],[6,7,8,9,10]])

print(a[0:1])       # 截取第一行,返回 [[1,2,3,4,5]]
print(a[1,2:5])     # 截取第二行,第三、四列,返回 [8,9]

print(a[1,:])       # 截取第二行,返回 [ 6,7,8,9,10]
print(a[1:,2:])     # 截取第一行之後,第2列之後內容,返回[8,9,10]
123456789

按條件截取

按條件截取其實是在[](方括號)中傳入自身的布爾語句

import numpy as np

a = np.array([[1,2,3,4,5],[6,7,8,9,10]])
b = a[a>6]      # 截取矩陣a中大於6的元素,範圍的是一維數組
print(b)        # 返回 [ 7  8  9 10]

# 其實布爾語句首先生成一個布爾矩陣,將布爾矩陣傳入[](方括號)實現截取
print(a>6) 


# 返回
[[False False False False False]
 [False  True  True  True  True]]
1234567891011121314

按條件截取應用較多的是對矩陣中滿足一定條件的元素變成特定的值
例如:將矩陣中大於6的元素變成0。

import numpy as np

a = np.array([[1,2,3,4,5],[6,7,8,9,10]])
print(a)


#開始矩陣爲
[[ 1  2  3  4  5]
 [ 6  7  8  9 10]]

a[a>6] = 0
print(a)


#大於6清零後矩陣爲
[[1 2 3 4 5]
 [6 0 0 0 0]]1234567891011121314151617

矩陣的合併

矩陣的合並可以通過numpy中的hstack方法和vstack方法實現

import numpy as np

a1 = np.array([[1,2],[3,4]])
a2 = np.array([[5,6],[7,8]])

# 注意! 參數傳入時要以列表list或元組tuple的形式傳入
print(np.hstack([a1,a2])) 

# 橫向合併,返回結果如下 
[[1 2 5 6]
 [3 4 7 8]]

print(np.vstack((a1,a2)))

# 縱向合併,返回結果如下
[[1 2]
 [3 4]
 [5 6]
 [7 8]]

# 矩陣的合併也可以通過concatenatef方法。

np.concatenate( (a1,a2), axis=0 )       # 等價於  np.vstack( (a1,a2) )
np.concatenate( (a1,a2), axis=1 )       # 等價於  np.hstack( (a1,a2) )123456789101112131415161718192021222324

通過函數創建矩陣

numpy模塊中自帶了一些創建ndarray對象的函數,可以很方便的創建常用的或有規律的矩陣。

arange

import numpy as np

a = np.arange(10)       # 默認從0開始到10(不包括10),步長爲1
print(a)                # 返回 [0 1 2 3 4 5 6 7 8 9]

a1 = np.arange(5,10)    # 從5開始到10(不包括10),步長爲1
print(a1)               # 返回 [5 6 7 8 9]

a2 = np.arange(5,20,2)  # 從5開始到20(不包括20),步長爲2
print(a2)               # 返回 [ 5  7  9 11 13 15 17 19]12345678910

linspace

linspace()和matlab的linspace很類似,用於創建指定數量等間隔的序列,實際生成一個等差數列。

import numpy as np

a = np.linspace(0,10,7) # 生成首位是0,末位是10,含7個數的等差數列
print(a) 


# 結果 
[  0.           1.66666667   3.33333333   5.         6.66666667  8.33333333  10.        ]
123456789

logspace

linspace用於生成等差數列,而logspace用於生成等比數列。
下面的例子用於生成首位是100,末位是104,含5個數的等比數列。

import numpy as np

a = np.logspace(0,4,5)
print(a)


# 結果
[  1.00000000e+00   1.00000000e+01   1.00000000e+02   1.00000000e+03
   1.00000000e+04]123456789

ones、zeros、eye、empty

ones創建全1矩陣
zeros創建全0矩陣
eye創建單位矩陣
empty創建空矩陣(實際有值)

import numpy as np

a_ones = np.ones((3,4))         # 創建3*4的全1矩陣
print(a_ones)

# 結果
[[ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]]


a_zeros = np.zeros((3,4))       # 創建3*4的全0矩陣
print(a_zeros)

# 結果
[[ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]]


a_eye = np.eye(3)               # 創建3階單位矩陣
print(a_eye)

# 結果
[ 1.  0.  0.]
 [ 0.  1.  0.]
 [ 0.  0.  1.]]


a_empty = np.empty((3,4))       # 創建3*4的空矩陣 
print(a_empty)
# 結果
[[  9.25283328e+086,               nan,   6.04075076e-309,    1.53957654e-306],
  [  3.60081101e+228,   8.59109220e+115,   5.83022290e+252,     7.29515154e-315],
   [  8.73990008e+245,  -1.12621655e-279,   8.06565391e-273,     8.35428692e-308]]
123456789101112131415161718192021222324252627282930313233343536

fromstring ——獲得字符ASCII碼

fromstring()方法可以將字符串轉化成ndarray對象,需要將字符串數字化時這個方法比較有用,可以獲得字符串的ascii碼序列。

import numpy as np

a = "abcdef"
b = np.fromstring(a,dtype=np.int8)      # 因爲一個字符爲8爲,所以指定dtype爲np.int8
print(b)                                # 返回 [ 97  98  99 100 101 102]
123456

fromfunction

fromfunction()方法可以根據矩陣的行號列號生成矩陣的元素。
例如創建一個矩陣,矩陣中的每個元素都爲行號和列號的和。

import numpy as np

def func(i,j): 
    return i+j

a = np.fromfunction(func,(5,6)) 
# 第一個參數爲指定函數,第二個參數爲列表list或元組tuple,說明矩陣的大小
print(a)


# 返回
[[ 0.  1.  2.  3.  4.  5.]
 [ 1.  2.  3.  4.  5.  6.]
 [ 2.  3.  4.  5.  6.  7.]
 [ 3.  4.  5.  6.  7.  8.]
 [ 4.  5.  6.  7.  8.  9.]]
# 注意這裏行號的列號都是從0開始的
123456789101112131415161718

矩陣的運算

常用矩陣運算符

numpy中的ndarray對象重載了許多運算符,使用這些運算符可以完成矩陣間對應元素的運算。
例如:+ - * / % **

常用矩陣函數

同樣地,numpy中也定義了許多函數,使用這些函數可以將函數作用於矩陣中的每個元素。
表格中默認導入了numpy模塊,即 import numpy as np

a爲ndarray對象。

  • np.sin(a) 對矩陣a中每個元素取正弦,sin(x)

  • np.cos(a) 對矩陣a中每個元素取餘弦,cos(x)

  • np.tan(a) 對矩陣a中每個元素取正切,tan(x)

  • np.arcsin(a) 對矩陣a中每個元素取反正弦,arcsin(x)

  • np.arccos(a) 對矩陣a中每個元素取反餘弦,arccos(x)

  • np.arctan(a) 對矩陣a中每個元素取反正切,arctan(x)

  • np.exp(a) 對矩陣a中每個元素取指數函數,ex

  • np.sqrt(a) 對矩陣a中每個元素開根號√x

    hy:

abs 絕對值

square 計算平方

例如:

import numpy as np

a = np.array([[1,2,3],[4,5,6]])
print(np.sin(a))

# 結果
[[ 0.84147098  0.90929743  0.14112001]
 [-0.7568025  -0.95892427 -0.2794155 ]]

print(np.arcsin(a))

# 結果
# RuntimeWarning: invalid value encountered in arcsin
print(np.arcsin(a))
[[ 1.57079633         nan         nan]
 [        nan         nan         nan]]
123456789101112131415161718

當矩陣中的元素不在定義域範圍內,會產生RuntimeWarning,結果爲nan(not a number)。

矩陣乘法(點乘)

矩陣乘法必須滿足矩陣乘法的條件,即第一個矩陣的列數等於第二個矩陣的行數。
矩陣乘法的函數爲 dot
例如:

import numpy as np

a1 = np.array([[1,2,3],[4,5,6]])        # a1爲2*3矩陣
a2 = np.array([[1,2],[3,4],[5,6]])      # a2爲3*2矩陣

print(a1.shape[1]==a2.shape[0])         # True, 滿足矩陣乘法條件
print(a1.dot(a2)) 

# a1.dot(a2)相當於matlab中的a1*a2
# 而python中的a1*a2相當於matlab中的a1.*a2

# 結果
[[22 28]
 [49 64]]
123456789101112131415

矩陣的轉置 A T

import numpy as np

a = np.array([[1,2,3],[4,5,6]])

print(a.transpose())

# 結果
[[1 4]
 [2 5]
 [3 6]]


### 矩陣的轉置還有更簡單的方法,就是a.T
a = np.array([[1,2,3],[4,5,6]])
print(a.T)

# 結果
[[1 4]
 [2 5]
 [3 6]]
123456789101112131415161718192021

矩陣的逆 a−1

求矩陣的逆需要先導入numpy.linalg,用linalg的inv函數來求逆。
矩陣求逆的條件是矩陣的行數和列數相同。

import numpy as np
import numpy.linalg as lg

a = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(lg.inv(a))

# 結果
[[ -4.50359963e+15   9.00719925e+15  -4.50359963e+15]
 [  9.00719925e+15  -1.80143985e+16   9.00719925e+15]
 [ -4.50359963e+15   9.00719925e+15  -4.50359963e+15]]

a = np.eye(3)               # 3階單位矩陣
print(lg.inv(a))            # 單位矩陣的逆爲他本身

# 結果
[[ 1.  0.  0.]
 [ 0.  1.  0.]
 [ 0.  0.  1.]]
12345678910111213141516171819

矩陣信息獲取(如平均值)

最大最小值

獲得矩陣中元素最大最小值的函數分別是max和min,可以獲得整個矩陣、行或列的最大最小值。
例如

import numpy as np

a = np.array([[1,2,3],[4,5,6]])
print(a.max())              # 獲取整個矩陣的最大值 結果: 6
print(a.min())              # 結果:1

# 可以指定關鍵字參數axis來獲得行最大(小)值或列最大(小)值
# axis=0 行方向最大(小)值,即獲得每列的最大(小)值
# axis=1 列方向最大(小)值,即獲得每行的最大(小)值
# 例如

print(a.max(axis=0))
# 結果爲 [4 5 6]

print(a.max(axis=1))
# 結果爲 [3 6]

# 要想獲得最大最小值元素所在的位置,可以通過argmax函數來獲得
print(a.argmax(axis=1))
# 結果爲 [2 2]1234567891011121314151617181920

平均值 mean()

獲得矩陣中元素的平均值可以通過函數mean()。同樣地,可以獲得整個矩陣、行或列的平均值。

import numpy as np

a = np.array([[1,2,3],[4,5,6]])
print(a.mean())             # 結果爲: 3.5

# 同樣地,可以通過關鍵字axis參數指定沿哪個方向獲取平均值
print(a.mean(axis=0))       # 結果 [ 2.5  3.5  4.5]
print(a.mean(axis=1))       # 結果 [ 2.  5.]123456789

方差 var()

方差的函數爲var(),方差函數var()相當於函數mean(abs(x - x.mean())**2),其中x爲矩陣。

import numpy as np

a = np.array([[1,2,3],[4,5,6]])
print(a.var())              # 結果 2.91666666667

print(a.var(axis=0))        # 結果 [ 2.25  2.25  2.25]
print(a.var(axis=1))        # 結果 [ 0.66666667  0.66666667]
123456789

標準差 std()

標準差的函數爲std()。
std()相當於 sqrt(mean(abs(x - x.mean())**2)),或相當於sqrt(x.var())。

import numpy as np

a = np.array([[1,2,3],[4,5,6]])
print(a.std())              # 結果 1.70782512766

print(a.std(axis=0))        # 結果 [ 1.5  1.5  1.5]
print(a.std(axis=1))        # 結果 [ 0.81649658  0.81649658]
123456789

中值 median()

中值指的是將序列按大小順序排列後,排在中間的那個值,如果有偶數個數,則是排在中間兩個數的平均值。

例如序列[5,2,6,4,2],按大小順序排成 [2,2,4,5,6],排在中間的數是4,所以這個序列的中值是4。

又如序列[5,2,6,4,3,2],按大小順序排成 [2,2,3,4,5,6],因爲有偶數個數,排在中間兩個數是3、4,所以這個序列中值是3.5。

中值的函數是median(),調用方法爲 numpy.median(x,[axis]),axis可指定軸方向,默認axis=None,對所有數去中值。

import numpy as np
x = np.array([[1,2,3],[4,5,6]])

print(np.median(x))         # 對所有數取中值
# 結果
3.5

print(np.median(x,axis=0))  # 沿第一維方向取中值
# 結果
[ 2.5  3.5  4.5]

print(np.median(x,axis=1))  # 沿第二維方向取中值
# 結果
[ 2.  5.]
12345678910111213141516

累積和 cumsum()

某位置累積和指的是該位置之前(包括該位置)所有元素的和。

例如序列[1,2,3,4,5],其累計和爲[1,3,6,10,15],即第一個元素爲1,第二個元素爲1+2=3,……,第五個元素爲1+2+3+4+5=15。

矩陣求累積和的函數是cumsum(),可以對行,列,或整個矩陣求累積和。

import numpy as np

a = np.array([[1,2,3],[4,5,6]])

print(a.cumsum())               # 對整個矩陣求累積和
# 結果 [ 1  3  6 10 15 21]

print(a.cumsum(axis=0))         # 對行方向求累積和
# 結果
[[1 2 3]
 [5 7 9]]

print(a.cumsum(axis=1))         # 對列方向求累積和
# 結果
[[ 1  3  6]
 [ 4  9 15]]
123456789101112131415161718

參考

參考自:smallpi
另外參考:numpy中的array與matrix

me

矩陣範數 numpy.linalg.norm

https://blog.csdn.net/bitcarmanlee/article/details/51945271

https://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.linalg.norm.html#numpy.linalg.norm

The Frobenius norm is given by [R41]:

||A||_F = [\sum_{i,j} abs(a_{i,j})2]{1/2}

norm()方法的原型:

def norm(x, ord=None, axis=None, keepdims=False):
    Matrix or vector norm.

    This function is able to return one of eight different matrix norms,
    or one of an infinite number of vector norms (described below), depending
    on the value of the ``ord`` parameter.

hy:keepdims是爲了保持結構,方便與原來的x進行計算,否則輸出默認是行向量。

keepdims : bool, optional

If this is set to True, the axes which are normed over are left in the result as dimensions with size one. With this option the result will broadcast correctly against the original x.

再看看更爲詳細的計算說明:

    The following norms can be calculated:

    =====  ============================  ==========================
    ord    norm for matrices             norm for vectors
    =====  ============================  ==========================
    None   Frobenius norm                2-norm
    'fro'  Frobenius norm                --
    'nuc'  nuclear norm                  --
    inf    max(sum(abs(x), axis=1))      max(abs(x))
    -inf   min(sum(abs(x), axis=1))      min(abs(x))
    0      --                            sum(x != 0)
    1      max(sum(abs(x), axis=0))      as below
    -1     min(sum(abs(x), axis=0))      as below
    2      2-norm (largest sing. value)  as below
    -2     smallest singular value       as below
    other  --                            sum(abs(x)**ord)**(1./ord)
    =====  ============================  ==========================1234567891011121314151617

看到上面這個表以後,同學們應該特別清楚了吧。

reshape

https://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.reshape.html#numpy.reshape

numpy.reshape(a, newshape, order=‘C’)[source]

-1的作用:最後一個維度,可以用-1讓程序自動計算,而不是準確計算出來,

若newshape是個整數,則都在同一行裏;所以newshape中,n等價於(1,n);

newshape是個元組,若只有一個參數,可以不加() ,但正常都要帶着! 或者爲防止出錯,還是習慣性帶着吧,哪怕只有一個參數。

a = np.arange(6).reshape((3, 2))
a = np.arange(6).reshape(3, 2)
a = np.reshape(np.arange(6), (3, 2))  
a = np.reshape(np.arange(6), 3, 2)  #err!

    W1 = np.random.randn(n_h, n_x)*0.01
    W1 = np.random.randn((n_h, n_x))*0.01 #err! 這個當特例,記住吧!!!!!!!!
    
    b1 = np.zeros((n_h,1))  #算是標準寫法吧
    b1 = np.zeros(n_h,1) #err!

A trick when you want to flatten a matrix X of shape (a,b,c,d) to a matrix X_flatten of shape (b∗∗c∗∗d, a) is to use:

X_flatten = X.reshape(X.shape[0], -1).T      # X.T is the transpose of X

Numpy乘法(dot multiply * 三種)

dot是標準矩陣乘法

multiply是對應項相乘 點乘

*有多重含義,不建議使用

https://www.jianshu.com/p/fd2999f41d84

數學計算

>>> np.log10(100)
2.0
>>> np.log(np.e)
1.0
>>> np.log2(4)
2.0

square  平方
sqrt  平方根


numpy.random.randn

Return a sample (or samples) from the “standard normal” distribution.

filled with random floats sampled from a univariate “normal” (Gaussian) distribution of mean 0 and variance 1

For random samples from N(\mu, \sigma^2), use:

sigma * np.random.randn(...) + mu

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章