Python基礎知識

一、安裝、編譯與運行

Python的安裝很容易，直接到官網：http://www.python.org/下載安裝就可以了。Ubuntu一般都預安裝了。沒有的話，就可以#apt-get install python。Windows的話直接下載msi包安裝即可。Python 程序是通過解釋器執行的，所以安裝後，可以看到Python提供了兩個解析器，一個是IDLE (Python GUI)，一個是Python (command line)。前者是一個帶GUI界面的版本，後者實際上和在命令提示符下運行python是一樣的。運行解釋器後，就會有一個命令提示符>>>，在提示符後鍵入你的程序語句，鍵入的語句將會立即執行。就像Matlab一樣。

另外，Matlab有.m的腳步文件，python也有.py後綴的腳本文件，這個文件除了可以解釋執行外，還可以編譯運行，編譯後運行速度要比解釋運行要快。

例如，我要打印一個helloWorld。

方法1：直接在解釋器中，>>> print ‘helloWorld’。

方法2：將這句代碼寫到一個文件中，例如hello.py。運行這個文件有三種方式：

1）在終端中：python hello.py

2）先編譯成.pyc文件：

import py_compile

py_compile.compile("hello.py")

再在終端中：python hello.pyc

3）在終端中：

python -O -m py_compile hello.py

python hello.pyo

編譯成.pyc和.pyo文件後，執行的速度會更快。所以一般一些重複性並多次調用的代碼會被編譯成這兩種可執行的方式來待調用。

二、變量、運算與表達式

這裏沒什麼好說的，有其他語言的編程基礎的話都沒什麼問題。和Matlab的相似度比較大。這塊差別不是很大。具體如下：

需要注意的一個是：5/2 等於2，5.0/2纔等於2.5。

[python]view
plain copy

###################################  

### compute #######  

# raw_input() get input from keyboard to string type  

# So we should transfer to int type  

# Some new support computing type:  

# and or not in is < <= != == | ^ & << + - / % ~ **  

print 'Please input a number:'  

number = int(raw_input())   

number += 1  

print number**2 # ** means ^  

print number and 1  

print number or 1  

print not number  

5/2 # is 2  

5.0/2 # is 2.5, should be noted

三、數據類型

1、數字

通常的int, long,float,long等等都被支持。而且會看你的具體數字來定義變量的類型。如下：

[python]view
plain copy

###################################  

### type of value #######  

# int, long, float  

# do not need to define the type of value, python will  

# do this according to your value  

num = 1   # stored as int type  

num = 1111111111111   # stored as long int type  

num = 1.0   # stored as float type  

num = 12L # L stands for long type  

num = 1 + 12j # j stands for complex type  

num = '1' # string type

2、字符串

單引號，雙引號和三引號都可以用來定義字符串。三引號可以定義特別格式的字符串。字符串作爲一種序列類型，支持像Matlab一樣的索引訪問和切片訪問。

[python]view
plain copy

###################################  

### type of string #######  

num = "1" # string type  

num = "Let's go" # string type  

num = "He's \"old\"" # string type  

mail = "Xiaoyi: \n hello \n I am you!"  

mail = """Xiaoyi: 

    hello 

    I am you! 

    """ # special string format  

string = 'xiaoyi' # get value by index  

copy = string[0] + string[1] + string[2:6] # note: [2:6] means [2 5] or[2 6)  

copy = string[:4] # start from 1  

copy = string[2:] # to end  

copy = string[::1] # step is 1, from start to end  

copy = string[::2] # step is 2  

copy = string[-1] # means 'i', the last one  

copy = string[-4:-2:-1] # means 'yoa', -1 step controls direction  

memAddr = id(num) # id(num) get the memory address of num  

type(num) # get the type of num

3、元組

元組tuple用()來定義。相當於一個可以存儲不同類型數據的一個數組。可以用索引來訪問，但需要注意的一點是，裏面的元素不能被修改。

[python]view
plain copy

###################################  

### sequence type #######  

## can access the elements by index or slice  

## include: string, tuple(or array? structure? cell?), list  

# basis operation of sequence type  

firstName = 'Zou'  

lastName = 'Xiaoyi'  

len(string) # the length  

name = firstName + lastName # concatenate 2 string  

firstName * 3 # repeat firstName 3 times  

'Z' in firstName # check contain or not, return true  

string = '123'  

max(string)  

min(string)  

cmp(firstName, lastName) # return 1, -1 or 0  

## tuple(or array? structure? cell?)  

## define this type using ()  

user = ("xiaoyi", 25, "male")  

name = user[0]  

age = user[1]  

gender = user[2]  

t1 = () # empty tuple  

t2 = (2, ) # when tuple has only one element, we should add a extra comma  

user[1] = 26 # error!! the elements can not be changed  

name, age, gender = user # can get three element respectively  

a, b, c = (1, 2, 3)

4、列表

列表list用[]來定義。它和元組的功能一樣，不同的一點是，裏面的元素可以修改。List是一個類，支持很多該類定義的方法，這些方法可以用來對list進行操作。

[python]view
plain copy

## list type (the elements can be modified)  

## define this type using []  

userList = ["xiaoyi", 25, "male"]  

name = userList[0]  

age = userList[1]  

gender = userList[2]  

userList[3] = 88888 # error! access out of range, this is different with Matlab  

userList.append(8888) # add new elements  

"male" in userList # search  

userList[2] = 'female' # can modify the element (the memory address not change)  

userList.remove(8888) # remove element  

userList.remove(userList[2]) # remove element  

del(userList[1]) # use system operation api  

## help(list.append)  

################################  

######## object and class ######  

## object = property + method  

## python treats anything as class, here the list type is a class,  

## when we define a list "userList", so we got a object, and we use  

## its method to operate the elements

5、字典

字典dictionary用{}來定義。它的優點是定義像key-value這種鍵值對的結構，就像struct結構體的功能一樣。它也支持字典類支持的方法進行創建和操作。

[python]view
plain copy

################################  

######## dictionary type ######  

## define this type using {}  

item = ['name', 'age', 'gender']  

value = ['xiaoyi', '25', 'male']  

zip(item, value) # zip() will produce a new list:   

# [('name', 'xiaoyi'), ('age', '25'), ('gender', 'male')]  

# but we can not define their corresponding relationship  

# and we can define this relationship use dictionary type  

# This can be defined as a key-value manner  

# dic = {key1: value1, key2: value2, ...}, key and value can be any type  

dic = {'name': 'xiaoyi', 'age': 25, 'gender': 'male'}  

dic = {1: 'zou', 'age':25, 'gender': 'male'}  

# and we access it like this: dic[key1], the key as a index  

print dic['name']  

print dic[1]  

# another methods create dictionary  

fdict = dict(['x', 1], ['y', 2]) # factory mode  

ddict = {}.fromkeys(('x', 'y'), -1) # built-in mode, default value is the same which is none  

# access by for circle  

for key in dic  

    print key  

    print dic[key]  

# add key or elements to dictionary, because dictionary is out of sequence,  

# so we can directly and a key-value pair like this:  

dic['tel'] = 88888    

# update or delete the elements  

del dic[1] # delete this key  

dic.pop('tel') # show and delete this key  

dic.clear() # clear the dictionary  

del dic # delete the dictionary  

dic.get(1) # get the value of key  

dic.get(1, 'error') # return a user-define message if the dictionary do not contain the key  

dic.keys()  

dic.values()  

dic.has_key(key)  

# dictionary has many operations, please use help to check out

四、流程控制

在這塊，Python與其它大多數語言有個非常不同的地方，Python語言使用縮進塊來表示程序邏輯（其它大多數語言使用大括號等）。例如：

if age < 21:

print("你不能買酒。")

print("不過你能買口香糖。")

print("這句話處於if語句塊的外面。")

這個代碼相當於c語言的：

if (age < 21)

{

print("你不能買酒。")

print("不過你能買口香糖。")

}

print("這句話處於if語句塊的外面。")

可以看到，Python語言利用縮進表示語句塊的開始和退出（Off-side規則），而非使用花括號或者某種關鍵字。增加縮進表示語句塊的開始（注意前面有個:號），而減少縮進則表示語句塊的退出。根據PEP的規定，必須使用4個空格來表示每級縮進（不清楚4個空格的規定如何，在實際編寫中可以自定義空格數，但是要滿足每級縮進間空格數相等）。使用Tab字符和其它數目的空格雖然都可以編譯通過，但不符合編碼規範。

爲了使我們自己編寫的程序能很好的兼容別人的程序，我們最好還是按規範來，用四個空格來縮減（注意，要麼都是空格，要是麼都製表符，千萬別混用）。

1、if-else

If-else用來判斷一些條件，以執行滿足某種條件的代碼。

[python]view
plain copy

################################  

######## procedure control #####  

## if else  

if expression: # bool type and do not forget the colon  

    statement(s) # use four space key   

if expression:   

statement(s) # error!!!! should use four space key   

if 1<2:  

    print 'ok, ' # use four space key  

    print 'yeah' # use the same number of space key  

if True: # true should be big letter True  

    print 'true'  

def fun():  

    return 1  

if fun():  

    print 'ok'  

else:  

    print 'no'  

con = int(raw_input('please input a number:'))  

if con < 2:  

    print 'small'  

elif con > 3:  

    print 'big'  

else:  

    print 'middle'  

if 1 < 2:  

    if 2 < 3:  

        print 'yeah'  

    else:  

        print 'no'    

    print 'out'  

else:  

    print 'bad'  

if 1<2 and 2<3 or 2 < 4 not 0: # and, or, not  

    print 'yeah'

2、for

for的作用是循環執行某段代碼。還可以用來遍歷我們上面所提到的序列類型的變量。

[python]view
plain copy

################################  

######## procedure control #####  

## for  

for iterating_val in sequence:  

    statements(s)  

# sequence type can be string, tuple or list  

for i in "abcd":  

    print i  

for i in [1, 2, 3, 4]:  

    print i  

# range(start, end, step), if not set step, default is 1,   

# if not set start, default is 0, should be noted that it is [start, end), not [start, end]  

range(5) # [0, 1, 2, 3, 4]  

range(1, 5) # [1, 2, 3, 4]  

range(1, 10, 2) # [1, 3, 5, 7, 9]  

for i in range(1, 100, 1):   

    print i  

# ergodic for basis sequence  

fruits = ['apple', 'banana', 'mango']  

for fruit in range(len(fruits)):   

    print 'current fruit: ', fruits[fruit]  

# ergodic for dictionary  

dic = {1: 111, 2: 222, 5: 555}  

for x in dic:  

    print x, ': ', dic[x]  

dic.items() # return [(1, 111), (2, 222), (5, 555)]  

for key,value in dic.items(): # because we can: a,b=[1,2]  

    print key, ': ', value  

else:  

    print 'ending'  

################################  

import time  

# we also can use: break, continue to control process  

for x in range(1, 11):  

    print x  

    time.sleep(1) # sleep 1s  

    if x == 3:  

        pass # do nothing  

    if x == 2:  

        continue  

    if x == 6:  

        break  

    if x == 7:    

        exit() # exit the whole program  

    print '#'*50

3、while

while的用途也是循環。它首先檢查在它後邊的循環條件，若條件表達式爲真，它就執行冒號後面的語句塊，然後再次測試循環條件，直至爲假。冒號後面的縮近語句塊爲循環體。

[python]view
plain copy

################################  

######## procedure control #####  

## while  

while expression:  

    statement(s)  

while True:  

    print 'hello'  

    x = raw_input('please input something, q for quit:')  

    if x == 'q':  

        break  

else:  

    print 'ending'

4、switch

其實Python並沒有提供switch結構，但我們可以通過字典和函數輕鬆的進行構造。例如：

[python]view
plain copy

#############################  

## switch ####  

## this structure do not support by python  

## but we can implement it by using dictionary and function  

## cal.py ##  

#!/usr/local/python  

from __future__ import division  

# if used this, 5/2=2.5, 6/2=3.0  

def add(x, y):  

    return x + y  

def sub(x, y):  

    return x - y  

def mul(x, y):  

    return x * y  

def div(x, y):  

    return x / y  

operator = {"+": add, "-": sub, "*": mul, "/": div}  

operator["+"](1, 2) # the same as add(1, 2)  

operator["%"](1, 2) # error, not have key "%", but the below will not  

operator.get("+")(1, 2) # the same as add(1, 2)  

def cal(x, o, y):  

    print operator.get(o)(x, y)  

cal(2, "+", 3)  

# this method will effect than if-else

五、函數

1、自定義函數

在Python中，使用def語句來創建函數：

[python]view
plain copy

################################  

######## function #####   

def functionName(parameters): # no parameters is ok  

    bodyOfFunction  

def add(a, b):  

    return a+b # if we do not use a return, any defined function will return default None   

a = 100  

b = 200  

sum = add(a, b)  

##### function.py #####  

#!/usr/bin/python  

#coding:utf8  # support chinese  

def add(a = 1, b = 2): # default parameters  

    return a+b  # can return any type of data  

# the followings are all ok  

add()  

add(2)  

add(y = 1)  

add(3, 4)  

###### the global and local value #####  

## global value: defined outside any function, and can be used  

##              in anywhere, even in functions, this should be noted  

## local value: defined inside a function, and can only be used  

##              in its own function  

## the local value will cover the global if they have the same name  

val = 100 # global value  

def fun():  

    print val # here will access the val = 100  

print val # here will access the val = 100, too  

def fun():  

    a = 100 # local value  

    print a  

print a # here can not access the a = 100  

def fun():  

    global a = 100 # declare as a global value  

    print a  

print a # here can not access the a = 100, because fun() not be called yet  

fun()  

print a # here can access the a = 100  

############################  

## other types of parameters  

def fun(x):  

    print x  

# the follows are all ok  

fun(10) # int  

fun('hello') # string  

fun(('x', 2, 3))  # tuple  

fun([1, 2, 3])    # list  

fun({1: 1, 2: 2}) # dictionary  

## tuple  

def fun(x, y):  

    print "%s : %s" % (x,y) # %s stands for string  

fun('Zou', 'xiaoyi')  

tu = ('Zou', 'xiaoyi')  

fun(*tu)    # can transfer tuple parameter like this  

## dictionary  

def fun(name = "name", age = 0):  

    print "name: %s" % name  

    print "age: " % age  

dic = {name: "xiaoyi", age: 25} # the keys of dictionary should be same as fun()  

fun(**dic) # can transfer dictionary parameter like this  

fun(age = 25, name = 'xiaoyi') # the result is the same  

## the advantage of dictionary is can specify value name  

#############################  

## redundancy parameters ####  

## the tuple  

def fun(x, *args): # the extra parameters will stored in args as tuple type   

    print x  

    print args  

# the follows are ok  

fun(10)  

fun(10, 12, 24) # x = 10, args = (12, 24)  

## the dictionary  

def fun(x, **args): # the extra parameters will stored in args as dictionary type   

    print x  

    print args  

# the follows are ok  

fun(10)  

fun(x = 10, y = 12, z = 15) # x = 10, args = {'y': 12, 'z': 15}  

# mix of tuple and dictionary  

def fun(x, *args, **kwargs):  

    print x  

    print args  

    print kwargs  

fun(1, 2, 3, 4, y = 10, z = 12) # x = 1, args = (2, 3, 4), kwargs = {'y': 10, 'z': 12}

2、Lambda函數

Lambda函數用來定義一個單行的函數，其便利在於：

[python]view
plain copy

#############################  

## lambda function ####  

## define a fast single line function  

fun = lambda x,y : x*y # fun is a object of function class  

fun(2, 3)  

# like  

def fun(x, y):  

    return x*y  

## recursion  

# 5=5*4*3*2*1, n!  

def recursion(n):  

    if n > 0:  

        return n * recursion(n-1) ## wrong  

def mul(x, y):  

    return x * y  

numList = range(1, 5)  

reduce(mul, numList) # 5! = 120  

reduce(lambda x,y : x*y, numList) # 5! = 120, the advantage of lambda function avoid defining a function  

### list expression  

numList = [1, 2, 6, 7]  

filter(lambda x : x % 2 == 0, numList)  

print [x for x in numList if x % 2 == 0] # the same as above  

map(lambda x : x * 2 + 10, numList)  

print [x * 2 + 10 for x in numList] # the same as above

3、Python內置函數

Python內置了很多函數，他們都是一個個的.py文件，在python的安裝目錄可以找到。弄清它有那些函數，對我們的高效編程非常有用。這樣就可以避免重複的勞動了。下面也只是列出一些常用的：

[python]view
plain copy

###################################  

## built-in function of python ####  

## if do not how to use, please use help()  

abs, max, min, len, divmod, pow, round, callable,  

isinstance, cmp, range, xrange, type, id, int()  

list(), tuple(), hex(), oct(), chr(), ord(), long()  

callable # test a function whether can be called or not, if can, return true  

# or test a function is exit or not  

isinstance # test type  

numList = [1, 2]  

if type(numList) == type([]):  

    print "It is a list"  

if isinstance(numList, list): # the same as above, return true  

    print "It is a list"  

for i in range(1, 10001) # will create a 10000 list, and cost memory  

for i in xrange(1, 10001)# do not create such a list, no memory is cost  

## some basic functions about string  

str = 'hello world'  

str.capitalize() # 'Hello World', first letter transfer to big  

str.replace("hello", "good") # 'good world'  

ip = "192.168.1.123"  

ip.split('.') # return ['192', '168', '1', '123']  

help(str.split)  

import string  

str = 'hello world'  

string.replace(str, "hello", "good") # 'good world'  

## some basic functions about sequence  

len, max, min  

# filter(function or none, sequence)  

def fun(x):  

    if x > 5:  

        return True  

numList = [1, 2, 6, 7]  

filter(fun, numList) # get [6, 7], if fun return True, retain the element, otherwise delete it  

filter(lambda x : x % 2 == 0, numList)  

# zip()  

name = ["me", "you"]  

age = [25, 26]  

tel = ["123", "234"]  

zip(name, age, tel) # return a list: [('me', 25, '123'), ('you', 26, '234')]  

# map()  

map(None, name, age, tel) # also return a list: [('me', 25, '123'), ('you', 26, '234')]  

test = ["hello1", "hello2", "hello3"]  

zip(name, age, tel, test) # return [('me', 25, '123', 'hello1'), ('you', 26, '234', 'hello2')]  

map(None, name, age, tel, test) # return [('me', 25, '123', 'hello1'), ('you', 26, '234', 'hello2'), (None, None, None, 'hello3')]  

a = [1, 3, 5]  

b = [2, 4, 6]  

def mul(x, y):  

    return x*y  

map(mul, a, b) # return [2, 12, 30]  

# reduce()  

reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) # return ((((1+2)+3)+4)+5)

六、包與模塊

1、模塊module

python中每一個.py腳本定義一個模塊，所以我們可以在一個.py腳本中定義一個實現某個功能的函數或者腳本，這樣其他的.py腳本就可以調用這個模塊了。調用的方式有三種，如下：

[python]view
plain copy

###################################  

## package and module ####  

## a .py file define a module which can be used in other script  

## as a script, the name of module is the same as the name of the .py file  

## and we use the name to import to a new script  

## e.g., items.py, import items  

## python contains many .py files, which we can import and use  

# vi cal.py  

def add(x, y):  

    return x + y  

def sub(x, y):  

    return x - y  

def mul(x, y):  

    return x * y  

def div(x, y):  

    return x / y  

print "Your answer is: ", add(3, 5)  

if __name__ == "__main__"  

    r = add(1, 3)  

    print r  

# vi test.py  

import cal # will expand cal.py here  

# so, this will execute the following code in cal.py  

# print "Your answer is: ", add(3, 5)  

# it will print "Your answer is: 8"  

# but as we import cal.py, we just want to use those functions  

# so the above code can do this for me, the r=add(1, 3) will not execute  

result = cal.add(1, 2)  

print result  

# or  

import cal as c  

result = c.add(1, 2)  

# or  

from cal import add  

result = add(1, 2)

2、包package

python 的每個.py文件執行某種功能，那有時候我們需要多個.py完成某個更大的功能，或者我們需要將同類功能的.py文件組織到一個地方，這樣就可以很方便我們的使用。模塊可以按目錄組織爲包，創建一個包的步驟：

# 1、建立一個名字爲包名字的文件夾

# 2、在該文件夾下創建一個__init__.py空文件

# 3、根據需要在該文件夾下存放.py腳本文件、已編譯拓展及子包

# 4、import pack.m1,pack.m2 pack.m3

[python]view
plain copy

#### package 包  

## python 的模塊可以按目錄組織爲包，創建一個包的步驟：  

# 1、建立一個名字爲包名字的文件夾  

# 2、在該文件夾下創建一個__init__.py 空文件  

# 3、根據需要在該文件夾下存放.py腳本文件、已編譯拓展及子包  

# 4、import pack.m1, pack.m2 pack.m3  

mkdir calSet  

cd calSet  

touch __init_.py  

cp cal.py .  

# vi test.py  

import calSet.cal  

result = calSet.cal.add(1, 2)  

print result

七、正則表達式

正則表達式，（英語：RegularExpression，在代碼中常簡寫爲regex、regexp或RE），計算機科學的一個概念。正則表達式使用單個字符串來描述、匹配一系列符合某個句法規則的字符串。在很多文本編輯器裏，正則表達式通常被用來檢索、替換那些符合某個模式的文本。

Python提供了功能強大的正則表達式引擎re，我們可以利用這個模塊來利用正則表達式進行字符串操作。我們用import re來導入這個模塊。

正則表達式包含了很多規則，如果能靈活的使用，在匹配字符串方面是非常高效率的。更多的規則，我們需要查閱其他的資料。

1、元字符

很多，一些常用的元字符的使用方法如下：

[python]view
plain copy

##############################  

## 正則表達式 RE  

## re module in python  

import re  

rule = r'abc' # r prefix, the rule you want to check in a given string  

re.findall(rule, "aaaaabcaaaaaabcaa") # return ['abc', 'abc']  

# [] 用來指定一個字符集 [abc] 表示 abc其中任意一個字符符合都可以  

rule = r"t[io]p"   

re.findall(rule, "tip tep twp top") # return ['tip', 'top']  

# ^ 表示 補集，例如[^io] 表示除i和o外的其他字符  

rule = r"t[^io]p"   

re.findall(rule, "tip tep twp top") # return ['tep', 'twp']  

# ^ 也可以 匹配行首，表示要在行首才匹配，其他地方不匹配  

rule = r"^hello"  

re.findall(rule, "hello tep twp hello") # return ['hello']  

re.findall(rule, "tep twp hello") # return []  

# $ 表示匹配行尾  

rule = r"hello$"  

re.findall(rule, "hello tep twp hello") # return ['hello']  

re.findall(rule, "hello tep twp") # return []  

# - 表示範圍  

rule = r"x[0123456789]x" # the same as  

rule = r"x[0-9]x"  

re.findall(rule, "x1x x4x xxx") # return ['x1x', 'x4x']  

rule = r"x[a-zA-Z]x"  

# \ 表示轉義符  

rule = r"\^hello"  

re.findall(rule, "hello twp ^hello") # return ['^hello']  

# \d 匹配一個數字字符。等價於[0-9]。  

# \D 匹配一個非數字字符。等價於[^0-9]。  

# \n 匹配一個換行符。等價於\x0a和\cJ。  

# \r 匹配一個回車符。等價於\x0d和\cM。  

# \s 匹配任何空白字符，包括空格、製表符、換頁符等等。等價於[ \f\n\r\t\v]。  

# \S 匹配任何非空白字符。等價於[^ \f\n\r\t\v]。  

# \t 匹配一個製表符。等價於\x09和\cI。  

# \w 匹配包括下劃線的任何單詞字符。等價於“[A-Za-z0-9_]”。  

# \W 匹配任何非單詞字符。等價於“[^A-Za-z0-9_]”。  

# {} 表示重複規則  

# 例如我們要查找匹配是否是 廣州的號碼，020-八位數據  

# 以下三種方式都可以實現  

rule = r"^020-\d\d\d\d\d\d\d\d$"  

rule = r"^020-\d{8}$" # {8} 表示前面的規則重複8次  

rule = r"^020-[0-9]{8}$"  

re.findall(rule, "020-23546813") # return ['020-23546813']  

# * 表示將其前面的字符重複0或者多次  

rule = r"ab*"  

re.findall(rule, "a") # return ['a']  

re.findall(rule, "ab") # return ['ab']  

# + 表示將其前面的字符重複1或者多次  

rule = r"ab+"  

re.findall(rule, "a") # return []  

re.findall(rule, "ab") # return ['ab']  

re.findall(rule, "abb") # return ['abb']  

# ? 表示前面的字符可有可無  

rule = r"^020-?\d{8}$"  

re.findall(rule, "02023546813") # return ['020-23546813  

re.findall(rule, "020-23546813") # return ['020-23546813']  

re.findall(rule, "020--23546813") # return []  

# ? 表示非貪婪匹配  

rule = r"ab+?"  

re.findall(rule, "abbbbbbb") # return ['ab']  

# {} 可以表示範圍  

rule = r"a{1,3}"  

re.findall(rule, "a") # return ['a']  

re.findall(rule, "aa") # return ['aa']  

re.findall(rule, "aaa") # return ['aaa']  

re.findall(rule, "aaaa") # return ['aaa', 'a']  

## compile re string  

rule = r"\d{3,4}-?\d{8}"  

re.findall(rule, "020-23546813")  

# faster when you compile it  

# return a object  

p_tel = re.compile(rule)  

p_tel.findall("020-23546813")  

# the parameter re.I 不區分大小寫  

name_re = re.compile(r"xiaoyi", re.I)  

name_re.findall("Xiaoyi")  

name_re.findall("XiaoYi")  

name_re.findall("xiAOyi")

2、常用函數

Re模塊作爲一個對象，它還支持很多的操作，例如：

[python]view
plain copy

# the object contain some methods we can use  

# match 去搜索字符串開頭，如果匹配對，那就返回一個對象，否則返回空  

obj = name_re.match('Xiaoyi, Zou')  

# search 去搜索字符串（任何位置），如果匹配對，那就返回一個對象  

obj = name_re.search('Zou, Xiaoyi')  

# 然後可以用它來進行判斷某字符串是否存在我們的正則表達式  

if obj:  

    pass  

# findall 返回一個滿足正則的列表  

name_re.findall("Xiaoyi")  

# finditer 返回一個滿足正則的迭代器  

name_re.finditer("Xiaoyi")  

# 正則替換  

rs = r"z..x"  

re.sub(rs, 'python', 'zoux ni ziox me') # return 'python ni python me'  

re.subn(rs, 'python', 'zoux ni ziox me') # return ('python ni python me', 2), contain a number  

# 正則切片  

str = "123+345-32*78"  

re.split(r'[\+\-\*]', str) # return ['123', '345', '32', '78']  

# 可以打印re模塊支持的屬性和方法，然後用help  

dir(re)  

##### 編譯正則表達式式 可以加入一些屬性，可以增加很多功能  

# 多行匹配  

str = """ 

    hello xiaoyi 

    xiaoyi hello 

    hello zou 

    xiaoyi hello 

    """  

re.findall(r'xiaoyi', str, re.M)

3、分組

分組有兩個作用，它用()來定義一個組，組內的規則只對組內有效。

[python]view
plain copy

# () 分組  

email = r"\w{3}@\w+(\.com|\.cn|\.org)"    

re.match(email, "[email protected]")  

re.match(email, "[email protected]")  

re.match(email, "[email protected]")

另外，分組可以優先返回分組內匹配的字符串。

[python]view
plain copy

# 另外，分組可以優先返回分組內匹配的字符串  

str = """ 

    idk hello name=zou yes ok d 

    hello name=xiaoyi yes no dksl 

    dfi lkasf dfkdf hello name=zouxy yes d 

    """  

r1 = r"hello name=.+ yes"  

re.findall(r1, str) # return ['hello name=zou yes', 'hello name=xiaoyi yes', 'hello name=zouxy yes']  

r2 = r"hello name=(.+) yes"  

re.findall(r2, str) # return ['zou', 'xiaoyi', 'zouxy']  

# 可以看到，它會匹配整個正則表達式，但只會返回()括號分組內的字符串，  

# 用這個屬性，我們就可以進行爬蟲，抓取一些想要的數據

4、一個小實例-爬蟲

這個實例利用上面的正則和分組的優先返回特性來實現一個小爬蟲算法。它的功能是到一個給定的網址裏面將.jpg後綴的圖片全部下載下來。

[python]view
plain copy

## 一個小爬蟲  

## 下載貼吧 或 空間中的所有圖片  

## getJpg.py  

#!/usr/bin/python  

import re  

import urllib  

# Get the source code of a website  

def getHtml(url):  

    print 'Getting html source code...'  

    page = urllib.open(url)  

    html = page.read()  

    return html  

# Open the website and check up the address of images,  

# and find the common features to decide the re_rule  

def getImageAddrList(html):  

    print 'Getting all address of images...'  

    rule = r"src=\"(.+\.jpg)\" pic_ext"  

    imReg = re.compile(rule)  

    imList = re.findall(imReg, html)  

    return imList  

def getImage(imList):  

    print 'Downloading...'  

    name = 1;  

    for imgurl in imList:  

        urllib.urlretrieve(imgurl, '%s.jpg' % name)  

        name += 1  

    print 'Got ', len(imList), ' images!'  

## main  

htmlAddr = "http://tieba.baidu.com/p/2510089409"  

html = getHtml(htmlAddr)  

imList = getImageAddrList(html)  

getImage(imList)

八、深拷貝與淺拷貝

Python中對數據的複製有兩個需要注意的差別：

淺拷貝：對引用對象的拷貝（只拷貝父對象），深拷貝：對對象資源的拷貝。具體的差別如下：

[python]view
plain copy

##############################  

### memory operation  

## 淺拷貝：對引用對象的拷貝（只拷貝父對象）  

## 深拷貝：對對象資源的拷貝  

a = [1, 2, 3]  

b = a # id(a) == id (b), 同一個標籤，相當於引用  

a.append(4) # a = [1, 2, 3, 4], and b also change to = [1, 2, 3, 4]  

import copy  

a = [1, 2, ['a', 'b']] # 二元列表  

c = copy.copy(a)  # id(c) != id(a)  

a.append('d') # a = [1, 2, ['a', 'b'], 'd'] but c keeps not changed  

# 但只屬於淺拷貝，只拷貝父對象  

# 所以 id(a[0]) == id(c[0])，也就是說對a追加的元素不影響c，  

# 但修改a被拷貝的數據後，c的對應數據也會改變，因爲拷貝不會改變元素的地址  

a[2].append('d') # will change c, too  

a[1] = 3 # will change c, too  

# 深拷貝  

d = copy.deepcopy(a) # 全部拷貝，至此恩斷義絕，兩者各走  

# 各的陽關道和獨木橋，以後毫無瓜葛

九、文件與目錄

1、文件讀寫

Python的文件操作和其他的語言沒有太大的差別。通過open或者file類來訪問。但python支持了很多的方法，以支持文件內容和list等類型的交互。具體如下：

[python]view
plain copy

########################  

## file and directory  

# file_handler = open(filename, mode)  

# mode is the same as other program langurage  

## read  

# method 1  

fin = open('./test.txt')  

fin.read()  

fin.close()  

# method 2, class file  

fin = file('./test.txt')  

fin.read()  

fin.close()  

## write  

fin = open('./test.txt', 'r+') # r, r+, w, w+, a, a+, b, U  

fin.write('hello')  

fin.close()  

### 文件對象的方法  

## help(file)  

for i in open('test.txt'):  

    print i  

str = fin.readline() # 每次讀取一行  

list = fin.readlines() # 讀取多行，返回一個列表，每行作爲列表的一個元素  

fin.next() # 讀取改行，指向下一行  

# 用列表來寫入多行  

fin.writelines(list)  

# 移動指針  

fin.seek(0, 0)  

fin.seek(0, 1)  

fin.seek(-1, 2)  

# 提交更新  

fin.flush() # 平時寫數據需要close才真正寫入文件，這個函數可以立刻寫入文件

2、OS模塊

os模塊提供了很多對系統的操作。例如對目錄的操作等。我們需要用import os來插入這個模塊以便使用。

[python]view
plain copy

#########################  

## OS module  

## directory operation should import this  

import os  

os.mkdir('xiaoyi') # mkdir  

os.makedirs('a/b/c', mode = 666) # 創建分級的目錄  

os.listdir() # ls 返回當前層所有文件或者文件夾名到一個列表中（不包括子目錄）  

os.chdir() # cd  

os.getcwd() # pwd  

os.rmdir() # rm

3、目錄遍歷

目錄遍歷的實現可以做很多普遍的功能，例如殺毒軟件，垃圾清除軟件，文件搜索軟件等等。因爲他們都涉及到了掃描某目錄下所有的包括子目錄下的文件。所以需要對目錄進行遍歷。在這裏我們可以使用兩種方法對目錄進行遍歷：

1）遞歸

[python]view
plain copy

#!/usr/bin/python  

#coding:utf8  

import os  

def dirList(path):  

    fileList = os.listdir(path)  

    allFile = []  

    for fileName in fileList:  

        # allFile.append(dirPath + '/' + fileName) # the same as below  

        filePath = os.path.join(path, fileName)  

        if os.path.isdir(filePath):  

            dirList(filePath)  

        allFile.append(filePath)  

    return allFile

2）os.walk函數

[python]view
plain copy

# os.walk 返回一個生成器，每次是一個三元組 [目錄, 子目錄, 文件]  

gen = os.walk('/')  

for path, dir, filelist in os.walk('/'):  

    for filename in filelist:  

        os.path.join(path, filename)

十、異常處理

異常意味着錯誤，未經處理的異常會中止程序運行。而異常拋出機制，爲程序開發人員提供一種在運行時發現錯誤，並進行恢復處理，然後繼續執行的能力。

[python]view
plain copy

###################################  

### 異常處理  

# 異常拋出機制，爲程序開發人員提供一種在運行時發現錯誤，  

# 進行恢復處理，然後繼續執行的能力  

# 用try去嘗試執行一些代碼，如果錯誤，就拋出異常，  

# 異常由except來捕獲，並由我們寫代碼來處理這種異常  

try:  

    fin = open("abc.txt")  

    print hello  

    ### your usually process code here  

except IOError, msg:  

    print "On such file!"  

    ### your code to handle this error  

except NameError, msg:  

    print msg  

    ### your code to handle this error  

finally: # 不管上面有沒有異常，這個代碼塊都會被執行  

    print 'ok'  

# 拋出異常，異常類型要滿足python內定義的  

if filename == "hello":  

    raise TypeError("Nothing!!")

Python基礎知識

druid數據源 xml配置

使用 dockerfile 構建 WordPress 環境

在MAC上安裝Elasticsearch

Redis、Memcache和MongoDB的區別

jQuery的61種選擇器

互聯網協議入門

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結