一、安裝模塊
pip install xlrd
二、Excel文件
本次案例採用的文件來源於中華人民共和國教育部:http://www.moe.gov.cn/jyb_xxgk/s5743/s5744/201906/t20190617_386200.html
點擊直接下載即可
三、數據庫設計
根據表格內容設計簡單的數據庫
1.成人高校數據庫(adult_school):
2.普通高校數據庫(ordinart_school):
四、上代碼
- 導入模塊
import xlrd
import pymysql
- 定義方法open_file(),用來讀取Excel文件並取出具體數據
def open_file(path):
try:
workbook = xlrd.open_workbook(path)
sheet = workbook.sheets()[0]
rownum = sheet.nrows
# print(rownum)
data = []
for value in range(4,rownum):
values = sheet.row_values(value)
# print(type(values[0]))
if type(values[0]) == float:
data.append(values)
# print(values)
return data
except Exception as e:
print(e)
workbook = xlrd.open_workbook(path):打開文件讀取數據
sheet = workbook.sheets()[0]:通過索引順序獲取一個工作表
rownum = sheet.nrows:獲取工作表中的行數
values = sheet.row_values(value):獲取每一行中所有單元格的值
觀察兩個Excel文件,從第五行開始爲所需要的數據
序號列用作表中的鍵,但是有些行顯示的是城市名稱,需要把這些行過濾掉。
通過遍歷,打印每一行中第一個單元格的值
for value in range(4,rownum):
values = sheet.row_values(value)
print(type(values[0]))
結果:
取出的值分別爲float和str類型的,由此可以判斷出只要是str類型的那一行都需要過濾掉,所以判斷一下:
if type(values[0]) == float:
data.append(values)
並把取出的符合表的格式的數據存放於data列表中並返回。
- 定義方法in_sql(),用於寫入數據庫
def in_sql(data,name):
connection = pymysql.connect(
host='localhost', # 數據庫地址
user='root', # 數據庫用戶名
password='19981216', # 數據庫密碼
db='school', # 數據庫名稱
# charset = 'utf8 -- UTF-8 Unicode'
)
# print(data)
cursor = connection.cursor()
print(name)
if name == 'adult_school':
sql = 'insert into ' + name + ' values(%s,%s,%s,%s,%s)'
else:
sql = 'insert into ' + name + ' values(%s,%s,%s,%s,%s,%s,%s)'
print(sql)
cursor.executemany(sql,data)
connection.commit()
print(cursor.rowcount)
in_sql 方法有data和name兩個參數,data參數就是調用open_file()返回的值,name參數指的是表的名字,因爲兩個Excel文件內容是不一樣的。
添加一個判斷,用來判斷應該寫入哪個表:
if name == 'adult_school':
sql = 'insert into ' + name + ' values(%s,%s,%s,%s,%s)'
else:
sql = 'insert into ' + name + ' values(%s,%s,%s,%s,%s,%s,%s)'
print(sql)
- main函數:
if __name__ == '__main__':
adult_school = open_file('./W020190617630075984660.xls')#成人 5列
ordinart_school = open_file('./W020190617630075964590.xls')#普通 7列
in_sql(adult_school,'adult_school')
in_sql(ordinart_school,'ordinart_school')
完整代碼:
import xlrd
import pymysql
def open_file(path):
try:
workbook = xlrd.open_workbook(path)
sheet = workbook.sheets()[0]
rownum = sheet.nrows
# print(rownum)
data = []
for value in range(4,rownum):
values = sheet.row_values(value)
print(type(values[0]))
if type(values[0]) == float:
data.append(values)
# print(values)
return data
except Exception as e:
print(e)
def in_sql(data,name):
connection = pymysql.connect(
host='localhost', # 數據庫地址
user='root', # 數據庫用戶名
password='19981216', # 數據庫密碼
db='school', # 數據庫名稱
# charset = 'utf8 -- UTF-8 Unicode'
)
# print(data)
cursor = connection.cursor()
print(name)
if name == 'adult_school':
sql = 'insert into ' + name + ' values(%s,%s,%s,%s,%s)'
else:
sql = 'insert into ' + name + ' values(%s,%s,%s,%s,%s,%s,%s)'
print(sql)
cursor.executemany(sql,data)
connection.commit()
print(cursor.rowcount)
if __name__ == '__main__':
adult_school = open_file('./W020190617630075984660.xls')#成人 5列
ordinart_school = open_file('./W020190617630075964590.xls')#普通 7列
in_sql(adult_school,'adult_school')
in_sql(ordinart_school,'ordinart_school')