導入csv文件
LOAD DATA INFILE '/var/lib/mysql-files/jjdb_fkdb_all_dropdup_20w.csv'
into table `jjdb_fkdb_all_dropdup_20w` character set utf8
fields terminated by ',' optionally enclosed by '"' escaped by '"'
lines terminated by '\r\n';
注:導入csv文件,對文件的結構和內容要求苛刻,失敗概率很大。
導入txt文件
1,將csv文件保存爲txt文件輸出
df.to_csv('jjdb_fkdb_all_dropdup.txt',sep='\t',index=False)
2,進入數據庫
mysql -uroot -p
use mysql57
3,數據導入mysql
LOAD DATA INFILE '/var/lib/mysql-files/jjdb_fkdb_all_dropdup_20w.txt'
若txt中字段的順序與table中字段的順序不一致,則會報錯,所以以下命令中,按照table中的順序輸入字段名稱:
INTO TABLE jjdb_fkdb_all_dropdup_20w IGNORE 1 LINES (xzqhdm,jjdbh,jjdwdm,jjybh,jjyxm,jjtbh,jjtip,jjsj,bjdh,jjlyh,bjrxm,bjrxbdm,lxdh,lxdz,jqdz,gxdwdm,jqlxdm,zddwxzb,zddwyzb,jqztdm,gxsjc,gljqbh,tfhm,rksjc,bjnr,bjlxmc,labels,sjdbh,jjdbh_f,cjdbh,fkdbh,fksj,fkdwip,fktbh,fkybh,fkdwdm,fkyxm,jwqdm,jwqmc,cjrxm,sjcjsj,ddxcsj,jqlbdm,jqxldm,jqfssj,jqjssj,jqdjdm,hzdjdm,afcslxdm,qhjzlbdm,cjqk,sfphxsaj,sfcczaaj,sfjjjf,cdclqk,cdryqk,hzyydm,cljgdm,cljg,rksjc_f,gxsjc_f,jqztdm_f,id);
導入成功
Query OK, 6331041 rows affected, 65535 warnings (2 min 30.16 sec)
Records: 6331041 Deleted: 0 Skipped: 0 Warnings: 48877809
注意:
1,導入過程中出現1261報錯:
ERROR 1261 (01000): Row 404 doesn't contain data for all columns
則設置sql_mode,操作如下:
show variables like "sql_mode";
set sql_mode='';
2,導入過程中出現1064報錯:
ERROR 1062 (23000): Duplicate entry '20' for key 'PRIMARY'
則在Navicat中,將table中的主鍵去掉即可,但如此導入結束後會發現數據量增加。
導出csv文件
從表jjdb_fkdb_all_dropdup_copy中導出csv文件(導出的csv文件用pandas讀取可能會出錯)
select * into outfile '/var/lib/mysql-files/all_vs_dup.csv' fields terminated by '\t' lines terminated by '\n' from jjdb_fkdb_all_dropdup_copy;
利用pandas從數據庫中讀取數據
import pandas as pd
import pymysql
# sql 命令
sql_cmd = "SELECT * FROM table_name"
# 用DBAPI構建數據庫鏈接engine
con = pymysql.connect(host='172.**.**.**', user='****', password='**這裏填寫數據庫密碼**', database='**填寫數據庫名稱**', charset='utf8', use_unicode=True)
df = pd.read_sql(sql_cmd, con)