會發生這種情況,一定是字段爲int或者float類型,它會把空字符串默認爲0,varchar則會還是空字符串
解決辦法一般爲兩個:第一個把所有空字符串替換爲\N
import re
with open("F:\\factor_db\\test_new", 'a') as g:
for line in open("F:\\factor_db\\test_old.txt", 'r'):
b = list(set(re.findall('\t+', line)))
b.sort(key = lambda i:len(i),reverse=True)
line = line.replace('\t\n', '\t\\N\n')
if max([len(i) for i in b])==1:
g.write(line)
else:
for i in b:
templen=len(i)
line=line.replace(i, '\t\\N'*(templen-1)+'\t')
g.write(line)
test_old.txt是直接從mysql生成的文本,test_new是最後處理成新的文本,然後直接
load data infile 'F:/factor_db/test_new' into table t_stock_factor_barra fields terminated by '\t' lines terminated by '\r\n' ignore 1 lines
數據庫直接就是null
推薦第二個:
load data infile 'F:/factor_db/barra_test' into table t_stock_factor_barra fields terminated by '\t' lines terminated by '\r\n' ignore 1 lines
(`full_insID`,`date`,@`beta`,@`book_to_price_ratio`,@`earnings_yield`,@`growth`,@`leverage`,@`liquidity`,@`momentum`,@`non_linear_size`,@`residual_volatil
ity`,@`size`)
set
`beta` = NULLif(@beta,''),
`book_to_price_ratio` = NULLif(@book_to_price_ratio,''),
`earnings_yield` = NULLif(@earnings_yield,''),
`growth` = NULLif(@growth,''),
`leverage` = NULLif(@leverage,''),
`liquidity` = NULLif(@liquidity,''),
`momentum` = NULLif(@momentum,''),
`non_linear_size` = NULLif(@non_linear_size,''),
`residual_volatility` = NULLif(@residual_volatility,''),
`size` = NULLif(@size,'')
;
F:/factor_db/barra_test是要導入的文檔
(`full_insID`,`date`,@`beta`,@`book_to_price_ratio`,@`earnings_yield`,@`growth`,@`leverage`,@`liquidity`,@`momentum`,@`non_linear_size`,@`residual_volatil
ity`,@`size`)
是數據庫表字段名稱,一定要按順序寫
加@的就是判斷有空字符串,直接存爲null的,下面跟上`字段名` = NULLif(@字段名,'')這句話就能完成了