數據來源:世界銀行共享的樣本
http://www.ibm.com/developerworks/cn/data/library/bd-hivetool/
參考:
http://blog.chinaunix.net/uid-27126319-id-3502468.html
ORACLE-SQLLOAD導入外部數據詳解
1、要求Linux主機有oracle驅動,並測試連接通過。
2、創建/opt/load_test/文件夾,文件傳上去。並用head命令看幾行,看看什麼分隔符和數據類型。
head(Finance_inequality_and_the_poor_data_6005.csv)
3、sqlplus /nolog上去創建表。
CREATE TABLE CEA.LOAD_TEST (
COUNTRYCODE VARCHAR2(16) NULL,
YEARID NUMBER(4,0) NULL,
LOGINITIALGINI NUMBER(20,10) NULL,
GROWTHEINGINI NUMBER(20,10) NULL,
SPAN NUMBER(4,0) NULL,
LOGINITIALGDPPERCAPITAL NUMBER(20,10) NULL,
GROWTHGDPPERCAPITAL NUMBER(20,10) NULL,
PRIVCREAVG NUMBER(20,10) NULL,
LOGPRIVATECREDIT NUMBER(20,10) NULL,
INFLATION NUMBER(20,10) NULL,
LOGTRADE NUMBER(20,10) NULL,
GR_LTRADE NUMBER(20,10) NULL,
GR_SCHOOL NUMBER(20,10) NULL,
LOGSCHOOLING NUMBER(20,10) NULL,
LOGCOMMERCIALCENTRALBANK NUMBER(20,10) NULL,
LOGINITIALLOWESTINCOMSHARE NUMBER(20,10) NULL,
GROWTHINLOWESTINCOMESHARE NUMBER(20,10) NULL
)
GO
4、切換用戶(環境變量不一樣)並,準備ctl文件,然後執行sqlldr。
su - oracle執行,沒有數據插入。原因爲每一行最後沒有分隔符結尾。
[oracle@client ~]$ /data/oracle/product/11.2.0/db_1/bin/sqlldr cea/cea control=/opt/load_test/load.ctl log=/opt/load_test/log.txt \
> bad=/opt/load_test/bad.txt data=/opt/load_test/Finance_inequality_and_the_poor_data_6005.csv
SQL*Loader: Release 11.2.0.1.0 - Production on Tue Dec 8 17:22:01 2015
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Commit point reached - logical record count 58
[oracle@client ~]$ pwd
/home/oracle
[oracle@client ~]$
5、發現最後少一個分隔符結尾,用awk在每一行追加一個分隔符。存成另外一個文件。
#awk在每行最後追加一個分隔符“,”
awk 'BEGIN{FS=",";OFS=","}{print $0","}' Finance_inequality_and_the_poor_data_6005.csv > Finance_inequality_and_the_poor_data_6005.txt
再次執行後,插入72行成功。
[oracle@client load_test]$ /data/oracle/product/11.2.0/db_1/bin/sqlldr cea/cea control=/opt/load_test/load.ctl log=/opt/load_test/log.txt \
> bad=/opt/load_test/bad.txt data=/opt/load_test/Finance_inequality_and_the_poor_data_6005.csv
SQL*Loader: Release 11.2.0.1.0 - Production on Tue Dec 8 17:59:11 2015
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Commit point reached - logical record count 58
Commit point reached - logical record count 72
######################## 看看輸出日誌
[oracle@client load_test]$ cat log.txt
SQL*Loader: Release 11.2.0.1.0 - Production on Tue Dec 8 17:59:11 2015
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Control File: /opt/load_test/load.ctl
Data File: /opt/load_test/Finance_inequality_and_the_poor_data_6005.csv
Bad File: /opt/load_test/bad.txt
Discard File: none specified
(Allow all discards)
Number to load: ALL
Number to skip: 0
Errors allowed: 50
Bind array: 64 rows, maximum of 256000 bytes
Continuation: none specified
Path used: Conventional
Table CEA.LOAD_TEST, loaded from every logical record.
Insert option in effect for this table: APPEND
Column Name Position Len Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
COUNTRYCODE FIRST * , CHARACTER
YEARID NEXT * , CHARACTER
LOGINITIALGINI NEXT * , CHARACTER
GROWTHEINGINI NEXT * , CHARACTER
SPAN NEXT * , CHARACTER
LOGINITIALGDPPERCAPITAL NEXT * , CHARACTER
GROWTHGDPPERCAPITAL NEXT * , CHARACTER
PRIVCREAVG NEXT * , CHARACTER
LOGPRIVATECREDIT NEXT * , CHARACTER
INFLATION NEXT * , CHARACTER
LOGTRADE NEXT * , CHARACTER
GR_LTRADE NEXT * , CHARACTER
GR_SCHOOL NEXT * , CHARACTER
LOGSCHOOLING NEXT * , CHARACTER
LOGCOMMERCIALCENTRALBANK NEXT * , CHARACTER
LOGINITIALLOWESTINCOMSHARE NEXT * , CHARACTER
GROWTHINLOWESTINCOMESHARE NEXT * , CHARACTER
value used for ROWS parameter changed from 64 to 58
Table CEA.LOAD_TEST:
72 Rows successfully loaded.
0 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Space allocated for bind array: 254388 bytes(58 rows)
Read buffer bytes: 1048576
Total logical records skipped: 0
Total logical records read: 72
Total logical records rejected: 0
Total logical records discarded: 0
Run began on Tue Dec 08 17:59:11 2015
Run ended on Tue Dec 08 17:59:11 2015
Elapsed time was: 00:00:00.08
CPU time was: 00:00:00.03
[oracle@client load_test]$