转自:http://wp0140502.javaeye.com/blog/580816
一:sql loader 的特点
oracle自己带了很多的工具可以用来进行数据的迁移、备份和恢复等工作。但是每个工具都有自己的特点。比如说exp和imp可以对数据库中的数据进行导出和导出的工作,是一种很好的数据库备份和恢复的工具,因此主要用在数据库的热备份和恢复方面。有着速度快,使用简单,快捷的优点;同时也有一些缺点,比如在不同版本数据库之间的导出、导入的过程之中,总会出现这样或者那样的问题,这个也许是oracle公司自己产品的兼容性的问题吧。
sql loader 工具却没有这方面的问题,它可以把一些以文本格式存放的数据顺利的导入到oracle数据库中,是一种在不同数据库之间进行数据迁移的非常方便而且通用的工具。缺点就速度比较慢,另外对blob等类型的数据就有点麻烦了。
二:sql loader 的帮助
命令行敲入“sqlldr”,可查看参数
有效的关键字:
userid -- ORACLE username/password
control -- Control file name
log -- Log file name
bad -- Bad file name
data -- Data file name
discard -- Discard file name
discardmax -- Number of discards to allow (全部默认)
skip -- Number of logical records to skip (默认0)
load -- Number of logical records to load (全部默认)
errors -- Number of errors to allow (默认50)
rows -- Number of rows in conventional path bind array or between direct path data saves
(默认: 常规路径 64, 所有直接路径)
bindsize -- Size of conventional path bind array in bytes(默认256000)
silent -- Suppress messages during run (header,feedback,errors,discards,partitions)
direct -- use direct path (默认FALSE)
parfile -- parameter file: name of file that contains parameter specifications
parallel -- do parallel load (默认FALSE)
file -- File to allocate extents from
skip_unusable_indexes -- disallow/allow unusable indexes or index partitions(默认FALSE)
skip_index_maintenance -- do not maintain indexes, mark affected indexes as unusable(默认FALSE)
readsize -- Size of Read buffer (默认1048576)
external_table -- use external table for load; NOT_USED, GENERATE_ONLY, EXECUTE(默认NOT_USED)
columnarrayrows -- Number of rows for direct path column array(默认5000)
streamsize -- Size of direct path stream buffer in bytes(默认256000)
multithreading -- use multithreading in direct path
resumable -- enable or disable resumable for current session(默认FALSE)
resumable_name -- text string to help identify resumable statement
resumable_timeout -- wait time (in seconds) for RESUMABLE(默认7200)
date_cache -- size (in entries) of date conversion cache(默认1000)
PLEASE NOTE: 命令行参数可以由位置或关键字指定。前者的例子是 'sqlloadscott/tiger foo'; 后一种情况的一个示例是 'sqlldr
control=foouserid=scott/tiger'.位置指定参数的时间必须早于但不可迟于由关键字指定的参数。例如,允许 'sqlldr scott/tiger
control=foo logfile=log', 但是不允许 'sqlldr scott/tiger control=foo log', 即使参数 'log' 的位置正确。
因为业务需要,每天都要把1G左右的文本文件(大约450万条数据)导入数据库,而且最要命的是,现在才开始做,要把过去好几个月的文件数据导入,两个月的就有60多个G,传统的读取文件然后拆分字段,再一条一条插入,完全不可行了,那简直是噩梦。刚开始试过,200多万的数据导入花了3个小时左右,这已经算是很不错了,要是加了索引,加了一些其他限制,那将更慢。
三、ctl配置文件:ZBroadDetail.ctl
- OPTIONS (errors=100000000,skip=3566018)
- LOAD DATA
- INFILE "/home/ftp/ZBroadDetail.20091103"
- append
- INTO TABLE RADIUSDETAIL
- Fields terminated by "::"
- trailing nullcols
- (
- NET_ACCT "LTRIM(:NET_ACCT)",
- START_TIME DATE "YYYYMMDD HH24MISS",
- STOP_TIME DATE "YYYYMMDD HH24MISS",
- FRAME_IP,
- MAC_ADDRESS,
- NAS_IP,
- NAS_PORT,
- NODE_ID,
- INPUT_OCTETS,
- OUTPUT_OCTETS,
- INPUT_PACKETS,
- OUTPUT_PACKETS
- )
OPTIONS (errors=100000000,skip=3566018)
LOAD DATA
INFILE "/home/ftp/ZBroadDetail.20091103"
append
INTO TABLE RADIUSDETAIL
Fields terminated by "::"
trailing nullcols
(
NET_ACCT "LTRIM(:NET_ACCT)",
START_TIME DATE "YYYYMMDD HH24MISS",
STOP_TIME DATE "YYYYMMDD HH24MISS",
FRAME_IP,
MAC_ADDRESS,
NAS_IP,
NAS_PORT,
NODE_ID,
INPUT_OCTETS,
OUTPUT_OCTETS,
INPUT_PACKETS,
OUTPUT_PACKETS
)
四、建表语句:
- -- Create table
- create table RADIUSDETAIL
- (
- NET_ACCT VARCHAR2(22) not null,
- START_TIME DATE not null,
- STOP_TIME DATE not null,
- FRAME_IP VARCHAR2(16),
- MAC_ADDRESS VARCHAR2(40),
- NAS_IP VARCHAR2(16) not null,
- NAS_PORT VARCHAR2(16) not null,
- NODE_ID VARCHAR2(10) not null,
- INPUT_OCTETS NUMBER(10) not null,
- OUTPUT_OCTETS NUMBER(10) not null,
- INPUT_PACKETS NUMBER(10) not null,
- OUTPUT_PACKETS NUMBER(10) not null
- )
- partition by range (START_TIME)
- (
- partition RADIUSDETAIL_200909 values less than (TO_DATE(' 2009-10-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
- tablespace TS_RADIUS_200909
- pctfree 10
- initrans 1
- maxtrans 255
- storage
- (
- initial 64K
- minextents 1
- maxextents unlimited
- ),
- partition RADIUSDETAIL_200910 values less than (TO_DATE(' 2009-11-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
- tablespace TS_RADIUS_200910
- pctfree 10
- initrans 1
- maxtrans 255
- storage
- (
- initial 64K
- minextents 1
- maxextents unlimited
- ),
- partition RADIUSDETAIL_200911 values less than (TO_DATE(' 2009-12-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
- tablespace TS_RADIUS_200911
- pctfree 10
- initrans 1
- maxtrans 255
- storage
- (
- initial 64K
- minextents 1
- maxextents unlimited
- ),
- partition RADIUSDETAIL_200912 values less than (TO_DATE(' 2010-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
- tablespace TS_RADIUS_200912
- pctfree 10
- initrans 1
- maxtrans 255
- storage
- (
- initial 64K
- minextents 1
- maxextents unlimited
- )
- );
- -- Create/Recreate indexes
- create bitmap index BIT_NET_ACCT on RADIUSDETAIL (NET_ACCT);
-- Create table
create table RADIUSDETAIL
(
NET_ACCT VARCHAR2(22) not null,
START_TIME DATE not null,
STOP_TIME DATE not null,
FRAME_IP VARCHAR2(16),
MAC_ADDRESS VARCHAR2(40),
NAS_IP VARCHAR2(16) not null,
NAS_PORT VARCHAR2(16) not null,
NODE_ID VARCHAR2(10) not null,
INPUT_OCTETS NUMBER(10) not null,
OUTPUT_OCTETS NUMBER(10) not null,
INPUT_PACKETS NUMBER(10) not null,
OUTPUT_PACKETS NUMBER(10) not null
)
partition by range (START_TIME)
(
partition RADIUSDETAIL_200909 values less than (TO_DATE(' 2009-10-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
tablespace TS_RADIUS_200909
pctfree 10
initrans 1
maxtrans 255
storage
(
initial 64K
minextents 1
maxextents unlimited
),
partition RADIUSDETAIL_200910 values less than (TO_DATE(' 2009-11-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
tablespace TS_RADIUS_200910
pctfree 10
initrans 1
maxtrans 255
storage
(
initial 64K
minextents 1
maxextents unlimited
),
partition RADIUSDETAIL_200911 values less than (TO_DATE(' 2009-12-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
tablespace TS_RADIUS_200911
pctfree 10
initrans 1
maxtrans 255
storage
(
initial 64K
minextents 1
maxextents unlimited
),
partition RADIUSDETAIL_200912 values less than (TO_DATE(' 2010-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
tablespace TS_RADIUS_200912
pctfree 10
initrans 1
maxtrans 255
storage
(
initial 64K
minextents 1
maxextents unlimited
)
);
-- Create/Recreate indexes
create bitmap index BIT_NET_ACCT on RADIUSDETAIL (NET_ACCT);
五、数据文件:
- 554chenhy1::20091102 215228::20091103 000000::114.101.219.208::00:e0:b0:eb:58:3a::61.190.207.16::318767313::0009::89658839::712520776::483771::647601
- 0558886954::20091102 205212::20091103 000000::117.67.74.67::00:1b:fc:e7:12:20::61.190.200.115::85985994::0014::892854637::127751439::1219981::521422
- 5552776874::20091102 201133::20091103 000000::114.102.37.7::00:24:e8:09:f9:74::61.190.209.34::18879689::0011::594629766::111886810::614080::295082
- 561shuang::20091102 204030::20091103 000000::117.57.101.140::00:e0:a0:0f:93:be::61.190.222.23::68158771::0013::61740866::142548707::267982::297281
- ads2529670::20091102 225827::20091103 000000::61.190.178.49::00:14:78:49:8e:95::61.190.210.22::35677785::0007::54688807::22344225::71865::77718
- 05614084872::20091102 225628::20091103 000000::117.69.193.248::00:25:86:8d:a9:7d::61.190.222.3::1090522556::0013::64602789::225039189::331101::346673
- 561niu77::20091102 211033::20091103 000000::61.191.115.72::00:e0:4c:08:f2:19::61.190.222.17::119551483::0013::32124693::44826012::79114::84360
- 552g3185402::20091102 204703::20091103 000000::220.178.137.31::00:1d:60:75:55:05::61.190.197.14::587204619::0003::193906940::28375706::261873::261391
- 564la3386690::20091102 193042::20091103 000000::117.68.80.122::00:25:86:95:ed:7f::61.190.218.14::35661047::0006::16133795::18722492::49295::48160
- la561nm8780::20091102 185300::20091103 000000::117.57.94.66::00:25:86:38:07:df::61.190.222.24::102770417::0013::267256413::441666832::667088::692132
- 5514491492::20091102 192433::20091103 000000::124.73.1.135::00:30:18:c0:da:f9::61.133.137.97::385877693::0001::1640961::13927897::23358::24829
- tl5852400::20091102 143341::20091103 000000::220.179.210.208::00:25:56:13:0d:f7::61.190.214.13::68163238::0012::1402826332::146399442::1216008::1013590
- 05596740939::20091102 234445::20091103 000000::220.179.106.230::00:24:8c:ea:06:6e::61.190.210.8::18904236::000706::19357649::27115887::35378::40429
- 55475l0vh::20091102 231541::20091103 000000::60.175.72.130::00:1d:60:29:15:79::61.190.207.14::1358957163::0009::9155440::72286818::57366::65709
- 18956423361::20091102 212834::20091103 000000::114.107.193.11::00:21:85:1f:61:85::61.190.218.13::33572914::0006::229545990::2247512519::1785262::2111780
- bl5227271::20091102 231433::20091103 000000::60.174.121.67::00:1d:60:af:64:f5::61.190.226.6::307737662::0017::11418629::45599106::38284::52599
- xc2829984::20091102 094159::20091103 000000::124.112.199.24::00:26:18:50:5f:ac::61.190.212.27::69211251::0008::9466980::28853080::93675::91051
- 05613918815::20091102 203458::20091103 000000::117.57.132.99::00:17:31:19:0b:77::61.190.222.2::1090522559::0013::106679965::347836935::565336::455583
- 564sc8345115::20091102 184324::20091103 000000::114.104.121.17::00:14:78:e0:3d:9d::61.190.218.142::35656861::000605::19638379::378257763::199378::296845
- 565aeb662234::20091101 071019::20091103 000000::114.105.20.20::00:21:27:8e:ce:37::61.190.220.37::35657321::001603::3815128047::3083888650::8618842::10227795
- 05613052533::20091102 185506::20091103 000000::220.179.181.14::00:21:97:44:01:5d::61.190.222.17::119551343::0013::22912037::28588344::120217::124821
- dza7013360::20091102 182120::20091103 000000::60.174.10.2::00:1d:92:d6:d9:35::61.190.216.17::52434445::001503::430876369::373789670::834249::569508
554chenhy1::20091102 215228::20091103 000000::114.101.219.208::00:e0:b0:eb:58:3a::61.190.207.16::318767313::0009::89658839::712520776::483771::647601
0558886954::20091102 205212::20091103 000000::117.67.74.67::00:1b:fc:e7:12:20::61.190.200.115::85985994::0014::892854637::127751439::1219981::521422
5552776874::20091102 201133::20091103 000000::114.102.37.7::00:24:e8:09:f9:74::61.190.209.34::18879689::0011::594629766::111886810::614080::295082
561shuang::20091102 204030::20091103 000000::117.57.101.140::00:e0:a0:0f:93:be::61.190.222.23::68158771::0013::61740866::142548707::267982::297281
ads2529670::20091102 225827::20091103 000000::61.190.178.49::00:14:78:49:8e:95::61.190.210.22::35677785::0007::54688807::22344225::71865::77718
05614084872::20091102 225628::20091103 000000::117.69.193.248::00:25:86:8d:a9:7d::61.190.222.3::1090522556::0013::64602789::225039189::331101::346673
561niu77::20091102 211033::20091103 000000::61.191.115.72::00:e0:4c:08:f2:19::61.190.222.17::119551483::0013::32124693::44826012::79114::84360
552g3185402::20091102 204703::20091103 000000::220.178.137.31::00:1d:60:75:55:05::61.190.197.14::587204619::0003::193906940::28375706::261873::261391
564la3386690::20091102 193042::20091103 000000::117.68.80.122::00:25:86:95:ed:7f::61.190.218.14::35661047::0006::16133795::18722492::49295::48160
la561nm8780::20091102 185300::20091103 000000::117.57.94.66::00:25:86:38:07:df::61.190.222.24::102770417::0013::267256413::441666832::667088::692132
5514491492::20091102 192433::20091103 000000::124.73.1.135::00:30:18:c0:da:f9::61.133.137.97::385877693::0001::1640961::13927897::23358::24829
tl5852400::20091102 143341::20091103 000000::220.179.210.208::00:25:56:13:0d:f7::61.190.214.13::68163238::0012::1402826332::146399442::1216008::1013590
05596740939::20091102 234445::20091103 000000::220.179.106.230::00:24:8c:ea:06:6e::61.190.210.8::18904236::000706::19357649::27115887::35378::40429
55475l0vh::20091102 231541::20091103 000000::60.175.72.130::00:1d:60:29:15:79::61.190.207.14::1358957163::0009::9155440::72286818::57366::65709
18956423361::20091102 212834::20091103 000000::114.107.193.11::00:21:85:1f:61:85::61.190.218.13::33572914::0006::229545990::2247512519::1785262::2111780
bl5227271::20091102 231433::20091103 000000::60.174.121.67::00:1d:60:af:64:f5::61.190.226.6::307737662::0017::11418629::45599106::38284::52599
xc2829984::20091102 094159::20091103 000000::124.112.199.24::00:26:18:50:5f:ac::61.190.212.27::69211251::0008::9466980::28853080::93675::91051
05613918815::20091102 203458::20091103 000000::117.57.132.99::00:17:31:19:0b:77::61.190.222.2::1090522559::0013::106679965::347836935::565336::455583
564sc8345115::20091102 184324::20091103 000000::114.104.121.17::00:14:78:e0:3d:9d::61.190.218.142::35656861::000605::19638379::378257763::199378::296845
565aeb662234::20091101 071019::20091103 000000::114.105.20.20::00:21:27:8e:ce:37::61.190.220.37::35657321::001603::3815128047::3083888650::8618842::10227795
05613052533::20091102 185506::20091103 000000::220.179.181.14::00:21:97:44:01:5d::61.190.222.17::119551343::0013::22912037::28588344::120217::124821
dza7013360::20091102 182120::20091103 000000::60.174.10.2::00:1d:92:d6:d9:35::61.190.216.17::52434445::001503::430876369::373789670::834249::569508
六、导入命令:
非并行加载:
- sqlldr scott/scott@XE control==/home/ftp/ZBroadDetail.ctl
- log==/home/ftp/ZBroadDetail.log
sqlldr scott/scott@XE control==/home/ftp/ZBroadDetail.ctl
log==/home/ftp/ZBroadDetail.log
这种模式下配置文件里可以设置rows数目,即一次提交量
并行加载(其实也就是不锁表,可以同时打开多个命令行加载多个文件,向同一个表导入)
- sqlldr kdck/kdck@orcl control=/home/ftp/ZBroadDetail.ctl log=/home/ftp/ZBroadDetail.log parallel=true
sqlldr kdck/kdck@orcl control=/home/ftp/ZBroadDetail.ctl log=/home/ftp/ZBroadDetail.log parallel=true
当该模式下导入的时候,配置文件里就不能指定rows
(如果没有索引,可以加上 direct=y 这样更快)
现在的话,在无索引的前提下,10分钟可以导入2000万条,算是非常高效了吧。