sqoop的job工具

sqoop job工具
        sqoop  job工具可以用于创建保存经常使用的命令为一个任务,还可以用于实现定时调用任务,用于sqoop增量导入新数据。
 
    sqoop语法:
            $ sqoop job (generic-args) (job-args) [-- [subtool-name] (subtool-args)]
            $ sqoop-job (generic-args) (job-args) [-- [subtool-name] (subtool-args)]
        
        参照下例可知,[  ]里面的是其它sqoop工具  两个()指的是本节介绍的参数和参数内容
 
sqoop job的使用参数:
Argument
Description
--create <job-id>
Define a new saved job with the specified job-id (name). A second Sqoop command-line, separated by a -- should be specified; this defines the saved job.
--delete <job-id>
Delete a saved job.
--exec <job-id>
Given a job defined with --create, run the saved job.
运行任务时,可以使用--形式的参数覆盖之前创建时设置的参数
--show <job-id>
Show the parameters for a saved job.
--list
List all saved jobs
 
 
创建一个任务自动增量导入的任务:
        自动导入nodes表
        sqoop可以将数据全部导入到hive,但是如果原数据(mysql)出现update和delete操作,是无法同步到hive中
 
sqoop job  --create testdata_nodes  -- import  --connect jdbc:mysql://192.168.10.80:33060/testdata  --username root --password lovelsl --table nodes  --hive-import  --hive-table testdata.nodes --null-string '\\N' --null-non-string '\\N'  --incremental append  --check-column id  --last-value 415
 
 
[root @localhost ~]# sqoop job --create testdata_nodes  -- import  --connect jdbc:mysql://192.168.10.80:33060/testdata  --username root --password lovelsl --table nodes  --hive-import  --hive-table testdata.nodes --null-string '\\N' --null-non-string '\\N'  --incremental append  --check-column id  --last-value 415
Warning: /lovelsl/sqoop/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /lovelsl/sqoop/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /lovelsl/sqoop/sqoop/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
19/07/25 21:23:05 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
19/07/25 21:23:07 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
19/07/25 21:23:07 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
19/07/25 21:23:07 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
[root @localhost ~]#
 
 
执行任务
        注意默认的情况下是要求输入数据库密码的,可以通过配置conf/sqoop-site.xml 指定sqoop.metastore.client.record.password为true避免
配置为
    <property>
        <name>sqoop.metastore.client.record.password</name>
        <value>true</value>
    </property>
指令执行:
        sqoop job --exec testdata_nodes
 
[root @localhost ~]# sqoop job --exec testdata_nodes
Warning: /lovelsl/sqoop/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /lovelsl/sqoop/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /lovelsl/sqoop/sqoop/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
19/07/26 00:32:11 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
Enter password:
........
 
 
删除任务
        sqoop job  --delete testdata_nodes
 
 
实现定时任务:
    
 
Centos 7的定时任务实现
[root @localhost shell]# cat cron.sh
#!/bin/sh
 
#
# 负责为sqoop job提供定时启动接口
#
 
echo "30 12 * * * /lovelsl/dev/shell/sqoop_job.sh"  >> /etc/crontab
 
crontab /etc/crontab
 
systemctl enable crond
 
Centos 7 下sqoop增强导入任务
 
[root @localhost shell]# cat sqoop_job.sh
#!/bin/sh
 
#
# 配置所有需要启动sqoop的任务
#
 
sqoop job -exec testdata_nodes
 
[root@localhost shell]#
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章