導入loudacre數據庫中的account表到HDFS
sqoop import \ --connect jdbc:mysql://localhost/loudacre \ --username training --password training \ --table accounts \ --target-dir /loudacre/accounts \ --null-non-string '\\N'
增量更新導入到HDFS
–check-column <column name> 檢查的增量更新的列
–last-value <largest_column_num> 檢查的列中的上一個導入的值
sqoop import \ --connect jdbc:mysql://localhost/loudacre \ --username training --password training \ --incremental append \ --null-non-string '\\N' \ --table accounts \ --target-dir /loudacre/accounts \ --check-column acct_num \ --last-value <largest_acct_num>
指定的字段分隔符導入到HDFS
–fields-terminated-by <char> 表示要進行設置的字段分隔符,默認是”,”,這裏我們可以用製表符”\t”
sqoop import \ --connect jdbc:mysql://localhost/loudacre \ --username training --password training \ --table webpage \ --target-dir /loudacre/webpage \ --fields-terminated-by "\t"
特定條件的數據導入到HDFS
可用–where <where clause> 來指定要導入的條件
sqoop import \ --connect jdbc:mysql://localhost/loudacre \ --username training --password training \ --table accounts \ --where "state = 'CA' and acct_close_dt IS NULL" \ --target-dir /loudacre/accounts-active \ --null-non-string '\\N'
將MySQL數據導入到Hive中
使用–hive-import 可將表導入到Hive中
sqoop import \ --connect jdbc:mysql://localhost/loudacre \ --username training --password training \ --fields-terminated-by '\t' \ --table device \ --hive-import
使用avro的格式導入到HDFS
使用–as-avrodatafile可將導入數據格式化成avro
sqoop import \ --connect jdbc:mysql://localhost/loudacre \ --username training --password training \ --table accounts \ --target-dir /loudacre/accounts-avro \ --null-non-string '\N' \ --as-avrodatafile
使用parquet的格式導入到HDFS
使用–as-parquetfile可將導入數據格式化成parquet
sqoop import \ --connect jdbc:mysql://localhost/loudacre \ --username training --password training \ --table accounts \ --target-dir /loudacre/accounts-parquet \ --null-non-string '\N' \ --as-parquetfile