16.
sqoop-create-hive-table
Thecreate-hive-table
tool populates a Hive metastore with a definition for a table based on a database table previously imported to HDFS, or one planned to be imported. This effectively performs the "--hive-import
" step ofsqoop-import
without running the preceeding import.
在create-hive-table工具填充一個元存儲通過定義一個表基於一個以前已經導入到HDFS的數據庫表,或一個計劃被導入。”--hive-import ”是sqoop-import其中的一步,不需要先執行導入。
(create-hive-table 的用途 1 在hive定義一個表名關聯到已經導入到HDFS中的表 2 在 hive中創建一個表,這個表可以用於以後的導入)
If data was already loaded to HDFS, you can use this tool to finish the pipeline of importing the data to Hive. You can also create Hive tables with this tool; data then can be imported and populated into the targetafter a preprocessing step run by the user.
如果數據已經加載到HDFS,您可以使用此工具來完成導入數據到hive的管道(關聯導入的數據到hive)。您還可以使用此工具創建hive表;然後數據可以導入並填充到目標在用戶運行一個預處理步驟後。
$ sqoop create-hive-table (generic-args) (create-hive-table-args) $ sqoop-create-hive-table (generic-args) (create-hive-table-args)
Although the Hadoop generic arguments must preceed any create-hive-table arguments, the create-hive-table arguments can be entered in any order with respect to one another(這個句子翻譯好多次了).
Table34.Common arguments
Argument | Description |
---|---|
--connect <jdbc-uri> | Specify JDBC connect string |
--connection-manager <class-name> | Specify connection manager class to use |
--driver <class-name> | Manually specify JDBC driver class to use |
--hadoop-mapred-home <dir> | Override $HADOOP_MAPRED_HOME |
--help | Print usage instructions |
-P | Read password from console |
--password <password> | Set authentication password |
--username <username> | Set authentication username |
--verbose | Print more information while working |
--connection-param-file <filename> | Optional properties file that provides connection parameters |
Table35.Hive arguments:
Argument | Description |
---|---|
--hive-home <dir> | Override$HIVE_HOME |
--hive-overwrite | Overwrite existing data in the Hive table. |
--create-hive-table | If set, then the job will fail if the target hive |
table exits. By default this property is false. | |
--hive-table <table-name> | Sets the table name to use when importing to Hive. |
--table | The database table to read the definition from. |
Table36.Output line formatting arguments:
Argument | Description |
---|---|
--enclosed-by <char> | Sets a required field enclosing character |
--escaped-by <char> | Sets the escape character |
--fields-terminated-by <char> | Sets the field separator character |
--lines-terminated-by <char> | Sets the end-of-line character |
--mysql-delimiters | Uses MySQL’s default delimiter set: fields:, lines:\n escaped-by:\ optionally-enclosed-by:' |
--optionally-enclosed-by <char> | Sets a field enclosing character |
Do not use enclosed-by or escaped-by delimiters with output formatting arguments used to import to Hive. Hive cannot currently parse them.
Define in Hive a table namedemps
with a definition based on a database table namedemployees
:
定義在一個名叫emps蜂巢表與一個基於數據庫表定義一個名爲員工
在hive中定義名爲emps的表 通過一個名爲employees的數據庫表的定義(建表需要表結構):
$ sqoop create-hive-table --connect jdbc:mysql://db.example.com/corp \ --table employees --hive-table emps
sqoop-eval :返回sql 查詢結果,可以檢驗查詢結果是否符自己的預期
Theeval
tool allows users to quickly run simple SQL queries against a database; results are printed to the console. This allows users to preview their import queries to ensure they import the data they expect.
eval工具允許用戶快速運行簡單的SQL查詢數據庫;結果打印到控制檯。這允許用戶預覽他們的導入查詢以確保他們導入數據他們期望
$ sqoop eval (generic-args) (eval-args) $ sqoop-eval (generic-args) (eval-args)
Although the Hadoop generic arguments must preceed any eval arguments, the eval arguments can be entered in any order with respect to one another.
Table37.Common arguments
Argument | Description |
---|---|
--connect <jdbc-uri> | Specify JDBC connect string |
--connection-manager <class-name> | Specify connection manager class to use |
--driver <class-name> | Manually specify JDBC driver class to use |
--hadoop-mapred-home <dir> | Override $HADOOP_MAPRED_HOME |
--help | Print usage instructions |
-P | Read password from console |
--password <password> | Set authentication password |
--username <username> | Set authentication username |
--verbose | Print more information while working |
--connection-param-file <filename> | Optional properties file that provides connection parameters |
Select ten records from theemployees
table:
從employees表中查詢10條記錄
$ sqoop eval --connect jdbc:mysql://db.example.com/corp \ --query "SELECT * FROM employees LIMIT 10"
Insert a row into thefoo
table:
插入一行到foo表:
$ sqoop eval --connect jdbc:mysql://db.example.com/corp \ -e "INSERT INTO foo VALUES(42, 'bar')"
$ sqoop list-databases (generic-args) (list-databases-args) $ sqoop-list-databases (generic-args) (list-databases-args)
Although the Hadoop generic arguments must preceed any list-databases arguments, the list-databases arguments can be entered in any order with respect to one another.
Table39.Common arguments
Argument | Description |
---|---|
--connect <jdbc-uri> | Specify JDBC connect string |
--connection-manager <class-name> | Specify connection manager class to use |
--driver <class-name> | Manually specify JDBC driver class to use |
--hadoop-mapred-home <dir> | Override $HADOOP_MAPRED_HOME |
--help | Print usage instructions |
-P | Read password from console |
--password <password> | Set authentication password |
--username <username> | Set authentication username |
--verbose | Print more information while working |
--connection-param-file <filename> | Optional properties file that provides connection parameters |
List database schemas available on a MySQL server:
列出MySQL服務器上可用的數據庫的名:
$ sqoop list-databases --connect jdbc:mysql://database.example.com/ information_schema employees
Note | |
---|---|
This only works with HSQLDB, MySQL and Oracle. When using with Oracle, it is necessary that the user connecting to the database has DBA privileges. |
$ sqoop list-tables (generic-args) (list-tables-args) $ sqoop-list-tables (generic-args) (list-tables-args)
Although the Hadoop generic arguments must preceed any list-tables arguments, the list-tables arguments can be entered in any order with respect to one another.
Table40.Common arguments
Argument | Description |
---|---|
--connect <jdbc-uri> | Specify JDBC connect string |
--connection-manager <class-name> | Specify connection manager class to use |
--driver <class-name> | Manually specify JDBC driver class to use |
--hadoop-mapred-home <dir> | Override $HADOOP_MAPRED_HOME |
--help | Print usage instructions |
-P | Read password from console |
--password <password> | Set authentication password |
--username <username> | Set authentication username |
--verbose | Print more information while working |
--connection-param-file <filename> | Optional properties file that provides connection parameters |
List tables available in the "corp" database:
$ sqoop list-tables --connect jdbc:mysql://database.example.com/corp employees payroll_checks job_descriptions office_supplies
In case of postgresql, list tables command with common arguments fetches only "public" schema. For custom schema, use --schema argument to list tables of particular schema Example
如果是postgresql數據庫,表列表命令只在“公共”模式下好用。對於自定義模式,使用 --schema來列出表名
$ sqoop list-tables --connect jdbc:postgresql://localhost/corp --username name -P -- --schema payrolldept employees expenses
$ sqoop help [tool-name] $ sqoop-help [tool-name]
If no tool name is provided (for example, the user runssqoop help
), then the available tools are listed. With a tool name, the usage instructions for that specific tool are presented on the console.
如果沒有工具的名字被提供(例如,用戶運行 sqoop help),這時 所有可用的工具都列出。指定一個工具名稱(如:sqoop help import ),這個工具的使用說明顯示在控制檯。
List available tools:
列出所有有效的工具:
$ sqoop help usage: sqoop COMMAND [ARGS] Available commands: codegen Generate code to interact with database records create-hive-table Import a table definition into Hive eval Evaluate a SQL statement and display the results export Export an HDFS directory to a database table ... See 'sqoop help COMMAND' for information on a specific command.
Display usage instructions for theimport
tool:
顯示導入工具的使用說明:
$ bin/sqoop help import usage: sqoop import [GENERIC-ARGS] [TOOL-ARGS] Common arguments: --connect <jdbc-uri> Specify JDBC connect string --connection-manager <class-name> Specify connection manager class to use --driver <class-name> Manually specify JDBC driver class to use --hadoop-mapred-home <dir> Override $HADOOP_MAPRED_HOME --help Print usage instructions -P Read password from console --password <password> Set authentication password --username <username> Set authentication username --verbose Print more information while working --hadoop-home <dir> Deprecated. Override $HADOOP_HOME Import control arguments: --as-avrodatafile Imports data to Avro Data Files --as-sequencefile Imports data to SequenceFiles --as-textfile Imports data as plain text (default) ...