寫在譯文前的話
這是我第一次翻譯技術文檔,肯定有很多錯誤不妥之處,希望各位指出,我馬上改過來。
我認爲閱讀英文文檔應該分爲三個階段 1 理解英文含義 2 理解文字表達的內容 3 實際操作,本文力爭做到第二個階段。
譯文雖然完成了90%,但主要的功能都翻譯的很清楚。應該可以滿足大部分的應用需求。
如果把這個文檔弄明白了,其他的sqoop也不用看了
注:紅字部分是有疑問,或需要標識的地方。
5.Basic Usage
6.Sqoop Tools
Sqoop is a collection of related tools. To use Sqoop, you specify the tool you want to use and the arguments that control the tool.
sqoop是一組 相關工具的集合,你可以通過制定工具和參數來控制它。
If Sqoop is compiled from its own source, you can run Sqoop without a formal installation process by running thebin/sqoop
program. Users of a packaged deployment of Sqoop (such as an RPM shipped with Apache Bigtop) will see this program installed as/usr/bin/sqoop
. The remainder of this documentation will refer to this program assqoop
. For example:
sqoop的調用方式, /bin/sqoop 或 /usr/bin/sqoop
$ sqoop tool-name [tool-arguments]
Note | |
---|---|
$ 代表客戶端,它不是輸入的一部分 |
Sqoop ships with a help tool. To display a list of all available tools, type the following command:
Sqoop有一個幫助工具,來展示可用的工具列表,輸入下面的命令
type 有打字和輸入的意思 ships with :帶有
$ sqoop help usage: sqoop COMMAND [ARGS] Available commands: codegen Generate code to interact with database records 生成與數據庫記錄交互的代碼 create-hive-table Import a table definition into Hive 創建hive型表結構 eval Evaluate a SQL statement and display the results 返回sql的執行結果並顯示 export Export an HDFS directory to a database table 導出一個HDFS目錄到一個表 help List available commands 列出可用命令行 import Import a table from a database to HDFS import-all-tables Import tables from a database to HDFS 導出指定數據庫的所有表 list-databases List available databases on a server 列出所有數據庫名 list-tables List available tables in a database列出所有表名 version Display version information 顯示版本信息 See 'sqoop help COMMAND' for information on a specific command. help還可以用在一個指定的命令上
You can display help for a specific tool by entering:sqoop help (tool-name)
; for example,sqoop help import
.
You can also add the--help
argument to any command:sqoop import --help
.
你也可以這樣使用:sqoop import --help
.
In addition to typing thesqoop (toolname)
syntax, you can use alias scripts that specify thesqoop-(toolname)
syntax. For example, the scriptssqoop-import
,sqoop-export
, etc. each select a specific tool.
使用別名,如sqoop-import
, sqoop-export.
You invoke Sqoop through the program launch capability provided by Hadoop. Thesqoop
command-line program is a wrapper which runs thebin/hadoop
script shipped with Hadoop. If you have multiple installations of Hadoop present on your machine, you can select the Hadoop installation by setting the$HADOOP_COMMON_HOME
and$HADOOP_MAPRED_HOME
environment variables.
For example:
$ HADOOP_COMMON_HOME=/path/to/some/hadoop \ HADOOP_MAPRED_HOME=/path/to/some/hadoop-mapreduce \ sqoop import --arguments...
or:
$ export HADOOP_COMMON_HOME=/some/path/to/hadoop $ export HADOOP_MAPRED_HOME=/some/path/to/hadoop-mapreduce $ sqoop import --arguments...
If either of these variables are not set, Sqoop will fall back to$HADOOP_HOME
. If it is not set either, Sqoop will use the default installation locations for Apache Bigtop,/usr/lib/hadoop
and/usr/lib/hadoop-mapreduce
, respectively.
The active Hadoop configuration is loaded from$HADOOP_HOME/conf/
, unless the$HADOOP_CONF_DIR
environment variable is set.
To control the operation of each Sqoop tool, you use generic and specific arguments.
控制每個 Sqoop工具,你都可以使用通用參數和指定參數
For example: 例如:
$ sqoop help import usage: sqoop import [GENERIC-ARGS] [TOOL-ARGS] Common arguments: --connect <jdbc-uri> Specify JDBC connect string //指定JDBC連接字符串 --connect-manager <jdbc-uri> Specify connection manager class to use//指定連接管理者的class類 --driver <class-name> Manually specify JDBC driver class to use//手動指定 JDBC驅動類 --hadoop-mapred-home <dir>+ Override $HADOOP_MAPRED_HOME --help Print usage instructions //顯示使用說明 -P Read password from console//從控制檯讀取參數 --password <password> Set authentication password //身份驗證密碼 --username <username> Set authentication username//身份驗證用戶 --verbose Print more information while working //輸出debug信息 --hadoop-home <dir>+ Deprecated. Override $HADOOP_HOME //已經棄用 [...] Generic Hadoop command-line arguments: Hadoop通用命令參數(是hadoop的命令,詳見hadoop命令文檔,可以用在sqoop tool上)
(must preceed any tool-specific arguments)(必須優先指定工具的參數)
Generic options supported are
-conf <configuration file> specify an application configuration file//指定一個應用配置文件
-D <property=value> use value for given property //傳參數
-fs <local|namenode:port> specify a namenode// 指定一個namenote
-jt <local|jobtracker:port> specify a job tracker//指定一個 job tracker
-files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster 指定逗號分隔文件。這些文件被拷貝到mapreduce 集羣上
-libjars <comma separated list of jars> specify comma separated jar files to include in the classpath.-archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is 通用命令行語法是
bin/hadoop command [genericOptions] [commandOptions]
You must supply the generic arguments-conf
,-D
, and so on after the tool name butbeforeany tool-specific arguments (such as--connect
). Note that generic Hadoop arguments are preceeded by a single dash character (-
), whereas tool-specific arguments start with two dashes (--
), unless they are single character arguments such as-P
.
-conf
, -D和其他的通用參數必須寫在所有的指定工具的參數前 (比如 --connect).注意 通用參數使用一個破折號 (-),在特定工具中除了單個字符的參數使用一個破折號外其他參數使用兩個破折號(-)
The-conf
,-D
,-fs
and-jt
arguments control the configuration and Hadoop server settings. For example, the-D mapred.job.name=<job_name>
can be used to set the name of the MR job that Sqoop launches, if not specified, the name defaults to the jar name for the job - which is derived from the used table name.
-conf
, -D
, -fs
and -jt控制hadoop服務的設置。(具體 作用 就得研究hadoop的參數)
The-files
,-libjars
, and-archives
arguments are not typically used with Sqoop, but they are included as part of Hadoop’s internal argument-parsing system.
-files
, -libjars
, and -archives
通常不用於Sqoop,他們被包含作爲hadoop的內部分析參數系統的一部分。
When using Sqoop, the command line options that do not change from invocation to invocation can be put in an options file for convenience. An options file is a text file where each line identifies an option in the order that it appears otherwise on the command line. Option files allow specifying a single option on multiple lines by using the back-slash character at the end of intermediate lines. Also supported are comments within option files that begin with the hash character. Comments must be specified on a new line and may not be mixed with option text. All comments and empty lines are ignored when option files are expanded. Unless options appear as quoted strings, any leading or trailing spaces are ignored. Quoted strings if used must not extend beyond the line on which they are specified.
Option files can be specified anywhere in the command line as long as the options within them follow the otherwise prescribed rules of options ordering. For instance, regardless of where the options are loaded from, they must follow the ordering such that generic options appear first, tool specific options next, finally followed by options that are intended to be passed to child programs.
To specify an options file, simply create an options file in a convenient location and pass it to the command line via--options-file
argument.
Whenever an options file is specified, it is expanded on the command line before the tool is invoked. You can specify more than one option files within the same invocation if needed.
For example, the following Sqoop invocation for import can be specified alternatively as shown below:
$ sqoop import --connect jdbc:mysql://localhost/db --username foo --table TEST $ sqoop --options-file /users/homer/work/import.txt --table TEST
where the options file/users/homer/work/import.txt
contains the following:
import --connect jdbc:mysql://localhost/db --username foo
The options file can have empty lines and comments for readability purposes. So the above example would work exactly the same if the options file/users/homer/work/import.txt
contained the following:
# # Options file for Sqoop import # # Specifies the tool being invoked import # Connect parameter and value --connect jdbc:mysql://localhost/db # Username parameter and value --username foo # # Remaining options should be specified in the command line. #