4. HBase Shell 交互接口
HBase shell 是 HBase 集羣的命令行接口。可以使用它連接到本地或遠程服務器並與之交互。shell 提供了客戶端和系統管理操作。
4.1 基礎 (Basics)
-----------------------------------------------------------------------------------------------------------------------------------------
體驗 shell 的第一步就是啓動它:
$ $HBASE_HOME/bin/hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.0.0, r6c98bff7b719efdb16f71606f3b7d8229445eb81, \
Sat Feb 14 19:49:22 PST 2015
hbase(main):001:0>
HBase shell 是基於 JRuby 的,JRuby 是基於 Ruby 實現的 Java 虛擬機。更確切地說,它使用的是交互式 Ruby Shell(Interactive Ruby Shell, IRB)
即輸入 Ruby 命令並立刻得到響應。HBase 攜帶的 Ruby 腳本擴展了 IRB,帶有基於 Java API 的特殊命令。它繼承了內置的命令歷史和命令補全的支持,
以及所有的 Ruby 命令。
NOTE:
-------------------------------------------------------------------------------------------------------------------------------------
沒有必要在機器上安裝 Ruby, 因爲 HBase 自帶了執行 JRuby shell 必要的 JAR 文件。用戶使用所提供的腳本在 Java 上啓動 shell。
shell 啓動後,可以輸入 help 獲得幫助文本:
hbase(main):001:0> help
HBase Shell, version 1.0.0,
r6c98bff7b719efdb16f71606f3b7d8229445eb81, \
Sat Feb 14 19:49:22 PST 2015
Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary)
\
for help on a specific command.
Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help
"general"') \
for help on a command group.
COMMAND GROUPS:
Group name: general
Commands: status, table_help, version, whoami
Group name: ddl
Commands: alter, alter_async, alter_status, create, describe,
disable, \
disable_all, drop, drop_all, enable, enable_all, exists,
get_table, \
is_disabled, is_enabled, list, show_filters
...
SHELL USAGE:
Quote all names in HBase Shell such as table and column names.
Commas
delimit command parameters. Type <RETURN> after entering a command to run it.
Dictionaries of configuration used in the creation and alteration of tables
are Ruby Hashes. They look like this:
...
如幫助文本中所描述的,可以請求特定命令的幫助,在調用 help 時將命令添加到後面,或者 help 命令後連接一個組名稱,打印出這個組內所有命令的
幫助信息。命令或者組名稱(command or group name) 必須使用引號括起來。
離開 shell, 輸入 exit 或 quit:
hbase(main):002:0> exit
$
shell 也包含特定的命令行選項,添加 -h 或 --help, 切換到命令行時會看到這些命令行選項:
$ $HBASE_HOME/bin/hbase shell -h
Usage: shell [OPTIONS] [SCRIPTFILE [ARGUMENTS]]
--format=OPTION Formatter for outputting results. Valid options are: console, html. (Default: console)
-d | --debug Set DEBUG log levels.
-h | --help This help
Debugging:
-------------------------------------------------------------------------------------------------------------------------------------
將 -d 或 --debug 開關添加到 shell 啓動命令啓用調試模式(debug mode), 即將日誌級別切換到 DEBUG, 並且使 shell 打印出任何 backtrace 信息,
其類似於 Java 中的 stacktraces 信息。
如果已經在 shell 中,可以使用 debug 命令在調試模式間切換:
hbase(main):001:0> debug
Debug mode is ON
hbase(main):002:0> debug
Debug mode is OFF
可以使用 debug? 命令檢查調試模式:
hbase(main):003:0> debug?
Debug mode is OFF
非調試模式下,shell 的日誌級別爲 ERROR, 並且不會在控制檯上打印 backtrace 信息。
有一個選項用於切換在 shell 中的輸出格式,至本文爲止,雖然 CLI 幫助說也支持 html, 但只有 console 可用。設置除 console 以外的格式會產生錯誤
消息。
shell 啓動腳本自動使用 $HBASE_HOME 目錄配置相同的目錄。用戶可以使用其它設置覆蓋這個位置,但最重要的是可以連接到不同的集羣。新建一個包含
hbase-site.xml 文件的單獨的目錄,配置 hbase.zookeeper.quorum 屬性指向一個不同的集羣,然後像下面這樣啓動 shell:
$ HBASE_CONF_DIR="/<your-other-config-dir>/" bin/hbase shell
注意,必須指定一個完整的目錄,而不僅僅是 hbase-site.xml 文件。
4.2 命令 (Commands)
-----------------------------------------------------------------------------------------------------------------------------------------
命令被分組爲五個不同的類別,分別代表了它們的語義關係。在輸入命令時,必須遵循下列原則:
● 引用名稱 (Quote Names)
-------------------------------------------------------------------------------------------------------------------------------------
命令需要表或列的名稱時,要將名稱用單引號或雙引號括起來。通常建議使用單引號。
● 引用值 (Quote Values)
-------------------------------------------------------------------------------------------------------------------------------------
shell 支持使用十六進制數或八進制表示二進制的輸入和輸出。必須使用雙引號將它們括起來,否則 shell 會將它們解釋爲字面文本(literal)。
hbase> get 't1', "key\x00\x6c\x65\x6f\x6e"
hbase> get 't1', "key\000\154\141\165\162\141"
hbase> put 't1', "test\xef\xff", 'f1:', "\x01\x33\x70"
注意上述的混合引用,必須確保使用正確的引用,否則無法獲取預期的結果。在單引號中的文本被當做字面文本(literal), 而雙引號中的文本是被替換的
(interpolated), 也就是說,它會轉換八進制或十六進制數值爲多個字節。
● 使用逗號分隔參數 (Comma Delimiters for Parameters)
-------------------------------------------------------------------------------------------------------------------------------------
使用逗號分隔命令參數。例如:
hbase(main):001:0> get 'testtable', 'row-1', 'colfam1:qual1'
● Ruby 哈希屬性 (Ruby Hashes for Properties)
-------------------------------------------------------------------------------------------------------------------------------------
對於某些命令,需要輸入帶有 key/value 對的 map 屬性。使用 Ruby 哈希形式:
{'key1' => 'value1', 'key2' => 'value2', ...}
keys/value 對被包含在花括號中,key 和 value 之間使用 "=>" 分隔。通常 key 是預定義的常量,例如 NAME, VERSIONS, or COMPRESSION, 並且不
需要引號括起來。例如:
hbase(main):001:0> create 'testtable', { NAME => 'colfam1', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true }
限制輸出 (Restricting Output)
-------------------------------------------------------------------------------------------------------------------------------------
get 命令有一個可選參數可以用於限制被打印值的長度。這對有多個列含有不同長度的值很有用。爲了快速獲得實際列的整體視圖,可以抑制太長的值
打印完整的數據,否則控制檯很快就變得難以控制。
下面的例子中,插入一個很長的值,並在之後的檢索時使用 MAXLENGTH 參數限制了長度:
hbase(main):001:0> put 'testtable','rowlong','colfam1:qual1','abcdefghijklmnopqrstuvwxyzabcdefghi \
jklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcde \
...
xyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz'
hbase(main):018:0> get 'testtable', 'rowlong', MAXLENGTH => 60
COLUMN CELL
colfam1:qual1 timestamp=1306424577316, value=abcdefghijklmnopqrstuvwxyzabc
MAXLENGTH 從行的開始處計算,也就是說,包括列的名稱。設置的值爲控制檯的寬度或略小,這樣可以在一行中顯示每個列。
對於任何的命令,可以通過輸入 help '<command>' 獲取詳細的幫助信息,例如:
hbase(main):001:0> help 'status'
Show cluster status. Can be 'summary', 'simple', 'detailed', or 'replication'. The default is 'summary'. Examples:
hbase> status
hbase> status 'simple'
hbase> status 'summary'
hbase> status 'detailed'
hbase> status 'replication'
hbase> status 'replication', 'source'
hbase> status 'replication', 'sink'
大多數命令都有一個直接匹配的客戶端 API 或管理 API。後面幾節將簡要介紹每個命令以及這些 API 的功能。它們按照目的分組,並按分組中的命令排列:
Command Groups in HBase Shell
+-------------------+---------------------------------------------------------------------------------------------------
| Group | Description
+-------------------+---------------------------------------------------------------------------------------------------
| general | Comprises general commands that do not fit into any other category, for example status.
+-------------------+---------------------------------------------------------------------------------------------------
| configuration | Some configuration properties can be changed at runtime, and reloaded with these commands
+-------------------+---------------------------------------------------------------------------------------------------
| ddl | Contains all commands for data-definition tasks, such as creating a table
+-------------------+---------------------------------------------------------------------------------------------------
| namespace | Similar to the former, but for namespace related operations.
+-------------------+---------------------------------------------------------------------------------------------------
| dml | Has all the data-maipulation commands, which are used to insert or delete data, for example.
+-------------------+---------------------------------------------------------------------------------------------------
| snapshots | Tables can be saved using snapshots, which are created, deleted, restored, etc.
+-------------------+---------------------------------------------------------------------------------------------------
| tools | There are tools supplied with the shell that can help run expert-level, cluster wide operations.
+-------------------+---------------------------------------------------------------------------------------------------
| replication | All replication related commands are within this group, for example, adding a peer cluster
+-------------------+---------------------------------------------------------------------------------------------------
| security | The contained commands handle security related tasks
+-------------------+---------------------------------------------------------------------------------------------------
| visibility labels | These commands handle cell label related functionality, such as adding or listing labels
+-------------------+---------------------------------------------------------------------------------------------------
可以使用任何組名稱來獲得幫助信息,使用 help '<groupname>' 語法,與命令的幫助語法相同。例如輸入 help ddl 會打印出數據定義命令的完整幫助文本
■ 通用命令 (General Commands)
-----------------------------------------------------------------------------------------------------------------------------------------
general 命令列於下表:
General Shell Commands
+---------------+---------------------------------------------------------------------------------------------------------------
| Command | Description
+---------------+---------------------------------------------------------------------------------------------------------------
| status | Returns various levels of information contained in the ClusterStatus class. See the help to get the simple,
| | summary, and detailed status information
+---------------+---------------------------------------------------------------------------------------------------------------
| version | Returns the current version, repository revision, and compilation date of your HBase cluster.
| | See ClusterStatus.getHBaseVersion()
+---------------+---------------------------------------------------------------------------------------------------------------
| table_help | Prints a help text explaining the usage of table references in the Ruby shell
+---------------+---------------------------------------------------------------------------------------------------------------
| whoami | Shows the current OS user and group membership known to HBase about the shell user
+---------------+---------------------------------------------------------------------------------------------------------------
沒有任何限定符運行 status 與執行 status 'summary' 相同,都打印出活動的和死掉的服務器數量,以及平均負載。平均負載是每臺服務器持有的平均
region 的數量。status 'simple' 打印出有關活動服務器和死掉的服務器的詳細信息,它們的服務器名,對於活動服務器還有高級的統計信息,類似於
web UI 中包含的請求數量,堆內存信息,磁盤以及 memstore 信息等等。最後,對於 detailed 版本的 status 命令,除了以上信息,還會打印出每個
region 當前所在的服務器信息。
另一組通用命令是與運行時更新服務器配置相關的:
Configuration Commands
+-------------------+-----------------------------------------------------------------------------------------------------------
| Commands | Description
+-------------------+-----------------------------------------------------------------------------------------------------------
| update_config | Update the configuration for a particular server. The name must be given as a valid server name
+-------------------+-----------------------------------------------------------------------------------------------------------
| update_all_config | Updates all region servers
+-------------------+-----------------------------------------------------------------------------------------------------------
可以先用 status 命令獲取服務器列表,然後使用返回的服務器名稱調用更新配置命令。注意,需要對返回的服務器名稱格式做些調整:服務
器名稱組件由逗號分隔,而不是冒號或空格,如下示例:
hbase(main):001:0> status 'simple'
1 live servers
127.0.0.1:62801 1431177060772
...
Aggregate load: 0, regions: 4
hbase(main):002:0> update_config '127.0.0.1,62801,1431177060772'
0 row(s) in 0.1290 seconds
hbase(main):003:0> update_all_config
0 row(s) in 0.0560 seconds
■ 名稱空間和數據定義命令 (Namespace and Data Definition Commands)
-----------------------------------------------------------------------------------------------------------------------------------------
namespace 組的命令提供 shell 的功能,用於處理創建,修改,以及刪除名稱空間。
Namespace Shell Commands:
+-----------------------+-------------------------------------------------------------------------------------------------------
| Commands | Description
+-----------------------+-------------------------------------------------------------------------------------------------------
| create_namespace | Creates a namespace with the provided name.
+-----------------------+-------------------------------------------------------------------------------------------------------
| drop_namespace | Removes the namespace, which must be empty, that is, it must not contain any tables.
+-----------------------+-------------------------------------------------------------------------------------------------------
| alter_namespace | Changes the namespace details by altering its configuration properties
+-----------------------+-------------------------------------------------------------------------------------------------------
| describe_namespace | Prints the details of an existing namespace
+-----------------------+-------------------------------------------------------------------------------------------------------
| list_namespace | Lists all known namespaces
+-----------------------+-------------------------------------------------------------------------------------------------------
| list_namespace_tables | Lists all tables contained in the given namespace
+-----------------------+-------------------------------------------------------------------------------------------------------
數據定義(data definition) 命令列於下表,大多數來源於管理 API (administrative API):
Data Definition Shell Commands
+---------------+---------------------------------------------------------------------------------------------------------------
| Commands | Description
+---------------+---------------------------------------------------------------------------------------------------------------
| alter | Modifies an existing table schema using modifyTable().
+---------------+---------------------------------------------------------------------------------------------------------------
| alter_async | Same as above, but returns immediately without waiting for the changes to take effect
+---------------+---------------------------------------------------------------------------------------------------------------
| alter_status | Can be used to query how many regions have the changes applied to them. Use this after making asynchronous
| | alterations.
+---------------+---------------------------------------------------------------------------------------------------------------
| create | Creates a new table. See the createTable() call
+---------------+---------------------------------------------------------------------------------------------------------------
| describe | Prints the HTableDescriptor. A shortcut for this command is desc
+---------------+---------------------------------------------------------------------------------------------------------------
| disable | Disables a table. See the disableTable() method.
+---------------+---------------------------------------------------------------------------------------------------------------
| disable_all | Uses a regular expression to disable all matching tables in a single command
+---------------+---------------------------------------------------------------------------------------------------------------
| drop | Drops a table. See the deleteTable() method
+---------------+---------------------------------------------------------------------------------------------------------------
| drop_all | Drops all matching tables. The parameter is a regular expression
+---------------+---------------------------------------------------------------------------------------------------------------
| enable | Enables a table. See the enableTable() call
+---------------+---------------------------------------------------------------------------------------------------------------
| enable_all | Using a regular expression to enable all matching tables
+---------------+---------------------------------------------------------------------------------------------------------------
| exists | Checks if a table exists. It uses the tableExists() call
+---------------+---------------------------------------------------------------------------------------------------------------
| is_disabled | Checks if a table is disabled. See the isTableDisabled() method
+---------------+---------------------------------------------------------------------------------------------------------------
| is_enabled | Checks if a table is enabled. See the isTableEnabled() method
+---------------+---------------------------------------------------------------------------------------------------------------
| list | Returns a list of all user tables. Uses the listTables() method
+---------------+---------------------------------------------------------------------------------------------------------------
| show_filters | Lists all known filter classes.
+---------------+---------------------------------------------------------------------------------------------------------------
| get_table | Returns a table reference that can used in scripting
+---------------+---------------------------------------------------------------------------------------------------------------
以 _all 結尾的命令接受正則表達式經命令應用到所有匹配的表。例如,假設系統中有一個名爲 test 的表:
hbase(main):001:0> drop_all '.*'
test
Drop the above 1 tables (y/n)?
y
1 tables successfully dropped
hbase(main):002:0> drop_all '.*'
No tables matched the regex .*
■ 數據操縱命令 (Data Manipulation Commands)
-----------------------------------------------------------------------------------------------------------------------------------------
數據操縱(data manipulation)命令列於下表,它們大多數由客戶端 API 提供:
Data Manipulation Shell Commands
+-------------------+-----------------------------------------------------------------------------------------------------------
| Commands | Description
+-------------------+-----------------------------------------------------------------------------------------------------------
| put | Stores a cell. Uses the Put class
+-------------------+-----------------------------------------------------------------------------------------------------------
| get | Retrieves a cell. See the Get class
+-------------------+-----------------------------------------------------------------------------------------------------------
| delete | Deletes a cell
+-------------------+-----------------------------------------------------------------------------------------------------------
| deleteall | Similar to delete but does not require a column. Deletes an entire family or row
+-------------------+-----------------------------------------------------------------------------------------------------------
| append | Allows to append data to cells
+-------------------+-----------------------------------------------------------------------------------------------------------
| incr | Increments a counter. Uses the Increment class
+-------------------+-----------------------------------------------------------------------------------------------------------
| get_counter | Retrieves a counter value. Same as the get command but converts the raw counter value into a readable number
+-------------------+-----------------------------------------------------------------------------------------------------------
| scan | Scans a range of rows. Relies on the Scan class
+-------------------+-----------------------------------------------------------------------------------------------------------
| count | Counts the rows in a table. Uses a Scan internally
+-------------------+-----------------------------------------------------------------------------------------------------------
| truncate | Truncates a table, which is the same as executing the disable and drop commands, followed by a create,
| | using the same schema
+-------------------+-----------------------------------------------------------------------------------------------------------
| truncate_preserve | Same as the previous command, but retains the regions with their start and end keys.
+-------------------+-----------------------------------------------------------------------------------------------------------
多個命令有擴展的可選參數,在 shell 中查看幫助獲取詳細信息。
格式化二進制數據 (Formatting Binary Data)
-------------------------------------------------------------------------------------------------------------------------------------
在 get 操作期間打印 cell 值時,shell 隱式使用 Bytes.toStringBinary() 轉換二進制數據。可以在基於列的設置上改變這種行爲,通過設定不同的
格式化方法。該方法必須接受一個 byte[] 數組並返回一個可打印的值的表現形式。它作爲列名的一部分定義,作爲 get 調用的可選參數輸入:
<column family>[:<column qualifier>[:format method]]
對於 get 調用,可以忽略任何的列信息,但如果添加了信息,它們可以作爲列族,或者列族和列限定符。第三個可選部分就是格式化方法,指向一個
Bytes 類的方法,或者一個自定義類的方法。因爲其隱式地指明瞭列族和列限定符,因此意味着只能對一個特定的列指定一個格式化方法,而不能對整個
列族,甚至整個行設定。
下表列出了兩個選項及其示例:
Possible Format Methods
+---------------+-------------------------------+--------------------------------------------------------------------------------
| Method | Examples | Description
+---------------+-------------------------------+--------------------------------------------------------------------------------
| Bytes Method | toInt, toLong | Refers to a known method from the Bytes class.
+---------------+-------------------------------+--------------------------------------------------------------------------------
| CustomMethod | c(CustomFormatClass).format | Specifies a custom class and method converting byte[] to text.
+---------------+-------------------------------+--------------------------------------------------------------------------------
Bytes Method 是對顯式的 Bytes 類簡單的快捷方式,例如,colfam:qual:c(org.apache.hadoop.hbase.util.Bytes).toInt 與 colfam:qual:toInt
相同。下面的示例使用了不同的命令展示討論的內容:
hbase(main):001:0> create 'testtable', 'colfam1'
0 row(s) in 0.2020 seconds
=> Hbase::Table - testtable
hbase(main):002:0> incr 'testtable', 'row-1', 'colfam1:cnt1'
0 row(s) in 0.0580 seconds
hbase(main):003:0> get_counter 'testtable', 'row-1', 'col
fam1:cnt1', 1
COUNTER VALUE = 1
hbase(main):004:0> get 'testtable', 'row-1', 'colfam1:cnt1'
COLUMN CELL
colfam1:cnt1 timestamp=..., value=\x00\x00\x00\x00\x00\x00\x00\x01
1 row(s) in 0.0150 seconds
hbase(main):005:0> get 'testtable', 'row-1', { COLUMN => 'colfam1:cnt1' }
COLUMN CELL
colfam1:cnt1 timestamp=..., value=\x00\x00\x00\x00\x00\x00\x00\x01
1 row(s) in 0.0160 seconds
hbase(main):006:0> get 'testtable', 'row-1', { COLUMN => ['colfam1:cnt1:toLong'] }
COLUMN CELL
colfam1:cnt1 timestamp=..., value=1
1 row(s) in 0.0050 seconds
hbase(main):007:0> get 'testtable', 'row-1', 'colfam1:cnt1:toLong'
COLUMN CELL
colfam1:cnt1 timestamp=..., value=1
1 row(s) in 0.0060 seconds
■ 快照命令 (Snapshot Commands)
-----------------------------------------------------------------------------------------------------------------------------------------
這些命令反應了管理 API 功能。可以給一個表創建快照,用於之後的恢復或克隆,以及列出所有可用的快照等等。
Snapshot Shell Commands
+-----------------------+-------------------------------------------------------------------------------------------------------
| Command | Description
+-----------------------+-------------------------------------------------------------------------------------------------------
| snapshot | Creates a snapshot. Use the SKIP_FLUSH => true option to not flush the table before the snapshot.
+-----------------------+-------------------------------------------------------------------------------------------------------
| clone_snapshot | Clones an existing snapshot into a new table
+-----------------------+-------------------------------------------------------------------------------------------------------
| restore_snapshot | Restores a snapshot under the same table name as it was created
+-----------------------+-------------------------------------------------------------------------------------------------------
| delete_snapshot | Deletes a specific snapshot. The given name must match the name of a previously created snapshot
+-----------------------+-------------------------------------------------------------------------------------------------------
| delete_all_snapshot | Deletes all snapshots using a regular expression to match any number of names
+-----------------------+-------------------------------------------------------------------------------------------------------
| list_snapshots | Lists all snapshots that have been created so far
+-----------------------+-------------------------------------------------------------------------------------------------------
創建一個快照可以指定模式,類似於 API 調用指定的模式,即可以強制刷寫表的內存中的數據(默認行爲),或者只創建以及在磁盤上的文件的快照。
hbase(main):001:0> create 'testtable', 'colfam1'
0 row(s) in 0.4950 seconds
=> Hbase::Table - testtable
hbase(main):002:0> for i in 'a'..'z' do \
for j in 'a'..'z' do put 'testtable', "row-#{i}#{j}", "col
fam1:#{j}", \
"#{j}" end end
0 row(s) in 0.0830 seconds
0 row(s) in 0.0070 seconds
...
hbase(main):003:0> count 'testtable'
676 row(s) in 0.1620 seconds
=> 676
hbase(main):004:0> snapshot 'testtable', 'snapshot1', { SKIP_FLUSH => true }
0 row(s) in 0.4300 seconds
hbase(main):005:0> snapshot 'testtable', 'snapshot2'
0 row(s) in 0.3180 seconds
hbase(main):006:0> list_snapshots
SNAPSHOT TABLE + CREATION TIME
snapshot1 testtable (Sun May 10 20:05:11 +0200 2015)
snapshot2 testtable (Sun May 10 20:05:18 +0200 2015)
2 row(s) in 0.0560 seconds
=> ["snapshot1", "snapshot2"]
hbase(main):007:0> disable 'testtable'
0 row(s) in 1.2010 seconds
hbase(main):008:0> restore_snapshot 'snapshot1'
0 row(s) in 0.3430 seconds
hbase(main):009:0> enable 'testtable'
0 row(s) in 0.1920 seconds
hbase(main):010:0> count 'testtable'
0 row(s) in 0.0130 seconds
=> 0
hbase(main):011:0> disable 'testtable'
0 row(s) in 1.1920 seconds
hbase(main):012:0> restore_snapshot 'snapshot2'
0 row(s) in 0.4710 seconds
hbase(main):013:0> enable 'testtable'
0 row(s) in 0.3850 seconds
hbase(main):014:0> count 'testtable'
676 row(s) in 0.1670 seconds
=> 676
■ 工具命令 (Tool Commands)
-----------------------------------------------------------------------------------------------------------------------------------------
tools 命令列於下表,這些命令都由管理 API 提供。這些命令很多是低級別的,也就是說,可能具有破壞性動作,因此確保仔細閱讀每個命令的 shell 幫助
以理解它們的影響。
Tools Shell Commands
+-----------------------+-------------------------------------------------------------------------------------------------------
| Command | Description
+-----------------------+-------------------------------------------------------------------------------------------------------
| assign | Assigns a region to a server
+-----------------------+-------------------------------------------------------------------------------------------------------
| balance_switch | Toggles the balancer switch
+-----------------------+-------------------------------------------------------------------------------------------------------
| balancer | Starts the balancer
+-----------------------+-------------------------------------------------------------------------------------------------------
| close_region | Closes a region. Uses the closeRegion() method
+-----------------------+-------------------------------------------------------------------------------------------------------
| compact | Starts the asynchronous compaction of a region or table. Uses compact()
+-----------------------+-------------------------------------------------------------------------------------------------------
| compact_rs | Compact all regions of a given region server. The optional boolean flag decided between major and minor
| | compactions
+-----------------------+-------------------------------------------------------------------------------------------------------
| flush | Starts the asynchronous flush of a region or table. Uses flush()
+-----------------------+-------------------------------------------------------------------------------------------------------
| major_compact | Starts the asynchronous major compaction of a region or table. Uses majorCompact()
+-----------------------+-------------------------------------------------------------------------------------------------------
| move | Moves a region to a different server. See the move() call
+-----------------------+-------------------------------------------------------------------------------------------------------
| split | Splits a region or table. See the split() call
+-----------------------+-------------------------------------------------------------------------------------------------------
| merge_region | Merges two regions, specified as hashed names. The optional boolean flag allows merging of
| | non-subsequent regions
+-----------------------+-------------------------------------------------------------------------------------------------------
| unassign | Unassigns a region. See the unassign() call
+-----------------------+-------------------------------------------------------------------------------------------------------
| wal_roll | Rolls the WAL, which means close the current and open a new one
+-----------------------+-------------------------------------------------------------------------------------------------------
| catalogjanitor_run | Runs the system catalog janitor process, which operates in the background and cleans out obsolete files
+-----------------------+-------------------------------------------------------------------------------------------------------
| catalogjanitor_switch | Toggles the system catalog janitor process, either enabling or disabling it
+-----------------------+-------------------------------------------------------------------------------------------------------
|catalogjanitor_enabled | Returns the status of the catalog janitor background process
+-----------------------+-------------------------------------------------------------------------------------------------------
| zk_dump | Dumps the ZooKeeper details pertaining to HBase. This is a special function offered by an internal class.
| | The web-based UI of the HBase Master exposes the same information
+-----------------------+-------------------------------------------------------------------------------------------------------
| trace | Starts or stops a trace, using the HTrace framework
+-----------------------+-------------------------------------------------------------------------------------------------------
■ 複製命令 (Replication Commands)
-----------------------------------------------------------------------------------------------------------------------------------------
Replication Shell Commands
+-----------------------+-------------------------------------------------------------------------------------------------------
| Command | Description
+-----------------------+-------------------------------------------------------------------------------------------------------
| add_peer | Adds a replication peer
+-----------------------+-------------------------------------------------------------------------------------------------------
| remove_peer | Removes a replication peer
+-----------------------+-------------------------------------------------------------------------------------------------------
| enable_peer | Enables a replication peer
+-----------------------+-------------------------------------------------------------------------------------------------------
| disable_peer | Disables a replication peer
+-----------------------+-------------------------------------------------------------------------------------------------------
| list_peers | List all previously added peers
+-----------------------+-------------------------------------------------------------------------------------------------------
| list_replicated_tables| Lists all tables and column families that have replication enabled on the current cluster
+-----------------------+-------------------------------------------------------------------------------------------------------
| set_peer_tableCFs | Sets specific column families that should be replicated to the given peer
+-----------------------+-------------------------------------------------------------------------------------------------------
| append_peer_tableCFs | Adds the given column families to the specified peer’s list of replicated column families
+-----------------------+-------------------------------------------------------------------------------------------------------
| remove_peer_tableCFs | Removes the given list of column families from the list of replicated families for the given peer
+-----------------------+-------------------------------------------------------------------------------------------------------
| show_peer_tableCFs | Lists the currently replicated column families for the given peer
+-----------------------+-------------------------------------------------------------------------------------------------------
大多數的命令需要一個 peer ID, 應用各自的功能到指定的 peer 配置。可以添加一個 peer, 之後移除它,對已存在的 peer 啓用或禁用複製,列出所有已知的
peer 或複製的表。
■ 安全命令 (Security Commands)
-----------------------------------------------------------------------------------------------------------------------------------------
這組命令可以劃分爲兩個部分,一部分是訪問控制列表(access control list),另一部分是 visibility label 相關的命令。對於訪問控制列表組可以授權
(grant), 吊銷(revoke), 以及列出用戶許可權限。注意,這些命令只有在 AccessController 協處理器啓用時可用。
Security Shell Commands
+-------------------+-----------------------------------------------------------------------------------------------------------
| Command | Description
+-------------------+-----------------------------------------------------------------------------------------------------------
| grant | Grant the named access rights to the given user
+-------------------+-----------------------------------------------------------------------------------------------------------
| revoke | Revoke the previously granted rights of a given user
+-------------------+-----------------------------------------------------------------------------------------------------------
| user_permission | Lists the current permissions of a user. The optional regular expression filters the list
+-------------------+-----------------------------------------------------------------------------------------------------------
第二組安全相關的命令處理 cell 級別可視性標籤(visibility labels), 再次提醒需要一些額外的配置才能使其工作,這裏是在服務器進程中
啓用額外的 VisibilityController 協處理器。
Visibility Label Shell Commands
+---------------+---------------------------------------------------------------------------------------------------------------
| Command | Description
+---------------+---------------------------------------------------------------------------------------------------------------
| add_labels | Adds a list of visibility labels to the system
+---------------+---------------------------------------------------------------------------------------------------------------
| list_labels | Lists all previously defined labels. An optional regular expression can be used to filter the list
+---------------+---------------------------------------------------------------------------------------------------------------
| set_auths | Assigns the given list of labels to the provided user ID.
+---------------+---------------------------------------------------------------------------------------------------------------
| get_auths | Returns the list of assigned labels for the given user
+---------------+---------------------------------------------------------------------------------------------------------------
| clear_auths | Removes all or only the specified list of labels from the named user.
+---------------+---------------------------------------------------------------------------------------------------------------
|set_visibility | Adds a visibility expression to one or more cell
+---------------+---------------------------------------------------------------------------------------------------------------
4.3 腳本應用 (Scripting)
-----------------------------------------------------------------------------------------------------------------------------------------
在 shell 中,可以交互式執行命令,立刻得到反饋信息。有時候,只想發送一個命令,或許是一個腳本由調度維護系統(如 cron 或 at)調用。用戶還可以
通過管道(piping) 的形式運行命令:
$ echo "status" | bin/hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.0.0, r6c98bff7b719efdb16f71606f3b7d8229445eb81, \
Sat Feb 14 19:49:22 PST 2015
status
1 servers, 2 dead, 3.0000 average load
一旦命令運行完成,shell 會關閉並將控制返回給調用者。最後,可以輸入整個腳本,由 shell 啓動時執行:
$ cat ~/hbase-shell-status.rb
status
$ bin/hbase shell ~/hbase-shell-status.rb
1 servers, 2 dead, 3.0000 average load
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.0.0, r6c98bff7b719efdb16f71606f3b7d8229445eb81, Sat Feb
14 19:49:22 PST 2015
hbase(main):001:0> exit
一旦腳本執行完成,可以繼續在 shell 中工作,或者正常退出。也有選項可以使用原生 JRuby 解釋器執行腳本,直接將它作爲一個 Java 應用程序執行。
hbase 腳本設置的類路徑(class path) 能夠使用任何必要的 Java 類。下面示例簡單地從遠程集羣檢索 table 列表:
$ cat ~/hbase-shell-status-2.rb
include Java
import org.apache.hadoop.hbase.HBaseConfiguration
import org.apache.hadoop.hbase.client.HBaseAdmin
import org.apache.hadoop.hbase.client.ConnectionFactory
conf = HBaseConfiguration.create
connection = ConnectionFactory.createConnection(conf)
admin = connection.getAdmin
tables = admin.listTables
tables.each { |table| puts table.getNameAsString() }
$ bin/hbase org.jruby.Main ~/hbase-shell-status-2.rb
testtable
由於 HBase shell 基於 JRuby’s IRB, 因此可以使用 IRB 內置特性,如,命令補全和命令歷史(command completion and history)。啓用或配置這些特性
要在用戶的 home 目錄中創建一個 .irbrc 文件,該文件會在 shell 啓動時讀取:
$ cat ~/.irbrc
require 'irb/ext/save-history'
IRB.conf[:SAVE_HISTORY] = 100
IRB.conf[:HISTORY_FILE] = "#{ENV['HOME']}/.irb-save-history"
Kernel.at_exit do
IRB.conf[:AT_EXIT].each do |i|
i.call
end
end
啓用命令歷史能保存執行過的 shell 命令。命令補全功能已由 HBase 腳本啓用了。
交互式解釋器具有執行 HBase 類和功能函數的優點,例如,某些應用要求寫一個 Java 應用程序。下面的示例將從 Bytes.toBytes() 調用輸出的二進制輸出
轉換爲整型值:
hbase(main):001:0>
org.apache.hadoop.hbase.util.Bytes.toInt( "\x00\x01\x06[".to_java_bytes)
=> 67163
注意 shell 如何將前三個不可見的字符編碼爲十六進制值,而第四個字符"[" 則作爲一個字符打印。
另一個例子是將一個日期轉換爲 Linux 紀元數,再轉換回人類可讀的日期:
hbase(main):002:0> java.text.SimpleDateFormat.new("yyyy/MM/dd HH:mm:ss").parse("2015/05/12 20:56:29").getTime
=> 1431456989000
hbase(main):002:0> java.util.Date.new(1431456989000).toString
=> "Tue May 12 20:56:29 CEST 2015"
也可以在一個循環中添加很多的 cell, 例如,使用測試數據填充表:
hbase(main):003:0> for i in 'a'..'z' do for j in 'a'..'z' do \
put 'testtable', "row-#{i}#{j}", "colfam1:#{j}", "#{j}" end end
一個更復雜的循環填充計數器可能類似如下:
hbase(main):004:0> require 'date';
import java.lang.Long
import org.apache.hadoop.hbase.util.Bytes
(Date.new(2011, 01, 01)..Date.today).each { |x| put "testtable", "daily", \
"colfam1:" + x.strftime("%Y%m%d"), Bytes.toBytes(Long.new(rand * 4000).longValue).to_a.pack("CCCCCCCC") }
shell 的 JRuby 代碼封裝了很多 Java 類,例如 Table, Admin 爲其自己的版本,更便於訪問它們自己的功能,在執行復雜的腳本任務時,可以使用這些類
執行 table_help 命令,可以訪問內建的幫助文本,說明如何使用 shell 封裝的類,以及特別是 table 參考。這時可能會比較好奇,爲什麼 shell 執行某些
特定命令,如 create 時,它的響應有時會帶有哈希箭頭(hash rocket, or fat comma, 即 =>):
hbase(main):005:0> create 'testtable', 'colfam1'
0 row(s) in 0.1740 seconds
=> Hbase::Table - testtable
create 命令實際上返回一個引用,指向一個 Hbase::Table 實例,也就是指向新創建的 testtable. 可以利用這個引用在一個變量中對其排序,並且可以
使用 shell 的雙 tab (double tab) 特性獲取它提供的所有功能(functions it exposes):
NOTE:
---------------------------------------------------------------------------------------------------------------------------------
進行下面的步驟時要刪除之前創建的測試表,使用 disable 'testtable' , 然後在執行 drop 'testtable'
hbase(main):006:0> tbl = create 'testtable', 'colfam1'
0 row(s) in 0.1520 seconds
=> Hbase::Table - testtable
hbase(main):006:0> tbl. TAB TAB
...
tbl.append tbl.close
tbl.delete
tbl.deleteall tbl.describe
tbl.disable
...
tbl.help tbl.incr
tbl.name
tbl.put tbl.snapshot
tbl.table
...
可以看到 table Ruby 類(這裏是變量名 tbl)顯露出的具有相同名稱的所有 shell 命令。例如 put 命令實際上是 table.put 方法的快捷方式。table.help
打印出與 table_help 相同的內容,table.table 是 Java Table 實例的引用。如果沒有其它選擇可用,可以使用後者訪問原生 API.
獲取同一 Ruby table 引用的另一個方法時利用 get_table 命令,如果表已經存在這個方法很有用。
hbase(main):006:0> tbl = get_table 'testtable'
0 row(s) in 0.0120 seconds
=> Hbase::Table - testtable
一旦擁有了引用,就可以使用匹配的方法調用任何命令,不需要再輸入表的名稱:
hbase(main):007:0> tbl.put 'row-1', 'colfam1:qual1', 'val1'
0 row(s) in 0.0050 seconds
使用給定的值插入到測試表的命名的行和列。以同樣的方法可以訪問數據:
hbase(main):008:0> tbl.get 'row-1'
COLUMN CELL
colfam1:qual1 timestamp=1431506646925, value=val1
1 row(s) in 0.0390 seconds
也可以使用 tbl.scan 等方法讀取數據。所有與表相關的命令,都將表名作爲第一個參數,也應使用表引用語法。輸入 tbl.help '<command>' 命令以查看
shell 內置的命令幫助,通常也包含引用語法的示例。
一般的管理操作也是直接作用到一個表上,例如,enable, disable, 以及 drop 等,通過輸入 tbl.enable, tbl.flush 等執行操作。注意,刪除(drop)
一個表之後,它的引用就變得沒用了,再使用它是未定義的,不建議這樣使用。
最後,另一個例子是圍繞自定義序列化和格式化的。假設已存儲 Java 對象到一個表中,並且打算重建實例,打印出所存儲對象的文本表示。之前已經看到,
可以在通過 get 命令獲取列時提供一個自定義的格式化方法。另外,HBase 攜帶的 Apache Commons Lang artifacts 使用其包含的 SerializationUtils 類
該類有靜態的 serialize() 和 deserialize() method, 可以處理任何實現了 Serializable 接口的 Java 對象。下面的示例深入 shell 環境中,必須創建
自己的 Put 實例。這是必要的,因爲 shell 提供的 put 命令假設其值爲一個字符串。讓我們的例子工作,需要訪問原生 Put 類方法:
hbase(main):004:0> import org.apache.commons.lang.SerializationUtils
=> Java::OrgApacheCommonsLang::SerializationUtils
hbase(main):002:0> create 'testtable', 'colfam1'
0 row(s) in 0.1480 seconds
hbase(main):003:0> p = org.apache.hadoop.hbase.client.Put.new("row-1000".to_java_bytes)
=> #<Java::OrgApacheHadoopHbaseClient::Put:0x6d6bc0eb>
hbase(main):004:0> p.addColumn("colfam1".to_java_bytes,
"qual1".to_java_bytes, SerializationUtils.serialize(java.util.ArrayList.new([1,2,3])))
=> #<Java::OrgApacheHadoopHbaseClient::Put:0x6d6bc0eb>
hbase(main):005:0> t.table.put(p)
hbase(main):006:0> scan 'testtable'
ROW COLUMN+CELL
row-1000 column=colfam1:qual1, timestamp=1431353253936, \
value=\xAC\xED\x00\x05sr\x00\x13java.util.ArrayListx\x81\xD2\x1D
\x99...
\x03sr\x00\x0Ejava.lang.Long;\x8B\xE4\x90\xCC\x8F#\xDF
\x02\x00\x01J...
\x10java.lang.Number\x86\xAC\x95\x1D\x0B\x94\xE0\x8B
\x02\x00\x00xp...
1 row(s) in 0.0340 seconds
hbase(main):007:0> get 'testtable', 'row-1000', 'colfam1:qual1:c(SerializationUtils).deserialize'
COLUMN CELL
colfam1:qual1 timestamp=1431353253936, value=[1, 2, 3]
1 row(s) in 0.0360 seconds
hbase(main):008:0> p.addColumn("colfam1".to_java_bytes, "qual1".to_java_bytes, SerializationUtils.serialize( \
java.util.ArrayList.new(["one", "two", "three"])))
=> #<Java::OrgApacheHadoopHbaseClient::Put:0x6d6bc0eb>
hbase(main):009:0> t.table.put(p)
hbase(main):010:0> scan 'testtable'
ROW COLUMN+CELL
row-1000 column=colfam1:qual1, timestamp=1431353620544, \
value=\xAC\xED\x00\x05sr\x00\x13java.util.ArrayListx\x81\xD2\x1D\x99 \
\xC7a\x9D\x03\x00\x01I\x00\x04sizexp\x00\x00\x00\x03w
\x04\x00\x00\x00 \
\x03t\x00\x03onet\x00\x03twot\x00\x05threex
1 row(s) in 0.4470 seconds
hbase(main):011:0> get 'testtable', 'row-1000', 'colfam1:qual1:c(SerializationUtils).deserialize'
COLUMN CELL
colfam1:qual1 timestamp=1431353620544, value=[one, two, three]
1 row(s) in 0.0190 seconds
首先導入 org.apache.commons.lang.SerializationUtils 類(已在 HBase shell 類路徑中), 然後創建了一個測試表,跟一個自定義 Put 實例。設置了 put
實例兩次,一次序列化數字數組,一次序列化字符串數組。之後調用封裝的 Table 實例的 put 方法,並掃描內容驗證序列化內容。
每一次序列化之後,調用了 get 命令,通過自定義的格式化方法指向 deserialize() method. 它解析原始字節反序列化爲一個 Java 對象,然後打印出來。
由於 shell 應用 toString() 調用,因此可以看到打印出的數組的原始內容,如 [one, two, three]. 這確認了直接在 shell 中重建了序列化的 Java 對象。
參考:
參考:
《HBase - The Definitive Guide - 2nd Edition》Early release —— 2015.7 Lars George