先簡單的把本文要做的事羅列一下:
1. 安裝zookeeper集羣
2. 安裝Storm集羣,包括一個Storm的nimbus,在nimbus上啓用Storm UI 和兩個supervisor
3. 跑一個小例子證實安裝完成
注:
本例中,nimbus機器IP爲10.1.110.24, 兩個supervisor分別爲10.1.110.21和10.1.110.22。後文不再用通配符寫IP。請自行更換IP。
準備工作
2014-07-29 新加註釋:
Storm現在移交到Apach了,新release的apache-storm-0.9.1-incubating直接將Netty換成了默認方式(更多更新參見這裏)。
所以啓用Netty部分的messaging部分的參數可以不帶了。
安裝ZK
mkdir -p /var/tmp/zkdata
cd /var/tmp/zkdata
echo 1 > myid
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/var/tmp/zkdata
# the port at which the clients will connect
clientPort=2181
server.1=10.1.110.21:2888:3888
server.2=10.1.110.22:2888:3888
server.3=10.1.110.24:2888:3888
## Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
## http://zookeeper.apache.org/doc/current/zookeeperAdmin.html
#sc_maintenance
## The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
#zookeeper
export ZOOKEEPER=/opt/zookeeper-3.4.5
PATH=$PATH:$ZOOKEEPER/bin
使其生效6. 啓動zk
cd /opt/zookeeper-3.4.5/bin
./zkServer.sh start
在剩下兩臺機器重複以上步驟,注意myid要對應
安裝Storm
PATH=$PATH:/opt/storm-0.9.0.1/bin
mkdir -p /var/tmp/storm/workdir/
以上步驟在Storm的集羣上的其他機器上重複執行,然後進行配置:
########### These MUST be filled in for a storm configuration
storm.zookeeper.servers:
- "10.1.110.21"
- "10.1.110.22"
- "10.1.110.24"
#
# nimbus.host: "nimbus"
#
#
# ##### These may optionally be filled in:
#
## List of custom serializations
# topology.kryo.register:
# - org.mycompany.MyType
# - org.mycompany.MyType2: org.mycompany.MyType2Serializer
#
## List of custom kryo decorators
# topology.kryo.decorators:
# - org.mycompany.MyDecorator
#
## Locations of the drpc servers
# drpc.servers:
# - "server1"
# - "server2"
storm.local.dir: "/var/tmp/storm/workdir"
########### These MUST be filled in for a storm configuration
storm.zookeeper.servers:
- "10.1.110.21"
- "10.1.110.22"
- "10.1.110.24"
nimbus.host: "10.1.110.24"
#
#
# ##### These may optionally be filled in:
#
## List of custom serializations
# topology.kryo.register:
# - org.mycompany.MyType
# - org.mycompany.MyType2: org.mycompany.MyType2Serializer
#
## List of custom kryo decorators
# topology.kryo.decorators:
# - org.mycompany.MyDecorator
#
## Locations of the drpc servers
# drpc.servers:
# - "server1"
# - "server2"
supervisor.slots.ports:
- 6700
- 6701
- 6702
storm.local.dir: "/var/tmp/storm/workdir"
啓動集羣
./storm nimbus
./storm ui
3. 啓動supervisor
./storm supervisor
ui.port=8089
部署程序
<dependency>
<groupId>org.twitter4j</groupId>
<artifactId>twitter4j-core</artifactId>
<!--<version>2.2.6-SNAPSHOT</version>-->
<version>[2.2,)</version>
</dependency>
<dependency>
<groupId>org.twitter4j</groupId>
<artifactId>twitter4j-stream</artifactId>
<!--<version>2.2.6-SNAPSHOT</version>-->
<version>[2.2,)</version>
</dependency>
mvn clean install -Dmaven.test.skip
3. 將storm-starter/target目錄下的storm-starter-0.0.1-SNAPSHOT-jar-with-dependencies.jar拷到nimbus服務器上去
./storm jar storm-starter-0.0.1-SNAPSHOT-jar-with-dependencies.jar storm.starter.WordCountTopology test
./storm list
769 [main] INFO backtype.storm.thrift - Connecting to Nimbus at localhost:6627
Topology_name Status Num_tasks Num_workers Uptime_secs
-------------------------------------------------------------------
test ACTIVE 28 3 20
6. 關閉topology
常見問題
hostname <new-name>
b. vi /etc/sysconfig/network
設置HOSTNAME=<new-name>vi /etc/hosts
將ipv4地址解析成<new-name>,並添加其他節點信息,如:10.1.110.24 nimbus
10.1.110.22 supervisor-22
10.1.110.21 supervisor-21
注:
這個參數是給啓動的worker用的,一個worker對應一個JVM進程。所以如果有兩個worker在該節點上啓動,必定有一個fail。因爲端口衝突!
在shutdown一個topology時,該worker的進程並不會自動關閉,所以端口已然被佔着。這個時候再啓動時,有可能會分配一個新進程,這時又會去綁該端口,也會fail!
所以應儘量避免remote debug,最後在本地模式調試好後才上傳至cluster。
在每次debug後,要kill掉該supervisor才能避免下次部署失敗。
6. Storm UI裏面topology summary是空的?但是topology確實在運行着。
apache-storm-0.9.2-incubating引入了新版的jquery.tablesorter.min.js,其中有部分代碼如下
ts.addParser({
id: "currency",
is: function (s) {
return /^[£$€?.]/.test(s);
}, format: function (s) {
return $.tablesorter.formatFloat(s.replace(new RegExp(/[£$€]/g), ""));
}, type: "numeric"
});
中間return 的正則特殊符號在GBK和GB2312下會有問題。導致js無法執行。
<meta charset="UTF-8">
如此設定頁面encoding爲UTF-8就沒事了。