Storm雜談之Topology的啓動過程(二)

在一中講到了topology提交給nimbus

nimbus

Nimbus可以 說是storm中最核心的部分,它的主要功能有兩個:

  • 對Topology的任務進行分配資源
  • 接收用戶的命令並做相應的處理,如Topology的提交,殺死,激活等等
Nimbus本身是基於Thrift框架實現的,使用了Thrift的THsHaServer服務,即半同步半異步服務模式,使用一個單獨的線程來處理網絡IO,使用一個獨立的線程池來處理消息,大大提高了消息的併發處理能力。


服務接口的定義都在storm.thrift文件中定義,貼下部分代碼:

service Nimbus {
  void submitTopology(1: string name, 2: string uploadedJarLocation, 3: string jsonConf, 4: StormTopology topology) throws (1: AlreadyAliveException e, 2: InvalidTopologyException ite);
  void submitTopologyWithOpts(1: string name, 2: string uploadedJarLocation, 3: string jsonConf, 4: StormTopology topology, 5: SubmitOptions options) throws (1: AlreadyAliveException e, 2: InvalidTopologyException ite);
  void killTopology(1: string name) throws (1: NotAliveException e);
  void killTopologyWithOpts(1: string name, 2: KillOptions options) throws (1: NotAliveException e);
  void activate(1: string name) throws (1: NotAliveException e);
  void deactivate(1: string name) throws (1: NotAliveException e);
  void rebalance(1: string name, 2: RebalanceOptions options) throws (1: NotAliveException e, 2: InvalidTopologyException ite);

  // need to add functions for asking about status of storms, what nodes they're running on, looking at task logs

  string beginFileUpload();
  void uploadChunk(1: string location, 2: binary chunk);
  void finishFileUpload(1: string location);
  
  string beginFileDownload(1: string file);
  //can stop downloading chunks when receive 0-length byte array back
  binary downloadChunk(1: string id);

  // returns json
  string getNimbusConf();
  // stats functions
  ClusterSummary getClusterInfo();
  TopologyInfo getTopologyInfo(1: string id) throws (1: NotAliveException e);
  //returns json
  string getTopologyConf(1: string id) throws (1: NotAliveException e);
  StormTopology getTopology(1: string id) throws (1: NotAliveException e);
  StormTopology getUserTopology(1: string id) throws (1: NotAliveException e);
}

當執行命令  nohup ${STORM_HOME}/bin/storm nimbus & 時,會啓動nimbus服務,具體的代碼執行:

storm python腳本代碼,默認啓動backtype.storm.daemon.nimbus程序:

def nimbus(klass="backtype.storm.daemon.nimbus"):
    """Syntax: [storm nimbus]

    Launches the nimbus daemon. This command should be run under 
    supervision with a tool like daemontools or monit. 

    See Setting up a Storm cluster for more information.
    (http://storm.incubator.apache.org/documentation/Setting-up-a-Storm-cluster)
    """
    cppaths = [CLUSTER_CONF_DIR]
    jvmopts = parse_args(confvalue("nimbus.childopts", cppaths)) + [
        "-Dlogfile.name=nimbus.log",
        "-Dlogback.configurationFile=" + STORM_DIR + "/logback/cluster.xml",
    ]
    exec_storm_class(
        klass, 
        jvmtype="-server", 
        extrajars=cppaths, 
        jvmopts=jvmopts)

然後執行nimbus.clj 腳本,主要涉及兩個方法——launch-server!(nimbus的啓動入口)和service-handler(真正定義處理邏輯的地方)。


nimbus啓動後,對外提供了一些服務,topology的提交,UI信息,topology的kill,rebalance等等。在文章一中講到提交topology給nimbus,這些服務的處理邏輯全部在service-handler方法中。以下截取service-handler裏面處理提交Topology的邏輯

(reify Nimbus$Iface
      (^void submitTopologyWithOpts
        [this ^String storm-name ^String uploadedJarLocation ^String serializedConf ^StormTopology topology
         ^SubmitOptions submitOptions]
        (try
          (assert (not-nil? submitOptions))
          (validate-topology-name! storm-name)
          (check-storm-active! nimbus storm-name false)
          (let [topo-conf (from-json serializedConf)]
            (try
              (validate-configs-with-schemas topo-conf)
              (catch IllegalArgumentException ex
                (throw (InvalidTopologyException. (.getMessage ex)))))
            (.validate ^backtype.storm.nimbus.ITopologyValidator (:validator nimbus)
                       storm-name
                       topo-conf
                       topology))
          (swap! (:submitted-count nimbus) inc)
          (let [storm-id (str storm-name "-" @(:submitted-count nimbus) "-" (current-time-secs))
                storm-conf (normalize-conf
                            conf
                            (-> serializedConf
                                from-json
                                (assoc STORM-ID storm-id)
                              (assoc TOPOLOGY-NAME storm-name))
                            topology)
                total-storm-conf (merge conf storm-conf)
                topology (normalize-topology total-storm-conf topology)
                storm-cluster-state (:storm-cluster-state nimbus)]
            (system-topology! total-storm-conf topology) ;; this validates the structure of the topology
            (log-message "Received topology submission for " storm-name " with conf " storm-conf)
            ;; lock protects against multiple topologies being submitted at once and
            ;; cleanup thread killing topology in b/w assignment and starting the topology
            (locking (:submit-lock nimbus)
              (setup-storm-code conf storm-id uploadedJarLocation storm-conf topology)
              (.setup-heartbeats! storm-cluster-state storm-id)
              (let [thrift-status->kw-status {TopologyInitialStatus/INACTIVE :inactive
                                              TopologyInitialStatus/ACTIVE :active}]
                (start-storm nimbus storm-name storm-id (thrift-status->kw-status (.get_initial_status submitOptions))))
              (mk-assignments nimbus)))
          (catch Throwable e
            (log-warn-error e "Topology submission exception. (topology name='" storm-name "')")
            (throw e))))
      
      (^void submitTopology
        [this ^String storm-name ^String uploadedJarLocation ^String serializedConf ^StormTopology topology]
        (.submitTopologyWithOpts this storm-name uploadedJarLocation serializedConf topology
                                 (SubmitOptions. TopologyInitialStatus/ACTIVE)))
檢查Topology的DAG圖是否是有效連接圖、以及該topology Name是否已經存在,然後分配資源和任務調度(mk-assignments )方法,等分配好資源之後,把數據寫入到zookeeper,watcher發現有數據,就通知supervisor讀取數據啓動新的worker,一個worker就是一個JVM進程,worker啓動後就會按照用戶事先定好的task數來啓動task,一個task就是一個thread


在executor.clj中mk-threads: spout ,mk-threads: bolt方法就是啓動task,而task就是對應的spout或bolt 組件,而且這時Spout的open,nextTuple方法,以及bolt的preapre,execute方法都是在這裏被調用的,結合文章一中提到的,對於

Spout 方法調用順序:

declareOutputFields-> open -> nextTuple -> fail/ack or other

Bolt 方法調用順序:

declareOutputFields-> prepare -> execute


需要的注意的是在Spout中fail、ack方法和nextTuple是在同一線程中被順序調用的,所以在nextTuple中不要做延遲很大的操作。


至此,一個topology算是可以正式啓動工作了。







發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章