【Solr啓動原理】

Solr集羣啓動,都做了哪些事情?做了很多事,over。 啓動流程大致如下:

1. 啓動入口:web.xml。Solr歸根結底是個Web服務,必須部署到jetty或者tomcat容器上。

2. SolrRequestFilter過濾器的實現類是org.apache.solr.servlet.SolrDispatchFilter。

  <!-- Any path (name) registered in solrconfig.xml will be sent to that filter -->
  <filter>
    <filter-name>SolrRequestFilter</filter-name>
    <filter-class>org.apache.solr.servlet.SolrDispatchFilter</filter-class>
    <!--
    Exclude patterns is a list of directories that would be short circuited by the 
    SolrDispatchFilter. It includes all Admin UI related static content.
    NOTE: It is NOT a pattern but only matches the start of the HTTP ServletPath.
    -->
    <init-param>
      <param-name>excludePatterns</param-name>
      <param-value>/css/.+,/js/.+,/img/.+,/tpl/.+</param-value>
    </init-param>
  </filter>

  <filter-mapping>
    <!--
      NOTE: When using multicore, /admin JSP URLs with a core specified
      such as /solr/coreName/admin/stats.jsp get forwarded by a
      RequestDispatcher to /solr/admin/stats.jsp with the specified core
      put into request scope keyed as "org.apache.solr.SolrCore".

      It is unnecessary, and potentially problematic, to have the SolrDispatchFilter
      configured to also filter on forwards.  Do not configure
      this dispatcher as <dispatcher>FORWARD</dispatcher>.
    -->
    <filter-name>SolrRequestFilter</filter-name>
    <url-pattern>/*</url-pattern>
  </filter-mapping>

3. [SolrDispatchFilter] 既然是個Filter,就要實現init(),doFilter()和destroy()三個方法。

package javax.servlet;

import java.io.IOException;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;

public interface Filter {
    void init(FilterConfig var1) throws ServletException;

    void doFilter(ServletRequest var1, ServletResponse var2, FilterChain var3) throws IOException, ServletException;

    void destroy();
}

4. [SolrDispatchFilter] 在init()方法中,初始化cores,並開始加載這些cores。請注意,load()方法是最重要的加載Solr Core的方法。

/**
   * Override this to change CoreContainer initialization
   * @return a CoreContainer to hold this server's cores
   */
  protected CoreContainer createCoreContainer(Path solrHome, Properties extraProperties) {
    NodeConfig nodeConfig = loadNodeConfig(solrHome, extraProperties); // 從ZK讀取solr.xml配置文件到nodeConfig中
    cores = new CoreContainer(nodeConfig, extraProperties, true);
    cores.load();
    return cores;
  }

5. [SolrDispatchFilter] 實際項目可以在createCoreContainer方法嘗試加載Cores之後,另外啓動一個線程去recovery failed cores,來再一次嘗試加載load失敗的Solr cores。

6. [CoreContainer] load()方法首先會通過initZooKeeper函數初始化ZK得到一個ZkController實例。

7. [CoreContainer] coresLocator.discover(this)是去遍歷Solr Home目錄下有哪些cores需要去加載的。

    // 這裏真正開始加載Solr Cores
    // setup executor to load cores in parallel
    ExecutorService coreLoadExecutor = ExecutorUtil.newMDCAwareFixedThreadPool(
        cfg.getCoreLoadThreadCount(isZooKeeperAware()),
        new DefaultSolrThreadFactory("coreLoadExecutor") );
    final List<Future<SolrCore>> futures = new ArrayList<>();
    try {
      // 遍歷Solr Home,發現需要加載的cores
      List<CoreDescriptor> cds = coresLocator.discover(this);
      if (isZooKeeperAware()) {
        //sort the cores if it is in SolrCloud. In standalone node the order does not matter
        CoreSorter coreComparator = new CoreSorter().init(this);
        cds = new ArrayList<>(cds);//make a copy
        Collections.sort(cds, coreComparator::compare); // 對集合中的Core排序
      }
      checkForDuplicateCoreNames(cds);

hdfs是個single shard single replica的索引,發現在日誌中會打印:
2018-12-10 09:52:16,581 | INFO | localhost-startStop-1 | Looking for core definitions underneath /srv/BigData/solr/solrserveradmin | org.apache.solr.core.CorePropertiesLocator.discover(CorePropertiesLocator.java:125)
2018-12-10 09:52:16,610 | INFO | localhost-startStop-1 | Found 1 core definitions | org.apache.solr.core.CorePropertiesLocator.discover(CorePropertiesLocator.java:158)

Solr實例中的狀態如下:
host1:~ # ll /srv/BigData/solr/solrserveradmin/hdfs_shard1_replica1/
total 4
-rw------- 1 omm wheel 190 Dec 10 09:49 core.properties
host1:~ # cat /srv/BigData/solr/solrserveradmin/hdfs_shard1_replica1/core.properties
#Written by CorePropertiesLocator
#Mon Dec 10 09:49:09 CST 2018
numShards=1
collection.configName=confWithHDFS
name=hdfs_shard1_replica1
shard=shard1
collection=hdfs
coreNodeName=core_node1

8. [CoreContainer] 然後啓動一個線程池去並行加載Cores(SolrCloud模式下是8個併發線程)

      // 開始加載Cores咯!按順序遍歷所有找到的Cores
      for (final CoreDescriptor cd : cds) {
        if (cd.isTransient() || !cd.isLoadOnStartup()) {
          solrCores.putDynamicDescriptor(cd.getName(), cd);
        } else if (asyncSolrCoreLoad) {
          solrCores.markCoreAsLoading(cd);
        }
        if (cd.isLoadOnStartup()) {
          futures.add(coreLoadExecutor.submit(() -> {
            SolrCore core;
            try {
              if (zkSys.getZkController() != null) {
                zkSys.getZkController().throwErrorIfReplicaReplaced(cd);
              }
              
              // 根據coreDe去創建core,false表示暫時不往ZK註冊。下面會有更詳細的解析。
              core = create(cd, false);
            } finally {
              if (asyncSolrCoreLoad) {
                solrCores.markCoreAsNotLoading(cd);
              }
            }
            try {
              // 這邊往ZK註冊!!!真正往Cluster中加載shard,主要做如下三件事:
              // 1. Shard leader選舉
              // 2. 復讀TLog重放數據,恢復現場,保證數據一致性
              // 3. 完成數據的恢復(leader數據會遷移到副本),這個恢復動作是後臺運行的
              zkSys.registerInZk(core, true);
            } catch (RuntimeException e) {
              SolrException.log(log, "Error registering SolrCore", e);
            }
            return core;
          }));
        }
      }
      // 結束加載Core咯!

      // Start the background thread
      backgroundCloser = new CloserThread(this, solrCores, cfg);
      backgroundCloser.start();

    } finally {
      if (asyncSolrCoreLoad && futures != null) {

        coreContainerWorkExecutor.submit((Runnable) () -> {
          try {
            for (Future<SolrCore> future : futures) {
              try {
                future.get();
              } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
              } catch (ExecutionException e) {
                log.error("Error waiting for SolrCore to be created", e);
              }
            }
          } finally {
            ExecutorUtil.shutdownAndAwaitTermination(coreLoadExecutor);
          }
        });
      } else {
        ExecutorUtil.shutdownAndAwaitTermination(coreLoadExecutor);
      }
    }

    if (isZooKeeperAware()) {
      zkSys.getZkController().checkOverseerDesignate();
    }

9. [CoreContainer] 下面詳細對create方法和registerInZk方法做一個介紹。

先來看看create方法。
方法註解寫着Creates a new core based on a CoreDescriptor,簡潔明瞭。

  1. create裏面的preRegister方法首先會將該core的狀態設置爲DOWN狀態
  2. 因爲傳入的publishState是false,則該core暫時不會去往ZK註冊(往ZK註冊會涉及到Shard Leader選舉)
  /**
   * Creates a new core based on a CoreDescriptor.
   *
   * @param dcore        a core descriptor
   * @param publishState publish core state to the cluster if true
   *
   * @return the newly created core
   */
  private SolrCore create(CoreDescriptor dcore, boolean publishState) {

    if (isShutDown) {
      throw new SolrException(ErrorCode.SERVICE_UNAVAILABLE, "Solr has been shutdown.");
    }

    SolrCore core = null;
    try {
      MDCLoggingContext.setCore(core);
      SolrIdentifierValidator.validateCoreName(dcore.getName());
      if (zkSys.getZkController() != null) {
        // 1. 向ZK的overseerJobQueue隊列/overseer/queue發佈該core的信息包括DOWN狀態;
        // 2. 通知zkStateReader要watch該core所在collection的state.json的監控
        zkSys.getZkController().preRegister(dcore);
      }

      ConfigSet coreConfig = coreConfigService.getConfig(dcore);
      log.info("Creating SolrCore '{}' using configuration from {}", dcore.getName(), coreConfig.getName());
      core = new SolrCore(dcore, coreConfig);
      MDCLoggingContext.setCore(core);
      
      // always kick off recovery if we are in non-Cloud mode
      if (!isZooKeeperAware() && core.getUpdateHandler().getUpdateLog() != null) {
        core.getUpdateHandler().getUpdateLog().recoverFromLog();
      }

      // 注意create方法傳入的參數是false,因此不會在下面的方法中去調用zkSys.registerInZk()往ZK註冊。
      registerCore(dcore.getName(), core, publishState);

      return core;
    } catch (Exception e) {
      coreInitFailures.put(dcore.getName(), new CoreLoadFailure(dcore, e));
      log.error("Error creating core [{}]: {}", dcore.getName(), e.getMessage(), e);
      final SolrException solrException = new SolrException(ErrorCode.SERVER_ERROR, "Unable to create core [" + dcore.getName() + "]", e);
      if(core != null && !core.isClosed())
        IOUtils.closeQuietly(core);
      throw solrException;
    } catch (Throwable t) {
      SolrException e = new SolrException(ErrorCode.SERVER_ERROR, "JVM Error creating core [" + dcore.getName() + "]: " + t.getMessage(), t);
      log.error("Error creating core [{}]: {}", dcore.getName(), t.getMessage(), t);
      coreInitFailures.put(dcore.getName(), new CoreLoadFailure(dcore, e));
      if(core != null && !core.isClosed())
        IOUtils.closeQuietly(core);
      throw t;
    } finally {
      MDCLoggingContext.clear();
    }

  }

10. [CoreContainer->ZkContainer->ZkController] registerInZk方法,會調用zkController.register(core.getName(), core.getCoreDescriptor())會在後臺新啓一個線程去執行,不影響Solr啓動。

具體主要做了如下三件事。具體主要做了如下三件事。

  1. 進行Shard Leader選舉!!!最近剛看過選舉機制,很簡單,mzxid最小的就是leader。
  2. 接下來,會從ULog重放數據,恢復現場
  3. 判斷是否需要恢復數據
  /**
 4. Register shard with ZooKeeper.
 5.  6. @return the shardId for the SolrCore
   */
  public String register(String coreName, final CoreDescriptor desc) throws Exception {
    return register(coreName, desc, false, false);
  }


  /**
 6. Register shard with ZooKeeper.
 7.  9. @return the shardId for the SolrCore
   */
  public String register(String coreName, final CoreDescriptor desc, boolean recoverReloadedCores, boolean afterExpiration) throws Exception {
    try (SolrCore core = cc.getCore(desc.getName())) {
      MDCLoggingContext.setCore(core);
    }
    try {
      // pre register has published our down state
      final String baseUrl = getBaseUrl();
      
      final CloudDescriptor cloudDesc = desc.getCloudDescriptor();
      final String collection = cloudDesc.getCollectionName();
      
      final String coreZkNodeName = desc.getCloudDescriptor().getCoreNodeName();
      assert coreZkNodeName != null : "we should have a coreNodeName by now";
      
      String shardId = cloudDesc.getShardId();
      Map<String,Object> props = new HashMap<>();
      // we only put a subset of props into the leader node
      props.put(ZkStateReader.BASE_URL_PROP, baseUrl);
      props.put(ZkStateReader.CORE_NAME_PROP, coreName);
      props.put(ZkStateReader.NODE_NAME_PROP, getNodeName());
      
      if (log.isInfoEnabled()) {
        log.info("Register replica - core:" + coreName + " address:" + baseUrl + " collection:"
            + cloudDesc.getCollectionName() + " shard:" + shardId);
      }
      
      ZkNodeProps leaderProps = new ZkNodeProps(props);
      
      // 1. 進行Shard Leader選舉!!!最近剛看過選舉機制,很簡單,mzxid最小的就是leader。
      try {
        // If we're a preferred leader, insert ourselves at the head of the queue
        boolean joinAtHead = false;
        Replica replica = zkStateReader.getClusterState().getReplica(desc.getCloudDescriptor().getCollectionName(),
            coreZkNodeName);
        if (replica != null) {
          joinAtHead = replica.getBool(SliceMutator.PREFERRED_LEADER_PROP, false);
        }
        joinElection(desc, afterExpiration, joinAtHead);
      } catch (InterruptedException e) {
        // Restore the interrupted status
        Thread.currentThread().interrupt();
        throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, "", e);
      } catch (KeeperException | IOException e) {
        throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, "", e);
      }
      
      // in this case, we want to wait for the leader as long as the leader might
      // wait for a vote, at least - but also long enough that a large cluster has
      // time to get its act together
      
      String leaderUrl = getLeader(cloudDesc, leaderVoteWait + 600000);
      
      String ourUrl = ZkCoreNodeProps.getCoreUrl(baseUrl, coreName);
      log.info("We are " + ourUrl + " and leader is " + leaderUrl);
      boolean isLeader = leaderUrl.equals(ourUrl);
      
      
      try (SolrCore core = cc.getCore(desc.getName())) {
        CoreDescriptor cd = core.getCoreDescriptor();
        if (SharedFsReplicationUtil.isZkAwareAndSharedFsReplication(cd) && !isLeader) {
          // with shared fs replication we don't init the update log until now because we need to make it read only
          // if we don't become the leader
          DelayedInitSolrCore.initIndexReaderFactory(core);
          core.getUpdateHandler().setupUlog(core, null);
          core.getSearcher(false, false, null, true);

          // the leader does this in ShardLeaderElectionContext#runLeaderProcess
        }

        // 2. 接下來,會從TLog重放數據,恢復現場
        // recover from local transaction log and wait for it to complete before
        // going active
        // TODO: should this be moved to another thread? To recoveryStrat?
        // TODO: should this actually be done earlier, before (or as part of)
        // leader election perhaps?
        
        UpdateLog ulog = core.getUpdateHandler().getUpdateLog();
        
        // we will call register again after zk expiration and on reload 
        if (!afterExpiration && !core.isReloaded() && ulog != null && !SharedFsReplicationUtil.isZkAwareAndSharedFsReplication(cd)) {
          // disable recovery in case shard is in construction state (for shard splits)
          Slice slice = getClusterState().getSlice(collection, shardId);
          if (slice.getState() != Slice.State.CONSTRUCTION || !isLeader) {
            Future<UpdateLog.RecoveryInfo> recoveryFuture = core.getUpdateHandler().getUpdateLog().recoverFromLog();
            if (recoveryFuture != null) {
              log.info("Replaying tlog for " + ourUrl + " during startup... NOTE: This can take a while.");
              recoveryFuture.get(); // NOTE: this could potentially block for
              // minutes or more!
              // TODO: public as recovering in the mean time?
              // TODO: in the future we could do peersync in parallel with recoverFromLog
            } else {
              log.info("No LogReplay needed for core=" + core.getName() + " baseURL=" + baseUrl);
            }
          }
        }
        // 3. 是否需要恢復數據
        // a. 如果是leader,則不需要recovery,直接將Replica狀態發佈爲Replica.State.ACTIVE;
        // b. 如果不是leader,則判斷是否需要recovery。若需要recover,新啓一個線程去從leader恢復數據到同一個數據版本,此時Replica狀態變成了Replica.State.RECOVERING狀態。
        // c. 在recover完成之後,再將Replica狀態發佈爲Replica.State.ACTIVE狀態。
        boolean didRecovery = checkRecovery(coreName, desc, recoverReloadedCores, isLeader, cloudDesc, collection,
            coreZkNodeName, shardId, leaderProps, core, cc, afterExpiration);
        if (!didRecovery) {
          // 將replica的狀態變成ACTIVE
          publish(desc, Replica.State.ACTIVE);
        }
        
        core.getCoreDescriptor().getCloudDescriptor().setHasRegistered(true);
      }
      
      // make sure we have an update cluster state right away
      zkStateReader.forceUpdateCollection(collection);
      return shardId;
    } finally {
      MDCLoggingContext.clear();
    }
  }

11. [ZkController] 檢查是否需要recovery過程

  /**
   * Returns whether or not a recovery was started
   */
  private boolean checkRecovery(String coreName, final CoreDescriptor desc,
                                boolean recoverReloadedCores, final boolean isLeader,
                                final CloudDescriptor cloudDesc, final String collection,
                                final String shardZkNodeName, String shardId, ZkNodeProps leaderProps,
                                SolrCore core, CoreContainer cc, boolean afterExpiration) {
    if (SKIP_AUTO_RECOVERY) {
      log.warn("Skipping recovery according to sys prop solrcloud.skip.autorecovery");
      return false;
    }
    boolean doRecovery = true;
    
    // leaders don't recover, shared fs replication replicas don't recover 
    CoreDescriptor cd = core.getCoreDescriptor();
    if (!isLeader && !SharedFsReplicationUtil.isZkAwareAndSharedFsReplication(cd)) {
      log.info("I am not the leader");
      if (!afterExpiration && core.isReloaded() && !recoverReloadedCores) {
        doRecovery = false;
      }

      if (doRecovery) {
        log.info("Core needs to recover:" + core.getName());
        // 這裏會新啓一個異步線程去recover,不會阻塞主線程。
        core.getUpdateHandler().getSolrCoreState().doRecovery(cc, core.getCoreDescriptor());        
        return true;
      }

      // see if the leader told us to recover
      final Replica.State lirState = getLeaderInitiatedRecoveryState(collection, shardId,
          core.getCoreDescriptor().getCloudDescriptor().getCoreNodeName());
      if (lirState == Replica.State.DOWN) {
        log.info("Leader marked core " + core.getName() + " down; starting recovery process");
        core.getUpdateHandler().getSolrCoreState().doRecovery(cc, core.getCoreDescriptor());
        return true;
      }
    } else {
      log.info("I am the leader, no recovery necessary");
    }

    return false;
  }

12. [RecoveryStrategy] 啓動recovery線程。在recovery完成之後,再將Replica狀態發佈爲Replica.State.ACTIVE狀態。

  @Override
  public void run() {

    // set request info for logging
    try (SolrCore core = cc.getCore(coreName)) {

      if (core == null) {
        SolrException.log(LOG, "SolrCore not found - cannot recover:" + coreName);
        return;
      }
      MDCLoggingContext.setCore(core);

      LOG.info("Starting recovery process. recoveringAfterStartup=" + recoveringAfterStartup);

      try {
        // !!!這邊開始去做Recovery!!!
        doRecovery(core);
      } catch (InterruptedException e) {
        Thread.currentThread().interrupt();
        SolrException.log(LOG, "", e);
        throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, "", e);
      } catch (Exception e) {
        LOG.error("", e);
        throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, "", e);
      }
    } finally {
      MDCLoggingContext.clear();
    }
  }

13. [RecoveryStrategy] Recovery流程介紹。

Recovery分爲PeerSync和Replication兩種方式。Recovery流程先嚐試做PeerSync Recovery,失敗的話再嘗試Replication Recovery。

  • PeerSync:如果中斷的時間較短,recovering node只是丟失少量update請求,那麼它可以從leader的update log中獲取。這個臨界值是100個update請求,如果大於100,就會從leader進行完整的索引快照恢復。
    PeerSync Recovery
  • Replication:如果節點下線太久以至於不能從leader那進行同步,或者如果PeerSync失敗(參考這裏),它就會使用Solr的基於http進行索引的快照恢復。
    Replication Recovery

Reference

https://blog.csdn.net/weixin_42257250/article/details/89512282
https://www.cnblogs.com/rcfeng/p/4145349.html

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章