solrCloud中的路由策略：DocRouter、CompositeIdRouter、ImplicitDocRouter

sorlCloud是分片的，那麼如何決定一個document應該到哪一個分片呢？負責解決這個問題的東西就是DocRouter，翻譯過來是doc路由器。在創建一個集合（collection）的時候，我們必須要給集合置頂一個docRouter，solr中默認是使用基於hash策略的docRouter（CompositeIdRouter），當然還有其他的Router，這個博客就要說這些。

我們先看一下DocRouter的源碼，裏面有很多的抽象方法，

public abstract Slice getTargetSlice(String id, SolrInputDocument sdoc, SolrParams params, DocCollection collection);

根據一個solrInputDocument判斷應該屬於一個collection的哪一個shard(slice)，用於添加document的時候,

public abstract Collection<Slice> getSearchSlicesSingle(String shardKey, SolrParams params, DocCollection collection);

這個方法是在查詢的時候應該查那些shard，根據shardKey來判斷。

public abstract boolean isTargetSlice(String id, SolrInputDocument sdoc, SolrParams params, String shardId, DocCollection collection);

這個是判斷一個shardId是不是一個solrInputDocument的正確的slice。

DocRouter的作用就是體現在這些方法上，對於查詢和增加document的時候分別調用不同的方法來決定要操作的那些shard。

我們看一下他的實現類，先看一下基於hash計算的：HashBasedRouter ，我們看一下這個類對上面的方法實現：

1、getTargetSlice:

 @Override
  public Slice getTargetSlice(String id, SolrInputDocument sdoc, SolrParams params, DocCollection collection) {
    if (id == null) id = getId(sdoc, params);//獲得這個doc的id
    int hash = sliceHash(id, sdoc, params,collection);//根據id計算hash值，嗲用的是Hash.murmurhash3_x86_32(id, 0, id.length(), 0);方法，mermerHash。
    return hashToSlice(hash, collection);//根據hash值得到一個slice，看下面的方法
  }

protected Slice hashToSlice(int hash, DocCollection collection) {
    for (Slice slice : collection.getActiveSlices()) {//當前的集合所有存活的shard
      Range range = slice.getRange();//一個shard有一個範圍，
      if (range != null && range.includes(hash)) return slice;//如果hash值在某個範圍。
    }
    throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "No active slice servicing hash code " + Integer.toHexString(hash) + " in " + collection);//如果沒有包含hash值的shard，則報錯。從這個地方可以看出，基於hash值的分片的方式應該是不能動態的擴容的
也就是不能在建立好集羣之後添加shard，因爲各個Shard的範圍應該基於創建的shard的個數被固定下來，所以不能動態的添加shard。
  }

從上面的方法中可以明白很多問題，比如基於hash值的路由策略的shard在建立的時候就會固定shard的範圍，這樣也就不能再動態添加shard了。

2、getSearchSliceSingle

 @Override
  public Collection<Slice> getSearchSlicesSingle(String shardKey, SolrParams params, DocCollection collection) {
    if (shardKey == null) {//如果在查詢的時候沒有指定shardKey，則查詢所有的存活的shard，也就是如果某個shard已經死掉了，默認就是不會查詢他。
      // search across whole collection
      // TODO: this may need modification in the future when shard splitting could cause an overlap
      return collection.getActiveSlices();
    }

    // use the shardKey as an id for plain hashing
    Slice slice = getTargetSlice(shardKey, null, params, collection);//如果指定了，則調用上面的getTargetSlice方法
    return slice == null ? Collections.<Slice>emptyList() : Collections.singletonList(slice);
  }

3、isTargetSlice方法很簡單，這裏就不展示了。

HashBasedRouter 仍然是抽象類，因爲他沒有指定range的實現方式以及和分片的個數的關係，他的實現類時CompositeIdRouter，我們看一下的他的partitionRange方法，在這個方法中一個集合根據分片的個數決定了每個分片的範圍（hash值的範圍），這個方法我還沒有看懂，有興趣的同學可以幫忙看看。

上面我們看完了基於hash值來分片的策略，他的缺點是不能再運行時添加shard，對於那些沒有明顯的規則的集合是合適的。

DocRouter的另一個實現：ImplicitDocRouter

這個是必須指定路由域路由策略，我們在創建集合的時候必須制定這個集合的路由的域是什麼，然後根據document的這個域的值來判斷這個document要添加到哪個shard中。我們看一下他的方法

@Override
  public Slice getTargetSlice(String id, SolrInputDocument sdoc, SolrParams params, DocCollection collection) {
    String shard = null;
    if (sdoc != null) {
      String f = getRouteField(collection);//得到要使用作爲路由的域，這個在創建集合的時候就要指定
      if(f !=null) {
        Object o = sdoc.getFieldValue(f);//得到這個document的這個域的值
        if (o != null) shard = o.toString();//根據與的值對應shard的id
        else throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "No value for field "+f +" in " + sdoc);
      }
      if(shard == null) {//如果上面沒有完成對shard的實現，則使用_ROUTE_這個域
        Object o = sdoc.getFieldValue(_ROUTE_);//使用_ROUTE_這個域
        if (o == null) o = sdoc.getFieldValue("_shard_");//deprecated . for backcompat remove later，如果沒有_ROUTE_這個域，則使用_shard_這個域
        if (o != null) {
          shard = o.toString();
        }
      }
    }

    if (shard == null) {//如果上面從sdoc中沒有找到，則從參數中
      shard = params.get(_ROUTE_);
      if(shard == null) shard =params.get("_shard_"); //deperecated for back compat
    }

    if (shard != null) {

      Slice slice = collection.getSlice(shard);//直接根據名字找slice
      if (slice == null) {
        throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "No shard called =" + shard + " in " + collection);
      }
      return slice;
    }

    return null;  // no shard specified... use default.
  }

上面的代碼可以看出，是先根據指定的域，如果沒有指定，則使用_ROUTE_做路由。

getSearchSlicesSingle

@Override
  public Collection<Slice> getSearchSlicesSingle(String shardKey, SolrParams params, DocCollection collection) {

    if (shardKey == null) {//如果在查詢的時候沒有指定shardkey，則查詢所有的存活的shard
      return collection.getActiveSlices();
    }

    // assume the shardKey is just a slice name
    Slice slice = collection.getSlice(shardKey);//如果指定了，則返回名字對應的shard
    if (slice == null) {
      throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "implicit router can't find shard " + shardKey + " in collection " + collection.getName());
    }

    return Collections.singleton(slice);
  }

這個路由策略的好處是可以在運行時動態的添加shard，對於document有明顯的篩選條件的場合應該優先使用這個。

那麼應該怎麼創建這兩種不同路由策略的集合呢？

如果在創建集合的時候沒有指定router.name，則默認就是CompositeIdRouter，比如這個語句：admin/collections?action=CREATE&name=collectionName&numShards=4&replicationFactor=2&collection.configName=collectionName&maxShardsPerNode=2可以在創建玩了之後查看一下zk上的clusterstate.json,上面就有"router":{"name":"compositeId"}（solr4.7.2），

如果指定了router.name=implicit，則就是後者，比如這個語句：admin/collections?action=CREATE&name=hello&replicationFactor=2&collection.configName=configName&maxShardsPerNode=10&router.name=implicit&shards=name1,name2,name3,name4&router.field=nameField，就會是後者。

solrCloud中的路由策略：DocRouter、CompositeIdRouter、ImplicitDocRouter

如何使用 JS 判斷用戶是否處於活躍狀態

通過HPA+CronHPA組合應對業務複雜彈性伸縮場景

lucene3.0.3中的CustomerScoreQuery

關於jvm監控配置的筆記

lucene3.0.3中的SpanNearQuery（一）

實現得分的PrefixQuery

solrCloud中CompsiteId路由策略的collection的操作分析

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結