Sentinel限流框架

前言

2020年，由于疫情大爆发。A所在的图书馆接到了上面的任务：1天内不允许超过50人同进在馆内看书，外省人禁入内。如果你是此图书管理员，你会怎么做？

数据收集（访客的身份证，来自哪）
数据统计（对一天内的访客总数统计，儿童馆统计）
规则限制（根据当前统计数据，比对规则）

简单的例子

String resource = "进入图书馆"；
Rule rule = new Rule();
rule.setResource(resource);
rule.setLimitApp("外省"); //限制外省人
rule.setCount(50);//一天只能50人
//载入规则
AuthorityRuleManager.loadRules(Collections.singletonList(rule);


String name = "访客身份证";
String origin = "来自省份";
ContextUtil.enter(name, origin);
Entry entry = null;
try {
      //请求进入图书馆
      entry = SphU.entry(resource);
      //我是XX来自XX省，已拿到图书馆的入场证明     
} catch (BlockException ex) {
      //我是XX来自XX省，被图书馆拒绝入内   
} finally {
      if (entry != null) {
           //我出图书馆了
           entry.exit();
      }
      //我回家了
      ContextUtil.exit();
}

Sentinel的调用链

ContextUtil.enter(name, origin);

创建一Context对象，它表示一次访问有档案（上下文）。保存在ThreadLocal中的，每次执行的时候会优先到ThreadLocal中获取。

name: 名字
origin：调用源，哪个App请求过来的
curEntry：调用链的当前entry
entranceNode：存入当前Entry的一些数据

  SphU.entry(1);
  SphU.entry(2);
  SphU.entry(3);

entry表示是否成功申请资源的一个凭证。

parent和child 表示上一个资源和下一个资源
createTime：当前Entry的创建时间，主要用来后期计算rt
resourceWrapper：当前Entry所关联的资源
node：当前Entry所关联的node，该node主要是记录了当前context下该资源的统计信息

 entry.exit();

Sentinel中的Slot链

//SphU.entry最终会调用这方法
private Entry entryWithPriority(ResourceWrapper resourceWrapper, int count, boolean prioritized, Object... args)
        throws BlockException {
        Context context = ContextUtil.getContext();
       
        //找对资源对应配置的Slot
        ProcessorSlot<Object> chain = lookProcessChain(resourceWrapper);

        Entry e = new CtEntry(resourceWrapper, chain, context);
         
        //根据资源配置的slot链一个一个调用 
        chain.entry(context, resourceWrapper, null, count, prioritized, args);
       
        return e;
    }

NodeSelectorSlot 资源和调用者双维度的存储对象。
ClusterBuilderSlot 资源维度的存储对象,用于存储资源的统计信息以及调用者信息。
StatisticsSlot 则用于记录，统计不同维度的 runtime 信息；
SystemSlot 则通过系统的状态，例如 load1 等，来控制总的入口流量；
AuthoritySlot 则根据黑白名单，来做黑白名单控制；
FlowSlot 则用于根据预设的限流规则，以及前面 slot 统计的状态，来进行限流；
DegradeSlot 则通过统计信息，以及预设的规则，来做熔断降级；

原来Sentinel通过实现不同的Slot及调用顺序实现了数据收集，统计，及规则的较检。注意，所有的Slot是资源维度的，不同资源Slot对象是不同的

Slot中的存储Node

node是保存资源的实时统计数据的，例如：passQps，blockQps，rt等实时数据.

NodeSelectorSlot

private volatile Map<String, DefaultNode> map = new HashMap<String, DefaultNode>(10);

@Override
public void entry(Context context, ResourceWrapper resourceWrapper, Object obj, int count, Object... args) throws Throwable {
    // 根据「上下文」的名称获取DefaultNode
    // 多线程环境下，每个线程都会创建一个context，
    // 只要资源名相同，则context的名称也相同，那么获取到的节点就相同
    DefaultNode node = map.get(context.getName());
    if (node == null) {
        synchronized (this) {
            node = map.get(context.getName());
            if (node == null) {
                // 如果当前「上下文」中没有该节点，则创建一个DefaultNode节点
                node = Env.nodeBuilder.buildTreeNode(resourceWrapper, null);
                // 省略部分代码
            }
            // 将当前node作为「上下文」的最后一个节点的子节点添加进去
            // 如果context的curEntry.parent.curNode为null，则添加到entranceNode中去
            // 否则添加到context的curEntry.parent.curNode中去
            ((DefaultNode)context.getLastNode()).addChild(node);
        }
    }
    // 将该节点设置为「上下文」中的当前节点
    // 实际是将当前节点赋值给context中curEntry的curNode
    // 在Context的getLastNode中会用到在此处设置的curNode
    context.setCurNode(node);
    fireEntry(context, resourceWrapper, node, count, args);
}

DefaultNode是基于资源和context双维度的一个存储对象。

ClusterBuilderSlot

//注意这边是static
private static volatile Map<ResourceWrapper, ClusterNode> clusterNodeMap = new HashMap<>();

@Override
public void entry(Context context, ResourceWrapper resourceWrapper, DefaultNode node, int count, Object... args) throws Throwable {
    if (clusterNode == null) {
        synchronized (lock) {
            if (clusterNode == null) {
                // Create the cluster node.
                clusterNode = Env.nodeBuilder.buildClusterNode();
                // 将clusterNode保存到全局的map中去
                HashMap<ResourceWrapper, ClusterNode> newMap = new HashMap<ResourceWrapper, ClusterNode>(16);
                newMap.putAll(clusterNodeMap);
                newMap.put(node.getId(), clusterNode);

                clusterNodeMap = newMap;
            }
        }
    }
    // 将clusterNode塞到DefaultNode中去
    node.setClusterNode(clusterNode);

    // 省略部分代码

    fireEntry(context, resourceWrapper, node, count, args);
}

ClusterNode是基于资源为维度的一个存储对象。

资源角度：Node关系图

其中entranceNode是每个上下文的入口，该节点是直接挂在root下的，是全局唯一的，每一个context都会对应一个entranceNode。另外defaultNode是记录当前调用的实时数据的，每个defaultNode都关联着一个资源和clusterNode，有着相同资源的defaultNode，他们关联着同一个clusterNode。

Slot中的数据收集及统计

StatistcSlot

@Override
public void entry(Context context, ResourceWrapper resourceWrapper, DefaultNode node, int count, Object... args) throws Throwable {
        // 触发下一个Slot的entry方法
        fireEntry(context, resourceWrapper, node, count, args);
        // 如果能通过SlotChain中后面的Slot的entry方法，说明没有被限流或降级
        // 统计信息
        node.increaseThreadNum();
        node.addPassRequest();
        // 省略部分代码
}

@Override
public void exit(Context context, ResourceWrapper resourceWrapper, int count, Object... args) {
    DefaultNode node = (DefaultNode)context.getCurNode();
    if (context.getCurEntry().getError() == null) {
        long rt = TimeUtil.currentTimeMillis() - context.getCurEntry().getCreateTime();
        if (rt > Constants.TIME_DROP_VALVE) {
            rt = Constants.TIME_DROP_VALVE;
        }
        node.rt(rt);
        node.decreaseThreadNum();
        // 省略部分代码
    } 
    fireExit(context, resourceWrapper, count);
}

收集请求资源线程的的数量，请求数量及响应时间。

利用时间窗口Metric做统计

//node.addPassRequest()
//最终会触发clusterNode，defaultNode的addPassRequest
public void addPassRequest(int count) {
        super.addPassRequest(count);
        this.clusterNode.addPassRequest(count);
}

//分别以秒维度和分钟维度计数
public void addPassRequest(int count) {
        rollingCounterInSecond.addPass(count);
        rollingCounterInMinute.addPass(count);
}

//获取当前的时间窗口，然后把数加上去
public void addPass(int count) {
    WindowWrap<MetricBucket> wrap = data.currentWindow();
    wrap.value().addPass(count);
}

//以一秒为统计间隔，分2个时间窗口
//rollingCounterInSecond = new ArrayMetric(2,1000);
//以一分钟为统计间隔，分60个时间窗口
//rollingCounterInMinute = new ArrayMetric(60, 60*1000);
public abstract class LeapArray<T> {
    /**
     * LeapArray对象
     * @param sampleCount 时间窗口的个数，单位：个
     * @param intervalInMs 统计的间隔，单位：毫秒
     */
    public LeapArray(int sampleCount, int intervalInMs) {
        //第个时间窗口长度 
        this.windowLengthInMs = intervalInMs / sampleCount;
        //统计的间隔
        this.intervalInMs = intervalInMs;
        //时间窗口的个数
        this.sampleCount = sampleCount;
        //时间窗口对象数组
        this.array = new AtomicReferenceArray<>(sampleCount);
    }

}

public class WindowWrap<T> {

    //时间窗口长度，多少秒
    private final long windowLengthInMs;

    //时间窗口的起始长度
    private long windowStart;

    //用来计数用的
    private MetricBucket value;
}

拿rollingCounterInSecond=new ArrayMetric(2,1000)举列

把时间以500ms切割成块，设当前时间x%500=200得
将时间往后推移150，300ms,得到的时间窗口都在timeId1。因此这个时间段发生的请求都将统计到timeId1时间窗口对象的value值中。
时间继续向前推200，得到的时间窗口在timeId2,因此这个时间段发生的请求都将统计到timeId2时间窗口对象的value值中.
时间继续向前推600,得到的时间窗口在timeId3,因此这个时间段发生的请求都将统计到timeId3时间窗口对象的value值中。因为只允许2个窗口，timeId1失效。

Slot中的规则效验

FlowSlot

FlowRule rule1 = new FlowRule();
rule1.setResource(KEY);
//设置允许通过的最大请求数20; 限流阈值
rule1.setCount(20);
//设置限流阈值类型, QPS 或线程数模式, 默认是QPS. 
rule1.setGrade(RuleConstant.FLOW_GRADE_QPS);
rule1.setLimitApp("default");

//通过FlowSlot最终会调用DefaultController
public boolean canPass(Node node, int acquireCount, boolean prioritized) {
        //最终会调用node.passQps()
        int curCount = avgUsedTokens(node);
        if (curCount + acquireCount > count) {
            //省略代码
            return false;
        }
        return true;
    }

//各时间窗口的通过数，除以统计时间
 public double passQps() {
        return rollingCounterInSecond.pass() / rollingCounterInSecond.getWindowIntervalInSec();
    }

FlowSlot 很好的利用的前面收集到的数据。SystemSlot，AuthoritySlot，DegradeSlot过程相似。

Sentinel中的项目结构介绍

sentinel-core 核心模块，限流、降级、系统保护等都在这里实现(我们上面讲的)
sentinel-transport 传输模块，提供了基本的监控服务端和客户端的API接口，以及一些基于不同库的实现（与外界交互）
sentinel-dashboard 控制台模块，可以对连接上的sentinel客户端实现可视化的管理（客户端装上传输模块把数据传到这）
sentinel-adapter 适配器模块，主要实现了对一些常见框架的适配（dubbo,spring-mvc等等）
sentinel-extension 扩展模块，主要对DataSource进行了部分扩展实现（数据持久化）
sentinel-benchmark 基准测试模块，对核心代码的精确性提供基准测试
sentinel-demo 样例模块，可参考怎么使用sentinel进行限流、降级等
sentinel-cluster 集群模式，提供统一tokenService实现（客户端装上传输模块把数据传到这统一处理）

Sentinel集群模式

假设经过压测，机器配置为4C8G最高能承受的TPS为 1500，而机器配置为8C16G能承受的TPS为3000，那如果采取单机限流，其阔值只能设置为1500，因为如果超过1500，会将4C8G的机器压垮。

为了充分利用硬件的资源，诸如 Dubbo 都提供了基于权重的负载均衡机制，例如可以将8C16G的机器设置的权重是4C8G的两倍，这样充分利用硬件资源.

如果是单机模式，我们要让程序根据不同的机型，分别设立阔值。
如果是集群模式，只需要对整个集群设置一个阔值。

原理

集群限流的原理很简单，和单机限流一样，都需要对 qps 等数据进行统计，区别就在於单机版是在每个实例中进行统计，而集群版是有一个专门的实例进行统计。

token client：集群流控客户端，用于向所属 token server 通信请求 token。集群限流服务端会返回给客户端结果，决定是否限流。
token server：即集群流控服务端，处理来自 token client 的请求，根据配置的集群规则判断是否应该发放 token（是否允许通过）。

//FlowRule策略中
 public boolean canPassCheck(FlowRule rule, Context context, DefaultNode node, int acquireCount,
                                                    boolean prioritized) {
        String limitApp = rule.getLimitApp();
        if (limitApp == null) {
            return true;
        }
        //如果是集群模式，则调用Cluster
        if (rule.isClusterMode()) {
            return passClusterCheck(rule, context, node, acquireCount, prioritized);
        }
        //否则调用本地模式
        return passLocalCheck(rule, context, node, acquireCount, prioritized);
    }

部署方式

独立部署，就是单独启动一个 token server 服务来处理 token client 的请求
嵌入部署，就是在多个 sentinel-core 中选择一个实例设置为 token server，随着应用一起启动，其他的 sentinel-core 都是集群中 token client

若在生产环境使用集群限流，管控端还需要关注以下的问题：

Token Server 自动管理（分配/选举 Token Server）
Token Server 高可用，在某个 server 不可用时自动 failover 到其它机器

参考文章

《Sentinel 原理-调用链》
《Sentinel 集群限流设计原理》

Sentinel限流框架

前言

简单的例子

Sentinel的调用链

Sentinel中的Slot链

Slot中的存储Node

NodeSelectorSlot

ClusterBuilderSlot

资源角度：Node关系图

Slot中的数据收集及统计

StatistcSlot

利用时间窗口Metric做统计

Slot中的规则效验

FlowSlot

Sentinel中的项目结构介绍

Sentinel集群模式

原理

部署方式

参考文章

AI 画图真刺激，手把手教你如何用 ComfyUI 来画出刺激的图

公司刚入职了一名 Java 中级开发，短短 4 行代码居然凑齐了 3 个 bug！我哭了~~

数据展示动态（跑分）显示

公众号5月C#/.NET热文一览

git 下载大陆镜像地址

日誌一致性協議Raft

JVM內存池

Mac下編譯openjdk及調試

openjdk1.8工程結構

class文件和字節碼解析

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結