深入了解 Rust 异步开发模式

原創

2020-09-01 12:23

Future for Pin

\nwhere\n P: Unpin + ops::DerefMut,\n{\n type Output = <

::Target as Future>::Output;\n\n fn poll(self: Pin, cx: &mut Context) -> Poll<:output> {\n Pin::get_mut(self).as_mut().poll(cx)\n }\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pin 类型能够让数据在内存中固定。withoutboats 有一篇 "},{"type":"link","attrs":{"href":"https://boats.gitlab.io/blog/post/2018-01-25-async-i-self-referential-structs/","title":null},"content":[{"type":"text","text":"blog"}]},{"type":"text","text":"，详细说明了自引用结构在 MOVE 之后产生的问题。Pin 能够解决这类问题。在上面的代码里面，看到针对 unpin 的类型，也增加了 Pin::new 来保障数据在内存中固定"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Future 的 poll 的第二个参数是 Context 类型，也就是 waker 的封装。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"rust"},"content":[{"type":"text","text":"fn poll(self: Pin, cx: &mut Context) -> Poll<:output>;"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"这个 Context 用来保存 future 的状态，在 poll 是 Pending 的时候保存 future 的信息和状态，在 poll 到 Ready 的时候，再执行。所以在每次 poll 之后，如果发现不是 Ready 的状态，都会重新把这个 Context 带入到一个新的 future 等待下次事件触发调用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Future 和 Pin 构成了 rust async/await 的基础。在函数前面加上 async ，就把函数包装称为了一个 Future；Future 后面加上 .await，就执行 Future 的 poll 操作。例如："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"rust"},"content":[{"type":"text","text":"async fn read_file(path: &str) -> io::Result {\n let mut file = File::open(path).await?;\n let mut contentx = String::new();\n file.read_to_string(&mut contexts).await?;\n Ok(contents)\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"async 在函数前面，把函数包装为一个"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"rust"},"content":[{"type":"text","text":"Future>"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在函数内部，也有两个 Future 的执行。一个是 File::open，一个是 read_to_string。这个又带来一个问题，就是 Future 里面包含了 Future ，是怎么执行的。按照程序逻辑，应该是要执行完 File::open 之后才能继续后面的操作，也就是说 Future 要按照顺序执行里面的 Future，也就是说 Future 的执行要支持嵌套和组合使用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Future 的嵌套组合也存在几个情况，在 async-std 里面，总结了这么几种：join、race、try_join、try_race、flatten 和 delay"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Join 比较容易理解。有两个 Future ，L 和 R 。先检查 L 是不是 Ready，如果 Ready ，再检查 R 的 Output 是不是有值（并没有 Poll L）。如果是，则把 L 和 R 的 Output 组合成一个 tuple 作为 Join 之后的 Output，然后返回 Poll::Ready 状态"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"TryJoin 和 Join 类似。先检查 L 是不是 Ready，然后检查 L 的 Output 是不是有错误，如果有错误，就返回Err。然后检查 R 的 Output 是不是有值，如果有值，就把 L 和 R 的 Output 合并返回 Poll:Ready。和 Join 相比，加了一步，检查 L 的 Output 是不是有错误"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Race 是两个比谁更快的意思。先检查 L 是不是 Ready，如果Ready，就去执行 L。然后对 R 再做同样的操作。如果其中有一个完成，就算 Race 完成，另外一个就不管了。Future 退出。当然，这个里面，L 占有一点先发优势，因为先执行 L"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"TryRace 和 TryJoin 类似，就不在重复描述"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Flattern ：第一个 Future 的输出，是第二个 Future 的输入。也就是嵌套 Future 。类似这样： async{ async {1} }。执行顺序就是先执行第一个，然后有返回结果之后。把返回结果再生成一个 Future，继续执行。然后返回最后的结果。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Delay ：就是延迟执行的 Future，可以为 Future 设置程序到了之后再过一段时间执行"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"当然， Future 不止针对两个的组合。也有针对 3、4 甚至多个组合。Tokio 就有一个 try_join3的组合。组合方式也不止这几种，crate.io 里面有一些 crates ，针对 Flattern ，还有 FlatternSink 和 FlatternStream，以及其他组合方式"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"看到这里，可能会有些迷惑。还记得之前提到的，Future 是未来要执行的动作。这些组合以及执行顺序，就有点类似编织未来程序执行的路径和方式。但是 Future 如果不被 Poll 或者 .await，Future 是静态的。这个和我们平时写同步执行代码有相似，也有不同。不同的地方在于，如果要获得一个中间判断的状态，就需要 Poll。如果要根据输入做动作，也需要在 Poll 的下面写对应的 match。感觉上会比较复杂。好在 Rust 提供 async / .await 的方式，能够让我们按照同步顺序操作的思路来写，就像上面描述的 read_file 函数。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"刚才提到，Future 是未来要执行的动作。那怎么执行 Future，就需要在代码里调用 Poll，或者在函数后面增加 .await 。Future 的设计是为了提供异步执行的模式，为什么会有异步执行的模式，主要是为了性能。能够按照计算机的运行方式，采用事件触发，把任务切成小的操作步骤，更高效的调度，来达到高性能。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果要设计一个高效完整的异步框架。在 Future 和 Waker 的基础上，还需要提供几个部分："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"executor ：执行任务的基础。线程、协程、进程或者其他计算运行时"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"scheduler：调度方式，针对不同的计算调度不同的 executor 来运行"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"park : 对线程进行管理，满足 scheduler 的调度要求"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"task ：Future 是小片的执行操作，要完整一个完整的任务，需要更强大的方式，就是task"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面就以 Tokio 为例来说明以上几个模块"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"executor : 为什么用 Native Thread"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"作为异步并发模式的基础，采用线程、Green Thread 还是 Coroutine，从可行性上都可以。Rust 采用的是 Native Thread。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"为什么会采用 Native Thread ，什么是 Native Thread ? Rust Team 核心成员 Steve Klabnik ( @withoutboats ) 有"},{"type":"link","attrs":{"href":"https://www.infoq.com/presentations/rust-2019/","title":null},"content":[{"type":"text","text":"一个演讲"}]},{"type":"text","text":"详细说明了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"核心的原因是，Rust 是 \"system programming language\" ，和 C 之间不能有 overhead 。也就是说，Rust 必须使用系统 Native 的 Thread，才能和 C 的转换没有额外的 IO 损耗。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Rust 的 Async 采用了一种 \"Synchronous non-blocking network I/O\" （同步非阻塞 IO）。这个看上去有些矛盾，但是仔细了解一下，感觉挺有道理的。同步阻塞的问题，就是效率较低。异步非阻塞的问题，对于长耗时的操作效率较低。异步阻塞，能够让长耗时的任务安排到独立线程运行，达到更好的性能。同步非阻塞IO，就是用同步的方法来写代码，但是内部其实是异步调用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"async-std 在"},{"type":"link","attrs":{"href":"https://async.rs/blog/stop-worrying-about-blocking-the-new-async-std-runtime/","title":null},"content":[{"type":"text","text":"这篇博客"}]},{"type":"text","text":" 这样说：\"The new runtime "},{"type":"text","marks":[{"type":"strong"}],"text":"detects blocking"},{"type":"text","text":" automatically. We don’t need "},{"type":"link","attrs":{"href":"https://docs.rs/async-std/1.2.0/async_std/task/fn.spawn_blocking.html","title":null},"content":[{"type":"text","text":"spawn_blocking"}]},{"type":"text","text":" anymore and can simply deprecate it \" 。系统 runtime 竟然能够自动检测是不是阻塞操作，不需要显式调用 spawn_blocking 来针对阻塞操作。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但是 Native Thread 在应对 IO 请求的时候，存在问题。它会针对每个请求，准备一个线程。这样会极大消耗系统资源，并且这些线程在等待的时候什么都不做。这样的机制面对大量请求的异步操作时会非常低效。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Go 和 Erlang 都是采用 Green Thread 来解决这个问题。但是 Rust 因为不想和 C 之间有更多的隔阂，不想采用 Green Thread 模式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Rust 参考了 Nginx 的 Event Poll 模型，还有 Node.js 的 \"Evented non-blocking IO\" 模型。withoutboats 非常推崇 Node.js 模型，但是 Node.js 带来了回调地狱 (callback hell) 。Javascript 又创造了 Promise 来避免回调的一些问题。Promise 就是 Future 的思路来源。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Twitter 的工程师在处理这个问题的时候，放弃啦 JVM 转而用 Scala ，获得了非常大的性能提升。然后他们写了一个 Paper 叫做 \""},{"type":"link","attrs":{"href":"https://monkey.org/~marius/funsrv.pdf","title":null},"content":[{"type":"text","text":"Your Server as a Function"}]},{"type":"text","text":"\" 。介绍了一个概念，叫做 Future 。这样描述："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A future is a container used to hold the result of an asynchronous operation such as a network RPC, a timeout, or a disk I/O opera- tion. A future is either "},{"type":"text","marks":[{"type":"italic"}],"text":"empty"},{"type":"text","text":"—the result is not yet available; "},{"type":"text","marks":[{"type":"italic"}],"text":"suc- ceeded"},{"type":"text","text":"—the producer has completed and has populated the future with the result of the operation; or "},{"type":"text","marks":[{"type":"italic"}],"text":"failed"},{"type":"text","text":"—the producer failed, and the future contains the resulting exception"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Rust 在这个基础上，完善并推出了 zero cost future 。就是上面一篇讲述的内容。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"\"Synchronous non-blocking network I/O \" 是怎么实现的呢？这里面的核心是调度 (scheduler)，就是让你用同步的方式来写代码，但是内部却是用异步调用的方式在运行。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下一篇就说一下调度 ( scheduler )"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"scheduler ：让异步运作更有效率"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Scheduler 的核心是调度，让 Future 和 Task 在 Executor 里面运行的更有效率。Tokio 和 Mio 的核心工程师 Carl Lerche 在 Tokio Blog 写了一篇文章： "},{"type":"link","attrs":{"href":"https://tokio.rs/blog/2019-10-scheduler/","title":null},"content":[{"type":"text","text":"Making the Tokio scheduler 10x faster"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"文章描述了怎么优化 work-stealing 的任务调度机制，达到10倍的加速。优化的核心是针对任务调度的消息队列。最开始的 tokio 采用 "},{"type":"link","attrs":{"href":"https://github.com/crossbeam-rs/crossbeam","title":null},"content":[{"type":"text","text":"crossbeam"}]},{"type":"text","text":" 的消息队列，是一种“single producer , multi-consumer”的模式。Tokio 参考 Go 优化成 \"multi-producer , single-consumer\" 模型，并且增加了一个 Global Queue，提升了调度效率"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们从 runtime 入手来看 scheduler 是怎么实现的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/8a/8ac310b29da224a84e27b7437a8fb012.png","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"builder 有两组 threads，一组是 core_threads，默认是和 CPU 的核数一样。一组是 max_threads ，默认是512 。Core_threads 是作为 tokio runtime 的主要 executor。max_thread 是作为 blocking_pool 的 executor。在 runtime 启动的时候，core_threads 和 blocking_thread 都启动。在运行 Future 的时候，tokio::spawn ，在 core_threads 运行； tokio::block_on ，在 blocking_thread 启动。当然两种情况线程调度的机制都不一样"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"这里面有三种调度方式： shell、basic_scheduler 和 threaded_scheduler"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Shell 没有实际使用，主要是 basic 和 threaded 两种。我们先来看一下 basice_scheduler。 BasicScheduler 的结构如下："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"rust"},"content":[{"type":"text","text":"pub(crate) struct BasicScheduler

\nwhere \tP: Park,\n{\n /// Scheduler component\n scheduler: Arc,\n /// Local state\n local: LocalState

,\n}\nstruct LocalState

{\n /// Current tick\n tick: u8,\n /// Thread park handle\n park: P,\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"LocalState 包含两个比较重要的点， tick 和 park。Park 是针对 Waker 的再次封装，为了能够更好控制线程，这个我们后面单独解释。 tick 是针对 task state 的增强，一次 tick 可能包含很多次 task 。例如在读取 socket 的时候，可以执行很多 task ，每个 task 读取一小段数据，执行多次，读完整个数据。在 basic_scheduler 里面，MAX_TASKS_PER_TICK 默认设置为 61"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们在使用 Tokio 的时候，往往是这样用："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"rust"},"content":[{"type":"text","text":"use tokio::net::TcpListener;\nuse tokio::prelude::*;\n\n#[tokio::main]\nasync fn main() -> Result> {\n let mut listener = TcpListener::bind(\"127.0.0.1:8080\").await?;\n\n loop {\n let (mut socket, _) = listener.accept().await?;\n\n tokio::spawn(async move {\n let mut buf = [0; 1024];\n\n // In a loop, read data from the socket and write the data back.\n loop {\n let n = match socket.read(&mut buf).await {\n // socket closed\n Ok(n) if n == 0 => return,\n Ok(n) => n,\n Err(e) => {\n eprintln!(\"failed to read from socket; err = {:?}\", e);\n return;\n }\n };\n\n // Write the data back\n if let Err(e) = socket.write_all(&buf[0..n]).await {\n eprintln!(\"failed to write to socket; err = {:?}\", e);\n return;\n }\n }\n });\n }\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最开始的 #[tokio::main] 是一个宏，宏下面的 async 函数，是作为宏的 Future 输入。tokio::main 做的主要工作是 builder 一个 runtime ，然后启动 block_on 函数，把 Future 包装进入 Block_on 。具体流程如下："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/32/32f0a495c2f5a1e5d230a93178e256dc.png","alt":null,"title":"","style":[{"key":"width","value":"50%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"block_on 主要做的工作就是获得 Future 然后执行。在 Tick Local State 这一步，实际是循环执行了多次 Task，如下："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/db/dbad697cfd0a801238a8271655b63ecf.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"就像上面说的，持续的执行 Task ，到没有 Task。或者执行 MAX_TASK_PER_TICK 次"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在刚才的 TcpListener 里面，有一个 Tokio::spawn(future)，spawn 的流程如下："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/4f/4fb35a7c2490ed1eb2b0da660516f667.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"启动一个任务，然后判断是不是有 Scheduler 存在。有的话，就 push future 到 Local 的任务队列，没有就 push 到 remote 的队列。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"还有一个关键的地方，就是 Scheduler 有一个 Mpsc (multi-producer, single consumer) 的队列。还记得这结刚开始的时候那篇文章: "},{"type":"link","attrs":{"href":"https://tokio.rs/blog/2019-10-scheduler/","title":null},"content":[{"type":"text","text":"Making the Tokio scheduler 10x faster"}]},{"type":"text","text":"。为了提升性能，tokio 优化了 crossbeam 的 queue。这个 Mpsc 有两个 equeue ：Local 和 Remote，Local 是自己线程的任务， Remote 是其他线程推送过来的任务。在 tick 里面获取 next_task 的时候，有一个逻辑，每一段次数(CHECK_REMOTE_INTERVAL，默认是13) 之后，就去获取一次 Remote queue 里面的任务。另外，在 spawn future 的时候，如果 scheduler 不存在，就推送 task 到 remote queue"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Thread_Scheduler 和 Basic_Scheduler 的区别在于， Thread 是多线程多任务模式，Basic 是单线程多任务模式。Threaded 用和 CPU 核数一样的线程数，针对每个任务用一个线程来执行。block_on 和 spawn 都是这样。所以 Thread 的流程较为简单，没有 Basic 复杂。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"除了这个之外，还有一个针对长耗时的 blocking 操作的 spawn_blocking 。就是用在 builder 里面 create_blocking_pool (512个) 来执行。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以上内容基本描述了 Scheduler 的流程。但是还有很多地方不清晰，因为我们还有两个重要的内容没描述：Park 和 Task 。Park 是在 Waker 基础上的增强，针对线程状态和 Future 做更细节的管理和控制； Task 是连接 Future 和线程模型的重要控制模块。也可以说是 Rust 异步模式之后开发时接触最多的概念。按照 Aaron Turon 的定义， Task 是正在执行的 Future ( a task is a future thas is being executed)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们先说 Park/Unpark，之后详细描述 Task"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"Park/Unpark : 管理线程"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在上面的 Basic 和 Threaded 里面，都设计到线程的管理。在 tokio::main 开始的时候，启动了 cpu_cores 的 core Threads 和 512 个 Thread Pool。这些线程怎么管理，达到 Scheduler 的要求。Park 起到了关键作用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"先描述一下 Park 的内部关系："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/c7/c7e2d3d4d838807e81b85b7fc6bcfc4b.png","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Park Trait 定义了一个关于状态切换的关系，如下："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/14/149303cede74f4d3f52f21107739ce4d.png","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 Thread 里面，有三个状态，Empty、PARKED 和 NOTIFIED。通过 park() 和 unpark() 转换状态。在 Inner 这个 struct 和它的 impl 里面。有一个 Condvar 是 std::sys::condvar，是一个条件变量。条件变量的官方描述是这样："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Condition variables represent the ability to block a thread such that it consumes no CPU time while waiting for an event to occur. Condition variables are typically associated with a boolean predicate (a condition) and a mutex. The predicate is always verified inside of the mutex before determining that a thread must block."}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"条件变量能够阻塞一个线程，让它不消耗 CPU。直到一个事件触发，线程再继续执行。条件变量往往和一个 bool类型及一个 mutex 关联，这个 bool 类型包装在 mutex 里面，用来确认这个线程是不是要阻塞。condvar 的内部实现利用了 sys::condvar ，这是一个操作系统层的条件变量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 park() 里面，有一个 self.condvar.wait ，会让线程 block ，等待 condvar 调用 notify 。在 unpark() ，除了设置 state 为 NOFITIED 外，还调用了了 self.condvar.notify_one() ，让线程重新激活。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Thread 除了 ParkThread 和 UnparkThread 之外，还有一个 CachedParkThread，用在多线程 blocking 操作里面，就是上面一篇说过的 spawn_blocking 。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"Task 把一切都连起来"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 Tokio 的文档里说，Task 是轻量级、非阻塞的执行单元。在 Tokio Task 的代码说，Task 是异步绿色线程(Asynchronnouse green-threads)。 Task 类似于线程，但是被 Tokio::Runtime 管理"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Task 的结构如下："}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/80/802f29fcb7b0b87f7fc0840cd9642111.png","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/99/99861ff6a1bcb007401955e045a9dbe3.png","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"这两页结构图还不是 Task 的全部，Task 还包含了 join、list、local、harness、queue、stack 等10多个 Struct 和 Trait 。我们尝试换一个方式，通过 Task 的实际使用来理解 Task 的机制。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"按照 Tokio 文档的案例："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"rust"},"content":[{"type":"text","text":"use tokio::net::TcpListener;\nuse tokio::prelude::*;\n\n#[tokio::main]\nasync fn main() -> Result> {\n let mut listener = TcpListener::bind(\"127.0.0.1:8080\").await?;\n loop {\n let (mut socket, _) = listener.accept().await?;\n tokio::spawn(async move {\n loop {\n // ... do somethings ...\n }\n });\n }\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上面说过，tokio::main 是启动了 runtime::block_on ，然后把 async 后面的内容作为一个 Future 传给 block_onBlock_on ，如果是 Basic Schedule ，则按照 Basic Schedule 开始循环执行 Future.poll 。流程如下："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/9c/9ccb213b8820fce202330885bdcde253.jpeg","alt":null,"title":"","style":[{"key":"width","value":"50%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 aync 后面，又调用了 tokio::spawn 一个新的 task ，代码如下："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"rust"},"content":[{"type":"text","text":"pub(crate) fn spawn(&self, future: F) -> JoinHandle<:output>\n where\n F: Future + Send + 'static,\n F::Output: Send + 'static,\n {\n let (task, handle) = task::joinable(future);\n self.scheduler.schedule(task);\n handle\n }"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"spawn 一个新的 Future ，包装成 joinable task。JoinHandle 包装了 Task，让 task 能够在 future lifetime 结束还能保留。JoinHandle 在 Drop Trait 里增了逻辑"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"rust"},"content":[{"type":"text","text":"impl Drop for JoinHandle {\n fn drop(&mut self) {\n if let Some(raw) = self.raw.take() {\n if raw.header().state.drop_join_handle_fast() {\n return;\n }\n\n raw.drop_join_handle_slow();\n }\n }\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果 Task 可以快速释放，就调用 drop_join_handle_fast，否则就是 drop_join_handle_slow。这个在 task::state 里面实现"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上面的 Schedule.tick，在前面 Scheduler 的时候描述过，多次调动 Task ，可以包装成一个 tick 来返回。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们简单描述了一下 Task 以及 Task 的使用。至此，我们可以说囫囵吞枣的把 Tokio 的内核和机制了解了一下。Tokio 还在发展，Rust 也还在进化。但是投入 Rust 及学习 Tokio ，还是非常有意义，希望我们都能坚持下去。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"这篇文章是春节疫情的时候写的，短短半年。Tokio 和 Rust 异步生态已经有了更大的进步，下一次，我们尝试了解一下 Tokio 最新进展，并且通过 Tokio 来尝试构建一个全异步数据处理平台"}]}]}]}