Android ZygoteServer 多路复用机制与CopyOnWrite

一、Android Zygote_Server进程

Android中创建app进程使用了3种通信技术：

Binder
LocalSocket
Pipe

Android 中创建应用进程的方式有3类：

Zygote_Server
JNI fork
shell + app_process

Zygote_Server是Android中非常重要的服务进程，无论是System_Server、WebView_Zygote（Android O之时是init进程的子进程，Android Q是ZygoteServer的子进程）这样的Android服务进程，还是我们的各种app，他们共同的父进程都是ZygoteServer。

Android ZygoteServer历来都是单线程+同步非阻塞IO模型，在Android 5.0之前，使用linux select 机制实现单线程多路复用，在Android 5.0之后该用poll机制，主要原因是select 使用复杂，最大能监听1024个文件描述符（Linux上一切皆文件），而Poll机制进行了改进，文件描述符FD数量取消了限制。当然，有人可能要问，为什么不实用epoll呢？

主要我们fork线程的频度可能还没有我们使用Handler Looper的频度高，epoll在低并发场景下优势并不明显，而且epoll有个事件队列要维护，对於单纯fork进程的服务，必要性不是很高。

二、多路复用

【1】Android 6.0 之前使用的linux select 单线程多路复用（ZygoteInit.java）

/**
602     * Runs the zygote process's select loop. Accepts new connections as
603     * they happen, and reads commands from connections one spawn-request's
604     * worth at a time.
605     *
606     * @throws MethodAndArgsCaller in a child process when a main() should
607     * be executed.
608     */
609    private static void runSelectLoop() throws MethodAndArgsCaller {
610        ArrayList<FileDescriptor> fds = new ArrayList<FileDescriptor>();
611        ArrayList<ZygoteConnection> peers = new ArrayList<ZygoteConnection>();
612        FileDescriptor[] fdArray = new FileDescriptor[4];
613
614        fds.add(sServerSocket.getFileDescriptor());
615        peers.add(null);
616
617        int loopCount = GC_LOOP_COUNT;
618        while (true) {
619            int index;
620
621            /*
622             * Call gc() before we block in select().
623             * It's work that has to be done anyway, and it's better
624             * to avoid making every child do it.  It will also
625             * madvise() any free memory as a side-effect.
626             *
627             * Don't call it every time, because walking the entire
628             * heap is a lot of overhead to free a few hundred bytes.
629             */
630            if (loopCount <= 0) {
631                gc();
632                loopCount = GC_LOOP_COUNT;
633            } else {
634                loopCount--;
635            }
636
637
638            try {
639                fdArray = fds.toArray(fdArray);
640                index = selectReadable(fdArray);
641            } catch (IOException ex) {
642                throw new RuntimeException("Error in select()", ex);
643            }
644
645            if (index < 0) {
646                throw new RuntimeException("Error in select()");
647            } else if (index == 0) {
648                ZygoteConnection newPeer = acceptCommandPeer();
649                peers.add(newPeer);
650                fds.add(newPeer.getFileDesciptor());
651            } else {
652                boolean done;
653                done = peers.get(index).runOnce();
654
655                if (done) {
656                    peers.remove(index);
657                    fds.remove(index);
658                }
659            }
660        }
661    }
662

【2】Android 6.0之后改为了Linux polll

void runSelectLoop(String abiList) throws Zygote.MethodAndArgsCaller {
        ArrayList<FileDescriptor> fds = new ArrayList<FileDescriptor>();
        ArrayList<ZygoteConnection> peers = new ArrayList<ZygoteConnection>();

        fds.add(mServerSocket.getFileDescriptor());
        peers.add(null);

        while (true) {
            StructPollfd[] pollFds = new StructPollfd[fds.size()];
            for (int i = 0; i < pollFds.length; ++i) {
                pollFds[i] = new StructPollfd();
                pollFds[i].fd = fds.get(i);
                pollFds[i].events = (short) POLLIN;
            }
            try {
                Os.poll(pollFds, -1);
            } catch (ErrnoException ex) {
                throw new RuntimeException("poll failed", ex);
            }
            for (int i = pollFds.length - 1; i >= 0; --i) {
                if ((pollFds[i].revents & POLLIN) == 0) {
                    continue;
                }
                if (i == 0) {
                    ZygoteConnection newPeer = acceptCommandPeer(abiList);
                    peers.add(newPeer);
                    fds.add(newPeer.getFileDesciptor());
                } else {
                    boolean done = peers.get(i).runOnce(this);
                    if (done) {
                        peers.remove(i);
                        fds.remove(i);
                    }
                }
            }
        }
    }

【3】什么是多路复用？

数据通信系统或计算机网络系统中，传输媒体的带宽或容量往往会大于传输单一信号的需求，为了有效地利用通信线路,希望一个信道同时传输多路信号，这就是所谓的多路复用技术(Multiplexing)，当然，我们可以简单的理解为类似Looper主线程队列。

【4】我们知道Zygote_Server单线程如何处理Socket请求？

我们知道，Socket在Java多线程模型中，会创建一个线程专门处理Socket数据通信，否则阻塞就会很严重，为什么Zygote_Server不会阻塞，或者阻塞不明显，而且还能正常和Client端通信呢？

主要原因是：在操作系统中，我们的很多任务并不需要CPU处理，也不需要用户态处理，即便是copy_to_user和copy_from_user这类IO操作都是内核完成的，在Java中，之所以线程阻塞或者Socket阻塞，并不是因为用户态在执行某些操作，而是用户态在等待内核态任务完成的信号，线程要么是wait，要么是悬挂等待cpu中断唤醒。

当然，单线程fork和IO还是有些差别的，fork时用户态会调用CPU的，那么问题来了，android fork进程为什么很快？为什么不使用多线程呢？

接下来我们解答

三、Copy On Write机制

Linux中进程和线程的关系，Linux中，进程属于重量级线程，线程属于轻量级线程，都会占用资源。回到之前的问题，android fork进程为什么很快？为什么不使用多线程呢？

【1】fork的本质是什么？

fork的本质是“拷贝”他所在的线程状态和数据，注意，fork不会“拷贝”其他线程，但是会“拷贝”其他线程的对象，fork采用和Copy On Write机制（类似Java中CopyOnWriteArrayList，但linux fork是汇编指令来完成的），在变量和状态变更之前，和父进程共享资源。fork不会“拷贝子线程”，但会“拷贝”子线程的对象，因此，保证了fork基本不会出现阻塞。

【2】fork时为什么不使用多线程？

fork“拷贝的当前线程”，如果在自线程拷贝，并不能完整“拷贝”主进程中主线程的状态和变量。

【3】fork前为什么不使用Binder？

Android中经常能问到的问题是，为什么创建进程不使用Binder呢？

看了很多回答，但基本存在问题：

很多回答是fork时会死锁。

反驳点：fork进程之前完全可以使用Binder和Zygote_Server通信，毕竟ServiceManager服务创建早于Zygote_Server。此外，并不意味这fork时也需要多线程，我们将线程交由队列不就行了？

那么问题在哪里呢？

fork 中copy on write是为了尽可能完整复制主进程（重量级线程），Binder线程池还需要将消息传递给转向主线程队列等待，反而体现不了binder多线程的优势。
fork时，对于Zygote_Server进程通信实际上并不频繁，维护线程池，在拷贝线程池也是一种负担。

Android ZygoteServer 多路复用机制与CopyOnWrite

一、Android Zygote_Server进程

二、多路复用

三、Copy On Write机制

重磅推出：Milvus Lite 正式上線，幾秒內即可輕鬆搭建 GenAI 應用

vue3 scss style scope 加了無法重寫

VUE3 route 指定打開tab的名稱

如何去掉DedeCMS首頁index.html後綴

Testin雲測：鴻蒙原生質量保障都需要做哪些測試

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結