學益得線上課堂之最簡單又最難理解的系統調用-fork

對於剛剛接觸Linux的同學，fork一定是大家最頭疼的概念，它看起來很簡單，但理解起來卻十分的複雜。我們先來從一道經典的筆試題開始。

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

int main()
{
    int i;
    for (i = 0; i < 2; i++)
    {   
        fork();
        printf("-");
    }   
    wait(NULL);
    wait(NULL);

    return 0;
}

問：代碼一共會輸出幾個‘-’？

大家可以先打開電腦把代碼敲一遍，看看結果怎麼樣。我們待會再講。

先從fork的man手冊說起吧。

FORK(2)       Linux Programmer's Manual       FORK(2)

NAME
       fork - create a child process

SYNOPSIS
       #include <unistd.h>

       pid_t fork(void);              
         
RETURN VALUE
       On success, the PID of the child process is returned in the parent, and 0 is returned in the child.  On  failure,
       -1 is returned in the parent, no child process is created, and errno is set appropriately.

這英文感覺也不難，所以我就不翻譯啦！

fork函數調用一次卻返回兩次：向父進程返回子進程的ID，向子進程中返回0。這是因爲父進程可能存在很多個子進程，所以必須通過返回的子進程ID來跟蹤子進程。

看一段代碼：

#include <unistd.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/wait.h>

int main()
{
    pid_t pid;
    int count=0;

    pid = fork();

    printf( "This is first time, who am i? pid = %d\n", pid );

    count++;
    printf( "count = %d\n", count );

    if (-1 == pid)
    {   
        perror("fork");
    }   
    else if (pid > 0)
    {   
        printf("This is the parent process!\n");
        wait(NULL);
    }   
    else if (pid == 0)
    {   
        printf("This is the child process!\n");
    }

    printf("This is second time, who am i? pid = %d\n", pid);

    return 0;
}

運行結果如下：

這個結果很奇怪了，爲什麼13行和32行的printf語句執行兩次，而15行的“count++”語句卻只執行了一次？

把代碼稍微修改一下，並且加上註釋：

#include <unistd.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/wait.h>

int main()
{
    pid_t pid;
    int count=0;

    /*通過fork系統調用創建了一個新的進程
    這個進程共享父進程的數據和堆棧空間等，
    fork調用是一個複製父進程的過程， 
    fork不像線程需提供一個函數做爲入口， 
    fork調用後，新進程的入口就在fork的下一條語句。*/
    pid = fork();

    /*通過pid的值可以判斷此時執行的是父進程還是子進程，都有可能 */
    printf( "This is first time, who am i? pid = %d\n", pid );

    if (-1 == pid)
    {   
        perror("fork");
    }   
    else if (pid > 0)
    {   
        printf("This is the parent process!\n");
        /*fork系統調用向父進程返回子進程的pid， 
        count仍然爲0， 因爲父進程中的count始終沒有被重新賦值,   
        這裏就可以看出子進程的數據和堆棧空間和父進程是獨立的， 
        而不是共享數據。*/
        printf("Parent process count = %d\n", count);
        wait(NULL);
    }   
    else if (pid == 0)
    {   
        printf("This is the child process!\n");
        /*在子進程中對count進行自加1的操作，
        但是並沒有影響到父進程中的count值， 
        父進程中的count值仍然爲0。*/
        count++;
        printf("Child process count = %d\n", count);
    }

    printf("This is second time, who am i? pid = %d\n", pid);

    return 0;
}

結果如下：

看這個程序的時候，頭腦中必須首先了解一個概念：在語句pid=fork()之前，只有一個進程在執行這段代碼，但在這條語句之後，就變成兩個進程在執行了，這兩個進程的代碼部分相同，將要執行的下一條語句都是fork下面的判斷語句。

父子進程的區別除了進程標識符（process ID）不同外，變量pid的值也不相同，pid存放的是fork的返回值。fork調用的一個奇妙之處就是它僅僅被調用一次，卻能夠返回兩次，它可能有三種不同的返回值：

在父進程中，fork返回新創建子進程的進程ID；
在子進程中，fork返回0；
如果出現錯誤，fork返回-1。

fork出錯可能有兩種原因：

當前的進程數已經達到了系統規定的上限，這時errno的值被設置爲EAGAIN；
系統內存不足，這時errno的值被設置爲ENOMEM。

接下來我們來看看《UNIX環境高級編程》中對fork的說明：

The new process created by fork is called the child process. This function is called once but returns twice. The only difference in the returns is that the return value in the child is 0, whereas the return value in the parent is the process ID of the new child. The reason the child’s process ID is returned to the parent is that a process can have more than one child, and there is no function that allows a process to o^ain the process IDs of its children. The reason fork returns 0 to the child is that a process can have only a single parent, and the child can always call getppid to o^ain the process ID of its parent. (Process ID 0 is reserved for use by the kernel, so it’s not possible for 0 to be the process ID of a child.)

被fork創建的新進程叫做子進程。fork函數被調用一次，卻有兩次返回。返回值唯一的區別是在子進程中返回0，而在父進程中返回子進程的pid。在父進程中要返回子進程的pid的原因是父進程可能有不止一個子進程，而一個進程又沒有任何函數可以得到他的子進程的pid。子進程返回0是因爲它只有一個父進程，並且可以通過getppid來獲得父進程id。

Both the child and the parent continue executing with the instruction that follows the call to fork. The child is a copy of the parent. For example, the child gets a copy of the parent’s data space, heap, and stack. Note that this is a copy for the child; the parent and the child do not share these portions of memory. The parent and the child share the text segment (Section 7.6).

子進程和父進程都執行在fork函數調用之後的代碼，子進程是父進程的一個拷貝。例如，父進程的數據空間、堆棧空間都會給子進程一個拷貝，而不是共享這些內存。

仔細分析後，我們就可以知道：

一個程序一旦調用fork函數，系統就爲一個新的進程分配了新的地址空間（包含數據段、代碼段、堆棧段）。首先，系統讓新的進程與舊的進程使用同一個代碼段，因爲它們的程序還是相同的；對於數據段和堆棧段，當代碼中涉及到寫內存操作時，系統則複製一份給新的進程，這樣，父進程的所有數據都可以留給子進程；但是，子進程一旦開始運行，雖然它繼承了父進程的一切數據，但實際上數據卻已經分開，相互之間不再有影響了，也就是說，它們之間不再共享任何數據了。

fork()不僅創建出與父進程數據相同的子進程，而且父進程在fork執行點的所有上下文場景也被自動複製到子進程中，包括：

全局和局部變量
打開的文件句柄
共享內存、消息等同步對象
信號

最後，再來看看文章開頭留給大家的問題。

答案是8！

第一次循環：

父進程通過fork創建子進程，然後父子進程都會執行下面的printf語句。兩次！

第二次循環：

需要注意的是，因爲有了第一次循環，已經創建了子進程，所以父子進程都會進入第二次循環。

父進程第二次循環，再次創建子進程，兩個進程都會執行printf語句，所以打印“-”再加兩次！

子進程第二次循環，再次創建孫進程，兩個進程都會執行printf語句，於是打印“-”再加兩次！

不對呀，這樣一分析結果是6呀！大家再把代碼敲一遍，每次輸出的時候，加上換行符：

printf("-\n");

這樣修改代碼後，結果還真的是6，跟我們分析的一樣。但是爲什麼加上換行符後結果就不一樣了呢？

這是因爲printf語句有緩衝區（屬於行緩衝，遇到換行符纔會輸出）。所以，對於上述沒有換行符程序，printf把“-”放到了緩存中，並沒有真正的輸出，在fork的時候，緩存被複制到了子進程空間，所以，就多了兩個，變成了8個。

什麼玩應？

沒看懂是吧！我來畫張圖給大家看看！

如果這張圖還沒有看懂的話，那就留言或者私信吧！

更多內容，關注公衆號 學益得智能硬件。

學益得線上課堂之最簡單又最難理解的系統調用-fork

再談23種設計模式（3）：行爲型模式（學習筆記）

Power Automate Desktop 安裝完，登錄後老是提示one driver 錯誤

微前端學習筆記(4):從微前端到微模塊之EMP與hel-micro方案探索

微前端學習筆記（1）：微前端總體架構概述，從微服務發微

985 碩士程序員，空窗 4 個月沒有 Offer！

一文搞懂 Spring 循環依賴

賽博鬥地主——使用大語言模型扮演Agent智能體玩牌類遊戲。

VScode右鍵打開(添加到右鍵)

記一次 .NET某工控視覺自動化系統卡死分析

WindowsServer--SQL Server搭建主從同步實現讀寫分離 - 事務性分發

學益得線上課堂之最簡單又最難理解的系統調用-fork

如何區分優質程序猿？遞歸的修煉之路！

【小米筆試題】密碼破譯-C語言實現

如何用棧實現深度優先算法-C語言解決迷宮問題

分享幾個互聯網求職神器，搞明白了，春招再嚴峻也不用擔心！

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結