Linux Process Address Space
high address +---------------+
| |
| Stack | int local_b
| |
+---------------+
| | |
| v |
| |
| |
| ^ |
| | |
+---------------+
| |
| Heap | int * heap_c = malloc()
| |
+---------------+
| Data | int global_a
+---------------+
| Code |
low address +---------------+
上圖是 Linux 的進程地址空間,從低位到高位地址分別爲:
- Code Segment: 程序的代碼,CPU 執行的指令部分,共享只讀
- Data Segment: 可細分爲初始化數據段和未初始化數據段,常用於存儲全局變量等
- Stack: 函數以及自動變量(未加 static 的自動變量又稱爲局部變量)
- Heap: 動態分配內存,如 malloc() 分配的內存
更爲詳細的介紹請見 Anatomy of a Program in Memory。
Fork
Parent Process Child Process
high address +---------------+ +---------------+
| | | |
| Stack | | Stack |
| | | |
+---------------+ +---------------+
| | | | | |
| v | | v |
| | | |
| | | |
| ^ | | ^ |
| | | | | |
+---------------+ +---------------+
| | | |
| Heap | | Heap |
| | | |
+---------------+ +---------------+
| Data | | Data |
+---------------+----------+---------------+
| Code |
low address +------------------------------------------+
fork 是 linux 中最重要的系統調用之一,用於創建一個新進程,它完全的複製父進程地址空間的 data segment、 heap 和 stack,但是和父進程共享一個 code segment,因爲 code segment 通常爲只讀,從邏輯的角度來看,子進程和父進程的內存地址空間互相獨立,子進程修改自己的 data segment,heap 和 stack 並不影響父進程內存空間。每次調用 fork,返回兩次結果,其中父進程的返回值爲子進程的 pid,子進程的返回值爲 0。
#include<stdio.h>
include<stdlib.h>
include <unistd.h>
int global_a = 0; // data segment
int main(void)
{
int local_b = 0, status; // stack
int * heap_c = malloc(sizeof(int)); // heap
pid_t pid;
<span class="k">if</span><span class="p">(</span><span class="o">!</span><span class="n">fork</span><span class="p">()){</span>
<span class="n">global_a</span> <span class="o">++</span><span class="p">;</span>
<span class="n">local_b</span> <span class="o">++</span><span class="p">;</span>
<span class="o">*</span><span class="n">heap_c</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
<span class="n">exit</span><span class="p">(</span><span class="mi">0</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">else</span><span class="p">{</span>
<span class="n">wait</span><span class="p">(</span><span class="o">&</span><span class="n">status</span><span class="p">);</span>
<span class="p">}</span>
<span class="n">printf</span><span class="p">(</span><span class="s">"global_a = %d, local_b = %d, heap_c = %d</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">global_a</span><span class="p">,</span> <span class="n">local_b</span><span class="p">,</span> <span class="o">*</span><span class="n">heap_c</span><span class="p">);</span>
}
程序的輸出結果如下:
$ ./a.out
global_a = 0, local_b = 0, heap_c = 0
Note:爲了減輕 fork 調用開銷,實際採用 copy on write(COW) 技術。
Properties shared by both parent and child process
雖然父進程和子進程的地址空間是獨立的,但是二者依舊共享很多其它的資源,以下摘自 Advanced Programming in the UNIX Environment, 3rd Edition:
- File descriptor
- Real user ID, real group ID, effective user ID, effective group ID
- Supplementary group IDs
- Process group ID
- Session ID
- Controlling terminal
- The set-user-ID and set-group-ID flags
- Current working directory
- Root directory
- File mode creation mask
- Signal mask and dispositions
- The close-on-exec flag for any open file descriptors
- Environment
- Attached shared memory segments
- Memory mappings
- Resource limits