《深入Linux內核架構》讀書筆記006——進程管理相關的系統調用

原創

2020-04-10 15:33

說明

進程管理相關的系統調用通常不是由應用程序直接調用的，而是使用了C標準庫這樣的中間層。

進程的創建最重要的是一個複製父進程到子進程的過程。

進程複製

Linux實現了3個系統調用用於進程複製。

fork：重量級調用，它建立父進程的完整副本；

vfork：類似於fork，但並不創建父進程數據的副本，而是與父進程共享數據。爲了滿足這個要求，子進程在退出或者開始新程序之前內核保證父進程處於堵塞狀態；

clone：產生線程，可以對父子進程之間的共享、複製進行精確控制；clone使用的細粒度的資源分配擴展了一般的線程概念，在一定程度上允許線程與進程之間的連續轉換；事實上在Linux中，線程和進程之間的差別不是那麼剛性，，這兩個名詞經常用作同義詞；

另外最重要的是，Linux使用了寫時複製（Copy-On-Write，COW）技術，它使父進程的數據不會直接複製到子進程，而是父子進程的地址空間指向同樣的物理內存，這些內存的屬性被設置成只讀。當一個進程試圖向複製的內存寫入，處理器會向內核報告“缺頁異常”，內核會創建該頁專用於當前進程的副本來進行寫操作。

上述系統調用的入口分別適合sys_fork、sys_vfork和sys_clone，它們是平臺相關的，以x86爲例（位於arch\x86\kernel\process_64.c）：

asmlinkage long sys_fork(struct pt_regs *regs)
{
	return do_fork(SIGCHLD, regs->rsp, regs, 0, NULL, NULL);
}

asmlinkage long
sys_clone(unsigned long clone_flags, unsigned long newsp,
	  void __user *parent_tid, void __user *child_tid, struct pt_regs *regs)
{
	if (!newsp)
		newsp = regs->rsp;
	return do_fork(clone_flags, newsp, regs, 0, parent_tid, child_tid);
}

/*
 * This is trivial, and on the face of it looks like it
 * could equally well be done in user mode.
 *
 * Not so, for quite unobvious reasons - register pressure.
 * In user mode vfork() cannot have a stack frame, and if
 * done by calling the "clone()" system call directly, you
 * do not have enough call-clobbered registers to hold all
 * the information you need.
 */
asmlinkage long sys_vfork(struct pt_regs *regs)
{
	return do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, regs->rsp, regs, 0,
		    NULL, NULL);
}

實際上它們都走到了平臺無關的函數do_fork()。

/*
 *  Ok, this is the main fork-routine.
 *
 * It copies the process, and if successful kick-starts
 * it and waits for it to finish using the VM if required.
 */
long do_fork(unsigned long clone_flags,
	      unsigned long stack_start,
	      struct pt_regs *regs,
	      unsigned long stack_size,
	      int __user *parent_tidptr,
	      int __user *child_tidptr)

關於這個函數的實現還是直接看書。

內核線程

內核線程是直接由內核本身啓動的進程，通過如下的接口創建：

/*
 * create a kernel thread without removing it from tasklists
 */
extern long kernel_thread(int (*fn)(void *), void * arg, unsigned long flags);

而它的實現，底層調用的還是do_fork：

pid_t
kernel_thread (int (*fn)(void *), void *arg, unsigned long flags)
{
	extern void start_kernel_thread (void);
	unsigned long *helper_fptr = (unsigned long *) &start_kernel_thread;
	struct {
		struct switch_stack sw;
		struct pt_regs pt;
	} regs;

	memset(&regs, 0, sizeof(regs));
	regs.pt.cr_iip = helper_fptr[0];	/* set entry point (IP) */
	regs.pt.r1 = helper_fptr[1];		/* set GP */
	regs.pt.r9 = (unsigned long) fn;	/* 1st argument */
	regs.pt.r11 = (unsigned long) arg;	/* 2nd argument */
	/* Preserve PSR bits, except for bits 32-34 and 37-45, which we can't read.  */
	regs.pt.cr_ipsr = ia64_getreg(_IA64_REG_PSR) | IA64_PSR_BN;
	regs.pt.cr_ifs = 1UL << 63;		/* mark as valid, empty frame */
	regs.sw.ar_fpsr = regs.pt.ar_fpsr = ia64_getreg(_IA64_REG_AR_FPSR);
	regs.sw.ar_bspstore = (unsigned long) current + IA64_RBS_OFFSET;
	regs.sw.pr = (1 << PRED_KERNEL_STACK);
	return do_fork(flags | CLONE_VM | CLONE_UNTRACED, 0, &regs.pt, 0, NULL, NULL);
}

另一個創建內核線程的是kthread_create：

/**
 * kthread_create - create a kthread.
 * @threadfn: the function to run until signal_pending(current).
 * @data: data ptr for @threadfn.
 * @namefmt: printf-style name for the thread.
 *
 * Description: This helper function creates and names a kernel
 * thread.  The thread will be stopped: use wake_up_process() to start
 * it.  See also kthread_run(), kthread_create_on_cpu().
 *
 * When woken, the thread will run @threadfn() with @data as its
 * argument. @threadfn() can either call do_exit() directly if it is a
 * standalone thread for which noone will call kthread_stop(), or
 * return when 'kthread_should_stop()' is true (which means
 * kthread_stop() has been called).  The return value should be zero
 * or a negative error number; it will be passed to kthread_stop().
 *
 * Returns a task_struct or ERR_PTR(-ENOMEM).
 */
struct task_struct *kthread_create(int (*threadfn)(void *data),
				   void *data,
				   const char namefmt[],
				   ...)

啓動新程序

複製進程之後，用新代碼替換現存程序，即可啓動新程序。

Linux使用execve系統調用來完成這個操作。

同樣execve的入口點對應sys_execve函數：

long
sys_execve (char __user *filename, char __user * __user *argv, char __user * __user *envp,
	    struct pt_regs *regs)
{
	char *fname;
	int error;

	fname = getname(filename);
	error = PTR_ERR(fname);
	if (IS_ERR(fname))
		goto out;
	error = do_execve(fname, argv, envp, regs);
	putname(fname);
out:
	return error;
}

這個是平臺相關的，而對應的do_execve是平臺無關的。

關於do_execve()的實現，也還是看書。

退出進程

退出進程使用系統調用exit，它的入口點事sys_exit：

asmlinkage long sys_exit(int error_code)
{
	do_exit((error_code&0xff)<<8);
}

它是跟平臺無關的。

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

《深入Linux內核架構》讀書筆記006——進程管理相關的系統調用

說明

進程複製

內核線程

啓動新程序

退出進程

System.Object未被引用的程序集中定義

Java 信號量（semaphore）搭配CountDownLatch 實現多線程處理循環內邏輯並限制創建線程數

【面試準備】項目經驗——接口自動化項目

ECC內存簡介

UEFI基礎——UEFI Shell

《深入Linux內核架構》讀書筆記005——管理進程相關ID

《深入Linux內核架構》讀書筆記002——簡介和概述2

《深入Linux內核架構》讀書筆記004——進程表示

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結