線程-Linux

線程

線程是進程的一個實體,是CPU調度和分派的基本單位,它是比進程更小的能獨立運行的基本單位。線程自己基本上不擁有系統資源,只擁有一點再運行中必不可少的資源(如程序技術器,一組寄存器和棧),但是,它可與同屬一個進程的其他的線程共享進程所擁有的全部資源。

A thread of execution is the smallest sequence of programmed instructions that can be managed independently by an Operating System Scheduler. The scheduler itself is a Light-Weight Process. The implementation of threads and processes differs from one OS to another, but in most cases, a thread is contained inside a process. Multiple threads can exist within the same process and share resources such as memory, while different processes do not share these resources. In particular, the threads of a process  share the latter's instruction (its code) and its context (the values that its variables reference at any given moment).

On a multiprocessor of multi-core system, threads can be truly concurrent, with every processor or core executing a separate thread simultaneously.

Many modern OS directly support both time-sliced and mulitprocessor threading with a process scheduler. The kernel of an OS allows programmers to manipulate threads via the system call interface. Some implementations are called a kernel thread, whereas a lightweight process (LWP) is a specific type of kernel thread that shares the same state and information.

Programs can have user-space threads when threading with timers, signals, or other methods to interrupt their own execution, performing a sort of ad hoc time-slicing.


How Threads differ from processes

Threads differ from traditional multitasking operating system processes in that:

1. Processes are typically independent, while threads exist as subsets of a process

2. Processes carry considerably more state information than threads, whereas multiple threads within a process share process state as well as memory and other resources

3. Processes have separate address spaces, whereas threads share their address space

4. Processes interact only through system-provided inter-process communication mechanisms

5. Context switching between threads in the same process is typically faster than context switching between processes.

Systems such as Windows NT and OS/2 are said to have "cheap" threads and "expensive" processes; in other operating systems there is not so great difference except the cost of address space switch which implies a TLB flush.


Multi-threading

Multi-threading is a widespread programing and execution model that allows multiple threads to exist within the context of a single process. There threads share the process' resources, but are able to execute independently.

This advantage of a multithreaded program allows it to operate faster on computer systems that have multiple CPUs, CPUs with multiple cores, or across a cluster of machines, because the threads of the program naturally lend themselves to truly concurrent execution. In such cases, the programmer needs to be careful to avoid race conditions, and other non-intuitive behaviors.

Another use of multithreading, applicable even for single-CPU systems, is the ability for an application to remain responsive to input. In a single-threaded program, if the main execution thread blocks on a long-running task, the entire application can appear to freeze. In most cases multithreading is not the only way to keep a program responsive, with non-blocking I/O and / or Unix signals being available for gaining similar results.

Operating systems schedule threads in one of two ways:

(1). Preemptive multitasking is generally considered the superior approach, as it allows the OS to determine when a context switch should occur. The disadvantage to preemptive multithreading is that the system may make a context switch at an inappropriate time, causing lock convoy, priority inversion or other negative effects which may be avoided by cooperative multithreading.

(2). Cooperative multithreading, on the other hand, relies on the threads themselves to relinquish control once they are at a stooping point. This can create problems if a thread is waiting for a resource to become available.


History

In the late 1990's, the idea of executing instructions from multiple threads simultaneously, known as simultaneous multithreading, had reached desktops with Intel's Pentium 4 processor, under the name hyper threading. It has been dropped from Intel Core and Core 2 architectures, but later was re-instated in teh Core i7 architectures and some Core i3 and Core i5 CPUs.


---The Problem with Threads, Edward A.Lee, UC Berkeley, 2006

Althrough threads seem to be a small step from sequential computation, in fact, they represent a huge step. They discard the most essential and appealing properties of sequential computation: understandability, predictability, and determinism. Threads, as a model of computation; are wildly non-deterministic, and the job of the programmer becomes on of pruning that nondeterminism.


Processes, kernel threads, user threads, and fibers

A process is the "heaviest" unit of kernel scheduling.

A kernel thread is the "lightest" unit of kernel scheduling. At least one kernel thread exists within each process. Kernel threads are preemptively mulitasked if the OS's process scheduler is preemptive.

Threads are sometimes implemented in userspace libraries, thus called user threads. The Kernel is unaware of them, so they are managed and scheduled in userspace. User threads as implemented by virtual machines are also called green threads. User threads are generally fast to create and manage, but can't take advantage of multithreading or multiprocessing and get blocked if all of their associated kernel threads get blocked even if there are some user threads are ready to run.

Fibers are an even lighter unit of scheduling which are cooperatively scheduled: a running fiber most explicitly "yield" to allow another fiber to run, which makes their implementation much easier than kernel or user threads. A fiber can be scheduled to run in any thread in the same process. This permits applications to gain performance improvements by managing scheduling themselves, instead of relying on the kernel scheduler (which may not be tuned for the application). Parallel programming environments such as OpenMP typically implement their taks through fibers.


I/O and scheduling

The use of blocking system calls in user threads (as opposed to kernel threads) or fibers can be problematic. User thread or fiber implementations   are typically entirely in userspace.


多線程使用環境

當多個任務可以並行執行時,可以爲每個任務啓動一個線程。


線程創建

使用pthread_create()

#include<pthread.h>

int pthread_create (pthread_t * __restrict __newthread, //新創建的線程ID

                                  __const pthread_attr_t  *__restrict __attr,//線程屬性

                                  void * (*__start_routine) (void *),//新創建的線程從start_routine開始執行

                                  void * __restrict __arg//執行函數的闡述);

返回值:成功返回0,失敗返回錯誤編號,並可以用strerror(errno)函數得到錯誤信息。


線程退出

void pthread_exit(void* value_ptr);

Parameter 說明:參數value_ptr 是一個轉向返回狀態值得指針。


線程間同步--等待一個給定線程終止

int pthread_join( pthread_t tid, void ** status );

Parameter說明:參數tid 希望等待線程的線程號;參數status是指向線程返回值的

Critiction 可以在線程中使用,mutex 只可在進程中使用。


初始化一個互斥體變量

int pthread_mutex_init( pthread_mutex_t *mutex, const pthread_mutex_attr_t* attr );

Parameter說明:mutex使用默認的屬性;如果參數attr爲NULL,則互斥。


鎖住互斥體變量

int pthread_mutex_lock ( pthread_mutex_t *mutex );

Parameter說明:如果參數mutex所指的互斥體已經被鎖住了,那麼發出調用的線程將被阻塞知道其他線程對mutex解鎖。


鎖住mutex所指定的互斥體,但不阻塞,與上面相區別

線程終止方式

三種方式:

1. 線程從執行函數返回,返回值是線程的退出碼

2. 線程被同一進程的其他線程取消

3. 調用皮thread_exit()函數退出。這裏不是調用exit, 因爲線程調用exit函數,會導致線程所在的進程退出。


編譯時,undefined reference to 'pthread_create'錯誤解決 (For example: $ g++ main.cpp -o runMain)

原因:pthread 庫不是linux默認的庫,所以在編譯時候,需要指明libpthread.a庫。

Solution: 在編譯時,加上-lpthread參數。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章