线程-Linux

线程

线程是进程的一个实体,是CPU调度和分派的基本单位,它是比进程更小的能独立运行的基本单位。线程自己基本上不拥有系统资源,只拥有一点再运行中必不可少的资源(如程序技术器,一组寄存器和栈),但是,它可与同属一个进程的其他的线程共享进程所拥有的全部资源。

A thread of execution is the smallest sequence of programmed instructions that can be managed independently by an Operating System Scheduler. The scheduler itself is a Light-Weight Process. The implementation of threads and processes differs from one OS to another, but in most cases, a thread is contained inside a process. Multiple threads can exist within the same process and share resources such as memory, while different processes do not share these resources. In particular, the threads of a process  share the latter's instruction (its code) and its context (the values that its variables reference at any given moment).

On a multiprocessor of multi-core system, threads can be truly concurrent, with every processor or core executing a separate thread simultaneously.

Many modern OS directly support both time-sliced and mulitprocessor threading with a process scheduler. The kernel of an OS allows programmers to manipulate threads via the system call interface. Some implementations are called a kernel thread, whereas a lightweight process (LWP) is a specific type of kernel thread that shares the same state and information.

Programs can have user-space threads when threading with timers, signals, or other methods to interrupt their own execution, performing a sort of ad hoc time-slicing.


How Threads differ from processes

Threads differ from traditional multitasking operating system processes in that:

1. Processes are typically independent, while threads exist as subsets of a process

2. Processes carry considerably more state information than threads, whereas multiple threads within a process share process state as well as memory and other resources

3. Processes have separate address spaces, whereas threads share their address space

4. Processes interact only through system-provided inter-process communication mechanisms

5. Context switching between threads in the same process is typically faster than context switching between processes.

Systems such as Windows NT and OS/2 are said to have "cheap" threads and "expensive" processes; in other operating systems there is not so great difference except the cost of address space switch which implies a TLB flush.


Multi-threading

Multi-threading is a widespread programing and execution model that allows multiple threads to exist within the context of a single process. There threads share the process' resources, but are able to execute independently.

This advantage of a multithreaded program allows it to operate faster on computer systems that have multiple CPUs, CPUs with multiple cores, or across a cluster of machines, because the threads of the program naturally lend themselves to truly concurrent execution. In such cases, the programmer needs to be careful to avoid race conditions, and other non-intuitive behaviors.

Another use of multithreading, applicable even for single-CPU systems, is the ability for an application to remain responsive to input. In a single-threaded program, if the main execution thread blocks on a long-running task, the entire application can appear to freeze. In most cases multithreading is not the only way to keep a program responsive, with non-blocking I/O and / or Unix signals being available for gaining similar results.

Operating systems schedule threads in one of two ways:

(1). Preemptive multitasking is generally considered the superior approach, as it allows the OS to determine when a context switch should occur. The disadvantage to preemptive multithreading is that the system may make a context switch at an inappropriate time, causing lock convoy, priority inversion or other negative effects which may be avoided by cooperative multithreading.

(2). Cooperative multithreading, on the other hand, relies on the threads themselves to relinquish control once they are at a stooping point. This can create problems if a thread is waiting for a resource to become available.


History

In the late 1990's, the idea of executing instructions from multiple threads simultaneously, known as simultaneous multithreading, had reached desktops with Intel's Pentium 4 processor, under the name hyper threading. It has been dropped from Intel Core and Core 2 architectures, but later was re-instated in teh Core i7 architectures and some Core i3 and Core i5 CPUs.


---The Problem with Threads, Edward A.Lee, UC Berkeley, 2006

Althrough threads seem to be a small step from sequential computation, in fact, they represent a huge step. They discard the most essential and appealing properties of sequential computation: understandability, predictability, and determinism. Threads, as a model of computation; are wildly non-deterministic, and the job of the programmer becomes on of pruning that nondeterminism.


Processes, kernel threads, user threads, and fibers

A process is the "heaviest" unit of kernel scheduling.

A kernel thread is the "lightest" unit of kernel scheduling. At least one kernel thread exists within each process. Kernel threads are preemptively mulitasked if the OS's process scheduler is preemptive.

Threads are sometimes implemented in userspace libraries, thus called user threads. The Kernel is unaware of them, so they are managed and scheduled in userspace. User threads as implemented by virtual machines are also called green threads. User threads are generally fast to create and manage, but can't take advantage of multithreading or multiprocessing and get blocked if all of their associated kernel threads get blocked even if there are some user threads are ready to run.

Fibers are an even lighter unit of scheduling which are cooperatively scheduled: a running fiber most explicitly "yield" to allow another fiber to run, which makes their implementation much easier than kernel or user threads. A fiber can be scheduled to run in any thread in the same process. This permits applications to gain performance improvements by managing scheduling themselves, instead of relying on the kernel scheduler (which may not be tuned for the application). Parallel programming environments such as OpenMP typically implement their taks through fibers.


I/O and scheduling

The use of blocking system calls in user threads (as opposed to kernel threads) or fibers can be problematic. User thread or fiber implementations   are typically entirely in userspace.


多线程使用环境

当多个任务可以并行执行时,可以为每个任务启动一个线程。


线程创建

使用pthread_create()

#include<pthread.h>

int pthread_create (pthread_t * __restrict __newthread, //新创建的线程ID

                                  __const pthread_attr_t  *__restrict __attr,//线程属性

                                  void * (*__start_routine) (void *),//新创建的线程从start_routine开始执行

                                  void * __restrict __arg//执行函数的阐述);

返回值:成功返回0,失败返回错误编号,并可以用strerror(errno)函数得到错误信息。


线程退出

void pthread_exit(void* value_ptr);

Parameter 说明:参数value_ptr 是一个转向返回状态值得指针。


线程间同步--等待一个给定线程终止

int pthread_join( pthread_t tid, void ** status );

Parameter说明:参数tid 希望等待线程的线程号;参数status是指向线程返回值的

Critiction 可以在线程中使用,mutex 只可在进程中使用。


初始化一个互斥体变量

int pthread_mutex_init( pthread_mutex_t *mutex, const pthread_mutex_attr_t* attr );

Parameter说明:mutex使用默认的属性;如果参数attr为NULL,则互斥。


锁住互斥体变量

int pthread_mutex_lock ( pthread_mutex_t *mutex );

Parameter说明:如果参数mutex所指的互斥体已经被锁住了,那么发出调用的线程将被阻塞知道其他线程对mutex解锁。


锁住mutex所指定的互斥体,但不阻塞,与上面相区别

线程终止方式

三种方式:

1. 线程从执行函数返回,返回值是线程的退出码

2. 线程被同一进程的其他线程取消

3. 调用皮thread_exit()函数退出。这里不是调用exit, 因为线程调用exit函数,会导致线程所在的进程退出。


编译时,undefined reference to 'pthread_create'错误解决 (For example: $ g++ main.cpp -o runMain)

原因:pthread 库不是linux默认的库,所以在编译时候,需要指明libpthread.a库。

Solution: 在编译时,加上-lpthread参数。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章