Linux的CPU資源調優

http://www.uplinux.com/shizi/wenxian/3629.html點擊打開鏈接

一)中斷的CPU親和性

我們可以調整中斷到某個CPU上,這樣可以讓CPU更有效的利用起來.

首先關閉掉irqbalance服務,如下:

/etc/init.d/irqbalance stop

Stopping irqbalance: [ OK ]

查看當前各種中斷所使用的CPU,如下:

for f in `find . -name "smp_affinity"`; do echo -ne "$f->" && cat $f; done

./18/smp_affinity->1

./17/smp_affinity->3

./15/smp_affinity->1

./14/smp_affinity->1

./13/smp_affinity->3

./12/smp_affinity->1

./11/smp_affinity->3

./10/smp_affinity->3

./9/smp_affinity->3

./8/smp_affinity->3

./7/smp_affinity->3

./6/smp_affinity->3

./5/smp_affinity->3

./4/smp_affinity->3

./3/smp_affinity->3

./2/smp_affinity->3

./1/smp_affinity->3

./0/smp_affinity->3

查看中斷使用率,如下:

cat /proc/interrupts

CPU0 CPU1

0: 137 0 IO-APIC-edge timer

1: 7 1 IO-APIC-edge i8042

3: 0 1 IO-APIC-edge

4: 1 0 IO-APIC-edge

7: 0 0 IO-APIC-edge parport0

8: 0 0 IO-APIC-edge rtc0

9: 0 0 IO-APIC-fasteoi acpi

12: 103 2 IO-APIC-edge i8042

14: 2301 628 IO-APIC-edge ata_piix

15: 116 49 IO-APIC-edge ata_piix

17: 15 0 IO-APIC-fasteoi ioc0

18: 4122 39 IO-APIC-fasteoi eth0

NMI: 0 0 Non-maskable interrupts

LOC: 18423 16772 Local timer interrupts

SPU: 0 0 Spurious interrupts

PMI: 0 0 Performance monitoring interrupts

PND: 0 0 Performance pending work

RES: 2740 2914 Rescheduling interrupts

CAL: 110 1349 Function call interrupts

TLB: 339 421 TLB shootdowns

TRM: 0 0 Thermal event interrupts

THR: 0 0 Threshold APIC interrupts

MCE: 0 0 Machine check exceptions

MCP: 3 3 Machine check polls

ERR: 0

MIS: 0

我們看一下18號中斷下的相關文件,如下:

cd /proc/irq/18/ && ls

affinity_hint eth0 node smp_affinity spurious

我們看下它使用的CPU.

more smp_affinity

注:這裏輸出爲1表示,它使用了第一個CPU,1對映的兩進制掩碼也是1.

如果它使用前兩個CPU,那麼這裏就是3,即二進制的11.

如果我們有4個處理器,這裏的值就應該是F,也就是二進制1111.

如果我們只要第2個處理器進行處理,這裏的值就應該是2,因爲2的二進制是10.

我們在另一臺服務器ping本機,如下:

ping 192.168.75.135

64 bytes from 192.168.75.135: icmp_seq=3454 ttl=64 time=2.80 ms

64 bytes from 192.168.75.135: icmp_seq=3455 ttl=64 time=3.80 ms

64 bytes from 192.168.75.135: icmp_seq=3456 ttl=64 time=0.814 ms

64 bytes from 192.168.75.135: icmp_seq=3457 ttl=64 time=0.293 ms

64 bytes from 192.168.75.135: icmp_seq=3458 ttl=64 time=1.84 ms

64 bytes from 192.168.75.135: icmp_seq=3459 ttl=64 time=0.265 ms

64 bytes from 192.168.75.135: icmp_seq=3460 ttl=64 time=0.021 ms

64 bytes from 192.168.75.135: icmp_seq=3461 ttl=64 time=0.793 ms

64 bytes from 192.168.75.135: icmp_seq=3462 ttl=64 time=0.285 ms

64 bytes from 192.168.75.135: icmp_seq=3463 ttl=64 time=0.038 ms

64 bytes from 192.168.75.135: icmp_seq=3464 ttl=64 time=0.936 ms

64 bytes from 192.168.75.135: icmp_seq=3465 ttl=64 time=0.279 ms

64 bytes from 192.168.75.135: icmp_seq=3466 ttl=64 time=0.706 ms

在本機監控網卡中斷分佈,如下:

while ((1)) ; do sleep 1; cat /proc/interrupts |grep eth0; done

18: 5568 39 IO-APIC-fasteoi eth0

18: 5570 39 IO-APIC-fasteoi eth0

18: 5576 39 IO-APIC-fasteoi eth0

18: 5580 39 IO-APIC-fasteoi eth0

18: 5584 39 IO-APIC-fasteoi eth0

18: 5590 39 IO-APIC-fasteoi eth0

18: 5592 39 IO-APIC-fasteoi eth0

18: 5598 39 IO-APIC-fasteoi eth0

18: 5604 39 IO-APIC-fasteoi eth0

18: 5606 39 IO-APIC-fasteoi eth0

18: 5612 39 IO-APIC-fasteoi eth0

18: 5616 39 IO-APIC-fasteoi eth0

18: 5620 39 IO-APIC-fasteoi eth0

18: 5626 39 IO-APIC-fasteoi eth0

18: 5628 39 IO-APIC-fasteoi eth0

18: 5634 39 IO-APIC-fasteoi eth0

18: 5638 39 IO-APIC-fasteoi eth0

18: 5641 39 IO-APIC-fasteoi eth0

18: 5647 39 IO-APIC-fasteoi eth0

18: 5650 39 IO-APIC-fasteoi eth0

18: 5656 39 IO-APIC-fasteoi eth0

我們看到當前的中斷(第二列)即所有的網卡中斷請求都分佈到了CPU0.

我們這裏指定前兩個CPU做爲處理網卡請求,如下:

echo "3" > smp_affinity

while ((1)) ; do sleep 1; cat /proc/interrupts |grep eth0; done

18: 6430 50 IO-APIC-fasteoi eth0

18: 6433 53 IO-APIC-fasteoi eth0

18: 6439 53 IO-APIC-fasteoi eth0

18: 6441 53 IO-APIC-fasteoi eth0

18: 6443 57 IO-APIC-fasteoi eth0

18: 6444 58 IO-APIC-fasteoi eth0

18: 6447 61 IO-APIC-fasteoi eth0

18: 6449 61 IO-APIC-fasteoi eth0

18: 6453 63 IO-APIC-fasteoi eth0

18: 6459 63 IO-APIC-fasteoi eth0

18: 6459 65 IO-APIC-fasteoi eth0

18: 6462 68 IO-APIC-fasteoi eth0

18: 6463 69 IO-APIC-fasteoi eth0

18: 6467 71 IO-APIC-fasteoi eth0

18: 6469 71 IO-APIC-fasteoi eth0

18: 6472 73 IO-APIC-fasteoi eth0

注:我們看到網卡的中斷請求已經平均的分配到了兩個CPU.

二)isolcpus

通過在grub中設定isolcpus內核參數可以指定哪幾個CPU在系統中是孤立的,也就是說默認它們將不被使用.

測試如下:

編輯/boot/grub/menu.list

在加載內核的選項後加入isolcpus=0,如下:

kernel /boot/vmlinuz-2.6.32-71.el6.i686 ro root=UUID=96262e00-91a3-432d-b225-cb35d29eec8f rhgb quiet isolcpus=0

也就是說我們在啓動系統時將默認不使用CPU0,注意這裏說的默認不使用並不是絕對的,操作系統仍然可以指定使用哪個CPU.對於用戶而言可以通過taskset來做到這點.

重啓系統後,我們查看進程的親和性,如下:

ps -eo pid,args:50,psr

PID COMMAND PSR

1 /sbin/init 1

2 [kthreadd] 0

3 [migration/0] 0

4 [ksoftirqd/0] 0

5 [watchdog/0] 0

6 [migration/1] 1

7 [ksoftirqd/1] 1

8 [watchdog/1] 1

9 [events/0] 0

10 [events/1] 1

11 [cpuset] 0

12 [khelper] 0

13 [netns] 0

14 [async/mgr] 0

15 [pm] 0

16 [sync_supers] 1

17 [bdi-default] 0

18 [kintegrityd/0] 0

19 [kintegrityd/1] 1

20 [kblockd/0] 0

21 [kblockd/1] 1

22 [kacpid] 0

23 [kacpi_notify] 0

24 [kacpi_hotplug] 0

25 [ata/0] 0

26 [ata/1] 1

27 [ata_aux] 0

28 [ksuspend_usbd] 0

29 [khubd] 0

30 [kseriod] 0

33 [khungtaskd] 0

34 [kswapd0] 0

35 [ksmd] 0

36 [aio/0] 0

37 [aio/1] 1

38 [crypto/0] 0

39 [crypto/1] 1

45 [kpsmoused] 0

46 [usbhid_resumer] 0

75 [kstriped] 0

239 [scsi_eh_0] 0

240 [scsi_eh_1] 0

250 [mpt_poll_0] 0

251 [mpt/0] 0

252 [scsi_eh_2] 0

301 [jbd2/sda1-8] 0

302 [ext4-dio-unwrit] 0

303 [ext4-dio-unwrit] 1

324 [flush-8:0] 0

390 /sbin/udevd -d 1

643 /sbin/udevd -d 1

644 /sbin/udevd -d 1

731 [kauditd] 0

1004 auditd 1

1029 /sbin/rsyslogd -c 4 1

1062 irqbalance 1

1081 rpcbind 1

1093 mdadm --monitor --scan -f --pid-file=/var/run/mdad 1

1102 dbus-daemon --system 1

1113 NetworkManager --pid-file=/var/run/NetworkManager/ 1

1117 /usr/sbin/modem-manager 1

1124 /sbin/dhclient -d -4 -sf /usr/libexec/nm-dhcp-clie 1

1129 /usr/sbin/wpa_supplicant -c /etc/wpa_supplicant/wp 1

1131 avahi-daemon: registering [linux.local] 1

1132 avahi-daemon: chroot helper 1

1149 rpc.statd 1

1186 [rpciod/0] 0

1187 [rpciod/1] 1

1194 rpc.idmapd 1

1204 cupsd -C /etc/cups/cupsd.conf 1

1229 /usr/sbin/acpid 1

1238 hald 1

1239 hald-runner 1

1279 hald-addon-input: Listening on /dev/input/event2 / 1

1283 hald-addon-acpi: listening on acpid socket /var/ru 1

1303 automount --pid-file /var/run/autofs.pid 1

1326 /usr/sbin/sshd 1

1456 /usr/libexec/postfix/master 1

1463 pickup -l -t fifo -u 1

1464 qmgr -l -t fifo -u 1

1467 /usr/sbin/abrtd 1

1475 crond 1

1486 /usr/sbin/atd 1

1497 libvirtd --daemon 1

1553 /sbin/mingetty /dev/tty1 1

1558 /sbin/mingetty /dev/tty2 1

1561 /sbin/mingetty /dev/tty3 1

1564 /sbin/mingetty /dev/tty4 1

1567 /sbin/mingetty /dev/tty5 1

1569 /sbin/mingetty /dev/tty6 1

1586 /usr/sbin/dnsmasq --strict-order --bind-interfaces 1

1598 sshd: root@pts/0 1

1603 -bash 1

我們看到有一些內核線程比如[kblockd/0]佔用了CPU0,這是因爲它指定了在CPU0上執行.其餘的進程佔用了CPU1.

我們這裏用一個簡單的循環程序測試一下:

#include <stdio.h>

int

main ()

{

while(1){

}

return 0;

}

gcc test.c

./a.out&

查看a.out進程的程序親和性,如下:

ps -eo pid,args:50,psr |grep a.out

1669 ./a.out 1

1670 ./a.out 1

1671 ./a.out 1

1672 ./a.out 1

1675 grep a.out 1

我們看到4個a.out進程都使用了CPU1.這正是我們想看到的.

最後要說明的是如果使用isolcpus=1,則系統默認會使用CPU0提供服務.如果我們只有兩個cpu,卻指定isolcpus=0,1,這時將默認使用CPU0.

三)cpu的熱插拔

在操作系統層面可以對cpu進行熱插拔.

動態關閉cpu1,如下:

echo "0" > /sys/devices/system/cpu/cpu1/online

此時我們在系統中,只能看到1個CPU了.

cat /proc/cpuinfo

processor : 0

vendor_id : AuthenticAMD

cpu family : 15

model : 107

model name : AMD Sempron(tm) Dual Core Processor 2300

stepping : 2

cpu MHz : 2210.053

cache size : 256 KB

fdiv_bug : no

hlt_bug : no

f00f_bug : no

coma_bug : no

fpu : yes

fpu_exception : yes

cpuid level : 1

wp : yes

flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow constant_tsc up tsc_reliable extd_apicid pni cx16 lahf_lm extapic 3dnowprefetch

bogomips : 4420.10

clflush size : 64

cache_alignment : 64

address sizes : 36 bits physical, 48 bits virtual

power management: ts fid vid ttp tm stc 100mhzsteps

注意,如果運行的程序跑在cpu1上,如果把cpu1關閉,則程序會遷移到cpu0上.另外cpu0是不可以被關閉的,在cpu0的sys文件系統中沒有online文件.

四)進程/線程的親和性和taskset的應用

1)概述:

1)CPU的親和性分爲軟親合性和硬親合性,軟親合性是使進程並不會在處理器之間頻繁遷移.硬親合性是使進程需要在您指定的處理器上運行,Linux默認是軟親合性的,所以Linux會試圖保持進程在相同的CPU上運行,因爲這樣再次應用TLB將成爲可能.

2)Linux系統通過親和性掩碼使應用程序使用哪個CPU來運行程序,Linux默認的親合性掩碼是使用所有的CPU.

3)應用程序可以在啓動時指定親和性掩碼,提交給調度系統,也可以在應用程序運行中調整它的親和性掩碼.

2)用taskset手工調整一個進程的親和性

我們先弄一個最簡單的程序,它是一個死循環,如下:

#include <stdio.h>

int

main ()

{

while(1){

}

return 0;

}

編譯並運行:

gcc taskloop.c -o taskloop

./taskloop

我們在另一個終端查看該進程的狀態,如下:

ps -eo pid,args:30,psr

PID COMMAND PSR

2826 ./taskloop 0

注:上面的ps輸出,我們只保留了taskloop一行,PSR代表我們的程序用了哪個CPU,如果有兩個CPU,就分別CPU0,CPU1,也就是從0開始.

我們中止這個程序,用taskset來指定它用CPU1來運行,如下:

taskset 2 ./taskloop

我們在另一個終端查看該進程的狀態,如下:

ps -eo pid,args:30,psr

PID COMMAND PSR

2892 ./taskloop 1

注:通過taskset對親合性掩碼的設定,我們選擇了CPU1來運行這個程序,這裏要說明的是taskset指定的掩碼是從1開始計算的,我們指定用CPU1,就得用taskset 2 COMMAND來設定.

下面是親合性掩碼與CPU的對映,如下:

0x00000001是處理器1(CPU0)

0x00000002是處理器2(CPU1)

0x00000003是處理器1和處理器2(CPU0/CPU1)

0x00000004是處理器3(CPU2)

0x0000001F是前5個處理器(CPU0,CPU1,CPU2,CPU3,CPU4)

0xFFFFFFFF是所有的處理器(即32個處理器)

以此類推,上面是十六進制的掩碼方式,在taskset中可以用十六進制和十進制兩種方式,我們爲了方便,在這裏只用十進制來表示.

同樣的我們也可以在程序運行中來改變它的親合性掩碼,從而改變它使用的CPU.

上面的taskloop進程使用了CPU1,我們將它改爲使用CPU0,如下:

taskset -p 1 `pgrep taskloop`

pid 2892's current affinity mask: 2

pid 2892's new affinity mask: 1

在另一個終端查看該進程的狀態:

ps -eo pid,args:30,psr

PID COMMAND PSR

2892 ./taskloop 0

注:我們看到原本運行於CPU1的進程改用了CPU0,這裏要說明的是如果我們程序在sleep或pause,此時改變它的親合性掩碼但它不會選用新的CPU,只有當sleep或pause結束,它纔會應用親和性掩碼.

3)進程中的親合性設置

通過sched_setaffinity函數,可以設置CPU的親和性,讓指定的進程運行於指定的CPU上,我們把之前的程序做一下變動,如下:

#define _GNU_SOURCE

#include <sched.h>

#include <stdio.h>

int main(){

cpu_set_t mask;

CPU_ZERO(&mask);

CPU_SET(1,&mask);

if(sched_setaffinity(0, sizeof(cpu_set_t), &mask)==-1)

printf("sched_setaffinity set error!");

while(1){

}

return 0;

}

編譯並運行:

gcc proaffinity.c -o proaffinity

./proaffinity

在另一個終端查看該進程的狀態:

ps -eo pid,args:20,psr|grep proaffinity|grep -v grep

3088 ./proaffinity 1

注:在程序中我們用sched_setaffinity函數,設定本程序用親合性掩碼1,也就是用第一個處理器來運行此程序.

sched_setaffinity(0, sizeof(cpu_set_t), &mask)函數中的第一個參數是PID,如果是0,則代表進程自己,第二個參數是親合性掩碼的長度,第三個參數是親合性掩碼.

CPU_ZERO(&mask)宏調用是清理掩碼mask,CPU_SET(1,&mask)是設定掩碼mask爲1.

4)線程中的親合性設置

通過pthread_attr_setaffinity_np函數,設定線程的親合性掩碼,也就是用第幾個處理器來運行該線程,如下:

#define _GNU_SOURCE

#include <stdio.h>

#include <pthread.h>

#include <unistd.h>

int GetCpuCount()

{

return (int)sysconf(_SC_NPROCESSORS_ONLN);

}

void *thread_fun()

{

int i;

while(1)

{

i = 0;

}

return NULL;

}

int main()

{

int cpu_num = 0;

cpu_num = GetCpuCount();

printf("The number of cpu is %d\n", cpu_num);

pthread_t t1;

pthread_t t2;

pthread_attr_t attr1;

pthread_attr_t attr2;

pthread_attr_init(&attr1);

pthread_attr_init(&attr2);

cpu_set_t cpu_info;

CPU_ZERO(&cpu_info);

CPU_SET(0, &cpu_info);

if (0!=pthread_attr_setaffinity_np(&attr1, sizeof(cpu_set_t), &cpu_info))

{

printf("set affinity failed");

return;

}

CPU_ZERO(&cpu_info);

CPU_SET(1, &cpu_info);

if (0!=pthread_attr_setaffinity_np(&attr2, sizeof(cpu_set_t), &cpu_info))

{

printf("set affinity failed");

}

if (0!=pthread_create(&t1, &attr1, thread_fun, NULL))

{

printf("create thread 1 error\n");

return;

}

if (0!=pthread_create(&t2, &attr2, thread_fun, NULL))

{

printf("create thread 2 error\n");

return;

}

pthread_join(t1, NULL);

pthread_join(t2, NULL);

}

編譯並運行:

gcc pthraffinity.c -o pthraffinity -pthread

./proaffinity

我們在另一個終端查看該進程中所有線程的狀態,如下:

ps -eLo pid,lwp,args:20,psr

PID LWP COMMAND PSR

3191 3191 ./pthraffinity 1

3191 3192 ./pthraffinity 0

3191 3193 ./pthraffinity 1

注:LWP一列爲線程ID,我們看到進程(PID=3191)用了CPU0,而第一個線程(LWP=3192)用了CPU1,這正是程序中設定的,而第二個線程(LWP=3193)用了CPU0,這也是程序的意圖.

五)cgroup

1)概述:

cgroup它的全稱爲control group.cgroup可以用於對一組進程分組,每一組進程就某種系統資源實現資源管理.

cgroup即一組進程的行爲控制.比如,我們限制進程/bin/sh的CPU使用爲20%.我們就可以建一個cpu佔用爲20%的cgroup.然後將 /bin/sh進程添加到這個cgroup中.

我們知道ulimit可以對系統資源進行限制,但ulimit以用戶的資源爲單位的,而cgroup可以針對到某個進程.

我們也可以指定某個進程使用某個cpu.或者讓某些進程使用一組CPU.

cgroup包含了多個孤立的子系統,每一個子系統代表一個單一的資源,在新的內核中cgroup控制的模塊有9個子系統,分別是:

blkio 這個子系統設置限制每個塊設備的輸入輸出控制。例如:磁盤，光盤以及usb等等。

cpu 這個子系統使用調度程序爲cgroup任務提供cpu的訪問。

cpuacct 產生cgroup任務的cpu資源報告。

cpuset 如果是多核心的cpu，這個子系統會爲cgroup任務分配單獨的cpu和內存。

devices 允許或拒絕cgroup任務對設備的訪問。

freezer 暫停和恢復cgroup任務。

memory 設置每個cgroup的內存限制以及產生內存資源報告。

net_cls 標記每個網絡包以供cgroup方便使用。

ns 名稱空間子系統。

有RHEL6中,有libcgroup軟件包,裏面包含了cgroup的相關服務程序,而且cgroup在RHEL6中也做爲服務出現.

同時cpuset和cgroup也做爲僞文件系統出現.如下:

grep cpuset /proc/filesystems

nodev cpuset

grep cgroup /proc/filesystems

nodev cgroup

2)我們可以通過cgroup中的cpuset子系統爲進程指定CPU和內存.

首先創建一個用於掛載cpuset文件系統的目錄,如下:

mkdir /cpusets

掛載cpuset文件系統以/cpusets目錄

mount -t cpuset nodev /cpusets/

注:我們也可以用,mount -t cgroup none /cpusets/ -o cpuset

查看

ls -l /cpusets/

total 0

-r--r--r--. 1 root root 0 Oct 22 21:52 cgroup.procs

-rw-r--r--. 1 root root 0 Oct 22 21:52 cpuset.cpu_exclusive

-rw-r--r--. 1 root root 0 Oct 22 21:52 cpuset.cpus

-rw-r--r--. 1 root root 0 Oct 22 21:52 cpuset.mem_exclusive

-rw-r--r--. 1 root root 0 Oct 22 21:52 cpuset.mem_hardwall

-rw-r--r--. 1 root root 0 Oct 22 21:52 cpuset.memory_migrate

-r--r--r--. 1 root root 0 Oct 22 21:52 cpuset.memory_pressure

-rw-r--r--. 1 root root 0 Oct 22 21:52 cpuset.memory_pressure_enabled

-rw-r--r--. 1 root root 0 Oct 22 21:52 cpuset.memory_spread_page

-rw-r--r--. 1 root root 0 Oct 22 21:52 cpuset.memory_spread_slab

-rw-r--r--. 1 root root 0 Oct 22 21:52 cpuset.mems

-rw-r--r--. 1 root root 0 Oct 22 21:52 cpuset.sched_load_balance

-rw-r--r--. 1 root root 0 Oct 22 21:52 cpuset.sched_relax_domain_level

drwxr-xr-x. 3 root root 0 Oct 22 21:53 libvirt

-rw-r--r--. 1 root root 0 Oct 22 21:52 notify_on_release

-rw-r--r--. 1 root root 0 Oct 22 21:52 release_agent

-rw-r--r--. 1 root root 0 Oct 22 21:52 tasks

查看cpuset.cpus,這裏顯示的是0-1,說明默認情況下我們的系統可以使用0,1兩個cpu.

more cpuset.cpus

0-1

查看cpuset.mems,這裏顯示的是0,說明默認情況下我們的系統可以使用結點0的內存系統,這裏是單路控制,不是NUMA,所以只看到了0.

more cpuset.mems

查看有多少個進程在這個cgroup中.如下:

more tasks

略

這裏我們再建一個cgroup/cpusets,將某些進程指定到這裏,如下:

mkdir /cpusets/test

查看新cgroup

cd test/

[root@localhost test]# ls

cgroup.procs cpuset.mem_exclusive cpuset.memory_pressure cpuset.mems notify_on_release

cpuset.cpu_exclusive cpuset.mem_hardwall cpuset.memory_spread_page cpuset.sched_load_balance tasks

cpuset.cpus cpuset.memory_migrate cpuset.memory_spread_slab cpuset.sched_relax_domain_level

注:

我們看到新建的目錄下面包含了cpusets文件系統所有的文件.另外新的cgroup下面的task是空的,cgroup.cpus/cgroup.mems也是空的.

這裏指定新的cgroup/cpusets中的cpus爲0,也就是在這個cgroup中的進程只能使用第一個CPU,如下:

echo 0 > cpuset.cpus

內存做同樣的指定,如下:

echo 0 > cpuset.mems

我們這裏運行一個測試程序a.out,如下:

/root/a.out &

[1] 1848

查看該進程使用了哪個cpu,如下:

ps -eo pid,args:20,psr|grep a.out

1848 /root/a.out 1

將該進程PID指定到tasks文件中,也不是指定到第一塊cpu上.如下:

echo "1848" > tasks

查看該進程使用的cpu,如下:

ps -eo pid,args:50,psr|grep a.out

1848 ./a.out 0

注:這裏我們看到a.out進程已經被指定到第1塊CPU上運行了,如下:

這裏我們查看該進程的狀態,如下:

cat /proc/1848/status

Name: a.out

State: R (running)

Tgid: 1848

Pid: 1848

PPid: 1584

TracerPid: 0

Uid: 0 0 0 0

Gid: 0 0 0 0

Utrace: 0

FDSize: 256

Groups: 0 1 2 3 4 6 10

VmPeak: 1896 kB

VmSize: 1816 kB

VmLck: 0 kB

VmHWM: 276 kB

VmRSS: 276 kB

VmData: 24 kB

VmStk: 88 kB

VmExe: 4 kB

VmLib: 1676 kB

VmPTE: 20 kB

VmSwap: 0 kB

Threads: 1

SigQ: 3/7954

SigPnd: 0000000000000000

ShdPnd: 0000000000000000

SigBlk: 0000000000000000

SigIgn: 0000000000000000

SigCgt: 0000000000000000

CapInh: 0000000000000000

CapPrm: ffffffffffffffff

CapEff: ffffffffffffffff

CapBnd: ffffffffffffffff

Cpus_allowed: 1

Cpus_allowed_list: 0

Mems_allowed: 1

Mems_allowed_list: 0

voluntary_ctxt_switches: 2

nonvoluntary_ctxt_switches: 343

注:

Cpus_allowed:1指出該進程可以使用CPU的親和性掩碼,因爲我們指定爲第1塊CPU,所以這裏就是1,如果該進程指定爲4個CPU(如果有話),這裏就是F(1111).

Cpus_allowed_list:0指出該進程可以使用CPU的列表,這裏是0,所以只能使用第1塊CPU.

Mems_allowed: 1

Mems_allowed_list: 0

內存同CPU一樣,進程a.out只是使用了結點0的內存資源.

下面是init進程的status,我們可以做一下對比,如下:

cat /proc/1/status

略

Cpus_allowed: 3

Cpus_allowed_list: 0-1

Mems_allowed: 1

Mems_allowed_list: 0

最後我們總結一下cpusets的各子目錄的關係,如果我們有8塊cpu,在/cpusets目錄下建了子目錄test1,同時給test1中的cpusets.cpus指定爲0-3,那麼在test1目錄下再建test2子目錄,此時只能給test2子目錄中的cpuset.cpus指定爲0-3.

也就是說這是一種繼承的關係.在我們的例子中在/cpusets/test目錄下再建subtest,此時不允許向subtest目錄下的cpuset.cpus寫入1,因爲test的cpuset.cpus是0.

如下:

mkdir subtest && cd subtest

寫入失敗

echo 1 > cpuset.cpus

-bash: echo: write error: Permission denied

查看test的cpuset.cpus爲0

cat ../cpuset.cpus

對subtest的cpuset.cpus寫入0是可以的,如下:

echo 0 > cpuset.cpus

cat cpuset.cpus

3)我們演示一下cgroup中的memory系統,這裏我們指定某個進程只能使用300MB的虛擬內存.

創建用於掛載的cgroup/memory的目錄,如下:

mkdir /mnt/cgroup/

指定用cgroup文件系統掛載,同時指定用memory子系統,如下:

mount -t cgroup none /mnt/cgroup/ -o memory

查看memory子系統中包括的文件,如下:

ls /mnt/cgroup/

cgroup.procs memory.memsw.usage_in_bytes

libvirt memory.soft_limit_in_bytes

memory.failcnt memory.stat

memory.force_empty memory.swappiness

memory.limit_in_bytes memory.usage_in_bytes

memory.max_usage_in_bytes memory.use_hierarchy

memory.memsw.failcnt notify_on_release

memory.memsw.limit_in_bytes release_agent

memory.memsw.max_usage_in_bytes tasks

查看cgroup/memory系統中的文件,如下:

cd /mnt/cgroup/

cgroup.procs memory.limit_in_bytes memory.memsw.max_usage_in_bytes memory.swappiness release_agent

libvirt memory.max_usage_in_bytes memory.memsw.usage_in_bytes memory.usage_in_bytes tasks

memory.failcnt memory.memsw.failcnt memory.soft_limit_in_bytes memory.use_hierarchy

memory.force_empty memory.memsw.limit_in_bytes memory.stat notify_on_release

新建目錄test,如下:

mkdir test && cd test/

cgroup.procs memory.max_usage_in_bytes memory.memsw.usage_in_bytes memory.usage_in_bytes

memory.failcnt memory.memsw.failcnt memory.soft_limit_in_bytes memory.use_hierarchy

memory.force_empty memory.memsw.limit_in_bytes memory.stat notify_on_release

memory.limit_in_bytes memory.memsw.max_usage_in_bytes memory.swappiness tasks

查看相關的內存限制:

more memory.limit_in_bytes

9223372036854775807

more memory.memsw.limit_in_bytes

9223372036854775807

將該進程組中的內存限制開啓爲300MB,如下:

echo 300M > memory.limit_in_bytes

echo 300M > memory.memsw.limit_in_bytes

cat memory.limit_in_bytes

314572800

cat memory.memsw.limit_in_bytes

314572800

將當前進程的PID寫入到tasks中,如下:

echo $$ > tasks

通過下面的程序,我們申請內存資源,如下:

more /tmp/test.c

#include <stdio.h>

#include <stdlib.h>

#define MALLOC_SIZE 1024 * 1024 * 300 //1G

int main(void)

{

char *i = NULL;

long int j;

i = malloc(MALLOC_SIZE);

if(i != NULL) {

for(j = 0; j < MALLOC_SIZE; j++)

*(i+j) = 'a';

}

sleep(5);

return 0;

}

編譯後運行,如下:

gcc /tmp/test.c /tmp/test

/tmp/test

Killed

進程在申請了300MB後,想繼續申請,發現內存不夠用後退出,我們查看日誌,如下:

tail -f /var/log/message

Oct 23 00:28:38 localhost kernel: 20531 total pagecache pages

Oct 23 00:28:38 localhost kernel: 0 pages in swap cache

Oct 23 00:28:38 localhost kernel: Swap cache stats: add 0, delete 0, find 0/0

Oct 23 00:28:38 localhost kernel: Free swap = 1760248kB

Oct 23 00:28:38 localhost kernel: Total swap = 1760248kB

Oct 23 00:28:38 localhost kernel: 262128 pages RAM

Oct 23 00:28:38 localhost kernel: 35330 pages HighMem

Oct 23 00:28:38 localhost kernel: 4333 pages reserved

Oct 23 00:28:38 localhost kernel: 17603 pages shared

Oct 23 00:28:38 localhost kernel: 103368 pages non-shared

Oct 23 00:28:38 localhost kernel: Memory cgroup out of memory: kill process 1621 (test) score 4828 or a child

Oct 23 00:28:38 localhost kernel: Killed process 1621 (test) vsz:309020kB, anon-rss:307104kB, file-rss:268kB

Memory cgroup out of memory正是說明cgroup啓到了作用,它kill了引起OOM的進程.

4)最後我們掛載一下其它的幾種cgroup子系統,看一下他們的相關配置文件,有關於更深入的分析,我手頭的資料還有限,在不斷研究後再總結出來.

mount -t cgroup none /mnt/cgroup/ -o cpu

ls -l /mnt/cgroup/

total 0

-r--r--r--. 1 root root 0 Oct 22 21:52 cgroup.procs

-rw-r--r--. 1 root root 0 Oct 22 21:52 cpu.rt_period_us

-rw-r--r--. 1 root root 0 Oct 22 21:52 cpu.rt_runtime_us

-rw-r--r--. 1 root root 0 Oct 22 21:52 cpu.shares

drwxr-xr-x. 3 root root 0 Oct 22 21:53 libvirt

-rw-r--r--. 1 root root 0 Oct 22 21:52 notify_on_release

-rw-r--r--. 1 root root 0 Oct 22 21:52 release_agent

-rw-r--r--. 1 root root 0 Oct 22 21:52 tasks

mount -t cgroup none /mnt/cgroup/ -o cpuacct

ls -l /mnt/cgroup/

total 0

-r--r--r--. 1 root root 0 Oct 22 21:52 cgroup.procs

-r--r--r--. 1 root root 0 Oct 22 21:52 cpuacct.stat

-rw-r--r--. 1 root root 0 Oct 22 21:52 cpuacct.usage

-r--r--r--. 1 root root 0 Oct 22 21:52 cpuacct.usage_percpu

drwxr-xr-x. 3 root root 0 Oct 22 21:53 libvirt

-rw-r--r--. 1 root root 0 Oct 22 21:52 notify_on_release

-rw-r--r--. 1 root root 0 Oct 22 21:52 release_agent

-rw-r--r--. 1 root root 0 Oct 22 21:52 tasks

mount -t cgroup none /mnt/cgroup/ -o devices

ls -l /mnt/cgroup/

total 0

-r--r--r--. 1 root root 0 Oct 22 21:52 cgroup.procs

--w-------. 1 root root 0 Oct 22 21:52 devices.allow

--w-------. 1 root root 0 Oct 22 21:52 devices.deny

-r--r--r--. 1 root root 0 Oct 22 21:52 devices.list

drwxr-xr-x. 3 root root 0 Oct 22 21:53 libvirt

-rw-r--r--. 1 root root 0 Oct 22 21:52 notify_on_release

-rw-r--r--. 1 root root 0 Oct 22 21:52 release_agent

-rw-r--r--. 1 root root 0 Oct 22 21:52 tasks

mount -t cgroup none /mnt/cgroup/ -o freezer

[root@localhost ~]# ls -l /mnt/cgroup/

total 0

-r--r--r--. 1 root root 0 Oct 22 21:52 cgroup.procs

drwxr-xr-x. 3 root root 0 Oct 22 21:53 libvirt

-rw-r--r--. 1 root root 0 Oct 22 21:52 notify_on_release

-rw-r--r--. 1 root root 0 Oct 22 21:52 release_agent

-rw-r--r--. 1 root root 0 Oct 22 21:52 tasks

mount -t cgroup none /mnt/cgroup/ -o net_cls

[root@localhost ~]# ls -l /mnt/cgroup/

total 0

-r--r--r--. 1 root root 0 Oct 22 21:52 cgroup.procs

-rw-r--r--. 1 root root 0 Oct 22 21:52 net_cls.classid

-rw-r--r--. 1 root root 0 Oct 22 21:52 notify_on_release

-rw-r--r--. 1 root root 0 Oct 22 21:52 release_agent

-rw-r--r--. 1 root root 0 Oct 22 21:52 tasks

mount -t cgroup none /mnt/cgroup/ -o blkio

[root@localhost ~]# ls -l /mnt/cgroup/

total 0

-r--r--r--. 1 root root 0 Oct 22 21:52 blkio.io_merged

-r--r--r--. 1 root root 0 Oct 22 21:52 blkio.io_queued

-r--r--r--. 1 root root 0 Oct 22 21:52 blkio.io_service_bytes

-r--r--r--. 1 root root 0 Oct 22 21:52 blkio.io_serviced

-r--r--r--. 1 root root 0 Oct 22 21:52 blkio.io_service_time

-r--r--r--. 1 root root 0 Oct 22 21:52 blkio.io_wait_time

--w-------. 1 root root 0 Oct 22 21:52 blkio.reset_stats

-r--r--r--. 1 root root 0 Oct 22 21:52 blkio.sectors

-r--r--r--. 1 root root 0 Oct 22 21:52 blkio.time

-rw-r--r--. 1 root root 0 Oct 22 21:52 blkio.weight

-rw-r--r--. 1 root root 0 Oct 22 21:52 blkio.weight_device

-r--r--r--. 1 root root 0 Oct 22 21:52 cgroup.procs

-rw-r--r--. 1 root root 0 Oct 22 21:52 notify_on_release

-rw-r--r--. 1 root root 0 Oct 22 21:52 release_agent

-rw-r--r--. 1 root root 0 Oct 22 21:52 tasks

Linux的CPU資源調優

什麼是OpenWRT?

Linux環境下的編譯，鏈接與庫的使用

如何編寫linux下nand flash驅動

uboot main_loop函數分析

Linux的CPU資源調優

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結