How to reproduce a condition which invokes the OOM-Killer ?

原文鏈接:https://access.redhat.com/solutions/47692

How to reproduce a condition which invokes the OOM-Killer ?

 SOLUTION UNVERIFIED - 已更新 2016年四月29日23:47 - 

English 

環境

  • Red Hat Enterprise Linux 5
  • Red Hat Enterprise Linux 6

問題

  • How to reproduce a condition which invokes the OOM-Killer ?

決議

Free output :

Raw

# free -m 
                  total         used        free  shared  buffers  cached
Mem:          1999        1819        180          0          94         910  
-/+ buffers/cache:      813       1186  
Swap:         4095          0       4095
  • There is almost 2GB of memory and out of that 910MB memory is cached( that means alomost 50% of memory is cached), system is using 99% of RAM.

  • Following are the overcommit parameters.

Raw

 $ cat /proc/sys/vm/overcommit_memory  

 $ cat /proc/sys/vm/overcommit_ratio  
 50

The following program will allocate all the memory but will not use it. Just it will allocate the memory.

memtest.c

Raw

 #include <stdio.h>
 #include <stdlib.h>

 int main (void) {  
         int n = 0;  

         while (1) {  
                 if (malloc(1<<20) == NULL) {  
                         printf("malloc failure after %d MiB\n", n);  
                         return 0;  
                 }  
                 printf ("got %d MiB\n", ++n);  
         }  
 }  



 $ gcc memtest1.c  
 $ ./a.out  

 got 570528 MiB  
 got 570529 MiB  
 got 570530 MiB  
 got 570531 MiBKilled
  • Kernel allowed upto 557MB of RAM (Kernel has overcommited the memory) we have used vm.overcommit_memory = 0 parameter.
    Following are the snipped log messages:

Raw

 #less /var/log/messages  

 6792kB unstable:0kB bounce:0kB writeback\_tmp:0kB pages\_scanned:160 all_unreclaimable? no  
 Dec  6 00:19:23 dhcp1-109 kernel: [15358.827694] lowmem_reserve[]: 0 0 0 0  
 Dec  6 00:19:23 dhcp1-109 kernel: [15358.827699] Node 0 DMA: 3*4kB 4*8kB 8*16kB 9*32kB 10*64kB 10*128kB 2*256kB 2*512kB 2*1024kB 1*2048kB 0*4096kB = 8012kB  
 Dec  6 00:19:23 dhcp1-109 kernel: [15358.827711] Node 0 DMA32: 377*4kB 21*8kB 2*16kB 0*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 5740kB  
 Dec  6 00:19:23 dhcp1-109 kernel: [15358.827723] 19644 total pagecache pages  
 Dec  6 00:19:23 dhcp1-109 kernel: [15358.827725] 1378 pages in swap cache  
 Dec  6 00:19:23 dhcp1-109 kernel: [15358.827728] Swap cache stats: add 1114112, delete 1112734, find 9660/15265  
 Dec  6 00:19:23 dhcp1-109 kernel: [15358.827730] Free swap  = 0kB  
 Dec  6 00:19:23 dhcp1-109 kernel: [15358.827732] Total swap = 4194300kB  
 Dec  6 00:19:23 dhcp1-109 kernel: [15358.836840] 521855 pages RAM  
 Dec  6 00:19:23 dhcp1-109 kernel: [15358.836843] 9983 pages reserved  
 Dec  6 00:19:23 dhcp1-109 kernel: [15358.836845] 17279 pages shared  
 Dec  6 00:19:23 dhcp1-109 kernel: [15358.836847] 494732 pages non-shared  
 Dec  6 00:19:23 dhcp1-109 kernel: [15358.836852] Out of memory: kill process 6299 (a.out) score 154833937 or a child  
 Dec  6 00:19:23 dhcp1-109 kernel: [15358.836857] Killed process 6299 (a.out) vsz:619335748kB, anon-rss:535344kB, file-rss:92kB
  • System was running in Low memory and it has killed a.out proces.

Free output

Raw

 # free -m  
                   total         used        free     shared    buffers     cached  
 Mem:          1999        455       1543               0            9          126  
 -/+ buffers/cache:      319       1680  
 Swap:        4095        354       3741

Raw

 #echo "2"  /proc/sys/vm/overcommit_memory  
 #echo "100"  /proc/sys/vm/overcommit_ratio <<< Here your system has failed.
  • Following program will start using the memory:

memtest2.c

Raw

 #include <stdio.h>
 #include <string.h>
 #include <stdlib.h>

 int main (void) {  
         int n = 0;  
         char *p;  

         while (1) {  
                 if ((p = malloc(1<<20)) == NULL) {  
                         printf("malloc failure after %d MiB\n", n);  
                         return 0;  
                 }  
                 memset (p, 0, (1<<20));  
                 printf ("got %d MiB\n", ++n);  
         }  
 }  


 #gcc memtest2.c  
 #./a.out  
 got 4511 MiB  
 got 4512 MiB  
 malloc failure after 4512 MiB
  • That means system allowed me to use upto 4.5GB of memory. This is because of overcommit_memory=2 and overcommit_ratio=100. (swap+100% of memory).

  • After running this program system became very slow and slugish but it has not crashed. Then OOM killer came and killed correct process.

Raw

 #less /var/log/messages  
 Dec  6 00:19:23 dhcp1-109 kernel: [15358.836847] 494732 pages non-shared  
 Dec  6 00:19:23 dhcp1-109 kernel: [15358.836852] Out of memory: kill process 6299 (a.out) score 154833937 or a child  
 Dec  6 00:19:23 dhcp1-109 kernel: [15358.836857] Killed process 6299 (a.out) vsz:619335748kB, anon-rss:535344kB, file-rss:92kB  
 Dec  6 00:29:19 dhcp1-109 rtkit-daemon[2166]: The canary thread is apparently starving. Taking action.  
 Dec  6 00:29:19 dhcp1-109 rtkit-daemon[2166]: Demoting known real-time threads.  
 Dec  6 00:29:19 dhcp1-109 rtkit-daemon[2166]: Successfully demoted thread 2336 of process 2333 (/usr/bin/pulseaudio).  
 Dec  6 00:29:19 dhcp1-109 rtkit-daemon[2166]: Successfully demoted thread 2335 of process 2333 (/usr/bin/pulseaudio).  
 Dec  6 00:29:19 dhcp1-109 rtkit-daemon[2166]: Successfully demoted thread 2333 of process 2333 (/usr/bin/pulseaudio).  
 Dec  6 00:29:19 dhcp1-109 rtkit-daemon[2166]: Demoted 3 threads
  • Still system is on and running.

Conclusion:

Reference

  • http://www.win.tue.nl/~aeb/linux/lk/lk-9.html
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章