Running on a swappiness of 1, the system shows plenty of free memory: the kernel carves 6 GiB allocating ~ 1.5 GiB as free, 1.5 GiB for cache, and the remainder to processes, swapping roughly 750 MiB of stale process memory to swap.<br />
<br />
sar statistics for this from the operational period of the system (under load; I can provide sar details, but let's start with averages - they are representative of the situation during the day):<br />
Linux 2.6.32-358.2.1.el6.x86_64 () 04/22/2014 _x86_64_ (3 CPU)<br />
<br />
08:00:02 AM CPU %user %nice %system %iowait %steal %idle<br />
Average: all 19.09 0.00 36.17 15.55 0.13 29.06<br />
<br />
08:00:02 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit<br />
Average: 1254309 4714303 78.98 20126 2175944 20541612 124.85<br />
<br />
08:00:02 AM frmpg/s bufpg/s campg/s<br />
Average: 5.20 -0.10 -11.65<br />
<br />
08:00:02 AM kbswpfree kbswpused %swpused kbswpcad %swpcad<br />
<br />
Average: 9433352 1051688 10.03 257077 24.44<br />
<br />
Running on a swapiness of 0 (the same system, same load), the next day), the kernel oom killed our database server (on a live production system) taking a business temporarily off line.<br />
<br />
sar statistics for the time period:<br />
<br />
sar -s 15:00:00 -e 16:00:00<br />
Linux 2.6.32-358.2.1.el6.x86_64 04/23/2014 _x86_64_ (3 CPU)<br />
<br />
03:00:02 PM CPU %user %nice %system %iowait %steal %idle<br />
03:10:01 PM all 17.90 0.00 35.27 3.29 0.12 43.42<br />
03:20:02 PM all 19.94 0.00 36.40 8.73 0.12 34.82<br />
03:30:02 PM all 20.18 0.00 42.10 20.37 0.16 17.19<br />
03:40:01 PM all 5.75 0.00 19.93 5.05 0.06 69.20<br />
03:50:01 PM all 22.21 0.00 34.31 8.72 0.16 34.59<br />
Average: all 17.17 0.00 33.58 9.23 0.12 39.90<br />
<br />
sar -s 15:00:00 -e 16:00:00<br />
<br />
03:00:02 PM CPU %user %nice %system %iowait %steal %idle<br />
03:10:01 PM all 17.90 0.00 35.27 3.29 0.12 43.42<br />
03:20:02 PM all 19.94 0.00 36.40 8.73 0.12 34.82<br />
03:30:02 PM all 20.18 0.00 42.10 20.37 0.16 17.19<br />
03:40:01 PM all 5.75 0.00 19.93 5.05 0.06 69.20<br />
03:50:01 PM all 22.21 0.00 34.31 8.72 0.16 34.59<br />
Average: all 17.17 0.00 33.58 9.23 0.12 39.90<br />
<br />
sar -r -s 15:00:00 -e 16:00:00<br />
<br />
03:00:02 PM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit<br />
03:10:01 PM 125396 5843216 97.90 22160 1989352 23282984 141.51<br />
03:20:02 PM 44788 5923824 99.25 19092 1759124 23759172 144.40<br />
03:30:02 PM 34400 5934212 99.42 11864 1649488 23856192 144.99<br />
03:40:01 PM 4139748 1828864 30.64 22972 341264 9558784 58.10<br />
03:50:01 PM 1364632 4603980 77.14 30784 2576044 11827612 71.88<br />
Average: 1141793 4826819 80.87 21374 1663054 18456949 112.18<br />
<br />
sar -R -s 15:00:00 -e 16:00:00<br />
<br />
03:00:02 PM frmpg/s bufpg/s campg/s<br />
03:10:01 PM -24.49 -0.54 26.00<br />
03:20:02 PM -34.14 -1.30 -97.51<br />
03:30:02 PM -4.43 -3.08 -46.70<br />
03:40:01 PM 1732.71 4.69 -552.15<br />
03:50:01 PM -1188.87 3.35 957.39<br />
Average: 100.36 0.62 55.05<br />
<br />
sudo ssh BMS-AppDBServer-4 sar -S -s 15:00:00 -e 16:00:00<br />
<br />
03:00:02 PM kbswpfree kbswpused %swpused kbswpcad %swpcad<br />
03:10:01 PM 10471148 13892 0.13 4300 30.95<br />
03:20:02 PM 10471148 13892 0.13 4300 30.95<br />
03:30:02 PM 10471148 13892 0.13 4300 30.95<br />
03:40:01 PM 10475820 9220 0.09 3484 37.79<br />
03:50:01 PM 10475820 9220 0.09 3484 37.79<br />
Average: 10473017 12023 0.11 3973 33.04<br />
<br />
And kernel log (abreviated to remove the large suite of processes - its an application/database server so tons of httpd and other app server processes, some db server processes, and the usual remainder):<br />
<br />
Apr 23 15:31:01 kernel: proc_count invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0, oom_score_adj=0<br />
Apr 23 15:31:01 kernel: proc_count cpuset=/ mems_allowed=0<br />
Apr 23 15:31:01 kernel: Pid: 1713, comm: proc_count Not tainted 2.6.32-358.2.1.el6.x86_64 <a href="http://bugs.centos.org/view.php?id=1">0000001</a><br />
Apr 23 15:31:01 kernel: Call Trace:<br />
Apr 23 15:31:01 kernel: [<ffffffff810cb5f1>] ? cpuset_print_task_mems_allowed+0x91/0xb0<br />
Apr 23 15:31:01 kernel: [<ffffffff8111cdf0>] ? dump_header+0x90/0x1b0<br />
Apr 23 15:31:01 kernel: [<ffffffff8111d272>] ? oom_kill_process+0x82/0x2a0<br />
Apr 23 15:31:01 kernel: [<ffffffff8111d1b1>] ? select_bad_process+0xe1/0x120<br />
Apr 23 15:31:01 kernel: [<ffffffff8111d6b0>] ? out_of_memory+0x220/0x3c0<br />
Apr 23 15:31:01 kernel: [<ffffffff8112c35c>] ? __alloc_pages_nodemask+0x8ac/0x8d0<br />
Apr 23 15:31:01 kernel: [<ffffffff810074fd>] ? xen_force_evtchn_callback+0xd/0x10<br />
Apr 23 15:31:01 kernel: [<ffffffff8116095a>] ? alloc_pages_current+0xaa/0x110<br />
Apr 23 15:31:01 kernel: [<ffffffff81129d3e>] ? __get_free_pages+0xe/0x50<br />
Apr 23 15:31:01 kernel: [<ffffffff8106bef4>] ? copy_process+0xe4/0x1450<br />
Apr 23 15:31:01 kernel: [<ffffffff81168432>] ? kmem_cache_alloc+0x182/0x190<br />
Apr 23 15:31:01 kernel: [<ffffffff8106d2f4>] ? do_fork+0x94/0x460<br />
Apr 23 15:31:01 kernel: [<ffffffff8119f642>] ? alloc_fd+0x92/0x160<br />
Apr 23 15:31:01 kernel: [<ffffffff8117dd17>] ? fd_install+0x47/0x90<br />
Apr 23 15:31:01 kernel: [<ffffffff81510160>] ? _spin_lock_irq+0x10/0x40<br />
Apr 23 15:31:01 kernel: [<ffffffff81009598>] ? sys_clone+0x28/0x30<br />
Apr 23 15:31:01 kernel: [<ffffffff8100b393>] ? stub_clone+0x13/0x20<br />
Apr 23 15:31:01 kernel: [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b<br />
Apr 23 15:31:01 kernel: Mem-Info:<br />
Apr 23 15:31:01 kernel: Node 0 DMA per-cpu:<br />
Apr 23 15:31:01 kernel: CPU 0: hi: 0, btch: 1 usd: 0<br />
Apr 23 15:31:01 kernel: CPU 1: hi: 0, btch: 1 usd: 0<br />
Apr 23 15:31:01 kernel: CPU 2: hi: 0, btch: 1 usd: 0<br />
Apr 23 15:31:01 kernel: Node 0 DMA32 per-cpu:<br />
Apr 23 15:31:01 kernel: CPU 0: hi: 186, btch: 31 usd: 29<br />
Apr 23 15:31:01 kernel: CPU 1: hi: 186, btch: 31 usd: 0<br />
Apr 23 15:31:01 kernel: CPU 2: hi: 186, btch: 31 usd: 0<br />
Apr 23 15:31:01 kernel: Node 0 Normal per-cpu:<br />
Apr 23 15:31:01 kernel: CPU 0: hi: 186, btch: 31 usd: 58<br />
Apr 23 15:31:01 kernel: CPU 1: hi: 186, btch: 31 usd: 0<br />
Apr 23 15:31:01 kernel: CPU 2: hi: 186, btch: 31 usd: 0<br />
Apr 23 15:31:01 kernel: active_anon:657282 inactive_anon:234243 isolated_anon:0<br />
Apr 23 15:31:01 kernel: active_file:1582 inactive_file:1577 isolated_file:0<br />
Apr 23 15:31:01 kernel: unevictable:415617 dirty:0 writeback:4 unstable:0<br />
Apr 23 15:31:01 kernel: free:7393 slab_reclaimable:13322 slab_unreclaimable:36105<br />
Apr 23 15:31:01 kernel: mapped:410321 shmem:409696 pagetables:86995 bounce:0<br />
Apr 23 15:31:01 kernel: Node 0 DMA free:7760kB min:12kB low:12kB high:16kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:7756kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes<br />
Apr 23 15:31:01 kernel: lowmem_reserve[]: 0 4024 6044 6044<br />
Apr 23 15:31:01 kernel: Node 0 DMA32 free:17736kB min:6620kB low:8272kB high:9928kB active_anon:2256636kB inactive_anon:564276kB active_file:2088kB inactive_file:2180kB unevictable:624700kB isolated(anon):0kB isolated(file):0kB present:4120800kB mlocked:0kB dirty:0kB writeback:16kB mapped:624832kB shmem:624832kB slab_reclaimable:32124kB slab_unreclaimable:90036kB kernel_stack:32696kB pagetables:206100kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no<br />
Apr 23 15:31:01 kernel: lowmem_reserve[]: 0 0 2020 2020<br />
Apr 23 15:31:01 kernel: Node 0 Normal free:4076kB min:3320kB low:4148kB high:4980kB active_anon:372492kB inactive_anon:372696kB active_file:4240kB inactive_file:4128kB unevictable:1037768kB isolated(anon):0kB isolated(file):0kB present:2068480kB mlocked:24776kB dirty:0kB writeback:0kB mapped:1016452kB shmem:1013952kB slab_reclaimable:21164kB slab_unreclaimable:54384kB kernel_stack:4328kB pagetables:141880kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no<br />
Apr 23 15:31:01 kernel: lowmem_reserve[]: 0 0 0 0<br />
Apr 23 15:31:01 kernel: Node 0 DMA: 2*4kB 1*8kB 2*16kB 1*32kB 0*64kB 2*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 1*4096kB = 7760kB<br />
Apr 23 15:31:01 kernel: Node 0 DMA32: 4083*4kB 12*8kB 4*16kB 3*32kB 2*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 17740kB<br />
Apr 23 15:31:01 kernel: Node 0 Normal: 615*4kB 11*8kB 1*16kB 1*32kB 0*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 4132kB<br />
Apr 23 15:31:01 kernel: 414562 total pagecache pages<br />
Apr 23 15:31:01 kernel: 1075 pages in swap cache<br />
Apr 23 15:31:01 kernel: Swap cache stats: add 22900328, delete 22899253, find 492200942/494634739<br />
Apr 23 15:31:01 kernel: Free swap = 10471148kB<br />
Apr 23 15:31:01 kernel: Total swap = 10485040kB<br />
Apr 23 15:31:01 kernel: 1572863 pages RAM<br />
Apr 23 15:31:01 kernel: 80710 pages reserved<br />
Apr 23 15:31:01 kernel: 2926057 pages shared<br />
Apr 23 15:31:01 kernel: 1058807 pages non-shared<br />
Apr 23 15:31:01 kernel: [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name<br />
<br />
<br />
Apr 23 15:31:01 kernel: Out of memory: Kill process 1357 (oninit) score 99 or sacrifice child<br />
Apr 23 15:31:01 kernel: Killed process 1359, UID 501, (oninit) total-vm:2056000kB, anon-rss:116kB, file-rss:10132kB<br />
Apr 23 15:31:20 kernel: awk invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0<br />
Apr 23 15:31:20 kernel: awk cpuset=/ mems_allowed=0<br />
Apr 23 15:31:20 kernel: Pid: 22899, comm: awk Not tainted 2.6.32-358.2.1.el6.x86_64 <a href="http://bugs.centos.org/view.php?id=1">0000001</a><br />
Apr 23 15:31:20 kernel: Call Trace:<br />
Apr 23 15:31:20 kernel: [<ffffffff810cb5f1>] ? cpuset_print_task_mems_allowed+0x91/0xb0<br />
Apr 23 15:31:20 kernel: [<ffffffff8111cdf0>] ? dump_header+0x90/0x1b0<br />
Apr 23 15:31:20 kernel: [<ffffffff8111d272>] ? oom_kill_process+0x82/0x2a0<br />
Apr 23 15:31:20 kernel: [<ffffffff8111d1b1>] ? select_bad_process+0xe1/0x120<br />
Apr 23 15:31:20 kernel: [<ffffffff8111d6b0>] ? out_of_memory+0x220/0x3c0<br />
Apr 23 15:31:20 kernel: [<ffffffff8112c35c>] ? __alloc_pages_nodemask+0x8ac/0x8d0<br />
Apr 23 15:31:20 kernel: [<ffffffff8116095a>] ? alloc_pages_current+0xaa/0x110<br />
Apr 23 15:31:20 kernel: [<ffffffff8111a1d7>] ? __page_cache_alloc+0x87/0x90<br />
Apr 23 15:31:20 kernel: [<ffffffff81119bbe>] ? find_get_page+0x1e/0xa0<br />
Apr 23 15:31:20 kernel: [<ffffffff8111b197>] ? filemap_fault+0x1a7/0x500<br />
Apr 23 15:31:20 kernel: [<ffffffff8100628f>] ? xen_set_pte_at+0xaf/0x170<br />
Apr 23 15:31:20 kernel: [<ffffffff81143194>] ? __do_fault+0x54/0x530<br />
Apr 23 15:31:20 kernel: [<ffffffff81143767>] ? handle_pte_fault+0xf7/0xb50<br />
Apr 23 15:31:20 kernel: [<ffffffff8100ba1d>] ? retint_restore_args+0x5/0x6<br />
Apr 23 15:31:20 kernel: [<ffffffff8100c1fb>] ? xen_hypervisor_callback+0x1b/0x20<br />
Apr 23 15:31:20 kernel: [<ffffffff8151098a>] ? error_exit+0x2a/0x60<br />
Apr 23 15:31:20 kernel: [<ffffffff8100ba1d>] ? retint_restore_args+0x5/0x6<br />
Apr 23 15:31:20 kernel: [<ffffffff8100c1fb>] ? xen_hypervisor_callback+0x1b/0x20<br />
Apr 23 15:31:20 kernel: [<ffffffff81004a49>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e<br />
Apr 23 15:31:20 kernel: [<ffffffff811443fa>] ? handle_mm_fault+0x23a/0x310<br />
Apr 23 15:31:20 kernel: [<ffffffff810474c9>] ? __do_page_fault+0x139/0x480<br />
Apr 23 15:31:20 kernel: [<ffffffff810074fd>] ? xen_force_evtchn_callback+0xd/0x10<br />
Apr 23 15:31:20 kernel: [<ffffffff81007ca2>] ? check_events+0x12/0x20<br />
Apr 23 15:31:20 kernel: [<ffffffff810074fd>] ? xen_force_evtchn_callback+0xd/0x10<br />
Apr 23 15:31:20 kernel: [<ffffffff81007ca2>] ? check_events+0x12/0x20<br />
Apr 23 15:31:20 kernel: [<ffffffff8151339e>] ? do_page_fault+0x3e/0xa0<br />
Apr 23 15:31:20 kernel: [<ffffffff81510755>] ? page_fault+0x25/0x30<br />
Apr 23 15:31:20 kernel: Mem-Info:<br />
Apr 23 15:31:20 kernel: Node 0 DMA per-cpu:<br />
Apr 23 15:31:20 kernel: CPU 0: hi: 0, btch: 1 usd: 0<br />
Apr 23 15:31:20 kernel: CPU 1: hi: 0, btch: 1 usd: 0<br />
Apr 23 15:31:20 kernel: CPU 2: hi: 0, btch: 1 usd: 0<br />
Apr 23 15:31:20 kernel: Node 0 DMA32 per-cpu:<br />
Apr 23 15:31:20 kernel: CPU 0: hi: 186, btch: 31 usd: 32<br />
Apr 23 15:31:20 kernel: CPU 1: hi: 186, btch: 31 usd: 0<br />
Apr 23 15:31:20 kernel: CPU 2: hi: 186, btch: 31 usd: 0<br />
Apr 23 15:31:20 kernel: Node 0 Normal per-cpu:<br />
Apr 23 15:31:20 kernel: CPU 0: hi: 186, btch: 31 usd: 90<br />
Apr 23 15:31:20 kernel: CPU 1: hi: 186, btch: 31 usd: 0<br />
Apr 23 15:31:20 kernel: CPU 2: hi: 186, btch: 31 usd: 0<br />
Apr 23 15:31:20 kernel: active_anon:659392 inactive_anon:234908 isolated_anon:0<br />
Apr 23 15:31:20 kernel: active_file:1093 inactive_file:1129 isolated_file:0<br />
Apr 23 15:31:20 kernel: unevictable:415617 dirty:0 writeback:0 unstable:0<br />
Apr 23 15:31:20 kernel: free:6348 slab_reclaimable:11215 slab_unreclaimable:36129<br />
Apr 23 15:31:20 kernel: mapped:410358 shmem:409696 pagetables:86756 bounce:0<br />
Apr 23 15:31:20 kernel: Node 0 DMA free:7760kB min:12kB low:12kB high:16kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:7756kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes<br />
Apr 23 15:31:20 kernel: lowmem_reserve[]: 0 4024 6044 6044<br />
Apr 23 15:31:20 kernel: Node 0 DMA32 free:14596kB min:6620kB low:8272kB high:9928kB active_anon:2264272kB inactive_anon:566132kB active_file:1476kB inactive_file:1444kB unevictable:624700kB isolated(anon):0kB isolated(file):0kB present:4120800kB mlocked:0kB dirty:0kB writeback:0kB mapped:624944kB shmem:624832kB slab_reclaimable:24184kB slab_unreclaimable:89940kB kernel_stack:32688kB pagetables:206172kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:1620 all_unreclaimable? no<br />
Apr 23 15:31:20 kernel: lowmem_reserve[]: 0 0 2020 2020<br />
Apr 23 15:31:20 kernel: Node 0 Normal free:3036kB min:3320kB low:4148kB high:4980kB active_anon:373296kB inactive_anon:373500kB active_file:2896kB inactive_file:3072kB unevictable:1037768kB isolated(anon):0kB isolated(file):2568kB present:2068480kB mlocked:24776kB dirty:0kB writeback:0kB mapped:1016488kB shmem:1013952kB slab_reclaimable:20676kB slab_unreclaimable:54576kB kernel_stack:4368kB pagetables:140852kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:9666 all_unreclaimable? no<br />
Apr 23 15:31:20 kernel: lowmem_reserve[]: 0 0 0 0<br />
Apr 23 15:31:20 kernel: Node 0 DMA: 2*4kB 1*8kB 2*16kB 1*32kB 0*64kB 2*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 1*4096kB = 7760kB<br />
Apr 23 15:31:20 kernel: Node 0 DMA32: 3297*4kB 12*8kB 4*16kB 3*32kB 2*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 14596kB<br />
Apr 23 15:31:20 kernel: Node 0 Normal: 356*4kB 1*8kB 0*16kB 1*32kB 0*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 3000kB<br />
Apr 23 15:31:20 kernel: 414714 total pagecache pages<br />
Apr 23 15:31:20 kernel: 1097 pages in swap cache<br />
Apr 23 15:31:20 kernel: Swap cache stats: add 22900397, delete 22899300, find 492200999/494634809<br />
Apr 23 15:31:20 kernel: Free swap = 10471368kB<br />
Apr 23 15:31:20 kernel: Total swap = 10485040kB<br />
Apr 23 15:31:20 kernel: 1572863 pages RAM<br />
Apr 23 15:31:20 kernel: 80710 pages reserved<br />
Apr 23 15:31:20 kernel: 2926158 pages shared<br />
Apr 23 15:31:20 kernel: 1059316 pages non-shared<br />
Apr 23 15:31:20 kernel: [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name<br />
<br />
Apr 23 15:31:20 kernel: Out of memory: Kill process 1357 (oninit) score 99 or sacrifice child<br />
Apr 23 15:31:20 kernel: Killed process 1357, UID 501, (oninit) total-vm:2056000kB, anon-rss:648kB, file-rss:1637852kB<br />
Apr 23 15:31:23 kernel: Pid 22322(fglrun-bin) over core_pipe_limit<br />
Apr 23 15:31:23 kernel: Skipping core dump<br />
Apr 23 15:31:23 kernel: Pid 21487(fglrun-bin) over core_pipe_limit<br />
Apr 23 15:31:23 kernel: Skipping core dump<br />
<br />
My expectation (and it would seem reasonable) is that oom killer would use cache or swap before killing processes.<br />
<br />
At a swapiness of 1 we have lots of wait io and the system is lightly sluggish due to swap, at a swappiness of 0 the wait io disappears (except for occasional processes which read large files - and we prefer that). <br />
<br />
There seems to be plenty of free RAM on the system; the problem is the kernel is not managing it well. IN particular, my understanding is that cache is supposed to be used only when available, at is to be freed immediately when required.<br />
<br />
Additionally, my understanding is that swappiness of 0 does not turn swap off - it just says only swap when needed. Hence oom should not kill it should swap.
↧
0007091: oom killer kills process rather than freeing cache or swapping processes under swappiness 0
↧