on it road.com
解决方案
重启是由于:
OOM杀手和物理/交换内存被完全利用。
启用了vm.panic_on_oom 。
>kmem -i PAGES TOTAL PERCENTAGE TOTAL MEM 1979578 7.6 GB --- FREE 27818 108.7 MB 1% of TOTAL MEM USED 1951760 7.4 GB 98% of TOTAL MEM 98% memory used SHARED 270 1.1 MB 0% of TOTAL MEM BUFFERS 53 212 KB 0% of TOTAL MEM CACHED 1461 5.7 MB 0% of TOTAL MEM SLAB 14026 54.8 MB 0% of TOTAL MEM TOTAL SWAP 4196348 16 GB --- SWAP USED 4196347 16 GB 99% of TOTAL SWAP SWAP FREE 1 4 KB 0% of TOTAL SWAP No free swap space COMMIT LIMIT 5186137 19.8 GB --- COMMITTED 1564979 6 GB 30% of TOTAL LIMIT
# sysctl_-a |grep vm.panic_on_oom vm.panic_on_oom = 1 OOM enabled for panic.
在这个例子中,“java”进程消耗了 80% 的内存。
>ps .. 30934 1 1 ffff88010043aae0 IN 78.3 29303436 7384984 java 30935 1 1 ffff88010043b540 IN 78.3 29303436 7384984 java 30936 1 1 ffff880238f62080 IN 78.3 29303436 7384984 java 31008 1 1 ffff8801060b0aa0 IN 78.3 29303436 7384984 java 31009 1 1 ffff8801060b0040 IN 78.3 29303436 7384984 java > 31010 1 0 ffff88010055a040 RU 78.3 29303436 7384984 java 31690 1 0 ffff880239fc6aa0 IN 78.3 29303436 7384984 java ...
必须增加物理内存或者交换内存以避免 OOM,否则必须减少应用程序的内存使用量。
应用程序内存使用可以通过 ulimit 或者 cgroup 进行限制。
问题
CentOS/RHEL 6 系统由于内存不足错误自动重启:
DUMPFILE: vmcore [PARTIAL DUMP] CPUS: 2 DATE: Mon Nov 29 05:28:02 2016 UPTIME: 33 days, 09:45:55 LOAD AVERAGE: 1.88, 1.52, 1.41 TASKS: 218 NODENAME: localhost RELEASE: 2.6.32-431.el6.x86_64 VERSION: #1 SMP Sun Nov 10 22:19:54 EST 2013 MACHINE: x86_64 (3000 Mhz) MEMORY: 8 GB PANIC: "Kernel panic - not syncing: Out of memory: system-wide panic_on_oom is enabled" PID: 31010 COMMAND: "java" TASK: ffff88010055a040 [THREAD_INFO: ffff88001583e000] CPU: 0 STATE: TASK_RUNNING (PANIC)
PANIC: "Kernel panic - not syncing: Out of memory: system-wide panic_on_oom is enabled"
日期:2020-09-17 00:11:58 来源:oir作者:oir