作者:姚嵩,外星人。
爱可生开源社区出品,原创内容未经授权不得随意使用,转载请联系小编并注明来源。
本文约 1250 字,预计阅读需要 5 分钟。
背景
主机上 MemFree 还剩 2GB 的时候,登陆容器居然报 out of memory ...
报错操作
登录 OAT 容器报错。
docker exec -it oat /bin/bash
报错内容
OCI runtime state failed: runc did not terminate sucessfully: fatal error: runtime: out of memory
查看容器状态
# docker ps -f name=oat
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
cf4cb4a2c359 reg.docker.alibaba-inc.com/oceanbase/oat:4.3.2_bp1_20250711_x86 "/oat/distribution/p…" 7 hours ago Up 7 hours oat
容器状态:从容器状态看,容器是正常运行。OAT 也能正常登录、操作。
查看容器的资源限制
# docker stats oat --no-stream
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
cf4cb4a2c359 oat 0.21% 483.3MiB / 3.701GiB 12.75% 0B / 0B 784MB / 95.3MB 97
# docker inspect oat | grep -i "memory"
"Memory": 0,
"KernelMemory": 0,
"MemoryReservation": 0,
"MemorySwap": 0,
"MemorySwappiness": null,
容器限制:未对容器做内存资源的限制,且容器使用的内存资源也并不高。
查看 messages 信息,确认是否存在 OOM 日志
# grep -E -i 'OOM|out of memory' /var/log/messages
Sep 23 01:31:58 localhost dockerd: time="2025-09-23T09:31:58.873152057+08:00" level=error msg="Error running exec cee8b083af29bef970aa60ff78c803e4ae8a6a4a14978aa62ae02a462f69b586 in container: OCI runtime state failed: runc did not terminate sucessfully: fatal error: runtime: out of memory\n\nruntime stack:\nruntime.throw(0xa1aa24, 0x16)\n\truntime/panic.go:1117 +0x72 fp=0x7ffd6d1993d0 sp=0x7ffd6d1993a0 pc=0x437bb2\nruntime.sysMap(0xc000000000, 0x4000000, 0xdcda50)\n\truntime/mem_linux.go:169 +0xc6 fp=0x7ffd6d199410 sp=0x7ffd6d1993d0 pc=0x41b0a6\nruntime.(*mheap).sysAlloc(0xdaf120, 0x400000, 0x0, 0x0)\n\truntime/malloc.go:729 +0x1e5 fp=0x7ffd6d1994b8 sp=0x7ffd6d199410 pc=0x40e545\nruntime.(*mheap).grow(0xdaf120, 0x1, 0x0)\n\truntime/mheap.go:1346 +0x85 fp=0x7ffd6d199520 sp=0x7ffd6d1994b8 pc=0x42a245\nruntime.(*mheap).allocSpan(0xdaf120, 0x1, 0x2c00, 0x0)\n\truntime/mheap.go:1173 +0x609 fp=0x7ffd6d1995a0 sp=0x7ffd6d199520 pc=0x42a049\nruntime.(*mheap).alloc.func1()\n\truntime/mheap.go:910 +0x59 fp=0x7ffd6d1995f0 sp=0x7ffd6d1995a0 pc=0x464ef9\nruntime.(*mheap).alloc(0xdaf120, 0x1, 0x220000012c, 0xffffffff)\n\truntime/mheap.go:904 +0x85 fp=0x7ffd6d199640 sp=0x7ffd6d1995f0 pc=0x4295e5\nruntime.(*mcentral).grow(0xdc1458, 0x0)\n\truntime/mcentral.go:232 +0x79 fp=0x7ffd6d199688 sp=0x7ffd6d199640 pc=0x41a699\nruntime.(*mcentral).cacheSpan(0xdc1458, 0x7ffd6d199738)\n\truntime/mcentral.go:158 +0x2ff fp=0x7ffd6d1996e0 sp=0x7ffd6d199688 pc=0x41a47f\nruntime.(*mcache).refill(0x7fd3b4b3d108, 0x2c)\n\truntime/mcache.go:162 +0xaa fp=0x7ffd6d199728 sp=0x7ffd6d1996e0 pc=0x41998a\nruntime.(*mcache).nextFree(0x7fd3b4b3d108, 0x7ffd6d19972c, 0x463ac5, 0x7ffd6d1997c8, 0x41005e)\n\truntime/malloc.go:882 +0x8d fp=0x7ffd6d199760 sp=0x7ffd6d199728 pc=0x40edad\nruntime.mallocgc(0x178, 0xa09b40, 0x7ffd6d199801, 0x7ffd6d199848)\n\truntime/malloc.go:1069 +0x850 fp=0x7ffd6d1997e8 sp=0x7ffd6d199760 pc=0x40f7b0\nruntime.newobject(0xa09b40, 0x463a80)\n\truntime/malloc.go:1177 +0x38 fp=0x7ffd6d199818 sp=0x7ffd6d1997e8 pc=0x40fa18\nruntime.malg(0x8000, 0x0)\n\truntime/proc.go:3988 +0x31 fp=0x7ffd6d199858 sp=0x7ffd6d199818 pc=0x442db1\nruntime.mpreinit(0xd97ba0)\n\truntime/os_linux.go:355 +0x29 fp=0x7ffd6d199878 sp=0x7ffd6d199858 pc=0x4347c9\nruntime.mcommoninit(0xd97ba0, 0xffffffffffffffff)\n\truntime/proc.go:744 +0xf7 fp=0x7ffd6d1998c0 sp=0x7ffd6d199878 pc=0x43ba77\nruntime.schedinit()\n\truntime/proc.go:637 +0xaf fp=0x7ffd6d199920 sp=0x7ffd6d1998c0 pc=0x43b5ef\nruntime.rt0_go(0x7ffd6d199b88, 0x9, 0x7ffd6d199b88, 0x4005d8, 0x881252, 0x900000000, 0x7ffd6d199b88, 0x46c3c0, 0x10000, 0x10000, ...)\n\truntime/asm_amd64.s:220 +0x125 fp=0x7ffd6d199928 sp=0x7ffd6d199920 pc=0x46c505\n: unknown"
日志内容:日志中包含 docker exec -it oat /bin/bash 执行时的报错内容,除此以外,没有其他信息。
查看 meminfo 信息
# cat /proc/meminfo
MemTotal: 3880632 kB
MemFree: 2978048 kB
MemAvailable: 0 kB
Buffers: 2856 kB
Cached: 222120 kB
SwapCached: 0 kB
Active: 657516 kB
Inactive: 126644 kB
Active(anon): 559364 kB
Inactive(anon): 18312 kB
Active(file): 98152 kB
Inactive(file): 108332 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 559184 kB
Mapped: 97404 kB
Shmem: 18492 kB
Slab: 54756 kB
SReclaimable: 30844 kB
SUnreclaim: 23912 kB
KernelStack: 4224 kB
PageTables: 10128 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 3880632 kB
Committed_AS: 4424840 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 15564 kB
VmallocChunk: 34359715580 kB
HardwareCorrupted: 0 kB
AnonHugePages: 8192 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 114544 kB
DirectMap2M: 3031040 kB
DirectMap1G: 3145728 kB
发现 MemFree 还有 2978048KB(2GB 多),但 MemAvailable 为 0KB 。HugePages_Total 为 0,不需要考虑大页占用内存的情况。
meminfo 中 MemFree 和 MemAvailable 的区别
-
MemFree 指当前空闲的页(空闲的页面不代表可以直接使用),而 MemAvailable 用于判断系统是否有足够内存来启动新程序。 -
MemAvailable 由内核计算而来,大概公式为:
MemAvailable ≈ MemFree + 可回收的 PageCache + 可回收的 Slab(SReclaimable) - 内存水位线(LowWatermark) - 预留内存
分析
我们发现虽然 /proc/meminfo 中的 MemFree 为 2978048KB(超过 2GB),但 MemAvailable 为 0KB,所以在执行 docker exec -it oat /bin/bash 命令时,报了 out of memory。
根据 MemAvailable 的计算公式,怀疑是 内存水位线定高了,导致计算出来的 MemAvailable 为 0。
内存水位线说明
Linux 内存管理使用水位线机制来防止系统陷入完全不可用的状态。
内存水位线会影响内存的回收行为:
-
内存水位线(最低水位线):当内存低于 min 时,直接回收(Direct Reclaim),新进程分配内存时会被阻塞以回收内存; -
内存水位线(低水位线):当内存低于 low 时,kswapd 守护进程被唤醒,开始异步回收内存; -
内存水位线(高水位线):当内存大于 high 时,kswapd 停止回收内存。
值的大小限制:
min < low < high
查看内存水位线
# awk '/min/ {sum += $2} END {print sum * 4 " KB"}' /proc/zoneinfo
2097148 KB
# sysctl vm.min_free_kbytes
vm.min_free_kbytes = 2097152
我们发现内存的最低水位线的值为 2097152KB,达到了 2GB,而我们的内存总量一共才 3880632KB(约 4GB),显然 vm.min_free_kbytes 配置的设置不合理。
解决方案
调小 vm.min_free_kbytes 的值,设置为 256MB。
# 临时配置
sysctl -w vm.min_free_kbytes=262144
# 永久配置
name_sysctl="vm.min_free_kbytes"
line="vm.min_free_kbytes=262144"
file="/etc/sysctl.conf"
grep -w -i -q ${name_sysctl} ${file} &&
sed -i "s#${name_sysctl}[ ]*=.*#${line}#g" ${file} ||
echo "${line}" >> ${file}
本文关键字:#OceanBase #Linux #内存
✨ Github:https://github.com/actiontech/sqle
📚 文档:https://actiontech.github.io/sqle-docs/
💻 官网:https://opensource.actionsky.com/sqle/
👥 微信群:请添加小助手加入 ActionOpenSource
🔗 商业支持:https://www.actionsky.com/sqle

