Troubleshooting ram usage or out of memory issues with proxmox

From Notes_Wiki
Revision as of 07:57, 4 August 2024 by Saurabh (talk | contribs) (Created page with "Home > Debian > Proxmox virtual environment > Troubleshooting ram usage or out of memory issues with proxmox If there is issue being faced related to RAM usage and if even worse some VMs are getting killed by OOM (Out of Memory) Killer then we can use following steps to investigate further. # Look at total memory allocated to VM and total physical memory. # Also look at whether balooning with minimum and maximum values has been enabled. # Do...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Home > Debian > Proxmox virtual environment > Troubleshooting ram usage or out of memory issues with proxmox

If there is issue being faced related to RAM usage and if even worse some VMs are getting killed by OOM (Out of Memory) Killer then we can use following steps to investigate further.

  1. Look at total memory allocated to VM and total physical memory.
  2. Also look at whether balooning with minimum and maximum values has been enabled.
  3. Do 'free -m' and look at value of last avaialble memory before planning to create more VMs as the OS/kernel will also use quite a bit of memory for its own tasks also.
  4. In case of ZFS
    1. Look at existing memory usage for cache via
      cat /sys/module/zfs/parameters/zfs_arc_max
      arc_summary | grep -i curr
    2. Set limit for cache to 10GB for current boot via:
      echo "$[10 * 1024*1024*1024]" >/sys/module/zfs/parameters/zfs_arc_max
    3. To enable this on boot also create /etc/modprobe.d/zfs.conf with
      options zfs zfs_arc_max=10737418240
  5. Look at atop output and see if it is showing MEM above in red color. Also consider recording atop output every 5 minutes to further help with the issue.
    See Rocky 9.x atop
    1. Per process memory output is also available via smem:
      apt-get -y install smem
      smem -t -k -c "pid user command swap uss pss rss"
    2. There is also some slab allocation that can be seen via:
      slabtop
      But its actual interpretation is not clear
  6. Searching for memory in /var/log/syslog should give timestamps when Out of memory (OOM) error is occurring:
    grep -i memory /var/log/syslog
    Then we need to look at actions / resource usage around the time process was killed due to OOM error
  7. If there is a swap file it should not be on a ZFS partition
  8. Temporary clearing cache has not helped in past:
    sync; echo 3 > /proc/sys/vm/drop_caches
    This did not make a difference of more than 1-2 GB even when used cache was 40GB or 140GB etc.


Refer:


Home > Debian > Proxmox virtual environment > Troubleshooting ram usage or out of memory issues with proxmox