Ollama GPU usage validation

From Notes_Wiki
Revision as of 12:36, 2 February 2025 by Saurabh (talk | contribs) (Created page with "Home > Local system based AI tools > Ollama > Ollama GPU usage validation =Ollama GPU usage validation= # Verify `/etc/systemd/system/ollama.service` has below line: #:<pre> #:: Environment="OLLAMA_FLASH_ATTENTION=1" #:</pre> # Run below in one terminal to monitor GPU usage including processes using GPU #:<pre> #:: watch nvidia-smi #:</pre> # Run some model on second terminal such as: #:<pre> #:: ollama run deepseek-r1:8b #:</pre> # In nvidia-s...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Home > Local system based AI tools > Ollama > Ollama GPU usage validation

Ollama GPU usage validation

  1. Verify `/etc/systemd/system/ollama.service` has below line:
    Environment="OLLAMA_FLASH_ATTENTION=1"
  2. Run below in one terminal to monitor GPU usage including processes using GPU
    watch nvidia-smi
  3. Run some model on second terminal such as:
    ollama run deepseek-r1:8b
  4. In nvidia-settings graphical output, click on GPU 0 - <GPU name> and see information about 'GPU Utilization:'
  5. In third terminal if you use “ollama ps” we can see how much of CPU is being used and how much GPU
    ollama ps
    NAME              ID              SIZE      PROCESSOR          UNTIL
    deepseek-r1:8b    28f8fd6cdc67    6.3 GB    35%/65% CPU/GPU    4 minutes from now


Home > Local system based AI tools > Ollama > Ollama GPU usage validation