Difference between revisions of "Pinokio"

From Notes_Wiki
m
m
 
(9 intermediate revisions by the same user not shown)
Line 1: Line 1:
[[Main Page|Home]] > [[Local system based AI tools]] > [[Pinokio]]
[[Main Page|Home]] > [[Local system based AI tools]] > [[Pinokio]]


=Install Pinoko AppImage=
To download and install Pinokio on local system use:
To download and install Pinokio on local system use:
# Go to https://github.com/pinokiocomputer/pinokio/releases  
# Go to https://github.com/pinokiocomputer/pinokio/releases  
# Download AppImage and run via AppImageLauncher.  See [[Rocky 9.x Owncloud client via AppImage]]
# Download AppImage and run via AppImageLauncher.  See [[Rocky 9.x Owncloud client via AppImage]]
# Then run pinokio via graphical run options  
# Then run pinokio via graphical run options.
# During first run set a parent folder in location where you have enough space (at least 100GB+)  
## For this in Rocky 9.x use:
##:<pre>
##:: chmod +x Pinokio.AppImage
##:: ./Piniokio.AppImage
##:</pre>
## For this in Ubuntu 24.04 use:
##:<pre>
##:: chmod +x Pinokio.AppImage
##:: ./Pinoko.AppImage --no-sandbox
##:</pre>
# During first run set a parent folder in location where you have enough space (at least 300GB+)  
# Install recommended tools from [[Overall list of useful AI tools]]


Refer:
Refer:
- https://program.pinokio.computer/#/?id=linux
- https://program.pinokio.computer/#/?id=linux
=Useful tools within Pinokio=
==Ultimate-TTS-Studio-SUP3R-Edition==
From https://github.com/pinokiofactory/Ultimate-TTS-Studio has options to use Kokoro, KittenTTS, Higgs audio, Chatterbox/Multi, Fish-speech, F5, Indextts, Indextts2, VoxCPM and VibeVoice all in a single application
===Usage===
# Install using one click Install
# Click "Load" for Kokoro TTS
# Wait for "Loaded (Auto-selected)" to appear against Kokoro TTS
# Type "Text to Synthesize" in text box
# Leave "TTS Engine" as "Kokoro TTS - Pre-Trained Voice"
# Output format - Wav
# From Engine Settings of Kokoro TTS Tab, by default "Heart" is selected
# Click "Generate Speech"
# It takes only about a second on GPU to generate the audio for a few sentences / paragraph.
==Dia==
'''[[#Ultimate-TTS-Studio-SUP3R-Edition]] is better'''
From https://github.com/nari-labs/dia    We can use dia for text to speech generation.  It also has option for uploading speech along with text and possibly generating more audio similar to uploaded sample.
===Installation===
After installing dia via pinokio one click it may not work.  Use below steps to solve the issue:
# cd pinokio-files/api/dia.git/
# source app/env/bin/activate
# pip uninstall dac
# pip install git+https://github.com/descriptinc/descript-audio-codec.git
#: Learned from https://github.com/nari-labs/dia/issues/140
# After this close Pinkio and open it again.  Then run dia in Pinokio and it should open this time.
===Usage===
# You can type text in "Text to Generate" box and Let it generate.  Then play or download the file.
# It will automatically show generation total time for ease of use.  File is generated in wav format.  There is download button at top right corner of play option tool box to download the file.
# Use [S1] or [S2] for speech boy or girl options
==Browser-use==
From https://github.com/browser-use/web-ui  We can benefit from browser-use by using AI (including vision models if required) for performing tasks on sites.  Browser-use can open a browser and perform actions on sites.  Browser-use supports local ollama so we can perform the automation locally without depending on SaaS based AI. 
To configure browser-use use:
# Install browser-use using pinokio one click install
# Ensure local ollama is installed properly and running.  Check for qwen3-vl:4b model installation via
#:<pre>
#:: ollama list
#:</pre>
# In Agent settings tab:
## Use "LLM Provder": Ollama with "LLM Model Name": qwen3-vl:4b.  Enable use of vision
## Use same for Planner LLM provider including vision for planner LLM also
##: We are enabling vision and using a model 4b that can fit entirely in 8GB VRAM GPU card.  If you have more or less GPU VRAM adjust model accordingly.  We dont want to use CPU for vision as that will be too slow.
##: '''Vision is only required for certain tasks.  If HTML body of page would be enough for model to work then we can use non-vision models eg gpt-oss:20b which might run slower from CPU compared to GPU'''
##: ''' Open in other browser http://127.0.0.1:7788/ to change settings.  For some reason changing model in Pinokio UI does not works unless we close pinokio and open again.'''
# In Browser Settings
## Disable "Keep Browser Open"
## Enable "Headless mode".  We can always see what is visible in browser in the "Run agent" tab and we also get gif under "Task output" once task is complete or stopped. 
## Disable Browser security
# In Run Agent we can give task such as "Open sbarjatiya.com/notes_wiki and search for article on how to configure swap space using file in Linux"
# Click "Submit Task" and wait for results
# Agent interaction page shows various browser screenshots and next action plan based on what was displayed in browser
# Task recording gif at bottom shows the various screens of browser with descriptive overlay text
# '''Overall results are not very impressive.  More research is required to get better output from this tool.'''
==Forge==
From https://github.com/lllyasviel/stable-diffusion-webui-forge we can use Forge to generate or modify images using AI. 
===Installation===
# After installing forge using 1-click-install we need to stop it to get options to download models
# Go to "Terminal" tab and choose stop. 
# We can also click on "View" button against top level Pinokio window to open Forge without running it.
# Download models:
##  - "sd_xl_base_1.0.safetensors"
##  -  "sd_xl_refiner_1.0.safetensors"
##: Download happens slowly from hugging face site.  We may have to restart download a few times by closing pinokio and again clicking for these model downloads.
===Usage===
# Once we have required models we can run Forge from Pinokio
# UI can be "SD"
# Under checkpoint ensure - "sd_xl_base_1.0.safetensors" is selected
# Look at "Txt2img" tab first
# Under "Generation" click and expand "Refiner" option by also enabling refiner
## Under refiner checkpoint select "sd_xl_refiner_1.0 [7440042bbd]"
# Type some text (Eg "Fire burning inside a very large hollow ice cube.  We should be able to see ice outside and fire burning within". ) and click generate
# Verify GPU usage using "nvidia-smi"
There are options to edit images using ImgtoImg and generate images using other tools from Spaces tabs.  There are other options in Extras tab also.
In Spaces tab there '''Background removal''' tool which works very well.
==Wan2GP==
From https://github.com/deepbeepmeep/Wan2GP if graphics card has 8GB or more VRAM we can generate simple videos.  This takes lot of time, usually 10 minutes for a model that fits entire in VRAM for 5.1second video.  If model does not fits in VRAM then using CPU and GPU both it might take upto 1.5 hours for 5.1second video.
===Installation===
When we run first generation it downloads required model automatically.  So it will take extra time for first run to download the models.
===Usage===
# Use Wan2.1 with Text2video 1.3B and default settings
## You can try "Text2video 14B" if you have GPU with 12GB or higher memory
# Leave default Orange octopus related prompt and test generation.
# Once file is created we can view it in the tool or download it and save it




[[Main Page|Home]] > [[Local system based AI tools]] > [[Pinokio]]
[[Main Page|Home]] > [[Local system based AI tools]] > [[Pinokio]]

Latest revision as of 03:28, 28 December 2025

Home > Local system based AI tools > Pinokio

Install Pinoko AppImage

To download and install Pinokio on local system use:

  1. Go to https://github.com/pinokiocomputer/pinokio/releases
  2. Download AppImage and run via AppImageLauncher. See Rocky 9.x Owncloud client via AppImage
  3. Then run pinokio via graphical run options.
    1. For this in Rocky 9.x use:
      chmod +x Pinokio.AppImage
      ./Piniokio.AppImage
    2. For this in Ubuntu 24.04 use:
      chmod +x Pinokio.AppImage
      ./Pinoko.AppImage --no-sandbox
  4. During first run set a parent folder in location where you have enough space (at least 300GB+)
  5. Install recommended tools from Overall list of useful AI tools

Refer: - https://program.pinokio.computer/#/?id=linux


Useful tools within Pinokio

Ultimate-TTS-Studio-SUP3R-Edition

From https://github.com/pinokiofactory/Ultimate-TTS-Studio has options to use Kokoro, KittenTTS, Higgs audio, Chatterbox/Multi, Fish-speech, F5, Indextts, Indextts2, VoxCPM and VibeVoice all in a single application

Usage

  1. Install using one click Install
  2. Click "Load" for Kokoro TTS
  3. Wait for "Loaded (Auto-selected)" to appear against Kokoro TTS
  4. Type "Text to Synthesize" in text box
  5. Leave "TTS Engine" as "Kokoro TTS - Pre-Trained Voice"
  6. Output format - Wav
  7. From Engine Settings of Kokoro TTS Tab, by default "Heart" is selected
  8. Click "Generate Speech"
  9. It takes only about a second on GPU to generate the audio for a few sentences / paragraph.



Dia

#Ultimate-TTS-Studio-SUP3R-Edition is better From https://github.com/nari-labs/dia We can use dia for text to speech generation. It also has option for uploading speech along with text and possibly generating more audio similar to uploaded sample.

Installation

After installing dia via pinokio one click it may not work. Use below steps to solve the issue:

  1. cd pinokio-files/api/dia.git/
  2. source app/env/bin/activate
  3. pip uninstall dac
  4. pip install git+https://github.com/descriptinc/descript-audio-codec.git
    Learned from https://github.com/nari-labs/dia/issues/140
  5. After this close Pinkio and open it again. Then run dia in Pinokio and it should open this time.


Usage

  1. You can type text in "Text to Generate" box and Let it generate. Then play or download the file.
  2. It will automatically show generation total time for ease of use. File is generated in wav format. There is download button at top right corner of play option tool box to download the file.
  3. Use [S1] or [S2] for speech boy or girl options



Browser-use

From https://github.com/browser-use/web-ui We can benefit from browser-use by using AI (including vision models if required) for performing tasks on sites. Browser-use can open a browser and perform actions on sites. Browser-use supports local ollama so we can perform the automation locally without depending on SaaS based AI.

To configure browser-use use:

  1. Install browser-use using pinokio one click install
  2. Ensure local ollama is installed properly and running. Check for qwen3-vl:4b model installation via
    ollama list
  3. In Agent settings tab:
    1. Use "LLM Provder": Ollama with "LLM Model Name": qwen3-vl:4b. Enable use of vision
    2. Use same for Planner LLM provider including vision for planner LLM also
      We are enabling vision and using a model 4b that can fit entirely in 8GB VRAM GPU card. If you have more or less GPU VRAM adjust model accordingly. We dont want to use CPU for vision as that will be too slow.
      Vision is only required for certain tasks. If HTML body of page would be enough for model to work then we can use non-vision models eg gpt-oss:20b which might run slower from CPU compared to GPU
      Open in other browser http://127.0.0.1:7788/ to change settings. For some reason changing model in Pinokio UI does not works unless we close pinokio and open again.
  4. In Browser Settings
    1. Disable "Keep Browser Open"
    2. Enable "Headless mode". We can always see what is visible in browser in the "Run agent" tab and we also get gif under "Task output" once task is complete or stopped.
    3. Disable Browser security
  5. In Run Agent we can give task such as "Open sbarjatiya.com/notes_wiki and search for article on how to configure swap space using file in Linux"
  6. Click "Submit Task" and wait for results
  7. Agent interaction page shows various browser screenshots and next action plan based on what was displayed in browser
  8. Task recording gif at bottom shows the various screens of browser with descriptive overlay text
  9. Overall results are not very impressive. More research is required to get better output from this tool.


Forge

From https://github.com/lllyasviel/stable-diffusion-webui-forge we can use Forge to generate or modify images using AI.

Installation

  1. After installing forge using 1-click-install we need to stop it to get options to download models
  2. Go to "Terminal" tab and choose stop.
  3. We can also click on "View" button against top level Pinokio window to open Forge without running it.
  4. Download models:
    1. - "sd_xl_base_1.0.safetensors"
    2. - "sd_xl_refiner_1.0.safetensors"
      Download happens slowly from hugging face site. We may have to restart download a few times by closing pinokio and again clicking for these model downloads.


Usage

  1. Once we have required models we can run Forge from Pinokio
  2. UI can be "SD"
  3. Under checkpoint ensure - "sd_xl_base_1.0.safetensors" is selected
  4. Look at "Txt2img" tab first
  5. Under "Generation" click and expand "Refiner" option by also enabling refiner
    1. Under refiner checkpoint select "sd_xl_refiner_1.0 [7440042bbd]"
  6. Type some text (Eg "Fire burning inside a very large hollow ice cube. We should be able to see ice outside and fire burning within". ) and click generate
  7. Verify GPU usage using "nvidia-smi"

There are options to edit images using ImgtoImg and generate images using other tools from Spaces tabs. There are other options in Extras tab also.

In Spaces tab there Background removal tool which works very well.


Wan2GP

From https://github.com/deepbeepmeep/Wan2GP if graphics card has 8GB or more VRAM we can generate simple videos. This takes lot of time, usually 10 minutes for a model that fits entire in VRAM for 5.1second video. If model does not fits in VRAM then using CPU and GPU both it might take upto 1.5 hours for 5.1second video.

Installation

When we run first generation it downloads required model automatically. So it will take extra time for first run to download the models.

Usage

  1. Use Wan2.1 with Text2video 1.3B and default settings
    1. You can try "Text2video 14B" if you have GPU with 12GB or higher memory
  2. Leave default Orange octopus related prompt and test generation.
  3. Once file is created we can view it in the tool or download it and save it




Home > Local system based AI tools > Pinokio