KoboldCpp

Featured

What is KoboldCpp?

KoboldCpp is an AI text-generation software for GGML and GGUF models, inspired by KoboldAI. It is a single self-contained distributable that builds off llama.cpp and adds additional features. It runs on CPU or GPU, supports image generation via Stable Diffusion 1.5, SDXL, SD3, Flux, speech-to-text via Whisper, and text-to-speech via OuteTTS. It includes the KoboldAI Lite UI with multiple modes and provides API endpoints for KoboldCppApi, OpenAiApi, OllamaApi, A1111ForgeApi, ComfyUiApi, WhisperTranscribeApi, XttsApi, and OpenAiSpeechApi. It works on Windows, MacOS, Linux, Android (Termux), Docker, Colab, and cloud GPU providers like RunPod and Novita AI.

How to use KoboldCpp

Download the binary for your operating system from the releases page. Make it executable if required. Launch with no command line arguments to open the GUI or run with --help for options. Obtain and load a GGUF model. Connect to http://localhost:5001 or use an API endpoint.

KoboldCpp features

Single File Executable

No installation required.

Runs on CPU or GPU

Supports full or partial offload.

Image Generation

Stable Diffusion 1.5, SDXL, SD3, Flux.

Speech-To-Text

Whisper voice recognition.

Text-To-Speech

OuteTTS voice generation.

KoboldCpp pricing

KoboldCpp

Free
  • Open-source download from GitHub releases
  • Runs on CPU/GPU
  • No installation required