llama.cpp by ggml-org
LLM inference in C/C++
About llama.cpp
From the project's README at github.com/ggml-org/llama.cpp. Lightly cleaned for readability; for the full source see the upstream repo.
[](https://opensource.org/licenses/MIT) [](https://github.com/ggml-org/llama.cpp/releases) [](https://github.com/ggml-org/llama.cpp/actions/workflows/server.yml)
LLM inference in C/C++ Recent API changes Changelog for API Changelog for REST API Hot topics Hugging Face cache migration: models downloaded with are now stored in the standard Hugging Face cache directory, enabling sharing with other HF tools. guide : using the new WebUI of llama.cpp guide : running gpt-oss with llama.cpp [[FEEDBACK] Better packaging for llama.cpp to support downstream consumers ](https://github.com/ggml-org/llama.cpp/discussions/15313) Support for the model with native MXFP4 format h
Health score breakdown
6-dimension composite. See methodology for formula and weights.
Adoption signals
Real-world usage data, pulled from each registry. The bigger the numbers, the more battle-tested the project.
| Signal | Value | Source |
|---|---|---|
| GitHub stars | 108k | github.com/ggml-org/llama.cpp |
| GitHub forks | 18k | github.com/ggml-org/llama.cpp |
Release & maintenance
Is this project actively maintained, or about to die? Check the recency of last commit and last release.
| Project age | 3.2 years | since Mar 2023 |
| Last commit | 2 days ago | May 4, 2026 |
| Releases shipped | 5,993 | last: 2 days ago |
| Security policy | SECURITY.md | declared by maintainers |
Self-hosting cost across providers
Detected requirements: 4GB RAM, 40GB disk minimum. Cheapest plan per provider that meets the requirement.
| Provider | Plan | Specs | Monthly | |
|---|---|---|---|---|
| hetzner | CAX11 | 2c · 4GB · 40GB | $4.13 USD | Deploy → |
| vultr | VC2 | 1c · 1GB · 25GB | $5 USD | Deploy → |
| linode | Nanode 1GB | 1c · 1GB · 25GB | $5.12 USD | Deploy → |
| digitalocean | Basic Regular 1GB | 1c · 1GB · 25GB | $6 USD | Deploy → |
Security advisories
CVE-2026-33298. What people say on Hacker News
Ready to self-host llama.cpp?
Spin up a hetzner CAX11 (4GB RAM, 40GB disk) for $4.13/mo and follow the project's official install docs.
Data last refreshed May 7, 2026.