Promptfoo: LLM evals & red teaming by promptfoo

Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration. Used by OpenAI and Anthropic.

llmprompt-engineeringpromptsllmopsprompt-testingtestingragevaluationevaluation-frameworkllm-evalllm-evaluationllm-evaluation-framework
Verdict 83/100 health $4.13/mo cheapest, hetzner 2/5 setup difficulty Last release 10 days ago

Self-host Promptfoo: LLM evals & red teaming on hetzner CAX11 for $4.13/mo.

Health score
83 /100
6-dim composite
Self-hosts from
$4.13 /mo
hetzner · CAX11
Difficulty
2 /5
Docker + read README
GitHub stars
21k
1.8k forks

About Promptfoo: LLM evals & red teaming

From the project's README at github.com/promptfoo/promptfoo. Lightly cleaned for readability; for the full source see the upstream repo.

promptfoo is a CLI and library for evaluating and red-teaming LLM apps. Stop the trial-and-error approach - start shipping secure, reliable AI apps.

> Promptfoo is now part of OpenAI. Promptfoo remains open source and MIT licensed. Read the [company update](https://www.promptfoo.dev/blog/promptfoo-joinin

Health score breakdown

6-dimension composite. See methodology for formula and weights.

activity
100
maturity
93
community
95
security
85
sustainability
100
adoption
39

Adoption signals

Real-world usage data, pulled from each registry. The bigger the numbers, the more battle-tested the project.

SignalValueSource
GitHub stars 21k github.com/promptfoo/promptfoo
GitHub forks 1.8k github.com/promptfoo/promptfoo
NPM downloads (last month) 904k promptfoo

Release & maintenance

Is this project actively maintained, or about to die? Check the recency of last commit and last release.

Project age3.0 yearssince Apr 2023
Last commit2 days agoMay 5, 2026
Releases shipped406last: 10 days ago
Security policySECURITY.mddeclared by maintainers
Funding links1declared by maintainers

Self-hosting cost across providers

Detected requirements: 4GB RAM, 40GB disk minimum. Cheapest plan per provider that meets the requirement.

ProviderPlanSpecsMonthly
hetzner CAX11 2c · 4GB · 40GB $4.13 USD Deploy →
vultr VC2 1c · 1GB · 25GB $5 USD Deploy →
linode Nanode 1GB 1c · 1GB · 25GB $5.12 USD Deploy →
digitalocean Basic Regular 1GB 1c · 1GB · 25GB $6 USD Deploy →

What people say on Hacker News

Ready to self-host Promptfoo: LLM evals & red teaming?

Spin up a hetzner CAX11 (4GB RAM, 40GB disk) for $4.13/mo and follow the project's official install docs.

Data last refreshed May 7, 2026.

Similar open-source projects

Projects in our directory that replace the same SaaS or share topics with Promptfoo: LLM evals & red teaming.

Frequently asked questions

Last verified . Data refreshes every 30 minutes.