Skrub data by skrub-data
Machine learning with dataframes
machine-learningdata-sciencedata-cleaningdatadata-preparationdata-preprocessingdata-analysisdirty-datadata-wranglingdataframedataframes
Verdict 70/100 health $4.13/mo cheapest, hetzner 2/5 setup difficulty Last release 1 months ago
Health score
70 /100
6-dim composite
Self-hosts from
$4.13 /mo
hetzner · CAX11
Difficulty
2 /5
Docker + read README
GitHub stars
1.6k
215 forks
Health score breakdown
6-dimension composite. See methodology for formula and weights.
activity
89
maturity
74
community
90
security
70
sustainability
65
adoption
29
Adoption signals
Real-world usage data, pulled from each registry. The bigger the numbers, the more battle-tested the project.
| Signal | Value | Source |
|---|---|---|
| GitHub stars | 1.6k | github.com/skrub-data/skrub |
| GitHub forks | 215 | github.com/skrub-data/skrub |
| PYPI downloads (last month) | 202k | skrub |
Release & maintenance
Is this project actively maintained, or about to die? Check the recency of last commit and last release.
| Project age | 8.2 years | since Mar 2018 |
| Last commit | 3 days ago | May 4, 2026 |
| Releases shipped | 7 | last: 1 months ago |
Self-hosting cost across providers
Detected requirements: 4GB RAM, 40GB disk minimum. Cheapest plan per provider that meets the requirement.
| Provider | Plan | Specs | Monthly | |
|---|---|---|---|---|
| hetzner | CAX11 | 2c · 4GB · 40GB | $4.13 USD | Deploy → |
| vultr | VC2 | 1c · 1GB · 25GB | $5 USD | Deploy → |
| linode | Nanode 1GB | 1c · 1GB · 25GB | $5.12 USD | Deploy → |
| digitalocean | Basic Regular 1GB | 1c · 1GB · 25GB | $6 USD | Deploy → |
What people say on Hacker News
- Officials scrub data showing US citizens swept up in immigration arrests
- Show HN: Apple's SHARP running in the browser via ONNX runtime web
- Show HN: Share browser recordings on Cloudflare Pages from the command line
- Show HN: Flipbook – scrub through media frame-by-frame
- Instead of writing my manuscript, I built a tool
Ready to self-host Skrub data?
Spin up a hetzner CAX11 (4GB RAM, 40GB disk) for $4.13/mo and follow the project's official install docs.
Data last refreshed May 7, 2026.
Similar open-source projects
Projects in our directory that replace the same SaaS or share topics with Skrub data.
pandas
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R da
scikit-learn
scikit-learn: machine learning in Python
polars
Extremely fast Query Engine for DataFrames, written in Rust
airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
streamlit
Streamlit — A faster way to build and share data apps.
prefect
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.