Apache Spark by apache

Apache Spark - A unified analytics engine for large-scale data processing

pythonscalarjavabig-datajdbcsqlspark
Verdict 61/100 health $4.13/mo cheapest, hetzner 2/5 setup difficulty 24360k docker pulls 9 open CVEs

Self-host Apache Spark on hetzner CAX11 for $4.13/mo.

Health score
61 /100
6-dim composite
Self-hosts from
$4.13 /mo
hetzner · CAX11
Difficulty
2 /5
Docker + read README
GitHub stars
43k
29k forks

About Apache Spark

From the project's README at github.com/apache/spark. Lightly cleaned for readability; for the full source see the upstream repo.

Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R (Deprecated), and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, pandas API on Spark for pandas workloads, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing. Official version: Development version:

[](https://opensource.org/licenses/Apache-2.0) [](https://search.maven.org/search?q=g:org.apache.spark) [](https://adoptium.net/temurin/releases/?version=17) [](https://github.com/apache/spark/actions/workflows/build_main.yml) [](https://codecov.io/gh/apache/spark) [](https://pypi.org/project/pyspark/) Online

Health score breakdown

6-dimension composite. See methodology for formula and weights.

activity
100
maturity
30
community
42
security
85
sustainability
65
adoption
45

Adoption signals

Real-world usage data, pulled from each registry. The bigger the numbers, the more battle-tested the project.

SignalValueSource
GitHub stars 43k github.com/apache/spark
GitHub forks 29k github.com/apache/spark
Docker Hub pulls 24360k hub.docker.com / apache

Release & maintenance

Is this project actively maintained, or about to die? Check the recency of last commit and last release.

Project age12.2 yearssince Feb 2014
Last commit2 days agoMay 5, 2026
Security policySECURITY.mddeclared by maintainers

Self-hosting cost across providers

Detected requirements: 4GB RAM, 40GB disk minimum. Cheapest plan per provider that meets the requirement.

ProviderPlanSpecsMonthly
hetzner CAX11 2c · 4GB · 40GB $4.13 USD Deploy →
vultr VC2 1c · 1GB · 25GB $5 USD Deploy →
linode Nanode 1GB 1c · 1GB · 25GB $5.12 USD Deploy →
digitalocean Basic Regular 1GB 1c · 1GB · 25GB $6 USD Deploy →

Security advisories

18 known advisories tracked via OSV.dev. 9 currently open without a documented fix. Most recent: CVE-2025-54920.

What people say on Hacker News

Ready to self-host Apache Spark?

Spin up a hetzner CAX11 (4GB RAM, 40GB disk) for $4.13/mo and follow the project's official install docs.

Data last refreshed May 7, 2026.

Similar open-source projects

Projects in our directory that replace the same SaaS or share topics with Apache Spark.

Frequently asked questions

Last verified . Data refreshes every 30 minutes.