Langfuse alternatives: the honest comparison

Langfuse is a strong open-source LLM observability platform, and after ClickHouse acquired it in January 2026 it has more durability behind it than most independent tools. Five reasons buyers still search for alternatives: self-host operational burden, hosted-SaaS pricing at scale, eval-first workflows, workbench-first authoring, and compliance posture that doesn't fit OSS self-host or US-cloud-acquired SaaS.
On this page
What Langfuse actually is, and what it costs
Langfuse is an open-source LLM engineering platform: tracing, prompt management, evals, datasets, dashboards, and a playground, all under one roof. The repository is MIT-licensed (the ee folders are excluded), and the v3 architecture pulled in ClickHouse as the core analytics database alongside Postgres, Redis, and S3-compatible object storage. The OpenTelemetry-native instrumentation makes it framework-agnostic.
The Cloud pricing on Langfuse's pricing page (verified April 2026) lays out four tiers:
- Hobby (Free): $0/month, 50,000 units/month, 30-day retention, 2 users, community support
- Core: $29/month, 100,000 units/month, $8 per 100k overage, 90-day retention, unlimited users, in-app support
- Pro: $199/month, 100,000 units/month, $8 per 100k overage, 3-year retention, SOC2 / ISO27001 / HIPAA. Teams Add-on at $300/month for SSO, RBAC, and dedicated Slack
- Enterprise: $2,499/month, audit logs, SCIM API, custom rate limits, uptime SLA, dedicated SE
A "unit" is any tracing data point: traces (complete interactions), observations (spans, events, generations), or scores (evaluations). The overage rate steps down with volume, hitting $6 per 100k at the 50M+ tier.
Self-host is the more interesting story. The MIT-licensed core is free to run, and at trial scale a small Langfuse deployment fits comfortably on a single VM with managed Postgres for around $20 per month. At production scale (10M events per month, 90-day retention), third-party benchmarks put all-in infrastructure cost at roughly $400 to $1,000 per month: ClickHouse $200 to $800, Postgres $20 to $50, app servers $50 to $150. Plus several hours per week of DevOps in steady state. Enterprise self-host adds gated features (SSO, RBAC, SCIM, server-side data masking, audit logs) at custom pricing on top of a ClickHouse commercial plan.
The "OSS self-host is free" pitch is true at trial; it stops being free at production scale. Anyone evaluating Langfuse should run that math against their actual event volume before committing. ClickHouse storage growth and Postgres maintenance windows are a different operational shape than a static-content VM, and the gap shows up at exactly the moment a successful product is also growing the team's other obligations.
Why someone searches "Langfuse alternative" in 2026
The search isn't one search. It's five.
1. Self-host operational burden at production scale. The Langfuse v3 stack is ClickHouse + Postgres + Redis + S3-compatible storage + Langfuse web/worker containers. The GitHub issue tracker has a steady stream of operational pain reports: K8s liveness probe failures at high data volume, MinIO connectivity issues, VM-deployment hangs, hangs with no response. None of these are dealbreakers for a team with a competent platform crew. They're real for a four-person team where the AI engineer is also the DevOps engineer.
2. Hosted-SaaS pricing at scale. $199 per month for Pro plus $8 per 100k units of overage runs comfortably at low traffic and stops running comfortably the moment the product crosses 1M units per month. Buyers who priced it out at trial volume find themselves doing the math again at production volume.
3. Eval-first workflow rather than trace-first. Langfuse is observability-first: tracing is the core; prompts and evals layer on top. Teams that adopted "your evals are the spec for your prompts" as their workflow want eval primitives at the center, not a tracing surface they navigate around.
4. Workbench-first authoring rather than observability-first tracing. Some teams want a craft-first prompt editor with version control, a six-dimension critique surface, and a model-graded comparison view as their primary workflow. They're not running production observability through Langfuse; they have separate tracing in Datadog, Honeycomb, or homegrown OpenTelemetry. A Langfuse Cloud subscription for the workbench half is the wrong shape. (Open the editor if that's where you are.)
5. Compliance posture that doesn't fit OSS self-host or US-cloud-acquired SaaS. The Hacker News thread on the ClickHouse acquisition surfaced this concern explicitly: "US companies can be legally compliant with GDPR, it's just that the likes of the CLOUD Act and FISA make it completely meaningless." (The CLOUD Act lets US authorities subpoena data held by US-headquartered companies regardless of where the data physically sits; FISA grants broader surveillance authority over US-incorporated firms.) For European buyers and US buyers under HIPAA/SOC2 mandates, the question after the acquisition is whether a US-headquartered parent company changes the data-sovereignty calculus. Self-host stays available, but compliance teams that can't carry the ops are in a bind.
A buyer who's cleanly in scenario 1 and a buyer who's cleanly in scenario 4 want different tools. The "Langfuse alternative" SERP hides this and the vendor-listicle results obscure it further. The honest answer is to figure out which scenario you're in first.
Did the ClickHouse acquisition change anything?
Not yet, and not in the ways the acquisition cynics worry about most.
ClickHouse acquired Langfuse on January 16, 2026 alongside a $400M Series D round. The Langfuse deal terms are not publicly disclosed. (A note for readers: the $400M figure cited in some coverage refers to the ClickHouse Series D, not the Langfuse purchase price. The two have been conflated frequently online.)
ClickHouse's announcement carries explicit commitments worth reading verbatim. "Langfuse remains 100% open-source under its existing MIT license for core features." And: "Langfuse Cloud continues operating as a standalone service." And: "The service continues operating with the same SLAs and support."
The cynical concerns from the Hacker News thread are real but separate from the immediate-durability question. Data-sovereignty under the CLOUD Act for European buyers. Market-consolidation worry: "This is a big reason why there are so few EU tech startups." Skepticism about whether the deal benefited Langfuse founders or was a forced exit: "Without the purchase price, it is unclear whether this deserves congratulations or condolences." These are valid framings for a long-term durability assessment. None of them are evidence that Langfuse is in imminent trouble.
The acute 2026 durability story isn't Langfuse. It's OpenAI's acquisition of Promptfoo on March 9, 2026. Promptfoo's standalone hosted offering future is unclear; OpenAI plans to incorporate the technology into OpenAI Frontier (the enterprise platform for AI agents) and committed only that "the open-source eval framework continues." Teams that built workflows around Promptfoo Cloud are now in the same posture Humanloop customers were in last September. The migration pattern teams ran when Humanloop sunset is the more relevant playbook here than anything Langfuse-shaped.
The four credible alternatives in 2026
| Tool | License | Self-host | Hosted | Pricing shape | Best for |
|---|---|---|---|---|---|
| Helicone | Apache 2.0 | Yes | Yes | Flat tiers, no per-trace meter | Gateway features (caching, rate-limiting), async logging |
| Phoenix (Arize AX) | Elastic License 2.0 | Yes | Yes | Free / $50 / Enterprise | OpenTelemetry-native teams, eval-driven |
| Braintrust | Closed source | Enterprise only | Yes | GB-processed + scored outputs | Eval-heavy workflows, no self-host need |
| Lunary | Open source (CE free) | Yes | Yes | $20/user/month | Mid-market teams wanting hosted SaaS without ClickHouse ops |
Helicone
Helicone supports two integration patterns: a gateway/proxy mode (you swap your provider base URL to oai.helicone.ai, and Helicone routes the request while logging it) and an async log mode (your application calls the provider directly, then fires a separate logging request after the fact, off the critical path). The async path means Helicone outages don't take your product down.
Pricing is flat: Hobby Free ($0, 10k requests, 1GB), Pro ($79/month, unlimited seats, 1k logs/min, alerts, HQL), Team ($799/month, 5 orgs, 90-day retention, SOC2/HIPAA, dedicated Slack), Enterprise (custom, on-prem, SAML SSO). License is Apache 2.0; self-host runs via Docker Compose with Postgres, ClickHouse, Redis, and MinIO.
Helicone fits buyers in scenario 1 (self-host operational pain at high traffic) who want gateway features Langfuse doesn't have, and buyers in scenario 5 (compliance) who can stand up the async-log path so the platform is never on the critical request path.
Phoenix (Arize AX)
Phoenix is Arize's open-core observability and eval framework. The license is Elastic License 2.0, not Apache; self-host is free, but the ELv2 prohibits a third party from offering Phoenix as a managed service.
Arize itself runs Arize AX (which absorbed Phoenix Cloud) as of 2025: Free at 25,000 spans/month and 1GB ingest, Pro at $50/month with 50,000 spans and 10GB ingest, Enterprise with custom pricing including SaaS or self-hosted deployment with SOC2/HIPAA and data residency. Phoenix fits cleanly when the team is already on Arize's larger ML observability stack or when OpenTelemetry-native instrumentation is the integration constraint.
Braintrust
Braintrust is closed-source and hosted-only on Pro; self-host is Enterprise-only. The product is sharper on the eval side than the trace side. Pricing in April 2026 is GB-processed plus scored outputs: Starter Free ($0, 1GB processed, 10k scores, 14-day retention), Pro at $249/month (5GB, 50k scores, 30-day retention), Enterprise custom.
Braintrust wins scenario 3 (eval-first workflows) for teams that don't need self-host. The unit economics flipped from older "1M free trace spans" framing; verify against current numbers before quoting them.
Lunary
Lunary is the dark horse of the four. Open source (Community Edition free), with a hosted Team tier at $20/user/month and Enterprise self-host at custom pricing. SOC2 Type II + ISO27001. Lunary's pitch is "the same observability surface as Langfuse without ClickHouse in the dependency graph": Postgres-only, simpler ops, smaller blast radius. Buyers in scenario 1 who want a hosted alternative without learning ClickHouse should put Lunary on the shortlist.
For LangChain shops specifically, LangSmith stays the natural answer on the Plus tier or the free Developer tier; the per-trace economics that drive teams off LangSmith at agentic scale don't apply at low to mid volume.
Where Prompt Assay fits, honestly
Prompt Assay is not an observability platform. It doesn't ingest production traces, it doesn't run tracing storage, and it doesn't have a request-path SDK that wraps your inference calls. Anyone telling you Prompt Assay is a Langfuse replacement on its own is selling you something.
What Prompt Assay does cover is the workbench half of what most teams pull Langfuse in for:
- Prompt authoring with version control: diff, restore, branching, annotations on the version itself
- Six-dimension critique (Clarity, Completeness, Structure, Technique Usage, Robustness, Efficiency) on the prompt before it ships
- Two-version Compare: model-graded structural diff between two revisions of the same prompt, with improvements, regressions, key differences, maturity changes, and a recommendation on which version to ship
- Eval suites with test cases, rubrics, and LLM-as-a-judge graders
- An AI pair in the editor (Brainstorm, Critique, Improve, Rewrite, Compare)
If your real Langfuse use was 30% trace plumbing and 70% "manage prompts and run evals so we stop shipping regressions," that 70% is what Prompt Assay handles. The 30% is what you pair with Langfuse self-host (or Helicone, or Phoenix, or Lunary) on the side.
The pricing is flat. Prompt Assay's pricing is $49 per month on Solo, $99 per seat per month on Team, custom on Enterprise. There's no per-trace meter, because we never see your inference traffic. BYOK is mandatory at every paid tier: your Anthropic, OpenAI, and Google keys connect directly to the providers. Your bill stays with your provider, not us. The BYOK setup is documented and takes about 60 seconds; the trust page covers the encryption-at-rest and key-isolation specifics if compliance review needs them.
That last part matters for scenario 5. The reason a US-headquartered SaaS provider can hold up under GDPR review is the data-flow architecture. Prompt Assay never proxies inference traffic, never reads your API keys outside a single decrypt-call-discard cycle on the server, and never persists prompt outputs server-side beyond evaluation runs you explicitly trigger. The BYOK as a principle post covers the why; the trust page covers the what.
Open the editor and connect a key. No credit card, no demo call.
The five scenarios, mapped to tools
The first-line tool below is the closest single-product fit. The pair-with column is what to add when the scenario also has a workbench-half or tracing-half need that the first-line tool doesn't cover.
| Scenario | First-line answer | Pair with |
|---|---|---|
| 1. Self-host ops burden at production scale | Lunary hosted (Postgres-only, no ClickHouse), or Helicone async-log mode | Add Prompt Assay if authoring + evals are also in scope and you don't want them in the tracing tool |
| 2. Hosted-SaaS pricing at scale | Helicone Pro flat at $79/month, or Lunary Team at $20/user | Add Prompt Assay if you want to separate workbench cost from observability cost so each scales on its own meter |
| 3. Eval-first workflow | Braintrust Pro at $249/month for the eval-centric workflow | Add Prompt Assay if you also want six-dimension critique on the prompt before evals run, plus version control on the prompt itself |
| 4. Workbench-first authoring | Prompt Assay (this is the scenario PA was built for) | Pair with Langfuse self-host, Helicone async, or your existing tracing stack for the production-tracing half |
| 5. Compliance / no-proxy data path | Langfuse self-host on your own infrastructure if you can carry the ops; otherwise Helicone async-log mode | Add Prompt Assay BYOK for the workbench half (BYOK keeps inference traffic on your provider, never on PA's path); the tracing half still needs a separate tool that fits your data-residency rules |
If you came in already convinced you wanted Langfuse, scenarios 4 and 5 are the ones worth re-reading. If you're closer to scenarios 1, 2, or 3, the table above is a faster read than another vendor listicle, and the PromptLayer alternatives breakdown is the closest sibling comparison if PromptLayer is also on your shortlist.
Frequently Asked Questions
Reader notes at the edge of the argument.
Ship your next prompt in the workbench.
Prompt Assay is the workbench for shipping production LLM prompts. Version every change. Critique, improve, and compare across GPT, Claude, and Gemini. Bring your own keys. No demo call. No card. No sales gate.
Further Reading
- №07·April 2026
LangSmith alternatives without per-trace billing
LangSmith's auto-upgrade-on-feedback can turn per-trace billing superlinear. Compare Langfuse, Helicone, Phoenix, Braintrust, and where Prompt Assay fits.
Comparisons & Migrations·12 min read - №04·April 2026
PromptLayer alternatives: the honest comparison
PromptLayer alternatives compared honestly: current 2026 pricing, BYOK posture, and when Prompt Assay, LangSmith, Langfuse, or Braintrust fits better.
Comparisons & Migrations·14 min read - №01·April 2026
Migrate from Humanloop: a 2026 re-home guide
Humanloop shut down Sep 2025. If the replacement you picked isn't sticking, this 2026 guide covers the durable asset, destinations, and BYOK math.
Comparisons & Migrations·15 min read
Issue №08 · Published APRIL 28, 2026 · Prompt Assay