Kelet

Name: Kelet
Author: almogbaku

I've spent the past few years building 50+ AI agents in prod (some reached 1M+ sessions/day), and the hardest part was never building them — it was figuring out why they fail. AI agents don't crash. They just quietly give wrong answers. You end up scrolling through traces one by one, trying to find a pattern across hundreds of sessions. Kelet automates that investigation. Here's how it works: 1. You connect your traces and signals (user feedback, edits, clicks, sentiment, LLM-as-a-judge, etc.) 2. Kelet processes those signals and extracts facts about each session 3. It forms hypotheses about what went wrong in each case 4. It clusters similar hypotheses across sessions and investigates them together 5. It surfaces a root cause with a suggested fix you can review and apply The key insight: individual session failures look random. But when you cluster the hypotheses, failure patterns emerge. The fastest way to integrate is through the Kelet Skill for coding agents — it scans your codebase, discovers where signals should be collected, and sets everything up for you. There are also Python and TypeScript SDKs if you prefer manual setup. It’s currently free during beta. No credit card required. Docs: https://kelet.ai/docs/ I'd love feedback on the approach, especially from anyone running agents in prod. Does automating the manual error analysis sound right?

AI Tools BOTH · almogbaku

Visit Kelet

N/A

Revenue not available

AI Analysis

Analysis coming soon.

Similar Products

AI Tools

Oodle.ai

Hi HN, we're Kiran and Vijay! Over the past two years, we have built a columnar storage engine for observability: logs, metrics, and traces. Today, it's exciting for us to show what we've built on top of that foundation: LLM Agent Observability. Given how non-deterministic agents are, storing all traces without sampling was critical for us. But these traces tend to be in the MBs, sometimes GBs - we needed to store them inexpensively. We also needed the queries and analyses to be fast. To meet both these goals, we store them in S3 in our own parquet-like file format, and query them using AWS Lambda. Since we process each span of every trace, instead of running LLM-based evals on each, we first analyze them using deterministic techniques. We detect tool failures, retries, loops, abnormal token usage, latency regressions, schema violations, sentiment, and other production signals. We've written more about the approach here: https://blog.oodle.ai/you-cant-sample-your-way-to-reliable-a... The combination of our own engine, no sampling, and deterministic processing before LLM-for-evals allows us to price at $10 per million traces, provide sub-second p99 query latency, and have healthy margins. Before building this, we used Langfuse for our own agent observability, which was 6x more expensive. Still super early, and rough around some edges, we would love your questions and feedback!

Revenue N/A

AI Tools

Benchmark your eng team's AI agent maturity in 5 minutes

we had hundreds of discussions with engineering leaders over the past few months, and everyone's trying to understand where they are in the AI journey. we collected all this data into a benchmark and built a free grader to let you know where you stand. you answer on a 1–5 scale (e.g., autonomy runs from "suggestions only" to "agents own multi-hour workflows across code, infra, and external systems") - takes about 5 minutes. https://agent-benchmarks.com/software-factory/ waiting for your results!

Revenue N/A

AI Tools

Microphone

If you are an aspiring founder, any VC will ask you this question: “why are you the only person who could solve this”. If you want to generate passive income with your side idea, get ready to enter a crowded market as everyone and their mother is shipping. Unless you have an active X account or you’re a TikTok sensation distribution is going to be tough. I just launched the trie.dev microphone beta to help folks find their edge. You yap into your phone about your ideas; Trie turns the rambling into hypotheses, then prioritizes them based on your experience and your realistic ability to distribute in that idea space — surfacing the problems only you can solve. From there you can generate creative and run Meta ads against your hypotheses straight from your phone, with zero setup, to see how real people respond. I built it initially for myself and friends as an “intake form” for running paid ads to help validate our side gig ideas. Happy to chat about how it works or the stack. Joining the waitlist will send you an email to join via TestFlight.

Revenue N/A

AI Tools