Product Catalog

26 products tracked

Trychert

Hey HN! We’re Gary and Ian, and we’re building Chert (https://www.trychert.com/), an API for businesses to send, receive, and automate iMessage conversations at scale. Check out our demo: https://www.youtube.com/watch?v=SRdwvVxMMoI. We originally started by building products on top of iMessage because the blue bubble interface, typing indicators, and reactions made agentic conversations feel more human than ones on SMS/RCS. These included a one-shot iMessage agent builder that reached 2,000 users in one week and an automated iMessage outbound sequencer that sent thousands of outbound messages per day. The hard part is that iMessage does not have a native API like SMS/RCS. Sending and receiving iMessages requires a separate infrastructure that is difficult to set up and maintain, especially at scale. As we talked to more companies, we realized that the highest-volume use cases for iMessage were not B2C agents or even sales. They were things like customer service, missed-call text-back, cart abandonment, and inbound lead capture in verticals like home services, DTC brands, and property management that drive the highest volume. Furthermore, these companies often need additional support, such as custom infrastructure setup (e.g. contact card, area code, or local worker sessions), integration support with their existing SMS/RCS or voice agent systems, and a reliable way to scale their volume over time. We built Chert to be an infrastructure layer for businesses to handle iMessage conversations at scale. Businesses can use our API to send and receive iMessages programmatically, route replies to humans or agents, and integrate conversations into the systems they already use. To maintain stability across both outbound and inbound use cases, we built phone line health checks and SMS/RCS fallback systems. We also integrate with existing SMS/RCS systems, voice agents, CRMs such as Salesforce, HubSpot, and Attio, and tools like Slack. Finally, we let businesses reliably scale from a few test lines to hundreds of lines with automated line provisioning and a usage-based pricing structure. We’re working with companies doing conversational messaging in DTC, sports programs, property management, and home services at the scale of hundreds of lines. We’d love to hear your thoughts on this and other similar verticals where iMessage could be useful. All comments welcome!

Revenue N/A

Developer Tools

Runtm

Hey HN, We're Gus and Carlos from Runtime (https://runtm.com). We're building infra that lets your whole team (including non-engineers) ship with Claude Code, Codex, and other agents without engineering having to handhold every session. After Mentum (YC S21) was acquired, I personally shipped 4 full-stack products in 3 months using coding agents. When I tried to roll the same workflow out to the rest of the team, it fell apart: Most PRs were unmergeable slop - Every repo required an engineer doing one-off local setup. - Skills and context lived in one person's head. - There was no safe way for a PM to touch a real codebase without risking a bad deploy or a secrets leak. Carlos comes from building agentic reconciliation systems at Modern Treasury and had a similar experience when letting his support team use devin. We ended up building internal background agent infra but it quickly became a nightmare to mantain and develop. We built Runtime so you don't have to do this kind of thing. Runtime work like as follows. Engineering defines the context once: system instructions, skills, and scoped integrations installable via CLI, mise, npm, or any package manager. Then Runtime snapshots your full running environment including multi-service Docker Compose setups, Kafka, Redis, seeded DBs, so it comes up in milliseconds with every server already running. We orchestrate across sandbox providers like E2B, Daytona, EC2 or self-hosted K8s depending on your setup. Secrets are injected through our managed proxy so they never touch the agent directly, and guardrails run at the infrastructure level: command allow/deny lists, network egress controls, and RBAC scoped per human and per agent. Every session also gets a shareable preview URL, so internal builds go from sandbox to the rest of the team without needing production access. Runtime works with whichever agent your team already uses: Claude Code, Codex, Cursor, Copilot, Gemini, Devin. You can trigger sandboxes from our web app, CLI, Slack, Linear, GitHub, or API. One of our customers built an on-call inspector that wires PagerDuty, Sentry, and their repo so when an alert fires, the agent finds the cause and opens a PR with a unit test before anyone gets paged. Another runs a finance agent in a private Slack channel pulling from Stripe, NetSuite, and Snowflake to run reconciliations in minutes with source rows attached. A fintech unicorn and several YC scaleups are live on Runtime, including a few teams who had built similar infrastructure internally and handed it to us to take over. The core is open source at https://github.com/runtm-ai/runtm. Hosted version is live at https://app.runtm.com, free tier included. We're charging a flat platform fee plus compute, no token markup. Check our demo: https://www.youtube.com/watch?v=wLwj__aEEh4 We'd love to hear how you're thinking about the infra for letting more people across your org use coding agents without creating chaos!

Revenue N/A

SaaS

Superlog (YC P26)

Hey HN, we’re Nico and Arseniy, co-founders of Superlog (https://superlog.sh). We're building a self-installing, self healing observability tool meant not to be opened. It has a wizard that daily sets up proper logging and an agent that investigates errors and opens PRs. Super short demo: https://www.youtube.com/watch?v=xFhU9Mk247M. In our earlier startups, we tried Sentry, Datadog, Grafana, Dash0, and nothing was good enough. Proper telemetry and alerting still requires a ton of manual setup. We struggled with adding good logs, so debugging was tough, especially as codebases grow at a faster pace. Meanwhile, the Datadog/Dash0 bill kept climbing, and we still spent engineering hours to learn, configure, and maintain our observability tooling. With Sentry, we found ourselves flooded by a stream of alerts into our Slack channel, most were duplicates or lacked context, so alert fatigue/constant interrupts were a real pain. The #ops notification is consistently the worst feeling on a Saturday morning We’ve seen too many times servers run out of memory and disk, and three AWS metrics giving us three different values. Half of the graphs on dashboards are normally empty or outdated, and manually clicking through UIs, especially when the team is small, seems like a huge waste of time. At some point we realized that solving this problem would be more valuable than the things we had been working on, and we had the expertise to do it, since Arseniy had spent years at Datadog, getting paged during the night to debug production incidents. So we decided to build a platform that would just work: agent-first, MCP-native, zero-setup. Here’s how Superlog works: we have a wizard that scans your repo, and automatically instruments it with well-structured logs, traces and metrics via OpenTelemetry. We make sure to highlight main failure modes, endpoint performance, usage per tenant, and LLM/upstream cost (by callsite, tenant and model). Errors get fingerprinted and grouped into incidents, so you see one issue, not a thousand duplicates. When you get a notification from Superlog, you see a clear failure summary, its inferred severity and impact upfront. Then the agent investigates and tries to solve the issue. If it has enough context, it produces a concise and tested PR. If it doesn't, it posts its findings for the investigating team, and automatically pulls in the engineers that could contribute more context based on documentation, previous investigations and Slack threads. Either way the output is one clean PR per incident, posted in Slack, that you can merge, ignore, or open as a Claude Code session and modify. Three things we think are different from other observability vendors: (1) We solve the setup pain. The wizard will instrument everything with native OTel SDKs, respecting the semantic conventions, with proper service and environment tagging. We’re also working on native automatic dashboards and alerts, so that you can see what’s going on in a glance and don’t miss subtle failure modes. (2) Our telemetry doesn’t decay. The wizard runs daily, and keeps adding logs, alerts and dashboards where it’s needed. You don't have to remember to instrument new features. The next time something breaks, the data you need to debug it is already there. (3) Our goal is to solve alert fatigue. We use agents to merge similar errors and refine the summaries, giving you relevant information upfront. We have a custom evaluation setup that makes sure that our summaries are dense and correct, and severity and impact is on point. We also give you confidence scores for every LLM-enhanced metric so that wrong guesses don’t get boosted. Important: superlog telemetry is vendor-neutral, so you keep all the logs/metrics/traces we install. Pricing is on the site. We're early, so expect rough edges and please tell us when you find them. You can try it at https://superlog.sh. We'd love to hear what you're using today, what's broken about it, and whether the "one mergeable PR per incident" model sounds useful or terrifying. Especially keen to hear from folks running integration-heavy products, anyone who's rolled their own observability, and anyone who has tried Sentry / Datadog MCPs and given up. Comments and feedback welcome!

Revenue N/A

AI Tools

Andonlabs

Hey HN! I'm Lukas from Andon Labs. We let AIs run companies without humans in the loop and report to the public on what can go wrong. Previously, we've done experiments in retail (vending machines, stores, and cafes), but we just launched one in the media sector. We gave four AI agents all the tools they need to both broadcast radio shows live and handle all the business side of running a media company. The agents' revenue is so far terrible (you can try to strike a sponsor deal with them if you want!), but their shows are at times hilarious. You can listen to them at andon.fm, I hope you enjoy this!

Revenue N/A

Developer Tools

Headless Cloud Security

The cloud security company I work for, Sysdig, launched “Headless Cloud Security” last week. The short version: as attacks get faster and more automated, security tooling is going to need to evolve beyond dashboards and humans clicking through workflows all day. We’ve already seen “headless” models emerge in other categories, and engineering teams are rapidly adopting agentic and CLI-first workflows with tools like Claude Code, Cursor, and MCP servers. Security teams, historically, tend to lag engineering adoption curves by 6–18 months, but I don’t think that gap will hold much longer. The idea behind headless security is that security capabilities should be consumable programmatically — through APIs, AI agents, IDEs, CI/CD pipelines, and automated workflows — not just through a UI. This post covers it in more detail: https://www.sysdig.com/learn-cloud-native/what-is-headless-c... Curious whether others here are seeing similar shifts inside their orgs, especially around AI-assisted development and security operations.

Revenue N/A

AI Tools

Voker

Hey HN, we're Alex and Tyler, co-founders of Voker.ai (https://voker.ai/), an agent analytics platform for AI product teams. Voker gives full visibility into what users are asking of your agents, and whether your agents are delivering, without having to dig through logs. Our main product is a lightweight SDK that is LLM stack agnostic and purpose-built for agent products. (https://app.voker.ai/docs) Agent Engineers and AI product teams don’t have the right level of visibility into agent performance in production, which results in bad user experiences, churn, and hundreds of hours wasted with spot checks to find and debug issues with agent configurations. Demo: https://www.tella.tv/video/vid_cmoukcsk1000i07jgb4j65u67/vie... We recently conducted a survey of YC Founders and 90%+ of respondents said that the only way they know if their Agents are failing users in production is by hearing complaints from customers. They push a prompt change hoping that it fixes the problem and doesn’t break something somewhere else, and the cycle repeats. We saw tons of observability and evals products popping up to try to address these problems, but we still felt like something was missing in the agent monitoring stack. Obs is good for individual trace debugging but is only accessible to engineers. Evals are good for testing known issues, but don't give insights into trends that teams don’t expect, so engineers are always playing catch up. Traditional product analytics tools do a good job tracking clicks and pageviews across your product surface but weren’t built ground up for agent products. Knowing what users want out of agents, and whether the agent delivered requires specific conversational intelligence / unstructured data processing techniques. We came up with the agent analytics primitives of Intents, Corrections, and Resolutions to describe something pretty much all conversational agents had in common: a user will always come to an agent with an intent, the user might have to correct this agent on the way to getting their intent resolved, and hopefully every intent a user has is eventually resolved by the agent. Voker processes LLM calls by automatically annotating individual conversations and picking out user intent and corrections. Voker takes these and uses LLMs and hierarchical text classification to create dynamic categories that give higher level insights so you don’t have to read individual conversations to know what are the main usage patterns across your users. The most common substitute solution we’ve seen is uploading obs logs to Claude or ChatGPT and asking for summary insights. There are a few problems with this - mainly that LLMs aren’t good at math or data science, so you don’t get accurate or consistent statistics. Its highly likely that the LLM overfits to some insights and underfits to others. The LLM isn’t programmatically reading and classifying each individual session or interaction. This is why we don’t use LLMs for any of our core data engineering (processing events, calculating statistics) so the analytics we produce are consistent, reproducible, and accurate. We have a publicly available, lightweight SDK that wraps LLM calls to OpenAI, Anthropic and Gemini in Python and Typescript. Voker handles the data engineering to turn raw data into usable analytics primitives and higher level insights. Free tier: 2,000 events / mo, requires email signup. Paid plans start at $80/mo with a 30 day free trial. We'd love to hear how you're currently detecting trends, and if you try Voker, tell us what part of our analysis is valuable, and what still feels missing. Thanks for reading, and we’re looking forward to your thoughts in the comments!

Revenue N/A

Developer Tools

Agentic interface for mainframes and COBOL

Hi HN, we’re Sai and Aayush, and we’re building Hypercubic (https://www.hypercubic.ai/), bringing AI tools to the mainframe and COBOL world. (We did a Launch HN last year: https://news.ycombinator.com/item?id=45877517.) Today we’re launching Hopper, an agentic development environment for mainframes. You can download it here: https://www.hypercubic.ai/hopper, and you can also request access and immediately get a mainframe user account to play with. There's also a video runthrough at https://www.youtube.com/watch?v=q81L5DcfBvE. Mainframes still run a surprising amount of critical infrastructure: banking, payments, insurance, airlines, government programs, logistics, and core operations at large institutions. Many of these systems are decades old, but they continue to process enormous transaction volumes because they are reliable, secure, and deeply embedded into business operations. A lot of that software is written in COBOL and runs on IBM z/OS. The development environment looks very different from modern cloud or Unix-style development. Instead of GitHub, shell commands, package managers, and CI pipelines, developers often work through TN3270 terminal sessions, ISPF panels, partitioned datasets, JCL, JES queues, spool output, return codes, VSAM files, CICS transactions, and shop-specific conventions. TN3270 is the terminal interface used to interact with many IBM mainframe systems. ISPF is the menu and panel system developers use inside that terminal to browse datasets, edit source, submit jobs, and inspect output. It is powerful and reliable, but it was designed for expert humans navigating screens, function keys, and fixed-width workflows, not AI agents. A simple COBOL change might require finding the right source member, checking copybooks, locating compile JCL, submitting a job, reading JES/SYSPRINT output, interpreting condition codes, patching fixed-width source, and resubmitting. Much of this work is so well-defined and repetitive that it's a good fit for agentic AI. To get that working, however, a chatbot next to a terminal is not enough. The agent needs to operate inside the mainframe environment. Hopper combines three things: (1) A real TN3270 terminal, (2) Mainframe-aware panels for datasets, members, jobs, and spool output, and (3) An AI agent that can operate across those z/OS surfaces. For example, here is a tiny version of the kind of thing Hopper can help debug: COBOL: IDENTIFICATION DIVISION. PROGRAM-ID. PAYCALC. DATA DIVISION. WORKING-STORAGE SECTION. 01 CUSTOMER-BALANCE PIC 9(7)V99. PROCEDURE DIVISION. ADD 100.00 TO CUSTOMER-BALNCE DISPLAY "UPDATED BALANCE: " CUSTOMER-BALANCE STOP RUN. JCL: //PAYCOMP JOB (ACCT),'COMPILE',CLASS=A,MSGCLASS=X //COBOL EXEC IGYWCL [//COBOL.SYSIN](https://cobol.sysin/) DD DSN=USER1.APP.COBOL(PAYCALC),DISP=SHR [//LKED.SYSLMOD](https://lked.syslmod/) DD DSN=USER1.APP.LOAD(PAYCALC),DISP=SHR A human would submit this job, inspect JES output, open `SYSPRINT`, find the undefined `CUSTOMER-BALNCE`, map it back to the source, patch the member, and resubmit. Hopper is designed to let an agent operate through that same loop autonomously. Hopper is not trying to hide the mainframe behind a generic abstraction, and it's not a chatbot. The design principle is simple: preserve the fidelity of the mainframe environment, but make it accessible to AI agents. Sensitive operations require approval, and the terminal remains visible at all times. Once agents can operate inside the mainframe environment, new workflows become possible: faster job debugging, automated documentation, safer code changes, test generation, migration planning, traffic replay, and modernization verification. We’re curious to hear your thoughts! especially from anyone who has worked with mainframes, COBOL or has done legacy enterprise modernization.

Revenue N/A

Developer Tools

Spec27

Hi HN! We’re a team of ML validation specialists and we’ve been building /Spec27, a tool for testing whether AI agents still do their job safely and reliably as models, prompts, tools, and surrounding systems change. We started working on this because a lot of current LLM evaluation work seems aimed at scoring general model behavior, while many teams are deploying systems that have a specific mission to fulfill. Many of the tools also assume you have full access to the agent stack and traces so you can place SDKs and Gateways, but a lot of agents are being created on vendor platforms where this isn’t possible. As a result, we approaches it from the outside in: all tests just run to the primary interfaces of an Agent and don’t assume anything about internals. The other important things about the approach is spec-driven. Instead of treating testing as a one-off benchmark or static eval set, we let teams define reusable specifications for the behavior they want from an agent, then generate tests against those specs. With this you can automatically generate adversarial and robustness checks, so you can see what an agent is sensitive to and what kinds of changes cause it to fail. We’ve worked on validation for other AI systems before, including vision and tabular workflows, and /Spec27 is our new product for language-model-based agents. Currently in early access, so we’d love feedback! The current version is strongest for single-turn agent and application validation. We do not fully support multi-turn interactions yet, and better telemetry/tool-call integration is still on our roadmap. We’ve made the product open to try for HN readers, with a sample flow so it’s easy to poke around without much setup. We’d especially love feedback from people deploying internal agents, vendor agents, or other AI systems where reliability matters more than benchmark scores.

Revenue N/A

SaaS

CatchAll

Hey HN, Artem and Maksym from NewsCatcher here. Some of you know us as we started six years ago as two freshly graduated economics students who decided to build the best news API product. We started NewsCatcher thinking the market for news APIs was so big that we could build a self-serve platform and get millions of $29 users. Obviously, it was a wrong assumption. We pivoted to serve enterprises and had success with it. But we are hackers at heart, and we want to serve hackers. We haven't used our Launch HN yet, so consider this our smoke test. We're looking for feedback and power users rather than revenue. So, happy to provide enough credits for any HN user who finds CatchAll useful. CatchAll is built for one thing: retrieving every matching event from the web. The use cases that fit it are ones where missing events have real consequences — funding and M&A monitoring, regulatory and compliance feeds (FDA approvals, SEC filings, policy changes), cybersecurity incident tracking, supply chain signals. If your pipeline consumes structured records and the answer to your query is "find all of them," that's where it works. It's not the right tool for small, bounded queries that return 5 high-precision results. The 15-minute job time is a direct consequence of the pipeline depth: analyze, fetch, cluster, validate, extract, deduplicate. You're not getting a ranked list of links; you're getting a verified record set. Our latest benchmark run: https://newscatcherapi.com/blog-posts/web-search-api-benchma...

Revenue N/A

Developer Tools

Graph Compose

Hey HN. Graph Compose is a hosted platform for orchestrating API workflows on Temporal. You define workflows as graphs of nodes (HTTP calls, AI agents, iterators, error boundaries) and everything runs as a durable Temporal workflow under the hood. Three ways to build the same graph: a React Flow visual builder, a typed TypeScript SDK (@graph-compose/client), and an AI assistant that turns plain English into a graph. Open-core: the execution foundations and integrations service are AGPL-3.0. The platform orchestrator, visual builder, and AI assistant are proprietary. Longer backstory on why I built this in the first comment. Would love feedback, especially from anyone who's dealt with the "services work fine, the glue between them doesn't" problem. Docs: https://graphcompose.io/docs

Revenue N/A

AI Tools

CyberWriter

Apple has quietly shipped a pretty complete on-device AI stack into macOS, with these features first getting API access in MacOS 26. There are multiple components in the foundation model, but the skills it shipped with actually make this ~3b parameter model useful. The API to hit the model is super easy, and no one is really wiring them together yet. - Foundation Models (macOS 26) - a ~3B-parameter LLM with an API. Streaming, structured output, tool use. No API key, no cloud call, no per-token cost. - NLContextualEmbedding (Natural Language framework, macOS 14+) -- a BERT-style 512-dim text embedder. Exactly what OpenAI and Cohere sell, sitting in Apple's SDKs since iOS 17. - SFSpeechRecognizer / SpeechAnalyzer - on-device speech-to-text including live dictation. Solid accuracy on Apple Silicon. I built cyberWriter, a Markdown editor, on top of all three, mostly as a test and showcase to see what it can do. I actually integrated local and cloud AI first, and then Apple shipped the foundation model, it stacked on super easy, and now users with no local or API AI knowledge can use it with just a click or two. Well the real reason is because most markdown editors need plugins that run with full system access, and I work on health data and can't have that. Vault chat / semantic search. The app indexes your Markdown folder via NLContextualEmbedding (around 50 seconds for 1000 chunks on an M1). The search bar gets a "Related Ideas" section that matches by meaning - typing "orbital mechanics" surfaces notes about rockets and launch windows even when those exact words never appear. Ask the AI a question and it retrieves the top 5 chunks as context. Plain RAG, but the embedder, retrieval, chat model, and search all run locally. AI Workspace. Command+Shift+A opens a chat panel, Command+J triggers inline quick actions (rewrite, summarize, change tone, fix grammar, continue). Apple Intelligence is the default; Claude, OpenAI, Ollama, and LM Studio all work if you prefer. The same context layer - document selection, attached files, retrieved vault chunks - feeds every provider through the same system-message path. Because the vault context is file and filename aware, it can create backlinks to the referenced file if it writes or edits a doc for you. Voice notes and dictation. Record a voice note directly into your doc, transcribe it with SpeechAnalyzer, or just dictate into the editor while you think. Audio never leaves the Mac. The privacy story is straightforward because the primitives are already private. Vectors live in a `.vault.embeddings.json` file next to your vault, never sent anywhere. If you use Apple Intelligence, even the retrieved text stays on-device. For cloud models there is a clear toggle and an inline warning before any filenames or snippets leave the machine. Honest limitations: - 512-dim embeddings are solid mid-tier. A GPT-4-class embedder catches subtler relationships this will miss. - 256-token chunks can split long paragraphs mid-argument. - Foundation Models caps its context window around 6K characters, so vault context is budgeted to 3K with truncation markers on the rest. - Multilingual support is English-only right now. NLContextualEmbedding has Latin, Cyrillic, and CJK model variants; wiring the language detector across chunks is Phase 2. The developer experience for these APIs is genuinely good. Foundation Models streams cleanly, NLContextualEmbedding downloads assets on demand and gives you mean-poolable token vectors in a handful of lines. Curious what others here are building on this stack - feels like low-hanging fruit that has been sitting there for a while. https://imgur.com/a/HyhHLv2 The Apple AI embedding feature is going live today. I'm honestly surprised it even works out of the box.

Revenue N/A

SaaS

Zatanna

Hey! I am Alex and together with my co-founder Tarun built Kampala (https://www.zatanna.ai/kampala). It’s a man-in-the-middle (MITM) style proxy that allows you to agentically reverse engineer existing workflows without brittle browser automation or computer use agents. It works for websites, mobile apps, desktop apps. Demo: https://www.youtube.com/watch?v=z_PeostC-b4. Many people spend hours per day in legacy dashboards and on-prem solutions reconciling data across platforms. Current attempts at automation use browser automations or computer use agents which are brittle, slow, and nondeterministic. I come from a web reverse engineering background and spent the last 7-8 years building integrations by hand for sneaker/ticket releases, sportsbooks logins, and everything in\ between. During that time I consulted for several companies and brought them off of browser based infrastructure into the requests layer. When we started Zatanna (that’s our company name) we worked in dental tech, which meant we had to deal with tons of insurance payer dashboards and legacy dental-practice solutions. Our superpower (as a fairly undifferentiated voice agent/front desk assistant company) was that we could integrate with nearly any system requested. During this time we built extensive tooling (including what we’re now calling Kampala) to allow us to spin up these integrations quickly. Existing MITM proxies and tooling didn’t work for a few reasons: (1) They manipulated the TLS and HTTP2 fingerprint over the wire which was detected by strict anti-bots. (2) They had bad MCPs which did not adequately expose necessary features like scripts/replay. (3) They did not allow for building workflows or actions given a sample or sequence of requests. As the tools we built got more powerful, we began to use them internally to scrape conference attendees, connect to external PMS systems, and interact with slack apps. I even sent it to my property manager mom, who (with a lot of help from me lol), automated 2-3 hours of billing information entry in Yardi. At that point we realized that this wasn’t really about dentistry :) Because Kampala is a MITM, it is able to leverage existing session tokens/anti-bot cookies and automate things deterministically in seconds. You can either use our agent harness that directly creates scripts/apis by prompting you with what actions to make, or our MCP by manually doing a workflow once, and asking your preferred coding agent to use Kampala to make a script/API to replicate it. Once you have an API/script, you can export, run, or even have us host it for you. We think the future of automation does not consist of sending screenshots of webpages to LLMs, but instead using the layer below that computers actually understand. Excited to hear your thoughts/questions/feedback!

Revenue N/A