Spec27

Name: Spec27
Author: njyx

Hi HN! We’re a team of ML validation specialists and we’ve been building /Spec27, a tool for testing whether AI agents still do their job safely and reliably as models, prompts, tools, and surrounding systems change. We started working on this because a lot of current LLM evaluation work seems aimed at scoring general model behavior, while many teams are deploying systems that have a specific mission to fulfill. Many of the tools also assume you have full access to the agent stack and traces so you can place SDKs and Gateways, but a lot of agents are being created on vendor platforms where this isn’t possible. As a result, we approaches it from the outside in: all tests just run to the primary interfaces of an Agent and don’t assume anything about internals. The other important things about the approach is spec-driven. Instead of treating testing as a one-off benchmark or static eval set, we let teams define reusable specifications for the behavior they want from an agent, then generate tests against those specs. With this you can automatically generate adversarial and robustness checks, so you can see what an agent is sensitive to and what kinds of changes cause it to fail. We’ve worked on validation for other AI systems before, including vision and tabular workflows, and /Spec27 is our new product for language-model-based agents. Currently in early access, so we’d love feedback! The current version is strongest for single-turn agent and application validation. We do not fully support multi-turn interactions yet, and better telemetry/tool-call integration is still on our roadmap. We’ve made the product open to try for HN readers, with a sample flow so it’s easy to poke around without much setup. We’d especially love feedback from people deploying internal agents, vendor agents, or other AI systems where reliability matters more than benchmark scores.

Developer Tools B2B · njyx

Visit Spec27

N/A

Revenue not available

AI Analysis

Analysis coming soon.

Similar Products

Developer Tools

Capgo

Instant updates for Capacitor apps. Ship fixes in minutes, not weeks. Push OTA updates to users without app store delays.

$15.2K /mo

Developer Tools Easy to clone

OpenAlternative

Open source alternatives to popular software. Over 1 million users replaced their proprietary tools with open source software. Discover the best alternatives and join the movement.

$6.7K /mo

Developer Tools

Vivari

Show HN: Vivari – Open-Source WebContainer for Node, Bun, and Python

Revenue N/A

Developer Tools

Flashpaper

Hi everyone! This is my first HN and I’m very new to the scene. My name is Min from Bangkok. At first, I just want to create a dead man's switch for personal use and for fun. then, I think about information that self destruct like a spy movie. after that, I try to come up with the better version of Privnote or Bitwarden with self-destruct and some kind of censoring or blocking download ability. Somehow, end up with this product. :O Flashpaper is for sending any information that would be burned after reading (with counting down timer like Mission Impossible movie ! or after 24 hrs max if not opened) Encryption happens in browser and because the key stays after # in the link; server never sees the key (Zero-knowledge for web use) and— because I’m a newbie. I don’t want to connect to database because I don’t have the money and I want things light and simple. So, that’s why Flashpaper keeps things in RAM-only, no database. For AI Agent side, Flashpaper provides a REST API and an MCP server so agents can create secret links easily in dead-drop style that can be claimed only once. The second claim would get a 404 which means someone already took it. However, for the agent API flow, the server sees the plaintext for a moment before encrypting, so this flow is not zero-knowledge like the web flow. Overall, I think it work quite well for web use, but for agent API use, I am not sure this is enough security. All the limitations are listed in SECURITY.md. Some feedback would be appreciated. I make it open source with MIT license, with honorware policy for Enterprise use, like self-hosted docker. Here is my repo https://github.com/mmmpym/flashpaper and you can try it here https://flashpaper.app Again ! Please feel free to tell me what I missed. Min

Revenue N/A

Developer Tools

Rise-reforming

Hi HN! This is George, Lucas, and Jona from Rise Reforming (https://www.rise-reforming.com/). We’re developing a process to convert gas produced at landfills, farms, and wastewater plants (“biogas”) into higher value chemicals. Our technology is modular, designed to be deployed and operated on-site. Think of us as a chemical project developer; we sit between biogas producers (suppliers) and chemical end users (customers). We pay biogas producers for their gas and we make money from selling our chemicals. We're starting with dimethyl ether (DME) as our beachhead chemical because of its high-margin use case in the cosmetics industry and ultimately targeting methanol – a versatile and widely used industrial chemical. Being in a two sided market allows us to target two large problems. (1) On the chemical side: The multi-trillion dollar U.S. chemical and fuel industries are vulnerable to geopolitical conflicts and climate-driven natural disasters. The Iran war has caused global methanol prices to skyrocket – even in the U.S., a net exporter of methanol. (https://www.spglobal.com/energy/en/news-research/latest-news... the US). In 2021, Winter Storm Uri wiped out 60% of U.S. organic chemicals production for at least a month (https://www.dallasfed.org/research/swe/2021/swe2102/swe2102c...). The problem? Centralized production and fossil-fuel dependence. The solution isn't unknown; decentralized, fossil-free production could insulate supply chains from these shocks. But distributed green chemical production has yet to become cost-competitive with the status quo. Unlocking it requires the right feedstock paired with the right process and strategy. Also, the chemical industry’s reliance on fossil fuels makes it responsible for 5-6% of global greenhouse gas emissions. About 40% of the industry’s well-to-gate emissions come from just the extraction, processing, and transportation of these fossil fuels (https://rmi.org/resources/chemistry-in-transition-charting-s...). (2) Biogas is an ideal feedstock to address Problem 1. It is decentralized, plentiful, and a large part of it is not properly utilized. Biogas is a mixture of methane (CH4) and carbon dioxide (CO2), produced as a result of anaerobic digestion at landfills, farms, and wastewater plants, and can be used as a raw material in chemical manufacturing. The U.S. produces around 780 billion cubic feet of biogas a year – if we converted all that biogas into methanol, that’s about $20 billion a year. Currently, about 60% of this biogas is either burned for power/heat (low-margin and unreliable) or flared altogether. The rest is used in the highly subsidized renewable natural gas (RNG) market (https://americanbiogascouncil.org/abcs-data-digest-lite-july...). The result: many biogas producers leave substantial revenue on the table and experience huge operational headaches. Our modular technology takes in biogas, electricity, and water as inputs. Co-location with biogas producers allows us to tap into their existing infrastructure and speeds up permitting vs a greenfield project. Our 3 step process is outlined below: Step 1: We clean the biogas of contaminants. That means running the gas over specialized adsorbents that trap any nasty sulfur-containing and silicon-containing compounds we don’t want in our process. Step 2: We reform that biogas into an intermediate gas called syngas through the bi-reforming process, which combines the novel dry methane reforming reaction with the legacy steam methane reforming reaction. Syngas is a versatile combination of H2 and CO and is the building block for many chemicals, allowing us to be a platform company. Step 3: Lastly, we upgrade that syngas into our end chemicals. We do this step using conventional catalysts and operating conditions. The modular approach paired with our patent-pending integrated process makes our solution one of the cheapest ways of making green chemicals. Where are we today? We’ve completed our proof-of-concept in the lab and just broke ground on our pilot plant at a Chicagoland wastewater plant that currently flares all of its biogas. We will convert that wasted biogas into methanol. Estimated commissioning is Q1 2027. We all met at the University of Chicago studying Molecular Engineering and started the company back in June 2024. Rise Reforming’s first iteration came after attending a talk from an Argonne National Laboratory researcher on low-carbon fuels. In that seminar, we heard about a reaction called “dry reforming” wherein one can react CH4 with CO2, effectively eliminating both pollutants and making useful syngas (CO + H2). We realized that this reaction could enable cheaper decarbonization of chemicals than the legacy electrolysis pathway and started to build a technoeconomic analysis. George has a background in energy generation, storage, and carbon capture. He was an early employee at Highland Electric Fleets (now a unicorn) and later worked at Nexamp, GenH, and Mantel Capture – researching various battery chemistries, building a first-of-a-kind (FOAK) modular hydropower system, and helping prove a novel point-source capture prototype. He also conducted battery research at UChicago's Patel Lab and Rowan Group, co-authoring two papers. Lucas led the design, procurement, construction, and operation of Rise Reforming’s bench-scale reforming unit with controls that operated successfully for over 1800+ continuous hours. Prior to Rise, he worked at Avangrid (Iberdrola Group) with the offshore wind project services team and did transmutation research of spent nuclear fuel at Argonne National Laboratory. Jona also studied Molecular Engineering at the University of Chicago. He grew up around the marine industry and brings deep knowledge of the space to the team. While at UChicago, he conducted research in the Patel Lab on batteries and sustainable polymer applications and built novel equipment for the lab, including a high-throughput cyclic voltammetry battery performance testing device. Our advisory board has 220+ combined years in aerosols, permitting/safety, low-carbon fuels, catalysts, scale-up, automated modular chemical plants, and wastewater treatment. Here’s our launch video if you want to put faces to the names: https://youtu.be/Bx_ASPapxlQ?si=PAlqvd1eUhW8kjJm. We’d appreciate any feedback, questions, or advice. Thank you for reading! George, Lucas, and Jona

Revenue N/A

Quick Facts

Category: Developer Tools
Audience: B2B
Founder: njyx
Revenue data: Unknown

Twitter LinkedIn

Spec27

AI Analysis

Similar Products

Capgo

OpenAlternative

Vivari

Flashpaper

Rise-reforming

Quick Facts

Share