Roaster
EN / RU
Boxes.dev: ditch localhost; run Claude Code and Codex in the cloud

Boxes.dev: ditch localhost; run Claude Code and Codex in the cloud

Hi HN, we’re Nick and Drew, and we’re building boxes.dev – the first cloud-only agentic dev environment (ADE) that gives every Codex and Claude Code agent its own cloud computer. We’re two engineers who previously built Gem (co-founder/CTO and first hire), and we spent the last year coding almost exclusively using Codex and Claude Code. It’s been a huge change to how we code, and it’s been exhilarating seeing the models keep getting better – but we eventually realized that developing on localhost was holding us back: - Git worktrees are clunky to set up and use for parallelizing work - It’s 2026, but somehow everyone is still walking around with laptops cracked open or SSHing into mac minis in their garage so their agents don’t stop working. - Mobile is treated like an afterthought even though coding is just texting now We started hitting resource constraints when multiple parallel agents test their own work by running the full app locally. - We tried different products, but couldn’t find any that solved all of our pain points – so we pivoted and decided to just build the ADE we wanted for ourselves. Boxes.dev is a desktop and mobile app that lets you run Claude Code, Codex (using your subscription!), and the full dev environment for whatever you’re building, all on remote compute. It’s similar to Conductor or the Codex desktop app, except everything is in the cloud. We use coding agents to scan your local dev setup and port it to the cloud. Then every Claude Code/Codex thread starts from a snapshot of the full setup, with its own filesystem and compute. No more git worktrees, no more cracked-open laptops, and your coding agents can actually test their work end-to-end because they can run your full app in isolation. We’ve mirrored the Claude Code and Codex UX to feel natural to power users, and also have a fully-featured mobile app (no handoffs or remote control), plus scheduled automations and a Slack integration. We’re obviously biased, but we’ve been building boxes.dev with boxes.dev for months and it’s honestly been a gamechanger. It’s hard to go back once you realize how much localhost has been limiting you; based on early feedback from beta testers, we’re increasingly sure that cloud is the future of agentic coding. We’d love for you to experience it yourselves! Would appreciate any feedback – and happy to answer any questions on this thread.

Design BOTH · nab
N/A
Revenue not available

AI Analysis

Analysis coming soon.

Similar Products

Design
We post-trained a model that pen tests instead of refusing your code

We post-trained a model that pen tests instead of refusing your code

I'm Dimitrios at Cosine. Quick orientation first: the read-only scan is free and you can run it right now: that's the part to try. The pen-test mode is gated behind written authorisation, because it's live offensive testing against real systems; I'll explain that below, it's not a paywall thing. The reason this exists: most "AI security" tools wrap a general model, so they inherit its refusals, point one at a real offensive task and it hedges or declines, because the base model was trained to. We went the other way and post-trained our own model for offensive security, so it does the work instead of apologising for it. It's our model, not a wrapper. Under the hood it's a multi-agent swarm: an orchestrator splits the job across subagents running in parallel, each owning a slice, then synthesises one report. That's what gets a polyglot microservice repo done in one pass. The fair objection to a model that doesn't refuse, pointed at your code: how is that not reckless? I think refusals are the wrong layer to put safety in. A model that refuses is both useless (won't do the job) and unsafe (you're trusting a probability distribution to hold a hard line). So we don't ask the model to behave — we enforce it in the harness. A runtime guard written in Go intercepts every tool call before it runs. In scan mode it hard-blocks every mutating tool and any non-read-only shell command and the model can decide whatever it wants, the guard won't let it write. In pen-test mode the same guard pins the agent's network scope to the targets you authorised; it can't reach anything else. Safety is deterministic and sits below the model, not inside it. Two modes, one CLI: - Security Scan - read-only audit of a local codebase, every finding tied to a file and line. Free, runnable today. - Pen Test - the swarm attacks systems you authorise and hands back the request it sent and the response your code gave. Gated behind written authorisation. Demo target and to be straight about it: Bank of Anthos, Google's open-source reference bank. Known app, some intentionally-soft bits — which is why I picked it, so you can reproduce the run instead of trusting a screenshot. The scan found an integer overflow in the transfer path that would let you forge an account balance, plus the usual injection/auth/secrets classes. It's a closed binary (brew/curl/winget), runs locally, by Cosine. Run it behind a firewall and `tcpdump` exactly what it does before you trust it on anything real. Install is free; the scan runs on a $20 Cosine subscription; pen test is scoped per engagement. I'll be in the thread all day. The harness-vs-refusals design is the part I most want torn apart - tell me where it breaks.

Revenue N/A
Design
ABC Classic 100 Rankings visualised

ABC Classic 100 Rankings visualised

This weekend is the ABC Classic FM countdown, which prompted me to dust off an old un-published data visualisation of rankings from previous years. I've considered adding a search function, but I also kind of like that it requires a bit of exploration in the current form. Some of the code is a bit clunky and I wouldn't mind refactoring it. I'm also not sure about browser compatibility - I've only got access to a couple of devices to test it on.

Revenue N/A
Design
Hydron

Hydron

Hi HN, this is Prashant from H2Loop. Embedded engineers that we work with were annoyed that generic AI tools hallucinated register addresses, generated code for peripherals that don't exist on the chip and mixed up timer quirks between similar platforms like STM32F4 and F7. The code looks clean but it just won't boot. This made them go back to the datasheet every time. So we built Hydron, an AI tool that writes datasheet-grounded code for your hardware. Demo: Hydron setting up sleep-mode CPU logging for an onboard temp sensor on an STM32U385 - https://boot.hydron.sh/zzzDemo First, we've pre-indexed 580+ platforms and peripherals. Most of what you'd use in a robotics, UAV, or IoT build: common dev platforms like STM32, ESP32, RP2040, AM6 families, plus the IMUs, GNSS modules, motor drivers, and baros that ship around them. Ask about a peripheral, the answer comes from our KG and the actual datasheet. Second, you can bring your own context and share it with your team. Hydron indexes PDFs up to 5000 pages, plus a whole host of various file types and even ZIPs of full C/C++/Python codebases up to 250MB. One engineer indexes the HW, BSPs, and datasheet pack once. Anyone else can reference it from their own Hydron agent. Third, HW-SW development happens in your editor or terminal. Agentic serial monitor is live today. GDB integration, and an AI log reader land in the next two weeks. Up next, we're focused on closing the hardware-software loop. We're building more HIL debugging capabilities, deeper target awareness, and support for additional platforms. If you work on embedded SW we'd love your feedback on where today's tools fall short and what you'd like to see next. Install: VS Code extension - https://marketplace.visualstudio.com/items?itemName=H2Loop.h... CLI mac/linux -> curl -fsSL https://get.hydron.sh/cli/install.sh | bash. CLI windows -> irm https://get.hydron.sh/cli/install.ps1 | iex More at http://boot.hydron.sh/HN. 200 free credits, 50% off on paid plans, one-step signup. Me and u/ajithhyd will be in the thread all day.

Revenue N/A
Design
Nutrepedia

Nutrepedia

Show HN: Nutrepedia – nutrition info in 29 locales built with Clojure and Htmx

Revenue N/A
Design
TV Explorer. Adding advanced UI to free online TV

TV Explorer. Adding advanced UI to free online TV

Show HN: TV Explorer. Adding advanced UI to free online TV

Revenue N/A

Quick Facts

Category
Design
Audience
BOTH
Founder
nab
Revenue data
Unknown

Share