Roaster
EN / RU
Filling PDF forms with AI using client-side tool calling

Filling PDF forms with AI using client-side tool calling

Hey HN! I built SimplePDF Copilot: an AI assistant that can interact with the PDF editor. It fills fields, answers questions, focuses on a specific field, adds fields, deletes pages, and so on. It's built on top of SimplePDF that I started 7 years ago, pioneering privacy-respecting client-side pdf editing, now used monthly by 200k+ people. As for the privacy model: the PDF itself never leaves the browser. Parsing, rendering, and field detection all run client-side. The text the model needs (and your messages) goes to whatever LLM you point at. By default that's our demo proxy (DeepSeek V4 Flash, rate-capped), but you can BYOK and point it at any cloud provider, or go fully local (I've been testing with LM Studio). Unlike the existing "Chat with PDF" tools that only retrieve the text/OCR layer, Copilot can act on the PDF: filling fields, adding fields (detected client-side using CommonForms by Joe Barrow [1], jbarrow on HN with some post-processing heuristics I added on top), focusing on fields, deleting pages, and so on. I built this because SimplePDF is mostly used by healthcare customers where document privacy is paramount, and I wanted an AI experience that didn't require shipping PII to a third party. Stack is pretty standard: - Tanstack Start - AI SDK from Vercel - Tailwind (I personally prefer CSS modules, I'm old-school but the goal since I open source it, I figured that Tailwind would be a better fit) The more interesting part is the client-side tool calling: events are passed back and forth via iframe postMessage. If you're not familiar with "tool calling" and "client-side tool calling", a quick primer: Tool calling is what LLMs use to take actions. When Claude runs grep or ls, or hits an MCP server, those are tool calls. Client-side tool calling means the intent to call a tool comes from the LLM, but the execution happens in the browser. That matters for: speed, you can't go faster than client-to-client operations and also gives you the ability to limit the data you expose to the LLM. For the demo I do feed the content of the document to the LLM, but that connection could be severed as simply as removing the tool that exposes the content data. The demo is fully open source, available on Github [2] and the demo is the same as the link of this post [3] What's not open source is SimplePDF itself (loaded as the iframe). I could talk on and on about this, let me know if you have any questions, anything goes! [1] https://github.com/jbarrow/commonforms [2] https://github.com/SimplePDF/simplepdf-embed/tree/main/copil... [3] https://copilot.simplepdf.com/?share=a7d00ad073c75a75d493228...

Developer Tools BOTH · nip
N/A
Revenue not available

AI Analysis

Analysis coming soon.

Similar Products

Developer Tools
Capgo

Capgo

Instant updates for Capacitor apps. Ship fixes in minutes, not weeks. Push OTA updates to users without app store delays.

$15.2K /mo
Developer Tools Easy to clone
OpenAlternative

OpenAlternative

Open source alternatives to popular software. Over 1 million users replaced their proprietary tools with open source software. Discover the best alternatives and join the movement.

$6.7K /mo
Developer Tools
Homebrew 6.0.0

Homebrew 6.0.0

Today, I’m proud to announce Homebrew 6.0.0. The most significant changes since 5.1.0 are a new tap trust security mechanism, the new faster, smaller, default internal Homebrew JSON API, sandboxing on Linux, better defaults informed by our user survey, many brew bundle improvements, improved performance and initial support for macOS 27 (Golden Gate). Happy to discuss any questions here!

Revenue N/A
Developer Tools
Intunedhq

Intunedhq

Hey HN, we're Faisal and Ahmad from Intuned (https://intunedhq.com). We’re building a platform for building, deploying, and maintaining browser automations. Customers primarily use the Intuned AI agent to automate websites that don't expose APIs. Common use-cases include scraping data, pulling reports, and submitting forms. As the website changes, our agent also helps automatically heal the automation. On Intuned, browser automations are created by an AI agent and run as code. Our infra captures the context of every run, allowing our agent to debug and maintain the underlying code - to keep the automations working over time. This way, we’re able to offer the predictability, speed, and cost of code, without the painful parts of writing and maintaining it. Here’s a demo of building a scraper on Intuned: https://youtu.be/ruZP73bK4FU Here’s a demo of using AI to maintain a project: https://youtu.be/e4R4hLdHBro Backstory: we were accepted into YC for a completely different idea. During the batch, because of Faisal's background at UiPath, several batchmates asked us whether RPA tools could fill API gaps in their products by automating websites without APIs. When it was time to pivot, we went back to those founders to dig deeper. (RPA in this context is referring to using UI automation to do complete non-testing tasks) We discovered that the actual hard problem in browser automation is maintenance. Websites change, selectors break, and failures can be painful to reproduce and fix. So in early 2024, we decided to take a crack at this problem with a handful of customers. It needed a fair number of iterations before we landed on our current code-first approach. How it works: Intuned is infra + agent, deeply integrated. On the infrastructure side, Intuned is a managed runtime for browser automation code. Projects are usually Playwright-based TypeScript or Python. Users can write them directly in our online IDE, or hand the work off to the agent. Either way, once deployed, the platform runs each project in its own isolated machine and handles auth/session reuse, scheduling, batch execution, concurrency, observability, and the other plumbing around running browser code. On the agent side, it took us a few iterations to get to the current approach. Our initial attempts were rigid pipelines: collect requirements, inspect the site, generate code, then try to patch whatever broke. It looked reasonable on paper, but real websites are too messy for fixed paths. Late last year, we were planning to ship that version when stronger models landed and harnesses like Claude Code and Codex showed what a more open-ended coding agent could do. We built a prototype on the Claude Agent SDK, it felt much better than what we had, and we scrapped the release and decided to rebuild the agent. The rebuild came down to three pieces around the SDK: an execution environment for running long agent sessions reliably, a CLI that exposes the platform to the agent so it operates Intuned the way engineers do, and a custom plugin (skills + MCP) built around what we've learned building browser automations. The infra-agent integration is where the product gets more interesting. The runtime doesn't just run the automation; it captures the context needed to debug it when it fails: params, results, traces, logs. That enables features like Fix with AI, where you can open a failed run and have the agent investigate and prepare a fix. The same integration powers a feature called self-healing. For configured projects, the platform detects failures, starts an agent session with the relevant context, and either proposes a fix for review or deploys it automatically. Demo: https://youtu.be/IVHIXw0lYMs We recently also packaged the infra and agent as an API called Web Task API, here is a demo: https://youtu.be/1olRn3l95vw We strongly believe that browser automations can and should be faster, cheaper and more predictable. Check us out at https://app.intuned.io/, we have a free tier with trial credits for your first few automations. Excited to hear your thoughts, questions, and feedback!

Revenue N/A
Developer Tools
Mercek

Mercek

Hey HN I've been using ECS for a while now and found it annoying having to log into the console everytime I use Lens for Kubernetes but couldnt find an equivalent for ECS so i built one! The project is open source as well https://github.com/utibeabasi6/mercek

Revenue N/A

Quick Facts

Category
Developer Tools
Audience
BOTH
Founder
nip
Revenue data
Unknown

Share