Filling PDF forms with AI using client-side tool calling
Hey HN! I built SimplePDF Copilot: an AI assistant that can interact with the PDF editor. It fills fields, answers questions, focuses on a specific field, adds fields, deletes pages, and so on. It's built on top of SimplePDF that I started 7 years ago, pioneering privacy-respecting client-side pdf editing, now used monthly by 200k+ people. As for the privacy model: the PDF itself never leaves the browser. Parsing, rendering, and field detection all run client-side. The text the model needs (and your messages) goes to whatever LLM you point at. By default that's our demo proxy (DeepSeek V4 Flash, rate-capped), but you can BYOK and point it at any cloud provider, or go fully local (I've been testing with LM Studio). Unlike the existing "Chat with PDF" tools that only retrieve the text/OCR layer, Copilot can act on the PDF: filling fields, adding fields (detected client-side using CommonForms by Joe Barrow [1], jbarrow on HN with some post-processing heuristics I added on top), focusing on fields, deleting pages, and so on. I built this because SimplePDF is mostly used by healthcare customers where document privacy is paramount, and I wanted an AI experience that didn't require shipping PII to a third party. Stack is pretty standard: - Tanstack Start - AI SDK from Vercel - Tailwind (I personally prefer CSS modules, I'm old-school but the goal since I open source it, I figured that Tailwind would be a better fit) The more interesting part is the client-side tool calling: events are passed back and forth via iframe postMessage. If you're not familiar with "tool calling" and "client-side tool calling", a quick primer: Tool calling is what LLMs use to take actions. When Claude runs grep or ls, or hits an MCP server, those are tool calls. Client-side tool calling means the intent to call a tool comes from the LLM, but the execution happens in the browser. That matters for: speed, you can't go faster than client-to-client operations and also gives you the ability to limit the data you expose to the LLM. For the demo I do feed the content of the document to the LLM, but that connection could be severed as simply as removing the tool that exposes the content data. The demo is fully open source, available on Github [2] and the demo is the same as the link of this post [3] What's not open source is SimplePDF itself (loaded as the iframe). I could talk on and on about this, let me know if you have any questions, anything goes! [1] https://github.com/jbarrow/commonforms [2] https://github.com/SimplePDF/simplepdf-embed/tree/main/copil... [3] https://copilot.simplepdf.com/?share=a7d00ad073c75a75d493228...
AI-анализ
Анализ скоро появится.
Похожие продукты
Capgo
Мгновенные обновления для Capacitor-приложений. Выпускайте исправления за минуты, а не недели. Отправляйте OTA-обновления пользователям без задержек App Store.
OpenAlternative
OpenAlternative — каталог open-source альтернатив проприетарному софту. На сайте собраны проекты из разных категорий с информацией о возможностях, стеке технологий и метриках GitHub. Платформа монетизируется через платные размещения и партнёрские ссылки.
zot
Why I Built Another coding agent harness?: https://dev.to/patriceckhart/zot-why-i-built-another-coding-... Github Repo: https://github.com/patriceckhart/zot
Utilyze
The standard GPU utilization metric reported by nvidia-smi, nvtop, Weights & Biases, Amazon CloudWatch, Google Cloud Monitoring, and Azure Monitor is highly misleading. It reports the fraction of time that any kernel is running on the GPU, which means a GPU can report 100% utilization even if only a small portion of its compute capacity is actually being used. In practice, we've seen workloads with ~1–10% real compute throughput while dashboards show 100%. This becomes a problem when teams rely on that metric for capacity planning or optimization decisions, it can make underutilized systems look saturated. We're releasing an open-source (Apache 2.0) tool, Utilyze, to measure GPU utilization differently. It samples hardware performance counters and reports compute and memory throughput relative to the hardware's theoretical limits. It also estimates an attainable utilization ceiling for a given workload. GitHub link: https://github.com/systalyze/utilyze We'd love to hear your thoughts!
Tiao, A two-player turn-based board game
Hi HN, I built this digital version of Tiao, a two-player turn based strategy board game. Think Checkers meets Go. It's free, runs in the browser, has multiplayer, AI, over the board mode and a lot of other neat things. The source is on GitHub (AGPL). The game was originally designed by my friend Andreas Edmeier. He created the rules and has been playtesting and refining the game design for years. I built the website for it. The core in about 2 weeks using TypeScript, Next.js, Express, Websockets, and MongoDB. Fully dockerized, deployed on a Hetzner VPS with Coolify. Authentication with better-auth. Real-time gameplay, ELO matchmaking, OpenPanel analytics, and a fully functional achievements system. Play it: https://playtiao.com Source: https://github.com/trebeljahr/tiao Happy to answer questions about the tech, the game design, or anything else. My hope is that more people will play this game because I think it is genuinely fun and would be cool to one day see people play this on a Go board or on their phones/computers. Have a good one.