LLMs consume 5.4x less mobile energy than ad-supported web search
The standard AI energy debate compares server-side LLM inference to a server-side Google query. I think this misses most of what actually happens on a mobile device during a real search session. I built a parametric model of the full end-to-end mobile search session: 4G/5G radio energy, SoC rendering cost for a 2.5MB page, programmatic advertising RTB auctions running in the background, and network transmission costs for both sides. Then compared it to an equivalent LLM session. Main finding across 10,000 Monte Carlo draws: on mobile, a standard LLM session uses on average 5.4x less energy than a classic ad-supported web search session. Programmatic advertising alone accounts for up to 41% of device battery drain per session. Caveats I tried to be explicit about: - Advantage disappears on fixed Wi-Fi/fiber - Reverses for reasoning models - Parametric model, not empirical device measurement. Greenspector has offered to run terminal measurements for v2 - Jevons paradox applies SSRN working paper, not peer-reviewed. Methodology and Monte Carlo distributions fully documented in the paper. Happy to defend the assumptions. DOI: 10.2139/ssrn.6287918
AI-анализ
Анализ скоро появится.
Похожие продукты
Atomic
Show HN: Atomic – Local-first, AI-augmented personal knowledge base
I built a toy that plays grandma's stories when my daughter hugs it
This was a project I built for my daughter's first birthday present. For context, I'm a surgical resident in the UK by background and am currently taking a year out of training to study a masters in computer science. My daughter just turned one. There are two things she really loves: the first is particular soft toy that she just can't live without, and the other is a good story book. Her grandparents live hours away and I didn't want her to forget what they sound like between visits. I wanted her to hear them whenever she missed them. My parents brought my brother and I up with incredible stories and books from all sorts of cultures, many of the stories being passed down from their parents before them. I didn't want my daughter to miss out on that. Finally, I was sick of missing storytime with her when I had to leave for night shifts. I wanted her to hear my voice before she slept every night. For all these reasons, I decided to build Storyfriend. It's her favourite soft toy with a custom made speaker-module inside. I combined my surgical skills with the skills I was learning as a CS student. Along the way I dipped my toes into the world of 3D printing, CAD and electronics design. When she hugs the toy, it plays stories read by her grandparents. She can take the toy with her anywhere and hear the stories anytime she wants - it works offline and has internal storage. It meets my wife's strict no-screen rule (which is getting harder to stick to as the days go by). I've recorded some of the stories that we would read together, so that on nights when I'm working she still has me there to read her a bedtime story. The bit I'm most pleased with: grandparents don't need an app. They just call a phone number. The audio routes through my server and pushes to the toy over WiFi. My own 86-year old grandmother in a rural village in another country can do it by just making a regular call via her landline, as she has done for many years - no help needed, no apps required, no smartphones involved. Hardware is a BLE/wifi module with a MAX98357 chip and custome battery management system, all soldered together, placed in a 3D printed enclosure and placed into a compartment that I stitched into her cuddly toy. Firmware pulls new messages when connected to WiFi and stores them on an SD card. So far I've sold a few hand-made units to parents and grandparents who resonated with the project. Site: https://storyfriend.co.uk Would love feedback on the technical approach, the product itself, or anything else. Happy to answer questions about the build
AthleteData
Im a triathlete and the data for my training lives in 6 apps: Garmin, Strava, WHOOP, Intervals.icu, Wahoo, Withings, Apple Health, sometimes Hevy. Every morning Id eyeball a few of them and make a call on whether to do the planned session. For the past month I have been building a thing that does this for me, and got it to the point where I use it myself every day. It OAuths into whatever platforms you connect, reconciles the activities (tbh harder than it sounds — same ride shows up in Strava, Garmin, and Wahoo with different timestamps and rounding), computes daily load and readiness, and proactively messages you over Telegram or Whatsapp when something matters. Stack is straightforward: Typescript all the way, Postgres, an agent loop running on Claude (via Bedrock) with tool access to all your data + my computed metrics: zones, CTL/ATL/TSB, power/pace curves, anomaly detection on HRV and RHR, etc Two things that were harder than expected: 1. Garmins API only exposes the last 90 days. So for anyone with Garmin as their primary device, you have to backfill from Strava and stitch the two together. Strava has full history but misses some fields (e.g. HR-based TSS only — no power). Wahoo and intervals.icu fill different gaps. The dedup pipeline is ugly and I'd welcome feedback from anyone who has solved this better. 2. Deciding when to message vs. stay silent is entirely a product problem. Too chatty -> muted. Too quiet -> feels dead. One honest caveat though: no RCT data, and Id be skeptical of anyone who claims they have it for AI coaching at this stage. I am at ~50 paying users, I personally reach out to every user to build the next iterations of the product based on feedback. Already got testimonials from Ironman world championship finishers and other pro athletes. Theres also a $9/mo MCP tier for people who would rather pipe their data into their own Claude/ChatGPT. Happy to go deep on any topic! e.g. the tool-calling architecture, or the cost-per-user question (running an agent on every athlete daily is not free, and the margins here are worth discussing).
Daemons
For almost two years, we've been developing Charlie, a coding agent that is autonomous, cloud-based, and focused primarily on TypeScript development. During that time, the explosion in growth and development of LLMs and agents has surpassed even our initially very bullish prognosis. When we started Charlie, we were one of the only teams we knew fully relying on agents to build all of our code. We all know how that has gone — the world has caught up, but working with agents hasn't been all kittens and rainbows, especially for fast moving teams. The one thing we've noticed over the last 3 months is that the more you use agents, the more work they create. Dozens of pull requests means older code gets out of date quickly. Documentation drifts. Dependencies become stale. Developers are so focused on pushing out new code that this crucial work falls through the cracks. That's why we pivoted away from agents and invented what we think is the necessary next step for AI powered software development. Today, we're introducing Daemons: a new product category built for teams dealing with operational drag from agent-created output. Named after the familiar background processes from Linux, Daemons are added to your codebase by adding an .md file to your repo, and run in a set-it-and-forget-it way that will make your lives easier and accelerate any project. For teams that use Claude, Codex, Cursor, Cline, or any other agent, we think you'll really enjoy what Daemons bring to the table.
CyberWriter
Apple has quietly shipped a pretty complete on-device AI stack into macOS, with these features first getting API access in MacOS 26. There are multiple components in the foundation model, but the skills it shipped with actually make this ~3b parameter model useful. The API to hit the model is super easy, and no one is really wiring them together yet. - Foundation Models (macOS 26) - a ~3B-parameter LLM with an API. Streaming, structured output, tool use. No API key, no cloud call, no per-token cost. - NLContextualEmbedding (Natural Language framework, macOS 14+) -- a BERT-style 512-dim text embedder. Exactly what OpenAI and Cohere sell, sitting in Apple's SDKs since iOS 17. - SFSpeechRecognizer / SpeechAnalyzer - on-device speech-to-text including live dictation. Solid accuracy on Apple Silicon. I built cyberWriter, a Markdown editor, on top of all three, mostly as a test and showcase to see what it can do. I actually integrated local and cloud AI first, and then Apple shipped the foundation model, it stacked on super easy, and now users with no local or API AI knowledge can use it with just a click or two. Well the real reason is because most markdown editors need plugins that run with full system access, and I work on health data and can't have that. Vault chat / semantic search. The app indexes your Markdown folder via NLContextualEmbedding (around 50 seconds for 1000 chunks on an M1). The search bar gets a "Related Ideas" section that matches by meaning - typing "orbital mechanics" surfaces notes about rockets and launch windows even when those exact words never appear. Ask the AI a question and it retrieves the top 5 chunks as context. Plain RAG, but the embedder, retrieval, chat model, and search all run locally. AI Workspace. Command+Shift+A opens a chat panel, Command+J triggers inline quick actions (rewrite, summarize, change tone, fix grammar, continue). Apple Intelligence is the default; Claude, OpenAI, Ollama, and LM Studio all work if you prefer. The same context layer - document selection, attached files, retrieved vault chunks - feeds every provider through the same system-message path. Because the vault context is file and filename aware, it can create backlinks to the referenced file if it writes or edits a doc for you. Voice notes and dictation. Record a voice note directly into your doc, transcribe it with SpeechAnalyzer, or just dictate into the editor while you think. Audio never leaves the Mac. The privacy story is straightforward because the primitives are already private. Vectors live in a `.vault.embeddings.json` file next to your vault, never sent anywhere. If you use Apple Intelligence, even the retrieved text stays on-device. For cloud models there is a clear toggle and an inline warning before any filenames or snippets leave the machine. Honest limitations: - 512-dim embeddings are solid mid-tier. A GPT-4-class embedder catches subtler relationships this will miss. - 256-token chunks can split long paragraphs mid-argument. - Foundation Models caps its context window around 6K characters, so vault context is budgeted to 3K with truncation markers on the rest. - Multilingual support is English-only right now. NLContextualEmbedding has Latin, Cyrillic, and CJK model variants; wiring the language detector across chunks is Phase 2. The developer experience for these APIs is genuinely good. Foundation Models streams cleanly, NLContextualEmbedding downloads assets on demand and gives you mean-poolable token vectors in a handful of lines. Curious what others here are building on this stack - feels like low-hanging fruit that has been sitting there for a while. https://imgur.com/a/HyhHLv2 The Apple AI embedding feature is going live today. I'm honestly surprised it even works out of the box.