Bonsai 1.7B ternary model at 442T/s on M4 Max

Name: Bonsai 1.7B ternary model at 442T/s on M4 Max
Author: hhuytho

We took a recently released Bonsai 1.7B ternary model from PrismML (https://github.com/PrismML-Eng/Bonsai-demo) and ran our agentic evolution search on it for 6 hours to optimize the Metal kernels. The search was fully autonomous. Measured against unmodified upstream llama.cpp at the same Bonsai/Q2_0 commit, same M4 Max: - tg128: 309.82 → 442.42 t/s (+42.0%) - pp512: 4250.32 → 4622.63 t/s (+8.8%)

Developer Tools BOTH · hhuytho

Перейти на Bonsai 1.7B ternary model at 442T/s on M4 Max

N/A

Данные о доходе недоступны

AI-анализ

Анализ скоро появится.

Похожие продукты

Developer Tools

Capgo

Мгновенные обновления для Capacitor-приложений. Выпускайте исправления за минуты, а не недели. Отправляйте OTA-обновления пользователям без задержек App Store.

$15.2K /мес

Developer Tools Легко клонировать

OpenAlternative

OpenAlternative — каталог open-source альтернатив проприетарному софту. На сайте собраны проекты из разных категорий с информацией о возможностях, стеке технологий и метриках GitHub. Платформа монетизируется через платные размещения и партнёрские ссылки.

$6.7K /мес

Developer Tools

Generate SKILL.md files from URLs, in the browser

I created this tool after writing a few agent skills by hand and noticing this pattern was repetitive. Paste a documentation URL, enter your own model API key, and it gets the page content client-side to create a reusable SKILL.md. There is no backend/proxy, so it stays as secure as possible. I would like feedback on the structure of the output and the edge cases.

Доход N/A

Developer Tools

State of the Art of Coding Models, According to Hacker News Commenters

Hello HN, I was away from my computer for two weeks, and after coming back and reading the latest discussions on HN about coding assistants (models, harnesses), I felt very out of the loop. My normal process would have been to keep reading and figure out the latest and greatest from people's comments, but I wanted to try and automate this process. Basically the goal is to get a quick overview over which coding models are popular on HN. A next iteration could also scan for harnesses that people use, or info on self-hosting or hardware setups. I wrote a short intro on the page about the pipeline that collects and analyzes the data, but feel free to ask for more details or check the Google Sheet for more info. https://hnup.date/hn-sota

Доход N/A

Developer Tools

Large Scale Article Extract of Newspapers 1730s-1960s

Hello HN, over the past 7 months I've spent nearly 3,000 hours on building SNEWPAPERS, the first historical newpaper archive with full-text extractions, nearly perfect OCR, a vast categorization taxonomy and of course with semantic and agentic search capabilities. Problem: I wanted to search through newspaper archives, but when I tried every service only lets you search for keywords and dates, and gives you back raw images of the papers, and too many of them with no context. A sea of noise. Solution: I taught machines how to read the newspapers and so far I've extracted the content from > 600k pages (about 5TB) from the Chronicling America collection. Problems I had to deal with were an infinite variety of layouts, font sizes, image scan qualities, resolutions, aspect ratios, navigating around the images on the page. I also had to figure out how to get OCR to be nearly perfect so people wouldn't hate reading the extracts. I stitched together a multi-model pipeline (layout tech, ocr tech, llm, vllm) with heuristics to go from layout -> segmentation -> classification. I put it all in OpenSearch / Postgres and made it semantically searchable and also put an agentic search tool on top that knows how to use the API really well and helps you write queries to find what you're looking for. Happy to discuss AWS architecture and scaling as well, that was tough! If you have five minutes and you just want to jump in and have your own personalized experience, what I would suggest is: Before searching for anything, go to the Sleuth page Ask it about anything from 1736 to 1963, maybe 1 or 2 follow up questions Then go to the search page so you can see the queries it wrote for you (bottom left "saved queries") and uncover more info on whatever it is you're interested in If you think it's cool and you want to learn more, then there's about 10 minutes of video guides on the various capabilities in "Guide" on the nav bar Some other people have also taken a crack at this, notably: https://dell-research-harvard.github.io/resources/americanst... (very good attempt) https://labs.loc.gov/work/experiments/newspaper-navigator/ (focused on images)

Доход N/A

Ключевые факты

Категория: Developer Tools
Аудитория: BOTH
Основатель: hhuytho
Данные о доходе: Неизвестно

Twitter LinkedIn

Bonsai 1.7B ternary model at 442T/s on M4 Max

AI-анализ

Похожие продукты

Capgo

OpenAlternative

Generate SKILL.md files from URLs, in the browser

State of the Art of Coding Models, According to Hacker News Commenters

Large Scale Article Extract of Newspapers 1730s-1960s

Ключевые факты

Поделиться