Show HN: Spegel, a Terminal Browser That Uses LLMs to Rewrite Webpages

simedw | 377 points

This is great! Another useful amendment to this that would make me use it add a chrome browser tool to allow access to pages that need authn and then scrape them for you.

My #1 usecase is fetching wikis on my hard drive and letting a local coding agent use it for creating plans.

ghm2180 | 13 minutes ago

This is actually very cool. Not really replacing a browser, but it could enable an alternative way of browsing the web with a combination of deterministic search and prompts. It would probably work even better as a command line tool.

A natural next step could be doing things with multiple "tabs" at once, e.g: tab 1 contains news outlet A's coverage of a story, tab 2 has outlet B's coverage, tab 3 has Wikipedia; summarize and provide references. I guess the problem at that point is whether the underlying model can support this type of workflow, which doesn't really seem to be the case even with SOTA models.

qsort | a day ago

Classic that the first example is for parsing the goddamn recipe from the goddamn recipe site. Instant thumbs up from me haha, looks like a neat little project.

bubblyworld | a day ago

Cool idea! but kind of wasteful.. I just feel wrong if I waste energy.. At least you could first turn it into markdown with a library that preserves semantic web structures (I authored this- https://github.com/romansky/dom-to-semantic-markdown) saving many tokens = much less energy used..

leroman | 9 hours ago

I've thought about getting a web browser to work on the terminal for a while now. This is an idea that hadn't occured to me yet and I'm intrigued.

But I feel it doesn't solve the main issue of terminal-based web browsing. Displaying HTML in the terminal is often kind of ugly and css-based fanciness does not work at all, but that can usually just be ignored. The main problem is javascript and dynamic content, which this approach just ignores.

So no real step forward for cli web browsing, imo.

hambes | 2 hours ago

I definitely like the LLM in the middle, it’s a nice way to circumvent the SEO machine and how Google has optimized writing in recent years. Removing all the cruft from a recipe is a brilliant case for an LLM. And I suspect more of this is coming: LLMs to filter. I mean, it would be nice to just read the recipe from HTML, but SEO has turned everything into an arms race.

mromanuk | 20 hours ago

Insanely resource expensive, but still a very interesting "why not?" idea. I think a fitting use case would be adapting newer websites for them to work on older hardware. That is, assuming the new technologies used are not vital to the functionality of the website (ex. Spotify, YouTube, WhatsApp) and can be adapted to older technologies (ex. Google Search, from all the styles that it has, to a simple input and a button).

In theory this could be used for ad blocking; though more expensive and less efficient, but the idea is there.

So, it is a very curious idea, but we still have to find an appropriate use case.

Jotalea | 5 hours ago

I wonder if you could use a less sophisticated model (maybe even something based on LSTMs) to walk over the DOM and extract just the chunks that should be emitted and collected into the browsable data structure, but doing it all locally. I feel like it'd be straightforward to generate training data for this, using an LLM-based toolchain like what the author wrote to be used directly.

treyd | 21 hours ago

I'm curious whether anyone has run into hallucinations with this kind of use of an LLM.

They are pretty great at converting data between formats, but I always worry there's a small chance it changes the actual data in the output in some small but misleading way.

robbles | 7 hours ago

Suggestion: add a -p option:

    spegel -p "extract only the product reviews" > REVIEWS.md
clbrmbr | 21 hours ago

I need this, but for the new forum formats such as Discourse or Discuss or whatever it's called. An eyesore and a brainsore.

barrenko | 5 hours ago

It would be cool of it were smart enough to figure out whether it was necessary to rewrite the page on every visit. There's a large chunk of the web where one of us could visit once, rewrite to markdown, and then serve the cleaned up version to each other without requiring a distinct rebuild on each visit.

__MatrixMan__ | 20 hours ago

People here are not realizing that html is just the start. If you can render a webpage into a view, you can render any input the model accepts. PDF to this view. Zip file of images to this view. Giant json file into this view. Whatever. The view is the product here, not the html input.

kelsey98765431 | 20 hours ago

Why not use pandoc to convert html to markdown and have the LLM condense from there?

hyperific | 20 hours ago

This is a terrific idea and could also have a lot of value with regards to accessibility.

ohadron | a day ago

Very cool! My retired AI agent transformed live webpage content, here's an old video clip of transforming HN to My Little Pony (with some annoying sounds): https://www.youtube.com/watch?v=1_j6cYeByOU. Skip to ~37 seconds for the outcome. I made an open-source standalone Chrome extension as well, it should probably still work for anyone curious: https://github.com/joshgriffith/ChromeGPT

cheevly | 21 hours ago

Changes Spegel made to the linked recipe's ingredients:

Pounds of lamb become kilograms (more than doubling the quantity of meat), a medium onion turns large, one celery stalk becomes two, six cloves of garlic turn into four, tomato paste vanishes, we lose nearly half a cup of wine, beef stock gets an extra ¾ cup, rosemary is replaced with oregano.

mossTechnician | 19 hours ago

Just a typo note: the flow diagram in the article says "Gemini 2.5 Pro Lite", but there is no such thing.

coder543 | 20 hours ago

Super neat - I did something similar on a lark to enable useful "web browsing" over 1200 baud packet - I have Starlink back at my camp but might be a few miles away, so as long as I can get line of sight I can Google up stuff, albeit slow. Worked well but I never really productionalized it beyond some weekend tinkering.

adrianpike | 21 hours ago

The main problem with these approaches is that most sites now are useless without JS or having access to the accessibility tree. Projects like browser-use or other DOM based approaches at least see the DOM(and screenshots).

I wonder if you could turn this into a chrome extension that at least filters and parses the DOM

deepdarkforest | 21 hours ago

A step towards the future of ad-blocking maybe? Just rewrite every page?

Buttons840 | 16 hours ago

Could work great with emacs' eww!

pepperonipboy | 21 hours ago

Does it fail cloudflare captcha?

neocodesoftware | 20 hours ago

Any chance it would work for pages like Facebook or LinkedIn? I would love to have a distraction-free way of searching information there.

Obviously, against wishes of these social networks, which want us to be addicted... I mean, engaged.

stared | 21 hours ago

great POC

looks very similar to a chrome extension i use for a similar goal: reader view - https://chromewebstore.google.com/detail/ecabifbgmdmgdllomnf...

eevmanu | 13 hours ago

I have been thinking of a project extremely similar to this for a totally different purpose. It’s lovely to see something like this. Thank you for sharing it, inspiring

cyrillite | 20 hours ago

We’re back to the BBS days, 30 years later!

herval | 8 hours ago

Don't you need javascript to make most webpages useful?

anonu | 21 hours ago

You should call this software a lens and filter instead of a mirror. It takes the essential information and transforms it into another medium.

nashashmi | 20 hours ago

Loving the text only browsing. Is this as fast as in the preview?

tartoran | 14 hours ago

Welcome to 2025 where it's more reasonable to filter all content through an LLM than to expect web developers to make use of the semantic web that's existed for more than a decade. . .

Serioisly though, looks like a novel fix for the problem that most terminal browsers face. Namely that terminals are text based, but the web, whilst it contains text, is often subdivided up in a way that only really makes sense graphically.

I wonder if a similar type of thing might work for screen readers or other accessibility features

benrutter | 17 hours ago

Very cool. I’ve been interested in browsing the web directly from my terminal; this feels accessible.

web3aj | 20 hours ago

Cool! It would be even better if it was able to create simple web pages for vintage browsers.

eniac111 | 21 hours ago

This is a neat idea!

I wonder if it could be adapted to render as gopher pages.

cout | 17 hours ago

Not to be confused with Kubernetes' Spiegel: https://spegel.dev/ https://github.com/spegel-org/spegel

remram | 9 hours ago

Congrats! Now you need an entire datacenter to visualize a web page.

fzaninotto | 21 hours ago

You could also use headless selenium under the hood and pipe to the model the entire Dom of the document after the JavaScript was loaded. Of course it would make it much slower but also would amend the main worry people have which is many websites will flat out not show anything in the initial GET request.

098799 | 20 hours ago

Does anyone know why LLMs love emojis so much?

WD-42 | 20 hours ago

Can it strip ads?

amelius | 20 hours ago

A cool hack, but also impressive to come up with a CLI "browser" that's even more expensive to run than Chromium.

crest | 10 hours ago

Have you considered making an MCP for this? Would be great for use in vibe-coding

nicklo | 21 hours ago

I would like to see a version of this where an LLM just takes the highlights of various social media content from your feed and just gives you the stuff worth watching. This also means excluding crap you had no interest in and was simply inserted into your feed. Fight algorithms with algorithms. Eliminate doom scrolling.

deadbabe | 12 hours ago

I did something similar, but with a chrome extension. Basically, for every web page, I feed the HTML to a local LLM (well, on a server in my basement). I ask it to consider if the content is likely clickbait or can be summarized without losing too many interesting details, and if so, it adds a little floating icon to the top of the page that I can click on to see the summary instead.

My next plan is to rewrite hyperlinks to provide a summary of the page on hover, or possibly to rewrite the hyperlinks to be more indicative of the content at the end of it(no more complaining about the titles of HN posts...). But, my machine isn't too beefy and I'm not sure how well that will work, or how to prioritize links on the page.

IncreasePosts | 17 hours ago

Now that's a user agent!

Klaster_1 | 21 hours ago

Interesting, but why round-trip through an LLM just to convert HTML to Markdown?

insane_dreamer | 20 hours ago

Use uv instead of pip

revskill | 14 hours ago

Reminds me of https://www.brow.sh/ which is not AI related at all but just a very powerful terminal browser which in fact supports JS, even videos.

ktpsns | 21 hours ago

I think the project itself is really cool, that said I really don't like the trend of having LLMs regurgitate content back to us. That said, this kinda makes me think of Browsh, who took the opposite approach and tries to render the HTML in the terminal (without LLMs as far as I know)

https://github.com/browsh-org/browsh https://www.youtube.com/watch?v=HZq86XfBoRo

nartho | 20 hours ago

I built something that did this a bit ago

https://github.com/sstraust/simpleweb

sammy0910 | 21 hours ago

this is another layer of abstraction on top of an already broken system. you're running html through an llm to get markdown that gets rendered in a terminal browser. that's like... three format conversions just to read text. the original web had simple html that was readable in any terminal browser already. now they arent designed as documents anymore but rather designed as applications that happen to deliver some content as a side effect

b0a04gl | 20 hours ago

[dead]

ghaering | 21 hours ago

[dead]

jannniii | 18 hours ago

Why not just use ncurses?

willm | 19 hours ago

Gopher is back!

jannniii | 18 hours ago

Gosh. Lovely project and cool, and - likewise - a bit scary: This is where the "bubble" seals itself "from the inside" and custom (or cloud, biased) LLMs sear the "bubble" in.-

The ultimate rose (or red, or blue or black ...) coloured glasses.-

Bluestein | 19 hours ago