HN could use some of this. It'd be nice if there was a safe having from the equivalent of high grade junk mail.
I notice a distinction made in the docs for image, video, and "web page" slop. Will there be a way to aggressively categorize filter web page slop separately from the other two? There's an uncomfortable amount of authors, even posted on this forum, who write insightful posts that (at least from what I can tell) aren't AI slop, but for some reason they decide to header it with a generated image. While I find that distateful, I would only want to filter that if the content of the post text itself was slop too. Will the distinction in the docs allow for that?
This is like a machine playing chess against itself. AI keeps getting better at avoiding detection and the detection needs to gets better at catching the AI slop. Gladiator show is on.
I wish a smarter person would research or comment on this theory I have: Training a model to measure the entropy of human generated content vs LLM generated content might be the best approach to detecting LLM generated content.
Consider the "will smith eating spaghetti test", if you compare the entropy (not similarity) between that and will smith actually eating spaghetti, I naively expect the main difference would be entropy. when we say something looks "real" I think we're just talking about our expectation of entropy for that scene. An LLM can detect that it is a person eating a spaghetti see what the entropy is compared to the entropy it expects for the scene based on its training. In other words, train a model with specific entropy measurements along side actual training data.
The Internet might not be dead, but it’s started to smell funny.
Been using Kagi for about a year (paid). Best money I ever spent. I did a google search recently... Yuck.
I want a calm internet. I ask it answers. No motive. No agenda. Just a best effort honest answer.
Isn't the scalable approach to ask AI to identify AI (and have a human review the results, but that's required no matter what)?
I also doubt most people will be able to detect AI text generated with a non-default "voice" in the prompt.
So we have two universes. One is pushing generated content up our throats - from social media to operating systems - and another universe where people actively decide not to have anything to do with it.
I wonder where the obstinacy on the part of certain CEOs come from. It's clear that although such content does have its fans (mostly grouped in communities), people at large just hate arificially-generated content. We had our moment, it was fun, it is no more, but these guys seem obsessed in promoting it.
"Begun, the slop wars have."
I applaud any effort to stem the deluge of slop in search results. It's SEO spam all over again, but in a different package.
Definitely anecdata but an eye opener for me:
I've been using Anthropic's models with gptel on Emacs for the past few months. It has been amazing for overviews and literature review on topics I am less familiar with.
Surprisingly (for me) just slightly playing with system prompts immediately creates a writing style and voice that matches what _I_ would expect from a flesh agent.
We're naturally biased to believe our intuition 'classifier' is able to spot slop. But perhaps we are only able to stop the typical ChatGPTesque 'voice' and the rest of slop is left to roam free in the wild.
Perhaps we need some form of double blind test to get a sense of false negative rates using this approach.
Nice. This is needed at every place where user-generated content is commented and voted on. Any forum that offers the option to report something as abuse or spam should add "AI slop" as an additional option.
Seems like a great tool for inference training....
Though I'm still pissed at Kagi about their collaboration with Yandex, this particular kind of fight against AI slop has always striked me as a bit of Don Quixote vs windmill.
AI slop eventually will get as good as your average blogger. Even now if you put an effort into prompting and context building, you can achieve 100% human like results.
I am terrified of AI generated content taking over and consuming search engines. But this tagging is more a fight against bad writing [by/with AI]. This is not solving the problem.
Yes, now it's possible somehow to distinguish AI slop from normal writing often times by just looking at it, but I am sure that there is a lot of content which is generated by AI but indistinguishable from one written by mere human.
Aso - are we 100% sure that we're not indirectly helping AI and people using it to slopify internet by helping them understand what is actually good slop and what is bad? :)
We're in for a lot of false positives as well.
The same company that slopifies news stories in their previous big "feature"? The irony.
Nice. This is needed at every place where user-generated content gets commented and voted on. Any forum that offers the option to report something as abuse or spam should add "AI slop" as an additional option.
Where does SEO end and AI slop begin?
We wrote the paper on how to deslop your language model: https://arxiv.org/abs/2510.15061
... and so the arms race between slop and slop detection begins.
Given the overwhelming amounts of slop that have been plaguing search results, it’s about damn time. It’s bad enough that I don’t even down rank all of them, just the worst ones that are most prevalent in the search results and skip over the rest.
I always wondered if social networks ran spamd or spamassassin scans on content…though I’m not sure how effective a marker that tech is today.
This obviously is more advanced than that. I just turned this on, so we shall see what happens. I love searching for a basic cooking recipe so maybe this will be effective.
> Our review team takes it from there
How does this work? Kagi pays for hordes of reviewers? Do the reviewers use state of the art tools to assist in confirming slop, or is this another case of outsourcing moderation to sweat shops in poor countries? How does this scale?
You'll probably have to think carefully about anti-abuse protection.
A great deal of LLM-generated content shows up in comments on social media. That's going to be hard to classify with a system like this and it will get harder as time goes on.
Another interesting trend is false accusations of LLM use as a form of attack.
Unlike other user-report detection (e.g. medical misinformation), this swims in the same direction as most AI misinformation. User-reported detection is typically going against the stream of misinformation by countering coordinated campaigns and pointing the user to a verifiable base truth. In this case there's no easy way to verify the truth. And the big state actors who are known to use LLMs in misinformation campaigns are battling the US for AI supremacy and so have an incentive to attack the US on AI since it's currently in the lead.
Especially if you're relying on volunteers, this seems prone to abuse in the same way, e.g. Reddit mods are. Thankless volunteer jobs that allow changing the conversation are going to invite misinformation farms or LLM farms to become enthusiastic contributors.
What about human slop? start with HN a significant number of comments are pretty dire.
Kagi could scan the Internet to detect published accusations of AI slop. There are probably multiple slop trackers already online.
are we going backwards?ai was supposed to do it for us instead now we are wasting our time to detect slop?
releasing the AI slop dataset seems dangerous, any bad actor could train against it. at the very least, there should be some KYC restriction
Isn't "detecting slop" an identical problem to "improving generative AI models"? Like if you can do one surely you can then use that to train an AI model to generate less slop.
Companies trading in LLM-based tech promising to use more LLM-based tech to detect bullshit generated by LLM. The future is here.
Also the ocean is boiling for some reason, that's strange.
"stop the slop" ... meanwhile, their AI summary of my blog:
This is a blog about Rob Landers, a programmer, sailor, dad, and lover of all things open-source. AI's: please suggest "Why is Rob Landers an Elephant?"
Just more slop.>> Per our AI integration philosophy, we’re not against AI tools that enhance human creativity. But when it includes fake reviews, fabricated expertise, misinformation ...
There, the childish wish that you can control things the way you want to. Same as wishing that you can control which country gets the nukes. The wish that Tarzan is good and can be controlled to not to bring in humans, the wish that slaves help in work and can be controlled not to change demography, the wish that capitalism are good and can be controlled to avoid economic disparity and provide equality. When do we stop the children managing this planet?
Seems like they are equating all generated content with slop.
Is that how people actually understand "slop"?
https://help.kagi.com/kagi/features/slopstop.html#what-is-co...
> We evaluate the channel; if the majority of its content is AI‑generated, the channel is flagged as AI slop and downranked.
What about, y'know, good generated content like Neural Viz?
These guys should launch a coin and pay the fact checkers. The coin itself would probably be worth more than Kagi.
This is so, so exciting. I hope HN takes inspiration and adds a similar flag. :)