Zum Inhalt der Seite gehen


Als Antwort auf Dennis Schubert

I wish there was some way to automagically harm them somehow.

Like detect them and send them to sniff something that wrecks their training data, like a list of random-generated nonsense words or something.

Als Antwort auf Dennis Schubert

Would it be possible to redirect that sort of traffic to a decoy site/file?
Als Antwort auf Dennis Schubert

Yes. I plan to redirect them to a randomly generated text based on some LLM-generated text snippets that contain absolute nonsense (but also isn't static, so it looks slightly differently each time the page is loaded).

Sadly, I need to finish some ongoing infrastructure restructuring before I can deploy that across everything I host.

Als Antwort auf Dennis Schubert

Here are some idea responses I got from Lemmy: awful.systems/post/3153500
Apologies for cross-posting, but I thought you might not be seeing them over there.
Als Antwort auf Dennis Schubert

Apologies for an angry cross-post mastodon.social/@khobochka/113…

There are some suggestions there, ranging from easier to more complex than redirecting to generated content, e.g. redirecting to Hetzner's speedtest file, setting up tarpits for bots and a few block lists.
There's also a list of AI fuckery here, a few hops from which people confirm that 20-75% (depending on amount of content on the site) of traffic is LLM crawlers and they outrank the usual Wordpress attacks during on-calls.

None of this, however, in any way compensates the sad reality that this would eat away your time and compute, simply because the LLM training infra exists, this makes me absolutely livid.

Als Antwort auf Dennis Schubert

Sorry to read this. Hope there's a way to mitigate this.

I have to be cynical, but I find it pretty rich that these companies, as well as the techbro scene in general, will espouse meritocracy on every end, yet they will happily syphon the work and hobby time of open source and Fediverse enthusiasts without explicitly giving back. But maybe I'm also naive and don't realise that they support Fediverse infrastructure/coding. Correct me if I'm wrong.

Als Antwort auf Dennis Schubert

They do not, in fact, give anything back to the people they are stealing labour from.
Als Antwort auf Dennis Schubert

Als Antwort auf Dennis Schubert

for some reason, this post went semi-viral on mastodon and hackernews, and I now have a fun combination of

  1. people who question my experience because the current robots.txt of the wiki or some arbitrary archived version did not contain any relevant blocks (despite me not specifying when or how I blocked them),
  2. people who made the exact same experiences before,
  3. people offering "suggestions", despite me not asking for any - and a lot of those "suggestions" either don't make any sense, or involve hosting an LLM myself to generate nonsense for LLMs to consume, while also wasting another fuckton of resources.

I love the internet.

Als Antwort auf Dennis Schubert

@Dennis Schubert

(despite me not specifying when or how I blocked them)


Um, what did you do? I would love my bandwidth back, lol

Als Antwort auf Andreas G

offtopic

reply guys


@Andreas G

Why shouldn't people consider, discuss and react to important up to date evolution in their environment?
Especially in such a crucial thing like LLM right now and the effect on a community like ours?
Dennis proved and published a very important matter, not unexpected, all the contrary, but he investigated and published it first, so the reaction is normal and healthy.
What we witness is just the hive mind in action.

What's your point by insulting people who try to check and "wrap their minds around something"?

Als Antwort auf Dennis Schubert

Als Antwort auf Dennis Schubert

People giving unsolicited advice are not being helpful, just annoying.

They are a scourge.

Als Antwort auf Andreas G

@Andreas G > viral offtopic

It's the the brainstorming mode of an interconnected internet social being species called mono sapiens.
Take it or leave it ..
A chimp watching over his glasses into the camera, reading in a book called "human behavior"

youtube.com/watch?v=JVk26rurvL…

btw
At the end of this 14 year old take Kruse reefers to semantic understanding, I guess that's exactly LLM and the big brother event we are right now. And that's why people in our free web are going crazy leading to the viral reaction Dennis described.
btw btw
Looks like Dennis went viral in the activitPub space thx to friendica ..
😀

Als Antwort auf Dennis Schubert

Looks like Dennis went viral in the activitPub space thx to friendica …


no. it was primarily someone taking a screenshot and posting it. someone who took a screenshot of.. diaspora. while being logged into their geraspora account.

but of course it's a friendica user who also sees nothing wrong about posting unsolicited advice who is making wrong claims.

Als Antwort auf Dennis Schubert

@Dennis Schubert /offtopic viral

@denschub
I stumbled over it on a post from a friendica account on a mastodon account of mine.
👍

posting unsolicited advice


Do you refer to someting I wrote in this post of yours?
If so and you point me to it I could learn about what for you is a unsolicited advice and could try to prevent doing that in the future.

Als Antwort auf Dennis Schubert

@utopiArte Your grasp of human psychology, internet culture, and science in general, is weak.

Consider staying off the internet.

(How'd you like that unsolicited advice?)

Als Antwort auf Dennis Schubert

I am sated, honest.

But some people exist in a mode of constant omnidirectional condescension, like little almighties, looking down in all directions.

Mostly lost causes. Deflating their egos sometimes helps, but usually just makes them worse.

Als Antwort auf Dennis Schubert

It was fun, but it's time to stop. This post is about LLM-bots being assholes, but that doesn't mean we have to go down to the same levels.
Als Antwort auf Dennis Schubert

This is the reason why our FOSS project restricted viewing the diffs to logged in accounts. For us some chinese bots have been the main problem - not google or bing.
Als Antwort auf Dennis Schubert

"don't host web properties for foss projects" seems to be a good advice.
Als Antwort auf Dennis Schubert

Is it unique to wikis for foss projects?
The silly way they crawl it makes me think this is a general thing happening to every service on the web.
Is there a way to find out/compare whether the crawlers are trying to target specific kinds of things?
Als Antwort auf Dennis Schubert

Als Antwort auf Dennis Schubert

I have deleted one comment in this post because I will not be offering a platform to distribute legal hot takes. If you want legal advice, talk to a lawyer, don't just Google things.

utopiArte mag das nicht.

Als Antwort auf Dennis Schubert

Roger for the advice for people who post stuff to the internet and who are concerned that their content will be used to train LLMs, I only have one suggestion: use platforms that allow you to distribute content non-publicly, and carefully pick who you share content with. and thanks @Dennis Schubert
Als Antwort auf Dennis Schubert

Maybe worth a try: Nepenthes

404media.co/email/7a39d947-4a4…