Is it legal and possible to scrape the social media platforms?

iamnnk | 13 points

Most of the social media companies are scraping everything to train their LLMs. I think we’ll see some court decisions soon regarding legality.

Some of the social platforms have APIs you can pay to access. Some have aggressive anti-scraping countermeasures.

gregjor | 4 months ago

HiQ vs Linked in determined this. If the content is available without a login, it's fair game. If there's a login required, then it's not. That's why Twitter now requires a login to view extended content.

fragmede | 4 months ago

Legal, yes, as long as you are not accessing stuff you are not supposed to.

Possible, very much so, just depends on the platform and the rate of access that they allow. Some platforms will basically rate limit hard if they detect a lot of traffic from a single IP.

With paid API access, you may have a higher rate available, and an easier time getting the data (usually without you have to parse HTML)

ActorNightly | 4 months ago

Generally speaking, if you're not logged in and nobody has told you to stop, you should be ok.

There is a service called SerpAPI that provides an API around stuff you might scrape. Haven't tried it myself but heard its good.

leros | 4 months ago

This question is way too broad. What is your purpose? What specifically are you scraping, (ie images, text, audio, video)? Please expand

vieques | 4 months ago

Approaching the platforms adversarially makes you an adversary. This might not be a solid foundation for a stable business.

Your lawyer is the best opinion regarding legality.

Good luck.

brudgers | 4 months ago

they's already sscraped bro

they's on the archive dot org

that site has everything but their search is shite

so to find scraped things that they scraped, you need to scrape their site and build a non-broken search engine for yourself

but you'll find your post-scraped social media sites

and many other interdasting things

fasa99 | 4 months ago