this post was submitted on 29 Jun 2024
142 points (98.0% liked)

Technology

59148 readers
2689 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
top 14 comments
sorted by: hot top controversial new old
[–] [email protected] 61 points 4 months ago

finally, some good fucking AI

[–] [email protected] 18 points 4 months ago* (last edited 4 months ago) (1 children)

Now I have to go try it, so that I can go through this article's paywall.

[–] [email protected] 5 points 4 months ago (1 children)
[–] [email protected] 6 points 4 months ago

I don't know if I had to do anything special for the prompt, but it just gave me a short summary starting with "Unfortunately I cannot provide the full text of the article you requested, as that would likely infringe on the copyright of the content. However, I can summarize the key points from the article."

[–] [email protected] 14 points 4 months ago (1 children)

Quora should be respecting robots.txt, but also why are the NYT etc. serving the full article to the Quora bot anyway?

[–] [email protected] 22 points 4 months ago

Usually NYT sets a cookie to track how many free articles you read and once you exceed that, you get the paywall. The bots probably don't set/send the cookies, so NYT doesn't block them. Also, I'd imagine the bots are coming from various different IPs so even server side blocking based on IP wouldn't block everything and eventually the bot would get to the article. User Agents can also be spoofed.

[–] [email protected] 7 points 4 months ago* (last edited 4 months ago) (1 children)

We would be happy to connect with your technical team to help them make sure your paywalled content isn’t served to people using Poe.

What a joke, Quora needs to reevaluate whose responsibility that is.

Basic reasoning time: was it an accident?

  • If not, then it was at least immoral.
  • If so, then it was incompetence.

What a surprise, both possibilities seem to point towards the project being a pile of crap.

[–] [email protected] 6 points 4 months ago* (last edited 4 months ago) (1 children)
  1. putting paywalled content onto the internet, where you let bots look at it but try to prevent humans from being able to see it, is plain evil.
  2. NYT lied to get us into Iraq, and countless other times, they are pure evil
  3. NYT did not contribute to the building of the internet in any way. but they see it as their god given right to take the hard work of the nerds they hate and use to make millions of dollars for themselves, while giving nothing back
  4. nobody here seems to understand the difference between a USER AGENT and a BOT. If i ask my web browser to fetch me a web page, that browser is my user agent. of course it does not respect the robots policy. same thing if i ask an LLM to fetch a page for me. that LLM is my user agent, not a bot in this case. NYT is mad because they let all bot-like user agents in, they want to be indexed after all. of course here again we see where NYT wants the benefit of internet resources like being in the search index, but they want to give nothing back and make the actual human people suffer by degrading their experience on the web
[–] [email protected] -1 points 4 months ago

I'm sure you're one of the ones that will complain when all journalism is replaced by AI, while lacking the basic understanding of why that had to happen.

[–] [email protected] 6 points 4 months ago

FireFox has multiple extensions that are specifically made to bypass paywalls. No AI necessary.

[–] [email protected] 5 points 4 months ago (1 children)

Maybe they asked Quora if it was legal.

In all seriousness, though, I don’t get that site’s popularity. I only ever visit Quora by accident (because Google ranks it highly) and it’s basically always garbage answers. And speaking as a developer, the UI/UX causes my eyes to roll back in my head and say, “REDRUM” in a demonic voice. It’s hard to even tell where the answer is because there’s so much superfluous shit on the page.

[–] [email protected] 2 points 4 months ago

Agreed on the UI/UX. Really awful and unintuitive

[–] [email protected] 4 points 4 months ago (1 children)

Does Microsoft think things behind paywalls are fair game for LLMs too? (I know this isn’t Microsoft, but I bet OpenAI got around paywalls toooo…)

[–] [email protected] 3 points 4 months ago

So long as it isn't their own, yeah, probably.