this post was submitted on 30 Dec 2024
31 points (97.0% liked)
Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ
55258 readers
172 users here now
⚓ Dedicated to the discussion of digital piracy, including ethical problems and legal advancements.
Rules • Full Version
1. Posts must be related to the discussion of digital piracy
2. Don't request invites, trade, sell, or self-promote
3. Don't request or link to specific pirated titles, including DMs
4. Don't submit low-quality posts, be entitled, or harass others
Loot, Pillage, & Plunder
📜 c/Piracy Wiki (Community Edition):
💰 Please help cover server costs.
Ko-fi | Liberapay |
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Looks like newspaper4k uses headless Chrome. You could try loading the Bypass Paywalls Clean extension and browsing the pages directly.
I regularly use it (in Firefox) without even thinking about it. Only notice when I send someone an article they can't access.
It does not use headless chrome it just uses the python requests library. Did u get got by an ai hallucination?
Source: i went digging in the source code.
No, just this example code from their site:
My mistake was not knowing where newspaper4k fits in the stack. They're wrapping it with Playwright, which it seems you could do here.
Ahh i see. Im using newspaper4k to fetch articles directly it seems the example u found is just using it simply as a parser after using playwright as a html fetcher. I might try that approach.