12ft works, if you really need to. But in general, I just don’t read any publications that paywall their content. Mass media is all owned by one or two billionaires, if they need money they can get it from them.
Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ
⚓ Dedicated to the discussion of digital piracy, including ethical problems and legal advancements.
Rules • Full Version
1. Posts must be related to the discussion of digital piracy
2. Don't request invites, trade, sell, or self-promote
3. Don't request or link to specific pirated titles, including DMs
4. Don't submit low-quality posts, be entitled, or harass others
Loot, Pillage, & Plunder
📜 c/Piracy Wiki (Community Edition):
💰 Please help cover server costs.
Ko-fi | Liberapay |
Looks like newspaper4k uses headless Chrome. You could try loading the Bypass Paywalls Clean extension and browsing the pages directly.
I regularly use it (in Firefox) without even thinking about it. Only notice when I send someone an article they can't access.
It does not use headless chrome it just uses the python requests library. Did u get got by an ai hallucination?
Source: i went digging in the source code.
No, just this example code from their site:
browser = p.chromium.launch(headless=True)
My mistake was not knowing where newspaper4k fits in the stack. They're wrapping it with Playwright, which it seems you could do here.
Ahh i see. Im using newspaper4k to fetch articles directly it seems the example u found is just using it simply as a parser after using playwright as a html fetcher. I might try that approach.
Why use one when you can use 6
Yeah ive tried that only some of em work in an easy way to implement but if the one im currently using goes down then i guess ill have to bodge somthing together.
Generally, 12ft.io works pretty well for me.
Most of the time archive.today gets the work done
It also offers a URL to get a snapshot from a given URL: http://archive.is/newest/http://lemmy.dbzer0.com/c/piracy