Technology
This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.
Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.
Rules:
1: All Lemmy rules apply
2: Do not post low effort posts
3: NEVER post naziped*gore stuff
4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.
5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)
6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist
7: crypto related posts, unless essential, are disallowed
view the rest of the comments
We need decentralized, federated search. I remember YaCy from years ago was attempting this. Anybody know if there's anybody actively working on this?
@makeasnek @schizoidman YaCy is still around.
And https://searx.space/ is an open source metasearch search engine with many instances. (Try https://searx.be/ if you want to test it out.)
SearX/SearXNG allows you to aggregate results from a number of different search engines. You choose which ones, and they're stored in your browser without setting up an account.
Interesting thanks I'll take a look!
@makeasnek On a broader note, I think possibly the best approach for decentralised, open-sourced web search might be an evolution on the SearXNG model.
At the top of the funnel, you have meta search engines that query and aggregate results from a number of smaller niche search engines.
The metasearch engines are open source, anyone with a spare server or a web hosting account can spin one up.
For some larger sites that are trustworthy, such as Wikipedia, the site's own search engine might be what's queried.
For the Fediverse and other similar federated networks, the query is fed through a trusted node on the network.
And then there's a host of smaller niche search engines, which only crawl and index pages on a small number of websites vetted and curated by a human.
(Perhaps on a particular topic? Or a local library or university might curate a list of notable local websites?)
(Alternatively, it might be that a crawler for a web index like Curlie.org only crawls websites chosen by its topic moderators.)
In this manner, you could build a decent web search engine without needing the scale of Google or Microsoft.
@ajsadauskas sounds like what @pears is trying to do
@makeasnek