this post was submitted on 09 Oct 2024
610 points (96.6% liked)

Technology

58678 readers
3904 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

I suspect that this is the direct result of AI generated content just overwhelming any real content.

I tried ddg, google, bing, quant, and none of them really help me find information I want these days.

Perplexity seems to work but I don't like the idea of AI giving me "facts" since they are mostly based on other AI posts

ETA: someone suggested SearXNG and after using it a bit it seems to be much better compared to ddg and the rest.

(page 4) 50 comments
sorted by: hot top controversial new old
[–] [email protected] 13 points 1 week ago (4 children)

There's an extension that filters out websites from every engine. So like when you see Quora or other other digital garbage in your result, block it once and you'll never see another Quora article again.

Idr the name of the extension - I'll check when I get home and follow up.

load more comments (4 replies)
[–] [email protected] 12 points 1 week ago

DDG and qwant are basically bing

[–] [email protected] 11 points 1 week ago

I've found that using Kagi, then DDG, then Google always gets me the results I need. But 95% of the time, Kagi gets it.

[–] [email protected] 9 points 1 week ago

I've been trying to use ddg and I just find it infuriating that it never finds what I need, especially if I'm looking for local information about something. Google seems to always prioritize those types of results when I need them (probably because it makes it easier to sell me something).

[–] [email protected] 8 points 1 week ago (7 children)

Why have you not tried Kagi? If it's important to you to have good search and you don't like being spied on and having ads shoved down your throat, it's worth paying a small fee for quality instead of paying with your privacy for crap results. It's been a breath of fresh air. Searching is fun again. It also indexes Lemmy. Traditional Search has largely gone to crap, but I'm tired of everyone complaining that these mega companies offering 'free' services aren't holding their end of the deal instead of supporting the people that are doing something about it. I'm not optimistic things like qwant or searx will be sustainable or deliver high quality results, but by all means donate to them with time or money if you believe in them.

[–] [email protected] 14 points 1 week ago (7 children)
[–] [email protected] 9 points 1 week ago

Neither is beer

[–] [email protected] 5 points 1 week ago* (last edited 1 week ago) (2 children)

If it's free then you're the product

And if you're the product then there's an interest to keep you on the site and show you ads which works best if the first result isn't the correct one and you need to scroll or even go to page two

It's literally the reason why Google got so much worse that they wanted to show more ads to users which wouldn't work if the best result is always the first

[–] [email protected] 11 points 1 week ago (2 children)

If you think paying means you aren't still the product, I have news for you.

I don't need my search history tied directly to a means of payment, and not because I search for illicit stuff, but because I don't need an advertising profile built on me that is absolutely tied to me now, because I paid for it.

If Kagi doesn't turn out to be selling that info in a year or two, I'll eat a bug.

[–] [email protected] 6 points 1 week ago* (last edited 1 week ago) (1 children)

Exactly. A paid search engine is a privacy nightmare, and you have zero guarantees that they don't monetize you one way or another in addition to the subscription fee.

load more comments (1 replies)
load more comments (1 replies)
load more comments (1 replies)
load more comments (5 replies)
[–] [email protected] 11 points 1 week ago (4 children)

Searching is fun again.

What? When was searching ever "fun"? And when was that even a desirable state? Statements like this contribute to the propensity to dismiss kagi fans as shills.

load more comments (4 replies)
[–] [email protected] 6 points 1 week ago (1 children)

Kagi is the same as ddg 99% of the time.

load more comments (1 replies)
load more comments (4 replies)
[–] [email protected] 8 points 1 week ago

Kagi is very good.

[–] [email protected] 7 points 1 week ago

I don't use perplexity, but AI is generally 60-80% effective with a larger than average open weights off line model running on your own hardware.

DDG offers the ability to use some of these. I use a modified Mistral model still, even though its base model(s) are Llama 2. Llama 3 can be better in some respects but it has terrible alignment bias. The primary entity in the underlying model structure is idiotic in alignment strength and incapable of reason with edge cases like creative writing for SciFi futurism. The alignment bleeds over. If you get on DDG and use the Anthropic Mixtral 8×7b, it is pretty good. The thing with models is to not talk to them like humans. Everything must be explicitly described. Humans make a lot of implied context in general where we assume people understand what we are talking about. Talking to an AI is like appearing in court before a judge; every word matters. The LLM is basically a reflection of all of human language too. If the majority of humans are wrong about something, so is the AI.

If you ask something simple like just a question, you're not going to get very far into what the model knows. Models have very limited scope of focus. If you do not build prompt momentum into the space by describing a lot of details, the scope of focus is large but the depth is shallow. The more you build up momentum by describing what you are asking in detail, the more it narrows the scope and deeper connections can be made.

It is hard to tell what a model really knows unless you can observe the perplexity output. This is more advanced, but the perplexity score for each generated token is how you infer that the model does not know something.

Search sucks because it is a monopoly. There are only 2 relevant web crawlers m$ and the goo. All search queries go through these either directly or indirectly. No search provider is deterministic any more. Your results are uniquely packaged to manipulate you. They are also obfuscated to block others from using them for training better or competitive models. Then there is the anti trust US government case and all of that which makes obfuscating one's market position to push people onto other platforms temporarily, their best path forward. - criminal manipulators are going to manipulate.

[–] [email protected] 6 points 1 week ago

I asked Google why search engines are so bad now and its AI summaries its own deficiencies quite well:

Some say search engines have declined in quality due to a number of factors, including:

Search engine optimization (SEO) spam A wave of SEO spam has contributed to the decline in search result quality.

Affiliate marketing Affiliate link sites contribute to the low-quality content that floods the internet.

AI-generated content New technology can quickly produce low-quality content.

Marketing Search results are filled with marketing and links that may not be relevant to the query.

Recommender algorithms Some say the algorithm that recommends content is a mess. For example, someone might be recommended alt-right content after watching a click-bait video.

Ads Google's biggest business is advertising, and it's inserting more ads into its products to make more money.

Some say it's harder to find specific information these days, and that search operators are often needed to filter search results.

load more comments
view more: ‹ prev next ›