this post was submitted on 26 Jul 2024
63 points (86.2% liked)

Technology

34920 readers
121 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 5 years ago
MODERATORS
top 45 comments
sorted by: hot top controversial new old
[–] [email protected] 51 points 3 months ago

If someone missed that: it returned a wrong answer even in the demo video.

https://lemmy.world/post/17961641

[–] [email protected] 22 points 3 months ago

Oh cool! A search engine that'll give you fake URL's!

[–] [email protected] 17 points 3 months ago (2 children)

Cool, so the worst part of modern search engines has been made into its own standalone search engine. Very neat.

[–] [email protected] 9 points 3 months ago* (last edited 3 months ago) (3 children)

I don't get the hype around LLM, it is a terrible way to search. It has never give me anything useful on any of my search, ever.

Most of the time asking chatgpt anything non-trivial, it will just spit out gibberish that doesn't mean anything.

Who in their right mind would look at these terribly stupid thing and think: Yeah! This garbage is going to advance humanity.

[–] [email protected] 3 points 3 months ago* (last edited 3 months ago)

I don’t get the hype around LLM, it is a terrible way to search

I'll be playing devil's advocate here just for a moment (despite the huge ecological, moral, political and economical costs) :

  • what LLM does provide is a looser linguistic interface. That means instead of searching for exact words, one can approximately search for the "idea". That means instead of hitting just the right keywords that an expert might know, one can describe a partial solution, a very rough guess of what the problem might be, and possibly get a realistic sounding answer. It might be wrong yet it might still be a step in the right direction.

So... yes I also don't think the hype is justified but IMHO it's quite clear that providing a solution that makes an interface easier to get some OK-looking result would appeal to masses. That means a LOT of people get their hopes up about potential empowerment and a few people ride that bubble making money on promises.

PS: for people interested in the topic but wanting to avoid the generative aspect I believe https://en.wikipedia.org/wiki/Semantic_search is a good starting point.

[–] [email protected] 3 points 3 months ago (1 children)

Have you tried perplexity.ai? Using it to do some programming and it's quite good so far. It's basically LLM + Search Engines.

You can also use it to use different models (not just with ChatGPT).

Sometimes even run the code itself (Python for my case) and see if it's valid.

[–] [email protected] 1 points 3 months ago* (last edited 3 months ago)

Last time I tried ChatGPT it cannot even do a trie in Haskell, so I don't see any way it is useful for me, unfortunately. IIRC, I was testing with some trivial modification of a trie, but I do not remember at this point.

Maybe it is useful for college homework, but I have yet to find any problem it can solve beyond college. But I would love to learn more, since you have more experience with it. :)

Edit: I tried a problem I encountered couple month ago on https://perplexity.ai. I want to implement a parser in Haskell that do not halt on error, but record the error and keeps going.

It should take 2 lines with mtl, and the AI gives me a more verbose answer that is also completely wrong.

So... I don't see how they are helpful, honestly. Sorry.

[–] [email protected] 1 points 3 months ago* (last edited 3 months ago)

Can't say I have the same experience. Other than for old niche content, the sources cited from asking perplexity.ai (I just use it since it's free, no idea how it compares to others) tend to be exactly what I'm after.

[–] [email protected] 3 points 3 months ago

That would be great if they just got the LLM AI out of real search engines.

[–] [email protected] 14 points 3 months ago

Only a matter of time until someone genuinely puts glue on their pizza

[–] [email protected] 13 points 3 months ago

@schizoidman I can't wait to use the energy requirements of a small country to search for shoes and convert from kg to lbs!

[–] [email protected] 9 points 3 months ago

I hope it's using a shit load of energy, like other "AI" stuff. Because we're absolutely not in a climate crisis where reducing consumption is necessary. More "AI" that consumes more power, that's exactly what we need.

[–] [email protected] 7 points 3 months ago

GPT: Because nobody in their right mind would waste nukes destroying the Internet.

[–] [email protected] 6 points 3 months ago (4 children)

I have kind of just been using ChatGPT 4o as my search engine, it's been working pretty well.

[–] [email protected] 19 points 3 months ago (1 children)

I wonder what the energy/environmental impact is vs a traditional search.

[–] [email protected] 16 points 3 months ago* (last edited 3 months ago) (2 children)

Completely terrible. An AI "search" takes as much electricity as hundreds to thousands of normal searches.

'AI" is TERRIBLE for climate change because they're increasing demand for electricity so much that they're keeping coal plants going that were even scheduled for decomissioning because they use A LOT of power.

[–] [email protected] 1 points 3 months ago (1 children)

I’ve been trying to find a search engine that doesn’t use AI for this very reason, but with little luck. Any suggestions?

[–] [email protected] 2 points 3 months ago (1 children)

Doesn't DuckDuckGo not have AI?

[–] [email protected] 1 points 3 months ago

Not to my understanding?

[–] [email protected] 0 points 3 months ago (1 children)

This can be resolved by building the data centers to cold countries like here in Finland. Servers are very good at converting electricity to heat, and the heat can be used to heat homes.

Microsoft Azure data center in Espoo is going to heat up 60% of the city's district heating network.

Also the electricity here in Finland is one of the cleanest, like in all Nordics (hydro, wind, nuclear)

[–] [email protected] 2 points 3 months ago

The electricity would be better spent on heat pumps. Computers convert 100% of their electricity into heat. Heat pumps convert 200-400% of their electricity into heat.

(I'm being lose with my wording for brevity's sake)

[–] [email protected] 14 points 3 months ago* (last edited 3 months ago)

I mean it works ok for things that aren't important. But you never really can have too much faith in the results because it will state blatantly incorrect answers with great certainty.

At least Perplexity links to the sources

[–] [email protected] 10 points 3 months ago

Same (in some situations). I feel like searching for "how to do X?", where X is a simple problem or knowledge, more often than not the classic search results are linking to articles that are way too long and talk around the solution way too much before actually getting to it (if at all).

Sure, I don't trust the AI responses for critical stuff, but I honestly rarely trust a random blog article either.

[–] [email protected] 5 points 3 months ago

I used perplexity pretty exclusively for a while. Especially for work. Both have their place and use cases but when I’m looking for something truly specific or nuanced, it’s DDG and a manual search.

[–] [email protected] 4 points 3 months ago

So their solution to a problem that their existing problem created is to use that problem to solve itself.

[–] [email protected] 2 points 3 months ago (3 children)

OpenAI also confirmed it plans to integrate SearchGPT into ChatGPT down the line.

I don't understand. Isn't CGPT already just a fancy search engine?

[–] [email protected] 4 points 3 months ago (2 children)

No. ChatGPT pulls information out of its ass and how I read it SearchGPT actually links to sources (while also summarizing it and pulling information out of it's ass, presumably). ChatGPT "knows" things and SearchGPT should actually look stuff up and present it to you.

[–] [email protected] 2 points 3 months ago

Kagi supports this since a while. You can end your query with a question mark to request a "quick answer" generated using an llm, complete with sources and citations. It's surprisingly accurate and useful!

[–] [email protected] 0 points 3 months ago (2 children)

ChatGPT "knows" things and SearchGPT should actually look stuff up and present it to you.

...where do you think CGPT gets the information it "knows" from?

[–] [email protected] 4 points 3 months ago (1 children)

It’s not doing live queries at all, it just makes a statistically likely answer up from its training data

[–] [email protected] -3 points 3 months ago* (last edited 3 months ago) (2 children)

Training data from where...?

[–] [email protected] 4 points 3 months ago (1 children)

I mean yeah it does include data scraped from the web but that is all three years old at this point. Hardly a search engine by any metric

[–] [email protected] -1 points 3 months ago (1 children)

So, in your mind, a "search engine" isn't an engine that searches the web?

[–] [email protected] 1 points 3 months ago (1 children)

It literally doesn’t do that

[–] [email protected] -1 points 3 months ago* (last edited 3 months ago)

It literally does...

You just said so yourself in the comment I replied to.

[–] [email protected] 2 points 3 months ago (1 children)

This is like saying the library search engine and Bob the drunkard who looked at the shelf labels and swears up and down he knows where everything is are the same thing.

Look, ChatGPT is an averaging machine. Yes it has ingested a significant chunk of the text on the internet, but it does not reproduce text exactly as it found it, it produces an average of all the text it has seen, weighted towards what seems like it make sense for the situation. For really common information this is fine. For niche information, it is bullshitting without any indication.

[–] [email protected] 0 points 3 months ago (1 children)

This is like saying the library search engine and Bob the drunkard who looked at the shelf labels and swears up and down he knows where everything is are the same thing.

It's...not remotely the same thing?

It's like saying an engine that searches the web for answers to your query is a search engine...?

but it does not reproduce text exactly as it found it

Nor does SearchGPT.

[–] [email protected] 1 points 3 months ago (1 children)

ChatGPT is not a search engine, it generates predictions on what is the most likely text completion to your prompt. It does not pull information from a database. It is a mathematical model. Its weights do not contain the training data. It is not indexing anything. You will not find any page from the internet in the model. It is all averaged out and any niche detail is lost, overpowered by more prevalent but less relevant training data. This is why it bullshits. When it bullshits it is not because it searched for something and came up empty, it is because in the training data there simply was not a sufficient number of occurrences of the answer to influence its response against the weight of all the other more prevalent training data. ChatGPT does not search anything.

[–] [email protected] 0 points 3 months ago (1 children)

ChatGPT is not a search engine

It is every bit as much of a search engine as SearchGPT, with the exception of more recent information, as I've already explained.

it generates predictions on what is the most likely text completion to your prompt.

...using information from the internet. I'm honestly baffled this needs to be explained. Once again, I ask: Where do you think the information it generates comes from? It's not just word salad, the words contain information. Were you unaware of the many many OpenAI lawsuits based on this fact?

This is why it bullshits.

It bullshits because it's trained on bullshit, and doesn't actually know anything, and isn't programmed to say "I don't know".

[–] [email protected] 1 points 3 months ago (1 children)

The information it generates comes from the model. The information from the model comes from the internet. The information it generates does not come from the internet. A to B to C, not A to C. I don't know how to explain this more simply without crayons, the information from the internet does not exist within the model, but the average of the information can be recreated by the model. That is not what a fucking search engine does. A search engine doesn't tell you the average results for your query, it gives you the most relevant results. At least, they should and used to. I can understand the confusion if you've only used a search engine in the past 3 years.

[–] [email protected] 0 points 3 months ago (1 children)

The information from the model comes from the internet

I rest my case.

[–] [email protected] 1 points 3 months ago

That you can't read.

[–] [email protected] 1 points 3 months ago

From the train dataset that was frozen many years ago. It's like you know something instead of looking it up. It doesn't provide sources, it just makes shit up based on what was in the (old) dataset. That's totally different than looking up the information based on what you know and then using the new information to create an informed answer backed up by sources

[–] [email protected] 3 points 3 months ago

No, its fancy autocomplete at a huge scale. Sometimes it returns correct answers.

A search engine should be taking a list of websites and metadata about those websites and returning results based on some ranking with the original desire being to get you what you wanted. (The current desire is just how much money can be extracted from your hands on the keys)

[–] [email protected] 1 points 3 months ago

It is but its not updated in real-time unlike searchgpt