this post was submitted on 04 May 2024
289 points (95.6% liked)

Technology

59312 readers
4683 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 
  • Rabbit R1, AI gadget, runs on Android app, not requiring "very bespoke AOSP" firmware as claimed by Rabbit.
  • Rabbit R1 launcher app can run on existing Android phones, not needing system-level permissions for core functionality.
  • Rabbit R1 firmware analysis shows minimal modifications to standard AOSP, contradicting claims of custom hardware necessity by Rabbit.
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 0 points 6 months ago* (last edited 6 months ago) (2 children)

ChatGPT 4 is a great assistant, I find it indispensable... I use it on my phone and computer but would like it in a dedicated device.

Privacy? Yeah it's not great, but that's mitigated by OpenAI focusing the product hard on areas that don't really need privacy.

I do think these tools can be private - but to get there we need more RAM on our computers and phones, and it needs to be expensive high bandwidth RAM, which costs a fortune right now. A lot of research is being done to reduce memory requirements and more manufacturing capacity for memory is being ramped up.

[–] [email protected] 2 points 6 months ago

Last I checked (around the time that LLAMA v3 was released), the performance of local models on CPU also was pretty bad for most consumer hardware (Apple Silicon excepted) compared to GPU performance, and the consumer GPU RAM situation is even worse. At least, when talking about the models that have performance anywhere near that of ChatGPT, which was mostly 70B models with a few exceptional 30B models.

My home server has a 3090, so I can use a self-hosted 4-bit (or 5-bit with reduced context) quantized 30B model. If I added another 3090 I’d be able to use a 4-bit quantized 70B model.

There’s some research that suggests that 1.58 bit (ternary) quantization has a lot of potential, and I think it’ll be critical to getting performant models on phones and laptops. At 1.58 bit per parameter, a 30B model could fit into 6 gigs of RAM, and the quality hit is allegedly negligible.

[–] [email protected] 1 points 6 months ago

So you are using OpenAI's app? Do you have it integrated into your phone? What are the main features that you use (beyond asking questions like one does from their app/site)?