codexarcanum

joined 3 months ago
[–] [email protected] 43 points 1 month ago (6 children)

I can't browse lemmy or play phone games while I read docs, but I can while I wait for the program to compile, load itself into a docker container, deploy to the test server, load my browser, and then fail to have fixed that bug I was looking at. Oh well, let me change one character and try again, it only takes about 15 minutes per attempt.

[–] [email protected] 24 points 1 month ago (8 children)

I switched fully over and am in the process of degoogling and de-microsofting my life. No more easy defaults of VS Code, back to custom configuring my emacs. No more surveillance, self-hosting and encryption. No more shitty windows gaming, Linux and Proton for gaming bliss!

[–] [email protected] 5 points 2 months ago (1 children)

[email protected] will ultimately be looking for this.

[–] [email protected] 5 points 2 months ago* (last edited 2 months ago)

I made a comment to a beehaw post about something similar, I should make it a post so the .world can see it.

I've been running the 14B distilled model, based on Ali Baba's Qwen2 model, but distilled by R1 and given it's chain of thought ability. You can run it locally with Ollama and download it from their site.

That version has a couple of odd quirks, like the first interaction in a new session seems much more prone triggering a generic brush-off response. But subsequent responses I've noticed very few guardrails.

I got it to write a very harsh essay on Tiananmen Square, tell me how to make gunpowder (very generally, the 14B model doesn't appear to have as much data available in some fields, like chemistry), offer very balanced views on Isreal and Palestine, and a few other spicy responses.

At one point though I did get a very odd and suspicious message out of it regarding the "Realis" group within China and how the government always treats them very fairly. It misread "Isrealis" and apparently got defensive about something else entirely.

[–] [email protected] 37 points 2 months ago

In a rich man's house, there's no where to spit except for his face.

[–] [email protected] 3 points 2 months ago

Seems like the cross post isn't displaying quoted content (for me on Voyager mobile anyway) so I just wanted to add that in the original post, there is a long discussion I wrote highlighting some interesting aspects of this output. Please click through if you'd like to know more!

[–] [email protected] 4 points 2 months ago

It's already happening. This article takes a long look at many of the rising threats to nvidia. Some highlights:

  • Google has been running on their own homemade TPUs (tensor processing units) for years, and say they on the 6th generation of those.

  • Some AI researchers are building an entirely AMD based stack from scratch, essentially writing their own drivers and utilities to make it happen.

  • Cerebras.ai is creating their own AI chips using a unique whole-die system. They make an AI chip the size of entire silicon wafer (30cm square) with 900,000 micro-cores.

So yeah, it's not just "China AI bad" but that the entire market is catching up and innovating around nvidia's monopoly.

[–] [email protected] 6 points 2 months ago

I'm not an expert so take anything I say with hearty skepticism as well. But yes, I think its possible that's just part of its data. Presumably it was trained using a lot available Chinese documents, and possibly official Party documents include such statements often enough for it to internalize them as part of responses on related topics.

It could also have been intentionally trained that way. It could be using a combination of methods. All these chatbots are censored in some ways, otherwise they could tell you how to make illegal things or plan illegal acts. I've also seen so many joke/fake DeepSeek outputs in the last 2 days that I'm taking any screenshots with extra salt.

[–] [email protected] 14 points 2 months ago (3 children)

"Reasoning" models like DeepSeek R1 or ChatGPT-o1 (I hate these naming conventions) work a little differently. Before responding, they do a preliminary inference round to generate a "chain of thought", then feed it back into themselves along with the prompt and other context. By tuning this reasoning round, the output is improved by giving the model "more time to think."

In R1 (not sure about gpt), you can read this chain of thought as it's generated, which feels like it's giving you a peek inside it's thoughts but I'm skeptical of that feeling. It isn't really showing you anything secret, just running itself twice (very simplified). Perhaps some of it's "cold start data" (as DS puts it) does include instructions like that but it could also be something it dreamed up from similar discussions in it's training data.

[–] [email protected] -1 points 2 months ago (1 children)

What a strangely hostile response to an obvious joke

[–] [email protected] 9 points 2 months ago (3 children)

This is the new captcha: only an AI would know which is the real download button.

[–] [email protected] 3 points 2 months ago

Imagine working in the department the company is now named after and realizing your whole product line is irrelevant and the AI-people get all the money now.

view more: ‹ prev next ›