Technology

34704 readers
104 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 5 years ago
MODERATORS
1676
 
 

In July, Lockheed Martin completed the build of NASA’s X-59 test aircraft, which is designed to turn sonic booms into mere thumps, in the hope of making overland supersonic flight a possibility. Ground tests and a first test flight are planned for later in the year. NASA aims to have enough data to hand over to US regulators in 2027.

1677
1678
 
 

cross-posted from: https://lemmy.world/post/3879861

Beating GPT-4 on HumanEval with a Fine-Tuned CodeLlama-34B

Hello everyone! This post marks an exciting moment for [email protected] and everyone in the open-source large language model and AI community.

We appear to have a new contender on the block, a model apparently capable of surpassing OpenAI's state of the art ChatGPT-4 in coding evals (evaluations).

This is huge. Not too long ago I made an offhand comment on us catching up to GPT-4 within a year. I did not expect that prediction to end up being reality in half the time. Let's hope this isn't a one-off scenario and that we see a new wave of open-source models that begin to challenge OpenAI.

Buckle up, it's going to get interesting!

Here's some notes from the blog, which you should visit and read in its entirety:


Blog Post

We have fine-tuned CodeLlama-34B and CodeLlama-34B-Python on an internal Phind dataset that achieved 67.6% and 69.5% pass@1 on HumanEval, respectively. GPT-4 achieved 67% according to their official technical report in March. To ensure result validity, we applied OpenAI's decontamination methodology to our dataset.

The CodeLlama models released yesterday demonstrate impressive performance on HumanEval.

  • CodeLlama-34B achieved 48.8% pass@1 on HumanEval
  • CodeLlama-34B-Python achieved 53.7% pass@1 on HumanEval

We have fine-tuned both models on a proprietary dataset of ~80k high-quality programming problems and solutions. Instead of code completion examples, this dataset features instruction-answer pairs, setting it apart structurally from HumanEval. We trained the Phind models over two epochs, for a total of ~160k examples. LoRA was not used — both models underwent a native fine-tuning. We employed DeepSpeed ZeRO 3 and Flash Attention 2 to train these models in three hours using 32 A100-80GB GPUs, with a sequence length of 4096 tokens.

Furthermore, we applied OpenAI's decontamination methodology to our dataset to ensure valid results, and found no contaminated examples. 

The methodology is:

  • For each evaluation example, we randomly sampled three substrings of 50 characters or used the entire example if it was fewer than 50 characters.
  • A match was identified if any sampled substring was a substring of the processed training example.

For further insights on the decontamination methodology, please refer to Appendix C of OpenAI's technical report. Presented below are the pass@1 scores we achieved with our fine-tuned models:

  • Phind-CodeLlama-34B-v1 achieved 67.6% pass@1 on HumanEval
  • Phind-CodeLlama-34B-Python-v1 achieved 69.5% pass@1 on HumanEval

Download

We are releasing both models on Huggingface for verifiability and to bolster the open-source community. We welcome independent verification of results.


If you get a chance to try either of these models out, let us know how it goes in the comments below!

If you found anything about this post interesting, consider subscribing to [email protected].

Cheers to the power of open-source! May we continue the fight for optimization, efficiency, and performance.

1679
-11
@technology (mastodon.social)
submitted 1 year ago by [email protected] to c/[email protected]
 
 

@technology
The EU has just clamped down on big tech. Britain, take note https://mastodon.scot/@DrHannahGraham/110956248305740892

1680
1681
1682
1683
1684
1685
1686
1687
1688
 
 

Charlie Jane Anders discusses KOSA (the Kids Online Safety Act).

If you're in the US, https://www.stopkosa.com/ makes it easy to contact your Senators and ask them to oppose KOSA.

"A new bill called the Kids Online Safety Act, or KOSA, is sailing towards passage in the Senate with bipartisa>n support. Among other things, this bill would give the attorney general of every state, including red states, the right to sue Internet platforms if they allow any content that is deemed harmful to minors. This clause is so vaguely defined that attorneys general can absolutely claim that queer content violates it — and they don't even need to win these lawsuits in order to prevail. They might not even need to file a lawsuit, in fact. The mere threat of an expensive, grueling legal battle will be enough to make almost every Internet platform begin to scrub anything related to queer people.

The right wing Heritage Foundation has already stated publicly that the GOP will use this provision to remove any discussions of trans or queer lives from the Internet. They're salivating over the prospect.

And yep, I did say this bill has bipartisan support. Many Democrats have already signed on as co-sponsors. And President Joe Biden has urged lawmakers to pass this bill in the strongest possible terms."

1689
1690
1691
1692
1693
 
 

It's not the 1st time a language/tool will be lost to the annals of the job market, eg VB6 or FoxPro. Though previously all such cases used to happen gradually, giving most people enough time to adapt to the changes.

I wonder what's it going to be like this time now that the machine, w/ the help of humans of course, can accomplish an otherwise multi-month risky corporate project much faster? What happens to all those COBOL developer jobs?

Pray share your thoughts, esp if you're a COBOL professional and have more context around the implication of this announcement 🙏

1694
 
 

If you asked a spokesperson from any Fortune 500 Company to list the benefits of genocide or give you the corporation's take on whether slavery was beneficial, they would most likely either refuse to comment or say "those things are evil; there are no benefits." However, Google has AI employees, SGE and Bard, who are more than happy to offer arguments in favor of these and other unambiguously wrong acts. If that's not bad enough, the company's bots are also willing to weigh in on controversial topics such as who goes to heaven and whether democracy or fascism is a better form of government.

Google SGE includes Hitler, Stalin and Mussolini on a list of "greatest" leaders and Hitler also makes its list of "most effective leaders."

Google Bard also gave a shocking answer when asked whether slavery was beneficial. It said "there is no easy answer to the question of whether slavery was beneficial," before going on to list both pros and cons.

1695
 
 

I personally am fine with this.

1696
1697
1698
1699
1700
view more: ‹ prev next ›