Thursday, October 17th 2024

NVIDIA Fine-Tunes Llama3.1 Model to Beat GPT-4o and Claude 3.5 Sonnet with Only 70 Billion Parameters

Oct 17th, 2024 03:21 Discuss (31 Comments)

NVIDIA has officially released its Llama-3.1-Nemotron-70B-Instruct model. Based on META's Llama3.1 70B, the Nemotron model is a large language model customized by NVIDIA in order to improve the helpfulness of LLM-generated responses. NVIDIA uses fine-tuning structured data to steer the model and allow it to generate more helpful responses. With only 70 billion parameters, the model is punching far above its weight class. The company claims that the model is beating the current top models from leading labs like OpenAI's GPT-4o and Anthropic's Claude 3.5 Sonnet, which are the current leaders across AI benchmarks. In evaluations such as Arena Hard, the NVIDIA Llama3.1 Nemotron 70B is scoring 85 points, while GPT-4o and Sonnet 3.5 score 79.3 and 79.2, respectively. Other benchmarks like AlpacaEval and MT-Bench spot NVIDIA also hold the top spot, with 57.6 and 8.98 scores earned. Claude and GPT reach 52.4 / 8.81 and 57.5 / 8.74, just below Nemotron.

This language model underwent training using reinforcement learning from human feedback (RLHF), specifically employing the REINFORCE algorithm. The process involved a reward model based on a large language model architecture and custom preference prompts designed to guide the model's behavior. The training began with a pre-existing instruction-tuned language model as the starting point. It was trained on Llama-3.1-Nemotron-70B-Reward and HelpSteer2-Preference prompts on a Llama-3.1-70B-Instruct model as the initial policy. Running the model locally requires either four 40 GB or two 80 GB VRAM GPUs and 150 GB of free disk space. We managed to take it for a spin on NVIDIA's website to say hello to TechPowerUp readers. The model also passes the infamous "strawberry" test, where it has to count the number of specific letters in a word, however, it appears that it was part of the fine-tuning data as it fails the next test, shown in the image below.

Sources: NVIDIA, HuggingFace

Add your own comment

31 Comments on NVIDIA Fine-Tunes Llama3.1 Model to Beat GPT-4o and Claude 3.5 Sonnet with Only 70 Billion Parameters

#26

Minus Infinity

AleksandarKIIRC 4 bit quants should allow it to run on PC(high end). Id love to one day run model of this size locally! :)

Arrow Lake NPU is surely up to the task :roll:

#27

Legacy-ZA

BlosteUp - 1 R

Yep, it works fine.

I was just about to... erm, yeah, you got here first. :D

Glad to see all that VRAM going to "good use" while we have to pay top dollar for a mediocre amount so our games don't stutter, nVstuttterrrrrrrr here are all your "R"s Mr A.I. :roll::roll::roll::roll:

#28

StimpsonJCat

If A.I. especially from nGreedia is so good, how come it's been 2 years since they announced that they would use A.I. to create graphics drivers for their cards...? Just marketing BS, as somebody has already shown in this thread. All it's good for is language, fake voices and A.I. image generation.

#29

AleksandarK

News Editor

StimpsonJCatIf A.I. especially from nGreedia is so good, how come it's been 2 years since they announced that they would use A.I. to create graphics drivers for their cards...? Just marketing BS, as somebody has already shown in this thread. All it's good for is language, fake voices and A.I. image generation.

NVIDIA already uses AI to design chips, so don't underestimate the power of AI so soon. For something its good, for some things not so much ;)

#30

StimpsonJCat

AleksandarKNVIDIA already uses AI to design chips, so don't underestimate the power of AI so soon. For something its good, for some things not so much ;)

Well it was nV themselves that said they will use A.I. to write their display drivers. They of all people would know if it could do it or not. My point is that A.I. beyond certain low-level functions, is not the be-all-and-end all it is hyped to be if even NV can't make use of it in 2 years.

Just look at the "new" perma-beta NV App... It's been a year, and they still haven't been able to add even half the functionality of their 20-year-old control panel. If A.I. was what NV promise it is, why can't they get it to do anything meaningful for them?

We have learned that anything NV announces doesn't necessarily mean that they will deliver it. I feel for the most part, A.I. is fool's gold.

#31

LittleBro

AleksandarKNVIDIA already uses AI to design chips, so don't underestimate the power of AI so soon. For something its good, for some things not so much ;)

If so called "AI" is really used by Nvidia to design their chips, then god help us. Wonder how many transistors in that chip are just a waste of sand.

If you let lots of electronics schemes through the model, it will still deliver another scheme based on provided data. It can't really think, so don't expect an improvement. I'm not saying that it can't yield some useful stuff, but the chances are really low.

Add your own comment

NVIDIA Fine-Tunes Llama3.1 Model to Beat GPT-4o and Claude 3.5 Sonnet with Only 70 Billion Parameters

31 Comments on NVIDIA Fine-Tunes Llama3.1 Model to Beat GPT-4o and Claude 3.5 Sonnet with Only 70 Billion Parameters

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts

NVIDIA Fine-Tunes Llama3.1 Model to Beat GPT-4o and Claude 3.5 Sonnet with Only 70 Billion Parameters

Related News

31 Comments on NVIDIA Fine-Tunes Llama3.1 Model to Beat GPT-4o and Claude 3.5 Sonnet with Only 70 Billion Parameters

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts