Meta’s Next-Gen Llama 3 Model is Powered by NVIDIA GPUs, Providing AI Optimization Across All Platforms, Including RTX – LEARNALLFIX

Meta’s Next-Gen Llama 3 Model is Powered by NVIDIA GPUs, Providing AI Optimization Across All Platforms, Including RTX

Meta’s Next-Gen Llama 3 Model is Powered by NVIDIA GPUs, Providing AI Optimization Across All Platforms, Including RTX

According to NVIDIA, Meta’s Llama 3 LLMs were developed using NVIDIA GPUs and are optimized to run on PCs and servers alike.

NVIDIA is the engine powering Meta’s Next-Gen Llama 3 AI LLMs, which have optimized support across cloud, edge, and RTX PCs.

Press Release: To speed up Meta Llama 3, the most recent iteration of the large language model (LLM), NVIDIA has released platform-wide enhancements. NVIDIA accelerated computing in conjunction with the open model empowers developers, researchers, and companies to responsibly innovate across a broad range of applications.

Trained on NVIDIA AI

Llama 3 was trained by Meta engineers using a computer cluster equipped with 24,576 H100 Tensor Core GPUs connected by a Quantum-2 InfiniBand network. Meta fine-tuned its network, software, and model designs for its flagship LLM with assistance from NVIDIA.

NVIDIA H100 GPUs Are Used as “Collateral” by CoreWeave to Secure a $2.3 Billion Debt 1

In an effort to push the boundaries of generative AI even farther, Meta recently revealed its intentions to expand its infrastructure to 350,000 H100 GPUs.

Putting Llama 3 to Work

Versions of Llama 3, optimized for NVIDIA GPUs, are currently accessible for cloud, data center, edge, and PC applications.

Meta’s Next-Gen Llama 3 Model is Powered by NVIDIA GPUs, Providing AI Optimization Across All Platforms, Including RTX 2.

Using NVIDIA NeMo, an open-source LLM framework that is a component of the safe and supported NVIDIA AI Enterprise platform, businesses may fine-tune Llama 3 based on their data. NVIDIA TensorRT-LLM can be used to optimize custom models for inference, and Triton Inference Server can be used to deploy them.

Taking Llama 3 to Devices and PCs

Llama 3 also runs on Jetson Orin for robotics and edge computing devices, creating interactive agents like those in the Jetson AI Lab. What’s more, RTX and GeForce RTX GPUs for workstations and PCs speed inference on Llama 3. These systems give developers a target of more than 100 million NVIDIA-accelerated systems worldwide.

Get Optimal Performance with Llama 3

Versions of Llama 3, optimized for NVIDIA GPUs, are currently accessible for cloud, data center, edge, and PC applications.

Meta’s Next-Gen Llama 3 Model is Powered by NVIDIA GPUs, Providing AI Optimization Across All Platforms, Including RTX 2.

Using NVIDIA NeMo, an open-source LLM framework that is a component of the safe and supported NVIDIA AI Enterprise platform, businesses may fine-tune Llama 3 based on their data. NVIDIA TensorRT-LLM can be used to optimize custom models for inference, and Triton Inference Server can be used to deploy them.

For edge devices, the version of Llama 3 with eight billion parameters generated up to 40 tokens/second on Jetson AGX Orin and 15 tokens/second on Jetson Orin Nano.

Advancing Community Models

An active open-source contributor, NVIDIA is committed to optimizing community software that helps users address their toughest challenges. Open-source models also promote AI transparency and let users broadly share work on AI safety and resilience.

Leave a Reply

Your email address will not be published. Required fields are marked *