genai / news / / Wccftech
vLLM-ATOM is a purpose-built plugin that aims to improve inference performance across various AI LLMs.
AMD's vLLM-ATOM plugin enables native kernel optimizations for MI350 and MI400 GPUs without vLLM code changes.
KEY POINTS
- vLLM-ATOM grants instant access to features like FP4 precision and rack-scale inference on AMD's newest GPUs.
- The plugin validates new hardware and kernel features, then upstreams mature optimizations to vLLM's ROCm backend.
- Users can run vLLM-ATOM as either a standalone server or as a plugin backend within vLLM workflows.
- vLLM-ATOM supports both LLMs and VLMs through a unified inference pipeline on AMD hardware.
COMPANIES
Summarized by Newsio from Wccftech. How we summarize →