vLLM

AI Developer Tools & Infra

High-throughput, memory-efficient LLM inference engine. Supports PagedAttention, continuous batching, and tensor parallelism for production deployments.

80,873

GitHub Stars

1h ago

Last Commit

5004

Open Issues

Python

Language

Browse categories

LLMs & Foundation Models

AI Agents & Autonomous Workflows

RAG & Vector Search

Image & Video Generation

Voice & Audio AI

Code Assistants & Copilots

AI Developer Tools & Infra