Modular builds a unified AI inference platform — spanning a custom compiler, an inference engine (MAX), and a high-performance programming language (Mojo) — that lets developers run AI models across Nvidia, AMD, Intel, ARM, and Apple silicon without rewriting code. It operates from GPU kernel to API endpoint as a single integrated stack, replacing the patchwork of serving, optimization, and scaling tools most teams assemble today. Mojo is the first programming language purpose-built for AI that combines Python-level usability with systems-level performance, achieving 68,000x speedups over Python on some benchmarks. Modular's MAX engine eliminates the need for developers to optimize their models separately for each hardware target (Nvidia, AMD, Intel, etc.), providing write-once-run-anywhere AI deployment.
The fragmentation of AI hardware — with Nvidia, AMD, Intel, and custom accelerators all requiring different optimization — has created a critical need for Modular's unified software layer that makes models portable across any platform.
Modular raised $250M in a Series C at a $1.6B valuation, bringing total funding to $380M from GV, General Catalyst, Greylock, and others.