Standard Kernel uses AI to autonomously generate highly specialized GPU kernels — the foundational units of computation that determine how efficiently AI models run on hardware. By optimizing down to native chip instructions, it replaces static one-size-fits-all libraries with code tailored to specific workloads and hardware configurations, without requiring changes to models or hardware. While kernel generation has become a popular LLM benchmark, most approaches target higher-level abstractions or simple workloads — Standard Kernel operates at the instruction level to match or beat human-engineered implementations. In partner testing, it demonstrated 80% to 4x performance improvements on H100 GPUs, outperforming Nvidia's cuDNN library in certain scenarios.
As AI models grow larger and inference costs dominate budgets, the efficiency of GPU kernel code has become a critical determinant of AI economics, making Standard Kernel's optimization expertise increasingly valuable.
Standard Kernel raised a $20M seed led by Jump Capital, with participation from General Catalyst, Felicis, CoreWeave, and Ericsson Ventures, along with notable angels including Jeff Dean and SemiAnalysis founder Dylan Patel. The company is early-stage, founded by a team out of MIT and Stanford that created the widely used KernelBench and Kernel Tree Search open-source benchmarks.