AI Digest
← Back to all articles
OpenAI
Research·OpenAI·1 min read

OpenAI Releases MRC Protocol to Enhance AI Supercomputer Networks

OpenAI has announced the release of MRC (Multipath Reliable Connection), a new networking protocol designed specifically for AI supercomputer clusters. The protocol has been released through the Open Compute Project (OCP), making it available to the broader technology community. MRC aims to improve both resilience and performance in large-scale AI training environments.

As AI models grow increasingly complex and require massive computing resources, the networking infrastructure connecting thousands of GPUs has become a critical bottleneck. Traditional networking protocols struggle with the unique demands of distributed AI training, where even brief connection failures or slowdowns can disrupt expensive training runs that cost millions of dollars. MRC addresses these challenges by using multiple network paths simultaneously, ensuring that training workloads can continue even when individual connections experience problems.

The release of MRC through the Open Compute Project signals OpenAI's commitment to advancing infrastructure standards across the AI industry. By making the protocol openly available, other organizations building large-scale AI systems can benefit from the same resilience improvements that OpenAI has developed for its own training clusters. This could accelerate the development of more robust AI infrastructure industry-wide and reduce the costs associated with failed training runs.