← Back to all articles

⬛OpenAI

May 6, 2026·OpenAI·1 min read

# OpenAI Releases Two New Reinforcement Learning Algorithms

OpenAI has announced the release of two new implementations in their Baselines library: ACKTR and A2C, offering researchers and developers more efficient tools for training AI agents.

A2C (Advantage Actor Critic) is a synchronous version of the popular A3C algorithm. Unlike its asynchronous predecessor, A2C processes updates in a deterministic, synchronized manner while maintaining equivalent performance. This makes the algorithm more predictable and easier to debug.

ACKTR represents a more significant advancement in sample efficiency. According to OpenAI, it outperforms both TRPO and A2C in learning from fewer examples, a critical factor in reducing training time and computational costs. The algorithm achieves this improvement while requiring only marginally more computation per update than A2C.

**Why it matters:** Sample efficiency is a major bottleneck in reinforcement

Related Video

Related Articles

# Uber Integrates OpenAI to Enhance Driver and Rider Experience

OpenAI · May 6, 2026

# OpenAI Launches ChatGPT Futures Program for Student Innovators

OpenAI · May 6, 2026

# Frontier Enterprises Gaining Competitive Edge Through Advanced AI Adoption

OpenAI · May 6, 2026

Read original post →