release

OpenAI Baselines: ACKTR & A2C

For AI builders, these implementations offer more efficient reinforcement learning training, especially ACKTR's sample efficiency which can reduce both time and resource consumption in training AI agents.

OpenAI Blog·August 18, 2017·1 min readrelease

releaseOpenAI Baselines: ACKTR & A2C

openai.com

What happened

OpenAI has added two new reinforcement learning algorithms to its Baselines library: ACKTR and A2C. According to the OpenAI Blog, A2C is a synchronous, deterministic variant of the Asynchronous Advantage Actor Critic (A3C) algorithm, delivering equivalent performance. ACKTR (Actor-Critic using Kronecker-Factored Trust Region) is more sample-efficient than both TRPO and A2C, while requiring only slightly more computation per update than A2C. For developers building AI workflows that involve reinforcement learning—such as game agents, robotics simulations, or optimization tasks—these new baselines provide ready-to-use, well-tested implementations. ACKTR's improved sample efficiency is particularly valuable for reducing training time and computational cost, making it a practical choice when data or compute budgets are constrained.

Key takeaways

OpenAI released ACKTR and A2C implementations in its Baselines library.
A2C is a synchronous variant of A3C with equal performance.
ACKTR is more sample-efficient than TRPO and A2C, requiring slightly more compute per update.
The algorithms are intended for reinforcement learning applications in AI workflows.