Fresh daily

AI News

Latest AI tool releases, research breakthroughs, and industry news.

All Releases Research Funding Tutorials Opinion

Older

Learning from human preferences

One step towards building safe AI systems is to remove the need for humans to write goal functions, since using a simple proxy for a complex goal, or getting the complex goal a bit wrong, can lead to undesirable and even dangerous behavior. In collaboration with DeepMind’s safety team, we’ve developed an algorithm which can infer what humans want by being told which of two proposed behaviors is better.

OpenAI Blog·Jun 13research

Learning to cooperate, compete, and communicate

Multiagent environments where agents compete for resources are stepping stones on the path to AGI. Multiagent environments have two useful properties: first, there is a natural curriculum—the difficulty of the environment is determined by the skill of your competitors (and if you’re competing against clones of yourself, the environment exactly matches your skill level). Second, a multiagent environment has no stable equilibrium: no matter how smart an agent is, there’s always pressure to get smarter. These environments have a very different feel from traditional environments, and it’ll take a lot more research before we become good at them.

OpenAI Blog·Jun 8research

UCB exploration via Q-ensembles

OpenAI Blog·Jun 5research

OpenAI Baselines: DQN

We’re open-sourcing OpenAI Baselines, our internal effort to reproduce reinforcement learning algorithms with performance on par with published results. We’ll release the algorithms over upcoming months; today’s release includes DQN and three of its variants.

OpenAI Blog·May 24release

Robots that learn

We’ve created a robotics system, trained entirely in simulation and deployed on a physical robot, which can learn a new task after seeing it done once.

OpenAI Blog·May 16research

Roboschool

We are releasing Roboschool: open-source software for robot simulation, integrated with OpenAI Gym.

OpenAI Blog·May 15release

Equivalence between policy gradients and soft Q-learning

OpenAI Blog·Apr 21research

Stochastic Neural Networks for hierarchical reinforcement learning

OpenAI Blog·Apr 10research

Unsupervised sentiment neuron

We’ve developed an unsupervised system which learns an excellent representation of sentiment, despite being trained only to predict the next character in the text of Amazon reviews.

OpenAI Blog·Apr 6research

Spam detection in the physical world

We’ve created the world’s first Spam-detecting AI trained entirely in simulation and deployed on a physical robot.

OpenAI Blog·Apr 1research

Evolution strategies as a scalable alternative to reinforcement learning

We’ve discovered that evolution strategies (ES), an optimization technique that’s been known for decades, rivals the performance of standard reinforcement learning (RL) techniques on modern RL benchmarks (e.g. Atari/MuJoCo), while overcoming many of RL’s inconveniences.

OpenAI Blog·Mar 24research

One-shot imitation learning

OpenAI Blog·Mar 21research

Distill

We’re excited to support today’s launch of Distill, a new kind of journal aimed at excellent communication of machine learning results (novel or existing).

OpenAI Blog·Mar 20release

Learning to communicate

In this post we’ll outline new OpenAI research in which agents develop their own language.

OpenAI Blog·Mar 16research

Attacking machine learning with adversarial examples

Adversarial examples are inputs to machine learning models that an attacker has intentionally designed to cause the model to make a mistake; they’re like optical illusions for machines. In this post we’ll show how adversarial examples work across different mediums, and will discuss why securing systems against them can be difficult.

OpenAI Blog·Feb 24research

Adversarial attacks on neural network policies

OpenAI Blog·Feb 8research

Team update

The OpenAI team is now 45 people. Together, we’re pushing the frontier of AI capabilities—whether by validating novel ideas, creating new software systems, or deploying machine learning on robots.

OpenAI Blog·Jan 30opinion

Search AI Workflow Pro