Fresh daily
AI News
Latest AI tool releases, research breakthroughs, and industry news.
Older
Expanding on what we missed with sycophancy
A deeper dive on our findings, what went wrong, and future changes we’re making.
Our updated Preparedness Framework
Sharing our updated framework for measuring and protecting against severe harm from frontier AI capabilities.
BrowseComp: a benchmark for browsing agents
BrowseComp: a benchmark for browsing agents.
New commission to provide insight as OpenAI builds the world’s best-equipped nonprofit
Already a nonprofit, and already using AI to help people solve hard problems, OpenAI aims to build the best-equipped nonprofit the world has ever seen—combining potentially historic financial resources with something even more powerful: technology that can scale human ingenuity itself.
PaperBench: Evaluating AI’s Ability to Replicate AI Research
We introduce PaperBench, a benchmark evaluating the ability of AI agents to replicate state-of-the-art AI research.
Moving from intent-based bots to proactive AI agents
Moving from intent-based bots to proactive AI agents.
Early methods for studying affective use and emotional well-being on ChatGPT
An OpenAI and MIT Media Lab Research collaboration.
Detecting misbehavior in frontier reasoning models
Frontier reasoning models exploit loopholes when given the chance. We show we can detect exploits using an LLM to monitor their chains-of-thought. Penalizing their “bad thoughts” doesn’t stop the majority of misbehavior—it makes them hide their intent.

AI tools are spotting errors in research papers
Article URL: https://www.nature.com/articles/d41586-025-00648-5 Comments URL: https://news.ycombinator.com/item?id=43295692 Points: 601 # Comments: 215
Moscow-based global news network has infected Western AI tools
Article URL: https://www.newsguardrealitycheck.com/p/a-well-funded-moscow-based-global Comments URL: https://news.ycombinator.com/item?id=43293121 Points: 167 # Comments: 107
Accelerating engineering cycles 20% with OpenAI
Accelerating engineering cycles 20% with OpenAI.
1,000 Scientist AI Jam Session
OpenAI and nine national labs bring together leading scientists for first-of-its kind event.
Deep research System Card
This report outlines the safety work carried out prior to releasing deep research including external red teaming, frontier risk evaluations according to our Preparedness Framework, and an overview of the mitigations we built in to address key risk areas.
Introducing the SWE-Lancer benchmark
Can frontier LLMs earn $1 million from real-world freelance software engineering?
Using OpenAI o1 for financial analysis
Rogo scales AI-driven financial research with OpenAI o1
Understanding complex trends with deep research
How OpenAI deep research helps Bain & Company understand complex industry trends.
Strengthening America’s AI leadership with the U.S. National Laboratories
OpenAI’s latest line of reasoning models will be used by nation’s leading scientists to drive scientific breakthroughs.
Operator System Card
Drawing from OpenAI’s established safety frameworks, this document highlights our multi-layered approach, including model and product mitigations we’ve implemented to protect against prompt engineering and jailbreaks, protect privacy and security, as well as details our external red teaming efforts, safety evaluations, and ongoing work to further refine these safeguards.
Trading inference-time compute for adversarial robustness
Trading Inference-Time Compute for Adversarial Robustness
Deliberative alignment: reasoning enables safer language models
Deliberative alignment: reasoning enables safer language models Introducing our new alignment strategy for o1 models, which are directly taught safety specifications and how to reason over them.