Skip to main content
Join Community

Search AI Workflow Pro

Search tools, categories, stacks, and pages

Fresh daily

AI News

Latest AI tool releases, research breakthroughs, and industry news.

AllReleasesResearchFundingTutorialsOpinion

Older

Measuring the performance of our models on real-world tasks

OpenAI introduces GDPval, a new evaluation that measures model performance on real-world economically valuable tasks across 44 occupations.

OpenAI Blog·Sep 25research

ENEOS Materials brings ChatGPT Enterprise to manufacturing

ENEOS Materials uses ChatGPT Enterprise to speed research, improve plant design safety, and cut HR analysis time by 90%, with 80% reporting better workflows.

OpenAI Blog·Sep 24research

Detecting and reducing scheming in AI models

Apollo Research and OpenAI developed evaluations for hidden misalignment (“scheming”) and found behaviors consistent with scheming in controlled tests across frontier models. The team shared concrete examples and stress tests of an early method to reduce scheming.

OpenAI Blog·Sep 16research

Building towards age prediction

Learn how OpenAI is building age prediction and parental controls in ChatGPT to create safer, age-appropriate experiences for teens while supporting families with new tools.

OpenAI Blog·Sep 15research

How people are using ChatGPT

New research from the largest study of ChatGPT use shows how the tool creates economic value through both personal and professional use. Adoption is broadening beyond early users, closing gaps and making AI a part of everyday life.

OpenAI Blog·Sep 14research

Working with US CAISI and UK AISI to build more secure AI systems

OpenAI shares progress on the partnership with the US CAISI and UK AISI to strengthen AI safety and security.

OpenAI Blog·Sep 12research

Why language models hallucinate

OpenAI’s new research explains why language models hallucinate. The findings show how improved evaluations can enhance AI reliability, honesty, and safety.

OpenAI Blog·Sep 5research

Collective alignment: public input on our Model Spec

OpenAI surveyed over 1,000 people worldwide on how AI should behave and compared their views to our Model Spec. Learn how collective alignment is shaping AI defaults to better reflect diverse human values and perspectives.

OpenAI Blog·Aug 27research

OpenAI and Anthropic share findings from a joint safety evaluation

OpenAI and Anthropic share findings from a first-of-its-kind joint safety evaluation, testing each other’s models for misalignment, instruction following, hallucinations, jailbreaking, and more—highlighting progress, challenges, and the value of cross-lab collaboration.

OpenAI Blog·Aug 27research

Helping people when they need it most

How we think about safety for users experiencing mental or emotional distress, the limits of today’s systems, and the work underway to refine them.

OpenAI Blog·Aug 25research

Accelerating life sciences research

Discover how a specialized AI model, GPT-4b micro, helped OpenAI and Retro Bio engineer more effective proteins for stem cell therapy and longevity research.

OpenAI Blog·Aug 22research

Medical research with GPT-5

Learn how GPT-5 is used for medical research.

OpenAI Blog·Aug 6research

From hard refusals to safe-completions: toward output-centric safety training

Discover how OpenAI's new safe-completions approach in GPT-5 improves both safety and helpfulness in AI responses—moving beyond hard refusals to nuanced, output-centric safety training for handling dual-use prompts.

OpenAI Blog·Aug 6research

How Amgen uses GPT-5

Learn how Amgen uses GPT-5.

OpenAI Blog·Aug 6research

Estimating worst case frontier risks of open weight LLMs

In this paper, we study the worst-case frontier risks of releasing gpt-oss. We introduce malicious fine-tuning (MFT), where we attempt to elicit maximum capabilities by fine-tuning gpt-oss to be as capable as possible in two domains: biology and cybersecurity.

OpenAI Blog·Aug 4research

OpenAI’s new economic analysis

Analysis provides insights into ChatGPT’s impact on the economy. OpenAI also launches new research collaboration to study AI’s broader effects on the labor market and productivity.

OpenAI Blog·Jul 21research

Preparing for future AI risks in biology

Advanced AI can transform biology and medicine—but also raises biosecurity risks. We’re proactively assessing capabilities and implementing safeguards to prevent misuse.

OpenAI Blog·Jun 18research

Toward understanding and preventing misalignment generalization

We study how training on incorrect responses can cause broader misalignment in language models and identify an internal feature driving this behavior—one that can be reversed with minimal fine-tuning.

OpenAI Blog·Jun 18research

Disrupting malicious uses of AI: June 2025

Our latest report featuring case studies of how we’re detecting and preventing malicious uses of AI.

OpenAI Blog·Jun 4research

Introducing HealthBench

HealthBench is a new evaluation benchmark for AI in healthcare which evaluates models in realistic scenarios. Built with input from 250+ physicians, it aims to provide a shared standard for model performance and safety in health.

OpenAI Blog·May 12research