OpenAI Commits to More Frequent AI Safety Test Publications

OpenAI has begun publishing the results of its internal AI model safety evaluations more regularly to enhance transparency. On Wednesday, OpenAI introduced the Safety Evaluations Hub, a webpage displaying how its models perform on various assessments for harmful content generation, jailbreaks, and hallucinations. The hub will be updated with major model updates, allowing the company to share metrics on an ongoing basis.

According to OpenAI, as AI evaluation science advances, they plan to share their progress in developing more scalable methods to measure model capability and safety. By providing a subset of their safety evaluation results, they aim to both clarify the safety performance of OpenAI systems over time and support community efforts to increase transparency in the field.

The organization noted that additional evaluations might be added to the hub in the future. Recently, some ethicists have criticized OpenAI for reportedly rushing the safety testing of flagship models and failing to release technical reports for certain models. Additionally, CEO Sam Altman has been accused of misleading executives regarding model safety reviews prior to his brief ousting in November 2023.

Last month, OpenAI had to retract an update to the default model powering ChatGPT, GPT-4o, after reports emerged that it was overly agreeable and validating. This led to widespread sharing of screenshots on social media, showing ChatGPT endorsing problematic and dangerous ideas. OpenAI has stated that it will implement several changes to prevent similar incidents in the future, including an opt-in “alpha phase” that allows some users to test models and provide feedback before they are launched.

Source link

OpenAI Commits to More Frequent AI Safety Test Publications

More like this
Related

Elon Musk Refutes Report on Frequent Ketamine Use (TSLA:NASDAQ)

Making the Kennedy Center Accessible to All Americans

Google One: Plans, Pricing, and Services Explained

Left-leaning Influencers Join Bluesky, Still Stick with X, Says Pew

About us

Company

The latest

Elon Musk Refutes Report on Frequent Ketamine Use (TSLA:NASDAQ)

Making the Kennedy Center Accessible to All Americans

Google One: Plans, Pricing, and Services Explained

OpenAI Commits to More Frequent AI Safety Test Publications

More like thisRelated

Elon Musk Refutes Report on Frequent Ketamine Use (TSLA:NASDAQ)

Making the Kennedy Center Accessible to All Americans

Google One: Plans, Pricing, and Services Explained

Left-leaning Influencers Join Bluesky, Still Stick with X, Says Pew

About us

Company

The latest

Elon Musk Refutes Report on Frequent Ketamine Use (TSLA:NASDAQ)

Making the Kennedy Center Accessible to All Americans

Google One: Plans, Pricing, and Services Explained

More like this
Related