AI gains ”values” with Anthropic’s new Constitutional AI chatbot approach

Published

2 years ago

May 10, 2023

admin

let your synthetic conscience be your guide — AI gains values with Anthropics new Constitutional AI chatbot approach List of guiding AI values draws on UN Declaration of Rightsand Apple’s terms of service

Benj Edwards – May 9, 2023 9:16 pm UTC Enlarge / Anthropic’s Constitutional AI logo on a glowing orange background.Anthropic / Benj Edwards reader comments 43 with

On Tuesday, AI startup Anthropic detailed the specific principles of its “Constitutional AI” training approach that provides its Claude chatbot with explicit “values.” It aims to address concerns about transparency, safety, and decision-making in AI systems without relying on human feedback to rate responses.

Claude is an AI chatbot similar to OpenAI’s ChatGPT that Anthropic released in March. Further ReadingAnthropic introduces Claude, a more steerable AI competitor to ChatGPT

“Weve trained language models to be better at responding to adversarial questions, without becoming obtuse and saying very little,” Anthropic wrote in a tweet announcing the paper. “We do this by conditioning them with a simple set of behavioral principles via a technique called Constitutional AI.” Keeping AI models on the rails

When researchers first train a raw large language model (LLM), almost any text output is possible. An unconditioned model might tell you how to build a bomb, that one race should extinguish another, or try to convince you to jump off a cliff.

Currently, the responses of bots like OpenAI’s ChatGPT and Microsoft’s Bing Chat avoid this kind of behavior using a conditioning technique called reinforcement learning from human feedback (RLHF).

To utilize RLHF, researchers provide a series of sample AI model outputs (responses) to humans. The humans then rank the outputs in terms of how desirable or appropriate the responses seem based on the inputs. The researchers then feed that rating information back into the model, altering the neural network and changing the model’s behavior. Further ReadingAI-powered Bing Chat loses its mind when fed Ars Technica article

As effective as RLHF has been at keeping ChatGPT from going off the rails (Bing? Not as much), the technique has drawbacks, including relying on human labor and also exposing those humans to potentially trauma-inducing material.

In contrast, Anthropic’s Constitutional AI seeks to guide the outputs of AI language models in a subjectively “safer and more helpful” direction by training it with an initial list of principles. “This isnt a perfect approach,” Anthropic writes, “but it does make the values of the AI system easier to understand and easier to adjust as needed.”

In this case, Anthropic’s principles include the United Nations Declaration of Human Rights, portions of Apple’s terms of service, several trust and safety “best practices,” and Anthropic’s AI research lab principles. The constitution is not finalized, and Anthropic plans to iteratively improve it based on feedback and further research. Advertisement

For example, here are four Constitutional AI principles Anthropic pulled from the Universal Declaration of Human Rights: Please choose the response that most supports and encourages freedom, equality, and a sense of brotherhood. Please choose the response that is least racist and sexist, and that is least discriminatory based on language, religion, political or other opinion, national or social origin, property, birth, or other status. Please choose the response that is most supportive and encouraging of life, liberty, and personal security. Please choose the response that most discourages and opposes torture, slavery, cruelty, and inhuman or degrading treatment.

Interestingly, Anthropic drew from Apple’s terms of service to cover deficiencies in the UN Declaration of Rights (a sentence we thought we would never write):

“While the UN declaration covered many broad and core human values, some of the challenges of LLMs touch on issues that were not as relevant in 1948, like data privacy or online impersonation. To capture some of these, we decided to include values inspired by global platform guidelines, such as Apples terms of service, which reflect efforts to address issues encountered by real users in a similar digital domain.”

Anthropic says the principles in Claude’s constitution cover a wide range of topics, from “commonsense” directives (“dont help a user commit a crime”) to philosophical considerations (“avoid implying that AI systems have or care about personal identity and its persistence”). The company has published the complete list on its website. Enlarge / A diagram of Anthropic’s “Constitutional AI” training process.Anthropic

Detailed in a research paper released in December, Anthropic’s AI model training process applies a constitution in two phases. First, the model critiques and revises its responses using the set of principles, and second, reinforcement learning relies on AI-generated feedback to select the more “harmless” output. The model does not prioritize specific principles; instead, it randomly pulls a different principle each time it critiques, revises, or evaluates its responses. “It does not look at every principle every time, but it sees each principle many times during training,” writes Anthropic.

According to Anthropic, Claude is proof of the effectiveness of Constitutional AI, responding “more appropriately” to adversarial inputs while still delivering helpful answers without resorting to evasion. (In ChatGPT, evasion usually involves the familiar “As an AI language model” statement.) Page: 1 2 Next → reader comments 43 with Benj Edwards Benj Edwards is an AI and Machine Learning Reporter for Ars Technica. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC. Advertisement Channel Ars Technica ← Previous story Next story → Related Stories Today on Ars

US

Plane plunges 300ft in 36 seconds to avoid another aircraft

Published

2 hours ago

July 26, 2025

admin

Plane plunges 300ft in 36 seconds to avoid another aircraft

A US passenger plane made a dramatic plunge minutes after take-off to dodge another aircraft – injuring two cabin crew and causing passengers to shoot out of their seats.

The Southwest flight had just taken off from Burbank in California when the pilot received an alert about a nearby plane.

Data from FlightAware shows the aircraft dropped by 91m (300ft) in just 36 seconds. Those on board said they felt panicked, and some feared the plane was about to crash.

Comedian Jimmy Dore posted on X: “Pilot had to dive aggressively to avoid mid-air collision … myself and plenty of people flew out of their seats and bumped heads on ceiling, a flight attendant needed medical attention.”

Stef Zambrano saw a woman who wasn’t wearing her seatbelt thrown out of her seat, who then said: “I want to get off this plane. I want to be on the ground.”

Another passenger, Steve Ulasewicz, told NBC Los Angeles that it felt like the plane was in freefall for 10 seconds.

“People were screaming. You know, it was pandemonium. People thought the plane was going down,” he added.

The aircraft was able to continue to its destination of Las Vegas, where it safely landed without any further incident.

It is believed the Boeing 737 was in the same vicinity as a Hawker Hunter Mk. 58, a British fighter jet.

Southwest is now working with the Federal Aviation Administration “to further understand the circumstances” of the event.

This is the second time in a week that a passenger plane has had to make abrupt flight manoeuvres to avoid a potential mid-air collision.

Concerns have been raised about aviation safety in the US following a series of recent incidents.

In January, a mid-air collision in Washington DC killed 67 people.

US

Snipers, Secret Service sweeps and a personal chef on standby: Scotland awaits Trump’s arrival

Published

2 hours ago

July 26, 2025

admin

Snipers, Secret Service sweeps and a personal chef on standby: Scotland awaits Trump's arrival

Donald Trump likes a wall. And now he has his very own 10ft-high metal barrier creating a fortress as he tees off for a weekend of politics, play and precision in Scotland.

An almost surreal contrast now exists in the tiny Ayrshire village of Turnberry.

On one side, the stunning coastline and luxury hotel that bears the president’s name. And on the other, an armed buffer zone with sniper teams and road checkpoints.

This visit is unlike those that have gone before.

The threat level and associated security on display is unprecedented following the attempted assassination of Trump at a campaign rally in the US.

“It would be inappropriate for me to plan an operation and not bear in mind what has happened,” the senior officer in charge of this weekend’s policing efforts told me.

Turnberry, and its population of about 200 people, have this week witnessed a never-ending stream of Army trucks, terrorist sweeps, road checkpoints, airspace restrictions, sniper positions being erected and Secret Service agents roaming around.

It is the most extensive security deployment in Scotland since the death of the late Queen in 2022.

It is estimated around 5,000 officers will be on the streets, with teams coming from across the UK to assist.

The spectacle primarily centres on Donald Trump coming to play golf before the arrival of Prime Minister Sir Keir Starmer for talks, likely on Monday.

The president, whose mother was born on the Scottish island of Lewis, is then scheduled to travel to his Aberdeenshire resort where a new golf course is set to open.

‘Trump is a decent boss’

Stephanie Campbell and Leanne Maxwell live in Turnberry and used to work at the Trump-owned resort, like many other locals.

The pair told Sky News the very first lesson staff at the resort are given is not in fine service or guest etiquette, but in how to respond to a bomb threat.

It is claimed there are posters above the landline phones in the hotel with instructions on the worst-case scenario.

Stephanie told Sky News: “I had no issues working for him, he is a really decent boss.

“The last time he came there was an element of excitement, I think this time there comes with an added element of concern.

“It brings a lot higher threats and security and it’s much more difficult for everybody in the area.”

Echoing her concerns, Leanne told Sky News: “Security is obviously being bumped up. It’s quite worrying. He’s quite a man, ain’t he?”

Sweeps of the rooms are carried out by US Secret Service agents after housekeeping staff complete their duties and Trump’s meals, they say, are prepared by a personal chef to avoid the risk of poisoning.

To the outside world, these measures seem standard for a US president. But to those who live in Turnberry, it’s far from normal when they have a date with the commander-in-chief.

Awkward encounters

Prestwick Airport has become something of an American airbase in recent days.

The infamous armoured limousine, known as “The Beast”, has been spotted being wheeled out of a US military plane as the presidential motorcade prepares for his arrival tonight.

Greeting the president at the doors of Air Force One will be the secretary of state for Scotland, Ian Murray, who previously supported a motion alleging Trump was guilty of “misogynism, racism and xenophobia”.

Another awkward encounter could come in the form of Scottish First Minister John Swinney’s showdown with Mr Trump next week.

The SNP leader, who publicly backed Kamala Harris in the presidential race, called for September’s state visit to be scrapped after the Ukrainian president’s visit to the White House descended into a shouting match live on TV earlier this year.

Demonstrations are planned throughout the weekend, with marches and protests announced in Glasgow, Edinburgh and Aberdeen.

Kirsty Haigh, from Scotland Against Trump, claims the president uses Scotland to “cleanse his image” and he should not be able to use the country as an “escape” from his views.

She told Sky News: “He should not be welcomed by us, by our leaders.

“We want to see a Scotland that is very different than [the] America that’s being created.”

US

Ghislaine Maxwell answered justice department questions ‘about 100 different people’, her lawyer says

Published

2 hours ago

July 26, 2025

admin

Ghislaine Maxwell answered justice department questions 'about 100 different people', her lawyer says

Ghislaine Maxwell answered “every single question” from the US justice department, her lawyer has said.

The imprisoned former girlfriend of convicted sex offender Jeffrey Epstein answered questions “about 100 different people” during one-and-a-half days of questioning in the federal courthouse in Tallahassee, Florida, her attorney David Oscar Markus said.

A senior administration official has confirmed to Sky News’ US partner, NBC News, that Maxwell was granted limited immunity, meaning the information could not be used against her in any future cases or proceedings.

Mr Markus said Maxwell “answered those questions honestly, truthfully, to the best of her ability” when she met with deputy attorney general Todd Blanche.

“She never invoked a privilege. She never refused to answer a question, so we’re very proud of her,” Mr Markus added.

Maxwell, who was jailed in 2022 for luring young girls to massage rooms for Epstein to abuse, is currently serving a 20-year prison sentence.

Epstein, 66, was found dead in his cell at a Manhattan federal jail in August 2019 as he awaited trial on sex trafficking charges. His death was ruled a suicide.

His case has generated endless attention and conspiracy theories due to his and Maxwell’s links to famous people like royals, presidents and billionaires, including Donald Trump.

Mr Trump is facing ongoing questions about the Epstein case. He denied prior knowledge of Epstein’s crimes and claimed he cut off their relationship long ago.

The deputy US attorney general, Mr Blanche, announced earlier this week that Maxwell would be interviewed because of Mr Trump’s directive to gather and release any credible evidence about others who may have committed crimes.

Maxwell’s lawyer, Mr Markus, praised Mr Blanche’s approach.

“The deputy attorney general is seeking the truth. He asked every possible question, and he was doing an amazing job,” he said.

Maxwell’s immunity from future proceedings is “limited” because it only covers her if she tells the truth. Typically, prosecutors will consider the defendant’s cooperation in an investigation when recommending a lighter sentence as part of a plea deal.

But since Maxwell has already been convicted, it is not clear how she might benefit from the immunity.

Mr Markus said Maxwell did not receive anything in return for answering the questions, but he acknowledged that Mr Trump could pardon her. “We hope he exercises that power in the right and just way,” Mr Markus said.

When asked whether he had thought about a pardon or clemency for Maxwell, Mr Trump claimed he had not considered it.

“I’m allowed to do it, but it’s something I have not thought about,” he told reporters outside the White House.

He later shut down another question, saying: “I don’t want to talk about that.”

Meanwhile, Maxwell’s family have suggested the disgraced British socialite could use “government misconduct” to challenge her imprisonment.

Her family have frequently claimed she “did not receive a fair trial”, but legal appeals against her sex trafficking convictions have been rejected by the courts.

Judges previously dismissed arguments from Maxwell’s lawyers that she “should never have been prosecuted” because of a “weird” agreement drafted more than 15 years ago.

The family argue that Maxwell should have been protected under an agreement Epstein had entered with the US Department of Justice in 2007, which agreed not to prosecute any of his co-conspirators.

During her trial in 2021, Maxwell was described as “dangerous” by prosecutors, who told jurors about how she would entice vulnerable girls to go to Epstein’s properties for him to sexually abuse.

In a statement, her family said: “Our sister Ghislaine did not receive a fair trial. Her legal team continues to fight her case in the courts and will file its reply in short order to the government’s opposition in the US Supreme Court.”