Connect with us

Published

on

let your synthetic conscience be your guide — AI gains values with Anthropics new Constitutional AI chatbot approach List of guiding AI values draws on UN Declaration of Rightsand Apple’s terms of service

Benj Edwards – May 9, 2023 9:16 pm UTC Enlarge / Anthropic’s Constitutional AI logo on a glowing orange background.Anthropic / Benj Edwards reader comments 43 with

On Tuesday, AI startup Anthropic detailed the specific principles of its “Constitutional AI” training approach that provides its Claude chatbot with explicit “values.” It aims to address concerns about transparency, safety, and decision-making in AI systems without relying on human feedback to rate responses.

Claude is an AI chatbot similar to OpenAI’s ChatGPT that Anthropic released in March. Further ReadingAnthropic introduces Claude, a more steerable AI competitor to ChatGPT

“Weve trained language models to be better at responding to adversarial questions, without becoming obtuse and saying very little,” Anthropic wrote in a tweet announcing the paper. “We do this by conditioning them with a simple set of behavioral principles via a technique called Constitutional AI.” Keeping AI models on the rails

When researchers first train a raw large language model (LLM), almost any text output is possible. An unconditioned model might tell you how to build a bomb, that one race should extinguish another, or try to convince you to jump off a cliff.

Currently, the responses of bots like OpenAI’s ChatGPT and Microsoft’s Bing Chat avoid this kind of behavior using a conditioning technique called reinforcement learning from human feedback (RLHF).

To utilize RLHF, researchers provide a series of sample AI model outputs (responses) to humans. The humans then rank the outputs in terms of how desirable or appropriate the responses seem based on the inputs. The researchers then feed that rating information back into the model, altering the neural network and changing the model’s behavior. Further ReadingAI-powered Bing Chat loses its mind when fed Ars Technica article

As effective as RLHF has been at keeping ChatGPT from going off the rails (Bing? Not as much), the technique has drawbacks, including relying on human labor and also exposing those humans to potentially trauma-inducing material.

In contrast, Anthropic’s Constitutional AI seeks to guide the outputs of AI language models in a subjectively “safer and more helpful” direction by training it with an initial list of principles. “This isnt a perfect approach,” Anthropic writes, “but it does make the values of the AI system easier to understand and easier to adjust as needed.”

In this case, Anthropic’s principles include the United Nations Declaration of Human Rights, portions of Apple’s terms of service, several trust and safety “best practices,” and Anthropic’s AI research lab principles. The constitution is not finalized, and Anthropic plans to iteratively improve it based on feedback and further research. Advertisement

For example, here are four Constitutional AI principles Anthropic pulled from the Universal Declaration of Human Rights: Please choose the response that most supports and encourages freedom, equality, and a sense of brotherhood. Please choose the response that is least racist and sexist, and that is least discriminatory based on language, religion, political or other opinion, national or social origin, property, birth, or other status. Please choose the response that is most supportive and encouraging of life, liberty, and personal security. Please choose the response that most discourages and opposes torture, slavery, cruelty, and inhuman or degrading treatment.

Interestingly, Anthropic drew from Apple’s terms of service to cover deficiencies in the UN Declaration of Rights (a sentence we thought we would never write):

“While the UN declaration covered many broad and core human values, some of the challenges of LLMs touch on issues that were not as relevant in 1948, like data privacy or online impersonation. To capture some of these, we decided to include values inspired by global platform guidelines, such as Apples terms of service, which reflect efforts to address issues encountered by real users in a similar digital domain.”

Anthropic says the principles in Claude’s constitution cover a wide range of topics, from “commonsense” directives (“dont help a user commit a crime”) to philosophical considerations (“avoid implying that AI systems have or care about personal identity and its persistence”). The company has published the complete list on its website. Enlarge / A diagram of Anthropic’s “Constitutional AI” training process.Anthropic

Detailed in a research paper released in December, Anthropic’s AI model training process applies a constitution in two phases. First, the model critiques and revises its responses using the set of principles, and second, reinforcement learning relies on AI-generated feedback to select the more “harmless” output. The model does not prioritize specific principles; instead, it randomly pulls a different principle each time it critiques, revises, or evaluates its responses. “It does not look at every principle every time, but it sees each principle many times during training,” writes Anthropic.

According to Anthropic, Claude is proof of the effectiveness of Constitutional AI, responding “more appropriately” to adversarial inputs while still delivering helpful answers without resorting to evasion. (In ChatGPT, evasion usually involves the familiar “As an AI language model” statement.) Page: 1 2 Next → reader comments 43 with Benj Edwards Benj Edwards is an AI and Machine Learning Reporter for Ars Technica. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC. Advertisement Channel Ars Technica ← Previous story Next story → Related Stories Today on Ars

Continue Reading

Politics

UK restores diplomatic ties with Syria

Published

on

By

UK restores diplomatic ties with Syria

The UK has re-established diplomatic ties with Syria, David Lammy has said, as he made the first visit to the country by a British minister for 14 years.

The foreign secretary visited Damascus and met with interim president Ahmed al Sharaa, also the leader of the rebel group Hayat Tahrir al-Sham (HTS), and foreign minister Asaad al Shaibani.

It marks the latest diplomatic move since Bashar al Assad’s regime was toppled by rebel groups led by HTS in December.

In a statement, Mr Lammy said a “stable Syria is in the UK’s interests” and added: “I’ve seen first-hand the remarkable progress Syrians have made in rebuilding their lives and their country.

“After over a decade of conflict, there is renewed hope for the Syrian people.

“The UK is re-establishing diplomatic relations because it is in our interests to support the new government to deliver their commitment to build a stable, more secure and prosperous future for all Syrians.”

Foreign Secretary David Lammy shakes hands with Syrian interim president Ahmed al-Sharaa in Damascus. Pic: X / @DavidLammy
Image:
Foreign Secretary David Lammy with Syria’s interim president Ahmed al Sharaa in Damascus. Pic: X / @DavidLammy

The Foreign, Commonwealth and Development Office has also announced a £94.5m support package for urgent humanitarian aid and to support the country’s long-term recovery, after a number of British sanctions against the country were lifted in April.

While HTS is still classified as a proscribed terror group, Sir Keir Starmer said last year that it could be removed from the list.

The Syrian president’s office also said on Saturday that the president and Mr Lammy discussed co-operation, as well as the latest developments in the Middle East.

Read more:
Wildfires break out in Greece, Turkey and Syria
Putin ‘mocking Trump’s peace efforts’, Poland says
Hamas gives ‘positive’ response to ceasefire proposal

Follow The World
Follow The World

Listen to The World with Richard Engel and Yalda Hakim every Wednesday

Tap to follow

Since Assad fled Syria in December, a transitional government headed by Mr al Sharaa was announced in March and a number of western countries have restored ties.

In May, US President Donald Trump said the United States would lift long-standing sanctions on Syria and normalise relations during a speech at the US-Saudi investment conference.

Please use Chrome browser for a more accessible video player

From May: Trump says US will end sanctions for Syria

He said he wanted to give the country “a chance at peace” and added: “There is a new government that will hopefully succeed.

“I say good luck, Syria. Show us something special.”

Continue Reading

World

Defiance in Tehran as Khamenei makes appearance

Published

on

By

Defiance in Tehran as Khamenei makes appearance

They rose to their feet in ecstatic surprise, shouting “heydar, heydar” – a Shia victory chant.

This was the first public appearance of their supreme leader since Israel began attacking their country.

He emerged during evening prayers in his private compound. He said nothing but looked stern and resolute as he waved to the crowd.

He has spent the last weeks sequestered in a bunker, it is assumed, for his safety following numerous death threats from Israel and the US.

His re-emergence suggests a return to normality and a sense of defiance that we have witnessed here on the streets of Tehran too.

Earlier, we had filmed as men in black marched through the streets of the capital to the sound of mournful chants and the slow beat of drums, whipping their backs with metal flails.

Please use Chrome browser for a more accessible video player

Defiance on streets of Tehran

This weekend they mark the Shia festival of Ashura as they have for 14 centuries. But this year has poignant significance for Iranians far more than most.

The devout remember the betrayal and death of Imam Hussein as if it happened yesterday. We filmed men and women weeping as they worshipped at the Imamzadeh Saleh Shrine in northern Tehran.

The armies of the Caliph Yazid killed the grandson of the Prophet Muhammad in the seventh-century Battle of Karbala.

Shiite Muslims mark the anniversary every year and reflect on the virtue it celebrates, of resistance against oppression and injustice.

But more so than ever in the wake of Israel and America’s attacks on their country.

The story is one of prevailing over adversity and deception. A sense of betrayal is keenly felt here among people and officials.

Follow The World
Follow The World

Listen to The World with Richard Engel and Yalda Hakim every Wednesday

Tap to follow

Many Iranians believe they were lured into pursuing diplomacy as part of a ruse by the US.

Iran believed it was making diplomatic progress in talks with America it hoped could lead to a deal. Then Israel launched its attacks and, instead of condemning them, the US joined in.

Death to Israel chants resounded outside the mosque in skies which were filled for 12 days with the sounds of Israeli jets. There is a renewed sense of defiance here.

One man told us: “The lesson to be learned from Hussein is not to give in to oppression even if it is the most powerful force in the world.”

A woman was dismissive about the US president. “I don’t think about Trump, nobody likes him. He always wants to attack too many countries.”

Pictures on billboards nearby draw a line between Imam Hussein’s story and current events. The seventh-century imam on horseback alongside images of modern missiles and drones from the present day.

Other huge signs remember the dead. Iran says almost 1,000 people were killed in the strikes, many of them women and children.

👉Listen to The World with Richard Engel and Yalda Hakim on your podcast app👈

Officially Iran is projecting defiance but not closing the door to diplomacy.

Government spokeswoman Dr Fatemeh Mohajerani told Sky News that Israel should not even think about attacking again.

“We are very strong in defence and as state officials have announced, this time Israel will receive an even stronger response compared to previous times,” she said.

“We hope that Israel will not make such a mistake.”

But there is also a hint of conciliation: Senior Iranian officials have told Sky News that back-channel efforts are under way to explore new talks with the US.

Israel had hoped its attacks could topple the Iranian leadership. That proved unfounded, the government is in control here.

For many Iranians, it seems quite the opposite happened – the 12-day war has brought them closer together.

Continue Reading

Politics

Secret Service seizes $400M in crypto, cold wallet among world’s largest

Published

on

By

Secret Service seizes 0M in crypto, cold wallet among world’s largest

Secret Service seizes 0M in crypto, cold wallet among world’s largest

Secret Service quietly amasses one of the world’s largest crypto cold wallets with $400 million seized, exposing scams through blockchain sleuthing and VPN missteps.

Continue Reading

Trending