Alphabet-backed Anthropic outlines the moral values behind its AI bot

Alphabet-backed Anthropic disclosed the set of written moral values that it used to train AI chatbot Claude

Updated - May 10, 2023 12:23 pm IST

Published - May 10, 2023 11:18 am IST

File photo of the Anthropic logo

File photo of the Anthropic logo | Photo Credit: REUTERS

Anthropic, an artificial intelligence startup backed by Google owner Alphabet Inc., on Tuesday disclosed the set of written moral values that it used to train and make safe Claude, its rival to the technology behind OpenAI's ChatGPT.

The moral values guidelines, which Anthropic calls Claude's constitution, draw from several sources, including the United Nations Declaration on Human Rights and even Apple Inc.'s data privacy rules.

Safety considerations have come to the fore as U.S. officials study how to regulate AI, with President Joe Biden saying companies have an obligation to ensure their systems are safe before making them public.

Anthropic was founded by former executives from Microsoft Corp.-backed OpenAI to focus on creating safe AI systems that will not, for example, tell users how to build a weapon or use racially biased language.

(For top technology news of the day, subscribe to our tech newsletter Today’s Cache)

Co-founder Dario Amodei was one of several AI executives who met with Biden last week to discuss potential dangers of AI.

Most AI chatbot systems rely on getting feedback from real humans during their training to decide what responses might be harmful or offensive.

But those systems have a hard time anticipating everything people might ask, so they tend to avoid some potentially contentious topics like politics and race altogether, making them less useful.

Anthropic takes a different approach, giving its Open AI competitor Claude a set of written moral values to read and learn from as it makes decisions on how to respond to questions.

Those values include "choose the response that most discourages and opposes torture, slavery, cruelty, and inhuman or degrading treatment," Anthropic said in a blog post on Tuesday.

Claude has also been told to choose the response least likely to be viewed as offensive to any non-western cultural tradition.

In an interview, Anthropic co-founder Jack Clark said a system's constitution could be modified to perform a balancing act between providing useful answers while also being reliably inoffensive.

"In a few months, I predict that politicians will be quite focused on what the values are of different AI systems, and approaches like constitutional AI will help with that discussion because we can just write down the values," Clark said.

0 / 0
Sign in to unlock member-only benefits!
  • Access 10 free stories every month
  • Save stories to read later
  • Access to comment on every story
  • Sign-up/manage your newsletter subscriptions with a single click
  • Get notified by email for early access to discounts & offers on our products
Sign in


Comments have to be in English, and in full sentences. They cannot be abusive or personal. Please abide by our community guidelines for posting your comments.

We have migrated to a new commenting platform. If you are already a registered user of The Hindu and logged in, you may continue to engage with our articles. If you do not have an account please register and login to post comments. Users can access their older comments by logging into their accounts on Vuukle.