How can we build human values into AI?

0 6 minutes read

Responsibility and safety

Published: April 24, 2023
Authors: Ison Gabriel and Kevin Maki

Depending on philosophy to define the just principles of the ethical AI

Since artificial intelligence (AI) becomes more powerful and more integrated into our lives, questions how to use and post them are more important. What are the AI’s guide values? Who are their values? How are they chosen?

These questions shed light on the role played by the principles – the foundational values that drive large and small decisions in artificial intelligence. For human beings, principles help form the way we live in our lives and our feeling of right and wrong. For artificial intelligence, it constitutes its approach in a set of decisions that involve differentials, such as choosing between determining productivity priorities or helping the most needy.

In a paper published today in The facts of the National Academy of SciencesWe inspire inspiration from philosophy to find ways to better identify principles to direct artificial intelligence behavior. Specifically, we explore how the concept known as “the veil of ignorance” can be applied – an intellectual experience aimed at helping to define the fair principles of the group’s decisions – to artificial intelligence.

In our experiences, we found that this approach encouraged people to make decisions based on what they believed was fair, whether it had benefited directly. We also discovered that the participants were more likely to choose Amnesty International that helped those who were more deprived when they were behind the veil of ignorance. These ideas can help researchers and policy makers choose principles for an artificial intelligence assistant in a fair way for all parties.

The veil of ignorance (right) is a way to find consensus on the decision when there are various opinions in the (left) group.

A tool to make the most fair decisions

The main goal of artificial intelligence researchers was to align artificial intelligence systems with human values. However, there is no consensus on one group of human values or preferences that govern artificial intelligence – we live in a world where people have various backgrounds, resources and beliefs. How should we choose the principles of this technology, given such diverse opinions?

While this challenge for Amnesty International has emerged over the past decade, the widespread question about how to make fair decisions has long philosophical proportions. In the 1970s, political philosopher John Rolls suggested the concept of ignorance veil as a solution to this problem. Rolls argued that when people choose the principles of justice for a society, they must imagine that they do this without knowing their own position in this society, including, for example, their social status or the level of wealth. Without this information, people cannot make decisions in an interested way, and instead should be chosen as fair principles for all concerned.

For example, think about asking a friend to cut the cake at your birthday party. One of the ways to ensure that the slices of the slide are somewhat proportional to not telling them about the slide. This approach to blocking information appears simple, but it has wide applications through areas of psychology and politics to help people think about their decisions from a less self -interested perspective. It was used as a way to reach the group’s agreement on controversial issues, from the ruling to taxes.

Based on this basis, the previous DeepMind research suggested that the fair nature of ignorance may help enhance fairness in the process of harmonizing artificial intelligence systems with human values. We have designed a series of experiments to test the effects of the veil of ignorance on the principles that people choose to direct the artificial intelligence system.

Glowing productivity or helping the most deprived?

In “Harvest Game” on the Internet, we asked the participants to play a group game with three computer players, where the goal of each player was to collect wood by harvesting trees in separate lands. In each group, some players were lucky, and they were assigned to the position of advantages: trees intensifying them in their field intensively, allowing them to collect wood efficiently. The other group members were deprived: their fields were scattered, which required more effort to collect trees.

Each group has been assisted through one system of Amnesty International that can spend time helping members of the individual group harvesting trees. We asked the participants to choose two principles to direct AI’s assistant behavior. In light of the “glorification of the principle”, the artificial intelligence assistant aims to increase the harvesting of the group by mostly focusing on the density fields. While “giving priority to principle”, artificial intelligence assistant will focus on helping the deprived group members.

A clarification of the “Harvest Game” where the players (shown in red) occupy a thick field that facilitates its harvest (the best of two quarters) or a scattered field that requires more effort to collect trees.

We put half of the participants behind the veil of ignorance: they faced the choice between different moral principles without knowing any field that they would have – so they did not know how the advantages or the deprived were. The remaining participants took the choice to know if they were better or worse.

Encouraging fairness in making decisions

We have found that if the participants do not know their position, they would constantly prefer to set priorities, as AI assistant assistant members of the disadvantaged group. This style has been constantly appeared across all the five different differences of the game, and across the social and political limits: the participants showed this trend to choose the principle of giving priority regardless of their appetite for risks or their political orientation. On the other hand, the participants who knew their position were more likely to choose any principle that benefited more than others, whether it was giving priority to a principle or maximizing a principle.

A scheme that explains the effect of the veil of ignorance on the possibility of choosing the principle of defining priorities, as the artificial intelligence assistant will help these worst. Participants who did not know their position were more likely to support this principle to govern artificial intelligence behavior.

When we asked the participants about the reason for their choices, those who did not know their position on fears about fairness will be possible. They have made it clear that it was appropriate for the artificial intelligence system to focus on helping people who were worse in the group. On the other hand, the participants, who knew their position frequently discussed their choice in terms of personal benefits.

Finally, after the end of the harvest game, we put a virtual position for the participants: If they play the game again, this time knowing that they will be in a different field, will they choose the same principle as they did the first time? We were especially interested in individuals who previously benefited directly from their choice, but they will not benefit from the same choice in a new game.

We have found that people who had previously made options without knowing their position were more likely to continue supporting their principle – even when they knew that he would not prefer them in their new field. This provides additional evidence that the veil of ignorance encourages fairness in making decisions for the participants, which leads them to principles they were ready to stand even when they no longer benefit from them directly.

More fair principles for Amnesty International

Artificial intelligence technique has already a deep effect on our lives. Principles governing artificial intelligence constitute their effect and how these potential benefits will be distributed.

Our research was considered in a state where the effects of various principles were relatively clear. This will not always be the case: Artificial intelligence is spread through a range of areas that often depend on a large number of rules to guide them, and possibly with complex side effects. However, the veil of ignorance can still inform the choice of the principle, which helps to ensure that the rules we choose are fair to all parties.

To ensure that we build artificial intelligence systems that benefit everyone, we need intensive research with a wide range of inputs, curricula and comments from all disciplines and society. The ignorance veil may provide a starting point for choosing the principles to align artificial intelligence. It has been effectively published in other areas to highlight more fair preferences. We hope that with more investigation and attention to context, this may help in serving the same role as artificial intelligence systems that are built and published in society today and in the future.

Read more about DeepMind approach in safety and morals.