Google’s Artificial Intelligence Learns “Highly Aggressive” Behaviour & Betrayal Pay Off


Written by Alexa Erickson

An artificial intelligence created by Google made headlines recently because of its ability to learn “highly aggressive” behaviour.

DeepMind, Google’s cutting-edge AI company, has accomplished much thus far, including learning from its memory, mimicking human voices, writing music, and beating the best Go player in the world.  Recently, the team behind the AI company ran a series of tests to see how it would respond to certain social dilemmas, specifically if it would be more likely to cooperate or compete.

Among the tests, one involved 40 million instances of playing the computer game Gathering, where DeepMind revealed just how far it will go to get what it wants. The game involved two scenarios — a wolf pack hunt requiring cooperation, and fruit gathering.

The fruit gathering scenario proved particularly intriguing, since the AI was given the ability to damage other intelligence by attacking them, or shooting beams at them. The team discovered that the AI became more aggressive through attacking mechanisms when less fruit was present.

The DeepMind team described the fruit game in a blog post:

We let the agents play this game many thousands of times and let them learn how to behave rationally using deep multi-agent reinforcement learning. Rather naturally, when there are enough apples in the environment, the agents learn to peacefully coexist and collect as many apples as they can. However, as the number of apples is reduced, the agents learn that it may be better for them to tag the other agent to give themselves time on their own to collect the scarce apples.

The team also acknowledged that the AI systems began to develop some forms of human behaviour, saying their model “shows that some aspects of human-like behaviour emerge as a product of the environment and learning.” “Less aggressive policies,” on the other hand, “emerge from learning in relatively abundant environments with less possibility for costly action.” “The greed motivation reflects the temptation to take out a rival and collect all the apples oneself,” they continue. 

The team explained that, with the Wolfpack game, “The idea is that the prey is dangerous, a lone wolf can overcome it, but is at risk of losing the carcass to scavengers. However, when the two wolves capture the prey together, they can better protect the carcass from scavengers and hence receive a higher reward.”

The researchers are now trying to figure out how AI can eventually “control complex multi-agent systems such as the economy, traffic systems, or the ecological health of our planet – all of which depend on our continued cooperation.”

Pertaining to everyday life, such information could prove important to the design of self-driving cars, which will need to find the safest routes while also taking into consideration the intentions of all parties involved.

As for the next step for DeepMind, team member Joe Leibo envisions the AI going deeper into the motivations behind decision-making. He said“Going forward it would be interesting to equip agents with the ability to reason about other agent’s beliefs and goals.”

Originally posted @ Collective Evolution


Leave a reply

Your email address will not be published. Required fields are marked *