Self-taught programme beats humans at Go game

Go is a traditional Chinese board game played on a checkerboard with ‘stones’ of two colours

Updated - October 22, 2017 09:02 am IST

Published - October 21, 2017 05:44 pm IST

South Korean students place white and black stones during the World Elementary School Student Go Game, the ancient Chinese board game, competition in Seoul, 29 July 2005. Over 11,000 students from eight countries are participating in this competition from July 03 to July 29.     AFP PHOTO/JUNG YEON-JE

South Korean students place white and black stones during the World Elementary School Student Go Game, the ancient Chinese board game, competition in Seoul, 29 July 2005. Over 11,000 students from eight countries are participating in this competition from July 03 to July 29. AFP PHOTO/JUNG YEON-JE

Researchers at DeepMind, a company that specialises in developing artificial intelligence, have succeeded in developing a programme – AlphaGoZero – that can beat human players at the Go game. Now that itself does not sound new – it is well known that earlier versions of AlphaGo have beaten world champions at the game. What is new is that, using the method of deep reinforcement learning, the programme has actually learnt the game all by itself – with no human inputs – from scratch, tabula rasa!

The Go game is a Chinese board game played on a checkerboard with ‘stones’ of two colours. The name translates into ‘the encircling game’ and the aim of each player would be to surround as much territory as possible.

The system starts with a neural network that knows nothing about the game. This plays against itself, combining the neural network with a search algorithm. The network is updated to predict the next move as well as the prospective winner. The updated neural network and the search algorithm are combined to produce a new version of AlphaGoZero. The process is then repeated to build better programme at the end of each iteration.

Thus, in just a few days, over millions of games against itself, the programme learnt the Go game from scratch. It was interesting that the game not only learnt human strategies but also gained new types of knowledge which were unconventional for humans.

After three hours of training, AlphaGoZero played like a human beginner foregoing long term advantages in favour of capturing as many stones as possible; after 19 hours it mastered advanced strategies such as life-and-death, influence and strategy; in 70 hours, it played at a superhuman level, with a game involving multiple challenges across the board.

Backgammon and Go

“It’s a powerful method,” says Professor B Ravindran, head of Robert Bosch Centre for Data Science and Artificial Intelligence, at IIT Madras, who was not involved in this research. He recalls that in the 1990s Gerald Tesauro, IBM Research, used reinforcement learning to master the backgammon game. “Go is several orders more complex. There was no player [AI] until the deep neural networks came in. The search technique is about 15 years old and in conjunction with the neural network it is powerful,” he says.

While David Silver, corresponding author of the Nature paper, was unavailable to comment on the work, Demis Hassabis, co-founder and CEO, DeepMind, said in an email: "It’s amazing to see just how far AlphaGo has come in only two years. AlphaGo Zero is now the strongest version of our programme and shows how much progress we can make even with less computing power and zero use of human data. Ultimately we want to harness algorithmic breakthroughs like this to help solve all sorts of pressing real world problems like protein folding or designing new materials. If we can make the same progress on these problems that we have with AlphaGo, it has the potential to drive forward human understanding and positively impact all of our lives."

0 / 0
Sign in to unlock member-only benefits!
  • Access 10 free stories every month
  • Save stories to read later
  • Access to comment on every story
  • Sign-up/manage your newsletter subscriptions with a single click
  • Get notified by email for early access to discounts & offers on our products
Sign in

Comments

Comments have to be in English, and in full sentences. They cannot be abusive or personal. Please abide by our community guidelines for posting your comments.

We have migrated to a new commenting platform. If you are already a registered user of The Hindu and logged in, you may continue to engage with our articles. If you do not have an account please register and login to post comments. Users can access their older comments by logging into their accounts on Vuukle.