Your game of hide and seek

Added: Lorrie Frantz - Date: 12.03.2022 09:38 - Views: 41217 - Clicks: 6270

Programmers at OpenAI, an artificial intelligence research company, recently taught a gaggle of intelligent artificial agents — bots — to play hide-and-seek. Not because they cared who won: The goal was to observe how competition between hiders and seekers would drive the bots to find and use digital tools. When your opponent adopts a strategy that works, you have to abandon what you were doing before and find a new, better plan. So it went with hide-and-seek. After hundreds of millions of games, they learned to manipulate their environment to give themselves an advantage.

The hiders, for example, learned to build miniature forts and barricade themselves inside; the seekers, in response, learned how to use ramps to scale the walls and find the hiders. These actions showed how AI agents could learn to use things around them as tools, according to the OpenAI team. Imagine when they can use many tools, or create tools. Would they invent a ladder?

Your game of hide and seek

Recent studies have probed ways to teach AI agents to use tools, but in most of them, tool use itself is the goal. The hide-and-seek experiment was different: Rewards were associated with hiding and finding, and tool use just happened — and evolved — along the way. But after enough games, the seekers learned, for example, that they could move boxes even after climbing on top of them. The tactic conferred a double advantage, combining movement with the ability to peer nimbly over walls, and it showed a more innovative use of tools than the human programmers had imagined.

In addition, the emergence of advantageous traits like tool use seems to echo a more familiar course of adaptation: The evolution of human intelligence. Games have long been a useful test bed for artificial intelligence. But games are also useful because competition drives players to find ever-better strategies to win. That should work for AI systems, too: In a competitive environment, algorithms learn to avoid their own mistakes and those of their opponents to optimize strategy. The relationship between games and AI runs deep. In the late s and early s, computer scientists including Claude Shannon and Alan Turing first described chess-playing algorithms.

Four decades later, an IBM researcher named Gerald Tesauro unveiled a backgammon AI program that learned the game through self-play, which means it improved by competing against older versions of itself. Instead of scanning through all possible moves — as some chess programs do — an algorithm using reinforcement learning prioritizes decisions that give it an advantage over its opponent.

Chess programs that start with random moves, for example, soon discover how to arrange pawns or use other pieces to protect the king. Self-play can also produce wildly inventive strategies that no human player would ever attempt. To wit: In shogi, a Japanese game similar to chess, human players typically shy away from moving their king to the middle of the board.

An AI system, however, recently used exactly this maneuver to trounce human competitors. For some games, like rock-paper-scissors, no amount of self-play will ever transcend dumb luck. Players take turns placing black or white stones on a square grid; the goal is to surround more territory than your opponent. But as training continued, the program improved by discovering successful new maneuvers.

Bolstered by these successes, programmers began tackling video games, which often involve multiple players and run in continuous time, rather than move by move. Hiders learned to barricade themselves using boxes and walls, but seekers soon adapted by climbing over them with ramps. I see no groundbreaking methods here. But Mordatch maintains that the hide-and-seek demonstration is different. It gave its AI players room to grow — instead of just learning to win the game, they taught themselves to make their own tools to do so.

The rules of the game were simple. Each AI team had between one and three agents, and games lasted moves. The setting was a virtual square arena with boxes, ramps and walls that agents could push around and that hiders could lock in place. The arena was based in a 3D physics simulator that prevented real-world impossibilities like agents walking through walls or two agents occupying the same space.

Seekers were frozen in place for the first 96 moves, giving hiders a chance to, well, hide. Each agent was programmed to maximize its point totals, and points were awarded to all the members of a team at once. Hiders gained a point if they were all concealed, and they lost a point if one of them was seen by a seeker. The point system helped both sides learn better strategies, Baker said. Instead of getting points for specific tasks, like moving a box or climbing a ramp, the teams were rewarded for how well they used such tasks to accomplish their goals — to find, or avoid being found.

At first, gameplay was crude. The agents scurried around — sometimes forward, sometimes backward — like confused toddlers. To make the AI agents easy to observe, and to match the playful spirit of the game, the programmers made their creations look like little jelly monsters with oversize he, wide smiles and bright eyes.

The bots were adorable bumblers. The experiment went through six phases. First, motivated only by the incentive to gain points, the agents learned to chase or run away. This was a sort of pre-tool phase. Then, after about 25 million games, the hiders learned to make forts out of the boxes and walls. After another 75 million games, seekers learned to push a ramp to the edge of a fort, climb up and jump in.

Hiders had to adopt a new strategy. After those 10 million games, the researchers suspected that the program had run its course. But the AI kept on changing — and learning. After almost million games came the fifth phase and the introduction of box surfing. During these rounds seekers learned they could still use the locked ramps by moving a box close to one, climbing the ramp and jumping onto the box.

The boxes were too high to scale without a ramp. Once on a box, a bot could move it around the arena while remaining on top of it. This allowed it to effectively ride around, or surf, on the box, searching for hiders.

This gave seekers the advantage of height and mobility. In the sixth and final phase of the game — which emerged after million rounds — the hiders finally learned to lock the boxes beforehand, preventing the surfing. The OpenAI researchers see these unexpected but advantageous behaviors as proof that their system can discover tasks beyond what was expected, and in a setting with real-world rules. Lange thinks this is a realistic goal. More complex problems in the virtual world could suggest useful applications in the real world.

One way to increase complexity — and see how far self-learning can go — is to increase the of agents playing the game. Each one will require its own independent algorithm, and the project will require much more computational power. And he says an AI system that can complete increasingly complex tasks raises questions about intelligence itself. During their post-game analysis of hide-and-seek, the OpenAI team devised and ran intelligence tests to see how the AI agents gained and organized knowledge.

Nevertheless, the way the AI agents used self-play and competition to develop tools does look a lot like evolution — of some variety — to some researchers in the field. Leibo notes that the history of life on Earth is rich with cases in which an innovation or change by one species prompted other species to adapt. Billions of years ago, for example, tiny algaelike creatures pumped the atmosphere full of oxygen, which allowed for the evolution of larger organisms that depend on the gas.

He sees a similar pattern in human culture, which has evolved by introducing and adapting to new standards and practices, from agriculture to the hour workweek to the prominence of social media.

Your game of hide and seek

In March, he was part of a quartet of researchers at DeepMind who released a manifesto describing how cooperation and competition in multi-agent AI systems le to innovation. In other words: When push comes to shove, shove better. They saw it happen when AlphaGo bested the best human players at Go, and Leibo says the hide-and-seek game offers another robust example. Baker similarly sees parallels between hide-and-seek and natural adaptation. And that creates pressure for all the other organisms to adapt.

Quanta Magazine moderates comments to facilitate an informed, substantive, civil conversation.

Your game of hide and seek

Abusive, profane, self-promotional, misleading, incoherent or off-topic comments will be rejected. Moderators are staffed during regular business hours New York time and can only accept comments written in English. We care about your data, and we'd like to use cookies to give you a smooth browsing experience.

Please agree and about our privacy policy. Read Later. After millions of games, machine learning algorithms found creative solutions and unexpected new strategies that could transfer to the real world. Millions of Lessons The rules of the game were simple. Evolving Beyond Hide-and-Seek After almost million games came the fifth phase and the introduction of box surfing. Show comments.

Your game of hide and seek

email: [email protected] - phone:(961) 359-7336 x 6656

"the game of hide and seek" translation into Polish