General Gaming Google DeepMind’s AI agents exceed "human-level" gameplay in Quake III CTF

**Commissar SFLUFAN** · July 6, 2018

AI agents continue to rack up wins in the video game world. Last week, OpenAI’s bots were playing Dota 2; this week, it’s Quake III, with a team of researchers from Google’s DeepMind subsidiary successfully training agents that can beat humans at a game of capture the flag.

TwinIon · July 6, 2018

Interesting that they used procedurally generated maps. I can only imagine the strategies that would develop if you allowed an AI like this to play 500k games on a standard map.

Now that I think about it, I wonder if that will become a standard way to find map glitches or loopholes while still in development. Let an AI play a few hundred thousand games and see if it figures out anything interesting.

XxEvil AshxX · July 6, 2018

But do they teabag?

number305 · July 6, 2018

Nothing new here. When I was a kid Mike Tyson's Punch-Out! A.I. used to kick my butt all day long.

Raggit · July 6, 2018

How do they handle the AIs ability to aim? Because that would strongly impact their performance. If they're basically using aim bot, it's kind of unfair.

Scape Zero · July 6, 2018

3 hours ago, TwinIon said:

Interesting that they used procedurally generated maps. I can only imagine the strategies that would develop if you allowed an AI like this to play 500k games on a standard map.

Now that I think about it, I wonder if that will become a standard way to find map glitches or loopholes while still in development. Let an AI play a few hundred thousand games and see if it figures out anything interesting.

That would probably be a really good idea for finding flaws.

TwinIon · July 6, 2018

2 hours ago, Raggit said:

How do they handle the AIs ability to aim? Because that would strongly impact their performance. If they're basically using aim bot, it's kind of unfair.

Despite using Q3 as the basis, they don't actually have guns. They can only tag opponents to get them to steal the flag.

They were mostly interested in how multiple AI agents would work together in a team exercise.

The Deepmind site has a better look at the game they played.

Rev · July 6, 2018

3 hours ago, TwinIon said:

Interesting that they used procedurally generated maps. I can only imagine the strategies that would develop if you allowed an AI like this to play 500k games on a standard map.

Now that I think about it, I wonder if that will become a standard way to find map glitches or loopholes while still in development. Let an AI play a few hundred thousand games and see if it figures out anything interesting.

13 minutes ago, Scape Zero said:

That would probably be a really good idea for finding flaws.

This is exactly what they did in The Talos Principle

https://blog.us.playstation.com/2015/09/25/the-talos-principle-on-ps4-designing-ai-to-test-a-game-about-ai/

TwinIon · July 6, 2018

2 minutes ago, Rev said:

This is exactly what they did in The Talos Principle

https://blog.us.playstation.com/2015/09/25/the-talos-principle-on-ps4-designing-ai-to-test-a-game-about-ai/

That's very cool. That's a bit more like a test script for the whole game than the kind of machine learning that Deepmind is doing, but it's a good example of what I was thinking about.

XxEvil AshxX · July 6, 2018

8 minutes ago, Rev said:

This is exactly what they did in The Talos Principle

https://blog.us.playstation.com/2015/09/25/the-talos-principle-on-ps4-designing-ai-to-test-a-game-about-ai/

Too bad the A.I. didn't tell them their game was boring.

legend · July 6, 2018

This is cool work. Using evolutionary algorithms to learn the reward function is an idea that's been explored in literature before (as well work in which I was involved), but not on this scale. Their reports of learning time are a bit unclear and potentially misleading. I'll have to look at this paper, but I think it's 500,000 games for each member of the population (which is huge).

3 hours ago, TwinIon said:

Interesting that they used procedurally generated maps. I can only imagine the strategies that would develop if you allowed an AI like this to play 500k games on a standard map.

Now that I think about it, I wonder if that will become a standard way to find map glitches or loopholes while still in development. Let an AI play a few hundred thousand games and see if it figures out anything interesting.

It actually might not do as well as you'd think. Correlated experience, which is common if you only have one agent learning with its own experience, can really fuck with learning and make it behave badly. Increasingly, you're seeing papers that do these huge distributed learning spaces to try and avoid that issue. That may not have been the motivation for lots of maps here, but there is a reasonable chance.

It's also a direction that grinds my gears a bit. The reinforcement-learning problem is at its heart about learning when you have to suffer your consequences. This direction of huge numbers of parallel actors is side stepping that issue and isn't practice for many real world scenarios.

I do like that they're looking at the evolution angle though because I've always found the interaction of evolution with internal reward functions interesting. For that reason, I'll cut them some slack in this case

3 hours ago, Raggit said:

How do they handle the AIs ability to aim? Because that would strongly impact their performance. If they're basically using aim bot, it's kind of unfair.

Its just looking at the image and deciding when, but they do say its reaction time and aiming is a result for some of its excellent performance, so in the further challenges they artificially added some random inaccuracy and it still did well.

legend · July 6, 2018

4 minutes ago, XxEvil AshxX said:

Too bad the A.I. didn't tell them their game was boring.

Psh. The Talos Principle is one of the best puzzle games out there.

Rev · July 6, 2018

2 minutes ago, legend said:

Psh. The Talos Principle is one of the best puzzle games out there.

This is exactly what an AI agent would say.

TwinIon · July 6, 2018

11 minutes ago, legend said:

It actually might not do as well as you'd think. Correlated experience, which is common if you only have one agent learning with its own experience, can really fuck with learning and make it behave badly. Increasingly, you're seeing papers that do these huge distributed learning spaces to try and avoid that issue. That may not have been the motivation for lots of maps here, but there is a reasonable chance.

It's also a direction that grinds my gears a bit. The reinforcement-learning problem is at its heart about learning when you have to suffer your consequences. This direction of huge numbers of parallel actors is side stepping that issue and isn't practice for many real world scenarios.

I'm not sure what you mean when you say the number of parallel actors sidesteps the issue of consequences.

legend · July 6, 2018

16 minutes ago, TwinIon said:

I'm not sure what you mean when you say the number of parallel actors sidesteps the issue of consequences.

So in RL, you have an agent. It doesn't know how to behave, so it takes an action and observes what happens. If that action was bad, too bad. If you want to know how something else would have turned out if you behaved differently, too bad. You don't get to magically reset to where you were. You'll have to somehow work your way back to that scenario (or something similar where similar is something you also have to learn) to see what else could have been. The agent is forced to deal with the consequences of what it did.

Because of how challenging that scenario is, a seriously important concept in RL is how the agent balances exploration vs exploitation. Is the current state a good state to try something new? Or should the agent instead try to exploit what it's learned at this point? This balance, and even how you choose to explore, are huge challenging topics, for which we by and large don't have good answers.

So what are people doing now instead? They've given the agent the ability to clone itself, or versions of itself, at huge scales so that in effect, any single decision isn't all that important and the agents don't have to worry so much about careful and useful exploration.

The way these huge numbers of actors learn is also often not merely equivalent to learning longer because of how the sequence of learning updates work. But that might require a bit more mathematical discussion to explain.

As I said, I think that they were explicitly looking at the evolution of reward functions is interesting and you have to investigate learning across a population to do that. So I'm less down on it here for the interest in that. But other stuff lately has been especially abusive IMO.

Scape Zero · July 6, 2018

1 hour ago, Rev said:

This is exactly what they did in The Talos Principle

https://blog.us.playstation.com/2015/09/25/the-talos-principle-on-ps4-designing-ai-to-test-a-game-about-ai/

That's really cool. I wonder when that will be mainstream.

XxEvil AshxX · July 6, 2018

53 minutes ago, legend said:

Psh. The Talos Principle is one of the best puzzle games out there.

I played it through from start to finish and it was literally the same puzzle mechanic from beginning to end. By the time I finished I was so fuckin' tired of pointing lasers, and the whole sentient A.I. philosophy got old about halfway through. I literally forced myself to finish it.

legend · July 6, 2018

1 hour ago, XxEvil AshxX said:

I played it through from start to finish and it was literally the same puzzle mechanic from beginning to end. By the time I finished I was so fuckin' tired of pointing lasers, and the whole sentient A.I. philosophy got old about halfway through. I literally forced myself to finish it.

It literally was not the same puzzle mechanic from beginning to end. Not only are you being reductive, you're also wrong!

Rev · July 6, 2018

I need to finish that game. The only part that annoyed me was the philosophy part. I'm not sure why I shelved it.

legend · July 6, 2018

1 minute ago, Rev said:

I need to finish that game. The only part that annoyed me was the philosophy part. I'm not sure why I shelved it.

What annoyed me about the philosophy part is I had great answers for all the questions, but they were never options

In that way, it's a lot like too many modern philosophers who haven't kept up with what we learned in math since the 50s.

Rev · July 6, 2018

2 minutes ago, legend said:

What annoyed me about the philosophy part is I had great answers for all the questions, but they were never options

In that way, it's a lot like too many modern philosophers who haven't kept up with what we learned in math since the 50s.

Yep, exactly the same experience for me. These are topics I've devoted a *ton* of time to thinking about and I was forced to answer badly. Not a deal-breaker, just annoying.

mo1518 · July 9, 2018

I wonder if we'll get to a stage where instead of watching human streamers play games, we'll watch various AIs built by different teams compete against each other. Like battle bots but for video games.

Rev · July 9, 2018

16 minutes ago, mo1518 said:

I wonder if we'll get to a stage where instead of watching human streamers play games, we'll watch various AIs built by different teams compete against each other. Like battle bots but for video games.

It's huge in chess already so maybe.

General Gaming Google DeepMind’s AI agents exceed "human-level" gameplay in Quake III CTF

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members