DESTROYING Donkey Kong with AI (Deep Reinforcement Learning)

Code Bullet34 minutes read

Three AI algorithms, including genetic algorithm, NEAT, and PPO, are tested for optimizing gameplay in Donkey Kong by evolving characters, handling physics, and implementing strategies to avoid obstacles like barrels and climb ladders, with each algorithm facing challenges and limitations in adapting to changing game conditions and complexity levels. Despite initial struggles and limitations, the AI algorithms gradually improve through generations, with NEAT evolving neural networks for more advanced behaviors and PPO introducing collective learning to overcome obstacles and progress in the game.

Insights

  • The genetic algorithm optimizes solutions through evolution, starting with random players, selecting parents based on performance, mutating their instructions, and repeating the process for improvement, showcasing a simple yet effective method for gameplay success.
  • The NEAT algorithm evolves neural networks over generations, allowing for more complex behaviors and strategies, but struggles with jumping and relies on luck due to random barrel movements, leading to a transition to the more advanced Proximal Policy Optimization (PPO) algorithm for enhanced performance and strategy development.

Get key ideas from YouTube videos. It’s free

Recent questions

  • What are the different AI algorithms being tested?

    Three AI algorithms tested are genetic algorithm, NEAT, and PPO.

Related videos

Summary

00:00

"AI Algorithms in Game Development Process"

  • Three different AI algorithms are being tested: genetic algorithm, neural evolution of augmented topologies, and proximal policy optimization.
  • The game development process starts with creating the player character, a square, with intentionally janky movement for an arcade feel.
  • Physics for player movement are coded manually to avoid smooth animations.
  • Ground creation involves handling individual steps rather than smooth slopes.
  • A laser is used to detect ground beneath the player to prevent falling through.
  • Ladders are implemented by checking for player collision and allowing movement up and down.
  • The player character is visually enhanced with sprites and animations.
  • Barrels are programmed to move in a single direction, bounce off walls, and climb down ladders.
  • The genetic algorithm is explained as a process of evolution to optimize solutions through mutation and selection.
  • The algorithm involves creating random players, testing them in the game, selecting parents based on performance, mutating their instructions, and repeating the process for improvement.

11:51

Evolution of AI in Donkey Kong

  • By generation 10, the AI has limited moves, mostly going right, but by generation 51, they start using ladders to avoid barrels.
  • The AI struggles with the concept of barrels initially but eventually learns to jump over them while climbing ladders.
  • The genetic algorithm used is simple to program and reliable, ensuring successful gameplay, but lacks understanding of game elements like barrels and ladders.
  • The algorithm's inability to handle randomness leads to repetitive actions and breaks in gameplay if conditions change.
  • The neural network-based algorithm, NEAT, evolves neural networks over generations, allowing for more complex behaviors and strategies.
  • NEAT's augmented topologies enable the neural network structure to evolve, leading to more advanced behaviors and strategies.
  • Inputs for the Donkey Kong AI include player position, ladder interaction, barrel proximity, and velocity, with the fitness function based on player height and avoiding barrel collisions.
  • NEAT AI gradually learns to climb ladders, with some reaching the second-last level, but success depends on luck due to random barrel movements.
  • After 178 generations, the NEAT AI still struggles with jumping, limiting its strategies to avoiding barrels and reaching the top.
  • The NEAT AI's small brain size hinders complex strategies, prompting a transition to a more advanced algorithm, Proximal Policy Optimization (PPO).

23:51

"Collective Brain Controls Game Progress and Challenges"

  • Players in a game share a collective brain, sending experiences back to a main brain for control and updates.
  • Falling down ladders is an issue, addressed by penalizing players to deter the behavior.
  • Progress in the game is luck-based, with players learning to avoid obstacles like barrels and improving over time.
  • Training progresses from the second level, where barrels are absent, to the third level where players must learn to jump over barrels.
  • Players struggle at the last stage due to sudden barrel appearances, leading to frequent deaths.
  • Various adjustments were made to improve the algorithm, including changes in neural network size, learning rates, and rewards/penalties.
Channel avatarChannel avatarChannel avatarChannel avatarChannel avatar

Try it yourself — It’s free.