The Challenge

When building a multiplayer shooter game, you need opponents that can challenge players even when no other humans are online. This is where AI agents come in.

Approach

I used Proximal Policy Optimization (PPO), a reinforcement learning algorithm from Stable-Baselines3. The key components:

Environment Design

The game environment was wrapped as a Gymnasium environment, exposing:

  • Observations: Player positions, health, weapon states, enemy locations
  • Actions: Movement, aiming, shooting, abilities
  • Rewards: Damage dealt, kills, staying alive, objective completion

Training Strategy

Training followed a curriculum:

  1. Basic movement: Learn to navigate the map
  2. Aiming: Learn to track and aim at targets
  3. Combat: Learn weapon timing and dodging
  4. Self-play: Train against other AI agents

Weapon Specialists

Instead of one generalist agent, I trained specialists for each weapon type:

  • SMG Agent: Aggressive, close-range combat
  • Sniper Agent: Positioning, long-range accuracy
  • Shotgun Agent: Flanking, burst damage
  • Pistol Agent: Balanced, precision shots
  • Bow Agent: Prediction, leading targets

Results

The trained agents can:

  • Navigate complex 3D environments
  • Track and engage moving targets
  • Use weapon-specific strategies
  • Adapt to different opponent playstyles

Lessons Learned

  1. Reward shaping matters: Small changes in reward function dramatically affect learned behavior
  2. Curriculum learning helps: Starting simple and increasing complexity improved training stability
  3. Self-play is powerful: Agents improved significantly when training against themselves

More details on the specific implementation coming in future posts.