Skip to content

Usage Guide

This comprehensive guide will help you get started with pSim, from basic usage to advanced features. pSim provides multiple environment types for different use cases: traditional control algorithms, reinforcement learning with Gymnasium, and multi-agent learning with PettingZoo.

Quick Start

First, make sure you have installed pSim with the appropriate dependencies.

Basic Configuration Setup

pSim automatically manages configuration files. When you first create an environment, it will automatically copy the default configuration to your project root if one doesn't exist.

# Configuration is created automatically - no manual setup required!
from pSim import SimpleVSSSEnv

env = SimpleVSSSEnv()  # Creates game_config.json automatically if needed

The game_config.json file in your project root defines robot behaviors and scenarios.

Environment Types

pSim offers three main environment types:

1. SimpleEnv - Traditional Control

Best for traditional control algorithms and manual testing.

from pSim import SimpleVSSSEnv
import numpy as np

# Create environment
env = SimpleVSSSEnv(
    render_mode="human",  # "human" for GUI, None for headless
    scenario="formation",
    num_agent_robots=3,
    num_adversary_robots=3,
    color_team="blue"
)

# Reset and run
obs, info = env.reset()
print(f"Observation keys: {list(obs.keys())}")

# Control loop
for step in range(1000):
    # Random actions for demonstration [v, w] for each robot
    action = np.random.uniform(-1, 1, (env.num_agent_robots, 2))
    obs, reward, terminated, truncated, info = env.step(action)

    if terminated or truncated:
        obs, info = env.reset()

env.close()

2. VSSSGymEnv - Gymnasium Integration

For reinforcement learning with Gymnasium.

from pSim import VSSSGymEnv
from gymnasium.wrappers import FlattenObservation
import numpy as np

# Create environment with flattened observations
env = FlattenObservation(VSSSGymEnv(
    render_mode="human",
    scenario="formation",
    num_agent_robots=3,
    num_adversary_robots=3,
    color_team="blue"
))

print(f"Action space: {env.action_space}")
print(f"Observation space: {env.observation_space}")

# Reset and run
obs, info = env.reset()
print(f"Observation shape: {obs.shape}")

for step in range(1000):
    action = env.action_space.sample()
    obs, reward, terminated, truncated, info = env.step(action)

    if terminated or truncated:
        obs, info = env.reset()

env.close()

3. VSSSPettingZooEnv - Multi-Agent Learning

For multi-agent reinforcement learning with PettingZoo.

from pSim import VSSSPettingZooEnv
import numpy as np

# Create multi-agent environment
env = VSSSPettingZooEnv(
    render_mode="human",
    scenario="formation",
    num_agent_robots=3,
    num_adversary_robots=3,
    color_team="blue"
)

print(f"Agents: {env.possible_agents}")
print(f"Action space: {env.action_space('agent_0')}")

# Reset and run
obs, info = env.reset()
print(f"Active agents: {env.agents}")

for step in range(1000):
    # Random actions for all agents
    actions = {agent: env.action_space(agent).sample() for agent in env.agents}
    obs, rewards, terminations, truncations, infos = env.step(actions)

    if any(terminations.values()) or any(truncations.values()):
        obs, info = env.reset()

env.close()

Core Concepts

Rendering Modes

pSim supports different rendering modes for various use cases:

  • "human": Displays a graphical window for interactive visualization
  • None: Headless mode for training and automated testing
  • "rgb_array": Returns RGB arrays for custom rendering or recording
# Interactive mode - shows GUI window
env = SimpleVSSSEnv(render_mode="human")

# Headless mode - no GUI, faster for training
env = SimpleVSSSEnv(render_mode=None)

# RGB array mode - for custom processing
env = SimpleVSSSEnv(render_mode="rgb_array")
obs, info = env.reset()
rgb_frame = env.render()  # Returns numpy array with RGB image

Actions and Observations

Action Format

Actions control robot movement as [linear_velocity, angular_velocity] pairs:

# For a single robot
action = [0.5, 0.2]  # Move forward at 0.5 speed, turn right at 0.2 speed

# For multiple robots (SimpleEnv)
actions = [
    [0.5, 0.2],  # Robot 0: forward + right turn
    [-0.3, 0.0], # Robot 1: backward + straight
    [0.0, -0.1]  # Robot 2: stop + left turn
]

Velocity ranges: -1.0 to 1.0 (normalized) - Linear velocity: Forward/backward movement - Angular velocity: Rotation (positive = right turn, negative = left turn)

Observation Format

Observations contain the complete environment state:

obs = {
    'agent_robots': [
        [x, y, theta, vx, vy, omega],  # Robot 0: position, orientation, velocities
        [x, y, theta, vx, vy, omega],  # Robot 1: ...
    ],
    'adversary_robots': [...],  # Same format for opponent robots
    'ball': [x, y, vx, vy],     # Ball position and velocity
    'field': [...],             # Field boundaries and features
    'game_state': {             # Game status information
        'score': {'blue': 0, 'yellow': 0},
        'time': 0.0,
        'episode_length': 1000
    }
}

Core Methods

reset()

Initializes or reinitializes the environment:

obs, info = env.reset()
# obs: Initial observation dictionary
# info: Additional information (usually empty dict)

When to call: At the start of each episode or when you want to restart the simulation.

step(action)

Advances the simulation by one time step:

obs, reward, terminated, truncated, info = env.step(action)

Parameters: - action: Control inputs for robots

Returns: - obs: New observation after the action - reward: Scalar reward value - terminated: True if episode ended naturally (goal scored, etc.) - truncated: True if episode was cut short (time limit, etc.) - info: Additional diagnostic information

Termination Conditions

  • terminated = True: Episode ended due to game logic (goal scored, out of bounds, etc.)
  • truncated = True: Episode ended due to external constraints (time limit, manual stop, etc.)
# Always check both conditions
if terminated or truncated:
    obs, info = env.reset()  # Start new episode

Reward System Customization

pSim allows you to customize the reward system by subclassing the RewardSystem class and injecting it into your environment. This provides full control over the reward calculation logic.

Creating a Custom Reward System

To create a custom reward system, inherit from RewardSystem and override the calculate_reward method.

from pSim.modules.env_description import RewardSystem
import numpy as np

class CustomRewardSystem(RewardSystem):
    """Custom reward system example."""

    def calculate_reward(self) -> tuple[float, bool, bool]:
        """
        Calculate custom reward.
        Returns: (reward, terminated, truncated)
        """
        # Access simulator state via self.simulator
        ball = self.simulator.ball_body
        agent = self.simulator.robots_agent[0]

        # Example: Simple distance-based reward
        distance = np.linalg.norm(agent.position - ball.position)
        reward = -distance

        # Check termination (e.g., goal scored)
        # You can use self.simulator.contact_listener or check positions
        terminated = False
        if distance < 0.1:
            reward += 10
            terminated = True

        # Check truncation (time limit)
        truncated = self.time_step >= self.truncated_time

        return reward, terminated, truncated

Using Custom Rewards

To use your custom reward system, you need to create a custom environment class that initializes it.

from pSim import VSSSGymEnv
from my_custom_rewards import CustomRewardSystem

class CustomVSSSGymEnv(VSSSGymEnv):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        # Override the default reward system
        self.reward_system = CustomRewardSystem(
            self.simulator, 
            truncated_time=kwargs.get('truncated_time', 600)
        )

# Usage
env = CustomVSSSGymEnv(render_mode="human", scenario="formation")

For a complete working example, see examples/vsss_gym_custom_obs_reward.py.

Configuration System

game_config.json Structure

The game_config.json file is the central configuration for pSim environments. It defines scenarios, robot behaviors, and initial conditions:

{
  "scenarios": {
    "scenario_name": {
      "ball_position_type": "fixed|range",
      "ball_position": [x, y],
      "ball_position_range": {"x": [min, max], "y": [min, max]},
      "ball_velocity": [vx, vy],

      "agent_robots": {
        "position_type": "fixed|range",
        "positions": [[x, y, theta], ...],
        "position_range": {"x": [min, max], "y": [min, max], "angle": [min, max]},
        "movement_types": ["action|ou|no_move", ...]
      },

      "adversary_robots": {
        "position_type": "fixed|range",
        "positions": [[x, y, theta], ...],
        "position_range": {"x": [min, max], "y": [min, max], "angle": [min, max]},
        "movement_types": ["action|ou|no_move", ...]
      }
    }
  }
}

Movement Types

pSim uses three movement types for robots:

  • "action": Robot controlled by your agent/learning algorithm
  • "ou": Robot moves with Ornstein-Uhlenbeck process generates realistic, correlated random movement
  • "no_move": Robot stays stationary.

Position Types

Fixed Positions

{
  "position_type": "fixed",
  "positions": [
    [-0.1, 0.0, 0.0],    // Robot 0: x, y, angle
    [-0.2, 0.2, 0.0],    // Robot 1: x, y, angle
    [-0.7, 0.0, -1.571]  // Robot 2: x, y, angle
  ]
}

Random Positions

{
  "position_type": "range",
  "position_range": {
    "x": [-0.7, 0.7],      // X coordinate range
    "y": [-0.6, 0.6],      // Y coordinate range
    "angle": [-3.14159, 3.14159]  // Angle range (radians)
  }
}

Default Scenarios

pSim includes several built-in scenarios:

Formation Scenario

Standard 3v3 formation with mixed control:

{
  "ball_position_type": "fixed",
  "ball_position": [0.0, 0.0],
  "agent_robots": {
    "position_type": "fixed",
    "positions": [[-0.1, 0.0, 0.0], [-0.2, 0.2, 0.0], [-0.7, 0.0, -1.571]],
    "movement_types": ["action", "action", "action"]
  },
  "adversary_robots": {
    "position_type": "fixed",
    "positions": [[0.25, 0.0, 3.14159], [0.2, -0.2, 3.14159], [0.7, 0.0, 1.571]],
    "movement_types": ["action", "ou", "ou"]
  }
}

Full Random Scenario

Complete randomization for robust training:

{
  "ball_position_type": "range",
  "ball_position_range": {"x": [-0.7, 0.7], "y": [-0.6, 0.6]},
  "ball_velocity": [0.6, 0.6],
  "agent_robots": {
    "position_type": "range",
    "position_range": {
      "x": [-0.7, 0.0],
      "y": [-0.6, 0.6],
      "angle": [-3.14159, 3.14159]
    },
    "movement_types": ["action", "ou", "ou"]
  },
  "adversary_robots": {
    "position_type": "range",
    "position_range": {
      "x": [0.0, 0.7],
      "y": [-0.6, 0.6],
      "angle": [-3.14159, 3.14159]
    },
    "movement_types": ["ou", "ou", "ou"]
  }
}

Dynamic Configuration

You can change robot behaviors at runtime:

from pSim import SimpleVSSSEnv

env = SimpleVSSSEnv(scenario="formation", num_agent_robots=3)

# Make robot 2 controllable
env.set_robot_controlled('agent', 2, True)

# Switch robot 1 back to automatic
env.set_robot_controlled('agent', 1, False)

# Toggle robot behavior
new_type = env.toggle_robot_control('adversary', 0)

Human-Machine Interface (HMI)

pSim includes a sophisticated HMI system for manual control with keyboard and joystick support.

Keyboard Controls

Key Action
W/S Forward/Backward movement
A/D Turn left/right
E/Q Next/Previous robot
X/Y Switch to Blue/Yellow team
B Toggle ball control mode
R Reset environment
ESC Exit

Joystick Controls (Universal Mapping)

Control Action
Left Stick Y Forward/Backward
Left Stick X Turn left/right (robot) or strafe (ball)
Right Stick X Turn left/right (robot only)
RB/LB Next/Previous robot
X/Y Switch to Blue/Yellow team
B Toggle ball control mode
BACK Reset environment
START Exit

HMI Example

from pSim import SimpleVSSSEnv, HMI

env = SimpleVSSSEnv(render_mode="human", num_agent_robots=3)
hmi = HMI()

obs, info = env.reset()

while hmi.active:
    # HMI returns a dictionary with control state
    control_state = hmi()

    if not control_state['active']:
        break

    if control_state['reset_commanded']:
        obs, info = env.reset()
        continue

    # Get actions from HMI state
    actions = control_state['actions']

    # Apply actions to controllable robots
    # For SimpleEnv with multiple robots, you might need to route actions
    # This is handled internally in main.py, but here is a simplified view:
    env_actions = np.zeros((env.num_agent_robots, 2))
    current_robot = control_state['current_robot_id']
    if current_robot < env.num_agent_robots:
        env_actions[current_robot] = actions

    obs, reward, terminated, truncated, info = env.step(env_actions)

    if terminated or truncated:
        obs, info = env.reset()

hmi.quit()
env.close()

Advanced Multi-Robot Control

The HMI automatically handles robot and team switching. You can access the current selection from the returned state:

control_state = hmi()
current_team = control_state['current_team']
current_robot_id = control_state['current_robot_id']
ball_mode = control_state['ball_control_mode']

Troubleshooting

Common Issues

Configuration not found - Configuration is created automatically when you first create an environment - Check that game_config.json exists in your project root - Ensure you have write permissions in your project directory

Wrong robot count - Check that movement_types arrays match num_agent/adversary_robots in your game_config.json - The configuration file is created automatically in your project root

Permission errors - Make sure you can write to your project directory - The automatic configuration creation requires write access

Environment won't start - Check that all dependencies are installed - Verify your Python version (3.12+ recommended) - Try running with render_mode=None first

Controller not detected - Try unplugging and replugging your controller - Check that pygame is properly installed - Some controllers may need additional drivers

Getting Help

  • Check the API Reference for detailed documentation
  • Review the examples in the examples/ directory
  • Configuration is managed automatically - no manual setup required