tqc Panda Pick And Place v1

sb3

Introduction

The TQC agent for the PandaPickAndPlace-v1 task is a deep reinforcement learning model developed using the Stable Baselines3 library and the RL Zoo training framework. This model specializes in tasks within the Panda Gym environments, leveraging state-of-the-art reinforcement learning techniques.

Architecture

The architecture of the TQC agent involves using the Stable Baselines3 library, which is a set of improved reinforcement learning algorithms. The specific model uses the Twin Delayed DDPG (TD3) variant with Critic architectures and policies defined as MultiInputPolicy. The architecture includes features like HER (Hindsight Experience Replay) to enhance learning from sparse rewards.

Training

The model has been trained using the RL Zoo, which provides a comprehensive framework for training reinforcement learning agents. Key hyperparameters used during training include:

  • batch_size: 2048
  • buffer_size: 1,000,000
  • gamma: 0.95
  • learning_rate: 0.001
  • n_timesteps: 1,000,000
  • policy_kwargs: dict(net_arch=[512, 512, 512], n_critics=2)
  • Replay buffer strategy: HER with online sampling and future goal selection

Guide: Running Locally

  1. Download the Model:
    Use the following command to download the model and save it to the logs/ directory:

    python -m rl_zoo3.load_from_hub --algo tqc --env PandaPickAndPlace-v1 -orga sb3 -f logs/
    
  2. Run the Model:
    To run the model and see the agent in action, execute:

    python enjoy.py --algo tqc --env PandaPickAndPlace-v1 -f logs/
    
  3. Training the Model Locally:
    If you wish to train the model locally, use:

    python train.py --algo tqc --env PandaPickAndPlace-v1 -f logs/
    
  4. Uploading the Model:
    To upload the trained model and generate a video, if possible:

    python -m rl_zoo3.push_to_hub --algo tqc --env PandaPickAndPlace-v1 -f logs/ -orga sb3
    

Cloud GPU Suggestions:
For optimal performance, consider using cloud GPUs such as those provided by AWS, Google Cloud, or Azure, which can significantly speed up the training and inference processes.

License

The model and associated code are distributed under the terms and conditions specified by the repositories hosting the Stable Baselines3 and RL Zoo libraries. For detailed information, refer to the respective GitHub repositories and their LICENSE files.

More Related APIs in Reinforcement Learning