# HIRO **Repository Path**: www2171668/HIRO ## Basic Information - **Project Name**: HIRO - **Description**: 123123123123 - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: HIRO - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2021-09-29 - **Last Updated**: 2023-02-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Overview An implementation of [Data-Efficient Hierarchical Reinforcement Learning](https://arxiv.org/pdf/1805.08296.pdf) (HIRO) in PyTorch. ![Demonstration](media/demo.gif) # Installation 1. Follow installation of [OpenAI Gym Mujoco Installation](https://github.com/openai/mujoco-py) ``` 1. Obtain a 30-day free trial on the MuJoCo website or free license if you are a student. The license key will arrive in an email with your username and password. 2. Download the MuJoCo version 2.0 binaries for Linux or OSX. 3. Unzip the downloaded mujoco200 directory into ~/.mujoco/mujoco200, and place your license key (the mjkey.txt file from your email) at ~/.mujoco/mjkey.txt. ``` 2. Install Dependencies ``` pip install -r requirements.txt ``` # Run For `HIRO`, ``` python main.py --train ``` For `TD3`, ``` python main.py --train --td3 ``` # Evaluate Trained Model Passing `--eval` argument will read the most updated model parameters and start playing. The goal is to get to the position (0, 16), which is top left corner. For `HIRO`, ``` python main.py --eval ``` For `TD3`, ``` python main.py --eval --td3 ``` # Trainining result Blue is HIRO and orange is TD3 ## Succss Rate Success_Rate ## Reward reward_Reward ## Intrinsic Reward reward_Intrinsic_Reward ## Losses Higher Controller Actor
loss_actor_loss_high Higher Controller Critic
loss_critic_loss_high Lower Controller Actor
loss_actor_loss_low Lower Controller Critic
loss_critic_loss_low TD3 Controller Actor
loss_actor_loss_td3 TD3 Controller Critic
loss_critic_loss_td3