This is a minimal mujoco simulation environment for training with SERL. The environment consists of a panda robot arm and a cube. The goal is to lift the cube to a target position. The environment is implemented using franka_sim
and gym
interface.
Install Franka Sim library
cd franka_sim
pip install -e .
pip install -r requirements.txt
Try if franka_sim
is running via python franka_sim/franka_sim/test/test_gym_env_human.py
.
Before beginning, please make sure that the simulation environment with franka_sim
is working.
Note: to set MUJOCO_GL
as egl if you are doing off-screen rendering.
You can do so by export MUJOCO_GL=egl
and remember to set the rendering argument to False in the script.
If receives Cannot initialize a EGL device display due to GLIBCXX not found
error, try run conda install -c conda-forge libstdcxx-ng
(ref)
Optionally install tmux
: sudo apt install tmux
✨ One-liner launcher (requires tmux
) ✨
bash examples/async_sac_state_sim/tmux_launch.sh
To kill the tmux session, run tmux kill-session -t serl_session
.
You can opt for running the commands individually in 2 different terminals.
cd examples/async_sac_state_sim
Run learner node:
bash run_learner.sh
Run actor node with rendering window:
# add --ip x.x.x.x if running on a different machine
bash run_actor.sh
You can optionally launch the learner and actor on separate machines. For example, if the learner node is running on a PC with ip=x.x.x.x
, you can launch the actor node on a different machine with internet access to ip=x.x.x.x
and add --ip x.x.x.
to the commands in run_actor.sh
.
Remove --debug
flag in run_learner.sh
to upload training stats to wandb
.
✨ One-liner launcher (requires tmux
) ✨
bash examples/async_drq_sim/tmux_launch.sh
You can opt for running the commands individually in 2 different terminals.
cd examples/async_drq_sim
# to use pre-trained ResNet weights, please download
wget https://github.com/rail-berkeley/serl/releases/download/resnet10/resnet10_params.pkl
Run learner node:
bash run_learner.sh
Run actor node with rendering window:
# add --ip x.x.x.x if running on a different machine
bash run_actor.sh
✨ One-liner launcher (requires tmux
) ✨
bash examples/async_drq_sim/tmux_rlpd_launch.sh
You can opt for running the commands individually in 2 different terminals.
cd examples/async_drq_sim
# to use pre-trained ResNet weights, please download
# note manual download is only for now, once repo is public, auto download will work
wget https://github.com/rail-berkeley/serl/releases/download/resnet10/resnet10_params.pkl
# download 20 demo trajectories
wget \
https://github.com/rail-berkeley/serl/releases/download/franka_sim_lift_cube_demos/franka_lift_cube_image_20_trajs.pkl
Run learner node, while provide the path to the demo trajectories in the --demo_path
argument.
bash run_learner.sh --demo_path franka_lift_cube_image_20_trajs.pkl
Run actor node with rendering window:
# add --ip x.x.x.x if running on a different machine
bash run_actor.sh
This provides a way to save and load trajectories for SERL training. Tensorflow RLDS dataset format is used to save and load trajectories. This standard is compliant with the RTX datasets, which can potentially can be used for other robot learning tasks.
This requires additional installation of oxe_envlogger
:
git clone git@github.com:rail-berkeley/oxe_envlogger.git
cd oxe_envlogger
pip install -e .
Save the trajectories
With the example above, we can save the data from the replay buffer by providing the rlds_logger_path
argument. This will save the data to the specified path.
./run_learner.sh --log_rlds_path /path/to/save
This will save the data to the specified path in the following format:
- /path/to/save
- dataset_info.json
- features.json
- serl_rlds_dataset-train.tfrecord-00000
- serl_rlds_dataset-train.tfrecord-00001
....
Load the trajectories
With the example above, we can load the data from the replay buffer by providing the preload_rlds_path
argument. This will load the data from the specified path.
./run_learner.sh --preload_rlds_path /path/to/load
This is similar to the examples/async_rlpd_drq_sim/run_learner.sh
script, which uses --demo_path
argument which load .pkl offline demo trajectories.
run_learner.sh
script. by adding the --batch_size
argument. For example, bash run_learner.sh --batch_size 64
.data_transform(data, metadata) -> data
function in the examples/async_drq_sim/asyn_drq_sim.py
script.