Marllib 1.0.3

Description:

marllib 1.0.3

MARLlib: A Scalable and Efficient Multi-agent Reinforcement Learning Library

News:
We are excited to announce that a major update has just been released. For detailed version information, please refer to the version info.

Multi-agent Reinforcement Learning Library (MARLlib) is a MARL library that utilizes Ray and one of its toolkits RLlib. It offers a comprehensive platform for developing, training, and testing MARL algorithms across various tasks and environments.
Here's an example of how MARLlib can be used:
from marllib import marl

# prepare env
env = marl.make_env(environment_name="mpe", map_name="simple_spread")

# initialize algorithm with appointed hyper-parameters
mappo = marl.algos.mappo(hyperparam_source='mpe')

# build agent model based on env + algorithms + user preference
model = marl.build_model(env, mappo, {"core_arch": "gru", "encode_layer": "128-256"})

# start training
mappo.fit(env, model, stop={'timesteps_total': 1000000}, share_policy='group')

# ready to control
mappo.render(env, model, share_policy='group', restore_path='path_to_checkpoint')

Why MARLlib?
Here we provide a table for the comparison of MARLlib and existing work.

Library
Supported Env
Algorithm
Parameter Sharing
Model

PyMARL
1 cooperative
5
share
GRU

PyMARL2
2 cooperative
11
share
MLP + GRU

MAPPO Benchmark
4 cooperative
1
share + separate
MLP + GRU

MAlib
4 self-play
10
share + group + separate
MLP + LSTM

EPyMARL
4 cooperative
9
share + separate
GRU

MARLlib
12 no task mode restriction
18
share + group + separate + customizable
MLP + CNN + GRU + LSTM

Library
Github Stars
Documentation
Issues Open
Activity
Last Update

PyMARL

:x:

PyMARL2

:x:

MAPPO Benchmark

:x:

MAlib

EPyMARL

:x:

MARLlib

key features
:beginner: MARLlib offers several key features that make it stand out:

MARLlib unifies diverse algorithm pipelines with agent-level distributed dataflow, allowing researchers to develop, test, and evaluate MARL algorithms across different tasks and environments.
MARLlib supports all task modes, including cooperative, collaborative, competitive, and mixed. This makes it easier for researchers to train and evaluate MARL algorithms across a wide range of tasks.
MARLlib provides a new interface that follows the structure of Gym, making it easier for researchers to work with multi-agent environments.
MARLlib provides flexible and customizable parameter-sharing strategies, allowing researchers to optimize their algorithms for different tasks and environments.

:rocket: Using MARLlib, you can take advantage of various benefits, such as:

Zero knowledge of MARL: MARLlib provides 18 pre-built algorithms with an intuitive API, allowing researchers to start experimenting with MARL without prior knowledge of the field.
Support for all task modes: MARLlib supports almost all multi-agent environments, making it easier for researchers to experiment with different task modes.
Customizable model architecture: Researchers can choose their preferred model architecture from the model zoo, or build their own.
Customizable policy sharing: MARLlib provides grouping options for policy sharing, or researchers can create their own.
Access to over a thousand released experiments: Researchers can access over a thousand released experiments to see how other researchers have used MARLlib.

Installation

Note:
Currently MARLlib supports Linux only.

Step-by-step (recommended)

install dependencies
install environments
install patches

1. install dependencies (basic)
First, install MARLlib dependencies to guarantee basic usage.
following this guide, finally install patches for RLlib.
$ conda create -n marllib python=3.8 # or 3.9
$ conda activate marllib
$ git clone https://github.com/Replicable-MARL/MARLlib.git && cd MARLlib
$ pip install -r requirements.txt

2. install environments (optional)
Please follow this guide.
3. install patches (basic)
Fix bugs of RLlib using patches by running the following command:
$ cd /Path/To/MARLlib/marl/patch
$ python add_patch.py -y

PyPI
$ pip install --upgrade pip
$ pip install marllib

Getting started

Prepare the configuration
There are four parts of configurations that take charge of the whole training process.

scenario: specify the environment/task settings
algorithm: choose the hyperparameters of the algorithm
model: customize the model architecture
ray/rllib: change the basic training settings

Before training, ensure all the parameters are set correctly, especially those you don't want to change.

Note:
You can also modify all the pre-set parameters via MARLLib API.*

Register the environment
Ensure all the dependencies are installed for the environment you are running with. Otherwise, please refer to
MARLlib documentation.

task mode
api example

cooperative
marl.make_env(environment_name="mpe", map_name="simple_spread", force_coop=True)

collaborative
marl.make_env(environment_name="mpe", map_name="simple_spread")

competitive
marl.make_env(environment_name="mpe", map_name="simple_adversary")

mixed
marl.make_env(environment_name="mpe", map_name="simple_crypto")

Most of the popular environments in MARL research are supported by MARLlib:

Env Name
Learning Mode
Observability
Action Space
Observations

LBF
cooperative + collaborative
Both
Discrete
1D

RWARE
cooperative
Partial
Discrete
1D

MPE
cooperative + collaborative + mixed
Both
Both
1D

SMAC
cooperative
Partial
Discrete
1D

MetaDrive
collaborative
Partial
Continuous
1D

MAgent
collaborative + mixed
Partial
Discrete
2D

Pommerman
collaborative + competitive + mixed
Both
Discrete
2D

MAMuJoCo
cooperative
Full
Continuous
1D

GRF
collaborative + mixed
Full
Discrete
2D

Hanabi
cooperative
Partial
Discrete
1D

MATE
cooperative + mixed
Partial
Both
1D

GoBigger
cooperative + mixed
Both
Continuous
1D

Each environment has a readme file, standing as the instruction for this task, including env settings, installation, and
important notes.

Initialize the algorithm

running target
api example

train & finetune
marl.algos.mappo(hyperparam_source=$ENV)

develop & debug
marl.algos.mappo(hyperparam_source="test")

3rd party env
marl.algos.mappo(hyperparam_source="common")

Here is a chart describing the characteristics of each algorithm:

algorithm
support task mode
discrete action
continuous action
policy type

IQL*
all four
:heavy_check_mark:

off-policy

PG
all four
:heavy_check_mark:
:heavy_check_mark:
on-policy

A2C
all four
:heavy_check_mark:
:heavy_check_mark:
on-policy

DDPG
all four

:heavy_check_mark:
off-policy

TRPO
all four
:heavy_check_mark:
:heavy_check_mark:
on-policy

PPO
all four
:heavy_check_mark:
:heavy_check_mark:
on-policy

COMA
all four
:heavy_check_mark:

on-policy

MADDPG
all four

:heavy_check_mark:
off-policy

MAA2C*
all four
:heavy_check_mark:
:heavy_check_mark:
on-policy

MATRPO*
all four
:heavy_check_mark:
:heavy_check_mark:
on-policy

MAPPO
all four
:heavy_check_mark:
:heavy_check_mark:
on-policy

HATRPO
cooperative
:heavy_check_mark:
:heavy_check_mark:
on-policy

HAPPO
cooperative
:heavy_check_mark:
:heavy_check_mark:
on-policy

VDN
cooperative
:heavy_check_mark:

off-policy

QMIX
cooperative
:heavy_check_mark:

off-policy

FACMAC
cooperative

:heavy_check_mark:
off-policy

VDAC
cooperative
:heavy_check_mark:
:heavy_check_mark:
on-policy

VDPPO*
cooperative
:heavy_check_mark:
:heavy_check_mark:
on-policy

*all four: cooperative collaborative competitive mixed
IQL is the multi-agent version of Q learning.
MAA2C and MATRPO are the centralized version of A2C and TRPO.
VDPPO is the value decomposition version of PPO.

Build the agent model
An agent model consists of two parts, encoder and core arch.
encoder will be constructed by MARLlib according to the observation space.
Choose mlp, gru, or lstm as you like to build the complete model.

model arch
api example

MLP
marl.build_model(env, algo, {"core_arch": "mlp")

GRU
marl.build_model(env, algo, {"core_arch": "gru"})

LSTM
marl.build_model(env, algo, {"core_arch": "lstm"})

Encoder Arch
marl.build_model(env, algo, {"core_arch": "gru", "encode_layer": "128-256"})

Kick off the training

setting
api example

train
algo.fit(env, model)

debug
algo.fit(env, model, local_mode=True)

stop condition
algo.fit(env, model, stop={'episode_reward_mean': 2000, 'timesteps_total': 10000000})

policy sharing
algo.fit(env, model, share_policy='all') # or 'group' / 'individual'

save model
algo.fit(env, model, checkpoint_freq=100, checkpoint_end=True)

GPU accelerate
algo.fit(env, model, local_mode=False, num_gpus=1)

CPU accelerate
algo.fit(env, model, local_mode=False, num_workers=5)

Training & rendering API
from marllib import marl

# prepare env
env = marl.make_env(environment_name="mpe", map_name="simple_spread")
# initialize algorithm with appointed hyper-parameters
mappo = marl.algos.mappo(hyperparam_source="mpe")
# build agent model based on env + algorithms + user preference
model = marl.build_model(env, mappo, {"core_arch": "mlp", "encode_layer": "128-256"})
# start training
mappo.fit(
env, model,
stop={"timesteps_total": 1000000},
checkpoint_freq=100,
share_policy="group"
)
# rendering
mappo.render(
env, model,
local_mode=True,
restore_path={'params_path': "checkpoint_000010/params.json",
'model_path': "checkpoint_000010/checkpoint-10"}
)

Benchmark results
All results are listed here.
Quick examples
MARLlib provides some practical examples for you to refer to.

Detailed API usage: show how to use MARLlib api in
detail, e.g. cmd + api combined running.
Policy sharing cutomization:
define your group policy-sharing strategy as you like based on current tasks.
Loading model and rendering:
render the environment based on the pre-trained model.
Incorporating new environment:
add your new environment following MARLlib's env-agent interaction interface.
Incorporating new algorithm:
add your new algorithm following MARLlib learning pipeline.

Tutorials
Try MPE + MAPPO examples on Google Colaboratory!

More tutorial documentations are available here.
Awesome List
A collection of research and review papers of multi-agent reinforcement learning (MARL) is available. The papers have been organized based on their publication date and their evaluation of the corresponding environments.
Algorithms:
Environments:
Community

Channel
Link

Issues
GitHub Issues

Roadmap
The roadmap to the future release is available in ROADMAP.md.
Contributing
We are a small team on multi-agent reinforcement learning, and we will take all the help we can get!
If you would like to get involved, here is information on contribution guidelines and how to test the code locally.
You can contribute in multiple ways, e.g., reporting bugs, writing or translating documentation, reviewing or refactoring code, requesting or implementing new features, etc.
Paper
If you use MARLlib in your research, please cite the MARLlib paper.
@article{hu2022marllib,
title={MARLlib: Extending RLlib for Multi-agent Reinforcement Learning},
author={Hu, Siyi and Zhong, Yifan and Gao, Minquan and Wang, Weixun and Dong, Hao and Li, Zhihui and Liang, Xiaodan and Chang, Xiaojun and Yang, Yaodong},
journal={arXiv preprint arXiv:2210.13708},
year={2022}
}

Files In This Product: (if this is empty don't purchase this product)

There are no reviews.

zed

Languages

Categories

Description:

License:

Share

Files In This Product: (if this is empty don't purchase this product)

Overview

What you can do with it

What you can't do with it

Related Products

Views For YouTube Bot writed on Python

AI-Web-Scraper

quivr

roop

More From This Creator

xdict 1.1.11

xdisplayselect 1.0.0

xfcs 1.1.6

xfcsdashboard 0.0.2

xfds 0.3.0

marllib 1.0.3

Languages

Categories

Description:

License:

Share

Files In This Product: (if this is empty don't purchase this product)

Customer Reviews

License

Overview

What you can do with it

What you can't do with it

Related Products

Views For YouTube Bot writed on Python

AI-Web-Scraper

quivr

roop

zed

More From This Creator

xdict 1.1.11

xdisplayselect 1.0.0

xfcs 1.1.6

xfcsdashboard 0.0.2

xfds 0.3.0