RLCard: A Toolkit for Reinforcement Learning in Card Games

RLCard is a toolkit for Reinforcement Learning (RL) in card games. It supports multiple card environments with easy-to-use interfaces. The goal of RLCard is to bridge reinforcement learning and imperfect information games, and push forward the research of reinforcement learning in domains with multiple agents, large state and action space, and sparse reward. RLCard is developed by DATA Lab at Texas A&M University.

Installation

Make sure that you have Python 3.5+ and pip installed. We recommend installing rlcard with pip as follow:

git clone https://github.com/datamllab/rlcard.git
cd rlcard
pip install -e .

Or you can directly install the package with

pip install rlcard

Examples

A short example is as below.

import rlcard
from rlcard.agents.random_agent import RandomAgent

env = rlcard.make('blackjack')
env.set_agents([RandomAgent()])

trajectories, payoffs = env.run()

We also recommend the toy examples in Getting Started.

Demo

Run examples/leduc_holdem_human.py to play with the pre-trained Leduc Hold’em model:

>> Leduc Hold'em pre-trained model

>> Start a new game!
>> Agent 1 chooses raise

=============== Community Card ===============
┌─────────┐
│░░░░░░░░░│
│░░░░░░░░░│
│░░░░░░░░░│
│░░░░░░░░░│
│░░░░░░░░░│
│░░░░░░░░░│
│░░░░░░░░░│
└─────────┘
===============   Your Hand    ===============
┌─────────┐
│J        │
│         │
│         │
│    ♥    │
│         │
│         │
│        J│
└─────────┘
===============     Chips      ===============
Yours:   +
Agent 1: +++
=========== Actions You Can Choose ===========
0: call, 1: raise, 2: fold

>> You choose action (integer):

Available Environments

Name: the name that should be passed to env.make to create the game environment.

Game

Name

Status

Blackjack (wiki, baike)

blackjack

Available

Leduc Hold’em

leduc-holdem

Available

Limit Texas Hold’em (wiki, baike)

limit-holdem

Available

Dou Dizhu (wiki, baike)

doudizhu

Available

Mahjong (wiki, baike)

mahjong

Available

No-limit Texas Hold’em (wiki, baike)

no-limit-holdem

Available

UNO (wiki, baike)

uno

Available

Sheng Ji (wiki, baike)

-

Developing

Evaluation

The perfomance is measured by winning rate through tournaments. Example outputs are as follows: Learning Curves

Contributing

Contribution to this project is greatly appreciated! Please create a issue for feedbacks/bugs. If you want to contribute codes, pleast contact daochen.zha@tamu.edu or khlai037@tamu.edu.

Acknowledgements

We would like to thank JJ World Network Technology Co.,LTD for the generous support.