rlcard.envs

rlcard.envs.env

class rlcard.envs.env.Env(game, allow_step_back=False)

Bases: object

The base Env class

decode_action(action_id)

Decode Action id to the action in the game.

Parameters

action_id (int) – The id of the action

Returns

The action that will be passed to the game engine.

Return type

(string)

Note: Must be implemented in the child class.

extract_state(state)

Extract useful information from state for RL. Must be implemented in the child class.

Parameters

state (dict) – the raw state

Returns

the extracted state

Return type

(numpy.array)

Get all legal actions for current state.

Returns

A list of legal actions’ id.

Return type

(list)

Note: Must be implemented in the child class.

get_payoffs()

Get the payoffs of players. Must be implemented in the child class.

Returns

A list of payoffs for each player.

Return type

(list)

Note: Must be implemented in the child class.

get_player_id()

Get the current player id

Returns

the id of the current player

Return type

(int)

get_state(player_id)

Get the state given player id

Parameters

player_id (int) – The player id

Returns

The observed state of the player

Return type

(numpy.array)

init_game()

Start a new game

Returns

Tuple containing:

(numpy.array): The begining state of the game (int): The begining player

Return type

(tuple)

is_over()

Check whether the curent game is over

Returns

True is current game is over

Return type

(boolean)

load_model()

Load pretrained/rule model

Returns

A Model object

Return type

model (Model)

static print_action(action)

Print out an action in a nice form

Parameters

action (str) – A string a action

print_result(player)

Print the game result when the game is over

Parameters

player (int) – The human player id

print_state(player)

Print out the state of a given player

Parameters

player (int) – Player id

reset()

Reset environment in single-agent mode

run(is_training=False, seed=None)

Run a complete game, either for evaluation or training RL agent.

Parameters
  • is_training (boolean) – True if for training purpose.

  • seed (int) – A seed for running the game. For single-process program, the seed should be set to None. For multi-process program, the seed should be asigned for reproducibility.

Returns

(list): A list of trajectories generated from the environment. (list): A list payoffs. Each entry corresponds to one player.

Return type

(tuple) Tuple containing

Note: The trajectories are 3-dimension list. The first dimension is for different players.

The second dimension is for different transitions. The third dimension is for the contents of each transiton

run_multi(task_num, result, is_training=False, seed=None)
set_agents(agents)

Set the agents that will interact with the environment

Parameters

agents (list) – List of Agent classes

set_mode(active_player=0, single_agent_mode=False, human_mode=False)
Turn on the single-agent-mode. Pretrained models will

be loaded to simulate other agents

Parameters

active_player (int) – The player that does not use pretrained models

single_agent_step(action)

Step forward for human/single agent

Parameters

action (int) – the action takem by the current player

Returns

The next state

Return type

next_state (numpy.array)

step(action)

Step forward

Parameters

action (int) – the action taken by the current player

Returns

Tuple containing:

(numpy.array): The next state (int): The ID of the next player

Return type

(tuple)

step_back()

Take one step backward.

Returns

Tuple containing:

(numpy.array): The previous state (int): The ID of the previous player

Return type

(tuple)

Note: Error will be raised if step back from the root node.

rlcard.envs.registration

class rlcard.envs.registration.EnvRegistry

Bases: object

Register an environment (game) by ID

make(env_id, allow_step_back=False)

Create and environment instance

Parameters
  • env_id (string) – the name of the environment

  • allow_step_back (boolean) – True if you wants to able to step_back

register(env_id, entry_point)

Register an environment

Parameters
  • env_id (string) – the name of the environent

  • entry_point (string) – a string the indicates the location of the envronment class

class rlcard.envs.registration.EnvSpec(env_id, entry_point=None)

Bases: object

A specification for a particular instance of the environment.

make(allow_step_back=False)

Instantiates an instance of the environment

Returns

an instance of the environemnt allow_step_back (boolean): True if you wants to able to step_back

Return type

env (Env)

rlcard.envs.registration.make(env_id, allow_step_back=False)

Create and environment instance

Parameters
  • env_id (string) – the name of the environment

  • allow_step_back (boolean) – True if you wants to able to step_back

rlcard.envs.registration.register(env_id, entry_point)

Register an environment

Parameters
  • env_id (string) – the name of the environent

  • entry_point (string) – a string the indicates the location of the envronment class

rlcard.envs.blackjack

class rlcard.envs.blackjack.BlackjackEnv(allow_step_back=False)

Bases: rlcard.envs.env.Env

Blackjack Environment

decode_action(action_id)

Decode the action for applying to the game

Parameters

id (action) – action id

Returns

action for the game

Return type

action (str)

extract_state(state)

Extract the state representation from state dictionary for agent

Parameters

state (dict) – Original state from the game

Returns

combine the player’s score and dealer’s observable score for observation

Return type

observation (list)

Get all leagal actions

Returns

return encoded legal action list (from str to int)

Return type

encoded_action_list (list)

get_payoffs()

Get the payoff of a game

Returns

list of payoffs

Return type

payoffs (list)

rlcard.envs.doudizhu

class rlcard.envs.doudizhu.DoudizhuEnv(allow_step_back=False)

Bases: rlcard.envs.env.Env

Doudizhu Environment

decode_action(action_id)

Action id -> the action in the game. Must be implemented in the child class.

Parameters

action_id (int) – the id of the action

Returns

the action that will be passed to the game engine.

Return type

action (string)

extract_state(state)

Encode state

Parameters

state (dict) – dict of original state

Returns

6*5*15 array
6current hand

the union of the other two players’ hand the recent three actions the union of all played cards

Return type

numpy array

Get all legal actions for current state

Returns

a list of legal actions’ id

Return type

legal_actions (list)

get_payoffs()

Get the payoffs of players. Must be implemented in the child class.

Returns

a list of payoffs for each player

Return type

payoffs (list)

rlcard.envs.limitholdem

class rlcard.envs.limitholdem.LimitholdemEnv(allow_step_back=False)

Bases: rlcard.envs.env.Env

Limitholdem Environment

decode_action(action_id)

Decode the action for applying to the game

Parameters

id (action) – action id

Returns

action for the game

Return type

action (str)

extract_state(state)

Extract the state representation from state dictionary for agent

Note: Currently the use the hand cards and the public cards. TODO: encode the states

Parameters

state (dict) – Original state from the game

Returns

combine the player’s score and dealer’s observable score for observation

Return type

observation (list)

Get all leagal actions

Returns

return encoded legal action list (from str to int)

Return type

encoded_action_list (list)

get_payoffs()

Get the payoff of a game

Returns

list of payoffs

Return type

payoffs (list)