rlcard.utils¶

rlcard.utils.utils¶

rlcard.utils.utils.elegent_form(card)¶

Get a elegent form of a card string

Parameters:: card (string) – A card string
Returns:: A nice form of card
Return type:: elegent_card (string)

rlcard.utils.utils.get_device()¶

rlcard.utils.utils.init_54_deck()¶

Initialize a standard deck of 52 cards, BJ and RJ

Returns:: Alist of Card object
Return type:: (list)

rlcard.utils.utils.init_standard_deck()¶

Initialize a standard deck of 52 cards

Returns:: A list of Card object
Return type:: (list)

rlcard.utils.utils.plot_curve(csv_path, save_path, algorithm)¶: Read data from csv file and plot the results

rlcard.utils.utils.print_card(cards)¶

Nicely print a card or list of cards

Parameters:: card (string or list) – The card(s) to be printed

rlcard.utils.utils.rank2int(rank)¶

Get the coresponding number of a rank.

Parameters:: rank (str) – rank stored in Card object
Returns:: the number corresponding to the rank
Return type:: (int)

Note

If the input rank is an empty string, the function will return -1.
If the input rank is not valid, the function will return None.

rlcard.utils.utils.remove_illegal(action_probs, legal_actions)¶

Remove illegal actions and normalize the: probability vector

Parameters:

action_probs (numpy.array) – A 1 dimention numpy array.
legal_actions (list) – A list of indices of legal actions.

Returns:

A normalized vector without legal actions.

Return type:

probd (numpy.array)

rlcard.utils.utils.reorganize(trajectories, payoffs)¶

Reorganize the trajectory to make it RL friendly

Parameters:

trajectory (list) – A list of trajectories
payoffs (list) – A list of payoffs for the players. Each entry corresponds to one player

Returns:

A new trajectories that can be fed into RL algorithms.

Return type:

(list)

rlcard.utils.utils.set_seed(seed)¶

rlcard.utils.utils.tournament(env, num)¶

Evaluate he performance of the agents in the environment

Parameters:

env (Env class) – The environment to be evaluated.
num (int) – The number of games to play.

Returns:

A list of avrage payoffs for each player

rlcard.utils.logger¶

class rlcard.utils.logger.Logger(log_dir)¶

Bases: object

Logger saves the running results and helps make plots from the results

log(text)¶: Write the text to log file then print it. :param text: text to log :type text: string

log_performance(episode, reward)¶: Log a point in the curve :param episode: the episode of the current point :type episode: int :param reward: the reward of the current point :type reward: float

rlcard.utils.seeding¶

rlcard.utils.seeding.colorize(string, color, bold=False, highlight=False)¶: Return string surrounded by appropriate terminal color codes to print colorized text. Valid colors: gray, red, green, yellow, blue, magenta, cyan, white, crimson

rlcard.utils.seeding.create_seed(a=None, max_bytes=8)¶

Create a strong random seed. Otherwise, Python 2 would seed using the system time, which might be non-robust especially in the presence of concurrency.

Parameters:

a (Optional[int, str]) – None seeds from an operating system specific randomness source.
max_bytes – Maximum number of bytes to use in the seed.

rlcard.utils.seeding.error(msg, *args)¶

rlcard.utils.seeding.hash_seed(seed=None, max_bytes=8)¶

Any given evaluation is likely to have many PRNG’s active at once. (Most commonly, because the environment is running in multiple processes.) There’s literature indicating that having linear correlations between seeds of multiple PRNG’s can correlate the outputs:

http://blogs.unity3d.com/2015/01/07/a-primer-on-repeatable-random-numbers/ http://stackoverflow.com/questions/1554958/how-different-do-random-seeds-need-to-be http://dl.acm.org/citation.cfm?id=1276928

Thus, for sanity we hash the seeds before using them. (This scheme is likely not crypto-strength, but it should be good enough to get rid of simple correlations.)

Parameters:

seed (Optional[int]) – None seeds from an operating system specific randomness source.
max_bytes – Maximum number of bytes to use in the hashed seed.

rlcard.utils.seeding.np_random(seed=None)¶

rlcard.utils.pettingzoo_utils¶

rlcard.utils.pettingzoo_utils.reorganize_pettingzoo(trajectories)¶

Reorganize the trajectory to make it RL friendly

Parameters:: trajectory (list) – A list of trajectories
Returns:: A new trajectories that can be fed into RL algorithms.
Return type:: (list)

rlcard.utils.pettingzoo_utils.run_game_pettingzoo(env, agents, is_training=False)¶

rlcard.utils.pettingzoo_utils.tournament_pettingzoo(env, agents, num_episodes)¶

rlcard.utils.pettingzoo_utils.wrap_state(state)¶