rlcard.utils¶
rlcard.utils.utils¶
- rlcard.utils.utils.elegent_form(card)¶
Get a elegent form of a card string
- Parameters:
card (string) – A card string
- Returns:
A nice form of card
- Return type:
elegent_card (string)
- rlcard.utils.utils.get_device()¶
- rlcard.utils.utils.init_54_deck()¶
Initialize a standard deck of 52 cards, BJ and RJ
- Returns:
Alist of Card object
- Return type:
(list)
- rlcard.utils.utils.init_standard_deck()¶
Initialize a standard deck of 52 cards
- Returns:
A list of Card object
- Return type:
(list)
- rlcard.utils.utils.plot_curve(csv_path, save_path, algorithm)¶
Read data from csv file and plot the results
- rlcard.utils.utils.print_card(cards)¶
Nicely print a card or list of cards
- Parameters:
card (string or list) – The card(s) to be printed
- rlcard.utils.utils.rank2int(rank)¶
Get the coresponding number of a rank.
- Parameters:
rank (str) – rank stored in Card object
- Returns:
the number corresponding to the rank
- Return type:
(int)
Note
If the input rank is an empty string, the function will return -1.
If the input rank is not valid, the function will return None.
- rlcard.utils.utils.remove_illegal(action_probs, legal_actions)¶
- Remove illegal actions and normalize the
probability vector
- Parameters:
action_probs (numpy.array) – A 1 dimention numpy array.
legal_actions (list) – A list of indices of legal actions.
- Returns:
A normalized vector without legal actions.
- Return type:
probd (numpy.array)
- rlcard.utils.utils.reorganize(trajectories, payoffs)¶
Reorganize the trajectory to make it RL friendly
- Parameters:
trajectory (list) – A list of trajectories
payoffs (list) – A list of payoffs for the players. Each entry corresponds to one player
- Returns:
A new trajectories that can be fed into RL algorithms.
- Return type:
(list)
- rlcard.utils.utils.set_seed(seed)¶
- rlcard.utils.utils.tournament(env, num)¶
Evaluate he performance of the agents in the environment
- Parameters:
env (Env class) – The environment to be evaluated.
num (int) – The number of games to play.
- Returns:
A list of avrage payoffs for each player
rlcard.utils.logger¶
- class rlcard.utils.logger.Logger(log_dir)¶
Bases:
object
Logger saves the running results and helps make plots from the results
- log(text)¶
Write the text to log file then print it. :param text: text to log :type text: string
- log_performance(episode, reward)¶
Log a point in the curve :param episode: the episode of the current point :type episode: int :param reward: the reward of the current point :type reward: float
rlcard.utils.seeding¶
- rlcard.utils.seeding.colorize(string, color, bold=False, highlight=False)¶
Return string surrounded by appropriate terminal color codes to print colorized text. Valid colors: gray, red, green, yellow, blue, magenta, cyan, white, crimson
- rlcard.utils.seeding.create_seed(a=None, max_bytes=8)¶
Create a strong random seed. Otherwise, Python 2 would seed using the system time, which might be non-robust especially in the presence of concurrency.
- Parameters:
a (Optional[int, str]) – None seeds from an operating system specific randomness source.
max_bytes – Maximum number of bytes to use in the seed.
- rlcard.utils.seeding.error(msg, *args)¶
- rlcard.utils.seeding.hash_seed(seed=None, max_bytes=8)¶
Any given evaluation is likely to have many PRNG’s active at once. (Most commonly, because the environment is running in multiple processes.) There’s literature indicating that having linear correlations between seeds of multiple PRNG’s can correlate the outputs:
http://blogs.unity3d.com/2015/01/07/a-primer-on-repeatable-random-numbers/ http://stackoverflow.com/questions/1554958/how-different-do-random-seeds-need-to-be http://dl.acm.org/citation.cfm?id=1276928
Thus, for sanity we hash the seeds before using them. (This scheme is likely not crypto-strength, but it should be good enough to get rid of simple correlations.)
- Parameters:
seed (Optional[int]) – None seeds from an operating system specific randomness source.
max_bytes – Maximum number of bytes to use in the hashed seed.
- rlcard.utils.seeding.np_random(seed=None)¶
rlcard.utils.pettingzoo_utils¶
- rlcard.utils.pettingzoo_utils.reorganize_pettingzoo(trajectories)¶
Reorganize the trajectory to make it RL friendly
- Parameters:
trajectory (list) – A list of trajectories
- Returns:
A new trajectories that can be fed into RL algorithms.
- Return type:
(list)
- rlcard.utils.pettingzoo_utils.run_game_pettingzoo(env, agents, is_training=False)¶
- rlcard.utils.pettingzoo_utils.tournament_pettingzoo(env, agents, num_episodes)¶
- rlcard.utils.pettingzoo_utils.wrap_state(state)¶