Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of Self-Play Training for Real-Time Environments #252

Closed
Tiikara opened this issue Jul 28, 2024 · 2 comments
Closed

Implementation of Self-Play Training for Real-Time Environments #252

Tiikara opened this issue Jul 28, 2024 · 2 comments

Comments

@Tiikara
Copy link

Tiikara commented Jul 28, 2024

I'm currently conducting experiments with the UniZero neural network using your library. I'm particularly interested in understanding the feasibility of implementing self-play training for UniZero in real-time environments, such as Atari games and similar platforms.
Upon examining the codebase, I noticed that self-play is currently implemented for board games. However, I'm finding it challenging to assess the extent of work required to extend this functionality to real-time environments.

Could you provide some insights on the following:

  1. Do you have an existing implementation for self-play in real-time environments?
  2. If yes, could you guide me on where to start to implement this feature? (I believe asking for guidance might be more efficient than trying to figure everything out independently)
  3. If not, could you share your perspective on the complexity of implementing this within your framework?

Your expertise and guidance would be greatly appreciated in helping me navigate this aspect of the library.

@Tiikara
Copy link
Author

Tiikara commented Jul 28, 2024

I've made some progress in understanding the system's functionality. It appears that the default gym environment doesn't natively support multiplayer capabilities. In light of this, I'm currently working on integrating the environment from https://github.com/Farama-Foundation/stable-retro to lightzero, which offers multiplayer support, and then implementing self-learning.

If my implementation proves successful and aligns with your project's interests, I'd be happy to submit a pull request. However, I'm still very interested in any insights or best practices you might have regarding the optimal implementation approach. Your guidance would be invaluable in this process.

@puyuan1996
Copy link
Collaborator

Hello, as you mentioned, we are currently using self-play training primarily in board games and plan to extend UniZero to these games in the near future. Regarding implementing self-play training in real-time environments, it currently requires the environment to be a two-player game, which is not applicable to most Atari games since they are mostly single-player. If you wish to adapt an environment that supports multiplayer games into the LightZero library and conduct self-play training, you can follow these steps:

  1. Create a game environment with an API similar to that of the board game TicTacToe (link). The main methods to implement are reset() and step(), and you may also need to implement three different battle_modes. Additionally, write corresponding test files for your environment.

  2. Write a configuration file similar to TicTacToe's configuration file (link), called your_env_muzero_sp_mode_config.py, and test its performance.

  3. Adapt UniZero to environments that support different battle_modes by referring to the implementation of MuZero and test its performance.

Thank you for your interest. Feel free to reach out with any questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants