[Question] State representation in Simple Adversary parallel environment #1202

baraahsidahmed · 2024-04-24T12:18:25Z

Question

Hi, I am working on simple_adversary_v3 along with agileRL to train the agents. I am using parallel_env and want to monitor the agents positions while training. I found that step function only returns next state observation as dictionary of arrays of elements (8 for adversary or 10 for good agents) and I can't quite work what each element of the array is because the documentation mention that it's [self_pos, self_vel, goal_rel_position, landmark_rel_position, other_agent_rel_positions] which are only five but in code comments it is mentioned it should be [goal_rel_position, landmark_rel_position, other_agent_rel_positions]. I am working with N=2 so what I did is interpreting the ten values of good agents as: [goal_rel_pos_x, goal_rel_pos_y, 1st_landmark_x, 1st_landmark_y, 2nd_landmark_x (same as goal), 2nd_landmark_y (same as goal), other_good_agent_x, other_good_agent_y, adversary_x, adversary_y] can you please confirm if this is the right value mapping or else what are the returned values exactly?

gresavage · 2024-05-01T16:04:44Z

Take a look at the observation function for the scenario.

On line 247 we see that if the agent is a good agent you will observe:

rel_goal_x, rel_goal_y, (rel_lm_x, rel_lm_y) * n_landmarks, (rel_other_agent_x, rel_other_agent_y) * n_agents

In the case of N=2 there are N+1 agents and N landmarks (see line 95), so that's 2+2*2+2*2 = 2 + 4 + 4 = 10

On line 250 we see that if the agent is an adversarial agent you will observe:

(rel_lm_x, rel_lm_y) * n_landmarks, (rel_other_agent_x, rel_other_agent_y) * n_agents

In the case of N=2 that's 2*2 + 2*2 = 4 + 4 = 8

dm-ackerman · 2024-05-06T14:42:37Z

Just an added comment. The documentation on the website defaults to the most recently released version of PettingZoo. If you're using the current master branch, things may have changed since then. The documentation for this game was updated a couple months ago (after the last release). You can change to the current master by selecting the master from the dropdown in the lower right of the doc page. IMO, it's still unclear because it doesn't explain that some values are arrays, but it does correctly match the code now.
(credit to Elliot for pointing this out)

baraahsidahmed · 2024-05-07T10:46:43Z

Thank you for all the in depth clarifications!

On line 247 we see that if the agent is a good agent you will observe:
rel_goal_x, rel_goal_y, (rel_lm_x, rel_lm_y) * n_landmarks, (rel_other_agent_x, rel_other_agent_y) * n_agents

I think the problem with the current documentation was 'self_vel' is not returned in the observation array so I was confused.

The documentation on the website defaults to the most recently released version of PettingZoo.

Thank you for pointing this out! for me I just clicked the Github link in the documentation page and assumed that it is the corresponding code without realizing the versions difference, sorry about that.

baraahsidahmed added the question Further information is requested label Apr 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] State representation in Simple Adversary parallel environment #1202

[Question] State representation in Simple Adversary parallel environment #1202

baraahsidahmed commented Apr 24, 2024

gresavage commented May 1, 2024 •

edited

Loading

dm-ackerman commented May 6, 2024

baraahsidahmed commented May 7, 2024

[Question] State representation in Simple Adversary parallel environment #1202

[Question] State representation in Simple Adversary parallel environment #1202

Comments

baraahsidahmed commented Apr 24, 2024

Question

gresavage commented May 1, 2024 • edited Loading

dm-ackerman commented May 6, 2024

baraahsidahmed commented May 7, 2024

gresavage commented May 1, 2024 •

edited

Loading