[RLlib|New API|Inconsistency] LSTM Encoder lacks the output Linear, but stated in the docstring (#47625) #47626

brieyla1 · 2024-09-12T12:50:00Z

I don't know who to add for review

Why are these changes needed?

Fixes an inconsistency between the docstring and the implementation of the TorchLSTMEncoder class in the RLlib new API.

See Issue for more info

Related issue number

Closes #47625

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests (Not sure, ran all tests from python -m pytest -v -s python/ray/tests/test_basic.py like state in https://docs.ray.io/en/latest/ray-contribute/getting-involved.html#testing and didn't run any C++ Tests are none have been changed.)

Signed-off-by: Brieyla <[email protected]>

simonsays1980 · 2024-09-25T09:13:50Z

@brieyla1 Thanks a lot for your PR and the great catch! Because we are rethinking the design of encoders and their configs to make implementation of modules easier for users, we are reluctant to changes in the encoders themselves at this moment and opt for a change of all docstrings.

This PR is great as it is in its code implementation.

brieyla1 · 2024-09-25T09:45:16Z

Interesting, I wonder what's the impact on the action processing of directly using an LSTM connected to a head, some of that knowledge may be making the training easier when the VF layers are not shared because of the extra linear processing of the data & their relationships before splitting it.

At the same time, I understand the reluctance to modify the Catalog & Encoder, as models built with it may have issues.. e.g. when reinitializing the RLModule and migrating the weights.

We should do some benchmarks to weigh the benefits of such an "improvement" since we already know the downsides.

simonsays1980 · 2024-09-26T14:51:51Z

Interesting, I wonder what's the impact on the action processing of directly using an LSTM connected to a head, some of that knowledge may be making the training easier when the VF layers are not shared because of the extra linear processing of the data & their relationships before splitting it.

At the same time, I understand the reluctance to modify the Catalog & Encoder, as models built with it may have issues.. e.g. when reinitializing the RLModule and migrating the weights.

We should do some benchmarks to weigh the benefits of such an "improvement" since we already know the downsides.

Benchmarking can be done already without adding the linear layer to the encoder but to the head(s): using post_fcnet_hiddens adds layers to the head network(s). The more complex your environment is the more layers will be needed to succeed in the task(s). PPO learns Atari Pong with just a single fc-layer after the CNN encoder (should take around 8 minutes on 4 GPUs and 95 EnvRunners). In this case layers are shared between the policy and the critic (just make sure that you tune the vf_loss_coeff).

Fixed encoders & ran script

a8eb4b9

Signed-off-by: Brieyla <[email protected]>

brieyla1 requested review from sven1977, ArturNiederfahrenhorst and simonsays1980 as code owners September 12, 2024 12:50

brieyla1 changed the title ~~[RLlib|New API|Inconsistency] LSTM Encoder lacks the output Linear, but stated in the docstring~~ [RLlib|New API|Inconsistency] LSTM Encoder lacks the output Linear, but stated in the docstring (#47625) Sep 12, 2024

brieyla1 added 4 commits September 13, 2024 19:58

Merge branch 'master' into issue/rllib-linear-missing-rnn

01b0e7e

Merge branch 'master' into issue/rllib-linear-missing-rnn

f2255f6

Merge branch 'master' into issue/rllib-linear-missing-rnn

7e839aa

Merge branch 'master' into issue/rllib-linear-missing-rnn

55b8aa1

anyscalesam added triage Needs triage (eg: priority, bug/not-bug, and owning component) rllib RLlib related issues labels Sep 16, 2024

brieyla1 added 4 commits September 18, 2024 14:53

Merge branch 'master' into issue/rllib-linear-missing-rnn

f0d4918

Merge branch 'master' into issue/rllib-linear-missing-rnn

0ad19d0

Merge branch 'master' into issue/rllib-linear-missing-rnn

a4a02f1

Merge branch 'master' into issue/rllib-linear-missing-rnn

cd04697

simonsays1980 mentioned this pull request Sep 25, 2024

[RLlib] Update docstrings in recurrent encoders. #47816

Closed

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib|New API|Inconsistency] LSTM Encoder lacks the output Linear, but stated in the docstring (#47625) #47626

[RLlib|New API|Inconsistency] LSTM Encoder lacks the output Linear, but stated in the docstring (#47625) #47626

brieyla1 commented Sep 12, 2024 •

edited

Loading

simonsays1980 commented Sep 25, 2024

brieyla1 commented Sep 25, 2024

simonsays1980 commented Sep 26, 2024

[RLlib|New API|Inconsistency] LSTM Encoder lacks the output Linear, but stated in the docstring (#47625) #47626

Are you sure you want to change the base?

[RLlib|New API|Inconsistency] LSTM Encoder lacks the output Linear, but stated in the docstring (#47625) #47626

Conversation

brieyla1 commented Sep 12, 2024 • edited Loading

Why are these changes needed?

Related issue number

Checks

simonsays1980 commented Sep 25, 2024

brieyla1 commented Sep 25, 2024

simonsays1980 commented Sep 26, 2024

brieyla1 commented Sep 12, 2024 •

edited

Loading