Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REP] Ray Export API. #55

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

[REP] Ray Export API. #55

wants to merge 8 commits into from

Conversation

MissiontoMars
Copy link

No description provided.

## Summary
### General Motivation

In the current design of Ray, the way to export various states in the Ray cluster are inconsistent.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the State API is typically recommended for end users to fetch current state info of a running Ray cluster, but the state API gets data from multiple sources depending on the resource type.

The export API will provide another way to fetch state data that works well for large scale clusters or can be saved to query after the cluster is terminated, but I don’t see disjoint interfaces for state data as the main motivation for this feature.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense. I mentioned it in the Key requirements:
To obtain the current status of the ray cluster at one time, you need to use state api.

we can achieve the observable ability independent of the Ray cluster. The most typical scenario is to build the Ray history server.

#### Key requirements:
- Need to expose all necessary ray states for basic observability, tasks/actor/jobs/nodes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good initial use case of this data is rebuilding the dashboard APIs, so we should make sure the schema for each resource contains all required fields. Timestamps should also be included so the reader can reconstruct a history of state changes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try to understand. Do you mean that we can use the data format of the dashboard api to correct the completeness of the exported data by the export api? For example, exported state is published to gcs, and dashboard gets data from gcs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we don’t need to exactly use the dashboard API schemas because that would require some post processing which we can leave to the users for flexibility. We should be able to postprocess the export API data to match the dashboard API responses though (which is needed for applications like history server)



- Export event
We propose to use pubsub for event export, because pubsub is easier to decouple systems,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My opinion is we can limit this REP to generating the events and let users implement exporting the events to an external system. We can share recommendations on how to do this (eg: log ingestion using Vector which can send data to various sinks per the use case), but I think the pub sub event export isn’t necessary for this version to keep the interface simple and flexible

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or instead, we give some possible export methods (such as gcs or vector) as a suggestion, not a proposal?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it’s fine to describe some export options as a suggestion, but this shouldn’t be the key part of this REP. Could you condense and move this to a “How to use Export API” section, so it’s clear these are ideas for the users?

@SongGuyang
Copy link

I have two questions:

  • It looks like that users can not get a completed history server after the implementation of this REP. Do you have any plans to support a default history server including UI?
  • What's the status of this REP? Do you have any plans for the implementation of this REP?

Copy link

@nikitavemuri nikitavemuri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly rewording and restructuring comments

we can achieve the observable ability independent of the Ray cluster. The most typical scenario is to build the Ray history server.

#### Key requirements:
- Need to expose all necessary ray states for basic observability, tasks/actor/jobs/nodes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Need to expose all necessary ray states for basic observability, tasks/actor/jobs/nodes.
- Need to expose necessary Ray state information which is currently returned in the dashboard APIs for tasks, actors, jobs, and nodes.

Comment on lines +18 to +19
- Light load may be added when we export states, but we should be able to limit this impact (eg: max data stored in memory, not blocking any control logic).
The export API should not put too much load on the Ray cluster.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Light load may be added when we export states, but we should be able to limit this impact (eg: max data stored in memory, not blocking any control logic).
The export API should not put too much load on the Ray cluster.
- We should be able to limit any additional load from exporting state information (eg: max data stored in memory). Default configurations for the export API should have minimal impact on the Ray cluster and we will run performance tests to determine this.

Comment on lines +20 to +21
- Export states streamingly rather than fetching it directly from ray cluster. To obtain the current status of the ray cluster at one time,
you need to use [the state observability api](https://github.com/ray-project/enhancements/blob/main/reps/2022-04-21-state-observability-apis.md "the state observability api") .

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Export states streamingly rather than fetching it directly from ray cluster. To obtain the current status of the ray cluster at one time,
you need to use [the state observability api](https://github.com/ray-project/enhancements/blob/main/reps/2022-04-21-state-observability-apis.md "the state observability api") .
- Export events should be emitted on state change, rather than being pulled from the current status of the ray cluster (as done with [the state observability api](https://github.com/ray-project/enhancements/blob/main/reps/2022-04-21-state-observability-apis.md)).

This was how I interpreted it, please let me know if it refers to something else

Comment on lines +22 to +23
- Certain states can be exported respectively. RayCores (actors/tasks/nodes), RayData, and RayServe-related states can be selectively exported.
For example, if a user is only interested in the state of RayServe and not in the state of RayCore, then the export of RayCore state can be disabled to reduce overhead.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Certain states can be exported respectively. RayCores (actors/tasks/nodes), RayData, and RayServe-related states can be selectively exported.
For example, if a user is only interested in the state of RayServe and not in the state of RayCore, then the export of RayCore state can be disabled to reduce overhead.
- [P1] Allow users to selectively enable exporting RayCore (actors/tasks/jobs/nodes), RayData, or RayServe-related states

Let’s label this P1 and move to the bottom of the list because it will be more relevant after we finish implementing the library events

you need to use [the state observability api](https://github.com/ray-project/enhancements/blob/main/reps/2022-04-21-state-observability-apis.md "the state observability api") .
- Certain states can be exported respectively. RayCores (actors/tasks/nodes), RayData, and RayServe-related states can be selectively exported.
For example, if a user is only interested in the state of RayServe and not in the state of RayCore, then the export of RayCore state can be disabled to reduce overhead.
- Friendly to all types of users (especially cloud vendors), easy to deploy and use, without modifying Ray.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this point because it doesn’t add a specific requirement. Was there something in particular you wanted to call out here?

| Objects | CoreWorker and raylet | None | |
| Jobs | JobManager | When JobInfoStorageClient.put_status called | https://docs.google.com/document/d/1upQRU-f8WgVH_NWBmeJyegyOSJwNDPPqnl1cCGpmiGo/edit?usp=sharing |
| Nodes | GCS | GcsNodeManager::HandleXXXNode | https://docs.google.com/document/d/1qjoF51h2oUN2sr_MtPnovbNFZYZrh3WLNR_P0HrUuOI/edit?usp=sharing |
| Placement groups | GCS | GcsPlacementGroupManager::HandleXXXPlacementGroup | |
Copy link

@nikitavemuri nikitavemuri Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example schema for placement groups? Otherwise let’s remove from this table or mark as P1

Comment on lines +79 to +91
- RayServe
State change events for replicas, deployments, applications.

| Event | Event Source | When to export | Format example(maybe a file link)|
| ---- | ---- | ---- | ---- |
| Serve App | ServeController Actor | None | None |

- RayData
All datasets, the dag, and execution progress.

| Event | Event Source | When to export | Format example(maybe a file link)|
| ---- | ---- | ---- | ---- |
| Datasets | ray.data.internal.stats._StatsActor | _StatsActor's update function called. | None |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s actually remove the library events info for now because we don’t have schemas for these yet. Can just say We are planning to emit events for Ray Serve and Ray Data in the future after validating implementation and use cases of the core events.



- Export event
We propose to use pubsub for event export, because pubsub is easier to decouple systems,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it’s fine to describe some export options as a suggestion, but this shouldn’t be the key part of this REP. Could you condense and move this to a “How to use Export API” section, so it’s clear these are ideas for the users?

we can achieve the observable ability independent of the Ray cluster. The most typical scenario is to build the Ray history server.

#### Key requirements:
- Need to expose all necessary ray states for basic observability, tasks/actor/jobs/nodes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we don’t need to exactly use the dashboard API schemas because that would require some post processing which we can leave to the users for flexibility. We should be able to postprocess the export API data to match the dashboard API responses though (which is needed for applications like history server)

Comment on lines +4 to +14
In the current design of Ray, the way to export various states in the Ray cluster are inconsistent.
For example, the state of the Actor is broadcast through GCS pubsub, and to obtain the state change of the node,
it is necessary to query rpc service (NodeInfoGcsService). For the jobs submitted through job submission,
there is no way to expose the state. The high-level libraries on top of Ray also don't have a unified export way.
E.g. RayServe and RayData collect the states through their own StateActor and report to the Dashboard respectively.

It is very difficult to obtain these ray basic states outside the Ray cluster. This issue is reflected in two aspects:
1. Scale of data exceeds current dashboard API limits.
2. After the Ray cluster terminates, all status data is lost.
If a unified export API can be defined,
we can achieve the observable ability independent of the Ray cluster. The most typical scenario is to build the Ray history server.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would re-organize this to something like

Suggested change
In the current design of Ray, the way to export various states in the Ray cluster are inconsistent.
For example, the state of the Actor is broadcast through GCS pubsub, and to obtain the state change of the node,
it is necessary to query rpc service (NodeInfoGcsService). For the jobs submitted through job submission,
there is no way to expose the state. The high-level libraries on top of Ray also don't have a unified export way.
E.g. RayServe and RayData collect the states through their own StateActor and report to the Dashboard respectively.
It is very difficult to obtain these ray basic states outside the Ray cluster. This issue is reflected in two aspects:
1. Scale of data exceeds current dashboard API limits.
2. After the Ray cluster terminates, all status data is lost.
If a unified export API can be defined,
we can achieve the observable ability independent of the Ray cluster. The most typical scenario is to build the Ray history server.
There are several major improvements requested by users about the Ray Dashboard
1. Currently, all state data is lost when the Ray cluster terminates and users are looking for persistence of this data.
2. The Ray dashboard has scalability limits on the amount of data stored and returned. Advanced users would benefit from greater scalability.
3. There is no unified and accurate way to get state events pushed on state change across various resource types. The State and dashboard APIs support a pull model, where it is possible to miss some events. There is also not a consistent way across resource types to get this information directly from components that generate it.
A unified export API could serve as a base to address these issues by allowing state data to be exported and processed separately. This would ensure that observability features remain functional independent of the Ray cluster.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to obtain the state change of the node, it is necessary to query rpc service (NodeInfoGcsService)

Just curious, do you or other users already query these GCS services directly to get data?

@nikitavemuri
Copy link

nikitavemuri commented Sep 5, 2024

@SongGuyang

  • It looks like that users can not get a completed history server after the implementation of this REP. Do you have any plans to support a default history server including UI?

Yeah, we wanted to open source the export API as the first step for various dashboard persistence and scalability features. We are looking into prototyping a history server to better understand the design options.

One open question for a full history server is choosing the right backend and how to interface with it. We need to choose a light weight backend, or we should probably support plugging into multiple backends.

  • What's the status of this REP? Do you have any plans for the implementation of this REP?

Yes, I have an initial working version of this in review, but we need to still verify if there are any performance regressions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants