Replies: 4 comments 5 replies
-
singularity-v2的很多业务涉及到数据库表格,像这样的业务变更可能未来还会有, 可以考虑在进行重大数据库结构变更时增加旧版本数据库文件处理功能,或者可以有一个小工具来完成旧版本数据库到新版本数据库的数据处理。 |
Beta Was this translation helpful? Give feedback.
-
@Xinan Xu - PL
If we continue the thinking logically we have a hierarchy:
|
Beta Was this translation helpful? Give feedback.
-
After more thoughts and incorporating some feedback. I think we can simplify this as two steps process.
Example commands below
|
Beta Was this translation helpful? Give feedback.
-
Agreed over zoom call to use above commands |
Beta Was this translation helpful? Give feedback.
-
We received a few community feedback about the confusing concepts between dataset and datasource
There is also a relevent feature that requires some refactoring of the current mapping
Below is a my proposal of how we can refactor our current CLI commands and potentially some database schemas:
Dataset
A dataset is a top level object where you can attach multiple storage sources for data preparation and one optional output storage for exporting CAR files. If no output storage is defined, then inline preparation will be used.
It is also the object that can attach multiple wallets for deal making. There is no other arguments associated with dataset other than an enabled/disabled state. Only enabled dataset is considered for data prep so the user can define source storages and output storage before enabling it for data prep.
Storage
A storage is a local or remote storage system. It can be used as a source of a dataset or an output of it to store CAR files.
As part of defining source storage, you can also define how you'd like to chunk it or encrypt the files.
To browse the list of files of storage, you can use check command.
Attach and detach storage to Dataset
Storages can be attached to dataset as either source or output. One dataset can have multiple sources but up to one output
If the dataset is not currently in process of deal making or data preparation, you can detach the source storage
You can switch to a different output storage anytime, i.e. when your current output storage is running out of space
If you don't set an output or clear the output storage, inline preparation will be used
Data preparation management
You can start dataset worker anytime
However, you will need to enable the dataset before dataset worker can pick them up
To manage data preparation tasks, it is inside storage menu. The reason why such task is not performed on the dataset level is because the dataset can contain multiple data sources and can be confusing and error prone for users to accidentally start or stop data preparation for all other source storage, i.e. if one source finished prep and another source failed, would retry means retry both or retry failed one. Same issues exist for all other similar commands and involves updating state in the database. See below
For commands that is read only, we can have same command at dataset level, which iterates over all source storages and displays similar info
Wallet
Wallet management will not change.
You can associate wallet with a dataset. The dataset needs to have associated wallets before it can be used for deal making
Beta Was this translation helpful? Give feedback.
All reactions