Skip to content

Commit

Permalink
Address #5, thanks @Fquico1999
Browse files Browse the repository at this point in the history
Signed-off-by: Jay Wang <[email protected]>
  • Loading branch information
xiaohk committed Feb 11, 2022
1 parent 9a88bf7 commit 4ae0438
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 3 deletions.
4 changes: 3 additions & 1 deletion data-generation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,4 +36,6 @@ If you change the dataset being visualized, should start by setting constants li

### Swap Dataset

You should start by swapping out the loaded models in [`dataset_train`](https://github.com/poloclub/dodrio/blob/dd7a98bb26335cf960e7e508b683ff02d4a4a1ea/data-generation/dodrio-data-gen.py#L892), [`dataset_vali`](https://github.com/poloclub/dodrio/blob/dd7a98bb26335cf960e7e508b683ff02d4a4a1ea/data-generation/dodrio-data-gen.py#L891), and [`dataset_test`](https://github.com/poloclub/dodrio/blob/dd7a98bb26335cf960e7e508b683ff02d4a4a1ea/data-generation/dodrio-data-gen.py#L873). The primary change needed, so that the correct files can be generate across datasets are changes to the function [`collate_sst2`](https://github.com/poloclub/dodrio/blob/dd7a98bb26335cf960e7e508b683ff02d4a4a1ea/data-generation/dodrio-data-gen.py#L182) (which should be renamed as well). Changes to this function will affect how entries within the dataset are processed/tokenized. For non-classification NLP tasks, more significant changes will need to be made accross the file to support visualization in Dodrio. If you have any questions or problems, feel free to [open an issue](https://github.com/poloclub/dodrio/issues/new/choose).
You should start by swapping out the loaded models in [`dataset_train`](https://github.com/poloclub/dodrio/blob/dd7a98bb26335cf960e7e508b683ff02d4a4a1ea/data-generation/dodrio-data-gen.py#L892), [`dataset_vali`](https://github.com/poloclub/dodrio/blob/dd7a98bb26335cf960e7e508b683ff02d4a4a1ea/data-generation/dodrio-data-gen.py#L891), and [`dataset_test`](https://github.com/poloclub/dodrio/blob/dd7a98bb26335cf960e7e508b683ff02d4a4a1ea/data-generation/dodrio-data-gen.py#L873). The primary change needed, so that the correct files can be generate across datasets are changes to the function [`collate_sst2`](https://github.com/poloclub/dodrio/blob/dd7a98bb26335cf960e7e508b683ff02d4a4a1ea/data-generation/dodrio-data-gen.py#L182) (which should be renamed as well). Changes to this function will affect how entries within the dataset are processed/tokenized. For non-classification NLP tasks, more significant changes will need to be made across the file to support visualization in Dodrio.
For the demo, we choose the sample with `instanceID=1562`. You would need to change this variable across [different views](https://github.com/poloclub/dodrio/search?q=instanceID) for your custom dataset.
If you have any questions or problems, feel free to [open an issue](https://github.com/poloclub/dodrio/issues/new/choose).
4 changes: 2 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
"scripts": {
"build": "rollup -c",
"dev": "rollup -c -w",
"start": "sirv public"
"start": "sirv public --port 5005"
},
"devDependencies": {
"@rollup/plugin-commonjs": "^14.0.0",
Expand All @@ -13,7 +13,7 @@
"babel-eslint": "^10.1.0",
"eslint": "^7.18.0",
"eslint-plugin-svelte3": "^3.0.0",
"node-sass": "^5.0.0",
"node-sass": "^7.0.0",
"rollup": "^2.3.4",
"rollup-plugin-livereload": "^2.0.0",
"rollup-plugin-svelte": "^6.0.0",
Expand Down

0 comments on commit 4ae0438

Please sign in to comment.