Polar Deep Insights Topics (pdi-topics)

Topic modeling Jupyter notebooks for Polar Deep Insights data and scientific text.

Run these notebooks in a browser by clicking on one of the buttons below.

to launch EGU sessions analysis

to launch EGU complete corpus topic modeling

to launch ScatterText visualization on EGU abstracts

The notebooks can also be executed on your own machine by using docker or in a Conda environment. See below for instructions

Build and start a docker image

Copy the Dockerfile to the project folder and run the following commands.

docker build -t pdi-topics .

To run a container we use the following command.

docker run -d -t -p 8888:8888 --name pdi-topics pdi-topics

or if we want to run notebooks from a particular location we can just mount a volume

docker run -d -t -p 8888:8888 -v $MY_LOCAL_PATH:/opt/pdi-topics/notebooks --name pdi-topics pdi-topics

You'll need the jupyter token in order to access the notebooks, you can get it by inspecting the logs in the docker container

docker logs pdi-topics

Using Conda environments

If we want to avoid using Docker we can also run the topic notebooks by creating an environment using conda3 or miniconda3

conda env create -f environment.yml

now to use the notebooks we need to activate the environment and run jupyter

source activate pdi-topics
jupyter notebook --allow-root --notebook-dir=$MY_DIR --ip='0.0.0.0' --port=8888 --no-browser

Running pdi-topics on a local Solr index with Sparkler data

Follow steps on https://github.com/USCDataScience/sparkler to run Sparkler on a seed url or file.
After execution completes, you can find the data indexed on http://localhost:8983/solr/#/crawldb/query
Build the docker image and run it using the following command. You need to replace HOST-IP with your system’s IP address

docker run -d -t --add-host=docker:{HOST-IP} -p 8888:8888 --name pdi-topics pdi-topics

Run sparkler-pdi-topics.ipynb and sparkler-pdi-scikit-topics.ipynb notebooks to view results for Sparkler data.

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
Docker		Docker
notebooks		notebooks
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
environment.yml		environment.yml
postBuild		postBuild

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Polar Deep Insights Topics (pdi-topics)

Topic modeling Jupyter notebooks for Polar Deep Insights data and scientific text.

Build and start a docker image

Using Conda environments

Running pdi-topics on a local Solr index with Sparkler data

LICENSE

About

Releases

Packages

Contributors 3

Languages

License

USCDataScience/pdi-topics

Folders and files

Latest commit

History

Repository files navigation

Polar Deep Insights Topics (pdi-topics)

Topic modeling Jupyter notebooks for Polar Deep Insights data and scientific text.

Build and start a docker image

Using Conda environments

Running pdi-topics on a local Solr index with Sparkler data

LICENSE

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages