Skip to content

Commit

Permalink
Merge branch 'main' into 1334-submission-portal--text-change-environm…
Browse files Browse the repository at this point in the history
…ental-extension
  • Loading branch information
pkalita-lbl committed Jan 31, 2025
2 parents 32b8649 + 677cd54 commit 71b2316
Show file tree
Hide file tree
Showing 7 changed files with 201 additions and 87 deletions.
36 changes: 26 additions & 10 deletions .env.example
Original file line number Diff line number Diff line change
@@ -1,27 +1,43 @@
# NMDC Database URI Override
# == NMDC Database URI Override ==
# Uncomment and modify these lines to provide arguments to nmdc server backend if running OUTSIDE the docker environment.
# Otherwise, .docker-env will take care of providing these arguments.
# NMDC_DATABASE_URI="postgresql:///nmdc_a"
# NMDC_TESTING_DATABASE_URI="postgresql:///nmdc_testing"

# ORCID OAuth Setup
# NMDC_ORCID_CLIENT_ID=changeme
# NMDC_ORCID_CLIENT_SECRET=changeme
#
# == NERSC Settings ==
# This is the username that will be used to authenticate with NERSC. It is used to fetch files
# used in the ingest process and to fetch database backups to restore your local database.
NERSC_USER=changeme

# == Authentication ==
# These values come from ORCID and are used when logging into your local portal instance.
NMDC_ORCID_CLIENT_ID=changeme
NMDC_ORCID_CLIENT_SECRET=changeme

# Base URL (without a trailing slash) at which the application can access an instance of ORCID.
# Note: For the production instance of ORCID, use: "https://orcid.org" (default)
# For the sandbox instance of ORCID, use: "https://sandbox.orcid.org"
# NMDC_ORCID_BASE_URL="https://orcid.org"

# MongoDB Ingest Setup
# These should be generated using a secure random string generator (e.g. `openssl rand -hex 32`)
NMDC_SESSION_SECRET_KEY=generateme
NMDC_API_JWT_SECRET=generateme

# == MongoDB Ingest Setup ==
# These values are used to connect to the MongoDB instance that provides data during the ingest process.
# NMDC_MONGO_HOST=host.docker.internal
# NMDC_MONGO_PORT=changeme
# NMDC_MONGO_USER=changeme
# NMDC_MONGO_PASSWORD=changeme

NMDC_ENVIRONMENT="development"

# Uncomment to enable CORS for local development of the nmdc-field-notes
# == CORS ==
# Uncomment to enable CORS for local development of the NMDC Field Notes mobile app.
# NMDC_CORS_ALLOW_ORIGINS=capacitor://localhost,ionic://localhost,http://localhost,http://127.0.0.1:8100

# == Testing ==
# Change this value to "testing" to run tests outside tox
NMDC_ENVIRONMENT=development

# (Optional) Slack incoming webhook URL the ingester can use to post messages to Slack.
# Reference: https://api.slack.com/messaging/webhooks#create_a_webhook
# SLACK_WEBHOOK_URL_FOR_INGESTER=changeme
# SLACK_WEBHOOK_URL_FOR_INGESTER=changeme
2 changes: 2 additions & 0 deletions .git-blame-ignore-revs
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,5 @@
8d6cb9d723f925a89bc79f5492f9166d819e5fe6
# Reformat files with black v24
b96c270d51c7f4a5aba10dac30d203be502daceb
# Format with black v25
c623f18ac3db95b0a2272fa9b5c1a23800731c43
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,4 @@ __pycache__/
vetur.config.js
**.local**
build/**
data/
2 changes: 2 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,8 @@ services:
- default
ports:
- "8000:8000"
volumes:
- ./data/ingest:/data/ingest

web:
image: ghcr.io/microbiomedata/nmdc-server/client:main
Expand Down
243 changes: 168 additions & 75 deletions docs/development.md
Original file line number Diff line number Diff line change
@@ -1,46 +1,85 @@
# Development Setup

## Docker

Install docker and docker-compose.
## Prerequisites

* [Docker](https://docs.docker.com/get-started/)
* [Docker Compose](https://docs.docker.com/compose/install/) (does not need to be installed separately if using Docker Desktop)
* Python 3.9
* Optional: [pyenv](https://github.com/pyenv/pyenv) for managing Python versions
* Node.js >= 20
* Optional: [nvm](https://github.com/nvm-sh/nvm) for managing Node.js versions
* Yarn 1.x
* Recommended: install via `corepack`:
```shell
corepack enable
corepack install --global yarn@1
```
* A local clone of the `nmdc-server` repository

## Configuration

Start by copying the example environment configuration file.

```bash
cp .env.example .env
```

Edit values in `.env` to point to existing postgresql databases. See `nmdc_server/config.py` for all configuration variables. Variable names in `.env` should be all uppercase and prefixed with `NMDC_`.
See `nmdc_server/config.py` for all configuration variables. Variable names in `.env` should be all uppercase and prefixed with `NMDC_`.

Follow the steps below to configure the necessary settings.

## OAuth setup
### Authentication

1. Create an ORCID account at [orcid.org](https://orcid.org).
2. Create an Application via the ORCID [developer tools](https://orcid.org/developer-tools) page.
- Set the Redirect URIs (the first and only one) to `http://127.0.0.1:8000`
- In case you run into validation errors, you may find [this issue](https://github.com/microbiomedata/nmdc-server/issues/1041) helpful.
- Note: Our production Redirect URIs are listed
1. If necessary, create an ORCID account at [orcid.org](https://orcid.org).
2. Once logged in to ORCID, create an Application via the ORCID [developer tools](https://orcid.org/developer-tools) page.
- In the "Application details" section:
- Set the **Application name** to `NMDC Portal`
- Set the **Application URL** to `https://microbiomedata.org/`
- Set the **Application description** to `NMDC Portal`
- In the "Redirect URIs" section:
- Add an entry for `http://127.0.0.1:8000`
> **Note**: Our production Redirect URIs are listed
[here](https://github.com/microbiomedata/infra-admin/blob/main/orcid/README.md#redirect-uris).
- You will use the resulting **Client ID** and **Client Secret** in the next step.
3. Set the following configuration in `.env`.
3. Populate the following variables in `.env`.
```bash
NMDC_ORCID_CLIENT_ID=changeme
NMDC_ORCID_CLIENT_SECRET=changeme
```

4. Populate the below fields in `.env`. Values for these should be generated by running `openssl rand -hex 32`. You will have to run the command twice total to get a value for each field. Now restart the stack.
4. Populate the below variables in `.env`. Values for these should be generated by running `openssl rand -hex 32`. You will have to run the command once for each variable (i.e. twice total).
```bash
NMDC_SESSION_SECRET_KEY=changeme
NMDC_API_JWT_SECRET=changeme
```

### NERSC Credentials

To load production data or to run an ingest locally, you will need NERSC credentials.

# Load production data
1. If necessary, create a new account by following the instructions at https://docs.nersc.gov/accounts.
2. Populate the following variable in `.env`.
```bash
NERSC_USER=changeme
```
3. Install the `sshproxy` tool by following the instructions at https://docs.nersc.gov/connect/mfa/#sshproxy.

### MongoDB Credentials

In order to connect to the dev or prod MongoDB instances for ingest, you will need credentials to connect to them. If you do not have credentials, ask a team member to either create accounts for you or to provide you with the generic `org.microbiomedata.data_reader` credentials. Then add the credentials to your `.env` file.

```bash
NMDC_MONGO_USER=changeme
NMDC_MONGO_PASSWORD=changeme
```

## Load production data

The `nmdc-server` CLI has a `load-db` subcommand which populates your local database using a nightly production backup. These backups are stored on NERSC. You must have NERSC credentials to use this subcommand.

First use NERSC's `sshproxy` [tool](https://docs.nersc.gov/connect/mfa/#sshproxy) to generate an ssh key.
First use NERSC's `sshproxy` [tool](https://docs.nersc.gov/connect/mfa/#sshproxy) to generate an ssh key if you haven't done so in the last 24 hours.

```bash
sshproxy.sh -u <nersc_username>
sshproxy.sh -u $NERSC_USER
```

Then run the `load-db` subcommand from a `backend` container, mounting the ssh key.
Expand All @@ -50,7 +89,7 @@ docker compose run \
--rm \
-v ~/.ssh/nersc:/tmp/nersc \
backend \
nmdc-server load-db -u <nersc_username>
nmdc-server load-db -u $NERSC_USER
```

To see all CLI options run:
Expand All @@ -67,93 +106,147 @@ docker compose down -v

This should only need to be done once. When the `db` service starts up again (including via running the `load-db` command), the necessary roles and databases will be created automatically.

# Running the server
<details>
<summary><b>Don't have a NERSC account?</b></summary>
If you're an NMDC team member, but don't have a NERSC account yet: talk to your team lead about getting a NERSC account, specifically one that has access to NMDC's project files. The process takes about a week. Docs: https://docs.nersc.gov/accounts/

If that is not an option for you:

1. Ask an NMDC team member with NERSC access to get a recent production backup for you. This will come in the form of a `.dump` file. Save the file locally.
2. Bring your local database up
```bash
docker compose up db -d
```
3. Load data from the `.dump` file into the running database
```bash
docker compose run --rm \
-v <absolute path to .dump file>:/tmp/backup.dump \
db pg_restore --dbname postgresql://postgres:[email protected]:5432/nmdc_a --clean --if-exists --verbose --single-transaction /tmp/backup.dump
```
</details>

## Installing dependencies locally

Although the project is designed to be run in Docker, having the dependencies installed locally can be useful for development (e.g. providing code completion in your editor, running tests, etc.).

### Backend dependencies

1. If necessary, create a new virtual environment.
```bash
python -m venv .venv
````
2. Activate your virtual environment and install the backend dependencies.
```bash
source .venv/bin/activate
pip install -e .
```
### Frontend dependencies
1. Install the frontend dependencies.
```bash
cd web
yarn
```
## Running the server
Run the full stack via Docker Compose:
<!-- TODO: Consider adding `--build` to this command so that Docker Compose builds
the containers, rather than pulling from from GHCR (unless you
want to use the versions that happen to currently be on GHCR).
This has to do with the fact that the `docker-compose.yml` file
contains service specs having both an `image` and `build` section. -->
```bash
docker-compose up -d
docker compose up -d
```
> The `-d` is short for `--detach` and makes it so the container logs (i.e. STDOUT and STDERR streams) _don't_ take over your shell, causing you to have to open up a new shell in order to run more commands.
View main application at `http://127.0.0.1:8080/` and the swagger page at `http://127.0.0.1:8080/api/docs`.
Troubleshooting: If the building of one of the services fails with an error citing networking timeouts, _but_ the building of _other_ services completes successfully, we recommend you retry building only the service that failed, by itself. Our thinking is that there will be less demand on your network that way. You can do that via `$ docker compose build {service_name}` (e.g. `$ docker compose build web`).
View the main application at `http://127.0.0.1:8080/` and the API documentation page at `http://127.0.0.1:8080/api/docs`.
Changing Python files in the `nmdc_server` directory will automatically reload the server.
## Outside Docker
To stop the server:
```bash
# Start only the service dependencies.
docker-compose up -d db data redis
docker compose down
```
With python virtualenv. Requires Python 3.7+
If you add or modify project dependencies, you may need to rebuild the containers:
```bash
pip install -e .
pip install uvicorn tox

uvicorn nmdc_server.asgi:app --reload
docker compose up --build -d
```
View swagger page at `http://127.0.0.1:8000/api/docs`.

### Running with frontend development server

# Running ingest

You need an active SSH tunnel connection to NERSC attached to the compose network. After running docker-compose up, run this container.

If you haven't already, [set up MFA on your NERSC account](https://docs.nersc.gov/connect/mfa/) (it's required for SSHing in).
If you are modifying files in the `web` directory, additionally run the frontend development server to enable hot reloading in your browser:
```bash
export NERSC_USER=changeme
docker run --rm -it -p 27017:27017 --network nmdc-server_default --name tunnel kroniak/ssh-client ssh -o StrictHostKeyChecking=no -L 0.0.0.0:27017:mongo-loadbalancer.nmdc.production.svc.spin.nersc.org:27017 $NERSC_USER@dtn01.nersc.gov '/bin/bash -c "while [[ 1 ]]; do echo heartbeat; sleep 300; done"'
```

You can connect to the instance manually

```bash
docker run -d -p 3000:3000 --network nmdc-server_default mongoclient/mongoclient
```

In order to populate the database, you must create a `.env` file in the top
level directory containing mongo credentials.

```bash
# .env
NMDC_MONGO_USER=changeme
NMDC_MONGO_PASSWORD=changeme
```

With that file in place, populate the docker volume by running,

```bash
docker-compose run backend nmdc-server truncate # if necessary
docker-compose run backend nmdc-server migrate
docker-compose run backend nmdc-server ingest -vv --function-limit 100
```

# Running the client

Run the client in development mode.

``` bash
cd web/
yarn
cd web
yarn serve
```
View main application at `http://127.0.0.1:8081`
View the main application at `http://127.0.0.1:8081/`. Changes to files in the `web` directory will automatically trigger a reload in your browser.
> **Note**: the frontend application will still be served via Docker Compose on port `8080`, but it will not pick up changes to the `web` directory automatically. Be aware of which port you are accessing when doing frontend development.
## Why not `localhost`?
### Why not `localhost`?
It is recommended to use `127.0.0.1` instead of `localhost` for local development. This is because `localhost` is **not** allowed as a redirect URI for an ORCID client. The workaround is to register `127.0.0.1` as a redirect URI with ORCID and to use subsequently visit `127.0.0.1` for local testing.
# Testing
## Running ingest
> **Note**: This is not generally required unless you are specifically working on the ingest code. If you are working on the web application, simply [loading from a recent production backup](#load-production-data) is sufficient.
1. Ensure that you have completed the sections above about configuring your [NERSC credentials](#nersc-credentials) and [MongoDB credentials](#mongodb-credentials).
2. Obtain a new SSH key from NERSC if you haven't done so in the last 24 hours.
```bash
sshproxy.sh -u $NERSC_USER
```
3. Set up an active SSH tunnel to the dev and production MongoDB instances if you do not already have one established.
```bash
ssh \
-L 37018:mongo-loadbalancer.nmdc-dev.production.svc.spin.nersc.org:27017 \
-L 37019:mongo-loadbalancer.nmdc.production.svc.spin.nersc.org:27017 \
-o ServerAliveInterval=60 \
-f \
-N \
-l $NERSC_USER \
-i ~/.ssh/nersc \
dtn01.nersc.gov
```
> That command will set up SSH port forwarding such that your computer can access the dev MongoDB server at `localhost:37018` and the prod MongoDB server at `localhost:37019`.
> From within a Docker container `host.docker.internal` can be used to access the `localhost` of your computer. When ingesting from the dev or prod MongoDB instances, be sure to set `NMDC_MONGO_HOST=host.docker.internal` in your `.env` file.
> See https://github.com/microbiomedata/infra-admin/blob/main/mongodb/connection-guide.md (internal) for more information on connecting to the MongoDB instances.
4. Create a local copy of ingest support files:
```bash
mkdir -p data && scp \
-r \
-i ~/.ssh/nersc \
"[email protected]:/global/cfs/cdirs/m3408/ingest" \
data
```
5. Run the ingest command:
```bash
docker-compose run backend nmdc-server ingest -vv --function-limit 100
```
> **Note**: The `--function-limit` flag is optional. It is used to reduce the time that the ingest takes by limiting the number of certain types of objects loaded. This can be useful for testing purposes. For more information on options run `nmdc-server ingest --help`.
## Testing
```bash
tox
```
# Generating new migrations
## Generating new migrations
```bash
# Autogenerate a migration diff from the current HEAD
Expand All @@ -174,7 +267,7 @@ docker-compose run backend alembic -c nmdc_server/alembic.ini upgrade head
docker-compose run backend alembic -c nmdc_server/alembic.ini revision --autogenerate
```
# Developing with the shell
## Developing with the shell
A handy IPython shell is provided with some commonly used symbols automatically
imported, and `autoreload 2` enabled. To run it:
Expand Down
Loading

0 comments on commit 71b2316

Please sign in to comment.