Skip to content

Commit 6f69975

Browse files
committedJan 18, 2024
init migration
0 parents  commit 6f69975

35 files changed

+1420
-0
lines changed
 

‎.github/workflows/pr.yaml

+33
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# pre-commit workflow
2+
#
3+
# Ensures the codebase passes the pre-commit stack.
4+
5+
name: pre-commit
6+
7+
on: [pull_request]
8+
9+
jobs:
10+
pre-commit:
11+
runs-on: ubuntu-latest
12+
13+
steps:
14+
- uses: actions/checkout@v3
15+
16+
- name: Run pre-commit hooks
17+
uses: pre-commit/action@v3.0.0
18+
with:
19+
extra_args: --all-files --show-diff-on-failure
20+
21+
- name: Setup TFLint
22+
uses: terraform-linters/setup-tflint@v3
23+
with:
24+
tflint_version: v0.44.1
25+
26+
- name: Show version
27+
run: tflint --version
28+
29+
- name: Init TFLint
30+
run: tflint --init
31+
32+
- name: Run TFLint
33+
run: tflint -f compact

‎.gitignore

+12
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# Secret configs
2+
*.json
3+
configs/encoded
4+
5+
# Deployment files
6+
*.tar.gz
7+
8+
# Terraform
9+
.terraform/
10+
.terraform*
11+
terraform.tfvars
12+
terraform.tfstate*

‎.pre-commit-config.yaml

+12
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
repos:
2+
# Default pre-commit hooks
3+
- repo: https://github.com/pre-commit/pre-commit-hooks
4+
rev: v3.2.0
5+
hooks:
6+
# Ensure EOF exists
7+
- id: end-of-file-fixer
8+
# Prevent adding large files
9+
- id: check-added-large-files
10+
args: ["--maxkb=5000"]
11+
# Newline at end of file
12+
- id: trailing-whitespace

‎.tflint.hcl

+16
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
plugin "terraform" {
2+
enabled = true
3+
preset = "recommended"
4+
}
5+
6+
plugin "aws" {
7+
enabled = true
8+
version = "0.28.0"
9+
source = "github.com/terraform-linters/tflint-ruleset-aws"
10+
}
11+
12+
plugin "google" {
13+
enabled = true
14+
version = "0.26.0"
15+
source = "github.com/terraform-linters/tflint-ruleset-google"
16+
}

‎LICENSE

+32
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
The Clear BSD License
2+
3+
Copyright (c) 2023 Origin Research Ltd
4+
All rights reserved.
5+
6+
Redistribution and use in source and binary forms, with or without modification,
7+
are permitted (subject to the limitations in the disclaimer below) provided that
8+
the following conditions are met:
9+
10+
* Redistributions of source code must retain the above copyright notice,
11+
this list of conditions and the following disclaimer.
12+
13+
* Redistributions in binary form must reproduce the above copyright
14+
notice, this list of conditions and the following disclaimer in the
15+
documentation and/or other materials provided with the distribution.
16+
17+
* Neither the name of the copyright holder nor the names of its
18+
contributors may be used to endorse or promote products derived from this
19+
software without specific prior written permission.
20+
21+
NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY
22+
THIS LICENSE. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
23+
CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
24+
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
25+
PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR
26+
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
27+
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
28+
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
29+
BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
30+
IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
31+
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
32+
POSSIBILITY OF SUCH DAMAGE.

‎README.md

+111
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
# Infernet Node Deployment
2+
3+
Deploy a cluster of heterogenous [Infernet](https://github.com/ritual-net/infernet-node) nodes on Amazon Web Services (AWS) and / or Google Cloud Platform (GCP), using [Terraform](https://www.terraform.io/) for infrastructure procurement and [Docker compose](https://docs.docker.com/compose/) for deployment.
4+
5+
6+
### Setup
7+
1. [Install Terraform](https://developer.hashicorp.com/terraform/install)
8+
2. **Configure nodes**: A node configuration file **for each** node being deployed.
9+
- See [example configuration](configs/0.json.example).
10+
- They must be named `0.json`, `1.json`, etc...
11+
- Misnamed files are ignored.
12+
- They must be placed under the top-level `configs/` directory.
13+
- Each node *strictly* requires its own configuration `.json` file, even if those are identical.
14+
- Number of `.json` files must match the `node_count` variable in `terraform.tfvars`.
15+
- Extra files are ignored.
16+
- For instructions on configuring nodes, refer to the [Infernet Node](https://github.com/ritual-net/infernet-node).
17+
18+
#### Infernet Router:
19+
The Infernet Router REST server is configured automatically by Terraform. However, if you plan to use it, you need to understand its implications:
20+
> **IMPORTANT:** When configuring a heterogeneous node cluster (i.e. `0.json`, `1.json`, etc. are not identical), container IDs should be reserved for a **unique container setup at the cluster level, i.e. across nodes (and thus `.json` files)**. That is becuase the router uses container IDs to make routing decisions between services running across the cluster.
21+
>
22+
> _Example:_ Consider nodes A and B, each running a single LLM inference container; node A runs `image1`, and node B runs `image2`. If we set `id: "llm-inference"` in both containers (`containers[0].id` attribute in `0.json`, `1.json`), the router will be **unable to disambiguate** between the two services, and will consider them interchangeable, _which they are not._ Any requests for `"llm-inference"` will be routed to either container, which is an error.
23+
>
24+
> Therefore, **re-using a IDs across configuration files must imply an identical container configuration**, including image, environment variables, command, etc. This will explicitly tell the router which containers are interchangeable, and allow it to distribute requests for those containers across _all nodes running that container._
25+
26+
### Deploy on AWS
27+
28+
1. Create an AWS service account for deployment:
29+
```bash
30+
cd procure/aws
31+
chmod 700 create_service_account.sh
32+
./create_service_account.sh
33+
```
34+
This will require local authentication with the AWS CLI. Add `access_key_id` and `secret_access_key` to your Terraform variables (see step 3).
35+
36+
2. Make a copy of the example configuration file [terraform.tfvars.example](procure/aws/terraform.tfvars.example):
37+
```bash
38+
cd procure/aws
39+
cp terraform.tfvars.example terraform.tfvars
40+
```
41+
42+
3. Configure your `terraform.tfvars` file. See [variables.tf](procure/aws/variables.tf) for config descriptions.
43+
44+
4. Run Terraform:
45+
```bash
46+
# Initialize
47+
cd procure
48+
make init provider=aws
49+
50+
# Print deployment plan
51+
make plan provider=aws
52+
53+
# Deploy
54+
make apply provider=aws
55+
56+
# WARNING: Destructive
57+
# Destroy deployment
58+
make destroy provider=aws
59+
```
60+
61+
### Deploy on GCP
62+
63+
64+
1. Create a GCP service account for deployment:
65+
```bash
66+
cd procure/gcp
67+
chmod 700 create_service_account.sh
68+
./create_service_account.sh
69+
```
70+
This will require local authentication with the GCP CLI, and create a local credentials file. Add the path to the credentials file (`gcp_credentials_file_path`) to your Terraform variables (see step 3).
71+
72+
2. Make a copy of the example configuration file [terraform.tfvars.example](procure/gcp/terraform.tfvars.example):
73+
```bash
74+
cd procure/gcp
75+
cp terraform.tfvars.example terraform.tfvars
76+
```
77+
3. Configure your `terraform.tfvars` file. See [variables.tf](procure/gcp/variables.tf) for config descriptions.
78+
79+
4. Run Terraform:
80+
```bash
81+
# Initialize
82+
cd procure
83+
make init provider=gcp
84+
85+
# Print deployment plan
86+
make plan provider=gcp
87+
88+
# Deploy
89+
make apply provider=gcp
90+
91+
# WARNING: Destructive
92+
# Destroy deployment
93+
make destroy provider=gcp
94+
```
95+
96+
### Using TfLint
97+
98+
```bash
99+
# Install tflint
100+
brew install tflint
101+
102+
# Install plugins
103+
tflint --init
104+
105+
# Run on all directories
106+
tflint --recursive
107+
```
108+
109+
## License
110+
111+
[BSD 3-clause Clear](./LICENSE)

‎configs/0.json.example

+64
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
{
2+
"log_path": "infernet_node.log",
3+
"server": {
4+
"port": 4000
5+
},
6+
"chain": {
7+
"enabled": true,
8+
"rpc_url": "http://127.0.0.1:8545",
9+
"coordinator_address": "0x...",
10+
"trail_head_blocks": 4,
11+
"wallet": {
12+
"max_gas_limit": 100000,
13+
"private_key": "12345s"
14+
}
15+
},
16+
"docker": {
17+
"username": "username",
18+
"password": "password"
19+
},
20+
"redis": {
21+
"host": "localhost",
22+
"port": 6379
23+
},
24+
"forward_stats": true,
25+
"containers": [
26+
{
27+
"id": "container-1",
28+
"image": "org1/image1:tag1",
29+
"description": "Container 1 description",
30+
"external": true,
31+
"port": "4999",
32+
"allowed_addresses": [],
33+
"allowed_delegate_addresses": [],
34+
"allowed_ips": [
35+
"XX.XX.XX.XXX",
36+
"XX.XX.XX.XXX"
37+
],
38+
"command": "--bind=0.0.0.0:3000 --workers=2",
39+
"env": {
40+
"KEY1": "VALUE1",
41+
"KEY2": "VALUE2"
42+
},
43+
"gpu": true
44+
},
45+
{
46+
"id": "container-2",
47+
"image": "org2/image2:tag2",
48+
"description": "Container 2 description",
49+
"external": false,
50+
"port": "4998",
51+
"allowed_addresses": [],
52+
"allowed_delegate_addresses": [],
53+
"allowed_ips": [
54+
"XX.XX.XX.XXX",
55+
"XX.XX.XX.XXX"
56+
],
57+
"command": "--bind=0.0.0.0:3000 --workers=2",
58+
"env": {
59+
"KEY3": "VALUE3",
60+
"KEY4": "VALUE4"
61+
}
62+
}
63+
]
64+
}

‎deploy/docker-compose.yaml

+56
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
version: '3'
2+
3+
services:
4+
node:
5+
image: ritualnetwork/infernet-node:0.1.0
6+
ports:
7+
- "0.0.0.0:4000:4000"
8+
volumes:
9+
- type: bind
10+
source: ./config.json
11+
target: /app/config.json
12+
- node-logs:/logs
13+
- /var/run/docker.sock:/var/run/docker.sock
14+
networks:
15+
- network
16+
restart:
17+
on-failure
18+
depends_on:
19+
- redis
20+
extra_hosts:
21+
- "host.docker.internal:host-gateway"
22+
stop_grace_period: 1m
23+
24+
redis:
25+
image: redis:latest
26+
ports:
27+
- "6379:6379"
28+
networks:
29+
- network
30+
volumes:
31+
- ./redis.conf:/usr/local/etc/redis/redis.conf
32+
- redis-data:/data
33+
restart:
34+
on-failure
35+
36+
fluentbit:
37+
image: fluent/fluent-bit:latest
38+
ports:
39+
- "24224:24224"
40+
environment:
41+
- FLUENTBIT_CONFIG_PATH=/fluent-bit/etc/fluent-bit.conf
42+
volumes:
43+
- ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
44+
- /var/log:/var/log:ro
45+
networks:
46+
- network
47+
restart:
48+
on-failure
49+
50+
networks:
51+
network:
52+
53+
54+
volumes:
55+
node-logs:
56+
redis-data:

‎deploy/fluent-bit.conf

+38
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
[SERVICE]
2+
Flush 1
3+
Daemon Off
4+
Log_Level info
5+
storage.path /tmp/fluentbit.log
6+
storage.sync normal
7+
storage.checksum on
8+
storage.backlog.mem_limit 5M
9+
10+
[INPUT]
11+
Name forward
12+
Listen 0.0.0.0
13+
Port 24224
14+
Storage.type filesystem
15+
16+
[OUTPUT]
17+
name stdout
18+
match *
19+
20+
[OUTPUT]
21+
Name pgsql
22+
Match stats.node
23+
Host meta-sink.ritual.net
24+
Port 5432
25+
User append_only_user
26+
Password ogy29Z4mRCLfpup*9fn6
27+
Database postgres
28+
Table node_stats
29+
30+
[OUTPUT]
31+
Name pgsql
32+
Match stats.live
33+
Host meta-sink.ritual.net
34+
Port 5432
35+
User append_only_user
36+
Password ogy29Z4mRCLfpup*9fn6
37+
Database postgres
38+
Table live_stats

0 commit comments

Comments
 (0)
Please sign in to comment.