Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Autoscaling #566

Draft
wants to merge 21 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 44 additions & 5 deletions .github/workflows/functional-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: Functional tests
on: [pull_request]

jobs:
functional-tests:
operator:
runs-on: ubuntu-latest
steps:
- name: Checkout code
Expand All @@ -27,17 +27,56 @@ jobs:
## Prepare kubeconfig
k3d kubeconfig get $CLUSTER_NAME > functionaltests/kubeconfig
export KUBECONFIG=$(pwd)/functionaltests/kubeconfig


## Build controller docker image
make docker-build

## Import controller docker image
k3d image import -c $CLUSTER_NAME controller:latest

## Install helm chart
helm install opensearch-operator ../charts/opensearch-operator --set manager.image.repository=controller --set manager.image.tag=latest --set manager.image.pullPolicy=IfNotPresent --namespace default --wait
cd functionaltests

## Run tests
go test ./operatortests -timeout 30m

cluster-helm-chart:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v2
- name: Setup go
uses: actions/setup-go@v3
with:
go-version: '1.19'
- uses: nolar/setup-k3d-k3s@v1
with:
version: v1.22
k3d-name: opensearch-operator-tests
k3d-args: --agents 2 -p 30000-30005:30000-30005@agent:0
github-token: ${{ secrets.GITHUB_TOKEN }}
- name: Run tests
run: |
set -e
export CLUSTER_NAME=opensearch-operator-tests
## Check disk to avoid failed shard assignments due to watermarking
df -h
cd opensearch-operator
## Prepare kubeconfig
k3d kubeconfig get $CLUSTER_NAME > functionaltests/kubeconfig
export KUBECONFIG=$(pwd)/functionaltests/kubeconfig

## Build controller docker image
make docker-build

## Import controller docker image
k3d image import -c $CLUSTER_NAME controller:latest

## Install helm chart
helm install opensearch-operator ../charts/opensearch-operator --set manager.image.repository=controller --set manager.image.tag=latest --set manager.image.pullPolicy=IfNotPresent --namespace default --wait
helm install opensearch-cluster ../charts/opensearch-cluster --set OpenSearchClusterSpec.enabled=true --wait
cd functionaltests

## Run tests
go test -timeout 30m
go test ./helmtests -timeout 15m
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ The opensearch k8s operator aims to be compatible to all supported opensearch ve

| Operator Version | Min Supported Opensearch Version | Max supported Opensearch version | Comment |
|------------------|----------------------------------|----------------------------------|---------|
| 2.3 | 1.0 | 2.7 | |
| 2.3 | 1.0 | 2.8 | |
| 2.2 | 1.0 | 2.5 | |
| 2.1 | 1.0 | 2.3 | |
| 2.0 | 1.0 | 2.3 | |
Expand Down
154 changes: 154 additions & 0 deletions docs/designs/autoscaling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
# Autoscaling

## Content
- [Autoscaling](#autoscaling)
- [Goals](#goals)
- [Design](#design)
- [Getting Started](#gettingstarted)

## Goals
1. Scale OpenSearch clusters managed by the operator up and down via monitoring metrics.
2. Support for making scaling decision from one-to-many metrics with aggregations.

## Design
A separate CRD is used for defining autoscaling policies. Autoscaling CRDs are stateless as they are never updated by the operator and only read. Once an autoscaler is created, it can be referenced from either a cluster or nodepool level inside the OpensearchCluster configuration. When enabled, the autoscaler will query a prometheus backend containing the cluster metrics and make scaling determinations based on the user configuration.

### Requirements
To support the second goal of being able to make scaling decisions with aggregations, there needs to be a record of cluster metrics over a time period. Since the monitoring component of the operator is already leveraging Prometheus, it made sense to utilize it as well. The autoscaler requires a Prometheus instance that is scraping the metrics of your cluster for the autoscaler to work.

### Considerations
Some design considerations to make note of:
1. ScaleConf only contains maxReplicas but no minReplicas, this is because the number of replicas specified in the nodepool for the OpenSearch cluster is used for the minReplica value.
2. The operator field of an Item can be any supported Prometheus comparison binary operator.
```
== (equal)
!= (not-equal)
> (greater-than)
< (less-than)
>= (greater-or-equal)
<= (less-or-equal)
```
3. The interval field of a queryOption can be an integer follow by any valid Prometheus time duration.
```
ms - milliseconds
s - seconds
m - minutes
h - hours
d - days - assuming a day has always 24h
w - weeks - assuming a week has always 7d
y - years - assuming a year has always 365d
```
4. The function field of a queryOption can be any valid singular Prometheus function.

### Autoscaler Custom Resource Reference Guide

The Autoscaler CRD is defined by kind: `Autoscaler`, group: `opensearch.opster.io` and version `v1`.
| Name | Type | Description | Required |
|--------|--------|--------|--------|
| apiVersion | string | opensearch.opster.io/v1 | true |
| kind | string | Autoscaler | true |
| metadata | object | Refer to the Kubernetes API documentation for the fields of the `metadata` field. | true |
| spec | object | AutoscalerSpec defines the desired configuration of the autoscaler. | true |


### Autoscaler.spec
AutoscalerSpec defines the desired configuration of the autoscaler.
| Name | Type | Description | Required |
|--------|--------|--------|--------|
| rules | []Rule | The container for lists of type Rule, defining scaling logic. | true |


### Rule
Rule defines a single rule.
| Name | Type | Description | Required |
|--------|--------|--------|--------|
| items | []Item | A list of type Item, defining conditions for scaling. | true |
| nodeRole | string | The role of the Opensearch node type you would like to target for scaling. | true |
| behavior | Scale | The container for the scaling behavior of the ruleset. | true |

A rule may contain many items; by default all items expressions generated from the configuration must evaluate to true for a scaling action to take place.
A nodeRole is needed primarily in the case that the autoscalePolicy is defined at the cluster level so which nodes to scale is known.

### Item
Item defines a singular item in a rule.
| Name | Type | Description | Required |
|--------|--------|--------|--------|
| metric | string | A prometheus metric to target for performing conditional operations. | true |
| operator | string | The operator to use for comparing the prometheus query result and threshold. | true |
| threshold | string | The threshold value for taking scaling action. | true |
| queryOptions | QueryOptions | Optional additions to the prometheus query. | false |

The operator field of an Item can be any supported Prometheus comparison binary operator.
```
== (equal)
!= (not-equal)
> (greater-than)
< (less-than)
>= (greater-or-equal)
<= (less-or-equal)
```

### QueryOptions
QueryOptions defined additional query configurations.
| Name | Type | Description | Required |
|--------|--------|--------|--------|
| labelMatchers | []string | A prometheus supported label matcher to limit results. | false |
| interval | string | A prometheus supported interval of time over which to query. | false |
| function | string | A prometheus supported function wrapper. | false |
| aggregateEvaluation | bool | A flag to average your prometheus query results together. | false |

The interval field of a queryOption can be an integer follow by any valid Prometheus time duration.
```
ms - milliseconds
s - seconds
m - minutes
h - hours
d - days - assuming a day has always 24h
w - weeks - assuming a week has always 7d
y - years - assuming a year has always 365d
```

The aggregateEvaluation field is designed to average the results from multiple nodes for comparison. This is useful for when you want to scale based off an average of nodes metrics versus each individual node needing to be evaluated.

### Behavior
Behavior defines a scaling behavior for a rule.
| Name | Type | Description | Required |
|--------|--------|--------|--------|
| enable | bool | Flag to enable or disable the rule. | true |
| scaleUp | ScaleConf | Container for upscaling behavior. | false |
| scaleDown | ScaleConf | Container for downscaling behavior. | false |

You should never have both scaleUp and scaleDown defined for the same rule. Each rule should only ever have one or the other.

### ScaleConf
Scaling behavior for scaling up or down.
| Name | Type | Description | Required |
|--------|--------|--------|--------|
| maxReplicas | int32 | Maximum amount of replicas to scale up to. | false |

MaxReplicas is an optional field in the case of a rule that is scaling down, however if scaling up it is needed so there is an upper boundary. MinReplicas is absent because the nodepool.Replicas defined in the cluster spec performs this function. When scaling down the cluster will never scale below the number of replicas defined in the cluster.


In addition to the autoscaler CRD, changes to the existing OpensearchCluster CRD are included, specifically the generalConfig and nodePools.

### OpensearchCluster.General.Autoscaler
Addition of an `Autoscaler` section under generalConfig.
| Name | Type | Description | Required |
|--------|--------|--------|--------|
| enable | boolean | Enables or disables autoscaling functionality. | false |
| prometheusEndpoint | string | A prometheus endpoint to monitor. | false |
| scaleTimeout | int | The amount of time to wait before scaling since last scale or cluster creation in minutes. | false |
| clusterPolicy | string | The override to set a cluster specific autoscale policy. | false |

### OpensearchCluster.nodePools
Addition of `AutoScalePolicy` to nodePools.
| Name | Type | Description | Required |
|--------|--------|--------|--------|
| autoScalePolicy | string | The name of an autoscaler that the user has applied. | false |

Note that the clusterPolicy and autoScalePolicy are synonymous and users should choose one or the other based on their needs.

## GettingStarted
1. Have a Prometheus instance where metrics from your cluster are being stored.
2. Create an autoscaling policy with the CRD that meets your scaling requirements.
3. Define the autoscaling policy in your OpensearchCluster and enable it.
45 changes: 45 additions & 0 deletions docs/designs/crd.md
Original file line number Diff line number Diff line change
Expand Up @@ -653,6 +653,51 @@ Monitoring TLS configuration options
</tbody>
</table>

<h3 id="GeneralConfig">
Autoscaler
</h3>

Autoscaler defines Opensearch autoscaling configuration

<table>
<thead>
<tr>
<th>Name</th>
<th>Type</th>
<th>Description</th>
<th>Required</th>
<th>default</th>
</tr>
</thead>
<tbody><tr>
<td><b>enable</b></td>
<td>bool</td>
<td>Define if to enable autoscaling for that cluster</td>
<td>false</td>
<td>-</td>
</tr><tr>
<td><b>prometheusEndpoint</b></td>
<td>string</td>
<td>A prometheus URL endpoint that monitoring metrics from the OS cluster are sent to.</td>
<td>false</td>
<td>-</td>
</tr><tr>
<td><b>scaleTimeout</b></td>
<td>string</td>
<td>This interval limits how often the cluster will attempt an automatic scaling action. Notation should follow [prometheus time duration standards](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-durations).</td>
<td>false</td>
<td>10m</td>
</tr><tr>
</tr><tr>
<td><b>clusterAutoScalePolicy</b></td>
<td>string</td>
<td>Optional to define an autoscaling policy at the cluster level instead of nodePool.</td>
<td>false</td>
<td>-</td>
</tr><tr>
</tr><tr>
</table>

<h3 id="GeneralConfig">
Keystore
</h3>
Expand Down
Loading