Predictive Horizontal Pod Autoscalers (PHPAs) are Horizontal Pod Autoscalers (HPAs) with extra predictive capabilities baked in, allowing you to apply statistical models to the results of HPA calculations to make proactive scaling decisions.
This extensively uses the the jthomperoo/k8shorizmetrics library to gather metrics and to evaluate them as the Kubernetes Horizontal Pod Autoscaler does.
PHPAs lets you choose models and fine tune them in order to predict how many replicas a resource should have, preempting events such as regular, repeated high load. This allows for proactive rather than simply reactive scaling that can make intelligent ahead of time decisions.
Systems that have predictable changes in load, for example; if over a 24 hour period the load on a resource is generally higher between 3pm and 5pm - with enough data and use of correct models and tuning the autoscaler could predict this and preempt the load, increasing responsiveness of the system to changes in load. This could be useful for handling different userbases across different timezones, or understanding that if a load is rapidly increasing we can prempt the load by predicting replica counts.
- Functionally identical to Horizontal Pod Autoscaler for calculating replica counts without prediction.
- Choice of statistical models to apply over Horizontal Pod Autoscaler replica counting logic.
- Holt-Winters Smoothing
- Linear Regression
- Allows customisation of Kubernetes autoscaling options without master node access. Can therefore work on managed
solutions such as EKS or GCP.
- CPU Initialization Period.
- Downscale Stabilization.
- Sync Period.
This project works by calculating the number of replicas a resource should have, then storing these values and using statistical models against them to produce predictions for the future. These predictions are compared and can be used instead of the raw replica count calculated by the Horizontal Pod Autoscaler logic.
PHPAs are designed to be as similar in configuration to Horizontal Pod Autoscalers as possible, with extra configuration options.
PHPAs have their own custom resource:
apiVersion: jamiethompson.me/v1alpha1
kind: PredictiveHorizontalPodAutoscaler
metadata:
name: simple-linear
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: php-apache
minReplicas: 1
maxReplicas: 10
behavior:
scaleDown:
stabilizationWindowSeconds: 0
metrics:
- type: Resource
resource:
name: cpu
target:
averageUtilization: 50
type: Utilization
models:
- type: Linear
name: simple-linear
linear:
lookAhead: 10000
historySize: 6
This PHPA acts like a Horizontal Pod Autoscaler and autoscales to try and keep the target resource's CPU utilization at 50%, but with the extra predictive layer of a linear regression model applied to the results.
The operator for managing Predictive Horizontal Pod Autoscalers can be installed using Helm:
VERSION=v0.13.1
HELM_CHART=predictive-horizontal-pod-autoscaler-operator
helm install ${HELM_CHART} https://github.com/jthomperoo/predictive-horizontal-pod-autoscaler/releases/download/${VERSION}/predictive-horizontal-pod-autoscaler-${VERSION}.tgz
Check out the getting started guide and the examples for ways to use Predictive Horizontal Pod Autoscalers.
See the wiki for more information, such as guides and references.
See the examples/
directory for working code samples.
Developing this project requires these dependencies:
Any Python dependencies must be installed by running:
pip install -r requirements-dev.txt
It is recommended to test locally using a local Kubernetes managment system, such as k3d (allows running a small Kubernetes cluster locally using Docker).
You can deploy a PHPA example (see the examples/
directory for choices) to test your changes.
make run
- runs the PHPA locally against the cluster configured in your kubeconfig file.make docker
- builds the PHPA image.make lint
- lints the code.make format
- beautifies the code, must be run to pass the CI.make test
- runs the unit tests.make doc
- hosts the documentation locally, at127.0.0.1:8000
.make coverage
- opens up any generated coverage reports in the browser.