Skip to main content

Analysis Templates Reference

An AnalysisTemplate is a resource that defines how to perform verification testing, including:

  • Container images and commands to run
  • Queries to external monitoring tools
  • How to interpret results from metric providers
  • Success or failure criteria
  • Frequency and duration of measurements

AnalysisTemplate resources (and the AnalysisRun resources that are spawned from them) are CRDs re-used from the Argo Rollouts project. They were intentionally built to be useful in contexts other than Argo Rollouts. Re-using this resource type to define verification processes means those processes benefit from this rich and battle-tested feature of Argo Rollouts.

info

This reference guide is intended to give a brief introduction to AnalysisTemplates for some common use cases. Please consult the relevant sections of the Argo Rollouts documentation for comprehensive coverage of the full range of AnalysisTemplate capabilities.

AnalysisTemplates integrate natively with many popular open-source and commercial monitoring tools, including:

In addition to monitoring tools, analysis can integrate with internal systems by:

  • Running containerized processes as Kubernetes Jobs
  • Making HTTP requests and interpreting JSON responses

Arguments

AnalysisTemplates may declare a set of arguments that can be "passed" in by the Stage. The arguments are resolved at the time the AnalysisRun is created and can then be referenced in metrics configuration. Arguments are dereferenced using the syntax: {{ args.<name> }}.

caution

Unlike Kargo promotion processes, which require expressions to be enclosed within ${{ }}, Argo Rollouts AnalysisTemplates require expressions to be enclosed within {{ }} (i.e. without $).

The following example shows an AnalysisTemplate with three arguments. Values for arguments can have a default value, supplied by the Stage, or obtained from a Secret if the value is sensitive (e.g. a bearer token for an HTTP request):

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: args-example
spec:
args:
# An argument can specify a value to be used as its default.
# This will be overridden by a value supplied by the Stage.
- name: api-url
value: http://example/measure
# If an argument specifies no value, it is considered a required
# argument and must be supplied by the Stage.
- name: service-name
# Arguments can be obtained from a Secret in the Project Namespace
- name: api-token
valueFrom:
secretKeyRef:
name: token-secret
key: apiToken
metrics:
- name: webmetric
successCondition: result == 'true'
provider:
web:
# placeholders are resolved when an AnalysisRun is created
url: "{{ args.api-url }}?service={{ args.service-name }}"
headers:
- key: Authorization
value: "Bearer {{ args.api-token }}"
jsonPath: "{$.results.ok}"

Success Condition

When interpreting the result of a query, an Expression Language expression can be used to evaluate the response. The response payload is set in a variable result. The following will interpret the response of a Prometheus query, and require that the element of the returned vector is greater than or equal to 0.95:

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: success-rate
spec:
metrics:
- name: success-rate
# Expr expression that can be evaluated to true or false
# NOTE: prometheus queries return results in the form of a vector.
# So it is common to access the index 0 of the returned array to obtain the value
successCondition: result[0] >= 0.95
provider:
prometheus:
address: "http://prometheus.example.com:9090"
query: |
sum(irate(
istio_requests_total{reporter="source",response_code!~"5.*"}[5m]
)) /
sum(irate(
istio_requests_total{reporter="source"}[5m]
))

Failure Conditions and Limits

As an alternative to successCondition, a failureCondition can be used to describe when a measurement is considered failed. Additionally, failureLimit can also be used to specify the maximum number of failed measurements that are allowed before the entire AnalysisRun is considered Failed.

The following example continually polls a Prometheus server to get the total number of errors (i.e., HTTP response code >= 500) every five minutes, causing the measurement to fail if ten or more errors are encountered. The entire AnalysisRun is considered to have Failed after three failed measurements.

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: failure-condition-example
spec:
metrics:
- name: total-errors
interval: 5m
failureCondition: result[0] >= 10
failureLimit: 3
provider:
prometheus:
address: http://prometheus.example.com:9090
query: |
sum(irate(
istio_requests_total{reporter="source",response_code=~"5.*"}[5m]
))

Delaying Measurements

In some scenarios, it may be necessary to delay the start of a metric measurement. For example, some time may need to pass after an update in order for new data to populate in the monitoring services. The initialDelay option can be used to delay the start of measurements. Each metric can be configured to have a different delay.

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: initial-delay-example
spec:
metrics:
- name: success-rate
# Duration before measurement collection. Default is no delay
initialDelay: 5m
successCondition: result[0] >= 0.90
provider:
prometheus:
address: http://prometheus.example.com:9090
query: ...

Example Metric Types

Web

An HTTP request can be performed against some external service to obtain the measurements.

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: web-metric-example
spec:
args:
- name: api-token
valueFrom:
secretKeyRef:
name: token-secret
key: apiToken
metrics:
- name: webmetric
successCondition: result == true
provider:
web:
url: "http://example.com/api/v1/measurement"
# HTTP Method. valid values are GET|POST|PUT. Defaults to GET
method: POST
# Timeout for the request. Defaults to 10 seconds
timeoutSeconds: 20
headers:
- key: Authorization
value: "Bearer {{ args.api-token }}"
# if body is a json, it is recommended to set the Content-Type
- key: Content-Type
value: "application/json"
# Requst body to send.
body: |
{"foo": "bar"}
# Optional JSON path to set the value of `result` in successCondition/failureCondition
jsonPath: "{$.data.ok}"

Job

A Kubernetes Job can be used to perform analysis. When a Job is used, the metric is considered successful if the Job completes with an exit code of zero and is otherwise considered to have failed.

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: integration-test
namespace: guestbook
spec:
metrics:
- name: integration-test
provider:
job:
spec:
template:
spec:
containers:
- name: sleep
image: alpine:latest
command: [sleep, "10"]
restartPolicy: Never
backoffLimit: 1