correctly discover the Alertmanager cluster and ServiceMonitors

This commit is contained in:
Frederic Branczyk
2016-12-12 22:24:52 -08:00
parent 3c8df35189
commit b867f6c9ec
6 changed files with 29 additions and 10 deletions

View File

@@ -24,6 +24,7 @@ which manages Prometheus servers and their configuration in a cluster. With a si
* A Prometheus configuration covering monitoring of all Kubernetes core components and exporters
* A default set of alerting rules on the cluster component's health
* A Grafana instance serving dashboards on cluster metrics
* A three node highly available Alertmanager cluster
Simply run:
@@ -35,6 +36,7 @@ hack/cluster-monitoring/deploy
After all pods are ready, you can reach:
* Prometheus UI on node port `30900`
* Alertmanager UI on node port `30903`
* Grafana on node port `30902`
To tear it all down again, run:
@@ -57,7 +59,9 @@ hack/example-service-monitoring/deploy
```
After all pods are ready you can reach the Prometheus server on node port `30100` and observe
how it monitors the service as specified.
how it monitors the service as specified. Same as before, this Prometheus server automatically
discovers the Alertmanager cluster deployed in the [Monitoring Kubernetes](#Monitoring-Kubernetes)
section.
Teardown:

View File

@@ -19,3 +19,4 @@ until kctl get prometheus; do sleep 1; done
kctl apply -f manifests/exporters
kctl apply -f manifests/grafana
kctl apply -f manifests/prometheus
kctl apply -f manifests/alertmanager

View File

@@ -11,6 +11,7 @@ kctl() {
kctl delete -f manifests/exporters
kctl delete -f manifests/grafana
kctl delete -f manifests/prometheus
kctl delete -f manifests/alertmanager
# Hack: wait a bit to let the controller delete the deployed Prometheus server.
sleep 5

View File

@@ -7,10 +7,9 @@ metadata:
prometheus: frontend
spec:
version: v1.4.1
serviceMonitors:
- selector:
matchLabels:
tier: frontend
serviceMonitorSelector:
matchLabels:
tier: frontend
resources:
requests:
# 2Gi is default, but won't schedule if you don't have a node with >2Gi

View File

@@ -1,6 +1,25 @@
apiVersion: v1
data:
prometheus.yaml: |
alerting:
alertmanagers:
- kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- action: keep
regex: alertmanager-main
source_labels:
- __meta_kubernetes_service_name
- action: keep
regex: monitoring
source_labels:
- __meta_kubernetes_namespace
- action: keep
regex: web
source_labels:
- __meta_kubernetes_endpoint_port_name
scheme: http
global:
scrape_interval: 15s
evaluation_interval: 15s

View File

@@ -13,8 +13,3 @@ spec:
# production use. This value is mainly meant for demonstration/testing
# purposes.
memory: 400Mi
alerting:
alertmanagers:
- namespace: monitoring
name: alertmanager-main
port: web