Compare commits

...

20 Commits

Author SHA1 Message Date
Philip Gough
344ec3464f [release-0.10] - Adds dependency on github.com/grafana/jsonnet-libs mixins-util (#2480)
* Add lockfile

* chore: ignore mdox errors for https://www.weave.works

* ci: Generate with newer go version
2024-08-07 17:19:09 +01:00
PromOperatorBot
096039236a [bot] [release-0.10] Automated version update (#2313)
Co-authored-by: Prometheus Operator Bot <prom-op-bot@users.noreply.github.com>
2024-01-03 17:31:51 +00:00
PromOperatorBot
94e8af35bb [bot] [release-0.10] Automated version update (#2288)
Co-authored-by: Prometheus Operator Bot <prom-op-bot@users.noreply.github.com>
2023-12-01 17:05:21 +00:00
PromOperatorBot
6dfe0e2ed6 [bot] [release-0.10] Automated version update (#2182)
Co-authored-by: Prometheus Operator Bot <prom-op-bot@users.noreply.github.com>
2023-08-16 16:55:54 +01:00
Philip Gough
63a13ae16b Fix docs check for 0.10 release (#2175) 2023-07-27 04:36:01 -07:00
Prometheus Operator Bot
e7eff18e7e [bot] [release-0.10] Automated version update 2022-11-07 10:53:19 +00:00
Jan Fajerski
ad6e0c2770 release-0.10: update jsonnet depndencies (#1894)
This pulls in a bug fix for kubernetes-mixins
https://github.com/kubernetes-monitoring/kubernetes-mixin/issues/786.

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
2022-10-06 20:57:26 +01:00
Simon Pasquier
b76224662e Merge pull request #1790 from JoaoBraveCoding/add-liveness
cheery-pick: Adds readinessProbe and livenessProbe to prometheus-adapter jsonnet
2022-07-07 15:02:38 +02:00
Joao Marcal
508722d5db Adds YAML for jsonnet modified for prometheus-adapter 2022-06-21 14:17:21 +01:00
Joao Marcal
a38f7012a9 Adds port name to prometheus-adapter jsonnet 2022-06-21 14:17:05 +01:00
Joao Marcal
26c8329481 Adds YAML for jsonnet modified in the previous commit 2022-06-21 14:14:51 +01:00
Joao Marcal
3fa00f11f3 Adds readinessProbe and livenessProbe to prometheus-adapter jsonnet
Problem: Currently the prometheus-adapter pods are restarted at the same
time even though the deployment is configured with strategy RollingUpdate.
This happens because the kubelet does not know when the prometheus-adapter
pods are ready to start receiving requests.

Solution: Add both readinessProbe and livenessProbe to the
prometheus-adapter, this way the kubelet will know when either the pod
stoped working and should be restarted or simply when it ready to start
receiving requests.

Issue: https://bugzilla.redhat.com/show_bug.cgi?id=2048333
2022-06-21 14:14:16 +01:00
Philip Gough
b38868e361 Merge pull request #1740 from arajkumar/bp-NodeFilesystemSpaceFillingUp-gc-thershold
Adjust NodeFilesystemSpaceFillingUp thresholds according default kubelet GC behavior
2022-05-10 08:26:47 +01:00
Arunprasad Rajkumar
c4e43dc412 assets: regenerate
Signed-off-by: Arunprasad Rajkumar <arajkuma@redhat.com>
2022-04-28 10:40:06 +05:30
Arunprasad Rajkumar
b54ad2ea71 Adjust NodeFilesystemSpaceFillingUp thresholds according default kubelet GC behavior
Previously[1] we attempted to do the same, but there was a
misunderstanding about the GC behavior and it caused the alert to be
fired even before GC comes into play.

According to[2][3] kubelet GC kicks in only when `imageGCHighThresholdPercent` is hit which is set to 85% by default. However `NodeFilesystemSpaceFillingUp` is set to fire as soon as 80% usage is hit.

This commit changes the `fsSpaceFillingUpWarningThreshold` to 15% so
that we give ample time to GC to reclaim unwanted images. This commit
also changes `fsSpaceFillingUpCriticalThreshold` to 10% which gives more time to admins to react to warning before sending critical alert.

[1] https://github.com/prometheus-operator/kube-prometheus/pull/1357
[2] https://docs.openshift.com/container-platform/4.10/nodes/nodes/nodes-nodes-garbage-collection.html#nodes-nodes-garbage-collection-images_nodes-nodes-configuring
[3] https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/

Signed-off-by: Arunprasad Rajkumar <arajkuma@redhat.com>
(cherry picked from commit 6ff8bfbb02)
2022-04-27 08:03:57 +05:30
Arthur Silva Sens
125fb56d74 Merge pull request #1715 from yogeek/backport-1630-kubeproxy-monitor-0-10
Backport PodMonitor for kube-proxy to 0.10 release
2022-04-09 08:31:59 +01:00
Philip Gough
35a61d9a0e Update PodMonitor for kube-proxy
Signed-off-by: yogeek <gdupin@gmail.com>
2022-04-08 16:21:54 +02:00
Paweł Krupa
5b9aa36169 Merge pull request #1611 from ArthurSens/release-0.10 2022-02-01 19:52:23 +01:00
Paweł Krupa (paulfantom)
142434ca2b manifests: regenerate
(cherry picked from commit 35f0bca4da)
2022-02-01 17:34:19 +00:00
Paweł Krupa (paulfantom)
701e3c91eb jsonnet: filter out kube-proxy alerts when kube-proxy is disabled
Signed-off-by: Paweł Krupa (paulfantom) <pawel@krupa.net.pl>
(cherry picked from commit 86ac6f79b1)
2022-02-01 17:34:11 +00:00
12 changed files with 352 additions and 697 deletions

View File

@@ -3,7 +3,7 @@ on:
- push
- pull_request
env:
golang-version: '1.15'
golang-version: '1.18'
kind-version: 'v0.11.1'
jobs:
generate:

View File

@@ -6,4 +6,8 @@ validators:
type: "ignore"
# Ignore release links.
- regex: 'https:\/\/github\.com\/prometheus-operator\/kube-prometheus\/releases'
type: "ignore"
type: "ignore"
# the www.weave.works domain returns 404 for many pages.
# Ignoring for now but we need remove the related content if it persists.
- regex: 'https:\/\/www.weave.works.*'
type: "ignore"

View File

@@ -14,11 +14,13 @@ date: "2021-03-08T23:04:32+01:00"
`kube-prometheus` ships with a set of default [Prometheus rules](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) and [Grafana](http://grafana.com/) dashboards. At some point one might like to extend them, the purpose of this document is to explain how to do this.
All manifests of kube-prometheus are generated using [jsonnet](https://jsonnet.org/) and Prometheus rules and Grafana dashboards in specific follow the [Prometheus Monitoring Mixins proposal](https://docs.google.com/document/d/1A9xvzwqnFVSOZ5fD3blKODXfsat5fg6ZhnKu9LK3lB4/).
All manifests of kube-prometheus are generated using [jsonnet](https://jsonnet.org/).
Prometheus rules and Grafana dashboards in specific follow the
[Prometheus Monitoring Mixins proposal](https://github.com/monitoring-mixins/docs/blob/master/design.pdf).
For both the Prometheus rules and the Grafana dashboards Kubernetes `ConfigMap`s are generated within kube-prometheus. In order to add additional rules and dashboards simply merge them onto the existing json objects. This document illustrates examples for rules as well as dashboards.
As a basis, all examples in this guide are based on the base example of the kube-prometheus [readme](../../README.md):
As a basis, all examples in this guide are based on the base example of the kube-prometheus [readme](https://github.com/prometheus-operator/kube-prometheus/blob/main/README.md):
```jsonnet mdox-exec="cat example.jsonnet"
local kp =
@@ -61,11 +63,14 @@ local kp =
### Alerting rules
According to the [Prometheus Monitoring Mixins proposal](https://docs.google.com/document/d/1A9xvzwqnFVSOZ5fD3blKODXfsat5fg6ZhnKu9LK3lB4/) Prometheus alerting rules are under the key `prometheusAlerts` in the top level object, so in order to add an additional alerting rule, we can simply merge an extra rule into the existing object.
As per the [Prometheus Monitoring Mixins proposal](https://github.com/monitoring-mixins/docs/blob/master/design.pdf)
Prometheus alerting rules are under the key `prometheusAlerts` in the top level object.
Additional alerting rules can be added by merging into the existing object.
The format is exactly the Prometheus format, so there should be no changes necessary should you have existing rules that you want to include.
> Note that alerts can just as well be included into this file, using the jsonnet `import` function. In this example it is just inlined in order to demonstrate their use in a single file.
> Note that alerts can also be included into this file, using the jsonnet `import` function.
> In this example it is just inlined in order to demonstrate their use in a single file.
```jsonnet mdox-exec="cat examples/prometheus-additional-alert-rule-example.jsonnet"
local kp = (import 'kube-prometheus/main.libsonnet') + {
@@ -336,9 +341,14 @@ Dashboards can either be added using jsonnet or simply a pre-rendered json dashb
### Jsonnet dashboard
We recommend using the [grafonnet](https://github.com/grafana/grafonnet-lib/) library for jsonnet, which gives you a simple DSL to generate Grafana dashboards. Following the [Prometheus Monitoring Mixins proposal](https://docs.google.com/document/d/1A9xvzwqnFVSOZ5fD3blKODXfsat5fg6ZhnKu9LK3lB4/) additional dashboards are added to the `grafanaDashboards` key, located in the top level object. To add new jsonnet dashboards, simply add one.
We recommend using the [grafonnet](https://github.com/grafana/grafonnet-lib/) library for jsonnet,
which gives you a simple DSL to generate Grafana dashboards.
Following the [Prometheus Monitoring Mixins proposal](https://github.com/monitoring-mixins/docs/blob/master/design.pdf)
additional dashboards are added to the `grafanaDashboards` key, located in the top level object.
To add new jsonnet dashboards, simply add one.
> Note that dashboards can just as well be included into this file, using the jsonnet `import` function. In this example it is just inlined in order to demonstrate their use in a single file.
> Note that dashboards can just as well be included into this file, using the jsonnet `import` function.
> In this example it is just inlined in order to demonstrate their use in a single file.
```jsonnet mdox-exec="cat examples/grafana-additional-jsonnet-dashboard-example.jsonnet"
local grafana = import 'grafonnet/grafana.libsonnet';

View File

@@ -37,6 +37,14 @@ function(params) {
mixin:: (import 'github.com/kubernetes-monitoring/kubernetes-mixin/mixin.libsonnet') {
_config+:: k8s._config.mixin._config,
} + {
// Filter-out alerts related to kube-proxy when `kubeProxy: false`
[if !(defaults + params).kubeProxy then 'prometheusAlerts']+:: {
groups: std.filter(
function(g) !std.member(['kubernetes-system-kube-proxy'], g.name),
super.groups
),
},
},
prometheusRule: {
@@ -280,7 +288,6 @@ function(params) {
},
podMetricsEndpoints: [{
honorLabels: true,
targetPort: 10249,
relabelings: [
{
action: 'replace',
@@ -289,6 +296,13 @@ function(params) {
sourceLabels: ['__meta_kubernetes_pod_node_name'],
targetLabel: 'instance',
},
{
action: 'replace',
regex: '(.*)',
replacement: '$1:10249',
targetLabel: '__address__',
sourceLabels: ['__meta_kubernetes_pod_ip'],
},
],
}],
},

View File

@@ -35,9 +35,12 @@ local defaults = {
// GC values,
// imageGCLowThresholdPercent: 80
// imageGCHighThresholdPercent: 85
// GC kicks in when imageGCHighThresholdPercent is hit and attempts to free upto imageGCLowThresholdPercent.
// See https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/ for more details.
fsSpaceFillingUpWarningThreshold: 20,
fsSpaceFillingUpCriticalThreshold: 15,
// Warn only after imageGCHighThresholdPercent is hit, but filesystem is not freed up for a prolonged duration.
fsSpaceFillingUpWarningThreshold: 15,
// Send critical alert only after (imageGCHighThresholdPercent + 5) is hit, but filesystem is not freed up for a prolonged duration.
fsSpaceFillingUpCriticalThreshold: 10,
diskDeviceSelector: 'device=~"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+"',
runbookURLPattern: 'https://runbooks.prometheus-operator.dev/runbooks/node/%s',
},

View File

@@ -220,7 +220,27 @@ function(params) {
'--tls-cipher-suites=' + std.join(',', pa._config.tlsCipherSuites),
],
resources: pa._config.resources,
ports: [{ containerPort: 6443 }],
readinessProbe: {
httpGet: {
path: '/readyz',
port: 'https',
scheme: 'HTTPS',
},
initialDelaySeconds: 30,
periodSeconds: 5,
failureThreshold: 5,
},
livenessProbe: {
httpGet: {
path: '/livez',
port: 'https',
scheme: 'HTTPS',
},
initialDelaySeconds: 30,
periodSeconds: 5,
failureThreshold: 5,
},
ports: [{ containerPort: 6443, name: 'https' }],
volumeMounts: [
{ name: 'tmpfs', mountPath: '/tmp', readOnly: false },
{ name: 'volume-serving-cert', mountPath: '/var/run/serving-cert', readOnly: false },

View File

@@ -1,6 +1,15 @@
{
"version": 1,
"dependencies": [
{
"source": {
"git": {
"remote": "https://github.com/grafana/jsonnet-libs.git",
"subdir": "mixin-utils"
}
},
"version": "master"
},
{
"source": {
"local": {

View File

@@ -18,7 +18,7 @@
"subdir": "contrib/mixin"
}
},
"version": "73080a716634f45d50d0593e0454ed3206a52f5b",
"version": "ae3b43a924c688f06560ada76a047d14b3935829",
"sum": "W/Azptf1PoqjyMwJON96UY69MFugDA4IAYiKURscryc="
},
{
@@ -28,8 +28,8 @@
"subdir": "grafonnet"
}
},
"version": "3626fc4dc2326931c530861ac5bebe39444f6cbf",
"sum": "gF8foHByYcB25jcUOBqP6jxk0OPifQMjPvKY0HaCk6w="
"version": "a1d61cce1da59c71409b99b5c7568511fec661ea",
"sum": "342u++/7rViR/zj2jeJOjshzglkZ1SY+hFNuyCBFMdc="
},
{
"source": {
@@ -38,8 +38,18 @@
"subdir": "grafana-builder"
}
},
"version": "264a5c2078c5930af57fe2d107eff83ab63553af",
"sum": "0KkygBQd/AFzUvVzezE4qF/uDYgrwUXVpZfINBti0oc="
"version": "02db06f540086fa3f67d487bd01e1b314853fb8f",
"sum": "B49EzIY2WZsFxNMJcgRxE/gcZ9ltnS8pkOOV6Q5qioc="
},
{
"source": {
"git": {
"remote": "https://github.com/grafana/jsonnet-libs.git",
"subdir": "mixin-utils"
}
},
"version": "d9ba581fb27aa6689e911f288d4df06948eb8aad",
"sum": "LoYq5QxJmUXEtqkEG8CFUBLBhhzDDaNANHc7Gz36ZdM="
},
{
"source": {
@@ -48,8 +58,8 @@
"subdir": ""
}
},
"version": "b538a10c89508f8d12885680cca72a134d3127f5",
"sum": "GLt5T2k4RKg36Gfcaf9qlTfVumDitqotVD0ipz/bPJ4="
"version": "ab104c5c406b91078d676475c14ab18644f84f2d",
"sum": "tRpIInEClWUNe5IS6uIjucFN/KqDFgg19+yo78VrLfU="
},
{
"source": {
@@ -58,7 +68,7 @@
"subdir": "lib/promgrafonnet"
}
},
"version": "fd913499e956da06f520c3784c59573ee552b152",
"version": "c72ac0392998343d53bd73343467f8bf2aa4e333",
"sum": "zv7hXGui6BfHzE9wPatHI/AGZa4A2WKo6pq7ZdqBsps="
},
{
@@ -141,7 +151,7 @@
"subdir": "mixin"
}
},
"version": "632032712f12eea0015aaef24ee1e14f38ef3e55",
"version": "fb97c9a5ef51849ccb7960abbeb9581ad7f511b9",
"sum": "X+060DnePPeN/87fgj0SrfxVitywTk8hZA9V4nHxl1g=",
"name": "thanos-mixin"
},

File diff suppressed because it is too large Load Diff

View File

@@ -150,7 +150,7 @@ spec:
!=
0
) or (
kube_daemonset_updated_number_scheduled{job="kube-state-metrics"}
kube_daemonset_status_updated_number_scheduled{job="kube-state-metrics"}
!=
kube_daemonset_status_desired_number_scheduled{job="kube-state-metrics"}
) or (
@@ -159,7 +159,7 @@ spec:
kube_daemonset_status_desired_number_scheduled{job="kube-state-metrics"}
)
) and (
changes(kube_daemonset_updated_number_scheduled{job="kube-state-metrics"}[5m])
changes(kube_daemonset_status_updated_number_scheduled{job="kube-state-metrics"}[5m])
==
0
)
@@ -752,18 +752,6 @@ spec:
for: 15m
labels:
severity: critical
- name: kubernetes-system-kube-proxy
rules:
- alert: KubeProxyDown
annotations:
description: KubeProxy has disappeared from Prometheus target discovery.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeproxydown
summary: Target disappeared from Prometheus target discovery.
expr: |
absent(up{job="kube-proxy"} == 1)
for: 15m
labels:
severity: critical
- name: kube-apiserver-burnrate.rules
rules:
- expr: |

View File

@@ -23,7 +23,7 @@ spec:
summary: Filesystem is predicted to run out of space within the next 24 hours.
expr: |
(
node_filesystem_avail_bytes{job="node-exporter",fstype!=""} / node_filesystem_size_bytes{job="node-exporter",fstype!=""} * 100 < 20
node_filesystem_avail_bytes{job="node-exporter",fstype!=""} / node_filesystem_size_bytes{job="node-exporter",fstype!=""} * 100 < 15
and
predict_linear(node_filesystem_avail_bytes{job="node-exporter",fstype!=""}[6h], 24*60*60) < 0
and
@@ -41,7 +41,7 @@ spec:
summary: Filesystem is predicted to run out of space within the next 4 hours.
expr: |
(
node_filesystem_avail_bytes{job="node-exporter",fstype!=""} / node_filesystem_size_bytes{job="node-exporter",fstype!=""} * 100 < 15
node_filesystem_avail_bytes{job="node-exporter",fstype!=""} / node_filesystem_size_bytes{job="node-exporter",fstype!=""} * 100 < 10
and
predict_linear(node_filesystem_avail_bytes{job="node-exporter",fstype!=""}[6h], 4*60*60) < 0
and

View File

@@ -37,9 +37,26 @@ spec:
- --secure-port=6443
- --tls-cipher-suites=TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA
image: k8s.gcr.io/prometheus-adapter/prometheus-adapter:v0.9.1
livenessProbe:
failureThreshold: 5
httpGet:
path: /livez
port: https
scheme: HTTPS
initialDelaySeconds: 30
periodSeconds: 5
name: prometheus-adapter
ports:
- containerPort: 6443
name: https
readinessProbe:
failureThreshold: 5
httpGet:
path: /readyz
port: https
scheme: HTTPS
initialDelaySeconds: 30
periodSeconds: 5
resources:
limits:
cpu: 250m