Compare commits

...

346 Commits

Author SHA1 Message Date
Paweł Krupa
7f94cfff2e Merge pull request #1494 from PhilipGough/revert-1406-dropped-cadvisor-metrics-6 2021-11-24 13:13:18 +01:00
Philip Gough
6783b6df04 Revert "Adjust dropped metrics from cAdvisor" 2021-11-10 10:30:04 +00:00
Paweł Krupa
9d5c3cece3 Merge pull request #1440 from prometheus-operator/automated-updates-release-0.6 2021-10-18 10:42:18 +02:00
dgrisonnet
a0999ff8d3 [bot] [release-0.6] Automated version update 2021-10-18 07:39:25 +00:00
Damien Grisonnet
5d07b5d659 Merge pull request #1429 from prometheus-operator/automated-updates-release-0.6
[bot] [release-0.6] Automated version update
2021-10-12 09:20:11 +02:00
dgrisonnet
8e257a945f [bot] [release-0.6] Automated version update 2021-10-11 07:39:26 +00:00
Damien Grisonnet
0cbec5b320 Merge pull request #1406 from PhilipGough/dropped-cadvisor-metrics-6
This change drops pod-centric metrics without a non-empty 'container'…
2021-09-28 12:02:28 +02:00
Philip Gough
d714141304 This change drops pod-centric metrics without a non-empty 'container' label.
Previously we dropped pod-centric metrics without a (pod, namespace) label set
however these can be critical for debugging.

Keep 'container_fs_.*' metrics from cAdvisor
2021-09-28 10:53:20 +01:00
Arthur Silva Sens
ccdb3781ca Merge pull request #1355 from PhilipGough/bz-199074
jsonnet: Drop cAdvisor metrics with no (pod, namespace) labels while …
2021-09-02 17:17:13 -03:00
Philip Gough
47e55a460e jsonnet: Drop cAdvisor metrics with no (pod, namespace) labels while preserving ability to monitor system services resource usage
The following provides a description and cardinality estimation based on the tests in a local cluster:

container_blkio_device_usage_total - useful for containers, but not for system services (nodes*disks*services*operations*2)
container_fs_.*                    - add filesystem read/write data (nodes*disks*services*4)
container_file_descriptors         - file descriptors limits and global numbers are exposed via (nodes*services)
container_threads_max              - max number of threads in cgroup. Usually for system services it is not limited (nodes*services)
container_threads                  - used threads in cgroup. Usually not important for system services (nodes*services)
container_sockets                  - used sockets in cgroup. Usually not important for system services (nodes*services)
container_start_time_seconds       - container start. Possibly not needed for system services (nodes*services)
container_last_seen                - Not needed as system services are always running (nodes*services)
container_spec_.*                  - Everything related to cgroup specification and thus static data (nodes*services*5)
2021-08-30 12:45:20 +01:00
Damien Grisonnet
dcc97b6f38 Merge pull request #1329 from prometheus-operator/automated-updates-release-0.6
[bot] [release-0.6] Automated version update
2021-08-17 10:19:16 +02:00
dgrisonnet
684dbc0bd5 [bot] [release-0.6] Automated version update 2021-08-17 08:05:35 +00:00
Paweł Krupa
80a830422d Merge pull request #1311 from dgrisonnet/ci-release-0.6 2021-08-16 10:16:32 +02:00
Philip Gough
3c8b17cb0b ci: Harden action to wait for kind cluster readiness 2021-08-09 18:23:03 +02:00
Damien Grisonnet
24e0682b05 ci: replace travis CI by github actions
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2021-08-09 18:22:01 +02:00
Paweł Krupa
8cb3f62e67 Merge pull request #1296 from dgrisonnet/1245-release-0.6
release-0.6: *: add "update" target to makefile and use it in automatic updater
2021-08-02 17:56:50 +02:00
paulfantom
e364778771 *: add "update" target to makefile and use it in automatic updater
Signed-off-by: paulfantom <pawel@krupa.net.pl>
2021-08-02 12:41:58 +02:00
Paweł Krupa
95ba62c107 Merge pull request #976 from vshn/0.6/pin-dependencies
[release-0.6] Pin Jsonnet dependencies
2021-02-24 15:21:42 +01:00
Simon Rüegg
a09aff9709 [release-0.6] Pin Jsonnet dependencies
Pin all Jsonnet dependencies to current commit SHA.

Signed-off-by: Simon Rüegg <simon@rueggs.ch>
2021-02-24 15:01:02 +01:00
Paweł Krupa
b2b90f25b8 Merge pull request #935 from underrun/fix-etcd-mixin-move
Pin etcd-mixin to last working version for release 0.6
2021-02-11 23:09:25 +01:00
Derek Wilson
d9465ce7a3 pin etcd-mixin to last working version for release
etcd refactored their repo moving and renaming etcd-mixin. the
jsonnetfile depended on "master" even though the lock was for an older
version. checking out from the last commit before the move works.
2021-02-11 18:08:25 +00:00
Lili Cosic
f69ff3d63d Merge pull request #687 from lilic/bump-to-patch-prom-operator
[release-0.6]: Bump to prometheus-operator 0.42.1
2020-09-23 13:55:22 +02:00
Lili Cosic
5550829c60 manifests: Regenerate 2020-09-23 11:45:08 +02:00
Lili Cosic
8b07a38917 jsonnetfile.lock.json: jb update 2020-09-23 11:42:25 +02:00
Sergiusz Urbaniak
3dfe4ee112 Merge pull request #669 from lilic/test-against-1.19
Test against 1.18 and 1.19
2020-09-22 10:10:38 +02:00
Sergiusz Urbaniak
09199d875c Merge pull request #684 from s-urbaniak/release-0.6
jsonnet: bump to prometheus-operator 0.42
2020-09-21 11:27:51 +02:00
Sergiusz Urbaniak
05b7a932ab jsonnet: bump to prometheus-operator 0.42 2020-09-21 10:51:24 +02:00
Lili Cosic
3bb92838bf Test against 1.18 and 1.19 2020-09-09 18:09:31 +02:00
Sergiusz Urbaniak
82fe2a3b06 Merge pull request #665 from s-urbaniak/node-exporter-max-unavailable-0.6
Backport: node-exporter: set maxUnavailable to 10%
2020-09-03 10:48:05 +02:00
Scott Dodson
87fabbc077 node-exporter: set maxUnavailable to 10%
This daemonset doesn't affect workload availability so allow its rollout to
be parallelized.
2020-09-03 10:01:53 +02:00
Frederic Branczyk
f1d92c8a80 Merge pull request #635 from lilic/cherry-pick-alerts
[release-0.6] jsonnet/prometheus-operator.libsonnet: Adjust alerts range
2020-08-06 15:40:04 +02:00
Lili Cosic
09a305bc0e manifests/prometheus-rules.yaml: Regenerate 2020-08-06 15:08:37 +02:00
Lili Cosic
f8b4c681a6 jsonnet/prometheus-operator.libsonnet: Adjust alerts range 2020-08-06 15:06:53 +02:00
Lili Cosic
85e22303f0 Merge pull request #627 from lilic/pin-mixin
Pin kube-mixin project to latest release
2020-08-03 13:33:17 +02:00
Lili Cosic
c0ff2c9f2d Pin kube-mixin project to latest release 2020-08-03 13:24:32 +02:00
Sergiusz Urbaniak
7d5d6d6a63 Merge pull request #626 from s-urbaniak/release-0.6
pin release0.6 release
2020-07-31 12:12:05 +02:00
Sergiusz Urbaniak
2932a74170 README: update compatibility matrix 2020-07-31 10:52:40 +02:00
Sergiusz Urbaniak
685a85e3e0 jb update, manifests: generate 2020-07-31 10:18:24 +02:00
Sergiusz Urbaniak
2326773ee1 jsonnet/kube-prometheus: pin depdencies 2020-07-31 10:18:24 +02:00
Frederic Branczyk
f0955e0540 Merge pull request #623 from brancz/add-kubelet-probes-metrics
Add scraping of endpoint for kubelet probe metrics
2020-07-29 12:57:28 +02:00
Frederic Branczyk
7c35752e3f Add scraping of endpoint for kubelet probe metrics 2020-07-29 11:49:52 +02:00
Frederic Branczyk
df3bfc6575 Merge pull request #622 from brancz/po-metrics
prometheus-adapter: Collect metrics from Prometheus Adapter
2020-07-29 11:45:00 +02:00
Frederic Branczyk
b51b9b983f prometheus-adapter: Collect metrics from Prometheus Adapter 2020-07-29 11:38:42 +02:00
Frederic Branczyk
6771c9bcc2 Merge pull request #616 from paulfantom/ciphers
Update default ciphers used by kube-rbac-proxy
2020-07-28 09:31:20 +02:00
paulfantom
63ad66e3f3 manifests: regenerate 2020-07-28 08:49:27 +02:00
paulfantom
8f85949438 jsonnet: update kube-rbac-proxy ciphers 2020-07-28 08:49:21 +02:00
Frederic Branczyk
2539ba9548 Merge pull request #621 from tafkam/master
secure metrics port for scheduler and controller-manager
2020-07-27 10:46:17 +02:00
root
3a6a0d0837 make generate 2020-07-27 10:29:31 +02:00
tafkam
6dfbcf35f2 port https-metrics 2020-07-27 10:27:14 +02:00
tafkam
c1304caa28 update secure ports for other cluster 2020-07-25 18:30:07 +02:00
tafkam
4410a80e4e secure scheduler/controller metrics ports, kubeadm discovery services 2020-07-25 18:27:17 +02:00
Frederic Branczyk
40adbfae6c Merge pull request #617 from paulfantom/node_filesystem_usage
Remove instance:node_filesystem_usage:sum
2020-07-23 21:25:55 +02:00
Frederic Branczyk
ba5c6e2e6a Merge pull request #618 from simonpasquier/bump-thanos
jsonnet: update component versions
2020-07-23 21:24:48 +02:00
Frederic Branczyk
d67c5da75e Merge pull request #620 from adinhodovic/regenerate-dashboards-rules
Regenerate dashboards and prometheus alerts
2020-07-23 21:04:47 +02:00
Adin Hodovic
6a34239786 Regenerate dashboards and alerts
Merged https://github.com/kubernetes-monitoring/kubernetes-mixin/pull/463 to remove duplicate entries for memory usage, however I'd like to move these changes to the Prometheus-Operator helm chart(https://github.com/helm/charts/pull/23024#issuecomment-661967101). I've regenerated the dashboards/alerts.
2020-07-23 18:36:41 +02:00
Simon Pasquier
a9ffdaa35c manifests: regenerate 2020-07-23 18:04:56 +02:00
Simon Pasquier
fcf7a2fcbf jsonnet: update component versions 2020-07-23 17:06:48 +02:00
paulfantom
550d42d95b manifests: regenerate 2020-07-23 16:51:35 +02:00
paulfantom
4e116aa7e2 jsonnet: remove incorrect instance:node_filesystem_usage:sum rule 2020-07-23 16:50:27 +02:00
Frederic Branczyk
b55c2825f7 Merge pull request #610 from lilic/add-more-alerts
Add PrometheusOperatorListErrors and fix PrometheusOperatorWatchErrors threshold
2020-07-15 13:19:45 +02:00
Lili Cosic
d88cb26377 manifests/prometheus-rules.yaml: Regenerate 2020-07-15 10:28:03 +02:00
Lili Cosic
5743540fbb prometheus-operator.libsonnet: Add List error alert and fix threshold to
Watch error alert
2020-07-15 10:24:45 +02:00
Frederic Branczyk
1917a57280 Merge pull request #608 from ghostsquad/chore/update-go-jsonnet
chore(jsonnet): ⬆️  update jsonnet to master
2020-07-14 10:10:36 +02:00
Frederic Branczyk
2421e8cbe9 Merge pull request #609 from lilic/add-prom-operator-alerts
prometheus-operator.libsonnet: Add PrometheusOperatorWatchErrors alert
2020-07-14 08:17:32 +02:00
Lili Cosic
a5b71282cd manifests/prometheus-rules.yaml: Regenerate 2020-07-13 17:35:36 +02:00
Lili Cosic
dfe9184c9b prometheus-operator.libsonnet: Add PrometheusOperatorWatchErrors alert 2020-07-13 17:35:36 +02:00
Weston McNamee
6f4a9e5233 chore(jsonnet): ⬆️ update jsonnet to master
pulls in recent performance improvement changes to speed up rendering

resolves #537
2020-07-12 23:27:36 -07:00
Lili Cosic
a87f322edc Merge pull request #605 from lilic/bump-prom-version
jsonnet/kube-prometheus: Bump default versions of prometheus and alertmanager
2020-07-09 12:03:01 +02:00
Lili Cosic
617003a583 manifests: Regenerate files 2020-07-09 11:48:30 +02:00
Lili Cosic
3865eacdb3 jsonnet/kube-prometheus: Bump default versions of prometheus and alertmanager 2020-07-09 11:48:22 +02:00
Frederic Branczyk
bce16b41eb Merge pull request #600 from tkashem/etcd-latency-metrics
enable etcd latency metrics in kube-apiserver
2020-07-03 16:20:52 +02:00
Abu Kashem
4d6e3d5c19 enable etcd latency metrics in kube-apiserver
kube-apiserver has a histogram etcd_request_duration_seconds that
measures latency between the kube-apiserver and etcd instance.
This metrics is currently dropped by cluster-prometheus. Enable
this metrics so we have visibility into etcd latency.

We ensured that this does not enable other unwanted metrcis

count by(name) ({name=~"etcd_request.+"})

etcd_request_duration_seconds_bucket
etcd_request_duration_seconds_count
etcd_request_duration_seconds_sum
2020-07-03 09:49:56 -04:00
Matthias Loibl
f4568b06dc Merge pull request #594 from metalmatze/discussions
Update the Issue templates to redirect to GitHub Discussions.
2020-06-30 12:58:59 +02:00
Matthias Loibl
cc7583fefb Update the Issue templates to redirect to GitHub Discussions. 2020-06-30 10:38:28 +02:00
Frederic Branczyk
176e9659f3 Merge pull request #590 from metalmatze/update-kubernetes-mixin
Update kubernetes-mixin to remove KubeAPILatencyHigh & KubeAPIErrorsHigh
2020-06-30 09:09:53 +02:00
Matthias Loibl
ea7a834755 Update kubernetes-mixin to remove KubeAPILatencyHigh & KubeAPIErrorsHigh 2020-06-29 19:43:34 +02:00
Lucas Servén Marín
2c1fc1cc11 Merge pull request #587 from andresterba/fix-typo
Fix typo
2020-06-26 12:58:22 +02:00
André Sterba
829a553e7a Fix typo 2020-06-26 12:17:49 +02:00
Simon Pasquier
de9591cbb0 Merge pull request #584 from simonpasquier/bump-grafana-6.7.4
Bump Grafana to v6.7.4
2020-06-24 13:32:26 +02:00
Simon Pasquier
83ebd535e6 manifests: regenerate 2020-06-24 10:55:13 +02:00
Simon Pasquier
bbd4e61fc1 Bump Grafana version to v6.7.4 2020-06-24 10:51:35 +02:00
Frederic Branczyk
1d41243b54 Merge pull request #579 from tommyjmquinn/master
Updated prometheus adapter deployment to use a multi arch image repo
2020-06-23 16:09:32 +02:00
Frederic Branczyk
b707a94314 Merge pull request #577 from kradalby/master
Make node-exporter listening address configurable
2020-06-23 16:00:51 +02:00
Tom Quinn
e82acdb253 Updated prometheus adapter deployment to use a multi arch image repo 2020-06-22 13:57:41 +01:00
Kristoffer Dalby
f55a17718d Allow nodeExporter address to be configured 2020-06-21 09:11:16 +01:00
Kristoffer Dalby
6b4bc0bb26 Allow nodeExporter address to be configured 2020-06-21 08:28:48 +01:00
Frederic Branczyk
6f488250fd Merge pull request #576 from simonpasquier/fix-alertmanager-config-inconsistent-alert
Fix AlertmanagerConfigInconsistent alert
2020-06-19 16:20:40 +02:00
Frederic Branczyk
97ca4616ff Merge pull request #575 from stafot/update_adapter_endpoint
Update prometheus-adapter endpoint
2020-06-19 16:08:30 +02:00
Simon Pasquier
0a43e85917 manifests: regenerate 2020-06-19 14:41:11 +02:00
Simon Pasquier
c3ea4675da Fix AlertmanagerConfigInconsistent alert
Previously the alert would fire when the number of Alertmanager pods
didn't match the number of replicas defined in the Alertmanager spec
even though all the running pods had the same configuration hash. This
type of issue is already covered by KubeStatefulSetUpdateNotRolledOut
(and possibly KubePodNotReady), having AlertmanagerConfigInconsistent
also active in this situation creates unnecessary noise.

With this change, the alert expression only returns when Alertmanager
pods have different configuration hash values irrespective of the number
of pod replicas. The message annotation has also been enhanced to report
the configuration hash for each pod.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-06-19 14:30:55 +02:00
Stavros Foteinopoulos
3cbc97d782 Update prometheus-adapter endpoint 2020-06-19 15:27:26 +03:00
Lili Cosic
17989b42aa Merge pull request #574 from lilic/bump-prom-op-40
Bump prometheus-operator to v0.40
2020-06-19 11:55:50 +02:00
Lili Cosic
beaba9f4da docs, manifests: Regenerate files 2020-06-19 10:30:50 +02:00
Lili Cosic
c5ecc42244 jsonnetfile.lock.json: jb update 2020-06-19 10:27:34 +02:00
Lili Cosic
53bb3431ad jsonnet/kube-prometheus/jsonnetfile.json: Bump prometheus-operator to
v0.40
2020-06-19 10:26:55 +02:00
Frederic Branczyk
7e0c503b13 Merge pull request #553 from atmosx/update-grafana-dashboard-docs
Update grafana dashboard docs
2020-05-27 19:09:32 +02:00
Panagiotis Atmatzidis
e3ad00999f [docs/update-grafana-dashboard-docs] Update Grafana dashboard instructions
Instructions to add Grafana dashboard do not work. The proposed
functions are wrong, according to
[grafana.libsonnet](https://github.com/brancz/kubernetes-grafana/blob/master/grafana/grafana.libsonnet)
`dashboards` and `rawDashboards` should be used in `grafana+::`
field.

This PR updates the existing documentation and fixes minor typos.
2020-05-27 19:39:31 +03:00
Frederic Branczyk
4b0fb40717 Merge pull request #551 from dgrisonnet/fix-release-0.5-compat
Update release-0.5 compatibility
2020-05-26 21:11:49 +02:00
Damien Grisonnet
ce1bc17d98 doc: update release-0.5 compatibility
kubernetes-mixin release-0.4 is only supported by 1.18+

Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2020-05-26 18:27:32 +02:00
Frederic Branczyk
5a84ac52c7 Merge pull request #548 from dmayle/update_docs
Update kubelet config section and compatibility
2020-05-25 14:08:30 +02:00
dmayle
37fb5cb53a Update kubelet config section and compatibility
This readme update includes two changes:
 1) Update the kubelet config requirements to mention the modern (non-deprecated) kubelet configuration values that can be used in place of the flags
 2) Update the compatibility matrix to mention the issue running release-0.4 on kubernetes versions 1.16.2 through 1.16.4, including a workaround.
2020-05-25 01:12:54 +02:00
Paul Gier
28332b410a Merge pull request #538 from pgier/update-compat-matrix-for-release-0.4
update compatibility matrix with note for release-0.4
2020-05-20 13:49:49 -05:00
Paul Gier
0983947755 update compatibility matrix with note for release-0.4 2020-05-20 11:16:42 -05:00
Frederic Branczyk
5b9341cad6 Merge pull request #527 from pgier/node-exporter-ignore-pod-mounts
Node exporter ignore pod mounts
2020-05-15 07:10:32 +02:00
Paul Gier
d288206d06 Merge pull request #526 from pgier/update-generated-files
update generated files for prometheus operator v0.39.0
2020-05-13 10:37:36 -05:00
Paul Gier
6742260399 update generated files for prometheus operator v0.39.0 2020-05-12 17:38:11 -05:00
Paul Gier
b40e70065b update generated files for node exporter ignored filesystems 2020-05-12 17:20:24 -05:00
Paul Gier
d1690d95f7 node_exporter: remove outdated comment and CLI arg
The ignored filesystem types now matches the default, so the
comment and arg can be removed.
2020-05-12 17:14:05 -05:00
Paul Gier
69b6883033 node-exporter: ignore kubelet pod mounts
Ignore kubelet pod filesystem mounts of the form:
/var/lib/kubelet/pods/1b260ce7-e75d-44d4-8409-922d2bd0851f/volumes...
Metrics for these volumes are available via the kubelet_volume_stats*
metrics.
2020-05-12 17:12:36 -05:00
Frederic Branczyk
f58d7b5695 Merge pull request #519 from pgier/dont-remove-preserve-unknown-fields
Revert "Remove field preserveUnknownFields from CRDs"
2020-05-11 16:16:22 +02:00
Paweł Krupa
11d57e468c Merge pull request #524 from paulfantom/prom-op-v0.39 2020-05-11 12:48:09 +02:00
paulfantom
7faed14744 *: regenerate 2020-05-11 11:59:55 +02:00
paulfantom
96ea25d5de *: update jsonnet to use prometheus-operator v0.39 2020-05-11 11:59:46 +02:00
Frederic Branczyk
dab022fc62 Merge pull request #508 from johanneswuerbach/custom-metrics-b2
custom metrics v1beta2 api with k8s-prometheus-adapter v0.7.0
2020-05-07 10:12:42 +02:00
Paul Gier
4840cdcb66 Revert "Remove field preserveUnknownFields from CRDs"
This reverts commit cdaaf3d51c.
2020-05-05 14:15:18 -05:00
Frederic Branczyk
d07466766d Merge pull request #517 from benjaminhuo/master
Update prometheus version to v2.17.2
2020-05-04 10:06:58 +02:00
Benjamin
7130905473 Update prometheus version to v2.17.2
Signed-off-by: Benjamin <benjamin@yunify.com>
2020-04-30 14:46:17 +08:00
Johannes Würbach
ab8f1bb9f2 custom metrics v1beta2 api 2020-04-30 00:26:06 +02:00
Johannes Würbach
8d6679658f k8s-prometheus-adapter v0.7.0 2020-04-30 00:26:06 +02:00
Frederic Branczyk
49ad6a67af Merge pull request #501 from dgrisonnet/fix-generate-cleanup
Fix json files cleanup when generating manifests
2020-04-29 14:14:05 +02:00
Damien Grisonnet
be4b525774 build.sh: fix json files cleanup
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2020-04-29 13:10:32 +02:00
Frederic Branczyk
070413521c Merge pull request #478 from NickelMedia/fix-nodeexporter-selector-labels
Remove version label from node-exporter selectors
2020-04-27 15:45:58 +02:00
Lili Cosic
60424ff54c Merge pull request #510 from bycEEE/patch-1
fix readme typos
2020-04-24 13:03:47 +02:00
Brian Choy
affbc9d7ff fix readme typos 2020-04-23 17:44:38 -07:00
Frederic Branczyk
320d512fc8 Merge pull request #374 from johanneswuerbach/custom-metrics
Transform custom-metrics into an addon
2020-04-22 19:28:59 +02:00
Frederic Branczyk
a1cf984749 Merge pull request #500 from lilic/bump-1.18
Test against kubernetes 1.18 release
2020-04-20 14:07:42 +02:00
Johannes Würbach
145ee24e09 Convert custom-metrics into an addon 2020-04-20 12:38:50 +02:00
Lili Cosic
626f1af8c0 tests/e2e/travis-e2e.sh: Bump kind version 2020-04-18 14:24:59 +02:00
Lili Cosic
be4d32cba2 README.md: Change compatibility matrix 2020-04-17 11:51:09 +02:00
Lili Cosic
b3dfd223b6 scripts,tests: Bump kubernetes version to 1.18 2020-04-17 11:36:32 +02:00
Frederic Branczyk
dcc46c8aa8 Merge pull request #496 from lilic/bump-things
Bump dependencies
2020-04-17 10:30:55 +02:00
Lili Cosic
b0f70c173b Bump to go 1.13 2020-04-17 09:53:19 +02:00
Lili Cosic
926337feac manifests: Regenerate 2020-04-17 09:48:06 +02:00
Lili Cosic
fd67733729 go.mod,sum: go mod tidy 2020-04-16 22:02:52 +02:00
Lili Cosic
f6ff666135 jb update 2020-04-16 21:59:33 +02:00
Lili Cosic
a8b4985de4 Merge pull request #482 from dgrisonnet/jsonnet-tooling
Move to go-jsonnet and lock tooling
2020-04-14 15:38:54 +02:00
Frederic Branczyk
e590ae2c68 Merge pull request #491 from sdarwin/prometheus-pvc.jsonnet
update prometheus-pvc.jsonnet
2020-04-14 11:10:57 +02:00
Frederic Branczyk
876bb9c5a1 Merge pull request #481 from omerlh/patch-2
Allow to configure EKS available IPs alert
2020-04-14 10:09:32 +02:00
Omer Levi Hevroni
6a08c7d69e Update kube-prometheus-eks.libsonnet 2020-04-13 10:51:13 +03:00
sdarwin
63078c2e78 update prometheus-pvc.jsonnet 2020-04-09 19:49:22 +00:00
Frederic Branczyk
d1c90625b1 Merge pull request #488 from johanneswuerbach/fix-window
prometheus-adapter: Fix rules window
2020-04-08 08:55:19 +02:00
Johannes Würbach
2ab69fdac0 Fix rules window 2020-04-07 22:01:26 +02:00
Frederic Branczyk
115721bbba Merge pull request #485 from johanneswuerbach/prom-adapter
Make prometheus-adapter config a real object
2020-04-07 17:02:29 +02:00
Zack Brenton
46aa9554d1 updated generated manifests 2020-04-07 11:06:30 -03:00
Johannes Würbach
bb21ea32e3 Make prometheus-adapter config a real object 2020-04-07 15:32:33 +02:00
Damien Grisonnet
7f9b082ed3 go.mod: remove unused packages
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2020-04-07 10:56:49 +02:00
Damien Grisonnet
7b4adb08f6 test.sh: update PATH to use project tooling
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2020-04-07 10:55:17 +02:00
Damien Grisonnet
c9900d6a57 Makefile: export GO111MODULE=on
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2020-04-07 10:38:13 +02:00
Zack Brenton
432db2c799 use top-level config for all nodeExporter selector labels 2020-04-06 13:54:17 -03:00
Damien Grisonnet
026425117d Makefile: use go install instead of go build
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2020-04-06 18:50:56 +02:00
Damien Grisonnet
f4b8064899 README: remove make in docker guidance
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2020-04-06 18:15:36 +02:00
Damien Grisonnet
9a7ba10755 build.sh: update PATH to use project tooling
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2020-04-06 18:15:34 +02:00
Paweł Krupa
0904ea78c0 Merge pull request #484 from miff2000/patch-1
Correct typo in Rolebindig
2020-04-06 16:47:02 +02:00
Matt Calvert
441065c2f9 Correct typo in Rolebindig 2020-04-06 14:50:30 +01:00
Damien Grisonnet
cb49f90491 ci: use golang tooling
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2020-04-06 12:50:11 +02:00
Damien Grisonnet
a9df00baec mod: add tooling dependencies
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2020-04-06 12:49:51 +02:00
Damien Grisonnet
0f6cd6d0a8 Makefile: remove containerized tooling
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2020-04-06 12:47:57 +02:00
Omer Levi Hevroni
ea9f474ab3 Allow to configure EKS available IPs alert 2020-04-06 12:15:09 +03:00
Frederic Branczyk
8fdf1c772c Merge pull request #480 from lilic/bump-kube-mixin
Bump dependencies
2020-04-03 14:51:40 +02:00
Lili Cosic
7992aa4e73 manifests: Regenerate files 2020-04-03 12:00:49 +02:00
Lili Cosic
5ee1229be8 jsonnetfile.json: Update deps 2020-04-03 11:59:10 +02:00
Frederic Branczyk
82c3d9e8e4 Merge pull request #467 from dgrisonnet/compatibility-matrix
doc: add kubernetes compatibility matrix
2020-04-03 09:54:33 +02:00
Zack Brenton
0d907098ae remove version label from node-exporter selectors 2020-04-02 12:53:17 -03:00
Damien Grisonnet
63bdb7d931 doc: add kubernetes compatibility matrix
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>

Co-Authored-By: Lili Cosic <cosiclili@gmail.com>
2020-03-31 13:39:26 +02:00
Lili Cosic
cf7bb8706c Merge pull request #463 from rajatvig/support_standard_labels_nodeexporter
Support standard labels for nodeexporter
2020-03-25 09:33:12 +01:00
Paweł Krupa
86b0419f59 Merge pull request #470 from paulfantom/sync_k8s_mixins
Sync dependencies
2020-03-25 09:29:41 +01:00
paulfantom
771ff9dcf4 manifests: regenerate 2020-03-24 15:49:39 +01:00
paulfantom
6b253bf13b *: update dependencies 2020-03-24 15:49:29 +01:00
paulfantom
0ad11b64d7 replace clock skew alert with one provided by node_exporter mixin 2020-03-24 15:49:10 +01:00
Rajat Vig
805d2e65f5 Update lock files 2020-03-24 11:17:34 +00:00
Rajat Vig
ff6b7ae5f3 Update Manifests based off the new jsonnets 2020-03-24 11:08:39 +00:00
Rajat Vig
83812948b7 Update lock files 2020-03-24 10:49:45 +00:00
Rajat Vig
474d4e39dc Remove the app label for node-exporter 2020-03-24 10:41:51 +00:00
Rajat Vig
6f4f34606d Remove custom k8s-app label in favor of standard k8s labels in the manifest for node-exporter 2020-03-24 10:33:42 +00:00
Paweł Krupa
68505af1f9 Merge pull request #453 from paulfantom/secure-metrics
Secure metrics endpoint
2020-03-24 11:06:00 +01:00
paulfantom
1dd5bbeb58 *: regenerate 2020-03-24 10:41:45 +01:00
paulfantom
f846c2e722 tests/e2e: use prometheus client_golang in e2e tests & add testing for http endpoints 2020-03-24 10:38:40 +01:00
paulfantom
6f37ddbcf9 jsonnet: expose prometheus-operator metrics over secure channel 2020-03-24 10:38:39 +01:00
paulfantom
4541b9e10c *: bump jb to 0.3.1 to be on par with latest tooling container 2020-03-24 10:38:38 +01:00
Paul Gier
75c532df17 Merge pull request #466 from pgier/prometheus-operator-v0.38.0
update prometheus-operator to v0.38.0
2020-03-23 11:43:27 -05:00
Paul Gier
09813bea10 update prometheus-operator to v0.38.0 2020-03-23 10:49:14 -05:00
Frederic Branczyk
a5e278372a Merge pull request #462 from bgagnon/460-ksm-namespace
Fix kube-state-metrics namespace override
2020-03-20 07:10:29 +01:00
Benoit Gagnon
bb5de11c89 fix kube-state-metrics namespace override
use $._config.namespace instead of hard-coding 'monitoring'
2020-03-19 21:32:34 -04:00
Lili Cosic
285624d8fb Merge pull request #456 from carlosedp/pr404_fix
Add version and image source as config parameters on kube-state-metrics
2020-03-18 15:38:07 +01:00
Carlos de Paula
0d4bfe7db5 Add version and image source as config parameters.
Fixes #455.
2020-03-18 10:20:31 -03:00
Latch Mihay
c4561b3206 adding security context to kube-rbac-proxy (#450)
* adding security context to kube-rbac-proxy

* make clean generate-in-docker

* Revert "make clean generate-in-docker"

This reverts commit ed136f1e37.

* make clean generate-in-docker

Co-authored-by: Latch M <latch_mihaylov@homedepot.com>
2020-03-18 07:52:26 +01:00
Frederic Branczyk
502f81b235 Merge pull request #441 from jadia/master
fix invalid Usage section reference
2020-03-17 14:20:31 +01:00
Frederic Branczyk
d2389d3e71 Merge pull request #452 from paulfantom/irate
Use irate for CPU measurements
2020-03-17 11:10:24 +01:00
paulfantom
ae69b62d01 manifests: regenerate 2020-03-17 10:57:53 +01:00
paulfantom
081f418273 jsonnet/prometheus-adapter: use irate for CPU queries 2020-03-16 11:58:55 +01:00
Lili Cosic
b100eead2e Merge pull request #448 from alok87/446-weave-net
Fixes for the weave-net monitoring setup
2020-03-12 12:14:55 +01:00
Alok Kumar Singh
50ff549b52 Updated the doc as grafana deployment needs modifications
Grafana deployment needs to modified for weave-net for mounting the
weave-net config maps volumes
2020-03-12 11:53:55 +05:30
Alok Kumar Singh
4ebc37e47b Fixed the port name for weave-net metrics endpoint 2020-03-12 10:40:18 +05:30
Alok Kumar Singh
486b233c6a Fixed the label for weave net selector 2020-03-12 10:33:38 +05:30
Nitish Jadia
90148e2356 fix invalid Usage section reference
Replace Usage section reference to Customizing Kube-Prometheus section.
2020-03-06 17:53:13 +05:30
Lucas Servén Marín
66c625d0bf Merge pull request #438 from dgrisonnet/update-customizing-section
Add note related to example.jsonnet in README
2020-03-05 11:56:54 +01:00
Damien Grisonnet
848285797c doc: add note related to example.jsonnet
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2020-03-05 10:23:27 +01:00
Frederic Branczyk
e27f575347 Merge pull request #439 from alok87/patch-1
Fix the grafana dashboard link
2020-03-05 07:48:06 +01:00
Alok Kumar Singh
17db5a68e5 Fix the grafana dashboard link 2020-03-05 08:54:55 +05:30
Frederic Branczyk
7a2572d1f9 Merge pull request #425 from alok87/weave-net
Weave Net Monitoring setup using kube-prometheus
2020-03-04 20:20:16 +01:00
Alok Kumar Singh
7a85d7d8a6 Weave Net name consistencies resolved
https://github.com/coreos/kube-prometheus/pull/425#pullrequestreview-368779890
2020-03-04 21:41:02 +05:30
Lili Cosic
23a6adea16 Merge pull request #437 from dgrisonnet/update-customizing-guildelines
Update README customizing guidelines with new release version
2020-03-04 13:50:24 +01:00
Damien Grisonnet
b5ba409b9a doc: update release version in customizing section
Fixes #435

Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2020-03-04 13:14:04 +01:00
Alok Kumar Singh
c942d6b837 Example with option to modify alert thresholds
Review comment: https://github.com/coreos/kube-prometheus/pull/425#discussion_r387494885
2020-03-04 16:18:37 +05:30
Frederic Branczyk
9323c4c98f Merge pull request #436 from lilic/remove-checks-ksm
jsonnet/kube-prometheus/kube-state-metrics: Remove probes
2020-03-04 11:22:46 +01:00
Lili Cosic
5469bea0a6 manifests: Regenerate kube-state-metrics deployment 2020-03-04 11:02:32 +01:00
Lili Cosic
15185bf4c0 jsonnet/kube-prometheus/kube-state-metrics: Remove probes 2020-03-04 11:02:32 +01:00
Alok Kumar Singh
29d4648af9 Added weavenet monitoring setup using kube-prometheus 2020-03-04 06:32:43 +05:30
Frederic Branczyk
b6ad6644d5 Merge pull request #428 from pgier/prometheus-operator-v0.37.0
update prometheus-operator to v0.37.0
2020-03-03 19:44:45 +01:00
Paul Gier
d24cf329d2 update prometheus-operator to v0.37.0 2020-03-03 11:55:51 -06:00
Lili Cosic
e4a8abe17f Merge pull request #434 from lilic/bump-1.9.5
Bump kube-state-metrics to 1.9.5
2020-03-03 16:49:00 +01:00
Frederic Branczyk
dd00a80be4 Merge pull request #432 from skyscrapers/fsAlert
Adjust the threshold of the NodeFilesystemSpaceFillingUp alert from the node-exporter mixin
2020-03-03 16:40:36 +01:00
Lili Cosic
90daccf6c7 manifests: Generate files 2020-03-03 16:35:10 +01:00
Lili Cosic
f66f94ac79 jsonnet/kube-prometheus/../kube-state-metrics.libsonnet: Bump to 1.9.5 2020-03-03 16:29:01 +01:00
Lili Cosic
0e8353ba91 jsonnetfile.lock.json: Bump kube-state-metrics to 1.9.5 2020-03-03 16:17:47 +01:00
Lili Cosic
50eee211dd Merge pull request #427 from lilic/fix-ksm
jsonnet/kube-prometheus: Add back kube-rbac-proxy containers to
2020-03-03 16:01:05 +01:00
Paul Gier
60dcc3a86b Merge pull request #429 from russorat/k8s-1.17
Updating to latest k8s version in minikube start
2020-03-03 08:32:16 -06:00
Lili Cosic
298f216847 Makefile: Force jsonnet-bundler to be at v0.2.0 2020-03-03 13:49:37 +01:00
Lili Cosic
2e73de0106 manifests: Regenerate kube-state-metrics files 2020-03-03 13:49:37 +01:00
Lili Cosic
f2540537cb jsonnet/kube-prometheus: Add back kube-rbac-proxy containers to
kube-state-metrics. These were removed by accident when migrating to
using upstream libsonnet.
2020-03-03 13:49:37 +01:00
iuri aranda
5638f48f9d Regenerate
Signed-off-by: iuri aranda <iuri@skyscrapers.eu>
2020-03-03 09:47:55 +01:00
iuri aranda
eaa83c461f Adjust threshold for the SpaceFillingUp alert
Reduce threshold of the node-exporter alert to 15% space available, instead of 20% (default).

As per https://github.com/coreos/kube-prometheus/issues/294

Signed-off-by: iuri aranda <iuri@skyscrapers.eu>
2020-03-03 09:47:03 +01:00
Frederic Branczyk
8e6f5217b4 Merge pull request #430 from pgier/lock-jsonnet-bundler-version
Makefile: lock jsonnet-bundler version
2020-03-03 09:25:20 +01:00
Paul Gier
199d619741 Makefile: lock jsonnet-bundler version
The new version (v0.3.1) of jsonnet bundler causes some changes
to go.mod and jsonnetfile.json.  The build should 'go get' a
specific version instead of the latest to prevent new releases
from breaking existing builds.
2020-03-02 21:11:11 -06:00
Russ Savage
895bf84e87 chore(README): updating to latest k8s version in minikube start 2020-03-02 15:45:45 -08:00
Frederic Branczyk
953c5464f7 Merge pull request #417 from benjaminhuo/alertmanager
Adjust Alertmanager inhibit conditions
2020-02-19 18:56:43 +01:00
Frederic Branczyk
3f3d4e2947 Merge pull request #414 from benjaminhuo/master
Change deprecated BaseImage to Image
2020-02-18 09:41:15 +01:00
Benjamin
1144885da0 regenerate
Signed-off-by: Benjamin <benjamin@yunify.com>
2020-02-17 22:02:07 +08:00
Benjamin
af9c1539e3 Adjust Alertmanager inhibit conditions
Signed-off-by: Benjamin <benjamin@yunify.com>
2020-02-17 21:44:49 +08:00
Benjamin
3531e303dc regenerate
Signed-off-by: Benjamin <benjamin@yunify.com>
2020-02-14 12:14:45 +08:00
Benjamin
c736d1a47b Change deprecated BaseImage to Image
Signed-off-by: Benjamin <benjamin@yunify.com>
2020-02-14 11:57:36 +08:00
Paul Gier
8b0b0bc514 Merge pull request #412 from pgier/prometheus-operator-v0.36.0
Prometheus operator v0.36.0
2020-02-11 11:04:23 -06:00
Paul Gier
6a2cc72573 remove preserveUnknownFields from thanos CRD
This keeps the CRD compatible with kubernetes v1.14 and earlier
2020-02-11 10:13:25 -06:00
Paul Gier
bb0ca63533 upgrade prometheus-operator to v0.36.0 2020-02-11 09:46:06 -06:00
Paul Gier
d8b4d25f9a update jsonnet dependencies 2020-02-11 09:46:06 -06:00
Paul Gier
0ed3f70014 Merge pull request #404 from olegmayko/master
Use kube-state-metrics jsonnet dependency #369
2020-02-11 09:45:27 -06:00
Paul Gier
5cabd5eeda Merge pull request #410 from gjkim42/experimental/custom-metrics-api
experimental/custom-metrics-api: Fix deprecated query for k8s 1.16
2020-02-11 08:57:21 -06:00
Paul Gier
95a853c531 Merge pull request #408 from pgier/optionally-disable-crd-pruning
Remove preserveUnknownField CRD setting
2020-02-11 08:41:42 -06:00
Oleg Mayko
f043bc32d3 Use kube-state-metrics jsonnet dependency #369 2020-02-11 08:12:22 +01:00
Geonju Kim
7f315e2262 experimental/custom-metrics-api: Fix deprecated query for k8s 1.16 2020-02-11 09:10:49 +09:00
Paul Gier
cdaaf3d51c Remove field preserveUnknownFields from CRDs
This allows compatiblity with kubernetes v1.14 and earlier.
2020-02-07 14:40:56 -06:00
Frederic Branczyk
8550ac35bf Merge pull request #406 from pgier/build-improvements
minor build improvements
2020-02-07 08:51:32 +01:00
Frederic Branczyk
9095ed4ccf Merge pull request #407 from pgier/prometheus-operator-v0.35.1
update jsonnet dependencies
2020-02-07 08:51:01 +01:00
Paul Gier
59de4a911b update jsonnet dependencies
Includes prometheus-operator v0.35.1 which should fix the statefulset
crash loop issue #2950
2020-02-06 16:39:31 -06:00
Paul Gier
33c7e23ccd go mod tidy 2020-02-06 16:13:24 -06:00
Paul Gier
92212085c6 Makefile: set bash -o pipefail
Fails if any command in a pipe fails.  Similar to the
prometheus-operator Makefile.
2020-02-06 16:11:13 -06:00
Paul Gier
37c8d369ee generate jsonnet-bundler binary if it's not available
Also locks jsonnet-bundler to version v0.2.0
2020-02-06 16:11:09 -06:00
Paul Gier
5774353d24 Merge pull request #403 from pgier/prometheus-operator-v0.35
Prometheus operator v0.35
2020-02-04 11:23:19 -06:00
Paul Gier
7292f0950a update prometheus-operator to v0.35.0 2020-02-03 14:31:33 -06:00
Paul Gier
e3174aef84 update jsonnet dependencies 2020-02-03 14:21:58 -06:00
Frederic Branczyk
eee5e10e72 Merge pull request #400 from JTarasovic/include-service-in-targetdown
Include service in targetdown
2020-01-30 16:14:25 +01:00
Jason Tarasovic
0b66cd33bd manifests/prometheus-rules.yaml: regenerated file 2020-01-30 07:33:14 -06:00
Jason Tarasovic
27e0a4c9a2 jsonnet/kube-prometheus/alerts: included service in TargetDown message 2020-01-30 07:31:23 -06:00
Sergiusz Urbaniak
519ae8681e Merge pull request #397 from s-urbaniak/up-down
jsonnet: add general rules for up/down targets
2020-01-30 12:06:15 +01:00
Frederic Branczyk
f30cf2e778 Merge pull request #398 from brancz/default-receivers
*: Add default receivers
2020-01-30 10:52:47 +01:00
Frederic Branczyk
fabf273d30 *: Fix jsonnet-bundler files 2020-01-30 10:39:44 +01:00
Frederic Branczyk
3e7d8b391a *: Add default receivers
This patch adds a few out of the box receivers that only need their
notification provider configuration filled in, instead of figuring out
all the wiring for critical alerts for example.
2020-01-30 10:39:41 +01:00
Sergiusz Urbaniak
9b429842e6 manifests: regenerate 2020-01-29 18:23:52 +01:00
Sergiusz Urbaniak
52e46a68a0 jsonnet: add general rules for up/down targets 2020-01-29 18:22:46 +01:00
Frederic Branczyk
1973936fd3 Merge pull request #395 from paulfantom/versions
Update components to latest versions
2020-01-29 10:00:48 +01:00
paulfantom
7bbec26ff3 manifests: regenerate 2020-01-28 23:22:02 +01:00
paulfantom
bd20662d48 jsonnet: update component versions 2020-01-28 23:20:29 +01:00
Frederic Branczyk
748b889a9f Merge pull request #392 from paulfantom/piecharts
Remove piecharts
2020-01-28 08:42:47 +01:00
paulfantom
ecf4a99634 manifests: regenerate 2020-01-28 01:06:05 +01:00
paulfantom
3137c5f607 update jsonnet dependencies 2020-01-28 01:05:20 +01:00
Frederic Branczyk
3277200fc5 Merge pull request #391 from brancz/default-inhibit-rules
*: Add some simple default inhibition rules
2020-01-25 16:55:35 +01:00
Frederic Branczyk
23344a39eb *: Add some simple default inhibition rules 2020-01-24 17:18:18 +01:00
Frederic Branczyk
f2b4528b63 Merge pull request #387 from brancz/reduce-histogram-buckets
*: Throw away unused high cardinality apiserver duration buckets
2020-01-23 15:32:18 +01:00
Frederic Branczyk
a7628e0223 Merge pull request #381 from krasi-georgiev/remove-collectors
remove some unused collectors
2020-01-23 14:50:47 +01:00
Krasi Georgiev
8984606f5d re-added most collectors
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
2020-01-23 15:17:56 +02:00
Frederic Branczyk
48d95f0b9f *: Throw away unused high cardinality apiserver duration buckets 2020-01-23 13:24:42 +01:00
Frederic Branczyk
e410043b6b Merge pull request #386 from paulfantom/bump_kube-mix
Bump kubernetes-mixins
2020-01-23 12:22:40 +01:00
paulfantom
894069f24d manifests: regenerate 2020-01-23 12:01:21 +01:00
paulfantom
d074ea1427 bump kubernetes-mixins dependency 2020-01-23 12:01:10 +01:00
Frederic Branczyk
269aef6e37 Merge pull request #384 from s-urbaniak/agg
prometheus-adapter: add nodes resource to aggregated-metrics-reader
2020-01-22 09:45:38 +01:00
Sergiusz Urbaniak
90e5982de4 manifests: regenerate 2020-01-21 20:43:47 +01:00
Sergiusz Urbaniak
7165938b39 prometheus-adapter: add nodes resource to aggregated-metrics-reader 2020-01-21 18:36:52 +01:00
Frederic Branczyk
9ebe632d5d Merge pull request #380 from omerlh/prom-all-namespaces
added patch to allow prom to watch all namespaces
2020-01-20 14:16:29 +01:00
Lili Cosic
72ae778bfc Merge pull request #382 from tlereste/update_kube_state_metrics
bump kube-state-metrics to version 1.9.2
2020-01-17 11:17:57 +01:00
Thibault Le Reste
0608c96bf6 bump kube-state-metrics to version 1.9.2 2020-01-15 13:12:35 +01:00
Krasi Georgiev
44f3c61010 remove some unused collectors
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
2020-01-15 12:03:04 +02:00
omerlh
f517b35a42 added patch to allow prom to watch all namespaces 2020-01-14 17:55:27 +02:00
Frederic Branczyk
54c0fda307 Merge pull request #378 from LiliC/drop-less
jsonnet,manifests: Do not drop not all metrics
2020-01-14 14:55:54 +01:00
Lili Cosic
6a3d667d3e manifests: Regenerate files 2020-01-14 10:34:46 +01:00
Lili Cosic
d9d3139dc8 jsonnet: Drop exact metrics 2020-01-14 10:26:42 +01:00
Frederic Branczyk
67ed0f63c2 Merge pull request #371 from tlereste/update_kube_state_metrics_version
update kube-state-metrics version to 1.9.1
2020-01-10 14:47:42 +01:00
Thibault Le Reste
7788d0d327 update kube-state-metrics version to 1.9.1 2020-01-10 14:23:52 +01:00
Lili Cosic
fca505f2a2 Merge pull request #368 from jfassad/master
jsonnet/kube-prometheus/kube-state-metrics: Add missing clusterRole permissions
2020-01-10 11:47:45 +01:00
João Assad
d40548d3a0 manifests: Regenerate manifests 2020-01-09 15:24:50 -03:00
João Assad
dba42d3477 jsonnet/kube-prometheus/kube-state-metrics: add missing clusterRole permissions 2020-01-09 15:12:59 -03:00
Lili Cosic
ee37661c34 Merge pull request #367 from LiliC/bump-k8s
tests/e2e/travis-e2e.sh: Switch to 1.17 k8s cluster
2020-01-09 13:13:39 +01:00
Lili Cosic
8b36950f0e tests/e2e/travis-e2e.sh: Switch to 1.17 k8s cluster 2020-01-09 13:03:01 +01:00
Frederic Branczyk
932745172d Merge pull request #365 from LiliC/drop-kubelet
Drop correct deprecated metrics and add e2e test to ensure that
2020-01-08 17:39:26 +01:00
Lili Cosic
1af59f3130 tests/e2e: Add e2e test to make sure all deprecated metrics are being
dropped
2020-01-08 12:35:21 +01:00
Lili Cosic
6562b02da8 manifests/*: Regenerate manifests 2020-01-08 12:35:21 +01:00
Lili Cosic
23999e44df jsonnet/kube-prometheus/prometheus: Drop correct deprecated metrics 2020-01-08 12:35:21 +01:00
Frederic Branczyk
69d3357892 Merge pull request #362 from pgier/lock-version-of-prometheus-operator-jsonnet-dependency
lock prometheus-operator jsonnet dependencies to v0.34.0
2020-01-07 08:06:46 +01:00
Frederic Branczyk
3465b0fa0d Merge pull request #346 from omerlh/patch-1
fix coredns monitoring on EKS
2020-01-06 16:19:16 +01:00
Paul Gier
1d1ce4967f lock prometheus-operator jsonnet dependencies to release-0.34 branch
This prevents mismatch between prometheus-operator binary and related
CRD yaml files.
2020-01-06 09:16:42 -06:00
Frederic Branczyk
3a0e6ba91f Merge pull request #360 from omerlh/patch-2
added metric_path to kublet/cadvisor selector
2020-01-06 13:24:23 +01:00
omerlh
81e2d19398 run make 2020-01-06 13:49:57 +02:00
Omer Levi Hevroni
92d4cbae08 added metric_path to kublet/cadvisor selector 2020-01-06 11:52:48 +02:00
Omer Levi Hevroni
2e72a8a832 fix coredns monitoring on EKS 2019-12-23 12:39:21 +02:00
Lili Cosic
9493a1a5f7 Merge pull request #342 from tlereste/update_kube_state_metrics
update kube-state-metrics version to 1.9.0
2019-12-20 16:57:17 +01:00
Thibault LE RESTE
0a48577bb7 update kube-state-metrics version to 1.9.0 2019-12-20 16:21:52 +01:00
Frederic Branczyk
9211c42df0 Merge pull request #336 from LiliC/change-dropped-metrics
jsonnet/kube-prometheus: Adjust dropped deprecated metrics names
2019-12-19 13:05:37 +01:00
Lili Cosic
5cddfd8da7 manifests: Regenerate manifests 2019-12-19 10:10:46 +01:00
Lili Cosic
bd69007c8c jsonnet/kube-prometheus: Adjust dropped deprecated metrics names
The names were not complete in the kubernetes CHANGELOG.
2019-12-19 10:09:34 +01:00
Frederic Branczyk
4f2b9c1ec8 Merge pull request #332 from LiliC/remove-pin-release
jsonnet/kube-prometheus/jsonnetfile.json: Pin prometheus-operator version to master instead
2019-12-18 13:16:03 +01:00
Lili Cosic
0be63d47fc manifests: Regenerate manifests 2019-12-18 11:18:21 +01:00
Lili Cosic
5fe60f37a2 jsonnetfile.lock.json: Update 2019-12-18 11:18:21 +01:00
Lili Cosic
200fee8d7c jsonnet/kube-prometheus/jsonnetfile.json: Pin prometheus-operator
version to master instead
2019-12-18 11:18:21 +01:00
Frederic Branczyk
1b9be6d00b Merge pull request #330 from LiliC/remove-depr-metrics
jsonnet,manifests: Drop all metrics which are deprecated in kubernetes
2019-12-17 16:51:40 +01:00
Lili Cosic
ce68c4b392 manifests/*: Regenerate manifest 2019-12-17 15:13:04 +01:00
Lili Cosic
5e9b883528 jsonnet/kube-prometheus*: Drop deprecated kubernetes metrics
These metrics were deprecated in kubernetes from 1.14 and 1.15 onwards.
2019-12-17 15:13:04 +01:00
Paweł Krupa
69b0ba03f1 Merge pull request #329 from paulfantom/e2e
tests/e2e: reenable checking targets availability
2019-12-16 14:40:43 +01:00
paulfantom
3279f222a0 tests/e2e: reenable checking targets availability 2019-12-16 14:23:43 +01:00
Paweł Krupa
543ccec970 Fix typo in node-exporter DaemonSet (#328)
Fix typo in node-exporter DaemonSet
2019-12-16 12:56:49 +01:00
paulfantom
f17ddfd293 assets: regenerate 2019-12-16 12:53:49 +01:00
paulfantom
3b8530d742 jsonnet/kube-prometheus/node-exporter: fix typo 2019-12-16 12:53:39 +01:00
Frederic Branczyk
44fe363211 Merge pull request #327 from paulfantom/deps
Update dependencies
2019-12-16 12:14:26 +01:00
paulfantom
326453cf47 manifests: regenerate 2019-12-16 11:24:04 +01:00
paulfantom
159a14ef47 update jsonnet dependencies 2019-12-16 11:20:37 +01:00
Frederic Branczyk
d03d57e6bb Merge pull request #326 from paulfantom/ipv6
IPv6 compatibility
2019-12-16 10:34:51 +01:00
Frederic Branczyk
31cb71fcd9 Merge pull request #317 from josqu4red/podmonitor-default-ns
Enable discovery of Podmonitors across namespaces
2019-12-12 16:54:39 +01:00
paulfantom
4474b24a32 manifests: regenerate 2019-12-12 16:26:58 +01:00
paulfantom
339ade5a81 jsonnet/kube-prometheus/node-exporter: wrap pod ip address in square brackets for ipv6 compatibility reasons 2019-12-12 16:14:08 +01:00
Frederic Branczyk
ce7c5fa3b4 Merge pull request #325 from sereinity-forks/master
Make limits/requests resources of kube-state-metrics removable
2019-12-12 16:06:58 +01:00
Sereinity
3f388b797d Make limits/requests resources of kube-state-metrics removable, unify tunning 2019-12-12 15:50:34 +01:00
Frederic Branczyk
20abdf3b72 Merge pull request #323 from simonpasquier/bump-kubernetes-mixin
Bump kubernetes mixin
2019-12-10 17:05:35 +01:00
Simon Pasquier
cd0f3c641e regenerate
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-12-10 16:48:51 +01:00
Simon Pasquier
408fde189b Bump kubernetes-mixin
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-12-10 16:48:28 +01:00
Jonathan Amiez
90cf0ae21c Update generated manifests 2019-12-05 15:12:18 +01:00
Jonathan Amiez
3ba4b5602a Enable PodMonitors discovery across namespaces 2019-12-05 15:09:40 +01:00
Frederic Branczyk
cb0e6e2c89 Merge pull request #309 from benjaminhuo/master
Group alert by namespace instead of job
2019-12-04 08:38:04 +01:00
Benjamin
03f7adcf92 regenerate
Signed-off-by: Benjamin <benjamin@yunify.com>
2019-12-04 10:14:42 +08:00
Benjamin
fd267aebeb Merge remote-tracking branch 'upstream/master' 2019-12-04 10:09:14 +08:00
Benjamin
420425d88e regenerate
Signed-off-by: Benjamin <benjamin@yunify.com>
2019-12-03 23:46:08 +08:00
Benjamin
965bec0ad7 Change Alertmanager group by condition
Signed-off-by: Benjamin <benjamin@yunify.com>
2019-12-03 20:02:47 +08:00
Frederic Branczyk
d22bad8293 Merge pull request #313 from yeya24/update-apiverison
Update apiversion
2019-12-03 11:22:47 +01:00
Frederic Branczyk
8c255e9e6c Merge pull request #310 from paulfantom/node-exporter-scrape-interval
Change node-exporter scrape interval to follow best practices
2019-12-03 10:15:52 +01:00
yeya24
56027ac757 update apiversion
Signed-off-by: yeya24 <yb532204897@gmail.com>
2019-12-01 09:33:11 -05:00
paulfantom
50b06b0d33 manifests: regenerate 2019-11-27 15:11:06 +01:00
paulfantom
6f6fd65a48 jsonnet/kube-prometheus/node-exporter: follow node-exporter best practices and scrape data every 15s 2019-11-27 15:09:04 +01:00
Frederic Branczyk
f48fe057dc Merge pull request #307 from EricHorst/patch-1
Update README.md with apply clarification.
2019-11-21 17:41:53 -08:00
Eric Horst
8487871388 Update README.md with apply clarification.
Update the kubectl apply commands in the customizing section to match those the quickstart section. The customizing section did not account for the recently introduced setup/ subdirectory.
2019-11-17 21:10:32 -08:00
110 changed files with 30371 additions and 15006 deletions

View File

@@ -4,48 +4,14 @@ about: If you have questions about kube-prometheus
labels: kind/support
---
<!--
This repository now has the new GitHub Discussions enabled:
https://github.com/coreos/kube-prometheus/discussions
Feel free to ask questions in #prometheus-operator on Kubernetes Slack!
Please create a new discussion to ask for any kind of support, which is not a Bug or Feature Request.
-->
Thank you for being part of this community!
**What did you do?**
---
**Did you expect to see some different?**
We are still happy to chat with you in the #prometheus-operator channel on Kubernetes Slack!
**Environment**
* Prometheus Operator version:
`Insert image tag or Git SHA here`
<!-- Try kubectl -n monitoring describe deployment prometheus-operator -->
* Kubernetes version information:
`kubectl version`
<!-- Replace the command with its output above -->
* Kubernetes cluster kind:
insert how you created your cluster: kops, bootkube, tectonic-installer, etc.
* Manifests:
```
insert manifests relevant to the issue
```
* Prometheus Operator Logs:
```
Insert Prometheus Operator logs relevant to the issue here
```
* Prometheus Logs:
```
Insert Prometheus logs relevant to the issue here
```
**Anything else we need to know?**:

54
.github/workflows/ci.yaml vendored Normal file
View File

@@ -0,0 +1,54 @@
name: ci
on:
- push
- pull_request
env:
golang-version: '1.15'
kind-version: 'v0.11.1'
jobs:
generate:
runs-on: ${{ matrix.os }}
strategy:
matrix:
os:
- macos-latest
- ubuntu-latest
name: Generate
steps:
- uses: actions/checkout@v2
- uses: actions/setup-go@v2
with:
go-version: ${{ env.golang-version }}
- run: make --always-make generate && git diff --exit-code
unit-tests:
runs-on: ubuntu-latest
name: Unit tests
steps:
- uses: actions/checkout@v2
- run: make --always-make test
e2e-tests:
name: E2E tests
runs-on: ubuntu-latest
strategy:
matrix:
kind-image:
- 'kindest/node:v1.19.11'
steps:
- uses: actions/checkout@v2
- name: Start KinD
uses: engineerd/setup-kind@v0.5.0
with:
version: ${{ env.kind-version }}
image: ${{ matrix.kind-image }}
wait: 300s
- name: Wait for cluster to finish bootstraping
run: kubectl wait --for=condition=Ready pods --all --all-namespaces --timeout=300s
- name: Create kube-prometheus stack
run: |
kubectl create -f manifests/setup
until kubectl get servicemonitors --all-namespaces ; do date; sleep 1; echo ""; done
kubectl create -f manifests/
- name: Run tests
run: |
export KUBECONFIG="${HOME}/.kube/config"
make test-e2e

View File

@@ -1,24 +0,0 @@
sudo: required
dist: xenial
language: go
go:
- "1.12.x"
go_import_path: github.com/coreos/kube-prometheus
cache:
directories:
- $GOCACHE
- $GOPATH/pkg/mod
services:
- docker
jobs:
include:
- name: Check generated files
script: make --always-make generate-in-docker && git diff --exit-code
- name: Run tests
script: make --always-make test-in-docker
- name: Run e2e tests
script: GO111MODULE=on ./tests/e2e/travis-e2e.sh

View File

@@ -1,60 +1,59 @@
JSONNET_ARGS := -n 2 --max-blank-lines 2 --string-style s --comment-style s
ifneq (,$(shell which jsonnetfmt))
JSONNET_FMT_CMD := jsonnetfmt
else
JSONNET_FMT_CMD := jsonnet
JSONNET_FMT_ARGS := fmt $(JSONNET_ARGS)
endif
JSONNET_FMT := $(JSONNET_FMT_CMD) $(JSONNET_FMT_ARGS)
SHELL=/bin/bash -o pipefail
JB_BINARY := jb
EMBEDMD_BINARY := embedmd
CONTAINER_CMD:=docker run --rm \
-e http_proxy -e https_proxy -e no_proxy \
-u="$(shell id -u):$(shell id -g)" \
-v "$(shell go env GOCACHE):/.cache/go-build" \
-v "$(PWD):/go/src/github.com/coreos/kube-prometheus:Z" \
-w "/go/src/github.com/coreos/kube-prometheus" \
quay.io/coreos/jsonnet-ci
export GO111MODULE=on
BIN_DIR?=$(shell pwd)/tmp/bin
EMBEDMD_BIN=$(BIN_DIR)/embedmd
JB_BIN=$(BIN_DIR)/jb
GOJSONTOYAML_BIN=$(BIN_DIR)/gojsontoyaml
JSONNET_BIN=$(BIN_DIR)/jsonnet
JSONNETFMT_BIN=$(BIN_DIR)/jsonnetfmt
TOOLING=$(EMBEDMD_BIN) $(JB_BIN) $(GOJSONTOYAML_BIN) $(JSONNET_BIN) $(JSONNETFMT_BIN)
JSONNETFMT_ARGS=-n 2 --max-blank-lines 2 --string-style s --comment-style s
all: generate fmt test
.PHONY: generate-in-docker
generate-in-docker:
@echo ">> Compiling assets and generating Kubernetes manifests"
$(CONTAINER_CMD) make $(MFLAGS) generate
.PHONY: clean
clean:
# Remove all files and directories ignored by git.
git clean -Xfd .
.PHONY: generate
generate: manifests **.md
**.md: $(shell find examples) build.sh example.jsonnet
$(EMBEDMD_BINARY) -w `find . -name "*.md" | grep -v vendor`
**.md: $(EMBEDMD_BIN) $(shell find examples) build.sh example.jsonnet
$(EMBEDMD_BIN) -w `find . -name "*.md" | grep -v vendor`
manifests: examples/kustomize.jsonnet vendor build.sh
rm -rf manifests
manifests: examples/kustomize.jsonnet $(GOJSONTOYAML_BIN) vendor build.sh
./build.sh $<
vendor: jsonnetfile.json jsonnetfile.lock.json
vendor: $(JB_BIN) jsonnetfile.json jsonnetfile.lock.json
rm -rf vendor
$(JB_BINARY) install
$(JB_BIN) install
fmt:
.PHONY: update
update: $(JB_BIN)
$(JB_BIN) update
.PHONY: fmt
fmt: $(JSONNETFMT_BIN)
find . -name 'vendor' -prune -o -name '*.libsonnet' -o -name '*.jsonnet' -print | \
xargs -n 1 -- $(JSONNET_FMT) -i
xargs -n 1 -- $(JSONNETFMT_BIN) $(JSONNETFMT_ARGS) -i
test:
$(JB_BINARY) install
.PHONY: test
test: $(JB_BIN)
$(JB_BIN) install
./test.sh
.PHONY: test-e2e
test-e2e:
go test -timeout 55m -v ./tests/e2e -count=1
test-in-docker:
@echo ">> Compiling assets and generating Kubernetes manifests"
$(CONTAINER_CMD) make $(MFLAGS) test
$(BIN_DIR):
mkdir -p $(BIN_DIR)
.PHONY: generate generate-in-docker test test-in-docker fmt
$(TOOLING): $(BIN_DIR)
@echo Installing tools from scripts/tools.go
@cat scripts/tools.go | grep _ | awk -F'"' '{print $$2}' | GOBIN=$(BIN_DIR) xargs -tI % go install %

View File

@@ -24,6 +24,8 @@ This stack is meant for cluster monitoring, so it is pre-configured to collect m
- [Table of contents](#table-of-contents)
- [Prerequisites](#prerequisites)
- [minikube](#minikube)
- [Compatibility](#compatibility)
- [Kubernetes compatibility matrix](#kubernetes-compatibility-matrix)
- [Quickstart](#quickstart)
- [Access the dashboards](#access-the-dashboards)
- [Customizing Kube-Prometheus](#customizing-kube-prometheus)
@@ -44,7 +46,7 @@ This stack is meant for cluster monitoring, so it is pre-configured to collect m
- [node-exporter DaemonSet namespace](#node-exporter-daemonset-namespace)
- [Alertmanager configuration](#alertmanager-configuration)
- [Adding additional namespaces to monitor](#adding-additional-namespaces-to-monitor)
- [Defining the ServiceMonitor for each addional Namespace](#defining-the-servicemonitor-for-each-addional-namespace)
- [Defining the ServiceMonitor for each additional Namespace](#defining-the-servicemonitor-for-each-additional-namespace)
- [Static etcd configuration](#static-etcd-configuration)
- [Pod Anti-Affinity](#pod-anti-affinity)
- [Customizing Prometheus alerting/recording rules and Grafana dashboards](#customizing-prometheus-alertingrecording-rules-and-grafana-dashboards)
@@ -63,8 +65,8 @@ You will need a Kubernetes cluster, that's it! By default it is assumed, that th
This means the kubelet configuration must contain these flags:
* `--authentication-token-webhook=true` This flag enables, that a `ServiceAccount` token can be used to authenticate against the kubelet(s).
* `--authorization-mode=Webhook` This flag enables, that the kubelet will perform an RBAC request with the API to determine, whether the requesting entity (Prometheus in this case) is allow to access a resource, in specific for this project the `/metrics` endpoint.
* `--authentication-token-webhook=true` This flag enables, that a `ServiceAccount` token can be used to authenticate against the kubelet(s). This can also be enabled by setting the kubelet configuration value `authentication.webhook.enabled` to `true`.
* `--authorization-mode=Webhook` This flag enables, that the kubelet will perform an RBAC request with the API to determine, whether the requesting entity (Prometheus in this case) is allow to access a resource, in specific for this project the `/metrics` endpoint. This can also be enabled by setting the kubelet configuration value `authorization.mode` to `Webhook`.
This stack provides [resource metrics](https://github.com/kubernetes/metrics#resource-metrics-api) by deploying the [Prometheus Adapter](https://github.com/DirectXMan12/k8s-prometheus-adapter/).
This adapter is an Extension API Server and Kubernetes needs to be have this feature enabled, otherwise the adapter has no effect, but is still deployed.
@@ -74,7 +76,7 @@ This adapter is an Extension API Server and Kubernetes needs to be have this fea
To try out this stack, start [minikube](https://github.com/kubernetes/minikube) with the following command:
```shell
$ minikube delete && minikube start --kubernetes-version=v1.16.0 --memory=6g --bootstrapper=kubeadm --extra-config=kubelet.authentication-token-webhook=true --extra-config=kubelet.authorization-mode=Webhook --extra-config=scheduler.address=0.0.0.0 --extra-config=controller-manager.address=0.0.0.0
$ minikube delete && minikube start --kubernetes-version=v1.18.1 --memory=6g --bootstrapper=kubeadm --extra-config=kubelet.authentication-token-webhook=true --extra-config=kubelet.authorization-mode=Webhook --extra-config=scheduler.address=0.0.0.0 --extra-config=controller-manager.address=0.0.0.0
```
The kube-prometheus stack includes a resource metrics API server, so the metrics-server addon is not necessary. Ensure the metrics-server addon is disabled on minikube:
@@ -83,9 +85,25 @@ The kube-prometheus stack includes a resource metrics API server, so the metrics
$ minikube addons disable metrics-server
```
## Compatibility
### Kubernetes compatibility matrix
The following versions are supported and work as we test against these versions in their respective branches. But note that other versions might work!
| kube-prometheus stack | Kubernetes 1.14 | Kubernetes 1.15 | Kubernetes 1.16 | Kubernetes 1.17 | Kubernetes 1.18 | Kubernetes 1.19 |
|-----------------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|
| `release-0.3` | ✔ | ✔ | ✔ | ✔ | ✗ | ✗ |
| `release-0.4` | ✗ | ✗ | ✔ (v1.16.5+) | ✔ | ✗ | ✗ |
| `release-0.5` | ✗ | ✗ | ✗ | ✗ | ✔ | ✗ |
| `release-0.6` | ✗ | ✗ | ✗ | ✗ | ✔ | ✔ |
| `HEAD` | ✗ | ✗ | ✗ | ✗ | ✔ | ✗ |
Note: Due to [two](https://github.com/kubernetes/kubernetes/issues/83778) [bugs](https://github.com/kubernetes/kubernetes/issues/86359) in Kubernetes v1.16.1, and prior to Kubernetes v1.16.5 the kube-prometheus release-0.4 branch only supports v1.16.5 and higher. The `extension-apiserver-authentication-reader` role in the kube-system namespace can be manually edited to include list and watch permissions in order to workaround the second issue with Kubernetes v1.16.2 through v1.16.4.
## Quickstart
>Note: For versions before Kubernetes v1.14.0 use the release-0.1 branch instead of master.
>Note: For versions before Kubernetes v1.18.z refer to the [Kubernetes compatibility matrix](#kubernetes-compatibility-matrix) in order to choose a compatible branch.
This project is intended to be used as a library (i.e. the intent is not for you to create your own modified copy of this repository).
@@ -100,7 +118,7 @@ kubectl create -f manifests/
```
We create the namespace and CustomResourceDefinitions first to avoid race conditions when deploying the monitoring components.
Alternatively, the resources in both folders can be applied with a single command
Alternatively, the resources in both folders can be applied with a single command
`kubectl create -f manifests/setup -f manifests`, but it may be necessary to run the command multiple times for all components to
be created successfullly.
@@ -154,12 +172,12 @@ Install this library in your own project with [jsonnet-bundler](https://github.c
$ mkdir my-kube-prometheus; cd my-kube-prometheus
$ jb init # Creates the initial/empty `jsonnetfile.json`
# Install the kube-prometheus dependency
$ jb install github.com/coreos/kube-prometheus/jsonnet/kube-prometheus@release-0.1 # Creates `vendor/` & `jsonnetfile.lock.json`, and fills in `jsonnetfile.json`
$ jb install github.com/coreos/kube-prometheus/jsonnet/kube-prometheus@release-0.4 # Creates `vendor/` & `jsonnetfile.lock.json`, and fills in `jsonnetfile.json`
```
> `jb` can be installed with `go get github.com/jsonnet-bundler/jsonnet-bundler/cmd/jb`
> An e.g. of how to install a given version of this library: `jb install github.com/coreos/kube-prometheus/jsonnet/kube-prometheus@release-0.1`
> An e.g. of how to install a given version of this library: `jb install github.com/coreos/kube-prometheus/jsonnet/kube-prometheus@release-0.4`
In order to update the kube-prometheus dependency, simply use the jsonnet-bundler update functionality:
```shell
@@ -174,6 +192,8 @@ e.g. of how to compile the manifests: `./build.sh example.jsonnet`
Here's [example.jsonnet](example.jsonnet):
> Note: some of the following components must be configured beforehand. See [configuration](#configuration) and [customization-examples](#customization-examples).
[embedmd]:# (example.jsonnet)
```jsonnet
local kp =
@@ -184,6 +204,7 @@ local kp =
// (import 'kube-prometheus/kube-prometheus-node-ports.libsonnet') +
// (import 'kube-prometheus/kube-prometheus-static-etcd.libsonnet') +
// (import 'kube-prometheus/kube-prometheus-thanos-sidecar.libsonnet') +
// (import 'kube-prometheus/kube-prometheus-custom-metrics.libsonnet') +
{
_config+:: {
namespace: 'monitoring',
@@ -218,12 +239,19 @@ set -x
# only exit with zero if all commands of the pipeline exit successfully
set -o pipefail
# Make sure to use project tooling
PATH="$(pwd)/tmp/bin:${PATH}"
# Make sure to start with a clean 'manifests' dir
rm -rf manifests
mkdir -p manifests/setup
# optional, but we would like to generate yaml, not json
jsonnet -J vendor -m manifests "${1-example.jsonnet}" | xargs -I{} sh -c 'cat {} | gojsontoyaml > {}.yaml; rm -f {}' -- {}
# Calling gojsontoyaml is optional, but we would like to generate yaml, not json
jsonnet -J vendor -m manifests "${1-example.jsonnet}" | xargs -I{} sh -c 'cat {} | gojsontoyaml > {}.yaml' -- {}
# Make sure to remove json files
find manifests -type f ! -name '*.yaml' -delete
rm -f kustomization
```
@@ -236,8 +264,13 @@ The previous steps (compilation) has created a bunch of manifest files in the ma
Now simply use `kubectl` to install Prometheus and Grafana as per your configuration:
```shell
# Update the namespace and CRDs, and then wait for them to be availble before creating the remaining resources
$ kubectl apply -f manifests/setup
$ kubectl apply -f manifests/
```
Alternatively, the resources in both folders can be applied with a single command
`kubectl apply -Rf manifests`, but it may be necessary to run the command multiple times for all components to
be created successfullly.
Check the monitoring namespace (or the namespace you have specific in `namespace: `) and make sure the pods are running. Prometheus and Grafana should be up and running soon.
@@ -271,7 +304,7 @@ Once updated, just follow the instructions under "Compiling" and "Apply the kube
## Configuration
Jsonnet has the concept of hidden fields. These are fields, that are not going to be rendered in a result. This is used to configure the kube-prometheus components in jsonnet. In the example jsonnet code of the above [Usage section](#Usage), you can see an example of this, where the `namespace` is being configured to be `monitoring`. In order to not override the whole object, use the `+::` construct of jsonnet, to merge objects, this way you can override individual settings, but retain all other settings and defaults.
Jsonnet has the concept of hidden fields. These are fields, that are not going to be rendered in a result. This is used to configure the kube-prometheus components in jsonnet. In the example jsonnet code of the above [Customizing Kube-Prometheus section](#customizing-kube-prometheus), you can see an example of this, where the `namespace` is being configured to be `monitoring`. In order to not override the whole object, use the `+::` construct of jsonnet, to merge objects, this way you can override individual settings, but retain all other settings and defaults.
These are the available fields with their respective default values:
```
@@ -561,11 +594,11 @@ local kp = (import 'kube-prometheus/kube-prometheus.libsonnet') + {
{ ['grafana-' + name]: kp.grafana[name] for name in std.objectFields(kp.grafana) }
```
#### Defining the ServiceMonitor for each addional Namespace
#### Defining the ServiceMonitor for each additional Namespace
In order to Prometheus be able to discovery and scrape services inside the additional namespaces specified in previous step you need to define a ServiceMonitor resource.
> Typically it is up to the users of a namespace to provision the ServiceMonitor resource, but in case you want to generate it with the same tooling as the rest of the cluster monitoring infrastructure, this is a guide on how to achieve this.
> Typically it is up to the users of a namespace to provision the ServiceMonitor resource, but in case you want to generate it with the same tooling as the rest of the cluster monitoring infrastructure, this is a guide on how to achieve this.
You can define ServiceMonitor resources in your `jsonnet` spec. See the snippet bellow:
@@ -651,9 +684,10 @@ Should the Prometheus `/targets` page show kubelet targets, but not able to succ
As described in the [Prerequisites](#prerequisites) section, in order to retrieve metrics from the kubelet token authentication and authorization must be enabled. Some Kubernetes setup tools do not enable this by default.
If you are using Google's GKE product, see [cAdvisor support](docs/GKE-cadvisor-support.md).
- If you are using Google's GKE product, see [cAdvisor support](docs/GKE-cadvisor-support.md).
- If you are using AWS EKS, see [AWS EKS CNI support](docs/EKS-cni-support.md).
- If you are using Weave Net, see [Weave Net support](docs/weave-net-support.md).
If you are using AWS EKS, see [AWS EKS CNI support](docs/EKS-cni-support.md)
#### Authentication problem
The Prometheus `/targets` page will show the kubelet job with the error `403 Unauthorized`, when token authentication is not enabled. Ensure, that the `--authentication-token-webhook=true` flag is enabled on all kubelet configurations.
@@ -692,5 +726,5 @@ the following process:
2. Commit your changes (This is currently necessary due to our vendoring
process. This is likely to change in the future).
3. Update the pinned kube-prometheus dependency in `jsonnetfile.lock.json`: `jb update`
3. Generate dependent `*.yaml` files: `make generate-in-docker`
3. Generate dependent `*.yaml` files: `make generate`
4. Commit the generated changes.

View File

@@ -7,10 +7,17 @@ set -x
# only exit with zero if all commands of the pipeline exit successfully
set -o pipefail
# Make sure to use project tooling
PATH="$(pwd)/tmp/bin:${PATH}"
# Make sure to start with a clean 'manifests' dir
rm -rf manifests
mkdir -p manifests/setup
# optional, but we would like to generate yaml, not json
jsonnet -J vendor -m manifests "${1-example.jsonnet}" | xargs -I{} sh -c 'cat {} | gojsontoyaml > {}.yaml; rm -f {}' -- {}
# Calling gojsontoyaml is optional, but we would like to generate yaml, not json
jsonnet -J vendor -m manifests "${1-example.jsonnet}" | xargs -I{} sh -c 'cat {} | gojsontoyaml > {}.yaml' -- {}
# Make sure to remove json files
find manifests -type f ! -name '*.yaml' -delete
rm -f kustomization

View File

@@ -7,8 +7,8 @@ One fatal issue that can occur is that you run out of IP addresses in your eks c
You can monitor the `awscni` using kube-promethus with :
[embedmd]:# (../examples/eks-cni-example.jsonnet)
```jsonnet
local kp = (import 'kube-prometheus/kube-prometheus.libsonnet') +
(import 'kube-prometheus/kube-prometheus-eks.libsonnet') + {
local kp = (import 'kube-prometheus/kube-prometheus.libsonnet') +
(import 'kube-prometheus/kube-prometheus-eks.libsonnet') + {
_config+:: {
namespace: 'monitoring',
},
@@ -32,7 +32,7 @@ local kp = (import 'kube-prometheus/kube-prometheus.libsonnet') +
{ ['node-exporter-' + name]: kp.nodeExporter[name] for name in std.objectFields(kp.nodeExporter) } +
{ ['kube-state-metrics-' + name]: kp.kubeStateMetrics[name] for name in std.objectFields(kp.kubeStateMetrics) } +
{ ['prometheus-' + name]: kp.prometheus[name] for name in std.objectFields(kp.prometheus) } +
{ ['prometheus-adapter-' + name]: kp.prometheusAdapter[name] for name in std.objectFields(kp.prometheusAdapter) }
{ ['prometheus-adapter-' + name]: kp.prometheusAdapter[name] for name in std.objectFields(kp.prometheusAdapter) }
```
After you have the required yaml file please run

View File

@@ -18,6 +18,7 @@ local kp =
// (import 'kube-prometheus/kube-prometheus-node-ports.libsonnet') +
// (import 'kube-prometheus/kube-prometheus-static-etcd.libsonnet') +
// (import 'kube-prometheus/kube-prometheus-thanos-sidecar.libsonnet') +
// (import 'kube-prometheus/kube-prometheus-custom-metrics.libsonnet') +
{
_config+:: {
namespace: 'monitoring',
@@ -163,7 +164,7 @@ local kp = (import 'kube-prometheus/kube-prometheus.libsonnet') + {
Along with adding additional rules, we give the user the option to filter or adjust the existing rules imported by `kube-prometheus/kube-prometheus.libsonnet`. The recording rules can be found in [kube-prometheus/rules](../jsonnet/kube-prometheus/rules) and [kubernetes-mixin/rules](https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/rules) while the alerting rules can be found in [kube-prometheus/alerts](../jsonnet/kube-prometheus/alerts) and [kubernetes-mixin/alerts](https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/alerts).
Knowing which rules to change, the user can now use functions from the [Jsonnet standard library](https://jsonnet.org/ref/stdlib.html) to make these changes. Below are examples of both a filter and an adjustment being made to the default rules. These changes can be assigned to a local variable and then added to the `local kp` object as seen in the examples above.
Knowing which rules to change, the user can now use functions from the [Jsonnet standard library](https://jsonnet.org/ref/stdlib.html) to make these changes. Below are examples of both a filter and an adjustment being made to the default rules. These changes can be assigned to a local variable and then added to the `local kp` object as seen in the examples above.
#### Filter
Here the alert `KubeStatefulSetReplicasMismatch` is being filtered out of the group `kubernetes-apps`. The default rule can be seen [here](https://github.com/kubernetes-monitoring/kubernetes-mixin/blob/master/alerts/apps_alerts.libsonnet).
@@ -213,7 +214,7 @@ local update = {
},
};
```
Using the example from above about adding in pre-rendered rules, the new local vaiables can be added in as follows:
Using the example from above about adding in pre-rendered rules, the new local variables can be added in as follows:
```jsonnet
local kp = (import 'kube-prometheus/kube-prometheus.libsonnet') + filter + update + {
prometheusAlerts+:: (import 'existingrule.json'),
@@ -227,7 +228,7 @@ local kp = (import 'kube-prometheus/kube-prometheus.libsonnet') + filter + updat
{ ['prometheus-' + name]: kp.prometheus[name] for name in std.objectFields(kp.prometheus) } +
{ ['prometheus-adapter-' + name]: kp.prometheusAdapter[name] for name in std.objectFields(kp.prometheusAdapter) } +
{ ['grafana-' + name]: kp.grafana[name] for name in std.objectFields(kp.grafana) }
```
```
## Dashboards
Dashboards can either be added using jsonnet or simply a pre-rendered json dashboard.
@@ -251,30 +252,32 @@ local kp = (import 'kube-prometheus/kube-prometheus.libsonnet') + {
_config+:: {
namespace: 'monitoring',
},
grafanaDashboards+:: {
'my-dashboard.json':
dashboard.new('My Dashboard')
.addTemplate(
{
current: {
text: 'Prometheus',
value: 'Prometheus',
grafana+:: {
dashboards+:: {
'my-dashboard.json':
dashboard.new('My Dashboard')
.addTemplate(
{
current: {
text: 'Prometheus',
value: 'Prometheus',
},
hide: 0,
label: null,
name: 'datasource',
options: [],
query: 'prometheus',
refresh: 1,
regex: '',
type: 'datasource',
},
hide: 0,
label: null,
name: 'datasource',
options: [],
query: 'prometheus',
refresh: 1,
regex: '',
type: 'datasource',
},
)
.addRow(
row.new()
.addPanel(graphPanel.new('My Panel', span=6, datasource='$datasource')
.addTarget(prometheus.target('vector(1)')))
),
)
.addRow(
row.new()
.addPanel(graphPanel.new('My Panel', span=6, datasource='$datasource')
.addTarget(prometheus.target('vector(1)')))
),
},
},
};
@@ -297,9 +300,14 @@ local kp = (import 'kube-prometheus/kube-prometheus.libsonnet') + {
_config+:: {
namespace: 'monitoring',
},
grafanaDashboards+:: {
grafanaDashboards+:: { // monitoring-mixin compatibility
'my-dashboard.json': (import 'example-grafana-dashboard.json'),
},
grafana+:: {
dashboards+:: { // use this method to import your dashboards to Grafana
'my-dashboard.json': (import 'example-grafana-dashboard.json'),
},
},
};
{ ['00namespace-' + name]: kp.kubePrometheus[name] for name in std.objectFields(kp.kubePrometheus) } +
@@ -311,15 +319,17 @@ local kp = (import 'kube-prometheus/kube-prometheus.libsonnet') + {
{ ['grafana-' + name]: kp.grafana[name] for name in std.objectFields(kp.grafana) }
```
Incase you have lots of json dashboard exported out from grafan UI the above approch is going to take lots of time. to improve performance we can use `rawGrafanaDashboards` field and provide it's value as json string by using importstr
In case you have lots of json dashboard exported out from grafana UI the above approach is going to take lots of time to improve performance we can use `rawDashboards` field and provide it's value as json string by using `importstr`
[embedmd]:# (../examples/grafana-additional-rendered-dashboard-example-2.jsonnet)
```jsonnet
local kp = (import 'kube-prometheus/kube-prometheus.libsonnet') + {
_config+:: {
namespace: 'monitoring',
},
rawGrafanaDashboards+:: {
'my-dashboard.json': (importstr 'example-grafana-dashboard.json'),
grafana+:: {
rawDashboards+:: {
'my-dashboard.json': (importstr 'example-grafana-dashboard.json'),
},
},
};

View File

@@ -5,14 +5,14 @@ This guide will help you monitor applications in other Namespaces. By default th
You have to give the list of the Namespaces that you want to be able to monitor.
This is done in the variable `prometheus.roleSpecificNamespaces`. You usually set this in your `.jsonnet` file when building the manifests.
Example to create the needed `Role` and `Rolebindig` for the Namespace `foo` :
Example to create the needed `Role` and `RoleBinding` for the Namespace `foo` :
```
local kp = (import 'kube-prometheus/kube-prometheus.libsonnet') + {
_config+:: {
namespace: 'monitoring',
prometheus+:: {
namespaces: ["default", "kube-system","foo"],
namespaces: ["default", "kube-system", "foo"],
},
},
};

69
docs/weave-net-support.md Normal file
View File

@@ -0,0 +1,69 @@
# Setup Weave Net monitoring using kube-prometheus
[Weave Net](https://kubernetes.io/docs/concepts/cluster-administration/networking/#weave-net-from-weaveworks) is a resilient and simple to use CNI provider for Kubernetes. A well monitored and observed CNI provider helps in troubleshooting Kubernetes networking problems. [Weave Net](https://www.weave.works/docs/net/latest/concepts/how-it-works/) emits [prometheus metrics](https://www.weave.works/docs/net/latest/tasks/manage/metrics/) for monitoring Weave Net. There are many ways to install Weave Net in your cluster. One of them is using [kops](https://github.com/kubernetes/kops/blob/master/docs/networking.md).
Following this document, you can setup Weave Net monitoring for your cluster using kube-prometheus.
## Contents
Using kube-prometheus and kubectl you will be able install the following for monitoring Weave Net in your cluster:
1. [Service for Weave Net](https://gist.github.com/alok87/379c6234b582f555c141f6fddea9fbce) The service which the [service monitor](https://coreos.com/operators/prometheus/docs/latest/user-guides/cluster-monitoring.html) scrapes.
2. [ServiceMonitor for Weave Net](https://gist.github.com/alok87/e46a7f9a79ef6d1da6964a035be2cfb9) Service monitor to scrape the Weave Net metrics and bring it to Prometheus.
3. [Prometheus Alerts for Weave Net](https://stackoverflow.com/a/60447864) This will setup all the important Weave Net metrics you should be alerted on.
4. [Grafana Dashboard for Weave Net](https://grafana.com/grafana/dashboards/11789) This will setup the per Weave Net pod level monitoring for Weave Net.
5. [Grafana Dashboard for Weave Net(Cluster)](https://grafana.com/grafana/dashboards/11804) This will setup the cluster level monitoring for Weave Net.
## Instructions
- You can monitor Weave Net using an example like below. **Please note that some alert configurations are environment specific and may require modifications of alert thresholds**. For example: The FastDP flows have never gone below 15000 for us. But if this value is say 20000 for you then you can use an example like below to update the alert. The alerts which may require threshold modifications are `WeaveNetFastDPFlowsLow` and `WeaveNetIPAMUnreachable`.
[embedmd]:# (../examples/weave-net-example.jsonnet)
```jsonnet
local kp = (import 'kube-prometheus/kube-prometheus.libsonnet') +
(import 'kube-prometheus/kube-prometheus-weave-net.libsonnet') + {
_config+:: {
namespace: 'monitoring',
},
prometheusAlerts+:: {
groups: std.map(
function(group)
if group.name == 'weave-net' then
group {
rules: std.map(
function(rule)
if rule.alert == 'WeaveNetFastDPFlowsLow' then
rule {
expr: 'sum(weave_flows) < 20000',
}
else if rule.alert == 'WeaveNetIPAMUnreachable' then
rule {
expr: 'weave_ipam_unreachable_percentage > 25',
}
else
rule
,
group.rules
),
}
else
group,
super.groups
),
},
};
{ ['00namespace-' + name]: kp.kubePrometheus[name] for name in std.objectFields(kp.kubePrometheus) } +
{ ['0prometheus-operator-' + name]: kp.prometheusOperator[name] for name in std.objectFields(kp.prometheusOperator) } +
{ ['node-exporter-' + name]: kp.nodeExporter[name] for name in std.objectFields(kp.nodeExporter) } +
{ ['kube-state-metrics-' + name]: kp.kubeStateMetrics[name] for name in std.objectFields(kp.kubeStateMetrics) } +
{ ['prometheus-' + name]: kp.prometheus[name] for name in std.objectFields(kp.prometheus) } +
{ ['prometheus-adapter-' + name]: kp.prometheusAdapter[name] for name in std.objectFields(kp.prometheusAdapter) } +
{ ['grafana-' + name]: kp.grafana[name] for name in std.objectFields(kp.grafana) }
```
- After you have the required yamls file please run
```
kubectl create -f prometheus-serviceWeaveNet.yaml
kubectl create -f prometheus-serviceMonitorWeaveNet.yaml
kubectl apply -f prometheus-rules.yaml
kubectl apply -f grafana-dashboardDefinitions.yaml
kubectl apply -f grafana-deployment.yaml
```

View File

@@ -6,6 +6,7 @@ local kp =
// (import 'kube-prometheus/kube-prometheus-node-ports.libsonnet') +
// (import 'kube-prometheus/kube-prometheus-static-etcd.libsonnet') +
// (import 'kube-prometheus/kube-prometheus-thanos-sidecar.libsonnet') +
// (import 'kube-prometheus/kube-prometheus-custom-metrics.libsonnet') +
{
_config+:: {
namespace: 'monitoring',

View File

@@ -1,5 +1,5 @@
local kp = (import 'kube-prometheus/kube-prometheus.libsonnet') +
(import 'kube-prometheus/kube-prometheus-eks.libsonnet') + {
local kp = (import 'kube-prometheus/kube-prometheus.libsonnet') +
(import 'kube-prometheus/kube-prometheus-eks.libsonnet') + {
_config+:: {
namespace: 'monitoring',
},
@@ -23,4 +23,4 @@ local kp = (import 'kube-prometheus/kube-prometheus.libsonnet') +
{ ['node-exporter-' + name]: kp.nodeExporter[name] for name in std.objectFields(kp.nodeExporter) } +
{ ['kube-state-metrics-' + name]: kp.kubeStateMetrics[name] for name in std.objectFields(kp.kubeStateMetrics) } +
{ ['prometheus-' + name]: kp.prometheus[name] for name in std.objectFields(kp.prometheus) } +
{ ['prometheus-adapter-' + name]: kp.prometheusAdapter[name] for name in std.objectFields(kp.prometheusAdapter) }
{ ['prometheus-adapter-' + name]: kp.prometheusAdapter[name] for name in std.objectFields(kp.prometheusAdapter) }

View File

@@ -14,12 +14,16 @@ spec:
port: 8080
targetPort: web
---
apiVersion: extensions/v1beta1
apiVersion: apps/v1
kind: Deployment
metadata:
name: example-app
namespace: default
spec:
selector:
matchLabels:
app: example-app
version: 1.1.3
replicas: 4
template:
metadata:

View File

@@ -9,30 +9,32 @@ local kp = (import 'kube-prometheus/kube-prometheus.libsonnet') + {
_config+:: {
namespace: 'monitoring',
},
grafanaDashboards+:: {
'my-dashboard.json':
dashboard.new('My Dashboard')
.addTemplate(
{
current: {
text: 'Prometheus',
value: 'Prometheus',
grafana+:: {
dashboards+:: {
'my-dashboard.json':
dashboard.new('My Dashboard')
.addTemplate(
{
current: {
text: 'Prometheus',
value: 'Prometheus',
},
hide: 0,
label: null,
name: 'datasource',
options: [],
query: 'prometheus',
refresh: 1,
regex: '',
type: 'datasource',
},
hide: 0,
label: null,
name: 'datasource',
options: [],
query: 'prometheus',
refresh: 1,
regex: '',
type: 'datasource',
},
)
.addRow(
row.new()
.addPanel(graphPanel.new('My Panel', span=6, datasource='$datasource')
.addTarget(prometheus.target('vector(1)')))
),
)
.addRow(
row.new()
.addPanel(graphPanel.new('My Panel', span=6, datasource='$datasource')
.addTarget(prometheus.target('vector(1)')))
),
},
},
};

View File

@@ -2,8 +2,10 @@ local kp = (import 'kube-prometheus/kube-prometheus.libsonnet') + {
_config+:: {
namespace: 'monitoring',
},
rawGrafanaDashboards+:: {
'my-dashboard.json': (importstr 'example-grafana-dashboard.json'),
grafana+:: {
rawDashboards+:: {
'my-dashboard.json': (importstr 'example-grafana-dashboard.json'),
},
},
};

View File

@@ -2,9 +2,14 @@ local kp = (import 'kube-prometheus/kube-prometheus.libsonnet') + {
_config+:: {
namespace: 'monitoring',
},
grafanaDashboards+:: {
grafanaDashboards+:: { // monitoring-mixin compatibility
'my-dashboard.json': (import 'example-grafana-dashboard.json'),
},
grafana+:: {
dashboards+:: { // use this method to import your dashboards to Grafana
'my-dashboard.json': (import 'example-grafana-dashboard.json'),
},
},
};
{ ['00namespace-' + name]: kp.kubePrometheus[name] for name in std.objectFields(kp.kubePrometheus) } +

View File

@@ -7,7 +7,12 @@ local pvc = k.core.v1.persistentVolumeClaim; // https://kubernetes.io/docs/refe
local kp =
(import 'kube-prometheus/kube-prometheus.libsonnet') +
(import 'kube-prometheus/kube-prometheus-bootkube.libsonnet') +
// Uncomment the following imports to enable its patches
// (import 'kube-prometheus/kube-prometheus-anti-affinity.libsonnet') +
// (import 'kube-prometheus/kube-prometheus-managed-cluster.libsonnet') +
// (import 'kube-prometheus/kube-prometheus-node-ports.libsonnet') +
// (import 'kube-prometheus/kube-prometheus-static-etcd.libsonnet') +
// (import 'kube-prometheus/kube-prometheus-thanos-sidecar.libsonnet') +
{
_config+:: {
namespace: 'monitoring',
@@ -50,9 +55,16 @@ local kp =
};
{ ['0prometheus-operator-' + name]: kp.prometheusOperator[name] for name in std.objectFields(kp.prometheusOperator) } +
{ ['setup/0namespace-' + name]: kp.kubePrometheus[name] for name in std.objectFields(kp.kubePrometheus) } +
{
['setup/prometheus-operator-' + name]: kp.prometheusOperator[name]
for name in std.filter((function(name) name != 'serviceMonitor'), std.objectFields(kp.prometheusOperator))
} +
// serviceMonitor is separated so that it can be created after the CRDs are ready
{ 'prometheus-operator-serviceMonitor': kp.prometheusOperator.serviceMonitor } +
{ ['node-exporter-' + name]: kp.nodeExporter[name] for name in std.objectFields(kp.nodeExporter) } +
{ ['kube-state-metrics-' + name]: kp.kubeStateMetrics[name] for name in std.objectFields(kp.kubeStateMetrics) } +
{ ['alertmanager-' + name]: kp.alertmanager[name] for name in std.objectFields(kp.alertmanager) } +
{ ['prometheus-' + name]: kp.prometheus[name] for name in std.objectFields(kp.prometheus) } +
{ ['prometheus-adapter-' + name]: kp.prometheusAdapter[name] for name in std.objectFields(kp.prometheusAdapter) } +
{ ['grafana-' + name]: kp.grafana[name] for name in std.objectFields(kp.grafana) }

View File

@@ -0,0 +1,40 @@
local kp = (import 'kube-prometheus/kube-prometheus.libsonnet') +
(import 'kube-prometheus/kube-prometheus-weave-net.libsonnet') + {
_config+:: {
namespace: 'monitoring',
},
prometheusAlerts+:: {
groups: std.map(
function(group)
if group.name == 'weave-net' then
group {
rules: std.map(
function(rule)
if rule.alert == 'WeaveNetFastDPFlowsLow' then
rule {
expr: 'sum(weave_flows) < 20000',
}
else if rule.alert == 'WeaveNetIPAMUnreachable' then
rule {
expr: 'weave_ipam_unreachable_percentage > 25',
}
else
rule
,
group.rules
),
}
else
group,
super.groups
),
},
};
{ ['00namespace-' + name]: kp.kubePrometheus[name] for name in std.objectFields(kp.kubePrometheus) } +
{ ['0prometheus-operator-' + name]: kp.prometheusOperator[name] for name in std.objectFields(kp.prometheusOperator) } +
{ ['node-exporter-' + name]: kp.nodeExporter[name] for name in std.objectFields(kp.nodeExporter) } +
{ ['kube-state-metrics-' + name]: kp.kubeStateMetrics[name] for name in std.objectFields(kp.kubeStateMetrics) } +
{ ['prometheus-' + name]: kp.prometheus[name] for name in std.objectFields(kp.prometheus) } +
{ ['prometheus-adapter-' + name]: kp.prometheusAdapter[name] for name in std.objectFields(kp.prometheusAdapter) } +
{ ['grafana-' + name]: kp.grafana[name] for name in std.objectFields(kp.grafana) }

View File

@@ -1,7 +0,0 @@
apiserver-key.pem
apiserver.csr
apiserver.pem
metrics-ca-config.json
metrics-ca.crt
metrics-ca.key
cm-adapter-serving-certs.yaml

View File

@@ -1,21 +0,0 @@
# Custom Metrics API
The custom metrics API allows the HPA v2 to scale based on arbirary metrics.
This directory contains an example deployment which extends the Prometheus Adapter, deployed with kube-prometheus, serve the [Custom Metrics API](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/custom-metrics-api.md) by talking to Prometheus running inside the cluster.
Make sure you have the Prometheus Adapter up and running in the `monitoring` namespace.
You can deploy everything in the `monitoring` namespace using `./deploy.sh`.
When you're done, you can teardown using the `./teardown.sh` script.
### Sample App
Additionally, this directory contains a sample app that uses the [Horizontal Pod Autoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/) to scale the Deployment's replicas of Pods up and down as needed.
Deploy this app by running `kubectl apply -f sample-app.yaml`.
Make the app accessible on your system, for example by using `kubectl port-forward svc/sample-app 8080`. Next you need to put some load on its http endpoints.
A tool like [hey](https://github.com/rakyll/hey) is helpful for doing so: `hey -c 20 -n 100000000 http://localhost:8080/metrics`
There is an even more detailed information on this sample app at [luxas/kubeadm-workshop](https://github.com/luxas/kubeadm-workshop#deploying-the-prometheus-operator-for-monitoring-services-in-the-cluster).

View File

@@ -1,12 +0,0 @@
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: custom-metrics-server-resources
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: custom-metrics-server-resources
subjects:
- kind: ServiceAccount
name: prometheus-adapter
namespace: monitoring

View File

@@ -1,13 +0,0 @@
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
name: v1beta1.custom.metrics.k8s.io
spec:
service:
name: prometheus-adapter
namespace: monitoring
group: custom.metrics.k8s.io
version: v1beta1
insecureSkipTLSVerify: true
groupPriorityMinimum: 100
versionPriority: 100

View File

@@ -1,9 +0,0 @@
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: custom-metrics-server-resources
rules:
- apiGroups:
- custom.metrics.k8s.io
resources: ["*"]
verbs: ["*"]

View File

@@ -1,98 +0,0 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: adapter-config
namespace: monitoring
data:
config.yaml: |
rules:
- seriesQuery: '{__name__=~"^container_.*",container_name!="POD",namespace!="",pod_name!=""}'
seriesFilters: []
resources:
overrides:
namespace:
resource: namespace
pod_name:
resource: pod
name:
matches: ^container_(.*)_seconds_total$
as: ""
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>,container_name!="POD"}[1m])) by (<<.GroupBy>>)
- seriesQuery: '{__name__=~"^container_.*",container_name!="POD",namespace!="",pod_name!=""}'
seriesFilters:
- isNot: ^container_.*_seconds_total$
resources:
overrides:
namespace:
resource: namespace
pod_name:
resource: pod
name:
matches: ^container_(.*)_total$
as: ""
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>,container_name!="POD"}[1m])) by (<<.GroupBy>>)
- seriesQuery: '{__name__=~"^container_.*",container_name!="POD",namespace!="",pod_name!=""}'
seriesFilters:
- isNot: ^container_.*_total$
resources:
overrides:
namespace:
resource: namespace
pod_name:
resource: pod
name:
matches: ^container_(.*)$
as: ""
metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>,container_name!="POD"}) by (<<.GroupBy>>)
- seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
seriesFilters:
- isNot: .*_total$
resources:
template: <<.Resource>>
name:
matches: ""
as: ""
metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)
- seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
seriesFilters:
- isNot: .*_seconds_total
resources:
template: <<.Resource>>
name:
matches: ^(.*)_total$
as: ""
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
- seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
seriesFilters: []
resources:
template: <<.Resource>>
name:
matches: ^(.*)_seconds_total$
as: ""
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
resourceRules:
cpu:
containerQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
nodeQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>, id='/'}[1m])) by (<<.GroupBy>>)
resources:
overrides:
node:
resource: node
namespace:
resource: namespace
pod_name:
resource: pod
containerLabel: container_name
memory:
containerQuery: sum(container_memory_working_set_bytes{<<.LabelMatchers>>}) by (<<.GroupBy>>)
nodeQuery: sum(container_memory_working_set_bytes{<<.LabelMatchers>>,id='/'}) by (<<.GroupBy>>)
resources:
overrides:
node:
resource: node
namespace:
resource: namespace
pod_name:
resource: pod
containerLabel: container_name
window: 1m

View File

@@ -1,7 +0,0 @@
#!/usr/bin/env bash
kubectl apply -n monitoring -f custom-metrics-apiserver-resource-reader-cluster-role-binding.yaml
kubectl apply -n monitoring -f custom-metrics-apiservice.yaml
kubectl apply -n monitoring -f custom-metrics-cluster-role.yaml
kubectl apply -n monitoring -f custom-metrics-configmap.yaml
kubectl apply -n monitoring -f hpa-custom-metrics-cluster-role-binding.yaml

View File

@@ -1,12 +0,0 @@
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: hpa-controller-custom-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: custom-metrics-server-resources
subjects:
- kind: ServiceAccount
name: horizontal-pod-autoscaler
namespace: kube-system

View File

@@ -1,67 +0,0 @@
kind: ServiceMonitor
apiVersion: monitoring.coreos.com/v1
metadata:
name: sample-app
labels:
app: sample-app
spec:
selector:
matchLabels:
app: sample-app
endpoints:
- port: http
interval: 5s
---
apiVersion: v1
kind: Service
metadata:
name: sample-app
labels:
app: sample-app
spec:
ports:
- name: http
port: 8080
targetPort: 8080
selector:
app: sample-app
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: sample-app
labels:
app: sample-app
spec:
replicas: 1
selector:
matchLabels:
app: sample-app
template:
metadata:
labels:
app: sample-app
spec:
containers:
- image: luxas/autoscale-demo:v0.1.2
name: metrics-provider
ports:
- name: http
containerPort: 8080
---
kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta1
metadata:
name: sample-app
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: sample-app
minReplicas: 1
maxReplicas: 10
metrics:
- type: Pods
pods:
metricName: http_requests
targetAverageValue: 500m

View File

@@ -1,7 +0,0 @@
#!/usr/bin/env bash
kubectl delete -n monitoring -f custom-metrics-apiserver-resource-reader-cluster-role-binding.yaml
kubectl delete -n monitoring -f custom-metrics-apiservice.yaml
kubectl delete -n monitoring -f custom-metrics-cluster-role.yaml
kubectl delete -n monitoring -f custom-metrics-configmap.yaml
kubectl delete -n monitoring -f hpa-custom-metrics-cluster-role-binding.yaml

View File

@@ -14,6 +14,14 @@ rules:
- get
- list
- watch
- apiGroups:
- "apps"
resources:
- deployments
verbs:
- get
- list
- watch
- apiGroups:
- "extensions"
resources:

View File

@@ -1,4 +1,4 @@
apiVersion: extensions/v1beta1
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server

24
go.mod
View File

@@ -1,33 +1,27 @@
module github.com/coreos/kube-prometheus
go 1.12
go 1.13
require (
github.com/Jeffail/gabs v1.2.0
github.com/alecthomas/template v0.0.0-20190718012654-fb15b899a751 // indirect
github.com/alecthomas/units v0.0.0-20190924025748-f65c72e2690d // indirect
github.com/gogo/protobuf v1.1.1 // indirect
github.com/google/gofuzz v0.0.0-20170612174753-24818f796faf // indirect
github.com/brancz/gojsontoyaml v0.0.0-20191212081931-bf2969bbd742
github.com/campoy/embedmd v1.0.0
github.com/google/go-jsonnet v0.16.1-0.20200703153429-aaf50f5b655f
github.com/googleapis/gnostic v0.0.0-20170729233727-0c5108395e2d // indirect
github.com/imdario/mergo v0.3.7 // indirect
github.com/json-iterator/go v0.0.0-20180701071628-ab8a2e0c74be // indirect
github.com/jsonnet-bundler/jsonnet-bundler v0.1.0 // indirect
github.com/mattn/go-colorable v0.1.4 // indirect
github.com/mattn/go-isatty v0.0.10 // indirect
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
github.com/modern-go/reflect2 v1.0.1 // indirect
github.com/jsonnet-bundler/jsonnet-bundler v0.4.0
github.com/kr/pretty v0.2.0 // indirect
github.com/mattn/go-colorable v0.1.7 // indirect
github.com/pkg/errors v0.8.1
github.com/prometheus/client_golang v1.5.1
github.com/spf13/pflag v1.0.3 // indirect
github.com/stretchr/objx v0.2.0 // indirect
golang.org/x/crypto v0.0.0-20190411191339-88737f569e3a // indirect
golang.org/x/net v0.0.0-20190206173232-65e2d4e15006 // indirect
golang.org/x/oauth2 v0.0.0-20190402181905-9f3314589c9a // indirect
golang.org/x/sys v0.0.0-20191023151326-f89234f9a2c2 // indirect
golang.org/x/sys v0.0.0-20200625212154-ddb9806d33ae // indirect
golang.org/x/text v0.3.1-0.20181227161524-e6919f6577db // indirect
golang.org/x/time v0.0.0-20190308202827-9d24e82272b4 // indirect
gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15 // indirect
gopkg.in/inf.v0 v0.9.1 // indirect
gopkg.in/yaml.v2 v2.2.4 // indirect
k8s.io/api v0.0.0-20190313235455-40a48860b5ab // indirect
k8s.io/apimachinery v0.0.0-20190313205120-d7deff9243b1
k8s.io/client-go v11.0.0+incompatible

111
go.sum
View File

@@ -5,86 +5,169 @@ github.com/alecthomas/template v0.0.0-20160405071501-a0175ee3bccc/go.mod h1:LOuy
github.com/alecthomas/template v0.0.0-20190718012654-fb15b899a751 h1:JYp7IbQjafoB+tBA3gMyHYHrpOtNuDiK/uB5uXxq5wM=
github.com/alecthomas/template v0.0.0-20190718012654-fb15b899a751/go.mod h1:LOuyumcjzFXgccqObfd/Ljyb9UuFJ6TxHnclSeseNhc=
github.com/alecthomas/units v0.0.0-20151022065526-2efee857e7cf/go.mod h1:ybxpYRFXyAe+OPACYpWeL0wqObRcbAqCMya13uyzqw0=
github.com/alecthomas/units v0.0.0-20190717042225-c3de453c63f4/go.mod h1:ybxpYRFXyAe+OPACYpWeL0wqObRcbAqCMya13uyzqw0=
github.com/alecthomas/units v0.0.0-20190924025748-f65c72e2690d h1:UQZhZ2O0vMHr2cI+DC1Mbh0TJxzA3RcLoMsFw+aXw7E=
github.com/alecthomas/units v0.0.0-20190924025748-f65c72e2690d/go.mod h1:rBZYJk541a8SKzHPHnH3zbiI+7dagKZ0cgpgrD7Fyho=
github.com/beorn7/perks v0.0.0-20180321164747-3a771d992973/go.mod h1:Dwedo/Wpr24TaqPxmxbtue+5NUziq4I4S80YR8gNf3Q=
github.com/beorn7/perks v1.0.0/go.mod h1:KWe93zE9D1o94FZ5RNwFwVgaQK1VOXiVxmqh+CedLV8=
github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw=
github.com/brancz/gojsontoyaml v0.0.0-20191212081931-bf2969bbd742 h1:PdvQdwUXiFnSmWsOJcBXLpyH3mJfP2FMPTT3J0i7+8o=
github.com/brancz/gojsontoyaml v0.0.0-20191212081931-bf2969bbd742/go.mod h1:IyUJYN1gvWjtLF5ZuygmxbnsAyP3aJS6cHzIuZY50B0=
github.com/campoy/embedmd v1.0.0 h1:V4kI2qTJJLf4J29RzI/MAt2c3Bl4dQSYPuflzwFH2hY=
github.com/campoy/embedmd v1.0.0/go.mod h1:oxyr9RCiSXg0M3VJ3ks0UGfp98BpSSGr0kpiX3MzVl8=
github.com/cespare/xxhash/v2 v2.1.1/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs=
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/fatih/color v1.7.0 h1:DkWD4oS2D8LGGgTQ6IvwJJXSL5Vp2ffcQg58nFV38Ys=
github.com/fatih/color v1.7.0/go.mod h1:Zm6kSWBoL9eyXnKyktHP6abPY2pDugNf5KwzbycvMj4=
github.com/fatih/color v1.9.0 h1:8xPHl4/q1VyqGIPif1F+1V3Y3lSmrq01EabUW3CoW5s=
github.com/fatih/color v1.9.0/go.mod h1:eQcE1qtQxscV5RaZvpXrrb8Drkc3/DdQ+uUYCNjL+zU=
github.com/ghodss/yaml v1.0.0 h1:wQHKEahhL6wmXdzwWG11gIVCkOv05bNOh+Rxn0yngAk=
github.com/ghodss/yaml v1.0.0/go.mod h1:4dBDuWmgqj2HViK6kFavaiC9ZROes6MMH2rRYeMEF04=
github.com/go-kit/kit v0.8.0/go.mod h1:xBxKIO96dXMWWy0MnWVtmwkA9/13aqxPnvrjFYMA2as=
github.com/go-kit/kit v0.9.0/go.mod h1:xBxKIO96dXMWWy0MnWVtmwkA9/13aqxPnvrjFYMA2as=
github.com/go-logfmt/logfmt v0.3.0/go.mod h1:Qt1PoO58o5twSAckw1HlFXLmHsOX5/0LbT9GBnD5lWE=
github.com/go-logfmt/logfmt v0.4.0/go.mod h1:3RMwSq7FuexP4Kalkev3ejPJsZTpXXBr9+V4qmtdjCk=
github.com/go-stack/stack v1.8.0/go.mod h1:v0f6uXyyMGvRgIKkXu+yp6POWl0qKG85gN/melR3HDY=
github.com/gogo/protobuf v1.1.1 h1:72R+M5VuhED/KujmZVcIquuo8mBgX4oVda//DQb3PXo=
github.com/gogo/protobuf v1.1.1/go.mod h1:r8qH/GZQm5c6nD/R0oafs1akxWv10x8SbQlK7atdtwQ=
github.com/golang/protobuf v1.2.0 h1:P3YflyNX/ehuJFLhxviNdFxQPkGK5cDcApsge1SqnvM=
github.com/golang/protobuf v1.2.0/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U=
github.com/google/gofuzz v0.0.0-20170612174753-24818f796faf h1:+RRA9JqSOZFfKrOeqr2z77+8R2RKyh8PG66dcu1V0ck=
github.com/google/gofuzz v0.0.0-20170612174753-24818f796faf/go.mod h1:HP5RmnzzSNb993RKQDq4+1A4ia9nllfqcQFTQJedwGI=
github.com/golang/protobuf v1.3.1/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U=
github.com/golang/protobuf v1.3.2 h1:6nsPYzhq5kReh6QImI3k5qWzO4PEbvbIW2cwSfR/6xs=
github.com/golang/protobuf v1.3.2/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U=
github.com/google/go-cmp v0.3.1/go.mod h1:8QqcDgzrUqlUb/G2PQTWiueGozuR1884gddMywk6iLU=
github.com/google/go-cmp v0.4.0/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE=
github.com/google/go-jsonnet v0.16.1-0.20200703153429-aaf50f5b655f h1:mw4KoMG5/DXLPhpKXQRYTEIZFkFo0a1HU2R1HbeYpek=
github.com/google/go-jsonnet v0.16.1-0.20200703153429-aaf50f5b655f/go.mod h1:sOcuej3UW1vpPTZOr8L7RQimqai1a57bt5j22LzGZCw=
github.com/google/gofuzz v1.0.0 h1:A8PeW59pxE9IoFRqBp37U+mSNaQoZ46F1f0f863XSXw=
github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
github.com/googleapis/gnostic v0.0.0-20170729233727-0c5108395e2d h1:7XGaL1e6bYS1yIonGp9761ExpPPV1ui0SAC59Yube9k=
github.com/googleapis/gnostic v0.0.0-20170729233727-0c5108395e2d/go.mod h1:sJBsCZ4ayReDTBIg8b9dl28c5xFWyhBTVRp3pOg5EKY=
github.com/imdario/mergo v0.3.7 h1:Y+UAYTZ7gDEuOfhxKWy+dvb5dRQ6rJjFSdX2HZY1/gI=
github.com/imdario/mergo v0.3.7/go.mod h1:2EnlNZ0deacrJVfApfmtdGgDfMuh/nq6Ok1EcJh5FfA=
github.com/json-iterator/go v0.0.0-20180701071628-ab8a2e0c74be h1:AHimNtVIpiBjPUhEF5KNCkrUyqTSA5zWUl8sQ2bfGBE=
github.com/json-iterator/go v0.0.0-20180701071628-ab8a2e0c74be/go.mod h1:+SdeFBvtyEkXs7REEP0seUULqWtbJapLOCVDaaPEHmU=
github.com/jsonnet-bundler/jsonnet-bundler v0.1.0 h1:T/HtHFr+mYCRULrH1x/RnoB0prIs0rMkolJhFMXNC9A=
github.com/jsonnet-bundler/jsonnet-bundler v0.1.0/go.mod h1:YKsSFc9VFhhLITkJS3X2PrRqWG9u2Jq99udTdDjQLfM=
github.com/json-iterator/go v1.1.6/go.mod h1:+SdeFBvtyEkXs7REEP0seUULqWtbJapLOCVDaaPEHmU=
github.com/json-iterator/go v1.1.9 h1:9yzud/Ht36ygwatGx56VwCZtlI/2AD15T1X2sjSuGns=
github.com/json-iterator/go v1.1.9/go.mod h1:KdQUCv79m/52Kvf8AW2vK1V8akMuk1QjK/uOdHXbAo4=
github.com/jsonnet-bundler/jsonnet-bundler v0.4.0 h1:4BKZ6LDqPc2wJDmaKnmYD/vDjUptJtnUpai802MibFc=
github.com/jsonnet-bundler/jsonnet-bundler v0.4.0/go.mod h1:/by7P/OoohkI3q4CgSFqcoFsVY+IaNbzOVDknEsKDeU=
github.com/julienschmidt/httprouter v1.2.0/go.mod h1:SYymIcj16QtmaHHD7aYtjjsJG7VTCxuUUipMqKk8s4w=
github.com/konsorten/go-windows-terminal-sequences v1.0.1/go.mod h1:T0+1ngSBFLxvqU3pZ+m/2kptfBszLMUkC4ZK/EgS/cQ=
github.com/kr/logfmt v0.0.0-20140226030751-b84e30acd515/go.mod h1:+0opPa2QZZtGFBFZlji/RkVcI2GknAs/DXo4wKdlNEc=
github.com/kr/pretty v0.1.0/go.mod h1:dAy3ld7l9f0ibDNOQOHHMYYIIbhfbHSm3C4ZsoJORNo=
github.com/kr/pretty v0.2.0 h1:s5hAObm+yFO5uHYt5dYjxi2rXrsnmRpJx4OYvIWUaQs=
github.com/kr/pretty v0.2.0/go.mod h1:ipq/a2n7PKx3OHsz4KJII5eveXtPO4qwEXGdVfWzfnI=
github.com/kr/pty v1.1.1/go.mod h1:pFQYn66WHrOpPYNljwOMqo10TkYh1fy3cYio2l3bCsQ=
github.com/kr/text v0.1.0 h1:45sCR5RtlFHMR4UwH9sdQ5TC8v0qDQCHnXt+kaKSTVE=
github.com/kr/text v0.1.0/go.mod h1:4Jbv+DJW3UT/LiOwJeYQe1efqtUx/iVham/4vfdArNI=
github.com/mattn/go-colorable v0.0.9 h1:UVL0vNpWh04HeJXV0KLcaT7r06gOH2l4OW6ddYRUIY4=
github.com/mattn/go-colorable v0.0.9/go.mod h1:9vuHe8Xs5qXnSaW/c/ABM9alt+Vo+STaOChaDxuIBZU=
github.com/mattn/go-colorable v0.1.4 h1:snbPLB8fVfU9iwbbo30TPtbLRzwWu6aJS6Xh4eaaviA=
github.com/mattn/go-colorable v0.1.4/go.mod h1:U0ppj6V5qS13XJ6of8GYAs25YV2eR4EVcfRqFIhoBtE=
github.com/mattn/go-colorable v0.1.7 h1:bQGKb3vps/j0E9GfJQ03JyhRuxsvdAanXlT9BTw3mdw=
github.com/mattn/go-colorable v0.1.7/go.mod h1:u6P/XSegPjTcexA+o6vUJrdnUu04hMope9wVRipJSqc=
github.com/mattn/go-isatty v0.0.6 h1:SrwhHcpV4nWrMGdNcC2kXpMfcBVYGDuTArqyhocJgvA=
github.com/mattn/go-isatty v0.0.6/go.mod h1:Iq45c/XA43vh69/j3iqttzPXn0bhXyGjM0Hdxcsrc5s=
github.com/mattn/go-isatty v0.0.8/go.mod h1:Iq45c/XA43vh69/j3iqttzPXn0bhXyGjM0Hdxcsrc5s=
github.com/mattn/go-isatty v0.0.10 h1:qxFzApOv4WsAL965uUPIsXzAKCZxN2p9UqdhFS4ZW10=
github.com/mattn/go-isatty v0.0.10/go.mod h1:qgIWMr58cqv1PHHyhnkY9lrL7etaEgOFcMEpPG5Rm84=
github.com/mattn/go-isatty v0.0.11 h1:FxPOTFNqGkuDUGi3H/qkUbQO4ZiBa2brKq5r0l8TGeM=
github.com/mattn/go-isatty v0.0.11/go.mod h1:PhnuNfih5lzO57/f3n+odYbM4JtupLOxQOAqxQCu2WE=
github.com/mattn/go-isatty v0.0.12 h1:wuysRhFDzyxgEmMf5xjvJ2M9dZoWAXNNr5LSBS7uHXY=
github.com/mattn/go-isatty v0.0.12/go.mod h1:cbi8OIDigv2wuxKPP5vlRcQ1OAZbq2CE4Kysco4FUpU=
github.com/matttproud/golang_protobuf_extensions v1.0.1/go.mod h1:D8He9yQNgCq6Z5Ld7szi9bcBfOoFv/3dc6xSMkL2PC0=
github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd h1:TRLaZ9cD/w8PVh93nsPXa1VrQ6jlwL5oN8l14QlcNfg=
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=
github.com/modern-go/reflect2 v0.0.0-20180701023420-4b7aa43c6742/go.mod h1:bx2lNnkwVCuqBIxFjflWJWanXIb3RllmbCylyMrvgv0=
github.com/modern-go/reflect2 v1.0.1 h1:9f412s+6RmYXLWZSEzVVgPGK7C2PphHj5RJrvfx9AWI=
github.com/modern-go/reflect2 v1.0.1/go.mod h1:bx2lNnkwVCuqBIxFjflWJWanXIb3RllmbCylyMrvgv0=
github.com/mwitkow/go-conntrack v0.0.0-20161129095857-cc309e4a2223/go.mod h1:qRWi+5nqEBWmkhHvq77mSJWrCKwh8bxhgT7d/eI7P4U=
github.com/pkg/errors v0.8.0/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
github.com/pkg/errors v0.8.1 h1:iURUrRGxPUNPdy5/HRSm+Yj6okJ6UtLINN0Q9M4+h3I=
github.com/pkg/errors v0.8.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/prometheus/client_golang v0.9.1/go.mod h1:7SWBe2y4D6OKWSNQJUaRYU/AaXPKyh/dDVn+NZz0KFw=
github.com/prometheus/client_golang v1.0.0/go.mod h1:db9x61etRT2tGnBNRi70OPL5FsnadC4Ky3P0J6CfImo=
github.com/prometheus/client_golang v1.5.1 h1:bdHYieyGlH+6OLEk2YQha8THib30KP0/yD0YH9m6xcA=
github.com/prometheus/client_golang v1.5.1/go.mod h1:e9GMxYsXl05ICDXkRhurwBS4Q3OK1iX/F2sw+iXX5zU=
github.com/prometheus/client_model v0.0.0-20180712105110-5c3871d89910/go.mod h1:MbSGuTsp3dbXC40dX6PRTWyKYBIrTGTE9sqQNg2J8bo=
github.com/prometheus/client_model v0.0.0-20190129233127-fd36f4220a90/go.mod h1:xMI15A0UPsDsEKsMN9yxemIoYk6Tm2C1GtYGdfGttqA=
github.com/prometheus/client_model v0.2.0/go.mod h1:xMI15A0UPsDsEKsMN9yxemIoYk6Tm2C1GtYGdfGttqA=
github.com/prometheus/common v0.4.1/go.mod h1:TNfzLD0ON7rHzMJeJkieUDPYmFC7Snx/y86RQel1bk4=
github.com/prometheus/common v0.9.1 h1:KOMtN28tlbam3/7ZKEYKHhKoJZYYj3gMH4uc62x7X7U=
github.com/prometheus/common v0.9.1/go.mod h1:yhUN8i9wzaXS3w1O07YhxHEBxD+W35wd8bs7vj7HSQ4=
github.com/prometheus/procfs v0.0.0-20181005140218-185b4288413d/go.mod h1:c3At6R/oaqEKCNdg8wHV1ftS6bRYblBhIjjI8uT2IGk=
github.com/prometheus/procfs v0.0.2/go.mod h1:TjEm7ze935MbeOT/UhFTIMYKhuLP4wbCsTZCD3I8kEA=
github.com/prometheus/procfs v0.0.8/go.mod h1:7Qr8sr6344vo1JqZ6HhLceV9o3AJ1Ff+GxbHq6oeK9A=
github.com/sergi/go-diff v1.1.0 h1:we8PVUC3FE2uYfodKH/nBHMSetSfHDR6scGdBi+erh0=
github.com/sergi/go-diff v1.1.0/go.mod h1:STckp+ISIX8hZLjrqAeVduY0gWCT9IjLuqbuNXdaHfM=
github.com/sirupsen/logrus v1.2.0/go.mod h1:LxeOpSwHxABJmUn/MG1IvRgCAasNZTLOkJPxbbu5VWo=
github.com/sirupsen/logrus v1.4.2/go.mod h1:tLMulIdttU9McNUspp0xgXVQah82FyeX6MwdIuYE2rE=
github.com/spf13/pflag v1.0.3 h1:zPAT6CGy6wXeQ7NtTnaTerfKOsV6V6F8agHXFiazDkg=
github.com/spf13/pflag v1.0.3/go.mod h1:DYY7MBk1bdzusC3SYhjObp+wFpr4gzcvqqNjLnInEg4=
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/objx v0.2.0/go.mod h1:qt09Ya8vawLte6SNmTgCsAVtYtaKzEcn8ATUoHMkEqE=
github.com/stretchr/testify v1.2.2 h1:bSDNvY7ZPG5RlJ8otE/7V6gMiyenm9RtJ7IUVIAoJ1w=
github.com/stretchr/objx v0.1.1/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/testify v1.2.2/go.mod h1:a8OnRcib4nhh0OaRAV+Yts87kKdq0PP7pXfy6kDkUVs=
github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI=
github.com/stretchr/testify v1.4.0 h1:2E4SXV/wtOkTonXsotYi4li6zVWxYlZuYNCXe9XRJyk=
github.com/stretchr/testify v1.4.0/go.mod h1:j7eGeouHqKxXV5pUuKE4zz7dFj8WfuZ+81PSLYec5m4=
golang.org/x/crypto v0.0.0-20180904163835-0709b304e793/go.mod h1:6SG95UA2DQfeDnfUPMdvaQW0Q7yPrPDi9nlGo2tz2b4=
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
golang.org/x/crypto v0.0.0-20190411191339-88737f569e3a h1:Igim7XhdOpBnWPuYJ70XcNpq8q3BCACtVgNfoJxOV7g=
golang.org/x/crypto v0.0.0-20190411191339-88737f569e3a/go.mod h1:WFFai1msRO1wXaEeE5yQxYXgSfI8pQAWXbQop6sCtWE=
golang.org/x/net v0.0.0-20180724234803-3673e40ba225/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20181114220301-adae6a3d119a/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20190108225652-1e06a53dbb7e/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20190206173232-65e2d4e15006 h1:bfLnR+k0tq5Lqt6dflRLcZiz6UaXCMt3vhYJ1l4FQ80=
golang.org/x/net v0.0.0-20190206173232-65e2d4e15006/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20190613194153-d28f0bde5980 h1:dfGZHvZk057jK2MCeWus/TowKpJ8y4AmooUzdBSR9GU=
golang.org/x/net v0.0.0-20190613194153-d28f0bde5980/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/oauth2 v0.0.0-20190402181905-9f3314589c9a h1:tImsplftrFpALCYumobsd0K86vlAs/eXGFms2txfJfA=
golang.org/x/oauth2 v0.0.0-20190402181905-9f3314589c9a/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw=
golang.org/x/sync v0.0.0-20181108010431-42b317875d0f/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20181221193216-37e7f081c4d4 h1:YUO/7uOKsKeq9UokNS62b8FYywz3ker1l1vDZRCRefw=
golang.org/x/sync v0.0.0-20181221193216-37e7f081c4d4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20190911185100-cd5d95a43a6e/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sys v0.0.0-20180905080454-ebe1bf3edb33/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20181116152217-5ac8a444bdc5/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20190222072716-a9d3bda3a223/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20190310054646-10058d7d4faa/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20190403152447-81d4e9dc473e h1:nFYrTHrdrAOpShe27kaFHjsqYSEQ0KWqdWLu3xuZJts=
golang.org/x/sys v0.0.0-20190403152447-81d4e9dc473e/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20191008105621-543471e840be/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20191023151326-f89234f9a2c2/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20190422165155-953cdadca894/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20191026070338-33540a1f6037 h1:YyJpGZS1sBuBCzLAR1VEpK193GlqGZbnPFnPV/5Rsb4=
golang.org/x/sys v0.0.0-20191026070338-33540a1f6037/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200116001909-b77594299b42/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200122134326-e047566fdf82 h1:ywK/j/KkyTHcdyYSZNXGjMwgmDSfjglYZ3vStQ/gSCU=
golang.org/x/sys v0.0.0-20200122134326-e047566fdf82/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200223170610-d5e6a3e2c0ae/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200625212154-ddb9806d33ae h1:Ih9Yo4hSPImZOpfGuA4bR/ORKTAbhZo2AbWNRCnevdo=
golang.org/x/sys v0.0.0-20200625212154-ddb9806d33ae/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.1-0.20181227161524-e6919f6577db h1:6/JqlYfC1CCaLnGceQTI+sDGhC9UBSPAsBqI0Gun6kU=
golang.org/x/text v0.3.1-0.20181227161524-e6919f6577db/go.mod h1:bEr9sfX3Q8Zfm5fL9x+3itogRgK3+ptLWKqgva+5dAk=
golang.org/x/time v0.0.0-20190308202827-9d24e82272b4 h1:SvFZT6jyqRaOeXpc5h/JSfZenJ2O330aBsf7JfSUXmQ=
golang.org/x/time v0.0.0-20190308202827-9d24e82272b4/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
google.golang.org/appengine v1.4.0 h1:/wp5JvzpHIxhs/dumFmF7BXTf3Z+dd4uXta4kVyO508=
google.golang.org/appengine v1.4.0/go.mod h1:xpcJRLb0r/rnEns0DIKYYv+WjYCduHsrkT7/EB5XEv4=
gopkg.in/alecthomas/kingpin.v2 v2.2.6 h1:jMFz6MfLP0/4fUyZle81rXUoxOBFi19VUFKVDOQfozc=
gopkg.in/alecthomas/kingpin.v2 v2.2.6/go.mod h1:FMv+mEhP44yOT+4EoQTLFTRgOQ1FBLkstjWtayDeSgw=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15 h1:YR8cESwS4TdDjEe65xsg0ogRM/Nc3DYOhEAlW+xobZo=
gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/inf.v0 v0.9.1 h1:73M5CoZyi3ZLMOyDlQh031Cx6N9NDJ2Vvfl76EDAgDc=
gopkg.in/inf.v0 v0.9.1/go.mod h1:cWUDdTG/fYaXco+Dcufb5Vnc6Gp2YChqWtbxRZE0mXw=
gopkg.in/yaml.v2 v2.1.1/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.2.1/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.2.2 h1:ZCJp+EgiOT7lHqUV2J862kp8Qj64Jo6az82+3Td9dZw=
gopkg.in/yaml.v2 v2.2.2/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.2.4 h1:/eiJrUcujPVeJ3xlSWaiNi3uSVmDGBK1pDHUHAnao1I=
gopkg.in/yaml.v2 v2.2.4/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.2.5 h1:ymVxjfMaHvXD8RqPRmzHHsB3VvucivSkIAvJFDI5O3c=
gopkg.in/yaml.v2 v2.2.5/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
k8s.io/api v0.0.0-20190313235455-40a48860b5ab h1:DG9A67baNpoeweOy2spF1OWHhnVY5KR7/Ek/+U1lVZc=
k8s.io/api v0.0.0-20190313235455-40a48860b5ab/go.mod h1:iuAfoD4hCxJ8Onx9kaTIt30j7jUFS00AXQi6QMi99vA=
k8s.io/apimachinery v0.0.0-20190313205120-d7deff9243b1 h1:IS7K02iBkQXpCeieSiyJjGoLSdVOv2DbPaWHJ+ZtgKg=

View File

@@ -5,7 +5,7 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
namespace: 'default',
versions+:: {
alertmanager: 'v0.18.0',
alertmanager: 'v0.21.0',
},
imageRepos+:: {
@@ -18,24 +18,53 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
global: {
resolve_timeout: '5m',
},
inhibit_rules: [{
source_match: {
severity: 'critical',
},
target_match_re: {
severity: 'warning|info',
},
equal: ['namespace', 'alertname'],
}, {
source_match: {
severity: 'warning',
},
target_match_re: {
severity: 'info',
},
equal: ['namespace', 'alertname'],
}],
route: {
group_by: ['job'],
group_by: ['namespace'],
group_wait: '30s',
group_interval: '5m',
repeat_interval: '12h',
receiver: 'null',
receiver: 'Default',
routes: [
{
receiver: 'null',
receiver: 'Watchdog',
match: {
alertname: 'Watchdog',
},
},
{
receiver: 'Critical',
match: {
severity: 'critical',
},
},
],
},
receivers: [
{
name: 'null',
name: 'Default',
},
{
name: 'Watchdog',
},
{
name: 'Critical',
},
],
},
@@ -48,7 +77,8 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
local secret = k.core.v1.secret;
if std.type($._config.alertmanager.config) == 'object' then
secret.new('alertmanager-' + $._config.alertmanager.name, { 'alertmanager.yaml': std.base64(std.manifestYamlDoc($._config.alertmanager.config)) }) +
secret.new('alertmanager-' + $._config.alertmanager.name, {})
.withStringData({ 'alertmanager.yaml': std.manifestYamlDoc($._config.alertmanager.config) }) +
secret.mixin.metadata.withNamespace($._config.namespace)
else
secret.new('alertmanager-' + $._config.alertmanager.name, { 'alertmanager.yaml': std.base64($._config.alertmanager.config) }) +
@@ -111,7 +141,7 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
spec: {
replicas: $._config.alertmanager.replicas,
version: $._config.versions.alertmanager,
baseImage: $._config.imageRepos.alertmanager,
image: $._config.imageRepos.alertmanager + ':' + $._config.versions.alertmanager,
nodeSelector: { 'kubernetes.io/os': 'linux' },
serviceAccountName: 'alertmanager-' + $._config.alertmanager.name,
securityContext: {

View File

@@ -7,10 +7,15 @@
{
alert: 'AlertmanagerConfigInconsistent',
annotations: {
message: 'The configuration of the instances of the Alertmanager cluster `{{$labels.service}}` are out of sync.',
message: |||
The configuration of the instances of the Alertmanager cluster `{{ $labels.namespace }}/{{ $labels.service }}` are out of sync.
{{ range printf "alertmanager_config_hash{namespace=\"%s\",service=\"%s\"}" $labels.namespace $labels.service | query }}
Configuration hash for pod {{ .Labels.pod }} is "{{ printf "%.f" .Value }}"
{{ end }}
|||,
},
expr: |||
count_values("config_hash", alertmanager_config_hash{%(alertmanagerSelector)s}) BY (service) / ON(service) GROUP_LEFT() label_replace(max(prometheus_operator_spec_replicas{%(prometheusOperatorSelector)s,controller="alertmanager"}) by (name, job, namespace, controller), "service", "alertmanager-$1", "name", "(.*)") != 1
count by(namespace,service) (count_values by(namespace,service) ("config_hash", alertmanager_config_hash{%(alertmanagerSelector)s})) != 1
||| % $._config,
'for': '5m',
labels: {

View File

@@ -7,7 +7,7 @@
{
alert: 'TargetDown',
annotations: {
message: '{{ printf "%.4g" $value }}% of the {{ $labels.job }} targets in {{ $labels.namespace }} namespace are down.',
message: '{{ printf "%.4g" $value }}% of the {{ $labels.job }}/{{ $labels.service }} targets in {{ $labels.namespace }} namespace are down.',
},
expr: '100 * (count(up == 0) BY (job, namespace, service) / count(up) BY (job, namespace, service)) > 10',
'for': '10m',

View File

@@ -1,24 +1,6 @@
{
prometheusAlerts+:: {
groups+: [
{
name: 'node-time',
rules: [
{
alert: 'ClockSkewDetected',
annotations: {
message: 'Clock skew detected on node-exporter {{ $labels.namespace }}/{{ $labels.pod }}. Ensure NTP is configured correctly on this host.',
},
expr: |||
abs(node_timex_offset_seconds{%(nodeExporterSelector)s}) > 0.05
||| % $._config,
'for': '2m',
labels: {
severity: 'warning',
},
},
],
},
{
name: 'node-network',
rules: [

View File

@@ -4,6 +4,32 @@
{
name: 'prometheus-operator',
rules: [
{
alert: 'PrometheusOperatorListErrors',
expr: |||
(sum by (controller,namespace) (rate(prometheus_operator_list_operations_failed_total{%(prometheusOperatorSelector)s}[10m])) / sum by (controller,namespace) (rate(prometheus_operator_list_operations_total{%(prometheusOperatorSelector)s}[10m]))) > 0.4
||| % $._config,
labels: {
severity: 'warning',
},
annotations: {
message: 'Errors while performing List operations in controller {{$labels.controller}} in {{$labels.namespace}} namespace.',
},
'for': '15m',
},
{
alert: 'PrometheusOperatorWatchErrors',
expr: |||
(sum by (controller,namespace) (rate(prometheus_operator_watch_operations_failed_total{%(prometheusOperatorSelector)s}[10m])) / sum by (controller,namespace) (rate(prometheus_operator_watch_operations_total{%(prometheusOperatorSelector)s}[10m]))) > 0.4
||| % $._config,
labels: {
severity: 'warning',
},
annotations: {
message: 'Errors while performing Watch operations in controller {{$labels.controller}} in {{$labels.namespace}} namespace.',
},
'for': '15m',
},
{
alert: 'PrometheusOperatorReconcileErrors',
expr: |||

View File

@@ -0,0 +1,50 @@
[
// Drop all kubelet metrics which are deprecated in kubernetes.
{
sourceLabels: ['__name__'],
regex: 'kubelet_(pod_worker_latency_microseconds|pod_start_latency_microseconds|cgroup_manager_latency_microseconds|pod_worker_start_latency_microseconds|pleg_relist_latency_microseconds|pleg_relist_interval_microseconds|runtime_operations|runtime_operations_latency_microseconds|runtime_operations_errors|eviction_stats_age_microseconds|device_plugin_registration_count|device_plugin_alloc_latency_microseconds|network_plugin_operations_latency_microseconds)',
action: 'drop',
},
// Drop all scheduler metrics which are deprecated in kubernetes.
{
sourceLabels: ['__name__'],
regex: 'scheduler_(e2e_scheduling_latency_microseconds|scheduling_algorithm_predicate_evaluation|scheduling_algorithm_priority_evaluation|scheduling_algorithm_preemption_evaluation|scheduling_algorithm_latency_microseconds|binding_latency_microseconds|scheduling_latency_seconds)',
action: 'drop',
},
// Drop all apiserver metrics which are deprecated in kubernetes.
{
sourceLabels: ['__name__'],
regex: 'apiserver_(request_count|request_latencies|request_latencies_summary|dropped_requests|storage_data_key_generation_latencies_microseconds|storage_transformation_failures_total|storage_transformation_latencies_microseconds|proxy_tunnel_sync_latency_secs)',
action: 'drop',
},
// Drop all docker metrics which are deprecated in kubernetes.
{
sourceLabels: ['__name__'],
regex: 'kubelet_docker_(operations|operations_latency_microseconds|operations_errors|operations_timeout)',
action: 'drop',
},
// Drop all reflector metrics which are deprecated in kubernetes.
{
sourceLabels: ['__name__'],
regex: 'reflector_(items_per_list|items_per_watch|list_duration_seconds|lists_total|short_watches_total|watch_duration_seconds|watches_total)',
action: 'drop',
},
// Drop all etcd metrics which are deprecated in kubernetes.
{
sourceLabels: ['__name__'],
regex: 'etcd_(helper_cache_hit_count|helper_cache_miss_count|helper_cache_entry_count|request_cache_get_latencies_summary|request_cache_add_latencies_summary|request_latencies_summary)',
action: 'drop',
},
// Drop all transformation metrics which are deprecated in kubernetes.
{
sourceLabels: ['__name__'],
regex: 'transformation_(transformation_latencies_microseconds|failures_total)',
action: 'drop',
},
// Drop all other metrics which are deprecated in kubernetes.
{
sourceLabels: ['__name__'],
regex: '(admission_quota_controller_adds|crd_autoregistration_controller_work_duration|APIServiceOpenAPIAggregationControllerQueue1_adds|AvailableConditionController_retries|crd_openapi_controller_unfinished_work_seconds|APIServiceRegistrationController_retries|admission_quota_controller_longest_running_processor_microseconds|crdEstablishing_longest_running_processor_microseconds|crdEstablishing_unfinished_work_seconds|crd_openapi_controller_adds|crd_autoregistration_controller_retries|crd_finalizer_queue_latency|AvailableConditionController_work_duration|non_structural_schema_condition_controller_depth|crd_autoregistration_controller_unfinished_work_seconds|AvailableConditionController_adds|DiscoveryController_longest_running_processor_microseconds|autoregister_queue_latency|crd_autoregistration_controller_adds|non_structural_schema_condition_controller_work_duration|APIServiceRegistrationController_adds|crd_finalizer_work_duration|crd_naming_condition_controller_unfinished_work_seconds|crd_openapi_controller_longest_running_processor_microseconds|DiscoveryController_adds|crd_autoregistration_controller_longest_running_processor_microseconds|autoregister_unfinished_work_seconds|crd_naming_condition_controller_queue_latency|crd_naming_condition_controller_retries|non_structural_schema_condition_controller_queue_latency|crd_naming_condition_controller_depth|AvailableConditionController_longest_running_processor_microseconds|crdEstablishing_depth|crd_finalizer_longest_running_processor_microseconds|crd_naming_condition_controller_adds|APIServiceOpenAPIAggregationControllerQueue1_longest_running_processor_microseconds|DiscoveryController_queue_latency|DiscoveryController_unfinished_work_seconds|crd_openapi_controller_depth|APIServiceOpenAPIAggregationControllerQueue1_queue_latency|APIServiceOpenAPIAggregationControllerQueue1_unfinished_work_seconds|DiscoveryController_work_duration|autoregister_adds|crd_autoregistration_controller_queue_latency|crd_finalizer_retries|AvailableConditionController_unfinished_work_seconds|autoregister_longest_running_processor_microseconds|non_structural_schema_condition_controller_unfinished_work_seconds|APIServiceOpenAPIAggregationControllerQueue1_depth|AvailableConditionController_depth|DiscoveryController_retries|admission_quota_controller_depth|crdEstablishing_adds|APIServiceOpenAPIAggregationControllerQueue1_retries|crdEstablishing_queue_latency|non_structural_schema_condition_controller_longest_running_processor_microseconds|autoregister_work_duration|crd_openapi_controller_retries|APIServiceRegistrationController_work_duration|crdEstablishing_work_duration|crd_finalizer_adds|crd_finalizer_depth|crd_openapi_controller_queue_latency|APIServiceOpenAPIAggregationControllerQueue1_work_duration|APIServiceRegistrationController_queue_latency|crd_autoregistration_controller_depth|AvailableConditionController_queue_latency|admission_quota_controller_queue_latency|crd_naming_condition_controller_work_duration|crd_openapi_controller_work_duration|DiscoveryController_depth|crd_naming_condition_controller_longest_running_processor_microseconds|APIServiceRegistrationController_depth|APIServiceRegistrationController_longest_running_processor_microseconds|crd_finalizer_unfinished_work_seconds|crdEstablishing_retries|admission_quota_controller_unfinished_work_seconds|non_structural_schema_condition_controller_adds|APIServiceRegistrationController_unfinished_work_seconds|admission_quota_controller_work_duration|autoregister_depth|autoregister_retries|kubeproxy_sync_proxy_rules_latency_microseconds|rest_client_request_latency_seconds|non_structural_schema_condition_controller_retries)',
action: 'drop',
},
]

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -1,74 +1,89 @@
{
"dependencies": [
{
"name": "ksonnet",
"source": {
"git": {
"remote": "https://github.com/ksonnet/ksonnet-lib",
"subdir": ""
}
},
"version": "master"
},
{
"name": "kubernetes-mixin",
"source": {
"git": {
"remote": "https://github.com/kubernetes-monitoring/kubernetes-mixin",
"subdir": ""
}
},
"version": "master"
},
{
"name": "grafana",
"source": {
"git": {
"remote": "https://github.com/brancz/kubernetes-grafana",
"subdir": "grafana"
}
},
"version": "master"
},
{
"name": "prometheus-operator",
"source": {
"git": {
"remote": "https://github.com/coreos/prometheus-operator",
"subdir": "jsonnet/prometheus-operator"
}
},
"version": "release-0.34"
},
{
"name": "etcd-mixin",
"source": {
"git": {
"remote": "https://github.com/coreos/etcd",
"subdir": "Documentation/etcd-mixin"
}
},
"version": "master"
},
{
"name": "prometheus",
"source": {
"git": {
"remote": "https://github.com/prometheus/prometheus",
"subdir": "documentation/prometheus-mixin"
}
},
"version": "master"
},
{
"name": "node-mixin",
"source": {
"git": {
"remote": "https://github.com/prometheus/node_exporter",
"subdir": "docs/node-mixin"
}
},
"version": "master"
"version": 1,
"dependencies": [
{
"source": {
"git": {
"remote": "https://github.com/brancz/kubernetes-grafana",
"subdir": "grafana"
}
]
},
"version": "release-0.1"
},
{
"source": {
"git": {
"remote": "https://github.com/coreos/etcd",
"subdir": "Documentation/etcd-mixin"
}
},
"version": "e8ba375032e8e48d009759dfb285f7812e7bcb8c"
},
{
"source": {
"git": {
"remote": "https://github.com/coreos/prometheus-operator",
"subdir": "jsonnet/prometheus-operator"
}
},
"version": "release-0.42"
},
{
"source": {
"git": {
"remote": "https://github.com/ksonnet/ksonnet-lib",
"subdir": ""
}
},
"version": "0d2f82676817bbf9e4acf6495b2090205f323b9f",
"name": "ksonnet"
},
{
"source": {
"git": {
"remote": "https://github.com/kubernetes-monitoring/kubernetes-mixin",
"subdir": ""
}
},
"version": "release-0.5"
},
{
"source": {
"git": {
"remote": "https://github.com/kubernetes/kube-state-metrics",
"subdir": "jsonnet/kube-state-metrics"
}
},
"version": "release-1.9"
},
{
"source": {
"git": {
"remote": "https://github.com/kubernetes/kube-state-metrics",
"subdir": "jsonnet/kube-state-metrics-mixin"
}
},
"version": "release-1.9"
},
{
"source": {
"git": {
"remote": "https://github.com/prometheus/node_exporter",
"subdir": "docs/node-mixin"
}
},
"version": "ff2ff3410f4ea8195e51f5fb8d84151684f91b3f"
},
{
"source": {
"git": {
"remote": "https://github.com/prometheus/prometheus",
"subdir": "documentation/prometheus-mixin"
}
},
"version": "release-2.20",
"name": "prometheus"
}
],
"legacyImports": true
}

View File

@@ -0,0 +1,20 @@
local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
{
prometheus+:: {
clusterRole+: {
rules+:
local role = k.rbac.v1.role;
local policyRule = role.rulesType;
local rule = policyRule.new() +
policyRule.withApiGroups(['']) +
policyRule.withResources([
'services',
'endpoints',
'pods',
]) +
policyRule.withVerbs(['get', 'list', 'watch']);
[rule]
},
}
}

View File

@@ -5,12 +5,12 @@ local servicePort = k.core.v1.service.mixin.spec.portsType;
{
prometheus+:: {
kubeControllerManagerPrometheusDiscoveryService:
service.new('kube-controller-manager-prometheus-discovery', { 'k8s-app': 'kube-controller-manager' }, servicePort.newNamed('http-metrics', 10252, 10252)) +
service.new('kube-controller-manager-prometheus-discovery', { 'k8s-app': 'kube-controller-manager' }, servicePort.newNamed('https-metrics', 10257, 10257)) +
service.mixin.metadata.withNamespace('kube-system') +
service.mixin.metadata.withLabels({ 'k8s-app': 'kube-controller-manager' }) +
service.mixin.spec.withClusterIp('None'),
kubeSchedulerPrometheusDiscoveryService:
service.new('kube-scheduler-prometheus-discovery', { 'k8s-app': 'kube-scheduler' }, servicePort.newNamed('http-metrics', 10251, 10251)) +
service.new('kube-scheduler-prometheus-discovery', { 'k8s-app': 'kube-scheduler' }, servicePort.newNamed('https-metrics', 10259, 10259)) +
service.mixin.metadata.withNamespace('kube-system') +
service.mixin.metadata.withLabels({ 'k8s-app': 'kube-scheduler' }) +
service.mixin.spec.withClusterIp('None'),

View File

@@ -0,0 +1,197 @@
local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
// Custom metrics API allows the HPA v2 to scale based on arbirary metrics.
// For more details on usage visit https://github.com/DirectXMan12/k8s-prometheus-adapter#quick-links
{
_config+:: {
prometheusAdapter+:: {
// Rules for custom-metrics
config+:: {
rules+: [
{
seriesQuery: '{__name__=~"^container_.*",container!="POD",namespace!="",pod!=""}',
seriesFilters: [],
resources: {
overrides: {
namespace: {
resource: 'namespace'
},
pod: {
resource: 'pod'
}
},
},
name: {
matches: '^container_(.*)_seconds_total$',
as: ""
},
metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>,container!="POD"}[1m])) by (<<.GroupBy>>)'
},
{
seriesQuery: '{__name__=~"^container_.*",container!="POD",namespace!="",pod!=""}',
seriesFilters: [
{ isNot: '^container_.*_seconds_total$' },
],
resources: {
overrides: {
namespace: {
resource: 'namespace'
},
pod: {
resource: 'pod'
}
},
},
name: {
matches: '^container_(.*)_total$',
as: ''
},
metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>,container!="POD"}[1m])) by (<<.GroupBy>>)'
},
{
seriesQuery: '{__name__=~"^container_.*",container!="POD",namespace!="",pod!=""}',
seriesFilters: [
{ isNot: '^container_.*_total$' },
],
resources: {
overrides: {
namespace: {
resource: 'namespace'
},
pod: {
resource: 'pod'
}
},
},
name: {
matches: '^container_(.*)$',
as: ''
},
metricsQuery: 'sum(<<.Series>>{<<.LabelMatchers>>,container!="POD"}) by (<<.GroupBy>>)'
},
{
seriesQuery: '{namespace!="",__name__!~"^container_.*"}',
seriesFilters: [
{ isNot: '.*_total$' },
],
resources: {
template: '<<.Resource>>'
},
name: {
matches: '',
as: ''
},
metricsQuery: 'sum(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)'
},
{
seriesQuery: '{namespace!="",__name__!~"^container_.*"}',
seriesFilters: [
{ isNot: '.*_seconds_total' },
],
resources: {
template: '<<.Resource>>'
},
name: {
matches: '^(.*)_total$',
as: ''
},
metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)'
},
{
seriesQuery: '{namespace!="",__name__!~"^container_.*"}',
seriesFilters: [],
resources: {
template: '<<.Resource>>'
},
name: {
matches: '^(.*)_seconds_total$',
as: ''
},
metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)'
}
],
},
},
},
prometheusAdapter+:: {
customMetricsApiService: {
apiVersion: 'apiregistration.k8s.io/v1',
kind: 'APIService',
metadata: {
name: 'v1beta1.custom.metrics.k8s.io',
},
spec: {
service: {
name: $.prometheusAdapter.service.metadata.name,
namespace: $._config.namespace,
},
group: 'custom.metrics.k8s.io',
version: 'v1beta1',
insecureSkipTLSVerify: true,
groupPriorityMinimum: 100,
versionPriority: 100,
},
},
customMetricsApiServiceV1Beta2: {
apiVersion: 'apiregistration.k8s.io/v1',
kind: 'APIService',
metadata: {
name: 'v1beta2.custom.metrics.k8s.io',
},
spec: {
service: {
name: $.prometheusAdapter.service.metadata.name,
namespace: $._config.namespace,
},
group: 'custom.metrics.k8s.io',
version: 'v1beta2',
insecureSkipTLSVerify: true,
groupPriorityMinimum: 100,
versionPriority: 200,
},
},
customMetricsClusterRoleServerResources:
local clusterRole = k.rbac.v1.clusterRole;
local policyRule = clusterRole.rulesType;
local rules =
policyRule.new() +
policyRule.withApiGroups(['custom.metrics.k8s.io']) +
policyRule.withResources(['*']) +
policyRule.withVerbs(['*']);
clusterRole.new() +
clusterRole.mixin.metadata.withName('custom-metrics-server-resources') +
clusterRole.withRules(rules),
customMetricsClusterRoleBindingServerResources:
local clusterRoleBinding = k.rbac.v1.clusterRoleBinding;
clusterRoleBinding.new() +
clusterRoleBinding.mixin.metadata.withName('custom-metrics-server-resources') +
clusterRoleBinding.mixin.roleRef.withApiGroup('rbac.authorization.k8s.io') +
clusterRoleBinding.mixin.roleRef.withName('custom-metrics-server-resources') +
clusterRoleBinding.mixin.roleRef.mixinInstance({ kind: 'ClusterRole' }) +
clusterRoleBinding.withSubjects([{
kind: 'ServiceAccount',
name: $.prometheusAdapter.serviceAccount.metadata.name,
namespace: $._config.namespace,
}]),
customMetricsClusterRoleBindingHPA:
local clusterRoleBinding = k.rbac.v1.clusterRoleBinding;
clusterRoleBinding.new() +
clusterRoleBinding.mixin.metadata.withName('hpa-controller-custom-metrics') +
clusterRoleBinding.mixin.roleRef.withApiGroup('rbac.authorization.k8s.io') +
clusterRoleBinding.mixin.roleRef.withName('custom-metrics-server-resources') +
clusterRoleBinding.mixin.roleRef.mixinInstance({ kind: 'ClusterRole' }) +
clusterRoleBinding.withSubjects([{
kind: 'ServiceAccount',
name: 'horizontal-pod-autoscaler',
namespace: 'kube-system',
}]),
}
}

View File

@@ -3,7 +3,24 @@ local service = k.core.v1.service;
local servicePort = k.core.v1.service.mixin.spec.portsType;
{
_config+:: {
eks: {
minimumAvailableIPs: 10,
minimumAvailableIPsTime: '10m'
}
},
prometheus+: {
serviceMonitorCoreDNS+: {
spec+: {
endpoints: [
{
bearerTokenFile: "/var/run/secrets/kubernetes.io/serviceaccount/token",
interval: "15s",
targetPort: 9153
}
]
},
},
AwsEksCniMetricService:
service.new('aws-node', { 'k8s-app' : 'aws-node' } , servicePort.newNamed('cni-metrics-port', 61678, 61678)) +
service.mixin.metadata.withNamespace('kube-system') +
@@ -48,14 +65,14 @@ local servicePort = k.core.v1.service.mixin.spec.portsType;
name: 'kube-prometheus-eks.rules',
rules: [
{
expr: 'sum by(instance) (awscni_total_ip_addresses) - sum by(instance) (awscni_assigned_ip_addresses) < 10',
expr: 'sum by(instance) (awscni_total_ip_addresses) - sum by(instance) (awscni_assigned_ip_addresses) < %s' % $._config.eks.minimumAvailableIPs,
labels: {
severity: 'critical',
},
annotations: {
message: 'Instance {{ $labels.instance }} has less than 10 IPs available.'
},
'for': '10m',
'for': $._config.eks.minimumAvailableIPsTime,
alert: 'EksAvailableIPs'
},
],

View File

@@ -37,6 +37,23 @@
regex: 'container_(network_tcp_usage_total|network_udp_usage_total|tasks_state|cpu_load_average_10s)',
action: 'drop',
},
// Drop cAdvisor metrics with no (pod, namespace) labels while preserving ability to monitor system services resource usage (cardinality estimation)
{
sourceLabels: ['__name__', 'pod', 'namespace'],
action: 'drop',
regex: '(' + std.join('|',
[
'container_fs_.*', // add filesystem read/write data (nodes*disks*services*4)
'container_spec_.*', // everything related to cgroup specification and thus static data (nodes*services*5)
'container_blkio_device_usage_total', // useful for containers, but not for system services (nodes*disks*services*operations*2)
'container_file_descriptors', // file descriptors limits and global numbers are exposed via (nodes*services)
'container_sockets', // used sockets in cgroup. Usually not important for system services (nodes*services)
'container_threads_max', // max number of threads in cgroup. Usually for system services it is not limited (nodes*services)
'container_threads', // used threads in cgroup. Usually not important for system services (nodes*services)
'container_start_time_seconds', // container start. Possibly not needed for system services (nodes*services)
'container_last_seen', // not needed as system services are always running (nodes*services)
]) + ');;',
},
],
},
],

View File

@@ -5,12 +5,12 @@ local servicePort = k.core.v1.service.mixin.spec.portsType;
{
prometheus+:: {
kubeControllerManagerPrometheusDiscoveryService:
service.new('kube-controller-manager-prometheus-discovery', { 'k8s-app': 'kube-controller-manager' }, servicePort.newNamed('http-metrics', 10252, 10252)) +
service.new('kube-controller-manager-prometheus-discovery', { 'k8s-app': 'kube-controller-manager' }, servicePort.newNamed('https-metrics', 10257, 10257)) +
service.mixin.metadata.withNamespace('kube-system') +
service.mixin.metadata.withLabels({ 'k8s-app': 'kube-controller-manager' }) +
service.mixin.spec.withClusterIp('None'),
kubeSchedulerPrometheusDiscoveryService:
service.new('kube-scheduler-prometheus-discovery', { 'k8s-app': 'kube-scheduler' }, servicePort.newNamed('http-metrics', 10251, 10251)) +
service.new('kube-scheduler-prometheus-discovery', { 'k8s-app': 'kube-scheduler' }, servicePort.newNamed('https-metrics', 10259, 10259)) +
service.mixin.metadata.withNamespace('kube-system') +
service.mixin.metadata.withLabels({ 'k8s-app': 'kube-scheduler' }) +
service.mixin.spec.withClusterIp('None'),

View File

@@ -5,12 +5,12 @@ local servicePort = k.core.v1.service.mixin.spec.portsType;
{
prometheus+: {
kubeControllerManagerPrometheusDiscoveryService:
service.new('kube-controller-manager-prometheus-discovery', { 'k8s-app': 'kube-controller-manager' }, servicePort.newNamed('http-metrics', 10252, 10252)) +
service.new('kube-controller-manager-prometheus-discovery', { 'k8s-app': 'kube-controller-manager' }, servicePort.newNamed('https-metrics', 10257, 10257)) +
service.mixin.metadata.withNamespace('kube-system') +
service.mixin.metadata.withLabels({ 'k8s-app': 'kube-controller-manager' }) +
service.mixin.spec.withClusterIp('None'),
kubeSchedulerPrometheusDiscoveryService:
service.new('kube-scheduler-prometheus-discovery', { 'k8s-app': 'kube-scheduler' }, servicePort.newNamed('http-metrics', 10251, 10251)) +
service.new('kube-scheduler-prometheus-discovery', { 'k8s-app': 'kube-scheduler' }, servicePort.newNamed('https-metrics', 10259, 10259)) +
service.mixin.metadata.withNamespace('kube-system') +
service.mixin.metadata.withLabels({ 'k8s-app': 'kube-scheduler' }) +
service.mixin.spec.withClusterIp('None'),

View File

@@ -5,12 +5,12 @@ local servicePort = k.core.v1.service.mixin.spec.portsType;
{
prometheus+: {
kubeControllerManagerPrometheusDiscoveryService:
service.new('kube-controller-manager-prometheus-discovery', { component: 'kube-controller-manager' }, servicePort.newNamed('http-metrics', 10252, 10252)) +
service.new('kube-controller-manager-prometheus-discovery', { component: 'kube-controller-manager' }, servicePort.newNamed('https-metrics', 10257, 10257)) +
service.mixin.metadata.withNamespace('kube-system') +
service.mixin.metadata.withLabels({ 'k8s-app': 'kube-controller-manager' }) +
service.mixin.spec.withClusterIp('None'),
kubeSchedulerPrometheusDiscoveryService:
service.new('kube-scheduler-prometheus-discovery', { component: 'kube-scheduler' }, servicePort.newNamed('http-metrics', 10251, 10251)) +
service.new('kube-scheduler-prometheus-discovery', { component: 'kube-scheduler' }, servicePort.newNamed('https-metrics', 10259, 10259)) +
service.mixin.metadata.withNamespace('kube-system') +
service.mixin.metadata.withLabels({ 'k8s-app': 'kube-scheduler' }) +
service.mixin.spec.withClusterIp('None'),

View File

@@ -6,12 +6,12 @@ local servicePort = k.core.v1.service.mixin.spec.portsType;
prometheus+: {
kubeControllerManagerPrometheusDiscoveryService:
service.new('kube-controller-manager-prometheus-discovery', { 'component': 'kube-controller-manager' }, servicePort.newNamed('http-metrics', 10252, 10252)) +
service.new('kube-controller-manager-prometheus-discovery', { 'component': 'kube-controller-manager' }, servicePort.newNamed('https-metrics', 10257, 10257)) +
service.mixin.metadata.withNamespace('kube-system') +
service.mixin.metadata.withLabels({ 'k8s-app': 'kube-controller-manager' }) +
service.mixin.spec.withClusterIp('None'),
kubeSchedulerPrometheusDiscoveryService:
service.new('kube-scheduler-prometheus-discovery', { 'component': 'kube-scheduler' }, servicePort.newNamed('http-metrics', 10251, 10251)) +
service.new('kube-scheduler-prometheus-discovery', { 'component': 'kube-scheduler' }, servicePort.newNamed('https-metrics', 10259, 10259)) +
service.mixin.metadata.withNamespace('kube-system') +
service.mixin.metadata.withLabels({ 'k8s-app': 'kube-scheduler' }) +
service.mixin.spec.withClusterIp('None'),

View File

@@ -9,6 +9,9 @@
'kube-rbac-proxy'+: {
limits: {},
},
'kube-state-metrics'+: {
limits: {},
},
'node-exporter'+: {
limits: {},
},

View File

@@ -5,7 +5,7 @@ local servicePort = k.core.v1.service.mixin.spec.portsType;
{
_config+:: {
versions+:: {
thanos: 'v0.7.0',
thanos: 'v0.14.0',
},
imageRepos+:: {
thanos: 'quay.io/thanos/thanos',
@@ -30,7 +30,7 @@ local servicePort = k.core.v1.service.mixin.spec.portsType;
spec+: {
thanos+: {
version: $._config.versions.thanos,
baseImage: $._config.imageRepos.thanos,
image: $._config.imageRepos.thanos + ':' + $._config.versions.thanos,
objectStorageConfig: $._config.thanos.objectStorageConfig,
},
},

View File

@@ -0,0 +1,189 @@
local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
local service = k.core.v1.service;
local servicePort = k.core.v1.service.mixin.spec.portsType;
{
prometheus+: {
serviceWeaveNet:
service.new('weave-net', { 'name': 'weave-net' }, servicePort.newNamed('weave-net-metrics', 6782, 6782)) +
service.mixin.metadata.withNamespace('kube-system') +
service.mixin.metadata.withLabels({ 'k8s-app': 'weave-net' }) +
service.mixin.spec.withClusterIp('None'),
serviceMonitorWeaveNet: {
apiVersion: 'monitoring.coreos.com/v1',
kind: 'ServiceMonitor',
metadata: {
name: 'weave-net',
labels: {
'k8s-app': 'weave-net',
},
namespace: 'monitoring',
},
spec: {
jobLabel: 'k8s-app',
endpoints: [
{
port: 'weave-net-metrics',
path: '/metrics',
interval: '15s',
},
],
namespaceSelector: {
matchNames: [
'kube-system',
],
},
selector: {
matchLabels: {
'k8s-app': 'weave-net',
},
},
},
},
},
prometheusRules+: {
groups+: [
{
name: 'weave-net',
rules: [
{
alert: 'WeaveNetIPAMSplitBrain',
expr: 'max(weave_ipam_unreachable_percentage) - min(weave_ipam_unreachable_percentage) > 0',
'for': '3m',
labels: {
severity: 'critical',
},
annotations: {
summary: 'Percentage of all IP addresses owned by unreachable peers is not same for every node.',
description: 'actionable: Weave Net network has a split brain problem. Please find the problem and fix it.',
},
},
{
alert: 'WeaveNetIPAMUnreachable',
expr: 'weave_ipam_unreachable_percentage > 25',
'for': '10m',
labels: {
severity: 'critical',
},
annotations: {
summary: 'Percentage of all IP addresses owned by unreachable peers is above threshold.',
description: 'actionable: Please find the problem and fix it.',
},
},
{
alert: 'WeaveNetIPAMPendingAllocates',
expr: 'sum(weave_ipam_pending_allocates) > 0',
'for': '3m',
labels: {
severity: 'critical',
},
annotations: {
summary: 'Number of pending allocates is above the threshold.',
description: 'actionable: Please find the problem and fix it.',
},
},
{
alert: 'WeaveNetIPAMPendingClaims',
expr: 'sum(weave_ipam_pending_claims) > 0',
'for': '3m',
labels: {
severity: 'critical',
},
annotations: {
summary: 'Number of pending claims is above the threshold.',
description: 'actionable: Please find the problem and fix it.',
},
},
{
alert: 'WeaveNetFastDPFlowsLow',
expr: 'sum(weave_flows) < 15000',
'for': '3m',
labels: {
severity: 'critical',
},
annotations: {
summary: 'Number of FastDP flows is below the threshold.',
description: 'actionable: Please find the reason for FastDP flows to go below the threshold and fix it.',
},
},
{
alert: 'WeaveNetFastDPFlowsOff',
expr: 'sum(weave_flows == bool 0) > 0',
'for': '3m',
labels: {
severity: 'critical',
},
annotations: {
summary: 'FastDP flows is zero.',
description: 'actionable: Please find the reason for FastDP flows to be off and fix it.',
},
},
{
alert: 'WeaveNetHighConnectionTerminationRate',
expr: 'rate(weave_connection_terminations_total[5m]) > 0.1',
'for': '5m',
labels: {
severity: 'critical',
},
annotations: {
summary: 'A lot of connections are getting terminated.',
description: 'actionable: Please find the reason for the high connection termination rate and fix it.',
},
},
{
alert: 'WeaveNetConnectionsConnecting',
expr: 'sum(weave_connections{state="connecting"}) > 0',
'for': '3m',
labels: {
severity: 'critical',
},
annotations: {
summary: 'A lot of connections are in connecting state.',
description: 'actionable: Please find the reason for this and fix it.',
},
},
{
alert: 'WeaveNetConnectionsRetying',
expr: 'sum(weave_connections{state="retrying"}) > 0',
'for': '3m',
labels: {
severity: 'critical',
},
annotations: {
summary: 'A lot of connections are in retrying state.',
description: 'actionable: Please find the reason for this and fix it.',
},
},
{
alert: 'WeaveNetConnectionsPending',
expr: 'sum(weave_connections{state="pending"}) > 0',
'for': '3m',
labels: {
severity: 'critical',
},
annotations: {
summary: 'A lot of connections are in pending state.',
description: 'actionable: Please find the reason for this and fix it.',
},
},
{
alert: 'WeaveNetConnectionsFailed',
expr: 'sum(weave_connections{state="failed"}) > 0',
'for': '3m',
labels: {
severity: 'critical',
},
annotations: {
summary: 'A lot of connections are in failed state.',
description: 'actionable: Please find the reason and fix it.',
},
},
],
},
],
},
grafanaDashboards+:: {
'weave-net.json': (import 'grafana-weave-net.json'),
'weave-net-cluster.json': (import 'grafana-weave-net-cluster.json'),
},
}

View File

@@ -4,6 +4,7 @@ local configMapList = k3.core.v1.configMapList;
(import 'grafana/grafana.libsonnet') +
(import 'kube-state-metrics/kube-state-metrics.libsonnet') +
(import 'kube-state-metrics-mixin/mixin.libsonnet') +
(import 'node-exporter/node-exporter.libsonnet') +
(import 'node-mixin/mixin.libsonnet') +
(import 'alertmanager/alertmanager.libsonnet') +
@@ -17,6 +18,63 @@ local configMapList = k3.core.v1.configMapList;
kubePrometheus+:: {
namespace: k.core.v1.namespace.new($._config.namespace),
},
prometheusOperator+:: {
service+: {
spec+: {
ports: [
{
name: 'https',
port: 8443,
targetPort: 'https',
},
],
},
},
serviceMonitor+: {
spec+: {
endpoints: [
{
port: 'https',
scheme: 'https',
honorLabels: true,
bearerTokenFile: '/var/run/secrets/kubernetes.io/serviceaccount/token',
tlsConfig: {
insecureSkipVerify: true,
},
},
]
},
},
clusterRole+: {
rules+: [
{
apiGroups: ['authentication.k8s.io'],
resources: ['tokenreviews'],
verbs: ['create'],
},
{
apiGroups: ['authorization.k8s.io'],
resources: ['subjectaccessreviews'],
verbs: ['create'],
},
],
},
} +
((import 'kube-prometheus/kube-rbac-proxy/container.libsonnet') {
config+:: {
kubeRbacProxy: {
local cfg = self,
image: $._config.imageRepos.kubeRbacProxy + ':' + $._config.versions.kubeRbacProxy,
name: 'kube-rbac-proxy',
securePortName: 'https',
securePort: 8443,
secureListenAddress: ':%d' % self.securePort,
upstream: 'http://127.0.0.1:8080/',
tlsCipherSuites: $._config.tlsCipherSuites,
},
},
}).deploymentMixin,
grafana+:: {
dashboardDefinitions: configMapList.new(super.dashboardDefinitions),
serviceMonitor: {
@@ -46,42 +104,43 @@ local configMapList = k3.core.v1.configMapList;
namespace: 'default',
versions+:: {
grafana: '6.4.3',
grafana: '7.1.0',
},
tlsCipherSuites: [
'TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256', // required by h2: http://golang.org/cl/30721
'TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256', // required by h2: http://golang.org/cl/30721
// 'TLS_RSA_WITH_RC4_128_SHA', // insecure: https://access.redhat.com/security/cve/cve-2013-2566
// 'TLS_RSA_WITH_3DES_EDE_CBC_SHA', // insecure: https://access.redhat.com/articles/2548661
// 'TLS_RSA_WITH_AES_128_CBC_SHA', // disabled by h2
// 'TLS_RSA_WITH_AES_256_CBC_SHA', // disabled by h2
'TLS_RSA_WITH_AES_128_CBC_SHA256',
// 'TLS_RSA_WITH_AES_128_GCM_SHA256', // disabled by h2
// 'TLS_RSA_WITH_AES_256_GCM_SHA384', // disabled by h2
// 'TLS_ECDHE_ECDSA_WITH_RC4_128_SHA', // insecure: https://access.redhat.com/security/cve/cve-2013-2566
// 'TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA',// disabled by h2
// 'TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA',// disabled by h2
// 'TLS_ECDHE_RSA_WITH_RC4_128_SHA', // insecure: https://access.redhat.com/security/cve/cve-2013-2566
// 'TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA', // insecure: https://access.redhat.com/articles/2548661
// 'TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA', // disabled by h2
// 'TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA', // disabled by h2
'TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256',
'TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256',
// 'TLS_RSA_WITH_RC4_128_SHA', // insecure: https://access.redhat.com/security/cve/cve-2013-2566
// 'TLS_RSA_WITH_3DES_EDE_CBC_SHA', // insecure: https://access.redhat.com/articles/2548661
// 'TLS_RSA_WITH_AES_128_CBC_SHA', // disabled by h2
// 'TLS_RSA_WITH_AES_256_CBC_SHA', // disabled by h2
// 'TLS_RSA_WITH_AES_128_CBC_SHA256', // insecure: https://access.redhat.com/security/cve/cve-2013-0169
// 'TLS_RSA_WITH_AES_128_GCM_SHA256', // disabled by h2
// 'TLS_RSA_WITH_AES_256_GCM_SHA384', // disabled by h2
// 'TLS_ECDHE_ECDSA_WITH_RC4_128_SHA', // insecure: https://access.redhat.com/security/cve/cve-2013-2566
// 'TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA', // disabled by h2
// 'TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA', // disabled by h2
// 'TLS_ECDHE_RSA_WITH_RC4_128_SHA', // insecure: https://access.redhat.com/security/cve/cve-2013-2566
// 'TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA', // insecure: https://access.redhat.com/articles/2548661
// 'TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA', // disabled by h2
// 'TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA', // disabled by h2
// 'TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256', // insecure: https://access.redhat.com/security/cve/cve-2013-0169
// 'TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256', // insecure: https://access.redhat.com/security/cve/cve-2013-0169
// disabled by h2 means: https://github.com/golang/net/blob/e514e69ffb8bc3c76a71ae40de0118d794855992/http2/ciphers.go
// 'TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384', // TODO: Might not work with h2
// 'TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384', // TODO: Might not work with h2
// 'TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305', // TODO: Might not work with h2
// 'TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305', // TODO: Might not work with h2
'TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384',
'TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384',
'TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305',
'TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305',
],
cadvisorSelector: 'job="kubelet"',
kubeletSelector: 'job="kubelet"',
cadvisorSelector: 'job="kubelet", metrics_path="/metrics/cadvisor"',
kubeletSelector: 'job="kubelet", metrics_path="/metrics"',
kubeStateMetricsSelector: 'job="kube-state-metrics"',
nodeExporterSelector: 'job="node-exporter"',
fsSpaceFillingUpCriticalThreshold: 15,
notKubeDnsSelector: 'job!="kube-dns"',
kubeSchedulerSelector: 'job="kube-scheduler"',
kubeControllerManagerSelector: 'job="kube-controller-manager"',
@@ -116,6 +175,10 @@ local configMapList = k3.core.v1.configMapList;
requests: { cpu: '10m', memory: '20Mi' },
limits: { cpu: '20m', memory: '40Mi' },
},
'kube-state-metrics': {
requests: { cpu: '100m', memory: '150Mi' },
limits: { cpu: '100m', memory: '150Mi' },
},
'node-exporter': {
requests: { cpu: '102m', memory: '180Mi' },
limits: { cpu: '250m', memory: '180Mi' },

View File

@@ -0,0 +1,91 @@
local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
local deployment = k.apps.v1.deployment;
local container = deployment.mixin.spec.template.spec.containersType;
local containerPort = container.portsType;
{
local krp = self,
config+:: {
kubeRbacProxy: {
image: error 'must provide image',
name: error 'must provide name',
securePortName: error 'must provide securePortName',
securePort: error 'must provide securePort',
secureListenAddress: error 'must provide secureListenAddress',
upstream: error 'must provide upstream',
tlsCipherSuites: error 'must provide tlsCipherSuites',
},
},
specMixin:: {
local sm = self,
config+:: {
kubeRbacProxy: {
image: error 'must provide image',
name: error 'must provide name',
securePortName: error 'must provide securePortName',
securePort: error 'must provide securePort',
secureListenAddress: error 'must provide secureListenAddress',
upstream: error 'must provide upstream',
tlsCipherSuites: error 'must provide tlsCipherSuites',
},
},
spec+: {
template+: {
spec+: {
containers+: [
container.new(krp.config.kubeRbacProxy.name, krp.config.kubeRbacProxy.image) +
container.mixin.securityContext.withRunAsUser(65534) +
container.withArgs([
'--logtostderr',
'--secure-listen-address=' + krp.config.kubeRbacProxy.secureListenAddress,
'--tls-cipher-suites=' + std.join(',', krp.config.kubeRbacProxy.tlsCipherSuites),
'--upstream=' + krp.config.kubeRbacProxy.upstream,
]) +
container.withPorts(containerPort.newNamed(krp.config.kubeRbacProxy.securePort, krp.config.kubeRbacProxy.securePortName)),
],
},
},
},
},
deploymentMixin:: {
local dm = self,
config+:: {
kubeRbacProxy: {
image: error 'must provide image',
name: error 'must provide name',
securePortName: error 'must provide securePortName',
securePort: error 'must provide securePort',
secureListenAddress: error 'must provide secureListenAddress',
upstream: error 'must provide upstream',
tlsCipherSuites: error 'must provide tlsCipherSuites',
},
},
deployment+: krp.specMixin {
config+:: {
kubeRbacProxy+: dm.config.kubeRbacProxy,
},
},
},
statefulSetMixin:: {
local sm = self,
config+:: {
kubeRbacProxy: {
image: error 'must provide image',
name: error 'must provide name',
securePortName: error 'must provide securePortName',
securePort: error 'must provide securePort',
secureListenAddress: error 'must provide secureListenAddress',
upstream: error 'must provide upstream',
tlsCipherSuites: error 'must provide tlsCipherSuites',
},
},
statefulSet+: krp.specMixin {
config+:: {
kubeRbacProxy+: sm.config.kubeRbacProxy,
},
},
},
}

View File

@@ -1,303 +1,129 @@
local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
{
_config+:: {
namespace: 'default',
kubeStateMetrics+:: {
collectors: '', // empty string gets a default set
scrapeInterval: '30s',
scrapeTimeout: '30s',
baseCPU: '100m',
baseMemory: '150Mi',
},
versions+:: {
kubeStateMetrics: 'v1.8.0',
kubeRbacProxy: 'v0.4.1',
kubeStateMetrics: '1.9.5',
},
imageRepos+:: {
kubeStateMetrics: 'quay.io/coreos/kube-state-metrics',
kubeRbacProxy: 'quay.io/coreos/kube-rbac-proxy',
},
kubeStateMetrics+:: {
scrapeInterval: '30s',
scrapeTimeout: '30s',
},
},
kubeStateMetrics+:: {
clusterRoleBinding:
local clusterRoleBinding = k.rbac.v1.clusterRoleBinding;
clusterRoleBinding.new() +
clusterRoleBinding.mixin.metadata.withName('kube-state-metrics') +
clusterRoleBinding.mixin.roleRef.withApiGroup('rbac.authorization.k8s.io') +
clusterRoleBinding.mixin.roleRef.withName('kube-state-metrics') +
clusterRoleBinding.mixin.roleRef.mixinInstance({ kind: 'ClusterRole' }) +
clusterRoleBinding.withSubjects([{ kind: 'ServiceAccount', name: 'kube-state-metrics', namespace: $._config.namespace }]),
clusterRole:
local clusterRole = k.rbac.v1.clusterRole;
local rulesType = clusterRole.rulesType;
local rules = [
rulesType.new() +
rulesType.withApiGroups(['']) +
rulesType.withResources([
'configmaps',
'secrets',
'nodes',
'pods',
'services',
'resourcequotas',
'replicationcontrollers',
'limitranges',
'persistentvolumeclaims',
'persistentvolumes',
'namespaces',
'endpoints',
]) +
rulesType.withVerbs(['list', 'watch']),
rulesType.new() +
rulesType.withApiGroups(['extensions']) +
rulesType.withResources([
'daemonsets',
'deployments',
'replicasets',
'ingresses',
]) +
rulesType.withVerbs(['list', 'watch']),
rulesType.new() +
rulesType.withApiGroups(['apps']) +
rulesType.withResources([
'statefulsets',
'daemonsets',
'deployments',
'replicasets',
]) +
rulesType.withVerbs(['list', 'watch']),
rulesType.new() +
rulesType.withApiGroups(['batch']) +
rulesType.withResources([
'cronjobs',
'jobs',
]) +
rulesType.withVerbs(['list', 'watch']),
rulesType.new() +
rulesType.withApiGroups(['autoscaling']) +
rulesType.withResources([
'horizontalpodautoscalers',
]) +
rulesType.withVerbs(['list', 'watch']),
rulesType.new() +
rulesType.withApiGroups(['authentication.k8s.io']) +
rulesType.withResources([
'tokenreviews',
]) +
rulesType.withVerbs(['create']),
rulesType.new() +
rulesType.withApiGroups(['authorization.k8s.io']) +
rulesType.withResources([
'subjectaccessreviews',
]) +
rulesType.withVerbs(['create']),
rulesType.new() +
rulesType.withApiGroups(['policy']) +
rulesType.withResources([
'poddisruptionbudgets',
]) +
rulesType.withVerbs(['list', 'watch']),
rulesType.new() +
rulesType.withApiGroups(['certificates.k8s.io']) +
rulesType.withResources([
'certificatesigningrequests',
]) +
rulesType.withVerbs(['list', 'watch']),
rulesType.new() +
rulesType.withApiGroups(['storage.k8s.io']) +
rulesType.withResources([
'storageclasses',
]) +
rulesType.withVerbs(['list', 'watch']),
];
clusterRole.new() +
clusterRole.mixin.metadata.withName('kube-state-metrics') +
clusterRole.withRules(rules),
deployment:
local deployment = k.apps.v1.deployment;
local container = deployment.mixin.spec.template.spec.containersType;
local volume = deployment.mixin.spec.template.spec.volumesType;
local containerPort = container.portsType;
local containerVolumeMount = container.volumeMountsType;
local podSelector = deployment.mixin.spec.template.spec.selectorType;
local podLabels = { app: 'kube-state-metrics' };
local proxyClusterMetrics =
container.new('kube-rbac-proxy-main', $._config.imageRepos.kubeRbacProxy + ':' + $._config.versions.kubeRbacProxy) +
container.withArgs([
'--logtostderr',
'--secure-listen-address=:8443',
'--tls-cipher-suites=' + std.join(',', $._config.tlsCipherSuites),
'--upstream=http://127.0.0.1:8081/',
]) +
container.withPorts(containerPort.newNamed(8443, 'https-main',)) +
container.mixin.resources.withRequests($._config.resources['kube-rbac-proxy'].requests) +
container.mixin.resources.withLimits($._config.resources['kube-rbac-proxy'].limits);
local proxySelfMetrics =
container.new('kube-rbac-proxy-self', $._config.imageRepos.kubeRbacProxy + ':' + $._config.versions.kubeRbacProxy) +
container.withArgs([
'--logtostderr',
'--secure-listen-address=:9443',
'--tls-cipher-suites=' + std.join(',', $._config.tlsCipherSuites),
'--upstream=http://127.0.0.1:8082/',
]) +
container.withPorts(containerPort.newNamed(9443, 'https-self',)) +
container.mixin.resources.withRequests($._config.resources['kube-rbac-proxy'].requests) +
container.mixin.resources.withLimits($._config.resources['kube-rbac-proxy'].limits);
local kubeStateMetrics =
container.new('kube-state-metrics', $._config.imageRepos.kubeStateMetrics + ':' + $._config.versions.kubeStateMetrics) +
container.withArgs([
'--host=127.0.0.1',
'--port=8081',
'--telemetry-host=127.0.0.1',
'--telemetry-port=8082',
] + if $._config.kubeStateMetrics.collectors != '' then ['--collectors=' + $._config.kubeStateMetrics.collectors] else []) +
container.mixin.resources.withRequests({ cpu: $._config.kubeStateMetrics.baseCPU, memory: $._config.kubeStateMetrics.baseMemory }) +
container.mixin.resources.withLimits({ cpu: $._config.kubeStateMetrics.baseCPU, memory: $._config.kubeStateMetrics.baseMemory });
local c = [proxyClusterMetrics, proxySelfMetrics, kubeStateMetrics];
deployment.new('kube-state-metrics', 1, c, podLabels) +
deployment.mixin.metadata.withNamespace($._config.namespace) +
deployment.mixin.metadata.withLabels(podLabels) +
deployment.mixin.spec.selector.withMatchLabels(podLabels) +
deployment.mixin.spec.template.spec.withNodeSelector({ 'kubernetes.io/os': 'linux' }) +
deployment.mixin.spec.template.spec.securityContext.withRunAsNonRoot(true) +
deployment.mixin.spec.template.spec.securityContext.withRunAsUser(65534) +
deployment.mixin.spec.template.spec.withServiceAccountName('kube-state-metrics'),
roleBinding:
local roleBinding = k.rbac.v1.roleBinding;
roleBinding.new() +
roleBinding.mixin.metadata.withName('kube-state-metrics') +
roleBinding.mixin.metadata.withNamespace($._config.namespace) +
roleBinding.mixin.roleRef.withApiGroup('rbac.authorization.k8s.io') +
roleBinding.mixin.roleRef.withName('kube-state-metrics') +
roleBinding.mixin.roleRef.mixinInstance({ kind: 'Role' }) +
roleBinding.withSubjects([{ kind: 'ServiceAccount', name: 'kube-state-metrics' }]),
role:
local role = k.rbac.v1.role;
local rulesType = role.rulesType;
local coreRule = rulesType.new() +
rulesType.withApiGroups(['']) +
rulesType.withResources([
'pods',
]) +
rulesType.withVerbs(['get']);
local extensionsRule = rulesType.new() +
rulesType.withApiGroups(['extensions']) +
rulesType.withResources([
'deployments',
]) +
rulesType.withVerbs(['get', 'update']) +
rulesType.withResourceNames(['kube-state-metrics']);
local appsRule = rulesType.new() +
rulesType.withApiGroups(['apps']) +
rulesType.withResources([
'deployments',
]) +
rulesType.withVerbs(['get', 'update']) +
rulesType.withResourceNames(['kube-state-metrics']);
local rules = [coreRule, extensionsRule, appsRule];
role.new() +
role.mixin.metadata.withName('kube-state-metrics') +
role.mixin.metadata.withNamespace($._config.namespace) +
role.withRules(rules),
serviceAccount:
local serviceAccount = k.core.v1.serviceAccount;
serviceAccount.new('kube-state-metrics') +
serviceAccount.mixin.metadata.withNamespace($._config.namespace),
service:
local service = k.core.v1.service;
local servicePort = service.mixin.spec.portsType;
local ksmServicePortMain = servicePort.newNamed('https-main', 8443, 'https-main');
local ksmServicePortSelf = servicePort.newNamed('https-self', 9443, 'https-self');
service.new('kube-state-metrics', $.kubeStateMetrics.deployment.spec.selector.matchLabels, [ksmServicePortMain, ksmServicePortSelf]) +
service.mixin.metadata.withNamespace($._config.namespace) +
service.mixin.metadata.withLabels({ 'k8s-app': 'kube-state-metrics' }) +
service.mixin.spec.withClusterIp('None'),
serviceMonitor:
{
apiVersion: 'monitoring.coreos.com/v1',
kind: 'ServiceMonitor',
metadata: {
name: 'kube-state-metrics',
namespace: $._config.namespace,
labels: {
'k8s-app': 'kube-state-metrics',
},
},
spec: {
jobLabel: 'k8s-app',
selector: {
matchLabels: {
'k8s-app': 'kube-state-metrics',
},
},
endpoints: [
{
port: 'https-main',
scheme: 'https',
interval: $._config.kubeStateMetrics.scrapeInterval,
scrapeTimeout: $._config.kubeStateMetrics.scrapeTimeout,
honorLabels: true,
bearerTokenFile: '/var/run/secrets/kubernetes.io/serviceaccount/token',
relabelings: [
{
regex: '(pod|service|endpoint|namespace)',
action: 'labeldrop',
},
],
tlsConfig: {
insecureSkipVerify: true,
},
},
{
port: 'https-self',
scheme: 'https',
interval: '30s',
bearerTokenFile: '/var/run/secrets/kubernetes.io/serviceaccount/token',
tlsConfig: {
insecureSkipVerify: true,
},
},
],
},
},
},
kubeStateMetrics+:: (import 'kube-state-metrics/kube-state-metrics.libsonnet') +
{
local ksm = self,
name:: 'kube-state-metrics',
namespace:: $._config.namespace,
version:: $._config.versions.kubeStateMetrics,
image:: $._config.imageRepos.kubeStateMetrics + ':v' + $._config.versions.kubeStateMetrics,
service+: {
spec+: {
ports: [
{
name: 'https-main',
port: 8443,
targetPort: 'https-main',
},
{
name: 'https-self',
port: 9443,
targetPort: 'https-self',
},
],
},
},
deployment+: {
spec+: {
template+: {
spec+: {
containers: std.map(function(c) c {
ports:: null,
livenessProbe:: null,
readinessProbe:: null,
args: ['--host=127.0.0.1', '--port=8081', '--telemetry-host=127.0.0.1', '--telemetry-port=8082'],
}, super.containers),
},
},
},
},
serviceMonitor:
{
apiVersion: 'monitoring.coreos.com/v1',
kind: 'ServiceMonitor',
metadata: {
name: 'kube-state-metrics',
namespace: $._config.namespace,
labels: {
'app.kubernetes.io/name': 'kube-state-metrics',
'app.kubernetes.io/version': ksm.version,
},
},
spec: {
jobLabel: 'app.kubernetes.io/name',
selector: {
matchLabels: {
'app.kubernetes.io/name': 'kube-state-metrics',
},
},
endpoints: [
{
port: 'https-main',
scheme: 'https',
interval: $._config.kubeStateMetrics.scrapeInterval,
scrapeTimeout: $._config.kubeStateMetrics.scrapeTimeout,
honorLabels: true,
bearerTokenFile: '/var/run/secrets/kubernetes.io/serviceaccount/token',
relabelings: [
{
regex: '(pod|service|endpoint|namespace)',
action: 'labeldrop',
},
],
tlsConfig: {
insecureSkipVerify: true,
},
},
{
port: 'https-self',
scheme: 'https',
interval: $._config.kubeStateMetrics.scrapeInterval,
bearerTokenFile: '/var/run/secrets/kubernetes.io/serviceaccount/token',
tlsConfig: {
insecureSkipVerify: true,
},
},
],
},
},
} +
((import 'kube-prometheus/kube-rbac-proxy/container.libsonnet') {
config+:: {
kubeRbacProxy: {
local cfg = self,
image: $._config.imageRepos.kubeRbacProxy + ':' + $._config.versions.kubeRbacProxy,
name: 'kube-rbac-proxy-main',
securePortName: 'https-main',
securePort: 8443,
secureListenAddress: ':%d' % self.securePort,
upstream: 'http://127.0.0.1:8081/',
tlsCipherSuites: $._config.tlsCipherSuites,
},
},
}).deploymentMixin +
((import 'kube-prometheus/kube-rbac-proxy/container.libsonnet') {
config+:: {
kubeRbacProxy: {
local cfg = self,
image: $._config.imageRepos.kubeRbacProxy + ':' + $._config.versions.kubeRbacProxy,
name: 'kube-rbac-proxy-self',
securePortName: 'https-self',
securePort: 9443,
secureListenAddress: ':%d' % self.securePort,
upstream: 'http://127.0.0.1:8082/',
tlsCipherSuites: $._config.tlsCipherSuites,
},
},
}).deploymentMixin,
}

View File

@@ -15,7 +15,17 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
},
nodeExporter+:: {
listenAddress: '127.0.0.1',
port: 9100,
labels: {
'app.kubernetes.io/name': 'node-exporter',
'app.kubernetes.io/version': $._config.versions.nodeExporter,
},
selectorLabels: {
[labelName]: $._config.nodeExporter.labels[labelName]
for labelName in std.objectFields($._config.nodeExporter.labels)
if !std.setMember(labelName, ['app.kubernetes.io/version'])
},
},
},
@@ -64,7 +74,8 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
local toleration = daemonset.mixin.spec.template.spec.tolerationsType;
local containerEnv = container.envType;
local podLabels = { app: 'node-exporter' };
local podLabels = $._config.nodeExporter.labels;
local selectorLabels = $._config.nodeExporter.selectorLabels;
local existsToleration = toleration.new() +
toleration.withOperator('Exists');
@@ -85,16 +96,13 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
local nodeExporter =
container.new('node-exporter', $._config.imageRepos.nodeExporter + ':' + $._config.versions.nodeExporter) +
container.withArgs([
'--web.listen-address=127.0.0.1:' + $._config.nodeExporter.port,
'--web.listen-address=' + std.join(':', [$._config.nodeExporter.listenAddress, std.toString($._config.nodeExporter.port)]),
'--path.procfs=/host/proc',
'--path.sysfs=/host/sys',
'--path.rootfs=/host/root',
// The following settings have been taken from
// https://github.com/prometheus/node_exporter/blob/0662673/collector/filesystem_linux.go#L30-L31
// Once node exporter is being released with those settings, this can be removed.
'--collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/)',
'--collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$',
'--no-collector.wifi',
'--no-collector.hwmon',
'--collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/pods/.+)($|/)',
]) +
container.withVolumeMounts([procVolumeMount, sysVolumeMount, rootVolumeMount]) +
container.mixin.resources.withRequests($._config.resources['node-exporter'].requests) +
@@ -105,7 +113,7 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
container.new('kube-rbac-proxy', $._config.imageRepos.kubeRbacProxy + ':' + $._config.versions.kubeRbacProxy) +
container.withArgs([
'--logtostderr',
'--secure-listen-address=$(IP):' + $._config.nodeExporter.port,
'--secure-listen-address=[$(IP)]:' + $._config.nodeExporter.port,
'--tls-cipher-suites=' + std.join(',', $._config.tlsCipherSuites),
'--upstream=http://127.0.0.1:' + $._config.nodeExporter.port + '/',
]) +
@@ -128,7 +136,8 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
daemonset.mixin.metadata.withName('node-exporter') +
daemonset.mixin.metadata.withNamespace($._config.namespace) +
daemonset.mixin.metadata.withLabels(podLabels) +
daemonset.mixin.spec.selector.withMatchLabels(podLabels) +
daemonset.mixin.spec.selector.withMatchLabels(selectorLabels) +
daemonset.mixin.spec.updateStrategy.rollingUpdate.withMaxUnavailable('10%') +
daemonset.mixin.spec.template.metadata.withLabels(podLabels) +
daemonset.mixin.spec.template.spec.withTolerations([existsToleration]) +
daemonset.mixin.spec.template.spec.withNodeSelector({ 'kubernetes.io/os': 'linux' }) +
@@ -153,22 +162,18 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
metadata: {
name: 'node-exporter',
namespace: $._config.namespace,
labels: {
'k8s-app': 'node-exporter',
},
labels: $._config.nodeExporter.labels,
},
spec: {
jobLabel: 'k8s-app',
jobLabel: 'app.kubernetes.io/name',
selector: {
matchLabels: {
'k8s-app': 'node-exporter',
},
matchLabels: $._config.nodeExporter.selectorLabels,
},
endpoints: [
{
port: 'https',
scheme: 'https',
interval: '30s',
interval: '15s',
bearerTokenFile: '/var/run/secrets/kubernetes.io/serviceaccount/token',
relabelings: [
{
@@ -193,9 +198,9 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
local nodeExporterPort = servicePort.newNamed('https', $._config.nodeExporter.port, 'https');
service.new('node-exporter', $.nodeExporter.daemonset.spec.selector.matchLabels, nodeExporterPort) +
service.new('node-exporter', $._config.nodeExporter.selectorLabels, nodeExporterPort) +
service.mixin.metadata.withNamespace($._config.namespace) +
service.mixin.metadata.withLabels({ 'k8s-app': 'node-exporter' }) +
service.mixin.metadata.withLabels($._config.nodeExporter.labels) +
service.mixin.spec.withClusterIp('None'),
},
}

View File

@@ -5,45 +5,58 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
namespace: 'default',
versions+:: {
prometheusAdapter: 'v0.5.0',
prometheusAdapter: 'v0.7.0',
},
imageRepos+:: {
prometheusAdapter: 'quay.io/coreos/k8s-prometheus-adapter-amd64',
prometheusAdapter: 'directxman12/k8s-prometheus-adapter',
},
prometheusAdapter+:: {
name: 'prometheus-adapter',
labels: { name: $._config.prometheusAdapter.name },
prometheusURL: 'http://prometheus-' + $._config.prometheus.name + '.' + $._config.namespace + '.svc:9090/',
config: |||
resourceRules:
cpu:
containerQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>,container!="POD",container!="",pod!=""}[5m])) by (<<.GroupBy>>)
nodeQuery: sum(1 - rate(node_cpu_seconds_total{mode="idle"}[5m]) * on(namespace, pod) group_left(node) node_namespace_pod:kube_pod_info:{<<.LabelMatchers>>}) by (<<.GroupBy>>)
resources:
overrides:
node:
resource: node
namespace:
resource: namespace
pod:
resource: pod
containerLabel: container
memory:
containerQuery: sum(container_memory_working_set_bytes{<<.LabelMatchers>>,container!="POD",container!="",pod!=""}) by (<<.GroupBy>>)
nodeQuery: sum(node_memory_MemTotal_bytes{job="node-exporter",<<.LabelMatchers>>} - node_memory_MemAvailable_bytes{job="node-exporter",<<.LabelMatchers>>}) by (<<.GroupBy>>)
resources:
overrides:
instance:
resource: node
namespace:
resource: namespace
pod:
resource: pod
containerLabel: container
window: 5m
|||,
prometheusURL: 'http://prometheus-' + $._config.prometheus.name + '.' + $._config.namespace + '.svc.cluster.local:9090/',
config: {
resourceRules: {
cpu: {
containerQuery: 'sum(irate(container_cpu_usage_seconds_total{<<.LabelMatchers>>,container!="POD",container!="",pod!=""}[5m])) by (<<.GroupBy>>)',
nodeQuery: 'sum(1 - irate(node_cpu_seconds_total{mode="idle"}[5m]) * on(namespace, pod) group_left(node) node_namespace_pod:kube_pod_info:{<<.LabelMatchers>>}) by (<<.GroupBy>>)',
resources: {
overrides: {
node: {
resource: 'node'
},
namespace: {
resource: 'namespace'
},
pod: {
resource: 'pod'
},
},
},
containerLabel: 'container'
},
memory: {
containerQuery: 'sum(container_memory_working_set_bytes{<<.LabelMatchers>>,container!="POD",container!="",pod!=""}) by (<<.GroupBy>>)',
nodeQuery: 'sum(node_memory_MemTotal_bytes{job="node-exporter",<<.LabelMatchers>>} - node_memory_MemAvailable_bytes{job="node-exporter",<<.LabelMatchers>>}) by (<<.GroupBy>>)',
resources: {
overrides: {
instance: {
resource: 'node'
},
namespace: {
resource: 'namespace'
},
pod: {
resource: 'pod'
},
},
},
containerLabel: 'container'
},
window: '5m',
},
}
},
},
@@ -70,10 +83,37 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
configMap:
local configmap = k.core.v1.configMap;
configmap.new('adapter-config', { 'config.yaml': std.manifestYamlDoc($._config.prometheusAdapter.config) }) +
configmap.new('adapter-config', { 'config.yaml': $._config.prometheusAdapter.config }) +
configmap.mixin.metadata.withNamespace($._config.namespace),
serviceMonitor:
{
apiVersion: 'monitoring.coreos.com/v1',
kind: 'ServiceMonitor',
metadata: {
name: $._config.prometheusAdapter.name,
namespace: $._config.namespace,
labels: $._config.prometheusAdapter.labels,
},
spec: {
selector: {
matchLabels: $._config.prometheusAdapter.labels,
},
endpoints: [
{
port: 'https',
interval: '30s',
scheme: 'https',
tlsConfig: {
insecureSkipVerify: true,
},
bearerTokenFile: '/var/run/secrets/kubernetes.io/serviceaccount/token',
},
],
},
},
service:
local service = k.core.v1.service;
local servicePort = k.core.v1.service.mixin.spec.portsType;
@@ -191,7 +231,7 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
local rules =
policyRule.new() +
policyRule.withApiGroups(['metrics.k8s.io']) +
policyRule.withResources(['pods']) +
policyRule.withResources(['pods', 'nodes']) +
policyRule.withVerbs(['get','list','watch']);
clusterRole.new() +

View File

@@ -6,7 +6,7 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
namespace: 'default',
versions+:: {
prometheus: 'v2.11.0',
prometheus: 'v2.20.0',
},
imageRepos+:: {
@@ -160,6 +160,7 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
local resourceRequirements = container.mixin.resourcesType;
local selector = statefulSet.mixin.spec.selectorType;
local resources =
resourceRequirements.new() +
resourceRequirements.withRequests({ memory: '400Mi' });
@@ -177,11 +178,12 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
spec: {
replicas: p.replicas,
version: $._config.versions.prometheus,
baseImage: $._config.imageRepos.prometheus,
image: $._config.imageRepos.prometheus + ':' + $._config.versions.prometheus,
serviceAccountName: 'prometheus-' + p.name,
serviceMonitorSelector: {},
podMonitorSelector: {},
serviceMonitorNamespaceSelector: {},
podMonitorNamespaceSelector: {},
nodeSelector: { 'kubernetes.io/os': 'linux' },
ruleSelector: selector.withMatchLabels({
role: 'alert-rules',
@@ -244,8 +246,13 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
jobLabel: 'k8s-app',
endpoints: [
{
port: 'http-metrics',
port: 'https-metrics',
interval: '30s',
scheme: "https",
bearerTokenFile: "/var/run/secrets/kubernetes.io/serviceaccount/token",
tlsConfig: {
insecureSkipVerify: true
}
},
],
selector: {
@@ -283,10 +290,11 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
insecureSkipVerify: true,
},
bearerTokenFile: '/var/run/secrets/kubernetes.io/serviceaccount/token',
metricRelabelings: (import 'kube-prometheus/dropping-deprecated-metrics-relabelings.libsonnet'),
relabelings: [
{
sourceLabels: ['__metrics_path__'],
targetLabel: 'metrics_path'
targetLabel: 'metrics_path',
},
],
},
@@ -303,7 +311,7 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
relabelings: [
{
sourceLabels: ['__metrics_path__'],
targetLabel: 'metrics_path'
targetLabel: 'metrics_path',
},
],
metricRelabelings: [
@@ -314,6 +322,40 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
regex: 'container_(network_tcp_usage_total|network_udp_usage_total|tasks_state|cpu_load_average_10s)',
action: 'drop',
},
// Drop cAdvisor metrics with no (pod, namespace) labels while preserving ability to monitor system services resource usage (cardinality estimation)
{
sourceLabels: ['__name__', 'pod', 'namespace'],
action: 'drop',
regex: '(' + std.join('|',
[
'container_fs_.*', // add filesystem read/write data (nodes*disks*services*4)
'container_spec_.*', // everything related to cgroup specification and thus static data (nodes*services*5)
'container_blkio_device_usage_total', // useful for containers, but not for system services (nodes*disks*services*operations*2)
'container_file_descriptors', // file descriptors limits and global numbers are exposed via (nodes*services)
'container_sockets', // used sockets in cgroup. Usually not important for system services (nodes*services)
'container_threads_max', // max number of threads in cgroup. Usually for system services it is not limited (nodes*services)
'container_threads', // used threads in cgroup. Usually not important for system services (nodes*services)
'container_start_time_seconds', // container start. Possibly not needed for system services (nodes*services)
'container_last_seen', // not needed as system services are always running (nodes*services)
]) + ');;',
},
],
},
{
port: 'https-metrics',
scheme: 'https',
path: '/metrics/probes',
interval: '30s',
honorLabels: true,
tlsConfig: {
insecureSkipVerify: true,
},
bearerTokenFile: '/var/run/secrets/kubernetes.io/serviceaccount/token',
relabelings: [
{
sourceLabels: ['__metrics_path__'],
targetLabel: 'metrics_path',
},
],
},
],
@@ -344,9 +386,14 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
jobLabel: 'k8s-app',
endpoints: [
{
port: 'http-metrics',
port: 'https-metrics',
interval: '30s',
metricRelabelings: [
scheme: "https",
bearerTokenFile: "/var/run/secrets/kubernetes.io/serviceaccount/token",
tlsConfig: {
insecureSkipVerify: true
},
metricRelabelings: (import 'kube-prometheus/dropping-deprecated-metrics-relabelings.libsonnet') + [
{
sourceLabels: ['__name__'],
regex: 'etcd_(debugging|disk|request|server).*',
@@ -401,10 +448,10 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
serverName: 'kubernetes',
},
bearerTokenFile: '/var/run/secrets/kubernetes.io/serviceaccount/token',
metricRelabelings: [
metricRelabelings: (import 'kube-prometheus/dropping-deprecated-metrics-relabelings.libsonnet') + [
{
sourceLabels: ['__name__'],
regex: 'etcd_(debugging|disk|request|server).*',
regex: 'etcd_(debugging|disk|server).*',
action: 'drop',
},
{
@@ -417,6 +464,11 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
regex: 'apiserver_admission_step_admission_latencies_seconds_.*',
action: 'drop',
},
{
sourceLabels: ['__name__', 'le'],
regex: 'apiserver_request_duration_seconds_bucket;(0.15|0.25|0.3|0.35|0.4|0.45|0.6|0.7|0.8|0.9|1.25|1.5|1.75|2.5|3|3.5|4.5|6|7|8|9|15|25|30|50)',
action: 'drop',
},
],
},
],

View File

@@ -0,0 +1,19 @@
{
prometheusRules+:: {
groups+: [
{
name: 'kube-prometheus-general.rules',
rules: [
{
expr: 'count without(instance, pod, node) (up == 1)',
record: 'count:up1',
},
{
expr: 'count without(instance, pod, node) (up == 0)',
record: 'count:up0',
},
],
},
],
},
}

View File

@@ -8,10 +8,6 @@
expr: 'sum(rate(node_cpu_seconds_total{mode!="idle",mode!="iowait"}[3m])) BY (instance)',
record: 'instance:node_cpu:rate:sum',
},
{
expr: 'sum((node_filesystem_size_bytes{mountpoint="/"} - node_filesystem_free_bytes{mountpoint="/"})) BY (instance)',
record: 'instance:node_filesystem_usage:sum',
},
{
expr: 'sum(rate(node_network_receive_bytes_total[3m])) BY (instance)',
record: 'instance:node_network_receive_bytes:rate:sum',

View File

@@ -1 +1,2 @@
(import 'node-rules.libsonnet')
(import 'node-rules.libsonnet') +
(import 'general.libsonnet')

View File

@@ -1,7 +1,7 @@
{
"version": 1,
"dependencies": [
{
"name": "kube-prometheus",
"source": {
"local": {
"directory": "jsonnet/kube-prometheus"
@@ -9,5 +9,6 @@
},
"version": ""
}
]
],
"legacyImports": true
}

View File

@@ -1,123 +1,136 @@
{
"version": 1,
"dependencies": [
{
"name": "etcd-mixin",
"source": {
"git": {
"remote": "https://github.com/coreos/etcd",
"subdir": "Documentation/etcd-mixin"
}
},
"version": "cbc1340af53f50728181f97f6bce442ac33d8993",
"sum": "bkp18AxkOUYnVC15Gh9EoIi+mMAn0IT3hMzb8mlzpSw="
},
{
"name": "grafana",
"source": {
"git": {
"remote": "https://github.com/brancz/kubernetes-grafana",
"remote": "https://github.com/brancz/kubernetes-grafana.git",
"subdir": "grafana"
}
},
"version": "539a90dbf63c812ad0194d8078dd776868a11c81",
"sum": "b8faWX1qqLGyN67sA36oRqYZ5HX+tHBRMPtrWRqIysE="
"version": "57b4365eacda291b82e0d55ba7eec573a8198dda",
"sum": "92DWADwGjnCfpZaL7Q07C0GZayxBziGla/O03qWea34="
},
{
"name": "grafana-builder",
"source": {
"git": {
"remote": "https://github.com/grafana/jsonnet-libs",
"subdir": "grafana-builder"
"remote": "https://github.com/coreos/etcd.git",
"subdir": "Documentation/etcd-mixin"
}
},
"version": "67ab3dc52f3cdbc3b29d30afd3261375b5ad13fd",
"sum": "ELsYwK+kGdzX1mee2Yy+/b2mdO4Y503BOCDkFzwmGbE="
"version": "e8ba375032e8e48d009759dfb285f7812e7bcb8c",
"sum": "EgKKzxcW3ttt7gjPMX//DNTqNcn/0o2VAIaWJ/HSLEc="
},
{
"name": "grafonnet",
"source": {
"git": {
"remote": "https://github.com/grafana/grafonnet-lib",
"remote": "https://github.com/coreos/prometheus-operator.git",
"subdir": "jsonnet/prometheus-operator"
}
},
"version": "cd331ce9bb58bb926e391c6ae807621cb12cc29e",
"sum": "nM1eDP5vftqAeQSmVYzSBAh+lG0SN6zu46QiocQiVhk="
},
{
"source": {
"git": {
"remote": "https://github.com/grafana/grafonnet-lib.git",
"subdir": "grafonnet"
}
},
"version": "b82411476842f583817e67feff5becf1228fd540",
"sum": "mEosZ6hZCTCw8AaASEtRFjY8PSmpvqI3xj6IWpwcroU="
"version": "3626fc4dc2326931c530861ac5bebe39444f6cbf",
"sum": "gF8foHByYcB25jcUOBqP6jxk0OPifQMjPvKY0HaCk6w="
},
{
"name": "ksonnet",
"source": {
"git": {
"remote": "https://github.com/ksonnet/ksonnet-lib",
"remote": "https://github.com/grafana/jsonnet-libs.git",
"subdir": "grafana-builder"
}
},
"version": "2ed138b205717af721af57b572bc7cd63bda62fd",
"sum": "U34Nd1ViO2LZ3D8IzygPPRfUcy6zOgCnTMVHZ+9O/QE="
},
{
"source": {
"git": {
"remote": "https://github.com/ksonnet/ksonnet-lib.git",
"subdir": ""
}
},
"version": "0d2f82676817bbf9e4acf6495b2090205f323b9f",
"sum": "h28BXZ7+vczxYJ2sCt8JuR9+yznRtU/iA6DCpQUrtEg="
"sum": "h28BXZ7+vczxYJ2sCt8JuR9+yznRtU/iA6DCpQUrtEg=",
"name": "ksonnet"
},
{
"source": {
"git": {
"remote": "https://github.com/kubernetes-monitoring/kubernetes-mixin.git",
"subdir": ""
}
},
"version": "7acc2fa2cad8d0038646e23656986dfb179cfa78",
"sum": "Of/1Y2kgQZSI/wutrkLtsq6GOMzbYXOilcTEMqaUXCQ="
},
{
"source": {
"git": {
"remote": "https://github.com/kubernetes-monitoring/kubernetes-mixin.git",
"subdir": "lib/promgrafonnet"
}
},
"version": "06d00e40b43e4e618afbebe8e453b5650c659015",
"sum": "zv7hXGui6BfHzE9wPatHI/AGZa4A2WKo6pq7ZdqBsps="
},
{
"source": {
"git": {
"remote": "https://github.com/kubernetes/kube-state-metrics.git",
"subdir": "jsonnet/kube-state-metrics"
}
},
"version": "e72315512a38653b19dcfe4429f93eadedc0ea96",
"sum": "zD/pbQLnQq+5hegEelaheHS8mn1h09GTktFO74iwlBI="
},
{
"source": {
"git": {
"remote": "https://github.com/kubernetes/kube-state-metrics.git",
"subdir": "jsonnet/kube-state-metrics-mixin"
}
},
"version": "e72315512a38653b19dcfe4429f93eadedc0ea96",
"sum": "E1GGavnf9PCWBm4WVrxWnc0FIj72UcbcweqGioWrOdU="
},
{
"source": {
"git": {
"remote": "https://github.com/prometheus/node_exporter.git",
"subdir": "docs/node-mixin"
}
},
"version": "ff2ff3410f4ea8195e51f5fb8d84151684f91b3f",
"sum": "znDrZiHvvascm7Xuj3lTASIOfwX4Vmx7PELmKKw4YiI="
},
{
"source": {
"git": {
"remote": "https://github.com/prometheus/prometheus.git",
"subdir": "documentation/prometheus-mixin"
}
},
"version": "983ebb4a513302315a8117932ab832815f85e3d2",
"sum": "TBq4SL7YsPInARbJqwz25JaBvvAegcnRCsuz3K9niWc=",
"name": "prometheus"
},
{
"name": "kube-prometheus",
"source": {
"local": {
"directory": "jsonnet/kube-prometheus"
}
},
"version": ""
},
{
"name": "kubernetes-mixin",
"source": {
"git": {
"remote": "https://github.com/kubernetes-monitoring/kubernetes-mixin",
"subdir": ""
}
},
"version": "325f8a46fac9605f1de8bc20ca811cb92d1ef7e5",
"sum": "qfm0EpLrEZ1+fe93LFLa9tyOalK6JehpholxO2d0xXU="
},
{
"name": "node-mixin",
"source": {
"git": {
"remote": "https://github.com/prometheus/node_exporter",
"subdir": "docs/node-mixin"
}
},
"version": "20fe5bfb5be4caf3c8c11533b7fb35cb97d810f5",
"sum": "7vEamDTP9AApeiF4Zu9ZyXzDIs3rYHzwf9k7g8X+wsg="
},
{
"name": "prometheus",
"source": {
"git": {
"remote": "https://github.com/prometheus/prometheus",
"subdir": "documentation/prometheus-mixin"
}
},
"version": "431844f0a7c289e4255a68f09a18fcca09637fb2",
"sum": "wSDLAXS5Xzla9RFRE2IW5mRToeRFULHb7dSYYBDfEsM="
},
{
"name": "prometheus-operator",
"source": {
"git": {
"remote": "https://github.com/coreos/prometheus-operator",
"subdir": "jsonnet/prometheus-operator"
}
},
"version": "8d44e0990230144177f97cf62ae4f43b1c4e3168",
"sum": "5U7/8MD3pF9O0YDTtUhg4vctkUBRVFxZxWUyhtNiBM8="
},
{
"name": "promgrafonnet",
"source": {
"git": {
"remote": "https://github.com/kubernetes-monitoring/kubernetes-mixin",
"subdir": "lib/promgrafonnet"
}
},
"version": "325f8a46fac9605f1de8bc20ca811cb92d1ef7e5",
"sum": "VhgBM39yv0f4bKv8VfGg4FXkg573evGDRalip9ypKbc="
}
]
],
"legacyImports": false
}

View File

@@ -16,8 +16,6 @@ resources:
- ./manifests/kube-state-metrics-clusterRole.yaml
- ./manifests/kube-state-metrics-clusterRoleBinding.yaml
- ./manifests/kube-state-metrics-deployment.yaml
- ./manifests/kube-state-metrics-role.yaml
- ./manifests/kube-state-metrics-roleBinding.yaml
- ./manifests/kube-state-metrics-service.yaml
- ./manifests/kube-state-metrics-serviceAccount.yaml
- ./manifests/kube-state-metrics-serviceMonitor.yaml
@@ -38,6 +36,7 @@ resources:
- ./manifests/prometheus-adapter-roleBindingAuthReader.yaml
- ./manifests/prometheus-adapter-service.yaml
- ./manifests/prometheus-adapter-serviceAccount.yaml
- ./manifests/prometheus-adapter-serviceMonitor.yaml
- ./manifests/prometheus-clusterRole.yaml
- ./manifests/prometheus-clusterRoleBinding.yaml
- ./manifests/prometheus-operator-serviceMonitor.yaml
@@ -58,9 +57,11 @@ resources:
- ./manifests/setup/0namespace-namespace.yaml
- ./manifests/setup/prometheus-operator-0alertmanagerCustomResourceDefinition.yaml
- ./manifests/setup/prometheus-operator-0podmonitorCustomResourceDefinition.yaml
- ./manifests/setup/prometheus-operator-0probeCustomResourceDefinition.yaml
- ./manifests/setup/prometheus-operator-0prometheusCustomResourceDefinition.yaml
- ./manifests/setup/prometheus-operator-0prometheusruleCustomResourceDefinition.yaml
- ./manifests/setup/prometheus-operator-0servicemonitorCustomResourceDefinition.yaml
- ./manifests/setup/prometheus-operator-0thanosrulerCustomResourceDefinition.yaml
- ./manifests/setup/prometheus-operator-clusterRole.yaml
- ./manifests/setup/prometheus-operator-clusterRoleBinding.yaml
- ./manifests/setup/prometheus-operator-deployment.yaml

View File

@@ -6,7 +6,7 @@ metadata:
name: main
namespace: monitoring
spec:
baseImage: quay.io/prometheus/alertmanager
image: quay.io/prometheus/alertmanager:v0.21.0
nodeSelector:
kubernetes.io/os: linux
replicas: 3
@@ -15,4 +15,4 @@ spec:
runAsNonRoot: true
runAsUser: 1000
serviceAccountName: alertmanager-main
version: v0.18.0
version: v0.21.0

View File

@@ -1,8 +1,44 @@
apiVersion: v1
data:
alertmanager.yaml: Imdsb2JhbCI6CiAgInJlc29sdmVfdGltZW91dCI6ICI1bSIKInJlY2VpdmVycyI6Ci0gIm5hbWUiOiAibnVsbCIKInJvdXRlIjoKICAiZ3JvdXBfYnkiOgogIC0gImpvYiIKICAiZ3JvdXBfaW50ZXJ2YWwiOiAiNW0iCiAgImdyb3VwX3dhaXQiOiAiMzBzIgogICJyZWNlaXZlciI6ICJudWxsIgogICJyZXBlYXRfaW50ZXJ2YWwiOiAiMTJoIgogICJyb3V0ZXMiOgogIC0gIm1hdGNoIjoKICAgICAgImFsZXJ0bmFtZSI6ICJXYXRjaGRvZyIKICAgICJyZWNlaXZlciI6ICJudWxsIg==
data: {}
kind: Secret
metadata:
name: alertmanager-main
namespace: monitoring
stringData:
alertmanager.yaml: |-
"global":
"resolve_timeout": "5m"
"inhibit_rules":
- "equal":
- "namespace"
- "alertname"
"source_match":
"severity": "critical"
"target_match_re":
"severity": "warning|info"
- "equal":
- "namespace"
- "alertname"
"source_match":
"severity": "warning"
"target_match_re":
"severity": "info"
"receivers":
- "name": "Default"
- "name": "Watchdog"
- "name": "Critical"
"route":
"group_by":
- "namespace"
"group_interval": "5m"
"group_wait": "30s"
"receiver": "Default"
"repeat_interval": "12h"
"routes":
- "match":
"alertname": "Watchdog"
"receiver": "Watchdog"
- "match":
"severity": "critical"
"receiver": "Critical"
type: Opaque

File diff suppressed because it is too large Load Diff

View File

@@ -5,7 +5,7 @@ data:
"apiVersion": 1,
"providers": [
{
"folder": "",
"folder": "Default",
"name": "0",
"options": {
"path": "/grafana-dashboard-definitions/0"

View File

@@ -16,7 +16,8 @@ spec:
app: grafana
spec:
containers:
- image: grafana/grafana:6.4.3
- env: []
image: grafana/grafana:7.1.0
name: grafana
ports:
- containerPort: 3000
@@ -93,9 +94,6 @@ spec:
- mountPath: /grafana-dashboard-definitions/0/pod-total
name: grafana-dashboard-pod-total
readOnly: false
- mountPath: /grafana-dashboard-definitions/0/pods
name: grafana-dashboard-pods
readOnly: false
- mountPath: /grafana-dashboard-definitions/0/prometheus-remote-write
name: grafana-dashboard-prometheus-remote-write
readOnly: false
@@ -180,9 +178,6 @@ spec:
- configMap:
name: grafana-dashboard-pod-total
name: grafana-dashboard-pod-total
- configMap:
name: grafana-dashboard-pods
name: grafana-dashboard-pods
- configMap:
name: grafana-dashboard-prometheus-remote-write
name: grafana-dashboard-prometheus-remote-write

View File

@@ -1,6 +1,9 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: v1.9.5
name: kube-state-metrics
rules:
- apiGroups:
@@ -86,6 +89,22 @@ rules:
- storage.k8s.io
resources:
- storageclasses
- volumeattachments
verbs:
- list
- watch
- apiGroups:
- admissionregistration.k8s.io
resources:
- mutatingwebhookconfigurations
- validatingwebhookconfigurations
verbs:
- list
- watch
- apiGroups:
- networking.k8s.io
resources:
- networkpolicies
verbs:
- list
- watch

View File

@@ -1,6 +1,9 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: v1.9.5
name: kube-state-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io

View File

@@ -2,71 +2,53 @@ apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: kube-state-metrics
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: v1.9.5
name: kube-state-metrics
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
app: kube-state-metrics
app.kubernetes.io/name: kube-state-metrics
template:
metadata:
labels:
app: kube-state-metrics
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: v1.9.5
spec:
containers:
- args:
- --host=127.0.0.1
- --port=8081
- --telemetry-host=127.0.0.1
- --telemetry-port=8082
image: quay.io/coreos/kube-state-metrics:v1.9.5
name: kube-state-metrics
- args:
- --logtostderr
- --secure-listen-address=:8443
- --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
- --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
- --upstream=http://127.0.0.1:8081/
image: quay.io/coreos/kube-rbac-proxy:v0.4.1
name: kube-rbac-proxy-main
ports:
- containerPort: 8443
name: https-main
resources:
limits:
cpu: 20m
memory: 40Mi
requests:
cpu: 10m
memory: 20Mi
securityContext:
runAsUser: 65534
- args:
- --logtostderr
- --secure-listen-address=:9443
- --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
- --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
- --upstream=http://127.0.0.1:8082/
image: quay.io/coreos/kube-rbac-proxy:v0.4.1
name: kube-rbac-proxy-self
ports:
- containerPort: 9443
name: https-self
resources:
limits:
cpu: 20m
memory: 40Mi
requests:
cpu: 10m
memory: 20Mi
- args:
- --host=127.0.0.1
- --port=8081
- --telemetry-host=127.0.0.1
- --telemetry-port=8082
image: quay.io/coreos/kube-state-metrics:v1.8.0
name: kube-state-metrics
resources:
limits:
cpu: 100m
memory: 150Mi
requests:
cpu: 100m
memory: 150Mi
securityContext:
runAsUser: 65534
nodeSelector:
kubernetes.io/os: linux
securityContext:
runAsNonRoot: true
runAsUser: 65534
serviceAccountName: kube-state-metrics

View File

@@ -1,30 +0,0 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: kube-state-metrics
namespace: monitoring
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- extensions
resourceNames:
- kube-state-metrics
resources:
- deployments
verbs:
- get
- update
- apiGroups:
- apps
resourceNames:
- kube-state-metrics
resources:
- deployments
verbs:
- get
- update

View File

@@ -1,12 +0,0 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: kube-state-metrics
namespace: monitoring
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: kube-state-metrics
subjects:
- kind: ServiceAccount
name: kube-state-metrics

View File

@@ -2,7 +2,8 @@ apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: kube-state-metrics
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: v1.9.5
name: kube-state-metrics
namespace: monitoring
spec:
@@ -15,4 +16,4 @@ spec:
port: 9443
targetPort: https-self
selector:
app: kube-state-metrics
app.kubernetes.io/name: kube-state-metrics

View File

@@ -1,5 +1,8 @@
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: v1.9.5
name: kube-state-metrics
namespace: monitoring

View File

@@ -2,7 +2,8 @@ apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
k8s-app: kube-state-metrics
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: 1.9.5
name: kube-state-metrics
namespace: monitoring
spec:
@@ -24,7 +25,7 @@ spec:
scheme: https
tlsConfig:
insecureSkipVerify: true
jobLabel: k8s-app
jobLabel: app.kubernetes.io/name
selector:
matchLabels:
k8s-app: kube-state-metrics
app.kubernetes.io/name: kube-state-metrics

View File

@@ -2,17 +2,19 @@ apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
app: node-exporter
app.kubernetes.io/name: node-exporter
app.kubernetes.io/version: v0.18.1
name: node-exporter
namespace: monitoring
spec:
selector:
matchLabels:
app: node-exporter
app.kubernetes.io/name: node-exporter
template:
metadata:
labels:
app: node-exporter
app.kubernetes.io/name: node-exporter
app.kubernetes.io/version: v0.18.1
spec:
containers:
- args:
@@ -20,8 +22,9 @@ spec:
- --path.procfs=/host/proc
- --path.sysfs=/host/sys
- --path.rootfs=/host/root
- --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/)
- --collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$
- --no-collector.wifi
- --no-collector.hwmon
- --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/pods/.+)($|/)
image: quay.io/prometheus/node-exporter:v0.18.1
name: node-exporter
resources:
@@ -44,8 +47,8 @@ spec:
readOnly: true
- args:
- --logtostderr
- --secure-listen-address=$(IP):9100
- --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
- --secure-listen-address=[$(IP)]:9100
- --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
- --upstream=http://127.0.0.1:9100/
env:
- name: IP
@@ -85,3 +88,6 @@ spec:
- hostPath:
path: /
name: root
updateStrategy:
rollingUpdate:
maxUnavailable: 10%

View File

@@ -2,7 +2,8 @@ apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: node-exporter
app.kubernetes.io/name: node-exporter
app.kubernetes.io/version: v0.18.1
name: node-exporter
namespace: monitoring
spec:
@@ -12,4 +13,4 @@ spec:
port: 9100
targetPort: https
selector:
app: node-exporter
app.kubernetes.io/name: node-exporter

View File

@@ -2,13 +2,14 @@ apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
k8s-app: node-exporter
app.kubernetes.io/name: node-exporter
app.kubernetes.io/version: v0.18.1
name: node-exporter
namespace: monitoring
spec:
endpoints:
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
interval: 30s
interval: 15s
port: https
relabelings:
- action: replace
@@ -20,7 +21,7 @@ spec:
scheme: https
tlsConfig:
insecureSkipVerify: true
jobLabel: k8s-app
jobLabel: app.kubernetes.io/name
selector:
matchLabels:
k8s-app: node-exporter
app.kubernetes.io/name: node-exporter

View File

@@ -11,6 +11,7 @@ rules:
- metrics.k8s.io
resources:
- pods
- nodes
verbs:
- get
- list

View File

@@ -1,32 +1,32 @@
apiVersion: v1
data:
config.yaml: |
resourceRules:
cpu:
containerQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>,container!="POD",container!="",pod!=""}[5m])) by (<<.GroupBy>>)
nodeQuery: sum(1 - rate(node_cpu_seconds_total{mode="idle"}[5m]) * on(namespace, pod) group_left(node) node_namespace_pod:kube_pod_info:{<<.LabelMatchers>>}) by (<<.GroupBy>>)
resources:
overrides:
node:
resource: node
namespace:
resource: namespace
pod:
resource: pod
containerLabel: container
memory:
containerQuery: sum(container_memory_working_set_bytes{<<.LabelMatchers>>,container!="POD",container!="",pod!=""}) by (<<.GroupBy>>)
nodeQuery: sum(node_memory_MemTotal_bytes{job="node-exporter",<<.LabelMatchers>>} - node_memory_MemAvailable_bytes{job="node-exporter",<<.LabelMatchers>>}) by (<<.GroupBy>>)
resources:
overrides:
instance:
resource: node
namespace:
resource: namespace
pod:
resource: pod
containerLabel: container
window: 5m
config.yaml: |-
"resourceRules":
"cpu":
"containerLabel": "container"
"containerQuery": "sum(irate(container_cpu_usage_seconds_total{<<.LabelMatchers>>,container!=\"POD\",container!=\"\",pod!=\"\"}[5m])) by (<<.GroupBy>>)"
"nodeQuery": "sum(1 - irate(node_cpu_seconds_total{mode=\"idle\"}[5m]) * on(namespace, pod) group_left(node) node_namespace_pod:kube_pod_info:{<<.LabelMatchers>>}) by (<<.GroupBy>>)"
"resources":
"overrides":
"namespace":
"resource": "namespace"
"node":
"resource": "node"
"pod":
"resource": "pod"
"memory":
"containerLabel": "container"
"containerQuery": "sum(container_memory_working_set_bytes{<<.LabelMatchers>>,container!=\"POD\",container!=\"\",pod!=\"\"}) by (<<.GroupBy>>)"
"nodeQuery": "sum(node_memory_MemTotal_bytes{job=\"node-exporter\",<<.LabelMatchers>>} - node_memory_MemAvailable_bytes{job=\"node-exporter\",<<.LabelMatchers>>}) by (<<.GroupBy>>)"
"resources":
"overrides":
"instance":
"resource": "node"
"namespace":
"resource": "namespace"
"pod":
"resource": "pod"
"window": "5m"
kind: ConfigMap
metadata:
name: adapter-config

View File

@@ -23,9 +23,9 @@ spec:
- --config=/etc/adapter/config.yaml
- --logtostderr=true
- --metrics-relist-interval=1m
- --prometheus-url=http://prometheus-k8s.monitoring.svc:9090/
- --prometheus-url=http://prometheus-k8s.monitoring.svc.cluster.local:9090/
- --secure-port=6443
image: quay.io/coreos/k8s-prometheus-adapter-amd64:v0.5.0
image: directxman12/k8s-prometheus-adapter:v0.7.0
name: prometheus-adapter
ports:
- containerPort: 6443

View File

@@ -0,0 +1,18 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
name: prometheus-adapter
name: prometheus-adapter
namespace: monitoring
spec:
endpoints:
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
interval: 30s
port: https
scheme: https
tlsConfig:
insecureSkipVerify: true
selector:
matchLabels:
name: prometheus-adapter

View File

@@ -4,15 +4,19 @@ metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.34.0
app.kubernetes.io/version: v0.42.1
name: prometheus-operator
namespace: monitoring
spec:
endpoints:
- honorLabels: true
port: http
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
honorLabels: true
port: https
scheme: https
tlsConfig:
insecureSkipVerify: true
selector:
matchLabels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.34.0
app.kubernetes.io/version: v0.42.1

View File

@@ -11,9 +11,10 @@ spec:
- name: alertmanager-main
namespace: monitoring
port: web
baseImage: quay.io/prometheus/prometheus
image: quay.io/prometheus/prometheus:v2.20.0
nodeSelector:
kubernetes.io/os: linux
podMonitorNamespaceSelector: {}
podMonitorSelector: {}
replicas: 2
resources:
@@ -30,4 +31,4 @@ spec:
serviceAccountName: prometheus-k8s
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector: {}
version: v2.11.0
version: v2.20.0

File diff suppressed because it is too large Load Diff

View File

@@ -11,7 +11,39 @@ spec:
interval: 30s
metricRelabelings:
- action: drop
regex: etcd_(debugging|disk|request|server).*
regex: kubelet_(pod_worker_latency_microseconds|pod_start_latency_microseconds|cgroup_manager_latency_microseconds|pod_worker_start_latency_microseconds|pleg_relist_latency_microseconds|pleg_relist_interval_microseconds|runtime_operations|runtime_operations_latency_microseconds|runtime_operations_errors|eviction_stats_age_microseconds|device_plugin_registration_count|device_plugin_alloc_latency_microseconds|network_plugin_operations_latency_microseconds)
sourceLabels:
- __name__
- action: drop
regex: scheduler_(e2e_scheduling_latency_microseconds|scheduling_algorithm_predicate_evaluation|scheduling_algorithm_priority_evaluation|scheduling_algorithm_preemption_evaluation|scheduling_algorithm_latency_microseconds|binding_latency_microseconds|scheduling_latency_seconds)
sourceLabels:
- __name__
- action: drop
regex: apiserver_(request_count|request_latencies|request_latencies_summary|dropped_requests|storage_data_key_generation_latencies_microseconds|storage_transformation_failures_total|storage_transformation_latencies_microseconds|proxy_tunnel_sync_latency_secs)
sourceLabels:
- __name__
- action: drop
regex: kubelet_docker_(operations|operations_latency_microseconds|operations_errors|operations_timeout)
sourceLabels:
- __name__
- action: drop
regex: reflector_(items_per_list|items_per_watch|list_duration_seconds|lists_total|short_watches_total|watch_duration_seconds|watches_total)
sourceLabels:
- __name__
- action: drop
regex: etcd_(helper_cache_hit_count|helper_cache_miss_count|helper_cache_entry_count|request_cache_get_latencies_summary|request_cache_add_latencies_summary|request_latencies_summary)
sourceLabels:
- __name__
- action: drop
regex: transformation_(transformation_latencies_microseconds|failures_total)
sourceLabels:
- __name__
- action: drop
regex: (admission_quota_controller_adds|crd_autoregistration_controller_work_duration|APIServiceOpenAPIAggregationControllerQueue1_adds|AvailableConditionController_retries|crd_openapi_controller_unfinished_work_seconds|APIServiceRegistrationController_retries|admission_quota_controller_longest_running_processor_microseconds|crdEstablishing_longest_running_processor_microseconds|crdEstablishing_unfinished_work_seconds|crd_openapi_controller_adds|crd_autoregistration_controller_retries|crd_finalizer_queue_latency|AvailableConditionController_work_duration|non_structural_schema_condition_controller_depth|crd_autoregistration_controller_unfinished_work_seconds|AvailableConditionController_adds|DiscoveryController_longest_running_processor_microseconds|autoregister_queue_latency|crd_autoregistration_controller_adds|non_structural_schema_condition_controller_work_duration|APIServiceRegistrationController_adds|crd_finalizer_work_duration|crd_naming_condition_controller_unfinished_work_seconds|crd_openapi_controller_longest_running_processor_microseconds|DiscoveryController_adds|crd_autoregistration_controller_longest_running_processor_microseconds|autoregister_unfinished_work_seconds|crd_naming_condition_controller_queue_latency|crd_naming_condition_controller_retries|non_structural_schema_condition_controller_queue_latency|crd_naming_condition_controller_depth|AvailableConditionController_longest_running_processor_microseconds|crdEstablishing_depth|crd_finalizer_longest_running_processor_microseconds|crd_naming_condition_controller_adds|APIServiceOpenAPIAggregationControllerQueue1_longest_running_processor_microseconds|DiscoveryController_queue_latency|DiscoveryController_unfinished_work_seconds|crd_openapi_controller_depth|APIServiceOpenAPIAggregationControllerQueue1_queue_latency|APIServiceOpenAPIAggregationControllerQueue1_unfinished_work_seconds|DiscoveryController_work_duration|autoregister_adds|crd_autoregistration_controller_queue_latency|crd_finalizer_retries|AvailableConditionController_unfinished_work_seconds|autoregister_longest_running_processor_microseconds|non_structural_schema_condition_controller_unfinished_work_seconds|APIServiceOpenAPIAggregationControllerQueue1_depth|AvailableConditionController_depth|DiscoveryController_retries|admission_quota_controller_depth|crdEstablishing_adds|APIServiceOpenAPIAggregationControllerQueue1_retries|crdEstablishing_queue_latency|non_structural_schema_condition_controller_longest_running_processor_microseconds|autoregister_work_duration|crd_openapi_controller_retries|APIServiceRegistrationController_work_duration|crdEstablishing_work_duration|crd_finalizer_adds|crd_finalizer_depth|crd_openapi_controller_queue_latency|APIServiceOpenAPIAggregationControllerQueue1_work_duration|APIServiceRegistrationController_queue_latency|crd_autoregistration_controller_depth|AvailableConditionController_queue_latency|admission_quota_controller_queue_latency|crd_naming_condition_controller_work_duration|crd_openapi_controller_work_duration|DiscoveryController_depth|crd_naming_condition_controller_longest_running_processor_microseconds|APIServiceRegistrationController_depth|APIServiceRegistrationController_longest_running_processor_microseconds|crd_finalizer_unfinished_work_seconds|crdEstablishing_retries|admission_quota_controller_unfinished_work_seconds|non_structural_schema_condition_controller_adds|APIServiceRegistrationController_unfinished_work_seconds|admission_quota_controller_work_duration|autoregister_depth|autoregister_retries|kubeproxy_sync_proxy_rules_latency_microseconds|rest_client_request_latency_seconds|non_structural_schema_condition_controller_retries)
sourceLabels:
- __name__
- action: drop
regex: etcd_(debugging|disk|server).*
sourceLabels:
- __name__
- action: drop
@@ -22,6 +54,11 @@ spec:
regex: apiserver_admission_step_admission_latencies_seconds_.*
sourceLabels:
- __name__
- action: drop
regex: apiserver_request_duration_seconds_bucket;(0.15|0.25|0.3|0.35|0.4|0.45|0.6|0.7|0.8|0.9|1.25|1.5|1.75|2.5|3|3.5|4.5|6|7|8|9|15|25|30|50)
sourceLabels:
- __name__
- le
port: https
scheme: https
tlsConfig:

View File

@@ -7,13 +7,49 @@ metadata:
namespace: monitoring
spec:
endpoints:
- interval: 30s
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
interval: 30s
metricRelabelings:
- action: drop
regex: kubelet_(pod_worker_latency_microseconds|pod_start_latency_microseconds|cgroup_manager_latency_microseconds|pod_worker_start_latency_microseconds|pleg_relist_latency_microseconds|pleg_relist_interval_microseconds|runtime_operations|runtime_operations_latency_microseconds|runtime_operations_errors|eviction_stats_age_microseconds|device_plugin_registration_count|device_plugin_alloc_latency_microseconds|network_plugin_operations_latency_microseconds)
sourceLabels:
- __name__
- action: drop
regex: scheduler_(e2e_scheduling_latency_microseconds|scheduling_algorithm_predicate_evaluation|scheduling_algorithm_priority_evaluation|scheduling_algorithm_preemption_evaluation|scheduling_algorithm_latency_microseconds|binding_latency_microseconds|scheduling_latency_seconds)
sourceLabels:
- __name__
- action: drop
regex: apiserver_(request_count|request_latencies|request_latencies_summary|dropped_requests|storage_data_key_generation_latencies_microseconds|storage_transformation_failures_total|storage_transformation_latencies_microseconds|proxy_tunnel_sync_latency_secs)
sourceLabels:
- __name__
- action: drop
regex: kubelet_docker_(operations|operations_latency_microseconds|operations_errors|operations_timeout)
sourceLabels:
- __name__
- action: drop
regex: reflector_(items_per_list|items_per_watch|list_duration_seconds|lists_total|short_watches_total|watch_duration_seconds|watches_total)
sourceLabels:
- __name__
- action: drop
regex: etcd_(helper_cache_hit_count|helper_cache_miss_count|helper_cache_entry_count|request_cache_get_latencies_summary|request_cache_add_latencies_summary|request_latencies_summary)
sourceLabels:
- __name__
- action: drop
regex: transformation_(transformation_latencies_microseconds|failures_total)
sourceLabels:
- __name__
- action: drop
regex: (admission_quota_controller_adds|crd_autoregistration_controller_work_duration|APIServiceOpenAPIAggregationControllerQueue1_adds|AvailableConditionController_retries|crd_openapi_controller_unfinished_work_seconds|APIServiceRegistrationController_retries|admission_quota_controller_longest_running_processor_microseconds|crdEstablishing_longest_running_processor_microseconds|crdEstablishing_unfinished_work_seconds|crd_openapi_controller_adds|crd_autoregistration_controller_retries|crd_finalizer_queue_latency|AvailableConditionController_work_duration|non_structural_schema_condition_controller_depth|crd_autoregistration_controller_unfinished_work_seconds|AvailableConditionController_adds|DiscoveryController_longest_running_processor_microseconds|autoregister_queue_latency|crd_autoregistration_controller_adds|non_structural_schema_condition_controller_work_duration|APIServiceRegistrationController_adds|crd_finalizer_work_duration|crd_naming_condition_controller_unfinished_work_seconds|crd_openapi_controller_longest_running_processor_microseconds|DiscoveryController_adds|crd_autoregistration_controller_longest_running_processor_microseconds|autoregister_unfinished_work_seconds|crd_naming_condition_controller_queue_latency|crd_naming_condition_controller_retries|non_structural_schema_condition_controller_queue_latency|crd_naming_condition_controller_depth|AvailableConditionController_longest_running_processor_microseconds|crdEstablishing_depth|crd_finalizer_longest_running_processor_microseconds|crd_naming_condition_controller_adds|APIServiceOpenAPIAggregationControllerQueue1_longest_running_processor_microseconds|DiscoveryController_queue_latency|DiscoveryController_unfinished_work_seconds|crd_openapi_controller_depth|APIServiceOpenAPIAggregationControllerQueue1_queue_latency|APIServiceOpenAPIAggregationControllerQueue1_unfinished_work_seconds|DiscoveryController_work_duration|autoregister_adds|crd_autoregistration_controller_queue_latency|crd_finalizer_retries|AvailableConditionController_unfinished_work_seconds|autoregister_longest_running_processor_microseconds|non_structural_schema_condition_controller_unfinished_work_seconds|APIServiceOpenAPIAggregationControllerQueue1_depth|AvailableConditionController_depth|DiscoveryController_retries|admission_quota_controller_depth|crdEstablishing_adds|APIServiceOpenAPIAggregationControllerQueue1_retries|crdEstablishing_queue_latency|non_structural_schema_condition_controller_longest_running_processor_microseconds|autoregister_work_duration|crd_openapi_controller_retries|APIServiceRegistrationController_work_duration|crdEstablishing_work_duration|crd_finalizer_adds|crd_finalizer_depth|crd_openapi_controller_queue_latency|APIServiceOpenAPIAggregationControllerQueue1_work_duration|APIServiceRegistrationController_queue_latency|crd_autoregistration_controller_depth|AvailableConditionController_queue_latency|admission_quota_controller_queue_latency|crd_naming_condition_controller_work_duration|crd_openapi_controller_work_duration|DiscoveryController_depth|crd_naming_condition_controller_longest_running_processor_microseconds|APIServiceRegistrationController_depth|APIServiceRegistrationController_longest_running_processor_microseconds|crd_finalizer_unfinished_work_seconds|crdEstablishing_retries|admission_quota_controller_unfinished_work_seconds|non_structural_schema_condition_controller_adds|APIServiceRegistrationController_unfinished_work_seconds|admission_quota_controller_work_duration|autoregister_depth|autoregister_retries|kubeproxy_sync_proxy_rules_latency_microseconds|rest_client_request_latency_seconds|non_structural_schema_condition_controller_retries)
sourceLabels:
- __name__
- action: drop
regex: etcd_(debugging|disk|request|server).*
sourceLabels:
- __name__
port: http-metrics
port: https-metrics
scheme: https
tlsConfig:
insecureSkipVerify: true
jobLabel: k8s-app
namespaceSelector:
matchNames:

View File

@@ -7,8 +7,12 @@ metadata:
namespace: monitoring
spec:
endpoints:
- interval: 30s
port: http-metrics
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
interval: 30s
port: https-metrics
scheme: https
tlsConfig:
insecureSkipVerify: true
jobLabel: k8s-app
namespaceSelector:
matchNames:

View File

@@ -10,6 +10,39 @@ spec:
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
honorLabels: true
interval: 30s
metricRelabelings:
- action: drop
regex: kubelet_(pod_worker_latency_microseconds|pod_start_latency_microseconds|cgroup_manager_latency_microseconds|pod_worker_start_latency_microseconds|pleg_relist_latency_microseconds|pleg_relist_interval_microseconds|runtime_operations|runtime_operations_latency_microseconds|runtime_operations_errors|eviction_stats_age_microseconds|device_plugin_registration_count|device_plugin_alloc_latency_microseconds|network_plugin_operations_latency_microseconds)
sourceLabels:
- __name__
- action: drop
regex: scheduler_(e2e_scheduling_latency_microseconds|scheduling_algorithm_predicate_evaluation|scheduling_algorithm_priority_evaluation|scheduling_algorithm_preemption_evaluation|scheduling_algorithm_latency_microseconds|binding_latency_microseconds|scheduling_latency_seconds)
sourceLabels:
- __name__
- action: drop
regex: apiserver_(request_count|request_latencies|request_latencies_summary|dropped_requests|storage_data_key_generation_latencies_microseconds|storage_transformation_failures_total|storage_transformation_latencies_microseconds|proxy_tunnel_sync_latency_secs)
sourceLabels:
- __name__
- action: drop
regex: kubelet_docker_(operations|operations_latency_microseconds|operations_errors|operations_timeout)
sourceLabels:
- __name__
- action: drop
regex: reflector_(items_per_list|items_per_watch|list_duration_seconds|lists_total|short_watches_total|watch_duration_seconds|watches_total)
sourceLabels:
- __name__
- action: drop
regex: etcd_(helper_cache_hit_count|helper_cache_miss_count|helper_cache_entry_count|request_cache_get_latencies_summary|request_cache_add_latencies_summary|request_latencies_summary)
sourceLabels:
- __name__
- action: drop
regex: transformation_(transformation_latencies_microseconds|failures_total)
sourceLabels:
- __name__
- action: drop
regex: (admission_quota_controller_adds|crd_autoregistration_controller_work_duration|APIServiceOpenAPIAggregationControllerQueue1_adds|AvailableConditionController_retries|crd_openapi_controller_unfinished_work_seconds|APIServiceRegistrationController_retries|admission_quota_controller_longest_running_processor_microseconds|crdEstablishing_longest_running_processor_microseconds|crdEstablishing_unfinished_work_seconds|crd_openapi_controller_adds|crd_autoregistration_controller_retries|crd_finalizer_queue_latency|AvailableConditionController_work_duration|non_structural_schema_condition_controller_depth|crd_autoregistration_controller_unfinished_work_seconds|AvailableConditionController_adds|DiscoveryController_longest_running_processor_microseconds|autoregister_queue_latency|crd_autoregistration_controller_adds|non_structural_schema_condition_controller_work_duration|APIServiceRegistrationController_adds|crd_finalizer_work_duration|crd_naming_condition_controller_unfinished_work_seconds|crd_openapi_controller_longest_running_processor_microseconds|DiscoveryController_adds|crd_autoregistration_controller_longest_running_processor_microseconds|autoregister_unfinished_work_seconds|crd_naming_condition_controller_queue_latency|crd_naming_condition_controller_retries|non_structural_schema_condition_controller_queue_latency|crd_naming_condition_controller_depth|AvailableConditionController_longest_running_processor_microseconds|crdEstablishing_depth|crd_finalizer_longest_running_processor_microseconds|crd_naming_condition_controller_adds|APIServiceOpenAPIAggregationControllerQueue1_longest_running_processor_microseconds|DiscoveryController_queue_latency|DiscoveryController_unfinished_work_seconds|crd_openapi_controller_depth|APIServiceOpenAPIAggregationControllerQueue1_queue_latency|APIServiceOpenAPIAggregationControllerQueue1_unfinished_work_seconds|DiscoveryController_work_duration|autoregister_adds|crd_autoregistration_controller_queue_latency|crd_finalizer_retries|AvailableConditionController_unfinished_work_seconds|autoregister_longest_running_processor_microseconds|non_structural_schema_condition_controller_unfinished_work_seconds|APIServiceOpenAPIAggregationControllerQueue1_depth|AvailableConditionController_depth|DiscoveryController_retries|admission_quota_controller_depth|crdEstablishing_adds|APIServiceOpenAPIAggregationControllerQueue1_retries|crdEstablishing_queue_latency|non_structural_schema_condition_controller_longest_running_processor_microseconds|autoregister_work_duration|crd_openapi_controller_retries|APIServiceRegistrationController_work_duration|crdEstablishing_work_duration|crd_finalizer_adds|crd_finalizer_depth|crd_openapi_controller_queue_latency|APIServiceOpenAPIAggregationControllerQueue1_work_duration|APIServiceRegistrationController_queue_latency|crd_autoregistration_controller_depth|AvailableConditionController_queue_latency|admission_quota_controller_queue_latency|crd_naming_condition_controller_work_duration|crd_openapi_controller_work_duration|DiscoveryController_depth|crd_naming_condition_controller_longest_running_processor_microseconds|APIServiceRegistrationController_depth|APIServiceRegistrationController_longest_running_processor_microseconds|crd_finalizer_unfinished_work_seconds|crdEstablishing_retries|admission_quota_controller_unfinished_work_seconds|non_structural_schema_condition_controller_adds|APIServiceRegistrationController_unfinished_work_seconds|admission_quota_controller_work_duration|autoregister_depth|autoregister_retries|kubeproxy_sync_proxy_rules_latency_microseconds|rest_client_request_latency_seconds|non_structural_schema_condition_controller_retries)
sourceLabels:
- __name__
port: https-metrics
relabelings:
- sourceLabels:
@@ -26,6 +59,12 @@ spec:
regex: container_(network_tcp_usage_total|network_udp_usage_total|tasks_state|cpu_load_average_10s)
sourceLabels:
- __name__
- action: drop
regex: (container_fs_.*|container_spec_.*|container_blkio_device_usage_total|container_file_descriptors|container_sockets|container_threads_max|container_threads|container_start_time_seconds|container_last_seen);;
sourceLabels:
- __name__
- pod
- namespace
path: /metrics/cadvisor
port: https-metrics
relabelings:
@@ -35,6 +74,18 @@ spec:
scheme: https
tlsConfig:
insecureSkipVerify: true
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
honorLabels: true
interval: 30s
path: /metrics/probes
port: https-metrics
relabelings:
- sourceLabels:
- __metrics_path__
targetLabel: metrics_path
scheme: https
tlsConfig:
insecureSkipVerify: true
jobLabel: k8s-app
namespaceSelector:
matchNames:

View File

@@ -1,239 +1,265 @@
apiVersion: apiextensions.k8s.io/v1beta1
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.2.4
creationTimestamp: null
name: podmonitors.monitoring.coreos.com
spec:
group: monitoring.coreos.com
names:
kind: PodMonitor
listKind: PodMonitorList
plural: podmonitors
singular: podmonitor
scope: Namespaced
validation:
openAPIV3Schema:
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
spec:
description: PodMonitorSpec contains specification parameters for a PodMonitor.
properties:
jobLabel:
description: The label to use to retrieve the job name from.
type: string
namespaceSelector:
description: NamespaceSelector is a selector for selecting either all
namespaces or a list of namespaces.
properties:
any:
description: Boolean describing whether all namespaces are selected
in contrast to a list restricting them.
type: boolean
matchNames:
description: List of namespace names.
items:
type: string
type: array
type: object
podMetricsEndpoints:
description: A list of endpoints allowed as part of this PodMonitor.
items:
description: PodMetricsEndpoint defines a scrapeable endpoint of a
Kubernetes Pod serving Prometheus metrics.
properties:
honorLabels:
description: HonorLabels chooses the metric's labels on collisions
with target labels.
type: boolean
honorTimestamps:
description: HonorTimestamps controls whether Prometheus respects
the timestamps present in scraped data.
type: boolean
interval:
description: Interval at which metrics should be scraped
type: string
metricRelabelings:
description: MetricRelabelConfigs to apply to samples before ingestion.
items:
description: 'RelabelConfig allows dynamic rewriting of the
label set, being applied to samples before ingestion. It defines
`<metric_relabel_configs>`-section of Prometheus configuration.
More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs'
properties:
action:
description: Action to perform based on regex matching.
Default is 'replace'
type: string
modulus:
description: Modulus to take of the hash of the source label
values.
format: int64
type: integer
regex:
description: Regular expression against which the extracted
value is matched. defailt is '(.*)'
type: string
replacement:
description: Replacement value against which a regex replace
is performed if the regular expression matches. Regex
capture groups are available. Default is '$1'
type: string
separator:
description: Separator placed between concatenated source
label values. default is ';'.
type: string
sourceLabels:
description: The source labels select values from existing
labels. Their content is concatenated using the configured
separator and matched against the configured regular expression
for the replace, keep, and drop actions.
items:
type: string
type: array
targetLabel:
description: Label to which the resulting value is written
in a replace action. It is mandatory for replace actions.
Regex capture groups are available.
type: string
type: object
type: array
params:
description: Optional HTTP URL parameters
type: object
path:
description: HTTP path to scrape for metrics.
type: string
port:
description: Name of the port this endpoint refers to. Mutually
exclusive with targetPort.
type: string
proxyUrl:
description: ProxyURL eg http://proxyserver:2195 Directs scrapes
to proxy through this endpoint.
type: string
relabelings:
description: 'RelabelConfigs to apply to samples before ingestion.
More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config'
items:
description: 'RelabelConfig allows dynamic rewriting of the
label set, being applied to samples before ingestion. It defines
`<metric_relabel_configs>`-section of Prometheus configuration.
More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs'
properties:
action:
description: Action to perform based on regex matching.
Default is 'replace'
type: string
modulus:
description: Modulus to take of the hash of the source label
values.
format: int64
type: integer
regex:
description: Regular expression against which the extracted
value is matched. defailt is '(.*)'
type: string
replacement:
description: Replacement value against which a regex replace
is performed if the regular expression matches. Regex
capture groups are available. Default is '$1'
type: string
separator:
description: Separator placed between concatenated source
label values. default is ';'.
type: string
sourceLabels:
description: The source labels select values from existing
labels. Their content is concatenated using the configured
separator and matched against the configured regular expression
for the replace, keep, and drop actions.
items:
type: string
type: array
targetLabel:
description: Label to which the resulting value is written
in a replace action. It is mandatory for replace actions.
Regex capture groups are available.
type: string
type: object
type: array
scheme:
description: HTTP scheme to use for scraping.
type: string
scrapeTimeout:
description: Timeout after which the scrape is ended
type: string
targetPort:
anyOf:
- type: string
- type: integer
type: object
type: array
podTargetLabels:
description: PodTargetLabels transfers labels on the Kubernetes Pod
onto the target.
items:
versions:
- name: v1
schema:
openAPIV3Schema:
description: PodMonitor defines monitoring for a set of pods.
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: Specification of desired Pod selection for target discovery
by Prometheus.
properties:
jobLabel:
description: The label to use to retrieve the job name from.
type: string
type: array
sampleLimit:
description: SampleLimit defines per-scrape limit on number of scraped
samples that will be accepted.
format: int64
type: integer
selector:
description: A label selector is a label query over a set of resources.
The result of matchLabels and matchExpressions are ANDed. An empty
label selector matches all objects. A null label selector matches
no objects.
properties:
matchExpressions:
description: matchExpressions is a list of label selector requirements.
The requirements are ANDed.
items:
description: A label selector requirement is a selector that contains
values, a key, and an operator that relates the key and values.
properties:
key:
description: key is the label key that the selector applies
to.
type: string
operator:
description: operator represents a key's relationship to a
set of values. Valid operators are In, NotIn, Exists and
DoesNotExist.
type: string
values:
description: values is an array of string values. If the operator
is In or NotIn, the values array must be non-empty. If the
operator is Exists or DoesNotExist, the values array must
be empty. This array is replaced during a strategic merge
patch.
namespaceSelector:
description: Selector to select which namespaces the Endpoints objects
are discovered from.
properties:
any:
description: Boolean describing whether all namespaces are selected
in contrast to a list restricting them.
type: boolean
matchNames:
description: List of namespace names.
items:
type: string
type: array
type: object
podMetricsEndpoints:
description: A list of endpoints allowed as part of this PodMonitor.
items:
description: PodMetricsEndpoint defines a scrapeable endpoint of
a Kubernetes Pod serving Prometheus metrics.
properties:
honorLabels:
description: HonorLabels chooses the metric's labels on collisions
with target labels.
type: boolean
honorTimestamps:
description: HonorTimestamps controls whether Prometheus respects
the timestamps present in scraped data.
type: boolean
interval:
description: Interval at which metrics should be scraped
type: string
metricRelabelings:
description: MetricRelabelConfigs to apply to samples before
ingestion.
items:
description: 'RelabelConfig allows dynamic rewriting of the
label set, being applied to samples before ingestion. It
defines `<metric_relabel_configs>`-section of Prometheus
configuration. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs'
properties:
action:
description: Action to perform based on regex matching.
Default is 'replace'
type: string
modulus:
description: Modulus to take of the hash of the source
label values.
format: int64
type: integer
regex:
description: Regular expression against which the extracted
value is matched. Default is '(.*)'
type: string
replacement:
description: Replacement value against which a regex replace
is performed if the regular expression matches. Regex
capture groups are available. Default is '$1'
type: string
separator:
description: Separator placed between concatenated source
label values. default is ';'.
type: string
sourceLabels:
description: The source labels select values from existing
labels. Their content is concatenated using the configured
separator and matched against the configured regular
expression for the replace, keep, and drop actions.
items:
type: string
type: array
targetLabel:
description: Label to which the resulting value is written
in a replace action. It is mandatory for replace actions.
Regex capture groups are available.
type: string
type: object
type: array
params:
additionalProperties:
items:
type: string
type: array
required:
- key
- operator
type: object
type: array
matchLabels:
description: matchLabels is a map of {key,value} pairs. A single
{key,value} in the matchLabels map is equivalent to an element
of matchExpressions, whose key field is "key", the operator is
"In", and the values array contains only "value". The requirements
are ANDed.
description: Optional HTTP URL parameters
type: object
path:
description: HTTP path to scrape for metrics.
type: string
port:
description: Name of the pod port this endpoint refers to. Mutually
exclusive with targetPort.
type: string
proxyUrl:
description: ProxyURL eg http://proxyserver:2195 Directs scrapes
to proxy through this endpoint.
type: string
relabelings:
description: 'RelabelConfigs to apply to samples before ingestion.
More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config'
items:
description: 'RelabelConfig allows dynamic rewriting of the
label set, being applied to samples before ingestion. It
defines `<metric_relabel_configs>`-section of Prometheus
configuration. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs'
properties:
action:
description: Action to perform based on regex matching.
Default is 'replace'
type: string
modulus:
description: Modulus to take of the hash of the source
label values.
format: int64
type: integer
regex:
description: Regular expression against which the extracted
value is matched. Default is '(.*)'
type: string
replacement:
description: Replacement value against which a regex replace
is performed if the regular expression matches. Regex
capture groups are available. Default is '$1'
type: string
separator:
description: Separator placed between concatenated source
label values. default is ';'.
type: string
sourceLabels:
description: The source labels select values from existing
labels. Their content is concatenated using the configured
separator and matched against the configured regular
expression for the replace, keep, and drop actions.
items:
type: string
type: array
targetLabel:
description: Label to which the resulting value is written
in a replace action. It is mandatory for replace actions.
Regex capture groups are available.
type: string
type: object
type: array
scheme:
description: HTTP scheme to use for scraping.
type: string
scrapeTimeout:
description: Timeout after which the scrape is ended
type: string
targetPort:
anyOf:
- type: integer
- type: string
description: 'Deprecated: Use ''port'' instead.'
x-kubernetes-int-or-string: true
type: object
type: object
required:
- podMetricsEndpoints
- selector
type: object
type: object
version: v1
type: array
podTargetLabels:
description: PodTargetLabels transfers labels on the Kubernetes Pod
onto the target.
items:
type: string
type: array
sampleLimit:
description: SampleLimit defines per-scrape limit on number of scraped
samples that will be accepted.
format: int64
type: integer
selector:
description: Selector to select Pod objects.
properties:
matchExpressions:
description: matchExpressions is a list of label selector requirements.
The requirements are ANDed.
items:
description: A label selector requirement is a selector that
contains values, a key, and an operator that relates the key
and values.
properties:
key:
description: key is the label key that the selector applies
to.
type: string
operator:
description: operator represents a key's relationship to
a set of values. Valid operators are In, NotIn, Exists
and DoesNotExist.
type: string
values:
description: values is an array of string values. If the
operator is In or NotIn, the values array must be non-empty.
If the operator is Exists or DoesNotExist, the values
array must be empty. This array is replaced during a strategic
merge patch.
items:
type: string
type: array
required:
- key
- operator
type: object
type: array
matchLabels:
additionalProperties:
type: string
description: matchLabels is a map of {key,value} pairs. A single
{key,value} in the matchLabels map is equivalent to an element
of matchExpressions, whose key field is "key", the operator
is "In", and the values array contains only "value". The requirements
are ANDed.
type: object
type: object
required:
- podMetricsEndpoints
- selector
type: object
required:
- spec
type: object
served: true
storage: true
status:
acceptedNames:
kind: ""
plural: ""
conditions: []
storedVersions: []

View File

@@ -0,0 +1,212 @@
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.2.4
creationTimestamp: null
name: probes.monitoring.coreos.com
spec:
group: monitoring.coreos.com
names:
kind: Probe
listKind: ProbeList
plural: probes
singular: probe
scope: Namespaced
versions:
- name: v1
schema:
openAPIV3Schema:
description: Probe defines monitoring for a set of static targets or ingresses.
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: Specification of desired Ingress selection for target discovery
by Prometheus.
properties:
interval:
description: Interval at which targets are probed using the configured
prober. If not specified Prometheus' global scrape interval is used.
type: string
jobName:
description: The job name assigned to scraped metrics by default.
type: string
module:
description: 'The module to use for probing specifying how to probe
the target. Example module configuring in the blackbox exporter:
https://github.com/prometheus/blackbox_exporter/blob/master/example.yml'
type: string
prober:
description: Specification for the prober to use for probing targets.
The prober.URL parameter is required. Targets cannot be probed if
left empty.
properties:
path:
description: Path to collect metrics from. Defaults to `/probe`.
type: string
scheme:
description: HTTP scheme to use for scraping. Defaults to `http`.
type: string
url:
description: Mandatory URL of the prober.
type: string
required:
- url
type: object
scrapeTimeout:
description: Timeout for scraping metrics from the Prometheus exporter.
type: string
targets:
description: Targets defines a set of static and/or dynamically discovered
targets to be probed using the prober.
properties:
ingress:
description: Ingress defines the set of dynamically discovered
ingress objects which hosts are considered for probing.
properties:
namespaceSelector:
description: Select Ingress objects by namespace.
properties:
any:
description: Boolean describing whether all namespaces
are selected in contrast to a list restricting them.
type: boolean
matchNames:
description: List of namespace names.
items:
type: string
type: array
type: object
relabelingConfigs:
description: 'RelabelConfigs to apply to samples before ingestion.
More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config'
items:
description: 'RelabelConfig allows dynamic rewriting of
the label set, being applied to samples before ingestion.
It defines `<metric_relabel_configs>`-section of Prometheus
configuration. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs'
properties:
action:
description: Action to perform based on regex matching.
Default is 'replace'
type: string
modulus:
description: Modulus to take of the hash of the source
label values.
format: int64
type: integer
regex:
description: Regular expression against which the extracted
value is matched. Default is '(.*)'
type: string
replacement:
description: Replacement value against which a regex
replace is performed if the regular expression matches.
Regex capture groups are available. Default is '$1'
type: string
separator:
description: Separator placed between concatenated source
label values. default is ';'.
type: string
sourceLabels:
description: The source labels select values from existing
labels. Their content is concatenated using the configured
separator and matched against the configured regular
expression for the replace, keep, and drop actions.
items:
type: string
type: array
targetLabel:
description: Label to which the resulting value is written
in a replace action. It is mandatory for replace actions.
Regex capture groups are available.
type: string
type: object
type: array
selector:
description: Select Ingress objects by labels.
properties:
matchExpressions:
description: matchExpressions is a list of label selector
requirements. The requirements are ANDed.
items:
description: A label selector requirement is a selector
that contains values, a key, and an operator that
relates the key and values.
properties:
key:
description: key is the label key that the selector
applies to.
type: string
operator:
description: operator represents a key's relationship
to a set of values. Valid operators are In, NotIn,
Exists and DoesNotExist.
type: string
values:
description: values is an array of string values.
If the operator is In or NotIn, the values array
must be non-empty. If the operator is Exists or
DoesNotExist, the values array must be empty.
This array is replaced during a strategic merge
patch.
items:
type: string
type: array
required:
- key
- operator
type: object
type: array
matchLabels:
additionalProperties:
type: string
description: matchLabels is a map of {key,value} pairs.
A single {key,value} in the matchLabels map is equivalent
to an element of matchExpressions, whose key field is
"key", the operator is "In", and the values array contains
only "value". The requirements are ANDed.
type: object
type: object
type: object
staticConfig:
description: 'StaticConfig defines static targets which are considers
for probing. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#static_config.'
properties:
labels:
additionalProperties:
type: string
description: Labels assigned to all metrics scraped from the
targets.
type: object
static:
description: Targets is a list of URLs to probe using the
configured prober.
items:
type: string
type: array
type: object
type: object
type: object
required:
- spec
type: object
served: true
storage: true
status:
acceptedNames:
kind: ""
plural: ""
conditions: []
storedVersions: []

View File

@@ -1,250 +1,94 @@
apiVersion: apiextensions.k8s.io/v1beta1
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.2.4
creationTimestamp: null
name: prometheusrules.monitoring.coreos.com
spec:
group: monitoring.coreos.com
names:
kind: PrometheusRule
listKind: PrometheusRuleList
plural: prometheusrules
singular: prometheusrule
scope: Namespaced
validation:
openAPIV3Schema:
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
description: ObjectMeta is metadata that all persisted resources must have,
which includes all objects users must create.
properties:
annotations:
description: 'Annotations is an unstructured key value map stored with
a resource that may be set by external tools to store and retrieve
arbitrary metadata. They are not queryable and should be preserved
when modifying objects. More info: http://kubernetes.io/docs/user-guide/annotations'
type: object
clusterName:
description: The name of the cluster which the object belongs to. This
is used to distinguish resources with same name and namespace in different
clusters. This field is not set anywhere right now and apiserver is
going to ignore it if set in create or update request.
type: string
creationTimestamp:
description: Time is a wrapper around time.Time which supports correct
marshaling to YAML and JSON. Wrappers are provided for many of the
factory methods that the time package offers.
format: date-time
type: string
deletionGracePeriodSeconds:
description: Number of seconds allowed for this object to gracefully
terminate before it will be removed from the system. Only set when
deletionTimestamp is also set. May only be shortened. Read-only.
format: int64
type: integer
deletionTimestamp:
description: Time is a wrapper around time.Time which supports correct
marshaling to YAML and JSON. Wrappers are provided for many of the
factory methods that the time package offers.
format: date-time
type: string
finalizers:
description: Must be empty before the object is deleted from the registry.
Each entry is an identifier for the responsible component that will
remove the entry from the list. If the deletionTimestamp of the object
is non-nil, entries in this list can only be removed.
items:
type: string
type: array
generateName:
description: |-
GenerateName is an optional prefix, used by the server, to generate a unique name ONLY IF the Name field has not been provided. If this field is used, the name returned to the client will be different than the name passed. This value will also be combined with a unique suffix. The provided value has the same validation rules as the Name field, and may be truncated by the length of the suffix required to make the value unique on the server.
If this field is specified and the generated name exists, the server will NOT return a 409 - instead, it will either return 201 Created or 500 with Reason ServerTimeout indicating a unique name could not be found in the time allotted, and the client should retry (optionally after the time indicated in the Retry-After header).
Applied only if Name is not specified. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#idempotency
type: string
generation:
description: A sequence number representing a specific generation of
the desired state. Populated by the system. Read-only.
format: int64
type: integer
labels:
description: 'Map of string keys and values that can be used to organize
and categorize (scope and select) objects. May match selectors of
replication controllers and services. More info: http://kubernetes.io/docs/user-guide/labels'
type: object
managedFields:
description: ManagedFields maps workflow-id and version to the set of
fields that are managed by that workflow. This is mostly for internal
housekeeping, and users typically shouldn't need to set or understand
this field. A workflow can be the user's name, a controller's name,
or the name of a specific apply path like "ci-cd". The set of fields
is always in the version that the workflow used when modifying the
object.
items:
description: ManagedFieldsEntry is a workflow-id, a FieldSet and the
group version of the resource that the fieldset applies to.
properties:
apiVersion:
description: APIVersion defines the version of this resource that
this field set applies to. The format is "group/version" just
like the top-level APIVersion field. It is necessary to track
the version of a field set because it cannot be automatically
converted.
type: string
fieldsType:
description: 'FieldsType is the discriminator for the different
fields format and version. There is currently only one possible
value: "FieldsV1"'
type: string
fieldsV1:
description: |-
FieldsV1 stores a set of fields in a data structure like a Trie, in JSON format.
Each key is either a '.' representing the field itself, and will always map to an empty set, or a string representing a sub-field or item. The string will follow one of these four formats: 'f:<name>', where <name> is the name of a field in a struct, or key in a map 'v:<value>', where <value> is the exact json formatted value of a list item 'i:<index>', where <index> is position of a item in a list 'k:<keys>', where <keys> is a map of a list item's key fields to their unique values If a key maps to an empty Fields value, the field that key represents is part of the set.
The exact format is defined in sigs.k8s.io/structured-merge-diff
type: object
manager:
description: Manager is an identifier of the workflow managing
these fields.
type: string
operation:
description: Operation is the type of operation which lead to
this ManagedFieldsEntry being created. The only valid values
for this field are 'Apply' and 'Update'.
type: string
time:
description: Time is a wrapper around time.Time which supports
correct marshaling to YAML and JSON. Wrappers are provided
for many of the factory methods that the time package offers.
format: date-time
type: string
type: object
type: array
name:
description: 'Name must be unique within a namespace. Is required when
creating resources, although some resources may allow a client to
request the generation of an appropriate name automatically. Name
is primarily intended for creation idempotence and configuration definition.
Cannot be updated. More info: http://kubernetes.io/docs/user-guide/identifiers#names'
type: string
namespace:
description: |-
Namespace defines the space within each name must be unique. An empty namespace is equivalent to the "default" namespace, but "default" is the canonical representation. Not all objects are required to be scoped to a namespace - the value of this field for those objects will be empty.
Must be a DNS_LABEL. Cannot be updated. More info: http://kubernetes.io/docs/user-guide/namespaces
type: string
ownerReferences:
description: List of objects depended by this object. If ALL objects
in the list have been deleted, this object will be garbage collected.
If this object is managed by a controller, then an entry in this list
will point to this controller, with the controller field set to true.
There cannot be more than one managing controller.
items:
description: OwnerReference contains enough information to let you
identify an owning object. An owning object must be in the same
namespace as the dependent, or be cluster-scoped, so there is no
namespace field.
properties:
apiVersion:
description: API version of the referent.
type: string
blockOwnerDeletion:
description: If true, AND if the owner has the "foregroundDeletion"
finalizer, then the owner cannot be deleted from the key-value
store until this reference is removed. Defaults to false. To
set this field, a user needs "delete" permission of the owner,
otherwise 422 (Unprocessable Entity) will be returned.
type: boolean
controller:
description: If true, this reference points to the managing controller.
type: boolean
kind:
description: 'Kind of the referent. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
name:
description: 'Name of the referent. More info: http://kubernetes.io/docs/user-guide/identifiers#names'
type: string
uid:
description: 'UID of the referent. More info: http://kubernetes.io/docs/user-guide/identifiers#uids'
type: string
required:
- apiVersion
- kind
- name
- uid
type: object
type: array
resourceVersion:
description: |-
An opaque value that represents the internal version of this object that can be used by clients to determine when objects have changed. May be used for optimistic concurrency, change detection, and the watch operation on a resource or set of resources. Clients must treat these values as opaque and passed unmodified back to the server. They may only be valid for a particular resource or set of resources.
Populated by the system. Read-only. Value must be treated as opaque by clients and . More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#concurrency-control-and-consistency
type: string
selfLink:
description: |-
SelfLink is a URL representing this object. Populated by the system. Read-only.
DEPRECATED Kubernetes will stop propagating this field in 1.20 release and the field is planned to be removed in 1.21 release.
type: string
uid:
description: |-
UID is the unique in time and space value for this object. It is typically generated by the server on successful creation of a resource and is not allowed to change on PUT operations.
Populated by the system. Read-only. More info: http://kubernetes.io/docs/user-guide/identifiers#uids
type: string
type: object
spec:
description: PrometheusRuleSpec contains specification parameters for a
Rule.
properties:
groups:
description: Content of Prometheus rule file
items:
description: RuleGroup is a list of sequentially evaluated recording
and alerting rules.
properties:
interval:
type: string
name:
type: string
rules:
items:
description: Rule describes an alerting or recording rule.
properties:
alert:
type: string
annotations:
type: object
expr:
anyOf:
- type: string
- type: integer
for:
type: string
labels:
type: object
record:
type: string
required:
- expr
type: object
type: array
required:
- name
- rules
type: object
type: array
type: object
type: object
version: v1
versions:
- name: v1
schema:
openAPIV3Schema:
description: PrometheusRule defines alerting rules for a Prometheus instance
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: Specification of desired alerting rule definitions for Prometheus.
properties:
groups:
description: Content of Prometheus rule file
items:
description: 'RuleGroup is a list of sequentially evaluated recording
and alerting rules. Note: PartialResponseStrategy is only used
by ThanosRuler and will be ignored by Prometheus instances. Valid
values for this field are ''warn'' or ''abort''. More info: https://github.com/thanos-io/thanos/blob/master/docs/components/rule.md#partial-response'
properties:
interval:
type: string
name:
type: string
partial_response_strategy:
type: string
rules:
items:
description: Rule describes an alerting or recording rule.
properties:
alert:
type: string
annotations:
additionalProperties:
type: string
type: object
expr:
anyOf:
- type: integer
- type: string
x-kubernetes-int-or-string: true
for:
type: string
labels:
additionalProperties:
type: string
type: object
record:
type: string
required:
- expr
type: object
type: array
required:
- name
- rules
type: object
type: array
type: object
required:
- spec
type: object
served: true
storage: true
status:
acceptedNames:
kind: ""
plural: ""
conditions: []
storedVersions: []

View File

@@ -1,346 +1,466 @@
apiVersion: apiextensions.k8s.io/v1beta1
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.2.4
creationTimestamp: null
name: servicemonitors.monitoring.coreos.com
spec:
group: monitoring.coreos.com
names:
kind: ServiceMonitor
listKind: ServiceMonitorList
plural: servicemonitors
singular: servicemonitor
scope: Namespaced
validation:
openAPIV3Schema:
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
spec:
description: ServiceMonitorSpec contains specification parameters for a
ServiceMonitor.
properties:
endpoints:
description: A list of endpoints allowed as part of this ServiceMonitor.
items:
description: Endpoint defines a scrapeable endpoint serving Prometheus
metrics.
properties:
basicAuth:
description: 'BasicAuth allow an endpoint to authenticate over
basic authentication More info: https://prometheus.io/docs/operating/configuration/#endpoints'
properties:
password:
description: SecretKeySelector selects a key of a Secret.
properties:
key:
description: The key of the secret to select from. Must
be a valid secret key.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names'
type: string
optional:
description: Specify whether the Secret or its key must
be defined
type: boolean
required:
- key
type: object
username:
description: SecretKeySelector selects a key of a Secret.
properties:
key:
description: The key of the secret to select from. Must
be a valid secret key.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names'
type: string
optional:
description: Specify whether the Secret or its key must
be defined
type: boolean
required:
- key
type: object
type: object
bearerTokenFile:
description: File to read bearer token for scraping targets.
type: string
bearerTokenSecret:
description: SecretKeySelector selects a key of a Secret.
properties:
key:
description: The key of the secret to select from. Must be
a valid secret key.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names'
type: string
optional:
description: Specify whether the Secret or its key must be
defined
type: boolean
required:
- key
type: object
honorLabels:
description: HonorLabels chooses the metric's labels on collisions
with target labels.
type: boolean
honorTimestamps:
description: HonorTimestamps controls whether Prometheus respects
the timestamps present in scraped data.
type: boolean
interval:
description: Interval at which metrics should be scraped
type: string
metricRelabelings:
description: MetricRelabelConfigs to apply to samples before ingestion.
items:
description: 'RelabelConfig allows dynamic rewriting of the
label set, being applied to samples before ingestion. It defines
`<metric_relabel_configs>`-section of Prometheus configuration.
More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs'
versions:
- name: v1
schema:
openAPIV3Schema:
description: ServiceMonitor defines monitoring for a set of services.
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: Specification of desired Service selection for target discovery
by Prometheus.
properties:
endpoints:
description: A list of endpoints allowed as part of this ServiceMonitor.
items:
description: Endpoint defines a scrapeable endpoint serving Prometheus
metrics.
properties:
basicAuth:
description: 'BasicAuth allow an endpoint to authenticate over
basic authentication More info: https://prometheus.io/docs/operating/configuration/#endpoints'
properties:
action:
description: Action to perform based on regex matching.
Default is 'replace'
type: string
modulus:
description: Modulus to take of the hash of the source label
values.
format: int64
type: integer
regex:
description: Regular expression against which the extracted
value is matched. defailt is '(.*)'
type: string
replacement:
description: Replacement value against which a regex replace
is performed if the regular expression matches. Regex
capture groups are available. Default is '$1'
type: string
separator:
description: Separator placed between concatenated source
label values. default is ';'.
type: string
sourceLabels:
description: The source labels select values from existing
labels. Their content is concatenated using the configured
separator and matched against the configured regular expression
for the replace, keep, and drop actions.
items:
type: string
type: array
targetLabel:
description: Label to which the resulting value is written
in a replace action. It is mandatory for replace actions.
Regex capture groups are available.
type: string
password:
description: The secret in the service monitor namespace
that contains the password for authentication.
properties:
key:
description: The key of the secret to select from. Must
be a valid secret key.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind, uid?'
type: string
optional:
description: Specify whether the Secret or its key must
be defined
type: boolean
required:
- key
type: object
username:
description: The secret in the service monitor namespace
that contains the username for authentication.
properties:
key:
description: The key of the secret to select from. Must
be a valid secret key.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind, uid?'
type: string
optional:
description: Specify whether the Secret or its key must
be defined
type: boolean
required:
- key
type: object
type: object
type: array
params:
description: Optional HTTP URL parameters
type: object
path:
description: HTTP path to scrape for metrics.
type: string
port:
description: Name of the service port this endpoint refers to.
Mutually exclusive with targetPort.
type: string
proxyUrl:
description: ProxyURL eg http://proxyserver:2195 Directs scrapes
to proxy through this endpoint.
type: string
relabelings:
description: 'RelabelConfigs to apply to samples before scraping.
More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config'
items:
description: 'RelabelConfig allows dynamic rewriting of the
label set, being applied to samples before ingestion. It defines
`<metric_relabel_configs>`-section of Prometheus configuration.
More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs'
bearerTokenFile:
description: File to read bearer token for scraping targets.
type: string
bearerTokenSecret:
description: Secret to mount to read bearer token for scraping
targets. The secret needs to be in the same namespace as the
service monitor and accessible by the Prometheus Operator.
properties:
action:
description: Action to perform based on regex matching.
Default is 'replace'
key:
description: The key of the secret to select from. Must
be a valid secret key.
type: string
modulus:
description: Modulus to take of the hash of the source label
values.
format: int64
type: integer
regex:
description: Regular expression against which the extracted
value is matched. defailt is '(.*)'
type: string
replacement:
description: Replacement value against which a regex replace
is performed if the regular expression matches. Regex
capture groups are available. Default is '$1'
type: string
separator:
description: Separator placed between concatenated source
label values. default is ';'.
type: string
sourceLabels:
description: The source labels select values from existing
labels. Their content is concatenated using the configured
separator and matched against the configured regular expression
for the replace, keep, and drop actions.
items:
type: string
type: array
targetLabel:
description: Label to which the resulting value is written
in a replace action. It is mandatory for replace actions.
Regex capture groups are available.
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind, uid?'
type: string
optional:
description: Specify whether the Secret or its key must
be defined
type: boolean
required:
- key
type: object
type: array
scheme:
description: HTTP scheme to use for scraping.
type: string
scrapeTimeout:
description: Timeout after which the scrape is ended
type: string
targetPort:
anyOf:
- type: string
- type: integer
tlsConfig:
description: TLSConfig specifies TLS configuration parameters.
properties:
ca: {}
caFile:
description: Path to the CA cert in the Prometheus container
to use for the targets.
type: string
cert: {}
certFile:
description: Path to the client cert file in the Prometheus
container for the targets.
type: string
insecureSkipVerify:
description: Disable target certificate validation.
type: boolean
keyFile:
description: Path to the client key file in the Prometheus
container for the targets.
type: string
keySecret:
description: SecretKeySelector selects a key of a Secret.
honorLabels:
description: HonorLabels chooses the metric's labels on collisions
with target labels.
type: boolean
honorTimestamps:
description: HonorTimestamps controls whether Prometheus respects
the timestamps present in scraped data.
type: boolean
interval:
description: Interval at which metrics should be scraped
type: string
metricRelabelings:
description: MetricRelabelConfigs to apply to samples before
ingestion.
items:
description: 'RelabelConfig allows dynamic rewriting of the
label set, being applied to samples before ingestion. It
defines `<metric_relabel_configs>`-section of Prometheus
configuration. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs'
properties:
key:
description: The key of the secret to select from. Must
be a valid secret key.
action:
description: Action to perform based on regex matching.
Default is 'replace'
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names'
modulus:
description: Modulus to take of the hash of the source
label values.
format: int64
type: integer
regex:
description: Regular expression against which the extracted
value is matched. Default is '(.*)'
type: string
replacement:
description: Replacement value against which a regex replace
is performed if the regular expression matches. Regex
capture groups are available. Default is '$1'
type: string
separator:
description: Separator placed between concatenated source
label values. default is ';'.
type: string
sourceLabels:
description: The source labels select values from existing
labels. Their content is concatenated using the configured
separator and matched against the configured regular
expression for the replace, keep, and drop actions.
items:
type: string
type: array
targetLabel:
description: Label to which the resulting value is written
in a replace action. It is mandatory for replace actions.
Regex capture groups are available.
type: string
optional:
description: Specify whether the Secret or its key must
be defined
type: boolean
required:
- key
type: object
serverName:
description: Used to verify the hostname for the targets.
type: string
type: object
type: object
type: array
jobLabel:
description: The label to use to retrieve the job name from.
type: string
namespaceSelector:
description: NamespaceSelector is a selector for selecting either all
namespaces or a list of namespaces.
properties:
any:
description: Boolean describing whether all namespaces are selected
in contrast to a list restricting them.
type: boolean
matchNames:
description: List of namespace names.
items:
type: string
type: array
type: object
podTargetLabels:
description: PodTargetLabels transfers labels on the Kubernetes Pod
onto the target.
items:
type: string
type: array
sampleLimit:
description: SampleLimit defines per-scrape limit on number of scraped
samples that will be accepted.
format: int64
type: integer
selector:
description: A label selector is a label query over a set of resources.
The result of matchLabels and matchExpressions are ANDed. An empty
label selector matches all objects. A null label selector matches
no objects.
properties:
matchExpressions:
description: matchExpressions is a list of label selector requirements.
The requirements are ANDed.
items:
description: A label selector requirement is a selector that contains
values, a key, and an operator that relates the key and values.
properties:
key:
description: key is the label key that the selector applies
to.
type: string
operator:
description: operator represents a key's relationship to a
set of values. Valid operators are In, NotIn, Exists and
DoesNotExist.
type: string
values:
description: values is an array of string values. If the operator
is In or NotIn, the values array must be non-empty. If the
operator is Exists or DoesNotExist, the values array must
be empty. This array is replaced during a strategic merge
patch.
type: array
params:
additionalProperties:
items:
type: string
type: array
required:
- key
- operator
type: object
type: array
matchLabels:
description: matchLabels is a map of {key,value} pairs. A single
{key,value} in the matchLabels map is equivalent to an element
of matchExpressions, whose key field is "key", the operator is
"In", and the values array contains only "value". The requirements
are ANDed.
description: Optional HTTP URL parameters
type: object
path:
description: HTTP path to scrape for metrics.
type: string
port:
description: Name of the service port this endpoint refers to.
Mutually exclusive with targetPort.
type: string
proxyUrl:
description: ProxyURL eg http://proxyserver:2195 Directs scrapes
to proxy through this endpoint.
type: string
relabelings:
description: 'RelabelConfigs to apply to samples before scraping.
More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config'
items:
description: 'RelabelConfig allows dynamic rewriting of the
label set, being applied to samples before ingestion. It
defines `<metric_relabel_configs>`-section of Prometheus
configuration. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs'
properties:
action:
description: Action to perform based on regex matching.
Default is 'replace'
type: string
modulus:
description: Modulus to take of the hash of the source
label values.
format: int64
type: integer
regex:
description: Regular expression against which the extracted
value is matched. Default is '(.*)'
type: string
replacement:
description: Replacement value against which a regex replace
is performed if the regular expression matches. Regex
capture groups are available. Default is '$1'
type: string
separator:
description: Separator placed between concatenated source
label values. default is ';'.
type: string
sourceLabels:
description: The source labels select values from existing
labels. Their content is concatenated using the configured
separator and matched against the configured regular
expression for the replace, keep, and drop actions.
items:
type: string
type: array
targetLabel:
description: Label to which the resulting value is written
in a replace action. It is mandatory for replace actions.
Regex capture groups are available.
type: string
type: object
type: array
scheme:
description: HTTP scheme to use for scraping.
type: string
scrapeTimeout:
description: Timeout after which the scrape is ended
type: string
targetPort:
anyOf:
- type: integer
- type: string
description: Name or number of the target port of the Pod behind
the Service, the port must be specified with container port
property. Mutually exclusive with port.
x-kubernetes-int-or-string: true
tlsConfig:
description: TLS configuration to use when scraping the endpoint
properties:
ca:
description: Stuct containing the CA cert to use for the
targets.
properties:
configMap:
description: ConfigMap containing data to use for the
targets.
properties:
key:
description: The key to select.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind,
uid?'
type: string
optional:
description: Specify whether the ConfigMap or its
key must be defined
type: boolean
required:
- key
type: object
secret:
description: Secret containing data to use for the targets.
properties:
key:
description: The key of the secret to select from. Must
be a valid secret key.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind,
uid?'
type: string
optional:
description: Specify whether the Secret or its key
must be defined
type: boolean
required:
- key
type: object
type: object
caFile:
description: Path to the CA cert in the Prometheus container
to use for the targets.
type: string
cert:
description: Struct containing the client cert file for
the targets.
properties:
configMap:
description: ConfigMap containing data to use for the
targets.
properties:
key:
description: The key to select.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind,
uid?'
type: string
optional:
description: Specify whether the ConfigMap or its
key must be defined
type: boolean
required:
- key
type: object
secret:
description: Secret containing data to use for the targets.
properties:
key:
description: The key of the secret to select from. Must
be a valid secret key.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind,
uid?'
type: string
optional:
description: Specify whether the Secret or its key
must be defined
type: boolean
required:
- key
type: object
type: object
certFile:
description: Path to the client cert file in the Prometheus
container for the targets.
type: string
insecureSkipVerify:
description: Disable target certificate validation.
type: boolean
keyFile:
description: Path to the client key file in the Prometheus
container for the targets.
type: string
keySecret:
description: Secret containing the client key file for the
targets.
properties:
key:
description: The key of the secret to select from. Must
be a valid secret key.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind, uid?'
type: string
optional:
description: Specify whether the Secret or its key must
be defined
type: boolean
required:
- key
type: object
serverName:
description: Used to verify the hostname for the targets.
type: string
type: object
type: object
type: object
targetLabels:
description: TargetLabels transfers labels on the Kubernetes Service
onto the target.
items:
type: array
jobLabel:
description: The label to use to retrieve the job name from.
type: string
type: array
required:
- endpoints
- selector
type: object
type: object
version: v1
namespaceSelector:
description: Selector to select which namespaces the Endpoints objects
are discovered from.
properties:
any:
description: Boolean describing whether all namespaces are selected
in contrast to a list restricting them.
type: boolean
matchNames:
description: List of namespace names.
items:
type: string
type: array
type: object
podTargetLabels:
description: PodTargetLabels transfers labels on the Kubernetes Pod
onto the target.
items:
type: string
type: array
sampleLimit:
description: SampleLimit defines per-scrape limit on number of scraped
samples that will be accepted.
format: int64
type: integer
selector:
description: Selector to select Endpoints objects.
properties:
matchExpressions:
description: matchExpressions is a list of label selector requirements.
The requirements are ANDed.
items:
description: A label selector requirement is a selector that
contains values, a key, and an operator that relates the key
and values.
properties:
key:
description: key is the label key that the selector applies
to.
type: string
operator:
description: operator represents a key's relationship to
a set of values. Valid operators are In, NotIn, Exists
and DoesNotExist.
type: string
values:
description: values is an array of string values. If the
operator is In or NotIn, the values array must be non-empty.
If the operator is Exists or DoesNotExist, the values
array must be empty. This array is replaced during a strategic
merge patch.
items:
type: string
type: array
required:
- key
- operator
type: object
type: array
matchLabels:
additionalProperties:
type: string
description: matchLabels is a map of {key,value} pairs. A single
{key,value} in the matchLabels map is equivalent to an element
of matchExpressions, whose key field is "key", the operator
is "In", and the values array contains only "value". The requirements
are ANDed.
type: object
type: object
targetLabels:
description: TargetLabels transfers labels on the Kubernetes Service
onto the target.
items:
type: string
type: array
required:
- endpoints
- selector
type: object
required:
- spec
type: object
served: true
storage: true
status:
acceptedNames:
kind: ""
plural: ""
conditions: []
storedVersions: []

File diff suppressed because it is too large Load Diff

View File

@@ -4,24 +4,21 @@ metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.34.0
app.kubernetes.io/version: v0.42.1
name: prometheus-operator
rules:
- apiGroups:
- apiextensions.k8s.io
resources:
- customresourcedefinitions
verbs:
- '*'
- apiGroups:
- monitoring.coreos.com
resources:
- alertmanagers
- alertmanagers/finalizers
- prometheuses
- prometheuses/finalizers
- alertmanagers/finalizers
- thanosrulers
- thanosrulers/finalizers
- servicemonitors
- podmonitors
- probes
- prometheusrules
verbs:
- '*'
@@ -71,3 +68,15 @@ rules:
- get
- list
- watch
- apiGroups:
- authentication.k8s.io
resources:
- tokenreviews
verbs:
- create
- apiGroups:
- authorization.k8s.io
resources:
- subjectaccessreviews
verbs:
- create

Some files were not shown because too many files have changed in this diff Show More