Commit Graph

528 Commits

Author SHA1 Message Date
dgrisonnet
8e257a945f [bot] [release-0.6] Automated version update 2021-10-11 07:39:26 +00:00
Philip Gough
d714141304 This change drops pod-centric metrics without a non-empty 'container' label.
Previously we dropped pod-centric metrics without a (pod, namespace) label set
however these can be critical for debugging.

Keep 'container_fs_.*' metrics from cAdvisor
2021-09-28 10:53:20 +01:00
Philip Gough
47e55a460e jsonnet: Drop cAdvisor metrics with no (pod, namespace) labels while preserving ability to monitor system services resource usage
The following provides a description and cardinality estimation based on the tests in a local cluster:

container_blkio_device_usage_total - useful for containers, but not for system services (nodes*disks*services*operations*2)
container_fs_.*                    - add filesystem read/write data (nodes*disks*services*4)
container_file_descriptors         - file descriptors limits and global numbers are exposed via (nodes*services)
container_threads_max              - max number of threads in cgroup. Usually for system services it is not limited (nodes*services)
container_threads                  - used threads in cgroup. Usually not important for system services (nodes*services)
container_sockets                  - used sockets in cgroup. Usually not important for system services (nodes*services)
container_start_time_seconds       - container start. Possibly not needed for system services (nodes*services)
container_last_seen                - Not needed as system services are always running (nodes*services)
container_spec_.*                  - Everything related to cgroup specification and thus static data (nodes*services*5)
2021-08-30 12:45:20 +01:00
dgrisonnet
684dbc0bd5 [bot] [release-0.6] Automated version update 2021-08-17 08:05:35 +00:00
Lili Cosic
5550829c60 manifests: Regenerate 2020-09-23 11:45:08 +02:00
Sergiusz Urbaniak
05b7a932ab jsonnet: bump to prometheus-operator 0.42 2020-09-21 10:51:24 +02:00
Scott Dodson
87fabbc077 node-exporter: set maxUnavailable to 10%
This daemonset doesn't affect workload availability so allow its rollout to
be parallelized.
2020-09-03 10:01:53 +02:00
Lili Cosic
09a305bc0e manifests/prometheus-rules.yaml: Regenerate 2020-08-06 15:08:37 +02:00
Sergiusz Urbaniak
685a85e3e0 jb update, manifests: generate 2020-07-31 10:18:24 +02:00
Frederic Branczyk
f0955e0540 Merge pull request #623 from brancz/add-kubelet-probes-metrics
Add scraping of endpoint for kubelet probe metrics
2020-07-29 12:57:28 +02:00
Frederic Branczyk
7c35752e3f Add scraping of endpoint for kubelet probe metrics 2020-07-29 11:49:52 +02:00
Frederic Branczyk
b51b9b983f prometheus-adapter: Collect metrics from Prometheus Adapter 2020-07-29 11:38:42 +02:00
Frederic Branczyk
6771c9bcc2 Merge pull request #616 from paulfantom/ciphers
Update default ciphers used by kube-rbac-proxy
2020-07-28 09:31:20 +02:00
paulfantom
63ad66e3f3 manifests: regenerate 2020-07-28 08:49:27 +02:00
root
3a6a0d0837 make generate 2020-07-27 10:29:31 +02:00
Frederic Branczyk
40adbfae6c Merge pull request #617 from paulfantom/node_filesystem_usage
Remove instance:node_filesystem_usage:sum
2020-07-23 21:25:55 +02:00
Frederic Branczyk
ba5c6e2e6a Merge pull request #618 from simonpasquier/bump-thanos
jsonnet: update component versions
2020-07-23 21:24:48 +02:00
Adin Hodovic
6a34239786 Regenerate dashboards and alerts
Merged https://github.com/kubernetes-monitoring/kubernetes-mixin/pull/463 to remove duplicate entries for memory usage, however I'd like to move these changes to the Prometheus-Operator helm chart(https://github.com/helm/charts/pull/23024#issuecomment-661967101). I've regenerated the dashboards/alerts.
2020-07-23 18:36:41 +02:00
Simon Pasquier
a9ffdaa35c manifests: regenerate 2020-07-23 18:04:56 +02:00
paulfantom
550d42d95b manifests: regenerate 2020-07-23 16:51:35 +02:00
Lili Cosic
d88cb26377 manifests/prometheus-rules.yaml: Regenerate 2020-07-15 10:28:03 +02:00
Lili Cosic
a5b71282cd manifests/prometheus-rules.yaml: Regenerate 2020-07-13 17:35:36 +02:00
Lili Cosic
617003a583 manifests: Regenerate files 2020-07-09 11:48:30 +02:00
Abu Kashem
4d6e3d5c19 enable etcd latency metrics in kube-apiserver
kube-apiserver has a histogram etcd_request_duration_seconds that
measures latency between the kube-apiserver and etcd instance.
This metrics is currently dropped by cluster-prometheus. Enable
this metrics so we have visibility into etcd latency.

We ensured that this does not enable other unwanted metrcis

count by(name) ({name=~"etcd_request.+"})

etcd_request_duration_seconds_bucket
etcd_request_duration_seconds_count
etcd_request_duration_seconds_sum
2020-07-03 09:49:56 -04:00
Matthias Loibl
ea7a834755 Update kubernetes-mixin to remove KubeAPILatencyHigh & KubeAPIErrorsHigh 2020-06-29 19:43:34 +02:00
Simon Pasquier
83ebd535e6 manifests: regenerate 2020-06-24 10:55:13 +02:00
Tom Quinn
e82acdb253 Updated prometheus adapter deployment to use a multi arch image repo 2020-06-22 13:57:41 +01:00
Frederic Branczyk
6f488250fd Merge pull request #576 from simonpasquier/fix-alertmanager-config-inconsistent-alert
Fix AlertmanagerConfigInconsistent alert
2020-06-19 16:20:40 +02:00
Simon Pasquier
0a43e85917 manifests: regenerate 2020-06-19 14:41:11 +02:00
Stavros Foteinopoulos
3cbc97d782 Update prometheus-adapter endpoint 2020-06-19 15:27:26 +03:00
Lili Cosic
beaba9f4da docs, manifests: Regenerate files 2020-06-19 10:30:50 +02:00
Frederic Branczyk
5b9341cad6 Merge pull request #527 from pgier/node-exporter-ignore-pod-mounts
Node exporter ignore pod mounts
2020-05-15 07:10:32 +02:00
Paul Gier
6742260399 update generated files for prometheus operator v0.39.0 2020-05-12 17:38:11 -05:00
Paul Gier
b40e70065b update generated files for node exporter ignored filesystems 2020-05-12 17:20:24 -05:00
Frederic Branczyk
f58d7b5695 Merge pull request #519 from pgier/dont-remove-preserve-unknown-fields
Revert "Remove field preserveUnknownFields from CRDs"
2020-05-11 16:16:22 +02:00
paulfantom
7faed14744 *: regenerate 2020-05-11 11:59:55 +02:00
Frederic Branczyk
dab022fc62 Merge pull request #508 from johanneswuerbach/custom-metrics-b2
custom metrics v1beta2 api with k8s-prometheus-adapter v0.7.0
2020-05-07 10:12:42 +02:00
Paul Gier
4840cdcb66 Revert "Remove field preserveUnknownFields from CRDs"
This reverts commit cdaaf3d51c.
2020-05-05 14:15:18 -05:00
Benjamin
7130905473 Update prometheus version to v2.17.2
Signed-off-by: Benjamin <benjamin@yunify.com>
2020-04-30 14:46:17 +08:00
Johannes Würbach
8d6679658f k8s-prometheus-adapter v0.7.0 2020-04-30 00:26:06 +02:00
Frederic Branczyk
070413521c Merge pull request #478 from NickelMedia/fix-nodeexporter-selector-labels
Remove version label from node-exporter selectors
2020-04-27 15:45:58 +02:00
Lili Cosic
926337feac manifests: Regenerate 2020-04-17 09:48:06 +02:00
Johannes Würbach
2ab69fdac0 Fix rules window 2020-04-07 22:01:26 +02:00
Zack Brenton
46aa9554d1 updated generated manifests 2020-04-07 11:06:30 -03:00
Johannes Würbach
bb21ea32e3 Make prometheus-adapter config a real object 2020-04-07 15:32:33 +02:00
Lili Cosic
7992aa4e73 manifests: Regenerate files 2020-04-03 12:00:49 +02:00
Lili Cosic
cf7bb8706c Merge pull request #463 from rajatvig/support_standard_labels_nodeexporter
Support standard labels for nodeexporter
2020-03-25 09:33:12 +01:00
paulfantom
771ff9dcf4 manifests: regenerate 2020-03-24 15:49:39 +01:00
Rajat Vig
ff6b7ae5f3 Update Manifests based off the new jsonnets 2020-03-24 11:08:39 +00:00
Rajat Vig
474d4e39dc Remove the app label for node-exporter 2020-03-24 10:41:51 +00:00