Commit Graph

680 Commits

Author SHA1 Message Date
simonpasquier 62a5b28b55 [bot] [release-0.9] Automated version update 2021-08-25 09:37:18 +00:00
Damien Grisonnet b5ec93208b jsonnet: drop deprecated etcd metric
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2021-08-18 17:27:50 +02:00
Damien Grisonnet 45adc03cfb jsonnet: update prometheus-adapter to v0.9.0
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2021-08-17 18:05:45 +02:00
dgrisonnet 6ade9e5c7d [bot] [main] Automated version update 2021-08-17 08:05:49 +00:00
paulfantom ad3fc8920e [bot] [main] Automated version update 2021-08-16 08:04:51 +00:00
Dimitrije Manic 12cd7fd9ce Prometheus ruleSelector defaults to all rules 2021-08-11 10:16:24 -04:00
dgrisonnet e97eb0fbe9 [bot] [main] Automated version update 2021-08-02 13:37:08 +00:00
Paweł Krupa b9c73c7b29 Merge pull request #1283 from prashbnair/node-veth
changing node exporter ignore list
2021-07-28 09:17:03 +02:00
Prashant Balachandran 09fdac739d changing node exporter ignore list 2021-07-27 17:17:19 +05:30
lanmarti ed48391831 Add resource requests and limits to prometheus-adapter container 2021-07-27 12:19:51 +02:00
paulfantom 05c72f83ef [bot] Automated version update 2021-07-26 13:44:14 +00:00
Manuel Rüger acd1eeba4c node.libsonnet: Fix small typo
Signed-off-by: Manuel Rüger <manuel@rueg.eu>
2021-07-22 19:14:24 +02:00
paulfantom 755d2fe5c1 manifests: regenerate 2021-07-22 17:31:30 +02:00
Paweł Krupa acea5efd85 Merge pull request #1268 from paulfantom/alerts-best-practices
Alerts best practices
2021-07-21 09:32:32 +02:00
Philip Gough 463ad065d3 jsonnet: Drop cAdvisor metrics with no (pod, namespace) labels while preserving ability to monitor system services resource usage
The following provides a description and cardinality estimation based on the tests in a local cluster:

container_blkio_device_usage_total - useful for containers, but not for system services (nodes*disks*services*operations*2)
container_fs_.*                    - add filesystem read/write data (nodes*disks*services*4)
container_file_descriptors         - file descriptors limits and global numbers are exposed via (nodes*services)
container_threads_max              - max number of threads in cgroup. Usually for system services it is not limited (nodes*services)
container_threads                  - used threads in cgroup. Usually not important for system services (nodes*services)
container_sockets                  - used sockets in cgroup. Usually not important for system services (nodes*services)
container_start_time_seconds       - container start. Possibly not needed for system services (nodes*services)
container_last_seen                - Not needed as system services are always running (nodes*services)
container_spec_.*                  - Everything related to cgroup specification and thus static data (nodes*services*5)
2021-07-20 12:50:02 +01:00
paulfantom 02454b3f53 manifests: regenerate 2021-07-20 11:14:28 +02:00
paulfantom 1a3c610c61 [bot] Automated version update 2021-07-19 13:44:23 +00:00
Yury Gargay 9b08b941f8 Update kubernetes-mixin
From https://github.com/kubernetes-monitoring/kubernetes-mixin/commit/b710a868a95621aa93e0b661954f63f4db82aaea
2021-07-14 18:51:36 +02:00
Damien Grisonnet 97e77e9996 Merge pull request #1231 from dgrisonnet/fix-adapter-queries
Consolidate intervals used in prometheus-adapter CPU queries
2021-07-07 13:48:02 +02:00
Philip 3e6865d776 Generate kubernetes-mixin 2021-07-06 17:49:32 +02:00
Damien Grisonnet b9563b9c2d jsonnet: improve adapter queries readability
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2021-07-05 15:29:45 +02:00
Damien Grisonnet 8812e45501 jsonnet: readjust prometheus-adapter intervals
Previously, prometheus-adapter configuration wasn't taking into account
the scrape interval of kubelet, node-exporter and windows-exporter
leading to getting non fresh results, and even negative results from the
CPU queries when the irate() function was extrapolating data.
To fix that, we want to set the interval used in the irate() function in
the CPU queries to 4x scrape interval in order to extrapolate data
between the last two scrapes. This will improve the freshness of the cpu
usage exposed and prevent incorrect extrapolations.

Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2021-07-05 15:28:25 +02:00
Sunil Thaha 0280f4ddf9 jsonnet: kube-prometheus adapt to changes to veth interfaces names
With OVN, the container veth network interface names that used to start
with `veth` has now changed to `<rand-hex>{15}@if<number>`(see Related
Links below).

This patch adapts to the new change introduced in ovn and ignores the network
interfaces that match `[a-z0-9]{15}@if\d+` in addition to those starting
with `veth`

Related Links:
  - https://github.com/openshift/ovn-kubernetes/blob/master/go-controller/vendor/github.com/containernetworking/plugins/pkg/ip/link_linux.go#L107
  - https://github.com/openshift/ovn-kubernetes/blob/master/go-controller/pkg/cni/helper_linux.go#L148

Signed-off-by: Sunil Thaha <sthaha@redhat.com>
2021-07-01 12:01:19 +10:00
Damien Grisonnet 2c5c20cfff Merge pull request #1216 from fpetkovski/prometheus-adapter-cipher-suites
jsonnet: disable insecure cypher suites for prometheus-adapter
2021-06-23 21:19:24 +02:00
paulfantom d0e21f34e5 [bot] Automated version update 2021-06-23 13:41:46 +00:00
fpetkovski 0959155a1c jsonnet: update downstream dependencies
This commit updates all downstream dependencies

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>
2021-06-22 16:27:29 +02:00
fpetkovski 0ff173efea jsonnet: disable insecure cypher suites for prometheus-adapter
Running sslscan against the prometheus adapter secure port reports two
insecure SSL ciphers, ECDHE-RSA-DES-CBC3-SHA and DES-CBC3-SHA.

This commit removes those ciphers from the list.

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>
2021-06-22 14:17:09 +02:00
Philip Gough 3a4e292aab Sync with kubernetes-mixin 2021-06-22 11:11:40 +01:00
paulfantom ffea8f498e [bot] Automated version update 2021-06-18 13:50:44 +00:00
paulfantom d6201759b8 [bot] Automated version update 2021-06-14 13:50:57 +00:00
Paweł Krupa 11778868b1 Merge pull request #1202 from prashbnair/kube-mixin 2021-06-12 13:36:39 +02:00
Prashant Balachandran 78a4677370 pulling in changes from kubernetes-mixin
adding changes from kube-mixin
2021-06-12 15:26:37 +05:30
paulfantom 54f79428ce [bot] Automated version update 2021-06-11 13:51:10 +00:00
Paweł Krupa df197f6759 Merge pull request #1192 from prometheus-operator/automated-updates 2021-06-11 15:47:41 +02:00
paulfantom edc869991d manifests: regenerate 2021-06-11 11:02:21 +02:00
paulfantom a2cf1acd95 [bot] Automated version update 2021-06-10 13:59:30 +00:00
ArthurSens f643955034 Update alertmanager mixin
Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2021-06-08 18:19:23 +00:00
Prem Saraswat 93282accb7 Generate manifests 2021-05-27 23:21:30 +05:30
paulfantom b10e0c9690 manifests: regenerate 2021-05-27 10:51:14 +02:00
paulfantom edd0eb639e manifests: regenerate 2021-05-26 12:50:11 +02:00
paulfantom 888443e447 manifests: regenerate 2021-05-25 16:03:49 +02:00
Paweł Krupa a1210f1eff Merge pull request #1132 from paulfantom/ruleNamespaceSelector 2021-05-06 23:05:34 +02:00
Damien Grisonnet a4a4d4b744 jsonnet: add PDB to prometheus-adapter
Adding a PodDisruptionBudget to prometheus-adapter ensure that at least
one replica of the adapter is always available. This make sure that even
during disruption the aggregated API is available and thus does not
impact the availability of the apiserver.

Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2021-05-05 16:15:25 +02:00
paulfantom 15a8351ce0 manifests: regenerate 2021-05-05 08:57:27 +02:00
paulfantom 415afa4cc0 *: cut release-0.8
Signed-off-by: paulfantom <pawel@krupa.net.pl>
2021-04-27 13:08:03 +02:00
Paweł Krupa a3d67f5219 Merge pull request #1095 from dgrisonnet/prometheus-adapter-ha
Make prometheus-adapter highly-available
2021-04-22 12:00:39 +02:00
Damien Grisonnet 4c6a06cf7e jsonnet: make prometheus-adapter highly-available
Prometheus-adapter is a component of the monitoring stack that in most
cases require to be highly available. For instance, we most likely
always want the autoscaling pipeline to be available and we also want to
avoid having no available backends serving the metrics API apiservices
has it would result in both the AggregatedAPIDown alert firing and the
kubectl top command not working anymore.

In order to make the adapter highly-avaible, we need to increase its
replica count to 2 and come up with a rolling update strategy and a
pod anti-affinity rule based on the kubernetes hostname to prevent the
adapters to be scheduled on the same node. The default rolling update
strategy for deployments isn't enough as the default maxUnavaible value
is 25% and is rounded down to 0. This means that during rolling-updates
scheduling will fail if there isn't more nodes than the number of
replicas. As for the maxSurge, the default should be fine as it is
rounded up to 1, but for clarity it might be better to just set it to 1.
For the pod anti-affinity constraints, it would be best if it was hard,
but having it soft should be good enough and fit most use-cases.

Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2021-04-22 09:57:14 +02:00
paulfantom 412061ef51 manifests: regenerate 2021-04-21 18:43:01 +02:00
Paweł Krupa 752d1a7fdc Merge pull request #1093 from ArthurSens/as/custom-alerts-description 2021-04-20 19:13:48 +02:00
Jan Fajerski 8b39a459fa update generated assets
Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
2021-04-20 14:35:31 +02:00