kube-prometheus

Author	SHA1	Message	Date
Philip Gough	10396baa75	Revert "Adjust dropped metrics from cAdvisor"	2021-11-10 10:31:05 +00:00
Nagel, Felix	02ed3e2519	replace double quotes with single quotes	2021-10-25 14:50:29 +11:00
Nagel, Felix	6da3452a1c	can change grafanaImage over $.values.common.images	2021-10-25 14:50:23 +11:00
Nagel, Felix	013adb7f6c	fix formatting issues	2021-10-25 14:50:08 +11:00
Nagel, Felix	abd442d24f	can change configmapReload over $.values.common.images	2021-10-25 14:49:53 +11:00
Nagel, Felix	58006b585a	fix formatting issues	2021-10-25 14:18:15 +11:00
Nagel, Felix	5f02e2741c	can change kubeRbacProxy over $.values.common.images	2021-10-25 14:18:11 +11:00
Philip Gough	4a40a2a11c	Adjust dropped metrics from cAdvisor This change drops pod-centric metrics without a non-empty 'container' label. Previously we dropped pod-centric metrics without a (pod, namespace) label set however these can be critical for debugging. Keep 'container_fs_.*' metrics from cAdvisor	2021-09-28 10:18:58 +01:00
Philip Gough	74594f2170	jsonnet: Drop cAdvisor metrics with no (pod, namespace) labels while preserving ability to monitor system services resource usage The following provides a description and cardinality estimation based on the tests in a local cluster: container_blkio_device_usage_total - useful for containers, but not for system services (nodesdisksservicesoperations2) container_fs_.* - add filesystem read/write data (nodesdisksservices4) container_file_descriptors - file descriptors limits and global numbers are exposed via (nodesservices) container_threads_max - max number of threads in cgroup. Usually for system services it is not limited (nodesservices) container_threads - used threads in cgroup. Usually not important for system services (nodesservices) container_sockets - used sockets in cgroup. Usually not important for system services (nodesservices) container_start_time_seconds - container start. Possibly not needed for system services (nodesservices) container_last_seen - Not needed as system services are always running (nodesservices) container_spec_. - Everything related to cgroup specification and thus static data (nodesservices5)	2021-08-30 12:16:04 +01:00
Philip Gough	710f6aa24d	jsonnet: The node exporter should not export data about veth interfaces. In case of the OVN, the regex was incorrect and was exporting veth metrics.	2021-08-16 10:26:36 +01:00
Sunil Thaha	ed87db34b6	jsonnet: kube-prometheus adapt to changes to veth interfaces names With OVN, the container veth network interface names that used to start with `veth` has now changed to `<rand-hex>{15}@if<number>`(see Related Links below). This patch adapts to the new change introduced in ovn and ignores the network interfaces that match `[a-z0-9]{15}@if\d+` in addition to those starting with `veth` Related Links: - https://github.com/openshift/ovn-kubernetes/blob/master/go-controller/vendor/github.com/containernetworking/plugins/pkg/ip/link_linux.go#L107 - https://github.com/openshift/ovn-kubernetes/blob/master/go-controller/pkg/cni/helper_linux.go#L148 Signed-off-by: Sunil Thaha <sthaha@redhat.com> (cherry picked from commit `0280f4ddf9`)	2021-07-05 22:40:47 +10:00
paulfantom	448aac54e5	jsonnet: sort list of dropped metrics	2021-07-05 11:16:35 +01:00
paulfantom	c9d5a64833	jsonnet: convert string of deprecated metrics into array	2021-07-05 11:16:11 +01:00
Duncan McNaught	023951137c	Fix for bug #1163 #1164 merged to release-0.8	2021-06-07 11:32:24 -06:00
Simon Pasquier	9d2e395361	jsonnet/kube-prometheus/addons: fix KSM regex patterns Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2021-05-20 09:27:23 +02:00
Damien Grisonnet	7760c2b801	jsonnet: add PDB to prometheus-adapter Adding a PodDisruptionBudget to prometheus-adapter ensure that at least one replica of the adapter is always available. This make sure that even during disruption the aggregated API is available and thus does not impact the availability of the apiserver. Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>	2021-05-05 17:08:18 +02:00
paulfantom	37f2852388	jsonnet: pin alertmanager to specific commit as release-0.21 doesn't have mixin directory Signed-off-by: paulfantom <pawel@krupa.net.pl>	2021-04-28 10:09:54 +02:00
paulfantom	415afa4cc0	*: cut release-0.8 Signed-off-by: paulfantom <pawel@krupa.net.pl>	2021-04-27 13:08:03 +02:00
Luis Vidal Ernst	8c712eaa36	Fixed labels in windows addon	2021-04-27 12:47:23 +02:00
Michel Tomas	e9d5221fb7	fix(addons): anti-affinity field does not exist: config	2021-04-23 01:19:05 +02:00
Paweł Krupa	cf039d2222	Merge pull request #1050 from paulfantom/fix827 jsonnet/kube-prometheus: fix jb warning message	2021-04-22 13:08:34 +02:00
Paweł Krupa	a3d67f5219	Merge pull request #1095 from dgrisonnet/prometheus-adapter-ha Make prometheus-adapter highly-available	2021-04-22 12:00:39 +02:00
Damien Grisonnet	4c6a06cf7e	jsonnet: make prometheus-adapter highly-available Prometheus-adapter is a component of the monitoring stack that in most cases require to be highly available. For instance, we most likely always want the autoscaling pipeline to be available and we also want to avoid having no available backends serving the metrics API apiservices has it would result in both the AggregatedAPIDown alert firing and the kubectl top command not working anymore. In order to make the adapter highly-avaible, we need to increase its replica count to 2 and come up with a rolling update strategy and a pod anti-affinity rule based on the kubernetes hostname to prevent the adapters to be scheduled on the same node. The default rolling update strategy for deployments isn't enough as the default maxUnavaible value is 25% and is rounded down to 0. This means that during rolling-updates scheduling will fail if there isn't more nodes than the number of replicas. As for the maxSurge, the default should be fine as it is rounded up to 1, but for clarity it might be better to just set it to 1. For the pod anti-affinity constraints, it would be best if it was hard, but having it soft should be good enough and fit most use-cases. Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>	2021-04-22 09:57:14 +02:00
paulfantom	7b69800686	jsonnet: add default container annotation for KSM and blackbox Signed-off-by: paulfantom <pawel@krupa.net.pl>	2021-04-21 18:43:00 +02:00
paulfantom	417e8b3f66	jsonnet/kube-prometheus: fix jb warning message Signed-off-by: paulfantom <pawel@krupa.net.pl>	2021-04-20 19:18:22 +02:00
ArthurSens	c96c639ef1	Add summary Signed-off-by: ArthurSens <arthursens2005@gmail.com>	2021-04-16 18:06:47 +00:00
ArthurSens	92016ef68d	Change message to description Signed-off-by: ArthurSens <arthursens2005@gmail.com>	2021-04-16 18:06:47 +00:00
paulfantom	3135cdd70d	jsonnet: fix windows addon	2021-04-16 15:12:41 +02:00
Paweł Krupa	07136d1d6e	Merge pull request #1039 from paulfantom/unify-config jsonnet: unify internal configuration field name	2021-04-16 15:05:26 +02:00
Paweł Krupa	8b62749642	Merge pull request #1076 from paulfantom/ksm-lite reduce KSM cardinality by denylisting unused metrics	2021-04-16 12:36:31 +02:00
Kristijan Sedlak	28d58a9dbc	Update versions	2021-04-14 20:19:00 +02:00
Paweł Krupa	7a3879ba49	Merge pull request #1070 from ArthurSens/as/psp-respect-common-ns Psp should be deployed at the same namespace as kube-prometheus stack	2021-04-12 10:31:51 +02:00
Paweł Krupa	1e67c71703	Merge pull request #1072 from dgrisonnet/platform-patch Allow configuring the platform used directly instead of having to use a patch	2021-04-09 14:29:34 +02:00
Damien Grisonnet	ed5a2f94fc	jsonnet: fix test failures with platformPatch Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>	2021-04-09 12:59:18 +02:00
Paweł Krupa	de3673a286	Merge pull request #1054 from adinhodovic/add-external-mixin-support jsonnet: Add External mixin lib	2021-04-08 09:11:48 +02:00
paulfantom	f81412d05d	jsonnet/kube-prometheus/addons: reduce KSM cardinality by denylisting unused metrics Signed-off-by: paulfantom <pawel@krupa.net.pl>	2021-04-07 14:23:43 +02:00
Paweł Krupa	2ba8d8aca2	Merge pull request #1058 from mansikulkarni96/windows_exporter	2021-04-07 10:07:33 +02:00
Adin Hodovic	0268128bd1	Add External mixin library Add library for mixins	2021-04-06 11:59:03 +02:00
Damien Grisonnet	b59b2c23d8	examples: update platform snippets and doc Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>	2021-04-06 11:42:29 +02:00
mansikulkarni96	1c63b6d021	Add relabel_config to replace endpoint address This commit adds a relabeling config to the scrape config of windows-exporter using the 'replace' action field to replace the node endpoint address with node name. The windows_exporter returns endpoint target as node IP but we need it to be node name to use the prometheus adapter queries and collect resource metrics information.	2021-03-31 13:29:16 -04:00
Damien Grisonnet	f06175bb3b	jsonnet: add function to apply platform patches Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>	2021-03-31 18:00:04 +02:00
ArthurSens	069f95148f	Psp should be deployed at the same namespace as kube-prometheus stack Signed-off-by: ArthurSens <arthursens2005@gmail.com>	2021-03-30 19:00:12 +00:00
Frederic Branczyk	f5f72e1b50	Merge pull request #1060 from ArthurSens/as/psp-addon-fixes PodSecurityPolicy uses role instead of clusterRole where posible	2021-03-30 13:33:48 +02:00
mansikulkarni96	7ba0479433	jsonnet: Add windows_exporter queries for adapter This commit includes windows_exporter metrics in the node queries for the prometheus adapter configuration. This will help obtain the resource metrics: memory and CPU for Windows nodes. This change will also help in displaying metrics reported through the 'kubectl top' command which currently reports 'unknown' status for Windows nodes.	2021-03-29 14:55:11 -04:00
Lili Cosic	17b11ae344	jsonnetfile.lock.json: Bump kube-state-metrics to 2.0.0-rc.1	2021-03-29 14:29:59 +02:00
viperstars	d1f401a73d	add cluster role to list and watch ingresses in api group "networking.k8s.io"	2021-03-29 14:19:35 +08:00
Frederic Branczyk	003daae495	Merge pull request #1052 from paulfantom/simplify-managed-cluster-addon jsonnet/addons: simplify managed-cluster addon	2021-03-26 19:35:22 +01:00
ArthurSens	c9b52c97f5	PodSecurityPolicy uses role instead of clusterRole where posible Signed-off-by: ArthurSens <arthursens2005@gmail.com>	2021-03-25 20:59:49 +00:00
paulfantom	35a22050e0	*: update dependencies Signed-off-by: paulfantom <pawel@krupa.net.pl>	2021-03-25 14:22:06 +01:00
Paweł Krupa	ea12911e4f	Merge pull request #1041 from lilic/ksm-2.0.0-rc.0	2021-03-25 14:18:27 +01:00

1 2 3 4 5 ...

663 Commits