Commit Graph

198 Commits

Author SHA1 Message Date
Damien Grisonnet eb7f83a407 Drop process start time from SLI endpoints (#2501)
* jsonnet: drop process start time metric from SLI

Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>

* manifests: regenerate

Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>

---------

Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2024-09-05 08:45:00 +01:00
Damien Grisonnet 43f2094629 jsonnet: add component SLI metrics
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2024-08-27 21:48:38 +02:00
Kemal Akkoyun 1c54cb22ca Merge pull request #2427 from philipgough/ci-fix-ksm
ci: Add runAsGroup for kube-state-metrics
2024-05-24 11:32:55 +02:00
Philip Gough e8f461ba38 ci: Add runAsGroup for prom operator Deployment 2024-05-14 10:04:07 +01:00
Philip Gough 387731a945 ci: Add runAsGroup for kube-state-metrics 2024-05-14 09:31:24 +01:00
Kemal Akkoyun cb55161e24 Merge pull request #2423 from philipgough/ci-fix-grafana 2024-05-13 12:07:04 +02:00
Philip Gough e8995efcf9 ci: Add runAsGroup for node_exporter sidecars 2024-05-13 10:40:04 +01:00
Philip Gough d0b0b0d087 ci: Add runAsGroup for Grafana Deployment 2024-05-13 10:34:58 +01:00
Philip Gough d1ec0ab362 ci: Add runAsGroup for blackbox exporter containers 2024-05-13 09:20:00 +01:00
Brad Ison 895db2a272 jsonnet/components/prometheus: Fix thanos-sidecar metrics access (#2330)
When enabled, the thanos-sidecar opens an HTTP listener on port 10902,
which what's used to scrape metrics.  This port wasn't being added to
the Prometheus Service, so wasn't added to the NetworkPolicy causing
scraping to fail from Prometheus instances other than the local one.

This adds the port to the Service and NetworkPolicy.

Fixes: #2006
2024-01-15 09:16:17 +00:00
Marco Lehmann ad86a3fa21 Patch 2 (#2259)
* Update node-exporter.libsonnet

* Update windows-hostprocess.libsonnet

* Update blackbox-exporter.libsonnet
2023-11-07 17:25:06 +00:00
Paweł Krupa 8a1a537524 Merge pull request #2252 from m99coder/patch-1
Update kube-state-metrics.libsonnet
2023-11-01 17:08:27 +01:00
Paweł Krupa 8b73d18b76 Merge pull request #2232 from paulfantom/scrape-configs
explicitly enable ScrapeConfig support
2023-11-01 17:00:45 +01:00
Marco Lehmann 5308081f75 Update kube-state-metrics.libsonnet
Fix typo in error message
2023-10-25 16:20:10 +02:00
Matthias Loibl 1e55a4057c Add securityContext items and add pod security labes 2023-10-09 13:02:20 +02:00
paulfantom ec915f7b47 explicitly enable ScrapeConfig support 2023-10-07 11:09:11 +02:00
Roeland van Batenburg 76ebaeeafe allow configuration of secrets for alertmanager (#2206) 2023-09-06 14:06:29 +01:00
Simon Pasquier 01d4f1daa6 Remove deprecated 'logtostderr' argument for kube-rbac-proxy (#2198)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2023-08-28 20:36:39 +01:00
Rita Canavarro 84c295f06b [FEAT] Prometheus Adapter reload (#2195)
* [FEAT] Prometheus Adapter reload

Signed-off-by: rita.canavarro <ritinhamcm@gmail.com>

* [FEAT] Prometheus Adapter reload

Signed-off-by: rita.canavarro <ritinhamcm@gmail.com>

---------

Signed-off-by: rita.canavarro <ritinhamcm@gmail.com>
2023-08-23 11:35:10 +01:00
Simon Pasquier 535a91cf4c prometheus-adapter: remove deprecated argument
The `--logtostderr` argument has been deprecated in v0.10.0 and removed
in v0.11.0.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2023-08-16 14:58:40 +02:00
Andrew N Golovkov 4068700e27 use names from resources directly (#2135) 2023-07-17 03:31:59 -07:00
Brian Torres-Gil 3af1d8320c fix: non-namespaced resources incorrectly have ns (#2158) 2023-07-13 12:22:56 -07:00
Jan-Otto Kröpke 09135ee9b3 Enable Multi Cluster alerts by default (#2099) 2023-05-22 16:26:44 +01:00
adinhodovic 64ed9f1f44 fix(component/node-exporter): Disable btrfs collector by default 2023-04-13 13:28:18 +02:00
Fran ec56f4559f Improve ArgoCD support (#2041)
* Improve ArgoCD support

Signed-off-by: Fran Sanjuán <francesc.sanjuan@marfeel.com>

* Add modified yamls

Signed-off-by: Fran Sanjuán <francesc.sanjuan@marfeel.com>

---------

Signed-off-by: Fran Sanjuán <francesc.sanjuan@marfeel.com>
2023-03-20 09:21:46 +00:00
Ricardo Ribeiro 274d5856c7 Added custom overrides for kube-rbac-proxy. (#1987)
Missing in prometheus-operator, node-exporter and blacbox-exporter.
2023-03-15 11:47:31 +00:00
Siyuan Wang c3dad8c70b fix: prometheus network policy let prometheus-adapter pass (#1982) 2023-03-15 11:45:51 +00:00
SUN Haoyu ed6a2f0fc7 additional selector for resource queries in Prometheus Adapter. (#2003)
Signed-off-by: Haoyu Sun <hasun@redhat.com>
2023-03-15 11:42:28 +00:00
Joao Marcal 7363e20b65 Adds startupProbe to prometheus-adapter (#2029)
Issue: https://issues.redhat.com/browse/OCPBUGS-7694

Problem: in clusters with a large nb of CRDs deployed prom-adapter takes too long to discover all of them which makes it fail the livenessProbe

Solution: introduce a startupProbe that gives 3 minutes for prom-adapter to initialize

Signed-off-by: JoaoBraveCoding <jmarcal@redhat.com>
2023-03-14 15:39:30 +00:00
Ashton Kemerling e435c1d640 jsonnet: Update disk selector regex. (#1945)
Update regex used for selecting disk devices to include raid devices of
the form "md123".
2022-11-23 11:41:56 +00:00
Haoyu Sun b7781c19a1 set path.udev.data argument of node exporter.
Signed-off-by: Haoyu Sun <hasun@redhat.com>
2022-10-21 17:17:23 +02:00
SUN Haoyu 1bf12a9842 Node Exporter: add parameter for ignored network devices (#1887) 2022-10-19 09:14:25 +01:00
Jan Fajerski 75bc89f6e3 prometheus-adapter: add prefix option to config for container metrics (#1844)
This commit adds the options `containerMetricsPrefix`
to the prometheus-adapter config-map generator. By default this option
is the empty string and doesn't change the current behavior. If set
however to e.g. `pa_`, the prometheus-adapter configuration will add
this prefix to all container_ queries in the resource rules.
This enables users of kube-prometheus to define a specialised service
monitor, that only expose the prometheus-adapter related container metrics with a
different configuration, like `honorTimestamps: true` or a tighter
scrape interval.

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
2022-09-02 16:51:30 +01:00
PromOperatorBot 6190853c1c [bot] [main] Automated version update (#1854)
* [bot] [main] Automated version update

* jsonnet: drop deprecated KSM metric

Signed-off-by: Paweł Krupa (paulfantom) <pawel@krupa.net.pl>

* jsonnet: Drop deprecated KSM metrics

Signed-off-by: Paweł Krupa (paulfantom) <pawel@krupa.net.pl>
Co-authored-by: Prometheus Operator Bot <prom-op-bot@users.noreply.github.com>
Co-authored-by: Paweł Krupa (paulfantom) <pawel@krupa.net.pl>
Co-authored-by: Philip Gough <philip.p.gough@gmail.com>
2022-08-30 11:34:17 +01:00
Philip Gough 9a64c41065 Merge pull request #1810 from zanhsieh/main
fix device regext adjusted for aks and eks
2022-07-22 16:04:36 +01:00
zanhsieh 44e3fc11c6 fix device regext adjusted for aks and eks
Signed-off-by: zanhsieh <zanhsieh@gmail.com>
2022-07-11 22:29:05 +08:00
Bernd Malmqvist 72583f3d3b enable automountServiceAccountToken for prometheus 2022-07-10 13:19:34 +01:00
Vladislav Polyakov 76f1ba051a style: fmt code 2022-04-19 10:48:24 +03:00
Vladislav Polyakov 17d2831fc5 fet: include ingress network policy for thanos 2022-04-19 10:27:14 +03:00
Vladislav Polyakov 62b2347277 Access requests to sidecar from thanos-query 2022-04-14 15:01:55 +03:00
Arunprasad Rajkumar 6ff8bfbb02 Adjust NodeFilesystemSpaceFillingUp thresholds according default kubelet GC behavior
Previously[1] we attempted to do the same, but there was a
misunderstanding about the GC behavior and it caused the alert to be
fired even before GC comes into play.

According to[2][3] kubelet GC kicks in only when `imageGCHighThresholdPercent` is hit which is set to 85% by default. However `NodeFilesystemSpaceFillingUp` is set to fire as soon as 80% usage is hit.

This commit changes the `fsSpaceFillingUpWarningThreshold` to 15% so
that we give ample time to GC to reclaim unwanted images. This commit
also changes `fsSpaceFillingUpCriticalThreshold` to 10% which gives more time to admins to react to warning before sending critical alert.

[1] https://github.com/prometheus-operator/kube-prometheus/pull/1357
[2] https://docs.openshift.com/container-platform/4.10/nodes/nodes/nodes-nodes-garbage-collection.html#nodes-nodes-garbage-collection-images_nodes-nodes-configuring
[3] https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/

Signed-off-by: Arunprasad Rajkumar <arajkuma@redhat.com>
2022-04-13 12:01:06 +05:30
ArthurSens 8bdd526039 jsonnet/components/prometheus: Fix grafana network access
Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2022-04-11 07:23:09 +00:00
ArthurSens 3da9bcd152 jsonnet/components/grafana: Address FIXME
Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2022-04-05 09:28:43 +00:00
Arthur Silva Sens 01004de76c Merge pull request #1650 from ArthurSens/as/network-policies
Adds NetworkPolicies to all components of Kube-prometheus
2022-04-05 09:47:05 +01:00
Joao Marcal 1d46f7ece9 Adds port name to prometheus-adapter jsonnet 2022-03-30 15:34:40 +01:00
Joao Marcal f6190e200a Adds readinessProbe and livenessProbe to prometheus-adapter jsonnet
Problem: Currently the prometheus-adapter pods are restarted at the same
time even though the deployment is configured with strategy RollingUpdate.
This happens because the kubelet does not know when the prometheus-adapter
pods are ready to start receiving requests.

Solution: Add both readinessProbe and livenessProbe to the
prometheus-adapter, this way the kubelet will know when either the pod
stoped working and should be restarted or simply when it ready to start
receiving requests.

Issue: https://bugzilla.redhat.com/show_bug.cgi?id=2048333
2022-03-30 07:22:55 +01:00
paulfantom 3ad08674b3 manifests: regenerate
Signed-off-by: paulfantom <pawel@krupa.net.pl>
Signed-off-by: Paweł Krupa (paulfantom) <pawel@krupa.net.pl>
(cherry picked from commit d3ea3147a8)
(cherry picked from commit d24c347b2742d9474c8f441f2831262c63b8c79b)
2022-03-09 07:48:01 +00:00
Arthur Silva Sens 3f3b56e247 alertmanager/networkPolicy: Allow cluster peer-to-peer communication
Signed-off-by: GitHub <noreply@github.com>
(cherry picked from commit df68b8d1da5d2d91b9502d4be67063c2c497e0cb)
2022-03-09 07:47:28 +00:00
Arthur Silva Sens ea158da23f Add networkPolicies for alertmanager, grafana, prometheus-operator and prometheus
Signed-off-by: GitHub <noreply@github.com>
(cherry picked from commit 86e16b539cc57710b50f4692848cab5645e3d2bc)
2022-03-09 07:47:25 +00:00
paulfantom fddf642de7 jsonnet: add networkpolicies for components accessed by prometheus
(cherry picked from commit f8c00b9963)
(cherry picked from commit f09b8e5de2e46db85f090549d37eeb878a81842f)
2022-03-09 07:42:09 +00:00