Commit Graph

198 Commits

Author SHA1 Message Date
Damien Grisonnet
eb7f83a407 Drop process start time from SLI endpoints (#2501)
* jsonnet: drop process start time metric from SLI

Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>

* manifests: regenerate

Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>

---------

Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2024-09-05 08:45:00 +01:00
Damien Grisonnet
43f2094629 jsonnet: add component SLI metrics
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2024-08-27 21:48:38 +02:00
Kemal Akkoyun
1c54cb22ca Merge pull request #2427 from philipgough/ci-fix-ksm
ci: Add runAsGroup for kube-state-metrics
2024-05-24 11:32:55 +02:00
Philip Gough
e8f461ba38 ci: Add runAsGroup for prom operator Deployment 2024-05-14 10:04:07 +01:00
Philip Gough
387731a945 ci: Add runAsGroup for kube-state-metrics 2024-05-14 09:31:24 +01:00
Kemal Akkoyun
cb55161e24 Merge pull request #2423 from philipgough/ci-fix-grafana 2024-05-13 12:07:04 +02:00
Philip Gough
e8995efcf9 ci: Add runAsGroup for node_exporter sidecars 2024-05-13 10:40:04 +01:00
Philip Gough
d0b0b0d087 ci: Add runAsGroup for Grafana Deployment 2024-05-13 10:34:58 +01:00
Philip Gough
d1ec0ab362 ci: Add runAsGroup for blackbox exporter containers 2024-05-13 09:20:00 +01:00
Brad Ison
895db2a272 jsonnet/components/prometheus: Fix thanos-sidecar metrics access (#2330)
When enabled, the thanos-sidecar opens an HTTP listener on port 10902,
which what's used to scrape metrics.  This port wasn't being added to
the Prometheus Service, so wasn't added to the NetworkPolicy causing
scraping to fail from Prometheus instances other than the local one.

This adds the port to the Service and NetworkPolicy.

Fixes: #2006
2024-01-15 09:16:17 +00:00
Marco Lehmann
ad86a3fa21 Patch 2 (#2259)
* Update node-exporter.libsonnet

* Update windows-hostprocess.libsonnet

* Update blackbox-exporter.libsonnet
2023-11-07 17:25:06 +00:00
Paweł Krupa
8a1a537524 Merge pull request #2252 from m99coder/patch-1
Update kube-state-metrics.libsonnet
2023-11-01 17:08:27 +01:00
Paweł Krupa
8b73d18b76 Merge pull request #2232 from paulfantom/scrape-configs
explicitly enable ScrapeConfig support
2023-11-01 17:00:45 +01:00
Marco Lehmann
5308081f75 Update kube-state-metrics.libsonnet
Fix typo in error message
2023-10-25 16:20:10 +02:00
Matthias Loibl
1e55a4057c Add securityContext items and add pod security labes 2023-10-09 13:02:20 +02:00
paulfantom
ec915f7b47 explicitly enable ScrapeConfig support 2023-10-07 11:09:11 +02:00
Roeland van Batenburg
76ebaeeafe allow configuration of secrets for alertmanager (#2206) 2023-09-06 14:06:29 +01:00
Simon Pasquier
01d4f1daa6 Remove deprecated 'logtostderr' argument for kube-rbac-proxy (#2198)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2023-08-28 20:36:39 +01:00
Rita Canavarro
84c295f06b [FEAT] Prometheus Adapter reload (#2195)
* [FEAT] Prometheus Adapter reload

Signed-off-by: rita.canavarro <ritinhamcm@gmail.com>

* [FEAT] Prometheus Adapter reload

Signed-off-by: rita.canavarro <ritinhamcm@gmail.com>

---------

Signed-off-by: rita.canavarro <ritinhamcm@gmail.com>
2023-08-23 11:35:10 +01:00
Simon Pasquier
535a91cf4c prometheus-adapter: remove deprecated argument
The `--logtostderr` argument has been deprecated in v0.10.0 and removed
in v0.11.0.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2023-08-16 14:58:40 +02:00
Andrew N Golovkov
4068700e27 use names from resources directly (#2135) 2023-07-17 03:31:59 -07:00
Brian Torres-Gil
3af1d8320c fix: non-namespaced resources incorrectly have ns (#2158) 2023-07-13 12:22:56 -07:00
Jan-Otto Kröpke
09135ee9b3 Enable Multi Cluster alerts by default (#2099) 2023-05-22 16:26:44 +01:00
adinhodovic
64ed9f1f44 fix(component/node-exporter): Disable btrfs collector by default 2023-04-13 13:28:18 +02:00
Fran
ec56f4559f Improve ArgoCD support (#2041)
* Improve ArgoCD support

Signed-off-by: Fran Sanjuán <francesc.sanjuan@marfeel.com>

* Add modified yamls

Signed-off-by: Fran Sanjuán <francesc.sanjuan@marfeel.com>

---------

Signed-off-by: Fran Sanjuán <francesc.sanjuan@marfeel.com>
2023-03-20 09:21:46 +00:00
Ricardo Ribeiro
274d5856c7 Added custom overrides for kube-rbac-proxy. (#1987)
Missing in prometheus-operator, node-exporter and blacbox-exporter.
2023-03-15 11:47:31 +00:00
Siyuan Wang
c3dad8c70b fix: prometheus network policy let prometheus-adapter pass (#1982) 2023-03-15 11:45:51 +00:00
SUN Haoyu
ed6a2f0fc7 additional selector for resource queries in Prometheus Adapter. (#2003)
Signed-off-by: Haoyu Sun <hasun@redhat.com>
2023-03-15 11:42:28 +00:00
Joao Marcal
7363e20b65 Adds startupProbe to prometheus-adapter (#2029)
Issue: https://issues.redhat.com/browse/OCPBUGS-7694

Problem: in clusters with a large nb of CRDs deployed prom-adapter takes too long to discover all of them which makes it fail the livenessProbe

Solution: introduce a startupProbe that gives 3 minutes for prom-adapter to initialize

Signed-off-by: JoaoBraveCoding <jmarcal@redhat.com>
2023-03-14 15:39:30 +00:00
Ashton Kemerling
e435c1d640 jsonnet: Update disk selector regex. (#1945)
Update regex used for selecting disk devices to include raid devices of
the form "md123".
2022-11-23 11:41:56 +00:00
Haoyu Sun
b7781c19a1 set path.udev.data argument of node exporter.
Signed-off-by: Haoyu Sun <hasun@redhat.com>
2022-10-21 17:17:23 +02:00
SUN Haoyu
1bf12a9842 Node Exporter: add parameter for ignored network devices (#1887) 2022-10-19 09:14:25 +01:00
Jan Fajerski
75bc89f6e3 prometheus-adapter: add prefix option to config for container metrics (#1844)
This commit adds the options `containerMetricsPrefix`
to the prometheus-adapter config-map generator. By default this option
is the empty string and doesn't change the current behavior. If set
however to e.g. `pa_`, the prometheus-adapter configuration will add
this prefix to all container_ queries in the resource rules.
This enables users of kube-prometheus to define a specialised service
monitor, that only expose the prometheus-adapter related container metrics with a
different configuration, like `honorTimestamps: true` or a tighter
scrape interval.

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
2022-09-02 16:51:30 +01:00
PromOperatorBot
6190853c1c [bot] [main] Automated version update (#1854)
* [bot] [main] Automated version update

* jsonnet: drop deprecated KSM metric

Signed-off-by: Paweł Krupa (paulfantom) <pawel@krupa.net.pl>

* jsonnet: Drop deprecated KSM metrics

Signed-off-by: Paweł Krupa (paulfantom) <pawel@krupa.net.pl>
Co-authored-by: Prometheus Operator Bot <prom-op-bot@users.noreply.github.com>
Co-authored-by: Paweł Krupa (paulfantom) <pawel@krupa.net.pl>
Co-authored-by: Philip Gough <philip.p.gough@gmail.com>
2022-08-30 11:34:17 +01:00
Philip Gough
9a64c41065 Merge pull request #1810 from zanhsieh/main
fix device regext adjusted for aks and eks
2022-07-22 16:04:36 +01:00
zanhsieh
44e3fc11c6 fix device regext adjusted for aks and eks
Signed-off-by: zanhsieh <zanhsieh@gmail.com>
2022-07-11 22:29:05 +08:00
Bernd Malmqvist
72583f3d3b enable automountServiceAccountToken for prometheus 2022-07-10 13:19:34 +01:00
Vladislav Polyakov
76f1ba051a style: fmt code 2022-04-19 10:48:24 +03:00
Vladislav Polyakov
17d2831fc5 fet: include ingress network policy for thanos 2022-04-19 10:27:14 +03:00
Vladislav Polyakov
62b2347277 Access requests to sidecar from thanos-query 2022-04-14 15:01:55 +03:00
Arunprasad Rajkumar
6ff8bfbb02 Adjust NodeFilesystemSpaceFillingUp thresholds according default kubelet GC behavior
Previously[1] we attempted to do the same, but there was a
misunderstanding about the GC behavior and it caused the alert to be
fired even before GC comes into play.

According to[2][3] kubelet GC kicks in only when `imageGCHighThresholdPercent` is hit which is set to 85% by default. However `NodeFilesystemSpaceFillingUp` is set to fire as soon as 80% usage is hit.

This commit changes the `fsSpaceFillingUpWarningThreshold` to 15% so
that we give ample time to GC to reclaim unwanted images. This commit
also changes `fsSpaceFillingUpCriticalThreshold` to 10% which gives more time to admins to react to warning before sending critical alert.

[1] https://github.com/prometheus-operator/kube-prometheus/pull/1357
[2] https://docs.openshift.com/container-platform/4.10/nodes/nodes/nodes-nodes-garbage-collection.html#nodes-nodes-garbage-collection-images_nodes-nodes-configuring
[3] https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/

Signed-off-by: Arunprasad Rajkumar <arajkuma@redhat.com>
2022-04-13 12:01:06 +05:30
ArthurSens
8bdd526039 jsonnet/components/prometheus: Fix grafana network access
Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2022-04-11 07:23:09 +00:00
ArthurSens
3da9bcd152 jsonnet/components/grafana: Address FIXME
Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2022-04-05 09:28:43 +00:00
Arthur Silva Sens
01004de76c Merge pull request #1650 from ArthurSens/as/network-policies
Adds NetworkPolicies to all components of Kube-prometheus
2022-04-05 09:47:05 +01:00
Joao Marcal
1d46f7ece9 Adds port name to prometheus-adapter jsonnet 2022-03-30 15:34:40 +01:00
Joao Marcal
f6190e200a Adds readinessProbe and livenessProbe to prometheus-adapter jsonnet
Problem: Currently the prometheus-adapter pods are restarted at the same
time even though the deployment is configured with strategy RollingUpdate.
This happens because the kubelet does not know when the prometheus-adapter
pods are ready to start receiving requests.

Solution: Add both readinessProbe and livenessProbe to the
prometheus-adapter, this way the kubelet will know when either the pod
stoped working and should be restarted or simply when it ready to start
receiving requests.

Issue: https://bugzilla.redhat.com/show_bug.cgi?id=2048333
2022-03-30 07:22:55 +01:00
paulfantom
3ad08674b3 manifests: regenerate
Signed-off-by: paulfantom <pawel@krupa.net.pl>
Signed-off-by: Paweł Krupa (paulfantom) <pawel@krupa.net.pl>
(cherry picked from commit d3ea3147a8)
(cherry picked from commit d24c347b2742d9474c8f441f2831262c63b8c79b)
2022-03-09 07:48:01 +00:00
Arthur Silva Sens
3f3b56e247 alertmanager/networkPolicy: Allow cluster peer-to-peer communication
Signed-off-by: GitHub <noreply@github.com>
(cherry picked from commit df68b8d1da5d2d91b9502d4be67063c2c497e0cb)
2022-03-09 07:47:28 +00:00
Arthur Silva Sens
ea158da23f Add networkPolicies for alertmanager, grafana, prometheus-operator and prometheus
Signed-off-by: GitHub <noreply@github.com>
(cherry picked from commit 86e16b539cc57710b50f4692848cab5645e3d2bc)
2022-03-09 07:47:25 +00:00
paulfantom
fddf642de7 jsonnet: add networkpolicies for components accessed by prometheus
(cherry picked from commit f8c00b9963)
(cherry picked from commit f09b8e5de2e46db85f090549d37eeb878a81842f)
2022-03-09 07:42:09 +00:00