Openshift cluster monitoring operator 我才不告訴你勒

slide: https://hackmd.io/p/OonUQ9QKQ7-7JPBd1N9tOA?both

We have a collaborative session

please prepare laptop or smartphone to join!

Who am I?

Jason Li
SRE/Backend developer
❤️ kubernetes Go Rust
🐱 lover
不斷的從入門到放棄

Agenda

Background
Related Work
Method
Conclusion

Background

Prometheus Operator, Prometheus, Prometheus Adapter, kube-state-metrics, … e.t.c.

In order to manage such diverse components, a centralized management configuration file is required.

UI
Prometheus
Metrics
Thanos

UI

Grafana

Prometheus

Prometheus Operator
Prometheus-k8s
👎 - Prometheus-user-workload
Alertmanager

Prometheus Operator

Provide Kubernetes native deployment and management related monitoring components.
automate the configuration of a Prometheus based monitoring stack for Kubernetes clusters.
- Prometheus
- Alertmanager
- Related components

Prometheus Operator(cont’d)

Metrics

node-exporter
kube-state-metrics
openshift-state-metrics

👎 prometheus-adapter
👎 Telemeter Client
👎 configuration sharing

node-exporter

Node exporter for hardware and OS metrics exposed by *NIX kernels.
We can scrape, including a wide variety of system metrics further down in the output (prefixed with node_).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


# HELP node_network_transmit_queue_length transmit_queue_length value of /sys/class/net/<iface>.
# TYPE node_network_transmit_queue_length gauge
node_network_transmit_queue_length{device="br0"} 1000
node_network_transmit_queue_length{device="eth0"} 1000
node_network_transmit_queue_length{device="lo"} 1000
node_network_transmit_queue_length{device="ovs-system"} 1000
node_network_transmit_queue_length{device="tun0"} 1000
node_network_transmit_queue_length{device="veth24377b8e"} 0
node_network_transmit_queue_length{device="veth58bd788d"} 0
...

kube-state-metrics

Focused on the health of the individual Kubernetes components, such as deployments, nodes and pods.
Exposes raw data unmodified from the Kubernetes API
Designed to be consumed either by Prometheus

openshift-state-metrics

Expands upon kube-state-metrics by adding metrics for OpenShift specific resources.
Expose cluster-level metrics for OpenShift specific resources

openshift-state-metrics (cont’d)

BuildConfig Metrics
Build Metrics
DeploymentConfig Metrics
ClusterResourceQuota Metrics
Route Metrics
Group Metrics

ref: https://github.com/openshift/openshift-state-metrics

Thanos

Thanos
Thanos Querier
Thanos Ruler

Method

Component	Key
Prometheus Operator	prometheusOperator
Prometheus	prometheusK8s
Alertmanager	alertmanagerMain
kube-state-metrics	kubeStateMetrics
openshift-state-metrics	openshiftStateMetrics
Grafana	grafana
Telemeter Client	telemeterClient
Prometheus Adapter	k8sPrometheusAdapter
Thanos Querier	thanosQuerier

Method (cont’d)

Only Prometheus and Alertmanager have extensive configuration options.
Other components usually provide only the nodeSelector field.

Method (cont’d)

move components to the node

1
2
3
4
5
6
7
8


data:
  config.yaml: |
    prometheusOperator:
      nodeSelector:
        foo: bar
    prometheusK8s:
      nodeSelector:
        foo: bar    

persistent volume claim

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


data:
  config.yaml: |
    prometheusK8s:
      volumeClaimTemplate:
        metadata:
          name: localpvc
        spec:
          storageClassName: local-storage
          resources:
            requests:
              storage: 40Gi    

custom Alertmanager configuration

At this stage, cluster monitoring does not provide Alertmanager settings

Conclusion

💯 💪 🎉

Wrap up

Self-updating monitoring stack that is based on Prometheus wider eco-system
Provides monitoring of cluster components
Expect to manage each component through the configuration file🎉

Cluster Monitoring Operator

Openshift cluster monitoring operator 我才不告訴你勒

Who am I?

Agenda

Background

UI

Prometheus

Prometheus Operator

Prometheus Operator(cont’d)

Metrics

node-exporter

kube-state-metrics

openshift-state-metrics

openshift-state-metrics (cont’d)

Thanos

Method

Method (cont’d)

Method (cont’d)

custom Alertmanager configuration

Conclusion

Wrap up

Thank you! 🐑

Cluster Monitoring Operator

Openshift cluster monitoring operator 我才不告訴你勒

Who am I?

Agenda

Background

Related Work

UI

Prometheus

Prometheus Operator

Prometheus Operator(cont’d)

Metrics

node-exporter

kube-state-metrics

openshift-state-metrics

openshift-state-metrics (cont’d)

Thanos

Method

Method (cont’d)

Method (cont’d)

custom Alertmanager configuration

Conclusion

Wrap up

Thank you! 🐑