Introduction:
A very big congratulations to you wherein you have managed to convince your team of engineers to migrate the overall company’s workload to micro-services, using containers on top of Kubernetes.
After some time, you realize things become a bit more complicated, you have multiple applications to deploy to the cluster, so you need an Ingress Controller, then, before going to production, you want visibility into how your workloads are performing, so you start looking for a monitoring solution, luckily, you find Prometheus, deploy it, add Grafana, and you’re done!
Later on, you start wondering — why Prometheus is running with just one replica? what happens if there is a container restart? or just a version update? for how long can Prometheus store my metrics? what will happen when the cluster goes down? Do I need another cluster for HA and DR? how will I have a centralized view with Prometheus?
Typical Kubernetes Cluster
The deployment consists of three layers:
- Underlying virtual machines — master nodes and worker nodes
- Kubernetes infrastructure applications
- User applications
The different components communicate with each other internally, usually using HTTP(s) (REST or gRPC) and some of them expose APIs to the outside of the cluster (Ingress), those APIs are mainly used for —
- Cluster management via Kubernetes API Server
- User application interaction exposed via an Ingress Controller
In some scenarios, applications might send traffic outside of the cluster (Egress) for consuming other services, such as Azure SQL, Azure Blob, or any 3rd party service.
Deploying Thanos
We’ll start with deploying Thanos Sidecar into our Kubernetes clusters, the same clusters we use for running our workloads and Prometheus and Grafana deployments.
While there are many ways to install Prometheus, I prefer using Prometheus-Operator which gives you easy monitoring definitions for Kubernetes services and deployment and management of Prometheus instances.
Before we deploy the Thanos Sidecar, we need to have a Kubernetes Secret with details on how to connect to the cloud storage, for this demo, I will be using Microsoft Azure.
Create a blob storage account—
az storage account create –name <storage_name> –resource-group <resource_group> –location <location> –sku Standard_LRS –encryption blog |
Then, create a folder (aka container) for the metrics —
az storage container create –account-name <storage_name> –name thanos |
Grab the storage keys —
az storage account keys list -g <resource_group> -n <storage_name> |
Create a file for the storage settings (thanos-storage-config.yaml) —
type: AZURE config: storage_account: “<storage_name>” storage_account_key: “<key>” container: “thanos” |
Create a Kubernetes Secret —
kubectl -n monitoring create secret generic thanos-objstore-config –from-file=thanos.yaml=thanos-storage-config.yaml |
Create a values file (prometheus-operator-values.yaml) to override the default Prometheus-Operator settings-
prometheus: prometheusSpec: replicas: 2 # work in High-Availability mode retention: 12h # we only need a few hours of retenion, since the rest is uploaded to blob image: tag: v2.8.0 # use a specific version of Prometheus externalLabels: # a cool way to add default labels to all metrics geo: us region: eastus serviceMonitorNamespaceSelector: # allows the operator to find target config from multiple namespaces any: true thanos: # add Thanos Sidecar tag: v0.3.1 # a specific version of Thanos objectStorageConfig: # blob storage configuration to upload metrics key: thanos.yaml name: thanos-obj store-config grafana: # (optional) we don’t need Grafana in all clusters enabled: false |
then deploy:
helm install –namespace monitoring –name prometheus-operator stable/prometheus-operator -f prometheus-operator-values.yaml |
Thanos Store Gateway and the Thanos Sidecar we will use mutual-TLS, meaning the client will authenticate the server and vise-versa.
Assuming you have .pfx file you can extract its Private Key, Public Key and CA using openssl —
# public key openssl pkcs12 -in cert.pfx -nocerts -nodes | sed -ne ‘/-BEGIN PRIVATE KEY-/,/-END PRIVATE KEY-/p’ > cert.key # private key openssl pkcs12 -in cert.pfx -clcerts -nokeys | sed -ne ‘/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p’ > cert.cer # certificate authority (CA) openssl pkcs12 -in cert.pfx -cacerts -nokeys -chain | sed -ne ‘/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p’ > cacerts.cer |
Create a two Kubernetes Secrets out of this —
# a secret to be used for TLS termination kubectl create secret tls -n monitoring thanos-ingress-secret –key ./cert.key –cert ./cert.cer |
# a secret to be used for client authenticating using the same CA kubectl create secret generic -n monitoring thanos-ca-secret –from-file=ca.crt=./cacerts.cer |
Make sure you have a domain that resolves to your Kubernetes cluster and create two sub-domains to be used for routing to each Thaos SideCar:
thanos-0.your.domain thanos-1.your.domain |
Now we can create the Ingress rules (replace the host value)—
apiVersion: v1 kind: Service metadata: labels: app: prometheus name: thanos-sidecar-0 spec: ports: – port: 10901 protocol: TCP targetPort: grpc name: grpc selector: statefulset.kubernetes.io/pod-name: prometheus-prometheus-operator-prometheus-0 type: ClusterIP — apiVersion: v1 kind: Service metadata: labels: app: prometheus name: thanos-sidecar-1 spec: ports: – port: 10901 protocol: TCP targetPort: grpc name: grpc selector: statefulset.kubernetes.io/pod-name: prometheus-prometheus-operator-prometheus-1 type: ClusterIP — apiVersion: extensions/v1beta1 kind: Ingress metadata: annotations: nginx.ingress.kubernetes.io/ssl-redirect: “true” nginx.ingress.kubernetes.io/backend-protocol: “GRPC” nginx.ingress.kubernetes.io/auth-tls-verify-client: “on” nginx.ingress.kubernetes.io/auth-tls-secret: “monitoring/thanos-ca-secret” labels: app: prometheus name: thanos-sidecar-0 spec: rules: – host: thanos-0.your.domain http: paths: – backend: serviceName: thanos-sidecar-0 servicePort: grpc tls: – hosts: – thanos-0.your.domain secretName: thanos-ingress-secret — apiVersion: extensions/v1beta1 kind: Ingress metadata: annotations: nginx.ingress.kubernetes.io/ssl-redirect: “true” nginx.ingress.kubernetes.io/backend-protocol: “GRPC” nginx.ingress.kubernetes.io/auth-tls-verify-client: “on” nginx.ingress.kubernetes.io/auth-tls-secret: “monitoring/thanos-ca-secret” labels: app: prometheus name: thanos-sidecar-1 spec: rules: – host: thanos-1.your.domain http: paths: – backend: serviceName: thanos-sidecar-1 servicePort: grpc tls: – hosts: – thanos-1.your.domain secretName: thanos-ingress-secret |
Create a thanos-values.yaml file to override the default chart settings —
# Thanos query configuration query: replicaCount: 1 logLevel: debug queryReplicaLabel: prometheus_replica stores: – thanos-store-grpc:10901 – thanos-0.your.domain:443 – thanos-1.your.domain:443 tlsClient: enabled: true objectStorageConfig: enabled: true store: tlsServer: enabled: true |
Re-create the storage secret in this cluster as well
kubectl -n thanos create secret generic thanos-objstore-config –from-file=thanos.yaml=thanos-storage-config.yaml |
To deploy this chart, we will use the same certificates we created earlier and inject them as values —
helm install –name thanos –namespace thanos ./thanos -f thanos-values.yaml –set-file query.tlsClient.cert=cert.cer –set-file query.tlsClient.key=cert.key –set-file query.tlsClient.ca=cacerts.cer –set-file store.tlsServer.cert=cert.cer –set-file store.tlsServer.key=cert.key –set-file store.tlsServer.ca=cacerts.cer |
This will install both Thanos Query Gateway and Thanos Storage Gateway, configuring them to use a secure channel.
Validation
kubectl -n thanos port-forward svc/thanos-query-http 8080:10902 |
Grafana
To add dashboards you can simply install Grafana using its Helm chart.
Create a grafana-values.yaml with the following content —
datasources: datasources.yaml: apiVersion: 1 datasources: – name: Prometheus type: prometheus url: http://thanos-query-http:10902 access: proxy isDefault: true dashboardProviders: dashboardproviders.yaml: apiVersion: 1 providers: – name: ‘default’ orgId: 1 folder: ” type: file disableDeletion: false editable: true options: path: /var/lib/grafana/dashboards/default dashboards: default: cluster-stats: # Ref: https://grafana.com/dashboards/1621 gnetId: 1621 revision: 1 datasource: Prometheus prometheus-stats: # Ref: https://grafana.com/dashboards/2 gnetId: 2 revision: 2 datasource: Prometheus node-exporter: # Ref: https://grafana.com/dashboards/1860 gnetId: 1860 revision: 13 datasource: Prometheus |
Conclusion:
These are the command that is considered while monitoring Kubernetes by different ways and ideas. Cloud Stack Group is a leading provider of AWS services that assist its clients in moving their servers from one place to another.
Feel free to contact us today to know more about the services or to know more about the command. You are just a call or an email far from us.
Monitor Kubernetes with Prometheus and Thanos
Introduction:
A very big congratulations to you wherein you have managed to convince your team of engineers to migrate the overall company’s workload to micro-services, using containers on top of Kubernetes.
After some time, you realize things become a bit more complicated, you have multiple applications to deploy to the cluster, so you need an Ingress Controller, then, before going to production, you want visibility into how your workloads are performing, so you start looking for a monitoring solution, luckily, you find Prometheus, deploy it, add Grafana, and you’re done!
Later on, you start wondering — why Prometheus is running with just one replica? what happens if there is a container restart? or just a version update? for how long can Prometheus store my metrics? what will happen when the cluster goes down? Do I need another cluster for HA and DR? how will I have a centralized view with Prometheus?
Typical Kubernetes Cluster
The deployment consists of three layers:
- Underlying virtual machines — master nodes and worker nodes
- Kubernetes infrastructure applications
- User applications
The different components communicate with each other internally, usually using HTTP(s) (REST or gRPC) and some of them expose APIs to the outside of the cluster (Ingress), those APIs are mainly used for —
- Cluster management via Kubernetes API Server
- User application interaction exposed via an Ingress Controller
In some scenarios, applications might send traffic outside of the cluster (Egress) for consuming other services, such as Azure SQL, Azure Blob, or any 3rd party service.
Deploying Thanos
We’ll start with deploying Thanos Sidecar into our Kubernetes clusters, the same clusters we use for running our workloads and Prometheus and Grafana deployments.
While there are many ways to install Prometheus, I prefer using Prometheus-Operator which gives you easy monitoring definitions for Kubernetes services and deployment and management of Prometheus instances.
Before we deploy the Thanos Sidecar, we need to have a Kubernetes Secret with details on how to connect to the cloud storage, for this demo, I will be using Microsoft Azure.
Create a blob storage account—
az storage account create –name <storage_name> –resource-group <resource_group> –location <location> –sku Standard_LRS –encryption blog |
Then, create a folder (aka container) for the metrics —
az storage container create –account-name <storage_name> –name thanos |
Grab the storage keys —
az storage account keys list -g <resource_group> -n <storage_name> |
Create a file for the storage settings (thanos-storage-config.yaml) —
type: AZURE config: storage_account: “<storage_name>” storage_account_key: “<key>” container: “thanos” |
Create a Kubernetes Secret —
kubectl -n monitoring create secret generic thanos-objstore-config –from-file=thanos.yaml=thanos-storage-config.yaml |
Create a values file (prometheus-operator-values.yaml) to override the default Prometheus-Operator settings-
prometheus: prometheusSpec: replicas: 2 # work in High-Availability mode retention: 12h # we only need a few hours of retenion, since the rest is uploaded to blob image: tag: v2.8.0 # use a specific version of Prometheus externalLabels: # a cool way to add default labels to all metrics geo: us region: eastus serviceMonitorNamespaceSelector: # allows the operator to find target config from multiple namespaces any: true thanos: # add Thanos Sidecar tag: v0.3.1 # a specific version of Thanos objectStorageConfig: # blob storage configuration to upload metrics key: thanos.yaml name: thanos-obj store-config grafana: # (optional) we don’t need Grafana in all clusters enabled: false |
then deploy:
helm install –namespace monitoring –name prometheus-operator stable/prometheus-operator -f prometheus-operator-values.yaml |
Thanos Store Gateway and the Thanos Sidecar we will use mutual-TLS, meaning the client will authenticate the server and vise-versa.
Assuming you have .pfx file you can extract its Private Key, Public Key and CA using openssl —
# public key openssl pkcs12 -in cert.pfx -nocerts -nodes | sed -ne ‘/-BEGIN PRIVATE KEY-/,/-END PRIVATE KEY-/p’ > cert.key # private key openssl pkcs12 -in cert.pfx -clcerts -nokeys | sed -ne ‘/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p’ > cert.cer # certificate authority (CA) openssl pkcs12 -in cert.pfx -cacerts -nokeys -chain | sed -ne ‘/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p’ > cacerts.cer |
Create a two Kubernetes Secrets out of this —
# a secret to be used for TLS termination kubectl create secret tls -n monitoring thanos-ingress-secret –key ./cert.key –cert ./cert.cer |
# a secret to be used for client authenticating using the same CA kubectl create secret generic -n monitoring thanos-ca-secret –from-file=ca.crt=./cacerts.cer |
Make sure you have a domain that resolves to your Kubernetes cluster and create two sub-domains to be used for routing to each Thaos SideCar:
thanos-0.your.domain thanos-1.your.domain |
Now we can create the Ingress rules (replace the host value)—
apiVersion: v1 kind: Service metadata: labels: app: prometheus name: thanos-sidecar-0 spec: ports: – port: 10901 protocol: TCP targetPort: grpc name: grpc selector: statefulset.kubernetes.io/pod-name: prometheus-prometheus-operator-prometheus-0 type: ClusterIP — apiVersion: v1 kind: Service metadata: labels: app: prometheus name: thanos-sidecar-1 spec: ports: – port: 10901 protocol: TCP targetPort: grpc name: grpc selector: statefulset.kubernetes.io/pod-name: prometheus-prometheus-operator-prometheus-1 type: ClusterIP — apiVersion: extensions/v1beta1 kind: Ingress metadata: annotations: nginx.ingress.kubernetes.io/ssl-redirect: “true” nginx.ingress.kubernetes.io/backend-protocol: “GRPC” nginx.ingress.kubernetes.io/auth-tls-verify-client: “on” nginx.ingress.kubernetes.io/auth-tls-secret: “monitoring/thanos-ca-secret” labels: app: prometheus name: thanos-sidecar-0 spec: rules: – host: thanos-0.your.domain http: paths: – backend: serviceName: thanos-sidecar-0 servicePort: grpc tls: – hosts: – thanos-0.your.domain secretName: thanos-ingress-secret — apiVersion: extensions/v1beta1 kind: Ingress metadata: annotations: nginx.ingress.kubernetes.io/ssl-redirect: “true” nginx.ingress.kubernetes.io/backend-protocol: “GRPC” nginx.ingress.kubernetes.io/auth-tls-verify-client: “on” nginx.ingress.kubernetes.io/auth-tls-secret: “monitoring/thanos-ca-secret” labels: app: prometheus name: thanos-sidecar-1 spec: rules: – host: thanos-1.your.domain http: paths: – backend: serviceName: thanos-sidecar-1 servicePort: grpc tls: – hosts: – thanos-1.your.domain secretName: thanos-ingress-secret |
Create a thanos-values.yaml file to override the default chart settings —
# Thanos query configuration query: replicaCount: 1 logLevel: debug queryReplicaLabel: prometheus_replica stores: – thanos-store-grpc:10901 – thanos-0.your.domain:443 – thanos-1.your.domain:443 tlsClient: enabled: true objectStorageConfig: enabled: true store: tlsServer: enabled: true |
Re-create the storage secret in this cluster as well
kubectl -n thanos create secret generic thanos-objstore-config –from-file=thanos.yaml=thanos-storage-config.yaml |
To deploy this chart, we will use the same certificates we created earlier and inject them as values —
helm install –name thanos –namespace thanos ./thanos -f thanos-values.yaml –set-file query.tlsClient.cert=cert.cer –set-file query.tlsClient.key=cert.key –set-file query.tlsClient.ca=cacerts.cer –set-file store.tlsServer.cert=cert.cer –set-file store.tlsServer.key=cert.key –set-file store.tlsServer.ca=cacerts.cer |
This will install both Thanos Query Gateway and Thanos Storage Gateway, configuring them to use a secure channel.
Validation
kubectl -n thanos port-forward svc/thanos-query-http 8080:10902 |
Grafana
To add dashboards you can simply install Grafana using its Helm chart.
Create a grafana-values.yaml with the following content —
datasources: datasources.yaml: apiVersion: 1 datasources: – name: Prometheus type: prometheus url: http://thanos-query-http:10902 access: proxy isDefault: true dashboardProviders: dashboardproviders.yaml: apiVersion: 1 providers: – name: ‘default’ orgId: 1 folder: ” type: file disableDeletion: false editable: true options: path: /var/lib/grafana/dashboards/default dashboards: default: cluster-stats: # Ref: https://grafana.com/dashboards/1621 gnetId: 1621 revision: 1 datasource: Prometheus prometheus-stats: # Ref: https://grafana.com/dashboards/2 gnetId: 2 revision: 2 datasource: Prometheus node-exporter: # Ref: https://grafana.com/dashboards/1860 gnetId: 1860 revision: 13 datasource: Prometheus |
Conclusion:
These are the command that is considered while monitoring Kubernetes by different ways and ideas. Cloud Stack Group is a leading provider of AWS services that assist its clients in moving their servers from one place to another.
Feel free to contact us today to know more about the services or to know more about the command. You are just a call or an email far from us.