In thanos query,prometheus sidecar status is health:
10.0.66.140:10901 | UP | prometheus="monitoring/k8s"prometheus_replica="prometheus-k8s-0" | 2021-10-03 01:41:57 | | 793.000ms ago
But when I query,will report an error:
Error executing query: expanding series: proxy Series(): Addr: 10.0.66.140:10901 LabelSets: {prometheus="monitoring/k8s", prometheus_replica="prometheus-k8s-0"} Mint: 1633225317596 Maxt: 9223372036854775807: receive series from Addr: 10.0.66.140:10901 LabelSets: {prometheus="monitoring/k8s", prometheus_replica="prometheus-k8s-0"} Mint: 1633225317596 Maxt: 9223372036854775807: rpc error: code = Canceled desc = grpc: the client connection is closing
Thanos query Pods has the following error:
level=warn ts=2021-10-03T06:04:55.23300433Z caller=storeset.go:570 component=storeset msg="update of store node failed" err="getting metadata: fetching store info from 10.0.66.140:10901: rpc error: code = DeadlineExceeded desc = context deadline exceeded" address=10.0.66.140:10901
level=info ts=2021-10-03T06:04:55.23316993Z caller=storeset.go:426 component=storeset msg="removing store because it's unhealthy or does not exist" address=10.0.66.140:10901 extLset="{prometheus=\"monitoring/k8s\", prometheus_replica=\"prometheus-k8s-0\"}"
level=error ts=2021-10-03T06:04:55.236279545Z caller=proxy.go:307 component=proxy request="min_time:1633240788495 max_time:1633241088495 matchers:<name:\"__name__\" value:\"up\" > aggregates:COUNT aggregates:SUM partial_response_disabled:true " err="Addr: 10.0.66.140:10901 LabelSets: {prometheus=\"monitoring/k8s\", prometheus_replica=\"prometheus-k8s-0\"} Mint: 1633225317596 Maxt: 9223372036854775807: receive series from Addr: 10.0.66.140:10901 LabelSets: {prometheus=\"monitoring/k8s\", prometheus_replica=\"prometheus-k8s-0\"} Mint: 1633225317596 Maxt: 9223372036854775807: rpc error: code = Canceled desc = grpc: the client connection is closing"
level=info ts=2021-10-03T06:04:56.069001968Z caller=storeset.go:463 component=storeset msg="adding new storeAPI to query storeset" address=10.0.66.140:10901 extLset="{prometheus=\"monitoring/k8s\", prometheus_replica=\"prometheus-k8s-0\"}"
I don't know why this is, who can help provide some ideas.My object storage is normal, and the monitoring data of prometheus has been reported to the object storage from the sidecar.
Thanos query yaml(Unnecessary content omitted),resource not set:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app.kubernetes.io/instance: thanos-query
name: thanos-query
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/instance: thanos-query
template:
metadata:
labels:
app.kubernetes.io/instance: thanos-query
spec:
containers:
- args:
- query
- --grpc-address=0.0.0.0:10901
- --http-address=0.0.0.0:9090
- --log.level=info
- --log.format=logfmt
- --query.replica-label=prometheus_replica
- --query.replica-label=rule_replica
- --store=dnssrv+_grpc._tcp.thanos-sidecar-self.monitoring.svc.cluster.local
- --store=dnssrv+_grpc._tcp.thanos-ruler.monitoring.svc.cluster.local
- --store=dnssrv+_grpc._tcp.thanos-store.monitoring.svc.cluster.local:10901
- --store=10.0.66.140:10901
- --query.auto-downsampling
env:
- name: HOST_IP_ADDRESS
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.hostIP
image: quay.io/thanos/thanos:v0.22.0
dnsPolicy: ClusterFirst
nodeSelector:
securityContext:
fsGroup: 65534
runAsUser: 65534
serviceAccount: thanos-query
serviceAccountName: thanos-query
terminationGracePeriodSeconds: 120