k8s采坑记录之CoreDns

  1. 现象
  2. 排查过程

现象

最近在部署开发环境的时候,发现POD的READY状态持续了大概7分钟一直都是0/1

1
2
NAME                                       READY   STATUS    RESTARTS   AGE   IP            NODE                                                NOMINATED NODE   READINESS GATES
ks-sso-server-deployment-896964cb6-9xdnb 0/1 Running 0 6m36s 10.244.1.30 ecs.ali-bj-vpc.other.172.25.116.185.vpc-dev-k8s02 <none> <none>

排查过程

通过describe查看健康检测接口”connection refused” 应该是pod的服务没有完全起来,readiness才没有检测通过

1
2
3
4
5
6
7
8
9
10
11
12
13
14
kubectl describe pod/ks-sso-server-deployment-896964cb6-9xdnb

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/ks-sso-server-deployment-896964cb6-9xdnb to ecs.ali-bj-vpc.other.172.25.116.185.vpc-dev-k8s02
Normal Pulling 78s kubelet, ecs.ali-bj-vpc.other.172.25.116.185.vpc-dev-k8s02 Pulling image "busybox"
Normal Pulled 69s kubelet, ecs.ali-bj-vpc.other.172.25.116.185.vpc-dev-k8s02 Successfully pulled image "busybox"
Normal Created 69s kubelet, ecs.ali-bj-vpc.other.172.25.116.185.vpc-dev-k8s02 Created container proj-init
Normal Started 69s kubelet, ecs.ali-bj-vpc.other.172.25.116.185.vpc-dev-k8s02 Started container proj-init
Normal Pulled 68s kubelet, ecs.ali-bj-vpc.other.172.25.116.185.vpc-dev-k8s02 Container image "harbor.x.xxx.com/library/ks-sso-server:1.6.0.0-SNAPSHOT" already present on machine
Normal Created 68s kubelet, ecs.ali-bj-vpc.other.172.25.116.185.vpc-dev-k8s02 Created container ks-sso-server
Normal Started 68s kubelet, ecs.ali-bj-vpc.other.172.25.116.185.vpc-dev-k8s02 Started container ks-sso-server
Warning Unhealthy 3s (x8 over 38s) kubelet, ecs.ali-bj-vpc.other.172.25.116.185.vpc-dev-k8s02 Readiness probe failed: Get http://10.244.1.30:9999/actuator/health: dial tcp 10.244.1.30:9999: connect: connection refused

根据历史经验判断应该是业务服务在启动过程中一直在等待连接数据库,尝试在k8s node节点上连接数据库可以正常访问,在pod中尝试ping外网域名解析失败

1
2
bash-4.2$ ping -c 1 www.baidu.com
ping: www.baidu.com: Name or service not known

查找dns相关信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
# pod内执行
bash-4.2$ cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5

# node上执行
root@pts/1 $ kubectl get svc -n kube-system -o wide | grep 10.96.0.10
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 33d k8s-app=kube-dns

# 查找对应的POD
root@pts/1 $ kubectl get pods -n kube-system -o wide -l k8s-app=kube-dns
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-5f8cbd7dcb-7d4tq 1/1 Running 0 26h 10.244.1.28 ecs.ali-bj-vpc.other.172.25.116.185.vpc-dev-k8s02 <none> <none>
coredns-5f8cbd7dcb-sq4kf 1/1 Running 0 33d 10.244.2.2 ecs.ali-bj-vpc.other.172.25.116.184.vpc-dev-k8s03 <none> <none>

# 分别使用对应的ip解析域名
root@pts/1 $ dig @10.244.1.28 www.baidu.com

; <<>> DiG 9.11.4-P2-RedHat-9.11.4-9.P2.el7 <<>> @10.244.1.28 www.baidu.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 42941
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;www.baidu.com. IN A

;; ANSWER SECTION:
www.baidu.com. 30 IN CNAME www.a.shifen.com.
www.a.shifen.com. 30 IN A 220.181.38.149
www.a.shifen.com. 30 IN A 220.181.38.150

;; Query time: 2 msec
;; SERVER: 10.244.1.28#53(10.244.1.28)
;; WHEN: Tue Dec 24 20:48:52 CST 2019
;; MSG SIZE rcvd: 149

ecs.ali-bj-vpc.other.172.25.116.186.vpc-dev-k8s01 [~] 2019-12-24 20:48:52
root@pts/1 $ dig @10.244.2.2 www.baidu.com

; <<>> DiG 9.11.4-P2-RedHat-9.11.4-9.P2.el7 <<>> @10.244.2.2 www.baidu.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 3294
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;www.baidu.com. IN A

;; ANSWER SECTION:
www.baidu.com. 22 IN CNAME www.a.shifen.com.
www.a.shifen.com. 22 IN A 220.181.38.150
www.a.shifen.com. 22 IN A 220.181.38.149

;; Query time: 2 msec
;; SERVER: 10.244.2.2#53(10.244.2.2)
;; WHEN: Tue Dec 24 20:49:15 CST 2019
;; MSG SIZE rcvd: 149

直接通过CoreDns解析都没有问题,访问service是通过iptable的转发规则实现的,难道iptable的转发规则有问题?

1
2
3
4
大概的转发规则是这样:
|-->KUBE-SEP-MNEVT5LK3OWCRPXW---| |--->10.244.1.28(50%)
请求-->OUTPUT-->KUBE-SERVICES-->KUBE-SVC-TCOU7JCQXEZGVUNU-->RANDOM --->路由
|-->KUBE-SEP-TCIZBYBD3WWXNWF5---| |--->10.244.2.2(50%)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# 这里只展示相关的规则
root@pts/0 $ iptables -t nat -vnL
Chain PREROUTING (policy ACCEPT 17 packets, 2178 bytes)
pkts bytes target prot opt in out source destination
393K 59M KUBE-SERVICES all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
138K 40M DOCKER all -- * * 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT 8 packets, 1243 bytes)
pkts bytes target prot opt in out source destination

Chain OUTPUT (policy ACCEPT 16 packets, 2436 bytes)
pkts bytes target prot opt in out source destination
1138K 81M KUBE-SERVICES all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
0 0 DOCKER all -- * * 0.0.0.0/0 !127.0.0.0/8 ADDRTYPE match dst-type LOCAL

Chain KUBE-SERVICES (2 references)
pkts bytes target prot opt in out source destination
0 0 KUBE-MARK-MASQ udp -- * * !10.244.0.0/16 10.96.0.10 /* kube-system/kube-dns:dns cluster IP */ udp dpt:53
0 0 KUBE-SVC-TCOU7JCQXEZGVUNU udp -- * * 0.0.0.0/0 10.96.0.10 /* kube-system/kube-dns:dns cluster IP */ udp dpt:53
0 0 KUBE-MARK-MASQ tcp -- * * !10.244.0.0/16 10.96.0.10 /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:53
0 0 KUBE-SVC-ERIFXISQEP7F7OF4 tcp -- * * 0.0.0.0/0 10.96.0.10 /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:53

Chain KUBE-SVC-TCOU7JCQXEZGVUNU (1 references)
pkts bytes target prot opt in out source destination
0 0 KUBE-SEP-MNEVT5LK3OWCRPXW all -- * * 0.0.0.0/0 0.0.0.0/0 statistic mode random probability 0.50000000000
0 0 KUBE-SEP-TCIZBYBD3WWXNWF5 all -- * * 0.0.0.0/0 0.0.0.0/0

Chain KUBE-SEP-MNEVT5LK3OWCRPXW (1 references)
pkts bytes target prot opt in out source destination
0 0 KUBE-MARK-MASQ all -- * * 10.244.1.28 0.0.0.0/0
0 0 DNAT udp -- * * 0.0.0.0/0 0.0.0.0/0 udp to:10.244.1.28:53

Chain KUBE-SEP-TCIZBYBD3WWXNWF5 (1 references)
pkts bytes target prot opt in out source destination
0 0 KUBE-MARK-MASQ all -- * * 10.244.2.2 0.0.0.0/0
0 0 DNAT udp -- * * 0.0.0.0/0 0.0.0.0/0 udp to:10.244.2.2:53

上面的规则也是没有问题的,郁闷了,看来只能祭出抓包神器tcpdump分析了

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
# pod上执行
bash-4.2$ ping -c 1 www.baidu.com
ping: www.baidu.com: Name or service not known

# node上执行
tcpdump -i cni0 -p udp port 53 and host 10.244.2.55 -vv -a
21:30:57.993070 IP (tos 0x0, ttl 64, id 14287, offset 0, flags [DF], proto UDP (17), length 85)
10.244.2.55.51209 > 10.96.0.10.domain: [bad udp cksum 0x17e7 -> 0x89b4!] 606+ A? www.baidu.com.default.svc.cluster.local. (57)
21:30:57.993102 IP (tos 0x0, ttl 63, id 14287, offset 0, flags [DF], proto UDP (17), length 85)
10.244.2.55.51209 > 10.244.2.2.domain: [bad udp cksum 0x1a73 -> 0x8728!] 606+ A? www.baidu.com.default.svc.cluster.local. (57)
######
这里停顿了5s
######
21:31:02.998159 IP (tos 0x0, ttl 64, id 18413, offset 0, flags [DF], proto UDP (17), length 85)
10.244.2.55.51209 > 10.96.0.10.domain: [bad udp cksum 0x17e7 -> 0x89b4!] 606+ A? www.baidu.com.default.svc.cluster.local. (57)
21:31:02.998189 IP (tos 0x0, ttl 63, id 18413, offset 0, flags [DF], proto UDP (17), length 85)
10.244.2.55.51209 > 10.244.2.2.domain: [bad udp cksum 0x1a73 -> 0x8728!] 606+ A? www.baidu.com.default.svc.cluster.local. (57)
######
这里停顿了5s
######
21:31:08.002208 IP (tos 0x0, ttl 64, id 22190, offset 0, flags [DF], proto UDP (17), length 59)
10.244.2.55.48623 > 10.96.0.10.domain: [bad udp cksum 0x17cd -> 0xcdb6!] 41007+ A? www.baidu.com. (31)
21:31:08.005166 IP (tos 0x0, ttl 62, id 57825, offset 0, flags [DF], proto UDP (17), length 166)
10.96.0.10.domain > 10.244.2.55.48623: [udp sum ok] 41007 q: A? www.baidu.com. 3/0/0 www.baidu.com. CNAME www.a.shifen.com., www.a.shifen.com. A 220.181.38.150, www.a.shifen.com. A 220.181.38.149 (138)
21:31:08.012680 IP (tos 0x0, ttl 64, id 22196, offset 0, flags [DF], proto UDP (17), length 73)
10.244.2.55.35870 > 10.96.0.10.domain: [bad udp cksum 0x17db -> 0xa4f7!] 12939+ PTR? 150.38.181.220.in-addr.arpa. (45)
21:31:08.015056 IP (tos 0x0, ttl 62, id 57832, offset 0, flags [DF], proto UDP (17), length 171)
10.96.0.10.domain > 10.244.2.55.35870: [udp sum ok] 12939 NXDomain q: PTR? 150.38.181.220.in-addr.arpa. 0/1/0 ns: 38.181.220.IN-ADDR.ARPA. SOA idc-ns1.bjtelecom.net. wang_ye.bjxywh.com. 1201938454 10800 3600 604800 38400 (143)

# 正常pod里返回的应该是像下面这样
23:21:04.525768 IP (tos 0x0, ttl 64, id 10315, offset 0, flags [DF], proto UDP (17), length 85)
10.244.0.2.58318 > 10.96.0.10.domain: [bad udp cksum 0x15b2 -> 0x4c30!] 9810+ A? www.baidu.com.default.svc.cluster.local. (57)
23:21:04.526224 IP (tos 0x0, ttl 62, id 16446, offset 0, flags [DF], proto UDP (17), length 178)
10.96.0.10.domain > 10.244.0.2.58318: [udp sum ok] 9810 NXDomain*- q: A? www.baidu.com.default.svc.cluster.local. 0/1/0 ns: cluster.local. [30s] SOA ns.dns.cluster.local. hostmaster.cluster.local. 1577200182 7200 1800 86400 30 (150)
23:21:04.526312 IP (tos 0x0, ttl 64, id 10316, offset 0, flags [DF], proto UDP (17), length 77)
10.244.0.2.52627 > 10.96.0.10.domain: [bad udp cksum 0x15aa -> 0xb683!] 3326+ A? www.baidu.com.svc.cluster.local. (49)
23:21:04.526617 IP (tos 0x0, ttl 62, id 16447, offset 0, flags [DF], proto UDP (17), length 170)
10.96.0.10.domain > 10.244.0.2.52627: [udp sum ok] 3326 NXDomain*- q: A? www.baidu.com.svc.cluster.local. 0/1/0 ns: cluster.local. [30s] SOA ns.dns.cluster.local. hostmaster.cluster.local. 1577200182 7200 1800 86400 30 (142)
23:21:04.526665 IP (tos 0x0, ttl 64, id 10317, offset 0, flags [DF], proto UDP (17), length 73)
10.244.0.2.35606 > 10.96.0.10.domain: [bad udp cksum 0x15a6 -> 0xe2fb!] 40161+ A? www.baidu.com.cluster.local. (45)
23:21:04.527091 IP (tos 0x0, ttl 62, id 12183, offset 0, flags [DF], proto UDP (17), length 166)
10.96.0.10.domain > 10.244.0.2.35606: [udp sum ok] 40161 NXDomain*- q: A? www.baidu.com.cluster.local. 0/1/0 ns: cluster.local. [30s] SOA ns.dns.cluster.local. hostmaster.cluster.local. 1577200182 7200 1800 86400 30 (138)
23:21:04.527127 IP (tos 0x0, ttl 64, id 10318, offset 0, flags [DF], proto UDP (17), length 59)
10.244.0.2.47739 > 10.96.0.10.domain: [bad udp cksum 0x1598 -> 0x2b05!] 18570+ A? www.baidu.com. (31)
23:21:04.529487 IP (tos 0x0, ttl 62, id 16448, offset 0, flags [DF], proto UDP (17), length 166)
10.96.0.10.domain > 10.244.0.2.47739: [udp sum ok] 18570 q: A? www.baidu.com. 3/0/0 www.baidu.com. [30s] CNAME www.a.shifen.com., www.a.shifen.com. [30s] A 220.181.38.150, www.a.shifen.com. [30s] A 220.181.38.149 (138)
23:21:04.537253 IP (tos 0x0, ttl 64, id 10325, offset 0, flags [DF], proto UDP (17), length 73)
10.244.0.2.35284 > 10.96.0.10.domain: [bad udp cksum 0x15a6 -> 0x7998!] 25193+ PTR? 150.38.181.220.in-addr.arpa. (45)
23:21:04.539122 IP (tos 0x0, ttl 62, id 12187, offset 0, flags [DF], proto UDP (17), length 171)
10.96.0.10.domain > 10.244.0.2.35284: [udp sum ok] 25193 NXDomain q: PTR? 150.38.181.220.in-addr.arpa. 0/1/0 ns: 38.181.220.IN-ADDR.ARPA. [30s] SOA idc-ns1.bjtelecom.net. wang_ye.bjxywh.com. 1201938454 10800 3600 604800 38400 (143)
1
2
3
4
5
6
7
8
9
10
# 如果我按照下面这个方式访问就能很快返回结果
bash-4.2$ ping www.baidu.com. <--注意这里是"."结尾,意思就是域名的绝对路径进行dns查询
PING www.a.shifen.com (220.181.38.149) 56(84) bytes of data.
64 bytes from 220.181.38.149 (220.181.38.149): icmp_seq=1 ttl=51 time=6.97 ms
64 bytes from 220.181.38.149 (220.181.38.149): icmp_seq=2 ttl=51 time=6.93 ms
64 bytes from 220.181.38.149 (220.181.38.149): icmp_seq=3 ttl=51 time=7.08 ms
^C
--- www.a.shifen.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 6.930/6.996/7.082/0.093 ms

综合上面查证问题一共有2个

  1. 网络存在丢包问题,这个问题经过各种查证最后通过重启解决了,可能和最近阿里云最近漏洞修复有关,太坑爹了
  2. 存在无效查询,因为 /etc/resolv.conf里存在默认搜索域”search default.svc.cluster.local svc.cluster.local cluster.local”,所以查询一个域名的时候就把将查询的域名依次添加默认搜索的域名进行查询后,才会以查询你想要查的域名,解决方法如下:
    1
    2
    3
    4
    5
    6
    7
    8
    在deploy.yml配置的containers里添加下面配置
    dnsConfig:
    options:
    - name: ndots
    value: "1"


    ndots指的的是域名中包含"."的个数,如果少于这个数量k8s则认为这个域名是一个相对路径,就会走search对应的域名。在 Kubernetes 中,默认设置了 ndots 值为5,是因为,Kubernetes 认为,内部域名,最长为5,要保证内部域名的请求,优先走集群内部的DNS,而不是将内部域名的DNS解析请求,有打到外网的机会,Kubernetes 设置 ndots 为5是一个比较合理的行为。
    至此,这个问题算是解决了,丢包那个问题真是坑爹。。。。

转载请注明来源,欢迎对文章中的引用来源进行考证,欢迎指出任何有错误或不够清晰的表达。可以在下面评论区评论,也可以邮件至 jaytp@qq.com

文章标题:k8s采坑记录之CoreDns

文章字数:3k

本文作者:Aaron

发布时间:2019-12-24, 20:20:17

最后更新:2019-12-26, 14:42:16

原始链接:http://blog.linuxerbulo.com/2019/12/24/k8s%E9%87%87%E5%9D%91%E8%AE%B0%E5%BD%95%E4%B9%8BCoreDns/

版权声明: "署名-非商用-相同方式共享 4.0" 转载请保留原文链接及作者。

目录
×

喜欢就点赞,疼爱就打赏