Kubernetes 完全教程

Kubernetes 网络与存储

王渊命 @jolestar

bg


Agenda

  1. Kubernetes 的网络
    1. Kubernetes 网络概述
    2. Kubernetes 的 ClusterIP 机制
    3. Kubernetes 的网络规范 CNI
    4. 容器的跨主机网络
    5. Kubernetes 的网络,以 Flannel 为例
    6. QingCloud SDN Passthrough
    7. Kubernetes 网络故障排查
  2. Kubernetes 的存储
    1. Kubernetes Volume
    2. Kubernetes PersistentVolume
    3. Kubernetes PersistentVolumeClaim 和 StorageClass

Kubernetes 网络概述

  1. Service ClusterIP
  2. Pod 网络
    • 容器之间可以直接互通,不需要 NAT
    • 节点可以和容器直接互通,不需要 NAT
    • 容器看到的自己 IP 应该和其他容器看到的一样

Network address translation (NAT) is a method of remapping one IP address space into another by modifying network address information in Internet Protocol (IP) datagram packet headers while they are in transit across a traffic routing device.


Service ClusterIP

第三课:Kubernetes 的网络和存储 - 图2


Iptables

iptables是一个配置Linux内核防火墙的命令行工具,它基于内核的netfilter机制

Tables↓/Chains→ PREROUTING INPUT FORWARD OUTPUT POSTROUTING
raw
(connection tracking)
mangle
nat (DNAT)
filter (default)
security
nat (SNAT)

Iptables data flow

第三课:Kubernetes 的网络和存储 - 图3


Iptables example Docker

  1. iptables -A INPUT -p tcp --dport 22 -j ACCEPT
  2. iptables -P INPUT DROP
  3. iptables -P FORWARD DROP
  4. # docker iptables nat
  5. iptables -S -t nat
  6. -N DOCKER
  7. -A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
  8. -A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
  9. -A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
  10. -A DOCKER ! -i docker0 -p tcp -m tcp --dport 80 -j DNAT --to-destination 172.17.0.2:80

Iptables example Kubernetes ClusterIP

  1. kubectl get service
  2. NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
  3. helloworld 10.105.189.133 <nodes> 80:31061/TCP 1d
  1. -N KUBE-NODEPORTS
  2. -N KUBE-POSTROUTING
  3. -N KUBE-SERVICES
  4. -N KUBE-SVC-2WB5SOAIQNMPIUJO
  5. -A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
  6. -A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
  7. -A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
  8. -A KUBE-NODEPORTS -p tcp -m comment --comment "default/helloworld:" -m tcp --dport 31061 -j KUBE-SVC-2WB5SOAIQNMPIUJO
  9. -A KUBE-SERVICES -d 10.105.189.133/32 -p tcp -m comment --comment "default/helloworld: cluster IP" -m tcp --dport 80 -j KUBE-SVC-2WB5SOAIQNMPIUJO
  10. -A KUBE-SVC-2WB5SOAIQNMPIUJO -m comment --comment "default/helloworld:"-j KUBE-SEP-3F6YTS3NO7JJQOGS
  11. -A KUBE-SEP-3F6YTS3NO7JJQOGS -p tcp -m comment --comment "default/helloworld:" -m tcp -j DNAT --to-destination 10.244.1.3:80

CNI (Container Network Interface)

  1. CNI_COMMAND (add/del)
  2. CNI_PATH (/opt/cni/bin)
  3. CNI_CONTAINERID
  4. CNI_NETNS
  1. $ cat /etc/cni/net.d/10-mynet.conf
  2. {
  3. "cniVersion": "0.2.0",
  4. "name": "mynet",
  5. "type": "bridge",
  6. "bridge": "cni0",
  7. "isGateway": true,
  8. "ipMasq": true,
  9. "ipam": {
  10. "type": "host-local",
  11. "subnet": "10.22.0.0/16",
  12. "routes": [
  13. { "dst": "0.0.0.0/0" }
  14. ]
  15. }
  16. }

容器的跨主机网络

  1. 容器网络(详情参看预备课)
  2. 容器的跨主机网络需要解决的问题
    1. IP 分配
    2. 跨主机网络数据转发
  3. 手动实现跨主机容器网络(演示)
  1. node1(192.168.0.11) docker: 172.17.0.0/16
  2. node2(192.168.0.12) docker: 172.18.0.0/16 (/etc/docker/daemon.json {"bip":"172.18.0.1/16"})
  3. # node1
  4. iptables -P FORWARD ACCEPT
  5. ip route add 172.18.0.0/16 via 192.168.0.12
  6. # node2
  7. iptables -P FORWARD ACCEPT
  8. ip route add 172.17.0.0/16 via 192.168.0.11

Flannel


第三课:Kubernetes 的网络和存储 - 图4

Backend Type

  1. udp
  2. vxlan
  3. host-gw
  4. aws-vpc

QingCloud SDN Passthrough

创建一个网卡(192.168.0.101),绑定到主机,通过命令将网卡移动到容器的 netns

  1. container_id=$(docker run --network none -d jolestar/go-probe)
  2. docker exec $container_id ifconfig
  3. # 创建 netns 文件夹连接
  4. pid=$(docker inspect -f '{{.State.Pid}}' ${container_id})
  5. mkdir -p /var/run/netns/
  6. ln -sfT /proc/$pid/ns/net /var/run/netns/$container_id
  7. # 移动网卡并激活
  8. ip link set eth1 netns ${container_id}
  9. ip netns exec ${container_id} ip addr add 192.168.0.101/24 dev eth1
  10. ip netns exec ${container_id} ip link set dev eth1 up
  11. ip netns exec ${container_id} ip route add default via 192.168.0.1
  12. docker exec $container_id ifconfig
  13. docker exec $container_id -- nping 192.168.0.12

Kubernetes 网络故障排查

  1. 确认同一主机上的 pod 网络是否互通,否,排查本机网络 arp, iptables
  2. 确认跨主机 pod 网络是否互通
  3. 确认 dns 服务是否正常
  4. 确认请求 Service ClusterIP 是否正常 (排查 Service iptables)
  5. 确认 pod 到 apiserver 请求是否正常
  6. 确认出 pod 请求公网是否正常
  1. # iptables debug
  2. iptables -t raw -A PREROUTING -s 172.17.0.0/16 -j TRACE
  3. iptables -t nat -A POSTROUTING -p tcp -m tcp -d 172.17.0.0/16 -j LOG --log-prefix "POSTROUTING"
  4. tail -f /var/log/kern.log

Kubernetes 网络故障排查工具

  • arp/arping
  • ping/nping/traceroute
  • iproute2
  • nmap/telnet/curl
  • nslookup/dig
  • tcpdump
  • iptables
  1. # iptables debug
  2. iptables -t raw -A PREROUTING -s 172.17.0.0/16 -j TRACE
  3. iptables -t nat -A POSTROUTING -p tcp -m tcp -d 172.17.0.0/16 -j LOG --log-prefix "POSTROUTING"
  4. tail -f /var/log/kern.log

Kubernetes Volume

  1. apiVersion: v1
  2. kind: Pod
  3. metadata:
  4. name: test-pd
  5. spec:
  6. containers:
  7. - image: jolestar/go-probe
  8. name: test-container
  9. volumeMounts:
  10. - mountPath: /cache
  11. name: cache-volume
  12. volumes:
  13. - name: cache-volume
  14. emptyDir: {}

Kubernetes Volume

  • emptyDir
  • hostPath
    1. volumes:
    2. - name: test-volume
    3. hostPath:
    4. path: /data
  • downwardAPI
  • secret
  • configMap

Kubernetes Volume


第三课:Kubernetes 的网络和存储 - 图5
  • projected (secret, downwardAPI, configMap)
  • gitRepo
    1. volumes:
    2. - name: git-volume
    3. gitRepo:
    4. repository: "git@xxxx:me/my-repo.git"
    5. revision: "22f1d8406d464"

Kubernetes Volume

  • NFS/CephFS/Glusterfs
    1. volumes:
    2. - name: nfs-volume
    3. nfs:
    4. path: /opt/nfs
    5. server: nfs.f22
  • Cloud Disk(GCEPersistentDisk, AWSElasticBlockStore, AzureDisk)
    1. volumes:
    2. - name: test-volume
    3. awsElasticBlockStore:
    4. volumeID: <volume-id>
    5. fsType: ext4

Kubernetes VolumeMount Option

  • mountPath
  • name
  • readOnly
  • subPath

PersistentVolume 以及 PersistentVolumeClaim

以及为什么要有 PersistentVolume,PersistentVolumeClaim

  • 生命周期管理
  • 资源清理以及复用
  • Pod 副本
  • 环境

Kubernetes PersistentVolume 规范

  1. kind: PersistentVolume
  2. apiVersion: v1
  3. metadata:
  4. name: qingcloud-pv
  5. labels:
  6. type: qingcloud
  7. spec:
  8. capacity:
  9. storage: 10Gi
  10. accessModes:
  11. - ReadWriteOnce
  12. flexVolume:
  13. driver: "qingcloud/flex-volume"
  14. fsType: "ext4"
  15. options:
  16. volumeID: "vol-xxxx

Kubernetes PersistentVolume 规范

  • Capacity
  • Access Modes
    • ReadWriteOnce
    • ReadOnlyMany
    • ReadWriteMany
  • Mount Options
  • Phase (Available,Bound,Released,Failed)

Kubernetes PersistentVolumeClaim 规范

  1. kind: PersistentVolumeClaim
  2. apiVersion: v1
  3. metadata:
  4. name: qingcloud-pvc
  5. spec:
  6. storageClassName: qingcloud-storageclass
  7. persistentVolumeReclaimPolicy: Recycle
  8. accessModes:
  9. - ReadWriteOnce
  10. resources:
  11. requests:
  12. storage: 3Gi
  1. volumes:
  2. - name: wordpress-persistent-storage
  3. persistentVolumeClaim:
  4. claimName: qingcloud-pvc

Reclaim Policy (Retain,Recycle,Delete)


Kubernetes StorageClass 规范

  1. kind: StorageClass
  2. apiVersion: storage.k8s.io/v1
  3. metadata:
  4. name: qingcloud-storageclass
  5. annotations:
  6. storageclass.kubernetes.io/is-default-class: "true"
  7. provisioner: qingcloud/volume-provisioner
  1. apiVersion: storage.k8s.io/v1
  2. kind: StorageClass
  3. metadata:
  4. name: slow
  5. provisioner: kubernetes.io/glusterfs
  6. parameters:
  7. resturl: "http://192.168.10.100:8080"
  8. restuser: ""
  9. secretNamespace: ""
  10. secretName: ""
  11. allowVolumeExpansion: true

作业

  1. 尝试另外一种网络方案,比如 calico,分析其实现方式,并进行简单的性能比较。
  2. 尝试在 Kubernetes 上运行 glusterfs server,并在 Kubernetes 中使用。

参考资料

  1. https://github.com/feiskyer/sdn-handbook
  2. 《Linux iptables Pocket Reference》Gregor N. Purdy
  3. 《图解 TCP/IP 》[日]竹下隆史 / [日]村山公保 / [日]荒井透 / [日]苅田幸雄
  4. 《计算机网络:自顶向下方法》[美] James F.Kurose / [美] Keith W.Ross

关于我

个人博客: http://jolestar.com
课程 Github:https://github.com/jolestar/kubernetes-complete-course

about


第三课:Kubernetes 的网络和存储 - 图7