kubeadm

https://github.com/kubernetes/kubeadm

kubeadm是k8s自带的部署集群的工具.

Install

准备工作

https://kubernetes.io/zh/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

安装runtime

https://kubernetes.io/zh/docs/setup/production-environment/container-runtimes/

默认的cgroup驱动时cgroupfs,如果系统是systemd,就会有两个cgroup driver,会出问题.

如果修改cgroup driver需要同时修改CRI和kubelet.

修改containerd的cgroup driver:

$ sudo vim /etc/containerd/config.toml
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
  SystemdCgroup = true
$ sudo systemctl restart containerd

修改kubelet的cgroup driver:

k8s 1.21.0 只有CRI=docker支持自动检测cgroup driver,其它CRI还不支持.

apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd

安装kubeadm, kubelet, kubectl

在每台机器上安装 kubeadm, kubelet, kubectl:

$ sudo apt-get update
$ sudo apt-get install -y apt-transport-https ca-certificates curl
$ sudo curl -fsSL https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add - 
$ echo "deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
$ sudo apt-get update
$ sudo apt-get --yes --allow-unauthenticated install kubeadm kubelet kubectl
$ sudo apt-mark hold kubelet kubeadm kubectl
$ sudo systemctl enable kubelet

Kubeadm CLI

init:

$ kubeadm init 
--config <config> 
--kubernetes-version <version> // kubelet --version
--apiserver-advertise-address <master> // 多网卡指定网卡IP
--image-repository <registry> // default: k8s.gcr.io
--pod-network-cidr <cidr> // 指定pod的cidr
--service-cidr <cidr> // default: 10.96.0.0/12
--service-dns-domain // default: cluster.local
--cri-socket // 如果安装了多个cri需要指定.
--ignore-preflight-errors
--upload-certs

join:

$ kubeadm join [apiserver-advertise-address] --token <token> --discovery-token-ca-cert-hash <hash>

reset:

$ kubeadm reset -f/--force

token:

$ kubeadm token create/delete/generate/list

部署Cluster

部署master

关闭swap

$ sudo swapoff -a

初始化

$ sudo kubeadm init \
--pod-network-cidr=10.244.0.0/16 \
--apiserver-advertise-address=<IP> \
--kubernetes-version=v1.17.0 \
--image-repository=registry.aliyuncs.com/google_containers \
--cri-socket=/run/containerd/containerd.sock \
-v=6
// --config 一般使用默认即可.
// --pod-network-cidr=10.244.0.0/16 是固定用法,表示选择flannel为网络插件。
// --image-repository 指定registry, 默认是gcr

配置当前帐号

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

部署网络插件flannel

// 在所有node上部署cni-plugin:
// <https://github.com/containernetworking/plugins/releases>
$ sudo mkdir -p /opt/cni/bin
// 下载并解压所有插件命令到该目录.

$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

配置master是否部署pod

# enable master deploy pod (默认不部署pod到master)
kubectl taint nodes --all node-role.kubernetes.io/master-

# disable master deploy pod
kubectl taint nodes <node> node-role.kubernetes.io/master=true:NoSchedule

部署node

$ sudo swapoff -a

// 如果有vpn,kubeadm会自动下载安装
// 在所有node上部署cni-plugin:
// <https://github.com/containernetworking/plugins/releases>
$ sudo mkdir -p /opt/cni/bin
// 下载并解压所有插件命令到该目录.

$ sudo kubeadm join 192.168.1.1:6443 \
--token 8po0v5.m1qlbc7w0btq15of \
--discovery-token-ca-cert-hash sha256:21d8365e336d5218637ddf26e2ec5d91c7dd2de518dbe47973e089837b13265b

验证

$ kubectl get pods -n kube-system
$ kubectl get nodes

删除cluster

所有node运行:

$ sudo kubeadm reset -f
// 自动停止kubelet并且删除下列文件和目录
[/etc/kubernetes/manifests /etc/kubernetes/pki]
[/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[/var/lib/etcd /var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni]

需要手动删除:

$ sudo rm -rf /etc/cni/net.d

所有node上删除flannel的网络配置

$ sudo ifconfig cni0 down
$ sudo ip link delete cni0
$ sudo ifconfig flannel.1 down
$ sudo ip link delete flannel.1
$ sudo rm -rf /run/flannel

所有node清空iptables

$ sudo iptables -F
$ sudo iptables -X
$ sudo iptables -t nat -F
$ sudo iptables -t nat -X

如果使用了IPVS:

$ ipvsadm --clear

删除配置

$ rm -rf $HOME/.kube

部署HA Cluster

ha需要在所有master节点安装haproxy和keepalived.

https://kubernetes.io/zh/docs/setup/production-environment/tools/kubeadm/high-availability/

https://github.com/kubernetes/kubeadm/blob/master/docs/ha-considerations.md#options-for-software-load-balancing

在master1上初始化:

$ sudo kubeadm init --config ./kubeadm.yaml -v=6 --upload-certs

加入其它master:

$ sudo kubeadm join 192.168.1.200:8443 --token <token> --discovery-token-ca-cert-hash <hash> --control-plane --certificate-key <key>

加入node:

$ sudo kubeadm join 192.168.1.200:8443 --token <token> --discovery-token-ca-cert-hash <hash>

配置

https://kubernetes.io/zh/docs/setup/production-environment/tools/kubeadm/control-plane-flags/

使用自定义配置:

$ sudo kubeadm init --config ./config.yaml -v=6

查看默认配置:

$ kubeadm config print init-defaults

配置kubeadm:

apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
localAPIEndpoint:
  advertiseAddress: 10.103.1.1 // master IP
  bindPort: 6443
nodeRegistration:
  criSocket: /run/containerd/containerd.sock
  name: debug // master hostname
  taints:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master

配置kubernetes:

// 定制control plane
<https://kubernetes.io/zh/docs/setup/production-environment/tools/kubeadm/control-plane-flags/>
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: 10.58.203.200:8443 // HA中haproxy的VIP和port
dns:
  type: CoreDNS
etcd:
  local:
    dataDir: /var/lib/etcd
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
  podSubnet: 10.244.0.0/16 // for flannel
imageRepository: k8s.gcr.io
kubernetesVersion: v1.18.6
controllerManager:
  ...
  extraArgs:
    <https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller-manager/>
    allocate-node-cidrs: 'true'
    node-cidr-mask-size: '16' // flannel的SubNetLen
    cluster-cidr: '10.0.0.0/8' // flannel的Network
apiServer:
  timeoutForControlPlane: 4m0s
  <https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/>
    extraArgs:
      advertise-address: 192.168.0.103
      ...
scheduler:
  ...
  <https://kubernetes.io/docs/reference/command-line-tools-reference/kube-scheduler/>
  extraArgs:
    ...

修改kubelet的cgroup driver:

apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd