Skip to content

Kubernetes(v1.21)上安装 KubeSphere(v3.2.1)

  • 可以使用各种公有云进行按量付费实验。
  • 本次搭建的是单 master 的集群,不适合用于生产环境。
  • 所有节点的机器需要联网。

第一章:主机规划

角色 IP地址 hostname 操作系统 配置 设备
Master 192.168.65.100 k8s-master CentOS 7.9,基础设施服务器 4核CPU,8G内存,40G硬盘
Node1 192.168.65.101 k8s-node1 CentOS 7.9,基础设施服务器 8核CPU,16G内存,40G硬盘 /dev/sdb 100G
Node2 192.168.65.102 k8s-node2 CentOS 7.9,基础设施服务器 8核CPU,16G内存,40G硬盘 /dev/sdb 100G
Node3 192.168.65.103 k8s-node3 CentOS 7.9,基础设施服务器 8核CPU,16G内存,40G硬盘 /dev/sdb 100G

第二章:安装 Kubernetes

2.1 Kubernetes 和 Docker 的版本对应关系

1652083352107-3e3fa5ba-8724-47d4-a11d-85afda91fb16.png

从文档中,我们可以知道 Docker 的版本是 v20.10 ,对应的 Kubernetes 的版本是 v1.21 。

2.2 前置条件

  • 如果是虚拟机则需要让所有机器互通,最简单的做法就是关闭防火墙。
systemctl stop firewalld
systemctl disable firewalld

1652083360385-e36d38d0-72d1-481a-be14-bce625be4ec0.gif

2.3 准备工作

2.3.1 升级系统内核

  • 查看当前系统的版本:
cat /etc/redhat-release

1652083365457-27b274a0-f0e0-4a24-b0e7-d20af19734d9.gif

  • 查看当前系统的内核:
uname -sr
# 查看启动顺序
yum install -y grub2-pc
grub2-editenv list
# 查看可用内核版本及启动顺序
sudo awk -F\' '$1=="menuentry " {print i++ " : " $2}' /boot/grub2/grub.cfg

默认的 3.10.0 实在是太低了,需要升级内核。

1652083370291-4220baac-c6c1-4bab-88ac-66f3e33c8647.gif

  • 在 CentOS 7.x 上启用 ELRepo 仓库:
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
rpm -Uvh https://www.elrepo.org/elrepo-release-7.0-4.el7.elrepo.noarch.rpm

1652083394005-45d7d12b-156f-47dc-96d9-0d476184246a.gif

温馨提示:可以替换为清华源:sed -i "s/mirrorlist=/#mirrorlist=/g" /etc/yum.repos.d/elrepo.reposed -i "s#elrepo.org/linux#mirrors.tuna.tsinghua.edu.cn/elrepo#g" /etc/yum.repos.d/elrepo.repo

  • 查看可用的系统内核相关包(选做):
yum --disablerepo="*" --enablerepo="elrepo-kernel" list available

1652083400201-bc07c3ab-4408-4506-928d-707378f80402.gif

  • 安装最新主线内核版本:
yum -y --enablerepo=elrepo-kernel install kernel-ml

1652083404699-0119525e-ecff-4203-9280-d38f1e9c43c8.gif

  • 设置默认的内核版本:
vim /etc/default/grub
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=0 # 修改此处,原来是 saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet"
GRUB_DISABLE_RECOVERY="true"

1652083409410-b73be9c1-260d-4e39-a51b-42d23f1846d5.gif

温馨提示:也可以使用这个命令 grub2-set-default 0

  • 重新创建内核配置:
grub2-mkconfig -o /boot/grub2/grub.cfg

1652083415658-39e107df-f92c-48da-88ac-2a9d93ac85b7.gif

  • 重启系统:
reboot

1652083420265-ca811921-2da5-49c9-abf0-1d3ff0b1fdb0.gif

  • 查看当前系统的内核:
uname -sr

1652083425262-9c77cc8f-8fe8-48c6-9fdf-d5805902f4ae.gif

2.3.2 设置主机名

  • 命令:
hostnamectl set-hostname <hostname>
  • 示例:
# 192.168.65.100
hostnamectl set-hostname k8s-master
# 192.168.65.101
hostnamectl set-hostname k8s-node1
# 192.168.65.102
hostnamectl set-hostname k8s-node2
# 192.168.65.103
hostnamectl set-hostname k8s-node3

1652083439596-cb474401-0359-4446-805a-a62d1fe881e9.gif

2.3.3 主机名解析

  • 为了方便后面集群节点间的直接调用,需要配置一下主机名解析,企业中推荐使用内部的 DNS 服务器。
cat >> /etc/hosts << EOF
127.0.0.1   $(hostname)
192.168.65.100 k8s-master
192.168.65.101 k8s-node1
192.168.65.102 k8s-node2
192.168.65.103 k8s-node3
EOF

1652083444509-7ed340fd-c7c1-48aa-9fe5-185d03bcdd9f.gif

2.3.4 时间同步

  • Kubernetes 要求集群中的节点时间必须精确一致,所以在每个节点上添加时间同步:
yum install ntpdate -y
ntpdate time.windows.com

1652083449095-503bacc9-6a14-4e4b-9fb8-dfa7e6199d27.gif

2.3.5 关闭 SELinux

  • 查看 SELinux 是否开启:
getenforce
  • 永久关闭 SELinux ,需要重启:
sed -i 's/enforcing/disabled/' /etc/selinux/config
  • 关闭当前会话的 SELinux ,重启之后无效:
setenforce 0

1652083455120-8b0f0f86-bc73-481c-a180-811e104e5a8f.gif

2.3.6 关闭 swap 分区

  • 永久关闭 swap ,需要重启:
sed -ri 's/.*swap.*/#&/' /etc/fstab
  • 关闭当前会话的 swap ,重启之后无效:
swapoff -a

1652083460917-4cefba17-114b-4002-b981-cfdb94aa8334.gif

2.3.7 将桥接的 IPv4 流量传递到 iptables 的链

  • 修改 /etc/sysctl.conf 文件:
# 如果有配置,则修改
sed -i "s#^net.ipv4.ip_forward.*#net.ipv4.ip_forward=1#g"  /etc/sysctl.conf
sed -i "s#^net.bridge.bridge-nf-call-ip6tables.*#net.bridge.bridge-nf-call-ip6tables=1#g"  /etc/sysctl.conf
sed -i "s#^net.bridge.bridge-nf-call-iptables.*#net.bridge.bridge-nf-call-iptables=1#g"  /etc/sysctl.conf
sed -i "s#^net.ipv6.conf.all.disable_ipv6.*#net.ipv6.conf.all.disable_ipv6=1#g"  /etc/sysctl.conf
sed -i "s#^net.ipv6.conf.default.disable_ipv6.*#net.ipv6.conf.default.disable_ipv6=1#g"  /etc/sysctl.conf
sed -i "s#^net.ipv6.conf.lo.disable_ipv6.*#net.ipv6.conf.lo.disable_ipv6=1#g"  /etc/sysctl.conf
sed -i "s#^net.ipv6.conf.all.forwarding.*#net.ipv6.conf.all.forwarding=1#g"  /etc/sysctl.conf
# 可能没有,追加
echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.conf
echo "net.bridge.bridge-nf-call-ip6tables = 1" >> /etc/sysctl.conf
echo "net.bridge.bridge-nf-call-iptables = 1" >> /etc/sysctl.conf
echo "net.ipv6.conf.all.disable_ipv6 = 1" >> /etc/sysctl.conf
echo "net.ipv6.conf.default.disable_ipv6 = 1" >> /etc/sysctl.conf
echo "net.ipv6.conf.lo.disable_ipv6 = 1" >> /etc/sysctl.conf
echo "net.ipv6.conf.all.forwarding = 1"  >> /etc/sysctl.conf
  • 加载 br_netfilter 模块:
modprobe br_netfilter
  • 持久化修改(保留配置包本地文件,重启系统或服务进程仍然有效):
sysctl -p

1652083466862-c7ef2cfd-0732-442e-bb94-d5aeb6021c00.gif

2.3.8 开启 ipvs

  • 在 Kubernetes 中 service 有两种代理模型,一种是基于 iptables ,另一种是基于 ipvs 的。
  • ipvs 的性能要高于 iptables 的,但是如果要使用它,需要手动载入 ipvs 模块。
  • 在所有机器安装 ipset 和 ipvsadm :
yum -y install ipset ipvsadm
  • 在所有机器执行如下脚本:
cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack
EOF
  • 授权、运行、检查是否加载:
chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4

1652083473477-7f8cc804-611c-4bc7-9d58-4706019c8b0a.gif

2.3.9 重启

  • 所有机器重启:
reboot

1652083483672-bd258c0c-9463-4ee0-b512-f7aa6a5b77c7.gif

2.4 Docker 安装

所有机器都需要安装 Docker 。

  • 卸载旧版本:
yum remove docker \
           docker-client \
           docker-client-latest \
           docker-common \
           docker-latest \
           docker-latest-logrotate \
           docker-logrotate \
           docker-engine

1652083504792-7d984201-dcec-43f0-96b7-71e708f0d280.gif

  • yum 安装 gcc 相关:
yum -y install gcc
yum -y install gcc-c++

1652083509441-975bbb74-deb0-43ca-982a-59db301c3901.gif

  • 安装所需要的软件包:
yum -y install yum-utils

1652083514390-851724b7-2164-4f54-86b2-c45e04169b59.gif

  • 设置 stable 镜像仓库:
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo

1652083518699-d2e1386b-d903-4aac-938f-2ac5e952cb8f.gif

  • 更新 yum 软件包索引:
yum makecache fast

1652083523439-e7c491a9-13df-45e8-bce6-ce1abec7a025.gif

  • 查看存储库中 Docker 的版本(选做):
yum list docker-ce --showduplicates | sort -r

1652083530154-4446f537-ed47-4945-a998-a04bcdea0062.gif

  • 安装指定版本的 Docker(v20.10) :
yum -y install docker-ce-3:20.10.8-3.el7.x86_64 docker-ce-cli-3:20.10.8-3.el7.x86_64 containerd.io

1652083536408-211a3c04-96af-4f09-a8f4-811ecdb48f22.gif

  • 启动 Docker :
# 启动 Docker 
systemctl start docker
# 开启自动启动
systemctl enable docker

1652083541312-ab8626a9-ab01-4021-b154-7b2fb77424fc.gif

  • 验证 Docker 是否安装成功:
docker version

1652083552175-f8d44757-edce-4ea6-8c0a-7d39c08daecd.gif

  • 阿里云镜像加速:
sudo mkdir -p /etc/docker
sudo tee /etc/docker/daemon.json <<-'EOF'
{
  "exec-opts": ["native.cgroupdriver=systemd"], 
  "registry-mirrors": ["https://du3ia00u.mirror.aliyuncs.com"], 
  "live-restore": true,
  "log-driver":"json-file",
  "log-opts": {"max-size":"500m", "max-file":"3"},
  "max-concurrent-downloads": 10,
  "max-concurrent-uploads": 5,
  "storage-driver": "overlay2"
}
EOF
sudo systemctl daemon-reload
sudo systemctl restart docker

1652083561848-3871fef1-fbf3-4610-b028-a50b3e781920.gif

2.5 添加阿里云的 Kubernetes 的 YUM 源

  • 由于 Kubernetes 的镜像源在国外,非常慢,这里切换成国内的阿里云镜像源(所有机器均需执行下面命令):
cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

1652083572094-e055fd83-b63f-4276-b859-d130e6ca9f34.gif

2.6 安装 kubelet 、kubeadm 和  kubectl

  • Kubernetes 架构图:

1652083576917-95da10dc-cb6a-44e7-ad62-eaaa7cc066f9.png

  • 安装 kubelet 、kubeadm 和  kubectl ,所有机器均需执行以下命令:
yum install -y kubelet-1.21.10 kubeadm-1.21.10 kubectl-1.21.10

1652083582255-5c2f0eee-6da5-41b3-872c-4a95130543f4.gif

  • 为了实现 Docker 使用的 cgroup drvier 和 kubelet 使用的 cgroup drver 一致,建议修改 /etc/sysconfig/kubelet 文件的内容,所有机器均需执行以下命令:
vim /etc/sysconfig/kubelet
# 修改
KUBELET_EXTRA_ARGS="--cgroup-driver=systemd"
KUBE_PROXY_MODE="ipvs"

1652083587127-ae5b1755-30df-4e3e-ba8f-2f83ad64df3e.gif

  • 设置为开机自启动即可,由于没有生成配置文件,集群初始化后自动启动:
systemctl enable kubelet

1652083591831-05969f67-6c3e-4c65-bfc0-8b87f8eeb049.gif

2.7 查看 Kubernetes 安装所需镜像(选做)

  • 查看 Kubernetes 安装所需镜像:
kubeadm config images list

1652083600141-091b5a6e-578c-4fc6-81d5-bec28163a7d0.gif

2.8 下载 Kubernetes 安装所需镜像

  • 所有机器均通过 Docker 下载所需镜像:
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.21.10
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.21.10
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.21.10
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.21.10
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.4.1
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.13-0
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.8.0

1652083605457-bc8a51e1-1b6c-4e16-93b7-9a618acc2fb6.gif

  • 给 coredns 镜像重新打 tag :
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.8.0 registry.cn-hangzhou.aliyuncs.com/google_containers/coredns/coredns:v1.8.0

1652083610583-ba828331-35dd-4b82-907f-e7beaff1cb6f.gif

2.9 部署 Kubernetes 的 Master 节点

  • 在 k8s-master (192.168.65.100)机器上部署 Kubernetes 的 Master 节点:
# 由于默认拉取镜像地址k8s.gcr.io国内无法访问,这里需要指定阿里云镜像仓库地址
kubeadm init \
  --apiserver-advertise-address=192.168.65.100 \
  --image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers \
  --kubernetes-version=v1.21.10 \
  --service-cidr=10.96.0.0/16 \
  --pod-network-cidr=10.244.0.0/16

注意:

  • apiserver-advertise-address 一定要是主机的 IP 地址。
  • apiserver-advertise-address 、service-cidr 和 pod-network-cidr 不能在同一个网络范围内。

1652083618546-137be63b-0bfb-410d-853a-44ac6ee31bb6.gif

  • 日志:
Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.65.100:6443 --token 5oqv3n.4n2ak6e1y4h35cra \
    --discovery-token-ca-cert-hash sha256:d82d66af9a8b1ef328501eb082235c65627be53918cb910501e088a78c766425
  • 根据日志提示操作,在 k8s-master (192.168.65.100) 执行如下命令:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 如果是 root 用户,还可以执行如下命令
export KUBECONFIG=/etc/kubernetes/admin.conf

1652083710221-2cb4762a-ee00-42ef-83d4-281e2152df3f.gif

  • 默认的 token 有效期为 2 小时,当过期之后,该 token 就不能用了,这时可以使用如下的命令创建 token :
kubeadm token create --print-join-command
# 生成一个永不过期的token
kubeadm token create --ttl 0 --print-join-command

2.10 部署 Kubernetes 的 Node节点

  • 根据日志提示操作,在 k8s-node1(192.168.65.101)、k8s-node2(192.168.65.102)和 k8s-node3(192.168.65.103)  执行如下命令:
kubeadm join 192.168.65.100:6443 --token 5oqv3n.4n2ak6e1y4h35cra \
    --discovery-token-ca-cert-hash sha256:d82d66af9a8b1ef328501eb082235c65627be53918cb910501e088a78c766425

1652083716852-1774bc46-1e4c-4939-bf19-37006ff035dc.gif

2.11 部署网络插件

kubectl apply -f https://projectcalico.docs.tigera.io/v3.19/manifests/calico.yaml

备注:为什么使用 3.19 , 原因在这里

1652083722443-014a2783-13d3-4eb7-b16c-1a621d97f060.gif

  • 查看部署 CNI 网络插件进度:
watch -n 1 kubectl get pod -n kube-system

1652083727910-0945071a-8edb-4f6d-bef0-dacce33ed571.gif

kubectl get pods -n kube-system

1652083732586-f9a7b474-01b3-4d95-aa63-3fddd7328194.png

2.12 查看节点状态

  • 在 Master(192.168.65.100)节点上查看节点状态:
kubectl get nodes

1652083741257-5a46460a-5fc2-46dd-8174-7c4c87144c46.png

2.13 设置 kube-proxy 的 ipvs 模式

  • 在 Master(192.168.65.100)节点设置 kube-proxy 的 ipvs 模式:
kubectl edit cm kube-proxy -n kube-system
apiVersion: v1
data:
  config.conf: |-
    apiVersion: kubeproxy.config.k8s.io/v1alpha1
    bindAddress: 0.0.0.0
    bindAddressHardFail: false
    clientConnection:
      acceptContentTypes: ""
      burst: 0
      contentType: ""
      kubeconfig: /var/lib/kube-proxy/kubeconfig.conf
      qps: 0
    clusterCIDR: 10.244.0.0/16
    configSyncPeriod: 0s
    conntrack:
      maxPerCore: null
      min: null
      tcpCloseWaitTimeout: null
      tcpEstablishedTimeout: null
    detectLocalMode: ""
    enableProfiling: false
    healthzBindAddress: ""
    hostnameOverride: ""
    iptables:
      masqueradeAll: false
      masqueradeBit: null
      minSyncPeriod: 0s
      syncPeriod: 0s
    ipvs:
      excludeCIDRs: null
      minSyncPeriod: 0s
      scheduler: ""
      strictARP: false
      syncPeriod: 0s
      tcpFinTimeout: 0s
      tcpTimeout: 0s
      udpTimeout: 0s
    kind: KubeProxyConfiguration
    metricsBindAddress: ""
    mode: ""
    nodePortAddresses: null
      minSyncPeriod: 0s
      syncPeriod: 0s
    ipvs:
      excludeCIDRs: null
      minSyncPeriod: 0s
      scheduler: ""
      strictARP: false
      syncPeriod: 0s
      tcpFinTimeout: 0s
      tcpTimeout: 0s
      udpTimeout: 0s
    kind: KubeProxyConfiguration
    metricsBindAddress: ""
    mode: "ipvs" # 修改此处
...

1652083753278-87be23f4-0fb4-4646-a4fe-fc5c569f051c.gif

  • 删除 kube-proxy ,让 Kubernetes 集群自动创建新的 kube-proxy :
kubectl delete pod -l k8s-app=kube-proxy -n kube-system

1652083761054-f8887557-66c7-4608-bedd-b5917da12c6f.gif

第三章:安装 KubeSphere 前置条件

3.1 安装 metrics-server(v 0.6.1)

  • 在 Master(192.168.65.100)节点安装 metrics-server :
vi k8s-metrics.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
    rbac.authorization.k8s.io/aggregate-to-edit: "true"
    rbac.authorization.k8s.io/aggregate-to-view: "true"
  name: system:aggregated-metrics-reader
rules:
- apiGroups:
  - metrics.k8s.io
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
rules:
- apiGroups:
  - ""
  resources:
  - nodes/metrics
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server-auth-reader
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server:system:auth-delegator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:auth-delegator
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:metrics-server
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  ports:
  - name: https
    port: 443
    protocol: TCP
    targetPort: https
  selector:
    k8s-app: metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  selector:
    matchLabels:
      k8s-app: metrics-server
  strategy:
    rollingUpdate:
      maxUnavailable: 0
  template:
    metadata:
      labels:
        k8s-app: metrics-server
    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        - --kubelet-insecure-tls # 使用非安全的协议
        image: bitnami/metrics-server:0.6.1   # k8s.gcr.io/metrics-server/metrics-server:v0.6.1
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /livez
            port: https
            scheme: HTTPS
          periodSeconds: 10
        name: metrics-server
        ports:
        - containerPort: 4443
          name: https
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /readyz
            port: https
            scheme: HTTPS
          initialDelaySeconds: 20
          periodSeconds: 10
        resources:
          requests:
            cpu: 100m
            memory: 200Mi
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1000
        volumeMounts:
        - mountPath: /tmp
          name: tmp-dir
      nodeSelector:
        kubernetes.io/os: linux
      priorityClassName: system-cluster-critical
      serviceAccountName: metrics-server
      volumes:
      - emptyDir: {}
        name: tmp-dir
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  labels:
    k8s-app: metrics-server
  name: v1beta1.metrics.k8s.io
spec:
  group: metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: metrics-server
    namespace: kube-system
  version: v1beta1
  versionPriority: 100
kubectl apply -f k8s-metrics.yaml

1652083804099-13ac953c-b04e-4d38-885e-f63273f7df11.gif

  • 查看是否安装成功:
kubectl top nodes --use-protocol-buffers
kubectl top pods --use-protocol-buffers

1652083814109-edf1107b-cd5b-4445-b94e-09e0f36600ee.gif

3.2 安装 GlusterFS 存储系统

注意:

  • 不建议您在生产环境中使用 NFS 存储(尤其是在 Kubernetes 1.20 或以上版本),这可能会引起 failed to obtain lockinput/output error 等问题,从而导致 Pod CrashLoopBackOff。此外,部分应用不兼容 NFS,例如 Prometheus 等。
  • 开发和测试的时候,可以使用 NFS 存储。

3.2.1 VMware 添加空硬盘

  • 在 VMware 中的 k8s-node1 和 k8s-node2 以及 K8s-node3 节点添加硬盘,本次只以 k8s-node1 为例,其余节点依次类推即可:

1652083827442-4d81e672-acef-4500-9bc1-f74708f1f33e.gif

  • 添加完毕之后,需要重启 k8s-node1 和 k8s-node2 以及 K8s-node3 节点,以便让这些机器能识别到新添加的硬盘:
reboot

1652083837668-c65ccf78-4d7e-41c2-afec-e2984ce28487.gif

  • 查看新添加的硬盘:
lsblk -f

1652083843304-cce883a5-2bcc-4966-a220-2667f5d12c9f.gif

注意:云厂商需要磁盘清零,假设 lsblk -f 命令查询的结果是 vdc ,那么就需要执行 dd if=/dev/zero of=/dev/vdc bs=1M status=progress 命令。

3.2.2 在所有 Node 节点机器安装 GlusterFS

  • 在所有 Node 机器上配置 yum 源:
yum -y install centos-release-gluster

1652083855534-da9d6152-22d9-4cca-9cb0-3cc60270d58e.gif

  • 在所有 Node 机器安装 GlusterFS 服务器:
yum -y install glusterfs glusterfs-server glusterfs-fuse

1652083865004-6b62c119-1014-4c2e-b2f0-5f4dbcc51bcd.gif

  • 验证是否安装成功:
glusterfs -V

1652083871120-92333bf7-e720-4bc6-9008-5c1d85c60282.gif

  • 启动 glusterd 服务,并设置开机自启:
systemctl enable glusterd && systemctl start glusterd

1652083877789-3cc17e09-dd0b-43c5-9b7f-edf306bff674.gif

  • 查看 glusterd 服务是否启动成功:
systemctl status glusterd

1652083884471-7923a440-f3a0-4f28-a230-22adae699c52.gif

3.2.3 创建 GlusterFS 集群

  • 在任意节点添加其他节点,组成 GlusterFS 集群,本次以 k8s-node1 为例:
gluster peer probe k8s-node2
gluster peer probe k8s-node3

1652083891378-44147560-a390-4449-80cc-f627d5af94d8.gif

  • 验证集群中的所有节点均已成功连接:
gluster peer status

1652083901002-5fd46ca2-517e-40af-92ef-9c46148c0b4d.gif

3.2.4 设置 Node 节点的免密登录

  • 在 k8s-node1 节点执行如下命令:
ssh-keygen
ssh-copy-id root@k8s-node2
ssh-copy-id root@k8s-node3

1652083912716-a5d2beac-d82c-4575-9e63-668b8b50538b.gif

3.2.5 部署 Heketi

由于 GlusterFS 本身不提供 API 调用的方法,因此您可以安装 Heketi,通过用于 Kubernetes 调用的 RESTful API 来管理 GlusterFS 存储卷的生命周期。这样,您的 Kubernetes 集群就可以动态地配置 GlusterFS 存储卷。在此示例中将会安装 Heketi v7.0.0 。

  • 在 k8s-node1 节点下载 Heketi (网络不行,点这里heketi.zip):
wget https://github.com/heketi/heketi/releases/download/v7.0.0/heketi-v7.0.0.linux.amd64.tar.gz

1652083929642-2faa496a-d1f6-4738-a49c-ccb1acae6484.gif

  • 在  k8s-node1 节点解压 Heketi :
tar -zxvf heketi-v7.0.0.linux.amd64.tar.gz
cd heketi
cp heketi /usr/bin
cp heketi-cli /usr/bin

1652084013467-509a47d9-c99f-4a30-89e9-a4ed0f8e9b11.gif

  • 在  k8s-node1 节点创建 Heketi 服务文件:
vi /lib/systemd/system/heketi.service
[Unit]
Description=Heketi Server
[Service]
Type=simple
WorkingDirectory=/var/lib/heketi
ExecStart=/usr/bin/heketi --config=/etc/heketi/heketi.json
Restart=on-failure
StandardOutput=syslog
StandardError=syslog
[Install]
WantedBy=multi-user.target

1652084020047-8733f75b-94c5-444c-a2e5-5178b6fcd00d.gif

  • 在  k8s-node1 节点创建 Heketi 文件夹:
mkdir -p /var/lib/heketi
mkdir -p /etc/heketi

1652084025053-866b560c-8b16-47f9-8577-7e059de0af51.gif

  • 在  k8s-node1 节点创建 JSON 文件以配置 Heketi :
vi /etc/heketi/heketi.json
{
  "_port_comment": "Heketi Server Port Number",
  "port": "8080",

  "_use_auth": "Enable JWT authorization. Please enable for deployment",
  "use_auth": false,

  "_jwt": "Private keys for access",
  "jwt": {
    "_admin": "Admin has access to all APIs",
    "admin": {
      "key": "123456"
    },
    "_user": "User only has access to /volumes endpoint",
    "user": {
      "key": "123456"
    }
  },

  "_glusterfs_comment": "GlusterFS Configuration",
  "glusterfs": {
    "_executor_comment": [
      "Execute plugin. Possible choices: mock, ssh",
      "mock: This setting is used for testing and development.",
      "      It will not send commands to any node.",
      "ssh:  This setting will notify Heketi to ssh to the nodes.",
      "      It will need the values in sshexec to be configured.",
      "kubernetes: Communicate with GlusterFS containers over",
      "            Kubernetes exec api."
    ],
    "executor": "ssh",

    "_sshexec_comment": "SSH username and private key file information",
    "sshexec": {
      "keyfile": "/root/.ssh/id_rsa",
      "user": "root"
    },

    "_kubeexec_comment": "Kubernetes configuration",
    "kubeexec": {
      "host" :"https://kubernetes.host:8443",
      "cert" : "/path/to/crt.file",
      "insecure": false,
      "user": "kubernetes username",
      "password": "password for kubernetes user",
      "namespace": "Kubernetes namespace",
      "fstab": "Optional: Specify fstab file on node.  Default is /etc/fstab"
    },

    "_db_comment": "Database file name",
    "db": "/var/lib/heketi/heketi.db",
    "brick_max_size_gb" : 1024,
 "brick_min_size_gb" : 1,
 "max_bricks_per_volume" : 33,


    "_loglevel_comment": [
      "Set log level. Choices are:",
      "  none, critical, error, warning, info, debug",
      "Default is warning"
    ],
    "loglevel" : "debug"
  }
}

在安装 GlusterFS 作为 KubeSphere 集群的存储类型时,必须提供帐户 admin 及其 Secret 值。

1652084033724-d3617e0c-fc43-43b2-915a-6ae3a18f329e.gif

  • 在  k8s-node1 节点启用 Heketi :
systemctl start heketi

1652084039847-87a06684-1728-4eec-8a23-06d4ece29536.gif

  • 在  k8s-node1 节点检查 Heketi 的状态:
systemctl status heketi

1652084044581-4e2413db-e0ad-428f-ad96-cdc4c5a459e0.gif

  • 在  k8s-node1 节点设置 Heketi 开机自启:
systemctl enable heketi

1652084050707-fa9171ae-a314-49d2-aa51-ae86961b9cc4.gif

  • 在  k8s-node1 节点为 Heketi 创建拓扑配置文件,该文件包含添加到 Heketi 的集群、节点和磁盘的信息。
vi /etc/heketi/topology.json
{
    "clusters": [
        {
            "nodes": [
                {
                    "node": {
                        "hostnames": {
                            "manage": [
                                "192.168.65.101" 
                            ],
                            "storage": [
                                "192.168.65.101" 
                            ]
                        },
                        "zone": 1
                    },
                    "devices": [
                        "/dev/sdb" 
                    ]
                },
                {
                    "node": {
                        "hostnames": {
                            "manage": [
                                "192.168.65.102" 
                            ],
                            "storage": [
                                "192.168.65.102"
                            ]
                        },
                        "zone": 1
                    },
                    "devices": [
                        "/dev/sdb" 
                    ]
                },
                {
                    "node": {
                        "hostnames": {
                            "manage": [
                                "192.168.65.103"
                            ],
                            "storage": [
                                "192.168.65.103"  
                            ]
                        },
                        "zone": 1
                    },
                    "devices": [
                        "/dev/sdb"
                    ]
                }
            ]
        }
    ]
}
  • 请使用您自己的 IP 替换上述 IP 地址。
  • 请在 devices 一栏添加您自己的磁盘名称。

1652084056656-e8c78e3b-3575-4b24-8c54-cffc88f11a59.gif

  • 在  k8s-node1 节点加载 Heketi JSON 文件:
export HEKETI_CLI_SERVER=http://localhost:8080
heketi-cli topology load --json=/etc/heketi/topology.json

1652084068601-87548b79-b429-4b1f-ab1a-266de9b823ae.gif

同时显示了集群 ID 和节点 ID。

  • 在  k8s-node1 节点查看集群信息:
heketi-cli cluster info 3df3bf32cf1dd7c047f46725facd814c # Use your own cluster ID.

1652084073342-d4dc4c61-d681-4162-81f9-d9ebe77c865e.png

3.2.6 Kubernetes 使用 GlusterFS 创建 storageclass

  • 在 k8s-master 节点创建 storageclass :
vi glusterfs-sc.yaml
apiVersion: v1
kind: Secret
metadata:
  name: heketi-secret
  namespace: kube-system
type: kubernetes.io/glusterfs
data:
  key: "MTIzNDU2"    # 请替换为您自己的密钥。Base64 编码。 echo -n "123456" | base64
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    storageclass.beta.kubernetes.io/is-default-class: "true"
    storageclass.kubesphere.io/supported-access-modes: '["ReadWriteOnce","ReadOnlyMany","ReadWriteMany"]'
  name: glusterfs
parameters:
  clusterid: "3df3bf32cf1dd7c047f46725facd814c"    #请替换为您自己的 GlusterFS 集群 ID。
  gidMax: "50000"
  gidMin: "40000"
  restauthenabled: "true"
  resturl: "http://192.168.65.101:8080"    #Gluster REST 服务/Heketi 服务 URL 可按需供应 gluster 存储卷。请替换为您自己的 URL。
  restuser: admin
  secretName: heketi-secret
  secretNamespace: kube-system
  volumetype: "replicate:3"    #请替换为您自己的存储卷类型。
provisioner: kubernetes.io/glusterfs
reclaimPolicy: Delete
volumeBindingMode: Immediate
allowVolumeExpansion: true
kubectl apply -f glusterfs-sc.yaml

1652084079903-225f2b28-9158-4d64-8e22-71cba2ade995.gif

第四章:安装 KuberSphere(v3.2.1)

4.1 下载核心配置文件

wget https://github.com/kubesphere/ks-installer/releases/download/v3.2.1/kubesphere-installer.yaml
wget https://github.com/kubesphere/ks-installer/releases/download/v3.2.1/cluster-configuration.yaml

1652084085741-953b698f-c5d5-4c10-b8b7-8520b80bcfe3.gif

4.2 修改 cluster-configuration

  • 在 k8s-master 节点修改 cluster-configuration.yaml 文件,指定我们需要开启的功能:
vi cluster-configuration.yaml
---
apiVersion: installer.kubesphere.io/v1alpha1
kind: ClusterConfiguration
metadata:
  name: ks-installer
  namespace: kubesphere-system
  labels:
    version: v3.2.1
spec:
  persistence:
    storageClass: ""        # If there is no default StorageClass in your cluster, you need to specify an existing StorageClass here.
  authentication:
    jwtSecret: ""           # Keep the jwtSecret consistent with the Host Cluster. Retrieve the jwtSecret by executing "kubectl -n kubesphere-system get cm kubesphere-config -o yaml | grep -v "apiVersion" | grep jwtSecret" on the Host Cluster.
  local_registry: ""        # Add your private registry address if it is needed.
  # dev_tag: ""               # Add your kubesphere image tag you want to install, by default it's same as ks-install release version.
  etcd:
    monitoring: true       # Enable or disable etcd monitoring dashboard installation. You have to create a Secret for etcd before you enable it.
    endpointIps: localhost  # etcd cluster EndpointIps. It can be a bunch of IPs here.
    port: 2379              # etcd port.
    tlsEnable: true
  common:
    core:
      console:
        enableMultiLogin: true  # Enable or disable simultaneous logins. It allows different users to log in with the same account at the same time.
        port: 30880
        type: NodePort
    # apiserver:            # Enlarge the apiserver and controller manager's resource requests and limits for the large cluster
    #  resources: {}
    # controllerManager:
    #  resources: {}
    redis:
      enabled: true
      volumeSize: 2Gi # Redis PVC size.
    openldap:
      enabled: true
      volumeSize: 2Gi   # openldap PVC size.
    minio:
      volumeSize: 20Gi # Minio PVC size.
    monitoring:
      # type: external   # Whether to specify the external prometheus stack, and need to modify the endpoint at the next line.
      endpoint: http://prometheus-operated.kubesphere-monitoring-system.svc:9090 # Prometheus endpoint to get metrics data.
      GPUMonitoring:     # Enable or disable the GPU-related metrics. If you enable this switch but have no GPU resources, Kubesphere will set it to zero. 
        enabled: false
    gpu:                 # Install GPUKinds. The default GPU kind is nvidia.com/gpu. Other GPU kinds can be added here according to your needs. 
      kinds:         
      - resourceName: "nvidia.com/gpu"
        resourceType: "GPU"
        default: true
    es:   # Storage backend for logging, events and auditing.
      # master:
      #   volumeSize: 4Gi  # The volume size of Elasticsearch master nodes.
      #   replicas: 1      # The total number of master nodes. Even numbers are not allowed.
      #   resources: {}
      # data:
      #   volumeSize: 20Gi  # The volume size of Elasticsearch data nodes.
      #   replicas: 1       # The total number of data nodes.
      #   resources: {}
      logMaxAge: 7             # Log retention time in built-in Elasticsearch. It is 7 days by default.
      elkPrefix: logstash      # The string making up index names. The index name will be formatted as ks-<elk_prefix>-log.
      basicAuth:
        enabled: false
        username: ""
        password: ""
      externalElasticsearchUrl: ""
      externalElasticsearchPort: ""
  alerting:                # (CPU: 0.1 Core, Memory: 100 MiB) It enables users to customize alerting policies to send messages to receivers in time with different time intervals and alerting levels to choose from.
    enabled: true         # Enable or disable the KubeSphere Alerting System.
    # thanosruler:
    #   replicas: 1
    #   resources: {}
  auditing:                # Provide a security-relevant chronological set of records,recording the sequence of activities happening on the platform, initiated by different tenants.
    enabled: true         # Enable or disable the KubeSphere Auditing Log System.
    # operator:
    #   resources: {}
    # webhook:
    #   resources: {}
  devops:                  # (CPU: 0.47 Core, Memory: 8.6 G) Provide an out-of-the-box CI/CD system based on Jenkins, and automated workflow tools including Source-to-Image & Binary-to-Image.
    enabled: true             # Enable or disable the KubeSphere DevOps System.
    # resources: {}
    jenkinsMemoryLim: 2Gi      # Jenkins memory limit.
    jenkinsMemoryReq: 1500Mi   # Jenkins memory request.
    jenkinsVolumeSize: 8Gi     # Jenkins volume size.
    jenkinsJavaOpts_Xms: 512m  # The following three fields are JVM parameters.
    jenkinsJavaOpts_Xmx: 512m
    jenkinsJavaOpts_MaxRAM: 2g
  events:                  # Provide a graphical web console for Kubernetes Events exporting, filtering and alerting in multi-tenant Kubernetes clusters.
    enabled: true         # Enable or disable the KubeSphere Events System.
    # operator:
    #   resources: {}
    # exporter:
    #   resources: {}
    # ruler:
    #   enabled: true
    #   replicas: 2
    #   resources: {}
  logging:                 # (CPU: 57 m, Memory: 2.76 G) Flexible logging functions are provided for log query, collection and management in a unified console. Additional log collectors can be added, such as Elasticsearch, Kafka and Fluentd.
    enabled: true         # Enable or disable the KubeSphere Logging System.
    containerruntime: docker
    logsidecar:
      enabled: true
      replicas: 2
      # resources: {}
  metrics_server:                    # (CPU: 56 m, Memory: 44.35 MiB) It enables HPA (Horizontal Pod Autoscaler).
    enabled: false                   # Enable or disable metrics-server.
  monitoring:
    storageClass: ""                 # If there is an independent StorageClass you need for Prometheus, you can specify it here. The default StorageClass is used by default.
    # kube_rbac_proxy:
    #   resources: {}
    # kube_state_metrics:
    #   resources: {}
    # prometheus:
    #   replicas: 1  # Prometheus replicas are responsible for monitoring different segments of data source and providing high availability.
    #   volumeSize: 20Gi  # Prometheus PVC size.
    #   resources: {}
    #   operator:
    #     resources: {}
    #   adapter:
    #     resources: {}
    # node_exporter:
    #   resources: {}
    # alertmanager:
    #   replicas: 1          # AlertManager Replicas.
    #   resources: {}
    # notification_manager:
    #   resources: {}
    #   operator:
    #     resources: {}
    #   proxy:
    #     resources: {}
    gpu:                           # GPU monitoring-related plug-in installation. 
      nvidia_dcgm_exporter:        # Ensure that gpu resources on your hosts can be used normally, otherwise this plug-in will not work properly.
        enabled: false             # Check whether the labels on the GPU hosts contain "nvidia.com/gpu.present=true" to ensure that the DCGM pod is scheduled to these nodes.
        # resources: {}
  multicluster:
    clusterRole: none  # host | member | none  # You can install a solo cluster, or specify it as the Host or Member Cluster.
  network:
    networkpolicy: # Network policies allow network isolation within the same cluster, which means firewalls can be set up between certain instances (Pods).
      # Make sure that the CNI network plugin used by the cluster supports NetworkPolicy. There are a number of CNI network plugins that support NetworkPolicy, including Calico, Cilium, Kube-router, Romana and Weave Net.
      enabled: true # Enable or disable network policies.
    ippool: # Use Pod IP Pools to manage the Pod network address space. Pods to be created can be assigned IP addresses from a Pod IP Pool.
      type: none # Specify "calico" for this field if Calico is used as your CNI plugin. "none" means that Pod IP Pools are disabled.
    topology: # Use Service Topology to view Service-to-Service communication based on Weave Scope.
      type: none # Specify "weave-scope" for this field to enable Service Topology. "none" means that Service Topology is disabled.
  openpitrix: # An App Store that is accessible to all platform tenants. You can use it to manage apps across their entire lifecycle.
    store:
      enabled: true # Enable or disable the KubeSphere App Store.
  servicemesh:         # (0.3 Core, 300 MiB) Provide fine-grained traffic management, observability and tracing, and visualized traffic topology.
    enabled: true     # Base component (pilot). Enable or disable KubeSphere Service Mesh (Istio-based).
  kubeedge:          # Add edge nodes to your cluster and deploy workloads on edge nodes.
    enabled: true   # Enable or disable KubeEdge.
    cloudCore:
      nodeSelector: {"node-role.kubernetes.io/worker": ""}
      tolerations: []
      cloudhubPort: "10000"
      cloudhubQuicPort: "10001"
      cloudhubHttpsPort: "10002"
      cloudstreamPort: "10003"
      tunnelPort: "10004"
      cloudHub:
        advertiseAddress: # At least a public IP address or an IP address which can be accessed by edge nodes must be provided.
          - ""            # Note that once KubeEdge is enabled, CloudCore will malfunction if the address is not provided.
        nodeLimit: "100"
      service:
        cloudhubNodePort: "30000"
        cloudhubQuicNodePort: "30001"
        cloudhubHttpsNodePort: "30002"
        cloudstreamNodePort: "30003"
        tunnelNodePort: "30004"
    edgeWatcher:
      nodeSelector: {"node-role.kubernetes.io/worker": ""}
      tolerations: []
      edgeWatcherAgent:
        nodeSelector: {"node-role.kubernetes.io/worker": ""}
        tolerations: []

4.3 执行安装

  • 在 k8s-master 节点执行安装:
kubectl apply -f kubesphere-installer.yaml
kubectl apply -f cluster-configuration.yaml

1652084095749-a53b7841-a2e9-4899-86c5-5aa7cd432843.gif

  • 在 k8s-master 节点查看安装进度:
kubectl logs -n kubesphere-system $(kubectl get pod -n kubesphere-system -l app=ks-install -o jsonpath='{.items[0].metadata.name}') -f

1652084100656-c3fc67c6-651a-4669-beb6-e24b9a1c8fb1.gif

4.4 解决 etcd 监控证书找不到的问题

  • 在 k8s-master 节点解决 etcd 监控证书找不到的问题:
kubectl -n kubesphere-monitoring-system create secret generic kube-etcd-client-certs  --from-file=etcd-client-ca.crt=/etc/kubernetes/pki/etcd/ca.crt  --from-file=etcd-client.crt=/etc/kubernetes/pki/apiserver-etcd-client.crt  --from-file=etcd-client.key=/etc/kubernetes/pki/apiserver-etcd-client.key

更新: 2023-04-01 06:17:22
原文: https://www.yuque.com/fairy-era/yg511q/renxuz

Comments