pod污点taint 与容忍度tolerations详解

更新时间：2022年11月07日 17:05:26 作者：人生的哲理

这篇文章主要为大家介绍了pod污点taint与容忍度tolerations示例详解，有需要的朋友可以借鉴参考下，希望能够有所帮助，祝大家多多进步，早日升职加薪

一.系统环境

服务器版本	docker软件版本	Kubernetes(k8s)集群版本	CPU架构
CentOS Linux release 7.4.1708 (Core)	Docker version 20.10.12	v1.21.9	x86_64

Kubernetes集群架构：k8scloude1作为master节点，k8scloude2，k8scloude3作为worker节点

服务器	操作系统版本	CPU架构	进程	功能描述
k8scloude1/192.168.110.130	CentOS Linux release 7.4.1708 (Core)	x86_64	docker，kube-apiserver，etcd，kube-scheduler，kube-controller-manager，kubelet，kube-proxy，coredns，calico	k8s master节点
k8scloude2/192.168.110.129	CentOS Linux release 7.4.1708 (Core)	x86_64	docker，kubelet，kube-proxy，calico	k8s worker节点
k8scloude3/192.168.110.128	CentOS Linux release 7.4.1708 (Core)	x86_64	docker，kubelet，kube-proxy，calico	k8s worker节点

二.前言

本文介绍污点taint 与容忍度tolerations，可以影响pod的调度。

使用污点taint 与容忍度tolerations的前提是已经有一套可以正常运行的Kubernetes集群，关于Kubernetes(k8s)集群的安装部署，可以查看博客《Centos7 安装部署Kubernetes(k8s)集群》

三.污点taint

3.1 污点taint概览

节点亲和性是 Pod 的一种属性，它使 Pod 被吸引到一类特定的节点（这可能出于一种偏好，也可能是硬性要求）。污点（Taint）则相反——它使节点能够排斥一类特定的 Pod。

3.2 给节点添加污点taint

给节点增加一个污点的语法如下：给节点 node1 增加一个污点，它的键名是 key1，键值是 value1，效果是 NoSchedule。这表示只有拥有和这个污点相匹配的容忍度的 Pod 才能够被分配到 node1 这个节点。

#污点的格式：键=值:NoSchedule
kubectl taint nodes node1 key1=value1:NoSchedule
#只有键没有值的话，格式为：键:NoSchedule
kubectl taint nodes node1 key1:NoSchedule

移除污点语法如下：

kubectl taint nodes node1 key1=value1:NoSchedule-

节点的描述信息里有一个Taints字段，Taints字段表示节点有没有污点

[root@k8scloude1 deploy]# kubectl get nodes -o wide
NAME         STATUS   ROLES                  AGE   VERSION   INTERNAL-IP       EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION          CONTAINER-RUNTIME
k8scloude1   Ready    control-plane,master   8d    v1.21.0   192.168.110.130   <none>        CentOS Linux 7 (Core)   3.10.0-693.el7.x86_64   docker://20.10.12
k8scloude2   Ready    <none>                 8d    v1.21.0   192.168.110.129   <none>        CentOS Linux 7 (Core)   3.10.0-693.el7.x86_64   docker://20.10.12
k8scloude3   Ready    <none>                 8d    v1.21.0   192.168.110.128   <none>        CentOS Linux 7 (Core)   3.10.0-693.el7.x86_64   docker://20.10.12
[root@k8scloude1 deploy]# kubectl describe nodes k8scloude1
Name:               k8scloude1
Roles:              control-plane,master
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=k8scloude1
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/control-plane=
                    node-role.kubernetes.io/master=
                    node.kubernetes.io/exclude-from-external-load-balancers=
Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
                    node.alpha.kubernetes.io/ttl: 0
                    projectcalico.org/IPv4Address: 192.168.110.130/24
                    projectcalico.org/IPv4IPIPTunnelAddr: 10.244.158.64
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Sun, 09 Jan 2022 16:19:06 +0800
Taints:             node-role.kubernetes.io/master:NoSchedule
Unschedulable:      false
......

查看节点是否有污点，Taints: node-role.kubernetes.io/master:NoSchedule表示k8s集群的master节点有污点，这是默认就存在的污点，这也是master节点为什么不能运行应用pod的原因。

[root@k8scloude1 deploy]# kubectl describe nodes k8scloude2 | grep -i Taints
Taints:             <none>
[root@k8scloude1 deploy]# kubectl describe nodes k8scloude1 | grep -i Taints
Taints:             node-role.kubernetes.io/master:NoSchedule
[root@k8scloude1 deploy]# kubectl describe nodes k8scloude3 | grep -i Taints
Taints:             <none>

创建pod，nodeSelector:kubernetes.io/hostname: k8scloude1表示pod运行在标签为kubernetes.io/hostname=k8scloude1的节点上。

关于pod的调度详细内容，请查看博客《pod(八)：pod的调度——将 Pod 指派给节点》

[root@k8scloude1 pod]# vim schedulepod4.yaml 
[root@k8scloude1 pod]# cat schedulepod4.yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: pod1
  name: pod1
  namespace: pod
spec:
  nodeSelector:
    kubernetes.io/hostname: k8scloude1
  containers:
  - image: nginx
    imagePullPolicy: IfNotPresent
    name: pod1
    resources: {}
    ports:
    - name: http
      containerPort: 80
      protocol: TCP
      hostPort: 80
  dnsPolicy: ClusterFirst
  restartPolicy: Always
status: {}

标签为kubernetes.io/hostname=k8scloude1的节点为k8scloude1节点

[root@k8scloude1 pod]# kubectl get nodes -l kubernetes.io/hostname=k8scloude1
NAME         STATUS   ROLES                  AGE   VERSION
k8scloude1   Ready    control-plane,master   8d    v1.21.0

创建pod，因为k8scloude1上有污点，pod1不能运行在k8scloude1上，所以pod1状态为Pending

[root@k8scloude1 pod]# kubectl apply -f schedulepod4.yaml 
pod/pod1 created
 #因为k8scloude1上有污点，pod1不能运行在k8scloude1上，所以pod1状态为Pending
[root@k8scloude1 pod]# kubectl get pod -o wide
NAME   READY   STATUS    RESTARTS   AGE   IP       NODE     NOMINATED NODE   READINESS GATES
pod1   0/1     Pending   0          9s    <none>   <none>   <none>           <none>
[root@k8scloude1 pod]# kubectl delete pod pod1 --force
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "pod1" force deleted
[root@k8scloude1 pod]# kubectl get pod -o wide
No resources found in pod namespace.

四.容忍度tolerations

4.1 容忍度tolerations概览

容忍度（Toleration）是应用于 Pod 上的。容忍度允许调度器调度带有对应污点的 Pod。容忍度允许调度但并不保证调度：作为其功能的一部分，调度器也会评估其他参数。

污点和容忍度（Toleration）相互配合，可以用来避免 Pod 被分配到不合适的节点上。每个节点上都可以应用一个或多个污点，这表示对于那些不能容忍这些污点的 Pod，是不会被该节点接受的。

4.2 设置容忍度tolerations

只有拥有和这个污点相匹配的容忍度的 Pod 才能够被分配到 node节点。

查看k8scloude1节点的污点

[root@k8scloude1 pod]# kubectl describe nodes k8scloude1 | grep -i taint
Taints:             node-role.kubernetes.io/master:NoSchedule

你可以在 Pod 规约中为 Pod 设置容忍度，创建pod，tolerations参数表示可以容忍污点：node-role.kubernetes.io/master:NoSchedule ，nodeSelector:kubernetes.io/hostname: k8scloude1表示pod运行在标签为kubernetes.io/hostname=k8scloude1的节点上。

[root@k8scloude1 pod]# vim schedulepod4.yaml 
[root@k8scloude1 pod]# cat schedulepod4.yaml 
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: pod1
  name: pod1
  namespace: pod
spec:
  tolerations:
  - key: "node-role.kubernetes.io/master"
    operator: "Equal"
    value: ""
    effect: "NoSchedule"
  nodeSelector:
    kubernetes.io/hostname: k8scloude1
  containers:
  - image: nginx
    imagePullPolicy: IfNotPresent
    name: pod1
    resources: {}
    ports:
    - name: http
      containerPort: 80
      protocol: TCP
      hostPort: 80
  dnsPolicy: ClusterFirst
  restartPolicy: Always
status: {}
[root@k8scloude1 pod]# kubectl get pods -o wide
No resources found in pod namespace.
[root@k8scloude1 pod]# kubectl apply -f schedulepod4.yaml 
pod/pod1 created

查看pod，即使k8scloude1节点有污点，pod还是正常运行。

taint污点和cordon，drain的区别：某个节点上有污点，可以设置tolerations容忍度，让pod运行在该节点，某个节点被cordon，drain，则该节点不能被分配出去运行pod。

关于cordon，drain的详细信息，请查看博客《cordon节点，drain驱逐节点，delete 节点》

[root@k8scloude1 pod]# kubectl get pods -o wide
NAME   READY   STATUS    RESTARTS   AGE   IP              NODE         NOMINATED NODE   READINESS GATES
pod1   1/1     Running   0          4s    10.244.158.84   k8scloude1   <none>           <none>
[root@k8scloude1 pod]# kubectl delete pod pod1 --force
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "pod1" force deleted
[root@k8scloude1 pod]# kubectl get pods -o wide
No resources found in pod namespace.

注意，tolerations容忍度有两种写法，任选一种即可:

tolerations:
- key: "key1"
  operator: "Equal"
  value: "value1"
  effect: "NoSchedule"
tolerations:
- key: "key1"
  operator: "Exists"
  effect: "NoSchedule"

给k8scloude2节点打标签

[root@k8scloude1 pod]# kubectl label nodes k8scloude2 taint=T
node/k8scloude2 labeled
[root@k8scloude1 pod]# kubectl get node --show-labels
NAME         STATUS   ROLES                  AGE   VERSION   LABELS
k8scloude1   Ready    control-plane,master   8d    v1.21.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8scloude1,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=
k8scloude2   Ready    <none>                 8d    v1.21.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8scloude2,kubernetes.io/os=linux,taint=T
k8scloude3   Ready    <none>                 8d    v1.21.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8scloude3,kubernetes.io/os=linux

对k8scloude2设置污点

#污点taint的格式：键=值:NoSchedule
[root@k8scloude1 pod]# kubectl taint node k8scloude2 wudian=true:NoSchedule
node/k8scloude2 tainted
[root@k8scloude1 pod]# kubectl describe nodes k8scloude2 | grep -i Taints
Taints:             wudian=true:NoSchedule

创建pod，tolerations参数表示容忍污点wudian=true:NoSchedule，nodeSelector:taint: T参数表示pod运行在标签为nodeSelector=taint: T的节点。

[root@k8scloude1 pod]# vim schedulepod4.yaml 
[root@k8scloude1 pod]# cat schedulepod4.yaml 
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: pod1
  name: pod1
  namespace: pod
spec:
  tolerations:
  - key: "wudian"
    operator: "Equal"
    value: "true"
    effect: "NoSchedule"
  nodeSelector:
    taint: T
  containers:
  - image: nginx
    imagePullPolicy: IfNotPresent
    name: pod1
    resources: {}
    ports:
    - name: http
      containerPort: 80
      protocol: TCP
      hostPort: 80
  dnsPolicy: ClusterFirst
  restartPolicy: Always
status: {}
[root@k8scloude1 pod]# kubectl get pod -o wide
No resources found in pod namespace.
[root@k8scloude1 pod]# kubectl apply -f schedulepod4.yaml 
pod/pod1 created

查看pod，k8scloude2节点就算有污点也能运行pod

[root@k8scloude1 pod]# kubectl get pods -o wide
NAME   READY   STATUS    RESTARTS   AGE   IP               NODE         NOMINATED NODE   READINESS GATES
pod1   1/1     Running   0          8s    10.244.112.177   k8scloude2   <none>           <none>
[root@k8scloude1 pod]# kubectl delete pod pod1 --force
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "pod1" force deleted
[root@k8scloude1 pod]# kubectl get pods -o wide
No resources found in pod namespace.

污点容忍的另一种写法：operator: "Exists"，没有value值。

[root@k8scloude1 pod]# vim schedulepod4.yaml 
[root@k8scloude1 pod]# cat schedulepod4.yaml 
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: pod1
  name: pod1
  namespace: pod
spec:
  tolerations:
  - key: "wudian"
    operator: "Exists"
    effect: "NoSchedule"
  nodeSelector:
    taint: T
  containers:
  - image: nginx
    imagePullPolicy: IfNotPresent
    name: pod1
    resources: {}
    ports:
    - name: http
      containerPort: 80
      protocol: TCP
      hostPort: 80
  dnsPolicy: ClusterFirst
  restartPolicy: Always
status: {}
[root@k8scloude1 pod]# kubectl apply -f schedulepod4.yaml 
pod/pod1 created

查看pod，k8scloude2节点就算有污点也能运行pod

[root@k8scloude1 pod]# kubectl get pods -o wide
NAME   READY   STATUS    RESTARTS   AGE   IP               NODE         NOMINATED NODE   READINESS GATES
pod1   1/1     Running   0          10s   10.244.112.178   k8scloude2   <none>           <none>
[root@k8scloude1 pod]# kubectl delete pod pod1 --force
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "pod1" force deleted
[root@k8scloude1 pod]# kubectl get pods -o wide
No resources found in pod namespace.

给k8scloude2节点再添加一个污点

[root@k8scloude1 pod]# kubectl describe nodes k8scloude2 | grep Taints
Taints:             wudian=true:NoSchedule
[root@k8scloude1 pod]# kubectl taint node k8scloude2 zang=shide:NoSchedule
node/k8scloude2 tainted
[root@k8scloude1 pod]# kubectl describe nodes k8scloude2 | grep Taints
Taints:             wudian=true:NoSchedule
[root@k8scloude1 pod]# kubectl describe nodes k8scloude2 | grep -A2 Taints
Taints:             wudian=true:NoSchedule
                    zang=shide:NoSchedule
Unschedulable:      false
[root@k8scloude1 pod]# kubectl describe nodes k8scloude2 | grep -A1 Taints
Taints:             wudian=true:NoSchedule
                    zang=shide:NoSchedule

创建pod，tolerations参数表示容忍2个污点：wudian=true:NoSchedule和zang=shide:NoSchedule，nodeSelector:taint: T参数表示pod运行在标签为nodeSelector=taint: T的节点。

[root@k8scloude1 pod]# vim schedulepod4.yaml 
[root@k8scloude1 pod]# cat schedulepod4.yaml 
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: pod1
  name: pod1
  namespace: pod
spec:
  tolerations:
  - key: "wudian"
    operator: "Equal"
    value: "true"
    effect: "NoSchedule"
  - key: "zang"
    operator: "Equal"
    value: "shide"
    effect: "NoSchedule"
  nodeSelector:
    taint: T
  containers:
  - image: nginx
    imagePullPolicy: IfNotPresent
    name: pod1
    resources: {}
    ports:
    - name: http
      containerPort: 80
      protocol: TCP
      hostPort: 80
  dnsPolicy: ClusterFirst
  restartPolicy: Always
status: {}
[root@k8scloude1 pod]# kubectl apply -f schedulepod4.yaml 
pod/pod1 created

查看pod，k8scloude2节点就算有2个污点也能运行pod

[root@k8scloude1 pod]# kubectl get pods -o wide
NAME   READY   STATUS    RESTARTS   AGE   IP               NODE         NOMINATED NODE   READINESS GATES
pod1   1/1     Running   0          6s    10.244.112.179   k8scloude2   <none>           <none>
[root@k8scloude1 pod]# kubectl delete pod pod1 --force
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "pod1" force deleted

创建pod，tolerations参数表示容忍污点：wudian=true:NoSchedule，nodeSelector:taint: T参数表示pod运行在标签为nodeSelector=taint: T的节点。

[root@k8scloude1 pod]# vim schedulepod4.yaml 
[root@k8scloude1 pod]# cat schedulepod4.yaml 
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: pod1
  name: pod1
  namespace: pod
spec:
  tolerations:
  - key: "wudian"
    operator: "Equal"
    value: "true"
    effect: "NoSchedule"
  nodeSelector:
    taint: T
  containers:
  - image: nginx
    imagePullPolicy: IfNotPresent
    name: pod1
    resources: {}
    ports:
    - name: http
      containerPort: 80
      protocol: TCP
      hostPort: 80
  dnsPolicy: ClusterFirst
  restartPolicy: Always
status: {}
[root@k8scloude1 pod]# kubectl apply -f schedulepod4.yaml 
pod/pod1 created

查看pod，一个节点有两个污点值，但是yaml文件只容忍一个，所以pod创建不成功。

[root@k8scloude1 pod]# kubectl get pods -o wide
NAME   READY   STATUS    RESTARTS   AGE   IP       NODE     NOMINATED NODE   READINESS GATES
pod1   0/1     Pending   0          8s    <none>   <none>   <none>           <none>
[root@k8scloude1 pod]# kubectl delete pod pod1 --force
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "pod1" force deleted
[root@k8scloude1 pod]# kubectl get pods -o wide
No resources found in pod namespace.

取消k8scloude2污点

[root@k8scloude1 pod]# kubectl describe nodes k8scloude2 | grep -A2 Taints
Taints:             wudian=true:NoSchedule
                    zang=shide:NoSchedule
Unschedulable:      false
#取消污点
[root@k8scloude1 pod]# kubectl taint node k8scloude2 zang-
node/k8scloude2 untainted
[root@k8scloude1 pod]# kubectl taint node k8scloude2 wudian-
node/k8scloude2 untainted
[root@k8scloude1 pod]# kubectl describe nodes k8scloude1 | grep -A2 Taints
Taints:             node-role.kubernetes.io/master:NoSchedule
Unschedulable:      false
Lease:
[root@k8scloude1 pod]# kubectl describe nodes k8scloude2 | grep -A2 Taints
Taints:             <none>
Unschedulable:      false
Lease:
[root@k8scloude1 pod]# kubectl describe nodes k8scloude3 | grep -A2 Taints
Taints:             <none>
Unschedulable:      false
Lease:

Tips:如果自身机器有限，只能有一台机器，则可以把master节点的污点taint取消，就可以在master上运行pod了。

以上就是pod污点taint 与容忍度tolerations详解的详细内容，更多关于污点taint容忍度tolerations 的资料请关注脚本之家其它相关文章！

您可能感兴趣的文章:

Docker网络配置的三种方式
在使用Docker时,网络通信是必不可少的,它可以使不同的Docker容器相互通信,也可以将容器与外部网络连接起来,本文给大家介绍了Docker网络配置的三种方式,文中通过图文给大家讲解非常详细,需要的朋友可以参考下
2024-01-01
Docker实现安装ELK(单节点)
这篇文章主要介绍了Docker实现安装ELK(单节点),具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教
2024-08-08
docker容器存储清理删除所需命令和方法
这篇文章主要介绍了docker容器存储清理所需命令和方法,我在用docker安装的es使用过程中，发现内存占满了，我把全部的都删除掉了，但有时候数据我们必须要使用，所以不能全删，需要指定删除，下面就是一些docker容器存储清理所需的一些命令和方法，需要的朋友可以参考下
2023-01-01
Docker的核心及安装的具体使用
这篇文章主要介绍了Docker的核心及安装的具体使用，文中通过示例代码介绍的非常详细，对大家的学习或者工作具有一定的参考学习价值，需要的朋友们下面随着小编来一起学习学习吧
2019-11-11
详解centos7 docker1.12安装私有仓库
本篇文章主要介绍了centos7 docker1.12安装私有仓库，小编觉得挺不错的，现在分享给大家，也给大家做个参考。一起跟随小编过来看看吧
2017-01-01
docker配置阿里云镜像仓库的实现
本文主要介绍了docker配置阿里云镜像仓库的实现，文中通过示例代码介绍的非常详细，对大家的学习或者工作具有一定的参考学习价值，需要的朋友们下面随着小编来一起学习学习吧
2022-08-08
docker镜像导入的实现方法
如果服务器网络不好或者pull不下来镜像,只能进行导入,本文主要介绍了docker镜像导入的实现方法,具有一定的参考价值,感兴趣的可以了解一下
2023-09-09
基于Docker搭建Redis主从集群的实现
本文基于Docker+Redis5.0.5版本，通过cluster方式创建一个6个redis实例的主从集群，需要的朋友们下面随着小编来一起学习学习吧
2021-05-05
Docker中如何删除image（镜像）的方法
这篇文章主要介绍了Docker中如何删除image（镜像）的方法，文中通过示例代码介绍的非常详细，对大家的学习或者工作具有一定的参考学习价值，需要的朋友们下面随着小编来一起学习学习吧
2019-09-09
搭建一个私有的Docker registry教程
这篇文章提供了一个非常务实的方法来处理搭建私有Docker registry时出现的各种错综复杂的情况。我们将会使用一个运行于DigitalOcean（之后简称为DO）的非常小巧的512MB VPS 实例
2016-09-09