Kubernetes运维之容器编排StatefulSet

StatefulSet

StatefulSet是为了解决有状态服务的问题(对应Deployments和ReplicaSets是为无状态服务而设计),其应用场景包括

  • 稳定的持久化存储,即Pod重新调度后还是能访问到相同的持久化数据,基于PVC来实现
  • 稳定的网络标志,即Pod重新调度后其PodName和HostName不变,基于Headless Service(即没有Cluster IP的Service)来实现
  • 有序部署,有序扩展,即Pod是有顺序的,在部署或者扩展的时候要依据定义的顺序依次依次进行(即从0到N-1,在下一个Pod运行之前所有之前的Pod必须都是Running和Ready状态),基于init containers来实现
  • 有序收缩,有序删除(即从N-1到0)

从上面的应用场景可以发现,StatefulSet由以下几个部分组成:

  • 用于定义网络标志(DNS domain)的Headless Service
  • 用于创建PersistentVolumes的volumeClaimTemplates
  • 定义具体应用的StatefulSet

StatefulSet中每个Pod的DNS格式为statefulSetName-{0..N-1}.serviceName.namespace.svc.cluster.local,其中

  • serviceName为Headless Service的名字
  • 0..N-1为Pod所在的序号,从0开始到N-1
  • statefulSetName为StatefulSet的名字
  • namespace为服务所在的namespace,Headless Servic和StatefulSet必须在相同的namespace
  • .cluster.local为Cluster Domain,

StatefulSet模板

必要字段

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
[root@k8s-master1 ~]# kubectl explain sts.spec
KIND: StatefulSet
VERSION: apps/v1

RESOURCE: spec <Object>

DESCRIPTION:
Spec defines the desired identities of pods in this set.

A StatefulSetSpec is the specification of a StatefulSet.

FIELDS:
podManagementPolicy <string> # POD启动规则,OrderedReady有序启动,Parallel一起启动
podManagementPolicy controls how pods are created during initial scale up,
when replacing pods on nodes, or when scaling down. The default policy is
`OrderedReady`, where pods are created in increasing order (pod-0, then
pod-1, etc) and the controller will wait until each pod is ready before
continuing. When scaling down, the pods are removed in the opposite order.
The alternative policy is `Parallel` which will create pods in parallel to
match the desired scale without waiting, and on scale down will delete all
pods at once.

replicas <integer> # 副本数量
replicas is the desired number of replicas of the given Template. These are
replicas in the sense that they are instantiations of the same Template,
but individual replicas also have a consistent identity. If unspecified,
defaults to 1.

revisionHistoryLimit <integer> # 保留历史副本
revisionHistoryLimit is the maximum number of revisions that will be
maintained in the StatefulSet's revision history. The revision history
consists of all revisions not represented by a currently applied
StatefulSetSpec version. The default value is 10.

selector <Object> -required- # labels标签
selector is a label query over pods that should match the replica count. It
must match the pod template's labels. More info:
https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#label-selectors

serviceName <string> -required- # service网络访问名称
serviceName is the name of the service that governs this StatefulSet. This
service must exist before the StatefulSet, and is responsible for the
network identity of the set. Pods get DNS/hostnames that follow the
pattern: pod-specific-string.serviceName.default.svc.cluster.local where
"pod-specific-string" is managed by the StatefulSet controller.

template <Object> -required- # POD模板
template is the object that describes the pod that will be created if
insufficient replicas are detected. Each pod stamped out by the StatefulSet
will fulfill this Template, but have a unique identity from the rest of the
StatefulSet.

updateStrategy <Object> # 滚动更新规则
rollingUpdate:
partition: 2 <integer> # 只会更新大于等于这个索引的POD

volumeClaimTemplates <[]Object> # 挂载数据信息
volumeClaimTemplates is a list of claims that pods are allowed to
reference. The StatefulSet controller is responsible for mapping network
identities to claims in a way that maintains the identity of a pod. Every
claim in this list must have at least one matching (by name) volumeMount in
one container in the template. A claim in this list takes precedence over
any volumes in the template, with the same name.

简单示例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
---
apiVersion: v1
kind: Service
metadata:
name: nginx # 与下面serviceName相同
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None ## 不分配ClusterIP,代表headless service网络,整个集群的Pod能访问,外部不能访问
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx
serviceName: "nginx"
replicas: 3
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
name: web
# volumeMounts:
# - name: www
# mountPath: /usr/share/nginx/html
# volumeClaimTemplates:
# - metadata:
# name: www
# annotations:
# volume.alpha.kubernetes.io/storage-class: anything
# spec:
# accessModes: [ "ReadWriteOnce" ]
# resources:
# requests:
# storage: 1Gi

上述例子中:

  • 名为 nginx 的 Headless Service 用来控制网络域名。
  • 名为 web 的 StatefulSet 有一个 Spec,它表明将在独立的 3 个 Pod 副本中启动 nginx 容器。
  • volumeClaimTemplates 将通过 PersistentVolumes 驱动提供的 PersistentVolumes 来提供稳定的存储。

下面给出一些选择集群域、服务名、StatefulSet 名、及其怎样影响 StatefulSet 的 Pod 上的 DNS 名称的示例:

集群域名服务(名字空间/名字)StatefulSet(名字空间/名字)StatefulSet 域名Pod DNSPod 主机名
cluster.localdefault/nginxdefault/webnginx.default.svc.cluster.localweb-{0..N-1}.nginx.default.svc.cluster.localweb-{0..N-1}
cluster.localfoo/nginxfoo/webnginx.foo.svc.cluster.localweb-{0..N-1}.nginx.foo.svc.cluster.localweb-{0..N-1}
kube.localfoo/nginxfoo/webnginx.foo.svc.kube.localweb-{0..N-1}.nginx.foo.svc.kube.localweb-{0..N-1}

执行结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
[root@k8s-master1 ~]# kubectl get pod                       
NAME READY STATUS RESTARTS AGE
web-0 1/1 Running 0 72s
web-1 1/1 Running 0 43s
web-2 0/1 ContainerCreating 0 3s
[root@k8s-master1 ~]# kubectl get pod
NAME READY STATUS RESTARTS AGE
web-0 1/1 Running 0 2m22s
web-1 1/1 Running 0 113s
web-2 1/1 Running 0 73s

# 查看创建的headless service和statefulset
[root@k8s-master1 ~]# kubectl get s
secrets serviceaccounts services statefulsets.apps storageclasses.storage.k8s.io
[root@k8s-master1 ~]# kubectl get statefulsets.apps
NAME READY AGE
web 3/3 3m13s
[root@k8s-master1 ~]# kubectl get statefulsets
NAME READY AGE
web 3/3 3m16s
# 其他pod中访问
root@nginx-deployment-594d59fb8d-56nfd:/# curl web-0.nginx
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>

其他的操作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# 扩容
$ kubectl scale statefulset web --replicas=5

# 缩容
$ kubectl patch statefulset web -p '{"spec":{"replicas":3}}'

# 镜像更新(目前还不支持直接更新image,需要patch来间接实现)
$ kubectl patch statefulset web --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value":"gcr.io/google_containers/nginx-slim:0.7"}]'

# 删除StatefulSet和Headless Service
$ kubectl delete statefulset web
$ kubectl delete service nginx

# StatefulSet删除后PVC还会保留着,数据不再使用的话也需要删除
$ kubectl delete pvc www-web-0 www-web-1

StatefulSet注意事项

  1. 还在beta状态,需要kubernetes v1.5版本以上才支持
  2. 所有Pod的Volume必须使用PersistentVolume或者是管理员事先创建好
  3. 为了保证数据安全,删除StatefulSet时不会删除Volume
  4. StatefulSet需要一个Headless Service来定义DNS domain,需要在StatefulSet之前创建好
  5. 目前StatefulSet还没有feature complete,比如更新操作还需要手动patch。

更多可以参考Kubernetes文档