K8S API

API

api-server统一的操作入口.

kubectl, UI, 等都是通过api-server操作资源.

payload可以是json,也可以是yaml.

yaml文件中#表示行注释。


yaml

部署k8s可以通过yaml文件来配置资源.

资源对象组成部分:

apiVersion: 
kind: 
metadata: 元数据
spec: 期望的状态
status: 观测到的状态

查看apiVersion:

kubectl api-versions

查看Kind:

kubectl api-resources

# In a namespace
kubectl api-resources --namespaced=true

# Not in a namespace
kubectl api-resources --namespaced=false

metadata:

metadata:

  name:
  namespace:

  labels/标签: 用户筛选资源,唯一的资源组合方法, 可以使用selector来查询.

  annotations/注解: 存储资源的非标识性信息,扩展资源的spec/status.

  ownerReference/关系: 方便反向查找创建资源的对象,方便进行级联删除。

spec:

status:


调度,抢占,驱逐

taints: 污点,使节点排斥特定pod。应用于node。

taints:
- effect: NoSchedule
  key: kubernetes.io/arch
  value: arm64

tolerations: 容忍度,使pod被吸引到特定节点。应用于pod。 这个只能让pod能部署到加了污点的node,pod也能部署到其它没有加污点的node。

tolerations:
- key: "key1"
  operator: "Equal"/"Exists"
  value: "value1"
  effect: "NoSchedule"/"NoExecute"

affinity: 亲和力,affinity可以通过label指定pod部署到node。 但是不能保证其它pod不部署到这个node。

   affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/arch
            operator: In
            values:
            - arm64

nodeSelector: 节点选择,

 nodeSelector:
   kubernetes.io/arch: arm64

Pod

pod模板, 通常使用deployment, job和statefulset, daemonset来管理pod.

apiVersion: v1
kind: Pod
metadata:
  name: test
  namespace: test
  labels:
      app: test
spec:

  //// containers
  os:  // 操作系统模板
    name:

  imagePullSecrets:  // 私有镜像授权
  - name: my-harbor

  initContainers: //  initcontainer模板
  - name: init
    image: my-image
    command: ...
    args: ...

  containers: // container 模板
  - name: test
    // image
    image: image
    imagePullPolicy: Always/IfNotPresent/Never

    // entrypoint
    command:
    args:
    workingDir:

    // port
    ports:

    // resources
    resources:
      requests:  // 申明需要的资源
        memory: "64Mi"  // byte
        cpu: "250m"     // millicore (1 core = 1000 millicore)
        ephemeral-storage: "2Gi" // byte
      limits:
        memory: "128Mi"
        cpu: "500m"
        ephemeral-storage: "4Gi"

    // environment variables, 针对单个键值对.
    env:
    - name: key
      value: value
    - name: key
      valueFrom: // 将cm-name中的值cm-key传给key
        configMapKeyRef:
          name: cm-name
          key: cm-key
          optional:
    - name: key
      valueFrom: // 挂载secret
        secretKeyRef:
          key:
          name:
          optional:
        fieldRef:
        resourceFieldRef:

    // 环境变量,针对文件中所有键值对.
    envFrom:
    - configMapRef    // 将my-cm中的所有键值对变成环境变量.
        name: my-cm
        optional:
    - secretRef
        name: 
        optional:
     
    // volumeMounts (去Volume找对应资源)
    // 如果没有subpath,整个目录会被覆盖,目录下只有secret/configmap挂载的文件.
    volumeMounts: // secret以文件形式挂载到/etc/foo
    - name: my-secret
      mountPath: "/etc/foo" // 挂载之后覆盖整个目录
      readOnly: true
    - name: my-configmap
      mountPath: "/etc/bar" // 挂载之后覆盖整个目录
      // 如果有subpath, secret/configmap里的data里的文件名需要与subpath和mountpath指定的文件名一致.
    - name: config
      mountPath: /etc/app/app.conf  // 是文件,文件名要和subpath一致。
      subPath: app.conf // 挂载之后只覆盖目录中同名文件,其它文件不影响.

    // lifecycle
    livenessProbe:
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 5
      httpGet:
        path: /admin
        port: django
        # httpHeaders:
        # - name: Authorization
          # value: Basic $LDAP_ACCOUNT
    readinessProbe:
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 5
      httpGet:
        path: /admin
        port: django
        # httpHeaders:
        # - name: Authorization
          # value: Basic $LDAP_ACCOUNT

    // securityContext
    securityContext:

    // debugging
    stdin: false
    stdinOnce: false
    tty: false

  //// security context
  securityContext: // pod级别security context定义
    runAsuser: 1000
    runAsGroup: 3000
    fsGroup: 2000

  //// volumes
  volumes:
  - name: my-secret // 指定要挂载的secret
    secret:
      secretName: mysecret
  - name: my-configmap
    configMap:
      name: myconfigmap

  //// lifecycle
  restartPolicy:

  //// scheduling
  nodeName:
  nodeSelector: // 将pod部署到指定node
    key: value
  affinity:
  tolerations:

  //// others
  hostname:
  hostNetwork:
  serviceAccountName:

pod中的container共享存储(pod volume):

apiVersion: v1
Kind: Pod
medadata:
spec:

  # 两种pod volume
  volumes:
  # emptyDir: pod删除之后该目录也会被删除
  - name: cache-volume
    emptyDir: {}
  # hostPath: pod删除之后该目录还在host上. 
  - name: hostpath-volume
    hostPath:
      path: /path/on/host

  containers:
  - name: container1
    image: test
    volumeMounts:
    - name: cache-volume
      mountpath: /path/on/container
      # subPath会在emptyDir或hostPath目录下创建子目录
      subPath: cache1
  - name: container2
    image: test
    volumeMounts:
    - name: hostpath-volume
      mountpath: /path/on/container
      readOnly: true

Deployment

用于部署无状态服务。

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app-deploy
  namespace: my-ns
  lables:
    app: my-app
spec:
  replicas: 3
  # 选择器
  selector: 
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
      - name: my-app
        image: image:latest
        imagePullPolicy: IfNotPresent/Always
        ports:
        - containerPort: 443
        volumeMounts:
        - name: my-hostpath
          mountPath: /path/on/cpod
        - name: my-pvc
          mountPath: /data 
      volumes:
      - name: my-hostpath
        hostPath: 
          path: /path/on/host
      - name: my-pvc
        persistentVolumeClaim:
          claimName: nfs-pvc

DaemonSet

每个node上部署一个pod,用于部署agent。

DaemonSet

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: my-ds
  namespace: my-ns
  labels:
    k8s-app: my-app
  spec:
    selector:
      matchLabels:
        name: my-app
    template:
      metadata:
        labels:
          name: my-app
      spec:
        containers:
        - name: my-container
          image: my-img

StatefulSet

用于部署有状态服务。

StatefulSet 中的 Pod 拥有一个唯一的顺序索引和稳定的网络身份标识。

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
  namespace: test
  labels:
    k8s-app: my-app
spec:
  serviceName: "nginx"
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: k8s.gcr.io/nginx-slim:0.8
        ports:
        - containerPort: 80
          name: web
        volumeMounts:
        - name: www
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
  - metadata:
      name: www
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi

Job

Job

appVersion: batch/v1
kind: Job
metadata:
  name: my-job
spec:
  # 代表本pod队列执行此次数(被执行8次)
  completions: 8
  # 代表并行执行个数(同时有两个在运行)
  parallelism: 2
  backoffLimit: 4
  template:
    spec:
      containers:
      - name: my-job
        image: my-image
        conmand: ['test']
      restartPolicy: Never

CronJob

CronJob

apiVersion: batch/v1
kind: CronJob
metadata:
  name: my-cj
spec:
  schedule: "* * * * *"
  startingDeadlineSeconds: None(default)/10
  concurrencyPolicy: Allow(default)/Forbid/Replace
  suspend: false(default)/true
  successfulJobsHistoryLimit: 3(default)
  failedJobsHistoryLimit: 1(default)
  jobTemplate:
    spec:
      template:
        metadata:
          annotations: ...
          labels: ...
        spec:
          nodeSelector:
            ...
          imagePullSecrets:
            ...
          restartPolicy: OnFailure
          containers:
          - name: image
            image: image
            args:
            - /bin/sh
            - -c
            - date

ConfigMap

configmap只能在当前namespace使用.

configmap的配置在pod中无法修改绑定的文件.

data里面的文件名就是挂载之后的文件名。

ConfigMap

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app: flanel
    tier: node
  name: flannel-cfg
  namespace: kube-system
data:
  cni-conf.json: |
    {
      "name": "n1"
    }

创建配置文件的configmap

$ kubectl -n app create cm my-conf --from-file ./config.ini -o yaml > myconf-configmap.yaml
$ kubectl -n influxdata create cm dashboard-docker --from-file Docker.json -o yaml > grafana-dashboard-docker-configmap.yaml

Secret

secret只能在当前namespace使用.

data里的文件名就是挂载之后的文件名。

Opaque是用户自定义格式

generic secret

kubectl create secret generic empty-secret

apiVersion: v1
kind: Secret
type: Opaque
metadata:
  name: mysecret
  namespace: kube-system
data:
  username: name 
  password: pw

创建tls secret账号:

kubectl -n kubernetes-dashboard create secret tls \
kubernetes-dashboard-tls --key ca.key --cert ca.crt 

类型为 kubernetes.io/tls 的 Secret 中包含密钥和证书的 DER 数据,以 Base64 格式编码。 如果你熟悉私钥和证书的 PEM 格式,base64 与该格式相同,只是你需要略过 PEM 数据中所包含的第一行和最后一行。

apiVersion: v1
kind: Secret
metadata:
  name: secret-tls
type: kubernetes.io/tls
data:
  tls.crt: |
    MIIC2DCCAcCgAwIBAgIBATANBgkqh ...    
  tls.key: |
    MIIEpgIBAAKCAQEA7yn3bRHQ5FHMQ ...   

创建docker secret:

kubernetes.io/dockercfg ~/.dockercfg 文件的序列化形式

kubernetes.io/dockerconfigjson ~/.docker/config.json 文件的序列化形式

给一个private registry创建secret:

$ kubectl -n ns create secret docker-registry <name> \
--docker-server=https://harbor.domain.com --docker-username=user --docker-password=pw --docker-email=canuxcheng@gmail.com 

根据本地的文件创建secret(如果需要多个registry,可以先在本地登陆)

$ kubectl -n ns create secret generic regcred \
--from-file=.dockerconfigjson=$HOME/.docker/config.json \
--type=kubernetes.io/dockerconfigjson

apiVersion: v1
kind: Secret
metadata:
  name: artifactory-cred
  namespace: ...
type: kubernetes.io/dockerconfigjson
data:
  .dockerconfigjson: ewoJImF1d......

Service

apiVersion: v1
kind: Service
metadata:
  name: grafana-service
  namespace: influxdata
spec:
  type: NodePort
  ports:
  - name: https
    port: 3000 // 集群内部访问的port.
    targetPort: 3000 // pod指定的port.
    nodePort: 32000 // 集群外部访问内部service的port.
  selector:   // 匹配资源的metadata.labels
    app: grafana

ExternalName Service

ExternalName Service 是 Service 的特例,它没有选择算符,但是使用 DNS 名称, 将服务映射到 DNS 名称,而不是selector.

访问其它namespace的service.

当查找主机 my-service.my-ns.svc.cluster.local 时,集群 DNS 服务返回 CNAME 记录, 其值为 out-service.out-ns.svc.cluster.local。 访问 my-service 的方式与其他服务的方式相同,但主要区别在于重定向发生在 DNS 级别,而不是通过代理或转发

apiVersion: v1
kind: Service
metadata:
  name: my-service
  namespace: my-ns
spec:
  type: ExternalName
  externalName: out-service.out-ns.svc.cluster.local // 指向其它namespace的service.

Endpoint

下面场景可以使用Endpoint.

  1. 希望在生产环境中使用外部的数据库集群,但测试环境使用自己的数据库。
  2. 希望服务指向另一个 命名空间 中或其它集群中的服务。
  3. 您正在将工作负载迁移到 Kubernetes。 在评估该方法时,您仅在 Kubernetes 中运行一部分后端。

先创建service:

apiVersion: v1
kind: Service
metadata:
  name: mysql-service
  namespace: influxdata
spec:
  ports:
    - protocol: TCP
      port: 3306
      targetPort: 3306

再创建endpoint:

apiVersion: v1
kind: Endpoints
metadata:
  name: mysql-service
  namespace: influxdata
subsets:
  - addresses:
      - ip: 10.103.X.X // 指向外部服务的IP
    ports:
      - port: 3306

HPC

Horizontal Pod Autoscaler

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: 
  labels: 
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: d-name
  minReplicas: 2
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        targetAverageUtilization: 80
    - type: Resource
      resource:
        name: memory
        targetAverageUtilization: 80

Authentication

默认的ClusterRole和ClusterRoleBinding大部分是system:开头。

ServiceAccont

服务账户是在具体名字空间的。

apiVersion: v1
kind: ServiceAccount
metadata:
  name: default
  namespace: default

1.22之前k8s会自动给SA创建token.

1.24之后使用TokenRequest获取有时间限制的token。

创建持久化token

apiVersion: v1
kind: Secret
type: kubernetes.io/service-account-token
metadata:
  name: mysecretname
  annotations:
    kubernetes.io/service-account.name: myserviceaccount

Role

通过role来给指定ns内的资源授权.

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: default
  name: pod-reader
rules:
- apiGroups: [""] # "" 标明 core API 组
  resources: ["pods"]
  verbs: ["get", "watch", "list"]

RoleBinding

将role或clusterrole权限赋予具体的角色.

apiVersion: rbac.authorization.k8s.io/v1
# 此角色绑定允许 "jane" 读取 "default" 名字空间中的 Pods
kind: RoleBinding
metadata:
  name: read-pods
  namespace: default
subjects:
# 你可以指定不止一个“subject(主体)”
- kind: User
  name: jane # "name" 是区分大小写的
  apiGroup: rbac.authorization.k8s.io
roleRef:
  # "roleRef" 指定与某 Role 或 ClusterRole 的绑定关系
  kind: Role # 此字段必须是 Role 或 ClusterRole
  name: pod-reader     # 此字段必须与你要绑定的 Role 或 ClusterRole 的名称匹配
  apiGroup: rbac.authorization.k8s.io

ClusterRole

clusterrole给整个集群授权.不需要namespace.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  # "namespace" 被忽略,因为 ClusterRoles 不受名字空间限制
  name: secret-reader
rules:
- apiGroups: [""]
  # 在 HTTP 层面,用来访问 Secret 对象的资源的名称为 "secrets"
  resources: ["secrets"]
  verbs: ["get", "watch", "list"]

ClusterRoleBinding

跨集群授权(也就是要访问不同ns的资源).不需要namespace.

apiVersion: rbac.authorization.k8s.io/v1
# 此集群角色绑定允许 “manager” 组中的任何人访问任何名字空间中的 secrets
kind: ClusterRoleBinding
metadata:
  name: read-secrets-global
subjects:
- kind: Group
  name: manager # 'name' 是区分大小写的
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: secret-reader
  apiGroup: rbac.authorization.k8s.io
Designed by Canux