本地把玩 vitess operator + orchestrator

龙成仁

2023-12-01

本地 k8s 集群跑 vitess operator 和 orchestrator，遇到不少坑，简单记录一下。

kubeadm 安装 k8s 集群

环境： ubuntu12.04 LTS，参考这里。

kubeadm init

$ sudo kubeadm init --pod-network-cidr=10.244.0.0/16

根据文档，需要指定 cidr ，否则组网会有问题。

安装 vitess operator

参考官方文档 aws 示例，把 operator.yaml 和 exampledb_aws.yaml 都下载到本地，并执行:

$ kubectl create -f ./operator.yaml
# 等待 operator 启动完成
$ kubectl create -f ./exampledb_aws.yaml

aws 示例使用的 s3 作为备份，需要改成本地 volume，比如:

volume:
  hostPath:
    path: /tmp/

会将宿主机 /tmp 目录映射至 vttablet 的 backup 目录。

设置 orchestrator 访问使用的用户

需要修改 exampledb_aws.yaml 里面定义的 init_db.sql 内容，在 vttablet 初始化的时候，显示创建用户，比如：

    CREATE USER 'orchestrator'@'%' IDENTIFIED BY 'orchestrator';                                                                                               
    GRANT SUPER, PROCESS, REPLICATION SLAVE, RELOAD                                                                                                            
      ON *.* TO 'orchestrator'@'%';                                                                                                                            
    GRANT SELECT                                                                                                                                               
      ON _vt.* TO 'orchestrator'@'%';

安装 orchetrator

apiVersion: apps/v1
kind: Deployment
metadata:
  name: vitess-orchestrator
  labels:
    app: orchestrator

spec:
  replicas: 1
  selector:
    matchLabels:
      app: orchestrator
  template:
    metadata:
      labels:
        app: orchestrator
    spec:
      containers:
      - name: orc
        image: vitess/orchestrator
        ports:
        - containerPort: 3000
        volumeMounts:
        - name: config-volume
          mountPath: /conf/

      volumes:
      - name: config-volume
        configMap:
          name: orchestrator-config

---
apiVersion: v1
kind: Service
metadata:
  name: orchestrator
spec:
  selector:
    app: orchestrator
  ports:
  - protocol: TCP
    port: 80
    targetPort: 3000

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: orchestrator-config
data:
  orchestrator.conf.json: |
  {
      "ActiveNodeExpireSeconds": 5,
      "ApplyMySQLPromotionAfterMasterFailover": true,
      "AuditLogFile": "/tmp/orchestrator-audit.log",
      "AuditToSyslog": false,
      "AuthenticationMethod": "",
      "AuthUserHeader": "",
      "AutoPseudoGTID": false,
      "BackendDB": "sqlite",
      "BinlogEventsChunkSize": 10000,
      "CandidateInstanceExpireMinutes": 60,
      "CoMasterRecoveryMustPromoteOtherCoMaster": false,
      "DataCenterPattern": "[.]([^.]+)[.][^.]+[.]vitess[.]io",
      "Debug": true,
      "DefaultInstancePort": 3306,
      "DetachLostSlavesAfterMasterFailover": true,
      "DetectClusterAliasQuery": "SELECT value FROM _vt.local_metadata WHERE name='ClusterAlias'",
      "DetectClusterDomainQuery": "",
      "DetectInstanceAliasQuery": "SELECT value FROM _vt.local_metadata WHERE name='Alias'",
      "DetectPromotionRuleQuery": "SELECT value FROM _vt.local_metadata WHERE name='PromotionRule'",
      "DetectDataCenterQuery": "SELECT value FROM _vt.local_metadata WHERE name='DataCenter'",
      "DetectPseudoGTIDQuery": "",
      "DetectSemiSyncEnforcedQuery": "SELECT @@global.rpl_semi_sync_master_wait_no_slave AND @@global.rpl_semi_sync_master_timeout > 1000000",
      "DiscoverByShowSlaveHosts": false,
      "EnableSyslog": false,
      "ExpiryHostnameResolvesMinutes": 60,
      "DelayMasterPromotionIfSQLThreadNotUpToDate": true,
      "FailureDetectionPeriodBlockMinutes": 10,
      "GraphiteAddr": "",
      "GraphiteConvertHostnameDotsToUnderscores": true,
      "GraphitePath": "",
      "HostnameResolveMethod": "none",
      "HTTPAuthPassword": "",
      "HTTPAuthUser": "",
      "InstanceBulkOperationsWaitTimeoutSeconds": 10,
      "InstancePollSeconds": 5,
      "ListenAddress": ":3000",
      "MasterFailoverLostInstancesDowntimeMinutes": 0,
      "MySQLConnectTimeoutSeconds": 1,
      "MySQLHostnameResolveMethod": "none",
      "MySQLTopologyCredentialsConfigFile": "",
      "MySQLTopologyMaxPoolConnections": 3,
      "MySQLTopologyPassword": "orchestrator",
      "MySQLTopologyReadTimeoutSeconds": 3,
      "MySQLTopologySSLCAFile": "",
      "MySQLTopologySSLCertFile": "",
      "MySQLTopologySSLPrivateKeyFile": "",
      "MySQLTopologySSLSkipVerify": true,
      "MySQLTopologyUseMutualTLS": false,
      "MySQLTopologyUser": "orchestrator",
      "OnFailureDetectionProcesses": [
          "echo 'Detected {failureType} on {failureCluster}. Affected replicas: {countSlaves}' >> /tmp/recovery.log"
      ],
      "OSCIgnoreHostnameFilters": [
      ],
      "PhysicalEnvironmentPattern": "[.]([^.]+[.][^.]+)[.]vitess[.]io",
      "PostFailoverProcesses": [
          "echo '(for all types) Recovered from {failureType} on {failureCluster}. Failed: {failedHost}:{failedPort}; Successor: {successorHost}:{successorPort}' >> /tmp/recovery.log"
      ],
      "PostIntermediateMasterFailoverProcesses": [
          "echo 'Recovered from {failureType} on {failureCluster}. Failed: {failedHost}:{failedPort}; Successor: {successorHost}:{successorPort}' >> /tmp/recovery.log"
      ],
      "PostMasterFailoverProcesses": [
          "echo 'Recovered from {failureType} on {failureCluster}. Failed: {failedHost}:{failedPort}; Promoted: {successorHost}:{successorPort}' >> /tmp/recove
ry.log",
          "n=0; until [ $n -ge 10 ]; do vtctlclient -server example-vtctld-625ee430:15999 TabletExternallyReparented {successorAlias} && break; n=$[$n+1]; slee
p 5; done"
      ],
      "PostponeSlaveRecoveryOnLagMinutes": 0,
      "PostUnsuccessfulFailoverProcesses": [
      ],
      "PowerAuthUsers": [
          "*"
      ],
      "PreFailoverProcesses": [
          "echo 'Will recover from {failureType} on {failureCluster}' >> /tmp/recovery.log"
      ],
      "ProblemIgnoreHostnameFilters": [
      ],
      "PromotionIgnoreHostnameFilters": [
      ],
      "PseudoGTIDMonotonicHint": "asc:",
      "PseudoGTIDPattern": "drop view if exists .*?`_pseudo_gtid_hint__",
      "ReadLongRunningQueries": false,
      "ReadOnly": false,
      "ReasonableMaintenanceReplicationLagSeconds": 20,
      "ReasonableReplicationLagSeconds": 10,
      "RecoverMasterClusterFilters": [
          ".*"
      ],
      "RecoveryIgnoreHostnameFilters": [
      ],
      "RecoveryPeriodBlockSeconds": 60,
      "ReduceReplicationAnalysisCount": true,
      "RejectHostnameResolvePattern": "",
      "RemoveTextFromHostnameDisplay": ".vitess.io:3306",
      "ReplicationLagQuery": "",
      "ServeAgentsHttp": false,
      "SkipBinlogEventsContaining": [
      ],
      "SkipBinlogServerUnresolveCheck": true,
      "SkipMaxScaleCheck": true,
      "SkipOrchestratorDatabaseUpdate": false,
      "SlaveStartPostWaitMilliseconds": 1000,
      "SnapshotTopologiesIntervalHours": 0,
      "SQLite3DataFile": ":memory:",
      "SSLCAFile": "",
      "SSLCertFile": "",
      "SSLPrivateKeyFile": "",
      "SSLSkipVerify": false,
      "SSLValidOUs": [
      ],
      "StaleSeedFailMinutes": 60,
      "StatusEndpoint": "/api/status",
      "StatusOUVerify": false,
      "UnseenAgentForgetHours": 6,
      "UnseenInstanceForgetHours": 240,
      "UseMutualTLS": false,
      "UseSSL": false,
      "VerifyReplicationFilters": false
    }

配置文件来自 vitess 代码库，注意，配置文件的用户名密码，需要在 VitessCluster 使用的 init_db.sql 中显示创建好，这里用的是 orchestrator:orchestrator，同时也需要指定 vtctld 服务地址，最好创建一个固定可识别的 Service。

问题集锦

node not ready

集群启动后，发现 node 一直 NotReady，没法调度 pod，通过 describe node ，condition 显示 network plugin is not ready: cni config uninitialized，通过安装 flannel 解决：

$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

⚠️ 可能需要梯子

pvc 一直 pending

vitess 组件运行的时候，operator 会声明pvc，需要安装一个 provisioner 才行，选用了rancher的组件：

$ kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml

default storage class

因为 rancher 注册的 storage class 是 local-path，而 vitess operator 声明的 pvc 默认并不指定 StorageClass ，需要将 local-path 指定为 default，否则 pvc 还是会一直处于 pending 状态。参考文档修改：

$ kubectl patch storageclass local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

再次查看 StorageClass:

$ kubectl get StorageClass

会有 default 字样

NAME                   PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path (default)   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  81m

node-role.kubernetes.io/master taint

默认安装好的单节点集群，节点是master，不会调度任何pod，需要去掉taint，参考这里：

$ kubectl taint nodes --all node-role.kubernetes.io/master-

“cannot be updated: the object has been modified; please apply your changes to the latest version and try again”

简单 Google 了下，普遍反馈是 k8s 的问题，主动降级 microk8s 后好了

$ sudo snap remove microk8s
# 1.13 以后的版本，docker 被换成了 ctr
$ sudo snap install microk8s --classic --channel=1.13/stable

init_db.sql 修改不生效

很诡异的问题，修改 init_db.sql 怎么都不生效，去 vitess slack channel 查历史记录，有多个人出现类似问题，看到 sougou 的回复，怀疑是 apparmor 的问题，于是把 snap 都删掉，直接用kubeadm 来安装集群，果然完美解决。

docker 设置 proxy

虽然国内镜像挺好使，但是有时，因为不可描述的原因，docker 镜像拉不下来，需要按照官方文档说明，设置好 dockerd 使用环境变量 HTTP_PROXY,HTTPS_PROXY 和 NO_PROXY。