In order to ensure that your aks-engine upgrade operation runs smoothly, there are a few things you should be aware of before getting started.
You will need access to the apimodel.json that was generated by aks-engine deploy or aks-engine generate (by default this file is placed into a relative directory that looks like _output//). aks-engine will use the --api-model argument to introspect the apimodel.json file in order to determine the cluster’s current Kubernetes version, as well as all other cluster configuration data as defined by aks-engine during the last time that aks-engine was used to deploy, scale, or upgrade the cluster.
aks-engine upgrade expects a cluster configuration that conforms to the current state of the cluster. In other words, the Azure resources inside the resource group deployed by aks-engine should be in the same state as when they were originally created by aks-engine. If you perform manual operations on your Azure IaaS resources (other than aks-engine scale and aks-engine upgrade) DO NOT use aks-engine upgrade, as the aks-engine-generated ARM template won’t be reconcilable against the state of the Azure resources that reside in the resource group. Some examples of manual operations that will prevent upgrade from working successfully:
To get the list of all available Kubernetes versions and upgrades, run the get-versions command:
./bin/aks-engine get-versions
To get the versions of Kubernetes that your particular cluster version is upgradable to, provide its current Kubernetes version in the version arg:
./bin/aks-engine get-versions --version 1.12.8
aks-engine upgrade relies upon a working connection to the cluster control plane during upgrade, both (1) to validate successful upgrade progress, and (2) to cordon and drain nodes before upgrading them, in order to minimize operational downtime of any running cluster workloads. If you are upgrading a private cluster, you must run aks-engine upgrade from a host VM that has network access to the control plane, for example a jumpbox VM that resides in the same VNET as the master VMs. For more information on private clusters refer to this documentation.
If using aks-engine upgrade in production, it is recommended to stage an upgrade test on an cluster that was built to the same specifications (built with the same cluster configuration + the same version of the aks-engine binary) as your production cluster before performing the upgrade, especially if the cluster configuration is “interesting”, or in other words differs significantly from defaults. The reason for this is that AKS Engine supports many different cluster configurations and the extent of E2E testing that the AKS Engine team runs cannot practically cover every possible configuration. Therefore, it is recommended that you ensure in a staging environment that your specific cluster configuration is upgradable using aks-engine upgrade before attempting this potentially destructive operation on your production cluster.
aks-engine upgrade is backwards compatible. If you deployed with aks-engine version 0.27.x, you can run upgrade with version 0.29.y. In fact, it is recommended that you use the latest available aks-engine version when running an upgrade operation. This will ensure that you get the latest available software and bug fixes in your upgraded cluster.
aks-engine upgrade will automatically re-generate your cluster configuration to best pair with the desired new version of Kubernetes, and/or the version of AKS Engine that is used to execute aks-engine upgrade. To use an example of both:
When you upgrade to (for example) Kubernetes 1.14 from 1.13, AKS Engine will automatically change your control plane configuration (e.g., coredns, metrics-server, kube-proxy) so that the cluster component configurations have a close, known-working affinity with 1.14.
When you perform an upgrade, even if it is a Kubernetes patch release upgrade such as 1.14.1 to 1.14.2, but you use a newer version of AKS Engine, a newer version of etcd (for example) may have been validated and configured as default since the original version of AKS Engine used to build the cluster was released. So, for example, without any explicit user direction, the newly upgraded cluster will now be running etcd v3.2.26 instead of v3.2.25. This is by design.
In summary, using aks-engine upgrade means you will freshen and re-pave the entire stack that underlies Kubernetes to reflect the best-known, recent implementation of Azure IaaS + OS + OS config + Kubernetes config.
During the upgrade, aks-engine successively visits virtual machines that constitute the cluster (first the master nodes, then the agent nodes) and performs the following operations:
Master nodes:
cordon the node and drain existing workloads
delete the VM
create new VM and install desired Kubernetes version
add the new VM to the cluster (custom annotations, labels and taints etc are retained automatically)
Agent nodes:
create new VM and install desired Kubernetes version
add the new VM to the cluster
evict any pods that might be scheduled onto this node by Kubernetes before copying custom node properties
copy the custom annotations, labels and taints of old node to new node.
cordon the node and drain existing workloads
delete the VM
wget https://github.com/Azure/aks-engine/releases/download/v0.55.4/aks-engine-v0.55.4-linux-amd64.tar.gz
tar -xvf aks-engine-v0.55.4-linux-amd64.tar.gz
cd aks-engine-v0.55.4-linux-amd64
azcopy copy "https://{{storageaccount}}.blob.core.chinacloudapi.cn/aks-engine/{{location}}/akse?{{SAS Token}}" "/home/vmadmin/akse" --recursive=true
cd /home/akse/akse
kubectl edit cm kube-flannel-cfg -n kube-system
#(add "cniVersion": "0.2.0")
aks-engine upgrade --azure-env AzureChinacloud \
--api-model _output/{{Resource Group}}/apimodel.json \
--location chinanorth2 \
--resource-group {{Resource Group}} \
--subscription-id {{subscription-id}} \
--upgrade-version 1.16.14 \
--client-id {{client-id}} \
--client-secret {{client-secret}}
curl -LO https://storage.googleapis.com/kubernetes-release/release/v1.16.14/bin/linux/amd64/kubectl
sudo cp kubectl /usr/bin/kubectl