You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

441 lines
17 KiB

1 year ago
  1. # Upgrading Kubernetes in Kubespray
  2. Kubespray handles upgrades the same way it handles initial deployment. That is to
  3. say that each component is laid down in a fixed order.
  4. You can also individually control versions of components by explicitly defining their
  5. versions. Here are all version vars for each component:
  6. * docker_version
  7. * docker_containerd_version (relevant when `container_manager` == `docker`)
  8. * containerd_version (relevant when `container_manager` == `containerd`)
  9. * kube_version
  10. * etcd_version
  11. * calico_version
  12. * calico_cni_version
  13. * weave_version
  14. * flannel_version
  15. * kubedns_version
  16. > **Warning**
  17. > [Attempting to upgrade from an older release straight to the latest release is unsupported and likely to break something](https://github.com/kubernetes-sigs/kubespray/issues/3849#issuecomment-451386515)
  18. See [Multiple Upgrades](#multiple-upgrades) for how to upgrade from older Kubespray release to the latest release
  19. ## Unsafe upgrade example
  20. If you wanted to upgrade just kube_version from v1.18.10 to v1.19.7, you could
  21. deploy the following way:
  22. ```ShellSession
  23. ansible-playbook cluster.yml -i inventory/sample/hosts.ini -e kube_version=v1.18.10 -e upgrade_cluster_setup=true
  24. ```
  25. And then repeat with v1.19.7 as kube_version:
  26. ```ShellSession
  27. ansible-playbook cluster.yml -i inventory/sample/hosts.ini -e kube_version=v1.19.7 -e upgrade_cluster_setup=true
  28. ```
  29. The var ```-e upgrade_cluster_setup=true``` is needed to be set in order to migrate the deploys of e.g kube-apiserver inside the cluster immediately which is usually only done in the graceful upgrade. (Refer to [#4139](https://github.com/kubernetes-sigs/kubespray/issues/4139) and [#4736](https://github.com/kubernetes-sigs/kubespray/issues/4736))
  30. ## Graceful upgrade
  31. Kubespray also supports cordon, drain and uncordoning of nodes when performing
  32. a cluster upgrade. There is a separate playbook used for this purpose. It is
  33. important to note that upgrade-cluster.yml can only be used for upgrading an
  34. existing cluster. That means there must be at least 1 kube_control_plane already
  35. deployed.
  36. ```ShellSession
  37. ansible-playbook upgrade-cluster.yml -b -i inventory/sample/hosts.ini -e kube_version=v1.19.7
  38. ```
  39. After a successful upgrade, the Server Version should be updated:
  40. ```ShellSession
  41. $ kubectl version
  42. Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.7", GitCommit:"1dd5338295409edcfff11505e7bb246f0d325d15", GitTreeState:"clean", BuildDate:"2021-01-13T13:23:52Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
  43. Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.7", GitCommit:"1dd5338295409edcfff11505e7bb246f0d325d15", GitTreeState:"clean", BuildDate:"2021-01-13T13:15:20Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
  44. ```
  45. You can control how many nodes are upgraded at the same time by modifying the ansible variable named `serial`, as explained [here](https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_strategies.html#setting-the-batch-size-with-serial). If you don't set this variable, it will upgrade the cluster nodes in batches of 20% of the available nodes. Setting `serial=1` would mean upgrade one node at a time.
  46. ```ShellSession
  47. ansible-playbook upgrade-cluster.yml -b -i inventory/sample/hosts.ini -e kube_version=v1.20.7 -e "serial=1"
  48. ```
  49. ### Pausing the upgrade
  50. If you want to manually control the upgrade procedure, you can set some variables to pause the upgrade playbook. Pausing *before* upgrading each upgrade may be useful for inspecting pods running on that node, or performing manual actions on the node:
  51. * `upgrade_node_confirm: true` - This will pause the playbook execution prior to upgrading each node. The play will resume when manually approved by typing "yes" at the terminal.
  52. * `upgrade_node_pause_seconds: 60` - This will pause the playbook execution for 60 seconds prior to upgrading each node. The play will resume automatically after 60 seconds.
  53. Pausing *after* upgrading each node may be useful for rebooting the node to apply kernel updates, or testing the still-cordoned node:
  54. * `upgrade_node_post_upgrade_confirm: true` - This will pause the playbook execution after upgrading each node, but before the node is uncordoned. The play will resume when manually approved by typing "yes" at the terminal.
  55. * `upgrade_node_post_upgrade_pause_seconds: 60` - This will pause the playbook execution for 60 seconds after upgrading each node, but before the node is uncordoned. The play will resume automatically after 60 seconds.
  56. ## Node-based upgrade
  57. If you don't want to upgrade all nodes in one run, you can use `--limit` [patterns](https://docs.ansible.com/ansible/latest/user_guide/intro_patterns.html#patterns-and-ansible-playbook-flags).
  58. Before using `--limit` run playbook `facts.yml` without the limit to refresh facts cache for all nodes:
  59. ```ShellSession
  60. ansible-playbook facts.yml -b -i inventory/sample/hosts.ini
  61. ```
  62. After this upgrade control plane and etcd groups [#5147](https://github.com/kubernetes-sigs/kubespray/issues/5147):
  63. ```ShellSession
  64. ansible-playbook upgrade-cluster.yml -b -i inventory/sample/hosts.ini -e kube_version=v1.20.7 --limit "kube_control_plane:etcd"
  65. ```
  66. Now you can upgrade other nodes in any order and quantity:
  67. ```ShellSession
  68. ansible-playbook upgrade-cluster.yml -b -i inventory/sample/hosts.ini -e kube_version=v1.20.7 --limit "node4:node6:node7:node12"
  69. ansible-playbook upgrade-cluster.yml -b -i inventory/sample/hosts.ini -e kube_version=v1.20.7 --limit "node5*"
  70. ```
  71. ## Multiple upgrades
  72. > **Warning**
  73. > [Do not skip minor releases (patches releases are ok) when upgrading--upgrade by one tag at a
  74. > time.](https://github.com/kubernetes-sigs/kubespray/issues/3849#issuecomment-451386515)
  75. For instances, given the tag list:
  76. ```console
  77. $ git tag
  78. v2.20.0
  79. v2.21.0
  80. v2.22.0
  81. v2.22.1
  82. v2.23.0
  83. v2.23.1
  84. v2.23.2
  85. v2.24.0
  86. ...
  87. ```
  88. v2.22.0 -> v2.23.2 -> v2.24.0 : ✓
  89. v.22.0 -> v2.24.0 : ✕
  90. Assuming you don't explicitly define a kubernetes version in your k8s_cluster.yml, you simply check out the next tag and run the upgrade-cluster.yml playbook
  91. * If you do define kubernetes version in your inventory (e.g. group_vars/k8s_cluster.yml) then either make sure to update it before running upgrade-cluster, or specify the new version you're upgrading to: `ansible-playbook -i inventory/mycluster/hosts.ini -b upgrade-cluster.yml -e kube_version=v1.11.3`
  92. Otherwise, the upgrade will leave your cluster at the same k8s version defined in your inventory vars.
  93. The below example shows taking a cluster that was set up for v2.6.0 up to v2.10.0
  94. ```ShellSession
  95. $ kubectl get node
  96. NAME STATUS ROLES AGE VERSION
  97. apollo Ready master,node 1h v1.10.4
  98. boomer Ready master,node 42m v1.10.4
  99. caprica Ready master,node 42m v1.10.4
  100. $ git describe --tags
  101. v2.6.0
  102. $ git tag
  103. ...
  104. v2.6.0
  105. v2.7.0
  106. v2.8.0
  107. v2.8.1
  108. v2.8.2
  109. ...
  110. $ git checkout v2.7.0
  111. Previous HEAD position was 8b3ce6e4 bump upgrade tests to v2.5.0 commit (#3087)
  112. HEAD is now at 05dabb7e Fix Bionic networking restart error #3430 (#3431)
  113. # NOTE: May need to `pip3 install -r requirements.txt` when upgrading.
  114. ansible-playbook -i inventory/mycluster/hosts.ini -b upgrade-cluster.yml
  115. ...
  116. $ kubectl get node
  117. NAME STATUS ROLES AGE VERSION
  118. apollo Ready master,node 1h v1.11.3
  119. boomer Ready master,node 1h v1.11.3
  120. caprica Ready master,node 1h v1.11.3
  121. $ git checkout v2.8.0
  122. Previous HEAD position was 05dabb7e Fix Bionic networking restart error #3430 (#3431)
  123. HEAD is now at 9051aa52 Fix ubuntu-contiv test failed (#3808)
  124. ```
  125. > **Note**
  126. > Review changes between the sample inventory and your inventory when upgrading versions.
  127. Some deprecations between versions that mean you can't just upgrade straight from 2.7.0 to 2.8.0 if you started with the sample inventory.
  128. In this case, I set "kubeadm_enabled" to false, knowing that it is deprecated and removed by 2.9.0, to delay converting the cluster to kubeadm as long as I could.
  129. ```ShellSession
  130. $ ansible-playbook -i inventory/mycluster/hosts.ini -b upgrade-cluster.yml
  131. ...
  132. "msg": "DEPRECATION: non-kubeadm deployment is deprecated from v2.9. Will be removed in next release."
  133. ...
  134. Are you sure you want to deploy cluster using the deprecated non-kubeadm mode. (output is hidden):
  135. yes
  136. ...
  137. $ kubectl get node
  138. NAME STATUS ROLES AGE VERSION
  139. apollo Ready master,node 114m v1.12.3
  140. boomer Ready master,node 114m v1.12.3
  141. caprica Ready master,node 114m v1.12.3
  142. $ git checkout v2.8.1
  143. Previous HEAD position was 9051aa52 Fix ubuntu-contiv test failed (#3808)
  144. HEAD is now at 2ac1c756 More Feature/2.8 backports for 2.8.1 (#3911)
  145. $ ansible-playbook -i inventory/mycluster/hosts.ini -b upgrade-cluster.yml
  146. ...
  147. "msg": "DEPRECATION: non-kubeadm deployment is deprecated from v2.9. Will be removed in next release."
  148. ...
  149. Are you sure you want to deploy cluster using the deprecated non-kubeadm mode. (output is hidden):
  150. yes
  151. ...
  152. $ kubectl get node
  153. NAME STATUS ROLES AGE VERSION
  154. apollo Ready master,node 2h36m v1.12.4
  155. boomer Ready master,node 2h36m v1.12.4
  156. caprica Ready master,node 2h36m v1.12.4
  157. $ git checkout v2.8.2
  158. Previous HEAD position was 2ac1c756 More Feature/2.8 backports for 2.8.1 (#3911)
  159. HEAD is now at 4167807f Upgrade to 1.12.5 (#4066)
  160. $ ansible-playbook -i inventory/mycluster/hosts.ini -b upgrade-cluster.yml
  161. ...
  162. "msg": "DEPRECATION: non-kubeadm deployment is deprecated from v2.9. Will be removed in next release."
  163. ...
  164. Are you sure you want to deploy cluster using the deprecated non-kubeadm mode. (output is hidden):
  165. yes
  166. ...
  167. $ kubectl get node
  168. NAME STATUS ROLES AGE VERSION
  169. apollo Ready master,node 3h3m v1.12.5
  170. boomer Ready master,node 3h3m v1.12.5
  171. caprica Ready master,node 3h3m v1.12.5
  172. $ git checkout v2.8.3
  173. Previous HEAD position was 4167807f Upgrade to 1.12.5 (#4066)
  174. HEAD is now at ea41fc5e backport cve-2019-5736 to release-2.8 (#4234)
  175. $ ansible-playbook -i inventory/mycluster/hosts.ini -b upgrade-cluster.yml
  176. ...
  177. "msg": "DEPRECATION: non-kubeadm deployment is deprecated from v2.9. Will be removed in next release."
  178. ...
  179. Are you sure you want to deploy cluster using the deprecated non-kubeadm mode. (output is hidden):
  180. yes
  181. ...
  182. $ kubectl get node
  183. NAME STATUS ROLES AGE VERSION
  184. apollo Ready master,node 5h18m v1.12.5
  185. boomer Ready master,node 5h18m v1.12.5
  186. caprica Ready master,node 5h18m v1.12.5
  187. $ git checkout v2.8.4
  188. Previous HEAD position was ea41fc5e backport cve-2019-5736 to release-2.8 (#4234)
  189. HEAD is now at 3901480b go to k8s 1.12.7 (#4400)
  190. $ ansible-playbook -i inventory/mycluster/hosts.ini -b upgrade-cluster.yml
  191. ...
  192. "msg": "DEPRECATION: non-kubeadm deployment is deprecated from v2.9. Will be removed in next release."
  193. ...
  194. Are you sure you want to deploy cluster using the deprecated non-kubeadm mode. (output is hidden):
  195. yes
  196. ...
  197. $ kubectl get node
  198. NAME STATUS ROLES AGE VERSION
  199. apollo Ready master,node 5h37m v1.12.7
  200. boomer Ready master,node 5h37m v1.12.7
  201. caprica Ready master,node 5h37m v1.12.7
  202. $ git checkout v2.8.5
  203. Previous HEAD position was 3901480b go to k8s 1.12.7 (#4400)
  204. HEAD is now at 6f97687d Release 2.8 robust san handling (#4478)
  205. $ ansible-playbook -i inventory/mycluster/hosts.ini -b upgrade-cluster.yml
  206. ...
  207. "msg": "DEPRECATION: non-kubeadm deployment is deprecated from v2.9. Will be removed in next release."
  208. ...
  209. Are you sure you want to deploy cluster using the deprecated non-kubeadm mode. (output is hidden):
  210. yes
  211. ...
  212. $ kubectl get node
  213. NAME STATUS ROLES AGE VERSION
  214. apollo Ready master,node 5h45m v1.12.7
  215. boomer Ready master,node 5h45m v1.12.7
  216. caprica Ready master,node 5h45m v1.12.7
  217. $ git checkout v2.9.0
  218. Previous HEAD position was 6f97687d Release 2.8 robust san handling (#4478)
  219. HEAD is now at a4e65c7c Upgrade to Ansible >2.7.0 (#4471)
  220. ```
  221. > **Warning**
  222. > IMPORTANT: Some variable formats changed in the k8s_cluster.yml between 2.8.5 and 2.9.0
  223. If you do not keep your inventory copy up to date, **your upgrade will fail** and your first master will be left non-functional until fixed and re-run.
  224. It is at this point the cluster was upgraded from non-kubeadm to kubeadm as per the deprecation warning.
  225. ```ShellSession
  226. ansible-playbook -i inventory/mycluster/hosts.ini -b upgrade-cluster.yml
  227. ...
  228. $ kubectl get node
  229. NAME STATUS ROLES AGE VERSION
  230. apollo Ready master,node 6h54m v1.13.5
  231. boomer Ready master,node 6h55m v1.13.5
  232. caprica Ready master,node 6h54m v1.13.5
  233. # Watch out: 2.10.0 is hiding between 2.1.2 and 2.2.0
  234. $ git tag
  235. ...
  236. v2.1.0
  237. v2.1.1
  238. v2.1.2
  239. v2.10.0
  240. v2.2.0
  241. ...
  242. $ git checkout v2.10.0
  243. Previous HEAD position was a4e65c7c Upgrade to Ansible >2.7.0 (#4471)
  244. HEAD is now at dcd9c950 Add etcd role dependency on kube user to avoid etcd role failure when running scale.yml with a fresh node. (#3240) (#4479)
  245. ansible-playbook -i inventory/mycluster/hosts.ini -b upgrade-cluster.yml
  246. ...
  247. $ kubectl get node
  248. NAME STATUS ROLES AGE VERSION
  249. apollo Ready master,node 7h40m v1.14.1
  250. boomer Ready master,node 7h40m v1.14.1
  251. caprica Ready master,node 7h40m v1.14.1
  252. ```
  253. ## Upgrading to v2.19
  254. `etcd_kubeadm_enabled` is being deprecated at v2.19. The same functionality is achievable by setting `etcd_deployment_type` to `kubeadm`.
  255. Deploying etcd using kubeadm is experimental and is only available for either new or deployments where `etcd_kubeadm_enabled` was set to `true` while deploying the cluster.
  256. From 2.19 and onward `etcd_deployment_type` variable will be placed in `group_vars/all/etcd.yml` instead of `group_vars/etcd.yml`, due to scope issues.
  257. The placement of the variable is only important for `etcd_deployment_type: kubeadm` right now. However, since this might change in future updates, it is recommended to move the variable.
  258. Upgrading is straightforward; no changes are required if `etcd_kubeadm_enabled` was not set to `true` when deploying.
  259. If you have a cluster where `etcd` was deployed using `kubeadm`, you will need to remove `etcd_kubeadm_enabled` the variable. Then move `etcd_deployment_type` variable from `group_vars/etcd.yml` to `group_vars/all/etcd.yml` due to scope issues and set `etcd_deployment_type` to `kubeadm`.
  260. ## Upgrade order
  261. As mentioned above, components are upgraded in the order in which they were
  262. installed in the Ansible playbook. The order of component installation is as
  263. follows:
  264. * Docker
  265. * Containerd
  266. * etcd
  267. * kubelet and kube-proxy
  268. * network_plugin (such as Calico or Weave)
  269. * kube-apiserver, kube-scheduler, and kube-controller-manager
  270. * Add-ons (such as KubeDNS)
  271. ### Component-based upgrades
  272. A deployer may want to upgrade specific components in order to minimize risk
  273. or save time. This strategy is not covered by CI as of this writing, so it is
  274. not guaranteed to work.
  275. These commands are useful only for upgrading fully-deployed, healthy, existing
  276. hosts. This will definitely not work for undeployed or partially deployed
  277. hosts.
  278. Upgrade docker:
  279. ```ShellSession
  280. ansible-playbook -b -i inventory/sample/hosts.ini cluster.yml --tags=docker
  281. ```
  282. Upgrade etcd:
  283. ```ShellSession
  284. ansible-playbook -b -i inventory/sample/hosts.ini cluster.yml --tags=etcd
  285. ```
  286. Upgrade etcd without rotating etcd certs:
  287. ```ShellSession
  288. ansible-playbook -b -i inventory/sample/hosts.ini cluster.yml --tags=etcd --limit=etcd --skip-tags=etcd-secrets
  289. ```
  290. Upgrade kubelet:
  291. ```ShellSession
  292. ansible-playbook -b -i inventory/sample/hosts.ini cluster.yml --tags=node --skip-tags=k8s-gen-certs,k8s-gen-tokens
  293. ```
  294. Upgrade Kubernetes master components:
  295. ```ShellSession
  296. ansible-playbook -b -i inventory/sample/hosts.ini cluster.yml --tags=master
  297. ```
  298. Upgrade network plugins:
  299. ```ShellSession
  300. ansible-playbook -b -i inventory/sample/hosts.ini cluster.yml --tags=network
  301. ```
  302. Upgrade all add-ons:
  303. ```ShellSession
  304. ansible-playbook -b -i inventory/sample/hosts.ini cluster.yml --tags=apps
  305. ```
  306. Upgrade just helm (assuming `helm_enabled` is true):
  307. ```ShellSession
  308. ansible-playbook -b -i inventory/sample/hosts.ini cluster.yml --tags=helm
  309. ```
  310. ## Migrate from Docker to Containerd
  311. Please note that **migrating container engines is not officially supported by Kubespray**. While this procedure can be used to migrate your cluster, it applies to one particular scenario and will likely evolve over time. At the moment, they are intended as an additional resource to provide insight into how these steps can be officially integrated into the Kubespray playbooks.
  312. As of Kubespray 2.18.0, containerd is already the default container engine. If you have the chance, it is advisable and safer to reset and redeploy the entire cluster with a new container engine.
  313. * [Migrating from Docker to Containerd](upgrades/migrate_docker2containerd.md)
  314. ## System upgrade
  315. If you want to upgrade the APT or YUM packages while the nodes are cordoned, you can use:
  316. ```ShellSession
  317. ansible-playbook upgrade-cluster.yml -b -i inventory/sample/hosts.ini -e system_upgrade=true
  318. ```
  319. Nodes will be rebooted when there are package upgrades (`system_upgrade_reboot: on-upgrade`).
  320. This can be changed to `always` or `never`.
  321. Note: Downloads will happen twice unless `system_upgrade_reboot` is `never`.