You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

437 lines
15 KiB

  1. # Kubernetes on Openstack with Terraform
  2. Provision a Kubernetes cluster with [Terraform](https://www.terraform.io) on
  3. Openstack.
  4. ## Status
  5. This will install a Kubernetes cluster on an Openstack Cloud. It should work on
  6. most modern installs of OpenStack that support the basic services.
  7. ## Approach
  8. The terraform configuration inspects variables found in
  9. [variables.tf](variables.tf) to create resources in your OpenStack cluster.
  10. There is a [python script](../terraform.py) that reads the generated`.tfstate`
  11. file to generate a dynamic inventory that is consumed by the main ansible script
  12. to actually install kubernetes and stand up the cluster.
  13. ### Networking
  14. The configuration includes creating a private subnet with a router to the
  15. external net. It will allocate floating IPs from a pool and assign them to the
  16. hosts where that makes sense. You have the option of creating bastion hosts
  17. inside the private subnet to access the nodes there. Alternatively, a node with
  18. a floating IP can be used as a jump host to nodes without.
  19. ### Kubernetes Nodes
  20. You can create many different kubernetes topologies by setting the number of
  21. different classes of hosts. For each class there are options for allocating
  22. floating IP addresses or not.
  23. - Master nodes with etcd
  24. - Master nodes without etcd
  25. - Standalone etcd hosts
  26. - Kubernetes worker nodes
  27. Note that the Ansible script will report an invalid configuration if you wind up
  28. with an even number of etcd instances since that is not a valid configuration.
  29. ### GlusterFS
  30. The Terraform configuration supports provisioning of an optional GlusterFS
  31. shared file system based on a separate set of VMs. To enable this, you need to
  32. specify:
  33. - the number of Gluster hosts (minimum 2)
  34. - Size of the non-ephemeral volumes to be attached to store the GlusterFS bricks
  35. - Other properties related to provisioning the hosts
  36. Even if you are using Container Linux by CoreOS for your cluster, you will still
  37. need the GlusterFS VMs to be based on either Debian or RedHat based images.
  38. Container Linux by CoreOS cannot serve GlusterFS, but can connect to it through
  39. binaries available on hyperkube v1.4.3_coreos.0 or higher.
  40. ## Requirements
  41. - [Install Terraform](https://www.terraform.io/intro/getting-started/install.html)
  42. - [Install Ansible](http://docs.ansible.com/ansible/latest/intro_installation.html)
  43. - you already have a suitable OS image in Glance
  44. - you already have a floating IP pool created
  45. - you have security groups enabled
  46. - you have a pair of keys generated that can be used to secure the new hosts
  47. ## Module Architecture
  48. The configuration is divided into three modules:
  49. - Network
  50. - IPs
  51. - Compute
  52. The main reason for splitting the configuration up in this way is to easily
  53. accommodate situations where floating IPs are limited by a quota or if you have
  54. any external references to the floating IP (e.g. DNS) that would otherwise have
  55. to be updated.
  56. You can force your existing IPs by modifying the compute variables in
  57. `kubespray.tf` as follows:
  58. ```
  59. k8s_master_fips = ["151.101.129.67"]
  60. k8s_node_fips = ["151.101.129.68"]
  61. ```
  62. ## Terraform
  63. Terraform will be used to provision all of the OpenStack resources with base software as appropriate.
  64. ### Configuration
  65. #### Inventory files
  66. Create an inventory directory for your cluster by copying the existing sample and linking the `hosts` script (used to build the inventory based on Terraform state):
  67. ```ShellSession
  68. $ cp -LRp contrib/terraform/openstack/sample-inventory inventory/$CLUSTER
  69. $ cd inventory/$CLUSTER
  70. $ ln -s ../../contrib/terraform/openstack/hosts
  71. ```
  72. This will be the base for subsequent Terraform commands.
  73. #### OpenStack access and credentials
  74. No provider variables are hardcoded inside `variables.tf` because Terraform
  75. supports various authentication methods for OpenStack: the older script and
  76. environment method (using `openrc`) as well as a newer declarative method, and
  77. different OpenStack environments may support Identity API version 2 or 3.
  78. These are examples and may vary depending on your OpenStack cloud provider,
  79. for an exhaustive list on how to authenticate on OpenStack with Terraform
  80. please read the [OpenStack provider documentation](https://www.terraform.io/docs/providers/openstack/).
  81. ##### Declarative method (recommended)
  82. The recommended authentication method is to describe credentials in a YAML file `clouds.yaml` that can be stored in:
  83. * the current directory
  84. * `~/.config/openstack`
  85. * `/etc/openstack`
  86. `clouds.yaml`:
  87. ```
  88. clouds:
  89. mycloud:
  90. auth:
  91. auth_url: https://openstack:5000/v3
  92. username: "username"
  93. project_name: "projectname"
  94. project_id: projectid
  95. user_domain_name: "Default"
  96. password: "password"
  97. region_name: "RegionOne"
  98. interface: "public"
  99. identity_api_version: 3
  100. ```
  101. If you have multiple clouds defined in your `clouds.yaml` file you can choose
  102. the one you want to use with the environment variable `OS_CLOUD`:
  103. ```
  104. export OS_CLOUD=mycloud
  105. ```
  106. ##### Openrc method (deprecated)
  107. When using classic environment variables, Terraform uses default `OS_*`
  108. environment variables. A script suitable for your environment may be available
  109. from Horizon under *Project* -> *Compute* -> *Access & Security* -> *API Access*.
  110. With identity v2:
  111. ```
  112. source openrc
  113. env | grep OS
  114. OS_AUTH_URL=https://openstack:5000/v2.0
  115. OS_PROJECT_ID=projectid
  116. OS_PROJECT_NAME=projectname
  117. OS_USERNAME=username
  118. OS_PASSWORD=password
  119. OS_REGION_NAME=RegionOne
  120. OS_INTERFACE=public
  121. OS_IDENTITY_API_VERSION=2
  122. ```
  123. With identity v3:
  124. ```
  125. source openrc
  126. env | grep OS
  127. OS_AUTH_URL=https://openstack:5000/v3
  128. OS_PROJECT_ID=projectid
  129. OS_PROJECT_NAME=username
  130. OS_PROJECT_DOMAIN_ID=default
  131. OS_USERNAME=username
  132. OS_PASSWORD=password
  133. OS_REGION_NAME=RegionOne
  134. OS_INTERFACE=public
  135. OS_IDENTITY_API_VERSION=3
  136. OS_USER_DOMAIN_NAME=Default
  137. ```
  138. Terraform does not support a mix of DomainName and DomainID, choose one or the
  139. other:
  140. ```
  141. * provider.openstack: You must provide exactly one of DomainID or DomainName to authenticate by Username
  142. ```
  143. ```
  144. unset OS_USER_DOMAIN_NAME
  145. export OS_USER_DOMAIN_ID=default
  146. or
  147. unset OS_PROJECT_DOMAIN_ID
  148. set OS_PROJECT_DOMAIN_NAME=Default
  149. ```
  150. #### Cluster variables
  151. The construction of the cluster is driven by values found in
  152. [variables.tf](variables.tf).
  153. For your cluster, edit `inventory/$CLUSTER/cluster.tf`.
  154. |Variable | Description |
  155. |---------|-------------|
  156. |`cluster_name` | All OpenStack resources will use the Terraform variable`cluster_name` (default`example`) in their name to make it easier to track. For example the first compute resource will be named`example-kubernetes-1`. |
  157. |`network_name` | The name to be given to the internal network that will be generated |
  158. |`dns_nameservers`| An array of DNS name server names to be used by hosts in the internal subnet. |
  159. |`floatingip_pool` | Name of the pool from which floating IPs will be allocated |
  160. |`external_net` | UUID of the external network that will be routed to |
  161. |`flavor_k8s_master`,`flavor_k8s_node`,`flavor_etcd`, `flavor_bastion`,`flavor_gfs_node` | Flavor depends on your openstack installation, you can get available flavor IDs through`nova flavor-list` |
  162. |`image`,`image_gfs` | Name of the image to use in provisioning the compute resources. Should already be loaded into glance. |
  163. |`ssh_user`,`ssh_user_gfs` | The username to ssh into the image with. This usually depends on the image you have selected |
  164. |`public_key_path` | Path on your local workstation to the public key file you wish to use in creating the key pairs |
  165. |`number_of_k8s_masters`, `number_of_k8s_masters_no_floating_ip` | Number of nodes that serve as both master and etcd. These can be provisioned with or without floating IP addresses|
  166. |`number_of_k8s_masters_no_etcd`, `number_of_k8s_masters_no_floating_ip_no_etcd` | Number of nodes that serve as just master with no etcd. These can be provisioned with or without floating IP addresses |
  167. |`number_of_etcd` | Number of pure etcd nodes |
  168. |`number_of_k8s_nodes`, `number_of_k8s_nodes_no_floating_ip` | Kubernetes worker nodes. These can be provisioned with or without floating ip addresses. |
  169. |`number_of_bastions` | Number of bastion hosts to create. Scripts assume this is really just zero or one |
  170. |`number_of_gfs_nodes_no_floating_ip` | Number of gluster servers to provision. |
  171. | `gfs_volume_size_in_gb` | Size of the non-ephemeral volumes to be attached to store the GlusterFS bricks |
  172. #### Terraform state files
  173. In the cluster's inventory folder, the following files might be created (either by Terraform
  174. or manually), to prevent you from pushing them accidentally they are in a
  175. `.gitignore` file in the `terraform/openstack` directory :
  176. * `.terraform`
  177. * `.tfvars`
  178. * `.tfstate`
  179. * `.tfstate.backup`
  180. You can still add them manually if you want to.
  181. ### Initialization
  182. Before Terraform can operate on your cluster you need to install the required
  183. plugins. This is accomplished as follows:
  184. ```ShellSession
  185. $ cd inventory/$CLUSTER
  186. $ terraform init ../../contrib/terraform/openstack
  187. ```
  188. This should finish fairly quickly telling you Terraform has successfully initialized and loaded necessary modules.
  189. ### Provisioning cluster
  190. You can apply the Terraform configuration to your cluster with the following command
  191. issued from your cluster's inventory directory (`inventory/$CLUSTER`):
  192. ```ShellSession
  193. $ terraform apply -var-file=cluster.tf ../../contrib/terraform/openstack
  194. ```
  195. if you chose to create a bastion host, this script will create
  196. `contrib/terraform/openstack/k8s-cluster.yml` with an ssh command for Ansible to
  197. be able to access your machines tunneling through the bastion's IP address. If
  198. you want to manually handle the ssh tunneling to these machines, please delete
  199. or move that file. If you want to use this, just leave it there, as ansible will
  200. pick it up automatically.
  201. ### Destroying cluster
  202. You can destroy your new cluster with the following command issued from the cluster's inventory directory:
  203. ```ShellSession
  204. $ terraform destroy -var-file=cluster.tf ../../contrib/terraform/openstack
  205. ```
  206. If you've started the Ansible run, it may also be a good idea to do some manual cleanup:
  207. * remove SSH keys from the destroyed cluster from your `~/.ssh/known_hosts` file
  208. * clean up any temporary cache files: `rm /tmp/$CLUSTER-*`
  209. ### Debugging
  210. You can enable debugging output from Terraform by setting
  211. `OS_DEBUG` to 1 and`TF_LOG` to`DEBUG` before running the Terraform command.
  212. ### Terraform output
  213. Terraform can output values that are useful for configure Neutron/Octavia LBaaS or Cinder persistent volume provisioning as part of your Kubernetes deployment:
  214. - `private_subnet_id`: the subnet where your instances are running is used for `openstack_lbaas_subnet_id`
  215. - `floating_network_id`: the network_id where the floating IP are provisioned is used for `openstack_lbaas_floating_network_id`
  216. ## Ansible
  217. ### Node access
  218. #### SSH
  219. Ensure your local ssh-agent is running and your ssh key has been added. This
  220. step is required by the terraform provisioner:
  221. ```
  222. $ eval $(ssh-agent -s)
  223. $ ssh-add ~/.ssh/id_rsa
  224. ```
  225. If you have deployed and destroyed a previous iteration of your cluster, you will need to clear out any stale keys from your SSH "known hosts" file ( `~/.ssh/known_hosts`).
  226. #### Bastion host
  227. If you are not using a bastion host, but not all of your nodes have floating IPs, create a file `inventory/$CLUSTER/group_vars/no-floating.yml` with the following content. Use one of your nodes with a floating IP (this should have been output at the end of the Terraform step) and the appropriate user for that OS, or if you have another jump host, use that.
  228. ```
  229. ansible_ssh_common_args: '-o ProxyCommand="ssh -o StrictHostKeyChecking=no -W %h:%p -q USER@MASTER_IP"'
  230. ```
  231. #### Test access
  232. Make sure you can connect to the hosts. Note that Container Linux by CoreOS will have a state `FAILED` due to Python not being present. This is okay, because Python will be installed during bootstrapping, so long as the hosts are not `UNREACHABLE`.
  233. ```
  234. $ ansible -i inventory/$CLUSTER/hosts -m ping all
  235. example-k8s_node-1 | SUCCESS => {
  236. "changed": false,
  237. "ping": "pong"
  238. }
  239. example-etcd-1 | SUCCESS => {
  240. "changed": false,
  241. "ping": "pong"
  242. }
  243. example-k8s-master-1 | SUCCESS => {
  244. "changed": false,
  245. "ping": "pong"
  246. }
  247. ```
  248. If it fails try to connect manually via SSH. It could be something as simple as a stale host key.
  249. ### Configure cluster variables
  250. Edit `inventory/$CLUSTER/group_vars/all.yml`:
  251. - Set variable **bootstrap_os** appropriately for your desired image:
  252. ```
  253. # Valid bootstrap options (required): ubuntu, coreos, centos, none
  254. bootstrap_os: coreos
  255. ```
  256. - **bin_dir**:
  257. ```
  258. # Directory where the binaries will be installed
  259. # Default:
  260. # bin_dir: /usr/local/bin
  261. # For Container Linux by CoreOS:
  262. bin_dir: /opt/bin
  263. ```
  264. - and **cloud_provider**:
  265. ```
  266. cloud_provider: openstack
  267. ```
  268. Edit `inventory/$CLUSTER/group_vars/k8s-cluster.yml`:
  269. - Set variable **kube_network_plugin** to your desired networking plugin.
  270. - **flannel** works out-of-the-box
  271. - **calico** requires [configuring OpenStack Neutron ports](/docs/openstack.md) to allow service and pod subnets
  272. ```
  273. # Choose network plugin (calico, weave or flannel)
  274. # Can also be set to 'cloud', which lets the cloud provider setup appropriate routing
  275. kube_network_plugin: flannel
  276. ```
  277. - Set variable **resolvconf_mode**
  278. ```
  279. # Can be docker_dns, host_resolvconf or none
  280. # Default:
  281. # resolvconf_mode: docker_dns
  282. # For Container Linux by CoreOS:
  283. resolvconf_mode: host_resolvconf
  284. ```
  285. ### Deploy Kubernetes
  286. ```
  287. $ ansible-playbook --become -i inventory/$CLUSTER/hosts cluster.yml
  288. ```
  289. This will take some time as there are many tasks to run.
  290. ## Kubernetes
  291. ### Set up kubectl
  292. 1. [Install kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/) on your workstation
  293. 2. Add a route to the internal IP of a master node (if needed):
  294. ```
  295. sudo route add [master-internal-ip] gw [router-ip]
  296. ```
  297. or
  298. ```
  299. sudo route add -net [internal-subnet]/24 gw [router-ip]
  300. ```
  301. 3. List Kubernetes certificates & keys:
  302. ```
  303. ssh [os-user]@[master-ip] sudo ls /etc/kubernetes/ssl/
  304. ```
  305. 4. Get `admin`'s certificates and keys:
  306. ```
  307. ssh [os-user]@[master-ip] sudo cat /etc/kubernetes/ssl/admin-[cluster_name]-k8s-master-1-key.pem > admin-key.pem
  308. ssh [os-user]@[master-ip] sudo cat /etc/kubernetes/ssl/admin-[cluster_name]-k8s-master-1.pem > admin.pem
  309. ssh [os-user]@[master-ip] sudo cat /etc/kubernetes/ssl/ca.pem > ca.pem
  310. ```
  311. 5. Configure kubectl:
  312. ```ShellSession
  313. $ kubectl config set-cluster default-cluster --server=https://[master-internal-ip]:6443 \
  314. --certificate-authority=ca.pem
  315. $ kubectl config set-credentials default-admin \
  316. --certificate-authority=ca.pem \
  317. --client-key=admin-key.pem \
  318. --client-certificate=admin.pem
  319. $ kubectl config set-context default-system --cluster=default-cluster --user=default-admin
  320. $ kubectl config use-context default-system
  321. ```
  322. 7. Check it:
  323. ```
  324. kubectl version
  325. ```
  326. If you are using floating ip addresses then you may get this error:
  327. ```
  328. Unable to connect to the server: x509: certificate is valid for 10.0.0.6, 10.0.0.6, 10.233.0.1, 127.0.0.1, not 132.249.238.25
  329. ```
  330. You can tell kubectl to ignore this condition by adding the
  331. `--insecure-skip-tls-verify` option.
  332. ## GlusterFS
  333. GlusterFS is not deployed by the standard`cluster.yml` playbook, see the
  334. [GlusterFS playbook documentation](../../network-storage/glusterfs/README.md)
  335. for instructions.
  336. Basically you will install Gluster as
  337. ```ShellSession
  338. $ ansible-playbook --become -i inventory/$CLUSTER/hosts ./contrib/network-storage/glusterfs/glusterfs.yml
  339. ```
  340. ## What's next
  341. Try out your new Kubernetes cluster with the [Hello Kubernetes service](https://kubernetes.io/docs/tasks/access-application-cluster/service-access-application-cluster/).