When etcd exceeds its memory limit, it becomes useless but keeps running.
We should let OOM killer kill etcd process in the container, so systemd can spot
the problem and restart etcd according to "Restart" setting in etcd.service unit file.
If OOME problem keep repeating, i.e. it happens every single restart,
systemd will eventually back off and stop restarting it anyway.
--restart=on-failure:5 in this file has no effect because memory allocation error
doesn't by itself cause the process to die
Related: https://github.com/kubernetes-incubator/kubespray/blob/master/roles/etcd/templates/etcd-docker.service.j2
This kind of reverts a change introduced in #1860.
Even though there it kubeadm_token_ttl=0 which means that kubeadm token never expires, it is not present in `kubeadm token list` after cluster is provisioned (at least after it is running for some time) and there is issue regarding this https://github.com/kubernetes/kubeadm/issues/335, so we need to create a new temporary token during the cluster upgrade.
Ansible automatically installs the python-apt package when using
the 'apt' Ansible module, if python-apt is not present. This patch
removes the (unneeded) explicit installation in the Kubespray
'preinstall' role.
In some installation, it can take up to 3sec to get the value. Retrying
for 5 sec will ensure the command won't return 1.
Signed-off-by: Sébastien Han <seb@redhat.com>
* allow installs to not have hostname overriden with fqdn from inventory
* calico-config no longer requires local as and will default to global
* when cloudprovider is not defined, use the inventory_hostname for cni-calico
* allow reset to not restart network (buggy nodes die with this cmd)
* default kube_override_hostname to inventory_hostname instead of ansible_hostname