Until now it was not possible to add an API Loadbalancer
without an static IP Address. But certain Loadbalancers
like AWS Elastic Loadbalanacer dontt have an fixed IP address.
With this commit it is possible to add these kind of Loadbalancers
to the Kargo deployment.
Updates based on feedback
Simplify checks for file exists
remove invalid char
Review feedback. Use regular systemd file.
Add template for docker systemd atomic
* Leave all.yml to keep only optional vars
* Store groups' specific vars by existing group names
* Fix optional vars casted as mandatory (add default())
* Fix missing defaults for an optional IP var
* Relink group_vars for terraform to reflect changes
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
Migrate older inline= syntax to pure yml syntax for module args as to be consistant with most of the rest of the tasks
Cleanup some spacing in various files
Rename some files named yaml to yml for consistancy
* Drop linux capabilities for unprivileged containerized
worlkoads Kargo configures for deployments.
* Configure required securityContext/user/group/groups for kube
components' static manifests, etcd, calico-rr and k8s apps,
like dnsmasq daemonset.
* Rework cloud-init (etcd) users creation for CoreOS.
* Fix nologin paths, adjust defaults for addusers role and ensure
supplementary groups membership added for users.
* Add netplug user for network plugins (yet unused by privileged
networking containers though).
* Grant the kube and netplug users read access for etcd certs via
the etcd certs group.
* Grant group read access to kube certs via the kube cert group.
* Remove priveleged mode for calico-rr and run it under its uid/gid
and supplementary etcd_cert group.
* Adjust docs.
* Align cpu/memory limits and dropped caps with added rkt support
for control plane.
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
Add BGP route reflectors support in order to optimize BGP topology
for deployments with Calico network plugin.
Also bump version of calico/ctl for some bug fixes.
* For Debian/RedHat OS families (with NetworkManager/dhclient/resolvconf
optionally enabled) prepend /etc/resolv.conf with required nameservers,
options, and supersede domain and search domains via the dhclient/resolvconf
hooks.
* Drop (z)nodnsupdate dhclient hook and re-implement it to complement the
resolvconf -u command, which is distro/cloud provider specific.
Update docs as well.
* Enable network restart to apply and persist changes and simplify handlers
to rely on network restart only. This fixes DNS resolve for hostnet K8s
pods for Red Hat OS family. Skip network restart for canal/calico plugins,
unless https://github.com/projectcalico/felix/issues/1185 fixed.
* Replace linefiles line plus with_items to block mode as it's faster.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
Co-authored-by: Matthew Mosesohn <mmosesohn@mirantis.com>
In order to enable offline/intranet installation cases:
* Move DNS/resolvconf configuration to preinstall role. Remove
skip_dnsmasq_k8s var as not needed anymore.
* Preconfigure DNS stack early, which may be the case when downloading
artifacts from intranet repositories. Do not configure
K8s DNS resolvers for hosts /etc/resolv.conf yet early (as they may be
not existing).
* Reconfigure K8s DNS resolvers for hosts only after kubedns/dnsmasq
was set up and before K8s apps to be created.
* Move docker install task to early stage as well and unbind it from the
etcd role's specific install path. Fix external flannel dependency on
docker role handlers. Also fix the docker restart handlers' steps
ordering to match the expected sequence (the socket then the service).
* Add default resolver fact, which is
the cloud provider specific and remove hardcoded GCE resolver.
* Reduce default ndots for hosts /etc/resolv.conf to 2. Multiple search
domains combined with high ndots values lead to poor performance of
DNS stack and make ansible workers to fail very often with the
"Timeout (12s) waiting for privilege escalation prompt:" error.
* Update docs.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
The variale etcd_access_addresses is used to determine
how to address communication from other roles to
the etcd cluster.
It was set to the address that ansible uses to
connect to instance ({{ item }})s and not the
the variable:
ip_access
which had already been created and could already
be overridden through the access_ip variable.
This change allows ansible to connect to a machine using
a different address than the one used to access etcd.
Also adds all masters by hostname and localhost/127.0.0.1 to
apiserver SSL certificate.
Includes documentation update on how localhost loadbalancer works.
* Add HA docs for API server.
* Add auto-evaluated internal endpoints and clarify the loadbalancer_apiserver
vars and usecases.
* Use facts for kube_apiserver to not repeat code and enable LB endpoints use.
* Use /healthz check for the wait-for apiserver.
* Use the single endpoint for kubelet instead of the list of apiservers
* Specify kube_apiserver_count to for HA layout
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
* Add auto-evaluated internal endpoints and clarify the loadbalancer_apiserver
vars and usecases.
* Add loadbalancer_apiserver_localhost (default false). If enabled, override
the external LB and expect localhost:443/8080 to be new internal only frontends.
* Add kube_apiserver_multiaccess to ignore loadbalancers, and make clients
to access the apiservers as a comma-separated list of access_ip/ip/ansible ip
(a default mode). When disabled, allow clients to use the given loadbalancers.
* Define connections security mode for kube controllers, schedulers, proxies.
It is insecure be default, which is the current deployment choice.
* Rework the groups['kube-master'][0] hardcode defining the apiserver
endpoints.
* Improve grouping of vars and add facts for kube_apiserver.
* Define kube_apiserver_insecure_bind_address as a fact, add more
facts for ease of use.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
Move set_facts to the preinstall scope, so every role
may see it. For example, network plugins to see the etcd_endpoint.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
* Enforce a etcd-proxy role to a k8s-cluster group members. This
provides an HA layout for all of the k8s cluster internal clients.
* Proxies to be run on each node in the group as a separate etcd
instances with a readwrite proxy mode and listen the given endpoint,
which is either the access_ip:2379 or the localhost:2379.
* A notion for the 'kube_etcd_multiaccess' is: ignore endpoints and
loadbalancers and use the etcd members IPs as a comma-separated
list. Otherwise, clients shall use the local endpoint provided by a
etcd-proxy instances on each etcd node. A Netwroking plugins always
use that access mode.
* Fix apiserver's etcd servers args to use the etcd_access_endpoint.
* Fix networking plugins flannel/calico to use the etcd_endpoint.
* Fix name env var for non masters to be set as well.
* Fix etcd_client_url was not used anywhere and other etcd_* facts
evaluation was duplicated in a few places.
* Define proxy modes only in the env file, if not a master. Del
an automatic proxy mode decisions for etcd nodes in init/unit scripts.
* Use Wants= instead of Requires= as "This is the recommended way to
hook start-up of one unit to the start-up of another unit"
* Make apiserver/calico Wants= etcd-proxy to keep it always up
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
Co-authored-by: Matthew Mosesohn <mmosesohn@mirantis.com>
Running etcd in Docker reduces the number of individual file
downloads and services running on the host.
Note: etcd container v3.0.1 moves bindir to /usr/local/bin
Fixes: #298
* Set ETCD_INITIAL_CLUSTER_STATE from `new` to `existing`,
because parameter `new` makes sense only on cluster assembly
stage.
* If cluster exists and current node is not a part
of the cluster, add it with command `etcdctl add member name url`.
Closes kubespray/kargo/#270