The new CI does not define k8s_cluster group, so it relies on
kubernetes-sigs/kubespray#11559.
This does not work for upgrade testing (which use the previous release).
We can revert this commit after 2.27.0
We should not rollback our test setup during upgrade test.
The only reason to do that would be for incompatible changes in the test
inventory, and we already checkout master for those (${CI_JOB_NAME}.yml)
Also do some cleanup by removing unnecessary intermediary variables
VirtualMachineInstance resources sometimes temporarily loose their
IP (at least as far as the kubevirt controllers can see).
See https://github.com/kubevirt/kubevirt/issues/12698 for the upstream
bug.
This does not seems to affect actual connection (if it did, our current
CI would not work).
However, our CI execute multiple playbooks, and in particular:
1. The provisioning playbook (which checks that the IPs have been
provisioned by querying the K8S API)
2. Kubespray itself
If any of the VirtualMachineInstance looses its IP between after 1
checked for it, and before 2 starts, the dynamic inventory (which is
invoked when the playbook is launched by ansible-playbook) will not have
an ip for that host, and will try to use the name for ssh, which of
course will not work.
Instead, when we have a valid state during provisioning (all IPs
presents), use it to construct a static inventory which will be used for
the rest of the CI run.
This allows a single source of truth for the virtual machines in a
kubevirt ci-run.
`etcd_member_name` should be correctly handled in kubespray-defaults for
testing the recover cases.
Not constraining the inventory to .ini allows us to use dynamic
inventory, which is needed for simplifying kubevirt jobs inventory.
Also reduces the scope of the ANSIBLE_INVENTORY variable.
VMI in Kubevirt are the abstraction below VirtualMachine.
- We don't really need the extra abstraction of VirtualMachine objects
- Convert the waiting for VMs ip address to use kubernetes.core.k8s_info
and no shell pipeline
This leverage the Kubernetes GC to delete kubevirt VMs, by using
ownerReferences, with the CI pod running the playbook as the owner.
This concretely means that the control plane in our CI cluster will
delete the kubevirt VMs associated with a particular ci job as soon as
that pod job is deleted, which usually happens when the job terminates,
(barring errors, which will be addressed in the cluster directly)
Upgrade to kubevirt.io/v1 for the VirtualMachine manifests, since the
alpha version is deprecated.
Before adding these changes, `ansible_facts.services["containerd.service"]` will not defined and fail to check for triggering the container stop and delete behaviors.
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
Simplify registry mirror rendering in config.toml.
The map filter can extract the host list from mirrors so we can
just unique them and render them without needing to construct vars
for it.
For the registry mirror tls section, we can first extract mirrors
from the dict then filter on only the ones having skip_veridy defined
first and then filter on the ones having true (as the dict might not
have skip_verify defined and that would cause errors of undefined var).
This will speed up and simply the templating.
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
Dropping the ansible dependencies for ansible-lint will allow us to
catch missing dependencies collections in galaxy.yml. For collections
needed for contrib/ or tests/ (i.e: not part of core kubespray
dependencies), we can just configure ansible-lint to mock them.
This mean it won't check the mocked module parameters, but for those
area of the code base it's an acceptable trade-off.
Using the hosts directive at the play level prevent those tasks from
being run when using --limit and the group in question is not part of
the limit (ex: running scale.yml on new worker nodes only)
Instead, run on all hosts, and for each group, partition between that
group and '_' (generic group name which is not used; using an empty
string as the group is not supported by ansible.builtin.group_by)
Reported-by: asteppat <asteppat@cisco.com>
* Add Fedora 39/40 to Vagrantfile
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
* Add CI tests for Fedora 39/40
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
* Update CI tests documentation
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
* Update support OS version in README.md
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
---------
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
* Update cluster-role for cilium to prevent errors in agent startup
ciliumloadbalancerippools permissions exists in the cilium helm chart for version 1.13.0
https://github.com/cilium/cilium/blob/v1.13.0/install/kubernetes/cilium/templates/cilium-agent/clusterrole.yaml#L71
The agent also needs permissions to read/watch secrets for bgp auth secrets when using CiliumBGPPeeringPolicy with a secret.
* Remove list/watch permissions for secrets
* Remove secrets from list/watch permissions