The kubeconfig provided to the kubernetes-control-plane-online.service
is invalid. However, the apiserver /healthz endpoint can be accessed without auth so it's
simpler to just use curl for that.
by adding targets and curl wait loops to services to ensure services
are not started before their depended services are reachable.
Extra targets cfssl-online.target and kube-apiserver-online.target
syncronize starts across machines and node-online.target ensures
docker is restarted and ready to deploy containers on after flannel
has discussed the network cidr with apiserver.
Since flannel needs to be started before addon-manager to configure
the docker interface, it has to have its own rbac bootstrap service.
The curl wait loops within the other services exists to ensure that when
starting the service it is able to do its work immediately without
clobbering the log about failing conditions.
By ensuring kubernetes.target is only reached after starting the
cluster it can be used in the tests as a wait condition.
In kube-certmgr-bootstrap mkdir is needed for it to not fail to start.
The following is the relevant part of systemctl list-dependencies
default.target
● ├─certmgr.service
● ├─cfssl.service
● ├─docker.service
● ├─etcd.service
● ├─flannel.service
● ├─kubernetes.target
● │ ├─kube-addon-manager.service
● │ ├─kube-proxy.service
● │ ├─kube-apiserver-online.target
● │ │ ├─flannel-rbac-bootstrap.service
● │ │ ├─kube-apiserver-online.service
● │ │ ├─kube-apiserver.service
● │ │ ├─kube-controller-manager.service
● │ │ └─kube-scheduler.service
● │ └─node-online.target
● │ ├─node-online.service
● │ ├─flannel.target
● │ │ ├─flannel.service
● │ │ └─mk-docker-opts.service
● │ └─kubelet.target
● │ └─kubelet.service
● ├─network-online.target
● │ └─cfssl-online.target
● │ ├─certmgr.service
● │ ├─cfssl-online.service
● │ └─kube-certmgr-bootstrap.service
to protect services from crashing and clobbering the logs when
certificates are not in place yet and make sure services are activated
when certificates are ready.
To prevent errors similar to "kube-controller-manager.path: Failed to
enter waiting state: Too many open files"
fs.inotify.max_user_instances has to be increased.
Before flannel is ready there is a brief time where docker will be
running with a default docker0 bridge. If kubernetes happens to spawn
containers before flannel is ready, docker can't be restarted when
flannel is ready because some containers are still running on the
docker0 bridge with potentially different network addresses.
Environment variables in `EnvironmentFile` override those defined via
`Environment` in the systemd service config.
Co-authored-by: Christian Albrecht <christian.albrecht@mayflower.de>
+ isolate etcd on the master node by letting it listen only on loopback
+ enabling kubelet on master and taint master with NoSchedule
The reason for the latter is that flannel requires all nodes to be "registered"
in the cluster in order to setup the cluster network. This means that the
kubelet is needed even at nodes on which we don't plan to schedule anything.