crd 删除引起的一个问题

k delete  crd ingressroutes.traefik.containo.us 命令导致kube-apiserver 崩溃,版本为1.16从etcd中删除对应的/registry/apiextensions.k8s.io/customresourcedefinitions/ingressroutes.traefik.containo.us  需要操作
—https://github.com/kubernetes/kubernetes/issues/90585 解决方案如下:https://github.com/kubernetes/kubernetes/pull/83789 The issue lies in the fact that kube-apiserver is trying to fetch details with respect to CRDs from ETCD but the memory address of the CRDs are changed as soon as we run the delete statement.
To resolve this you can exec into the etcd pod(if running as a pod or directly via etcdctl command line if running as a service) and run the following commands to delete the entries from there permanently:
    List all the CRDs in ETCD –> ETCDCTL_API=3 etcdctl get /registry/apiextensions.k8s.io/customresourcedefinitions –limit=10 –prefix –keys-only –cacert=cacertfile –cert=tlscertfile –key=tlskeyfile –endpoints https://127.0.0.1:2379    Delete the individual CRDs with respect to prometheus –> ETCDCTL_API=3 etcdctl del /registry/apiextensions.k8s.io/customresourcedefinitions/podmonitors.monitoring.coreos.com –cacert=cacertfile –cert=tlscertfile –key=tlskeyfile –endpoints https://127.0.0.1:2379
There will be several other CRDs of promtheus(5 as far as I remember), delete them in a similar manner and take a restart of your docker and kubelet services.
PS: This was the quickest and the only fix without upgrading or crashing the server and reinstalling kubernetes
@liggitt Can you please verify the same if possible? I have tried and tested this in my ENV and works like a charm :)—

ETCDCTL_API=3 etcdctl del /registry/apiextensions.k8s.io/customresourcedefinitions/ingressroutes.traefik.containo.us   –endpoints https://1.0.:2379

‘systemctl restart docker systemctl restart kubectl ‘ ‘systemctl restart kubelet ‘ ‘systemctl restart kube-apiserver ‘systemctl status kube-apiserver ‘
k delete  crd ingressroutes.traefik.containo.us删除后 kube-apiserver 崩溃了。查错误,加log参数  –logtostderr=true –v=0  

加入log参数 启动kube-apiserver  可以查看具体错误 /opt/k8s/bin/kube-apiserver –logtostderr=true –v=0    

发现是:kube apiserver Observed a panic: “invalid memory address or nil pointer dereference” (runtime error: invalid memory address or nil pointer dereference)这个错误
从git上查到问题是删除crd 导致 是kube-apisever 1.6的一个bug需手动从etcd中删除Delete the individual CRDs with respect to crd用etcd命令删除对应的 数据 ETCDCTL_API=3 etcdctl del /registry/apiextensions.k8s.io/customresourcedefinitions/ingressroutes.traefik.containo.us   –endpoints https://179然后重启docker kubetlet kube-apiserver  恢复正常

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注