V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
V2EX 提问指南
salamanderMH
V2EX  ›  问与答

k8s kube-apiserver CPU 负载很高

  •  
  •   salamanderMH · 2019-04-25 16:58:24 +08:00 · 3932 次点击
    这是一个创建于 2072 天前的主题,其中的信息可能已经有所发展或是发生改变。

    问题

    监控报警了,top 命令查看

      PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                      
     1171 root      20   0 1155672 785112  77040 S 120.3  9.6 157:44.39 kube-apiserver                               
     7903 root      20   0 10.742g 777632  46784 S   5.3  9.5   8:23.43 etcd                                         
     8957 root      20   0 1365948 123764  73864 S   1.3  1.5   2:57.95 kubelet                                      
    10369 root      20   0   44012  31584  20276 S   1.3  0.4   1:53.49 calico-felix                                 
     1147 root      20   0  451168  89944  68120 S   1.0  1.1   1:51.80 kube-scheduler
    
    

    可以看到 CPU 飙到了 120%,不知道是什么原因导致的

    第 1 条附言  ·  2019-04-25 18:00:58 +08:00

    看apisever的日志是

    E0425 10:00:25.721663       1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: Error: 'dial tcp 10.43.42.227:443: i/o timeout'
    Trying to reach: 'https://10.43.42.227:443/openapi/v2', Header: map[]
    I0425 10:00:25.721833       1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue.
    I0425 10:00:25.722490       1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.metrics.k8s.io
    E0425 10:00:26.573574       1 available_controller.go:311] v1beta1.admission.certmanager.k8s.io failed with: Operation cannot be fulfilled on apiservices.apiregistration.k8s.io "v1beta1.admission.certmanager.k8s.io": the object has been modified; please apply your changes to the latest version and try again
    E0425 10:00:27.436960       1 available_controller.go:311] v1beta1.metrics.k8s.io failed with: Operation cannot be fulfilled on apiservices.apiregistration.k8s.io "v1beta1.metrics.k8s.io": the object has been modified; please apply your changes to the latest version and try again
    
    10 条回复    2019-04-26 14:22:05 +08:00
    HypoChen
        1
    HypoChen  
       2019-04-25 17:01:23 +08:00
    先查日志,看看有啥异常,比如是不是啥服务 bug 了在 D 你的 api server
    salamanderMH
        2
    salamanderMH  
    OP
       2019-04-25 17:12:34 +08:00
    @HypoChen
    我看了下 apisever 的日志
    ```
    E0425 09:11:11.383772 1 available_controller.go:311] v1beta1.admission.certmanager.k8s.io failed with: Operation cannot be fulfilled on apiservices.apiregistration.k8s.io "v1beta1.admission.certmanager.k8s.io": the object has been modified; please apply your changes to the latest version and try again
    E0425 09:11:14.341853 1 available_controller.go:311] v1beta1.metrics.k8s.io failed with: Operation cannot be fulfilled on apiservices.apiregistration.k8s.io "v1beta1.metrics.k8s.io": the object has been modified; please apply your changes to the latest version and try again
    E0425 09:11:16.391080 1 available_controller.go:311] v1beta1.admission.certmanager.k8s.io failed with: Get https://10.43.42.227:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    E0425 09:11:19.349480 1 available_controller.go:311] v1beta1.metrics.k8s.io failed with: Get https://10.43.219.61:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    E0425 09:11:21.400839 1 available_controller.go:311] v1beta1.admission.certmanager.k8s.io failed with: Operation cannot be fulfilled on apiservices.apiregistration.k8s.io "v1beta1.admission.certmanager.k8s.io": the object has been modified; please apply your changes to the latest version and try again
    E0425 09:11:24.367592 1 available_controller.go:311] v1beta1.metrics.k8s.io failed with: Operation cannot be fulfilled on apiservices.apiregistration.k8s.io "v1beta1.metrics.k8s.io": the object has been modified; please apply your changes to the latest version and try again
    ```
    HypoChen
        3
    HypoChen  
       2019-04-25 18:30:02 +08:00
    @salamanderMH api server 的网络请求量如何?
    salamanderMH
        4
    salamanderMH  
    OP
       2019-04-25 20:09:50 +08:00
    @HypoChen 我看到内网流出带宽有 1.29M bit/s, 内网流入带宽是 400k bit/s
    0312birdzhang
        5
    0312birdzhang  
       2019-04-26 08:16:42 +08:00
    什么版本的?感觉你这个版本有 bug,重启 kubelet 可以缓解
    salamanderMH
        6
    salamanderMH  
    OP
       2019-04-26 12:31:18 +08:00
    0312birdzhang
        7
    0312birdzhang  
       2019-04-26 12:35:39 +08:00
    @salamanderMH 具体到小版本号
    salamanderMH
        8
    salamanderMH  
    OP
       2019-04-26 12:46:06 +08:00
    @0312birdzhang v1.11.6
    0312birdzhang
        9
    0312birdzhang  
       2019-04-26 13:36:31 +08:00
    @salamanderMH #8 可以直接升级到 1.11.7,有一个 bug 在 1.11.7 修复了。不过看到你的报错还跟我们的不完全一样,我们的是提示 version 已经更改了什么的
    salamanderMH
        10
    salamanderMH  
    OP
       2019-04-26 14:22:05 +08:00
    @0312birdzhang 好的,我试试
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   3221 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 25ms · UTC 13:05 · PVG 21:05 · LAX 05:05 · JFK 08:05
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.