V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
weidaizi
V2EX  ›  程序员

造轮子的快乐!一个极快的纯 C 异步日志库

  •  2
     
  •   weidaizi ·
    MuggleWei · 2023-10-12 14:07:20 +08:00 · 3829 次点击
    这是一个创建于 421 天前的主题,其中的信息可能已经有所发展或是发生改变。

    项目地址: https://github.com/MuggleWei/haclog

    十一放假之前,看了一下 Nanolog 的论文,感觉思路非常赞,可惜 Nanolog 并不跨平台,而且为了性能和吞吐量,其日志输出并不直接可读。既然已经有了大体思路,接下来当然是自己撸一个啦!而且因为我也比较多 C 的库需要使用,所以就撸一个纯 C 的版本~
    长假当中就断断续续写了一些,这几天在家的时间没事也写几段,今天大致到了较为满意的程度. 下面是在 Mac M1 2021 和 ThinkPad X1C 2019 上的 benchmark 结果 (测试程序可以直接在工程目录中找到: haclog gbenchmark),欢迎各位老板们使用和提意见

    Mac M1 2021

    Run on (8 X 24 MHz CPU s)
    CPU Caches:
      L1 Data 64 KiB
      L1 Instruction 128 KiB
      L2 Unified 4096 KiB (x8)
    Load Average: 8.88, 4.53, 2.84
    -----------------------------------------------------------------------------------------------------------
    Benchmark                                                                 Time             CPU   Iterations
    -----------------------------------------------------------------------------------------------------------
    ConstIterFixture/run/min_time:3.000/threads:1                          66.3 ns         66.2 ns     60803739
    ConstIterFixture/run/min_time:3.000/threads:2                          34.0 ns         67.8 ns     58196532
    ConstIterFixture/run/min_time:3.000/threads:4                          24.7 ns         96.0 ns     46114032
    ConstIterFixture/run/min_time:3.000/threads:7                          30.0 ns          146 ns     25301248
    ConstIterFixture/run/iterations:6000/repeats:5/threads:1                120 ns          120 ns         6000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:1               74.8 ns         74.8 ns         6000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:1               65.9 ns         66.0 ns         6000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:1               87.8 ns         87.8 ns         6000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:1               79.8 ns         80.0 ns         6000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:1_mean          85.7 ns         85.7 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:1_median        79.8 ns         80.0 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:1_stddev        20.8 ns         20.7 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:1_cv           24.29 %         24.19 %             5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:4               25.6 ns          102 ns        24000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:4               23.0 ns         91.8 ns        24000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:4               28.5 ns          114 ns        24000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:4               27.8 ns          111 ns        24000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:4               28.6 ns          114 ns        24000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:4_mean          26.7 ns          107 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:4_median        27.8 ns          111 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:4_stddev        2.42 ns         9.71 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:4_cv            9.05 %          9.09 %             5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:8               16.7 ns          132 ns        48000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:8               17.5 ns          130 ns        48000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:8               17.5 ns          135 ns        48000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:8               16.8 ns          134 ns        48000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:8               16.9 ns          133 ns        48000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:8_mean          17.1 ns          133 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:8_median        16.9 ns          133 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:8_stddev       0.396 ns         1.64 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:8_cv            2.32 %          1.24 %             5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:16              7.82 ns          125 ns        96000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:16              8.64 ns          136 ns        96000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:16              13.2 ns          204 ns        96000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:16              7.25 ns          114 ns        96000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:16              11.0 ns          169 ns        96000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:16_mean         9.57 ns          149 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:16_median       8.64 ns          136 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:16_stddev       2.47 ns         36.7 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:16_cv          25.83 %         24.56 %             5
    

    Thinkpad X1C 2019

    Run on (12 X 1780.93 MHz CPU s)
    CPU Caches:
      L1 Data 32 KiB (x6)
      L1 Instruction 32 KiB (x6)
      L2 Unified 256 KiB (x6)
      L3 Unified 12288 KiB (x1)
    Load Average: 0.09, 0.03, 0.08
    -----------------------------------------------------------------------------------------------------------
    Benchmark                                                                 Time             CPU   Iterations
    -----------------------------------------------------------------------------------------------------------
    ConstIterFixture/run/min_time:3.000/threads:1                          63.8 ns         63.7 ns     65480842
    ConstIterFixture/run/min_time:3.000/threads:2                          35.3 ns         70.6 ns     57795198
    ConstIterFixture/run/min_time:3.000/threads:6                          28.4 ns          170 ns     19480434
    ConstIterFixture/run/min_time:3.000/threads:11                         2491 ns        27350 ns       370425
    ConstIterFixture/run/iterations:6000/repeats:5/threads:1                144 ns          144 ns         6000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:1                145 ns          144 ns         6000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:1                144 ns          144 ns         6000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:1                137 ns          136 ns         6000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:1                128 ns          128 ns         6000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:1_mean           140 ns          139 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:1_median         144 ns          144 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:1_stddev        7.14 ns         7.04 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:1_cv            5.11 %          5.06 %             5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:6               40.1 ns          241 ns        36000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:6               38.5 ns          231 ns        36000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:6               41.1 ns          246 ns        36000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:6               39.6 ns          238 ns        36000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:6               38.1 ns          228 ns        36000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:6_mean          39.5 ns          237 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:6_median        39.6 ns          238 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:6_stddev        1.20 ns         7.48 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:6_cv            3.03 %          3.16 %             5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:12              37.3 ns          446 ns        72000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:12              36.2 ns          434 ns        72000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:12              35.5 ns          426 ns        72000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:12              34.7 ns          416 ns        72000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:12              37.6 ns          450 ns        72000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:12_mean         36.3 ns          434 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:12_median       36.2 ns          434 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:12_stddev       1.21 ns         13.8 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:12_cv           3.34 %          3.19 %             5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:24              12.8 ns          307 ns       144000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:24              11.9 ns          286 ns       144000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:24              11.6 ns          277 ns       144000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:24              13.6 ns          326 ns       144000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:24              12.5 ns          300 ns       144000
    ConstIterFixture/run/iterations:6000/repeats:5/threads:24_mean         12.5 ns          299 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:24_median       12.5 ns          300 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:24_stddev      0.783 ns         18.8 ns            5
    ConstIterFixture/run/iterations:6000/repeats:5/threads:24_cv           6.27 %          6.28 %             5
    
    29 条回复    2023-10-17 10:59:58 +08:00
    Geekgogo
        1
    Geekgogo  
       2023-10-12 14:10:52 +08:00
    大佬
    JL1990
        2
    JL1990  
       2023-10-12 14:14:20 +08:00
    强👍
    duandashen
        3
    duandashen  
       2023-10-12 14:23:11 +08:00
    不得不服
    o562dsRcFqYl375i
        4
    o562dsRcFqYl375i  
       2023-10-12 14:53:22 +08:00
    这可能就是热爱吧,羡慕
    hankai17
        5
    hankai17  
       2023-10-12 15:07:32 +08:00
    看起来跟 我之前维护的很像 核心是多队列环形缓冲区
    heelerdeer
        6
    heelerdeer  
       2023-10-12 15:57:05 +08:00
    NB
    cmhonker
        7
    cmhonker  
       2023-10-12 16:46:42 +08:00
    C 语言太蛋疼了,目前还在用 C 的只剩下信仰了吧。
    pengtdyd
        8
    pengtdyd  
       2023-10-12 19:03:28 +08:00
    c 还是牛皮啊,搞 java ,搞 js ,搞 dart ,最后我还是被迫要搞 c ,底层不懂永远好像是少条腿走路。。。。。。
    zhs227
        9
    zhs227  
       2023-10-12 19:36:23 +08:00
    滚动日志支持打成压缩包吗
    zooo
        10
    zooo  
       2023-10-12 19:45:51 +08:00   ❤️ 1
    纯 C 。。

    我宁愿用 C++,都不想用 C
    hitmanx
        11
    hitmanx  
       2023-10-12 19:48:35 +08:00
    有一点小建议是,可能很多人都像我一样,对你这个挺感兴趣。但是既没看过这个论文,不知道创新在哪里;也不知道你这个 benchmark 究竟每一项测的是什么,甚至这个 benchmark 是你自己创造的而不是广为人知的。

    如果你这个卖点就是 performance ,可能和别的产品进行相对比较会比列上几十上百个绝对值更好一些……
    zooo
        12
    zooo  
       2023-10-12 19:57:42 +08:00
    给大佬点赞,能做出来已经很🐂


    我以为原版非 C++/C ,还想着用 C++去实现呢,结果原版是 C++实现的。
    weidaizi
        13
    weidaizi  
    OP
       2023-10-12 19:58:18 +08:00
    @huangzongzhuan 谢谢夸奖!纯粹是兴趣,乐在其中,祝你也能对技术保持长久好奇心!
    weidaizi
        14
    weidaizi  
    OP
       2023-10-12 20:06:19 +08:00
    @zhs227 这个的确可以加一下,但是考虑到保持日志库本身的简单,不对外部的有依赖;所以得想一下如何实现
    * 自己实现
    * 或是从允许商用的 LICENSE 开源压缩库里扒代码
    * 或是以 extra 的形式增加一个依赖外部 gzip 的 handler
    十分感谢提醒!!!
    weidaizi
        15
    weidaizi  
    OP
       2023-10-12 20:07:17 +08:00
    @zhs227 之后我加好了,一定来回复您! (*^_^*)
    weidaizi
        16
    weidaizi  
    OP
       2023-10-12 20:18:30 +08:00   ❤️ 1
    @hitmanx 感谢提醒!的确是一个好建议!我之后会另外开一个库,可以用同一个 benchmark 的参数,来测多个不同的库,届时来回复您~

    另外,项目中的 benchmark 确实我没在 README 中解释很详细,这个我的锅
    其实就是使用 google benchmark 在测两部分场景
    1. 设定最小的测试时间(google benchmark 中的 MinTime),这时间中尽可能的压力测试写日志,看看当缓冲区高度紧张的情况下,日志前端写入效率如何。这个场景下测试了当线程数为 1/2/${CPU 数量的一半}/${CPU 数量 - 1} 的情况
    2. 设定迭代和重复次数(google benchmark 中的 Iterations + Repetitions),看看当缓冲区没被打满的情况下,能跑多快。这个场景下测试了当线程数为 1/${CPU 数量的一半}/${CPU 数量}/${2 倍 CPU 数量} 的情况
    enchilada2020
        17
    enchilada2020  
       2023-10-12 20:27:22 +08:00 via Android
    好顶赞
    weidaizi
        18
    weidaizi  
    OP
       2023-10-12 20:32:37 +08:00
    @zooo 啊哈,是的呢,如果感兴趣可以看看我这边的源码,或者也可以看看 Nanolog 那边的源码,虽然思路相同,但是实现是完全不一样的,不过都藏了很多有趣的东西呢!
    就比如一个很简单的问题:为什么 haclog 和 Nanolog 都要限定 format 字符串为字符串常量呢? 感兴趣可以去代码里探索一下呀,哈哈哈哈哈
    weidaizi
        19
    weidaizi  
    OP
       2023-10-12 20:33:43 +08:00
    @zooo 啊~ 刚刚打漏了~ 谢谢你的夸奖!(*^_^*)
    xmtpw
        20
    xmtpw  
       2023-10-12 22:22:23 +08:00
    厉害了,大佬
    icyalala
        21
    icyalala  
       2023-10-12 22:26:26 +08:00
    我也喜欢 C 。
    对于性能对比,建议画个直方图 📊 在 Readme 里,这对于宣传性能来说真的很重要。
    weidaizi
        22
    weidaizi  
    OP
       2023-10-12 22:34:45 +08:00
    @icyalala 👍 感谢提醒!上面也有位朋友提到了可以横向对比一下,我之后准备另外开一个库,跑 C/C++ 的不同日志库的 benchmark 看看,这样也可以发现还有什么库是特别快的
    bashbot
        23
    bashbot  
       2023-10-13 10:04:57 +08:00
    挺好的轮子,之前也手撸过一个。
    想问下 多队列环形缓冲区 能否保证多线程下日志的时间序?

    之前为了保证时间序,写了一个可扩充的多生产单消费环形缓冲区。
    但是在数据竞争上有点漏洞,导致多线程竞争时有偶发的丢日志,后来只能加锁搞定,性能下降了好多。
    另外,运行时基于标记和级别的日志过滤,都没人用吗,看到好多日志库都没这些功能。
    junmoxiao
        24
    junmoxiao  
       2023-10-13 10:14:58 +08:00
    厉害呀大佬。可以加到 xmake-repo 的仓库里
    weidaizi
        25
    weidaizi  
    OP
       2023-10-13 10:51:07 +08:00
    @bashbot 啊哈,只能保证同一线程的顺序,不能保证多线程的顺序;不过想了一下即使是单缓冲区,其实也是不能完全保证,除非是把加时间戳的代码段都包含在某个锁中,这个就太废了; 回到日志本身,因为带有时间戳,所以即使不同线程在日志文件中乱序,而且乱序也只是在一个消费者循环中,问题也不太大,得看使用场景权衡取舍吧
    weidaizi
        26
    weidaizi  
    OP
       2023-10-13 10:53:04 +08:00
    @junmoxiao 谢谢!我平时不太用 xmake ,我之后去探索一下要如何加,哈~
    Masoud2023
        27
    Masoud2023  
       2023-10-13 15:37:39 +08:00
    昨天没事的时候简单的扫了一眼这个论文

    我能不能简单的把这个理解为一个异步的带压缩的结构化 logger...
    weidaizi
        28
    weidaizi  
    OP
       2023-10-13 17:31:58 +08:00
    @Masoud2023 是的,Nanolog 编译会生成一个 decompressor ,如果要以文本形式展示,得通过这程序解压一下日志~
    weidaizi
        29
    weidaizi  
    OP
       2023-10-17 10:59:58 +08:00   ❤️ 1
    @hitmanx
    @icyalala
    感谢两位上周的建议,这几天我写了一个 C/C++ Benchmark 项目, 用于横向对比一些知名的或以低延时为目标的 C/C++ 日志库, 并图表化显示
    地址: https://github.com/MuggleWei/cc_log_benchmark
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   2603 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 30ms · UTC 04:54 · PVG 12:54 · LAX 20:54 · JFK 23:54
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.