V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
VforVendetta
V2EX  ›  站长

cloudflared 自动给网站生成 robots.txt

  •  
  •   VforVendetta · 62 天前 · 1060 次点击
    这是一个创建于 62 天前的主题,其中的信息可能已经有所发展或是发生改变。

    默认禁止一些 llm 爬虫。

    # site through automated means, including any device, tool,
    # or process designed to data mine or scrape content, is
    # prohibited except (1) for the purpose of search engine indexing or
    # artificial intelligence retrieval augmented generation or (2) with express
    # written permission from this site’s operator.
    
    # To request permission to license our intellectual
    # property and/or other materials, please contact this
    # site’s operator directly.
    
    # BEGIN Cloudflare Managed content
    
    User-agent: Amazonbot
    Disallow: /
    
    User-agent: Applebot-Extended
    Disallow: /
    
    User-agent: Bytespider
    Disallow: /
    
    User-agent: CCBot
    Disallow: /
    
    User-agent: ClaudeBot
    Disallow: /
    
    User-agent: Google-Extended
    Disallow: /
    
    User-agent: GPTBot
    Disallow: /
    
    User-agent: meta-externalagent
    Disallow: /
    
    # END Cloudflare Managed Content
    1 条回复    2025-07-03 17:51:59 +08:00
    laobaiguolai
        1
    laobaiguolai  
       62 天前
    你去 cloudflare 的统计里看看,这些爬虫爬得非常多。。禁了是好事。
    关于   ·   帮助文档   ·   自助推广系统   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   5348 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 19ms · UTC 07:53 · PVG 15:53 · LAX 00:53 · JFK 03:53
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.