diff options
author | 2024-08-02 18:22:39 +0200 | |
---|---|---|
committer | 2024-08-02 18:22:39 +0200 | |
commit | 9b50151f17b5921b68b3c413a26edf8ec6cdc6f8 (patch) | |
tree | 30b422982c0798870232835314fbaa827ad57a9a /internal/web/robots.go | |
parent | [bugfix] close files before error return (#3163) (diff) | |
download | gotosocial-9b50151f17b5921b68b3c413a26edf8ec6cdc6f8.tar.xz |
[feature] Beef up our AI opt-outs (#3165)
* [chore] Synchronise our robots.txt with upstream
* [feature] Add headers to escape AI crawlers
This adds 2 headers that a number of AI crawlers respect to signal that
content should not be included in their datasets.
Diffstat (limited to 'internal/web/robots.go')
-rw-r--r-- | internal/web/robots.go | 9 |
1 files changed, 9 insertions, 0 deletions
diff --git a/internal/web/robots.go b/internal/web/robots.go index 39708eb55..3309de97c 100644 --- a/internal/web/robots.go +++ b/internal/web/robots.go @@ -43,15 +43,24 @@ User-agent: Claude-Web User-agent: cohere-ai User-agent: Diffbot User-agent: FacebookBot +User-agent: facebookexternalhit User-agent: FriendlyCrawler User-agent: Google-Extended User-agent: GoogleOther +User-agent: GoogleOther-Image +User-agent: GoogleOther-Video User-agent: GPTBot User-agent: ImagesiftBot User-agent: img2dataset +User-agent: Meta-ExternalAgent +User-agent: OAI-SearchBot User-agent: omgili User-agent: omgilibot User-agent: PerplexityBot +User-agent: PetalBot +User-agent: Scrapy +User-agent: Timpibot +User-agent: VelenPublicWebCrawler User-agent: YouBot Disallow: / |