diff options
| author | 2025-05-03 09:28:43 +0000 | |
|---|---|---|
| committer | 2025-05-03 09:28:43 +0000 | |
| commit | bad427e7f0e654835ece503e4666d42769ed4f58 (patch) | |
| tree | 464cebdd6a21378074d314e3e63489a171fb7e0b /docs | |
| parent | [docs/zh] Update zh docs: synced to 6c879186 (#4117) (diff) | |
| download | gotosocial-bad427e7f0e654835ece503e4666d42769ed4f58.tar.xz | |
[chore/docs] fix relative link to scraper deterrence (#4111)
# Description
While working on the doc translation update, I found a broken link. So I;m opening this separate PR to keep it clean from the translation stuff. Marked as draft currently for checking for any other typos :)
Reviewed-on: https://codeberg.org/superseriousbusiness/gotosocial/pulls/4111
Co-authored-by: cdn0x12 <git@cdn0x12.dev>
Co-committed-by: cdn0x12 <git@cdn0x12.dev>
Diffstat (limited to 'docs')
| -rw-r--r-- | docs/admin/robots.md | 2 | ||||
| -rw-r--r-- | docs/configuration/advanced.md | 19 |
2 files changed, 20 insertions, 1 deletions
diff --git a/docs/admin/robots.md b/docs/admin/robots.md index e4b3d27ce..29a02db42 100644 --- a/docs/admin/robots.md +++ b/docs/admin/robots.md @@ -10,6 +10,6 @@ You can allow or disallow crawlers from collecting stats about your instance fro The AI scrapers come from a [community maintained repository][airobots]. It's manually kept in sync for the time being. If you know of any missing robots, please send them a PR! -A number of AI scrapers are known to ignore entries in `robots.txt` even if it explicitly matches their User-Agent. This means the `robots.txt` file is not a foolproof way of ensuring AI scrapers don't grab your content. In addition to this you might want to look into blocking User-Agents via [requester header filtering](request_filtering_modes.md), and enabling a proof-of-work [scraper deterrence](scraper_deterrence.md). +A number of AI scrapers are known to ignore entries in `robots.txt` even if it explicitly matches their User-Agent. This means the `robots.txt` file is not a foolproof way of ensuring AI scrapers don't grab your content. In addition to this you might want to look into blocking User-Agents via [requester header filtering](request_filtering_modes.md), and enabling a proof-of-work [scraper deterrence](../advanced/scraper_deterrence.md). [airobots]: https://github.com/ai-robots-txt/ai.robots.txt/ diff --git a/docs/configuration/advanced.md b/docs/configuration/advanced.md index 88f4aff67..0b8b3183f 100644 --- a/docs/configuration/advanced.md +++ b/docs/configuration/advanced.md @@ -182,4 +182,23 @@ advanced-csp-extra-uris: [] # Options: ["block", "allow", ""] # Default: "" advanced-header-filter-mode: "" + +# Bool. Enables a proof-of-work based deterrence against scrapers +# on profile and status web pages. This will generate a unique but +# deterministic challenge for each HTTP client to complete before +# accessing the above mentioned endpoints, on success being given +# a cookie that permits challenge-less access within a 1hr window. +# +# The outcome of this is that it should make scraping of these +# endpoints economically unfeasible, while having a negligible +# performance impact on your own instance. +# +# The downside is that it requires javascript to be enabled. +# +# For more details please check the documentation at: +# https://docs.gotosocial.org/en/latest/admin/scraper_deterrence +# +# Options: [true, false] +# Default: true +advanced-scraper-deterrence: false ``` |
