summaryrefslogtreecommitdiff
path: root/docs/advanced/caching
diff options
context:
space:
mode:
authorLibravatar Daenney <daenney@users.noreply.github.com>2023-06-12 15:38:53 +0200
committerLibravatar GitHub <noreply@github.com>2023-06-12 15:38:53 +0200
commit4990099fdeee5ac362295de3879d4b291e629c76 (patch)
treec630d02d3ce4e7600f68b012f5cfe3c02b958d1f /docs/advanced/caching
parent[chore]: Bump modernc.org/sqlite from 1.23.0 to 1.23.1 (#1884) (diff)
downloadgotosocial-4990099fdeee5ac362295de3879d4b291e629c76.tar.xz
[docs] Made Advanced its own section (#1883)
* [docs] Made Advanced its own section This splits the Advanced page off from the Getting Started guide and makes it its own thing. It now has some additional sub-sections for bigger topics like caching and enhanced security. This also moves tracing from Getting Started to Advanced as that feels like a more appropriate location for it. The enhanced security looks a little silly with a single section, but I have guides pending for firewall configurations and I'd also like to consolidate our how to provision TLS certificates in there as we repeat this information multiple times. * [docs] Fix all my spelling errors * [docs] Inline the links in sandboxing
Diffstat (limited to 'docs/advanced/caching')
-rw-r--r--docs/advanced/caching/api.md86
-rw-r--r--docs/advanced/caching/assets-media.md72
-rw-r--r--docs/advanced/caching/index.md11
3 files changed, 169 insertions, 0 deletions
diff --git a/docs/advanced/caching/api.md b/docs/advanced/caching/api.md
new file mode 100644
index 000000000..89df55a09
--- /dev/null
+++ b/docs/advanced/caching/api.md
@@ -0,0 +1,86 @@
+# Caching API responses
+
+It is possible to cache certain API responses to offload the GoToSocial process from having to handle all requests. We don't recommend caching responses to requests under `/api`.
+
+When using a [split domain](../host-account-domain.md) deployment style, you need to ensure you configure caching on the host domain. The account domain should only be issuing redirects to the host domain which clients will automatically remember.
+
+!!! warning "There are only two hard things in computer science"
+ Configuring caching incorrectly can result into all kinds of problems. Follow this guide carefully and thoroughly test your modifications. Don't cache endpoints that require authentication without taking the `Authorization` header into account.
+
+## Endpoints
+
+### Webfinger and hostmeta
+
+Requests to `/.well-known/webfinger` and `/.well-known/host-meta` can be safely cached. Do be careful to ensure any caching strategy takes query parameters into account when caching webfinger requests as requests to that endpoint are of the form `?resource=acct:@username@domain.tld`.
+
+### Public keys
+
+Many implementations will regularly request the public key for a user in order to validate the signature on a message they received. This will happen whenever a message gets federated amongst other things. These keys are long lived, essentially eternal, and can thus be cached with a long lifetime.
+
+## Configuration snippets
+
+### nginx
+
+For nginx, you'll need to start by configuring a cache zone. The cache zone must be created in the `http` section, not within `server` or `location`.
+
+```nginx
+http {
+ ...
+ proxy_cache_path /var/cache/nginx keys_zone=gotosocial_ap_public_responses:10m inactive=1w;
+}
+```
+
+This configures a cache of 10MB whose entries will be kept up to one week if they're not accessed.
+
+The zone is named `gotosocial_ap_public_responses` but you can name it whatever you want. 10MB is a lot of cache keys; you can probably use a smaller value on small instances.
+
+Second, we need to update our GoToSocial nginx configuration to actually use the cache for the endpoints we want to cache.
+
+```nginx
+server {
+ server_name social.example.org;
+
+ location ~ /.well-known/(webfinger|host-meta)$ {
+ proxy_set_header Host $host;
+ proxy_set_header X-Forwarded-For $remote_addr;
+ proxy_set_header X-Forwarded-Proto $scheme;
+
+ proxy_cache gotosocial_ap_public_responses;
+ proxy_cache_background_update on;
+ proxy_cache_key $scheme://$host$uri$is_args$query_string;
+ proxy_cache_valid 200 10m;
+ proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504 http_429;
+ proxy_cache_lock on;
+ add_header X-Cache-Status $upstream_cache_status;
+
+ proxy_pass http://localhost:8080;
+ }
+
+ location ~ ^\/users\/(?:[a-z0-9_\.]+)\/main-key$ {
+ proxy_set_header Host $host;
+ proxy_set_header X-Forwarded-For $remote_addr;
+ proxy_set_header X-Forwarded-Proto $scheme;
+
+ proxy_cache gotosocial_ap_public_responses;
+ proxy_cache_background_update on;
+ proxy_cache_key $scheme://$host$uri;
+ proxy_cache_valid 200 604800s;
+ proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504 http_429;
+ proxy_cache_lock on;
+ add_header X-Cache-Status $upstream_cache_status;
+
+ proxy_pass http://localhost:8080;
+ }
+```
+
+The `proxy_pass` and `proxy_set_header` are mostly the same, but the `proxy_cache*` entries warrant some explanation:
+
+- `proxy_cache gotosocial_ap_public_responses` tells nginx to use the `gotosocial_ap_public_responses` cache zone we previously created. If you named it something else, you should change this value
+- `proxy_cache_background_update on` means nginx will try and refresh a cached resource that's about to expire in the background, to ensure it has a current copy on disk
+- `proxy_cache_key` is configured in such a way that it takes the query string into account for caching. So a request for `.well-known/webfinger?acct=user1@example.org` and `.well-known/webfinger?acct=user2@example.org` are not seen as the same.
+- `proxy_cache_valid 200 10m;` means we only cache 200 responses from GTS and for 10 minutes. You can add additional lines of these, like `proxy_cache_valid 404 1m;` to cache 404 responses for 1 minute
+- `proxy_cache_use_stale` tells nginx it's allowed to use a stale cache entry (so older than 10 minutes) in certain cases
+- `proxy_cache_lock on` means that if a resource is not cached and there's multiple concurrent requests for them, the queries will be queued up so that only one request goes through and the rest is then answered from cache
+- `add_header X-Cache-Status $upstream_cache_status` will add an `X-Cache-Status` header to the response so you can check if things are getting cached. You can remove this.
+
+The provided configuration will serve a stale response in case there's an error proxying to GoToSocial, if our connection to GoToSocial times out, if GoToSocial returns a `5xx` status code or if GoToSocial returns 429 (Too Many Requests). The `updating` value says that we're allowed to serve a stale entry if nginx is currently in the process of refreshing its cache. Because we configured `inactive=1w` in the `proxy_cache_path` directive, nginx may serve a response up to one week old if the conditions in `proxy_cache_use_stale` are met.
diff --git a/docs/advanced/caching/assets-media.md b/docs/advanced/caching/assets-media.md
new file mode 100644
index 000000000..a7639885c
--- /dev/null
+++ b/docs/advanced/caching/assets-media.md
@@ -0,0 +1,72 @@
+# Caching assets and media
+
+When you've configured your GoToSocial instance with local storage for media, you can use your [reverse proxy](../../getting_started/reverse_proxy/index.md) to serve these files directly and cache them. This avoids hitting GoToSocial for these requests and reverse proxies can typically serve assets faster than GoToSocial.
+
+You can also use your reverse proxy to cache the GoToSocial web UI assets, like the CSS and images it uses.
+
+When using a [split domain](../host-account-domain.md) deployment style, you need to ensure you configure caching of the assets and media on the host domain.
+
+!!! warning "Media pruning"
+ If you've configured media pruning, you need to ensure that when media is not found on disk the request is still sent on to GoToSocial. This will ensure the media is fetched again from the remote instance and subsequent requests for this media will then be handled by your reverse proxy again.
+
+## Endpoints
+
+There are 2 endpoints that serve assets we can serve and cache:
+
+* `/assets` which contains fonts, CSS, images etc. for the web UI
+* `/fileserver` which serves attachments for status posts when using the local storage backend
+
+The filesystem location of `/assets` is defined by the [`web-asset-base-dir`](../../configuration/web.md) configuration option. Files under `/fileserver` are retrieved from the [`storage-local-base-path`](../../configuration/storage.md).
+
+## Configuration
+
+### nginx
+
+Here's an example of the three location blocks you'll need to add to your existing configuration in nginx:
+
+```nginx
+server {
+ server_name social.example.org;
+
+ location /assets/ {
+ alias web-asset-base-dir/;
+ autoindex off;
+ expires 5m;
+ add_header Cache-Control "public";
+ }
+
+ location @fileserver {
+ proxy_pass http://localhost:8080;
+ proxy_set_header Host $host;
+ proxy_set_header Upgrade $http_upgrade;
+ proxy_set_header Connection "upgrade";
+ proxy_set_header X-Forwarded-For $remote_addr;
+ proxy_set_header X-Forwarded-Proto $scheme;
+ }
+
+ location /fileserver/ {
+ alias storage-local-base-path/;
+ autoindex off;
+ expires max;
+ add_header Cache-Control "private, immutable";
+ try_files $uri @fileserver;
+ }
+}
+```
+
+The `/fileserver` location is a bit special. When we fail to fetch the media from disk, we want to proxy the request on to GoToSocial so it can try and fetch it. The `try_files` directive can't take a `proxy_pass` itself so instead we created the named `@fileserver` location that we pass in last to `try_files`.
+
+!!! bug "Trailing slashes"
+ The trailing slashes in the `location` directives and the `alias` are significant, do not remove those.
+
+The `expires` directive adds the necessary headers to inform the client how long it may cache the resource:
+
+* For assets, which may change on each release, 5 minutes is used in this example
+* For attachments, which should never change once they're created, `max` is used instead setting the cache expiry to the 31st of December 2037.
+
+For other options, see the nginx documentation on the [`expires` directive](https://nginx.org/en/docs/http/ngx_http_headers_module.html#expires).
+
+Nginx does not add cache headers to 4xx or 5xx response codes so a failure to fetch an asset won't get cached by clients. The `autoindex off` directive tells nginx to not serve a directory listing. This should be the default but it doesn't hurt to be explicit. The added `add_header` lines set additional options for the `Cache-Control` header:
+
+* `public` is used to indicate that anyone may cache this resource
+* `immutable` is used to indicate this resource will never change while it is fresh (it's before the end of the expires) allowing clients to forgo conditional requests to revalidate the resource during that timespan.
diff --git a/docs/advanced/caching/index.md b/docs/advanced/caching/index.md
new file mode 100644
index 000000000..ca946dd8e
--- /dev/null
+++ b/docs/advanced/caching/index.md
@@ -0,0 +1,11 @@
+# Caching
+
+This section covers a number of different caching techniques that can be used to make GoToSocial more robust in the face of higher traffic and offload the GoToSocial instance from some work.
+
+!!! note
+ These guides are only relevant if you're running a [reverse proxy](../../getting_started/reverse_proxy/index.md).
+
+## Guides
+
+* [Caching API responses](api.md)
+* [Assets and media serving and caching](assets-media.md)