diff options
Diffstat (limited to 'docs/advanced')
-rw-r--r-- | docs/advanced/caching/api.md | 86 | ||||
-rw-r--r-- | docs/advanced/caching/assets-media.md | 72 | ||||
-rw-r--r-- | docs/advanced/caching/index.md | 11 | ||||
-rw-r--r-- | docs/advanced/host-account-domain.md | 104 | ||||
-rw-r--r-- | docs/advanced/index.md | 14 | ||||
-rw-r--r-- | docs/advanced/outgoing-proxy.md | 21 | ||||
-rw-r--r-- | docs/advanced/security/index.md | 10 | ||||
-rw-r--r-- | docs/advanced/security/sandboxing.md | 78 | ||||
-rw-r--r-- | docs/advanced/tracing.md | 44 |
9 files changed, 440 insertions, 0 deletions
diff --git a/docs/advanced/caching/api.md b/docs/advanced/caching/api.md new file mode 100644 index 000000000..89df55a09 --- /dev/null +++ b/docs/advanced/caching/api.md @@ -0,0 +1,86 @@ +# Caching API responses + +It is possible to cache certain API responses to offload the GoToSocial process from having to handle all requests. We don't recommend caching responses to requests under `/api`. + +When using a [split domain](../host-account-domain.md) deployment style, you need to ensure you configure caching on the host domain. The account domain should only be issuing redirects to the host domain which clients will automatically remember. + +!!! warning "There are only two hard things in computer science" + Configuring caching incorrectly can result into all kinds of problems. Follow this guide carefully and thoroughly test your modifications. Don't cache endpoints that require authentication without taking the `Authorization` header into account. + +## Endpoints + +### Webfinger and hostmeta + +Requests to `/.well-known/webfinger` and `/.well-known/host-meta` can be safely cached. Do be careful to ensure any caching strategy takes query parameters into account when caching webfinger requests as requests to that endpoint are of the form `?resource=acct:@username@domain.tld`. + +### Public keys + +Many implementations will regularly request the public key for a user in order to validate the signature on a message they received. This will happen whenever a message gets federated amongst other things. These keys are long lived, essentially eternal, and can thus be cached with a long lifetime. + +## Configuration snippets + +### nginx + +For nginx, you'll need to start by configuring a cache zone. The cache zone must be created in the `http` section, not within `server` or `location`. + +```nginx +http { + ... + proxy_cache_path /var/cache/nginx keys_zone=gotosocial_ap_public_responses:10m inactive=1w; +} +``` + +This configures a cache of 10MB whose entries will be kept up to one week if they're not accessed. + +The zone is named `gotosocial_ap_public_responses` but you can name it whatever you want. 10MB is a lot of cache keys; you can probably use a smaller value on small instances. + +Second, we need to update our GoToSocial nginx configuration to actually use the cache for the endpoints we want to cache. + +```nginx +server { + server_name social.example.org; + + location ~ /.well-known/(webfinger|host-meta)$ { + proxy_set_header Host $host; + proxy_set_header X-Forwarded-For $remote_addr; + proxy_set_header X-Forwarded-Proto $scheme; + + proxy_cache gotosocial_ap_public_responses; + proxy_cache_background_update on; + proxy_cache_key $scheme://$host$uri$is_args$query_string; + proxy_cache_valid 200 10m; + proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504 http_429; + proxy_cache_lock on; + add_header X-Cache-Status $upstream_cache_status; + + proxy_pass http://localhost:8080; + } + + location ~ ^\/users\/(?:[a-z0-9_\.]+)\/main-key$ { + proxy_set_header Host $host; + proxy_set_header X-Forwarded-For $remote_addr; + proxy_set_header X-Forwarded-Proto $scheme; + + proxy_cache gotosocial_ap_public_responses; + proxy_cache_background_update on; + proxy_cache_key $scheme://$host$uri; + proxy_cache_valid 200 604800s; + proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504 http_429; + proxy_cache_lock on; + add_header X-Cache-Status $upstream_cache_status; + + proxy_pass http://localhost:8080; + } +``` + +The `proxy_pass` and `proxy_set_header` are mostly the same, but the `proxy_cache*` entries warrant some explanation: + +- `proxy_cache gotosocial_ap_public_responses` tells nginx to use the `gotosocial_ap_public_responses` cache zone we previously created. If you named it something else, you should change this value +- `proxy_cache_background_update on` means nginx will try and refresh a cached resource that's about to expire in the background, to ensure it has a current copy on disk +- `proxy_cache_key` is configured in such a way that it takes the query string into account for caching. So a request for `.well-known/webfinger?acct=user1@example.org` and `.well-known/webfinger?acct=user2@example.org` are not seen as the same. +- `proxy_cache_valid 200 10m;` means we only cache 200 responses from GTS and for 10 minutes. You can add additional lines of these, like `proxy_cache_valid 404 1m;` to cache 404 responses for 1 minute +- `proxy_cache_use_stale` tells nginx it's allowed to use a stale cache entry (so older than 10 minutes) in certain cases +- `proxy_cache_lock on` means that if a resource is not cached and there's multiple concurrent requests for them, the queries will be queued up so that only one request goes through and the rest is then answered from cache +- `add_header X-Cache-Status $upstream_cache_status` will add an `X-Cache-Status` header to the response so you can check if things are getting cached. You can remove this. + +The provided configuration will serve a stale response in case there's an error proxying to GoToSocial, if our connection to GoToSocial times out, if GoToSocial returns a `5xx` status code or if GoToSocial returns 429 (Too Many Requests). The `updating` value says that we're allowed to serve a stale entry if nginx is currently in the process of refreshing its cache. Because we configured `inactive=1w` in the `proxy_cache_path` directive, nginx may serve a response up to one week old if the conditions in `proxy_cache_use_stale` are met. diff --git a/docs/advanced/caching/assets-media.md b/docs/advanced/caching/assets-media.md new file mode 100644 index 000000000..a7639885c --- /dev/null +++ b/docs/advanced/caching/assets-media.md @@ -0,0 +1,72 @@ +# Caching assets and media + +When you've configured your GoToSocial instance with local storage for media, you can use your [reverse proxy](../../getting_started/reverse_proxy/index.md) to serve these files directly and cache them. This avoids hitting GoToSocial for these requests and reverse proxies can typically serve assets faster than GoToSocial. + +You can also use your reverse proxy to cache the GoToSocial web UI assets, like the CSS and images it uses. + +When using a [split domain](../host-account-domain.md) deployment style, you need to ensure you configure caching of the assets and media on the host domain. + +!!! warning "Media pruning" + If you've configured media pruning, you need to ensure that when media is not found on disk the request is still sent on to GoToSocial. This will ensure the media is fetched again from the remote instance and subsequent requests for this media will then be handled by your reverse proxy again. + +## Endpoints + +There are 2 endpoints that serve assets we can serve and cache: + +* `/assets` which contains fonts, CSS, images etc. for the web UI +* `/fileserver` which serves attachments for status posts when using the local storage backend + +The filesystem location of `/assets` is defined by the [`web-asset-base-dir`](../../configuration/web.md) configuration option. Files under `/fileserver` are retrieved from the [`storage-local-base-path`](../../configuration/storage.md). + +## Configuration + +### nginx + +Here's an example of the three location blocks you'll need to add to your existing configuration in nginx: + +```nginx +server { + server_name social.example.org; + + location /assets/ { + alias web-asset-base-dir/; + autoindex off; + expires 5m; + add_header Cache-Control "public"; + } + + location @fileserver { + proxy_pass http://localhost:8080; + proxy_set_header Host $host; + proxy_set_header Upgrade $http_upgrade; + proxy_set_header Connection "upgrade"; + proxy_set_header X-Forwarded-For $remote_addr; + proxy_set_header X-Forwarded-Proto $scheme; + } + + location /fileserver/ { + alias storage-local-base-path/; + autoindex off; + expires max; + add_header Cache-Control "private, immutable"; + try_files $uri @fileserver; + } +} +``` + +The `/fileserver` location is a bit special. When we fail to fetch the media from disk, we want to proxy the request on to GoToSocial so it can try and fetch it. The `try_files` directive can't take a `proxy_pass` itself so instead we created the named `@fileserver` location that we pass in last to `try_files`. + +!!! bug "Trailing slashes" + The trailing slashes in the `location` directives and the `alias` are significant, do not remove those. + +The `expires` directive adds the necessary headers to inform the client how long it may cache the resource: + +* For assets, which may change on each release, 5 minutes is used in this example +* For attachments, which should never change once they're created, `max` is used instead setting the cache expiry to the 31st of December 2037. + +For other options, see the nginx documentation on the [`expires` directive](https://nginx.org/en/docs/http/ngx_http_headers_module.html#expires). + +Nginx does not add cache headers to 4xx or 5xx response codes so a failure to fetch an asset won't get cached by clients. The `autoindex off` directive tells nginx to not serve a directory listing. This should be the default but it doesn't hurt to be explicit. The added `add_header` lines set additional options for the `Cache-Control` header: + +* `public` is used to indicate that anyone may cache this resource +* `immutable` is used to indicate this resource will never change while it is fresh (it's before the end of the expires) allowing clients to forgo conditional requests to revalidate the resource during that timespan. diff --git a/docs/advanced/caching/index.md b/docs/advanced/caching/index.md new file mode 100644 index 000000000..ca946dd8e --- /dev/null +++ b/docs/advanced/caching/index.md @@ -0,0 +1,11 @@ +# Caching + +This section covers a number of different caching techniques that can be used to make GoToSocial more robust in the face of higher traffic and offload the GoToSocial instance from some work. + +!!! note + These guides are only relevant if you're running a [reverse proxy](../../getting_started/reverse_proxy/index.md). + +## Guides + +* [Caching API responses](api.md) +* [Assets and media serving and caching](assets-media.md) diff --git a/docs/advanced/host-account-domain.md b/docs/advanced/host-account-domain.md new file mode 100644 index 000000000..6831a7e19 --- /dev/null +++ b/docs/advanced/host-account-domain.md @@ -0,0 +1,104 @@ +# Split-domain deployments + +This guide explains how to have usernames like `@me@example.org` but run the GoToSocial instance itself on a subdomain like `social.example.org`. Configuring this type of deployment layout **must** be done before starting GoToSocial for the first time. + +!!! danger + You cannot change your domain layout after you've federated with someone. Servers are going to get confused and you'll need to convince the admin of every instance that's federated with you before to mess with their database to resolve it. It also requires regenerating the database on your side to create a new instance account and pair of encryption keys. + +## Background + +The way ActivityPub implementations discover how to map your account domain to your host domain is through a protocol called [webfinger](https://www.rfc-editor.org/rfc/rfc7033). This mapping is typically cached by servers and hence why you can't change it after the fact. + +It works by doing a request to `https://<account domain>/.well-known/webfinger?resource=acct:@me@example.org`. At this point, a server can return a redirect to where the actual webfinger endpoint is, `https://<host domain>/.well-known/webfinger?resource=acct:@me@example.org` or may respond directly. The JSON document that is returned informs you what the endpoint to query is for the user: + +```json +{ + "subject": "acct:me@example.org", + "aliases": [ + "https://social.example.org/users/me", + "https://social.example.org/@me" + ], + "links": [ + { + "rel": "http://webfinger.net/rel/profile-page", + "type": "text/html", + "href": "https://social.example.org/@me" + }, + { + "rel": "self", + "type": "application/activity+json", + "href": "https://social.example.org/users/me" + } + ] +} +``` + +ActivityPub clients and servers will now use the entry from the `links` array with `rel` `self` and `type` `application/activity+json` to query for further information, like where the `inbox` is located to federated messages to. + +## Configuration + +There are 2 configuration settings you'll need to concern yourself with: + +* `host`, the domain the API will be served on and what clients and servers will end up using when talking to your instance +* `account-domain`, the domain user accounts will be created on + +In order to achieve the setup as described in the introduction, you'll need to set these two configuration options accordingly: + +```yaml +host: social.example.org +account-domain: example.org +``` + +!!! info + The `host` must always be the DNS name that your GoToSocial instance runs on. It does not affect the IP address the GoToSocial instance binds to. That is controlled with `bind-address`. + +## Reverse proxy + +When using a [reverse proxy](../getting_started/reverse_proxy/index.md) you'll need to ensure you're set up to handle traffic on both of those domains. You'll need to redirect a few endpoints from the account domain to the host domain. + +Redirects are typically used so that the change of domain can be detected client side. The endpoints to redirect from the account domain to the host domain are: + +* `/.well-known/webfinger` +* `/.well-known/host-meta` +* `/.well-known/nodeinfo` + +!!! tip + Do not proxy or redirect requests to the API endpoints, `/api/...`, from the account domain to the host domain. This will confuse heuristics some clients use to detect a split-domain deployment resulting in broken login flows and other weird behaviour. + +### nginx + +In order to configure the redirect, you'll need to configure it on the account domain. Assuming the account domain is `example.org` and the host domain is `social.example.org`, the following configuration snippet showcases how to do this: + +```nginx +server { + server_name example.org; + + location /.well-known/webfinger { + rewrite ^.*$ https://social.example.org/.well-known/webfinger permanent; + } + + location /.well-known/host-meta { + rewrite ^.*$ https://social.example.org/.well-known/host-meta permanent; + } + + location /.well-known/nodeinfo { + rewrite ^.*$ https://social.example.org/.well-known/nodeinfo permanent; + } +} +``` + +### Traefik + +If `example.org` is running on [Traefik](https://doc.traefik.io/traefik/), we could use labels similar to the following to setup the redirect. + +```yaml +myservice: + image: foo + # Other stuff + labels: + - 'traefik.http.routers.myservice.rule=Host(`example.org`)' + - 'traefik.http.middlewares.myservice-gts.redirectregex.permanent=true' + - 'traefik.http.middlewares.myservice-gts.redirectregex.regex=^https://(.*)/.well-known/(webfinger|nodeinfo|host-meta)$$' + - 'traefik.http.middlewares.myservice-gts.redirectregex.replacement=https://social.$${1}/.well-known/$${2}' + - 'traefik.http.routers.myservice.middlewares=myservice-gts@docker' +``` diff --git a/docs/advanced/index.md b/docs/advanced/index.md new file mode 100644 index 000000000..f196b191b --- /dev/null +++ b/docs/advanced/index.md @@ -0,0 +1,14 @@ +# Advanced + +In this section we touch on a number of more advanced topics, primarily related around deploying, operating and tuning GoToSocial. + +We consider these topics advanced because applying them incorrectly does have the possibility of causing client and federation issues. Applying any of these configuration changes may also make it harder for you to debug an issue with your GoToSocial instance if you don't understand the changes that you're making. + +## Guides + +* [Split-domain deployment (API vs. account domain)](host-account-domain.md) +* [Using an HTTP proxy for client/outgoing requests](outgoing-proxy.md) +* [Caching API responses](caching/api.md) +* [Serving and caching assets and media from local storage](caching/assets-media.md) +* [Process sandboxing](security/sandboxing.md) +* [Tracing](tracing.md) diff --git a/docs/advanced/outgoing-proxy.md b/docs/advanced/outgoing-proxy.md new file mode 100644 index 000000000..67d00777a --- /dev/null +++ b/docs/advanced/outgoing-proxy.md @@ -0,0 +1,21 @@ +# Outgoing HTTP proxy + +GoToSocial supports canonical environment variables for configuring the use of an HTTP proxy for outgoing requets: + +* `HTTP_PROXY` +* `HTTPS_PROXY` +* `NO_PROXY` + +The lowercase versions of these environment variables are also recognised. `HTTPS_PROXY` takes precedence over `HTTP_PROXY` for https requests. + +The environment values may be either a complete URL or a `host[:port]`, in which case the "http" scheme is assumed. The schemes "http", "https", and "socks5" are supported. + +## systemd + +When running with systemd, you can add the necessary environment variables using the `Environment` option in the `Service` section. + +How to do so is documented in the [`systemd.exec` manual](https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Environment). + +## Container runtime + +Environment variables can be set in the compose file under the `environment` key. You can also pass them on the CLI to Docker or Podman's `run` command with `-e KEY=VALUE` or `--env KEY=VALUE`. diff --git a/docs/advanced/security/index.md b/docs/advanced/security/index.md new file mode 100644 index 000000000..63ee73c9a --- /dev/null +++ b/docs/advanced/security/index.md @@ -0,0 +1,10 @@ +# Enhanced security + +These guides cover improving the security posture of your GoToSocial deployment. They don't involve tweaking settings in GoToSocial, but rather additional things you can do to better lock down your instance. + +!!! note + Anything in these guides is meant to ehance the security of your GoToSocial deployment; they are not a replacement for good security practices like keeping your systems patched and up to date. + +## Guides + +* [Sandboxing the GoToSocial binary](sandboxing.md) diff --git a/docs/advanced/security/sandboxing.md b/docs/advanced/security/sandboxing.md new file mode 100644 index 000000000..067940e32 --- /dev/null +++ b/docs/advanced/security/sandboxing.md @@ -0,0 +1,78 @@ +# Application sandboxing + +By sandboxing the GoToSocial binary it's possible to control which parts of the system GoToSocial can access, and limit which things it can read and write. This can be helpful to ensure that even in the face of a security issue in GoToSocial, an attacker is severely hindered in escalating their privileges and gaining a foothold on your system. + +!!! note + As GoToSocial is still early in its development, the sandboxing policies we ship may get out of date. If you happen to run into this, please raise an issue on the issue tracker or better yet submit a PR to help us fix it. + +Different distributions have different sandboxing mechanisms they prefer and support: + +* **AppArmor** for the Debian or Ubuntu family of distributions or OpenSuSE, including when running with Docker +* **SELinux** for the Red Hat/Fedora/CentOS family of distributions or Gentoo + +!!! warning "Containers and sandboxing" + Running GoToSocial as a container does not in and of itself provide much additional security. Despite their name, "containers do not contain". Containers are a distribution mechanism, not a security sandbox. To further secure your container you can instruct the container runtime to load the AppArmor profile and look into limiting which syscalls can be used using a seccomp profile. + +## AppArmor + +We ship an example AppArmor policy for GoToSocial, which you can retrieve and install as follows: + +```sh +$ curl -LO 'https://github.com/superseriousbusiness/gotosocial/raw/main/example/apparmor/gotosocial' +$ sudo install -o root -g root gotosocial /etc/apparmor.d/gotosocial +$ sudo apparmor_parser -Kr /etc/apparmor.d/gotosocial +``` + +!!! tip + If you're using SQLite, the AppArmor profile expects the database in `/gotosocial/db/` so you'll need to adjust your configuration paths or the policy accordingly. + +With the policy installed, you'll need to configure your system to use it to constrain the permissions GoToSocial has. + +You can disable the policy like this: + +```sh +$ sudo apparmor_parser -R /etc/apparmor.d/gotosocial +$ sudo rm -vi /etc/apparmor.d/gotosocial +``` +Don't forget to roll back any configuration changes you made that load the AppArmor policy. + +### systemd + +Add the following to the systemd service, or create an override: + +```ini +[Service] +... +AppArmorProfile=gotosocial +``` + +Reload systemd and restart GoToSocial: + +``` +$ systemctl daemon-reload +$ systemctl restart gotosocial +``` + +### Containers + +!!! tip + You should review the [Docker](https://docs.docker.com/engine/security/apparmor/) or [Podman](https://docs.podman.io/en/latest/markdown/options/security-opt.html) documentation on AppArmor. + +When using our example Compose file, you can tell it to load the AppArmor policy by tweaking it like so: + +```yaml +services: + gotosocial: + ... + security_opt: + - apparmor=gotosocial +``` + +When launching the container with `docker run` or `podman run`, you'll need the `--security-opt="apparmor=gotosocial"` command line flag. + +## SELinux + +!!! note + SELinux can only be used in combination with the [binary installation](../../getting_started/installation/metal.md) method. SELinux cannot be used to constrain GoToSocial when running in a container. + +The SELinux policy is maintained by the community in the [`lzap/gotosocial-selinux`](https://github.com/lzap/gotosocial-selinux) repository on GitHub. Make sure to read its documentation, review the policy before using it and use their issue tracker for any support requests around the SELinux policy. diff --git a/docs/advanced/tracing.md b/docs/advanced/tracing.md new file mode 100644 index 000000000..34b47f563 --- /dev/null +++ b/docs/advanced/tracing.md @@ -0,0 +1,44 @@ +# Tracing + +GoToSocial comes with [OpenTelemetry][otel] based tracing built-in. It's not wired through every function, but our HTTP handlers and database library will create spans. How to configure tracing is explained in the [Observability configuration reference][obs]. + +In order to receive the traces, you need something to ingest them and then visualise them. There are many options available including self-hosted and commercial options. + +We provide an example of how to do this using [Grafana Tempo][tempo] to ingest the spans and [Grafana][grafana] to explore them. Please beware that the configuration we provide is not suitable for a production setup. It can be used safely for local development and can provide a good starting point for setting up your own tracing infrastructure. + +You'll need the files in [`example/tracing`][ext]. Once you have those you can run `docker-compose up -d` to get Tempo and Grafana running. With both services running, you can add the following to your GoToSocial configuration and restart your instance: + +```yaml +tracing-enabled: true +tracing-transport: "grpc" +tracing-endpoint: "localhost:4317" +tracing-insecure: true +``` + +[otel]: https://opentelemetry.io/ +[obs]: ../configuration/observability.md +[tempo]: https://grafana.com/oss/tempo/ +[grafana]: https://grafana.com/oss/grafana/ +[ext]: https://github.com/superseriousbusiness/gotosocial/tree/main/example/tracing + +## Querying and visualising traces + +Once you execute a few queries against your instance, you'll be able to find them in Grafana. You can use the Explore tab and pick Tempo as the datasource. Because our example configuration for Grafana enables [TraceQL][traceql], the Explore tab will have the TraceQL query type selected by default. You can switch to "Search" instead and find all traces emitted by GoToSocial under the "GoToSocial" service name. + +Using TraceQL, a simple query to find all traces related to requests to `/api/v1/instance` would look like this: + +``` +{.http.route = "/api/v1/instance"} +``` + +If you wanted to see all GoToSocial traces, you could instead run: + +``` +{.service.name = "GoToSocial"} +``` + +Once you select a trace, a second panel will open up visualising the span. You can drill down from there, by clicking into every sub-span to see what it was doing. + + + +[traceql]: https://grafana.com/docs/tempo/latest/traceql/ |