Fediversity/matrix/nginx/workers/README.md

---
gitea: none
include_toc: true
---

# Reverse proxy for Synapse with workers

Changing nginx's configuration from a reverse proxy for a normal, monolithic
Synapse to one for a Synapse that uses workers, quite a lot has to be changed.

As mentioned in [Synapse with workers](../../synapse/workers.md#synapse), we're
changing the "backend" from network sockets to UNIX sockets.

Because we're going to have to forward a lot of specific requests to all kinds
of workers, we'll split the configuration into a few bits:

* all `proxy_forward` settings
* all `location` definitions
* maps that define variables
* upstreams that point to the correct socket(s) with the correct settings
* settings for private access
* connection optimizations

Some of these go into `/etc/nginx/conf.d` because they are part of the
configuration of nginx itself, others go into `/etc/nginx/snippets` because we
need to include them several times in different places.


# Maps

A map sets a variable based on, usually, another variable. One case we use this
is in determining the type of sync a client is doing. A normal sync, simply
updating an existing session, is a rather lightweight operation. An initial sync,
meaning a full sync because the session is brand new, is not so lightweight.

A normal sync can be recognised by the `since` bit in the request: it tells
the server when its last sync was. If there is no `since`, we're dealing with
an initial sync.

We want to forward requests for normal syncs to the `normal_sync` workers, and
the initial syncs to the `initial_sync` workers.

We decide to which type of worker to forward the sync request to by looking at
the presence or absence of `since`: if it's there, it's a normal sync and we
set the variable `$sync` to `normal_sync`. If it's not there, we set `$sync` to
`initial_sync`. The content of `since` is irrelevant for nginx.

This is what the map looks like:

```
map $arg_since $sync {
    default normal_sync;
    '' initial_sync;
}
```

We evaluate `$arg_since` to set `$sync`: `$arg_since` is nginx's variable `$arg_`
followed by `since`, the argument we want. See [the index of
variables in nginx](https://nginx.org/en/docs/varindex.html) for more
variables we can use in nginx.

By default we set `$sync` to `normal_sync`, unless the argument `since` is
empty (absent); then we set it to `initial_sync`.

After this mapping, we forward the request to the correct worker like this:

```
proxy_pass http://$sync;
```


# Upstreams

In our configuration, nginx is not only a reverse proxy, it's a load balancer.
Just like what `haproxy` does, it can forward requests to "servers" behind it.
Such a server is the inbound UNIX socket of a worker, and there can be several
of them in one group.

Two of these upstreams are the sync workers: `normal_sync` and `initial_sync`,
both consisting of several "servers":

```
upstream initial_sync {
    hash $mxid_localpart consistent;
    server unix:/run/matrix-synapse/inbound_initial_sync1.sock max_fails=0;
    server unix:/run/matrix-synapse/inbound_initial_sync2.sock max_fails=0;
    keepalive 10;
}

upstream normal_sync {
    hash $mxid_localpart consistent;
    server unix:/run/matrix-synapse/inbound_normal_sync1.sock max_fails=0;
    server unix:/run/matrix-synapse/inbound_normal_sync2.sock max_fails=0;
    server unix:/run/matrix-synapse/inbound_normal_sync3.sock max_fails=0;
    keepalive 10;
}
```

The `hash` bit is to make sure requests are always forwarded to the same
worker.


# Locations

Now that we have defined the workers and/or worker pools, we have to forward
the right traffic to the right workers. The Synapse documentation about
[available worker
types](https://element-hq.github.io/synapse/latest/workers.html#available-worker-applications)
lists which endpoints a specific worker type can handle.

The docs say that the `generic_worker` can handle these requests for synchronisation
requests:

```
# Sync requests
^/_matrix/client/(r0|v3)/sync$
^/_matrix/client/(api/v1|r0|v3)/events$
^/_matrix/client/(api/v1|r0|v3)/initialSync$
^/_matrix/client/(api/v1|r0|v3)/rooms/[^/]+/initialSync$
```

Now, if we had only one worker type for synchronisations, named `sync`, not
splitting those requests up in normal and initial, we would direct all
sync-requests to that worker with this `location`:

```
location ~ ^(/_matrix/client/(r0|v3)/sync|/_matrix/client/(api/v1|r0|v3)/events|/_matrix/client/(api/v1|r0|v3)/initialSync|/_matrix/client/(api/v1|r0|v3)/rooms/[^/]+/initialSync)$ {
    proxy_pass http://sync;
}