Fediversity/matrix/nginx/workers
2024-12-30 09:56:43 +01:00
..
README.md Added first bit about locations in nginx, and added link to PG-tuning. 2024-12-30 09:56:43 +01:00

Table of Contents

Reverse proxy for Synapse with workers

Changing nginx's configuration from a reverse proxy for a normal, monolithic Synapse to one for a Synapse that uses workers, quite a lot has to be changed.

As mentioned in Synapse with workers, we're changing the "backend" from network sockets to UNIX sockets.

Because we're going to have to forward a lot of specific requests to all kinds of workers, we'll split the configuration into a few bits:

  • all proxy_forward settings
  • all location definitions
  • maps that define variables
  • upstreams that point to the correct socket(s) with the correct settings
  • settings for private access
  • connection optimizations

Some of these go into /etc/nginx/conf.d because they are part of the configuration of nginx itself, others go into /etc/nginx/snippets because we need to include them several times in different places.

Maps

A map sets a variable based on, usually, another variable. One case we use this is in determining the type of sync a client is doing. A normal sync, simply updating an existing session, is a rather lightweight operation. An initial sync, meaning a full sync because the session is brand new, is not so lightweight.

A normal sync can be recognised by the since bit in the request: it tells the server when its last sync was. If there is no since, we're dealing with an initial sync.

We want to forward requests for normal syncs to the normal_sync workers, and the initial syncs to the initial_sync workers.

We decide to which type of worker to forward the sync request to by looking at the presence or absence of since: if it's there, it's a normal sync and we set the variable $sync to normal_sync. If it's not there, we set $sync to initial_sync. The content of since is irrelevant for nginx.

This is what the map looks like:

map $arg_since $sync {
    default normal_sync;
    '' initial_sync;
}

We evaluate $arg_since to set $sync: $arg_since is nginx's variable $arg_ followed by since, the argument we want. See the index of variables in nginx for more variables we can use in nginx.

By default we set $sync to normal_sync, unless the argument since is empty (absent); then we set it to initial_sync.

After this mapping, we forward the request to the correct worker like this:

proxy_pass http://$sync;

Upstreams

In our configuration, nginx is not only a reverse proxy, it's a load balancer. Just like what haproxy does, it can forward requests to "servers" behind it. Such a server is the inbound UNIX socket of a worker, and there can be several of them in one group.

Two of these upstreams are the sync workers: normal_sync and initial_sync, both consisting of several "servers":

upstream initial_sync {
    hash $mxid_localpart consistent;
    server unix:/run/matrix-synapse/inbound_initial_sync1.sock max_fails=0;
    server unix:/run/matrix-synapse/inbound_initial_sync2.sock max_fails=0;
    keepalive 10;
}

upstream normal_sync {
    hash $mxid_localpart consistent;
    server unix:/run/matrix-synapse/inbound_normal_sync1.sock max_fails=0;
    server unix:/run/matrix-synapse/inbound_normal_sync2.sock max_fails=0;
    server unix:/run/matrix-synapse/inbound_normal_sync3.sock max_fails=0;
    keepalive 10;
}

The hash bit is to make sure requests are always forwarded to the same worker.

Locations

Now that we have defined the workers and/or worker pools, we have to forward the right traffic to the right workers. The Synapse documentation about available worker types lists which endpoints a specific worker type can handle.

The docs say that the generic_worker can handle these requests for synchronisation requests:

# Sync requests
^/_matrix/client/(r0|v3)/sync$
^/_matrix/client/(api/v1|r0|v3)/events$
^/_matrix/client/(api/v1|r0|v3)/initialSync$
^/_matrix/client/(api/v1|r0|v3)/rooms/[^/]+/initialSync$

Now, if we had only one worker type for synchronisations, named sync, not splitting those requests up in normal and initial, we would direct all sync-requests to that worker with this location:

location ~ ^(/_matrix/client/(r0|v3)/sync|/_matrix/client/(api/v1|r0|v3)/events|/_matrix/client/(api/v1|r0|v3)/initialSync|/_matrix/client/(api/v1|r0|v3)/rooms/[^/]+/initialSync)$ {
    proxy_pass http://sync;
}