Fediversity/matrix/nginx/workers
2024-12-30 12:00:51 +01:00
..
README.md Proxy-optimizations added, several worker config files added plus loggin template. 2024-12-30 12:00:51 +01:00

Table of Contents

Reverse proxy for Synapse with workers

Changing nginx's configuration from a reverse proxy for a normal, monolithic Synapse to one for a Synapse that uses workers, quite a lot has to be changed.

As mentioned in Synapse with workers, we're changing the "backend" from network sockets to UNIX sockets.

Because we're going to have to forward a lot of specific requests to all kinds of workers, we'll split the configuration into a few bits:

  • all proxy_forward settings
  • all location definitions
  • maps that define variables
  • upstreams that point to the correct socket(s) with the correct settings
  • settings for private access
  • connection optimizations

Some of these go into /etc/nginx/conf.d because they are part of the configuration of nginx itself, others go into /etc/nginx/snippets because we need to include them several times in different places.

Optimizations

In the quest for speed, we are going to tweak several settings in nginx. To keep things manageable, most of those tweaks go into separate configuration files that are either automatically included (those under /etc/nginx/conf.d) or explicitly where we need them (those under /etc/nginx/snippets).

For every proxy_forward we want to configure several settings, and because we don't want to include the same list of settings every time, we put all of them in one snippet of code, that we can include every time we need it.

Create /etc/nginx/snippets/proxy.conf and put this in it:

proxy_connect_timeout 2s;
proxy_buffering off;
proxy_http_version 1.1;
proxy_read_timeout 3600s;
proxy_redirect off;
proxy_send_timeout 120s;
proxy_socket_keepalive on;
proxy_ssl_verify off;

proxy_set_header Accept-Encoding "";
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Connection $connection_upgrade;
proxy_set_header Upgrade $http_upgrade;

client_max_body_size 50M;

Every time we use a proxy_forward, we include this snippet.

Maps

A map sets a variable based on, usually, another variable. One case we use this is in determining the type of sync a client is doing. A normal sync, simply updating an existing session, is a rather lightweight operation. An initial sync, meaning a full sync because the session is brand new, is not so lightweight.

A normal sync can be recognised by the since bit in the request: it tells the server when its last sync was. If there is no since, we're dealing with an initial sync.

We want to forward requests for normal syncs to the normal_sync workers, and the initial syncs to the initial_sync workers.

We decide to which type of worker to forward the sync request to by looking at the presence or absence of since: if it's there, it's a normal sync and we set the variable $sync to normal_sync. If it's not there, we set $sync to initial_sync. The content of since is irrelevant for nginx.

This is what the map looks like:

map $arg_since $sync {
    default normal_sync;
    '' initial_sync;
}

We evaluate $arg_since to set $sync: $arg_since is nginx's variable $arg_ followed by since, the argument we want. See the index of variables in nginx for more variables we can use in nginx.

By default we set $sync to normal_sync, unless the argument since is empty (absent); then we set it to initial_sync.

After this mapping, we forward the request to the correct worker like this:

proxy_pass http://$sync;

Upstreams

In our configuration, nginx is not only a reverse proxy, it's a load balancer. Just like what haproxy does, it can forward requests to "servers" behind it. Such a server is the inbound UNIX socket of a worker, and there can be several of them in one group.

Let's start with a simple one, the login worker, that handles the login process for clients.

login worker komt hier...

Two of these upstreams are the sync workers: normal_sync and initial_sync, both consisting of several "servers":

upstream initial_sync {
    hash $mxid_localpart consistent;
    server unix:/run/matrix-synapse/inbound_initial_sync1.sock max_fails=0;
    server unix:/run/matrix-synapse/inbound_initial_sync2.sock max_fails=0;
    keepalive 10;
}

upstream normal_sync {
    hash $mxid_localpart consistent;
    server unix:/run/matrix-synapse/inbound_normal_sync1.sock max_fails=0;
    server unix:/run/matrix-synapse/inbound_normal_sync2.sock max_fails=0;
    server unix:/run/matrix-synapse/inbound_normal_sync3.sock max_fails=0;
    keepalive 10;
}

The hash bit is to make sure requests are always forwarded to the same worker.

Locations

Now that we have defined the workers and/or worker pools, we have to forward the right traffic to the right workers. The Synapse documentation about available worker types lists which endpoints a specific worker type can handle.

The docs say that the generic_worker can handle these requests for synchronisation requests:

# Sync requests
^/_matrix/client/(r0|v3)/sync$
^/_matrix/client/(api/v1|r0|v3)/events$
^/_matrix/client/(api/v1|r0|v3)/initialSync$
^/_matrix/client/(api/v1|r0|v3)/rooms/[^/]+/initialSync$

Now, if we had only one worker type for synchronisations, named syncworkers, not splitting those requests up in normal and initial, we would direct all sync-requests to that worker pool with this location:

location ~ ^(/_matrix/client/(r0|v3)/sync|/_matrix/client/(api/v1|r0|v3)/events|/_matrix/client/(api/v1|r0|v3)/initialSync|/_matrix/client/(api/v1|r0|v3)/rooms/[^/]+/initialSync)$ {
    proxy_pass http://syncworkers;
}

That's the concept.