Fediversity/matrix/synapse/workers.md

389 lines
11 KiB
Markdown

---
gitea: none
include_toc: true
---
# Worker-based setup
Very busy servers are brought down because a single thread can't keep up with
the load. So you want to create several threads for different types of work.
See this [Matrix blog](https://matrix.org/blog/2020/11/03/how-we-fixed-synapse-s-scalability/)
for some background information.
The traditional Synapse setup is one monolithic piece of software that does
everything. Joining a very busy room makes a bottleneck, as the server will
spend all its cycles on synchronizing that room.
You can split the server into workers, that are basically Synapse servers
themselves. Redirect specific tasks to them and you have several different
servers doing all kinds of tasks at the same time. A busy room will no longer
freeze the rest.
Workers communicate with each other via socket files and Redis.
# Redis
First step is to install Redis.
```
apt install redis-server
```
For less overhead we use a UNIX socket instead of a network connection to
localhost. Disable the TCP listener and enable the socket in
`/etc/redis/redis.conf`:
```
port 0
unixsocket /run/redis/redis-server.sock
unixsocketperm 770
```
Our matrix user (`matrix-synapse`) has to be able to read from and write to
that socket, which is created by Redis and owned by `redis:redis`, so we add
user `matrix-synapse` to the group `redis`.
```
adduser matrix-synapse redis
```
Restart Redis for these changes to take effect. Check if port 6379 is no
longer active, and if the socketfile `/run/redis/redis-server.sock` exists.
# Synapse
Workers communicate with each other over sockets, that are all placed in one
directory. To make sure only the users that need access will have it, we
create a new group and add the users to it.
Then, create the directory where all the socket files for workers will come,
and give it the correct user, group and permission:
```
groupadd --system clubmatrix
useradd matrix-synapse clubmatrix
useradd www-data clubmatrix
mkdir /run/matrix-synapse
dpkg-statoverride --add --update matrix-synapse clubmatrix 2770 /run/matrix-synapse
```
First we change Synapse from listening on `localhost:8008` to listening on a
socket. We'll do most of our workers work in `conf.d/listeners.yaml`, so let's
put the new configuration for the main proccess there:
Add a replication listener:
```
listeners:
- path: /run/matrix-synapse/inbound_main.sock
mode: 0660
type: http
resources:
- names:
- client
- consent
- federation
- path: /run/matrix-synapse/replication.sock
mode: 0660
type: http
resources:
- names:
- replication
```
This means Synapse will create two sockets under `/run/matrix/synapse`: one
for incoming traffic that is forwarded by nginx (`inbound_main.sock`), and one for
communicating with all the other workers (`replication.sock`).
If you restart Synapse now, it won't do anything anymore, because nginx is
still forwarding its traffic to `localhost:8008`. We'll get to nginx later,
but you'd have to change
```
proxy_forward http://localhost:8008;
```
to
```
proxy_forward http://unix:/run/matrix-synapse/inbound_main.sock;
```
If you've done this, restart Synapse, check if the socket is created and has
the correct permissions. Now point Synapse at Redis in `conf.d/redis.yaml`:
```
redis:
enabled: true
path: /run/redis/redis-server.sock
```
Check if Synapse can connect to Redis via the socket, you should find log
entries like this:
```
synapse.replication.tcp.redis - 292 - INFO - sentinel - Connecting to redis server UNIXAddress('/run/redis/redis-server.sock')
synapse.util.httpresourcetree - 56 - INFO - sentinel - Attaching <synapse.replication.http.ReplicationRestResource object at 0x7f95f850d150> to path b'/_synapse/replication'
synapse.replication.tcp.redis - 126 - INFO - sentinel - Connected to redis
synapse.replication.tcp.redis - 138 - INFO - subscribe-replication-0 - Sending redis SUBSCRIBE for ['matrix.example.com/USER_IP', 'matrix.example.com']
synapse.replication.tcp.redis - 141 - INFO - subscribe-replication-0 - Successfully subscribed to redis stream, sending REPLICATE command
synapse.replication.tcp.redis - 146 - INFO - subscribe-replication-0 - REPLICATE successfully sent
```
# Worker overview
Every worker is, in fact, a Synapse server, only with a limited set of tasks.
Some tasks can be handled by a number of workers, others only by one. Every
worker starts as a normal Synapse process, reading all the normal
configuration files, and then a bit of configuration for the specific worker
itself.
Workers need to communicate with each other and the main process, they do that
via the `replication` sockets under `/run/matrix-synapse`.
Most worker also need a way to be fed traffic by nginx, they have an `inbound`
socket for that, in the same directory.
Finally, all those replicating workers need to be registered in the main
process: all workers and their replication sockets are listed inin the `instance_map`.
Every worker has its own configuration file, we'll put those under
`/etc/matrix-synapse/workers`. Create it, and then one systemd service file for
all workers:
## Types of workers
We'll make separate workers for almost every task, and several for the
heaviest tasks: synchronising. An overview of what endpoints are to be
forwarded to a worker is in [Synapse's documentation](https://element-hq.github.io/synapse/latest/workers.html#available-worker-applications).
We'll create the following workers:
* login
* federation_sender
* mediaworker
* userdir
* pusher
* push_rules
* typing
* todevice
* accountdata
* presence
* receipts
* initial_sync: 1 and 2
* normal_sync: 1, 2 and 3
Some of them are `stream_writers`, and the [documentation about
stream_witers](https://element-hq.github.io/synapse/latest/workers.html#stream-writers)
says:
```
Note: The same worker can handle multiple streams, but unless otherwise documented, each stream can only have a single writer.
```
So, stream writers must have unique tasks: you can't have two or more workers
writing to the same stream. Stream writers have to be listed in `stream_writers`:
```
stream_writers:
account_data:
- accountdata
presence:
- presence
receipts:
- receipts
to_device:
- todevice
typing:
- typing
push_rules:
- push_rules
```
As you can see, we've given the stream workers the name of the stream they're
writing to. We could combine all those streams into one worker, which would
probably be enough for most instances.
We could define a worker with the name streamwriter and list it under all
streams instead of a single worker for every stream.
Finally, we have to list all these workers under `instance_map`: their name
and their replication socket:
```
instance_map:
main:
path: "/run/matrix-synapse/replication_main.sock"
login:
path: "/run/matrix-synapse/replication_login.sock"
federation_sender:
path: "/run/matrix-synapse/replication_federation_sender.sock"
mediaworker:
path: "/run/matrix-synapse/replication_mediaworker.sock"
...
normal_sync1:
path: "unix:/run/matrix-synapse/inbound_normal_sync1.sock"
normal_sync2:
path: "unix:/run/matrix-synapse/inbound_normal_sync2.sock"
normal_sync3:
path: "unix:/run/matrix-synapse/inbound_normal_sync3.sock"
```
## Defining a worker
# The rest
```
[Unit]
Description=Synapse %i
AssertPathExists=/etc/matrix-synapse/workers/%i.yaml
# This service should be restarted when the synapse target is restarted.
PartOf=matrix-synapse.target
ReloadPropagatedFrom=matrix-synapse.target
# if this is started at the same time as the main, let the main process start
# first, to initialise the database schema.
After=matrix-synapse.service
[Service]
Type=notify
NotifyAccess=main
User=matrix-synapse
WorkingDirectory=/var/lib/matrix-synapse
EnvironmentFile=-/etc/default/matrix-synapse
ExecStart=/opt/venvs/matrix-synapse/bin/python -m synapse.app.generic_worker --config-path=/etc/matrix-synapse/homeserver.yaml --config-path=/etc/matrix-synapse/conf.d/ --config-path=/etc/matrix-synapse/workers/%i.yaml
ExecReload=/bin/kill -HUP $MAINPID
Restart=always
RestartSec=3
SyslogIdentifier=matrix-synapse-%i
[Install]
WantedBy=matrix-synapse.target
```
And create the `matrix-synapse.target`, which combines all Synapse parts into
one systemd target:
```
[Unit]
Description=Matrix Synapse with all its workers
After=network.target
[Install]
WantedBy=multi-user.target
```
# Create workers
We need a configuration file for each worker, and the main process needs to
know which workers there are and how to contact them.
The latter is done in the ...
## Temporary block
We're going to configure a few different workers:
* client-sync
* roomworker
* federation-sender
* mediaworker
### Client-sync
This type needs both an inbound socket to receive stuff from nginx, and a
replication socket to communicate with the rest. We probably want a few of
these workers. The configuration should look like this:
```
worker_app: "synapse.app.generic_worker" # Always this unless "synapse.app.media_repository"
worker_name: "clientsync1" # Name of worker specified in instance map
worker_log_config: "/etc/matrix-synapse/logconf.d/clientsync.yaml" # Log config file
worker_listeners:
# Include for any worker in the instance map above:
- path: "/run/matrix-synapse/replication_clientsync1.sock"
type: http
resources:
- names: [replication]
compress: false
# Include for any worker that receives requests in Nginx:
- path: "/run/matrix-synapse/synapse_inbound_client_sync1.sock"
type: http
x_forwarded: true # Trust the X-Forwarded-For header from Nginx
resources:
- names:
- client
- consent
```
### Roomworker
These don't need a replication socket as they're not in the instance map, but
they do need an inboud socket for nginx to pass stuff to them. We want a few
of these workers, we may even configure a worker for one specific busy room...
Configuration should look like this:
```
worker_app: "synapse.app.generic_worker"
worker_name: "roomworker1"
worker_log_config: "/etc/matrix-synapse/logconf.d/roomworker.yaml"
worker_listeners:
- path: "/run/matrix-synapse/inbound_roomworker1.sock"
type: http
x_forwarded: true
resources:
- names:
- client
- consent
- federation
compress: false
```
### Mediaworker
To make sure the worker takes care of handling media, and not the main
process. You need to tell the main process to to keep its hands off media, and
which worker will take care of it:
```
enable_media_repo: false
media_instance_running_background_jobs: "mediaworker1"
```
Then define the worker, like this:
```
worker_app: "synapse.app.media_repository"
worker_name: "mediaworker1"
worker_log_config: "/etc/matrix-synapse/logconf.d/mediaworker.yaml"
worker_listeners:
- path: "/run/matrix-synapse/inbound_mediaworker1.sock"
type: http
x_forwarded: true
resources:
- names: [media]
```