Important ASA (Admins Service Announcement) for Mastodon and generally PostgreSQL admins: Due to some changes in glibc some distribution upgrades will cause PostgreSQL text indexes to become corrupted, potentially leading to unique indexes not being correctly enforced and inconsistent application data.
THIS WILL ALSO HAPPEN IF YOU DO NOT UPDATE YOUR POSTGRESQL, A DISTRO UPDATE IS ENOUGH!
Affected upgrades are Ubuntu 18.04 or lower to 18.10 or higher, so especially current LTS upgrades (18.04 to 20.04), Debian 9 or lower to Debian 10, RHEL/CentOS 7 or lower to RHEL/CentOS 8.
Not heeding the linked advices will at some point in the future lead to strange search and ordering results out of your db and will probably cause duplicate entries despite unique indices. Fixing this afterwards involves rebuilding the indexes and somehow fixing any duplicate key error you encounter.
We did not know about this beforehand as it is not noted in any docs we could see regarding distro or pg upgrades. We noticed strange things approx. 6 hours after the distro+pg upgrade with pg_upgrade and had ~10 unique index violations in our relatively small Mastodon instance, ~10 across 2 smallish HackMD instances, and ~30-40 across 2 rather lively Matrix Synapse instances. It took us around 3h to get everything sorted and can only _hope_ everything is good now. Please read the linked wiki.
@thegcat Yeah, it's not really a standard unfortunately - some use "mastoadmins" too. (And I've also seen "fediadmins" - I think Pleroma uses a Postgres backend too? So other Fediverse software will likely be affected as well, btw.)
There's also a (mostly unused) mailing list and a forum at https://discourse.joinmastodon.org where instance admins are around.
@thegcat I wish this #MastoAdmin tip was more widely known 2 months ago because I started having weirdness with my instance when I upgraded to Debian 10 and battled with masto weirdness throughout March.
If one or more of your mastodon docker containers starts randomly flapping in the breeze (restarting) and you did an OS update look at your database indexes. My instance and a few others have already dealt with this.
Also postgresql in a docker container seems to be a bad idea.
@msh Note that the Debian 10 release notes did include this piece of information, so kudos to them for that. And from what I understood the pg Docker containers only did dump and restore pg upgrades, which would be unaffected?
@thegcat I will have to read the release notes more carefully in the future!
It should be noted that when I started having problems I was using a docker container for the db and it was upgraded on a different day than the host OS, and I'm not sure if there are issues with clashing host/container glib versions as well.
If you have issues with random broken emojis, @ mentions, certain public profiles going 404, masto containers flapping etc. then your instance has probably hit this gotcha.
Let's say you upgrade the db container during a docker-compose rebuild, and it is still a v11 database but the base image has updated glibc. The container could theoretically just reattach the data volume and then your indexes will start corrupting if you don't manually reindex ASAP, just as if you upgraded OS on a base image...
@atrus @thegcat ...so docker isn't a solution unto itself, however if the creator of the docker image is mindful they could address the issue. All I know is that I started noticing an increasing number of glitches after upgrading (I rebuilt the containers one weekend, and updated the OS of the machine hosing them to Buster the next IIRC) and it appears to have been this glib issue anyways, so nothing seemed to catch the problem.
Aside from that I had other reasons to ditch docker...
For me, docker just kept getting in the way. For some reason I couldn't determine the database ran noticeably slower in docker despite there not being anything that caused obvious overhead. Also when there did start to be errors docker-compose kept slamming services off and on abruptly to recover, including the database. Occasionally graceful shutdown would timeout and docker would just force kill. Going through logs was annoying. All those overlayFS mounts was annoying etc...
...it's not so much that docker is bad, m'kay? I know there are answers to all the docker problems I had...But for my use case it ultimately made no sense. I have a small instance and a pretty simple environment. Tuning the stock docker-compose to work *optimally* with my environment (including learning the intricacies of docker) turned out to be much more of a time sink than just following the docs to install masto straight onto a stock debian setup.
@thegcat pretty sure I'm not affected but just to be safe
mastodon=# reindex database concurrently mastodon;
Mastodon ist ein soziales Netzwerk. Es basiert auf offenen Web-Protokollen und freier, quelloffener Software. Es ist dezentral (so wie E-Mail!).