This new version introduced a system so that your instance stops sending out content to other instances that are supposedly dead / offline.
Unfortunately for some reason there’s false positives. When I checked comparing the results from a curl request vs the information in our Lemmy database I found over 350+ false positives.
In the DB there is a table called “instance” which has a column called “updated”. If the date on that column is older than 3 days, your server will stop sending any content to those instances.
For some reason I had entries that were dated as last being alive in July, while actually they were always up. If an entry is incorrect, you can fix it by manually using an update statement and adding today’s date. If your instance is not too large you can safely update all entries to today’s date and check if everything works as expected from then on any new content created on your instances.
The dead instances won’t have an impact unless your instance is larger and generates more content, thus it might be easier to simply update all entries and have Lemmy believe they’re all alive if you start noticing wonky behavior and don’t want to check one by one.
If you don’t know how to access the database run this command where domaincom is your instance domain without the dot.
-
docker exec -it domaincom_postgres_1 busybox /bin/sh
-
psql -U
(The default user is ‘lemmy’) You could technically do this is one single step, but it’s good to know the command to get shell access to the container itself if you didn’t know how to.
This should give you access to a postgres CLI interface. Use \c to connect, \dt to list tables and \d+ tablename to list table definition. You can also run SQL queries from there.
Try with this query: SELECT * from instance
to list all instances and their updated date.
You can use other SQL queries to get better results or correct false positives. Just be careful with what you execute since there’s no undo.
One liner to get into postgres:
docker exec -it lemmyca_postgres_1 psql -U lemmy
Show only the instances not updated in past 3 days:
select * from instance where updated < current_date-3;
Fix those old rows:
update instance set updated = current_date where updated < current_date - 3;
Your less than’s got turned into
<
Almost a 1 liner, just need to pass in the statement using the
-c
flagdeleted by creator
deleted by creator
If you used Lemmy-Easy-Deploy, the docker name is
lemmy-easy-deploy-postgres-1
docker exec -it lemmy-easy-deploy-postgres-1 busybox /bin/sh
Is there a link to a GitHub issue or pull request that I can track for this?
Very useful thanks for sharing! We should have an admin tips community to collect information like this
You can post it here and the explanation why it happens.
there is !lemmy_admin@lemmy.ml
Perfect! Thanks!
Thank you but is this really safe?
Just asking since there was db errors that messed up other people’s instances, or so it seemed.
It should be safe as long as you put in a valid timestamp and not some other value. If you run a large instance, then you run the risk of pseudo-DDoSing yourself by sending a large amount of requests to dead servers, but unless you’re a large instance you shouldn’t have to worry about that.
It’s effectively how lemmy pre-0.18.3 behaved so it’d be no worse than that.
Is it possible to add some scheduled trigger to check all instances in that table regardless of the last updated value? Shouldn’t too frequent but enough to avoid fully defederating some instances
In theory that’s what Lemmy now does every day, but I have no idea why it fails to update some instances sometimes. Instances which are very much alive at 12AM which is when this gets executed.
If you’re unlucky enough to have lemmy.world momentarily down at the exact same moment your instance doing the check, then it’s possible. Also, lemmy.world implements rate limiting via cloudflare, and their admins probably not whitelisting requests from the dead instance checker.
I think instance owners that use cloudflare should whitelist
/nodeinfo/2.0.json
so automated requests to that endpoint is not blocked by cloudflare’s bot protection (which can lead to failed daily instance check on instances you federated with).Good point. I think that might be it actually. This could be the reason.
I hope that’s not 12am on the dot for all instances, with no fudging of the check time.
That seems to be the case more or less. In my case times go from 0:00 to 0:25 or so when it finishes.
So at that exact time big servers get a few hundred requests? I hope it’s a very light weight check and doesn’t trigger any flood or spam protection… otherwise https://i.imgflip.com/xhss9.jpg
thank you :)
deleted by creator