• 0 Posts
  • 16 Comments
Joined 2 years ago
cake
Cake day: June 7th, 2023

help-circle
  • It’s been a few of years since did my initial setup (8 apparently, just checked); so, my info is definitely out of date. Looking at the Ubuntu site they still list Ubuntu 16.04, but I think the info on setting it up is still valid. Though, it looks like they only list setting up a mirror or a stripe set without parity. A mirror is fine, but you trade half your storage space for complete data redundancy. That can make sense, but usually not for a self hosting situation. A stripe set without parity is only useful for losing data, never use this. The option you’ll want is a raidz, which is a stripe set with parity. The command will look like:

    zpool create zpool raidz /dev/sdb /dev/sdc /dev/sdd
    

    This would create a zpool named “zpool” from the drives at /dev/sdb, /dev/sdc and /dev/sdd.

    I would suggest spending some time reading up on the setup. It was actually pretty simple to do, but it’s good to have a foundation to work with. I also have this link bookmarked, as it was really helpful for getting rolling snapshots setup. As with the data redundancy given by RAID, it does not replace backups; but, can be used as part of a backup strategy. They also help when you make a mistake and delete/overwrite a file.

    Finally, to answer your question about hardware, my recollection and experience has been that ZFS is not terribly demanding of CPU. I ran a Intel Core i3 for most of the server’s life and only upgraded when I realized that I wanted to game servers on it. Memory is more of an issue. The minimum requrement most often cited is 8GB, but I also saw a rule of thumb that you want 1GB of memory for each TB of storage. In the end, I went with 8GB of RAM, as I only had 4TB of storage (3 2TB disks in a RAIDZ1). But, also think about what other workloads you have on the system. When built, I was only running NextCloud, NGinx, Splunk, PiHole and WordPress (all in docker containers). And the initial 8GB of RAM was doing just fine. When I started running game servers, I stared to run into issues. I now have 16GB and am mostly fine. Some game servers can be a bit heavy (e.g. Minecraft, because fucking Java), but I don’t normally see problems. Also, since the link I provided mentioned it, skip ECC memory. it’s almost never worth the cost, and for home use that “almost never” gets much closer to “actually never”.

    When choosing disks, keep in mind that you will need a minimum of 2 disks and you effectively lose the storage space of one of the disks in the pool to parity storage (assuming all disks are the same size). Also, it is best for all of the disks to be the same size. You can technically use different size disks in the same pool; but, the larger disks get treated as the same size as the smaller disks. So long as the pool is healthy, read speeds are better than a single disk as the read can be spread out among the pool. But, write speeds can be slower, as the parity needs to be calculated at write time. Otherwise, you’re pretty free to choose any disks which will be recognized by the OS. You mention that 1TB is filling up; so, you’ll want to pick something bigger. I mentioned using spinning disks, as they can provide a lot more space for the money. Something like a 14TB WD Red drive can be had for $280 ($20/TB). With three of those in a RAIDZ1 pool, you get ~28TB of storage and can tolerate one disk failure , without losing data. With solid state disks, you can expect costs closer to $80/TB. Though, there is a tradeoff in speed. So, you need to consider what type of workloads you expect the storage pool to handle. Video editing on spinning rust is not going to be fun. Streaming video at 4k is probably OK, though 8k is going to struggle.

    A couple other things think about are space in the chassis, drive connections and power. Chassis space is pretty obvious, you gotta put the disks in the box. Technically, you don’t have to mount the disks, they can just be sitting at the bottom of the case, but this can cause problems with heat shortening the lifespan of the drives. It’s best to have them properly mounted and fans pushing air over them. Drive connections are one of those, you either have the headers or you don’t. Make sure your motherboard can support 3 more drives with the chosen interface (SATA, NVMe, etc.) before you get the drives. Nothing sucks more than having a fancy new drive only to be unable to plug it into the motherboard. Lastly, drives (and especially spinning drives) can be power hungry. Make sure your power supply can support the extra power requirements.

    Good luck whatever route you pick.


  • Probably the easiest solution would be to just chuck a larger disk in the system and retain the original drive for the operating system. If you do not need the high speed of an SSD, you may be able to get more storage space for the money by going with a spinning disk. 7200RPM drives are fast enough for most applications, though you may run into issues streaming 4K (or higher) resolution video.

    Another option would be to start building out a storage pool using some type of RAID technology. On my own server, I use ZFS for the data partition. It is basically a software RAID. I use a RAID-Z1 configuration, which stripes the data over multiple disks (three in my case) and uses a parity calculation to provide data redundancy. It also has the advantage that it can be expanded to new disks dynamically and does not require that all disks are the same size. Initial setup does require more work and you are now monitoring multiple physical disks, but having a unified storage pool and redundancy is a nice way to go.

    Any way you go, just make sure you have good backups. Drives fail, and sometimes even early in their life. Backblaze reports can be an interesting read when looking at drive options, as they really do put the drives through the wringer.


  • do any of you hate how self-hosting services like photo- or document-management systems, or even a simple rss tool, forces you to sort your stuff out, and put your decades old files in order?!

    What is this “sort” thing you speak of? I don’t sort anything, I have NextCloud syncing my entire photos, videos and documents folders and they are just as messy as ever. Granted, I do go through my photos and videos once a year and dump them in a folder named for the year they were taken. Occasionally, I’ll go hog wild and try to sort some of a year’s photos/videos into folders named after events. Though, that hasn’t happened in a number of years. I setup NextCloud so I could have everything synced to my own server and just forget, not have to deal with labeling my data.

    As for bookmarks. I already keep those in folders; but, I don’t sync those. I use my desktop far more than I use my phone for web browsing. And the types of things I use my phone for (mostly recipes), I just keep bookmarked there.



  • Ya, sadly there is still a lot of useful content in the technical subreddits. So I find myself ending up there via search engines on a fairly regular basis. But, I specifically use the Redirector plugin for Firefox to auto-magically force the use of old Reddit. If I hit the site on my work computer, I’m quickly reminded about why I quit the site.



  • It makes little sense why it works on an offsite WiFi, but not mobile data.

    I’d agree with unbuckled above, it’s a DNS issue. If your mobile device is capable, use nslookup or dig to see what responses you are getting in different scenarios. It’s possible that your VPN software is leaking DNS queries out to the mobile data provider’s DNS servers while you are on mobile data and only using the correct DNS settings when you are on wifi. Possibly look for split tunnel settings in the VPN software, as this can create this type of situation.

    You can also confirm this from the pihole side. Connect to the VPN via mobile data and browse to some website you don’t use often, but is not your own internal stuff. Then open the query log on your pihole and see if that domain shows up. I’d put money on that query not showing in the pihole query log.


  • Along with the things others have said (Backups, Linux, Docker, Networking) I’d also recommend getting comfortable with server and network security. A lot of this is wrapped up in the simple mantra “install your goddamn updates!” But, there is more to it than that. For example, if you go with Nextcloud, read through their hardening guide and seriously consider implementing all of the recommendation. Also think through how you intend to manage both the server and instance. If this is all local, then it is easier as you can keep SSH access to the server firewalled off from the internet. If you host part of your stuff “in the cloud”, you’ll want to start looking at limiting down access and using keys to login (which is good practice for all situations). Also, never use default credentials. You may also want to familiarize yourself with the logs provided by the applications and maybe setup some monitoring around them. I personally run Nextcloud and I feed all my logs into Splunk (you can run a free instance in a docker container). I have a number of dashboards I look at every morning to keep an eye on things. E.g. Failed/successful logins, traffic sources, URI requests, file access, etc. If your server is attached to the internet it will be under attack constantly. Fail2Ban on my wireguard container banned 112 IP addresses over the last 24 hours, for 3 failed attempts to login via SSH. Less commonly, attackers try to log in to my Nextcloud instance. And my WordPress site is under constant attack. If you choose to run Wordpress, be very careful about the plugins you choose to install, and then keep them up to date. Wordpress itself is reasonably secure, the plugins are a shit-show and worse when they aren’t kept up to date.


  • I’m sure there are several out there. But, when I was starting out, I didn’t see one and just rolled my own. The process was general enough that I’ve been able to mostly just replace the SteamID of the game in the Dockerfile and have it work well for other games. It doesn’t do anything fancy like automatic updating; but, it works and doesn’t need anything special.


  • I see containers as having a couple of advantages:

    1. Separation of dependencies - while not as big of issue as it used to be, just knowing that you won’t end up with the requirements for one application conflicting with another is one less issue to worry about. Additionally, you can do anything you want to one container, without having an effect on another container. You don’t get stuck wanting to reboot or revert the system, but not wanting to break a different running service.
    2. Portability - Eventually, you are going to replace the OS of that VM (at least, you should). Moving a container to a new OS is dead simple. Re-installing an application on a new OS, moving data and configs can be anywhere from easy to a pain in the arse, depending on the software.
    3. Easier fall back - Have you ever upgraded an application and had everything go to shit? In my years working as a sysadmin, I lost way too many evenings to this sort of bullshit. And while VM snapshots should make reverting easy, sometimes it just didn’t work out that way. Containers force enough separation of applications that you can do just about anything to one container and not effect others.
    4. Less dependency on a single install - Have you ever had a system just get FUBAR, and after a few hours of digging the answer seems to be, just format the drive and start over? Maybe you tried some weird application out and the uninstall wasn’t really clean. By having all that crap happen in containers, you can isolate the damage. Nuke the container, nuke the image, and the base OS is still clean.
    5. Easier version testing - Want to try out upgrading to version 2 of an application, but worried that it may not be fully baked yet or the new configs may take a while to get right? Do it off in a separate container on a copy of the data. You can do this with VMs and snapshots; but, I find containers to be less overhead.

    That all said, if an application does not have an official container image, the added complexity of creating and maintaining your own image can be a significant downside. One of my use cases for containers is running game servers (e.g. Valheim). There isn’t an official image; so, I had to roll my own. The effort to set this up isn’t zero and, when trying to sort out an image for a new game, it does take me a while before I can start playing. And those images need to be updated when a new version of the game releases. Technically, you can update a running container in a lot of cases; but, I usually end up rebuilding it at some point anyway.

    I’d also note that, careful use of VMs and snapshots can replicate or mitigate most of the advantages I listed. I’ve done both (decade and a half as a sysadmin). But, part of that “careful use” usually meant spinning up a new VM for each application. Putting multiple applications on the same OS install was usually asking for trouble. Eventually, one of the applications would get borked and having the flexibility to just nuke the whole install saved a lot of time and effort. Going with containers removed the need to nuke the OS along with the application to get a similar effect.

    At the end of the day, though. It’s your box, you do what you are most comfortable with and want to support. If that’s a monolithic install, then go for it. While I, or other might find containers a better answer for us, maybe it isn’t for you.


  • My list of items I look for:

    • A docker image is available. Not some sort of make or build script which make gods know what changes to my system, even if the end result is a docker image. Just have a docker image out on Dockerhub or a Dockerfile as part of the project. A docker-compose.yaml file is a nice bonus.
    • Two factor auth. I understand this is hard, but if you are actually building something you want people to seriously use, it needs to be seriously secured. Bonus points for working with my YubiKey.
    • Good authentication logging. I may be an outlier on this one, but I actually look at the audit logs for my services. Having a log of authentication activity (successes and failures) is important to me. I use both fail2ban to block off IPs which get up to any fuckery and I manually blackhole entire ASNs when it seems they are sourcing a lot of attacks. Give me timestamps (in ISO8601 format, all other formats are wrong), IP address, username, success or failure (as a independent field, not buried in a message or other string) and any client information you can (e.g. User-Agent strings).
    • Good error logging. Look, I kinda suck, I’m gonna break stuff. When I do, it’s nice to have solid logging giving me an idea of what I broke and to provide a standardized error code to search on. It also means that, when I give up and post it as an issue to your github page, I can provide you with some useful context.

    As for that hackernews response, I’d categorically disagree with most of it.

    An app, self-contained, (essentially) a single file with minimal dependencies.

    Ya…no. Complex stuff is complex. And a lot of good stuff is complex. My main, self-hosted app is NextCloud. Trying to run that as some monolithic app would be brain-dead stupid. Just for the sake of maintainability, it is going to need to be a fairly sprawling list of files and folders. And it’s going to be dependent on some sort of web server software. And that is a very good place to NOT roll your own. Good web server software is hard, secure web server software is damn near impossible. Let the large projects (Apache/Nginx) handle that bit for you.

    Not something so complex that it requires docker.

    “Requires docker” may be a bit much. But, there is a reason people like to containerize stuff, it avoids a lot of problems. And supporting whatever random setup people have just sucks. I can understand just putting a project out as a container and telling people to fuck off with their magical snowflake setup. There is a reason flatpak is gaining popularity.
    Honestly, I see docker as a way to reduce complexity in my setup. I don’t have to worry about dependencies or having the right version of some library on my OS. I don’t worry about different apps needing different versions of the same library. I don’t need to maintain different virtual python environments for different apps. The containers “just work”. Hell, I regularly dockerize dedicated game servers just for my wife and I to play on.

    Not something that requires you to install a separate database.

    Oh goodie, let’s all create our own database formats and re-learn the lessons of the '90s about how hard databases actually are! No really, fuck off with that noise. If your app needs a small database backend, maybe try SQLite. But, some things just need a real database. And as with web servers, rolling your own is usually a bad plan.

    Not something that depends on redis and other external services.

    Again, sometimes you just need to have certain functionality and there is no point re-inventing the wheel every time. Breaking those discrete things out into other microservices can make sense. Sure, this means you are now beholden to everything that other service does; but, your app will never be an island. You are always going to be using libraries that other people wrote. Just try to avoid too much sprawl. Every dependency you spin up means your users are now maintaining an extra application. And you should probably build a bit of checking into your app to ensure that those dependencies are in sync. It really sucks to upgrade a service and have it fail, only to discover that one of it’s dependencies needed to be upgraded manually first, and now the whole thing is corrupt and needs to be restored from backup. Yes, users should read the release notes, they never do.
    The corollary here is to be careful about setting your users up for a supply chain attack. Every dependency or external library you add is one more place for your application to be attacked. And just because the actual vulnerability is in SomeCoolLib.js, it’s still your app getting hacked. You chose that library, you’re now beholden to everything it gets wrong.

    At the end of it all, I’d say the best app to write is the one you are interested in writing. The internet is littered with lots of good intentions and interesting starts. There is a lot less software which is actually feature complete and useful. If you lose interest, because you are so busy trying to please a whole bunch of idiots on the other side of the internet, you will never actually release anything. You do you, and fuck all the haters. If what you put out is interesting and useful, us users will show up and figure out how to use it. We’ll also bitch and moan, no matter how great your app is. It’s what users do. Do listen, feedback is useful. But, also remember that opinions are like assholes: everyone has one, and most of them stink.


  • Have you considered just beige boxing a server yourself? My home server is a mini-ITX board from Asus running a Core i5, 32GB of RAM and a stack of SATA HDDs all stuffed in a smaller case. Nothing fancy, just hardware picked to fulfill my needs.

    Limiting yourself to bespoke systems means limiting yourself to what someone else wanted to build. The main downside to building it yourself is ensuring hardware comparability with the OS/software you want to run. If you are willing to take that on, you can tailor your server to just what you want.



  • No, but you are the target of bots scanning for known exploits. The time between an exploit being announced and threat actors adding it to commodity bot kits is incredibly short these days. I work in Incident Response and seeing wp-content in the URL of an attack is nearly a daily occurrence. Sure, for whatever random software you have running on your normal PC, it’s probably less of an issue. Once you open a system up to the internet and constant scanning and attack by commodity malware, falling out of date quickly opens your system to exploit.


  • Short answer: yes, you can self-host on any computer connected to your network.

    Longer answer:
    You can, but this is probably not the best way to go about things. The first thing to consider is what you are actually hosting. If you are talking about a website, this means that you are running some sort of web server software 24x7 on your main PC. This will be eating up resources (CPU cycles, RAM) which you may want to dedicated to other processes (e.g. gaming). Also, anything you do on that PC may have a negative impact on the server software you are hosting. Reboot and your server software is now offline. Install something new and you might have a conflict bringing your server software down. Lastly, if your website ever gets hacked, then your main PC also just got hacked, and your life may really suck. This is why you often see things like Raspberry Pis being used for self-hosting. It moves the server software on to separate hardware which can be updated/maintained outside a PC which is used for other purposes. And it gives any attacker on that box one more step to cross before owning your main PC. Granted, it’s a small step, but the goal there is to slow them down as much as possible.

    That said, the process is generally straight forward. Though, there will be some variations depending on what you are hosting (e.g. webserver, nextcloud, plex, etc.) And, your ISP can throw a massive monkey wrench in the whole thing, if they use CG-NAT. I would also warn you that, once you have a presence on the internet, you will need to consider the security implications to whatever it is you are hosting. With the most important security recommendation being “install your updates”. And not just OS updates, but keeping all software up to date. And, if you host WordPress, you need to stay on top of plugin and theme updates as well. In short, if it’s running on your system, it needs to stay up to date.

    The process generally looks something like:

    • Install your updates.
    • Install the server software.
    • Apply updates to the software (the installer may be an outdated version).
    • Apply security hardening based on guides from the software vendor.
    • Configure your firewall to forward the required ports (and only the required ports) from the WAN side to the server.
    • Figure out your external IP address.
    • Try accessing the service from the outside.

    Optionally, you may want to consider using a Dynamic DNS service (DDNS) (e.g. noip.com) to make reaching your server easier. But, this is technically optional, if you’re willing to just use an IP address and manually update things on the fly.

    Good luck, and in case I didn’t mention it, install your updates.


  • Yup, OP doesn’t seem to understand that the Fediverse is still in an “early adopter” stage. For tech stuff, that’s a demographic which is usually dominated by male tech nerds. I suspect the age range is more diverse than the OP claims. Though, I imagine it centers somewhere near the 25-35 range, as those will be the people with the drive and time to engage in online mental masturbation argue with people on the internet.

    Also, the math will make the median age skew older. Each older person will have a larger effect on moving the average age up. For example, consider 3 people with ages 45, 20, and 18. they have an average age of of just a bit over 27. There is a lot of room for us old farts to drag the average up while there is only a small range for the kids to drag it down. There might be the rare 9-10 year old who has the wherewithal to make an account and comment. But, I don’t suspect the user base will get all that much younger. Whereas, there’s probably lots of late 30’s and 40 year old users and some even older. Take those same three ages listed above and add in a precocious 10 year old and a 60 year old grey beard. The average is now just over 30. That older person had a larger effect on the average than the kid.

    Also, age doesn’t matter all that much. The important thing is that the early adopters are here and making content. Ideally, this will build out the site and others may follow (or not, not every trend works out). But hopefully, this will end up like the Great Digg Migration before it, which fed the Reddit beast. And we’ll have something a new, different and maybe actually better this time.