Off-and-on trying out an account over at @tal@oleo.cafe due to scraping bots bogging down lemmy.today to the point of near-unusability.

  • 1 Post
  • 201 Comments
Joined 2 years ago
cake
Cake day: October 4th, 2023

help-circle
  • If I’m traveling or I wipe my device or get a new one, I would have to add the new key to many servers as authorized keys,

    So, I don’t want to get into a huge argument over the best way to deal with things, since everyone has their own use cases, but if that’s your only concern, you have a list of hosts that you want to put the key on, and you still have a key for another device, that shouldn’t be terribly difficult. Generate your new keypair for your new device. Then on a Linux machine, something like:

    $ cat username-host-pairs.txt
    me@host1
    me@host2
    me@host3
    $ cat username-host-pairs.txt|xargs -n1 ssh-copy-id -i new-device-key-file-id_ed25519.pub
    

    That should use your other device’s private key to authenticate to the servers in question and copy the new device’s pubkey to the accounts on the host in question. Won’t need password access enabled.



  • So an internet

    The highest data rate it looks like is supported by LoRa in North America is 21900 bits per second, so you’re talking about 21kbps, or 2.6kBps in a best-case scenario. That’s about half of what an analog telephone system modem could achieve.

    It’s going to be pretty bandwidth-constrained, limited in terms of routing traffic around.

    I think that the idea of a “public access, zero-admin mesh Internet over the air” isn’t totally crazy, but that it’d probably need to use something like laser links and hardware that can identify and auto-align to other links.




  • Have a limited attack surface will reduce exposure.

    If, say, the only thing that you’re exposing is, oh, say, a Wireguard VPN, then unless there’s a misconfiguration or remotely-exploitable bug in Wireguard, then you’re fine regarding random people running exploit scanners.

    I’m not too worried about stuff like (vanilla) Apache, OpenSSH, Wireguard, stuff like that, the “big” stuff that have a lot of eyes on them. I’d be a lot more dubious about niche stuff that some guy just threw together.

    To put perspective on this, you gotta remember that most software that people run isn’t run in a sandbox. It can phone home. Games on Steam. If your Web browser has bugs, it’s got a lot of sites that might attack it. Plugins for that Web browser. Some guy’s open-source project. That’s a potential vector too. Sure, some random script kiddy running an exploit scanner is a potential risk, but my bet is that if you look at the actual number of compromises via that route, it’s probably rather lower than plain old malware.

    It’s good to be aware of what you’re doing when you expose the Internet to something, but also to keep perspective. A lot of people out there run services exposed to the Internet every day; they need to do so to make things work.




  • https://stackoverflow.com/questions/30869297/difference-between-memfree-and-memavailable

    Rik van Riel’s comments when adding MemAvailable to /proc/meminfo:

    /proc/meminfo: MemAvailable: provide estimated available memory

    Many load balancing and workload placing programs check /proc/meminfo to estimate how much free memory is available. They generally do this by adding up “free” and “cached”, which was fine ten years ago, but is pretty much guaranteed to be wrong today.

    It is wrong because Cached includes memory that is not freeable as page cache, for example shared memory segments, tmpfs, and ramfs, and it does not include reclaimable slab memory, which can take up a large fraction of system memory on mostly idle systems with lots of files.

    Currently, the amount of memory that is available for a new workload, without pushing the system into swap, can be estimated from MemFree, Active(file), Inactive(file), and SReclaimable, as well as the “low” watermarks from /proc/zoneinfo.

    However, this may change in the future, and user space really should not be expected to know kernel internals to come up with an estimate for the amount of free memory.

    It is more convenient to provide such an estimate in /proc/meminfo. If things change in the future, we only have to change it in one place.

    Looking at the htop source:

    https://github.com/htop-dev/htop/blob/main/MemoryMeter.c

       /* we actually want to show "used + shared + compressed" */
       double used = this->values[MEMORY_METER_USED];
       if (isPositive(this->values[MEMORY_METER_SHARED]))
          used += this->values[MEMORY_METER_SHARED];
       if (isPositive(this->values[MEMORY_METER_COMPRESSED]))
          used += this->values[MEMORY_METER_COMPRESSED];
    
       written = Meter_humanUnit(buffer, used, size);
    

    It’s adding used, shared, and compressed memory, to get the amount actually tied up, but disregarding cached memory, which, based on the above comment, is problematic, since some of that may not actually be available for use.

    top, on the other hand, is using the kernel’s MemAvailable directly.

    https://gitlab.com/procps-ng/procps/-/blob/master/src/free.c

    	printf(" %11s", scale_size(MEMINFO_GET(mem_info, MEMINFO_MEM_AVAILABLE, ul_int), args.exponent, flags & FREE_SI, flags & FREE_HUMANREADABLE));
    

    In short: You probably want to trust /proc/meminfo’s MemAvailable, (which is what top will show), and htop is probably giving a misleadingly-low number.



  • tal@lemmy.todaytoSelfhosted@lemmy.worldWhere to start with backups?
    link
    fedilink
    English
    arrow-up
    6
    ·
    edit-2
    24 days ago

    If databases are involved they usually offer some method of dumping all data to some kind of text file. Usually relying on their binary data is not recommended.

    It’s not so much text or binary. It’s because a normal backup program that just treats a live database file as a file to back up is liable to have the DBMS software write to the database while it’s being backed up, resulting in a backed-up file that’s a mix of old and new versions, and may be corrupt.

    Either:

    1. The DBMS needs to have a way to create a dump — possibly triggered by the backup software, if it’s aware of the DBMS — that won’t change during the backup

    or:

    1. One needs to have filesystem-level support to grab an atomic snapshot (e.g. one takes an atomic snapshot using something like btrfs and then backs up the snapshot rather than the live filesystem). This avoids the issue of the database file changing while the backup runs.

    In general, if this is a concern, I’d tend to favor #2 as an option, because it’s an all-in-one solution that deals with all of the problems of files changing while being backed up: DBMSes are just a particularly thorny example of that.

    Full disclosure: I mostly use ext4 myself, rather than btrfs. But I also don’t run live DBMSes.

    EDIT: Plus, #2 also provides consistency across different files on the filesystem, though that’s usually less-critical. Like, you won’t run into a situation where you have software on your computer update File A, then does a sync(), then updates File B, but your backup program grabs the new version of File B but then the old version of File A. Absent help from the filesystem, your backup program won’t know where write barriers spanning different files are happening.

    In practice, that’s not usually a huge issue, since fewer software packages are gonna be impacted by this than write ordering internal to a single file, but it is permissible for a program, under Unix filesystem semantics, to expect that the write order persists there and kerplode if it doesn’t…and a traditional backup won’t preserve it the way that a backup with help from the filesystem can.



  • I think that the problem will be if software comes out that’s doesn’t target home PCs. That’s not impossible. I mean, that happens today with Web services. Closed-weight AI models aren’t going to be released to run on your home computer. I don’t use Office 365, but I understand that at least some of that is a cloud service.

    Like, say the developer of Video Game X says “I don’t want to target a ton of different pieces of hardware. I want to tune for a single one. I don’t want to target multiple OSes. I’m tired of people pirating my software. I can reduce cheating. I’m just going to release for a single cloud platform.”

    Nobody is going to take your hardware away. And you can probably keep running Linux or whatever. But…not all the new software you want to use may be something that you can run locally, if it isn’t released for your platform. Maybe you’ll use some kind of thin-client software — think telnet, ssh, RDP, VNC, etc for past iterations of this — to use that software remotely on your Thinkpad. But…can’t run it yourself.

    If it happens, I think that that’s what you’d see. More and more software would just be available only to run remotely. Phones and PCs would still exist, but they’d increasingly run a thin client, not run software locally. Same way a lot of software migrated to web services that we use with a Web browser, but with a protocol and software more aimed at low-latency, high-bandwidth use. Nobody would ban existing local software, but a lot of it would stagnate. A lot of new and exciting stuff would only be available as an online service. More and more people would buy computers that are only really suitable for use as a thin client — fewer resources, closer to a smartphone than what we conventionally think of as a computer.

    EDIT: I’d add that this is basically the scenario that the AGPL is aimed at dealing with. The concern was that people would just run open-source software as a service. They could build on that base, make their own improvements. They’d never release binaries to end users, so they wouldn’t hit the traditional GPL’s obligation to release source to anyone who gets the binary. The AGPL requires source distribution to people who even just use the software.


  • I will say that, realistically, in terms purely of physical distance, a lot of the world’s population is in a city and probably isn’t too far from a datacenter.

    https://calculatorshub.net/computing/fiber-latency-calculator/

    It’s about five microseconds of latency per kilometer down fiber optics. Ten microseconds for a round-trip.

    I think a larger issue might be bandwidth for some applications. Like, if you want to unicast uncompressed video to every computer user, say, you’re going to need an ungodly amount of bandwidth.

    DisplayPort looks like it’s currently up to 80Gb/sec. Okay, not everyone is currently saturating that, but if you want comparable capability, that’s what you’re going to have to be moving from a datacenter to every user. For video alone. And that’s assuming that they don’t have multiple monitors or something.

    I can believe that it is cheaper to have many computers in a datacenter. I am not sold that any gains will more than offset the cost of the staggering fiber rollout that this would require.

    EDIT: There are situations where it is completely reasonable to use (relatively) thin clients. That’s, well, what a lot of the Web is — browser thin clients accessing software running on remote computers. I’m typing this comment into Eternity before it gets sent to a Lemmy instance on a server in Oregon, much further away than the closest datacenter to me. That works fine.

    But “do a lot of stuff in a browser” isn’t the same thing as “eliminate the PC entirely”.





  • I’m not familiar with FreshRSS, but assuming that there’s something in the protocol that lets a reader push up a “read” bit on an per article basis — this page references a “GReader” API — I’d assume that that’d depend on the client, not the server.

    If the client attempts an update and fails and that causes it to not retry again later, then I imagine that it wouldn’t work. If it does retry until it sets the bit, I’d imagine that it does work. The FreshRSS server can’t really be a factor, because it won’t know whether the client has tried to talk to it when it’s off.

    EDIT: Some of the clients in the table on the page I linked to say that they “work offline”, so I assume that the developers at least have some level of disconnected operation in mind.

    The RSS readers I’ve always used are strictly pull. They don’t set bits on the server, and any “read” flag lives only on the client.


  • tal@lemmy.todaytoSelfhosted@lemmy.worldLVM question
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 month ago

    Secondly, is there a benefit to creating an LVM volume with a btrfs filesystem vs just letting btrfs handle it?

    Like, btrfs on top of LVM versus btrfs? Well, the latter gives you access to LVM features. If you want to use lvmcache or something, you’d want it on LVM.