Just a guy shilling for gun ownership, tech privacy, and trans rights.

I’m open for chats on mastodon https://hachyderm.io/

my blog: thinkstoomuch.net

My email: nags@thinkstoomuch.net

Always looking for penpals!

  • 7 Posts
  • 55 Comments
Joined 2 years ago
cake
Cake day: December 21st, 2023

help-circle
  • Ollama and all that runs on it its just the firewall rules and opening it up to my network that’s the issue.

    I cannot get ufw, iptables, or anything like that running on it. So I usually just ssh into the PC and do a CLI only interaction. Which is mostly fine.

    I want to use OpenWebUI so I can feed it notes and books as context, but I need the API which isn’t open on my network.



  • nagaram@startrek.websiteOPtoSelfhosted@lemmy.world1U mini PC for AI?
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    1 day ago

    Ollama + Gemma/Deepseek is a great start. I have only ran AI on my AMD 6600XT and that wasn’t great and everything that I know is that AMD is fine for gaming AI tasks these days and not really LLM or Gen AI tasks.

    A RTX 3060 12gb is the easiest and best self hosted option in my opinion. New for <$300 and used even less. However, I was running with a Geforce 1660 ti for a while and thats <$100




  • I do already have a NAS. It’s in another box in my office.

    I was considering replacing the PIs with a BOD and passing that through to one of my boxes via USB and virtualizing something. I compromised by putting 2tb Sata SSDs in each box to use for database stuff and then backing that up to the spinning rust in the other room.

    How do I do that? Good question. I take suggestions.


  • With a RTX 3060 12gb, I have been perfectly happy with the quality and speed of the responses. It’s much slower than my 5060ti which I think is the sweet spot for text based LLM tasks. A larger context window provided by more vram or a web based AI is cool and useful, but I haven’t found the need to do that yet in my use case.

    As you may have guessed, I can’t fit a 3060 in this rack. That’s in a different server that houses my NAS. I have done AI on my 2018 Epyc server CPU and its just not usable. Even with 109gb of ram, not usable. Even clustered, I wouldn’t try running anything on these machines. They are for docker containers and minecraft servers. Jeff Geerling probably has a video on trying to run an AI on a bunch of Raspberry Pis. I just saw his video using Ryzen AI Strix boards and that was ass compared to my 3060.

    But to my use case, I am just asking AI to generate simple scripts based on manuals I feed it or some sort of writing task. I either get it to take my notes on a topic and make an outline that makes sense and I fill it in or I feed it finished writings and ask for grammatical or tone fixes. Thats fucking it and it boggles my mind that anyone is doing anything more intensive then that. I am not training anything and 12gb VRAM is plenty if I wanna feed like 10-100 pages of context. Would it be better with a 4090? Probably, but for my uses I haven’t noticed a difference in quality between my local LLM and the web based stuff.





  • Not really a lot of thought went into rack choice. I wanted something smaller and more powerful than my several optiplexs I had.

    I also decided I didn’t want storage to happen here anymore because I am stupid and only knew how to pass through disks for Truenas. So I had 4 truenas servers on my network and I hated it.

    This was just what I wanted at a price I was good with at Like $120. There’s a 3D printable version but I wasn’t interested in that. I do want to 3D print racks and I want to make my own custom ones for the Pis to save space.

    But this set up is way cheaper if you have a printer and some patience.


  • Not much. As much as I like LLMs, I don’t trust them for more than rubber duck duty.

    Eventually I want to have a Copilot at Home set up where I can feed a notes database and whatever manuals and books I’ve read so it can draw from that when I ask it questions.

    The problem is my best GPU is my gaming GPU a 5060ti and its in a Bazzite gaming PC so its hard to get the AI out of it because of Bazzite’s “No I won’t let you break your computer” philosophy, which is why I did it. And my second best GPU is a 3060 12GB which is really good, but if I made a dedicated AI server, I’d want it to be better than my current server.








  • The rats nest is behind it

    I need to re do some of the wiring.

    I have all 4 power cables braided and zip tied together with the single data cable so its nice to pull out and put back into the entertainment center.

    Only problem is I only had four 1 foot Ethernet cables and three 7 foot cables. So I used the 1 footers for the Pis and the 7 footers are bundled up as best I could and neatly hidden.

    I’m waiting on some color coordinated .5 foot cables from Cables and Kits and I will swap the switch and patch panel. I want the Pis to have that one cable and that’s it, but I also want all the patch panel ports to work.