@Diabolo96

Diabolo96@lemmy.dbzer0.com · 27 days ago

Looks like an email, not a link.

Diabolo96@lemmy.dbzer0.com · edit-2 6 months ago

Too…much…info…my head…HURT.

Cool stuff tho

Diabolo96@lemmy.dbzer0.com · edit-2 8 months ago

I haven’t checked progress in TTS tech for months (probably several revolutionary evolutions have happened since then), but try Coqui xttsv2.

Diabolo96@lemmy.dbzer0.com · edit-2 1 year ago

If you guys like hiking and stuff, there’s this cool open source app called trail sense on f-droid and it’s just so much feature packed…

I don’t hike, so I only use it for it’s pedometer capabilities and a hypothetical situation where “I might get really lost” but the amount of features it has for hiking and survival is crazy and so I think deserves to be more known.

Diabolo96@lemmy.dbzer0.com · 1 year ago

It’s beans worthy quality content.

Diabolo96@lemmy.dbzer0.com · 1 year ago

Forget about personalisation. That UX work is just 👌👌💯✨

But I’d definitely would like to know how it works.

Diabolo96@lemmy.dbzer0.com · edit-2 1 year ago

I had a hunch that writing the actual Upload/download speed tather than mbps was probably wrong. My bad, my internet provider lingo is rusted.

Diabolo96@lemmy.dbzer0.com · edit-2 1 year ago

I don’t have a jellyfin server but 1MB/s (8mbps) for each person watching 1080p (3.6Gb per hour of content for each file) seems reasonable. ~3MB/s (24mbps) upload and as much download should work.

Diabolo96@lemmy.dbzer0.com · edit-2 1 year ago

No. Quantization make it go faster. Not blazing fast, but decent.

Diabolo96@lemmy.dbzer0.com · 1 year ago

Completely forgot to tell you to only use quantized models. Your pc can run 4bit quantized versions of the models I mentioned. That’s the key for running llms on at consumer level hardware. You can later read further about the different quantizations and toy with other ones like Q5_K_M and such.

Just read phi-3 got released and apparently it’s a 4B that reach gpt 3.5 level. Follow the news and wait for it to be add to ollama/llama.ccp

Thank you so much for taking the time to help me with that! I’m very new to the whole LLM things, and sorta figuring it out as I go

I became fascinated with llms after the first AI booms but all this knowledge is basically useless where I live, so might as well make it useful by teaching people what i know.

Diabolo96@lemmy.dbzer0.com · edit-2 1 year ago

The key is quantized models. A full model wouldn’t fit but a 4bit 8b llama3 would fit.

Diabolo96@lemmy.dbzer0.com · edit-2 1 year ago

Yeah, it’s not a potato but not that powerful eaither. Nonetheless, it should run a 7b/8b/9b and maybe 13b models easily.

running them in Python with Huggingface’s Transformers library (from local models

That’s your problem right here. Python is great for making llms but is horrible at running them. With a computer as weak as yours, every bit of performance counts.

Just try ollama or llama.ccp . Their github is also a goldmine for other projects you could try.

Llama.ccp can partially run the model on the gpu for way faster inference.

Piper is a pretty decent very lightweight tts engine that can be directly run on your cpu if you want to add tts capabilities to your setup.

Good luck and happy tinkering!

Diabolo96@lemmy.dbzer0.com · 1 year ago

Specs? Try mistral with llama.ccp.

Diabolo96@lemmy.dbzer0.com · 1 year ago

It shouldn’t happen for a 8b model. Even on CPU, it’s supposed to be decently fast. There’s definitely something wrong here.

Diabolo96@lemmy.dbzer0.com · 1 year ago

Sadly, can’t really help you much. I have a potato pc and the biggest model I ran on it was Microsoft phi-2 using the candle framework. I used to tinker with Llama.cpp on colab, but it seems they don’t handle llama3 yet. ollama says it does , but I’ve never tried it before. For the speed, It’s kinda expected for a 70b model to be really slow on the CPU. How much slow is too slow ? I don’t really know…

You can always try the 8b model. People says it’s really great and even replaced the 70b models they’ve been using.

Diabolo96@lemmy.dbzer0.com · edit-2 1 year ago

Run 70b llama3 on one and have a 100% local, gpt4 level home assistant . Hook it up with coqui.Ai xttsv2 for mind baffling natural language speech (100% local too ) that can imitate anyone’s voice. Now, you got yourself Jarvis from Ironman.

Edit : thought they were some kind of beast machines with 192gb ram and stuff. They’re just regular middle-low tier pcs.

Diabolo96@lemmy.dbzer0.com · edit-2 1 year ago

Oh, be assured that threads will one day defederate and build a wall so you can’t access their content anymore. The Fediverse need to have a critical mass of users to survive when it happens, but if the features threads offers are too compelling and the majority of the new accounts are made in there then the Fediverse is screwed.

Diabolo96@lemmy.dbzer0.com · 1 year ago

It’s far worse. They’re making improvements only on their side. The protocol everyone uses will lack the features their protocol offers. In other words, their side of the garden is now greener than ours, and one day, their side will be so majestic and beautiful compared to ours that almost nobody will want to visit it anymore, and like a flame without fuel, the Fediverse will Extinguish on its own.

Diabolo96@lemmy.dbzer0.com · edit-2 1 year ago

Embrace Extend Extinguish

       ^

   we are here

Diabolo96@lemmy.dbzer0.com · 1 year ago

Thanks for the info ! I geuss we’ll just have to be patient .