John Colagioia

John Colagioia@lemmy.sdf.org · 4 months ago

I’ve been using different versions of SearX for a long while (sometimes on my server, sometimes through a provider like Disroot) as my standard search engine, since I’ve never had great luck with the big names, and it’s decent, but between upstream provider quota limits, and just the fact that it relies on corporate search APIs at all, sometimes the quality craters.

While I haven’t had the energy to run YaCy on my own, and public instances tend to not have a long life, I don’t have nearly as much experience with it, but when I have gotten to try it out, the search itself looked great, but generally didn’t have as broad or current an index. Long-term, though, it (and its protocol) is probably going to be the way to go, if only because a company can’t randomly tank it like they can with the meta-search systems or their own interfaces.

Looking at Presearch for the first time now, the search results look almost surprisingly good if poorly sorted, but the fact that I now know orders of magnitude more about their finances and their cryptocurrency token than what and how the thing actually searches makes me worry a bit about its future.

John Colagioia@lemmy.sdf.org · 6 months ago

I believe that YouTube supports RSS. I haven’t used it in years, but gPodder allowed subscribing to channels.

Ah, yeah. From this post:

Go to the YouTube channel page.
Click more for the About box.
Scroll down to click Share channel. Choose Copy channel ID.
Get the feed from https://www.youtube.com/feeds/videos.xml?channel_id= plus that channel ID from the previous step.

From there, something (like a podcast client) needs to grab the video.

Otherwise, I’ve been using Tartube to download to my media server, which is not great but fine, except for needing to delete the lock file when it (or the computer) crashes, and the fact that the media server hasn’t the foggiest idea of how to organize the “episodes.”

John Colagioia@lemmy.sdf.org · 10 months ago

The Indie Web website up there actually has protocols to do most of what people do for social media, in exactly that structure. It’s enough of a pain to set up that I don’t see it becoming normal, but the amount that I’ve set up for my website at least works…

John Colagioia@lemmy.sdf.org · edit-2 2 years ago

In addition to YaCy and the varieties of Searx (both of which perform better for me than any of the commercial search engines), it’s not even out of the question to do this yourself, if you’re willing to start with the most recent Common Crawl dump and do some spidering in between releases. I don’t recommend it, unless you want to learn for yourself why search engines often give such miserable results, but it’s possible.

However, that’s the issue, here. Can you self-host a search engine? Sure, if you want to maintain the storage to back it. That depends on how deep your pockets go…

John Colagioia@lemmy.sdf.org · 2 years ago

Probably, though I don’t know their architecture well enough to say. The discussion that I saw referred specifically to PDF.js, which I believe is what the browsers use, though.

John Colagioia@lemmy.sdf.org · 2 years ago

It’s not as clean a solution as they’d like it to be, but for another option, Jellyfin hosts media including books. When I say “not as clean,” I mean that you can stream video and music from the server, but it has you download books to read on another device. Last I heard, they were looking to integrate at least a PDF viewer into the interface, though.

John Colagioia@lemmy.sdf.org · 2 years ago

My half-solution to this has always been to refer to where I’m working in my notes, like a file, method name, and maybe control structure if warranted. I’ve never needed to take that final step (hence half-solution), but this carries about enough information that someone could hack together a quick program to merge the notes and code in a reasonable way.

While (as I say) I’ve never specifically needed it, though, at work I’ve often wanted to do that and take the next step of sifting through version control, the ticketing system, and team chats to pull a complete view of what’s been happening around a particular chunk of code. I point that all out, because I think that you’re on the right track, however you ultimately solve that problem for yourself.