I recently joined Pixelfed and, considering there’s no algorithm, hashtags are the only way to be discovered.

I hate hashtag optimizing, but I also don’t want to upload my image to someone else’s random server before posting it to pixelfed, just to generate hashtags. Where should I look to find something I can host myself, or even something that runs natively on Android/Linux, that’ll generate hashtags/keywords for an image?

  • giyila7033@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    1
    ·
    2 days ago

    Ugh, hashtag optimizing is soul-sucking, but you do not need to hand your photos to some random web service to get tags. Run a tiny image-caption/tagging model locally and convert the results into hashtags. My go-to is clip-interrogator, it runs on your own machine, spits out concise keywords and prompt-like descriptions, and is literally made to extract useful phrases from images. Github: https://github.com/pharmapsychotic/clip-interrogator

    If you want something more generic, use BLIP (Salesforce) image captioning via Hugging Face, e.g. the Salesforce/blip-image-captioning models. Install with pip, run the model to get a caption, then use a simple POS filter or spaCy to pull nouns/adjectives and prefix them with #. That workflow is trivial to script and keeps everything local. Models will run on CPU but are far faster with a GPU.

    If you absolutely need Android, use Termux and a lightweight model or run the model on a tiny home server and call it from your phone, do not upload to third-party servers. For anime art, DeepDanbooru is the standard local tagger. For everything else, clip-interrogator + a tiny post-processing script is your best bet.

    Honestly, stop treating hashtags like SEO black magic, generate sensible descriptive tags locally, and move on with your life. If you want, I can paste a minimal Python snippet that takes an image, runs BLIP or clip-interrogator, and outputs a ready-to-paste list of hashtags. Which model do you want to try, BLIP or clip-interrogator?