Got a warning for my blog going over 100GB in bandwidth this month… which sounded incredibly unusual. My blog is text and a couple images and I haven’t posted anything to it in ages… like how would that even be possible?

Turns out it’s possible when you have crawlers going apeshit on your server. Am I even reading this right? 12,181 with 181 zeros at the end for ‘Unknown robot’? This is actually bonkers.

Edit: As Thunraz points out below, there’s a footnote that reads “Numbers after + are successful hits on ‘robots.txt’ files” and not scientific notation.

  • slazer2au@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    4 months ago

    AI scrapers are the new internet DDoS.

    Might want to throw something Infront of your blog to ward them off like Anubis or a Tarpit.

    • ikt@aussie.zone
      link
      fedilink
      English
      arrow-up
      1
      ·
      4 months ago

      the one with the quadrillion hits is this bad boy: https://www.babbar.tech/crawler

      Babbar.tech is operating a crawler service named Barkrowler which fuels and update our graph representation of the world wide web. This database and all the metrics we compute with are used to provide a set of online marketing and referencing tools for the SEO community.