Spam Report

This content has been archived. It may no longer be accurate or relevant.

While checking the Akismet spam filter to respond to one of the commenters on another thread, PG discovered a statistical summary of the last twelve months of TPV from a ham/spam perspective. He thought some visitors might find it interesting. (Click on the graph for a larger version)

PG really enjoys reading 99% of the comments that appear on TPV, but he had no idea there were almost 11 thousand comments he didn’t see because Akismet zapped them. That’s 36% of the total comments that were submitted to the blog.

Since PG opened up Excel to calculate that 36% spam percentage, he played with the numbers a little bit more.

Absent the spam filter, PG isn’t certain how long it would have taken him to clean up the spam by manually deleting spam comments.

However, Excel at hand, if each spam comment took him an average of 15 seconds to identify and delete, that would total almost 45 hours in addition to the time he already spends on TPV that he would have to devote to keeping the conversational space tidied up.

If it took 30 seconds per spam post, that’s almost 90 hours. 60 seconds per spam would total about one month of 8-hour Monday-Friday work days.

PG’s calculations only assume time spent on the 10,744 spam comments that Akismet caught during the last twelve months. However, in order to identify the chaff or mostly-chaff comments, PG would also have to at least briefly examine the wheat comments before determining he wouldn’t need to delete them.

The total number of wheat and chaff comments would have been almost 30,000. Presumably, without Akismet cleaning the chaff, spammers might well have been incented to drop more comments into TPV, thereby consuming more human filtering time.

PG needs to figure out a way to make a donation to Akismet.

14 thoughts on “Spam Report”

  1. I’m nowhere near your levels but do find the spam stuff interesting. I have it save it rather than delete direct.

    However, I think the numbers are less scary “per spam” than what you have. Most of them would take less than a second to identify, and all no more than 3 or 4 tops. Reviewing them in the save queue makes it obvious how repetitive they are, and how easy to triage by one of three fields:

    a. The return address is often “GUCCI WATCHES” or some stupid product name;
    b. The URLs are often GUCCIWATCHES.SALE.CH or something similarly obvious; and,
    c. The content of the email is either REALLY badly worded, a string of links, in a different alphabet, or very clearly about a product.

    If I didn’t have AKISMET, it would some less useful process, but it would still be possible to manually triage quickly. Just easier to let the bot do it. 🙂

    For me, I’m far more impressed how ACCURATE the bot is. Once in awhile, I’ll review a list of the spam to see if there are any false positives, and it is VERY rare to see any (1 or 2 a year, if that).

    • You’re probably right about processing speed without Akismet, Paul, but doing so manually would not be my favorite task several times per day.

      I think there would also be a bad-drives-out-the-good effect on serious comments if a commenter sees a bunch of junk in the comments.

      • Agree completely. My legitimate comment volume is low enough that I approve all comments, even if they’ve commented before. Eventually I might have to change that. I’m impressed separately though with how effective Gmail filters are — I get no spam in my inbox except for directed spam to me personally, which is more just unsolicited offers but at least hand-tailored, but there are some things captured in my spam filter that is occasionally not spam and has to be put on a whitelist (often e-zines that come in with an ad built into the copy).

        P.

  2. Pricing and Support Options for Akismet. Akismet has a basic plan which is provided on an honor system basis. This means you decide what Akismet is worth and then pay that amount. Apart from that Akismet also has monthly paid plans starting from $5 per month for a single site.May 18, 2018

    • Randell – I took a quick look for Akismet plans yesterday, but couldn’t find a straight donation option. I’ll go look again.

      • I didn’t see anything myself except through the plugin link. It defaults on mine to either free or a per year subscription rate at whatever you think is fair, but the amounts start at $16.20 Cdn and goes up in $8 increments. It does have a button at the bottom to AUTO RENEW or not, so if you don’t tell it to auto-renew, presumably it’s a one-time contribution. But I don’t like the wording around “expiry dates” for your subscription…makes it sound like once you sign up, you can’t go back to your free, non-renewing options.

        If you find a tip jar option, let me know!

        P.

    • I had not thought about Ham/Spam not work for a vegetarian, Deb.

      I don’t know if there is a vegetarian equivalent to those terms available.

      • I’m not certain I want to meet the vegetarian version of SPAM. Although I saw an ad for vegan haggis once*, so that might be close.

        *once. I’ve seen more for vegetarian, so I guess the vegan option was not popular enough to warrant offering it a second time.

  3. The zero false positives looks good but how do you tell whether it is correct? Clearly you don’t examine all the spam to check.

    When my ISP changed its email service the number of false positives was so high that I had to change the settings so everything it called spam was still directed to my PC. It’s much better now but the email program on my PC still chucks the odd email into its junk folder (it particularly likes doing this to notifications of new comments on PV).

    • Obviously, you are in dire need of a more discerning email spam filter, Mike. 🙂

      On occasion, I’ll receive an email via the Contact page about an improperly-condemned comment, but it’s pretty rare considering the quantity of comments that pass under Akismet’s all-seeing eye.

  4. Back around the mid ’90s, I realized the vast majority of spam I received could be dealt with via a simple spell checker, perhaps backed up by a grammar checker. There weren’t any that were suitable to pipe from procmail on the OS I was using, so that idea didn’t go anywhere.

    That sort of filtering would have also trapped at least 10% of my non-spam email, but one of my personal quirks is that I won’t bother to parse gibberish that looks like it was typed by someone’s cat. Since that also applies to business mail, it has also cost me some undetermined amount of money over the years…

  5. There may be more issues with Akismet than you know. I recently commented on the Rose Noir posting and likely because I included a couple of links to solutions to the WP color space issue, my comment is not currently appearing. I assumed at the time I posted it that while it triggered a spam filter hit, someone would eventually come along and rescue it. If you never look at filtered comments, then I guess you won’t see the solution to the issue.

    In addition, I’ve several times in the past while reading from a very large corporate network which proxies web traffic been blocked from reading TPV and instead given a page that I’m reading too fast. I assume that’s to stop scraping bots, but when you’re reading a normal pace but it sees every single person using the same corporate proxy IP as you, it feels over aggressive to prevent even reading a post!

Comments are closed.