Matter Solutions Logo (Simple Version)
June 17, 2024

Web Form Spam - Actionable Rules

Published: 17 June 2024 

I've had a website promoting my services as a web designer and then later as an SEO since April 2000.

Yes, really 24 years ago! Check out the history section of our about page for more info.

In this post I'll look at the following.

Introduction to Form Spam

In today's digital age, maintaining a clean and efficient online presence is crucial for businesses and individuals alike. One of the significant challenges faced by website owners and their website designers is the issue of form spam.

Form spam occurs when automated bots or malicious users submit fake or irrelevant information through your website's forms, e.g. contact forms, registration forms, or comments on blog posts (like this one).

This type of spam can lead to various problems including cluttered databases and wasted resources.

In this blog post, we will look at web form spam, its impact on your website, effective strategies to prevent and manage it and how we keep a handle on it for us, and for some of our Digital Marketing clients.

Understanding web form spam and implementing robust (yet simple to manage) countermeasures, you can protect your website and ensure a seamless experience for your genuine users, and potential customers.

The Impact of Form Spam

Form spam can have far-reaching consequences for both your website and your overall online strategy. While it may seem like a minor inconvenience at first, the cumulative effects of web form spam can be detrimental in several ways:

  1. Database Clutter: Spam submissions can quickly fill up your database with junk data, making it difficult to manage and analyse legitimate entries. This can lead to increased storage costs and more significantly reduce the efficiency of data retrieval.
  2. Database Load: If the volume of submissions hitting your database is very heavy then it has the potential to degrade the user experience (UX) for legitimate visitors. The main reason for this is that saving data into a database table (a write) is over 10x as much work as loading information (read)  from a database table. This is the classic input output problem aka i/o.
  3. Resource Drain: Aside from i/o resources are needed for processing and filtering spam entries. This effort can consume valuable server resources, slowing down your website and potentially affecting the performance of other critical functions.
  4. Email Overload: If your forms are set up to send notifications for each submission, spam can flood your inbox with irrelevant messages. This can make it challenging to identify and respond to genuine inquiries in a timely manner.
  5. Emails ALL Marked as SPAM: An horrific situation can occur when the webmaster's email decides that the spammy looking form submissions are so bad that ALL email from the websites ends up being treated as spam. This is a disaster! We've fixed this for many clients.
  6. User Experience: Excessive spam can degrade the user experience by cluttering comment sections, forums, or review pages with irrelevant content. This not only frustrates legitimate users but can also make your site appear unprofessional and poorly managed.
  7. Security Risks: Some form spam may contain malicious links or scripts intended to exploit vulnerabilities in your website. This can lead to data breaches, malware infections, or other security incidents that compromise the integrity of your site and the safety of your users.
  8. SEO Implications: Search engines demote or even penalise websites that are poorly maintained. This especially includes comment sections. When comments sections and via submissions are filled with spammy unsavoury content (think porn, pills and poker/gambling), this can negatively impact your search engine rankings, reducing your site's visibility and organic traffic - and therefore leads.

When you realise the impact of these issues, you can appreciate the importance of addressing form spam promptly and effectively.

Why & How Form Spam Happens

Form spam is a widespread issue on the internet, and understanding why it occurs can help you develop more effective strategies to combat it. Here are some of the primary reasons form spam happens:

  1. Automated Bots: A significant amount of form spam is generated by automated bots designed to crawl the web and submit fake information to forms. These bots can be programmed to find and exploit vulnerabilities in web forms, flooding them with spam entries.
  2. Advertising and SEO Manipulation: Some spammers use form spam as a way to promote their products or services. By submitting links and advertisements through your forms, they aim to drive traffic to their websites or improve their search engine rankings through backlink generation.
  3. Data Harvesting: Malicious actors may use form spam to harvest data from your website. By submitting forms with phishing links or malware, they can trick users into providing personal information or infect their devices.
  4. Disruption and Vandalism: Some spammers engage in form spam simply to disrupt your operations or vandalize your website. This type of spam can include offensive or irrelevant content intended to tarnish your site's reputation and frustrate users.
  5. Competitor Sabotage: In some cases, competitors may use form spam to undermine your business. By flooding your forms with spam, they can drain your resources, distract your attention, and create a negative user experience on your site.
  6. Testing Security: Some attackers use form spam as a way to test the security measures you have in place. By probing your forms with various spam techniques, they can identify vulnerabilities that they might later exploit for more serious attacks.

Understanding these motivations helps us anticipate the types of spam that each site might encounter and tailor anti-spam strategies accordingly.

How I deal with the worst Form Spam at Matter Solutions

Ben Maden

Every (LEGIT) ENQUIRY IS (SUPER) VALUABLE

For me the primary issue is that of ensuring my team or I must see every single legitimate enquiry.

It is my view that even just one valid enquiry that I don't see because of a rule or spam filter is a disaster. This is because each of those valid enquiries represents:-

  • The culmination of all our marketing efforts, on our website and offline, i.e. a huge investment of time, effort and money.
  • Every missed enquiry represent a real human being who will never get a call back or a reply. This is a bad experience for them, and the last thing we (at Matter Solutions) want them to experience.
  • The human (who 'would be' down) also represents an organisation that Matter Solutions would be very likely to win over and help. Missing just one of these represents a huge opportunity cost.

For these reasons it is my position that wading through some web-form-spam is worth it to never miss a legit enquiry.

I know this isn't the case for all my clients but as many of our clients get a manageable volume of enquries and form spam, most of them follow the same (or close to the) process we use.

KEEP IT SIMPLE (KISS)

I've been working with WordPress since 2008 so I've seen lots of products to handle spam come and go. The original gangster is Akismet, owned by the brand behind WordPress (Automattic). We're big fans of Akismet and have had an agency license for many many years, more on that below.

My initial attempts over the years have involved using recaptcha and honeypots, explained below in more detail.

These days I let almost anything in (after using GravityForms linked to Akismet) and handle the resulting messages with hand-written filters, defined by me.

My EMAIL FILTERS

Our inbound communications come to my email address primarily. It is worth noting that other team members have visibility over my inbox but we stick to rules about who follows up each lead.

The sort of rules I put in place look like this... in Google Mail > Settings > Filters, or do a search, then use the dropdown below the advanced search to "Create Filter". In this case I've been spammed enough times by brandbuildingassistance@outlook.com to realise they're just wasting our time, every-single-time!

As a result this rule just silently takes any enquiry that mentions their reply email (1) applies a label - like a folder, and (2) skips past my inbox so I never have to see it. The best thing is that I can scale these simple rules for many of the spammer who complete our forms over and over again.

One could say it is a game of cat and mouse, they get ignored so they move on to another email address or brand. Yes, probably, but it doesn't quite work that way.

They're in business and need leads so they use a website, a brand, or an email address over and over. For example "Monkey Digital" are in my inbox 48 times, but it just doesn't matter to me, because they're silently filtered and therefore only ever seen if I go and check the (hierarchical) gmail label that I call "X-Bad/Form-Spam"

If you'd like a list of the current offenders let me know and I'll share it with you.

ROOM FOR IMPROVEMENT

This method works fine for the day to day management of inbound leads, but... it doesn't handle the Elephant in the room for us as Digital Marketers. How do we take these "spam" submissions off the KPIs when judging the performance of the enquiry forms.

This is best handled in another post as it involves some technical details about utm parameters, and transferring those to the gravityforms entries.

Implementing CAPTCHA and Other Verification Methods

One of the most effective ways to combat form spam is by implementing verification methods that distinguish between human users and automated bots. CAPTCHAs (Completely Automated Public Turing tests to tell Computers and Humans Apart) are the most common and widely used verification tools. Here’s a closer look at CAPTCHAs and other verification methods you can use to secure your forms:

  1. CAPTCHA:
    • Traditional CAPTCHAs: These typically involve distorted text or images that users must decipher and enter into a form field. While effective, they can sometimes be challenging for legitimate users to solve.
    • reCAPTCHA: Developed by Google, reCAPTCHA is more user-friendly and often requires users to simply check a box that says "I'm not a robot." Advanced versions analyse user behavior to determine if the interaction is genuine, minimizing the need for user input.
    • Invisible reCAPTCHA: This version operates in the background, analysing user interactions with the site and only prompting for additional verification if the behavior appears suspicious.
  2. Honeypots:
    • Honeypot Fields: These are hidden form fields that human users cannot see and therefore will not fill out. Bots, however, will usually attempt to fill in every form field, including the hidden ones. Detecting entries in these fields can help identify and block bots.
  3. Time-Based Challenges:
    • Submission Timing: Bots typically submit forms much faster than human users can. By tracking the time it takes for a form to be completed and submitted, you can flag submissions that occur too quickly as potential spam.
  4. Email Verification:
    • Double Opt-In: Require users to verify their email address by sending a confirmation link to the provided email. This ensures that the email is valid and the user is genuine.
    • Domain Whitelisting/Blacklisting: Block form submissions from known spam email domains and prioritize submissions from trusted domains.
  5. Behavioral Analysis:
    • Mouse Movement and Click Patterns: Analyse the patterns of mouse movements, clicks, and other interactions to determine if the user behaves like a human. Bots often have predictable and mechanical interaction patterns.
    • Keystroke Dynamics: Monitor the typing patterns of users. Humans have unique typing rhythms, whereas bots typically have uniform keystroke timings.
  6. Third-Party Anti-Spam Services:
    • Akismet: Originally developed for filtering comment spam on blogs, Akismet can also be integrated into your forms to detect and block spam submissions.
    • CleanTalk: A cloud-based anti-spam service that provides spam protection for forms, comments, and registrations, using algorithms to filter out spammy submissions.

By implementing a combination of these methods, you can significantly reduce the amount of spam that reaches your forms, thereby protecting your website's integrity and ensuring a better experience for legitimate users. In the next section, we will discuss how to use honeypots to catch spammers effectively.

Regular Checks & Maintenance

Weekly
  1. It is important to keep an eye on the folder/label used to silently ignore spam.
  2. I go in and have a look every week or two to feel comfortable that no legitimate leads are being sent their by mistake.
  3. I might even look through the recent submissions and check whether a particular IP address appears often. If it does I'll note it down.
Monthly (ish)
  1. I log into WordPress, go to the Dashboard > GravityForms > Entries list.
  2. Here I can mark any egregious spammy submissions as SPAM.
  3. I will often have a quick look in this SPAM folder in WordPress to double check nothing valuable is being incorrectly flagged as SPAM, this is known as a 'false positive'.

Ad-hoc / Quarterly

IP Blocking - Every so often I'll collate any IP addresses that have been recorded against multiple spam submissions and go add them to a the DENY section in our web server config. This literally stops anyone visiting from that IP from seeing anything, they can't even load the website.

ASIDE - NEVER PRESS "SPAM" IN INBOX

Never never ever press the SPAM button in your inbox. You will cause way more issues than it can solve. You'll tell your email program that the email that came FROM your website is SPAM.

The best case scenario is that your email program thinks you're a silly goose and ignores you.

The most likely, and horrifically bad scenario is that your email program treats ALL the email from your website as SPAM and so you NEVER ever see it, even the legit ones... and if you read what I said above about how valuable enquiries are you'll understand why I'm YELLING about this!

I have often been asked to rescue clients, or even new prospective clients from this disaster. It is possible to solve, and not especially hard... but this mistake is a major pain in the proverbial that I recommend everyone avoids.

 

What do you do? Comment below.

Want some help?

If you want some help doing this for your website then speak with a web team member. Use this button to book a call back with a web specialist.

Ben Maden

Read more posts by Ben

Leave a Reply

Your email address will not be published. Required fields are marked *

Shares