How to prevent spam in contact forms without using CAPTCHA?

Question

How to prevent spam in contact forms without using CAPTCHA?

Navigation

#1 by (23 votes)
#2 by (9 votes)
#3 by (6 votes)
#4 by (4 votes)
#5 by (3 votes)

19

I'm looking for a simple and efficient solution to avoid spam in contact forms. The idea is that the client does not need to inform that he is a "person" (CAPTCHA example), the submitted form must be clean (the client should not see anything besides the form).

Rules:

CAPTCHA can not be used;
The client should only see the form;

Recommendations:

Only use HTML , CSS and JS ;
Smallest possible code;
There may be extra validation from the server, but it would be interesting if there were none.

Form:

<form>
    <label for='name'>Nome:</label><input type='text' id='name' name='name'>
    <label for='email'>E-mail:</label><input type='text' id='email' name='email'>
    <label for='message'>Mensagem:</label><textarea id='message' name='message'></textarea>
    <input type='submit' value='enviar'>
</form>

Note: The primary goal is to prevent spam from being run on random sites.

spam

asked by anonymous 28.05.2014 / 23:41

5 answers

9

One way is to use a proof of work system ( proof- of-work system ), such as Hashcash or the virtual currency mining system Bitcoin (among others). A simple example would be:

Choose an integer T to be your work factor; if your site faces little spam, that number may be small, if the spam problem increases, increase that number;

Generate a random token a on the server, a timestamp and an expiration date (short), digitally sign and send the client (in a cookie) - along with the current working factor;

Put a JavaScript code that will generate random tokens b , looking to find one such that:

scrypt(a + b) < (2**256 >> T)

In other words, find a b such that the resulting hash has T zeros left [in binary]. If you do not want to tinker with binary arithmetic, just see if the hexadecimal hash has T/16 characters 0 on the left.

At the time of submitting the form, also send the token b found;

While this token is not found, do not submit the form, as it will be rejected by the server; show the user a message type "validating ..."

Upon receiving the form, check that the date is valid (i.e. longer than timestamp, less than expiration), and that the b parameter satisfies the above constraint. Otherwise, reject the form.

(Note: I suggested the scrypt because it is harder to manipulate, but you can use another hash that is lighter for the server.)

If a single human customer is filling out a form only once, he will barely notice that extra work in the browser - especially if you put it to run in parallel with the fill task in a WebWorker for example. Already a spammer wanting to submit the form several times would have to repeat this time-consuming calculation once for each submission, in particular if you invalidate the already used tokens (you may weigh a bit in BD *) or if you include form key fields : email and message **) in the hash function entry.

At the end of the day, the important thing is the cost with the processing power *** required to send the spam to be greater than the expected profit for that spam. And it's costly pro-client, but light pro server.

This was a simple example, and certainly the protocol can be improved in some way, but it is the only solution I know that is not foolproof - even the methods described in the other answers (check IP, Referrer, using sophisticated algorithms to identify "human behavior", etc.) can be circumvented if your site is of high value to spammers. But the ideal, of course, is to use an "in-depth defense," so if you can employ these methods too (simultaneously) the chances of stopping spam increase.

* This may or may not be relevant, depending on whether your site already requests the DB in any way or not. One way to minimize this is to put the a token as a column in the message table, with the UNIQUE restriction - there it would prevent a spammer from reusing the same token in multiple messages, and even if you get a free UUID " for her. :)

** The disadvantage is that it can not be paralleled, so the user will always see the "validating" message and will have to wait a few more seconds to send each message. In comparison to CAPTCHA, it's a net-win , but from the user's point of view your website will still look "slow" ...

*** If the spammer is using a botnet to send spam, he will not be the one to pay that cost, so this technique is less effective. On the other hand, this will make the user's computer slow, which may help him to suspect that he is infected by malware.

03.06.2014 / 01:24

6

Why not use Captcha?

Well, there are several techniques for spamming, so we would have to treat all of them (which is impossible since the techniques are recreated and invented every day).

You can use a CSRF TOKEN, that is, a single hash generated by the server, which will be within the request (POST) to be sent by the user. If this hash does not come as you hoped it is because they are trying to make the request outside the browser.
You can check the server by the HTTP REFERRER, which comes from the request, and see if the user comes from the URL and the expected domain (yours);
None of the above addresses the possibility of a robot sending forms through the browser again and again (here it would enter the captcha). What you can do is compare the timestamp of when the page was generated between the timestamp of when the user submits the form. Usually robots fill out the form quickly and send. You can determine a minimum time in which a human completes the form (15 seconds, depending?) And if it is smaller, you reject it.
Robots usually fill all fields found within a <form> , ie if you put a <input type="text" style="display: none" name="name2" /> , the robot will probably fill. This field would not appear to the user, so it would be sent empty. If it is filled, you will know that it was not a human that filled it.

If your app is safe from spam? Certainly not. It would not be, even with the captcha, but it would be great to have one. =)

Reference: link

02.06.2014 / 21:21

4

I would do the following:

I'll get the origin IP of the one who is writing the form and submit it to a trusted RBL (Barracuda, Spamcop, etc.) if the IP is contained in any of these places the chance of being SPAM increases considerably, and you can either allow the form to be submitted or not.

Another method is to submit the content of the form to some script to check and classify the content, analyzing the written context in the form it may or may not be considered spam, you can interact with spamassassim that can classify a text, through learning you can teach the algorithm.

Another method is to create a From or From field in your form so that a valid email in this case from the submitter is filled out, you send a confirmation link to this email so that the message only quits if the face click the link that will receive by email.

02.06.2014 / 22:27

3

There is only one way to avoid spam correctly, and it is filtering through the backend. Hidden fields and other "creative" ways only serve to prevent old and / or simple bots.

If you are so worried that a user will not have to type a captcha, the "loss" will have to fall on you, receiving messages and then filtering them with services like Akismet and Mollom.

This is how most large sites deal with spam, the filter is after the message is sent.

05.06.2014 / 14:36

How does PHP foreach work? How to use a specific index in a query in SQL Server?

score 23 · Accepted Answer

An alert

Do not require validation on the server means to make the control in the user's browser.

This implies that any solution will be easily circumvented with minimal knowledge of Javascript. Another possible attack is to simply replicate the HTTP request without actually using a browser. Opera Summary: Without server validation any solution will be extremely vulnerable.

Solution that does not require server validation

To prevent just more naive attacks, one solution is to use an event on the page that enables form submission if it is identified that a real user is accessing the page.

The challenge is to identify the pattern of a real user. I imagine a user either will click the button or use the TAB key until getting there, right? Then we could only activate the submission if there is an event mouseover or focus on the button.

In addition, to prevent an automatic script from identifying action and form fields I would use a solution with Ajax.

See the following example of two fields with a button:

Field 1: <input id="f1"/><br/>
Field 2: <input id="f2"/><br/>
<button type="button">Enviar</button>

And then a script that monitors events mouseover and blur , adding the click event that will make Ajax only when one of the first two events is executed:

//monitora por focus e mouse over
$('button').bind('focus mouseover', function() {

    $(this)
        //eventos não são mais necessários
        .off('mouseover focus') 
        //adiciona o evento que executará a requisição final
        .click(function() {
           console.log('implementar ajax aqui');
        });

});

Demo on JSFiddle

Anyway, I think it's possible to help this too for a conventional submit if that's the case.

Solutions that require server validation

Here I will leave a recorded response that I had written using server validation.

There are several solutions not to use captcha, some more professional based on creativity.

Hidden field for "humans"

An OS response has given the idea of creating a hidden field on the form. A "robot" program that sends messages automatically will try to fill this field with some random information. Then your code will know that if the hidden field is filled someone has been messing around with what they should not.

Example:

<!-- este campo não deverá ser preenchido, mas provavelmente os bots tentarão fazê-lo -->
<input type="text" id="nao_humano" name="nome" />

<!-- este campo é o que realmente o usuário deve preencher -->
<input type="text" name="nome_real" />

<!-- o estilo inibe o campo que o usuário não deve preencher -->
<style>
   #nao_humano { display: none }
</style>

Service with artificial "intelligence"

Some services do the spam work . For example, on my blog I use Akismet .

Akismet works something like this:

The user submits a comment on the form

A code on my site receives the message and sends it to the Akismet service

The Akismet service checks the comment against a Spam information base

Akismet returns whether the message is potential spam or not

Obviously, there is some concern about the security of this process. In a public blog there are no difficulties, but for a company that receives information from clients the traffic of information to a third party server can be an impediment.

Detecting Human Behavior

Another idea I saw some time ago is to detect events on the site to validate if someone is actually typing the message.

Doing this is relatively simple. First, generate a random code and put it in the user's session. Print this same code in a Javascript block inside the page:

var codigo = 'CODIGO_GERADO';

Then add a hidden and initially empty field in the form:

<input type="hidden" name="validacao"/>

Now create a code in some event such as mouse over or key up on the page that fills the field validacao with value codigo .

Finally, the server should validate that the validacao field came with the code. To dodge some smarter spammers, the name of this field may also be random.

Conclusion

In my opinion, creativity is what rules this point. The more different and creative your solution, the more difficult the spammers will detect it.

Do not forget that any client validation can easily be fooled by any user who knows how to use the developer tool and has an intermediate knowledge of Javascript.