Catching 'Apache Spammers'

Abuse – Warning – Spam originating from server

Dear Customer,
You are being contacted because NOC has received complaints concerning
unsolicited emails originating from an IP address assigned to
you.
======================

So starts the nightmare alert for a web server administrator. It is followed by a Spamcop Mail informing the blocking of server. And then a support ticket , saying unable to sent mail to an AOL account.

Bobcares helps online businesses maintain secure web services. Emergency security assistance and proactive security management is an important part of our web server management services. Today we’ll go through what we do to quickly identify the spam source, and block outgoing spam.

Who are these ‘Apache Spammers’?

Anyone using the web server ( apache ) to generate spam mails is called ‘Apache spammer’. Instead of using SMTP port ( 25 ) directly to relay mails, they use the vulnerabilities of a script or upload their own scripts to the webspace available. These scripts are exploited to send spam mails.

Whats this all about ?

I manage a Redhat machine running stock Apache , PHP 4.3.10 and Qmail (with qpsmtpd) . I also have a copy of the spam mail with headers, obtained either from the NOC’s warning mail or from the mail queue. In this article, I jot down many of the tricks that I used to trace the spammer in my webserver. This is to help sys-admins to detect the source of spam. This article will contain steps for detection and recovery .. but NOT prevention.

Give me the steps..

The first and obvious trick is to read the mail content for any clue of domain name. Very often the spammer leaves a lot of URLs and email addresses inside the mail. If the url points to a domain in the server , you have got the clue straight away. But what if the URLs are not pointing to any domain in the server . Well , thats where we have to use the tricks .

The tricks are basically divided into 2 according to the spam mail content.

1) User id given in mail-header

The first step is to look for the user-id inside qmail queue.
———————-
Received: (qmail 7984 invoked by uid 188); Thu, 14 Jul
2005 10:15:06 -0800
———————–
You get the User-id (uid) which invoked the qmail process from the mail header. Search for the uid inside /etc/passwd ( Use grep for fast result ) . If you are in luck, the uid would belong to a human user in the server. For eg : apache , qmaild are users but not human-users, while mike, john and sojish are human-users. Once you get the user who is generating the spam, it is easy to seperate the user files and if needed, suspend the user as an immediate action.

If you have suEXEC, enabled in your server, you need to
check the webspace files too to figure out how the spam is being generated . I’ve suEXEC enabled in server for cgi-bin, but not for PHP. Hence, I keep checking the the cgi scripts of users when a specific uid is available. A very common example of spamming using CGI script is through a incorrectly configured FormMail script .

2) User-Id is that of Apache.

User-ID obtained in mail header is that of Apache. But apache, a non-human user , cannot generate spams by itself. Which means, there ought to be a crooked human brain behind the apache server sending spam mails. It is not possible for us to guess who , the owner of that brain is . We’ll have to do reverse engineering using the basic concepts of how an apache server works. The email content should be readable for Apache inorder to send the mails. I thought about every possibility through which Apache will read that email content. Depending on the source from which apache reads it , I classified all
possibilities into 3 categories .

a) Email content is available in document root( In HTML or scripts or text files ..)
b) Email content is available in database.
c) Email content is dynamically provided to forms.( Given through HTML
forms )

2.a) Email content is available in document root.

I don’t know why, but my mind always gives this option first . Assuming that the email content is available in the webspace. It could be any scripts or HTML pages or even in simple text files. To figure out which virtual-host has that email content , you are forced to do an extensive search in all virtual-hosts .

You can search for a part of email content or mabbe even for one the email addresses in the “TO:” part of email header.

What if your server is having over 1000 virtual hosts ? Then this trick becomes time-consuming and cpu intensive. That is why I always use this trick only if no other trick workout.

2.b) Email content is available in database.

A spammer knows that having the email content inside his
document root is an easy give away. So he tries to hide it . Where can he hide it, but still access it fast enough ? Database. Every shared webserver gives its users database access. So now the spammer saves the mail content and also all the email addresses inside a database table. He can then start spamming using a very simple database script in php or cgi. Thats when you start running into trouble.

I use a Linux server with MySql. The first thing is to go to the directory that stores the mysql databases and do a search . The commands I keep handy are

   $ cd /var/lib/mysql


   $ grep "spam_mail_content" ./* -r    or  "to_address_in_spam" ./* -r

If God is gracious enough or the spammer is foolish enough ,we’ll get the database name. It’ll be easy to figure out the owner of database and the virtual host. Its a cheeky trick , but gives results.

2.c) Email content is dynamically provided to forms or scripts. ( Given through HTML forms )

This is the most common attacking strategy devised by spammers. They think they dont leave any trailing marks. and they are right, in most of the situations. The spammer posts the email content and the addresses to scripts which processes it. This could be very simple HTML form or the spammer could be misusing a vulnerability in some standard scripts without the knowledge of the webmaster. Once I was amazed to see a PHP form , that asks for “LIST OF email ids ” or “the database” which has it. But how do we trace the script ? .

Searching for email content inside document root or mysql would not give you any result .Still apache would be saving the Form content in a temporary file. So we search inside the temporary file . Usually apache’s temporary files are inside /tmp directory. Do a grep inside /tmp and read the content of email. Here is a sample that I did..

---------


 [root@my_server /tmp]# grep "J Crew" ./* -r -l

  ./20050708-163842-68.47.122.xx-request_body-BkhAnH

 ----------

 The file was opened to read , and it contained details like ..

 ----------

 Content-Disposition: form-data; name="EmailIDs


 ----------

Thats the clue that I was looking for. The name is actually name of an INPUT used in the form or an argument passed to the script . I search for that name inside the document root of virtual hosts. But unlike step 2.a) , here you can be sure that there is a form accepting the email content.

Some of you might ask the questions , what if the temporary file is getting deleted before we could even check it ? . What if spammer is giving only very few arguments in one go ? . For them, I’ll pray. You can do a continuous check of apache server status page to catch any “POSTS” that are being made. This requires a lot of patience and concentration .

Conclusion

You could easily catch a spammer if you keep your eyes wide open , look for clues that are hidden , be persistent. No spammer is as intelligent as a system administrator . They get away because we are lazy enough to let them do it. But chasing , catching and killing spammers should be considered as a sys admin’s virtue.NOTE : The author does not claim that the above said tricks are the ONLY or ALL steps available to catch a spammer.

About the author:
Sojish Krishnan works as Sr. Engineer in Bobcares.com. Sojish has worked in Bobcares for 4 years and is a passionate advocate of superior Customer support. He graduated Bachelors in Computer Science in 2001.