Bobcares

How to find malware and malicious code that anti-malware tools cannot

by | Sep 28, 2015

Linux servers have a great set of open source anti-malware tools like Linux Malware Detect, ClamAV + SaneSecurity, etc. These tools do a good job in identifying the vast majority of malware that’s out there. However, they still need a bit of time to create signatures from malware samples found in the wild. So, in some cases such as zero-day exploits, these anti-malware tools may need anywhere from a few hours to a couple of days to update their virus database.

For example, let’s take the case of the recent Active VisitorTracker Malware campaign. The malware was first seen in the wild on 5th September, but it was only after thousands of WordPress sites were affected on 17th September that malware signatures were available in popular anti-malware tools. As a server owner, you may not have the luxury to wait till signatures are available. Today we’ll see how to detect malware and malicious code which anti-malware tools cannot detect.

 

Anatomy of malware or malicious code infection

In case of zero-day exploits, news of attacks usually come first through security bulletins, community discussions, or even hacked files reported by webmasters. You’ll usually get a sample code when such malware is found, which will look something like this:

eval(gzinflate(base64_decode('pRlrc9o69nN2Zv+DyF2YigCdyFGdz9lYvlgCNsX
KpcCdyFGdz9lYvdCKzR3cafR3biVGbn92bn91cp1DdvJGJJkQCK0wepkyM90TZwl....
....
...(and a lot more unreadable gibberish)
....
Gro7eoJguDPBX4+B8=')));

This thing above is called code obfuscation, which is a favorite way for malware authors to hide their malicious code. They are being clever, but it’s a dead giveaway because if your files have something to hide, probably that doesn’t belong in the server. There are a few PHP functions and patterns usually used along with such gibberish in website files. They are:

eval, exec, gzinflate, base64_decode, str_rot13, gzuncompress, rawurldecode, strrev, ini_set(chr, chr(rand, shell_exec, fopen, curl_exec, popen, \x..\x..

As we’ll see later, we can use presence of gibberish to detect infected files.

Another common occurrence with malware infection is an un-explained change in file modification times. If the website owner did not make any changes, how did the files change? Again, this is a good way to detect infected files.

So, to summarize, if you suspect there could be infected files in your server (or website), and that your anti-malware tools did not detect it, you have 3 ways to locate them:

  1. The exact malware code (if you get it from community discussions or security bulletins).
  2. Malware like code which looks like gibberish.
  3. Recent file modification time stamps.

At Bobcares, we are big fans of anti-malware tools, but we are even bigger believers of proactive security management, which is why we keep a close watch on security news outlets, come to know about new malware as they are released, and weed them out even before malware signatures are available. We do it through a few Linux commands as explained below.

 

Using Bash shell scripting in Linux servers to locate infected files

Linux servers provide some very useful tools to find recently modified files and search for specific patterns. Using Bash shell, you can string together these tools to find a list of files that could contain malicious code. Here’s how it’ll look like in a server where website files are located in /home/[account-name]/public_html/:

# find /home/*/public_html/ -type f -mtime -7 -maxdepth 4 -exec egrep -q “eval\(|exec\(|gzinflate\(|base64_decode\(|str_rot13\(|gzuncompress\(|rawurldecode\(|strrev\(|ini_set\(chr|chr\(rand\(|shell_exec\(|fopen\(|curl_exec\(|popen\(|x..x..” {} \; -print > /tmp/suspected-malware.txt

Now, let’s see how this command works:

find – This is a Linux tool that can search for files installed by default in most servers.

/home/*/public_html/ – This is the path that find looks for files. The * is automatically replaced by all the directory names under /home.

-type f – This denotes that I’m looking only for files, and not directories, which makes find more efficient.

-mtime -7 – This denotes that the files should have a modification date within the last 7 days.

-maxdepth 4 – This denotes that I need only files within 4 layers of directories from public_html. This makes find execute faster.

-exec egrep “pattern” {} \; – This passes each file found by find to the command egrep that will look for malicious code pattern in those files.

-print – This will output the file name if a malware pattern was found in a file.

> /tmp/suspected-malware.txt – This will store the output into /tmp/suspected-malware.txt, one each line.

If you have the exact malware pattern as obtained from a community, you can use it directly. For example, the VisitorTracker malware had the pattern visitorTracker_isMob in its code. So, I used the below find command:

# find /home/*/public_html/ -type f -maxdepth 4 -exec egrep -q “visitorTracker_isMob” {} \; -print

Note: When using PHP functions in the search pattern, there is a chance for non-infected files turning up in the list. This is because some developers obfuscate their code (much like in the same way hackers do) to protect intellectual property. You should use the function search method only if you do not get a sample source code from communities.

 

Conclusion

New website malware come out all the time. Even the best anti-malware vendors may not catch all the malware on the first day of infection. Knowledge of a few Linux command line tools can come in handy to detect emerging threats.

PREVENT YOUR SERVER FROM CRASHING!

Never again lose customers to poor server speed! Let us help you.

Our server experts will monitor & maintain your server 24/7 so that it remains lightning fast and secure./p>

SEE SERVER ADMIN PLANS

var google_conversion_label = "Blp0CLCojHIQ0aD71QM";

1 Comment

  1. Georgi Todorov

    Wow, this rocks!

    I have previously though on such ideas, but you do it with a single command!

    Very helpful!

    Reply

Submit a Comment

Your email address will not be published. Required fields are marked *

Never again lose customers to poor
server speed! Let us help you.

Privacy Preference Center

Necessary

Necessary cookies help make a website usable by enabling basic functions like page navigation and access to secure areas of the website. The website cannot function properly without these cookies.

PHPSESSID - Preserves user session state across page requests.

gdpr[consent_types] - Used to store user consents.

gdpr[allowed_cookies] - Used to store user allowed cookies.

PHPSESSID, gdpr[consent_types], gdpr[allowed_cookies]
PHPSESSID
WHMCSpKDlPzh2chML

Statistics

Statistic cookies help website owners to understand how visitors interact with websites by collecting and reporting information anonymously.

_ga - Preserves user session state across page requests.

_gat - Used by Google Analytics to throttle request rate

_gid - Registers a unique ID that is used to generate statistical data on how you use the website.

smartlookCookie - Used to collect user device and location information of the site visitors to improve the websites User Experience.

_ga, _gat, _gid
_ga, _gat, _gid
smartlookCookie
_clck, _clsk, CLID, ANONCHK, MR, MUID, SM

Marketing

Marketing cookies are used to track visitors across websites. The intention is to display ads that are relevant and engaging for the individual user and thereby more valuable for publishers and third party advertisers.

IDE - Used by Google DoubleClick to register and report the website user's actions after viewing or clicking one of the advertiser's ads with the purpose of measuring the efficacy of an ad and to present targeted ads to the user.

test_cookie - Used to check if the user's browser supports cookies.

1P_JAR - Google cookie. These cookies are used to collect website statistics and track conversion rates.

NID - Registers a unique ID that identifies a returning user's device. The ID is used for serving ads that are most relevant to the user.

DV - Google ad personalisation

_reb2bgeo - The visitor's geographical location

_reb2bloaded - Whether or not the script loaded for the visitor

_reb2bref - The referring URL for the visit

_reb2bsessionID - The visitor's RB2B session ID

_reb2buid - The visitor's RB2B user ID

IDE, test_cookie, 1P_JAR, NID, DV, NID
IDE, test_cookie
1P_JAR, NID, DV
NID
hblid
_reb2bgeo, _reb2bloaded, _reb2bref, _reb2bsessionID, _reb2buid

Security

These are essential site cookies, used by the google reCAPTCHA. These cookies use an unique identifier to verify if a visitor is human or a bot.

SID, APISID, HSID, NID, PREF
SID, APISID, HSID, NID, PREF