SpiderLabs Blog

Setting HoneyTraps with ModSecurity: Adding Fake robots.txt Disallow Entries

August 26, 2013 7 minutes read Ryan Barnett

The following blog post is taken from Recipe 3-2: Adding Fake robots.txt Disallow Entries in my new book "Web Application Defender's Cookbook: Battling Hackers and Protecting Users". Additionally, if this topic interests you and you would like to have official, hands-on training, register for the upcoming OWASP AppSecUSA conference where I will be giving a full 2-day training class based on my book.

All warfare is based on deception. Hold out baits to entice the enemy. Feign disorder, and crush him.

— Sun Tzu in "The Art of War"

Signature Detection is Insufficient

We, as web application defenders, have a challenging task. We must try to defend web applications from attackers without the benefit of knowledge of the application internals. Without this information, it may be difficult to identify malicious behavior hidden in a flood of legitimate traffic.

How do normal users interact with the application resources? If you understand how normal users utilize the web application, you should be able to identify when attackers deviate from this usage profile. Unfortunately, many organizations attempt to use signature-based detection systems to identify malicious behavior. This endeavor is often fraught with accidentally blocking legitimate clients, and worse, missing real attackers altogether. How can we change the game so the odds of identifying malicious users are in our favor?

Rather than searching through "haystacks" of legitimate traffic looking for the malicious attack "needles", we need a method of removing the haystacks altogether. If we can set up a method that removes all normal user traffic, what we are left with would be abnormal traffic. This brings us to the concept of honeypots, or as we will be implementing them with ModSecurity, honeytraps.

Honeytrap Concepts

The honeytrap concept is rather simple: they are essentially booby-traps that are built into the application. Whereas honeypot systems are separate hosts that you deploy within your network to act as targets, honeytraps are dynamically planted throughout a web application and act as a virtual minefield for would-be attackers. They have no valid production purpose and there is no authorized interaction with or alteration of them, so any interaction is most assuredly suspicious.

A honeytrap's true value lies in being able to quickly identify malicious users from benign users. This detection value lies in the way that honeytraps work as tripwires during various phases of attack methodologies. Before an attacker can launch a malicious request, they must first conduct some reconnaissance of the application to understand its layout, construction and technologies in use. They must find out how the application handles authentication, authorization, session management and input validation. It is during these initial reconnaissance-gathering stages that honeytraps can easily spot users with malicious intent when they attempt to manipulate the data. Honeytrap concepts are extremely simple, yet extremely effective.

Let's take a closer look at the three main benefits of utilizing honeytraps:

High Fidelity Alerts – Since all honeytrap activity is, by definition, unauthorized, it is extremely effective at reducing false positive alerts.
Smaller Number of Alerts – A honeytrap only generates alerts when a client either interacts with or manipulates it. This results in a much lower number of alerts that a security analyst would need to validate.
Identifying False Negatives – Negative security signatures do not catch all data manipulation attacks as they have no knowledge of what the data was supposed to be. Honeytraps, on the other hand, excel at identifying data manipulation attacks as the payloads are known in advance.

As you can see, the concept of honeytraps is very easy to understand and implement. Simply lay your honeytraps throughout your web application and if anything is altered, you have most likely identified an attacker.

Recipe 3-2: Adding Fake robots.txt Disallow Entries

This recipe will show you how to add additional Disallow entries to the robots.txt file to alert when clients attempt to access the selections.

Ingredients

Robots Exclusion Standard

The Robots Exclusion Standard was created as a means to allow website owners to advise search engine crawlers which resources they were allowed to index. They do this by placing a file called robots.txt in the website's document root. In this file, the administrator of the site can include allow and disallow commands to instruct web crawlers which resources to access. Here are some real examples of robots.txt entries:

User-agent: * Allow: /   User-agent: Googlebot Disallow: /backup/ Disallow: /cgi-bin/Disallow: /admin.bak/ Disallow: /old/

The first entry means that all crawlers are allowed to access and index the entire site, however, the second entry states that Google's Google bot crawler should not access four different directories. By the names of these directories, this makes sense. There may be some sensitive data or files within these directories that the website owners do not want to be indexed by Google.

While all of this makes sense and serves a legitimate purpose, do you see a problem with using the robots.txt file? The robot exclusion standard is merely a suggestion and does not serve as access control. The issue is that you are basically letting external clients know of specific sensitive areas of your website that you don't want them poking around in. Well, guess what, malicious users and their tools will not abide by these entries. They will most assuredly try to access those locations. Therein lies our opportunity to lay another honeytrap detection point.

Dynamically Updating the robots.txt file

With ModSecurity, we can dynamically insert our own honeytrap robots. txt entries. First, you must enable the following two directives:

SecContentInjection OnSecStreamOutBodyInspection On

This directive tells ModSecurity that we want to have the ability to modify data (using prepend, append actions, or @rsub operator) into the live stream. With these directives in place, we can then use the following rule to add a fake honeytrap robots.txt entry.

SecRule REQUEST_FILENAME "@streq /robots.txt" \"id:'999005',phase:4,t:none,nolog,pass,append:'Disallow: /db_backup.%{time_epoch}/# Old DB crash data'"

This rule will silently append a fake directory location to the end of the legitimate robots.txt data. When the attacker now accesses the robots.txt file, this is how it would appear:

User-agent: * Allow: /   User-agent: Googlebot Disallow: /backup/ Disallow: /cgi-bin/ Disallow: /admin.bak/ Disallow: /old/Disallow: /db_backup.1331084262/ # Old DB crash data

Notice the new honeytrap Disallow entry at the end. You should try and make the name of your directory and any comments after it enticing to would-be attackers. In this case, we have named our honeytrap directory so that it would appear that it would contain possible database crash dump data. This tidbit of data would be almost too irresistible for an attacker to pass up. Now that we have laid out our honeytrap, we next need to write the detection rule that will catch if a user tries to access this location.

SecRule REQUEST_FILENAME "^/db_backup.\d{10}" \"id:'999006',phase:1,t:none,log,block,msg:'HoneyTrap Alert: Disallowedrobots.txt Entry Accessed.',logdata:'%{matched_var}',setvar:ip.malicious_client=1"

This rule will identify if any client accesses our honeytrap Disallow location and it will then set a variable in the IP Collection labeling this client as malicious. The ModSecurity debug log shows the following processing when a client accesses this honeytrap Disallow location:

Recipe: Invoking rule b81169b8; [file"/etc/apache2/modsecurity-crs/base_rules/modsecurity_crs_15_custom.conf"][line "5"] [id "999006"].Rule b81169b8: SecRule "REQUEST_FILENAME" "@rx ^/db_backup.\\d{10}""phase:1,id:999006,t:none,log,block,msg:'HoneyTrap Alert: Disallowed robots.txt EntryAccessed.',logdata:%{matched_var},setvar:ip.malicious_client=1"Transformation completed in 0 usec.Executing operator "rx" with param "^/db_backup.\\d{10}" against REQUEST_FILENAME.Target value: "/db_backup.1331084275/"Operator completed in 8 usec.Setting variable: ip.malicious_client=1Set variable "ip.malicious_client" to "1".Resolved macro %{matched_var} to: /db_backup.1331084275/Warning. Pattern match "^/db_backup.\\d{10}" at REQUEST_FILENAME. [file"/etc/apache2/modsecurity-crs/base_rules/modsecurity_crs_15_custom.conf"] [line "5"] [id "999006"] [msg "HoneyTrap Alert: Disallowed robots.txt Entry Accessed."] [data"/db_backup.1331084275/"]

Implement Fake Basic Authentication

An extension to the concept of a fake directory is to add a layer of fake authentication. This is useful on two fronts:

By replying to a request with a 401 Authorization Required HTTP response code, you are making the honeytrap resource appear more real.
When faced with an authentication challenge-response, attackers will often attempt to either manually enter some default username and password credential combinations or try a fully automated attack. In this scenario, we have won this battle as there is no correct authentication combination and the attacker is wasting their time attempting to brute force the credentials.

We can update the previous ModSecurity SecRule to include this fake authentication response by changing the phase, adding in a deny action, and instructing ModSecurity to issue a 401 response code.

SecRule REQUEST_FILENAME "^/db_backup.\d{10}" "id:'999011',phase:3,t:none,log,deny,status:401,msg:'HoneyTrapAlert: Disallowed robots.txt Entry Accessed.',logdata:'%{matched_var}',setvar:ip.malicious_client=1, setenv:basic_auth=1" Header always set WWW-Authenticate "Basic realm=\"Admin\"" env=basic_auth

Note that when this rule triggers, it will set an Apache environmental variable. The final Apache Header directive is then conditionally executed if the ModSecurity environmental variable is set. The Header command adds in the WWW-Authenticate response header.

Now, when an attacker decides to access our honeytrap resource from the robots.txt file, they will be greeted with an HTTP basic authentication pop-up box as shown below.

BSL_12373_e79eb05d-861c-439b-a442-21077c88aeb4

If the attacker attempts to authenticate to our honeytrap resource, we can then use the following ruleset to extract and decode the credentials.

SecRule REQUEST_FILENAME "^/db_backup.\d{10}" \  "chain,id:'999012',phase:1,t:none,log,msg:'HoneyTrap Alert: Authentication Attempt to Fake Resource.',logdata:'Credentials used: %{matched_var}'"  SecRule REQUEST_HEADERS:Authorization "^Basic (.*)" "chain,capture"    SecRule TX:1 ".*" "t:base64Decode"

The last rule uses the base64Decode transformation function to decode the submitted data in the Authorization request header. Here is a debug log section showing how this processing works:

Recipe: Invoking rule b7aae038; [file"/etc/apache2/modsecurity-crs/base_rules/modsecurity_crs_15_custom.conf"] [line "12"] [id "999012"].Rule b7aae038: SecRule "REQUEST_FILENAME" "@rx ^/db_backup.\\d{10}" "phase:1,deny,chain,id:999012,t:none,log,msg:'HoneyTrap Alert: Authentication Attempt to Fake Resource.',logdata:'Credentials used: %{matched_var}'"Transformation completed in 1 usec.Executing operator "rx" with param "^/db_backup.\\d{10}" against REQUEST_FILENAME.Target value: "/db_backup.1331278051/"Operator completed in 4 usec.Rule returned 1.Match -> mode NEXT_RULE.Recipe: Invoking rule b7aaedc8; [file "/etc/apache2/modsecurity-crs/base_rules/modsecurity_crs_15_custom.conf"] [line "13"].Rule b7aaedc8: SecRule "REQUEST_HEADERS:Authorization" "@rx ^Basic (.*)" "chain,capture"Transformation completed in 0 usec.Executing operator "rx" with param "^Basic (.*)" against REQUEST_HEADERS:Authorization.Target value: "Basic YWRtaW46UGFzc3dvcmQxMjM0"Added regex subexpression to TX.0: Basic YWRtaW46UGFzc3dvcmQxMjM0Added regex subexpression to TX.1: YWRtaW46UGFzc3dvcmQxMjM0Operator completed in 24 usec.Rule returned 1.Match -> mode NEXT_RULE.Recipe: Invoking rule b7aaf368; [file "/etc/apache2/modsecurity-crs/base_rules/modsecurity_crs_15_custom.conf"] [line "14"].Rule b7aaf368: SecRule "TX:1" "@rx .*" "t:base64Decode"T (0) base64Decode: "admin:Password1234"Transformation completed in 7 usec.Executing operator "rx" with param ".*" against TX:1.Target value: "admin:Password1234"Operator completed in 3 usec.Resolved macro %{matched_var} to: admin:Password1234Warning. Pattern match ".*" at TX:1. [file"/etc/apache2/modsecurity-crs/base_rules/modsecurity_crs_15_custom.conf"][line "12"] [id "999012"] [msg "HoneyTrap Alert: Authentication Attempt to FakeResource."] [data "Credentials used: admin:Password1234"]

As you can see from the bolded entries, we can identify that the attacker sent the following credentials:

Username = admin
Password = Password1234

Conclusion

As you can see from this example, setting HoneyTraps can be pretty easy and can also provide great value. Not only are we able to identify when an attacker is attempting to access restricted resources, but are also able to force the attacker to waste valuable time attempting to brute-force our fake authentication. This can provide valuable time for web defenders to respond to this threat. Keep an eye out for more examples of Setting HoneyTraps with ModSecurity.

Experiencing a security breach?

Experiencing a security breach?

Setting HoneyTraps with ModSecurity: Adding Fake robots.txt Disallow Entries

All warfare is based on deception. Hold out baits to entice the enemy. Feign disorder, and crush him.

Signature Detection is Insufficient

Honeytrap Concepts

Recipe 3-2: Adding Fake robots.txt Disallow Entries

Robots Exclusion Standard

Dynamically Updating the robots.txt file

Implement Fake Basic Authentication

Conclusion

Latest SpiderLabs Blogs

Cloudy with a Chance of Hackers: Protecting Critical Cloud Workloads

Trustwave Rapid Response: CrowdStrike Falcon Outage Update

Using AWS Secrets Manager and Lambda Function to Store, Rotate and Secure Keys

Stay Informed

Sign up to receive the latest security news and trends straight to your inbox from Trustwave.