Trustwave SpiderLabs Exposes Unique Cybersecurity Threats in the Public Sector. Learn More

Trustwave SpiderLabs Exposes Unique Cybersecurity Threats in the Public Sector. Learn More

Managed Detection & Response

Eliminate active threats with 24/7 threat detection, investigation, and response.

Co-Managed SOC (SIEM)

Maximize your SIEM investment, stop alert fatigue, and enhance your team with hybrid security operations support.

Advisory & Diagnostics

Advance your cybersecurity program and get expert guidance where you need it most.

Penetration Testing

Test your physical locations and IT infrastructure to shore up weaknesses before exploitation.

Database Security

Prevent unauthorized access and exceed compliance requirements.

Email Security

Stop email threats others miss and secure your organization against the #1 ransomware attack vector.

Digital Forensics & Incident Response

Prepare for the inevitable with 24/7 global breach response in-region and available on-site.

Firewall & Technology Management

Mitigate risk of a cyberattack with 24/7 incident and health monitoring and the latest threat intelligence.

Offensive Security
Solutions to maximize your security ROI
Microsoft Exchange Server Attacks
Stay protected against emerging threats
Rapidly Secure New Environments
Security for rapid response situations
Securing the Cloud
Safely navigate and stay protected
Securing the IoT Landscape
Test, monitor and secure network objects
Why Trustwave
About Us
Awards and Accolades
Trustwave SpiderLabs Team
Trustwave Fusion Security Operations Platform
Trustwave Security Colony
Technology Alliance Partners
Key alliances who align and support our ecosystem of security offerings
Trustwave PartnerOne Program
Join forces with Trustwave to protect against the most advance cybersecurity threats
SpiderLabs Blog

The Keystone Rocks - Foundation Chips of Pentesting Tips Part 1

The knowledgebase of a penetration tester can be broadly split into two categories: Relevant knowledge and Meaningless knowledge. These can also be thought of as Non-persistent knowledge and Persistent knowledge respectively. Although, I should highlight that these alternative labels for the complements are not entirely interoperable; that is to say we cannot exactly deduce that meaningless knowledge is persistent, or non-persistent knowledge is relevant. It could very well be the case that the only relevant knowledge for any given situation was persistent and what was non-persistent also transpired to be irrelevant. Perhaps it helps if I expand upon what I mean exactly: The Persistent knowledge is only eventually meaningless to the penetration tester in the absence of conditions applicable to any Relevant (or non-persistent) knowledge. It is also persistently the case that the Meaningless knowledge we will discuss can never result in successful penetration of a network by itself and it is therefore defined as meaningless because it has absolutely nothing whatsoever to do with insecurity or insecure conditions. However, without this Meaningless knowledge, the Relevant knowledge is much less effective - ergo, as with anything, what is truly meaningless is limited to specific and finite set of conditions or scope. To use an analogy by way of data compression: A tank of compressed air is completely meaningless upon the surface of the Earth in relation to breathing (unless you happen to be in an area where there is poison gas or are atop Mount Everest), but is extremely meaningful by itself if one is underwater. However, despite the lack of air it's also meaningless in orbit without the complement of a space suit, as the freezing (or boiling if in sunlight) temperatures and differentials in pressure will cause death before you could take a breath.

This series of posts will focus entirely upon the Meaningless knowledge and therefore begin with a brief and one-off elaboration of the Relevant category by way of contrast...

Security bugs get fixed. In one way or another - either directly at the source, or their radius of propagation and therefore of possible exploitation is limited by the introduction of a secondary system. This is the essence of the non-persistent nature of Relevant penetration testing knowledge. A penetration tester can be effective at their job having assimilated only a recent portion of the hitherto catalogue of insecure conditions. Vulnerabilities that are extremely historic have an escalating probability of being irrelevant and therefore also (arguably) meaningless. That being said, ignorance of ancient vulnerabilities may be a grievous weakness in the face of grotesque stasis and so new penetration testers should start at the top and "work downwards". This is a massive task and therefore to eliminate risk of this article becoming irrelevant we focus upon the other Meaningless and Persistent aspects of knowledge the reader may not yet possess - timeless skills and techniques which transcend circumstance, have nothing directly to do with insecurity, and are uniformly applicable and beneficial to effectiveness and efficiency during penetration tests. A platform and scaffold to support the active ammunition in our transient shifting knowledgebase. It also goes without saying that this "top down" approach whilst broadly true is also a simplified generalisation, as the definition of "vulnerability" is complex - certainly from a penetration tester's perspective many interesting avenues for investigation and insecure aspects do not have CVE numbers and are merely configuration defaults, residual due to the momentum in the design decisions of various vendors. As to why some persistent security-related knowledge is arguably meaningless? For the same reason most of us have lost the ability to hunt with a spear...Such skills are obsolete and generally inapplicable in the modern World. Yet, Neanderthal penetration testers have not forgotten how to exploit IIS 4.0 double decode for example. Nowadays we can of course unanchor this knowledge from its historic circumstance and identify aspects which may be persistently meaningful, for instance by reflecting that directory traversal attacks are still often found in the wild in the present day...

Having digressed into relevance, we continue in our serious context, because without a lot of this so-termed "meaningless" knowledge it will be very difficult to be a very good penetration tester. A good penetration tester needs a lot of skills that are generic and related to the organization, and manipulation of, data. This is what this series is about - increasing your persistent knowledge such that what is directly meaningless to insecurity itself provides a continuous basis from which to support your ongoing hacking attempts. A zero and a one have been the same since the dawn of numbers, since before digitalization, and will ever remain so.

The Nitty-Gritty

Enough of this wordplay and intellectual tomfoolery - fun as it is to think about - let's get down to it. One may presume that any and all tools and utilities used continuously (with a persistent relevance in a non-security context) during penetration tests are already present on common penetration testing distributions. If that were the case I'd have nothing to write about...We'll be covering two of the most basic and common use cases in this article. As the series progresses, new topics will be introduced, and converge into a metaphorical Swiss army knife. Combination techniques will then be discussed, with a prerequisite of all the Meaningless knowledge discussed and accumulated thus far.

Throughout the series, the tools we discuss will be released under the package "Keystone" - available at

Sorting by IP Address

As penetration testers we very often wish to organize IPv4 addresses into their correct order for various reasons. Take a look at the following short list of unordered IP's:

If we just run this through the UNIX utility "sort", we get the following:

A reasonable attempt, however if sorting by octets "2" comes before "192" and "100" should be after "2". So we try "sort -n" for a numerical sort:

Better, but the 3rd and 4th lines still need to be swapped. A quick Internet search reveals we can make sort do what we want thus:

sort -n -t . -k 1,1 -k 2,2 -k 3,3 -k 4,4

This gives the correctly IP-sorted list:

One could wrap up the solution to the problem by making a script called "ipsort" which looked like this:

if [ -r $1 ]
sort -f $1 -n -t . -k 1,1 -k 2,2 -k 3,3 -k 4,4
sort -f - -n -t . -k 1,1 -k 2,2 -k 3,3 -k 4,4

Then we can either do ipsort <file> or cat <file> | ipsort for flexibility. This works, but only for certain situations...

Limitation 1 - Files are not always nicely organised

The initial solution fails for cases where IP addresses appear within a wrapper of other data containing any full stops (periods), such as:


If you run the above through our initial "ipsort" solution, the order won't change. That's because it already satisfies the sorting condition, which fails to distinguish the IP address octets from the numerical indices.

Limitation 2 - Sometimes we just want the IP addresses

Very often the reason for sorting a file by IP address transcends just the sorting process; it's often just the list of IP addresses which are interesting. Consider the following:

Nmap scan report for
Nmap scan report for
Nmap scan report for
Nmap scan report for

Here, our original "ipsort" script is effective as there are no other full stops and data left of the initial sorting point is identical, resulting in a correct line order:

Nmap scan report for
Nmap scan report for
Nmap scan report for
Nmap scan report for

However the "Nmap scan report for" prefix is disinteresting because it's repetitive and prevents any useful comparison of the data. We would need to strip it manually by piping to cut -f 5 -d ' ' or similar command.

Solution - Improved ipsort

Let's write a more flexible variant using Perl. Various solutions exist on the Internet for IP sorting, so we wrap one into a Perl sort function here.

sub ip {
@oa=split /\./, $a;
@ob=split /\./, $b; #/

$oa[0] <=> $ob[0]
$oa[1] <=> $ob[1]
$oa[2] <=> $ob[2]
$oa[3] <=> $ob[3]

while (<>) {
/(\d{1,3}\.){3}\d{1,3}/ or next;
push @ips, $&;

print join("\n",sort ip @ips)."\n";

This script is more effective, as it first extracts only the IP address from each line in the file, discarding any other data. If we run this on our "Nmap scan report for" example we get just the sorted list of IPs. A more powerful variant of this script, which has several options (including options for identifying multiple IP addresses appearing on a single line and the ability to sort unique) can be found in the Keystone package on GitHub.

The benefits of the new and more powerful "ipsort" utility are obvious. Not only can we search for the presence of IP addresses in a file and extract them, we can now invoke the "ipsort" function in the standard UNIX way using pipes in any other manipulation process - even just in a grepping capacity with automatic sort. This is exceptionally useful.

Comparing lists of IP addresses robustly

Many readers may be familiar with the standard UNIX utility comm. Briefly, this provides a good way of comparing two files which are sorted lexically (non-numerically). As penetration testers, it is very common to wish to compare two files containing nothing other than IP addresses. There are numerous occasions when we might want to do this - let's suppose we have a list of files with port 80 open, and another with port 443 open. We wish to know which IP addresses have only port 80 and not 443, which have only 443 and not 80, and which have both. We could do this visually quite easily for small sets of data, but the comparison becomes difficult for humans when there's more than a dozen or so instances of each.

If we attempt to use our list of IP addresses sorted in IP-address order from before, comm will not function as intended. For a refresher, the first column displays lines unique to the first file, the second those unique to the second, and the third those common to both. You can suppress any column using -1 -2 or -3 or any combination thereof.

cat list1.txt

Now we insert another IP address in our second file:

$ cat list2.txt

Let's try comm on the two files:

$ comm list1.txt list2.txt

Clearly, the results are nonsensical because the lines do not appear in lexical order. In order to address the problem, we must first sort them lexically. Sort by default achieves this:

$ sort list1.txt | tee list1-lexical.txt

$ sort list2.txt | tee list2-lexical.txt

$ comm list1-lexical.txt list2-lexical.txt

This gives the correct/anticipated result - every line is common to both files with the exception of, which is unique to list2-lexical.txt.

As penetration testers, it is common to want lists of IP addresses sorted by IP address, and it's common to want to compare such lists. We could always sort our files lexically before using comm, but this process becomes tedious given the frequency of use. We need to create a utility which wraps "comm" in a more useful way.

Along comes the utility "scomm", which is available in the SpiderLabs GitHub Keystone package. This blog post does not discuss the specifics of the creation process. Suffice to say, the utility is exactly like comm (because it invokes it), but it first sorts the files appropriately. Therefore we no longer have to worry about the order of the lines, and can concentrate on comm's primary function - its ability to tell us which of the lines are unique or common to the two compared files. You will never use comm by itself again.

This concludes the first two tools and techniques released by this series. Look out for Part 2, where we will be discussing grepping for penetration testing.

Latest SpiderLabs Blogs

2024 Public Sector Threat Landscape: Trustwave Threat Intelligence Briefing and Mitigation Strategies

Trustwave SpiderLabs’ 2024 Public Sector Threat Landscape: Trustwave Threat Intelligence Briefing and Mitigation Strategies report details the security issues facing public sector security teams as...

Read More

How to Create the Asset Inventory You Probably Don't Have

This is Part 12 in my ongoing project to cover 30 cybersecurity topics in 30 weekly blog posts. The full series can be found here.

Read More

Guardians of the Gateway: Identity and Access Management Best Practices

This is Part 10 in my ongoing project to cover 30 cybersecurity topics in 30 weekly blog posts. The full series can be found here.

Read More