List comprehension one liners to extract info from nmap scans using Python and libnmap

  1. Setup
  2. Port information
  3. SSL/TLS and HTTP/HTTPS
  4. Other Service Information
  5. Random Stuff
  6. Ways to use the results
  7. Conclusion

When I perform internal penetration tests where a large number of hosts and services are involved, its useful to be able to quickly extract certain sets of information in an automated fashion from nmap scan data.  This is useful for performing automated tests against various service types, such as directory brute forcing on web servers, SSL/TLS cipher and protocol testing on SSL/TLS servers, and other targeted tests on various particular products or protocols.

I do a lot of my processing during a pentest either from IPython or the *nix shell, so being able to access this information from Python, where I can directly use it in scripts, the REPL or write it to disk to access using shell commands is extremely useful.

To this end, the libnmap library proves extremely useful.  This post will cover a number of list comprehension "one liners" that can be used with the aid of the NmapParser library from libnmap to populate this information into a Python environment, where it can then be easily used for other purposes such as running a loop of other actions, or writing it to disk in a text file.

This is largely for my own use, so I don't forget these techniques, but hopefully other people find this useful as well.  I'm hoping that this post does more than just give you a list of code to copy and paste, and also gives you an appreciation about how useful IPython is as a data processing tool for Pentesting.

I may add to this post in future if I have other commonly used patterns that I think would be useful.


Setup


The first step in being able to parse nmap scan data, is doing an nmap scan.  I wont go into too much detail about how this is done, but to use the code in this post you will need to have saved your scan results to an xml file (options -oX or -oA) and have performed service detection (-sV) and run scripts (-sC) on the open ports.

The rest of the commands in this post will assume you are working from a Python REPL environment such as IPython, and have installed the libnmap module (which you can do using easy_install or pip).

To start off with, you need to setup the environment, by importing the NmapParser module and then reading in your xml scan results file (named "up_hosts_all_ports_fullscan.xml" which is located in the present working directory in the example below).

from libnmap.parser import NmapParser
nmap_report = NmapParser.parse_fromfile('up_hosts_all_ports_fullscan.xml')


The rest of this post will cover various one lines that will generate lists with various useful groupings of information.  The examples all assume that the nmap scan data resides in a variable named nmap_report, as generated in the example above. The base list comprehension lines are given in the examples below, and if you paste these directly into the IPython REPL in which you have already run the instructions above, it will dump the output directly to the console so you can view it.  I usually always do this first before taking other steps so I can determine the data "looks" as expected.

Then, you can optionally preface these lines with a variable name and equals "=" sign to assign the data to a variable so you can use it in future Python code, or surround it with a join and a write to save it to disk so you can work on it with shell commands.  You could also paste the snippets into a Python script if its something you might want to use multiple times, or if you want to incorporate some more complex logic that would become awkward in a REPL environment.  I'll include a section at the end that shows you how to quickly perform these operations.


Port information

Hosts with a given open port

Show all hosts that have a given port open.  Generates a list of host addresses as strings.  Port 443 is used in the example below, change this to your desired value.

[ a.address for a in nmap_report.hosts if (a.get_open_ports()) and 443 in [b[0] for b in a.get_open_ports()] ]


Unique port numbers found open


Show a unique list of the port numbers that are open on various hosts.  Generates a list of port numbers, as ints, sorted numerically.

sorted(set([ b[0] for a in nmap_report.hosts for b in a.get_open_ports()]), key=int)


Hosts serving each open port, grouped by port

Show all open ports and the hosts that have them open, grouped by port and sorted by port order.  Generates a list of lists, where the first item of each member list is the port number as an int, and the second item is a list of the IP addresses as strings that have that port open.

[ [a, [ b.address for b in nmap_report.hosts for c in b.get_open_ports() if a==c[0] ] ] for a in sorted(set([ b[0] for a in nmap_report.hosts for b in a.get_open_ports()]),key=int) ]


SSL/TLS and HTTP/HTTPS

Host and port combinations with SSL

Show all host and port combinations with SSL/TLS.  This works by looking for any reference to the service being tunneled over "ssl" or the script result including results that reference a pem certificate.  Generates a list of lists, where each list item includes the host address as a string, and the port as an int.

[ [a.address,  b.port] for a in nmap_report.hosts for b in a.services if b.tunnel=='ssl' or "'pem'" in str(b.scripts_results)  ]


The following includes the same information as the above, a list of all SSL enabled host and port combinations except as opposed to a list of lists, I have used the join function to create a list of host:port strings. 

[ ':'.join([a.address,  str(b.port)]) for a in nmap_report.hosts for b in a.services if b.tunnel=='ssl' or "'pem'" in str(b.scripts_results)  ]


Host and port combinations serving websites

Show all websites, including the port and protocol (http or https).  This generates a list of lists, where each child list contains the protocol as a string, address as a string and port number as an int.  There is some inconsistency in the way that nmap reports on https sites (sometimes the service is "https", and other times the service is "http" with an "ssl" tunnel), so I have performed some munging of field data to make the output here consistent.

[ [(b.service + b.tunnel).replace('sl',''), a.address, b.port] for a in nmap_report.hosts for b in a.services if b.open() and b.service.startswith('http') ]


Here is the same information as the above, but instead of a list of lists with protocol, host and port as separate items, it joins all these together to provide list of URLs as strings.

[ (b.service + b.tunnel).replace('sl','') + '://' + a.address + ':' + str(b.port) + '/' for a in nmap_report.hosts for b in a.services if b.open() and b.service.startswith('http') ]


Other Service Information

Unidentified services

Show all of the services that nmap could not identify during its service enumeration.  Generates a list of lists where each child list contains the address as a string, the port as an int, and the nmap service fingerprint as a string.  I usually like to generate this information for manual review of those particular services, and don't do anything automated with the output, but its still nice to be able to quickly generate this information for easy review.

[ [ a.address, b.port, b.servicefp ] for a in nmap_report.hosts for b in a.services if (b.service =='unknown' or b.servicefp) and b.port in [c[0] for c in a.get_open_ports()] ]


Software products identified by nmap

Show a unique list of the products that nmap identified during the scan. Generates a sorted list of strings for each product.

sorted(set([ b.banner for a in nmap_report.hosts for b in a.services if 'product' in b.banner]))


Host and port combinations serving software products, grouped by product

Show each software product, with hosts and ports where they are served, grouped by product.  Generates a list of lists, where each child has a first element of the product name as a string, followed by a list of lists, where each child list contains the address as a string, and the port number as an int.

[ [ a, [ [b.address, c.port] for b in nmap_report.hosts for c in b.services if c.banner==a] ] for a in sorted(set([ b.banner for a in nmap_report.hosts for b in a.services if 'product' in b.banner])) ]


Same as the above, shows each product, with hosts and ports where they are enabled, grouped by product, but with a slightly different presentation.  This generates a list of lists, the first element in each list is the product name as a string, the second element is a list of host:ports as strings.

[ [ a, [ ':'.join([b.address, str(c.port)]) for b in nmap_report.hosts for c in b.services if c.banner==a] ] for a in sorted(set([ b.banner for a in nmap_report.hosts for b in a.services if 'product' in b.banner])) ]


Host and port combinations serving services relating to a particular search string

Show all the hosts and ports that relate to a given (case sensitive) search string, which can be found anywhere in a raw text dump of all the service information provided by nmap, covering the product name, the service name, etc.  The string "Oracle" is used in the example below. Can be used to create a more generalised, or alternatively more specific grouping of services than the snippet above.  Generates a list of lists, where each child list contains the address of a host as a string and the port as an int.

[ [a.address, b.port] for a in nmap_report.hosts for b in a.services if b.open() and 'Oracle' in str(b.get_dict()) + str(b.scripts_results) ]


Shows the same as the above, all host and port combinations that match a given search string, but in this case its modified to match against a service information dump that's all in lower case.  Use lower case search strings here (the example is a lower case "oracle").  Generates output in the same format as the example above.

[ [a.address, b.port] for a in nmap_report.hosts for b in a.services if b.open() and 'oracle' in (str(b.get_dict()) + str(b.scripts_results)).lower() ]


Random Stuff

Common Name from Certificate Subject

Shows the common name field from any SSL certificates found and parsed by nmap during a script scan.  Can be useful to determine the systems host name if you only started with an IP Address and reverse DNS doesnt work.  Generates a list of lists, each containing the IP address and extracted host name as strings.

[ [a.address, c['elements']['subject']['commonName'] ] for a in nmap_report.hosts for b in a.services for c in b.scripts_results if c.has_key('elements') and c['elements'].has_key('subject') ]


Ways to use the results

As mentioned earlier, the examples above, when pasted into your IPython REPL, will just dump the output to screen, where you can look at it.  That's nice, cause it allows you to see the data you're interested in, ensure it passes the "smell" test, etc, but you probably want to do other stuff with it as well. One of the benefits of generating the information above in this way is that you can easily perform some further automated action with the result.

If you're already familiar with Python, it will be pretty easy for you to perform these other types of tasks, and you can skip this section, but for those who are not, this section will provide some basic pointers on how you can make use of these snippets. The examples below will demonstrate some simple examples of the things that you might want to do with them.


Saving to disk

If you want to write the output of one of the snippets to disk to a text file, you need to join the list together in an appropriate string format (depending on your use case) and then actually write it out to a file.  In Python, we can join the list to a string using the join function, and write it to disk using open and write.  Here's an example.

Lets say we want to take our generated list of hosts and ports that support ssl, and pass them out to a newline separated file so we can do a for loop in bash and test each combination for secure ssl usage using a command line tool (are the ciphers correct, are there bad protocols like ssl3 or 3, etc).  I would do all of this in a giant one liner in IPython, because that's the sort of thing that amuses me, but I will break it down into individual lines of code here for readability in this example.

Our list comprehension, that generates a list of host:port strings from above, is as follows. Note how we use the str function around the port number to convert it to a string from an int to allow it to be joined together with other strings.  This is something to be aware of when manipulating this sort of data like this, and was something that caught me out a lot when I was first learning to use Python. 

[ ':'.join([a.address,  str(b.port)]) for a in nmap_report.hosts for b in a.services if b.tunnel=='ssl' or "'pem'" in str(b.scripts_results)  ]


Lets assign it to a variable named "ssl_services" to make it easier to work with.

ssl_services = [ ':'.join([a.address,  str(b.port)]) for a in nmap_report.hosts for b in a.services if b.tunnel=='ssl' or "'pem'" in str(b.scripts_results)  ] 


Now, lets join each list element together with newlines ('\n'), using Pythons join function, and assign it to a new variable, "ssl_services_text".

ssl_services_text = '\n'.join(ssl_services)


Now, we can create a new file "ssl_services_file.txt" in the present working directory, and write the contents of our ssl_services_text variable to it.

open('ssl_services_file.txt','w').write(ssl_services_text)


Easy.  Now you can bash away at the file to your hearts content.


Using with other Python code

Perhaps you want to use the list comprehensions above in other Python code?  That's easy too.  Here's a simple example, where we will loop through each of our websites identified from nmap, and see the result of requesting a particular page from that site.

Here's the list comprehension that generates the list of URLs.  We will assign it directly to a variable called "urls".

urls = [ (b.service + b.tunnel).replace('sl','') + '://' + a.address + ':' + str(b.port) + '/' for a in nmap_report.hosts for b in a.services if b.open() and b.service.startswith('http') ]


Next, we do some prep work, and import the requests module and setup a simple helper function that makes a web request and saves the result to file on disk with an autogenerated name based on the url.  You will note below that the get request uses the "verify=False" option, which ignores certificate validation errors when making the request, something that's often necessary when testing internally on machines that may not trust the certificate authorities used to sign SSL certificates.

import requests

def getAndSave(url):
r = requests.get(url, verify=False)
open('_'.join(url.split('/')[2:]).replace(':',''),'wb').write(r.text.encode('utf8'))


Now, we will add some code to iterate through each of the base URLs from nmap, and requests  the robots.txt file for each site so that it can be saved to disk for our later perusal.

for a in urls:
getAndSave(a + 'robots.txt')


This will request each the robots.txt from each of the sites in turn, and save it to disk in the current working directory.  This is just a very simple example (there's not even any error checking in the getAndSave function, but it hopefully gives you an idea of whats possible.

Conclusion

Hope you found this useful, and that it gave you an appreciation of how useful and flexible Python can be when parsing nmap scan data.

Are there any other common tasks you perform with nmap scan data that you would like a one-liner for?  Leave a comment!