BYOS – Build your own sniffer – Python Tutorial

Lat post of 2017 and its going to be a long one.

Before proceeding take into consideration the following:

1) I am not a developer and everything shown below is my own trip on understanding some principles
2) If there are mistakes or incorrect understanding let me know on Twitter.

During an assessment you will most probably end up running Responder.

If you read about Responder you will see that it is a LLMNR, NBT-NS and MDNS poisoner meaning that in order to for the tool to work it will need to first capture multicast packets of any of those protocols, if not it will simply not work. The idea of this post is to create a simple tool that will look for these packets so you don’t run Responder if there are no required packets flowing in the wire. The outcome will not be an awesome tool but it will teach you a thing or two about protocols and understanding packets. If you want to avoid all this trouble and just look for a way to see LLMNR and mDNS packets use Wireshark and apply the filter “LLMNR” or “MDNS”.

Before we begin a question that comes to mind is why not use Wireshark? Wireshark is awesome and i use it during all my assessments but i wanted to play a bit with Python’s packing, unpacking and sockets. I also now code 100% in Python so i am researching a lot to improve my coding so the below is basically a learning trip of my own.

I will not deep dive into the specifics on how Responder does the poisoning part as so many other people have made good tutorials but i will use the these underlying protocols to build a sniffer/dissector that will only look for these protocols, ignoring all others. The idea behind this is twofold. First it will show you how you can inspect packets in detail and also to leverage on that to build a tool that does more than this so thing of it as a skeleton script for future developments or reference.

Before i get into the code we need some theory that will come in handy later on.

LLMNR – Link-Local Multicast Name Resolution

Summary of LLMNR’s RFC…

..based on the DNS packet format and supports all current and future DNS formats, types, and classes. LLMNR operates on a separate port from the Domain Name System (DNS), with a distinct resolver cache.

  • IPv4

Multicast destination IP address 224.0.0.252
Multicast MAC destination address 01:00:5E:00:00:FC

Destination Port UDP 5355

  • IPv6

Multicast IP destination IP address FF02:0:0:0:0:0:1:3
IPv6 multicast destination MAC address 33:33:00:01:00:03

Destination Port UDP 5355

So for the LLMNR part this is all the information we need. If we observe packets that fulfil the above protocol parameters then we can conclude that there is LLMNR traffic in the network.

mDNS – multicast Domain Name System

Summary of mDNS’s RFC…

“…the DNS top-level domain “.local.” is a special domain with special semantics, namely that any fully qualified name ending in “.local.” is link-local, and names within this domain are meaningful only on the link where they originate.
[..] Any DNS query for a name ending with “.local.” MUST be sent to the mDNS IPv4 link-local multicast address 224.0.0.251 (or its IPv6 equivalent FF02::FB).

  • IPv4

Multicast destination IP 224.0.0.251
Multicast MAC address 01:00:5E:00:00:FB
Destination Port UDP 5353

  • IPv6

Multicast destination IP FF02::FB
Multicast MAC address 33:33:00:00:00:FB
Destination Port UDP 5353

This concludes the mDNS part. If we observe packets that fulfil the above protocol parameters then we can conclude that there is mDNS traffic in the network.

We can verify the above by setting a filter in Wireshark for either LLMNR or mDNS.

Mdns

Ethernet

As the Ethernet frames are captured they have the following structure.

Ethernet II1

What we care about is the first 14 bytes which include the Destination MAC, the Source MAC and the Type. Note that the data comes after these 14 bytes but since we are not using it at any point we can drop it completely. Obviously if you want to extract more information you must dive into that section as well. By Type we mean the EtherType number. EtherType which is a two-octet field in an Ethernet frame. It is used to indicate which protocol is encapsulated in the payload of the frame. Wikipedia full article. If the EtherType is 0800 it means this is IPv4 traffic. If the EtherType is 86DD that means it is IPv6 traffic.

Lets start coding.

import socket
import struct

def main():
	packets = socket.socket(socket.AF_PACKET, socket.SOCK_RAW, socket.ntohs(3))

We begin with a main function which creates a socket to listen for the incoming packets at Layer 2 of the OSI. Breaking this line down:

socket.AF_PACKET – Address Family Packet. This is used when we want to capture and manipulate traffic. An alternative is AF_INET which is used only for TCP and UDP communication.

socket.SOCK_RAW – Raw packets are passed to and from the device driver without any changes in the packet data. An alternative would be to use SOCK_DGRAM where the physical header is removed before the packet is passed to the user.

socket.ntohs(3)

More info on sockets can be found here.

Adding some more functionality to the code.

import socket
import struct

def main():
	packets = socket.socket(socket.AF_PACKET, socket.SOCK_RAW, socket.ntohs(3))

	while True:
		ethernet_data, address = packets.recvfrom(65536)
	

data, address = packets.recvfrom(65536)

Here we are receiving packets on our socket with a maximum buffer size of 65536 which is the largest buffer than can be defined. The recvfrom returns a tuple containing the raw data received and the address of the socket sending that data i.e eth0. We are taking the tuple values and assigning them to the data and address variables. The address variable is not useful and will not be used at all.

The problem right now is that the raw data captured are in (1) byte format and (2) we need to locate where the information we need from each packet is, for example what Type of packet it is.

What needs to be done is a mechanism which is going to ingest the raw data captured and return the Source MAC Address, Destination MAC Address and Type of protocol.

import socket
import struct

def main():
	packets = socket.socket(socket.AF_PACKET, socket.SOCK_RAW, socket.ntohs(3))

	while True:
		ethernet_data, address = packets.recvfrom(65536)
		dest_mac, src_mac, protocol, ip_data = ethernet_dissect(ethernet_data)


def ethernet_dissect(ethernet_data):

	dest_mac, src_mac, protocol = struct.unpack('! 6s 6s H', ethernet_data[:14])
	return mac_format(dest_mac), mac_format(src_mac), socket.htons(protocol), ethernet_data[14:]

dest_mac, src_mac, protocol = struct.unpack(‘! 6s 6s H’, ethernet_data[:14])

Each raw data packet received (up to byte 14) is being unpacked into four variables. If you remember from the Ethernet Frame image above, the first 6 bytes are the destination mac address, the next 6 bytes are the source mac address and the next 2 bytes is the type. The remaining bytes after 14 onwards are the data bytes which contain the IP Packet information.

As mentioned before, the raw data are in byte format and we need to help Python understand how to convert it. Breaking it down:

! The form ‘!’ is available for those poor souls who claim they can’t remember whether network byte order is big-endian or little-endian.
6s – A string of 6 characters. MAC addresses are 48 bits or 6 bytes long
6s – A string of 6 characters. MAC addresses are 48 bits or 6 bytes long
H – Unsigned short integer of 2 bytes.

An alternative way is to use unsigned chars:

dest_mac, src_mac, protocol = struct.unpack(‘! BBBBBB BBBBBB H’, ethernet_data[:14])

or better

dest_mac, src_mac, protocol = struct.unpack(‘! 6B 6B H’, ethernet_data[:14])

Right now we have a 6-byte string stored in the destination mac variable, another 6-byte string stored in the source mac address and a 2-byte integer stored in the protocol variable. Although the protocol is usable as is the source mac address and the destination mac address are not as they are not in MAC address format (MM:MM:MM:SS:SS:SS).

After this line of code executes we go ahead and call the function which is going to be responsible to appropriately format the mac address and return the values to the main function.
We also execute this socket.htons(protocol) line of code which will convert the protocol integers from host to network byte order so we don’t need to worry about endianess.

More details on Python’s struct can be found here.

The next part of the code is the function that does the heavy lifting on correctly formatting the mac address.

import socket
import struct

def main():
	packets = socket.socket(socket.AF_PACKET, socket.SOCK_RAW, socket.ntohs(3))

	while True:
		ethernet_data, address = packets.recvfrom(65536)
		dest_mac, src_mac, protocol, ip_data = ethernet_dissect(ethernet_data)


def ethernet_dissect(ethernet_data):

	dest_mac, src_mac, protocol = struct.unpack('! 6s 6s H', ethernet_data[:14])
	return mac_format(dest_mac), mac_format(src_mac), socket.htons(protocol), data[14:]


def mac_format(mac):
	mac = map('{:02x}'.format, mac)
	return ':'.join(mac).upper()

The mac_format function will take first the destination mac address and then the source mac address and correct their format. It does that by executing that map function on each mac string that comes in. What map does is to apply the same function on all elements an return the result. The input data is the mac string and the function that will run on each is ‘{:02x}’.format

Breaking down ‘{:02x}’.format means break each string into 2-byte chunks of hexadecimal characters and store the result into the mac variable. As the map function return a list that means the map variable will be a list looking kind of like this.

map = [’00’,’0a’,’95’,’9d’,’68’,’16’]

The next line of code will return the value after joining the elements of the list and converting them to uppercase effectively returning

00:0A:95:9D:68:16

So at this stage or main function has the correctly formatted Destination MAC, Source MAC and Type.

The first step that needs to be taken care of is if the traffic received is IPv4 traffic or not because we don’t want to deal with other traffic such as IPv6, ARP etc. It is completely doable but it is out of scope of this post. Depending on the result we can look for different things in each frame. Before that we need to dissect the IP packet so that we will read the information contained in it and basically extract the protocol which is going to tell us what kind of packet it is. This is crucial of any type of analysis and to dissect and manipulate a packet you need to first understand what you are dealing with. All of this information is contained in the data segment of the Ethernet Frame shown in blue.

Ethernet II1

In particular the data segment comes after the first 14 bytes of the Ethernet frame.

The next decision on how to proceed first comes from the fact that the packet must be a UDP packet as both LLMNR and mDNS send their traffic in UDP.

IP Packet protocols come in the following format:

1 – ICMP
6 – TCP
17 – UDP
27 – RDP
41 – IPv6

The complete list of protocols is here

For what we are trying to build here is the only relevant protocol that we care about is 17 – UDP. This is the first test that we will do on every single packet that comes in; if the type is 17 – UDP we can proceed with the rest of the checks.

import socket
import struct

def main():
	packets = socket.socket(socket.AF_PACKET, socket.SOCK_RAW, socket.ntohs(3))

	while True:
		ethernet_data, address = packets.recvfrom(65536)
		dest_mac, src_mac, protocol, data = ethernet_dissect(ethernet_data)
		print("\nEthernet Frame Captured:")
		print("Destination: {} Source: {} Protocol {}".format(dest_mac, src_mac, protocol))

def ethernet_dissect(ethernet_data):

	dest_mac, src_mac, protocol = struct.unpack('! 6s 6s H', ethernet_data[:14])
	return mac_format(dest_mac), mac_format(src_mac), socket.htons(protocol), ethernet_data[14:]


def mac_format(mac):
	mac = map('{:02x}'.format, mac)
	return ':'.join(mac).upper()


def ipv4_packet(ip_data):

	ip_protocol, source_ip, target_ip = struct.unpack('! 8x B B 2x 4s 4s' , data[:20])
	return ip_protocol, ipv4(source_ip), ipv4(target_ip)

def ipv4(address):
	return '.'.join(map(str, address))

Lets break it down because it is starting to be a bit complicated. The IP header is located inside the DATA portion of the Ethernet Frame and it looks like the image below.

Ipheader

The elements that are of interest to us is the protocol to distinguish what kind of packet it is, the source IP address and the destination IP address. Lets break down the ipv4_packet function which is going to help extract the information needed.


def ipv4_packet(ip_data):

	  ip_protocol, source_ip, target_ip = struct.unpack('! 9x B 2x 4s 4s' , ip_data[:20])
	return ip_protocol, ipv4(source_ip), ipv4(target_ip), ip_data[20:]


def ipv4(address):
	return '.'join(map(str, address))



We first begin by sending to the function the ip_data part that was extracted initially from ethernet_dissect function. This data is the ethernet data after the first 14 bytes so we are now dealing only with the data portion of the frame.

So we know that all entire part of the IP packet excluding the options and data portion is 20 bytes therefore all the information that we need is in those 20 bytes. We also need what comes next after the 20 header bytes which is the UDP packet data.

Ipv4 packet format

All the information that we need are included in the 8th, 12th and 16th octets. IP Addresses are 32 bits or 4 bytes long and the protocol is 2 bytes long so now it is a matter of unpacking those values from the 20-byte long IP Packet.

The most challenging thing is understanding how to unpack. Lets take it step by step.

struct.unpack(‘! 8x B B 2x 4s 4s’ , data[:20])

8x – Means 8 pad bytes of no value. This is because the first 8 bytes are shown in red. We don’t care about that information so in essence we are skipping that part with 8 pad bytes.

Ipnums

We we reach the 10th byte which holds the Protocol we store in into the variable ip_protocol. Because the protocol is an integer of length 2 we unpack it with BB which means an integer of length 1 *2. The we skip byte 10 and byte 11 with 2x until we reach the source IP address. As mentioned before the IP address is 4 bytes long so it can be unpacked with 4s meaning a string of length 4. Because the destination IP address comes right after the 4 bytes of the source IP address no padding is required and the destination IP address can be unpacked with 4s again.

So we have the IP addresses stored in the source_ip and target_ip variables but are still not in the right format so they are send to the formatting function which takes each variable in and converts them appropriately. Remember that unpack returns a list so each IP address will be something like this before formatting.

[‘165′,’32’,’123′,’43’]


def ipv4(address):
	return '.'join(map(str, address))

What the function above does is take in the list, convert each element to a string and join all the strings using a ‘.’ separator leading to an IP of 165.32.123.43.

This program will decide on the packet if it is LLMNR or mDNS based on 3 things:

Destination IP Address
MAC address
Destination Port

So far we have the MAC addresses and destination IP addresses but we don’t have the destination port so we need a function to ingest the UDP packet and return the destination port. Before moving to data let’s put in some checks to ensure that the packet coming received is an IPv4 packet and otherwise ignore it and if it is to check its protocol and see if it is a UDP one with protocol code 17.

import socket
import struct

def main():
	packets = socket.socket(socket.AF_PACKET, socket.SOCK_RAW, socket.ntohs(3))

	while True:
		data, address = packets.recvfrom(65536)
		dest_mac, src_mac, protocol, ip_data = ethernet_dissect(ethernet_data)
		print("\nEthernet Frame Captured:")
		print("Destination MAC Address: {} Source MAC Address: {} Protocol: {}".format(dest_mac, src_mac, protocol))

		if protocol == 8:
			ip_protocol, source_ip, target_ip = ipv4(ip_data)
			if ip_protocol == 17:
				source, destination = udp_packet(ip_data)
				if destination == 5355:
					print("LLMNR Packet Received from {} to {}".format(source, destination)
				elif destination == 5353:
					print("mDNS Packet Received from {} to {}".format(source, destination)


def ethernet_dissect(ethernet_data):

	dest_mac, src_mac, protocol = struct.unpack('! 6s 6s H', ethernet_data[:14])
	return mac_format(dest_mac), mac_format(src_mac), socket.htons(protocol), ethernet_data[14:]


def mac_format(mac):
	mac = map('{:02x}'.format, mac)
	return ':'.join(mac).upper()


def ipv4_packet(ip_data):

	ip_protocol, source_ip, target_ip = struct.unpack('! 9x B 2x 4s 4s' , ip_data[:20])
	return ip_protocol, ipv4(source_ip), ipv4(target_ip), ip_data[20:]


def ipv4(address):
	return '.'join(map(str, address))


def udp_packet(ip_data):
	src_port, dst_port = struct.unpack('! H H', ipdata[:4])
    return src_port, dst_port


The UDP packet format looks like the image. The 1st 2 bytes is the source port and the next 2 bytes is the destination port. We only care about the second part which is the destination port to determine if it is LLMNR or mDNS but nevertheless we get both.

Udp

So after a lot of debugging this is how my final code looks like.


import socket
import struct

def main():
	
	packets = socket.socket(socket.AF_PACKET, socket.SOCK_RAW, socket.ntohs(3))

	while True:
		ethernet_data, address = packets.recvfrom(65536)
		dest_mac, src_mac, protocol, ip_data = ethernet_dissect(ethernet_data)
	
		if protocol == 8:
			ip_protocol, source_ip, target_ip, ipdata = ipv4_packet(ip_data)
			source, destination = udp_packet(ipdata)
			if dest_mac == "01:00:5E:00:00:FB":
				print("[+] mDNS IPv4 Packet Received")
				if ip_protocol == 17:
					print("\t[-] Source IP: {} Source Port: {} Destination IP: {} Destination Port: {}".format(source_ip,source,target_ip))
					if destination == 5353:
						print("mDNS Packet from {} to {}".format(source_ip, target_ip))
			elif dest_mac == "33:33:00:00:00:FB":
				print("[+] mDNS IPv6 Packet Received")
			elif dest_mac == "01:00:5E:00:00:FC":
				print("[+] LLMNR IPv4 Packet Received")
				if ip_protocol ==17:
					if destination == 5355:
 						print("\t[-] LLMNR Packet from {} to {}".format(source_ip, target_ip))
			elif dest_mac == "33:33:00:00:00:01":
				print("[+] LLMNR IPv6 Packet received")
def ethernet_dissect(data):

	dest_mac, src_mac, protocol = struct.unpack('! 6s 6s H', data[:14])
	return mac_format(dest_mac), mac_format(src_mac), socket.htons(protocol), data[14:]


def mac_format(mac):
	mac = map('{:02x}'.format, mac)
	return ':'.join(mac).upper()


def ipv4_packet(ip_data):

	ip_protocol, source_ip, target_ip = struct.unpack('! 9x B 2x 4s 4s' , ip_data[:20])
	
	return ip_protocol, ipv4(source_ip), ipv4(target_ip), ip_data[20:]

def ipv4(address):
	return '.'.join(map(str, address))


def udp_packet(ipdata):
	src_port, dst_port = struct.unpack('! H H', ipdata[:4])
	return src_port, dst_port

main()


This was kind of a long post but i wanted to show you step by step how you can do this kind of things with Python and hopefully learn a thing or two that will help you develop your own tools.

Big shout out to Bucky Roberts (@thenewboston) for his awesome tutorials.