Calculation of the Shannon entropy in a network traffic (saved in CAP file) using Python

3

I have a dump file (CAP format) of a network traffic capture made with the Ubuntu tcp dump. Until a certain time, it is attack-free traffic. Then a series of TCP SYN flooding attacks begin. My goal is to calculate the entropy of each of the traffic moments (with and without attacks) and to compare them.

Do you know of any Python library that calculates the shannon entropy of a network traffic?

I found the following code, what do you think?

import numpy as np
import collections

sample_ips = [
    "131.084.001.031",
    "131.084.001.031",
    "131.284.001.031",
    "131.284.001.031",
    "131.284.001.000",
]

C = collections.Counter(sample_ips)
counts = np.array(list(C.values()),dtype=float)
#counts  = np.array(C.values(),dtype=float)
prob    = counts/counts.sum()
shannon_entropy = (-prob*np.log2(prob)).sum()
print (shannon_entropy)

Imagine that I had these IPs only in traffic collected at a certain time.

I would take several trades on different days to see how entropy behaves, thus having several different entropy. What would be the best way to plot a graph using Python to check the behavior of entropy?

    
asked by anonymous 01.03.2017 / 15:17

1 answer

3
Hmm I do not know any lib for what you need, I use entropy calculations for Audio, to help define how different (random, disorganized) an audio frame in the spectrum is, it makes sense what you want to do depending of the returned entropy calculation you can define if an attack exists yes, the more organized, the less random the TCP-DUMP traffic is, the greater the chances of an attack occurring. The code shown seems to be correct with the equation that I use for entropy:

Where Ti is the data of your TCP-DUMP, in your case you seem to be taking only the occurrence of the IP's in a certain time interval, before calculating the entropy you need to normalize the data once again it appears that this step is OK, your data has been normalized in the following line prob = counts/counts.sum()

On Plot, the most obvious way is to store each entropy and its particular collection day and then make a simple plot using matplotlib.pyplot , it would be something like plot(dia,entropia) , perhaps by observations you can set a threshold to later sort automatically which days have had an attack, remember the higher the entropy value the greater the chances that an attack has occurred (usually the closer to 1, the less random its values are), it may be interesting to go forward rather than by day to do an hourly analysis: -)

    
02.03.2017 / 02:06