The Shannon entropy is given by the formula:
WhereTi
willbethedataextractedfrommynetworkdump(dump.pcap).
TheendofanHTTPheaderonaregularconnectionismarkedby\r\n\r\n
:
ExampleofanincompleteHTTPheader(couldbeadenialofserviceattack):
Mygoalistocalculatetheentropyofthenumberofpackageswith\r\n\r\n
andwithout\r\n\r\n
inordertocomparethem.
IcanreadthePCAPfilelikethis:
importpysharkpkts=pyshark.FileCapture('dump.pcap')
EntropybasedonIPnumbersI'vedone:
importnumpyasnpimportcollectionssample_ips=["131.084.001.031",
"131.084.001.031",
"131.284.001.031",
"131.284.001.031",
"131.284.001.000",
]
C = collections.Counter(sample_ips)
counts = np.array(list(C.values()),dtype=float)
#counts = np.array(C.values(),dtype=float)
prob = counts/counts.sum()
shannon_entropy = (-prob*np.log2(prob)).sum()
print (shannon_entropy)
Any ideas? Is it possible / does it make sense to calculate the entropy based on the number of packages with \r\n\r\n
and without \r\n\r\n
? Or is it something that does not make sense?
Any ideas on how to do the calculation?
The network dump is here: link
Some lines from it:
30 2017/246 11:20:00.304515 192.168.1.18 192.168.1.216 HTTP 339 GET / HTTP/1.1
GET / HTTP/1.1
Host: 192.168.1.216
accept-language: en-US,en;q=0.5
accept-encoding: gzip, deflate
accept: */*
user-agent: Mozilla/5.0 (X11; Linux i686; rv:45.0) Gecko/20100101 Firefox/45.0
Connection: keep-alive
content-type: application/x-www-form-urlencoded; charset=UTF-8