Data classification of network attacks (attack or non-attack)

2

I'm using the dataset:

https://www.unsw.adfa.edu.au/unsw-canberra-cyber/cybersecurity/ADFA-NB15-Datasets/

The purpose is to classify a sample as attack or non-attack. Is it a good idea to use logistic regression?

I made the code below to make pairplots in the dataset. The problem is that the dataset has 49 columns and I would like to filter the columns to use in pairplot, I tried to make a slice in the variable UNSW11, like UNSW11 [:, 1: 5], inside the pairplot but I got error: "builtins.TypeError : unhashable type: 'slice'

Is there any way to limit the number of columns to join the pairplot?

 # -*- coding: utf-8 -*-
    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    import seaborn as sns
    UNSW11 = pd.read_csv('/home/govinda/Desktop/UNSW-NB15_1_ed.csv')
    sns.pairplot(UNSW11,palette='bwr',hue = 'class') #usar hue!
    plt.show()
    
asked by anonymous 10.06.2018 / 21:46

1 answer

2

I did a test with the base that you pointed out and I was able to select the pairplot columns with the parameter vars:

sns.pairplot(df,palette='bwr',hue="var45", vars=["var8", "var9","var41"]) plt.show()

Whereas var8, var9, var41, and var45 are columns 8, 9, 41, and 45 respectively of your dataset

This was my resulting plot

    
28.06.2018 / 04:46