Bar graph generated with Python has become unreadable. How to improve it? How to work with a dataset of more than 1 million rows?

-1

Friends,

The following bar chart was generated (the first column of datasets is UNIX time):

ThePythoncode(version3.5)usedwasasfollows:

#-*-coding:utf-8-*-importmatplotlib.pyplotaspltimportmatplotlib.datesasdatesfromdatetimeimportdatetime,timedeltax=[]y=[]withopen("/Radhe/LabAbril2017Capturas/slices_calculos/winTime_10Abril_SemAtaques.csv") as f:
    for l in f:
        X,Y = l.split(",") #separador eh a virgula
        x.append(float(X))
        y.append(float (Y))

x1 = [datetime.fromtimestamp(int(d)) for d in x]
y_pos = [idx for idx, i in enumerate(y)]

plt.gca().xaxis.set_major_formatter(dates.DateFormatter('%m/%d/%Y %H:%M:%S'))

y1 = []
v = 0
y_sorted = sorted(y)
for i in y_sorted:
    if(abs(i-v > 50)):
        y1.append(i)
        v = i

plt.bar(y_pos, y, align='edge', color="blue", alpha=0.5, width=0.5) 

plt.title("Tamanho da janela TCP durante período sem ataques")
plt.ylabel("Tamanho da janela TCP")
plt.xlabel('Tempo')
plt.xticks(y_pos, x1, size='small',rotation=35, ha="right")
plt.yticks(y1)
plt.ylim(ymin=y_sorted[0]-200) # valor minimo do eixo y

plt.show()

Using the dataset winTime_10Abril_slowloris.csv, the chart was also bad:

ThedatasetwinTime_10Abril_SemAtaques.csvisavailablehere: link

The dataset winTime_10Abril_slowloris.csv is available here: link

How to make the chart more readable? Any more efficient way to do it? My next dataset has about 1 million rows .... It will take a long time ...

1 million line dataset (winTime_10Abril_sockstress.csv): link

    
asked by anonymous 26.08.2017 / 16:39

1 answer

2

Half answer, half comment.

Take a look at this question / answer:

link

It is based on the ability to use Line Collection of matplotlib.

Maybe you can use the same technique. Another option is to reduce the number of lines and points as suggested. In general you do not need 1 million points, you need 1000 points. The hard part is selecting and putting only what is needed (for example, only a few% of% to reduce the graphical part already helps).

    
26.08.2017 / 20:15