How to save space in audit log?

2

I was creating a system that would give me data from my users according to the date I requested, from X to Y for example. And the only way I found it was to create a table like this:

date with date inserted in DATE format

name name of what was saved, for example Browser

value the value entered, for example, Google Chrome

ip with user IP

When the user enters Czech, if there is a row on the TODAY date with the name BROWSER if it does not exist it inserts, if it exists it does nothing.

Well that way it will insert the browser of the users who accessed my site and then I can select, create graphics etc.

The problem for this would be space, would not this take up a lot of space? Taking for example an average of 5-10 thousand visits per day, and taking into account that I do not want to capture only the browser . How could I solve this problem?

    
asked by anonymous 27.12.2014 / 18:55

2 answers

5

The simple answer to this is do not give. And even if it takes, if you have this need you should do, you do not have a problem to solve, just a matter to solve.

Do you have log access to the linked site? Every site has it. Nobody complains about lack of space for him. There you have all this information you want and probably much more redundant that will occupy much more space.

If space is lacking buy more space. If you can not afford it, stop doing what is taking up space. There is no miracle.

Of course there are cool solutions but probably not worth the complexity besides being questionable if it will give good result, price of disk space is absurdly cheaper and risk free.

For some time you came to accept the other answer, perhaps because you found something good in it but not totally. Maybe you did not understand what I meant. I'll try to make it clearer.

The proposal of the other answer is to put together two information that were separated into in just one line. This really is good because it eliminates some repetitions. Only if it is to do this, then you must treat each access as a line. That is, its structure has to be completely different from what it is doing. And the way you're doing it might be needed for some reason that only you know. No one can say what's best without knowing everything you need.

The problem with this answer is that if you are to create a line for each access it probably does not make sense to create a database for this, this line already exists in the log file of the HTTP server. Just consult it and make the statistics you want. Duplicating efforts rather is wasting space. In fact, other people have already done this for you and there are hundreds or thousands of free and commercial programs that generate fairly complete statistics.

    
27.12.2014 / 19:11
1

Some things can be changed to save space, but it's not that simple.

First, you say have name and value .

Would not it be better to create a column for each fixed value?

   date    |   name    |     value     |      ip
2014/12/27 | BROWSER   | GOOGLE CHROME | 127.0.0.1
2014/12/27 | HTTP_REF  | GOOGLE.COM    | 127.0.0.1

In this case there would be a duplicate, date and ip that could be solved if it were:

 date      |   browser       |     HTTP_REF     |      ip
2014/12/27 | GOOGLE CHROME   |    GOOGLE.COM    | 127.0.0.1

That way, it would save some duplicate lines and data.

Why not use INT?

If the name is entered by you, why not use INT?

0 -> BROWSER
1 -> HTTP_REF

   date    | name  |     value     |      ip
2014/12/27 | 0     | GOOGLE CHROME | 127.0.0.1
2014/12/27 | 1     | GOOGLE.COM    | 127.0.0.1

Cron-Job to delete old data

If you save daily, create a CronJob at 00:00 and with it do all the necessary averages and erase the data, in an old TRUNCATE . So I would convert 10mil lines into one containing all the averages and data already calculated.

    
27.12.2014 / 19:31