Recombination algorithm

3

Imagine the following scenario: a gas station is sued for tax evasion by issuing an invoice of tax coupons already issued - what happens is that each vehicle of the companies agreed to the station supplied and at the end of the month the station issued a note with your invoice.

The operation, however, was made wrong for almost 2 years. There was no evasion in any case, however, the numbers of tax coupons were not referenced to the invoices.

We are talking about an immense mass of data, where we basically need a match for quantity of liters per type of fuel x value - for example: an invoice of R $ 32,127.12 and 19,047.61 liters of diesel oil has to be "regrouped" with N tax coupons.

However, we have the following problems: fuel prices vary, since the invoice can be the combination of N pumps x N fiscal printers, that is, we are talking about a stratospheric mass of data.

However, knowing that we can limit the "search" radius of recombination by date (last 30 days) - (which in volume of data sums up to trillions of combinations in this period), could we use some tree algorithm? Or some variant algorithm of the traveling salesman?

    
asked by anonymous 05.04.2016 / 22:02

1 answer

1

It looks like you're dealing with a backpack problem , in English Knapsack problem . It's an NP-complete problem, so it has no quick fix, and performance only worsens with increasing items. He mentioned that it is a huge mass, so without much hope.

The advantage, in your case, is that the tax coupons have date, and possibly time. Assuming the account closes , what you can do is sort these coupons by date and time, and from back to front you will add up all the items until you match or exceed the nearest date note. Equal, lucky: remove the used coupons, mark the note as ok, and start over. Failed to sum up you "give up" the last coupon still alive, and repeat the process. This solution assumes that the system issuing orders does a similar process, picking up open and closing coupons in packages.

But the hypothesis that the account closes is actually very fragile. An alternative is to close sums of notes a day, for the days you are "sure" on certain notes, and then stick the "uncertain" coupons in the note of the previous month or later, as you have space in that or that month. >

The same solution, only you get from the station (s) that prepared that the list of coupons and notes. Anything else, even the perfect knapsack solution of all coupons on all the notes does not guarantee that this was the original relationship. It's just an approximation like any other in this indeterministic problem.

    
06.04.2016 / 00:04