Imagine the following scenario: a gas station is sued for tax evasion by issuing an invoice of tax coupons already issued - what happens is that each vehicle of the companies agreed to the station supplied and at the end of the month the station issued a note with your invoice.
The operation, however, was made wrong for almost 2 years. There was no evasion in any case, however, the numbers of tax coupons were not referenced to the invoices.
We are talking about an immense mass of data, where we basically need a match for quantity of liters per type of fuel x value - for example: an invoice of R $ 32,127.12 and 19,047.61 liters of diesel oil has to be "regrouped" with N tax coupons.
However, we have the following problems: fuel prices vary, since the invoice can be the combination of N pumps x N fiscal printers, that is, we are talking about a stratospheric mass of data.
However, knowing that we can limit the "search" radius of recombination by date (last 30 days) - (which in volume of data sums up to trillions of combinations in this period), could we use some tree algorithm? Or some variant algorithm of the traveling salesman?