Update_All Rails with field of the database itself

0

I need to make a change in all the columns of the database by replicating information from another column, for example:

Model.update_all("a = b")

I would like to create threads and start spinning, for example at midnight, and finishing at 7 o'clock in the morning the other day,

How could I do it in the best way?

    
asked by anonymous 03.11.2016 / 20:37

1 answer

1

To run such a costly task, you first have to decide what to do if it does not fit in your total time window. You need to decide if it should be canceled in full, returning to the previous state, suspended and continued in the next time window, or move forward until it is over.

Each of these cases takes a different approach. A good rule is to divide to conquer. You could divide the total of records by ranges of IDs and send them in groups to a Thread :: Queue or to a Redis server from where each thread or process in In case of redis, it would consume each lot.

One way to get groups of ids would be Model.find_in_batches , referring only to #ids. Another way is to create ranges of ids for each job, for example, by querying the Model.maximum (: id) and the Model.minimum (: id) to know the cardinality of your set and then divide the quantity by the number of lots you want to create or by the quantity of each lot you want to create.

In addition, to control the end of the window, a thread could be used for monitoring, which would send the stop signal to each executing thread.

A good scheduling library is rufus-scheduler .

Remember that if you only work with threads, you will not be able to scale in more than one process, so redis becomes an interesting tool for communicating jobs and even suspending alerts among the processes involved.

    
23.11.2016 / 20:48