How to separate records into blocks for parallel processing?

1

I have a table in SQL Server that makes the control of messages to be sent and a routine in C# , configured in a Cron, that selects the pending messages, sends and marks as sent.

What I'm implementing is the ability to configure other calls on the Cron so that this processing is done in parallel, thus processing a larger volume of records in less time.

The challenge is in assembling the SQL statement so that one call does not select the records of the other call.

I have some premises to follow:

  • I can not change the table structure, nor create other tables;
  • The number of calls should be flexible, and more calls on the Cron may be included on demand.

The schema of the table and the query that I have is more or less this:

DECLARE @Mensagem TABLE (
    id_mensagem INT IDENTITY PRIMARY KEY,
    criado_em DATETIME DEFAULT GETDATE(),
    destinatário VARCHAR(250),
    assunto VARCHAR(250),
    mensagem VARCHAR(MAX),
    enviado_em DATETIME
);

INSERT @Mensagem
    (destinatário, assunto, mensagem)
VALUES
    ('[email protected]', 'Assnto 1 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 2 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 3 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 4 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 5 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 6 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 7 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 8 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 9 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 10 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 11 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 12 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 13 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 14 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 15 Lorem Ipsum', 'Mensagem Lorem Ipsum');

-- Seleciona os registros não enviados
SELECT
    M.id_mensagem,
    M.criado_em,
    M.destinatário,
    M.assunto,
    M.mensagem,
    M.enviado_em
FROM
    @Mensagem AS M
WHERE
    M.enviado_em IS NULL;

-- Após o envio pela rotina C#, marca a mensagem como enviada
UPDATE @Mensagem SET enviado_em = GETDATE() WHERE id_mensagem = 1;
    
asked by anonymous 20.04.2018 / 21:18

2 answers

0

The defined solution should consider that new messages can be received in the table at the same time that the process of sending messages is triggered or is running. That is, you need to differentiate the rows that have already been selected to send those that entered the table after the sending process has started. The ideal would be to add column in the table that indicated the situation of the message. Something like "A" (waiting for sending), "S" (selected for sending), "E" (sent) and "R" (sending error); perhaps some other situation that it deems necessary to be recorded. This would reduce the possibility of sending the same message more than once. However, since you can not change the table structure, this will have to be simulated differently. One option is to use the enviado_em column, with the following values:

  • NULL: message waiting to be sent;
  • 0: message selected for submission;
  • -1: error in sending the message;
  • other value : message sent.

To get the list of messages to send, and at the same time update the status of the selected rows, I suggest that you use the UPDATE statement in conjunction with the OUTPUT clause.

-- código #1
UPDATE Mensagem
  set enviado_em= 0
  output INSERTED.id_mensagem, 
         INSERTED.destinatario, 
         INSERTED.assunto, 
         INSERTED.mensagem
  where enviado_em is null;

You can then start sending the returned message set by executing code # 1 or trigger sending threads , passing a subset of the messages to each one. The number of threads as well as the subsets are defined dynamically in the program in #C, depending on the number of messages to send. That is, a single call on the Cron is enough.

As for marking sent messages, there are some approaches. One of them is that after sending each message the situation is updated in the database by thread , using the value of message id to find the line to be updated. If the message was sent, here's the code hint:

-- código #2
UPDATE Mensagem
  set enviado_em= current_timestamp
 where id_mensagem= @id;

Another is that each thread , in the end, inform the main body of the program which messages were sent and which ones were not, leaving it on the responsibility of the main body to update the situation of the messages in the table. >     

22.04.2018 / 15:02
-1

One approach that resolves is to perform the calculation of the remainder of the division between a single constant and the number of calls that will be made, thus generating an integer that groups the results.

In the example below, the constant used is the auto-increment column, as it generates a unique value for each line:

DECLARE @Mensagem TABLE (
    id_mensagem INT IDENTITY PRIMARY KEY,
    criado_em DATETIME DEFAULT GETDATE(),
    destinatário VARCHAR(250),
    assunto VARCHAR(250),
    mensagem VARCHAR(MAX),
    enviado_em DATETIME
);

INSERT @Mensagem
    (destinatário, assunto, mensagem)
VALUES
    ('[email protected]', 'Assnto 1 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 2 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 3 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 4 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 5 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 6 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 7 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 8 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 9 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 10 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 11 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 12 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 13 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 14 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 15 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 16 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 17 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 18 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 19 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 20 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 21 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 22 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 23 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 24 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 25 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 26 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 27 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 28 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 29 Lorem Ipsum', 'Mensagem Lorem Ipsum'),
    ('[email protected]', 'Assnto 30 Lorem Ipsum', 'Mensagem Lorem Ipsum');

-- Seleciona os registros não enviados apenas para uma determinada chamada
SELECT
    (M.id_mensagem % 3) + 1 AS chamada,
    M.id_mensagem,
    M.criado_em,
    M.destinatário,
    M.assunto,
    M.mensagem,
    M.enviado_em
FROM
    @Mensagem AS M
WHERE
    M.enviado_em IS NULL
    AND (M.id_mensagem % 3) + 1 = 1;-- Aqui entra o número da chamada, neste exemplo seria 1, 2 ou 3

What other approaches can solve this problem?

    
20.04.2018 / 21:18