UPSERT or UPDATE-INSERT?

1

Through this question I felt the need to research and understand a little about the UPSERT . I was wondering what a blessed command this is? I never saw ..

I found some information but it is still not clear to me because there is no "basic command", type UPSERT tabela (id=0, valor='nome') , for example.

I usually see codes using the following structure:

IF EXISTS (SELECT 1 FROM tabela WHERE id = 0)
    UPDATE tabela SET campo = 2 where id = 0
ELSE
    INSERT INTO tabela (id, campo) VALUES (0, 1)
END

About upsert found this way :

MERGE INTO tabela
USING (SELECT 0 AS id, 1 AS campo) AS reg
ON tabela.id = reg.id
WHEN MATCHED THEN
     UPDATE SET campo = reg.campo
WHEN NOT MATCHED THEN
     INSERT(id , campo)
     VALUES (reg.id, reg.campo)

So, regarding performance , which way should I use it? the IF that queries existence by the identifier and defines which flow to follow ( insert or update ) or MERGE ( upsert ) condenses the two commands?

    
asked by anonymous 23.08.2018 / 13:16

2 answers

3

How do you call the second block of code in the question? I call it a command. It could have indentation to make it clear that it is one thing.

You mean you do not have a UPSERT keyword? Ok, this is not necessary, except, perhaps, to facilitate. If it was a keyword I would have written with the appropriate style. I think it's wrong to write everything in capital letters because of the grammar.

The command needs to perform atomically and perhaps indempotently (I still do not conclude this, but have several sources saying, without explaining, then I will not buy so easy, have tried to explain me and not convinced me, became dictionary explanation) an operation or the other.

Is the first one in a transaction? It may be considered Upsert, but not a single command. I can not say that. If it is certainly not is not and may even generate problems, even if rare. I guess the first one does not run in isolation, so that's fine.

Each database implements as it wishes, or fails to implement. As far as I know it does not have a standard in ANSI SQL, even if it does, they do a poor job in the implementations.

Whether it interferes with performance or is not implementation detail, but it should change very little, other things make more difference, the interpretation of the command will make much more difference than itself. The former may be faster because it is simpler to parse it and execute it, but only by speculating. It may be that in the first case a duplicate existence check is occurring and this can make performance worse. Take a real test. You can change case by case.

What I do know is that some DBs are struggling to do right and keep up with performance, probably because of past wrong choices.

I do not know if MERGE has an exact match to the first block, so comparing them does not make sense.

    
23.08.2018 / 15:13
2

In terms of performance, I think it makes a difference when you have the UPSERT natively, as in the cassandra database that the INSERT is a UPSERT, so when giving INSERT as every bank already has to validate if the key already exists, it already decides there in that query if it is going to be an update or insert. Now, if you have to query, as in the case of the IF or MERGE, I believe it does not make a significant difference.

    
23.08.2018 / 13:23