Code Answer: How do I UPDATE a row in a table or INSERT it if it doesn't exist?

I have the following table of counters:

CREATE TABLE cache (
    key text PRIMARY KEY,
    generation int
);

I would like to increment one of the counters, or set it to zero if the corresponding row doesn't exist yet. Is there a way to do this without concurrency issues in standard SQL? The operation is sometimes part of a transaction, sometimes separate.

The SQL must run unmodified on SQLite, PostgreSQL and MySQL, if possible.

A search yielded several ideas which either suffer from concurrency issues, or are specific to a database:

Try to INSERT a new row, and UPDATE if there was an error. Unfortunately, the error on INSERT aborts the current transaction.
UPDATE the row, and if no rows were modified, INSERT a new row.
MySQL has an ON DUPLICATE KEY UPDATE clause.

EDIT: Thanks for all the great replies. It looks like Paul is right, and there's not a single, portable way of doing this. That's quite surprising to me, as it sounds like a very basic operation.

From stackoverflow

I don't know that you are going to find a platform-neutral solution.

This is commonly called an "UPSERT".

See some related discussions:
- http://stackoverflow.com/questions/108403/solutions-for-insert-or-update-on-sql-server
- http://stackoverflow.com/questions/234
Ryan Graham : Is that solution really applicable? and portable?
If you don't have a common way to atomically update or insert (e.g., via a transaction) then you can fallback to another locking scheme. A 0-byte file, system mutex, named pipe, etc...
Could you use an insert trigger? If it fails, do an update.

depesz : Trigger (at least in PostgreSQL) is running when the command worked. i.e. you can't have trigger that runs when base command failed.

Michael Todd : Hunh. Didn't know that. Thanks.
MySQL (and subsequently SQLite) also support the REPLACE INTO syntax:
```
REPLACE INTO my_table (pk_id, col1) VALUES (5, '123');
```
This automatically identifies the primary key and finds a matching row to update, inserting a new one if none is found.

Kyle Cronin : > (and subsequently SQLite) ...you totally stole that from my answer :-)

andygeers : Sorry - hope that's not considered rude?

Learning : if you found it useful , you could've atleast upvoted it :)

Evan : Actually, to be specific, MySQL's REPLACE always does an insert, but it will delete the row if it already exists, first. http://dev.mysql.com/doc/refman/4.1/en/replace.html

Zoredache : It is important to understand that it is an insert+delete and never and update. The consequence of this, is that you will always want to make sure when you do a replace, you should always include data for all the fields.

j_random_hacker : +1 Zoredache, that's important info.
SQLite supports replacing a row if it already exists:
```
INSERT OR REPLACE INTO [...blah...]
```
You can shorten this to
```
REPLACE INTO [...blah...]
```
This shortcut was added to be compatible with the MySQL REPLACE INTO expression.

Learning : +1 for sqlite context.
Standard SQL provides the MERGE statement for this task. Not all DBMS support the MERGE statement.
In PostgreSQL there is no merge command, and actually writing it is not trivial - there are actually strange edge cases that make the task "interesting".

The best (as in: working in the most possible conditions) approach, is to use function - such as one shown in manual (merge_db).

If you don't want to use function, you can usually get away with:
```
updated = db.execute(UPDATE ... RETURNING 1)
if (!updated)
  db.execute(INSERT...)
```
Just remember that it is not fault proof and it will fail eventually.
I would do something like the following:
```
INSERT INTO cache VALUES (key, generation)
ON DUPLICATE KEY UPDATE (key = key, generation = generation + 1);
```
Setting the generation value to 0 in code or in the sql but the using the ON DUP... to increment the value. I think that's the syntax anyway.

Fire Crow : I've used this solution in php/mysql applications it works very well.
the ON DUPLICATE KEY UPDATE clause is the best solution because: REPLACE does a DELETE followed by an INSERT so for an ever so slight period the record is removed creating the ever so slight possibility that a query could come back having skipped that if the page was viewed during the REPLACE query.

I prefer INSERT ... ON DUPLICATE UPDATE ... for that reason.

jmoz's solution is the best: though I prefer the SET syntax to the parentheses
```
INSERT INTO cache VALUES 
SET key = 'key', generation = 'generation'
ON DUPLICATE KEY 
UPDATE key = 'key', generation = (generation + 1)
;
```

Code Answer

Friday, April 15, 2011

How do I UPDATE a row in a table or INSERT it if it doesn't exist?

0 comments:

Post a Comment

Blog Archive