InnoDB repeatable read (RR) is complicated. This is my short description. It isn't exactly correct.
Summary
Processing of statements within a transaction for InnoDB RR:
Sources of Confusion
Summary
Processing of statements within a transaction for InnoDB RR:
- Statements that are read-only and x-lock free use snapshot isolation
- Other statements use read committed (RC) with gap locks
- All statements observe uncommitted changes from the transaction
It would have been simpler to use snapshot isolation as-is but that leads to more write-write conflicts.
Sources of Confusion
- The repeatable read snapshot is created at the start of the first real statement (SELECT, INSERT, etc) and not by BEGIN or START TRANSACTION. It is created by START TRANSACTION WITH CONSISTENT SNAPSHOT.
- Some RR transactions use SELECT to find rows to modify followed by UPDATE when they should have used SELECT ... FOR UPDATE. Without the FOR UPDATE a SELECT statement uses the snapshot from transaction start while the UPDATE uses RC and sees all committed changes. So SELECT might observe different rows from the UPDATE that follows.
- I never tried to learn whether SELECT ... FOR UPDATE and UPDATE use 1) use a snapshot from statement start or 2) see any row that is committed at the time statement processing encounters that row. I assume that it is #2. The manual states "reads the latest available data".
- The rules for gap locks and next-key locks are non-trivial. I don't claim they are more complicated than they need to be because concurrency is hard and they provide useful semantics by preventing phantom reads and more. While the academic community tends towards stronger isolation the common choices in popular systems are weaker (RC in Postgres and Oracle, RR in InnoDB, ? in SQL Server).
When does InnoDB RR implement RR?
I am not an expert on isolation. Experts have yet to review this post. I hope they do. The ANSI spec requires that RR prevent dirty read, non-repeatable read and phantom reads. Alas, InnoDB can do reads (evaluate the WHERE clause) at different points in time (see above) which can allow for some of non-repeatable reads and phantom reads unless you are careful to use the FOR UPDATE clause.