[Lazarus] SQLdb: Pseudo Transaction?
R.Smith
ryansmithhe at gmail.com
Tue Jun 30 16:49:06 CEST 2020
>
> Ryan,
>
> your answer helps a lot; thank you.
>
> Another (maybe stupid) question: What exactly is the purpose of a
> 'transaction'? Do I need it for each SQL statement to be executed?
That's an excellent question in fact - lots of confusion out there.
Two main reasons, which I will quickly show - 1 - ACID guarantees
(especially Isolation and Consistency, the others [Atomicity and
Durability] you get anyway), and 2 - Speed.
(https://en.wikipedia.org/wiki/ACID)
In most DB engines (SQLite definitely) most executed commands gets their
own little wrapped transaction if you did not start one explicitly -
just to make the nuts and bolts of the engine function correctly. This
might be a "lesser" form of transaction, but it has to at a minimum
prepare the table, lock tables/rows/whatever, do the reading/writing,
and then release those locks.
This also means that if you issue four commands, let's use an example
(based loosely on a contacts-list table), say you execute these three in
order:
SELECT MAX(ID) FROM contacts;
(The result of the previous statement is put into a variable, say: X)
INSERT INTO contacts(ID, FirstName, LastName, TimeAdded, ....) VALUES
(X+1, 'Joe', 'Soap', Now, ...);
INSERT INTO contacts(ID, FirstName, LastName, TimeAdded, ....) VALUES
(X+2, 'Joe', 'Jones', Now, ...);
INSERT INTO contacts(ID, FirstName, LastName, TimeAdded, ....) VALUES
(X+3, 'Joe', 'Smith', Now, ...);
Now inside the DB Engine, it is wrapped in pseudo-code like this
(Obviously there is more going on, but I'm just mentioning the
interesting bits to our example):
tt = Start_Transaction;
qq = prepare query( SELECT MAX(ID) FROM contacts; ) :
if (qq is READ Query)
obtain_read_lock(tt, qq); // Executed
ELSE
obtain_write_lock(tt, qq);// Skipped
run_query(qq);
output_results;
drop_locks(tt, qq);
release_prepared_resources(qq);
IF ERRORS > 0
roll_back(tt);
ELSE
commit(tt);
release_transaction(tt);
tt = Start_Transaction;
qq = prepare query( INSERT INTO contacts(ID, FirstName, LastName,
TimeAdded, ...) VALUES (X+1, 'Joe', 'Soap', Now, ...); ) :
if (qq is READ Query)
obtain_read_lock(tt, qq);// Skipped
ELSE
obtain_write_lock(tt, qq); // Executed
run_query(qq);
output_results;
drop_locks(tt, qq);
release_prepared_resources(qq);
IF ERRORS > 0
roll_back(tt);
ELSE
commit(tt);
release_transaction(tt);
tt = Start_Transaction;
qq = prepare query(INSERT INTO contacts(ID, FirstName, LastName,
TimeAdded, ...) VALUES (X+2, 'Joe', 'Jones', Now, ...); ) :
if (qq is READ Query)
obtain_read_lock(tt, qq); // Skipped
ELSE
obtain_write_lock(tt, qq); // Executed
run_query(qq);
output_results;
drop_locks(tt, qq);
release_prepared_resources(qq);
IF ERRORS > 0
roll_back(tt);
ELSE
commit(tt);
release_transaction(tt);
tt = Start_Transaction;
qq = prepare query(INSERT INTO contacts(ID, FirstName, LastName,
TimeAdded, ...) VALUES (X+3, 'Joe', 'Smith', Now, ...); ) :
if (qq is READ Query)
obtain_read_lock(tt, qq); // Skipped
ELSE
obtain_write_lock(tt, qq); // Executed
run_query(qq);
output_results;
drop_locks(tt, qq);
release_prepared_resources(qq);
IF ERRORS > 0
roll_back(tt);
ELSE
commit(tt);
release_transaction(tt);
The "output_results;" command might be a no-op for INSERT queries,
though some DBs do return values.
Now note some things:
Towards Point 1 above: If someone else also were trying to insert to
this database at the same time, they might get an X that is in-between
your inserts, and indeed insert values in between. This is called a
concurrancy problem and is fixed by the "isolation" in an ACID transaction.
(I know that behaviour can well be fixed by declaring the ID column
with AUTO_INCREMENT, but let's imagine it wasn't in this example, you
may well have other columns that must be Unique).
Note also that the inserted values for "Now()" would be slightly
different every time since it's just a littlebit later when the next
INSERT happens. Inside a transaction, the TimeAdded values will all be
the same.
Basically, while the transaction is in progress, and for the duration of
it, no other connection to the same database can mess with those values,
plus those values will remain consistent for the duration of the
transaction. This is a very important property to many people.
There are a few other considerations to get technical with, like
Serializability modes etc, allowing Dirty-reads, Read-uncommitted data
(they don't fall in the scope of this discussion, but you can look them
up if interested) - i'll only say here that not everyone wants their
transactions to be closed off completely so Database Engines have
switches/ways to allow breaking of the ACID model.
Point 2: Also note that the speed magnificently increases when you
explicitly START the transaction, because the above script and
pseudo-code changes to this:
BEGIN TRANSACTION;
SELECT MAX(ID) FROM contacts;
INSERT INTO contacts(ID, FirstName, LastName, TimeAdded, ....) VALUES
(X+1, 'Joe', 'Soap', Now, ...);
INSERT INTO contacts(ID, FirstName, LastName, TimeAdded, ....) VALUES
(X+2, 'Joe', 'Jones', Now, ...);
INSERT INTO contacts(ID, FirstName, LastName, TimeAdded, ....) VALUES
(X+3, 'Joe', 'Smith', Now, ...);
END TRANSACTION;
tt = Start_Transaction;
obtain_read_lock(tt);
qq = prepare query( SELECT MAX(ID) FROM contacts; ) :
if (qq is WRITE Query)escalate_to_write_lock(tt, qq); // Skipped
run_query(qq);
output_results;**
release_prepared_resources(qq);
qq = prepare query( INSERT INTO contacts(ID, FirstName, LastName,
TimeAdded, ...) VALUES (X+1, 'Joe', 'Soap', Now, ...); ) :
if (qq is WRITE Query) if (qq is WRITE
Query)escalate_to_write_lock(tt, qq); // Executed
run_query(qq);
release_prepared_resources(qq);
qq = prepare query( INSERT INTO contacts(ID, FirstName, LastName,
TimeAdded, ...) VALUES (X+2, 'Joe', 'Jones', Now, ...); ) :
if (qq is WRITE Query) if (qq is WRITE Query)escalate_to_write_lock(tt,
qq); // No-op because write-lock is already held.
run_query(qq);
release_prepared_resources(qq);
qq = prepare query( INSERT INTO contacts(ID, FirstName, LastName,
TimeAdded, ...) VALUES (X+3, 'Joe', 'Smith', Now, ...); ) :
if (qq is WRITE Query) if (qq is WRITE Query)escalate_to_write_lock(tt,
qq); // No-op because write-lock is already held.
run_query(qq);
release_prepared_resources(qq);
IF ERRORS > 0
roll_back(tt);
ELSE
commit(tt);
output_results;**
release_transaction(tt);
** - The exact times and places when results are output in queries
differ from Engine to Engine and can be controlled with switches in
some. Typically inline after selects, but sometimes you only want to see
the results that actually made it into the database (i.e. at the END of
the transaction).
You can of course get even faster when using multi-line inserts and
prepared statements - but that's also another discussion.
Note also that only transactions with explicit errors will be rolled
back by the Engines typically. If there is an error (a thing that you
don't want to commit to the data) but it isn't a problem to commit it
for the DB engine, then you have to explicitly tell it to roll back. (In
sqlite this is just querying the command: ROLLBACK;)
There are also conflict handlers which you can specify on how you want
to proceed when a conflict (error) occurs, but again, that's another
topic, this one is too long already. :)
Cheers,
Ryan
More information about the lazarus
mailing list