DB – So, what’s a Transaction?

I posted a Gotcha in Oracle Commit some time back. This is one of the posts that gets visited a lot. Also, at work, I saw some SQL scripts that showed a complete lack of understanding of how transactions or commits work in Oracle (or any database for that matter). Hence this post.

Dictionary.com defines Transact as “To carry or conduct business to a conclusion or settlement”. And a Transaction is an act of Transacting! BusinessDictionary.com goes a bit further to explain it as a contract or agreement. It even goes further to define it as an event in accounting, such as entries that go into a cash register. A Transaction then is a self-contained unit of work, that is completed (either with success or failure). This is what is adapted in Database terminology. A transaction must be complete, reliable and independent of other actions.

In Database parlance, we call it an ACID – atomic, consistent, isolated and durable. See Microsoft’s post about ACID here or this Wikipedia article. When a DB transaction succeeds, we COMMIT it and when it fails, we ROLLBACK it.

Transactions in real life
A transaction typically happens between 2 parties and it typically concludes to the satisfaction of both the parties involved. The parties involved either agree to go all the way, or cancel/settle somehow. Think about it. Even if there are multiple parties involved, you can always break it down to smaller and smaller transaction until you can no longer split and that will among 2 parties.

A typical transaction we all can relate is the cash transaction at a bank (at a Teller counter the ATM): You give your card. The teller (or the machine) validates it and asks you to enter your pin number. Once you have done that successfully, then the teller (or the machine) asks you the type of transaction you would like to perform – deposit or withdraw. If you are withdrawing, you tell the amount and the teller gives you the money and deducts it from the account and finally gives you a receipt.

SOCIAL ATM
Courtesy life2point1 on toondoo.com

For example, imagine you are at an ATM depositing money into your account. Say you had $1000 in the account before you deposited another $100. This is your transaction with the bank at the moment. (You may have other transactions at the same bank and elsewhere, but each is an independent unit of work). If everything goes well, at the end of the “transaction”, your account will have $1100. What if all is not well? What if Power goes down in the middle of this transaction? Will your account now have $1000 or $1100 or even $900 (incorrect)? What happened to your deposit? The answer in all these cases should be, your deposit is safely completed irrespective of external issues. That’s where the concept of transactions help.

building_a_houseFor another e.g., slightly longer transaction, let’s assume your transaction is building a house. To do this, you go to the bank and get the loan, go to the builder and decide the plan, builder cashes his check upon which his bank transacts with yours, builder goes to construction company and gets the building built and so on. And if any of these individual transactions fail, there are repercussions in others. For .e.g., if your transaction with the bank failed, your transaction with the builder does not happen, even if you previously agreed.

Imagine such transactions – they have to be atomic – a collection of small steps each of which is completed and the teller or the system will take you to the next step. For e.g., when you are at the ATM, if the password step doesn’t complete, you don’t stand a chance of seeing the money (worse case, may even trigger a new transaction with the security guard!). Your transaction with the teller is isolated or independent, we don’t want it to be mixed with your other transactions or someone else’s! And it is durable – once done, the results are set in stone (or computer). We certainly expect the house to be durable at the end of that transaction!! And these concepts are typically applied to the Database, since that’s where your results of your transactions are stored (persisted).

Transactions in a DB
The concept of transactions is not new to Databases. Mainframe databases had them for a long time. Modern Relational Databases (and SQL) are all transactional by nature. See this about.com post for a description of database transactions in general. Every major SQL database vendor support transactions. Some may support Autocommit by default. Some, like Oracle, may leave it to the SQL developer  (you!).

In a relational database, every transaction may contain one or more database operations. In RDBMS these operations are grouped as Data Manipulation Language (DML), Query and Data Declaration Language (DDL). As I mentioned in my previous post, Commits work differently in DMLs and DDLs in Oracle.

The DML statements are the atomic operations that perform a single database operation like INSERT, UPDATE and DELETE. These operations can be performed individually or combined in a group of statements. Together these make a “Transaction”. At the end of such a transaction, you will have to COMMIT or ROLLBACK to make the effect of transaction permanent or ROLLBACK to revert back to before the transaction block, so the database is left unaffected. See here for more on COMMIT etc. Again, COMMIT can be automatic or manual.

I would like to point out one gotcha here: Since DDL automatically commits, you don’t really need to use an explicit COMMIT after a DDL statement (CREATE, ALTER etc). I am somewhat puzzled to see even experienced developers and DBAs sometimes make this mistake. This is true for even CTAS (Create Table As) statements in Oracle.

CREATE TABLE dup_employee AS
SELECT * FROM employee;
-- You don't really need this COMMIT; it's already committed above.
<span style="text-decoration: line-through;">COMMIT;</span>

We say that this is in an implicitly committed transaction. If we expanded the above statement, it may look like this:

COMMIT;
CREATE TABLE dup_employee AS
SELECT * FROM employee;
COMMIT;

Just you or the database don’t have to show it explicitly, as a matter of convenience. But, this definitely

 TRANSACTION Block

When a bunch of SQL statements participate in a single transaction, they are in one single Transactional block. To indicate beginning and end Transaction block, SQL standard talks about transaction control statements. For e.g, a START TRANSACTION (MySQL), BEGIN TRANSACTION (SQL Server) or BEGIN WORK(Informix) marks the beginning of a transaction block. Typically a COMMIT or ROLLBACK would mark the end. To be able commit or rollback to a specific section in the transactional block, databases offer sort of bookmarks, the SAVE POINTS. When you start a transaction block, make sure there is no other transaction block is already open, meaning the last block was committed or rolled back. Otherwise, you will get an error from the database.

In Oracle SQL, you don’t need an explicit “START TRANSACTION” or “BEGIN WORK”  statement to mark the beginning of a transaction. Typically,  the first non-committed executable SQL statement, can be DML or DDL, (implicitly) marks the beginning of a new transaction block. (otherwise, each statement joins the previously open transaction block). A commit or Rollback will end the transaction.
For e.g,
<<implicit beginning of transaction>>

INSERT INTO department (dept_id, dept_name) VALUES (10, 'HR');
INSERT INTO employee (emp_id, employee_name, dept_id) VALUES (100, 'Employee', 10);
INSERT INTO payroll(emp_id, emp_type, pay_rate, pay_rate_type) VALUES(100, 'TEMP', 100, 'HOURLY');

COMMIT;

INSERT INTO lotto_pool (pool_id, emp_id) VALUES (1, 100);

These imaginary inserts are done for each new employee that joins a company. The employee gets added to the employee table and also cross-referenced in a department, payroll tables. Come to think about it, each SQL statement above depends on the previous one to succeed. Thus, they can be part of a single transaction block. If the department does not exist, he cannot be added and if the employee doesn’t exist, he cannot be added to payroll etc.

The above example is just for the illustration of a transaction. Really speaking, the department could have been added long before 100th employee arrives. An employee record could be added well before, his payroll information is setup).

To add little more control to the transaction handling, Databases also provides a marker. In Oracle this is called a SAVEPOINT.

In the above example, if you assume a save point after each DML statement, then you will be able to rollback or commit to a specific save point.

INSERT INTO department (dept_id, dept_name) VALUES (10, 'HR');

SAVEPOINT dept_save;

INSERT INTO employee (emp_id, employee_name, dept_id) VALUES (100, 'Employee', 10);
SAVEPOINT emp_save;
INSERT INTO payroll(emp_id, emp_type, pay_rate, pay_rate_type) VALUES(100, 'TEMP', 100, 'HOURLY');

SAVEPOINT payroll_save;

In this example, we could simply save the department, even if there is an error in employee or payroll.

 

References

http://www.oracle-dba-online.com/sql/commit_rollback_savepoint.htm

https://www-304.ibm.com/support/knowledgecenter/SSGU8G_11.50.0/com.ibm.sqls.doc/ids_sqs_0329.htm

http://dev.mysql.com/doc/refman/5.7/en/commit.html

https://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_4010.htm

Advertisements

Comments, please?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s