SQL Dependency Injection
“All problems in computer science can be solved by another level of indirection, but that usually will create another problem.” — David Wheeler, inventor of the subroutine
This is the story of a “framework” that doesn’t exist: SqlDi
Most database frameworks’ authors consider that it is wrong to write SQL in your source code and provide many kinds of wrappers you need to deal with, instead of pure query strings.
Their reasons are along the lines of:
Portability: SQL is not database independent
Sadly, this is true. Although there are ANSI standards – ANSI SQL 99, 2003 etc. – no RDBMS supports all the features from ANSI and most vendors force you to use some of their custom extensions to accomplish things from the standard.
(For example, SQL 2003 introduces ‘window functions’ that could be used instead of the vendor specific ROWNUM, LIMIT or TOP keywords – and the window functions allow you to do even more. Few database systems support SQL 2003 though..)
Abstraction: SQL queries contain details about the storage mechanism
The relational model is a good way of assuring the integrity and non-redundancy of data. JOINs allow you to reverse the normalization process, while keeping the “Single Point of Truth” – storage model. INDEXes allow you to lookup faster.
Having SQL (in your source) that displays all these storage details is bad in practice. Even if your attributes (columns) are not subject to change, the way of retrieving them may be.
For example, if you want to display a (paginated) list of books on a web page, with their authors, a minimal query would be:
SELECT books.title as title, books.date_published as date_published, authors.name as author FROM books INNER JOIN authors ON books.author_id = autors.id WHERE ROW_NUMBER() OVER (ORDER BY books.title) BETWEEN 1 AND 10
This is both an example of SQL that reveals the storage internals and that is not portable (because ROW_NUMBER() is supported by very few ‘chosen’ ones)
Correctness: Query strings are validated only when run against the database
Even if you’re using an editor that knows how to highlight syntax from the SQL strings, you are still prone to mistakes, syntactical or not (e.g. wrong column name).
The usual solution is to write objects that construct your query – here’s an example for Hibernate’s Criteria API
Criteria crit = session.createCriteria(Cat.class); crit.add( Expression.eq( "color", eg.Color.BLACK ) ); crit.setMaxResults(10); List cats = crit.list();
This still fails if the “color” string is mistyped, for example.
The proposed SqlDi solution emphasizes some of the positive things about SQL, while working around the above-mentioned issues.
include StarbucksDSL order = latte venti, half_caf, non_fat, no_foam, no_whip print order.prepare
As opposed to this ruby Starbucks language I found on the web, SQL has proper syntax rules, and everybody knows at least its basics.
So SqlDI will allow you to write a subset of SQL called SqlDi.
But how about the aforementioned Abstraction ?
In order to decouple the storage model from the application query, SqlDi uses a kind of remote-process inversion of control (RP/IoC) pattern where the actual storage-specific part of the query gets injected in your code by the RDBMS process (instead of your code depending on the storage layout).
Here’s an example of the previous query, re-written in SqlDi:
SELECT title, date_published, author FROM books_with_authors LIMIT 1,10
You can already guess the syntax of SqlDi, but I will paste it here anyway:
SELECT attribute_expr [, attribute_expr]* FROM thing [WHERE boolean_expr] [LIMIT n [,m]]
Now how does the ‘thing’ called ‘books_with_authors’ got injected?
Simple, it’s part of the database “configuration file” (the script that creates the schema):
CREATE TABLE authors ( id INTEGER PRIMARY KEY, name VARCHAR(80), birthday DATE); CREATE TABLE books ( id INTEGER PRIMARY KEY, date_published DATE, author_id INTEGER FOREIGN KEY REFERENCES authors(id) ); CREATE VIEW books_with_authors AS SELECT books.title as title, books.date_published as date_published, authors.name as author FROM books INNER JOIN authors ON books.author_id = autors.id;
Now, if some other time you discover that one book can have multiple authors, all you have to do is to re-write the storage-dependent part of the query.
But how does SqlDi provide Portability ?
SqlDi is a parser for a subset of SQL – you have already seen the SELECT statement syntax. The various functions dealing with numbers, dates and strings, even the operators (such as string concatenation) are defined one way or another in SqlDi and translated into the native dialect.
For example, the LIMIT keyword is translated into ROWNUM, LIMIT, TOP or ROW_NUMBER() depending on the used database.
This translation is accomplished in two steps:
- First, the query is tokenized and parsed, and an object which represents the root of the Abstract Syntax Tree is constructed.
- Second, the AST spills out the native dialect – based on some form of run-time pluggable dialect writers (plus ‘drivers’)
In this way you’ll never need to change the “SELECT x FROM y WHERE …” from your source, even if you change the used database. (You do, however, need another schema creation script).
Yeah, but how about the compile-time Correctness ?
This is the most interesting part.
SqlDi.Select ( "books_with_authors", null /*the where clause*/, 1, 10 );
For example, if you have WHERE name LIKE ‘Richard %’, instead of a null you will have SqlDi.Like ( SqlDi.Column(“name”), “Richard %” ) – where “SqlDi” is the package name.
Of course you could write this yourself if you are a hardcore SqlDi user;)
To summarize what we achieved:
- you can write SQL in your source code,
- it will be database independent and
- you will get compile-time syntax checking
- bonus points: the DBA will become your best friend
What we didn’t do:
- no object mapping was even considered
- we didn’t enter into details of INSERTs and UPDATEs
- SqlDi does not exist