Maven, SQL and ordered execution
[tweetmeme source=”anismiles” only_single=false http://www.URL.com%5D
I have been re-architecting this new project. This is huge. And broken. And I was called upon to clean, refactor and re-architect it. Phew!
I thought a good strategy would be to deal with it in outside-in style, which is, fixing the build part first then moving down to various parts. One of the interesting things that they had done was using Maven to populate PostgreSQL DB Schemas and then load data into the DB.
Steps of execution
- Drop everything
- Create users
- Create schemas
- Load data
- Load upgrade data
They used sql-maven-plugin underwhich they had multiple execution tasks. The whole thing was very untidy and I didn’t like it.
- They used <fileSets> to import sql files. FileSets are an elegant choice if you want to import files with ANT like wildcard patterns. But the files are alphabetically sorted. So, unless you don’t care about the order in which the files will be executed, you should avoid using FileSets. In their case, they wanted to run ‘drop_users_databases.sql’ before ‘create_users_databases.sql’. With FileSets, it was impossible to do, they would be re-ordered just the opposite. Solution? They should have used <srcFiles>, but instead they created 2 separate <execution> tasks to run in order. Too much verbosity. Eh?
- Another issue was with loading upgrade data. Now, they follow agile methodologies and keep updating their DB schemas, and so after few iterations, before they merge the changes in the main script, they end up with a number of files like upgrade-schema-2.0.4.sql, upgrade-schema-2.0.5.sql, upgrade-schema-18.104.22.168.sql and so on. Pain is, they don’t have any automated mechanism to execute these files in order, so the developers would be forced to do it manually. If the number of these files were low, that wouldn’t have been a problem… but how about thirty or forty such files? Pissed off.
- Kick <fileSets> out of the window
- Use <srcFiles> instead (because I cared more about order)
- Logically partition the operations
- Create a new profile to help run these DB executions at will
<profiles> <profile> <!-- profile id --> <id>init-db</id> <build> <plugins> <plugin> <groupId>org.codehaus.mojo</groupId> <artifactId>sql-maven-plugin</artifactId> <!-- postgresql dependencies --> <dependencies> <dependency> <groupId>postgresql</groupId> <artifactId>postgresql</artifactId> <version>8.3-603.jdbc4</version> <scope>clean</scope> </dependency> </dependencies> <!-- DB Configuration --> <configuration> <url>jdbc:postgresql:my_db</url> <driver>org.postgresql.Driver</driver> <username>user</username> <password>pass</password> <autocommit>true</autocommit> </configuration> <executions> <!-- 1. drop everything, create users and load data --> <execution> <id>drop-and-create-database</id> <phase>clean</phase> <goals><goal>execute</goal></goals> <configuration> <srcFiles> <srcFile>src/main/sql/drop_users_databases.sql</srcFile> <srcFile>src/main/sql/create_users_databases.sql</srcFile> <srcFile>src/main/sql/create_schema.sql</srcFile> <srcFile>src/main/sql/load_data.sql</srcFile> <srcFile>src/main/sql/test_data.sql</srcFile> </srcFiles> </configuration> </execution> <!-- 2. run all upgrade scripts --> <execution> <id>upgrade-schema</id> <phase>clean</phase> <goals><goal>execute</goal></goals> <configuration> <srcFiles> <srcFile>src/main/sql/upgrade-schema-2.0.6.sql</srcFile> <srcFile>src/main/sql/upgrade-schema-2.1.5.sql</srcFile> <srcFile>src/main/sql/upgrade-schema-2.1.7.sql</srcFile> <srcFile>src/main/sql/upgrade-schema-2.1.8.sql</srcFile> <srcFile>src/main/sql/upgrade-schema-2.2.1.sql</srcFile> <srcFile>src/main/sql/upgrade-schema-3.0.0.sql</srcFile> <srcFile>src/main/sql/upgrade-schema-3.0.1.sql</srcFile> <srcFile>src/main/sql/upgrade-schema-3.0.2.sql</srcFile> <srcFile>src/main/sql/upgrade-schema-3.0.3.sql</srcFile> <srcFile>src/main/sql/upgrade-schema-3.0.4.sql</srcFile> <srcFile>src/main/sql/upgrade-schema-3.0.5.sql</srcFile> <srcFile>src/main/sql/upgrade-schema-3.0.6.sql</srcFile> <srcFile>src/main/sql/upgrade-schema-22.214.171.124.sql</srcFile> <srcFile>src/main/sql/upgrade-schema-3.0.8.sql</srcFile> <srcFile>src/main/sql/upgrade-schema-3.0.9.sql</srcFile> <srcFile>src/main/sql/upgrade-schema-3.0.10.sql</srcFile> <srcFile>src/main/sql/upgrade-schema-3.0.11.sql</srcFile> </srcFiles> </configuration> </execution> </executions> </plugin> </plugins> </build> </profile> </profiles>
How to run?
Note that I have bound the executions with phase ‘clean’ within a profile with id ‘init-db’. So, I just need to let maven know about the phase and profile, like this:
mvn -Pinit-db clean
I hope this will help someone in need. I will post more adventurous code cleaning stuffs as I will encounter them. Cheers!