DbSetup, from Ninja Squad

Don't clean up, prepare!

A classical way of doing database tests is to start from an empty database, populate the database before each test, run the test, then remove the data after the test.

This approach is not satisfactory:

if the database is not empty before the test, it fails;
if a test fails, inspecting the contents of the database after a test can be useful;
most of the tests are read-only, and inserting and removing the same data set again and again makes the tests slow. And slow tests are bad.

We recommend using multiple very small data sets, specific for each test class. If you insert less data, the tests will be faster.

We thus recommend the following strategy:

clear the database before the test,
then insert a small data set,
then run the test,
then leave the database as it is.

If another test is run after, it will start by clearing the database, and everything will be fine. That means that each test must start by clearing all the tables, and not just the tables it uses. You want your tests to be independent, and able to run in whatever order.

Getting started

Create the list of tables to clear before each test

The order of this list is important. You must start with the tables that don't have any reference to them. Then the tables that reference these tables, etc. For example, if the product table has a foreign key to the vendor table, which has a foreign key to the country table, you must start by clearing the product table, then the vendor table, then the country table.

Since the tables will be cleared before each test, you can define a global constant that will be used by all the tests:

import static com.ninja_squad.dbsetup.Operations.*;

public class CommonOperations {
    public static final Operation DELETE_ALL =
        deleteAllFrom("PRODUCT", "VENDOR", "COUNTRY", "USER");
    ...
}

Optional: create a common data set

We recommend creating data sets that are as independent as possible from other ones. Using a global data set for multiple test classes leads to data sets larger than necessary, and makes the tests run slower. And more importantly, if you add a row in the data set to test a specific corner case in a specific test, you risk breaking a whole lot of other tests that have already been written and don't expect to have this new row in the table.

It can be useful, though, to have a very small set of reference data that are necessary for nearly all the tests, because all tables depend on them: countries, languages, users, things like that:

    public static final Operation INSERT_REFERENCE_DATA =
        sequenceOf(
            insertInto("COUNTRY")
                .columns("ID", "ISO_CODE", "NAME")
                .values(1, "FRA", "France")
                .values(2, "USA", "United States")
                .build(),
            insertInto("USER")
                .columns("ID", "LOGIN", "NAME")
                .values(1L, "jbnizet", "Jean-Baptiste Nizet")
                .values(2L, "clacote", "Cyril Lacote")
                .build());

Create your test class

The below example uses JUnit, but the principle is similar if you use TestNG or any other test framework. You need to have a reference to a JDBC DataSource, or to create a new connection each time using the DriverManager:

public class VendorRepositoryTest {

    private DataSource dataSource = ...;

    @Before
    public void prepare() throws Exception {
        Operation operation =
            sequenceOf(
                CommonOperations.DELETE_ALL,
                CommonOperations.INSERT_REFERENCE_DATA,
                insertInto("VENDOR")
                    .columns("ID", "CODE", "NAME", "COUNTRY_ID")
                    .values(1L, "AMA", "Amazon", 2)
                    .values(2L, "PMI", "PriceMinister", 1)
                    .build());
        DbSetup dbSetup = new DbSetup(new DataSourceDestination(dataSource), operation);
        // or without DataSource:
        // DbSetup dbSetup = new DbSetup(new DriverManagerDestination(TEST_DB_URL, TEST_DB_USER, TEST_DB_PASSWORD), operation);
        dbSetup.launch();
    }
}

Add tests, and speedup the setup

You might want to add a test for the methods findByCode(), findByName(), findByCriteria() and createVendor(). Among all those tests, the test for createVendor() is the only one which is susceptible to modify the database. All the other tests are read-only, and it's thus unnecessary to launch the setup after any of these tests has been run. To do that, we'll introduce a DbSetupTracker, and mark the read-only tests as such:

    // the tracker is static because JUnit uses a separate Test instance for every test method.
    private static DbSetupTracker dbSetupTracker = new DbSetupTracker();

    @Before
    public void prepare() throws Exception {
        // same operation definition as above
        Operation operation =
            sequenceOf(
                CommonOperations.DELETE_ALL,
                CommonOperations.INSERT_REFERENCE_DATA,
                insertInto("VENDOR")
                    .columns("ID", "CODE", "NAME", "COUNTRY_ID")
                    .values(1L, "AMA", "Amazon", 2)
                    .values(2L, "PMI", "Price Minister", 1)
                    .build());

        // same DbSetup definition as above
        DbSetup dbSetup = new DbSetup(new DataSourceDestination(dataSource), operation);

        // use the tracker to launch the DbSetup.
        dbSetupTracker.launchIfNecessary(dbSetup);
    }

    @Test
    public void testFindByCode() {
        dbSetupTracker.skipNextLaunch();
        ...
    }

    @Test
    public void testFindByName() {
        dbSetupTracker.skipNextLaunch();
        ...
    }

    @Test
    public void testFindByCriteria() {
        dbSetupTracker.skipNextLaunch();
        ...
    }

    @Test
    public void testCreateVendor() {
        // The test writes to the database, so dbSetupTracker.skipNextLaunch(); must NOT be called
        ...
    }

Note that if you forget to call dbSetupTracker.skipNextLaunch(); from a read-only test, nothing serious will happen. The only consequence will be an unnecessary setup, but all the tests will pass. That's why we chose this strategy.

Data formats

DbSetup uses the parameter metadata to detect the type of the columns it must insert into. Depending on the type of the column, you may pass values of various types. Read the javadoc of the DefaultBinderConfiguration and Binders classes for a complete list of supported types and formats. Here are the most important ones:

Type of the column	Supported types and formats
VARCHAR (and similar)	`String`, `Enum`, or any Object. If an enum is pased, its name is inserted. If an Object is passed, its `toString()` value is inserted.
DATE	`java.sql.Date`, `java.util.Date`, `java.util.Calendar`, several classes from the `java.time` package (see the javadoc of `Binders.timestampBinder()` for details), or `String`. The expected format of the String is the format of `java.sql.Date.valueOf()`: `yyyy-MM-dd`
TIMESTAMP	`java.sql.Timestamp`, `java.util.Date`, `java.util.Calendar`, several classes from the `java.time` package (see the javadoc of `Binders.dateBinder()` for details) or `String`. The expected format of the String is the format of `java.sql.Timestamp.valueOf()`: `yyyy-MM-dd hh:mm:ss[.f...]` or the format of `java.sql.Date.valueOf()`: `yyyy-MM-dd`

Note that to insert NULL into a column, you must pass null (and not the String "null").

To support additional types or formats, create an instance of BinderConfiguration, and pass an instance to the DbSetup constructor.

Some databases, like MySQL and Oracle, don't allow knowing the type of a column from parameter metadata. In that case, since version 1.3.0, the default BinderConfiguration returns the default binder instead of throwing a SQLException. This binder only supports standard JDBC types (bound using PreparedStatement.setObject()), java.util.Date and java.util.Calendar (bound as JDBC Timestamp), and enums (whose name is bound as String).

Value generators

Database tables often have columns that are mandatory, must contain a unique value, but are irrelevant for your tests. Or sometimes, you might simply want to populate a column by a constant or sequential value without having to specify the value for each row in the dataset. In these cases, you can use a ValueGenerator. Several implementations are available, that generate sequential numeric, string or date values, with customizable starting values and increments. Here's an example of a table where the ID column is populated with a generator starting at 1000 and with an increment of 10:

    Operation insertVendors =
        insertInto("PARAMETER")
            .withGeneratedValue("ID", ValueGenerators.sequence().startingAt(1000L).incrementingBy(10))
            .columns("CODE", "LABEL")
            .values("AMA", "AMAZON")
            .values("PMI", "Price Minister")
            .values("EBA", "EBay")
            .build();

Repeating values

It's also useful sometimes to insert similar rows several times. Here's an example showing how to insert 100 tags with generated IDs and names, and the same fake description:

    Operation insertTags =
        insertInto("TAG")
            .withGeneratedValue("ID", ValueGenerators.sequence().startingAt(1L))
            .withGeneratedValue("NAME", ValueGenerators.stringSequence("tag-").startingAt(1L))
            .columns("DESCRIPTION")
            .repeatingValues("fake description").times(100)
            .build();

Specifying column names in rows

If the number of columns in a table is big, the code inserting values in the table might become harder to read. In a list of 20 values, it's hard to tell which column a current value is for, and hard to check if all the 20 required values have been specified for each row. A solution to this problem is to specify the column name for every inserted value. This makes the code more verbose of course, but you can't have your cake and eat it too.

Here's an example showing this alternative way of inserting rows in a table:

    Operations.insertInto("VENDOR")
              .columns("ID", "CODE", "NAME", "COUNTRY_ID") // optional
              .row().column("ID", 1L)
                    .column("CODE", "AMA")
                    .column("NAME", "Amazon")
                    .column("COUNTRY_ID", 2)
                    .end()
              .row().column("ID", 2L)
                    .column("CODE", "PMI")
                    .column("NAME", "Price Minister")
                    .column("COUNTRY_ID", 1)
                    .end()
              .build());

In this kind of insert, you may omit the call to columns(). In that case, the first row will will be used to determine the columns that all the inserted rows will have. If a row tries to insert a column that was not present in the columns specified in columns() (or in the first row if columns() was omitted), then an exception is thrown. If a row doesn't contain any value for one of the columns of the insert, then null is inserted for this column.

Note that you can mix the two kinds of row definitions inside a single insert, and that you can also use value generators. Repeating the same row is also possible by ending the row with .times(N) instead of .end().

Cyclic dependencies and SQL support

Sometimes, you need a data set that contains cyclic dependencies. For example you might have a foreign key to the vendor table in the product table, and you might have a foreign key to the featured product in the vendor table. This kind of cyclic dependency is problematic, because you can't delete all the vendors before deleting the products, and vice-versa. Similarly, you can't insert a product before its vendor, but you can't insert the vendor before the product either. In that case, you thus have two solutions:

deactivating (or deleting) one of the foreign key constraints, insert the data, and reactivating (or recreating) the foreign key constraint
insert the vendors, then the products, then update the vendors to set their featured product.

The solution, in both cases, is to use a SQL operation. For example:

    Operation insertVendorsAndProducts =
        sequenceOf(
            insertInto("VENDOR")
                .columns("ID", "CODE", "NAME")
                .values(1L, "AMA", "AMAZON")
                .build(),
            insertInto("PRODUCT")
                .columns("ID", "NAME", "VENDOR_ID")
                .values(1L, "Kindle", "1L")
                .build(),
            sql("update VENDOR set FEATURED_PRODUCT_ID = 1 where ID = 1"));

A note on metadata, MySQL and Oracle

If you tried using DbSetup before version 1.3.0 with MySQL or Oracle, you probably got one of the following exceptions:

java.sql.SQLException: Parameter metadata not available for the given statement.
java.sql.SQLException: Unsupported feature.

DbSetup tries to know the type of the columns it inserts into in order to know how to bind the parameter. For example, if you pass an enum as argument, and the type of the column is VARCHAR, it will store the name of the enum. But if the column is of type NUMBER, it will store the ordinal of the enum. This works fine most of the time, but not with MySQL and Oracle which don't support this feature.

MySQL sucks is special in that regard. MySQL doesn't support parameter metadata (or at least I haven't found any way to make it support them). Adding ?useServerPrepStmts=true to the URL makes it support them, but VARBINARY is returned whatever the type of the column actually is. Adding ?generateSimpleParameterMetadata=true to the URL makes it support them, but VARCHAR is returned whatever the type of the column actually is.

In previous versions of DbSetup, you had to call useMetadata(false) to avoid this exception. DbSetup degrades nicely since 1.3.0. The default binder configuration catches the exception and returns the default binder, which uses setObject() on the prepared statement for any value passed, except for Date and Calendar instances which are transformed to Timestamp, and enums which are transformed to their name. So if we go back to the enum example, if you want to bind its ordinal value, you will thus have to pass the ordinal of the enum explicitely, instead of simply passing the enum and let DbSetup decide how to bind it.

The Kotlin module

Since version 2.1.0, a separate Kotlin module is also available, providing an even nicer DSL to create your database setups. This module is a very thin layer on top of the Java API, so all we discussed in this guide applies to Kotlin, too. The type-safe builders, lambda expressions and extension functions that Kotlin provides allow writing your database setup in a nice, idiomatic, and extensible way.

Here is an example of how you would use the Kotlin DSL to write a setup using Kotlin:

    val setup = dbSetup(to = testDataSource) {
        binderConfiguration = customBinderConfiguration // optional

        deleteAllFrom("PRODUCT", "VENDOR")

        insertInto("VENDOR") {
            columns("ID", "CODE", "NAME")               // using ordered values
            values(1L, "AMA", "Amazon")
            values(2L, "PMI", "Prime Minister")
        }

        insertInto("PRODUCT") {
            mappedValues("ID" to 1L,
                         "NAME" to "Kindle",
                         "VENDOR_ID" to 1L)             // using column/value pairs
            mappedValues("ID" to 2L,
                         "NAME" to "Television",
                         "VENDOR_ID" to 2L)
            withGeneratedValue("FIRST_SELLING_DATE", ValueGenerators.dateSequence())
        }
    }

You can of course launch the setup immediately:

    dbSetup(to = testDataSource) {
        ...
    }.launch()

or you can launch it with a DbSetupTracker:

    dbSetup(to = testDataSource) {
        ...
    }.launchWith(tracker)

Don't repeat yourself

Most of the times, all your setups will use the same destination, the same binder configuration, and will probably also start by deleting everything from the same tables. How can you make that DRY?

Kotlin extension functions are great for that. Here is an example showing how you can define your own prepareDatabase() function doing all this common stuff (it assumes you have defined the destination, custom binder configuration and tracker):

    fun prepareDatabase(configure: DbSetupBuilder.() -> Unit) {
        dbSetup(to = testDestination) {
            // do the common stuff
            binderConfiguration = customBinderConfiguration
            deleteAllFrom("PRODUCT", "VENDOR", "COUNTRY")

            // then do the additional configuration with the lambda passed as argument
            configure()
        }.launchWith(tracker)
    }

In all your tests, you can now use this function the same way you would use dbSetup(), but without having to redefine the common stuff:

    prepareDatabase {
        insertInto("VENDOR") {
            columns("ID", "CODE", "NAME")
            values(1L, "AMA", "Amazon")
            values(2L, "PMI", "Prime Minister")
        }
    }

Extending the DSL

Now suppose many, but not necessarily all of your tests, must insert the same set of reference data. You can use the power of extension functions to achieve that. For example:

    fun DbSetupBuilder.insertReferenceData() {
        insertInto("country") {
            ...
        }
        insertInto("city") {
            ...
        }
    }

You can then use this extension function inside any configuration lambda. For example:

    prepareDatabase {
        insertReferenceData()

        insertInto("VENDOR") {
            ...
        }
    }

User Guide