Applying TDD to Scalding Development

In this post I would like to show a structured approach to allow a proper TDD process during the development of Scalding jobs. The post will contain a lot of code so, for the more impatient, let’s clearly state where we are heading to. The target of the testing approach I’m describing here is to allow to test a subset of the transformations composing your Scalding job in isolation, without too much boilerplate, simply specifying the input condition and the expected output. We need also to verify that the compositions of all these operations in the final Scalding job operates as required. That means that our tests should operate at two levels: Unit Tests: operating on a relatively small sets of operations Integration Tests: operating at job level In order to be able to do tests at both level we need to structure our code in a way that allows to extract a series of operation into a block that can be tested (i.e. has an expressible meaning) and we need some support to allow us to tests both these operations in isolation and the final job as a whole.

Read More