Applying TDD to Scalding Development

In this post I would like to show a structured approach to allow a proper TDD process during the development of Scalding jobs. The post will contain a lot of code so, for the more impatient, let’s clearly state where we are heading to. The target of the testing approach I’m describing here is to allow to test a subset of the transformations composing your Scalding job in isolation, without too much boilerplate, simply specifying the input condition and the expected output. We need also to verify that the compositions of all these operations in the final Scalding job operates as required. That means that our tests should operate at two levels: Unit Tests: operating on a relatively small sets of operations Integration Tests: operating at job level In order to be able to do tests at both level we need to structure our code in a way that allows to extract a series of operation into a block that can be tested (i.e. has an expressible meaning) and we need some support to allow us to tests both these operations in isolation and the final job as a whole.

Read More
7
Share

The External Operations Pattern

This is the first pattern of the series of Scalding specific patterns. As I mentioned in the introduction, the pattern presentation will follow the structure: Motivation: where I will describe the reason why this pattern might be relevant to you Structure: where I will describe how the code will be structured Sample code: where I will describe a very simple solution implementing the pattern Interactions: where I will present how this pattern is interacting with others The External Operations Pattern The code of a the Scalding Job class should not contain complex pipe transformations but delegate to externally-written functions. Motivation There are several reasons to adopt this approach. The first one is to follow the KISS principle and reduce the complexity of the code composing the Scalding Job. Extracting the operations in external module (a Trait or Object) is also allowing to respect the DRY principle and extract common operations to be reused by different Jobs. This pattern is also very useful to increase the testability of your code and will constitute the basis of the TDD approach described in the next chapter. Structure A Scalding Job is created defining a class extending com.twitter.scalding.Job. The Primary Constructor will contain all the…

Read More
4
Share