For data-driven organizations, a key challenge is to incorporate new data into existing charts, reports or recommendations. However, these data products often also contain the manual work of domain experts. An efficient procedure for recomputability should be able to add these human interactions automatically and also deal with dependencies on external systems. As part of a customer project, mgm developed a Big Data architecture that is precisely designed for this purpose. It makes it possible to reconstruct older versions of data products and compare them with newer versions.
Matthias Kricke, Martin Grimmer and Michael Schmeißer present the technical basis of this system architecture in the paper Preserving Recomputability of Results from Big Data Transformation Workflows, which appeared in the “Datenbank-Spektrum” in mid-September. The paper available here first describes the architecture and then discusses the experiences of implementing the system.