Verification of data quality (variation)
The quality of results of data-driven applications depends largely upon the output data’s quality. Therefore minimizing mistakes and outliers in the data basis is a crucial step when it comes to further development of big data solutions. By means of descriptive statistics we perform tests that calculate statistical distributions, averages and moments. Due to these key figures, changes can be recognized quickly. It is important to keep in mind that data might change due to internal factors as well as to changes in the environmental conditions.
Evaluating the significance of the prediction
Were the forecasts helpful and informative? The chronological sequence allows assessing initiated changes. If the expected results obtain, the weights of the related parameters will be strengthened. Otherwise the weights may be lowered. This is how the forecasts can be retrospectively adapted to reality refining the statements precision in each iteration.
In order to continuously increase the precision of the statements from data-driven applications, previous results should be revised critically. We like to work in iterative processes in order to continuously evaluate and sharpen the informative value of new findings.
This may happen by taking into account new data that previously was not provided in the appropriate quantity or simply seemed irrelevant. During the operating phase, one has to ensure that implied data is always representative. The supervision of the content is especially important to identify changes over time and deduct corrective measures. This shows that already little variations of the underlying questions require adaption of the training quantities.