Source: wikihow.com
The activities that can and must be carried out on data are many and sometimes very complex. The applied solutions are often difficult to explain and even more difficult to demonstrate convincingly to the end customer who pays for it and also to the other experts involved.
This is because the goodness of a solution can be discovered during the evolution of the project and therefore after some time.
In short, it is an activity whose goodness is discovered later and I mean not immediately afterwards.
Our effort is to plan an activity only after we have identified its KPIs to demonstrate what we have done. Our effort continues to present the results to the final customer in a convincing way because they are presented within his reach and his competence and fall within the objectives for which we were hired.
This corollary is sometimes very difficult to achieve. Imagine choosing a data schema. For example a star pattern. Why dedicate time to this activity which in the end changes nothing from a practical point of view, at least in the initial phase of the project?
Why spend time rigorously defining the semantics of the data when the name associated with the data is intuitively explanatory?
Instead, the real question we must answer to reduce potential future problems is: are you sure that no one will doubt whether to interpret it in a different way and therefore to feed it or use it differently from how it was initially designed by the developer?
let’s make a trivial example: registration date.
There may be different interpretations:
- does it feed only with the first registration?
- or does it feed only with the last registration even if it is no longer active?
- or does it mean only the registration date relating to the one that is still active and has not yet expired?
and then
- Who is this information used by? Under what circumstance?
- Who can update it? when?
- Who can cancel it?
- What happens to the deleted data? Can I recover it? How can I check if the effects of its modification have enriched the information base or reduced it?
- Or have I lost information content or have I enriched it?
This is just a flash of our data science vision
Last Updated on June 5, 2024