Controlling and explaining a neural network
AI meets credit scoring: The challenges and opportunities ahead
Five key components of best practice data management
11 July 2018
Following on from our previous article on IRB data management, Inderjit Mund, our Data Practice Director, explores key themes in data management and provides practical steps to help you to deliver it effectively.
People, processes and knowledge associated with data, as well as data itself are the assets that strong data management is looking to leverage. Material improvements can be made to and for each of these assets to enhance data management.
Data management can be divided up into five themes:
- Data Capture: Identification and collection of data.
- Data Organisation - Retention and organisation of data for flexible, efficient use and maintenance.
- Data Exploitation - Conversion of data into useful business information.
- Data Knowledge - Acquisition, retention and presentation of information about data that are required in support of 'Data Capture', 'Organisation and Exploitation'.
- Data Culture - Attitudes and behaviours demonstrated across all data themes.
Excellent data management is characterised by consistent good practice across all these themes. Most organisations demonstrate good practice in some of these themes but few (for various reasons) can demonstrate across all. The main reason is that the first three themes are usually under the control of different teams (data capture and elements of data organisation are often technology matters while other elements of data organisation and data exploitation are matters for business-focused teams).
This results in different people, processes and knowledge that, without organisation-wide focus and communication, can affect the fourth and fifth themes.
It is perfectly possible to have excellent data management through different means in technology and business but unless there is a common approach or culture there may always be certain issues that prevent truly excellent data management across the organisation.
What can be done?
We previously discussed the need for data to be treated as an organisation-wide asset with responsibility for data residing at board level. This is important so that teams involved in data management (be they technology, business or other) have a data strategy that can be followed in the development and implementation of data management policies and procedures.
Data management experts (which every organisation has) can work within the guidelines provided by the strategy while maintaining the differences required of their role within the organisation. Alignment is beneficial, particularly under the themes of data knowledge and data culture but it is still possible for technology and business teams to do things differently under key guiding principles.
Whilst we have strong opinions on desirable data management behaviours it is important that steps are taken aligned to each organisation’s own views on priorities.
Here are some key recommendations to help you with best practice data management:
Are all data sources identified, documented and validated?
Too often data users only see ‘their’ part of the data landscape. As an example, technology teams typically have no time to investigate the business meaning of data and teams typically have little time to investigate sourcing. It is beneficial if the information held by each area is recorded, maintained and made available as a matter of course as part of the process of capturing data to ensure the most complete view possible.
Is a comprehensive, trusted, readily available, consistently formatted, quality data source available?
The stock approach to data organisation is a data warehouse with subordinate marts and, while this is an ideal strategic approach, full implementation can be time-consuming and delays discouraging. Ensuring common principles around reconciliation, availability and data structures whilst also gathering, maintaining and making available information about data goes a long way towards making data friendlier for all users.
As with data capture, these principles should be employed using a ‘matter of course’ approach.
Can users access data when they need it, how they need it and with sufficient confidence?
Nothing locks users into a narrow view of data more than a lack of availability, a lack of support and a feeling that they are dependent on others for their data. Making sure that users can access or at least be aware of all available data, information about data, can share methods and can feedback on data (and see action taken as a direct result of feedback) encourages creative use and starts a cycle by which the use and quality of data continuously improves.
Users are part of data management and those involved in data management are users.
Do all users have information on the lineage, quality and ownership of the data they use?
Inevitably some data are more trusted, and users gravitate to these data if they can. Such trust comes from confidence in the sourcing and reconciliation of the data and knowing that someone is responsible for the data. Less trusted data tends to be subjected to users’ own validation of the data before use.
As for data capture, data organisation and data exploitation, a ‘matter of course’ process is beneficial and ultimately more efficient. Identifying provenance, quality and ownership of data and making this knowledge available is a key part of a successful data management strategy.
It is also important that measures employed in the data knowledge theme are organisation-wide as knowledge of an organisation’s data becomes as widespread as knowledge of an organisation’s business itself.
Who ‘bangs the drum’ for data management and do they have influence overall data management in the organisation?
Data culture needs to transcend users of data at all levels within the organisation from data processing to the use of data within the boardroom and should underpin all use of data. By taking on responsibility for effective data management and encouraging those implementing data management to adopt the kind of principles outlined in the five themes, a suitably influential individual can propagate a data culture in the same way that charismatic business leaders propagate important business cultures.
As stated earlier, everyone involved with data is a ‘user’ albeit with different roles and responsibilities in relation to data. Data culture ensures that, despite those differing roles and responsibilities, sight is not lost of the fact that data are there to be used and data management is geared towards this.
The enforcement of GDPR has brought an even sharper focus on effective data management. Data ownership is firmly in the hands of the individual and organisations face hefty reputational and financial penalties for not handling an individual’s data in the way they want it to be handled. The GDPR has a bearing on data used for analytics and profiling.
It is not uncommon for ineffective organisational-level data management to result in users building and refining bespoke data sets for their specific analytical, profiling, MI data processing needs, and then hoarding these “unofficial” data sets in private storage areas for use as and when the data processing needs to be repeated.
The dangers of the above are obvious, e.g. the provenance of the data is unknown, such data sets are usually undocumented, the code used to produce these is not optimal both in terms of efficiency of processing and maintainability, the intimate knowledge of the data set is usually confined to a specific user resulting in a human single point of failure, etc.
In addition, these unofficial data sets are unlikely to be kept up-to-date with user permissions resulting in the contravention of the GDPR.
All of the above are examples of steps that can be taken to deliver excellence in data management.
There are many more measures, some very specific and others more nebulous, and the application of these depends very much on organisations’ existing, unique data landscapes. This reinforces our assertion from our previous data management articles that ‘one size fits all’ is rarely true. The real challenge (and interesting aspect) of data management is identifying, tailoring and implementing measures appropriate to each organisation.
Data management shouldn’t be seen as a ‘big bang’ implementation, rather a series of steps that can be taken, often for a particular function, which is then rolled out to other parts of an organisation, towards the ultimate goal – a ‘little-by-little’ approach of continuous improvement. To quote Groove Armada “If everybody looked the same, we’d get tired looking at each other”.