Data Governance playbooks for 2024
Back in 2020, I offered up some thoughts for consideration around generic or homogenous data governance playbooks. Revisit it if you care to.
This was in part fueled by frustrations with the various maturity models and potential frameworks available but also by the push, particularly from some software vendors, to suggest that a data governance program could be relatively easily captured and implemented generically using boiler-plated scenarios by any organization without necessarily going through the painful process of analysis, assessment and design.
Of course, there is the adage, “Anything worth doing, is worth doing well“, and that remains a truism as applicable to a data governance program as anything else in the data management space.
You can’t scrimp on the planning and evaluation phase if you want to get your data governance program to be widely adopted and effective irrespective of how many bucks you drop and irrespective of the mandates and prescripts yelled from the boardroom.
Like any change program, a DG initiative needs advocacy and design appropriate to the context and no vendor is going to do that perfectly well for you without you making a significant investment of time, people and effort to get the program running. If you’re evaluating a software vendor to do this for you, in particular, you need to be sure to check out their implementation chops and assess their domain knowledge, particularly relevant to your industry sector, market and organizational culture. This is a consulting focus area that “The Big Four” have started to look more closely at and are competing with boutique consultancies on. So if you have a passion for consulting and you feel all the big ERP and CRM projects have been done and you want to break into this space, then here is an area to consider.
What is it exactly?
The term “playbook” in a business context is borrowed from American football. In sports, a playbook is often a collection of a team’s plays and strategies, all compiled and organized into one book or binder. Players are expected to learn “the plays” and ahead of the game the coach and team work out the play that they are likely to run at the opposing team or the approach that they will use if the opposing team is observed to run a particular play of their own. Some plays may be offensive, some defensive and then there may be other plays for specialised tactical runs at a given goal or target.
A “business playbook” contains all your company’s processes, policies, and standard operating procedures (SOPs). Also termed a “company playbook”, it is effectively a manual outlining how your business does what it does, down to each business operations role, responsibility, business strategy, and differentiator. This should be differentiated from a RunBook where the latter is your “go-to” if a team needs step-by-step instructions for certain tasks. Playbooks have a broader focus and are great for teams that need to document more complex processes. It is a subtlety that is appreciated more when you are in the weeds of the work than when you are talking or thinking conceptually about new ways of optimizing organizational effectiveness and efficiency.
A data governance playbook is then effectively a library of documented processes and procedures that describe each activity in terms of the inputs and capture or adoption criteria, the processes to be completed, who would be accountable for which tasks, and the interactions required. It also often outlines the deliverables, quality expectations, data controls, and the like.
Under the US President’s management agenda, the federal Data Strategy offers up a Data Governance Playbook that is worth taking a look at as an example. Similarly, the Health IT Playbook is a tool for administrators, physician practice owners, clinicians and practitioners, practice staff, and anyone else who wants to leverage health IT. The focus is on the protection and security of patient information and ensuring patient safety.
So, in 2024, if you’re just picking up the concept of a playbook, and a data governance playbook in particular, it is likely that you’ll look at what the software vendors have in mind; you’ll evaluate a couple of implementation proposals from consultants and you’ll consider repurposing something from adjacent industry, a past project or a comparable organization.
Taking a “roll-your-own” approach
There’s plenty of reading content out there from books written by industry practitioners, and analysts, to technology vendors, as mentioned. Some are as dry as ditchwater and very few get beyond a first edition, although some authors have been moderately successful at pushing out subsequent volumes with different titles. A lot of the content though, will demonstrate itself to be thought exercises with examples, things/factors to consider, experiences and industry or context-specific understandings or challenges. Some will focus on particular functionality or expectations around the complementary implementation or adoption of particular technologies.
With the latest LLM and AI/ML innovations, you’ll also discover a great deal of content. Many of these publications, articles and posts found across the internet have already been parsed and assimilated into the LLM engines so, a good starting point is for you to ask your favourite chatbot what it thinks.
Using a large language model (LLM) like ChatGPT to facilitate the building of data playbooks might be feasible to a certain extent but there will be challenges.
On the plus-side. An LLM could generate content and provide templates for various sections of a data playbook, such as data classification, access controls, data lifecycle management, and compliance. It can also assist in drafting policy statements, guidelines, and procedures.
It could help in explaining complex data governance concepts, definitions, and best practices in a more accessible language for use in say a business glossary or thesaurus. This could be beneficial for individuals who might not have a deep understanding of data governance – think about your data literacy campaigning in this context.
Users can also directly interact with an LLM in a question-answer format to seek clarity on specific aspects of data governance and help build an understanding of key data governance concepts and data management requirements.
Just as for generic playbooks, there are going to be problems with this approach, LLMs operate based on patterns learned from a diverse range of data, but they often lack domain specificity. A data management platform or data catalog itself might have an LLM attached to it but has it been trained with data governance practice content?
Data governance often requires an understanding of industry-specific regulations, data types, and organizational contexts that might not be captured adequately by a generic model.
We’ve also heard about AI hallucinations, and some of us may have even experienced a chatbot hallucination. Without the particular character of data governance practice and domain knowledge, there’s a risk that the AI might generate content that is wholly or partially inaccurate, incomplete, or not aligned with the actual organizational need. This then, would have you second-guessing the results and having to dig into the details to ensure that the suggested content is appropriate. You’ll need to have a domain expert on hand to validate the machine-generated output.
Data governance practices and regulations are also ever-evolving. What the LLM might not be aware of, is new regulations, new compliance expectations or new industry standards. So leaning purely on machine-generated content may be deficient in revealing emerging best practices unless it gets to be trained with updates.
Each organization has its unique culture, structure, and processes. The intertwined nature of DG with the various organizational processes, and understanding these interconnections is vital; that’s best achieved with careful analysis, process design and domain knowledge. The tool you use to help elaborate your playbook might simply provide information in isolation, without any grasp of the broader organizational context. Without appropriate training and prompting, the specific nuances of the organization will make it almost impossible to tailor the generated content to align with organizational goals and practices.
I guess my whole point is that you will not escape the human factor. If you insist on going it alone and relying on machine-generated content in particular then that same content should undergo thorough validation by domain experts and organizational stakeholders to ensure that the results are accurate and aligned with organizational and industry requirements.
The use of modern-day tooling to assist human experts in drafting and refining data playbooks is a valuable acceleration approach that has merit but just as for generic playbooks and templates, you need to leverage the strengths of canned, automated generation and human expertise to arrive at a good result.
I’d love to hear what if anything you’ve done with chatbots, AI, ML and LLM to generate content. If you are implementing any data management or data governance initiatives, I would love to know how successful you have been and any tips or tricks you acquired along the way.
Read More
Author: Clinton Jones