Data to Knowledge
“Data! Data! Data!” he cried impatiently. “I can’t make bricks without clay!”
– Sherlock Holmes, The Adventure in the Copper Beaches
I have an unhealthy relationship with data. We’ve spent way too many sleepless nights together, have constantly fought each other, and I’ve spent so my time and energy cleaning it up. Yet, in the end, I love data, for in data I’ve found so much strategic gold and insight.
Strategic leaders are masters of data, knowing where to get the right data, when they need data to prove or disprove hypothesis, how to massage insight out of data, when to question the validity of data, abstract an organization’s processes into data, and how to systemically get their hands on more of the right data fast. Let’s go over a lot of the ins and outs of data so that you can have a healthier relationship with data.
What is Data?
Data are values of qualitative and quantitative variables. Qualitative data are characteristic variables that you can’t represent numerically. In an organization, qualitative data, also known as categorical data, can be non-numeric data tied to products, customers, team members, strategic options, and issues. Quantitative data are numeric values of variables. Examples include the number of customers, the average sales per employee, or product costs.
There are three stages that data can evolve through, which are raw data, information, and knowledge.
In the context of an organization, data is typically used for two main purposes. The first purpose is to enable the workflow of a process. The second purpose is to inform and drive to decisions.
Why is data important?
Think about the two primary purposes of data…enable the workflow of a process or inform and drive to decisions. If you want to enable better workflow or drive to better decisions, you need better data, and you need to become better at transforming that data into information and knowledge.
How do you create value from data?
We are swimming in a vast ocean of ever-increasing amounts of data. 90% of the world’s data was created over the past two years. Data is in systems, computers, spreadsheets, documents, databases, conversations, peoples’ heads, interactions, and all over the internet. Yet, data is pretty useless without context and purpose. To create value from data, you need to transform data into information and knowledge.
Stage 1 to Stage 2 – Transform data into information
A sorely ignored topic. Similar to the analogy that a person uses less than 10% of his brain, companies typically transform less than 10% of their data into information. When conducting analysis, I typically spend 20-30% of my time getting the right data, 40-60% of my time transforming data into information, and then 20-30% of my time “analyzing” the information and transforming it into knowledge. There are three major steps in transforming data into information:
Step 1 – Properly clean the data
It’s a dirty little secret that most people don’t properly clean data, which often leads to poor analysis and incorrect conclusions. Much data, especially derived from human interactions and keystrokes is termed “un or semi-structured” data, meaning that it isn’t very consistent, well organized, and categorized. In these instances, you must spend time cleaning, organizing and categorizing the data.
The following are elements of data set that need to be cleaned up:
Any time you get a column of data that pertains to a category, such as states, country, segments, demographics, or any other categorical flag, you need to make sure that they are consistent. For example, in a data set, California may be represented as CA, California, Ca, CAL, Calif., etc.
Remove unnecessary data
Managing a dataset can be unwieldy. One of the simplest ways to make a dataset easier to work with is to remove unnecessary data. Remove columns and data that aren’t necessary for the purpose of the analysis.
If you see a lot of holes or empty fields in a data set, you need to assess how that will affect your analysis. Often, you may need to omit data elements that have empty fields from the dataset to make sure the analysis isn’t misleading.
Often, with numeric data, you’ll have nonsensical values created by miskeys or errors. Rank order columns to understand the variance of values within a dataset, and potentially remove data elements with unrealistic values. Furthermore, some data sets have outliers that you may need to remove.
Text to numbers
Sometimes, numeric values are represented as text, and you need to convert them into numbers using an Excel function.
Text to columns
If you have a dataset that is comma-delimited or has data combined in the same column, you can use the text to column function in Excel to separate the data elements.
When you collect data and analyze data, make sure it is unbiased. Biased means one-sided, lacking a neutral viewpoint, or not having an open mind. Bias data can be caused by wanting a particular result or outcome. The way people create and gather data can be biased, such as when people ask biased survey questions.
For most Excel analysis, you can use pivot tables to quickly assess and understand how the data needs to be cleaned up.
Step 2 – Add metadata
One of my favorite methods to turn data into information is to add metadata; data that is appended to add new categorization to the data. A good example is when you have a data set of customers, with their total purchase values, and you create metadata, organizing the customers into “high spend,” “medium spend”, and “low spend.” You should always think through what metadata you should add, some of the metadata dimensions you should think through include, timing, value, organizational categorizations, demographics, and source.
Step 3 – Get to know the data
A dataset to an analyst is a bit like the land is to an architect. You have to get to know the data to understand the potential value in it. Ask yourself basic questions, such as “Where did the data come from?” “What insights can be gleaned from the data?” “Are there other data sets that could be combined to produce more information?” Get to know your dataset. Spend some time using pivot tables and conducting simple analyses. Once you start transforming data into information, more and more ideas and hypotheses will pop up. That is how minds work: stimuli, new facts, new questions, new ideas, rinse and repeat. Once you get a new question or idea, write it down, and follow up to see if the data can answer your question or provide insight.
Stage 2 to Stage 3 – Transform information into knowledge
One of the keys to transforming information into knowledge is being clear about the knowledge you are trying to produce and the question(s) you are trying to answer. Anytime anyone comes to me with analysis, data or information; my first question is always, “What are we trying to answer?” If there isn’t a good answer, then you’re probably in for an exercise of “boiling the ocean,” or a random walk to nowhere. Once you know what you are trying to answer, the next set of analytic tools will help you answer your questions and transform information into knowledge.