“The ultimate authority must always rest with the individual’s own reason and critical analysis.”
– Dalai Lama
Many of the biggest insights I’ve had working with organizations has come from data analysis, whether it was understanding that customers who received service from an associate at a retailer spent almost 50% as much as those that didn’t, discovering millions of potential savings for an insurance company which was being overcharged by vendors, or determining that the aerodynamics of a redesigned big rig truck would save a driver over $2,000 a year in fuel.
Data analysis is the tool to objectively understand the health issues of and potential remedies for an organization. Strategic leaders use data analysis to unearth insights that aren’t obvious just through observation. So let’s go over the various types and best practices of data analysis.
What is Data Analysis?
Data analysis is the organization, manipulation, and examination of qualitative and quantitative data in the pursuit of transforming the data into information, and information into knowledge. Developing a strategy can involve a significant amount of data analysis, some of it quick and straightforward, and some complicated and in-depth. Every step of strategic planning, including diagnosis, option set creation, prioritization, and execution should utilize data analysis.
Once again, people utilize data for two primary purposes. The first purpose is to enable the workflow of a process. The second purpose is to inform and drive to decisions, which is the purview of this section on data analysis. Let’s go over some of the most important types of data analyses.
Trend analysis is the organization of data in a time series, such as daily, monthly, quarterly, annually, and year over year. Often, trend analysis is used to try and predict or forecast a data series into the future, often done by using regression analysis. One of the major issues with trend analysis is adjusting for seasonality. Most organizations have seasonality in product sales and costs, and seasonality often leads to misdiagnosis of trends. The easiest way to understand seasonality is to line up 2 to 5 years of weekly or monthly data, build a chart, and see if there are common peaks and troughs (example below). There are other anomalies that can affect trend analysis, such as one-time events, large orders, new product launches, and out of stocks.
Sample trend analysis of monthly sales over 3 years
Line chart comparing the year-over-year monthly sales
Variance analysis is calculating the difference between two similar data sets. It is one of the most used analyses by finance departments, in calculating the “variance” or difference of financial budgets to actual spend or sales. This type of variance analysis can also be done comparing the “variance” of costs one year versus another year. Variance analysis can be used to compare a ton of similar data sets, including comparing the financials of two or more companies, the different cost structures of multiple products, the difference in web traffic year over year, and many more.
Sample variance analysis ’13 vs. ’12 actuals and ’13 budget
Waterfall analysis is one of the top go-to analysis of top strategy consulting firms. Waterfall analysis simply disaggregates a value into its sub-values. The chart below is a good example. It is a simple disaggregation of a P&L into product costs, overhead, other costs, net loss, and revenue. While it is simple, often the simplest charts are the most effective, since they communicate the magnitude of things very well.
Sample waterfall analysis of a company’s P&L
Sensitivity analysis is determining the effect of changing the values of different input variables on the output values of a model. Sensitivity analysis is one of the most important, yet overlooked analyses. Anytime you build a model, you should understand the high sensitivity variables, those with small changes will drive big changes in the output values of the model. Let’s take the below example of a digital campaign ROI calculator for an ecommerce website. The sensitivity analysis involves going through each variable, changing the variable by 10% and then calculating the corresponding change in net profit. You can see that 3 Variables (i.e., conversion rate, average spend, and gross margin have the highest Sensitivity at 35%, and should probably be the focus of any improvement efforts. Though, you also have to take into account the ease and potential of a variable to actually change. In our example, gross margin is probably pretty set at 50% and will be difficult to improve. While click-through rate could probably be substantially improved with better messaging and imaging on the ads.
Sensitivity analysis of a digital campaign model
Crosstab or Pivot Table Analysis
Crosstab or pivot table analysis summarizes categorical data in a matrix form to enable a better understanding of relationships in data. If you haven’t become a master at pivot tables, get some training, since it will pay dividends in your ability to analyze large data sets quickly. The example below illustrates a crosstab analysis. It takes the 18 customers, and organizes them into four sets, based on what they bought and if they are a new or repeat customer, and then averages their customer satisfaction score. Illustratively, you can see a big gap in the satisfaction score between new vs. repeat customers, who bought The Ship.
Correlation analysis calculates to what degree there are patterns and relationships between two or more variables or sets of data. Correlation analysis can be used to test the effects of marketing, advertising and changes to product or pricing on consumer demand. It can also be used to derive the root cause of quality issues in products, services, and processes. Regression analysis is the most used statistical tool to conduct a correlation analysis. Let’s go over a simple correlation analysis on the effects of increasing advertising spend on sales. Below is the data set I’ll use for the example.
Given the dataset of 20 markets, the increase in advertising spend and the subsequent increase in sales can be plotted on a chart, which is below. Just by looking at the upward sloping pattern of the plots on the chart, you can see there is a correlation. That is confirmed by running a regression analysis using Excel tools. One of the outputs of regression analysis is the R-Squared value, which is on the chart below as R2 = .61484. R-Squared is a measure of how close the data fits the regression line, with 0% being no fit and 100% being an exact fit.
Below is a scatter plot chart with a simple linear regression
What are the best practices with data analysis?
As a strategic leader, imagine yourself as a doctor, and analysis as your toolkit to both diagnose a sick or unhealthy patient, and to see if different medicines and treatments are effective in curing the patient. Beyond the doctor analogy, here are some of the best practices of data analysis.
Start with a Hypothesis or Question
Analysis always starts with a hypothesis or a question that needs to be answered. The best analysts are constantly curious about the business, and are always generating new questions they want to answer, creating an objective view of reality through different analyses, and ultimately generating the insight necessary to carve a strategic course through the future.
Work with Clean Data
Like I’ve said before, the actual analyzing of data only takes 20-30% of the actual time, while getting the right data and cleaning it up takes up 70-80% of the time. If you don’t start with clean data, you’ll find yourself either cleaning it after you start or generating poor and often incorrect conclusions.
Pick the Right Analysis for the Situation
Given whatever you are trying to understand or generate insights about, typically one of the below types of data analysis will help you:
Sanity Check your Work
One of the fastest ways to lose credibility with analysis is an answer that doesn’t make any sense. Always sanity check your analysis. Just go through it, make sure the answers seem reasonable. Also, give it a quick audit, follow the analysis through its logic and ensure the logic is correct. Any time you are sanity checking your analysis try to take a step back and put yourself in the shoes of someone that has never seen the analysis and hasn’t been trapped in it for 5 hours. Even better yet, have someone else sanity check the analysis and findings. Excel also has a nice set of auditing tools to track the inputs and outputs of equations, inform you when data doesn’t look consistent, and error checking.