The cautionary advice that stuck out to me the most in the Harvard Business Review book, HBR Guide to Data Analytics Basics for Managers, in chapter 14 was on linear thinking bias. Often people rely on their gut to make decisions that involve data with complex nonlinear relationships that often lead to incorrect action. Not only do they rely on their gut, but even if someone is trying to use data to inform their decisions it is easy to ignore the nonlinear aspects of the data. The first example about vehicle fuel efficiencies was an excellent example because it showcased how most people would assume the larger miles per gallon increases would be the most beneficial even though the opposite is true.
I found it interesting that most businesses focus on averages for their metrics, however, averages are even more likely to mask nonlinear relationships which will lead to prediction errors (HBR 2018, 138). A lot of times metrics around rates of change are also used, but those too can hide nonlinear complexities.
An example of nonlinearity being masked and leading to incorrect conclusions that I have encountered was around energy pricing. To understand cost savings in a specific energy market, the average utility rate unit cost for all accounts in that territory was used to estimate the savings for a contracted rate. The results didn’t quite “feel” right and so I dug into the actual dataset and did some more statistics on it. What I found was that there were many small accounts creating high unit costs because a portion of the rates were not usage-based, so the embedded flat costs were creating higher per unit costs. This was skewing the overall average of the unit costs significantly. We implemented a process to exclude a percentage of the high and low cases to remove those skewing accounts. If the dataset had been more comprehensive, including all the separated costs, better methods could have been used but our dataset was limited so this was the best method for us to remove that nonlinearity bias. Without diving in deeper, it would have just been assumed that no matter the size of the account it would have similar unit costs due to assumed linear usage to cost behavior.
Understanding the data is critical to utilizing data properly. It is important to understand both the weaknesses of your data, but also the relationships of the variables in the data. After the information in this chapter, I think it will be easier to explain datasets to others when there are nonlinear factors now that the issue has been framed well. I encourage you to examine and understand nonlinearity in the data you are working with.
Author: Logan Callen
Harvard Business Review Press. 2018. HBR Guide to Data Analytics Basics for Managers. Boston: Harvard Business Press.