In this post I want to discuss a different perspective that, I think, should be taken into account when studying social and economic systems. It is, fundamentally, about considering that the number of factors and causes behind socioeconomic phenomena can be very large. And by large, I mean that the number of influences that determined each of the instances of our object of interest can be (almost) unique.
The current methodology in economics runs in the opposite way. Instead, the typical empirical exercises are about identifying the few significant factors of influence (i.e., explanatory variables) of a measurable phenomenon. And this is done typically by postulating a theoretical model of the world, and then testing it with regression analysis. This sounds perfectly reasonable. The concerns of economists with regression analysis is with what they call the ``endogeneity problem''. That is, the situation in which you don't know what caused what. Like, for example, when causation goes both ways between the phenomenon you are interested in and the factors you are including.
My concern, however, is about the possibility that we may live in a world in which the number of (equally significant) factors of influence can be larger than the number of observations. I feel, in fact, that it is precisely because there are so many factors at play in socioeconomic phenomena that most of the research in the social sciences keeps constantly being fueled by the discovery of new factors of influence.
Let me show you an example I read today. Every week, I receive in my email a list of new working papers in specific topics in economics. I quote from the abstract of the first paper in the list I received today: ``While various potential determinants of innovative activity have been considered, little empirical evidence is yet available for the influence of firm governance issues. This paper aims at filling this gap in the literature by studying whether the relative importance of owner-managed small and medium sized enterprises has an effect on regional innovative capacity''. What this abstract expresses is true: many (and I say MANY) determinants of innovative activity have been considered, but ``firm governance'' has received little attention. The paper concludes that, indeed, they find ``a sizable and significant influence of the regional importance of owner-managed [Small and Medium Enterprises] on relative regional innovative capacity''.
The previous example was just the first paper in a list of 21 papers that I received this week. The majority share the same general template: ``Many factors have been studied, but here we report an additional factor that we found to be significant for a subset of observations''. Of course, I am only giving anecdotal evidence. But I feel this trend does not show any sign of stopping any time soon. Many factors of influence are still waiting to be studied. And I predict, many will be found significant for many subtypes of economic activities.
Okay, but so what? In a regression model, if one has N observations, one cannot include more than N factors in the analysis. What do we do if there are, in fact, more than N factors influencing our N observations?
In my opinion, this calls for a change in perspective. We need to give up the need of identifying specific causes, and we need a Boltzmannian revolution in economics. By this I mean that we need a ``statistical economics''. In this approach, we would recognize the complexity of the system, treat the parts of the system as stochastic, and we would try to understand the aggregate statistical properties of our phenomenon. We would then be trying to understand what are the statistical properties of an aggregate measure of economic activity, given that it will probably be the result a very very large number of processes and causal factors.
Whether there are many or a few factors of influence in a given socioeconomic phenomenon is, ultimately, an empirical question. It is quite possible that there are just ten or less. However, there can also be a hundred factors. Or a thousand. Even a million. If reality is such that the number of factors influencing a phenomenon is so large, what, then, are the questions we should be asking?