Principles of analytics

There is one basic idea behind all the principles in this guide: anyone at any point should be able to understand, use, and scrutinize all the analytics developed at DIL. This means that for all code that is written to analyze data, it should be easy for other people to (1) understand what the code is doing, (2) run the code, and (3) obtain exactly the same results. More often than not, this person will be a future version of the original code writer.

Although economists and other social scientists will generally agree that this is a desirable goal, in practice few of us know how to get there. And there is a good reason for it: we may spend a good chunk of our time as a profession doing analytics, but most of us were never trained as programmers, statisticians, or data scientists. As a result, two unfortunate ideas are imprinted on our hive mind. The first one is that analytics is just a hurdle we need to get through to answer a research question. The second is that we don’t have enough time to learn and implement best practices.

These beliefs, however, are far from true. First of all, good analytics are just as important for credible research as interesting questions and sound methods. Using bad data or using data badly means giving bad answers. Plus, spending some time learning how to best implement your analysis is a guaranteed investment. It will spare you from trying to reinvent the wheel and from falling into well-known traps.

That is not to say that as a lab member involved in analytics you have to enjoy writing code or getting your hands dirty with data. A good analytical workflow (by which we mean your system for processing and analyzing data that eventually leads to publishable, reliable, and reproducible results) allows researchers to collaborate with others in ways that play to their strengths and weaknesses. Therefore in building your workflows you should be realistic about your weak as well as strong sides so you can continue enjoying the work that you do.

However, to do good research it is essential for every researcher to be able to differentiate between good and bad analytics. No matter what stage of your career you are at, chances are you will continue working with data for a long time. So use this guide as an invitation to start thinking about how to create an analytic workflow that works for you. In the long haul, this will ensure you have more time and peace of mind to focus on the most important and productive aspects of your research.

The focus of this guide is on compiling and motivating overarching principles. Its objective is not to teach you how to implement every single principle, but rather to ensure that you can identify the occasions when they should be applied and know where to find help to do so. As such, it may skip some details and point to more comprehensive and practical resources instead. Click through the links to see examples, applications, and more in-depth discussions.


Table of contents