Reproducible Research

Reproducible Research

We apply data science, and intersection of quantitative social sciences & management science, statistics and computer science, to make data collection, processing and analysis fully reproducible. Our observatory systems allow users to save thousands of working hours in data collection, in senior supervision, documentation and compliance functions. We want to help business researchers, policymakers, scientists and journalist to focus on what they do best, and use the latest, most complete, and perfectly processed information to support their work.

Our methodology can be used in business consultancy, evidence-based policy making, social and management science and data journalism. We thrive to support all these communities, because by incorporating their different review, confirmation and auditing processing, and provide an even more reliable, robust data services to all of our users.

Reviewability means that our application’s results are can be assessed and judged by our user’s experts, or experts they trust. We not only automate data collection and cleaning, but we also automate documentation.

  • Reviewability for professional consultants means that the data is always avaible with the freshest forms and with up-to-date documentation, instead of ad hoc spreadsheets collected, modified and overwritten by junior staff. The usual error-prone work that junior staff does is done by our observatory system automatically, and all the latest spreadsheets are available for all staff.
  • Reviewability in a scientific context means that our observatory systems produce the research data in a well-documented format that adheres to the expectation of social scientific and management science peer-review.
  • Reviewability in data journalism means that all the data is automatically refreshed and present for journalistic and editorial source-checking and analysis.

Reproducibility means that we are providing data products and tools that allow the exact duplication of our results during assessments. The most important aspect of our reproducibility is that we make the entire statistical program code either fully open source, or make the entire algorithm available for our clients. From the raw data that they give us, or we collect for them, they can arrive step-by-step to the same indicator or KPI values. Reproducibility ensures that there is no lock-in to our applications. You can always chose a different data and software vendor, or compare our results with them.

  • Reproducibility in a consulting context means that our data is well-processed and described, and a senior analyst can re-create it, or even modify it to the organizations needs.
  • In a scientific context it means that other researchers can reproduce the scientific findings which are supported by our data, because it is of high quality, well formatted and documented.
  • In data journalism this means that journalist can reliably follow leads and sources over a longer period of time.

We are creating applications that are confirmable and auditable. These are usually required by the ethical guidelines and professional standards of our users.

Confirmability means that using our applications findings leads to the same professional results as other available software and information. Our data products use the open-source statistical programming language R. We provide details about our algorithms and methodology to confirm our results in SPSS or Stata or sometimes even in Excel or OpenOffice. Confirmability is often a requirement of various ethical guideleins and professional standards that our clients must adhere to in the legal, economic or policy-making professions.

  • Reproducibility in a consulting context means that our data is well-processed and described, and a senior analyst can re-create it, or even modify it to the organizations needs. They can confront the data with their qualitative results or parallel information bases.
  • In a scientific context it means that other researchers can reproduce the scientific findings which are supported by our data, because it is of high quality, well formatted and documented. If any issues are raised with scientific confirmation, we assist our scientific partner to amend, replace or correct the problematic data.
  • In data journalism this means that journalist can reliably follow leads and sources over a longer period of time. We review the entire data pipeline if our journalistic partners find contrary evidence or red flags with our data sources.

Auditability means that our data and software is archived in a way that external auditors can later review, reproduce and confirm our findings. This is a stricter form of data retention that most organizations apply, because we do not only archive results step-by-step but all computational steps – as if your colleagues would not only save every step in Excel but also their keystrokes. While auditability is a requirement in accounting, but we are extending this approach to all the quantitative work of a professional organization in an advisory or consulting capacity.

  • Auditability means that we are following professional guidelines and standards of various statistical, accounting, national accounting and other professional bodies to develop our indicators and data products, and we allow auditors to review not only our data, but the entire history of the data processing, including key elements of our software code, and our general methodology.
  • Auditability means for us that we make our critical methods and software code available for peer-review both as a software on CRAN and as a statistical method in an appropriate, peer-reviewed journal.
  • In a data journalism context our data products can be subject to full editorial and journalistic review, from source to publication.
Next