Skip to main content

Data Curation Tools

We want to help you manage your data as well as we have, by providing you with the toolset we used to manage this project.

📄️ Data dictionaries

A data dictionary is usually the last thing on a researcher's mind when publishing data. Not only because there are few data collecting platforms that take data dictionaries into consideration as output for a data collection instrument, but they are not standardized. And the only reason you need a data dictionary is if you need someone else to understand your data. When you are beginning data collection, you are the creator of your data instrument, so you know what you are collecting. But when the data is finally retrieved from the instrument, do your variable names look like Q1, Q3, Q34? What do those names represent?

📄️ Jupyter notebooks

This is likely the best tool to have in your data curation toolbox. A Jupyter notebook is not only the visual eye candy for representing your metadata, but it provides you with the tools to download and work with much of the data. This is significant because we can run data analyses from within the notebook and save the results and notes from those analyses within a single file. The best thing about these notebooks is they are JSON formatted (you will notice we talk at great lengths about JSON in this guide). In this user guide we will provide you with examples and demonstrate the power of these notebooks.