It seemed like a simple problem.
At the beginning of the pandemic, our development teams set out to build a tool that would accelerate the work of COVID-19 researchers—a tool that would provide an easier way for our team and research organizations to more easily gather, use, and share the large and rapidly growing volume of publicly available COVID-19 data to further research.
Like many researchers who work with many sources of data, we regularly need to access different systems to complete a single analysis. While these could be commercial surveying and assessment systems (like Qualtrics) or academic research surveying tools (like RedCap), they may also include information from Customer Relationship Management systems (like Salesforce) and even commercial digital health platforms, wearable devices, electronic health records and openly available government databases. The data types we work with range from survey responses from many different platforms, to appointment and event management systems, to wearable data from consumer devices like a Fitbit or Apple Watch, and data from publicly available datasets (like vaccine distribution information from the CDC). Often they include data that has been collected from a proprietary app or system built by one of our partners.
We discovered that we needed tools to help us more easily gather and combine these massive volumes of records from various sources. These tools, we hoped, would be a better alternative to the manual way that many researchers handle data, by downloading spreadsheets from different sources and then spending tedious hours merging them into a single source. And, while many researchers use helpful tools such as Jupyter notebooks, even these tools are a lot easier to use if you have all the data in one place already. The challenge of manual integration for multiple sources remains.
With the urgency of COVID research to motivate us, we built a secure research data management solution that makes it easy to access data in any of these systems through an easy-to-manage process:
Our question isn’t, “How do we analyze COVID-19 data?” (There are many tools that support general analysis.) The question is, “How can we make it easier to acquire and gather the data to analyze it in the first place?”
COVID-19 data is flowing in from around the world. COVID-19 has sparked accelerated innovation by technologists to meet the needs of the research community, and it will continue to be a driving force as we strive to find solutions that collect critical statistical information and make it accessible across a range of systems.
This is one of many efforts we are undertaking to build the future of research.