We are releasing a new report with the goal to increase awareness of the data that is available and missing across humanitarian operations. The State of Open Humanitarian Data is based on the data shared by dozens of partners through the Humanitarian Data Exchange (HDX) platform as measured through the Data Grid, a feature that prioritizes core data into six categories. 

As we start 2020, just over 50 percent of relevant crisis data is available across 14 humanitarian operations. Afghanistan and the Central African Republic have the most complete data, while Venezuela has the least. The largest data gaps are in the categories for health and education, and food security and nutrition. The categories with the best coverage of data include affected people, and geography and infrastructure.

When HDX was launched in 2014, it held around 800 datasets. Over the past five years, that number has skyrocketed to over 17,000 datasets. The data covers every active humanitarian crisis, from Afghanistan to Yemen, and has been shared by dozens of organizations, from ACLED to WFP. In 2019, HDX was accessed by over 600,000 users. 

This is a tremendous achievement for collective action in a sector that relies on cooperation.It also shows the value of an open data platform. OCHA’s work to aggregate data from many sources in one place has undoubtedly created efficiency in the system. Humanitarians, donors, academics, and journalists no longer need to chase contacts to locate data; they can go to HDX and search for it. If the data is not there, the HDX team will help find it. 

“Accurate data is the lifeblood of good policy and decision-making. Obtaining it, and sharing it across hundreds of organizations, in the middle of a humanitarian emergency, is complicated and time-consuming – but it is absolutely crucial.”
-United Nations Secretary-General António Guterres at the opening of the OCHA Centre for Humanitarian Data in The Hague in December 2017

One downside to all of this data sharing is knowing what data is most relevant to understanding a crisis context. In May 2019, HDX added a new feature called the Data Grid to help people in their quest for good and relevant data. Based on extensive user research, the Data Grid places the most important crisis data into six categories: affected people; coordination and context; food security and nutrition; geography and infrastructure; health and education; and population and socio-economy. Within each category, there are several sub-categories, with a total of 27 sub-categories altogether. 

There are three main criteria for whether relevant data is included in the Data Grid: 1) is it disaggregated beyond the national level?; 2) is it in a commonly-used format?; and 3) is it timely? If at least one dataset meets all criteria, that sub-category is considered ‘complete’. If at least one dataset meets some of these criteria, the sub-category is considered ‘incomplete’. If a dataset does not meet the criteria or does not exist on HDX, the sub-category is considered empty or as having no data. 

Of course, relevant data will greatly depend on who is looking and what they are looking for. A dataset might have the right data, but not cover the part of the country needed for the analysis. Or it might cover the right geographic area but be in a format that is difficult to work with. For this reason, the HDX team reviews all relevant datasets and assesses them against the criteria. This careful curation process is undertaken daily on all newly shared or updated datasets. So far, some 700 datasets have been taken through this process. 

It is important to note that not all humanitarian data can be shared openly. Data about the location of affected people and responders can put people at risk, especially in conflict environments. The HDX Terms of Service prohibit the sharing of data that includes personally identifiable information. For sensitive, non-personal data that can be shared under certain conditions, HDX offers a feature, HDX Connect, which enables organizations to share only the metadata and make the underlying data available bilaterally upon request. There are two HDX Connect datasets that are included in the Data Grid, both related to Venezuela.

We will expand the Data Grids to cover all locations with a Humanitarian Response Plan throughout 2020. We may also expand the categories and sub-categories based on feedback. 

We look forward to collaborating with partners to close data gaps in the year ahead. To fuel this work, we will collaborate with the Rockefeller Foundation on cutting edge data science (stay tuned for more about these efforts in 2020). We are grateful to the many donors, not least the Netherlands Ministry of Foreign Affairs which has supported this work over the years, and the HDX users who are committed to ensuring humanitarian response is data driven.

Share your feedback on the report by contacting us at centrehumdata@un.org or on Twitter @humdata