The first release of the HDX Repository beta site is coming soon. The data-set repository will use an open-source software called CKAN, which is being used by dozens of organizations around the world. The task is to make CKAN fit for humanitarian use and ensure that the initial features we build are based on potential users’ priority needs.

The HDX team will engage users frequently as we get to a minimum viable product. The user-experience process—discover-design-implement-evaluate—will be repeated many times during this year. So, stick with us!


Image from “The Business value of UX” available here.

This initial user research was limited in scope. I conducted one-hour interviews with 15 people and asked them how they share and find data. The interviewees were technical and non-technical staff based in New York, Geneva and our three pilot locations (Colombia, Kenya and Yemen).

The Results

The interviews were informative and my findings were extensive. Below is a high-level take on what I heard and an initial look at the features we could build.

5 Findings

  • Data hunting is a daily activity: Participants said that they are on the look out for data nearly every day.
  • Accessing data is ad hoc: Participants find data by searching various websites, or by e-mailing and calling trusted sources. Sometimes data can’t be found.
  • Data sharing is local and informal: Data is shared by internal groups over e-mail, and through Google spreadsheets and file-sharing services, such as Dropbox.
  • Data quality inhibits data sharing: Participants share data internally, but they are hesitant to share publicly because they feel the quality of their data isn’t perfect.
  • Timeliness often trumps quality: Although people had concerns about the quality of the data they find, timeliness was often more important as long as the data was “good enough”.


5 Features

  • Search: We can make it easy for users to find the data they need by offering a search by country, keyword, cluster, crisis and other intuitive tags.
  • Easy uploading: We can create an e-mail address to which data providers can send their data for our data team to upload on their behalf. We can also find ways to link to popular file-sharing systems.
  • Highlight source and date gathered: We can design the user interface so that users can quickly ascertain whether or not the data comes from a trusted source and is timely.
  • Notifications: We can provide a way for users to indicate what data they are interested in and then be notified immediately when new data is uploaded to those categories.
  • Data-caveats field: We can provide a field where data providers can add background on their data and any caveats that users should be aware of (e.g. this data is incomplete, or this data was compiled in the first 72 hours of the crisis; use with caution). This would hopefully counter any nervousness about sharing imperfect data.

Screen Shot 2014-03-19 at 12.44.48 PM

Wireframe example with caveat field in metadata table.

I’d love to know what you think and if you feel that these results reflect your data-sharing behaviours. Please share your comments. And if you want to be part of our early beta-testers group, e-mail