Sharing and Using Data

| January 9, 2021

The data that users download from HDX will always reflect updates made to the remote resource (such as a file on Dropbox or Google Drive). However, the metadata and activity stream will not automatically indicate the updated date of the data. This has to be done manually in HDX by the dataset owner. We are working to improve this functionality, so please bear with us!


| January 9, 2021

HDX can live-link to and preview files stored in any Dropbox folder and even preview them if they are in CSV or XLS format. You must login to Dropbox via the web application and navigate to the folder containing the spreadsheet (or other file) that you want to share. Select the file and choose 'Share link', following the instructions in the Dropbox help centre. You will then receive a special link that allows anyone to download the file.

Add that link as a resource to your HDX dataset. When you receive a Dropbox link, it normally looks something like this:
https://www.dropbox.com/etc/etc/your_file_name.csv?dl=0

For HDX to be able to process and preview your file, you'll need to change the last '0' to a '1' so that it looks like this:
https://www.dropbox.com/etc/etc/your_file_name.csv?dl=1

The HDX resource will automatically track any changes you save to the Dropbox file on your own computer. Be careful not to move or rename the file after you share it.


| January 9, 2021

To include a link to a Google Sheet, you must first set the sheet's sharing permissions so that it is either publicly visible or at least accessible to anyone who has the link. We recommend creating at least two separate resources for each Google Sheet: 1) a link to the sheet itself in the regular Google Drive interface; and 2) a direct-download link to an Excel or CSV version of the sheet, so that users can preview it in HDX. The version in HDX will update automatically as you make changes to the original Google Sheet.

To obtain the direct download link, select "Publish to the web..." from the "File" menu in Google Sheets, then in the dialog box that opens, under the 'Link' tab select your preferred file format (such as Excel or CSV), confirm, and Google Sheets will provide you the link. (Note that this process is not necessary simply for working with HXL-aware tools like Quick Charts, because they can open data directly from the regular Google Sheets link.)


| January 9, 2021

First you need to be sure that the Google Drive file or files are publicly visible or accessible to anyone who has the link. For instructions on how to change, follow the walkthrough slides below.

You can click on 'Add Data' and choose the option to import files from 'Google Drive'. A 'Google Drive' pop-up will show and help you choose the file/files from your account. The files will not be copied into HDX. Instead, the HDX 'Download' button will always direct users to the live version of the Google document.

The HDX Resource Picker for Google Drive will only have access to your list of Google Drive files when you are choosing Google Drive resources through the HDX interface. You can revoke this permission at any time in Google Drive's App Manager. However, this will not change the visibility of the Google Drive resources already created on HDX.


| January 9, 2021

Yes. HDX allows you to drag and drop files from your computer. First, you need to click on the 'Add Data' link and then select files from your computer. Drop the files in the designated area. A new dataset form will appear with some fields already pre-filled.


| January 9, 2021

Yes. HDX can host the data for you, but it works equally well with a link to data hosted somewhere else on the web. For example, if your organisation already has a system or API that produces data for download, you can simply include a link to that data as a resource in your dataset, and the version on HDX will automatically stay up to date.


| January 9, 2021

If your resource is simply a link to a file hosted elsewhere, there is no size limit. If you are uploading a file onto HDX, the file size is limited to 300MB. If you have larger files that you want to share, e-mail us at hdx@un.org.


| January 9, 2021

For datasets: the keywords in your dataset title are matched to the search terms users enter when looking for data in HDX. Avoid using abbreviations in the title that users may not be familiar with. Also avoid using words such as current, latest or previous when referring to the time period (e.g., latest 3W), as these terms become misleading as the dataset ages. The following is a good example of a dataset title: 'Who is Doing What Where in Afghanistan in Dec 2016'.

For resources: by default, the resource name is the name of the uploaded file. However, you can change this if needed to make it more clear to users.

For zipped shapefiles: we recommend the filename be name_of_the_file.shp.zip. However, the system does not require this construction.


| January 9, 2021

Resources can be either different formats of the same data (such as XLSX and CSV) or different releases of the same data (such as March, April, and May needs assessments). Always put the resource with the most-recent or most-important information first, because the HDX system will by default use the first resource to create visualisations such as Quick Charts or geographic preview (this default can be overridden in the dataset edit page).

If you have data that is substantially different, like a different type of assessment or data about a different province, we recommend creating a separate dataset.


| January 9, 2021

We define data as information that common software can read and analyse. We encourage contributions in any common data format. HDX has built-in preview support for tabular data in CSV and Microsoft Excel (xls only) formats, and for geographic data in zipped shapefile, kml and geojson formats. If multiple formats are available, each can be added as a resource to the dataset, or if you only wish to add one format, then for tabular data, csv is preferable and for geographic data, zipped shapefile is preferred.

A PDF file is not data. If you have a data visualization in PDF format, you can add it as a showcase item on the dataset page. If you wish to share documents, graphics, or other types of humanitarian information that are not related to the data you are sharing, please visit our companion sites ReliefWeb and HumanitarianResponse. A resource, such as a readme file, could also contain documentation that helps users to understand the dataset.


| January 9, 2021

Data Check automatically detects and highlights common humanitarian data errors including validation against CODs and other vocabularies from your HXL-tagged spreadsheet. You can access Data Check from:

  1. HDX via dataset pages (The "Validate with Data Check" option will appear under "More" button under HXL-tagged resources)
  2. HDX Tools, for datasets that exist outside of HDX. For this option, you should not use Data Check to process personal or otherwise sensitive data.

Data uploaded to HDX Tools is not retained within the HDX infrastructure, while data downloaded by HDX Tools from public URLs is cached only as long as necessary for processing.

You can access both versions of Data Check without being a registered user of HDX. For instructions on how to use Data Check, review the walkthrough slides below.

Data Check uses a generic schema that detects many kinds of common errors like possible spelling mistakes or atypical numeric values, but in some cases, an organisation will want to validate against its own more-specific rules. In that case, you can write your own, custom HXL schema and validate using the HXL Proxy (Data Check's backend engine) directly. Information is available on these pages in the HXL Proxy wiki: HXL schemas, Validation page, and Validation service.


| January 9, 2021

Organization admins and editors can add data visualizations to dataset pages to let users explore your data. The data visuals can be made using Tableau, Power BI or whatever software you prefer. The visuals will appear in the "Interactive Data" section at the top of the page.

Learn how to do this by taking a quick look at these slides: