Python

Q: Python

The most well developed HXL library, libhxl-python , is written in Python. The most recent versions support Python 3 only, but there are earlier versions with Python 2.7 support. Features of the library include filtering, validation and the ingestion and generation of various formats. libhxl-python uses an idiom that is familiar from JQuery and other Javascript libraries; for example, to load a dataset, you would use simply import hxl source = hxl.data('http://example.org/dataset.xlsx') As in JQuery, you process the dataset by adding additional steps to the chain. The following example selects every row with the organisation "UNICEF" and removes the column with email addresses: source.with_rows('#org=UNICEF').without_columns('#contact+email') The library also includes a set of command-line tools for processing HXL data in shell scripts. For example, the following will perform the same operation shown above, without the need to write Python code: $ cat dataset.xlsx | hxlselect -q "#org=UNICEF" | hxlcut -x '#contact+email' There is library API-level documentation available online.

By Alexandru Artimon

The most well developed HXL library, libhxl-python, is written in Python. The most recent versions support Python 3 only, but there are earlier versions with Python 2.7 support. Features of the library include filtering, validation and the ingestion and generation of various formats. libhxl-python uses an idiom that is familiar from JQuery and other Javascript libraries; for example, to load a dataset, you would use simply

import hxl 
source = hxl.data('http://example.org/dataset.xlsx')

As in JQuery, you process the dataset by adding additional steps to the chain. The following example selects every row with the organisation “UNICEF” and removes the column with email addresses:

source.with_rows('#org=UNICEF').without_columns('#contact+email')

The library also includes a set of command-line tools for processing HXL data in shell scripts. For example, the following will perform the same operation shown above, without the need to write Python code:

$ cat dataset.xlsx | hxlselect -q "#org=UNICEF" | hxlcut -x '#contact+email'

There is library API-level documentation available online.

import hxl 
source = hxl.data('http://example.org/dataset.xlsx')

As in JQuery, you process the dataset by adding additional steps to the chain. The following example selects every row with the organisation "UNICEF" and removes the column with email addresses:

source.with_rows('#org=UNICEF').without_columns('#contact+email')

$ cat dataset.xlsx | hxlselect -q "#org=UNICEF" | hxlcut -x '#contact+email'

There is library API-level documentation available online.

By Alexandru Artimon

Share