Share
The most well developed HXL library, libhxl-python, is written in Python. The most recent versions support Python 3 only, but there are earlier versions with Python 2.7 support. Features of the library include filtering, validation and the ingestion and generation of various formats. libhxl-python uses an idiom that is familiar from JQuery and other Javascript libraries; for example, to load a dataset, you would use simply
import hxl
source = hxl.data('http://example.org/dataset.xlsx')
As in JQuery, you process the dataset by adding additional steps to the chain. The following example selects every row with the organisation “UNICEF” and removes the column with email addresses:
source.with_rows('#org=UNICEF').without_columns('#contact+email')
The library also includes a set of command-line tools for processing HXL data in shell scripts. For example, the following will perform the same operation shown above, without the need to write Python code:
$ cat dataset.xlsx | hxlselect -q "#org=UNICEF" | hxlcut -x '#contact+email'
There is library API-level documentation available online.
import hxl
source = hxl.data('http://example.org/dataset.xlsx')
As in JQuery, you process the dataset by adding additional steps to the chain. The following example selects every row with the organisation "UNICEF" and removes the column with email addresses:
source.with_rows('#org=UNICEF').without_columns('#contact+email')
The library also includes a set of command-line tools for processing HXL data in shell scripts. For example, the following will perform the same operation shown above, without the need to write Python code:
$ cat dataset.xlsx | hxlselect -q "#org=UNICEF" | hxlcut -x '#contact+email'
There is library API-level documentation available online.