Data storage standards: NXcanSAS accepted!

Data storage. Public domain image, source: https://en.wikipedia.org/wiki/Tape_library#/media/File:NDOC_magnetic_tape_library.jpg
Data storage. Public domain image, source: https://en.wikipedia.org/wiki/Tape_library#/media/File:NDOC_magnetic_tape_library.jpg

Small-angle scattering datasets do not adhere to any particular format. This point was painfully obvious during the round robin tests, where I had to write a very flexible ASCII data reader just to keep up with the myriad of data formats. For a very long time, there has been a (glacial but steady) movement towards something more interoperable for storing corrected SAS datasets: NXcanSAS.

Data storage. Public domain image, source: https://en.wikipedia.org/wiki/Tape_library#/media/File:NDOC_magnetic_tape_library.jpg
Data storage. Public domain image, source: https://en.wikipedia.org/wiki/Tape_library#/media/File:NDOC_magnetic_tape_library.jpg

A while ago, a standard was reached with the CanSAS XML format. This is intended for storing 1D corrected data for later ingestion by data fitting programs, but its adoption has been very slow (even McSAS doesn’t support it, to my shame).

Some arguments for its slow adoption were that the format is only applicable to a subset of data (only 1D curves), and that it required the instrument responsible to fill in a lot of data fields on their instrument. XML also proved to be a hurdle for some (read: at least me) to implement cleanly. We need something capable of storing 1D, 2D or multidimensional data with the appropriate metadata.

With the advent of NeXus, an HDF5-based format was defined for archival data storage from synchrotron beamlines, together with an exhaustive list of optional descriptors to describe the instrument and experiment. This would be the ideal vessel for a renewed attempt at data storage, and so, CanSAS2012 was brought to life.

This format was intended to be compatible with NeXus, mimicking its structure and definitions. It was then submitted as a proposed official addition (application definition) to the standard: called “NXcanSAS”. In a big step forward, this has now been accepted by the NIAC (Nexus International Advisory Committee), and so it is pretty much ready for use.

Unlike the older CanSAS XML format, NXcanSAS requires only a minimum of information (q, I, and if it’s up to me, also the uncertainty on I), but specifies an additional set of recommended data that can be used to further detail the measurement (wavelength, uncertainties in both Q and I, etc.). It therefore has a very low threshold for implementation, and we should be seeing it pop up in software packages soon. Indeed, Mantid and SasView have already implemented it.

A big thanks to all the people in the CanSAS working groups for their continuing efforts!

1 Comment

Leave a Reply

Your email address will not be published.


*


*

This site uses Akismet to reduce spam. Learn how your comment data is processed.