Archival/universal file format

At the 3-way meeting last week, there was some discussion into general file formats for beamlines. The argument for them is that when beamlines and scientific institutes will be asked to open their research data to the general public (perhaps after a certain period has expired, just like the astronomy community apparently has to), it should be published in a readable format with enough metadata attached to uniquely identify the experiment and conditions.

While it sounds like a lot of extra work (and there is a barrier to extra work, as we have enough to do as it is), the idea of using such a format as an archival and interchange format is one that appeals to me. The Astronomy community could agree on a format, but X-ray and neutron scattering communities have discussed plenty and reached (let’s be honest) not much yet. Whether that is because there is as yet no need for such a universal file format or whether it needs to be more flexible than it can ever support is not known. All I know is that a file format I made for my experiments, containing the calibrated and corrected data, has saved me a lot of time.

At the 3-way meeting, there was consensus that the file format should be gently forced by the management upon the beamlines. However, it was also agreed that money needs to be spent hiring full-time employees to set up, maintain and support the file format. Finally, it was agreed that the NeXus file format is the format to go for.

With that in mind, I contacted the community to ask whether I can write NeXus files from Matlab (the only language I know). In short, the file format can be read, but writing tools are not yet implemented. This will involve the inclusion of Java classes into the Matlab environment, a method I am unfortunately unfamiliar with. As the effort continues, I will let you know once more is available.

Be the first to comment

Leave a Reply

Your email address will not be published.


*


*

This site uses Akismet to reduce spam. Learn how your comment data is processed.