Rationale and DiscussionRobert W Cox, PhD
Director, Scientific and Statistical Computing Core
National Institute of Mental Health
National Institutes of Health
Department of Health and Human Services
United States of America
26 Feb 2004
The idea that grew into the NIfTI-1 format originated as a random thought on how to get various pieces of well-established FMRI software to work together with the least amount of tinkering. Since the ANALYZETM 7.5 format is extremely simple, widely used in the FMRI community, and contains some unused or little-used spaces, this format was a logical place to start. The idea was to take the ANALYZE 7.5 header, squeeze some extra fields into it, and get everyone in the DFWG to agree to support it. Since the people in the room included John Ashburner (SPM), Steve Smith (FSL), and myself (AFNI), I figured that if the three of us could agree to use such a hybrid format—for both read and write—the rest of the FMRI world would perforce come along.
Roughly speaking, the two goals of the NIfTI-1 format development were:
- To add information to the header that will be useful for functional neuroimaging (especially FMRI) data analysis and display.
- To maintain compatibility with non-NIfTI-aware ANALYZE 7.5 compatible software.
Of course, nothing is as simple as it ought to be. We had to agree upon which fields in the ANALYZE 7.5 header were really "little-used" and could be over-written with other content. And we had to agree upon what that new content would be, and how it would be interpreted. The issue of coordinate systems was probably the most annoying, as usual. And of course, there was the problem of keeping everything in the 348 byte length of the ANALYZE 7.5 header. The result is not fully satisfying; however, I believe it is a practical short-term solution for the image data file interoperability problem in functional neuroimaging software.
The next section details the fields jettisoned from the ANALYZE 7.5 header. The remainder of this document then discusses the thinking behind each of the conceptual components of the NIfTI-1 header. Along the way, most of the features of the NIfTI-1 format are at least outlined. However, the comments in the NIfTI-1 C header are more complete about most of these features.
What Was Sacrificed from
All of the fields marked as "unused" in the ANALYZE 7.5 header have been taken for NIfTI-1 purposes. In addition:
The compressed field has been renamed to slice_duration.
Reasoning: NIfTI-1 doesn't support compressed image data. Also, the meaning of compressed is quite unclear.
The verified field has been renamed to toffset.
Reasoning: The meaning of verified is obscure and we were unaware of any neuroimaging data analysis software that used this field.
All the data_history sub-structure after the aux_file field has been replaced by other contents.
Reasoning: The ANALYZE 7.5 documentation describes this entire sub-structure as "not required", and the meanings of most of its fields are unobvious, so deciding to alter it extensively was easy. The only field within data_history that we considered keeping was orient, which was a integer code describing the spatial orientation of the 3D volume. However, this field did not even allow encoding all 48 possible orthogonal orientations, and was seldom set or used in any software of which we were aware. We therefore decided to completely replace the orientation specification in NIfTI-1.
The entire header structure is described in the NIfTI-1 C header as a single C struct, rather than as a struct containing 3 sub-structs. This cosmetic change is entirely for convenience in programming.
Various little-used ANALYZE 7.5 header fields have been kept in the NIfTI-1 format. These are marked as ++UNUSED++ in the NIfTI-1 C header. One reason for keeping these fields is that we weren't sure how necessary they are for ANALYZE 7.5 compatibility. These unused-but-not-replaced fields are data_type, db_name, extents, session_error, glmin, glmax, and regular. The contents of none of these fields are specified by the NIfTI-1 standard. A "proper" NIfTI-aware program should ignore these fields on input. On output, regular should be set to the character 'r' for compatibility with ANALYZE 7.5; the other unused fields may be filled with zero bytes for safety.
The glmin and glmax fields are sometimes used to hold the global data minimum and maximum values. We decided against mandating this use of these fields because we are now allowing vector-valued data to be stored at each voxel (in the 5th dimension of the image array). The concept of data minimum and maximum doesn't apply to such data. However, we did not "recycle" these fields since some ANALYZE 7.5 compatible software uses these values. In a NIfTI-1 file, these fields should either be set to 0, or be set to the global minimum and maximum of the image data.
NIfTI-1, like ANALYZE 7.5, allows data to be stored on a regular grid in 1..7 dimensions. However, the NIfTI-1 standard adds some specific interpretations to the dimensions that the ANALYZE 7.5 format does not require.
The number of dimensions is stored in dim, as in ANALYZE 7.5; this may be a value from 1..7. (If this value is out of range, then that is the signal that the header and data should be byte-swapped — pairwise for short fields and quadwise for float fields. If dim is still out of range after byte-swapping, then the file is not interpretable as either a NIfTI-1 or ANALYZE 7.5 file. Note that this technique is also the correct way to detect if an ANALYZE 7.5 file needs byte-swapping; the practice of checking if sizeof_hdr=348 for this purpose is not correct.)
As a shorthand, let N=dim. Then dim..dim[N] contain the length of each dimension in the N-dimensional image data array; again, as in ANALYZE 7.5. Unlike ANALYZE 7.5, in NIfTI1-1 each dimension index 1..5 is assigned a specific role. Dimensions 1..3 are always interpreted as spatial dimensions, here called x, y, and z. Dimension 4 is the time dimension, here called t. Dimension 5 is reserved for storing multiple values per spatiotemporal voxel (e.g., for storing vector-valued data). NIfTI-1 does not specify any uses for dimensions 6 and 7.
The NIfTI-1 C header comments have a number of examples of how the dimensional roles described above should be used. For example, a spatially 1D time series of 3-vectors would have
dim=5 dim=dim=dim=1 dim=number_of_time_points dim=3(Further discussion of how vector-valued data is to be stored can be found in a later section.)
The reason for assigning fixed dimensional roles to each index 1..5 is to make the interpretation of the file simple and consistent. A program can tell if an input file is time-dependent simply by checking if dim≥4 and then if dim>1. As another example, a single slice FMRI experiment with 155 image acquisitions might have
dim=4 dim=dim=64 dim=1 dim=155Although this is really 3 dimensional data (x,y,t), it is easily distinguished from a 3 dimensional volume (x,y,z) because dim>1 and dim=1.
It may seem odd that some earlier dimensions might be 1, while later dimensions are greater than 1. In terms of the actual image data storage, this makes no difference, since the data at the 5D index (a,b,c,d,e) is to be stored at byte offset
(a+b*dim+c*dim*dim+d*dim*dim*dim+e*dim*dim*dim*dim)*bitpix/8into the image data array, for a=0..dim-1, b=0..dim-1, etc. In the spatially 1D time series example above, the only legal value for a, b, and c is 0. In this example, the offset formula simplifies to
(d+e*dim)*bitpix/8which is exactly the form one would get by storing the time axis in dimension 1 and the vector components along dimension 2. Thus, there is no computational or storage efficiency loss to assigning dimensional roles that result in leading dimensions being set to 1.
There was some discussion in the DFWG about having an option to store the vector values along dimension 1. The advantage of using this option would be that data that "belongs" together in the same spatiotemporal voxel would be stored contiguously in the data array. Storing the vector values along dimension 5 means that such data can be spaced quite far apart; for example, if dim=dim=dim=100, dim=1, then each unit step along dimension 5 steps across 100000*bitpix/8 bytes of image data. On modern computer systems, such non-local access can be quite slow if not programmed very carefully.
The reason we decided against this "vector-values along dimension 1" option is that such files would make no sense to a non-NIfTI-aware program. With dimensions 1..3 being assigned to space and dimension 4 to time, our hope is that most programs reading the ANALYZE 7.5 will still be able to at least display NIfTI-1 files in some recognizable way.
.hdr/.img File Pairs and .nii Single
The ANALYZE 7.5 format famously stores a dataset in two files: prefix.hdr to contain just the 348-byte header, and prefix.img to contain just the binary image data. (In principle, the image data could be stored in another filename, which would be stored in header field db_name, but this is very uncommon in the functional neuroimaging community.)
NIfTI-1 keeps the file pair storage strategy, in part to keep a measure of compatibility with ANALYZE 7.5. Another reason for storing the binary image data in a separate file is that it becomes easy to load the data into memory using the Unix mmap() function. This function allow a program to map a file into memory space; the OS then takes care of the I/O.
There are two principal reasons that a single file storage of a dataset is desirable:
- Convenience in file copying, moving, and renaming operations.
- Making it possible to link to a dataset file on a Web page; the creation of "helper" applications that can then display such files will make it possible for researchers to post 3D and 4D neuroimaging data and results for easy viewing.
To signal that a 348-byte header is NIfTI-1 compatible, the last 4 bytes of this header comprise the magic field. This field should have one of these two values:
- "ni1\0" (hexadecimal: 6E 69 31 00) — indicates a NIfTI-1 dataset stored in two files;
- "n+1\0" (hexadecimal: 6E 2B 31 00) — indicates a NIfTI-1 dataset stored in one file.
The preferred NIfTI-1 suffix for such single files is .nii. This was chosen because the more obvious .nif was already spoken for; cf. http://www.icdatamaster.com/n.html. On the plus side, "nii" can be pronounced as in this movie scene. However, this filename suffix is not required; the ultimate decision about whether the image data is in the same file as the header comes from the magic field, not from the filename.
For years, SPM has used the ANALYZE 7.5 funused1 field as a scaling factor; if the value stored therein is nonzero, then all the image data is scaled by funused1 before being used in the program. NIfTI-1 has formalized and extended this idea. The funused1 field is renamed to scl_slope and the funused2 field is renamed to scl_inter. If scl_slope is nonzero, then a voxel value val read from the file should be interpreted as scl_slope * val + scl_inter inside the program. This scaling is to apply to all the values in the dataset, unless the datatype is RGB.
In general, it is probably better to store floating point values as floating point data, rather than try to compress them to 16-bit shorts or 8-bit bytes along with a global scaling factor. However, several DFWG members felt that providing a standard way of expressing scaling was important, especially for backward compatibility.
A large number of DFWG members felt it was important to provide a "complete" set of elementary data type codes (stored in the datatype header field). The allowable datatype codes have been extended to allow for 8..128 bit integer, float, and complex values. For more information, see FAQ #12.
There was some discussion of encoding more complicated structured types to be stored at each voxel. (The structured types allowed in ANALYZE 7.5 are complex numbers=float pairs and RGB colors=byte triples.) The interpretation of dimension 5 as stored vector-values for each spatiotemporal voxel partly elides this issue. However, since each value in the image binary data must have the same datatype, it is impossible currently to store (say) an int and 3 floats at each location. Unfortunately, it seems to be impracticable to encode such any reasonably general typing mechanism and at the same time keep to the 348-byte header length compatibility requirement.
Direct imaging data (i.e., the output from a scanner) is only a fraction of what is stored in functional neuroimaging data analyses. A large quantity of derived datasets is created along the way. Many of these are statistics of various kinds. One goal of our efforts was to provide a standard way to mark the values in a dataset as being t-statistics, F-statistics, etc., and to store the auxiliary parameters (e.g., degrees-of-freedom) that go with these parametric distributions.
"Meaning" is attached to the numbers in a dataset by setting the intent_code field to a nonzero value. Values of intent_code in the range 2..22 indicate that the dataset numbers are drawn from various standard probability distributions. Any particular distribution may have from 0 to 3 parameters. The distribution codes and parameters are:λ (lambda)
|intent_code||Distribution||Parameter 1||Parameter 2||Parameter 3|
|4||F-statistic||numerator DOF||denominator DOF||N/A|
|8||Binomial||no. trials||prob. per trial||N/A|
|12||Noncentral F-statistic||numerator DOF||denominator DOF||noncentrality|
|16||Uniform||lower bound||upper bound||N/A|
|20||Inverse Gaussian||μ (mu)||N/A|
|21||Extreme Value I||location||scale||N/A|
Codes 2..10 are chosen to be compatible with AFNI. It is my intention to provide C functions to inter-convert these distributional values between p-values and z-scores. A little more information about these codes and distributions can be found in the C header file. Three books of particular interest are
- Univariate Discrete Distributions. NL Johnson, S Kotz, AW Kemp.
- Continuous Univariate Distributions, vol. 1. NL Johnson, S Kotz, N Balakrishnan.
- Continuous Univariate Distributions, vol. 2. NL Johnson, S Kotz, N Balakrishnan.
It was my original intention that the statistical parameters for such a dataset would be globally fixed in the header; for example, a dataset of t-statistics with 23.7 degrees-of-freedom. (This is the current status in AFNI.) Several other members of the DFWG convinced me that allowing the parameters to be voxel-dependent was an important feature. Thus, NIfTI-1 allows a statistical dataset to have its parameters specified in two different ways:
- If dim=1 (or if dim<5), then the parameters are global (voxel-independent) and are stored in intent_p1, intent_p2, and intent_p3.
- If dim>1, then dim should equal 1 plus the number of parameters for the intent_code distribution. If we call e the index for the 5th dimension, then for each 4D voxel index (a,b,c,d), the statistic itself is stored in the 5D location (a,b,c,d,e=0), the first parameter in (a,b,c,d,e=1), et cetera.
Values of intent_code greater than 1000 are used to signal other meanings of the data. Several of these codes are used to signify the interpretation of the 5th dimension in a dataset:
|1004||M×N General Matrix||dim=M*N intent_p1=M intent_p2=N|
|1005||N×N Symmetric Matrix||dim=N*(N+1)/2 intent_p1=N|
|1008||N-dimensional Point Set||dim=N dim=dim=dim=1|
|1010||Triangle Set||dim=3 dim=dim=dim=1|
The difference between "Displacement" and "General" vector is that intent_code=1006 is used to signal that the vectors stored at each location are a geometric displacement (or distortion). This feature can be used to encode nonlinear warps, for example. intent_code=1007 just signals that vectors are stored at each location, but does not impute any more meaning to them (e.g., this feature could be used to store a diffusion-tensor derived tract direction at each location in space).
The "Point Set" intent_code's function is to allow storage of random points in space; for example, the nodes of a surface mesh. The "Triangle Set" intent_code's function is to allow storage of integer triples that encode a surface mesh. In both cases, dim would be set to the number of entries in the dataset. The combination of a "Point Set" dataset and a "Triangle Set" dataset would allow the specification of a surface mesh, where the indexes stored in the "Triangle Set" dataset would indicate which nodes from the "Point Set" dataset were in each triangle.
Numerous other specialty vector types could easily be devised. Rather than get carried away, we decided to stop here and see what comments we get. All these features are frankly experimental and tentative.
Storing Other Images with
A few remaining intent_code values are provided to allow for tagging a dataset's values as having particular meanings. The intent_name string field can be used to store additional textual information about these meanings.
|1001||Values are estimates||intent_name=what's being estimated|
|1002||Values are label indexes||aux_file=file with labels|
|1003||Values are NeuroNames indexes||in my dreams|
Values 1002 and 1003 for intent_code are there to let a dataset be a set of labels; for example, to mark each voxel in a brain volume with an anatomical structure name (or names, if dim>1). Code 1002 is for a custom set of labels; the integer values in the dataset are supposed to correspond to an external file, perhaps specified in header field aux_file. Code 1003 is for a standard set of labels, from the NeuroNames list. (This feature is still in development, and not available yet.)
Coordinate Systems and Data
The issue of specifying the orientation and location in space of a volumetric dataset is moderately annoying. On the one hand, it seems basically trivial. On the other hand, no one agrees on what is the "right way" to do this. The choices in NIfTI-1 are the result of a compromise between John Ashburner, Steve Smith, and myself, worked out via e-mail and at the bar in the Marquis Marriott during the 9th Annual Meeting of the OHBM.
When dealing with a single volume, or a set of spatially identical volumes, the orientation and location of the raster grid are relatively unimportant when processing the data. But in FMRI and other neuroimaging applications, one usually needs to deal with sets of volumes acquired on different grid sizes, in different orientations, and with different coverage of the brain. For example, one usually wants to overlay low-resolution FMRI results onto high-resolution MP-RAGE/SPGR type structural reference volumes. To do this correctly and automatically requires that the spatial relationship between different volumes be encoded somehow.
One way to encode such relationships (adopted here) is to postulate a master coordinate system, and then store the location/orientation of each dataset with respect to that master system. Then the spatial relationship between any two datasets can be found. The basic such spatial relationship question is, "What value of (i,j,k)1 in dataset #1 corresponds to a given (i,j,k)2 in dataset #2?" With a common master coordinate system, each dataset #p contains a transformation from (i,j,k)p to (x,y,z); call this transformation (x,y,z) = Tp[(i,j,k)]. Then the answer to the basic question is (i,j,k)1 = T1-1[T2[(i,j,k)2] ]. When Tp is an affine transformation, then Tp-1 is also affine, and so is T1-1T2. (The transformations stored in the NIfTI-1 header are affine.)
An alternative way encode the spatial relationship between datasets (not adopted here) is to eschew the use of an arbitrary master coordinate system. Instead, the pairwise relationship between any two datasets is stored separately. This approach is followed in FSL, where external files with names like example_func2highres.mat encode the spatial transformation between the EPI data (example_func.hdr) and the high-resolution anatomical volume (highres.hdr). The main drawback to this scheme is that it is hard to generalize to cases with many different types of images, each gathered in a different way. For this reason, we did not use this approach in NIfTI-1.
NIfTI-1 allows the master coordinate system to be labeled as one of the following:
- Scanner-based anatomical coordinates (e.g., from the DICOM image header);
- Scanner-based anatomical coordinates, realigned to some "truth"
- Coordinates aligned to the Talairach-Tournoux Atlas; (e.g., (0,0,0)=Anterior Commissure);
- Coordinates aligned to the MNI-152 standard template.
The NIfTI-1 format allows the storage of two affine transformations from (i,j,k) to (x,y,z). The first transformation ("qform") is intended to encode the orthogonal orientation of the 3D volumes in the master coordinate space. The second transformation ("sform") is intended to encode a general affine transformation to a standard space (i.e., Talairach-Tournoux or MNI-152). Each transform can be useful in different contexts: the "qform" for displaying the data on its original grid (what the scanner reported), and the "sform" for displaying the data scaled to a standard space. For example, the "qform" can be constructed from the attributes from a set of DICOM files; the "sform" can be computed and inserted into the header later. Image display software can use either transform, depending on its needs.
To save space, the "qform" orthogonal transformation is encoded using a quaternion in the header file. the "sform" general transformation is stored as a 4×3 matrix. The details are described in the NIfTI-1 C header at length.
An important point is that the (x,y,z) value assigned by the header transformations to a voxel 3D index (i,j,k) refers to the center of the voxel. This point was the subject of some debate at the Marquis Marriott meeting. The alternative proposal was to define (x,y,z) as being a corner of a voxel, where (i,j,k)=(0,0,0) would correspond to the outermost corner of the first voxel in the dataset. This proposal has the advantage that the 3D bounding box of the volume is then given by T([0,0,0])..T([dim-1,dim-1,dim-1]), where again T represents the affine transformation from (i,j,k) to (x,y,z). After some mild debate, the center proposal won the vote. This is consistent with the custom in scanner-supplied image headers (including DICOM), where pixel coordinates invariably refer to the center.
Codes are supplied to indicate the spatial units of the (x,y,z) axes: millimeter, micrometers, and meters. Codes are also available to indicate the units of the t (4th dimension) axis: seconds, milliseconds, and microseconds. If the "t" axis is really frequency (e.g., after Fourier transforming), then codes are supplied to indicate the 4th dimension units are Hz or ppm. If necessary, codes could similarly be added to indicate that the spatial axes are in k-space (e.g., units of cycles/mm).
Note that centimeters are not an available code. This is because cm is not an SI unit, and its use is deprecated in the scientific literature.
The time step is given in pixdim; the time origin is given in toffset; that is, the qth time point is at time toffset+q*pixdim for q=0..dim-1.
Since FMRI is the most widely used functional neuroimaging technique, and since the DFWG's charter from the main NIfTI committee was to help make FMRI analysis software interoperable, several features in the NIfTI-1 format are specialized for multislice echo-planar image (EPI) acquisition:
- A field to specify which of (x,y,z) corresponds to the frequency, phase, and slice encoding directions. This is essential information when trying to apply a magnetic field map to unwarp an echo-planar image.
- A field to specify the amount of time required to gather a single slice. Clustered acquisition schemes may have slice_duration*dim[slice_dim] less than pixdim, with the rest of the TR=pixdim time silent.
- A field to indicate the slice acquisition order (e.g., interleaved). These last two fields are essential when trying to estimate hemodynamic delays from EPI time series.
- The ANALYZE 7.5 fields cal_min and cal_max have been left untouched; however, the meaning of these values is not specifically defined by the NIfTI-1 standard. One possible use is to map data values to some color display scale; for example, map values at (or below) cal_min to "black", values at or above cal_max to "white", and values in between linearly to intermediate colors. Here, "black" and "white" can be the extreme ends of any color scheme, of course.
- Integer types are implicitly assumed to be stored in twos-complement form. Floating point types are implicitly assumed to be stored in IEEE-754 format (so that binary floats are interchangeable, with at most byte-swapping required). These assumptions should be valid for all modern microprocessors one is likely to encounter.
- Explicit use in the header is made of the assumptions that sizeof(short)==2 and sizeof(float)==4. Implicitly, we are also assuming that sizeof(int)==4, sizeof(double)==8, and that the complex types are stored with no padding.
- We are assuming that there are only two byte orders possible, and that correctly swapping data values that occupy N bytes is done by swapping end-for-end (e.g., ABCD&harrDCBA). Also, we are assuming that the header and data are both stored in the same endian-ness, so that by examining dim one can determine whether to byte swap the rest of the header and the image data.
Although I am the author of AFNI and its file format, very little of that material has crept into the NIfTI-1 format. About the only place where my AFNI work has directly impacted the new format is my insistence on including intent codes for indicating that the values in a dataset are drawn from various common probability distributions. The actual codes used for this are drawn from the AFNI format.
I'm not particularly enamored of the AFNI file format any more. I "designed" it in about an hour in July 1994. I would prefer an XML-based format to become a de facto FMRI standard, but that doesn't seem to be an option at this time (i.e., the enthusiasm level for this in the DFWG was non-existent). I chose to put forward the idea of adapting the ANALYZE 7.5 format to FMRI needs because I felt this would be the most likely way to make progress in a short amount of time. Here, "progress" is defined as being able to get the major FMRI software packages to exchange data without the users experiencing too much anguish. To my mind, the biggest issue was (and is) coordinate systems.
One of the FAQs is "Why didn't the committee choose the DICOM format?" My personal answer to this question includes the following points:
- DICOM is far too complex to understand easily. A reasonably competent programmer should be able to write a NIfTI-1 read/write set of functions in a day or so. (Less if they start with the existing software.)
- DICOM doesn't allow storage of floating point values. (This "feature" is perhaps the biggest objection in my mind.)
- We'd need to add a bunch of attributes to describe neuroimaging-specific needs (e.g., statistical codes and parameters; vector-valued datasets). These additions would make NIfTI-1 datasets not particularly readable in standard DICOM programs.
Last modified 2008-01-22 12:50