In the IM library images are 2D matrices of pixels defining width and height. Stacks, Animations, Videos and Volumes are represented as a sequence of individual images.
The pixels can have one of several color spaces:
IM_MAP is a subset of the IM_RGB color space. It can have a maximum of 256 colors. Each value is an index into a RGB palette.
IM_GRAY usually means luma (nonlinear Luminance), but it can represent any other intensity value that is not necessarily related to color.
IM_BINARY is a subset of the IM_GRAY color space, and it has only 2 colors black and white. Each value can be 0 or 1. But for pratical reasons we use one byte to store it.
The other color spaces are standard CIE color spaces, except CMYK that does not have a clear definition without other parameters to complement it.
There are several numeric representations for the color component, or several data types:
There is no bit type, binary images use 1 byte (waist space but keep processing simple).
To avoid defining another image parameter we also use a parameter called color_mode that it is composed by the color_space plus some flags, i.e. color_mode = color_space + flags. The flags are binary combined with the color space, for example color_mode = IM_RGB | IM_XXX. And several flags can be combined in the same color_mode.
There are 3 flags:
When a flag is absent the opposite definition is assumed. For simplicity we define some macros that help handling the color mode:
The number of components of the color space defines the depth of the image. The color components can be packed sequentially in one plane (like rgbrgbrgb...) or separated in several planes (like rrr...ggg...bbb...). Packed color components are normally used by graphics systems. We allow these two options because many users define their own image structure that can have a packed or an separated organization. The following picture illustrates the difference between the two options:

  (flag not defined)                 
  IM_PACKED
Separated and Packed RGB Components
An extra component, the alpha channel, may be present. The number of components is then increased by one. Its organization follows the rules of packed and unpacked components.
Image orientation can be bottom up to top with the origin at the bottom left corner, or top down to bottom with the origin at the top left corner.

IM_TOPDOWN (flag not defined)
Top Down and Bottom Up Orientations
IM_RGB | IM_ALPHA - rgb color space with an alpha channel, bottom up orientation and 
    separated components
    IM_GRAY | IM_TOPDOWN - gray color space with no alpha channel and top down orientation
    IM_RGB | IM_ALPHA | IM_PACKED - rgb color space with an alpha channel, bottom 
    up orientation and packed components
So these four parameters define our raw image data: width, height, color_mode and data_type. The raw data buffer is always byte aligned and each component is stored sequentially in the buffer following the specified packing.
For example, if a RGB image is 4x4 pixels it will have the following organization in memory:
RRRRRRRRRRRRRRRRGGGGGGGGGGGGGGGGBBBBBBBBBBBBBBBB - for non packed components 0 1 2 3 0 1 2 3 0 1 2 3
RGBRGBRGBRGBRGBRGBRGBRGBRGBRGBRGBRGBRGBRGBRGBRGB - for packed components 0 1 2 3
In bold we visualy marked some lines of data.
We could restrict the data organization by eliminating the extra flags, but several users requested these features in the library. So we keep them but restricted to raw data buffers.
For the high level image processing functions we created a structure called imImage that eliminates the extra flags and assume bottom up orientation and separated components. Alpha channel is supported as an extra component.
The imImage structure is defined using four image parameters: width, height, color_space and data_type. It is an open structure in C where you can access all the parameters. In addition to the 4 creation parameters there are many auxiliary parameters like depth, count, line_size, plane_size and size.