You should include one or more headers: <im_process_ana.h>, <im_process_glo.h>, <im_process_loc.h> and <im_process_pon.h>. And you must link with the "im_process.a/im_process.lib" library.
The processing operations are very simple to use. Usually you just have to call the respective function. But you will have to ensure yourself that the image parameters for the input and output data are correct. Here is an example:
void imProcessFlip(const imImage* src_image, imImage* dst_image);
The processing operations are exclusive for the imImage structure. This makes the implementation cleaner and much easier to process color images since the planes are separated. But remmber that you can always use the imImageInit function to initializes an imImage structure with your own buffer.
The image data of the output image is assumed to be zero before any operation. This is always true after creating a new image, but if you are reusing an image for several operation use imImageClear to zero the image data between operations.
An operation complexity is directly affected by the number of data types it will operate.
If it is only one, than it is as simple as:
void DoProc(imbyte* data, int width, int height)
{
  for (int y = 0; y < height; y++)
  {
    for (int x = 0; x < width; x++)
    {
      // Do something
      int offset = y * width + x;
      data[offset] = 0;
    }
  }
}
void SampleProc(imImage* image)
{
  // a loop for all the color planes
  for (int d = 0; d < image->depth; d++)
  {
    // Notice that the same operation may be used to process each color component
    DoProc((imbyte*)image->data[d], image->width, image->height);
  }
}
  
  Or if you want to use templates to allow a more number of types:
template <class T> 
void DoProc2(const T* src_data, T* dst_data, int count)
{
  for (int i = 0; i < count; i++)
  {
    src_data[i] = dst_data[i];
    
    // or a more low level approach
    
    *src_data++ = *dst_data++;
  }
}
// This is a sample that do not depends on the spatial distribution of the data.
// It uses data[0], the pointer where all depths depends on.
void SampleProc2(const imImage* src_image, imImage* dst_image)
{
  int total_count = src_image->count * src_image->depth; 
  switch(src_image->data_type)
  {
  case IM_BYTE:
    DoProc((imbyte*)src_image->data[0], (imbyte*)dst_image->data[0], total_count);
    break; 
  case IM_USHORT:
    DoProc((imushort*)src_image->data[0], (imushort*)dst_image->data[0], total_count);
    break; 
  case IM_INT: 
    DoProc((int*)src_image->data[0], (int*)dst_image->data[0], total_count);
    break; 
  case IM_FLOAT: 
    DoProc((float*)src_image->data[0], (float*)dst_image->data[0], total_count);
    break; 
  case IM_CFLOAT: 
    DoProc((imcfloat*)src_image->data[0], (imcfloat*)dst_image->data[0], total_count);
    break;
  }
}
  
  The first sample can be implemented in C, but the second sample can not, it must be in C++. Check the manual and the source code for many operations already available.
To add support for the counter callback to a new operation is very simple. The following code shows how:
int counter = imCounterBegin("Process Test 1");
imCounterTotal(counter, count_steps, "Processing");
for (int i = 0; i < count_steps; i++)
{
  // Do something
  if (!imCounterInc(counter))
    return IM_ERR_COUNTER;
}
imCounterEnd(counter);
  
  Every time you call imCounterTotal between a imCounterBegin/imCounterEnd for the same counter means that you are starting a count at that counter. So one operation can be composed by many sub-operations and still have a counter to display progress. For example, each call to the imFileReadImageData starts a new count for the same counter.
A nice thing to do when counting is not to display too small progress. To accomplish that in the implementation of the counter callback consider a minimum delay from one display to another.
See Utilities / Counter.