Chapter 2. Contributions: How to and guides

version: Return version of grasp if it is compiled with saving this information

The GRASP is open with a purpose to enable people to investigate and modify it. You can use it just to get some knowledge on how it works, but you can also contribute as part of the community in many different ways: testing, reporting bugs, helping other users, developing new awesome features or extensions. In this chapter all of this will be explained along with the main GRASP architecture and the libraries -everything that is necessary to start helping the GRASP team.

Index of content chapter 2:

2.1. Bug reporting

Report a bug is an easy task thank to GitLab web interface and it is a great way to support GRASP project. In case you identify a strange behavior of the code or an error such as "segmentation fault" you can report that bug to the developers team. The user can report a bug opening an issue (see 1.2. Open an issue section).

2.2. Documentation

The documentation is also part of the code repository. The success of the project depends also on the quality of the documentation. That is the reason for having as objective a very good and complete documentation, which is easy to read and understand. To achieve our goal we have split the documentation into user and technical documentation. The idea is to approach the topics from different point of views.

2.2.1. User documentation

The user documentation is published on the web www.grasp-open.com/doc. The source code of the doc is in doc/overview/ and changes are automatically published on the web. The documentation is developed using DocBook technology which helps to deploy the results in web format (html) and PDF.

To edit the documentation, the developer (or reviewer in this case) has to edit the chapters that are in files doc/overview/src/en/chap0*.dbl. The preview of the results can be seen after the compilation of the doc. It is easy to compile the documentation, if you have installed the prerequisites:

  • docbook
  • an XSLT processor (xsltproc by default, or alternate tools Saxon or Xalan)
  • a Formatting Objects processor (currently Apache fop).
  • The Apache Xerces XML parser (for use with Saxon or Xalan)

Then, from doc/overview folder, you just need to type make. To see the the results you can open doc/overview/website/index.html or doc/overview/website/docgrasp.pdf. Once changes are done, make clean before committing anything. Changes can be committed to a fork and then ask for merging in the original project via "pull request". This actions are explained in 1.3. Forking, contributing and pulling request section.

2.2.2. Technical documentation

The documentation that the reader is reading right now is the technical documentation. Raw files of this documentation are placed in doc/technical folder of GRASP repository. The changes are automatically published on the web www.grasp-open.com/tech-doc . This documentation is generated by Doxygen technology and accompanied by markdown, a popular lightweight markup language with plain text formatting syntax.

Technical documentation is written in same code files, which helps to keep it always up to date with code interfaces. Additionally, some introductory chapters are written in markdown. The source can be compiled using make command in doc/technical folder, having doxygen installed in the system. Changes can be committed to a fork and then ask for merging in original project via "pull request". This actions are explained in 1.3. Forking, contributing and pulling request section.

2.3. Developing

The code is organized in different modules, easy to identify in the source code because each one has its own folder under src/ folder. These modules are:

  • Retrieval: It is a scientific library that can also work as a stand alone application (advanced mode is not recommended).
  • Input: Responsible of preparing and injecting input data in the retrieval algorithm.
  • Output: It retrieves results from the algorithm and prepare the output for the user. Data can be dump into screen or into files with different formats (such as ascii, HDF, NetCDF...)
  • Settings: It defines the user interface which defines the way that the software is going to work. Settings are provided as a YAML file which sets up the behavior of the process.
  • Controller It is the starting point of the code and it performs the connections between all different modules.
  • Global General module which can be used by the rest of the modules. General things such as a parameters library, that can be used in input or settings module for work with an initial guess, or in output to work with a result parameters array.

A general overview about GRASP architecture is accesible in chapter three of the overview documentation (see this) and that is the first recommended reference before starting to work with the code.

A clear image about how the retrieval library is called and the general workflow is the following one:

GRASP_call_flow_sequence_diagram.png
Sequence diagram of GRASP control unit

The MPI compilation of the code works with a similar work flow but master node performs all work except inversion which is done by the workers. So, the loop of the previous diagram is parallelized.

2.3.1. Global libraries

2.3.1.1 Output streams

INTRODUCTION

The user guide for output streams is in www.grasp-open.com/doc/apc.php

Output streams allows to redirect the output from GRASP to a file, to /dev/null or to a file easily. It is well integrated with settings module, allowing a powerful way to define the filename. The basic idea behind this library is to provide a flexible way to name the output filenames. If we call "output.txt" to the output generated by a segment, it will be overwritten by all segments, keeping just the information of the last segment. Using wildcard, that will be dynamically generated, you can store all organized output information. Implementation details will be explained in the following sections.

USAGE EXAMPLE

// Stream declaration
// Stream initialization
grasp_output_stream_initialize("data_{tile_from}_{tile_to}.txt", &stream);
// Open stream (You have to provide the known information)
grasp_output_stream_open(stream, settings, segment, NULL, &tile_description->dimensions, -1, -1, -1);
// Stream usage
gos_fprintf(stream," message=%d", value);
// Closing stream
// Deallocating memory

If you use the function initialize at the beginning, you have to use the function destroy at the end. Perhaps you don't need to use initialize if you only want a pointer to files generated by the stream library.

FILE *f;
f=grasp_output_stream_open(stream, settings, segment, output, &tile_description->dimensions, icol, irow, itime);
if(grasp_output_stream_writable(stream)==false){ // If it is not writable, close stream and finish function
return 0;
}

HOW TO ADD A NEW WILDCARD

There is a list of available wildcard that can be used in output streams. In principle, this list should cover all requirements of the users and it is not necessary to add any more wildcards. But if you feel a necessity to use a special wildcard which is not implemented, pease follow this guide.

  • 1. New wildcard definition

In the file src/output/grasp_output_stream.h you have to define a function which will return the wildcard. As an example, please follow this function which returns the version of the compilation:

//**
** @param token Token to be translated: segment_corner_row
* @param format Format of the token read from pattern: N. It has to be a integer number which represents number of 0s at left
* @param settings Current settings
* @param segment Input segment
* @param output Output current segment
* @param tile_description Tile description (dimensions)
* @param icol Number of column. If it is known you have to use -1
* @param irow Number of current row of the segment
* @param itime Number of current time of the segment
* @return the wildcard translated
*//
char *grasp_output_stream_filename_generate_by_version(const char *token, const char *format, const grasp_settings *settings, const grasp_segment_t *segment, const output_segment_general *output, const grasp_tile_dimensions_t *tile_description, int icol, int irow, int itime);
  • 2. Wildcard implementation

In the file src/output/grasp_output_stream.c you have to implement the function defined in point 1. Example:

char *grasp_output_stream_filename_generate_by_version(const char *token, const char *format, const grasp_settings *settings, const grasp_segment_t *segment, const output_segment_general *output, const grasp_tile_dimensions_t *tile_description, int icol, int irow, int itime){
char *token_result;
token_result = (char *) trackmem_malloc(sizeof (char)* _GBL_FILE_PATH_LEN);
assert( token_result!= NULL);
strcpy(token_result, "");
if (strcmp(token, "version") != 0) {
return token_result;
}
strcpy(token_result,GRASP_VERSION);
return token_result;
}

As you can see, this function returns a string allocated (using trackmem library) with the string selected. In the current example, it is simple since it does not use any other information. You can dive into the file and see other examples which use settings or segment information.

  • 3. Add the new wildcard to the dictionary

At the beginning of XXX file you will find this structure definition:

You can add your new wildcard just by adding the name of the wildcard (the string which will be used to call it) and the name of the function to be used.

  • 4. Improve documentation

Remember to add the information to the new wildcard in the used documentation. Edit the file doc/overview/src/en/appc.dbk and add to the list of the available wildcards the explanation on the new one.

2.3.2. Extensions

GRASP code can be extended by using a kind of "plug-in", however the operation should be done during the compilation time. GRASP is a high optimized software without losses due to extensions, since the user can choose which extensions to install. In a production environment, the less extensions installed is better, since they will take memory even if they are not used.

Many types of extensions can be developed and they are organized as input or output depending of their target. The following sections will explain each type of extension, its goal and ways to implement them. Manual implementation is not recommended since the grasp-manager script provides easy commands to create the initial scaffold, but at the same time, the knowledge of all the details will help developers to understand how to create an extension.

If you want to create initial code for an extension you can use the grasp-manager with the following command:

./grasp-manager.sh create-extension [extension-type] [extension-name] [extension-url]

2.3.2.1. Input

Input extensions extend input functionalities of the GRASP code. They can be drivers or transformers. A driver is a method to read directly raw data and inject it in the GRASP core without using intermediate files. The whole process is performed in the memory and it is crucial in satellite data processing due to huge databases. Input transformers allow to modify input data after being read. It is a way to share between different drivers a data preprocesing. For example, a transformer can modify initial guess for each pixel or read a climatology database.

These extensions work in the sdata structure, setting the information there. You can see the SDATA structure from the user point of view in the user documentation: The SDATA format. To set data into that structure, grasp_input_segment.h library should be used. The following scheme represents sdata structure:

sdata.png
Sensor input data structure

2.3.1.1.1. Drivers

A driver is a method for reading custom data and inject it directly in GRASP retrieval algorithm without any intermediate file. The general scheme for developing it is to start developing a read data library implemented in any language and then bind it with C passing the data to a fix structure called SDATA. This methodology allows to implement drivers in any language and then bind with the Framework code. The drivers are placed in src/input/drivers/{DRIVER_NAME} and are compiled automatically by the frameworks makefiles. Drivers are an optimal way to introduce data in the retrieval algorithm, but they are also a good way to organize your script to prepare raw data and reuse the code.

How to develop a driver

  • 1. Prepare the framework

You need to create a folder in src/input/drivers/{DRIVER_NAME}

  • 2. Write a read library

Place in the driver folder your library to read your data. Your library can use a Makefile explicity and the default rule will be called by the framework compiler. You need to create a file named object_list and add all the object files that your library need to work with. This object files will be added at the grasp_input library. The object_list file must contain a new line at the end. If your reader library is written in C and can be compile with default rules, you don't need to create a Makefile - the object list will be enough.

  • 3. Developing a bridge between the reader library and GRASP retrieval algorithm (Create bridge files)

You need to create files for the driver bridge code and for the driver settings. At this point there are some names that are fixed and the compiler will look for them. Yo have to create a file named grasp_input_driver_{DRIVER_NAME}.h that has to implement the following interfaces mandatorily:

// Main function that returns the driver object with the driver main functions. This function will return a driver structure which contains pointers to the driver functions to initialize, get segment information and close it. This function will be described also in this file. The name of this function is fixed and the framework will find it automatically
grasp_input_driver_t grasp_input_driver_{DRIVER_NAME}();
// Initializing driver function. The name of this function is not fixed but it is recommended to use this pattern. This function will initialize the driver data. If the driver is going to read all the data in one step, this action can be performed here. If the driver wants to preload data for different areas, this function can't load the data and it will delegate this action to get_segment function. You can read the documentation about preloader to get more information about it. In this function input_information->input_files will contain the information of the input files to the driver. Mandatorily, this function will provide input_information->dimensions. input_information->used_files can be set at this point or segment by segment. At the end of the process, this information should be available.
int grasp_input_driver_{DRIVER_NAME}_init(grasp_settings *settings, grasp_tile_description_t *input_information);
// This function will be the translator between the reader library and the framework. It will get data read from raw file, and set in the correct place in sdata structure. If preload functionality is used, this function will also call the read library.
int grasp_input_driver_{DRIVER_NAME}_get_segment(grasp_settings *settings, grasp_segment_t *segment, int col, int row, int itime);
// This function have to deallocate all memory taken by the driver.
int grasp_input_driver_{DRIVER_NAME}_close(void);
// Function with fixed name. The framework will call it for retrieve driver settings and join them in the general settings of the system.
grasp_settings_parameter_array *grasp_input_driver_settings_{DRIVER_NAME}(grasp_settings *settings);

The file grasp_input_driver_{DRIVER_NAME}.c will implement this functions. The following code will show you the general structure of this functions:

// Globally stored the data read from the instrument
records_t records;
grasp_input_driver_t grasp_input_driver_{DRIVER_NAME}(){
x.init=grasp_input_driver_{DRIVER_NAME}_init;
x.get_segment=grasp_input_driver_{DRIVER_NAME}_get_segment;
x.close=grasp_input_driver_{DRIVER_NAME}_close;
return x;
}
int grasp_input_driver_{DRIVER_NAME}_get_segment(grasp_settings *settings, grasp_segment_t *segment, int col, int row, int itime){
// Settings SDATA HEADER
sdata->NX=?;
sdata->NY=?;
sdata->NT=?;
// Loop in data which it will be set
for (irecord = FIRST_RECORD_FOR(icol,irow,itime) ; irecord < LAST_RECORD_FOR(icol,irow,itime) ; irecord++){
sdata->pixel[ipixel].ix=?;
sdata->pixel[ipixel].iy=?;
sdata->pixel[ipixel].it=?;
//...
// Loop in wavelengths also...
for(iwl=0;iwl<8;iwl++){
//sdata->pixel[ipixel].meas[sdata->pixel[ipixel].nwl].ind_wl=?;
//...
// Loop in measure types...
for(ip=0; ip < sdata->pixel[ipixel].meas[sdata->pixel[ipixel].nwl].nip ;ip++){
sdata->pixel[ipixel].meas[sdata->pixel[ipixel].nwl].meas_type[ip]=?;
//...
// Loop in directions
for(ivalidmeas=0;ivalidmeas<sdata->pixel[ipixel].meas[sdata->pixel[ipixel].nwl].nbvm[ip] ;ivalidmeas++){
sdata->pixel[ipixel].meas[sdata->pixel[ipixel].nwl].thetav[ip][ivalidmeas]=?;
//...
}
}
}
}
ipixel++;
}
sdata->npixels=ipixel;
return ipixel;
}
int grasp_input_driver_{DRIVER_NAME}_close(void){
free(records);
return 0;
}
int grasp_input_driver_{DRIVER_NAME}_init(grasp_settings *settings, grasp_tile_description_t *input_information){
// Some validations...
// Example of validation
if(input_information->ninput_files!=1){
printf("ERROR: {DRIVER_NAME} driver only support 1 input file yet\n");
exit(-1);
}
// Read data calling your reader library
nrecords = read_records(input_information->input_files[0], &records, args...);
if (nrecords < 0) {
perror(input_information->input_files[0]);
exit(-1);
}
// Set mandatory information
// Number of segment that will be processed
input_information->dimensions.segment_ncols=?;
input_information->dimensions.segment_nrows=?;
input_information->dimensions.segment_ntimes=?;
// Number of pixels that will be processed
input_information->dimensions.npixel_estimated=?;
input_information->dimensions.tile_nt=?;
input_information->dimensions.tile_nx=?;
input_information->dimensions.tile_ny=?;
// Used files (optional. you can set it here or in getsegment function)
input_information->nused_files=?;
input_information->used_files=(char **)malloc(sizeof(char *)*?);
for(nused_files){
strcpy(input_information->used_files[0],used_file[i]);
}
// Return iterator
return 0;
}
grasp_settings_parameter_array *grasp_input_driver_settings_{DRIVER_NAME}(grasp_settings *settings){
// If you don't need extra parameters skip this code and copy next one.
// If you need extra parameters your need to read documentation about how to define them. An example of how to do it is:
int i;
// Static definition of a dictionary (We recommend defining it statically inside the function and do a copy after it)
yamlsettings_parameter parameters[]= {
// Settings deinition. See documentation.
// Name of parameter will be classified inside extension parameters block: input.driver_settings.{DRIVER_NAME}.{parameter_name}
};
result->nparameters=sizeof(parameters)/sizeof(yamlsettings_parameter);
result->parameters = (yamlsettings_parameter *) malloc(sizeof (yamlsettings_parameter)*result->nparameters);
for(i=0;i<sizeof(parameters)/sizeof(yamlsettings_parameter);i++){
grasp_settings_copy_parameter(&(parameters[i]),&(result->parameters[i]));
}
return result;
}

For the last function of the last code you will need to store the settings in some memory place. For defining this new memory places you have to create a file called grasp_input_driver_settings_{DRIVER_NAME}.h . The settings specified here will be available in settings structure, following our example:

typedef struct grasp_input_driver_settings_{DRIVER_NAME}_t_ {
// An example of integer value
int value;
}grasp_input_driver_settings_{DRIVER_NAME}_t;

You can access this value in:

settings->input.driver.{DRIVER_NAME}.value

In case that you don't want to define new extra parameters, you don't need to add this file grasp_input_driver_settings_example.h and the code of grasp_input_driver_settings_example function will be like this:

grasp_settings_parameter_array *grasp_input_driver_settings_{DRIVER_NAME}(grasp_settings *settings){
// If your driver not need extra parameters you can return NULL
return NULL;
}
  • 4. Create an object list

You have to create a file called object_list and place it in the driver folder. This file has to contain all object files that will be used for framework when the driver is choosen to run. This file has to contain a new line mandatorily at the end of the file. An example of this file:

drivers/{DRIVER_NAME}/grasp_input_driver_{DRIVER_NAME}.o
drivers/{DRIVER_NAME}/aeronet_{DRIVER_NAME}.o
  • 5. Does your driver need extra resources?

Some drivers could need extra resources like a climatology of gases or something else. First of all, we recommend to check in the resources folder, because perhaps there are resources already available. In case that your resources are instrument dependent, you can create a resources folder inside your driver directory. These resources will be copied at the same time as the other resources used by the framework with the Makefile rule.

make install_resources

and then you can link them inside your code, using an absolute path which allows you to be carefree regarding on which folder you execute the code. To reference them, you need to concatenate the parameter settings->global.resources_path which has the absolute path to resources folder with "/drivers/{DRIVER\_NAME}/{RESOURCE\_NAME}/{RESOURCE\_FILE}"

  • 6. Does your driver have extra tools?

Finally, if your driver has some extra tools (for example a tool for easily downloading raw data), you can create a bin folder inside the driver directory. The executables placed there will be copied to the bin folder of the framework and install in the system if the user use "install" rule of the make file.

2.3.1.1.2. Transformers

[ Section in process ]

2.3.2.2. Output

Output extensions are functions that are called in order to print the output. There are three types of output functions depending on the arguments and the moment of the process they are called. For working with the output it is important to know the output structures and the functions to access the data. For segments, output structure, follow the next scheme:

segment_output.png
Segment output structure

But to access that structure, it is highly recommended to use grasp_output_segment_result.h library.

The output extensions which work at tile level (tile and current output functions) will receive as argument a grasp_results_t structure. A diagram of the content is the following one:

tile_output.png
Tile output structure

This structure contains the results for all pixels. Dash lines represents dynamic memory. Only requested (by the user in the settings file) products will be allocated. The array fields are also dynamically allocated. It makes it more difficult to access the data because multi-dimensional arrays can not be accessed with the brackets syntax ([x][y] can not be used). That is why, in order to access these products you can use the grasp output tile library, which simplifies the way to retrieve the results. You can find the API to access the results here in grasp_output_segment_result.h

Note: see that a common way to access to a pixel in the library grasp_output_segment_result.h library is by using t, x, and y indexes. If the developer wants to access via index of pixel (as a linear array), it can be done using as arguments 0, 0 and ipixel. Example:

int ipixel;
for (ipixel = 0; ipixel < results->information.tile_npixels; ipixel++) {
if(grasp_output_tile_is_pixel(results,0,0,ipixel)){
printf("latitude of pixel %d: %f", ipixel, grasp_output_tile_pixel_information_latitude(results,0,0,ipixel));
}
}

2.3.2.2.1. Segment functions

Segment output functions are called after the process of each segment.

2.3.2.2.2. Tile functions

[ Section in process ]

2.3.2.2.2. Current functions

[ Section in process ]