Chapter 2. Contributions: How to and guides

version: Return version of grasp if it is compiled with saving this information

The fact that GRASP is open is because it is looking forward people dive inside. You can do it just to get knowledge how it works but also you can contribute with the community in many different ways: testing, reporting bugs, helping other users, developing new awesome features or extensions. In this chapter all of this will be explained along with main GRASP architecture and libraries. Everything necessary to start to help GRASP team.

Index of content chapter 2:

2.1. Bug reporting

Report a bug is an easy task thank to GitLab web interface and it is a great way to support GRASP project. In case you identify a strange behavior of the code or an error such as "segmentation fault" you can report that bug to developers team. The user can report a bug opening a issue (see 1.2. Open an issue section).

2.2. Documentation

The documentation is part also of the code repository. The success of the project also depend on the quality of the documentation and that is the reason to have as objective a very good documentation: easy to read and understand and complete. To achieve our goal we have split the documentation in user and technical documentation. The idea is to approach the topics from different point of views.

2.2.1. User documentation

The user documentation is published on the web www.grasp-open.com/doc . The source code of the doc is in doc/overview/ and changes are automatically published on the web. The documentation is developed using DocBook technology which help to deploy the results in web format (html) and PDF.

To edit the documentation the developer (or reviewer in this case) has to edit the chapters are in files doc/overview/src/en/chap0*.dbl. The preview of results can be done after compilation of the doc. To compile the documentation is easy if you have installed the prerequisites:

  • docbook
  • an XSLT processor (xsltproc by default, or alternate tools Saxon or Xalan)
  • a Formatting Objects processor (currently Apache fop).
  • The Apache Xerces XML parser (for use with Saxon or Xalan)

Then, from doc/overview folder just is needed to type make. To see the results you can open doc/overview/website/index.html or doc/overview/website/docgrasp.pdf. Once changes are done, make clean before commit anything. Changes can be committed to a fork and then ask for merging in original project via "pull request". This actions are explained in 1.3. Forking, contributing and pulling request section.

2.2.2. Technical documentation

The documentation that the reader is reading right now is the technical documentation. Raw files of this documentation are placed in doc/technical folder of GRASP repository. The changes are automatically published on the web www.grasp-open.com/tech-doc . This documentation is generated by Doxygen technology and accompanied by markdown, a popular lightweight markup language with plain text formatting syntax.

Technical documentation is written in same code files, which helps to keep it always up to date with code interfaces. Additionally, some introductory chapters are written in markdown. The source can be compiled using make command in doc/technical folder, having doxygen installed in the system. Changes can be committed to a fork and then ask for merging in original project via "pull request". This actions are explained in 1.3. Forking, contributing and pulling request section.

2.3. Developing

The code is organized in different modules easy to identify in source code because each one has its own folder under src/ folder. These modules are:

  • Retrieval: It is scientific library that can also work as stand alone application (advanced mode not recommendable).
  • Input: Responsible of prepare and inject input data in retrieval algorithm
  • Output: It retrieve results from the algorithm and prepare the output for the user. Data can be dump into screen or into files with different formats (such as ascii, HDF, NetCDF...)
  • Settings: It defines the user interface which define the way that the software is going to work. Settings are provided as a YAML file which set up the behavior of the process.
  • Controller It is the start point of the code and it performs the connections between all different modules.
  • Global General module which can be used by the rest of modules. General things such us parameters library, that can be used in input or settings module for work with initial guess or in output to work with result parameters array.

A general overview about GRASP architecture is written in chapter three of overview documentation (see this) and that is the first reference before starting to work with the code.

A clear image about how the retrieval library is called and the general workflow is the following one:

GRASP_call_flow_sequence_diagram.png
Sequence diagram of GRASP control unit

The MPI compilation of the code works with similar work flow but master node performs all work except inversion with it done by the workers. So, the loop of previous diagram is paralellized.

2.3.1. Global libraries

2.3.1.1 Output streams

INTRODUCTION

The user guide for output streams is in www.grasp-open.com/doc/apc.php

Output streams allows to redirect the output from GRASP to a file, to /dev/null or to a file easily. It is well integrated with settings module, allowing a powerful way to define the filename. It is basic idea behind this library is to provide a flexible way to name the output filenames. If we call "output.txt" to the output generated by a segment it will be overwritten by all segment keeping just the information of last segment. Using wildcard, that will be dynamically generated, you can store organized all output information. Implementation details will be explained in following sections.

USAGE EXAMPLE

// Stream declaration
// Stream initialization
grasp_output_stream_initialize("data_{tile_from}_{tile_to}.txt", &stream);
// Open stream (You have to provide the known information)
grasp_output_stream_open(stream, settings, segment, NULL, &tile_description->dimensions, -1, -1, -1);
// Stream usage
gos_fprintf(stream," message=%d", value);
// Closing stream
// Deallocating memory

If you use the function initialize at the beginning you have to use the function destroy at the end. Perhaps you don't need to you initialize if you only want pointer to files generated by stream library.

FILE *f;
f=grasp_output_stream_open(stream, settings, segment, output, &tile_description->dimensions, icol, irow, itime);
if(grasp_output_stream_writable(stream)==false){ // If it is not writable, close stream and finish function
return 0;
}

HOW TO ADD A NEW WILDCARD

There is a list of available wildcard can be used in output streams. In principle, this list should cover all requirements of the users and there is not necessity to add any wildcard more. But, if a user feels the necesity to use a special wildcard which is not implemented following this guide.

Segment definition is retrieval (fortran code) segment definition binded plus extra information added by the framework to this structure (iguess in same structure). In addition this file contains the function to set sdata information with limit checking

  • 1. New wildcard definition

In the file src/output/grasp_output_stream.h you have to define a function which will return the wildcard. You can follow as example this function which return version of compilation

//**
** @param token Token to be translated: segment_corner_row
* @param format Format of the token read from pattern: N. It has to be a integer number which represents number of 0s at left
* @param settings Current settings
* @param segment Input segment
* @param output Output current segment
* @param tile_description Tile description (dimensions)
* @param icol Number of column. If it is known you have to use -1
* @param irow Number of current row of the segment
* @param itime Number of current time of the segment
* @return the wildcard translated
*//
char *grasp_output_stream_filename_generate_by_version(const char *token, const char *format, const grasp_settings *settings, const grasp_segment_t *segment, const output_segment_general *output, const grasp_tile_dimensions_t *tile_description, int icol, int irow, int itime);
  • 2. Wildcard implementation

In the file src/output/grasp_output_stream.c you have to implement the function defined in point 1. Example:

char *grasp_output_stream_filename_generate_by_version(const char *token, const char *format, const grasp_settings *settings, const grasp_segment_t *segment, const output_segment_general *output, const grasp_tile_dimensions_t *tile_description, int icol, int irow, int itime){
char *token_result;
token_result = (char *) trackmem_malloc(sizeof (char)* _GBL_FILE_PATH_LEN);
assert( token_result!= NULL);
strcpy(token_result, "");
if (strcmp(token, "version") != 0) {
return token_result;
}
strcpy(token_result,GRASP_VERSION);
return token_result;
}

As you can see this function return a string allocated (using trackmem library) with the string selected. In the current example this is simple since it does not use any other information. You can dive in the file an see other examples which use settings or segment information

  • 3. Add the new wildcard to the dictionary

At the beginning of XXX file you will find this structure definition:

You can add your new wildcard just adding the name of the wildcard (the string which will be used to call it) and the name of the function to be used.

  • 4. Improve documentation

Remember to add the information to the new wildcard in the used documentation. Edit the file doc/overview/src/en/appc.dbk and add to list of available wildcard the explanation of the new one.

2.3.2. Extensions

GRASP code can be extended using a kind of "plug-in" but in compilation time. GRASP is a high optimized software and it is not lost due to extensions since the user can choose what extensions wants to install. In a production environment as less extensions installed as better, since they will take memory even if they are not used.

Many types of extensions can be developed and they are organized as input or output depending the target of them. Following sections will explain each type of extension, its goal and way to implement them. Manual implementation is not recommended since grasp-manager script provide easy commands to create the initial scaffold but at same time, the knowledge of all details will help developers to understand how to create an extension.

If you want to create initial code for an extension you can use grasp-manager with following command:

./grasp-manager.sh create-extension [extension-type] [extension-name] [extension-url]

2.3.2.1. Input

Input extensions extend input functionalities of GRASP code. The can be drivers or transformers. A drivers is a method to read directly raw data and inject it in GRASP core without use intermediate files. All process is performed in memory and it is crucial in satellite data processing due to huge databases. Input transformers allow to modify input data after being read. It is a way to share between different drivers a data preprocesing. For example a transformer can modify initial guess for each pixel or read a climatology database.

These extensions work in the sdata structure, settings the information there. You can see the SDATA structure from the user point of view in the user documentation: The SDATA format. To set data into that structure should be used the grasp_input_segment.h library. Following scheme represents sdata structure:

sdata.png
Sensor input data structure

2.3.1.1.1. Drivers

A driver is a method for reading custom data and inject it directly in GRASP retrieval algorithm without any intermediate file. The general scheme for developing it is to start developing a read data library implemented in any lenguaje and then bind it with C passing the data to a fix structure called SDATA. This methodology allows to implement drivers in any lenguaje and then bind with the Framework code. The drivers are placed in src/input/drivers/{DRIVER_NAME} and it are compiled automatically by the frameworks makefiles. Drivers are a optimal way to introduce data in the retrieval algorithm but also is a good way to have organized your script to prepare raw data and reuse the code.

How to develop a driver

  • 1. Prepare the framework

You have to create a folder in src/input/drivers/{DRIVER_NAME}

  • 2. Write a read library

Place in driver folder your library to read your data. You library can use a Makefile explicity and the default rule will be called by the framework compiler. You need to create a file called object_list and add them all the object files that you library need to work. This object files will be added at the grasp_input library. The object_list file have to contain a new line at the end. If your reader library is written in C and can be compile with default rules you don't need to create a Makefile and with the object list will be enough.

  • 3. Developing a bridge between the reader library and GRASP retrieval algorithm (Create a bridge files)

You need to create a files for the driver bridge code and for the driver settings. At this point there are some names that are fixed and the compiler will look for them. Yo have to create a file called grasp_input_driver_{DRIVER_NAME}.h that has to implement following interfaces mandatorily:

// Main function that returns the driver object which contains driver main functions. This function will return a driver structure which contains pointers to driver functions to initialize, get segment information and close it. This function will be described also in this file. The name os this function is fix and the framework will find automatically
grasp_input_driver_t grasp_input_driver_{DRIVER_NAME}();
// Initialize driver function. The name of this function is not fixed but is recommended to use this pattern. This function will initialize driver data. If the driver is going to read all data in one step can be perform this action here. If driver want to preload data for different areas this function can't load the data and it will delegate this actions to get_segment function. You can read documentation about preloader to get more information about it. In this function input_information->input_files will contain the information of the input files to the driver. Mandatorily this function will provide input_information->dimensions. input_information->used_files can be set at this point or segment by segment. At the end of the process thin information have to be available.
int grasp_input_driver_{DRIVER_NAME}_init(grasp_settings *settings, grasp_tile_description_t *input_information);
// This function will be the translator between reader library and the framework. It will get data read from raw file, and set in correct place in sdata structure. If preload functionality is used, this function will call the read library also.
int grasp_input_driver_{DRIVER_NAME}_get_segment(grasp_settings *settings, grasp_segment_t *segment, int col, int row, int itime);
// This function have to deallocate all memory taken by the driver.
int grasp_input_driver_{DRIVER_NAME}_close(void);
// Function with fixed name. The framework will call it for retrieve driver settings and join them in the general settings of the system.
grasp_settings_parameter_array *grasp_input_driver_settings_{DRIVER_NAME}(grasp_settings *settings);

The file grasp_input_driver_{DRIVER_NAME}.c will implement this functions. The following code will show you the general structure of this functions:

// Globally stored the data read from the instrument
records_t records;
grasp_input_driver_t grasp_input_driver_{DRIVER_NAME}(){
x.init=grasp_input_driver_{DRIVER_NAME}_init;
x.get_segment=grasp_input_driver_{DRIVER_NAME}_get_segment;
x.close=grasp_input_driver_{DRIVER_NAME}_close;
return x;
}
int grasp_input_driver_{DRIVER_NAME}_get_segment(grasp_settings *settings, grasp_segment_t *segment, int col, int row, int itime){
// Settings SDATA HEADER
sdata->NX=?;
sdata->NY=?;
sdata->NT=?;
// Loop in data which it will be set
for (irecord = FIRST_RECORD_FOR(icol,irow,itime) ; irecord < LAST_RECORD_FOR(icol,irow,itime) ; irecord++){
sdata->pixel[ipixel].ix=?;
sdata->pixel[ipixel].iy=?;
sdata->pixel[ipixel].it=?;
//...
// Loop in wavelengths also...
for(iwl=0;iwl<8;iwl++){
//sdata->pixel[ipixel].meas[sdata->pixel[ipixel].nwl].ind_wl=?;
//...
// Loop in measure types...
for(ip=0; ip < sdata->pixel[ipixel].meas[sdata->pixel[ipixel].nwl].nip ;ip++){
sdata->pixel[ipixel].meas[sdata->pixel[ipixel].nwl].meas_type[ip]=?;
//...
// Loop in directions
for(ivalidmeas=0;ivalidmeas<sdata->pixel[ipixel].meas[sdata->pixel[ipixel].nwl].nbvm[ip] ;ivalidmeas++){
sdata->pixel[ipixel].meas[sdata->pixel[ipixel].nwl].thetav[ip][ivalidmeas]=?;
//...
}
}
}
}
ipixel++;
}
sdata->npixels=ipixel;
return ipixel;
}
int grasp_input_driver_{DRIVER_NAME}_close(void){
free(records);
return 0;
}
int grasp_input_driver_{DRIVER_NAME}_init(grasp_settings *settings, grasp_tile_description_t *input_information){
// Some validations...
// Example of validation
if(input_information->ninput_files!=1){
printf("ERROR: {DRIVER_NAME} driver only support 1 input file yet\n");
exit(-1);
}
// Read data calling your reader library
nrecords = read_records(input_information->input_files[0], &records, args...);
if (nrecords < 0) {
perror(input_information->input_files[0]);
exit(-1);
}
// Set mandatory information
// Number of segment that will be processed
input_information->dimensions.segment_ncols=?;
input_information->dimensions.segment_nrows=?;
input_information->dimensions.segment_ntimes=?;
// Number of pixels that will be processed
input_information->dimensions.npixel_estimated=?;
input_information->dimensions.tile_nt=?;
input_information->dimensions.tile_nx=?;
input_information->dimensions.tile_ny=?;
// Used files (optional. you can set it here or in getsegment function)
input_information->nused_files=?;
input_information->used_files=(char **)malloc(sizeof(char *)*?);
for(nused_files){
strcpy(input_information->used_files[0],used_file[i]);
}
// Return iterator
return 0;
}
grasp_settings_parameter_array *grasp_input_driver_settings_{DRIVER_NAME}(grasp_settings *settings){
// If you don't need extra parameters skip this code and copy next one.
// If you need extra parameters your need to read documentation about how to define them. An example of how to do it is:
int i;
// Static definition of a dictionary (We recommend define it statically inside the function a do a copy after it)
yamlsettings_parameter parameters[]= {
// Settings deinition. See documentation.
// Name of parameter will be classified inside extension parameters block: input.driver_settings.{DRIVER_NAME}.{parameter_name}
};
result->nparameters=sizeof(parameters)/sizeof(yamlsettings_parameter);
result->parameters = (yamlsettings_parameter *) malloc(sizeof (yamlsettings_parameter)*result->nparameters);
for(i=0;i<sizeof(parameters)/sizeof(yamlsettings_parameter);i++){
grasp_settings_copy_parameter(&(parameters[i]),&(result->parameters[i]));
}
return result;
}

For the last function of the last code you will need to store the settings in some memory place. For defining this new memory places you have to create a file called grasp_input_driver_settings_{DRIVER_NAME}.h . The settings specified here will be available in settings structure. Following our example:

typedef struct grasp_input_driver_settings_{DRIVER_NAME}_t_ {
// An example of integer value
int value;
}grasp_input_driver_settings_{DRIVER_NAME}_t;

You can access to this value in:

settings->input.driver.{DRIVER_NAME}.value

In case that you don't want to define new extra parameters you don't need to add this file grasp_input_driver_settings_example.h and the code of grasp_input_driver_settings_example function will be like this:

grasp_settings_parameter_array *grasp_input_driver_settings_{DRIVER_NAME}(grasp_settings *settings){
// If your driver not need extra parameters you can return NULL
return NULL;
}
  • 4. Create an object list

You have to create a file called object_list and place it in driver folder. This file have to contain all object files that will be used for framework when the driver is choosen to run. This file have to contain a new line mandatorily at the end of the file. An example of this file:

drivers/{DRIVER_NAME}/grasp_input_driver_{DRIVER_NAME}.o
drivers/{DRIVER_NAME}/aeronet_{DRIVER_NAME}.o
  • 5. Does your driver need extra resources?

Some driver could need extra resources like a climatology of gases or something else. First of all we recommend check in resources folder because perhaps there are resources already availables. In case that your resources are instrument dependent you can create a resources folder inside your driver directory. This resources will be copied at same time than the other resources used by the framework with the Makefile rule.

make install_resources

and then you can link them inside your code using an absolute path which allows you to don't need to take care about in which folder you execute the code. To reference them you need to concatenate the parameter settings->global.resources_path which has the absolute path to resources folder with "/drivers/{DRIVER\_NAME}/{RESOURCE\_NAME}/{RESOURCE\_FILE}"

  • 6. Does your driver have extra tools?

Finally if your driver has some extra tools (for example a tool for donwload easily raw data) you can create a bin folder inside driver directory. The executables placed there will be copied to bin folder of the framework and install in the system if the user use "install" rule of the make file.

2.3.1.1.2. Transformers

[ Section in process ]

2.3.2.2. Output

Output extensions are function that are called in order to print the output. There are three types of output functions depending on the arguments and the moment of the process they are called. For working with the output it is important to know output structures and functions to access the data. For segments, output structure follow next scheme:

segment_output.png
Segment output structure

But to access to that structure it is highly recommended to use grasp_output_segment_result.h library.

The output extensions which work at tile level (tile and current output functions) will receive as argument a grasp_results_t structure. A diagram of the content is the following one:

tile_output.png
Tile output structure

This structure contains the results for all pixels. Dash lines represents dynamic memory. Only requested (by the user in the settings file) products will be allocated. The array fields are also dynamically allocated. It makes more difficult to access to the data because multi-dimensional arrays can not be accessed with the brackets syntax ([x][y] can not be used). It is why to access to that products you can use the grasp output tile library which simplify the way to retrieve the results. You can find the API to access to results here in grasp_output_segment_result.h

Note: see that a common way to access to a pixel in the library grasp_output_segment_result.h library is using t, x, and y indexes. If the developer wants to access via index of pixel (as a linear array), it can be done using as arguments 0, 0 and ipixel. Example:

int ipixel;
for (ipixel = 0; ipixel < results->information.tile_npixels; ipixel++) {
if(grasp_output_tile_is_pixel(results,0,0,ipixel)){
printf("latitude of pixel %d: %f", ipixel, grasp_output_tile_pixel_information_latitude(results,0,0,ipixel));
}
}

2.3.2.2.1. Segment functions

Segment output functions are called after process each segment.

2.3.2.2.2. Tile functions

[ Section in process ]

2.3.2.2.2. Current functions

[ Section in process ]