The GRASP is open with a purpose to enable people to investigate and modify it. You can use it just to get some knowledge on how it works, but you can also contribute as part of the community in many different ways: testing, reporting bugs, helping other users, developing new awesome features or extensions. In this chapter all of this will be explained along with the main GRASP architecture and the libraries -everything that is necessary to start helping the GRASP team.
Index of content chapter 2:
- 2. Contributions: How to and guides
- 2.1. Bug reporting
- 2.2. Documentation
- 2.2.1. User documentation
- 2.2.2. Technical documentation
- 2.3. Developing
- 2.3.1. Global libraries
- 2.3.1.1. Output streams
- 2.3.2. Extensions
- 2.3.2.1. Input
- 2.3.1.1.1. Drivers
- 2.3.1.1.2. Transformers
- 2.3.2.2. Output
- 2.3.2.2.1. Segment functions
- 2.3.2.2.2. Tile functions
- 2.3.2.2.3. Current functions
- 2.3.2.1. Input
- 2.3.3. Making a new setting
- 2.3.1. Global libraries
2.1. Bug reporting
Report a bug is an easy task thank to GitLab web interface and it is a great way to support GRASP project. In case you identify a strange behavior of the code or an error such as "segmentation fault" you can report that bug to the developers team. The user can report a bug opening an issue (see 1.2. Open an issue section).
2.2. Documentation
The documentation is also part of the code repository. The success of the project depends also on the quality of the documentation. That is the reason for having as objective a very good and complete documentation, which is easy to read and understand. To achieve our goal we have split the documentation into user and technical documentation. The idea is to approach the topics from different point of views.
2.2.1. User documentation
The user documentation is published on the web www.grasp-open.com/doc. The source code of the doc is in doc/overview/
and changes are automatically published on the web. The documentation is developed using DocBook technology which helps to deploy the results in web format (html) and PDF.
To edit the documentation, the developer (or reviewer in this case) has to edit the chapters that are in files doc/overview/src/en/chap0*.dbl
. The preview of the results can be seen after the compilation of the doc. It is easy to compile the documentation, if you have installed the prerequisites:
- docbook
- an XSLT processor (xsltproc by default, or alternate tools Saxon or Xalan)
- a Formatting Objects processor (currently Apache fop).
- The Apache Xerces XML parser (for use with Saxon or Xalan)
Then, from doc/overview
folder, you just need to type make
. To see the the results you can open doc/overview/website/index.html
or doc/overview/website/docgrasp.pdf
. Once changes are done, make clean before committing anything. Changes can be committed to a fork and then ask for merging in the original project via "pull request". This actions are explained in 1.3. Forking, contributing and pulling request section.
2.2.2. Technical documentation
The documentation that the reader is reading right now is the technical documentation. Raw files of this documentation are placed in doc/technical
folder of GRASP repository. The changes are automatically published on the web www.grasp-open.com/tech-doc . This documentation is generated by Doxygen technology and accompanied by markdown, a popular lightweight markup language with plain text formatting syntax.
Technical documentation is written in same code files, which helps to keep it always up to date with code interfaces. Additionally, some introductory chapters are written in markdown. The source can be compiled using make command in doc/technical
folder, having doxygen installed in the system. Changes can be committed to a fork and then ask for merging in original project via "pull request". This actions are explained in 1.3. Forking, contributing and pulling request section.
2.3. Developing
The code is organized in different modules, easy to identify in the source code because each one has its own folder under src/
folder. These modules are:
- Retrieval: It is a scientific library that can also work as a stand alone application (advanced mode is not recommended).
- Input: Responsible of preparing and injecting input data in the retrieval algorithm.
- Output: It retrieves results from the algorithm and prepare the output for the user. Data can be dump into screen or into files with different formats (such as ascii, HDF, NetCDF...)
- Settings: It defines the user interface which defines the way that the software is going to work. Settings are provided as a YAML file which sets up the behavior of the process.
- Controller It is the starting point of the code and it performs the connections between all different modules.
- Global General module which can be used by the rest of the modules. General things such as a parameters library, that can be used in input or settings module for work with an initial guess, or in output to work with a result parameters array.
A general overview about GRASP architecture is accesible in chapter three of the overview documentation (see this) and that is the first recommended reference before starting to work with the code.
A clear image about how the retrieval library is called and the general workflow is the following one:
The MPI compilation of the code works with a similar work flow but master node performs all work except inversion which is done by the workers. So, the loop of the previous diagram is parallelized.
2.3.1. Global libraries
2.3.1.1 Output streams
INTRODUCTION
The user guide for output streams is in www.grasp-open.com/doc/apc.php
Output streams allows to redirect the output from GRASP to a file, to /dev/null or to a file easily. It is well integrated with settings module, allowing a powerful way to define the filename. The basic idea behind this library is to provide a flexible way to name the output filenames. If we call "output.txt" to the output generated by a segment, it will be overwritten by all segments, keeping just the information of the last segment. Using wildcard, that will be dynamically generated, you can store all organized output information. Implementation details will be explained in the following sections.
USAGE EXAMPLE
If you use the function initialize at the beginning, you have to use the function destroy at the end. Perhaps you don't need to use initialize if you only want a pointer to files generated by the stream library.
HOW TO ADD A NEW WILDCARD
There is a list of available wildcard that can be used in output streams. In principle, this list should cover all requirements of the users and it is not necessary to add any more wildcards. But if you feel a necessity to use a special wildcard which is not implemented, pease follow this guide.
- 1. New wildcard definition
In the file src/output/grasp_output_stream.h you have to define a function which will return the wildcard. As an example, please follow this function which returns the version of the compilation:
- 2. Wildcard implementation
In the file src/output/grasp_output_stream.c you have to implement the function defined in point 1. Example:
As you can see, this function returns a string allocated (using trackmem library) with the string selected. In the current example, it is simple since it does not use any other information. You can dive into the file and see other examples which use settings or segment information.
- 3. Add the new wildcard to the dictionary
At the beginning of XXX file you will find this structure definition:
You can add your new wildcard just by adding the name of the wildcard (the string which will be used to call it) and the name of the function to be used.
- 4. Improve documentation
Remember to add the information to the new wildcard in the used documentation. Edit the file doc/overview/src/en/appc.dbk and add to the list of the available wildcards the explanation on the new one.
2.3.2. Extensions
GRASP code can be extended by using a kind of "plug-in", however the operation should be done during the compilation time. GRASP is a high optimized software without losses due to extensions, since the user can choose which extensions to install. In a production environment, the less extensions installed is better, since they will take memory even if they are not used.
Many types of extensions can be developed and they are organized as input or output depending of their target. The following sections will explain each type of extension, its goal and ways to implement them. Manual implementation is not recommended since the grasp-manager script provides easy commands to create the initial scaffold, but at the same time, the knowledge of all the details will help developers to understand how to create an extension.
If you want to create initial code for an extension you can use the grasp-manager with the following command:
2.3.2.1. Input
Input extensions extend input functionalities of the GRASP code. They can be drivers or transformers. A driver is a method to read directly raw data and inject it in the GRASP core without using intermediate files. The whole process is performed in the memory and it is crucial in satellite data processing due to huge databases. Input transformers allow to modify input data after being read. It is a way to share between different drivers a data preprocesing. For example, a transformer can modify initial guess for each pixel or read a climatology database.
These extensions work in the sdata structure, setting the information there. You can see the SDATA structure from the user point of view in the user documentation: The SDATA format. To set data into that structure,
grasp_input_segment.h library should be used. The following scheme represents sdata structure:
2.3.1.1.1. Drivers
A driver is a method for reading custom data and inject it directly in GRASP retrieval algorithm without any intermediate file. The general scheme for developing it is to start developing a read data library implemented in any language and then bind it with C passing the data to a fix structure called SDATA. This methodology allows to implement drivers in any language and then bind with the Framework code. The drivers are placed in src/input/drivers/{DRIVER_NAME} and are compiled automatically by the frameworks makefiles. Drivers are an optimal way to introduce data in the retrieval algorithm, but they are also a good way to organize your script to prepare raw data and reuse the code.
How to develop a driver
- 1. Prepare the framework
You need to create a folder in src/input/drivers/{DRIVER_NAME}
- 2. Write a read library
Place in the driver folder your library to read your data. Your library can use a Makefile explicity and the default rule will be called by the framework compiler. You need to create a file named object_list and add all the object files that your library need to work with. This object files will be added at the grasp_input library. The object_list file must contain a new line at the end. If your reader library is written in C and can be compile with default rules, you don't need to create a Makefile - the object list will be enough.
- 3. Developing a bridge between the reader library and GRASP retrieval algorithm (Create bridge files)
You need to create files for the driver bridge code and for the driver settings. At this point there are some names that are fixed and the compiler will look for them. Yo have to create a file named grasp_input_driver_{DRIVER_NAME}.h that has to implement the following interfaces mandatorily:
The file grasp_input_driver_{DRIVER_NAME}.c will implement this functions. The following code will show you the general structure of this functions:
For the last function of the last code you will need to store the settings in some memory place. For defining this new memory places you have to create a file called grasp_input_driver_settings_{DRIVER_NAME}.h . The settings specified here will be available in settings structure, following our example:
You can access this value in:
In case that you don't want to define new extra parameters, you don't need to add this file grasp_input_driver_settings_example.h and the code of grasp_input_driver_settings_example function will be like this:
- 4. Create an object list
You have to create a file called object_list and place it in the driver folder. This file has to contain all object files that will be used for framework when the driver is choosen to run. This file has to contain a new line mandatorily at the end of the file. An example of this file:
- 5. Does your driver need extra resources?
Some drivers could need extra resources like a climatology of gases or something else. First of all, we recommend to check in the resources folder, because perhaps there are resources already available. In case that your resources are instrument dependent, you can create a resources folder inside your driver directory. These resources will be copied at the same time as the other resources used by the framework with the Makefile rule.
and then you can link them inside your code, using an absolute path which allows you to be carefree regarding on which folder you execute the code. To reference them, you need to concatenate the parameter settings->global.resources_path which has the absolute path to resources folder with "/drivers/{DRIVER\_NAME}/{RESOURCE\_NAME}/{RESOURCE\_FILE}"
- 6. Does your driver have extra tools?
Finally, if your driver has some extra tools (for example a tool for easily downloading raw data), you can create a bin folder inside the driver directory. The executables placed there will be copied to the bin folder of the framework and install in the system if the user use "install" rule of the make file.
2.3.1.1.2. Transformers
[ Section in process ]
2.3.2.2. Output
Output extensions are functions that are called in order to print the output. There are three types of output functions depending on the arguments and the moment of the process they are called. For working with the output it is important to know the output structures and the functions to access the data. For segments, output structure, follow the next scheme:
But to access that structure, it is highly recommended to use grasp_output_segment_result.h library.
The output extensions which work at tile level (tile and current output functions) will receive as argument a grasp_results_t structure. A diagram of the content is the following one:
This structure contains the results for all pixels. Dash lines represents dynamic memory. Only requested (by the user in the settings file) products will be allocated. The array fields are also dynamically allocated. It makes it more difficult to access the data because multi-dimensional arrays can not be accessed with the brackets syntax ([x][y] can not be used). That is why, in order to access these products you can use the grasp output tile library, which simplifies the way to retrieve the results. You can find the API to access the results here in grasp_output_segment_result.h
Note: see that a common way to access to a pixel in the library grasp_output_segment_result.h library is by using t, x, and y indexes. If the developer wants to access via index of pixel (as a linear array), it can be done using as arguments 0, 0 and ipixel. Example:
2.3.2.2.1. Segment functions
Segment output functions are called after the process of each segment.
2.3.2.2.2. Tile functions
[ Section in process ]
2.3.2.2.2. Current functions
[ Section in process ]
2.3.3. Making a new setting
This is meant to be a small guide to follow as reference jointly with the already created structures in the corresponding files that should be copy-pasted. Five different files have to be modified to propagate the input values from yaml file to the Fortran90 core:
1˚ src/settings/grasp_settings.c
C# dictionary to set (in order): tree strucure in yaml file, memory direction if needed, variable type and dimenisons, value by default, array type, short description
2˚ src/settings/grasp_settings_data_types.c
If it is an option with different strings as possible input options they should be defined in this file and their numerical correspondance if needed. The function to do this is called "yamlsettings_enumeration_definition". Here it is an example:
3˚ src/settings/grasp_settings_data_types.h
Memory allocation interface, there are two functions to be called "grasp_data_type_VARIABLE_NAME_set" and "grasp_data_type_VARIABLE_NAME_get"
The two following files establish the interface between fortran90 an C#.
4˚ src/settings/grasp_settings_t.h
Definition inside the structure for C#
5˚ src/retrieval/interfaces/mod_retr_settings_derived_type.f90
Definition inside the structure for Fortran90
In this step please follow consistency in the order between both lists of variables. Note that in the C# part of the interface the constants are called beginign with a "_", and the order in the dimensions of the variables between C# and fortran90 is reversed.
If your create a struct of variables be careful of using different naming for the name of the structure type and the final definition of it.