To increase the traceability of changes between versions of datasets and reduce uncertainty stemming from choices made at implementation time, we favored using original code implementations or thoroughly validated re-implementations of original codes. Thus, our code organization strings together loosely coupled components which implement each step, with clear-cut interfaces between steps. This modular approach eases the maintenance and change efforts for any individual step, but adds complexity to evaluating changes for the entire pipeline. Different programming languages (Python, C, MATLAB and IDL, plus PV-WAVE for FLUXNET2015) were used to implement the different steps, all connected using a controller code that makes appropriate calls in the correct order. The ONEFlux22 code collection replaced the PV-WAVE code with a re-implementation in Python, and also collates most of these steps into a cohesive pipeline (see also the Code Availability section). The IDL code, which applies the sundown partitioning method44, is not yet currently implemented in ONEFlux, because some additional testing and development are needed to make it robust and more suitable for general application. Implementation details of individual steps are discussed next, with references to the outputs each step identified by an execution sequential number and the step name–e.g., 01_qc_visual contains the results of the first processing step, the visual check step. Each of these steps correspond to a code module. Supplementary Fig. SM2 shows the steps and their inter-dependencies.
The main controller code for ONEFlux is implemented in Python. Besides being the glue code that executes each step, pre- and post-checks are also executed before and after each step. These checks guarantee that the input data meet the minimum requirements to run the step, that the minimum expected outputs were generated by the execution of the step, and that any errors or exception conditions were handled correctly. Information about execution is recorded in a log for the entire pipeline, along with logs for individual steps. Besides the controller code, two of the three flux partitioning steps were re-implemented in Python (the nighttime and daytime methods, 10_nee_partition_nt and 11_nee_partition_dt), together with other specific steps such as data preparation for the uncertainty estimates (12_ure_input), and the creation and checking of final products (99_fluxnet2015). The original flux partitioning implementation in PV-WAVE was used for the LaThuile2007 and FLUXNET2015 datasets. Also, the tool for the downscaling of the ERA-I meteorological data is implemented in Python and runs on a server connected to the ERA data.
Several steps are implemented in the C programming language, allowing better control over execution performance of these steps. These steps include:
automated QA/QC flagging (02_qc_auto), USTAR threshold estimation using the MP method (04_ustar_mp), the filtering and gap-filling of meteorological data, including the merging with the ERA-I downscaled data (07_meteo_proc), the filtering and gap-filling of CO2 fluxes (08_nee_proc), the filtering, gap-filling, and energy corrections of energy fluxes (09_energy_proc), and the computation of uncertainty products (12_ure). The source codes and the compiled executables are provided for steps implemented in C, as well as build procedures in make/Makefile format.
The estimation of USTAR thresholds using the CP method (05_ustar_cp) is the only step implemented in MATLAB. It is distributed both as source code and compiled code to be used with the MATLAB Runtime Environment, such that it does not require a license purchase.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
 Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.