indicates if slopes should be discarded or replaced by 0 according to quality thresholds set by user
Usage
flux_quality(
slopes_df,
conc_col,
f_fluxid = f_fluxid,
f_slope = f_slope,
f_time = f_time,
f_start = f_start,
f_end = f_end,
f_fit = f_fit,
f_cut = f_cut,
f_pvalue = f_pvalue,
f_rsquared = f_rsquared,
f_b = f_b,
force_discard = c(),
force_ok = c(),
ratio_threshold = 0,
fit_type = c(),
ambient_conc = 421,
error = 100,
pvalue_threshold = 0.3,
rsquared_threshold = 0.7,
rmse_threshold = 25,
cor_threshold = 0.5,
b_threshold = 1,
cut_arg = "cut"
)
Arguments
- slopes_df
dataset containing slopes
- conc_col
column containing the measured gas concentration (exponential fit)
- f_fluxid
column containing unique IDs for each flux
- f_slope
column containing the slope of each flux (as calculated by the flux_fitting function)
- f_time
column containing the time of each measurement in seconds (exponential fit)
- f_start
column with datetime of the start of the measurement (after cuts)
- f_end
column with datetime of the end of the measurement (after cuts)
- f_fit
column containing the modeled data (exponential fit)
- f_cut
column containing the cutting information
- f_pvalue
column containing the p-value of each flux (linear and quadratic fit)
- f_rsquared
column containing the r squared of each flux (linear and quadratic fit)
- f_b
column containing the b parameter of the exponential expression (exponential fit)
- force_discard
vector of fluxIDs that should be discarded by the user's decision
- force_ok
vector of fluxIDs for which the user wants to keep the calculated slope despite a bad quality flag
- ratio_threshold
ratio of gas concentration data points over length of measurement (in seconds) below which the measurement will be considered as not having enough data points to be considered for calculations
- fit_type
model fitted to the data, linear, quadratic or exponential. Will be automatically filled if slopes_df was produced using flux_fitting()
- ambient_conc
ambient gas concentration in ppm at the site of measurement (used to detect measurement that started with a polluted setup)
- error
error of the setup, defines a window outside of which the starting values indicate a polluted setup
- pvalue_threshold
threshold of p-value below which the change of gas concentration over time is considered not significant (linear and quadratic fit)
- rsquared_threshold
threshold of r squared value below which the linear model is considered an unsatisfactory fit (linear and quadratic fit)
- rmse_threshold
threshold for the RMSE of each flux above which the fit is considered unsatisfactory (exponential fit)
- cor_threshold
threshold for the correlation coefficient of gas concentration with time below which the correlation is considered not significant (exponential fit)
- b_threshold
threshold for the b parameter. Defines a window with its opposite inside which the fit is considered good enough (exponential fit)
- cut_arg
argument defining that the data point should be cut out
Value
a dataframe with added columns of quality flags (f_quality_flag
),
the slope corrected according to the quality flags (f_slope_corr
),
some diagnostics depending on the fit, and any columns present in the input.
Examples
data(slopes0lin)
flux_quality(slopes0lin, conc, fit_type = "li")
#>
#> Total number of measurements: 6
#>
#> discard 5 83 %
#> ok 1 17 %
#> zero 0 0 %
#> force_discard 0 0 %
#> start_error 0 0 %
#> no_data 0 0 %
#> force_ok 0 0 %
#> # A tibble: 1,251 × 22
#> datetime temp_air temp_soil conc PAR turfID type
#> <dttm> <dbl> <dbl> <dbl> <dbl> <fct> <fct>
#> 1 2022-07-28 23:43:35 NA NA 447. NA 156 AN2C 156 ER
#> 2 2022-07-28 23:43:36 7.22 10.9 447. 1.68 156 AN2C 156 ER
#> 3 2022-07-28 23:43:37 NA NA 448. NA 156 AN2C 156 ER
#> 4 2022-07-28 23:43:38 NA NA 449. NA 156 AN2C 156 ER
#> 5 2022-07-28 23:43:39 NA NA 449. NA 156 AN2C 156 ER
#> 6 2022-07-28 23:43:40 NA NA 450. NA 156 AN2C 156 ER
#> 7 2022-07-28 23:43:41 NA NA 451. NA 156 AN2C 156 ER
#> 8 2022-07-28 23:43:42 NA NA 451. NA 156 AN2C 156 ER
#> 9 2022-07-28 23:43:43 NA NA 453. NA 156 AN2C 156 ER
#> 10 2022-07-28 23:43:44 NA NA 453. NA 156 AN2C 156 ER
#> # ℹ 1,241 more rows
#> # ℹ 15 more variables: f_start <dttm>, f_end <dttm>, f_fluxid <fct>,
#> # f_flag_match <chr>, f_time <dbl>, f_cut <fct>, f_pvalue <dbl>,
#> # f_rsquared <dbl>, f_adj_rsquared <dbl>, f_intercept <dbl>, f_slope <dbl>,
#> # f_fit <dbl>, f_ratio <dbl>, f_quality_flag <chr>, f_slope_corr <dbl>
data(slopes30)
flux_quality(slopes30, conc, fit_type = "expo")
#>
#> Total number of measurements: 6
#>
#> ok 6 100 %
#> discard 0 0 %
#> zero 0 0 %
#> force_discard 0 0 %
#> start_error 0 0 %
#> no_data 0 0 %
#> force_ok 0 0 %
#> # A tibble: 1,251 × 27
#> datetime temp_air temp_soil conc PAR turfID type
#> <dttm> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
#> 1 2022-07-28 23:43:35 NA NA 447. NA 156 AN2C 156 ER
#> 2 2022-07-28 23:43:36 7.22 10.9 447. 1.68 156 AN2C 156 ER
#> 3 2022-07-28 23:43:37 NA NA 448. NA 156 AN2C 156 ER
#> 4 2022-07-28 23:43:38 NA NA 449. NA 156 AN2C 156 ER
#> 5 2022-07-28 23:43:39 NA NA 449. NA 156 AN2C 156 ER
#> 6 2022-07-28 23:43:40 NA NA 450. NA 156 AN2C 156 ER
#> 7 2022-07-28 23:43:41 NA NA 451. NA 156 AN2C 156 ER
#> 8 2022-07-28 23:43:42 NA NA 451. NA 156 AN2C 156 ER
#> 9 2022-07-28 23:43:43 NA NA 453. NA 156 AN2C 156 ER
#> 10 2022-07-28 23:43:44 NA NA 453. NA 156 AN2C 156 ER
#> # ℹ 1,241 more rows
#> # ℹ 20 more variables: f_start <dttm>, f_end <dttm>, f_fluxid <dbl>,
#> # f_flag_match <lgl>, f_time <dbl>, f_cut <chr>, f_Cz <dbl>, f_Cm <dbl>,
#> # f_a <dbl>, f_b <dbl>, f_tz <dbl>, f_slope <dbl>, f_fit <dbl>,
#> # f_fit_slope <dbl>, f_start_z <dttm>, f_ratio <dbl>, f_cor_coef <dbl>,
#> # f_RMSE <dbl>, f_quality_flag <chr>, f_slope_corr <dbl>