SCT with fg

Validates observations based on spatial properties and first-guess values. Specifically, we apply a spatial consistency test (SCT) of a set of observations with a first guess for the observed values.

Pseudo-Algorithm

Definitions

Good Observation: An observation that accurately represents the actual atmospheric state, with reasonable accuracy and precision compared to its nearest neighbors.
Suspect or Bad Observation: An observation that does not accurately represent the atmospheric state or has significantly different accuracy or precision than its neighbors.
Centroid Observation: The center point of two concentric circles—the outer circle and inner circle—with radii outer_circle and inner_circle, respectively.
Outer Circle: The area used to select observations for assessing the quality of one or more observations simultaneously. It may include observations that help evaluate others but are not themselves assessed for quality.
Inner Circle: The area that allows multiple observations to be flagged at the same time. Checking more observations simultaneously speeds up the quality control process but increases the risk of misclassifying good observations as suspect.

Main Steps

SCT iteration
- Gradually tighten thresholds with each iteration, making it harder to flag observations as suspect.
- Detection Loop: Identify suspect observations.
- Cluster Preservation Loop: Save data that blends well with neighbors.
- Stray Data Redemption Loop: Bringing back good observations that got caught in the wrong crowd.
- Flag Assignment Step: Assign a final flag to each observation based on detection, cluster preservation, and redemption results.
- Exit Condition: Terminate the SCT iteration if no suspect observations are found or the maximum number of iterations is reached.

The final flag is assigned by the Stray Data Redemption Loop, which relies on the flags from the previous two loops. Subsequent SCT iterations do not change the suspect flags from earlier iterations, but they may flag previously good observations as suspect.

Detection Loop

Loop over all observations:
- Check if the current observation qualifies as a centroid observation.
- If yes, gather all neighbors within the outer circle that were not flagged as suspect in previous SCT iterations.
- If the observation is isolated, exit without flagging it.
- For all selected observations, perform analysis and leave-one-out analysis.
- Estimate observation error variance within the outer circle.
- Compute the SCT score for all selected observations within the inner circle based on analysis, leave-one-out analysis, and observation error variance.
- Flag as suspect any observations (in the inner circle) with an SCT score exceeding the specified threshold.

Cluster Preservation Loop

Loop over all observations flagged by the Detection Loop:
- Treat each flagged observation as a centroid observation.
- Retrieve all neighbors close to the centroid (same neighbors as in the Detection Loop).
- For each selected observation, perform both analysis and leave-one-out analysis. These are based on an iterative Optimal Interpolation (OI) scheme with four iterations. In the first iteration, the background is the first-guess values. For subsequent iterations, the background is the leave-one-out analysis. The final analysis and leave-one-out analysis should be closer to observed values, especially when clusters of observations reconstruct similar atmospheric patterns.
- Estimate the observation error variance within the outer circle.
- Compute the SCT score for the centroid observation based on the analysis, leave-one-out analysis, and observation error variance.
- Flag the centroid observation as suspect if the SCT score exceeds the threshold.

Stray Data Redemption Loop

Loop over all observations flagged by the Cluster Preservation Loop:
- Treat each flagged observation as a centroid observation.
- Retrieve all neighbors close to the centroid and consider only non-flagged neighbours
- Break out if observation isolated and flag it as suspect
- For all selected observations, perform analysis and leave-one-out analysis.
- Estimate observation error variance within the outer circle.
- Compute the SCT score for the centroid observation based on the analysis, leave-one-out analysis, and observation error variance.
- Flag the centroid observation as suspect if the SCT score exceeds the threshold.

Function Signature

ivec titanlib::sct_with_fg(const Points& points,
        const vec& values,
        const vec& background_values,
        float values_min,
        float values_max,
        int num_min,
        int num_max,
        float inner_radius,
        float outer_radius,
        int num_iterations,
        float min_horizontal_scale,
        float max_horizontal_scale,
        float vertical_scale,
        const vec& pos,
        const vec& neg,
        const vec& eps2,
        const vec& min_obs_var,
        bool diagnostics,
        const std::string& filename_diagnostics,
        vec& sct_scores,
        const ivec& obs_to_check)

Description:

points: Longitude, latitude, and elevation of observation locations
values: Observed values
background_values: First-guess values at observation locations
values_min: Minimum acceptable observed value (set equal to values_max to ignore)
values_max: Maximum acceptable observed value (set equal to values_min to ignore)
num_min: Minimum required observations within the outer radius (must be > 1)
num_max: Maximum observations used for the test (must be > num_min)
inner_radius: Radius for flagging [m]
outer_radius: Radius for computing OI [m]
num_iterations: Maximum iterations (stops if no new flags are set)
min_horizontal_scale: Minimum horizontal decorrelation length [m]
max_horizontal_scale: Maximum horizontal decorrelation length [m]
vertical_scale: Vertical decorrelation length [m]
pos: Allowed positive deviation
neg: Allowed negative deviation
eps2: Observation-to-background error variance ratio (e.g., 0.5 means observations are trusted twice as much as the background)
min_obs_var: Minimum observation error variance (reflects estimated representativeness error or expected observation uncertainty at min_horizontal_scale)
diagnostics: Should we write the diagnostics on a file? True or False
filename_diagnostics: Diagnostics filename
obs_to_check: Observations to be checked (1 = check, 0 = ignore)

Returns:

Flags indicating suspect observations (1 = suspect, 0 = good)
sct_scores SCT (Gross error) score per observation (higher values indicate a greater likelihood of measurement or large representativeness error)

Diagnostic file

Header:

it;loop;curr;i;index;lon;lat;z;yo;yb;ya;yav;dh;sig2;flags_d;scores_d;flags_c;scores_c;saved_c;flags_r;scores_r;saved_r;flags;sct_scores;