Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Event onsets MUST NOT be negative #2075

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

yarikoptic
Copy link
Collaborator

@yarikoptic yarikoptic commented Mar 3, 2025

I found not a single openneuro datasets with negative onsets.

smaug:/mnt/datasets/datalad/crawl/openneuro
*$> for ds in ds*; do git grep '^ *-' '**_events.tsv' | head ; done

But ATM we are looking at a dataset with some buggy _events.tsv which has huge negative values sorted within onsets column. I think in general/theory in some .1% of the cases there could be point to have onsets with negative onsets, e.g. due to desire to encode some preparatory to the data scanning activities (e.g. during dummy scans etc). So for those -- it would be nice to have a dedicated ERROR code to ignore/overwrite. But in general, it just signals a problem with the file and thus IMHO should be an ERROR.

TODOs (@effigies @rwblair @tsalo - please guide)

  • how to handle n/a values there? (should be ok)
  • how/where to add a test?

Sorry, something went wrong.

I found not a single openneuro datasets with negative onsets.

    smaug:/mnt/datasets/datalad/crawl/openneuro
    *$> for ds in ds*; do git grep '^ *-' '**_events.tsv' | head ; done

But ATM we are looking at a dataset with some buggy _events.tsv which has huge
negative values sorted within onsets column.  I think in general/theory in some
.1% of the cases there could be point to have onsets with negative onsets, e.g.
due to desire to encode some preparatory to the data scanning activities  (e.g.
during dummy scans etc).  So for those -- it would be nice to have a dedicated
ERROR code to ignore/overwrite.  But in general, it just signals a problem with
the file and thus IMHO should be an ERROR.
@yarikoptic yarikoptic requested a review from erdalkaraca as a code owner March 3, 2025 16:34
Copy link

codecov bot commented Mar 3, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.58%. Comparing base (1e99fc3) to head (00be46e).
Report is 2 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2075   +/-   ##
=======================================
  Coverage   82.58%   82.58%           
=======================================
  Files          17       17           
  Lines        1499     1499           
=======================================
  Hits         1238     1238           
  Misses        261      261           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rwblair
Copy link
Member

rwblair commented Mar 3, 2025

- expression: min([-1, "n/a", 1])                                                                                        
  result: -1

In the expression_tests.yaml negative values are treated as less than "n/a". I believe the deno validator handles this case in its implementation.

yarikoptic added a commit to yarikoptic/bids-specification that referenced this pull request Mar 3, 2025
As @tsalo pointed out in bids-standard#2075 (comment)
there is a test in expression_tests.yaml which ensures that -1 is less than n/a
@Remi-Gau
Copy link
Collaborator

Remi-Gau commented Mar 3, 2025

that would be breaking backwards compatibility, no?

for as far as I can remember negative onsets have been explicitly allowed: https://bids-specification.readthedocs.io/en/stable/glossary.html#onset-columns

Description: Onset (in seconds) of the event, measured from the beginning of the acquisition of the first data point stored in the corresponding task data file. Negative onsets are allowed, to account for events that occur prior to the first stored data point. For example, in case there is an in-scanner training phase that begins before the scanning sequence has started events from this sequence should have negative onset time counting down to the beginning of the acquisition of the first volume.

Copy link
Collaborator

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a breaking change:

https://bids-specification.readthedocs.io/en/stable/modality-specific-files/task-events.html

Column name Level Data type Description
onset REQUIRED number Onset (in seconds) of the event, measured from the beginning of the acquisition of the first data point stored in the corresponding task data file. Negative onsets are allowed, to account for events that occur prior to the first stored data point. [...]

There's very little point in including an onset <-10s. We could set a threshold at, say, -30s or -60s to allow for unconsidered cases while identifying probable errors.

If we are going to set a lower bound in the schema, it should also be indicated in the text.

@yarikoptic
Copy link
Collaborator Author

d'oh @effigies , and I thought I checked for that being spelled out!

I am Ok with making it "up to 60 seconds" or any other reasonable amount for an ERROR. I wonder if it would still make sense to issue a warning if any negative is encountered given that none of the datasets seems to use them -- WDYT?

@Remi-Gau
Copy link
Collaborator

Remi-Gau commented Mar 3, 2025

Personally I'd prefer to be conservative not make it an error but just a warning as suggested by @VisLab
#2075 (comment)

@yarikoptic
Copy link
Collaborator Author

ok, made it into a warning and extended description -- or would such wording be not advisable there?

yarikoptic and others added 2 commits March 3, 2025 21:56
As @tsalo pointed out in
bids-standard#2075 (comment)
there is a test in expression_tests.yaml which ensures that -1 is less
than n/a

Fix comparison to be >= -- it is a check to pass.

Make it a warning and only on implausible negative (over 60 sec before)
events

Co-authored-by: Chris Markiewicz <[email protected]>
@yarikoptic yarikoptic force-pushed the enh-onsets-positive branch from c25f791 to 603de8d Compare March 4, 2025 02:59
- suffix == "events"
- extension == ".tsv"
checks:
- max(columns.onset) < 2678400
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, got "inspired" and added one for implausible positive one. I think not even .1% of users should stick more than a month worth of events into one file. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure these checks are necessary. On the negative onsets, I think I have seen driving simulation experiments where there were simulator events and driver response events that were recorded in the events file before the EEG recording was turned on after the person was driving a while. The behavioral events were recorded through the whole thing, but the onset was zeroed at the point the EEG started.

On the upper end... I am not sure why this would be necessary either. Consider a wearable EKG recording --- it is possible to have long term monitoring....

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are necessary as I think they could help catch buggy data, as happened in our case and was identified only thanks due to HED validation! ;) (just a side effect)

The use cases where such onsets would be legit would constitute probably less than .1% of the studies given that I found none among existing openneuro for negative onsets altogether. In those few cases, it would be easy to add ignore for those warning codes and sleep happily after.

Even with EKG recording -- I think it is very unlikely (is there a single instance anywhere?) to have over a month long (I was very liberal) recording.

@Remi-Gau
Copy link
Collaborator

Remi-Gau commented Mar 4, 2025

Should the definition of onsets be amended to mention those warnings?

=== Do not change lines below ===
{
 "chain": [],
 "cmd": "git-sedi Implausible Suspicious; git-sedi IMPLAUSIBLE SUSPICIOUS",
 "exit": 0,
 "extra_inputs": [],
 "inputs": [],
 "outputs": [],
 "pwd": "."
}
^^^ Do not change lines above ^^^
@yarikoptic
Copy link
Collaborator Author

Should the definition of onsets be amended to mention those warnings?

Lazy me: I think they are of the kind of SUSPICIOUSLY_LONG_EVENT_DESIGN and SUSPICIOUSLY_SHORT_EVENT_DESIGN hint types.. I do not think we document for them.

Doer me: I pushed change to use SUSPICIOUS_ instead of IMPLAUSIBLE_ prefix here for consistency

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants