-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Event onsets MUST NOT be negative #2075
base: master
Are you sure you want to change the base?
Conversation
I found not a single openneuro datasets with negative onsets. smaug:/mnt/datasets/datalad/crawl/openneuro *$> for ds in ds*; do git grep '^ *-' '**_events.tsv' | head ; done But ATM we are looking at a dataset with some buggy _events.tsv which has huge negative values sorted within onsets column. I think in general/theory in some .1% of the cases there could be point to have onsets with negative onsets, e.g. due to desire to encode some preparatory to the data scanning activities (e.g. during dummy scans etc). So for those -- it would be nice to have a dedicated ERROR code to ignore/overwrite. But in general, it just signals a problem with the file and thus IMHO should be an ERROR.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #2075 +/- ##
=======================================
Coverage 82.58% 82.58%
=======================================
Files 17 17
Lines 1499 1499
=======================================
Hits 1238 1238
Misses 261 261 ☔ View full report in Codecov by Sentry. |
In the expression_tests.yaml negative values are treated as less than "n/a". I believe the deno validator handles this case in its implementation. |
As @tsalo pointed out in bids-standard#2075 (comment) there is a test in expression_tests.yaml which ensures that -1 is less than n/a
that would be breaking backwards compatibility, no? for as far as I can remember negative onsets have been explicitly allowed: https://bids-specification.readthedocs.io/en/stable/glossary.html#onset-columns Description: Onset (in seconds) of the event, measured from the beginning of the acquisition of the first data point stored in the corresponding task data file. Negative onsets are allowed, to account for events that occur prior to the first stored data point. For example, in case there is an in-scanner training phase that begins before the scanning sequence has started events from this sequence should have negative onset time counting down to the beginning of the acquisition of the first volume. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be a breaking change:
https://bids-specification.readthedocs.io/en/stable/modality-specific-files/task-events.html
Column name | Level | Data type | Description |
---|---|---|---|
onset | REQUIRED | number | Onset (in seconds) of the event, measured from the beginning of the acquisition of the first data point stored in the corresponding task data file. Negative onsets are allowed, to account for events that occur prior to the first stored data point. [...] |
There's very little point in including an onset <-10s. We could set a threshold at, say, -30s or -60s to allow for unconsidered cases while identifying probable errors.
If we are going to set a lower bound in the schema, it should also be indicated in the text.
d'oh @effigies , and I thought I checked for that being spelled out! I am Ok with making it "up to 60 seconds" or any other reasonable amount for an ERROR. I wonder if it would still make sense to issue a warning if any negative is encountered given that none of the datasets seems to use them -- WDYT? |
Personally I'd prefer to be conservative not make it an error but just a warning as suggested by @VisLab |
ok, made it into a warning and extended description -- or would such wording be not advisable there? |
As @tsalo pointed out in bids-standard#2075 (comment) there is a test in expression_tests.yaml which ensures that -1 is less than n/a Fix comparison to be >= -- it is a check to pass. Make it a warning and only on implausible negative (over 60 sec before) events Co-authored-by: Chris Markiewicz <[email protected]>
c25f791
to
603de8d
Compare
- suffix == "events" | ||
- extension == ".tsv" | ||
checks: | ||
- max(columns.onset) < 2678400 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, got "inspired" and added one for implausible positive one. I think not even .1% of users should stick more than a month worth of events into one file. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure these checks are necessary. On the negative onsets, I think I have seen driving simulation experiments where there were simulator events and driver response events that were recorded in the events file before the EEG recording was turned on after the person was driving a while. The behavioral events were recorded through the whole thing, but the onset was zeroed at the point the EEG started.
On the upper end... I am not sure why this would be necessary either. Consider a wearable EKG recording --- it is possible to have long term monitoring....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are necessary as I think they could help catch buggy data, as happened in our case and was identified only thanks due to HED validation! ;) (just a side effect)
The use cases where such onsets would be legit would constitute probably less than .1% of the studies given that I found none among existing openneuro for negative onsets altogether. In those few cases, it would be easy to add ignore for those warning codes and sleep happily after.
Even with EKG recording -- I think it is very unlikely (is there a single instance anywhere?) to have over a month long (I was very liberal) recording.
Should the definition of onsets be amended to mention those warnings? |
=== Do not change lines below === { "chain": [], "cmd": "git-sedi Implausible Suspicious; git-sedi IMPLAUSIBLE SUSPICIOUS", "exit": 0, "extra_inputs": [], "inputs": [], "outputs": [], "pwd": "." } ^^^ Do not change lines above ^^^
Lazy me: I think they are of the kind of SUSPICIOUSLY_LONG_EVENT_DESIGN and SUSPICIOUSLY_SHORT_EVENT_DESIGN hint types.. I do not think we document for them. Doer me: I pushed change to use |
I found not a single openneuro datasets with negative onsets.
But ATM we are looking at a dataset with some buggy _events.tsv which has huge negative values sorted within onsets column. I think in general/theory in some .1% of the cases there could be point to have onsets with negative onsets, e.g. due to desire to encode some preparatory to the data scanning activities (e.g. during dummy scans etc). So for those -- it would be nice to have a dedicated ERROR code to ignore/overwrite. But in general, it just signals a problem with the file and thus IMHO should be an ERROR.
TODOs (@effigies @rwblair @tsalo - please guide)
n/a
values there? (should be ok)