Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timestep, timehorizon, timeseries need new place and def #267

Closed
1 of 7 tasks
akleinau opened this issue Mar 4, 2020 · 47 comments · Fixed by #538 or #570
Closed
1 of 7 tasks

timestep, timehorizon, timeseries need new place and def #267

akleinau opened this issue Mar 4, 2020 · 47 comments · Fixed by #538 or #570
Assignees
Labels
[B] restructure Restructuring existing parts of the ontology oeo dev meeting Discuss issue at oeo dev meeting oeo-physical changes the oeo-physical module

Comments

@akleinau
Copy link
Contributor

akleinau commented Mar 4, 2020

Description of the issue

TimeStep, TimeHorizon and TimeSeries are currently variables and without a definition.

Ideas of solution

TimeStep: A TimeStep is a temporal region (?) stating the time between two calculations or measurements made.

  • or: turn into properties has_temporal_resolution, has_number_of_time_steps

TimeHorizon: A TimeHorizon is a temporal region (?) stating a specific point in time at which specific events will be reviewed or should end.

TimeSeries: A TimeSeries is a data set storing data indexed by time.

Workflow checklist

  • I discussed the issue with someone else than me before working on a solution
  • I already read the latest version of the workflow for this repository
  • I added this issue to the Project 'Issues'. If suitable, I add it to further Projects.
  • The goal of this ontology is clear to me

I am aware that

  • every entry in the ontology should have an annotation
  • classes should arise from concepts rather than from words
  • class or property names should follow the UpperCamelCase
@akleinau akleinau added the [B] restructure Restructuring existing parts of the ontology label Mar 4, 2020
@akleinau akleinau added the oeo-physical changes the oeo-physical module label Mar 12, 2020
@akleinau
Copy link
Contributor Author

delete to leave out in first release

@akleinau akleinau added this to the oeo-release-0.0.1 milestone Apr 15, 2020
@akleinau akleinau removed this from the oeo-release-0.0.1 milestone Apr 15, 2020
@l-emele l-emele added this to the oeo-release-1.1 milestone Jun 25, 2020
@Vera-IER
Copy link
Contributor

I don't like the term temporal region. I just googled it to see if thats something you can say in English and its actually a term for a part of the brain ;-)
We could maybe use model property or model characteristic instead.
So the def would be:
A time step is a model property that describes the time period between two calculations or measurements made.
etc.

@akleinau
Copy link
Contributor Author

temporal region is an already implemented class of the bfo with the definition: "A temporal region is an occurrent entity that is part of time as defined relative to some reference frame".
This fits in my opinion better than model property as it's already there and more specific about the main aspect of this concepts, the description of some time period

@stap-m
Copy link
Contributor

stap-m commented Jul 29, 2020

I agree, classes time step and time horizon should be classified as 1-dimensional temporal regions. And they definitely need to be related to models / scenarios / time series.
Also, time step needs to be related to a quantity value (e.g. time step = 15 min).

@akleinau
Copy link
Contributor Author

on further thinking I think time step should actually be a quantity value?
It is just a description of a portion of time, no fixed moment in time

@akleinau akleinau pinned this issue Jul 29, 2020
@akleinau akleinau self-assigned this Jul 30, 2020
@0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q
Copy link
Contributor

Also, time step needs to be related to a quantity value (e.g. time step = 15 min).

It is just a description of a portion of time, no fixed moment in time

I think there are two perspectives on time steps: on the one hand a description of a model or scenario (e.g. "how many time steps of what duration does this model calculate?"), and on the other hand the description of data (e.g. "to which time step does this specific datum belong?"). Since both need to be accommodated, the model description either needs to list all time steps that are calculated -- tedious for 35040 15-minute time-slices to a year -- or we need two different "time step" classes.

For the "data perspective", a time step is defined by start and end, because it is a fixed moment in time (or rather, a region of time). The denomination of the time steps is arbitrary and differs between fields. E.g. in meteorology, observations are designated by the end-time of the time interval (so for 15-minute time steps, 08:15 would be the observations between 08:00 and 08:15, if I remember this discussion correctly @carstenhoyerklick), the IPCC data sets for the Assessment Reports uses the mid-point (so 2035 refers to data from the beginning of 2033 to the end of 2037). So

  • time step [data perspective] would have two attributes
    • start time
    • end time

I don't know what the crucial characteristics for the "model perspective" are. I guess number and size of time steps. I just want to point out that these need not be homogeneous. Our model uses five-, ten- and twenty-year time steps at the same time (because it is computationally cheaper and the temporal resolution is important for the short term, but not so much when you look 100 years out).

@0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q
Copy link
Contributor

Related to #474.

@akleinau
Copy link
Contributor Author

akleinau commented Aug 5, 2020

these are two perspectives, yes, one that looks at a single time step and one that looks at all time steps in a model. The underlying concept of „timestep“ remains the same though and should be treated as one concept. That one can have the start and end time you proposed which are used directly for your data perspective. It can also have a duration property, so to describe your model perspective we can just relate the model to the time steps, eg „model has x instances of the timestep class“ and those instances have y duration.

So

  • one class: Timestep with start, end, duration
  • model with has exact x timesteps who have y duration

@stap-m
Copy link
Contributor

stap-m commented Aug 5, 2020

on further thinking I think time step should actually be a quantity value?
It is just a description of a portion of time, no fixed moment in time

No, I think time step itself is still a 1-dimensional temporal region. But we should to add further relations and quantity values. For example

  • time step has quantity value duration
  • and / or time step has quantity value start time and end time

@0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q
Copy link
Contributor

  • one class: Timestep with start, end, duration

This makes me cringe a little, because duration would be derivative to start time and end time, allowing for inconsistent definitions. Would there be a "generic" time step, that only has duration, but no position in time? What would be the use case for it?

@akleinau
Copy link
Contributor Author

akleinau commented Aug 5, 2020

use case would be that instead of giving start end end time of every timestep used in a model when looking at the model view we can just state the duration (that is typically the same for all).
Yes, I thought about the inconsistency problem too. Maybe just include start and duration? So leave end out?

@0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q
Copy link
Contributor

instead of giving start end end time of every timestep used in a model when looking at the model view we can just state the duration

One doesn't need a class for that. One could also just give the model number of time steps and duration of time step attributes.

(that is typically the same for all)

This domain ontology is a joint effort to represent the typical energy-system modelling context based on standard terminologies used by human experts in this field of research.

Fixed. 😎
But probably it makes sense to allow models to either just state the number and duration of time steps, or give a comprehensive list. FYI: there are power sector models using different time step setups depending on the data they run on.

Yes, I thought about the inconsistency problem too. Maybe just include start and duration? So leave end out?

I'm not sure if this has to be addressed in the ontology. Users will always find a way to define nonsensical data. If a "generic time step" having only duration proves useful, then go for it and make the attributes optional. If not, personally, I would default to start time and end time, but in the end it doesn't matter.

@stap-m
Copy link
Contributor

stap-m commented Aug 6, 2020

Ok, here comes a differentiation to not mix up time series and time step:
We need time series with

  • has quantity value start time and (optional?) end time
  • has quantity value number of time steps
  • has part some time step

We need time step with

  • has quantity value duration
  • optional has quantity value start time and end time --> and it lies in the responsibility of the users to not assign nonsense?

Do you agree?

@0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q
Copy link
Contributor

Do you agree?

I for one do in principle agree, but this is not sufficient.

tl;dr: Time series are not necessarily homogenous (composed of time steps of identical duration), nor continuous. Therefore a time series is not always completely identified by start time, number of time steps, and time step duration.

  • has quantity value start time and (optional?) end time
  • has quantity value number of time steps

If "has part some time step" links to only one "generic" time step (with duration, but without start time or end time), this presupposes homogeneous time steps, which is not always the case.

A concrete example: the REMIND model uses 19 time steps that vary between five and 20 years in duration. We don't care about the precise start and end times and durations in our work, especially not down to the second, but this would be the formal definition.

number period start time end time duration
1 2005 2003-01-01 00:00:00 UTC 2008-01-01 00:00:00 UTC 5 years
2 2010 2008-01-01 00:00:00 UTC 2013-01-01 00:00:00 UTC 5 years
3 2015 2013-01-01 00:00:00 UTC 2018-01-01 00:00:00 UTC 5 years
4 2020 2018-01-01 00:00:00 UTC 2023-01-01 00:00:00 UTC 5 years
5 2025 2023-01-01 00:00:00 UTC 2028-01-01 00:00:00 UTC 5 years
6 2030 2028-01-01 00:00:00 UTC 2033-01-01 00:00:00 UTC 5 years
7 2035 2033-01-01 00:00:00 UTC 2038-01-01 00:00:00 UTC 5 years
8 2040 2038-01-01 00:00:00 UTC 2043-01-01 00:00:00 UTC 5 years
9 2045 2043-01-01 00:00:00 UTC 2048-01-01 00:00:00 UTC 5 years
10 2050 2048-01-01 00:00:00 UTC 2053-01-01 00:00:00 UTC 5 years
11 2055 2053-01-01 00:00:00 UTC 2058-01-01 00:00:00 UTC 5 years
12 2060 2058-01-01 00:00:00 UTC 2065-07-02 12:00:00 UTC 7.5 years
13 2070 2065-07-02 12:00:00 UTC 2075-07-02 12:00:00 UTC 10 years
14 2080 2075-07-02 12:00:00 UTC 2085-07-02 12:00:00 UTC 10 years
15 2090 2085-07-02 12:00:00 UTC 2095-07-02 12:00:00 UTC 10 years
16 2100 2095-07-02 12:00:00 UTC 2105-07-02 12:00:00 UTC 10 years
17 2110 2105-07-02 12:00:00 UTC 2120-07-02 12:00:00 UTC 15 years
18 2130 2120-07-02 12:00:00 UTC 2140-07-02 12:00:00 UTC 20 years
19 2150 2140-07-02 12:00:00 UTC 2167-07-02 12:00:00 UTC 27 years

Other IAMs do this differently. E.g.

The carbon price for a given model year t is usually assumed to be constant over the length of the time step Δt (either from time t-1 to t or from t-Δt/2 to t+Δt/2, depending on the model).

(From the Model Diagnostic Exercise – Study Protocol of the ADVANCE Project)

The Assessment Report data of the IPCC on the other hand is not specific about what a time step like 2030 actually refers to. But one interpretation (held by the people at IIASA, who are hosting the data) is that it denotes that specific year, in which case the time series is not continuous, but has ten-year gaps in-between:
2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041
Practically, it makes little difference and is interpolated away. But the ontology must be able to represent it.

We need time step with

  • has quantity value duration
  • optional has quantity value start time and end time

I still don't see the use for the generic time step, but sure.

--> and it lies in the responsibility of the users to not assign nonsense?

Sure.

@stap-m
Copy link
Contributor

stap-m commented Aug 6, 2020

tl;dr: Time series are not necessarily homogenous (composed of time steps of identical duration), nor continuous. Therefore a time series is not always completely identified by start time, number of time steps, and time step duration.

I agree. Any ideas for further relations?

If "has part some time step" links to only one "generic" time step (with duration, but without start time or end time), this presupposes homogeneous time steps, which is not always the case.

Indeed. If the relation is has part some time step, then seveal, also inhomogenous, time steps could be assigned. Please correct, if I am getting this wrong @akleinau.

I still don't see the use for the generic time step, but sure.

What would be your choice?

@0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q
Copy link
Contributor

What would be your choice?

One could also just give the model number of time steps and duration of time step attributes.

@stap-m
Copy link
Contributor

stap-m commented Aug 10, 2020

One could also just give the model number of time steps and duration of time step attributes.

So-called "attributes" (not really an ontological term, though) are implemented also as classes (often as dependent continuants) that are related via properties (e.g. has quantity value, has part, ...) to an independent continuant. See also wiki and OpenEnergyPlatform/oeo-extended#5.
I don't know if it's even possible to implement one class numer of time steps. How to classify? "number of" is dependent on something, e.g. time steps. Same for "duration of".
And, if you have generic classes number, time step, duration ,... you can also reuse it for other purposes.

@0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q
Copy link
Contributor

How do you plan on representing the hub height of a wind turbine (to use a popular example from the dev meetings)? Is there going to be a class for it, with subclasses for every possible hub height?

@stap-m
Copy link
Contributor

stap-m commented Aug 10, 2020

Is there going to be a class for it, with subclasses for every possible hub height?

Yes, there is going to be a class hub hight, defining the concept of hub hight.
But instead of having subclasses (or rather instances) of possible hub hight values, this class hub hight would be related via has quantity value to a quanity value length value (or hight value or however called) which is related per definition to a value (data property, e.g. 140) and a unit (e.g. metre).
Whether the value instances of length value will be stored within the OEO or not has to be discussed. I can't answer that yet. But I'll put this question on the agenda for the developer meeting on thursday.

@l-emele
Copy link
Contributor

l-emele commented Aug 10, 2020

Whether the value instances of length value will be stored within the OEO or not has to be discussed.

My understanding is that data like that will not be stored in the OEO but in the OEP database as one major use case of the OEO is to annotate the data in the OEP.

@l-emele
Copy link
Contributor

l-emele commented Aug 19, 2020

Ping @christian-rli : Any thoughts?

@carstenhoyerklick
Copy link
Contributor

I would go with the the time step, as especially for irradiation you need to know when this time interval was or is, e.g. if you need to relate it to solar geometry.

Doesn't that all boil down to and ?

Some data sets would use 11:30 for the 11-12h value, some use 11h.

To me that's just a label that is attached to the time step in the data set. It is up to the data provider to (by what ever means >that should be done at some point) describe the "11:30" item in the data set to mean "a time step that has some part start time >(11:01) and has some part end time (12:00)".

In an ideal world yes, if we could force everyone to and that would be perfect. The second best ist let them define the and the , then it would be equivalent. So it is alternatives.

@0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q
Copy link
Contributor

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented Sep 17, 2020

In an ideal world yes, if we could force everyone to and that would be perfect. The second best ist let them define the and the , then it would be equivalent. So it is alternatives.

🤔

In an ideal world yes, if we could force everyone to <start time> and <end time> 
that would be perfect. The second best ist let them define the <duration> and 
the <time stamp>, then it would be equivalent. So it is alternatives. 
  1. Don't make up your own HTML tags ;) (Or maybe tell your e-mail client not to.)

  2. But we can force them, by structuring the ontology in that way! Data providers will have to annotate their data set in any case, and hopefully will do so programmatically. Extracting start time and end time is only marginally more elaborate than extracting time stamp and duration,

switch (timestamp_meaning) {
	case TS_BEGIN:
		start_time = timestamp;
		end_time   = timestamp + duration;
		break;
		
	case TS_END:
		start_time = timestamp - duration;
		end_time   = timestamp;
		break;
		
	case TS_MIDDLE:
		start_time = timestamp - duration / 2;
		end_time   = timestamp + duration / 2;
		break;
		
	case TS_INSTANTANEOUS:
		start_time = timestamp;
		end_time   = timestamp;
		break;
}

and anybody who can't manage that will fail several times over in other areas with the ontology.

  1. (Arguing from a LOD-GEOSS perspective):
    If we were to have multiple definitions of time steps (start time and end time [SE] or time stamp, duration, meaning of time stamp [TsDM]), there would have to be a conversion between them on the Databus in any way, in order for users to extract data in their preverred format. So the [TsDM] → [SE] conversion could also be used up front, allowing easy uploading to the Databus, and not burdening the ontology with two equivalent yet different definitions.

I would go with the the time step, as especially for irradiation you need to know when this time interval was or is, e.g. if you need to relate it to solar geometry.

Good point. I yield to the expert ;)

@carstenhoyerklick
Copy link
Contributor

Sorry I edited directly on GitHub, .. I misinterpreted the coding style.

I am mostly convinced, with the only exception, we can force for new data sets, but what do we do if want to annotate old existing data sets? Convert them to start time and end time and republish them?

@carstenhoyerklick
Copy link
Contributor

p.s. also holds true for platforms that already regulary publish data as transparancy plattforms. We would need to force them to different publishing formats.

@0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q
Copy link
Contributor

Convert them to start time and end time and republish them?

You seem to assume some automatic connection between the data set and the ontology. As I understand it, all these connections have to be explicitly states, so "old" and "new" makes no difference.

but what do we do if want to annotate old existing data sets

What do these data sets look like? I don't know, but I imagine something like

geographic reference time stamp Irradiation
some grid square 10:30 some number
some grid square 11:30 some number
some grid square 12:30 some number
... ... ...

with meta data attached specifying that "time stamp" means "the middle of the time step" and "time step duration is 1 hour". Then there has to be a link detailing the connection between the data item "11:30" and the ontology element "a time step that has some part start time (11:01) and has some part end time (12:00)."

I'm not sure how this connection is to be made (see LOD-GEOSS Redmine, maybe @Ludee can calrify), but I don't see any difference between "old" and "new" data sets. If the data set had the format

geographic reference start time end time Irradiation
some grid square 10:01 11:00 some number
some grid square 11:01 12:00 some number
some grid square 12:01 13:00 some number
... ... ... ...

there still would be to have a link saying "this specific line concerns "a time step that has some part start time (11:01) and has some part end time (12:00)."

@0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q
Copy link
Contributor

Also holds true for platforms that already regulary publish data as transparancy plattforms. We would need to force them to different publishing formats.

Maybe this might need a wider discussion or explanation on a OEO dev call. My understanding (that might be incorrect) is that the ontology is agnostic to the format data is in, but used to annotate the meaning of the data.

But if different formats are needed, isn't (automatic) republishing in "better" formats a core feature of the Databus? ;)

@carstenhoyerklick
Copy link
Contributor

To my understanding, we may need all the definitions be able to anntoate all the data, also before republishing it. To be able to interprete the time information, we may need also the time stampand duration concepts for those who are not willing or able to include the start time and end time information in their data sets.

@stap-m
Copy link
Contributor

stap-m commented Sep 17, 2020

time stamp definitely is a common term among energy system modelers, thus it should be part of the OEO.
We had a discussion about time stamps when we implemented time series metadata for the OEP. For the metadata we soved it like this (here's the full example file):

"temporal": {
        "referenceDate": "2016-01-01",
        "timeseries": {
            "start": "2017-01-01T00:00+01",
            "end": "2017-12-31T23:00+01",
            "resolution": "1 h",
            "alignment": "left",
            "aggregationType": "sum"
        }

alignment means 11:01 (left) or 11:30 (centre) or 12:00 (right), referring to the above mentioned example.
aggregation type could be sum/integrated, mean, instantaneous
Maybe this could help for a solution.

What's also missing is a concept for time standards like UTC, CET, ...

@carstenhoyerklick
Copy link
Contributor

I think this is a good way wich alignment and aggrgation type.

@sfluegel05
Copy link
Contributor

Currently we have (among others) the following axioms for time series:

  • has part some start time
  • has part some ending time

In order to make time stamp usable we should replace them with (has part some start time and has part some end time) or (has part some time stamp and has part some alignment)

We also need definitions for the classes:

  • time stamp: A time stamp is a zero-dimensional temporal region that is used to describe a time series.
  • alignment (maybe time stamp alignment would be more clear): An alignment is a data descriptor that indicates the position of a time stamp in a time series.
    We could add left alignment, centre alignment and right alignment as Individuals and make them Instances of alignment (this would be analogous to data format and its instances)

@carstenhoyerklick
Copy link
Contributor

carstenhoyerklick commented Sep 30, 2020

In General, it sounds very reasonable, only that the second option also needs a duration, so it would be something like ... or (has part some time stamp and has part some duration). Otherwise ´center allginment´ and ´right alligment´ are undefined.

@0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q
Copy link
Contributor

time series time step (has part some start time and has part some ending time) or (has part some time stamp and has part some duration and has part some alignment)

@sfluegel05
Copy link
Contributor

time series time step (has part some start time and has part some ending time) or (has part some time stamp and has part some duration and has part some alignment)

Yes, but also time series. We should add this relation for time step and time series.

@akleinau
Copy link
Contributor Author

akleinau commented Oct 7, 2020

this issue has 42 comments. Maybe it is a good idea to define an upper limit like 30 comments, after which an issue should be discussed in a dev meeting as it got too complex?

@stap-m stap-m added the oeo dev meeting Discuss issue at oeo dev meeting label Oct 8, 2020
@carstenhoyerklick
Copy link
Contributor

I think, we are more or less done in this discussion. I think we may just call it to a close in the next dev meeting.

@Ludee
Copy link
Member

Ludee commented Oct 14, 2020

OEO-TimeSeries

@0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q
Copy link
Contributor

Yes, but also time series. We should add this relation for time step and time series.

For start time and end time I agree. But does anybody use time stamps for entire time series?

@Ludee Ludee linked a pull request Oct 14, 2020 that will close this issue
@sfluegel05
Copy link
Contributor

For start time and end time I agree. But does anybody use time stamps for entire time series?

Maybe we can leave that part out at the moment. We can still implement it when we find someone who does use it.

After we discussed this issue in dev-meeting 10, I will implement the part concerning time stamp.
Also, I suggest to open two new issues, one about aggregation which is needed to describe a time step and another about start time and end time which don't work the way they should at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[B] restructure Restructuring existing parts of the ontology oeo dev meeting Discuss issue at oeo dev meeting oeo-physical changes the oeo-physical module
Projects
None yet
9 participants