Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Include closed milestones in milestones stream #300

Merged

Conversation

TrishGillett
Copy link
Contributor

@TrishGillett TrishGillett commented Aug 28, 2024

⚠️ This may be a disruptive change for some users who are only interested in open milestones and are relying on the fact that this tap only pulls open milestones. These existing users would need to apply a stream map or downstream filter in order to continue excluding closed milestones. Update: we've landed on a way to implement it as optional, with the stream still extracting only open milestones by default.

Why is it worth making this potentially disruptive change?
If a user is using this tap to maintain and update a data source about milestones and their states/attributes, but each run only extracts data about open milestones, all past and current milestones will show up as open whether they are open or closed. Other fields (especially counts of open/closed issues, closed_at, updated_at) would contain potentially stale data corresponding to the last time they were observed before they were closed. By extracting data about all milestones, previously extracted milestones which are now closed can be updated to a closed state and have any other changed fields updated as well.

The implementation approach for this change follows the pattern used in streams like IssuesStream and PullRequestsStream when diverging from the default query parameters.

@TrishGillett TrishGillett changed the title Include closed milestones in milestones stream feat: Include closed milestones in milestones stream Aug 28, 2024
@edgarrmondragon edgarrmondragon self-assigned this Aug 29, 2024
@edgarrmondragon edgarrmondragon added the enhancement New feature or request label Aug 29, 2024
@edgarrmondragon
Copy link
Member

@TrishGillett wdyt about putting behind a (new) setting to make it opt-in?

@TrishGillett
Copy link
Contributor Author

TrishGillett commented Sep 9, 2024

@TrishGillett wdyt about putting behind a (new) setting to make it opt-in?

I'd be down for that! If we can make it non-disruptive for existing users that's a big win. Are you thinking of:

  • An alternative stream similar to how stargazers rest and stargazers graphql coexist?
  • Some kind of stream specific setting on the existing stream? I don't think I've seen examples of something like that so a hint or reference would be appreciated!
  • A top level setting like start_date, auth_token, etc, but that would only change behaviour for this stream?

@edgarrmondragon
Copy link
Member

A top level setting like start_date, auth_token, etc, but that would only change behaviour for this stream?

I think this is my preference. Not ideal that a top-level setting affects the behavior of a stream, but there's some precedence in the ecosystem so I think we can live with it.

@TrishGillett
Copy link
Contributor Author

TrishGillett commented Sep 13, 2024

Not ideal that a top-level setting affects the behavior of a stream, but there's some precedence in the ecosystem so I think we can live with it.

Yeah, it's not my favourite either. I think the least-bad way to do a top-level setting might be to structure it in a way where we won't hate ourselves later if more cases like this come up in other streams and we need to add more.

(Full disclosure: I've been looking into Github's GraphQL API lately and thinking about how there may be reasons to make GraphQL versions of some streams, and I can picture wanting to add more stream options in that case since GraphQL offers a lot more flexibility. For example, a GraphQL issues stream could support an end date, and hopefully also an option to only get data about issues, instead of the default which includes data about PRs too.)

For a pattern, how would you feel about one of these:
Option 1 (It adds a little bulk to the config but the organization is a little nice IMO)

config:
  milestones_stream_params:
    state: all
  other_stream_params:
    thing: value_1
    other_thing: value_2

Option 2

config:
  milestones_stream_state_param: all
  other_stream_thing_param: value_1
  other_stream_other_thing_param: value_1

@TrishGillett TrishGillett force-pushed the include-closed-milestones branch 2 times, most recently from d2da696 to 5638c38 Compare September 17, 2024 23:07
@edgarrmondragon
Copy link
Member

Not ideal that a top-level setting affects the behavior of a stream, but there's some precedence in the ecosystem so I think we can live with it.

Yeah, it's not my favourite either. I think the least-bad way to do a top-level setting might be to structure it in a way where we won't hate ourselves later if more cases like this come up in other streams and we need to add more.

(Full disclosure: I've been looking into Github's GraphQL API lately and thinking about how there may be reasons to make GraphQL versions of some streams, and I can picture wanting to add more stream options in that case since GraphQL offers a lot more flexibility. For example, a GraphQL issues stream could support an end date, and hopefully also an option to only get data about issues, instead of the default which includes data about PRs too.)

For a pattern, how would you feel about one of these: Option 1 (It adds a little bulk to the config but the organization is a little nice IMO)

config:
  milestones_stream_params:
    state: all
  other_stream_params:
    thing: value_1
    other_thing: value_2

Option 2

config:
  milestones_stream_state_param: all
  other_stream_thing_param: value_1
  other_stream_other_thing_param: value_1

Hey @TrishGillett, sorry for the delay in reviewing. I like Option 1 😄. Wdyt of shorter names, e.g.

config:
  milestones:
    state: all
  all_streams:
    thing: value_1
    other_thing: value_2

Otherwise I think "nested" is the right approach.

@TrishGillett
Copy link
Contributor Author

Cool! Will implement. 🙌

@TrishGillett
Copy link
Contributor Author

Pushed an update. This can be considered fair game for review, although it shouldn't be merged just yet because I still need to do a test run and confirm it works as expected.

@TrishGillett TrishGillett force-pushed the include-closed-milestones branch 2 times, most recently from 1df0367 to 24aa4ce Compare September 19, 2024 21:00
tap_github/tap.py Outdated Show resolved Hide resolved
meltano.yml Outdated Show resolved Hide resolved
@TrishGillett
Copy link
Contributor Author

Done test runs:

  • ✅ When not using the new config, you still get only open milestones
  • ✅ When using the new config with state = all, you get all milestones
    🎉

@TrishGillett TrishGillett force-pushed the include-closed-milestones branch 2 times, most recently from 196814e to 5a291a6 Compare October 2, 2024 06:33
Copy link

sonarcloud bot commented Oct 17, 2024

Copy link
Member

@edgarrmondragon edgarrmondragon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Thanks for your patience @TrishGillett 😄

@edgarrmondragon edgarrmondragon merged commit 3002f2b into MeltanoLabs:main Oct 17, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants