-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hoodie Community Dashboard #102
Comments
https://libraries.io/ has a lot of awesome stuff by @andrew which you probably know about. 1. Load by usersStats like # issues/PRs should be easy to track via https://www.githubarchive.org/ or just the github API since it's pretty straightforward. Also big query, 2, and more. Actually just looking at the winners/entries for the data challenge might give some more inspiration in general - https://github.com/blog/1864-third-annual-github-data-challenge. Not sure why there wasn't one last/this year though. I know there was http://issuestats.com/ (might be down) that tracked avg time to close an issue/merge a PR + badges and a graph (won the github data challege previously). You can also track activity of 3rd party stuff (tweets with Probably not very many data points but other things like # of blog posts related, # of meetups/conferences/videos, # of talks. 2. Active contributorsComments are really good - also active conversations via slack/twitter or other ways we do community engagement. |
Thanks @hzoo, I think it would be nice to share some Big Query APIs We're also working on gathering contributor information in a google sheet as well. The sheet helps us keep track of the names and backgrounds of our contributors so that we can answer several maintainer questions:
We are also working on the right process for checking in on the community:
|
@gr2m Saw a tweet directing readers to this issue -- very exciting. I work in Berlin at Zalando (huge, publicly traded company) as open source evangelist. Would love to chat, but in the meantime wanted to share these links to some dashboards (apologies if they're already known to you);
I'm aware of other initiatives falling along these lines; happy to talk more. |
Hi there, willing to help with metrics :) |
Let me know if I can help pulling any statistics out of @librariesio for you |
BTW, I run a small analysis of ten projects of HoodieHQ, you can have a look at http://cauldron.io/dashboards/hoodiehq . This is based on the tech we have at grimoirelab as referenced by @LappleApple . You can see aggregated info per data source (Git, GitHub Issues and GitHub Pull Requests) and each panel provides several charts and tables. You can drill down, filter or even export the data and use your own viz. Indeed you have info you already mention such as the number of open pull requests and issues, people involved in them, time to close those pull requests and issues, time zone distribution of commits and developers and others. Hope this is useful! |
So happy you're doing this! A few more ideas:
Maybe average time til an issue/PR is closed, as well? Basically tracking how long it takes to resolve. I think support departments often track this.
Also the number of opened issues/PRs, i.e. the rate at which they are being opened each week/month/whatever (which is more about growth).
This is probably implied, but I'd also track the number of repeat contributors, the ratio of first-time to repeat, and how that changes over time.
If you haven't seen it, icecrime's vossibility project might be useful here. |
+1 to @nayafia's ideas here. Nadia, are you saying that vossibility addresses your point? From the README I can't tell, exactly. @icecrime, do you have other docs? (if not, let me know if you need help on that; I tend to follow this README template I created for Zalando. |
# of StackOverflow questions and average time until a StackOverflow response are also good things to track. Dunno about Slack, but for IRC there's https://botbot.me/ which tracks logs and can be used to calculate messages per day (probably need to average it over number of users, though, because of bots). |
@LappleApple I meant that vossibility might be a useful tool for creating dashboards/visualizations of any GitHub data collected. vossibility-collector has a bit more info in its README. |
Thanks for the ping! Vossibility is a tool I created to help me manage the 🐳 Docker open source project. I do wish it was easier to consume or use in other projects, it's mostly a matter of documentation 😞 TL;DR: vossibility takes GitHub data, transforms it to extract the information you want and to enrich where necessary, sends all this to Elastic Search, and then you can use the wonderful Kibana as a frontend. These are some examples of how I'm using vossibility today:
Happy to discuss more if this kind of information is helpful 👍 |
Hey just to mention (do not want to spam! ^^) that grimoirelab supports the following data sources: askbot, bugzilla, confluence, discourse, gerrit, git, github issues, github pull requests, mbox, jenkins, jira, mediawiki, meetup, phabricator, pipermail, redmine, rss, stackexchange (stackoverflow), supybot, telegram, kitsune and remo. There's some extra info of Perceval, the retrieval tool at: https://github.com/GrimoireLab/perceval That means, that having all of that information in a database (ElasticSearch mainly), we all can go for the metrics that you're mentioning such as people and evolution of contributors in all of those data sources, activity for all of those data sources, etc... And then on top of that, build more advanced analysis, such as the demographics of the community or some others. btw, is any of you attending FOSDEM? that could be a great place to meet and discuss about metrics. We're also having this workshop to talk about metrics and show how to use the grimoire toolchain, just in case you're interested [http://grimoirelab.github.io/con/] and we also have this collaborative book https://jgbarah.gitbooks.io/grimoirelab-training/content/ where that's also detailed. |
One of the things I would love to focus FOSS Heartbeat on is the people in open source communities. I think we often get caught up in metrics like "Is rate of merged pull requests increasing?" without focusing on the people behind those metrics. Examples of more people-focused questions I would love FOSS Heartbeat to answer are:
I'd love to chat more about this. If you're looking to hire contractors to work on these sort of people-focused metrics, you can drop a line to [email protected]. I'll also be at FOSDEM. |
@sarahsharp's excellent qs remind me...a lot of these metrics should be used to measure not just growth, but sustainability. Ex. "average response time" can be used to measure how quickly maintainers respond to issues/PRs, but if it's decreasing over time, that can also be a sign of exhaustion. So the response isn't just "answer them faster!" but might be "how do we get additional 👀s and ✋s to help out?" |
A few things that we pull regular metrics on in the Node.js project that have been important.
We used to track who was merging commits but that has gotten less useful over time because it doesn't really indicate who is reviewing commits as more PRs get reviewed by many people before being merged and it's common for someone to merge a bunch of already reviewed PRs. You can probably get better data out of the new review tools in GitHub if you're using that review system. |
@dicortazar / @sarahsharp I'll be at FOSDEM too, as will some of my Zalando colleagues (@alexkops for sure, hopefully @hjacobs and our IP lawyer at minimum as well). Also cc'ing my colleagues @jbspeakr and @KathleenLD here so they can follow this thread; both are interested in/have exp in metrics and balance. @nayafia Thank you for clarifying your point and for the extra link. @icecrime, would be up for adding some bits to your READMEs over the holidays. |
One last thing I'll say: think carefully about what is important to the project before building the dashboard. There's plenty of data out there and it's easy to get lost in creating amazing visualizations of it. I've done this myself a few times and the result was more of a distraction than a benefit. There's also a bunch of products out there that already do this and I feel the same way about most of them as well. I've actually paired back the data that I regularly consider. For instance, I no longer try to track the total number of commits in the main repo in master. There's an inflection point where the project can't handle any more activity in one place and things are spun off more liberally. If we obsessed about that metric we'd end up overloading that branch/repo. Instead, more focus is put on "how" the work is getting done rather than just the volume of work. If you want to distribute the work load, attract more contributors, increase diversity, etc, a lot of these metrics won't help and can become counter-productive. |
@mikeal I very much agree. The Hoodie way to avoid this problem is to start with the end result without thinking about technical limitations or what data and tooling is available today. Then we will probably create some kind of dummy dashboard that just looks and feels amazing, then we all get super exciting about it, and then we make it work backwards :) This is also the reason why I want @leighphan to lead this project, because she cares about the processes and the experience from the perspective of new and existing contributors as well as maintainers, and she has the skills for and interest in data visualsiations. Thanks for this great discussion y’all <3 keep it coming |
This is great & very important! @sarahsharp point about understanding the perceptions of people in the open source community is important for understanding the health & sustainability of the project, beyond the quantitative metrics. @kariljordan just pointed me at this great paper from Steinmacher et al on self-efficacy towards OSS projects Increasing the Self-Efficacy of Newcomers to Open Source Software Projects. This study shows that self-efficacy (belief in one's ability to succeed in accomplishing a task) can increase with more guidance around initial commits. #win! The study doesn't then go on to show that people with more self-efficacy continue to contribute to the project, but more general studies in self-efficacy show that it's important for involvement in an activity. It would be interesting to survey people new to and actively working on the projects, potentially with these survey questions, to understand why there might a particular balance between active users, contributors and maintainers on a project and if people are transitioning from being newcomers to active contributors. Steinmacher et al survey questions:
|
Thanks everyone for your interest. I greatly appreciate all the tips on starting points! Looks like there are many different angles and ways we can extract and visualize the climate of Open Source communities. @jasonLaster I'm curious - how do you get to know contributors better? Are there weekly chats/meetings welcome to everyone? While project data can reveal efficiency and growth of OS projects, I'm also very interested in the data that will help us connect with people first in Open Source communities. Thanks @sarahsharp and @nayafia for bringing up very perceptive questions and indicators and questions about sustainability of people and (before) projects. I'm all taking the Hoodie approach for starting with the interface look and feel, then working backward - people first. :) |
Hey all, where are we on this thread? Some of us are talking about FOSDEM right now and it reminded me, there was talk of a FOSDEM get-together. Should we plan? |
👍 GitHub would be happy to host a dinner conversation. |
Thanks everyone for the wealth of information. I am inspired by all of the ongoing work. Here's a quick doc that I started to summarize what I learned. Please feel free to improve it in anyway. Also, if you're interested in joining a hangout, add your name to the doc and we can discuss next steps. |
@jasonLaster this is great work, thanks for putting it together 👍 |
@tracykteal you might be interested in http://opensourcesurvey.org/, which is being conducted by GitHub. While not specific to any one project, some of those Steinmacher et al questions will be asked of respondents, so might be helpful as a baseline. The survey questions are public, and results will be public too. |
Just getting back into the swing of things but @jasonLaster pointed me to this over the holidays and I was really excited to see all the enthusiasm and work going on. I've been looking at this from a couple different angles with very similar goals to what I've seen here so I'll share what I've been thinking. First I want to understand the contributor funnel (funnel being a standard marketing term, perhaps not the best way to describe people; onboarding?)
Then I'd like to understand area of interest or strengths for contributors. (I think this is what @tracykteal is getting at)
Looking forward to more discussion! 🎉 |
I'm also also curious how or if projects have an onboarding process (perhaps like Hoodie Camp) to get to know contributors' interests, such as building a portfolio, getting a job - this would help give a clearer idea of their trajectory. I started @codelaboc, a learning group in my community, and we conduct a survey for new members, to get an idea of their goals and strengths, and shape our events/direction to help each other. @nayafia Any chance the http://opensourcesurvey.org/ will touch on such questions? |
@leighphan off the top of my head, I don't think so (it's geared more towards contributor behavior than project norms), but you can see the full set of questions here. |
Hello, Please guide me through :) |
If you asked me:
I could not answer it. Nor could any other maintainer from any other Open Source project that I asked so far. And this is a problem, because Open Source Burnout is real and yet we don’t measure the underlying problems in ways we measure code quality.
What we don’t measure, we cannot improve.
The question about active contributors is only one aspect. What I am really interested in how well balanced the community is between active users, contributors and maintainers.
Goal
The goal for the Hoodie Community Dashboard is to be able to answer this question at all times, and make the underlying measurements transparent to everyone.
Out of scope:
In future, I would also measure the success / impact of the Hoodie community which would include things like the reach we have, number of first-time open source contributors, diversity numbers etc.
Measurements
1. Work load
Measuring amount of users is hard for Open Source project, for good reason. But while it would be nice to know how many active users we have to measure the success of Hoodie, we are only interested in how much work load people produce that the Hoodie community has to take care of. Things that we can measure are
2. Active contributors
At Hoodie, we think contributions go beyond code and documentation. Equally important is the work from our editorial and design team, people helping answer questions in slack or on GitHub. In opposite to Load of users, we are not interested in amount of contributions, but in amount of different people who do the contributions, as we are not interested to have a few people do huge amounts of work, but in having a big group to balance the work load.
We can experiment with the details, but for a start I would define active as "contributed within the past month"
A contribution can be one of the following (from people who are part of the Contributors Team on GitHub)
3. Active maintainers
Traditionally maintainers are seen as gate keepers in Open Source projects, often times referred to as "committers". At Hoodie, we see maintainers being in charge to maintain and grow the space in which people enjoy becoming and staying an active contributor. Just like with contributors, we are less interested in the total amount of work by maintainers, and more interested in the total amount of active contributors.
Activities by maintainers are
Visualisations
To be done.
Basically I would love to see different charts, the main one showing the "community climate" indicator (or however we want to call it) over time.
I would like to add these visualisations to hoodie.camp (it currently is a simple prototype only showing open issues).
Besides having a website, I would like to be able to send out weekly and monthly reports via email
Feedback
We are actively discussing all aspects of the Hoodie dashboard and are very interested in your thoughts, questions and insights into existing tools or our experiences with other Open Source communities
The text was updated successfully, but these errors were encountered: