Under Discussion: The Maintenance of Large Open-Source Projects

13 feb 2020

12 min

Under Discussion: The Maintenance of Large Open-Source Projects

autor

Tech Editor @ WTTJ

While attending the 2019 edition of dotJS Behind the Code asked former Node.js core team member Bert Belder and Vladimir Agafonkin, creator of the JavaScript library Leaflet, to sit down and outline what it takes to maintain large open-source projects. While at Node.js, Belder was the person mainly responsible for creating the platform abstraction layer Libuv, and co-created the open-source API framework LoopBack, as well as the open-source library Deno. Agafonkin is the maintainer of more than 40 open-source libraries, including Leaflet, an open-source JavaScript library for mobile-friendly interactive maps. We discussed what the role of a large open-source project maintainer involves, the difficulties Belder and Agafonkin have encountered along the way, and the advice they would pass on to others.

Please tell us a bit about yourselves and explain your involvement as maintainers of large open-source projects?

Vladimir Agafonkin: I have been working in open-source software for 9 years. My most popular project is Leaflet, a library for interactive maps that has hundreds of contributors and is even being taught at universities now. I also work on vector rendering technology at Mapbox, more specifically on a project called Mapbox GL JS, and also on a few dozen smaller libraries that deal with specific algorithms or data structures that we use as dependencies. Those libraries have smaller followings but I still have to deal with managing the community and maintaining the libraries.

Bert Belder: I got involved seriously in open source for the first time in 2010. At the time, Node.js was just very new. I was working at a company where I really wanted to use Node.js, but I couldn’t because it didn’t run on Windows. So I initially got involved just making it work on Windows. A year later, in 2011, when Node.js really took off in popularity, I was hired by another company to be a full-time contributor to the project. I stayed on as a technical steering committee member until I quit my job more or less in 2017, and that was also the end of my Node.js career. Now I am starting a new open-source project called Deno and it’s actually nice to be programming again and occasionally shepherding some contributors forward.

What is the role of a maintainer, specifically on large open-source projects?

BB: For the Node.js project, we started with two or three core team people, which grew to a dozen contributors, to eventually hundreds of people. So my role changed over time. I was initially mostly contributing code myself, and at some point I was mostly reviewing other people’s code. And even later, as people weren’t happy with how Node.js was owned by one particular company and really wanted a foundation, my role evolved again and consisted mostly of talking to people and getting them to agree on how to move forward with the project.

VA: For me, the role of a maintainer has two sides. The first is purely technical—the maintainer ensures the project is healthy, that the bugs get fixed, that nothing is broken, and that new features are developed. The second is more non-technical and involves nurturing the community and making sure that no one feels excluded, that everybody is heard, and that decisions are made.

What are the main difficulties you might encounter while maintaining a large open-source project?

VA: For sure one of the main difficulties is routine! Because once the project gets established and mature, most of your work is not about exciting features and doing cool things anymore—99% of the work is just dealing with some weird bugs, some obscure situations and just trying to reproduce someone else’s problem, and boring things like that. It can be demoralizing and sometimes people will burn out and not be able to handle the project once it becomes really popular.

The second one is saying “No” to people. Sometimes someone will spend a month working on something, will send a pull request, and will be very excited about it. And you need to be able to tell them, “Sorry, we can’t accept that as a contribution.” This can really be heartbreaking! From one side you want to encourage contributions, but from the other you want to keep the project lightweight and focused. Keeping this balance between external contributions and sharp focus on the goal of the project is very difficult, from my point of view.

BB: I really agree with your second point, Vladimir. And in addition to knowing how to say “No” you also have to learn how to use “Not yet.” For instance, when someone sends you a patch for a feature that you want but you’re not totally happy with the quality. You comment or go to them and ask, “Please can you change this?” And they fix it, but then you look at it again and it’s still not great and you say, “Oh, it’s still not great, please fix that too.” You have to constantly balance things, like, “OK, I want to maintain a certain quality, but at the same time, if I ask someone to to make changes to a pull request five times in a row, they’ll likely get frustrated and disappear.”

From my experience, another difficulty is juggling the two roles Vladimir was talking about—the technical role and the management role. For example, you may want to solve a technical issue yourself, which usually requires a lot of focus over several days, and while you’re doing it, other stuff in the community gets neglected. Nobody looks at other people’s patches, nobody responds to mailing-list issues. And then when you get back to it, you find all this mess that’s piled up while you weren’t paying attention.

And the last difficulty would be to be able to make people work together to solve big problems. Sometimes there are important issues that are too big for any individual contributor to solve by themselves. So you need multiple unrelated contributors to work together and to say, “OK, we’re going to do this together.” And it’s really hard to get that stuff in motion. It requires defining a long-term vision for the project. When I was working on Node.js, we didn’t really succeed in making a long-term plan. Even when we had weekly technical steering committee meetings, what was discussed was mostly short-term focused, with individual contributors working on their individual pet peeves.

What is the best organization for the team of a large open-source project?

VA: On my biggest project, Leaflet, we follow the benevolent dictator governance model. [This means] there is one person who defines the guiding principles of a project and tries to make sure that the project stays on track.

BB: A lot of projects following the benevolent dictator model fall by the wayside just because their maintainer either can’t handle the pressure or doesn’t have the people skills to do it. But when someone does have the right skills, then yes, a project can be very successful.

In very large open-source projects, there will usually be multiple maintainers—sections of the code will be owned by one or two people. But it doesn’t guarantee that your open-source project will be successful!

Google Chrome is also an interesting case, as it is a very successful open-source project that is not community driven. We can ask ourselves, “Is the fact that it is not community driven actually what makes it successful?” And maybe it is. For example, it might be easier to decide on a focus for the project. When everybody gets paid by the same boss, they will generally move in the same direction. In a community-driven open-source project, you don’t really get that.

VA: I think organization is not such a key factor. What is most important is to have a clear focus. If a project has a razor-sharp, defined focus, that’s going to help a lot! For example, Leaflet was created specifically as a minimal simple lightweight alternative to an existing project that tried to cram all the features that possibly can be into one project. And this is why Leaflet became successful, because it [maintained a] sharp focus on the most essential things.

What would be your advice about handling issues and pull requests efficiently?

BB: Sometimes there are issues that pop up and, as a maintainer, you know it’s an issue and that it’s somehow important to fix it, but you can’t fix it yet because it depends on a lot of other stuff being done first. And so there are basically two strategies for going about it. Either you let it sit there in the meantime, but then you just never look at your older issues because newer ones come in on top and that’s where your attention goes. Or you are just brutally honest about it and say things like, “OK, well, you were right. This is a bug, but we don’t have the time to fix it, so we’re going to close it now.” Some projects use bots for this, but as someone who took the time to report a bug, I often find that a bit offensive. Just because the team didn’t care to look at it for six months is not a valid reason to close an issue.

VA: I don’t find it offensive! I think it’s a good strategy to set up bots to close old issues, because if there is an issue that no one looked at, it usually means that the issue is not critical enough. If an issue gets a lot of attention and lots of comments, it means that it’s an important issue that a lot of people are facing. You can maybe also be clear about it with your contributors by saying, “Sorry, we will close issues after a certain period of inactivity.” The same can probably be applied to pull requests as well. Because, in the end, you’re still trying to manage very limited resources. You have to prioritize and compromise. Maybe someone will get offended, but isn’t it better that someone gets offended than for you to burn out and leave the project?

BB: From a maintainer’s perspective, I actually do agree with you. It’s just that I also use open-source projects that I don’t maintain, and it makes me a little angry every time my issues are closed without an actual fix. At the same time, and I’m very aware of it, I can’t demand anything from the maintainer. I know they’re probably doing it in their spare time. I’m not paying them. But still, it’s hard for me to handle it emotionally.

The most successful large open-source projects took their time to get the technology sorted and to build a community. It really helps if you can focus on getting your thing right before the rush of bugs and requests comes in. When we started the Deno project [in 2018], we had, like, 25,000 stars on GitHub after three months, which is quite a lot by GitHub’s standards, and that did worry me. It’s good to see so many people interested, but I wouldn’t want all these people to run away disappointed because the technology isn’t ready yet.

Can it be useful to be clear in the readme about what you expect from people in terms of contributions?

VA: Usually you create separate “contributing.md” file, which defines your principles for accepting contributions and usually you outline how you work with pull requests, how you review them, and it sets expectations for contributors. It helps for sure!

BB: I agree. But sometimes it’s not so black and white. Pull requests that are clearly in violation are easy to close, but there’s always a lot of gray area, things that you didn’t think about, or other trade-offs. Sometimes a pull request gets you a little bit closer to your goal in some way, but further away from it in another. And so that means that, as a maintainer, you have to make a decision and making those decisions can be very tiring.

VA: It becomes much easier when you have support from other maintainers to make those decisions. So it’s very important to try to nurture contributors into [becoming] maintainers. For instance, if you see someone bringing a pull request that’s not good, but they have a lot of enthusiasm, want to learn, and want to improve, then you try to capture that enthusiasm and guide them into becoming better. I have had people submitting really bad pull requests but were very enthusiastic and now they’re maintainers of the project, and you can rely on them and trust them to make their own decisions about the project.

What is the more accurate frequency for releases?

BB: It depends a little bit on the maturity of the project. I’m currently working on a very young project that changes very fast, so we release weekly. Projects that ship mostly bug fixes can release frequently as well, especially if the release process can be automated. But on the other hand, you don’t want the JavaScript standard to release every week, it would drive everyone crazy.

VA: For the Mapbox GL JS project we’ve adopted a monthly-release cycle. The project is now more or less mature and lots of people depend on certain things working. So you have to have some buffer when you release a beta. Make sure people test it out and don’t merge any new features into that beta. To be able to give this buffer, you have to extend the cycles. So, the more critical it is to make sure there are no regressions, the bigger the release cycle will be.

How do you manage testing on open-source projects?

VA: For Mapbox GL JS, we extensively test every release, but we also count on users trying out the beta releases and getting feedback. And we also run non-regression tests.

BB: Regarding Node.js, it’s very different now than it used to be. Before, it was a manual process to manage the releases. So we would spend a day building them for all the different platforms and doing some smoke tests. Nowadays, of course, we all have continuous integration and it’s a lot smoother that way. But we don’t really release betas. Node.js still relies mostly on the fact that some people are eager to get the latest features and they are also a little bit more willing to accept that there might be a bug or a regression in it, versus a large corporate user who is not going to upgrade.

If you release infrequently, the problem is that once you do make a release, everybody has to upgrade, because there aren’t so many releases. And then if you have a regression, it’s a much bigger problem because the regression is not going to get fixed until your next, very far away release, which might be another year or more away. So, for me, a higher frequency is usually better!

How can a maintainer ensure they retain contributors?

BB: As Vladimir said earlier, I think the biggest challenge as a maintainer is to identify contributors who want to learn and give them the opportunity to grow. Eventually they will become very valuable contributors. And in order to achieve this, the key is to be available to talk to them and answer their questions, as well as getting them more involved by making them a moderator, asking them to review some code, involving them in building a release or asking them to speak at a conference.

VA: I think retaining contributors involves making people feel really welcome. So leaving a comment like, “Wow, this is so great. Thank you so much. If you’re interested in this area of code, let’s work our way through this next cool feature together.” Merging a pull request can be very helpful to foster passion about the project and make people feel included and appreciated. There will be a much higher chance of them staying!

Is it also the role of the maintainer to answer questions or comments about the project on Reddit or Hacker News, for example?

BB: It can help! Especially if there is confusion about something. But it can also be tricky because some people on those social platforms will just spew negative opinions, even if they are not actually interested in using your software or helping it improve. So my preference is to not engage with them. At the same time, of course, there can be good feedback in the comments on your project. But I prefer it as one-way communication—I read it but I’m not going to respond to it. I think that’s also important for your sanity.

VA: It helps a lot for the popularity of the project to participate in discussions. Of course there will be some negativity, but if you respond to that and let people know that you are open to advice, you can turn it into a constructive, positive discussion.

BB: I think what you’re saying is true, but is it worth your time to try it every time?

VA: Well, probably not… But during my spare time reading Hacker News discussions, if I see that some people post something about my library, I’ll be like, “Hey, this is the author here, please feel free to ask any questions.” Most people are nice—at least, that’s what I’ve experienced. But maybe it’s because my projects are not popular enough. When they go up a certain threshold of popularity, then they attract bad comments.

How do you know when it’s time to end an open-source project?

VA: Some projects become mature and turn into a maintenance mode, and then, at some point, a new technology emerges and yours slowly goes down. And sometimes the best thing to do is just to let it happen. It’s not necessary to support some projects forever.

BB: I agree. I think there isn’t really any reason to try to give projects eternal life. There’s a useful life to everything and then, at some point, it’s OK if it disappears. But I think it is more painful when a project has become very popular and a lot of people now rely on it. For example, if you’re a bank and you run Apache web server—instead of letting it die, maybe you can hire someone and make sure the project is maintained. A project can end as a volunteer-driven open-source project, but it doesn’t have to die if there are developers who get paid to keep it working.

VA: The only way to make open-source projects sustainable and to not burn out is to make it your job to work on it. Because projects you work on in your spare time are never sustainable. The only way for sustainability is for companies that are interested in technologies to sponsor developments and hire people to work on those projects, which is how I got to work on open source. I work on open source full time because that’s our company’s product. This company earns money based on it. That’s the only way.

This interview has been edited for space and clarity.

This article is part of Behind the Code, the media for developers, by developers. Discover more articles and videos by visiting Behind the Code!

Want to contribute? Get published!

Follow us on Twitter to stay tuned!

Illustration by Victoria Roussel

Las temáticas de este artículo

Open source Career hacking

Más inspiración: Coder stories

We can learn a lot by listening to the tales of those that have already paved a path and by meeting people who are willing to share their thoughts and knowledge about programming and technologies.

Keeping up with Swift's latest evolutions

Daniel Steinberg was our guest for an Ask Me Anything session (AMA) dedicated to the evolutions of the Swift language since Swift 5 was released.

10 may 2021

"We like to think of Opstrace as open-source distribution for observability"

Discover the main insights gained from an AMA session with Sébastien Pahl about Opstrace, an open-source distribution for observability.

16 abr 2021

The One Who Co-created Siri

Co-creator of the voice assistant Siri, Luc Julia discusses how the back end for Siri was built at Apple, and shares his vision of the future of AI.

07 dic 2020

The Breaking Up of the Global Internet

Only 50 years since its birth, the Internet is undergoing some radical changes.

26 nov 2020

On the Importance of Understanding Memory Handling

One concept that can leave developers really scratching their heads is memory, and how programming languages interact with it.

27 oct 2020

¿Estás buscando tu próxima oportunidad laboral?

Más de 200.000 candidatos han encontrado trabajo en Welcome to the Jungle

Explorar ofertas