Reflections On My OPW Internship
My last patch to Parsoid got merged a couple of weeks ago, so now seems like a good time to write a final post to summarize what I learned. I picked up plenty of technical skills during my internship, but more importantly, I learned how to get up to speed on a large codebase and how to work with a team.
Tips for New Open Source Contributors
- Don’t try to understand the whole thing. Just learn enough to make the contribution that you want to make, at least at first. You will get to know the codebase organically as you continue to make contributions.
- Keep your patchsets small. I spent the majority of my internship working on the same patch; it took 7 weeks and 31 patchsets before it got merged. This was mostly because the patch spanned 23 different files. In general, I believe that new contributors should either stay away from projects that affect a huge number of files, or split up their work so that each patchset covers just 2-3 files at a time.
- Having a mentor really helps. Even if you are operating on your own and not through a program like OPW, it’s good to establish a relationship with someone who can help guide you through the codebase. I spent about six hours a week asking my mentor all kinds of questions.
- Peripheral participation can be productive. My team often got into long IRC debates about my project, most of which I didn’t understand. But I would save the IRC transcript and ask my mentor to translate it for me later. My coworkers were able to converse naturally without feeling held back, while I benefited from understanding all of their different viewpoints.
- Figure out who the decision-makers are. Whose approval will you need before your patch gets merged in? On the Parsoid team, for example, it took me some time to realize that the team lead was the final authority, and that no patch would be merged without his approval. Once I knew that, I tried harder to seek his opinion ahead of time, so that I wouldn’t end up submitting a patch and be forced to revise it later.
- You are ultimately responsible for your code. You know your code better than anyone else will, especially as you continue to make contributions. Don’t expect other contributors to have the same level of understanding, or to always spot your mistakes, even if they have more experience than you do.
- Use IRC. In the beginning, I thought it would be weird if I had to correspond on IRC with people I’d never met in real life, and I asked if I could have Google Hangouts with my mentors once a week. After the first Hangout, I realized that everyone just felt more comfortable on IRC.
What I Learned (Technically)
From a technical perspective, I mainly learned better ways of structuring my code that are probably applicable to any language / framework. I learned the most from code review and from IRC discussions about how to implement my project. Examples of things people told me /taught me about were:
- To have explicit returns on all paths of if / else statements, for code clarity
- To avoid “side effects” where a function does more than what it says
- To build something once in the constructor, instead of building it over and over again (to speed up performance)
I also became more familiar with version control and code review systems. I learned how to use Gerrit, picked up a few new git commands (like rebase -i), and learned how to write well-structured and informative commit messages.
Finally, by working on Parsoid I began to develop an understanding of how parsers work; how they break up text into indivisible chunks called tokens, transform the tokens, and then reassemble them on the other side using a tree structure.