R and Python are essential tools for data scientists working at tech platforms.
Hadley Wickham is the most important developer for the programming language R. Wes McKinney is among the most important developers for programming language Python. The two languages, which are free to use, are often seen as competitors in the world of data science. Wickham and McKinney don’t think the rivalry is necessary. In fact, they think that by working together, they can make each other’s languages more useful for their millions of users.
Last month, McKinney announced the founding of Ursa Labs, an innovation group intended to improve data-science tools. McKinney will partner with RStudio—Wickham’s employer, which maintains the most popular user interface for R—on the project. The main goals of Ursa Labs are to make it easier for data scientists working in different programming languages to collaborate, and avoid redundant work by developers across languages. In addition to improving R and Python, the group hopes its work will also improve the user experience in other open-source programming languages like Java and Julia.
R and Python are essential tools for data scientists working at tech platforms like Google and Facebook, researchers, academic researchers, and data journalists (Quartz is a big user of both). A common problem for coders is that it’s hard to collaborate with colleagues who use one of the other languages. Ursa Labs will try to make sharing data and code with someone using another data science language easier, by creating new standards that work in all of them. Developers call this an improvement to “interoperability.” Wickham and McKinney have already worked together to create a file format that can used in both Python and R.
Besides making collaboration easier, Wickham and McKinney tell Quartz that a key motivation for the project was watching developers in each language solve the same problems, but not share what they found with one another.
For example, Wickham explains that in every language people need to be able to calculate averages. This is a simple process for users, involving one line of code in R or Python. But for the languages’ developers, it is a tricky problem to figure out the best way for that one line of code to perform the calculation. Developers in R and Python both tend to solve this problem in the languages C++ and C—languages that are good for development, but tricky for the average user. Ideally, Wickham says, if a developer in one language figures out the best way to do something, it should be applied in every other language. That’s the main mission of Ursa Labs.
Wickham and McKinney add that besides solving technical problems, the project also serves as an effort to make peace between programming tribes. The more that people using these languages work together, they say, the better it is for data science in general. “I hope it ends the pointless feuding between R and Python,” says Wickham. “Both languages are awesome.”