How Duolingo Builds Its Data Science Methodology

For enterprise leaders and different powers that be inside a corporation, information science could be a mysterious, almost-magical device: They don’t essentially perceive it, however in it they see the potential of answering any query they’ve about their customers, enterprise, income or product.

So when a request lands on the desk of a knowledge science crew, it might probably typically betray the writer’s lack of expertise round its viability — or lack thereof. 

Data science methodology is the duty of crafting a mission that solutions the wants of shoppers and colleagues alike and requires a deep understanding of a request and the motivations behind it. Following a rigorously outlined methodology, a well-run information science crew will then craft a mission plan that defines and collects the kind of information they want after which prepares and fashions the information to reinforce their understanding of the insights contained therein. Finally, it should be able to deploy the requested device or characteristic with the expectation that suggestions processes will most likely require some post-production tweaks or updates. 

This, in fact, is your common information scientist’s bread and butter. But what does this course of truly seem like in apply? 

With greater than 300 million customers finishing greater than 7 billion workout routines every month, language-learning platform Duolingo provides an instance of knowledge science methodology in motion. Not solely do the corporate’s huge databases inform tweaks to Duolingo’s person expertise and underlying infrastructure on a regular basis, however the firm’s information science groups conduct regular research into every little thing from optimizing reminder notifications to theories on the way to enhance educating practices and outcomes for learners of indigenous languages

Duolingo’s information science methodology underpins a lot of this work. To be taught extra in regards to the nuts and bolts of how a mission strikes from an amorphous thought to a usable device or priceless perception, Lead Data Scientist Erin Gustafson — one in all RE•WORK’s Top 30 Women Aiding AI Advancement again in 2019 — took us by way of her crew’s finest practices. 


Erin Gustafson, Lead Data Scientist at Duolingo


What are your crew’s finest practices when designing your information science methodology for a brand new mission? 

Our primary finest apply is a mission kickoff course of that we’ve been honing over time. Most of our initiatives undergo this course of, which includes drafting a kickoff doc and scheduling a gathering with key stakeholders to debate the plan. We’ve discovered that each phases of this course of add a ton of worth. 

At the doc section, information scientists work with their managers and crew results in outline the targets, necessities, key stakeholders, technical strategy and timeline for the mission. This section forces us to do the vital foundational pondering for a mission so we will be sure that we now have the information we want — greater than as soon as, the kickoff course of has helped us notice we don’t — and that the mission has excessive ROI.

In the kickoff assembly, the information scientist talks by way of the plan and any areas that want additional alignment with cross-functional stakeholders. The cross-functional nature of this assembly is absolutely vital as a result of the success of a knowledge science mission just isn’t solely decided by how effectively the technical strategy is executed — success can also be pushed by the impression that the work has on the product or enterprise extra usually. Including product managers, engineers, studying scientists and others within the assembly ensures that we’re asking the suitable questions and plan to reply them appropriately.

This brings me to a different finest apply or key precept for deciding on a knowledge science methodology: Don’t let excellent be the enemy of fine. As a small information science crew in a fast-moving firm, we don’t typically have the posh of spending months on a single mission. This implies that we expect iteratively about information science initiatives and infrequently agree to start out with a minimal viable product mannequin that may ship “good enough” insights or predictions, and stage up the strategy later as soon as we’ve demonstrated the worth of the mannequin. This permits us to maneuver shortly and tackle extra initiatives.


“Understanding how your work will be used is an important part of choosing your technical approach.”


What’s an instance of your methodology in motion? 

A current initiative that encapsulates a number of our greatest practices was a revamp of our learner forecasting methodology. For the final couple of years, we’ve relied on a technique that gave us a reasonably correct forecast (even throughout COVID instances) however required a ton of overhead to replace and preserve. We determined to take a step again initially of this yr to take inventory of our strategy and think about options. 

We started by going by way of our typical kickoff course of. This ended up being invaluable as a result of the necessities of this mission had been pretty complicated. We wished to discover a new methodology that will be simpler to keep up, extra versatile so we may add performance as our enterprise matures, extra strong from a statistical perspective and be at the very least as correct because the legacy strategy. What’s extra, we additionally wanted to verify we had been satisfying the rising wants of our stakeholders in advertising, finance and product. The kickoff course of made positive that we had been clear on what success regarded like and that we had buy-in from our stakeholders in regards to the revamp.


How has your information science methodology course of developed over time? 

A current addition to our course of is after-action evaluations. This is a apply that our engineering group has used previously to mirror on classes discovered from previous initiatives. After-action evaluations typically contain an analogous cross-functional group as our kickoff conferences and so they give us a possibility to determine points of our course of or technical strategy that labored effectively, fell quick or may very well be improved. We’ve began to include this into the usual lifecycle of a knowledge science mission. For instance, we not too long ago wrapped work on an MVP mannequin, mirrored on the mission as a crew in an after-action evaluation and instantly utilized these learnings in a kickoff doc for the following iteration on the mannequin. These two processes in tandem have helped us work smarter.

Rise of the CDOChief Data Officers Are Useful. But Their Role Is Still Murky.


What are some frequent methods by which a defective methodology can compromise a knowledge science mission? 

A course of that doesn’t guarantee a well-defined objective for the mission may cause a myriad of issues. For instance, not being aligned on targets may imply that the information scientist doesn’t perceive the use case for the mannequin they’re constructing. Success for a mannequin seems to be completely different relying on whether or not you hope to attract robust inferences out of your mannequin versus generate correct predictions. Understanding how your work can be used is a crucial a part of selecting your technical strategy. A robust course of for kicking off information science initiatives ensures that information scientists and their key stakeholders get on the identical web page early within the lifecycle of a mission.


Please enter your comment!
Please enter your name here