• Home
  • Assessment paper
  • A study by UC Berkeley researchers explains the Machine Learning Operationalization Process or MLOps which exposes the variables that govern the success of machine learning models in deployment

A study by UC Berkeley researchers explains the Machine Learning Operationalization Process or MLOps which exposes the variables that govern the success of machine learning models in deployment

By on September 26, 2022 0

As machine learning becomes more prevalent in software, a new subfield known as MLOps (short for ML Operations) has evolved to organize the “collection of methods that attempt to deploy and manage ML models in production safely and efficiently”. MLOps is commonly recognized as being complicated. According to anecdotal statistics, 90% of ML models do not make it to production; some suggest that 85% of ML initiatives fail to deliver value. At the same time, it is unclear why MLOps is difficult. Current scholarly understanding of MLOps is limited to a patchwork of white papers, anecdotes, and opinion pieces, as a cottage economy of corporations attempting to address MLOps concerns.

MLOps issues “technical debt”, leading to “huge ongoing maintenance expenditures in real-world ML systems”. Most successful ML deployments seem to involve a “team of engineers who spend much of their time on the less glamorous elements of ML, such as maintaining and monitoring ML pipelines.” Previous research has focused on generic data analysis and scientific procedures without considering the MLOps issues of model building. There is an urgent need to clarify MLOps, especially in describing what MLOps typically entail in ML businesses and applications. A more complete understanding of best practices and barriers in MLOps can reveal gaps in current procedures and better influence the creation of next-generation solutions.

Source: https://arxiv.org/pdf/2209.09125.pdf | Standard ML pipeline

Accordingly, they conducted a semi-structured interview study of ML engineers (MLE), each of whom had worked on production ML models. They recruited 18 participants from various companies and applications, as shown in the table below, and asked open-ended questions to better understand their workflow and day-to-day issues. MLEs, as shown in the figure above, undertake their main tasks: data collection, (ii) experimentation, (iii) evaluation and deployment, and (iv) monitoring and response. They see three elements that influence the efficiency of a production ML deployment across the board: speed, validation, and versioning. They discuss specific MLOps techniques, which are organized into broad conclusions: The nature of ML engineering is highly exploratory.

Source: https://arxiv.org/pdf/2209.09125.pdf | Description of interviewees

As stated earlier, many theories state that 90% of models never make it to production. However, they believe that this estimate is incorrect. Due to ongoing testing, several variants have been created, of which only a tiny percentage will make it to production. As a result, it pays to prototype ideas quickly, with few changes to established procedures, and demonstrate the practical benefits early so that faulty designs never go away. Model evaluation requires an intentional organizational effort to operationalize. Popular model evaluation “best practices” do not reflect how seriously companies take deployments: they often focus on using a typically static data set to evaluate the model with a single choice of metric ML (eg precision, recall).

They found that MLEs spend substantial effort updating various datasets and assessment metrics over time to ensure that data subpopulations of interest are fully covered. Non-ML rules and human-in-the-loop techniques keep production models reliable. They discovered that MLE favors basic concepts, even if it means dealing with many versions. For example, rather than using complex strategies to minimize distribution shift errors, MLEs would develop new models and retrain them on new data. MLEs ensured reliable deployments by employing tactics such as on-call rotations, pattern rollbacks, and numerous rules-based guardrails to prevent erroneous exits. The document is a must read for anyone trying to do ML in production.

This Article is written as a research summary article by Marktechpost Staff based on the research paper 'Operationalizing Machine Learning: An Interview Study'. All Credit For This Research Goes To Researchers on This Project. Check out the paper.

Please Don't Forget To Join Our ML Subreddit

Consultant intern in content writing at Marktechpost.