4 Min

Machine Learning Engineering for Production (MLOps)

At SMS, we support lifelong education and continuous learning – acquiring new skills is an essential tool for both personal and professional development. For this reason, we steadily pursue to expand our education program by conducting training and mentoring sessions, offering online courses on platforms such as Coursera or Udemy etc. Our “Open Fridays” can be used for self-education e.g., working through the material of online classes or attending interesting presentations by teammates on a specific expert topic.

In the following, we talk about a specialization that we conducted from August 2021 to April 2022. With ten colleagues from different data science teams of SMS digital, we learned how to plan, develop, deploy and continuously improve a production ready machine learning application.

Machine learning poses a new set of challenges that the majority of existing software architectures are not designed for. Machine learning puts stress on software systems because it creates a double dependency not only on code but also on data. This data dependency causes new phenomena where suddenly different parts of your system landscape depend on each other, they become entangled. Widespread entanglement is generally an unfavorable property in software architectures and software developers aim to prevent it.

This diagram shows some typical steps in a machine learning workflow. Many tools and platforms aim to conduct singular or multiple steps from those workflows. Now, machine learning engineering aims to supervise, orchestrate and optimize this tool landscape (diagram from https://ml-ops.org/content/end-to-end-ml-workflow)

As a prevention mechanism against entanglement, one can analyze the data lifecycle with data schemas to leverage data lineage, data provenance and data metadata. Such techniques fall under the term machine learning engineering. More generally speaking, the field of machine learning engineering brings machine learning algorithms from a research environment into a production environment. In the end, machine learning engineering requires competencies also commonly found in technical fields such as software engineering and DevOps.

Often, university programs cover the research and development of machine learning algorithms but don’t teach how to bring such models into production. The Coursera specialization "Machine Learning Engineering for Production (MLOps)" fills this gap and covers how to conceptualize, build, and maintain machine learning in enterprise systems. The participants of this specialization will understand how to develop, deploy and continuously improve a production sized machine learning application.

The class at hand is taught by Andrew Ng, Stanford professor and founder of the google brain team as well as several google developers. All experts have experience in creating condensed classes that quickly bring you up to speed in the respective field. Our participants were able to work through the class at their own pace while meeting every week to discuss the content of the specialization.

As there are no gold standards in machine learning engineering yet, the specialization offers a glimpse into new emerging standards and technologies. The specialization is a hands-on experience where participants watch videos that contain small lectures. The understanding of the material is afterwards checked by mandatory quizzes and programming exercises. It is a multi-channel learning experience using modern technologies. The platform also gives out free cloud credits for the participants to use cloud services to deploy algorithms and test the learned concepts in sandbox environments. That way the participants could try out techniques like a canopy deployment scheme outside of a running project. These type of exercises are real hands-on experiences with a lasting impact on the participants.

Working through such a specialization besides your daily work is a demanding process. For six months, our participants invested on average six hours per week in this. Most of the participants carried out the specialization during their spare time. This shows the high motivation of our colleagues and the great interest in this area.

In total, there were colleagues from four countries involved ranging from young graduates to senior data scientists with many years of industry experience – all sorts of skill levels were present in the class. This was a valuable exchange also between different development teams, as most of the participants were not regularly working together. Conclusively, we are very happy with this Coursera specialization. It offered a condensed overview of different machine learning engineering techniques and technologies. The skills acquired in this class are already used in running projects. Furthermore, such classes enhance the collaboration across the development teams. Due to the success of this class, we are already increasing the amounts of participants and will offer a wider variety of classes. 

Many thanks to our great colleagues who carried on with the class and finished the specialization successfully!

Maximilian Christ
Lead Data Scientist
SMS digital GmbH