Engineering Best Practices for Machine Learning // Alex Serban // MLOps Meetup #79
MLOps Community Meetup #79! Last Wednesday we talked to Alex Serban, a PhD Candidate at Radboud University.
//Abstract
The increasing reliance on applications with ML components calls for mature engineering techniques that ensure these are built in a robust and future-proof manner. Moreover, the negative impact that improper use of ML can have on users and society is now widely recognized and policymakers are working on guidelines aiming to promote trustworthy development of ML.
To address these issues, we mined both academic and non-academic literature and compiled a catalog of engineering best practices for the development of ML applications. The catalog was validated with over 500 teams of practitioners, which allowed us to extract valuable information about the practice difficulty or the effects of adopting the practices.
In this talk, I will give an overview of our findings, which indicate, for example, that teams tend to neglect traditional software engineering practices, or that effects such as traceability or reproducibility can be accurately predicted from assessing the practice adoption. Moreover, I will present a quantitative method to assess a team’s engineering ability to develop software with ML components and suggest improvements for your team’s processes.
// Bio
Alex works at the intersection of machine learning and software engineering, looking for ways to design, develop and maintain robust machine learning solutions.
Since robustness has broad implications along each stage of the development life cycle, Alex studies robustness both from a system (engineering) and from an algorithmic prescriptive.
// Related links
Slide: https://cs.ru.nl/~aserban/talks/pdf/engineering_ML.pdf
https://cs.ru.nl/~aserban/
https://se-ml.github.io/
————— ✌️Connect With Us ✌️ ————-
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Catch all episodes, Feature Store, Machine Learning Monitoring and Blogs: https://mlops.community/
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Alex on https://www.linkedin.com/in/serbanac/
Timestamps:
[00:00] Alex Serban’s background
[00:32] Machine Learning robustness
[02:21] Robustness in the wild
[03:11] Robustness in policy
[03:55] Good engineering, a prerequisite for building robust ML systems
[05:16] Investigating ML engineering best practices
[08:19] Online catalog of engineering practices for ML
[10:54] Example practice
[11:54] Measuring practice adoption
[13:50] Tech companies lead practice adoption
[15:03] Practice adoption increases with team size and experience
[16:37] ML-specific practices are adopted slightly more than traditional SE practices
[17:36] Practice adoption by data type
[18:19] Example practice
[19:10] 29 practices ranked
[20:02] Most adopted practices
[20:53] Least adopted practices
[21:48] Measuring effects of practice adoption
[21:56] Shadow deployment low adoption
[24:53] Different practices, different outcomes
[25:31] Practice importance for each effect
[26:32] Engineering best practices for ML
[27:50] Seven key requirements
[29:17] New practices, mapped to trustworthiness requirements
[30:17] Adoption of practices for trustworthy ML
[30:50] Challenges of adopting new best practices
[31:51] EU Regulations Robust Data Set gray area
[34:45] Key Takeaways
[35:51] Take the survey at se-ml.github.io!
[36:06] Learn more
[37:00] After survey follow up
[38:33] Surprising best practices
[39:26] Criticisms
[40:34] Discarded practices
[41:12] Machine Learning as being crowded like Cyber Security
[42:27] Random Forest Model best practices
[43:14] Difficult but important best practices
[44:30] Interpretability method suggestion for GNN’s
[45:09] Impact of the paper
[47:17] ML Test Score
[47:44] Goal
source