SREcon22 Asia/Pacific – Improving Machine Learning Development Reliability
Improving Machine Learning Development Reliability
Brian Hansen and Yan Yan, Meta
The Machine learning Development LifeCycle is not the same as Software Development LifeCycle. It’s so different that we believe that we need to develop new ways to rationalize how we go about building, monitoring and alerting on ML artifacts as they go through the process. This talk explores those differences. It highlights challenges of ML reliability and scalability, what we’ve done and the need for involvement from this community to evolve how we think about the development and productization of machine learning as it explodes across our industry.
View the full SREcon22 Asia/Pacific program at https://www.usenix.org/conference/srecon22apac
source