Scaling Production Machine Learning Pipelines with Databricks
Conde Nast is a global leader in the media production space housing iconic brands such as The New Yorker, Wired, Vanity Fair, and Epicurious, among many others. Along with our content production, Conde Nast invests heavily in companion products to improve and enhance our audience’s experience. One such product solution is Spire, Conde Nast’s service for user segmentation, and targeted advertising for over a hundred million users. Spire consists of thousands of models, many of which require individual scheduling and optimization. From data preparation to model training to interference, we’ve built abstractions around the data flow, monitoring, orchestration, and other internal operations. In this talk, we explore the complexities of building large scale machine learning pipelines within Spire and discuss some of the solutions we’ve discovered using Databricks, MLflow, and Apache Spark. The key focus is on production-grade engineering patterns, the inner workings the required components, and the lessons learned throughout their development.
About:
Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: https://databricks.com/product/unifie…
Connect with us:
Website: https://databricks.com
Facebook: https://www.facebook.com/databricksinc
Twitter: https://twitter.com/databricks
LinkedIn: https://www.linkedin.com/company/databricks
Instagram: https://www.instagram.com/databricksinc/ Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-named-leader-by-gartner
source