Channel: PyData
Category: Science & Technology
Tags: pythonlearn to codeeducationsoftwarepydatalearncodinghow to programjuliaopensourcescientific programmingnumfocuspython 3tutorial
Description: Serving BERT Models in Production with TorchServe Speakers: Adway Dhillon, Nidhin Pattaniyil Summary This talk is for a data scientist or ML engineer looking to serve their PyTorch models in production. It will cover post training steps that should be taken to optimize the model such as quantization and torch script. It will also walk the user in packaging and serving the model through Facebook’s TorchServe. Description Intro (10 mins). - Introduce the deep learning BERT model. - Walk over the notebooks on Google Collab Setup. - Show the end model served along with sample inference. Review Some Deep Learning Concepts (10 mins) - Review sample trained PyTorch model code - Review sample model transformer architecture - Tokenization / pre and post processing Optimizing the model (30 mins) - Two modes of PyTorch: eager vs script mode - Benefits of script mode and PyTorch JIT - Post training optimization methods: static and dynamic quantization, distillation - Hands on: - Quantizing model - Converting the Bert model with torch script Deploying the model (30 mins) - Overview of deployment options : Pure flask app vs model servers like Torch Serve / TF-Serving - Benefits of Torch Serve: high performance serving, multi model serving, model version for A/B testing, server side batching, support for pre and post processing - Exploring the built in model handlers and how to write your own - Managing the model through management api - Exploring built and custom metrics provided by Torch Serve - Hands on : - Package the given model using Torch Model Archive - Write a custom handler to support pre processing and post processing Lessons Learned: (10min) - share some performance benchmarks of model served at Walmart Search - future next steps Q&A (5 mins) Adway Dhillon's Bio Software and Machine Learning Engineer at Walmart Labs GitHub: github.com/adwaydhillon LinkedIn: linkedin.com/in/adwaydhillon Nidhin Pattaniyil's Bio Senior Machine Learning Engineer at Walmart Labs GitHub: github.com/npatta01 Twitter: twitter.com/npatta01 LinkedIn: linkedin.com/in/nidhinpattaniyil Website: npatta01.github.io/ PyData Global 2021 Website: pydata.org/global2021 LinkedIn: linkedin.com/company/pydata-global Twitter: twitter.com/PyData pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases. 00:00 Welcome! 00:10 Help us add time stamps or captions to this video! See the description for details. Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: github.com/numfocus/YouTubeVideoTimestamps