Our California-based utilities client supplies natural gas and electricity to over 3 million customers every year. Since incorporating over 140 years ago, this Fortune 500 company has built more than 10,000 miles of underground power and 6,000 miles of overhead power.
Scaling machine learning
Our client had begun using models to respond to fire emergencies, but was struggling to scale operations to meet the intensity and unpredictability of business operations during fire season. Performance, reliability, auditability, and developer-friendly tools were major priorities for the client. They wanted a smooth transition to the cloud that maximized the work they had already put into their models, while architecting a future operating state that made model operations more collaborative, automated, and visible within the enterprise.
Success in the cloud
Our goal was to pair an innovative machine learning architecture and industry-leading best practices. The new architecture would improve scalability, reduce costs, establish an automated audit trail, and optimize business insights—all as part of our client’s forward-thinking cloud strategy. Using MLOps, solution architecture, and workflow best practices would help the team work effectively and enable new data scientists to onboard easily.
For a scalable, cost-effective, and modern solution, we turned to AWS.
Compiling the data
Data science teams often develop features from a range of datasets housed in multiple systems, each with different update schedules and owners. One of our first priorities for this cloud migration was to place data into a cost-effective storage solution and enable low-latency querying using a cloud-native query engine. We selected Athena for querying; we leverage Glue and Wrangler to infer schemas and convert data files into performant formats. When selecting services and designing architecture, we considered feature complexity, data size, cross-team sharing, and model use cases.
Scalable machine learning
We chose SageMaker Studio as our data science platform. SageMaker leverages the power of Jupyter Notebooks with the tools and services available in AWS. Our solution combined SageMaker with the power of CI/CD pipelines, Terraform infrastructure as code, and distributed compute workflows to create a versatile environment for experimentation and model development. Deployment pathways enable quick model promotion into production using automated pipelines. Migrated models run faster in AWS with scalable compute, while development teams iterate faster with data, environments, and infrastructure that are established through SageMaker and ready to for use.
Model governance strategy for auditability and time savings
Technical debt is a reality for data science teams that are iterating rapidly to meet business demands. MLOps and model governance strategies can balance immediate needs with a long-term vision for growth and scalability. By putting model governance into practice using model registry, we automated the creation of audit trails and made versioning data, model artifacts, and models an automated process with approval gates built in to ensure only high-quality models go to production.
Our migration of these modeling activities has improved the data science team’s workflow and made delivering insights faster and easier. Data science teams can develop locally or in SageMaker for easy integration with familiar tools and workflows. When a model is ready for deployment, the promotion process can be automated and the advantages of model governance strategies, data pipelines, and CI/CD can be fully realized. Model predictions are easily accessible and permissioning is streamlined through IAM roles, so insights and records aren’t lost on a hard drive or delayed by a time-consuming permissioning processes.
A better system for a safer world
This cloud migration and MLOps implementation has had an immediate, positive impact on our client’s ability to prevent and mitigate wildfires. As climate conditions continue to change and wildfire threats emerge, our client is positioned to scale its operations to accommodate complex use cases and keep pace with innovation.