Case Study - Simplified Deployment for Machine Learning

Simplified Deployment for Machine Learning
Simplified Deployment for Machine Learning

Executive Summary:
Stratus10 assisted GravityAI in creating an Infrastructure as Code (IaC) implementation and deployment process for their customer environments. GravityAI’s solution required a custom environment for each customer, and Stratus10 was able to assist them in leveraging CloudFormation and AWS Service Catalog to deliver consistent, stable, environments for new customers with minimal overhead on their operations team.

About GravityAI:
GravityAI is a leader in Machine Learning and Artificial Intelligence as a service. They deliver pre-built and customizable machine learning models delivered to isolated customer environments. These models can continue to be trained and redeployed within their environment, ensuring maximum control and security of the customer’s data. 


Customer Challenges:
GravityAI was building every customer environment by hand through the AWS Console. Every customer needs a new VPC, an ECS Cluster, multiple ECR Repositories, Cloudfront distributions, and more. It would take GravityAI several days to fully onboard a new customer and get them access to their environment. 

With the manual environment creation, it was difficult to create consistent environments for customers as well. This increased the overhead of administering these environments, because every environment was slightly different. 

Finally, GravityAI was deploying part of their application to EC2 instances because of underlying system requirements. They wanted to reuse their ECS compute layer rather than having a specialized deployment process for these components.


Why AWS:
GravityAI’s solution requires highly flexible and scalable infrastructure that they simply could not operate in an on-premise environment. Their complex networking, large spikes in compute needs, and security requirements would require a massive on-premise solution that would not be cost effective. 

By leveraging AWS ECS, VPC, CloudFront, and RDS, GravityAI is able to deliver best in class Machine Learning solutions that are optimized for the cloud. They can produce segregated environments for each customer that gives them the flexibility to run highly sensitive data through the GravityAI solution without having to compromise security


Why Stratus10:
Stratus10 was the perfect fit to deliver a DevOps solution for GravityAI that was repeatable and flexible enough to adjust to their client’s demands. Our expertise across the AWS suite of services resulted in the best solution for GravityAI that leveraged services they didn’t even know were available. Stratus10’s knowledge of containerization and Docker helped GravityAI to remove EC2 from their compute layer and containerize a part of their application that they did not think was possible. Finally Stratus10 was able to deliver parameterized Cloudformation templates that removed the complexity of creating customer environments.


The Solution:
Stratus10 delivered a solution based on AWS Service Catalog with cross-account sharing to deploy a VPC with ECS, RDS, S3, ECR, CodePipeline, Cloudfront, Route53, and Web Application Firewall (WAF) components. 

ECS was the perfect container orchestration solution due to a relatively small number of individual services needed and the low overhead and cost of the service. The horizontal scalability of ECS fits the very spiky compute requirements of GravityAI, and the simple delivery process with CodePipeline ensures that an automated delivery solution is provided out of the box for GravityAI’s customers. 

In addition to automating the creation of the environment, Stratus10 was able to assist GravityAI in containerizing the portion of their application that produces the trained models from their customers’ data. Because this solution relies on calling docker commands throughout the build and test processes, and the limitations of docker running inside docker, GravityAI was not sure this portion of their application could be containerized. We were able to mount the Docker socket into the containers and execute docker commands using the host’s docker process. This allowed GravityAI to remove all self-managed EC2 instances from their customer environments and optimize the cost of running those services on the cloud.

Stratus10 leveraged AWS Service Catalog to provide a simple interface with pre-built environments defined that GravityAI can deploy with the click of a button. The sharing capability allows GravityAI to keep all of their customer environments in sync by maintaining the Products on their main account.

Results and Benefits:
GravityAI’s Service Catalog solution built around Cloudformation Templates has significantly reduced the time it takes to onboard new customers. They went from multiple days to create a customer environment to less than 20 minutes for the entire environment to come up. The consistency delivered through an infrastructure as code approach has also greatly simplified the  administration of each of the customer environments that GravityAI has to maintain, and has ensured consistent performance and functionality are delivered to each customer. 

By containerizing all of the components in the GravityAI solution, Stratus10 has helped them optimize the cost of running their customers’ environments and simplified the solution that GravityAI has to administer.


Use case: DevOps Automation

Client: GravityAI

Date: January 2021

Category: DevOps / AI / ML