Setup MLflow on AWS EC2

Vaibhav Satpathy
Analytics Vidhya
Published in
4 min readJan 7, 2021

--

Welcome to 3 Part Tutorial for end to end MLOps, starting from training, tracking, deploying, inferencing.

Part 1: Setup MLflow on AWS EC2

Part 2: MLOps deployment on AWS Fargate: I

Part 3: MLOps deployment on AWS Fargate: II

Getting on with Part 1…

You can find the Github Repository for all the used code here!

Tracking of metrics of the heavy duty training performed by neural networks after extensive development by engineers is of utmost importance. Here we would be using a framework called MLflow for tracking all our trainings. The advantage of using this framework is that it permits tracking across users, teams and organisations, giving them the leverage to understand the learnings attained from previous trainings and use them to develop better performing models over time.

Now we will going through the necessary steps needed to setup MLflow tracking server on AWS EC2 with minimal efforts.

Step 1: Setting up an AWS account and signing into your console

Step 2: Search for EC2 and click on Launch Instance

Step 3: Select Amazon Linux 2 64-bit (x86)

Step 4: Choose t2.micro instance

Step 5: Configure your instance by adding your VPC is existing with the subnets or else choosing the default options

Step 6: Skip to Configure Security Group. If you have existing security groups that allow SSH and HTTP connection add those or create a new security group that allow SSH connection from your IP to configure the instance and HTTP connections to interact with the tracking server

Step 7: Review and Launch your instance

Step 8: Connect via SSH to you instance. In order to find the necessary command, open EC2 in your dashboard and click on your instance ID, from there click on Connect and you should see the commands.

Step 9: Install python3.5 or above if not already there within the system

sudo yum install python3.5

Step 10: Install MLFlow

sudo pip3 install mlflow

Step 11: Install httpd-tools for password protection to the dashboard

sudo yum install httpd-tools

Step 12: Install nginx and open the config file

sudo yum install nginx

Step 13: Add password for testuser

sudo htpasswd -c /etc/nginx/.htpasswd testuser

Step 14: Configure nginx to reverse proxy to port 5000

sudo nano /etc/nginx/nginx.conf

Add the following to the config file as follows

location / {
proxy_pass http://localhost:5000/;
auth_basic “Restricted Content”;
auth_basic_user_file /etc/nginx/.htpasswd;
}

Step 15: Start the nginx server and mlflow server

sudo service nginx startmlflow server --host 0.0.0.0

Step 16: Create a S3 bucket to store the metrics and model

Step 17: To use this setup in your python training code add the following to your training script

Step 18: Once performing the above mentioned actions you will be able to visualise the MLflow dashboard on the public DNS of your EC2 instance

To find your public DNS open your instance and click on connect. Copy the address and open it in a new browser.

Congratulations you have successfully setup the tracking server of MLflow on EC2 instance.
Hope you liked the tutorial.😁😁

--

--