Setup MLflow on AWS EC2
Welcome to 3 Part Tutorial for end to end MLOps, starting from training, tracking, deploying, inferencing.
Part 1: Setup MLflow on AWS EC2
Part 2: MLOps deployment on AWS Fargate: I
Part 3: MLOps deployment on AWS Fargate: II
Getting on with Part 1…
You can find the Github Repository for all the used code here!
Tracking of metrics of the heavy duty training performed by neural networks after extensive development by engineers is of utmost importance. Here we would be using a framework called MLflow for tracking all our trainings. The advantage of using this framework is that it permits tracking across users, teams and organisations, giving them the leverage to understand the learnings attained from previous trainings and use them to develop better performing models over time.
Now we will going through the necessary steps needed to setup MLflow tracking server on AWS EC2 with minimal efforts.
Step 1: Setting up an AWS account and signing into your console
Step 2: Search for EC2 and click on Launch Instance
Step 3: Select Amazon Linux 2 64-bit (x86)
Step 4: Choose t2.micro instance
Step 5: Configure your instance by adding your VPC is existing with the subnets or else choosing the default options
Step 6: Skip to Configure Security Group. If you have existing security groups that allow SSH and HTTP connection add those or create a new security group that allow SSH connection from your IP to configure the instance and HTTP connections to interact with the tracking server
Step 7: Review and Launch your instance
Step 8: Connect via SSH to you instance. In order to find the necessary command, open EC2 in your dashboard and click on your instance ID, from there click on Connect and you should see the commands.
Step 9: Install python3.5 or above if not already there within the system
sudo yum install python3.5
Step 10: Install MLFlow
sudo pip3 install mlflow
Step 11: Install httpd-tools for password protection to the dashboard
sudo yum install httpd-tools
Step 12: Install nginx and open the config file
sudo yum install nginx
Step 13: Add password for testuser
sudo htpasswd -c /etc/nginx/.htpasswd testuser
Step 14: Configure nginx to reverse proxy to port 5000
sudo nano /etc/nginx/nginx.conf
Add the following to the config file as follows
location / {
proxy_pass http://localhost:5000/;
auth_basic “Restricted Content”;
auth_basic_user_file /etc/nginx/.htpasswd;
}
Step 15: Start the nginx server and mlflow server
sudo service nginx startmlflow server --host 0.0.0.0
Step 16: Create a S3 bucket to store the metrics and model
Step 17: To use this setup in your python training code add the following to your training script
Step 18: Once performing the above mentioned actions you will be able to visualise the MLflow dashboard on the public DNS of your EC2 instance
To find your public DNS open your instance and click on connect. Copy the address and open it in a new browser.
Congratulations you have successfully setup the tracking server of MLflow on EC2 instance.
Hope you liked the tutorial.😁😁