TASK :3-AUTOMATION IN ML

Saurava

5 min readMay 29, 2020

PROBLEM STATEMENT

1. Create container image that’s has Python3 and Keras or numpy installed using dockerfile

2. When we launch this image, it should automatically starts train the model in the container.

3. Create a job chain of job1, job2, job3, job4 and job5 using build pipeline plugin in Jenkins

4. Job1 : Pull the Github repo automatically when some developers push repo to Github.

5. Job2 : By looking at the code or program file, Jenkins should automatically start the respective machine learning software installed interpreter install image container to deploy code and start training( eg. If code uses CNN, then Jenkins should start the container that has already installed all the softwares required for the cnn processing).

6. Job3 : Train your model and predict accuracy or metrics.

7. Job4 : if metrics accuracy is less than 80% , then tweak the machine learning model architecture.

8. Job5: Retrain the model or notify that the best model is being created

9. Create One extra job job6 for monitor : If container where app is running. fails due to any reason then this job should automatically start the container again from where the last trained model left.

SOLUTION :

we have decided what we have to do in this project so, for this project I have created 6 jobs.

Creating the git repository

When the developer writes any codes in local repository then the job has to push the codes in from local repository to Github.

JOB1: PULL CODE FROM GITHUB

whenever the developer push any codes in Github it automatically trigger because i have used poll SCM it keeps eye on the Github repository . Our duty is to provide the Github URL in Git plugin so, that the codes will clone in local workspace as well as in base os in which our Jenkins is setup.(Redhat as base os )

In repository URL i have given the path of my own Github.\

So this command will allow to copy codes from Github to local repository in Base OS is Redhat.

JOB2: CHECK CODE

This job will look the code of main program file in my case mlproject.py . our main motive is to check the code of python and searched the important word from code so, that we will create an environment according to it. In my case I have used LENET .
CNN model will definitely have 2 words in it to implement their modules which are keras and conv2D. Thus if these words are present then, we will get to known about CNN model.
So , we will have to create one Image in which all the module for the building of CNN model be present. For this first we have to create dockerfile and we will write all the module which will help to create an environment.

As we known these are the basic module for the building of CNN model

we have created one images mlops as you can see at the top of the images of version v1 and the size of my image is about 2.26.

Now this shell will help to launch a container and the name of the container is mlos and we have created a volume of mlops1

You can see in the aboe image i have launched one container.
At last of this job we have created one layer with the use of some basic layers and accuracy that we achieved is about 97%.

JOB3: DISPLAY PAGE

In job 3 we have basically created a webpage which will describe the accuracy of the model after 6 epochs along with the hyperparameters used.

S let’s see the page after the epoch will completed and check the hyperparameter used in it without tweak.

So, for the first time we have achieved the accuracy of about 97% .now we will show how tweak will enhanced the accuracy level of model. after completion of job 3 it will automatically trigger the job4.

JOB4: TWEAKING CODE

This job basically check the accuracy of our model if the accuracy is not reached up to a desired level the it will execute again job2 and ones again create a container and launch model again
If in the first step we have reached in desired level the it will automatically trigger job5 for the success mail.

BUT HOW TO CHECK THE ACCURACY AND COMPARE ?

For this i have created a python code tweaker.py so that it will always compare the accuracy of old and new data .
As soon as the hyper parameter value is changed , the job2 is re run to see the accuracy.Now , if the accuracy would have increased , it means that the value increased was good and can be increased further , so it increases that parameter’s value further. so the repeatation will continue untill we will reached in our desired level.