All things GCP: Machine Learning Decision pyramid

Understand which Google Cloud tools matches best for you.

Updated: 21^th October 2019

Most of the cloud providing services provide numerous tools and services to use which benefits users to start their development process. Although these tools varied from many levels of abstraction as per user preference, this can also be overwhelming to the users.
Many of the users find it hard to find the best tools for their needs and this can best be seen in the Machine Learning stage.

Many users want Machine Learning in their products and cloud providers like GCP, AWS Azure, etc knows about this. Because of this demand, as per the user's needs and expertise they can provide machine learning tools out of the box and users can work on complex Machine Learning tasks by a few lines of code or no code at all!
For this article I have selected GCP, and why not. For ML tools perspective, Google Cloud has the most useful ML toolkits that support a wide range of expertise. Let’s jump into understanding each stack starting from the bottom.

Note: this article is inspired by Sara Robinson on her tweet and a short 5 min video explanation is uploaded by Google Cloud Platform on Youtube. This article will explain the tools and their usefulness in detail.

ML that require Data Scientists

1. ML Frameworks

Frameworks like Tensorflow, Pytorch, Sklearn, XG Boost, etc have been popular with Data Scientists for quite a while.
These frameworks help to use out-of-the-box code/functions to create Machine Learning or Neural Networks by few lines of code.
Use of documentation for the most popular framework and many tutorials have made it go-to stack for Data Scientists.
Also if needed you can easily modify the code and can make the custom framework as per the needs.

Most popular GCP frameworks to support

Google Compute Engine

You can create your own Virtual Machine Instance and launch a system as per your needs.
You will have the granular power of your OS, its version, python version, which framework you needed, etc.

To get started.

Google Cloud Instance for Machine Learning.
Deep Learning in Google Compute Engine.

GCP compute engine creation of a new instance

2. Deep Learning VM Images

You will get pre-installed frameworks to use with Python or R Environments configured.
You can select VM images based on your frameworks or environments.
Like a compute engine, you can specify the number of cores and RAM for your instance.
If needed support of GPU and JupyterLab is also provided out of the box.

To get started.

Deep Learning VM console
Deep Learning VM Documentation

Deep Learning VM images — Creation of the new instance with configurations

3. Kubeflow

This is mostly utilized when we need to deploy our training models to production.
Because of different configuration in our local env. and production env. many of our ML tasks breaks when moving to production env. for deployment and to serve our ML model.
This issue was addressed before by developers beforehand in applications other than ML and they created Kubernetes.
Because of the demand for orchestration platforms like Kubernetes on ML, Kubeflow was formed.

To get started.

End-to-End Kubeflow on GCP
Codelabs — Kubeflow

Kubeflow UI

4. Cloud ML Engine

Similar to Deep Learning VM Images, GCP has AI Notebooks. A one-click easy to deploy AI environment with the notebook.
This AI notebook can seamlessly pull data from BigQuery, use Cloud Dataproc to transform it, and leverage AI Platform services or Kubeflow for distributed training and online prediction.
You can run a Notebook instance on a container of your choice.

To get started.

AI Platform Documentation

AI Notebook — One-click access to JupyterLab

Some Understanding about the data

5. BQML

Here you have the data (if not Big Query has plenty of public datasets to play with). You have loaded your data to BigQuery, now you want to perform ML from the data but all you have learned is basic SQL. Here’s BQML comes to rescue.
Inside BigQuery, we have BQML (Big Query Machine Learning) which can easily train and predict ML models from BigQuery itself.
A sample code to run ML model for training will look like.

To get started.

BQML Documentation
Codelab — BQML

BigQuery training using SQL

Not Much understanding at all

6. AutoML

In AutoML you have to provide the data in which you want to perform ML and that's all!
AutoML removes the complications even of BQML and provides start-of-the-art ML models that you can use for your dataset.
AutoML has many tools in its bucket such as

AutoML Vision
AutoML Video Intelligence
AutoML Natural Language
AutoML Translation
AutoML Tables

To get started.

Auto ML docs
Deep understanding of how Auto ML works

How Auto ML works

7. ML APIS

These are Google’s provided API’s which will return ML prediction tasks when provided appropriate input.
ML Api’s include

Vision AI
Video AI
Cloud Speech-to-Text API
Cloud Text-to-Speech API
Dialogflow
Cloud Inference API
Recommendations AI (beta)
Cloud AutoML

To get started.

Codelabs — ML Api’s

Some GCP ML Api’s

Conclusion

As more and more development and research are been done in ML/ AI space, more people are interested to take leverage of it. And as people from different background are interested in this field, it is no doubt, ML service providers will make more and more simple “sophisticated” tools that will tap to a wide range of users. If we take a step back in all of these we can also think, if Machine Learning is becoming simple day-by-day, there is a possibility that companies/organizations/ individuals won't be relying on Data Scientists to make ML models and the “hype” this sector has created among individuals can burst. This is just a speculation and if true we are very far away to completely rely on a computer for ML tasks… or are we?

For any clarification, comment, help, appreciation or suggestions just post it in comments and I will help you in that. If you liked it you can follow me on medium.

Also, you can connect me on my social media if you want more content based on Data Science, Artificial Intelligence or Data Pipeline.

Subscribe to my newsletter

To get latest Product and Data blogs.

Want to get in touch?
Drop me a line!

And coffee's on me ☕.

You can connect with me on links provided below.

All things GCP: Machine Learning Decision pyramid

Understand which Google Cloud tools matches best for you.

ML that require Data Scientists

1. ML Frameworks

Most popular GCP frameworks to support

Google Compute Engine

To get started.

2. Deep Learning VM Images

To get started.

3. Kubeflow

To get started.

4. Cloud ML Engine

To get started.

Some Understanding about the data

5. BQML

To get started.

Not Much understanding at all

6. AutoML

To get started.

7. ML APIS

To get started.

Conclusion

Want to get in touch?Drop me a line!

You can connect with me on links provided below.

Want to get in touch?
Drop me a line!