Saturday, May 1, 2021

Imputation Strategies

 Imputation is used when handling pre-processing training data in machine learning. It is useful in handling missing data. 

Installation - Machine Learning Deep Learning Prerequisites

import numpy as np # linear algebra

import seaborn as sns # data visualization, API

from bs4 import BeautifulSoup as soup # web scraping

Install packages based on requirements.txt using command line

Install requirements

$sudo pip install -r requirements.txt

Other commonly used libraries:

numpy, scipy - for scientific computing, matplotlib, 

import os 
# import the os module 
# "This module provides a portable way of 
# using operating system dependent functionality."

Other scikit-learn import statements you might see in the wild:

from sklearn.metrics import roc_auc_score

from sklearn.ensemble import RandomForestClassifier

from sklearn.naive_bayes import GaussianNB

from sklearn.neighbors import KNeighborsClassifier

from sklearn.tree import DecisionTreeClassifier

from sklearn.ensemble import AdaBoostClassifier

from sklearn.ensemble import GradientBoostingClassifier

from sklearn.linear_model import LogisticRegression

Machine Learning in the Cloud

Workflow :  How to generate or collect, preprocess and train with data. 

Sample tasks : 

  • train machine learning models in google cloud. 
  • Data collection in Google Cloud or on Amazon Web Services (AWS). 
  • Analyze, preprocess training data. 
  • Clean, analyze data and present your findings
  • Pre-processing data using python
  • Train a basic machine learning model
  • Deploy a model for prediction using a REST API