4 Projects Description:
Perform a multiclass classification for sentiment analysis by first ingesting product reviews data into a central repository, Amazon S3 bucket. Then, we will use Amazon Athena and AWS Glue Machine Learning tools to analyze the data and visualize the data with interactive queries, which will be used during the model development process. Use SageMaker BlazingText built-in algorithm to predict the sentiment for each customer review. BlazingText is a variant of FastText which is based on word2vec. Bias can be present in your data before any model training occurs. Inspecting the dataset for bias can help detect collection gaps, inform your feature engineering, and understand societal biases the dataset may reflect. In this lab you will analyze bias on the dataset, generate and analyze bias report, and prepare the dataset for the model training.Use Amazon Sagemaker Autopilot to train a BERT-based natural language processing (NLP) model. The model will analyze customer feedback and classify the messages into positive (1), neutral (0) and negative (-1) sentiment.
My Solutions: Practical Data Science Projects from Coursera, DeepLearning.AI and Amazon Web Services
ML Pipeline using Amazon Sagemaker

Project 1:
Analyze Datasets and Train ML Models using AutoML
Perform a multiclass classification for sentiment analysis by first ingesting product reviews data into a central repository, Amazon S3 bucket. Then, we will use Amazon Athena and AWS Glue Machine Learning tools to analyze the data and visualize the data with interactive queries, which will be used during the model development process.Steps
1. List and access the Women's Clothing Reviews dataset files hosted in an S3 bucket 2. Install and import AWS Data Wrangler 3. Create an AWS Glue Catalog database and list all Glue Catalog databases 4. Register dataset files with the AWS Glue Catalog 5. Write SQL queries to answer specific questions on your dataset and run your queries with Amazon Athena 6. Return the query results in a pandas dataframe 7. Produce and select different plots and visualizations that address your questions
Project 2:
Detect data bias with Amazon SageMaker Clarify
Bias can be present in your data before any model training occurs. Inspecting the dataset for bias can help detect collection gaps, inform your feature engineering, and understand societal biases the dataset may reflect. In this lab you will analyze bias on the dataset, generate and analyze bias report, and prepare the dataset for the model training.
Steps
1. Download and save raw unbalanced dataset 2. Analyze bias with open source Clarify 3. Balance the dataset 4. Analyze bias at scale with a Amazon SageMaker processing job and Clarify 5. Analyze bias reports before and after balancing the dataset
Project 3:
SageMaker pipelines to train a BERT-Based text classifier
Use Amazon Sagemaker Autopilot to train a BERT-based natural language processing (NLP) model. The model will analyze customer feedback and classify the messages into positive (1), neutral (0) and negative (-1) sentimentSteps
1. Dataset review 2. Configure the Autopilot job 3. Launch Autopilot job 4. Track Autopilot job progress 5. Feature engineering 6. Model training and tuning 7. Review all output 8. Deploy and test best candidate model
Project 4:
Train a text classifier using Amazon SageMaker BlazingText built-in algorithm
Use SageMaker BlazingText built-in algorithm to predict the sentiment for each customer review. BlazingText is a variant of FastText which is based on word2vec. For more information on BlazingText, see the documentation here: https://docs.aws.amazon.com/sagemaker/latest/dg/blazingtext.html
Steps
1. Prepare dataset 2. Train the model with Amazon SageMaker BlazingText 3. Deploy the model 4. Test the model
References:
Built-in algorithms:
Elastic Machine Learning Algorithms in Amazon SageMaker