Get Advanced Python Certification Course In Delhi and Gurgaon

Introduction and Overview to Python

Environment Setup (Installation and setting up Python IDE – Anaconda- Spyder, Jupyter)

Introduction

  • History
  • Features
  • Setting up path
  • Working with Python
  • Basic Syntax
  • Variable and Data Types
  • Operator

 

PYTHON: ESSENTIALS (CORE)

  • Overview of Python- Starting with Python
  • Introduction to installation of Python
  • Introduction to Python Editors & IDE’s(Spyder,  Jupyter, etc…)
  • Understand Spyder & Customize Settings
  • Concept of Packages/Libraries – Important packages(NumPy, SciPy, scikit-learn, Pandas, Matplotlib, etc)
  • Installing & loading Packages & Name Spaces
  • Data Types & Data objects/structures (strings, Tuples, Lists, Dictionaries)
  • List and Dictionary Comprehensions
  • Variable & Value Labels –  Date & Time Values
  • Basic Operations – Mathematical – string – date
  • Reading and writing data
  • Simple plotting
  • Control flow & conditional statements
  • Debugging & Code profiling
  • How to create class and modules and how to call them?

 

DATA EXPLORATION FOR MODELING

  • Need for structured exploratory data
  • EDA framework for exploring the data and identifying any problems with the data (Data Audit Report)
  • Identify missing data
  • Identify outliers data
  • Visualize the data trends and patterns

ACCESSING/IMPORTING AND EXPORTING DATA USING PYTHON MODULES

  • Importing Data from various sources (Csv, txt, excel, access etc)
  • Database Input (Connecting to database)
  • Viewing Data objects – subsetting, methods
  • Exporting Data to various formats
  • Important python modules: Pandas, beautifulsoup

DATA MANIPULATION – CLEANSING – MUNGING USING PYTHON MODULES

  • Cleansing Data with Python
  • Data Manipulation steps(Sorting, filtering, duplicates, merging, appending, subsetting, derived variables, sampling, Data type conversions, renaming, formatting etc)
  • Data manipulation tools(Operators, Functions, Packages, control structures, Loops, arrays etc)
  • Python Built-in Functions (Text, numeric, date, utility functions)
  • Python User Defined Functions
  • Stripping out extraneous information
  • Normalizing data
  • Formatting data
  • Important Python modules for data manipulation (Pandas, Numpy, re, math, string, datetime etc)

DATA ANALYSIS – VISUALIZATION USING PYTHON

  • Introduction exploratory data analysis
  • Descriptive statistics, Frequency Tables and summarization
  • Univariate Analysis (Distribution of data & Graphical Analysis)
  • Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
  • Creating Graphs- Bar/pie/line chart/histogram/ boxplot/ scatter/ density etc)
  • Important Packages for Exploratory Analysis (NumPy Arrays, Matplotlib, Pandas and scipy.stats etc)

INTRODUCTION TO PREDICTIVE MODELING

  • Concept of model in analytics and how it is used?
  • Common terminology used in analytics & modelling process

INTRODUCTION TO STATISTICS

  • Basic Statistics – Measures of Central Tendencies and Variance
  • Building blocks – Probability Distributions – Normal distribution – Central Limit Theorem
  • Inferential Statistics -Sampling – Concept of Hypothesis Testing
  • Statistical Methods – Z/t-tests( One sample, independent, paired), Anova, Correlations and Chi-square
  • Important modules for statistical methods: Numpy, Scipy, Pandas

SEGMENTATION: SOLVING SEGMENTATION PROBLEMS

  • Introduction to Segmentation
  • Types of Segmentation (Subjective Vs Objective, Heuristic Vs. Statistical)
  • Heuristic Segmentation Techniques (Value Based, RFM Segmentation and Life Stage Segmentation)
  • Behavioural Segmentation Techniques (K-Means Cluster Analysis)
  • Cluster evaluation and profiling – Identify cluster characteristics
  • Interpretation of results – Implementation on new data

DATA PREPARATION

  • Need of Data preparation
  • Consolidation/Aggregation – Outlier treatment – Flat Liners – Missing values- Dummy creation – Variable Reduction
  • Variable Reduction Techniques – Factor & PCA Analysis

MACHINE LEARNING -PREDICTIVE MODELING – BASICS

  • Introduction to Machine Learning & Predictive Modelling
  • Types of Business problems – Mapping of Techniques – Regression vs. classification vs. segmentation vs. Forecasting
  • Major Classes of Learning Algorithms -Supervised vs Unsupervised Learning
  • Different Phases of Predictive Modelling (Data Pre-processing, Sampling, Model Building, Validation)
  • Overfitting (Bias-Variance Trade off) & Performance Metrics
  • Feature engineering & dimension reduction
  • Concept of optimization & cost function
  • Overview of gradient descent algorithm
  • Overview of Cross validation(Bootstrapping, K-Fold validation etc)

Model performance metrics (R-square, Adjusted R-squre, RMSE, MAPE, AUC, ROC curve, recall,

UNSUPERVISED LEARNING: SEGMENTATION

  • What is segmentation & Role of ML in Segmentation?
  • Concept of Distance and related math background
  • K-Means Clustering
  • Hierarchical Clustering
  • Principle component Analysis (PCA)

LINEAR REGRESSION: SOLVING REGRESSION PROBLEMS

  • Introduction – Applications
  • Assumptions of Linear Regression
  • Building Linear Regression Model
  • Understanding standard metrics (Variable significance, R-square/Adjusted R-square, Global hypothesis ,etc)
  • Assess the overall effectiveness of the model
  • Validation of Models (Re running Vs. Scoring)
  • Standard Business Outputs (Decile Analysis, Error distribution (histogram), Model equation, drivers etc.)
  • Interpretation of Results – Business Validation – Implementation on new data

TIME SERIES FORECASTING: SOLVING FORECASTING PROBLEMS

  • Introduction – Applications
  • Time Series Components( Trend, Seasonality, Cyclicity and Level) and Decomposition
  • Classification of Techniques(Pattern based – Pattern less)
  • Basic Techniques – Averages, Smoothening, etc
  • Advanced Techniques – AR Models, ARIMA, etc
  • Understanding Forecasting Accuracy – MAPE, MAD, MSE, etc

LOGISTIC REGRESSION: SOLVING CLASSIFICATION PROBLEMS

  • Introduction – Applications
  • Linear Regression Vs. Logistic Regression Vs. Generalized Linear Models
  • Building Logistic Regression Model (Binary Logistic Model)
  • Understanding standard model metrics (Concordance, Variable significance, Hosmer Lemeshov Test, Gini, KS, Misclassification, ROC Curve etc)
  • Validation of Logistic Regression Models (Re running Vs. Scoring)
  • Standard Business Outputs (Decile Analysis, ROC Curve, Probability Cut-offs, Lift charts, Model equation, Drivers or variable importance, etc)
  • Interpretation of Results – Business Validation – Implementation on new data

TIME SERIES FORECASTING: SOLVING FORECASTING PROBLEMS

  • Introduction – Applications
  • Time Series Components( Trend, Seasonality, Cyclicity and Level) and Decomposition
  • Classification of Techniques(Pattern based – Pattern less)
  • Basic Techniques – Averages, Smoothening, etc
  • Advanced Techniques – AR Models, ARIMA, etc
  • Understanding Forecasting Accuracy – MAPE, MAD, MSE, etc

SUPERVISED LEARNING: DECISION TREES

  • Decision Trees – Introduction – Applications
  • Types of Decision Tree Algorithms
  • Construction of Decision Trees through Simplified Examples; Choosing the “Best” attribute at each Non-Leaf node; Entropy; Information Gain, Gini Index, Chi Square, Regression Trees
  • Generalizing Decision Trees; Information Content and Gain Ratio; Dealing with Numerical Variables; other Measures of Randomness
  • Pruning a Decision Tree; Cost as a consideration; Unwrapping Trees as Rules
  • Decision Trees – Validation
  • Overfitting – Best Practices to avoid

SUPERVISED LEARNING: KNN

  • What is KNN & Applications?
  • KNN for missing treatment
  • KNN For solving regression problems
  • KNN for solving classification problems
  • Validating KNN model
  • Model fine tuning with hyper parameters

SUPERVISED LEARNING: ENSEMBLE LEARNING

  • Concept of Ensembling
  • Manual Ensembling Vs. Automated Ensembling
  • Methods of Ensembling (Stacking, Mixture of Experts)
  • Random forest (Logic, Practical Applications)
  • Boosting (Logic, Practical Applications)

SUPERVISED LEARNING: SUPPORT VECTOR MACHINES

  • Motivation for Support Vector Machine & Applications
  • Support Vector Regression
  • Support vector classifier (Linear & Non-Linear)
  • Mathematical Intuition (Kernel Methods Revisited, Quadratic Optimization and Soft Constraints)
  • Interpretation of Outputs and Fine tune the models with hyper parameters
  • Validating SVM models

SUPERVISED LEARNING: ARTIFICIAL NEURAL NETWORKS (ANN)

  • Motivation for Neural Networks and Its Applications
  • Perceptron and Single Layer Neural Network, and Hand Calculations
  • Learning In a Multi Layered Neural Net: Back Propagation and Conjugant Gradient Techniques
  • Neural Networks for Regression
  • Neural Networks for Classification
  • Interpretation of Outputs and Fine tune the models with hyper parameters
  • Validating ANN models

SUPERVISED LEARNING: NAÏVE BAYES

  • Concept of Conditional Probability
  • Bayes Theorem and Its Applications
  • Naïve Bayes for classification
  • Applications of Naïve Bayes in Classifications

TEXT MINING & ANALYTICS

  • Taming big text, Unstructured vs. Semi-structured Data; Fundamentals of information retrieval, Properties of words; Creating Term-Document ( TxD);Matrices; Similarity measures, Low-level processes (Sentence Splitting; Tokenization; Part-of-Speech Tagging; Stemming; Chunking)
  • Finding patterns in text: text mining, text as a graph
  • Assignment and PROJECT – Implementing the concepts on Real Time projects
  • Applying different algorithms to solve the business problems and understand the industry applications.