All posts by year

Showing everything matching: clear

2020
2020

Time-series forecasting

20 Jul 2020
machine-learning forecasting

Forecasting

It is the technique to predict future data analysing historical data.

Type of forecasting

Quantitative forecasting Qualitative forecasting
Based on the data and any repeating historical patterns in the data. This procedure is taken when data is not available and historical patterns do not repeat. This is based on expert decision.
Captures complex patterns which humans cannot identify. May not capture complex patterns.
No bias Bias
Eg. Time-series forecasting Eg. Delphi method
More...

General ML questions

09 Jul 2020
machine-learning practice-questions

What is Linear regression?

  • Linear regression is the machine learning(ML) algorithm which finds the best linear-fit relationship on any given data, between independent and dependent variables.
  • It is mostly done by the Sum of Squared Residuals method.
  • Sum of the residuals of a linear regression model is always zero.

What are the assumptions in a Linear regression model?

  1. Assumption about the form of the model: Linearity assumption: It is assumed that there is a linear relationship between the dependent and independent variables.
  2. Assumptions about the residuals: a. Normality assumption: It is assumed that the error terms, are normally distributed. b. Zero mean assumption: It is assumed that the residuals have a mean value of zero. c. Constant variance assumption:
    • It is assumed that the residuals have same (but unknown) variance i.e. .
    • This assumption is also known as the assumption of homogenity or homoscedasticity. d. Independent error assumption: It is assumed that the residuals are independent of each other i.e. their pair-wise covariance is zero.
  3. Assumptions about the estimators: a. The independent variables are measured without error. b. The independent variables are independent of each other i.e. there is no multi-collinearity in the data.

What is Feature engineering? How do you apply it in the process of modelling?

  • Feature engineering is the process of transforming raw data into features that better represent the underlying problem to the predictive models, resulting in improved model accuracy on unseen data.
  • In simple words, it is the development of new features that may help you to understand and model the problem in a better way.
  • Types of feature engineering:
    • Business-driven:
      • This revolves around the inclusion of features from a business point of view.
      • The job here is to transform the business variables into features of the problem.
    • Data-driven:
      • In this, the features you add do not have a significant physical interpretation, but they help the model in prediction of the target variable.
  • To apply feature engineering, one must be fully acquainted with the dataset.
    • This involves knowing what the given data is, what it signifies, what the raw features are, etc.
    • You must also have a crystal clear idea of the problem, such as what factors affect the target variable, what the physical interpretation of the variable is, etc.

What is the use of regularisation? Explain L1 and L2 regularisation techniques?

  • Regularisation is the technique that is used to tackle the problem of overfitting of the model.
  • At times, the simple model might not be able to generalize the data and the complex model overfits. To address this problem, regularisation is used.
  • Regularisation is nothing but adding the coefficient terms (s) to the cost function so that the terms are penalised and are small in magnitude.
  • This essentially helps in capturing the trends in the data and at the same time prevents overfitting by not letting the model become too complex.
  • L1 or LASSO regularisation:
    • Here, the absolute values of the coefficients are added to the cost function.
    • This regularisation technique gives sparse results, which leads to feature selection as well.
  • L2 or Ridge regularisation:
    • Here, the squares of the coefficients are added to the cost function.

How to choose the value of the parameter Learning rate?

  • If the value is too small, the gradient descent algorithm takes ages to converge to the optimal solution.
  • On the other hand, if the value of the learning rate () is high, the gradient descent will overshoot the optimal solution and most likely never converge to the optimal solution.
  • To overcome this problem, you can try different values of alpha over a range of values and plot the cost vs the number of iterations. Then, based on the graphs, the value corresponding to the graph showing the rapid decrease can be chosen.
  • If you see that the cost is increasing with the number of iterations, then your learning rate parameter is high and it needs to be decreased.
More...

Practice questions

06 Jul 2020
statistics practice-questions

What is the difference between univariate, bivariate and multivariate analyses?

  • Univariate analysis: The data which is being analysed contains one variable. For example, analysing the customers visiting an e-commerce fashion website. The data can be divided into men, women and kids, as following:

    Category Number per hour
    Women 78
    Men 49
    Kids 12

    The major purpose of this analysis is to describe and find patterns in data.

  • Bivariate analysis: It is used to find relationships between two different variables. For example, finding the relationship between Employee satisfaction level and income.
  • Multivariate analysis: It is the analysis of three or more variables. For example, analyse the factors that could have affected the lower sales of a newly launched product.
More...

F-Test

05 Jul 2020
statistics hypothesis-testing

Two sample t-tests can validate a hypothesis containing only two groups at a time. For samples involving three or more groups, the t-test becomes tedious as you have to perform the tests for each combination of the groups. Also, Type-1 error increases in this process. You use ANOVA in such cases.

ANOVA

ANOVA or Analysis of Variance can determine whether the means of three or more groups are different. ANOVA uses F-tests to statistically test the equality of means.

More...

p-Value approach

04 Jul 2020
statistics hypothesis-testing

The p-value approach avoids the use of significance level; instead, it reports how significant the sample evidence is. The p value of a sample is:

  • The probability of seeing a sample with at least as much evidence in favour of the alternative hypothesis as the sample actually observed.
  • The smaller the p-value, there is more evidence in favor of the alternative hypothesis.

Let’s understand the concept of p-value with a simple example.

Suppose that a bakery owner wants to know about the satisfaction level of his customers. He takes feedback from his customers on a ‘-5 to +5’ scale, -5 representing highly dissatisfied, +5 representing fully satisfied, and 0 being neutral.

More...

Chi-square Test

04 Jul 2020
statistics hypothesis-testing

Following are the 2 types of Chi-square test:

  • Chi-square test of independence
  • Chi-square goodness of fit (This is used to test whether a sample data correctly represent a population data)

Chi-square test of independence

This is used to determine if there is a significant relationship between 2 nominal (categorical) variables.

More...

A/B testing

04 Jul 2020
statistics hypothesis-testing
  • The process of testing original and alternate versions of a website is known as A/B testing
  • In this process, we employ data-driven decision making instead of just gut feeling for a change in website.
  • For a website, divide all the visitors to the platform into two categories
    • one experiencing control group A
    • the other experiencing variation B A/B testing is a tool in the hands of managers, which they can use to understand the user behavior in two competing scenarios. This helps them select the best way in which they can make the customers complete the activities the website wants them to.
  • A/B testing helps you find out what you could change in the current product to increase user conversion rate.
More...

T-Test

02 Jul 2020
statistics hypothesis-testing

Hypothesis testing can be done only when following 2 conditions are met:

  • Standard deviation of population is known
  • Sample size is greater than 30

But how will you test the hypothesis if these conditions are not fulfilled? Using t-test.

The t-distribution is :

More...

Hypothesis testing

28 Jun 2020
statistics hypothesis-testing
  • An assumption or claim made about the entire population using the sample statistics after an analysis is performed on a sample is called Hypothesis.
  • What hypothesis testing helps you do is statistically verify whether a claim is likely to be true or not for the whole population.
  • Thus, we can say that Hypothesis testing is a method or procedure that tests the statistical validity of a claim.
  • Components involved in Hypothesis testing:
    • Null hypothesis : It states that there is no change or no difference in the situation and assumes that the status quo is true. It always has the = sign and it is a common belief about the population.
    • Alternative hypothesis : It is the claim that opposes null hpothesis. It challenges the status quo and may or may not be proved. It never has the = sign and always challenges the status quo.
More...

Inferential statistics

26 Jun 2020
statistics sampling

Terminology

  • Population: Entire data
  • Sample: Part of the population selected by a defined procedure to be representative of the data.
  • Variables used:

    Term Sample Population
    Size n N
    Mean
    Standard Deviation s
More...

Normal distribution

25 Jun 2020
statistics probability-distributions
  • A normal distribution curve is a type of probability distribution for continuous random variables.
  • Examples:
    • The distribution of height
    • The distribution of IQ
  • Properties of normally distributed curve are as follows:
    • It is symmetrical on both sides of its mean
    • The mean lies at the middle of the curve. Mean = Median = Mode
    • The total area under the curve is equal to 1 (since it is probability density function)
More...

Probability distributions

18 Jun 2020
statistics probability-distributions

Random variable

  • A variable that can produce possible values those are outcomes of a random experiment is called random variable.
  • Types:
    • Discrete random variable is a random variable that has countable number of possible values.
      • Example: Random variable representing the sum of 2 dices.
    • Continuous random variable is a random variable where the data can be infinitely random.
      • Example: Random variable representing the height of students in a class.

Probability distribution

  • It is a table of values that shows the probabilities of all possible values of a random variable.
  • Probability distribution for discrete and continuous are different.
More...

Joint probability and conditional probablity

18 Jun 2020
statistics probability-basics

Joint probability

  • Joint probability is the probability of the occurence of 2 events simultaneously.
  • 2 events are independent of each other in Joint probability.
More...

Introduction to probability

16 Jun 2020
statistics probability-basics

Definition

  • Probability is the measurement of the likelihood of the occurence of an event.
  • It ranges between 0 and 1.
  • 0 = impossible event.
  • 1 = certain event.
More...

Image classification using OpenCV

13 Jun 2020
open-cv deep-learning image-classification

In Deep Learning with OpenCV blog, we talked about using pre-trained model provided in OpenCV 3.3 to classify images or even videos(real-time webcams, video files, etc.). In this blog, we’ll deep-dive into image classification using OpenCV and GoogleLeNet (pre-trained on ImageNet) using the Caffe framework.

More...

Deep Learning with OpenCV

01 Jun 2020
open-cv deep-learning

Deep learning with OpenCV using its inbuilt dnn module

Introduction to dnn module

The dnn module of OpenCV was included in the main repository in v3.3.

With OpenCV 3.3 or later, we can utilize pre-trained networks with popular deep learning frameworks. The fact that they are pre-trained implies that we don’t need to spend many hours training the network - rather we can complete a forward pass and utilize the output to make a decision within our application.

More...