Scipy Stats – Complete Guide

In this Python tutorial, we will understand the use of “Scipy Stats” using various examples in Python. Additionally, we will cover the following topics.

  • Scipy Stats
  • Scipy Stats Lognormal
  • Scipy Stats Norm
  • Scipy Stats T-test
  • Scipy Stats Pearsonr
  • Scipy Stats chi-square
  • Scipy Stats IQR
  • Scipy Stats Poisson
  • Scipy Stats Entropy
  • Scipy Stats Anova
  • Scipy Stats Anderson
  • Scipy Stats Average
  • Scipy Stats Alpha
  • Scipy Stats Boxcox
  • Scipy Stats Binom
  • Scipy Stats Beta
  • Scipy Stats Binomial test
  • Scipy Stats Binned statistics
  • Scipy Stats Binom pmf
  • Scipy Stats CDF
  • Scipy Stats Cauchy
  • Scipy Stats Describe
  • Scipy Stats Exponential
  • Scipy Stats Gamma
  • Scipy Stats Geometric
  • Scipy Stats gmean
  • Scipy Stats Gennorm
  • Scipy Stats Genpareto
  • Scipy Stats Gumbel
  • Scipy Stats Genextreme
  • Scipy Stats Histogram
  • Scipy Stats Half normal
  • Scipy Stats Half cauchy
  • Scipy Stats Inverse gamma
  • Scipy Stats Inverse normal CDF
  • Scipy Stats Johnson
  • Scipy Stats PDF
  • Scipy Stats Hypergeom
  • Scipy Stats Interval
  • Scipy Stats ISF
  • Scipy Stats Independent T-test
  • Scipy Stats Fisher Exact

Scipy Stats

The Scipy has a package or module scipy.stats that contains a huge number of statistical functions. Although statistics is a very broad area, here module contains the functions related to some of the major statistics.

  • Summary Statistics
  • Frequency Statistics
  • Statistical tests
  • Probability distributions
  • Frequency statistics
  • Correlation functions
  • Quasi-Monte Carlo
  • Masked statistics functions
  • Other statistical functionality

Scipy Stats Lognormal

The Lognormal represents the logarithm in normally distributed form. It is a random variable that is lognormal continuous.

The syntax is given below.

scipy.stats.lognorm.method_name(data,loc,size,moments,scale)

Where parameters are:

  • data: It is a set of points or values that represent evenly sampled data in the form of array data.
  • loc: It is used to specify the mean, by default it is 0.
  • moments: It is used to calculate statistics like standard deviation, kurtosis, and mean.
  • scale: It is used to specify the standard deviation, by default it is 1.

The above parameters are the common parameter of all the methods in the object scipy.stats.lognorm(). The methods are given below.

  • scipy.stats.lognorm.CDF(): It is used for the cumulative distribution function.
  • scipy.stats.lognorm.PDF(): It is used for the probability density function.
  • scipy.stats.lognorm.rvs(): To get the random variates.
  • scipy.stats.lognorm.stats(): It is used to get the standard deviation, mean, kurtosis, and skew.
  • scipy.stats.lognorm.logPDF(): It is used to get the log related to the probability density function.
  • scipy.stats.lognorm.logCDF(): It is used to find the log related to the cumulative distribution function.
  • scipy.stats.lognorm.sf(): It is used to get the values of the survival function.
  • scipy.stats.lognorm.isf(): It is used to get the values of the inverse survival function.
  • scipy.stats.lognorm.logsf(): It is used to find the log related to the survival function.
  • scipy.stats.lognorm.mean(): It is used to find the mean of the distribution.
  • scipy.stats.lognorm.medain(): It is used to find the median of the distribution.
  • scipy.stats.lognorm.var(): It is used to find the variance related to the distribution.
  • scipy.stats.lognorm.std(): It is used to find the standard deviation related to the distribution

Read: Scipy Constants – Multiple Examples

Scipy Stats Norm

The scipy.stats.norm represents the random variable that is normally continuous. It has different kinds of functions for normal distribution like CDF, PDF, median, etc.

It has two important parameters loc for the mean and scale for standard deviation, as we know we control the shape and location of distribution using these parameters.

The syntax is given below.

scipy.stats.norm.method_name(data,loc,size,moments,scale)

Where parameters are:

  • data: It is a set of points or values that represent evenly sampled data in the form of array data.
  • loc: It is used to specify the mean, by default it is 0.
  • moments: It is used to calculate statistics like standard deviation, kurtosis and mean.
  • scale: It is used to specify the standard deviation, by default it is 1.

The above parameters are the common parameter of all the methods in the object scipy.stats.norm(). The methods are given below.

  • scipy.stats.norm.CDF(): It is used for the cumulative distribution function.
  • scipy.stats.norm.PDF(): It is used for the probability density function.
  • scipy.stats.norm.rvs(): To get the random variates.
  • scipy.stats.norm.stats(): It is used to get the standard deviation, mean, kurtosis, and skew.
  • scipy.stats.norm.logPDF(): It is used to get the log related to the probability density function.
  • scipy.stats.norm.logCDF(): It is used to find the log related to the cumulative distribution function.
  • scipy.stats.norm.sf(): It is used to get the values of the survival function.
  • scipy.stats.norm.isf(): It is used to get the values of the inverse survival function.
  • scipy.stats.norm.logsf(): It is used to find the log related to the survival function.
  • scipy.stats.norm.mean(): It is used to find the mean related to the normal distribution.
  • scipy.stats.norm.medain(): It is used to find the median related to the normal distribution.
  • scipy.stats.norm.var(): It is used to find the variance related to the distribution.
  • scipy.stats.norm.std(): It is used to find the standard deviation related to the distribution

Let’s take an example by using one of the methods mentioned above to know how to use the methods with parameters.

Import the required libraries using the below code.

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

Create observation data values and calculate the probability density function from these data values with mean = 0 and standard deviation = 1.

observatin_x = np.linspace(-4,4,200)
PDF_norm = stats.norm.PDF(observatin_x,loc=0,scale=1)

Plot the created distribution using the below code.

plt.plot(observatin_x,PDF_norm)
plt.xlabel('x-values')
plt.ylabel('PDF_norm_values')
plt.title("Probability density funciton of normal distribution")
plt.show()
Scipy Stats Norm
Scipy Stats Norm

Look at the output, which shows the probability density function graph of normal distribution.

Read: Scipy Optimize – Helpful Guide

Scipy Stats CDF

Scipy stats CDF stand for Comulative distribution function that is a function of an object scipy.stats.norm(). The range of the CDF is from 0 to 1.

The syntax is given below.

scipy.stats.norm.CDF(data,loc,size,moments,scale)

Where parameters are:

data: It is a set of points or values that represent evenly sampled data in the form of array data.
loc: It is used to specify the mean, by default it is 0.
moments: It is used to calculate statistics like standard deviation, kurtosis and mean.
scale: It is used to specify the standard deviation, by default it is 1.

Let’s take an example and calculate using the below steps:

Import the required libraries using the below code.

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

Create observation data values and calculate the comulative distribution function from these data values with mean = 0 and standard deviation = 1.

observatin_x = np.linspace(-4,4,200)
CDF_norm = stats.norm.CDF(observatin_x,loc=0,scale=1)

Plot the created distribution using the below code.

plt.plot(observatin_x,CDF_norm)
plt.xlabel('x-values')
plt.ylabel('CDF_norm_values')
plt.title("Comulative distribution function")
plt.show()
Scipy Stats CDF
Scipy Stats CDF

From the above output, CDF is increasing and it tells that any value chosen from a population is going to have a probability less than or equal to some value x.

Read: Scipy Sparse – Helpful Tutorial

Scipy Stats Histogram

The Scipy has a method histogram() to create a histogram from the given values that exist within a subpackage scipy.stats. This function set apart the range into several bins and returns the instances in each bin.

The syntax is given below.

scipy.stats.histogram(a, numbins, defaultreallimits, weights)

Where parameters are:

  • a (array): It is the array of data that is provided as input.
  • numbins (int): It is used to set the number of bins for the histogram.
  • defaultreallimits: It is used to specify the range like lower and upper values of the histogram.
  • weights (array): It is used to specify the weight of each value within the array.

The above function exists in the older version of Scipy, so here we will use the same function but it can be accessed from the scipy module directly. Let’s take an example using the below steps.

Import the required libraries using the below code.

import numpy as np 
import scipy
import matplotlib.pyplot as plt

Generating the histogram values and bins by passing the array [1, 2, 2, 3, 2, 3, 3] and bin range 4 to the function histogram().

histogram, bins = scipy.histogram([1, 2, 2, 3, 2, 3, 3],
bins = range(4))

Viewing the values and size of histogram and bins respectively.

print ("Number of values in each bin : ", histogram)
print ("Size of the bins          : ", bins)

Plot the above-created histogram using the below code.

plt.bar(bins[:-1], histogram, width = 0.9)
plt.xlim(min(bins), max(bins))
plt.show()
Scipy Stats Histogram
Scipy Stats Histogram

Look at the above output, this is how a histogram is created using the Scipy.

Read: Scipy Stats Zscore + Examples

Scipy Stats Pearsonr

The Pearsonr is a Pearson correlation coefficient that is used to know the linear relationship between two variables and datasets. The method pearsonr() in the subpackage scipy.stats is used for that.

The syntax is given below.

scipy.stats.pearsonr(x, y)

Where parameters are:

  • x: It is the array data.
  • y: It is also the array data.

The method pearsonr() returns two values an r (Pearson correlation coefficient) and a p-value. The values of r between -1 and 1 where -1 means a strong negative relationship and 1 means a strong positive relationship, if the value is equal to 0 which means there is no relationship.

Let’s take an example by following the below steps:

Import the libraries using the below code.

from scipy import stats

Now access the method pearsonr() and pass it two array values using the below code.

r, p_values = stats.pearsonr([1, 4, 3, 2, 5], [9, 10, 3.5, 7, 5])

Check the values of the Pearson correlation coefficient and p-value using the below code.

print('The Pearson correlation coefficient',r)
print('P-value                            ',p_values)
Scipy Stats Pearsonr
Scipy Stats Pearsonr

Read: Python Scipy FFT [11 Helpful Examples]

Scipy Stats PDF

Scipy stats CDF stand for Probability density function that is a function of an object scipy.stats.norm(). The range of the PDF is from 0 to 1.

The syntax is given below.

scipy.stats.norm.PDF(data,loc,size,moments,scale)

Where parameters are:

data: It is a set of points or values that represent evenly sampled data in the form of array data.
loc: It is used to specify the mean, by default it is 0.
moments: It is used to calculate statistics like standard deviation, kurtosis and mean.
scale: It is used to specify the standard deviation, by default it is 1.

Let’s take an example and calculate using the below steps:

Import the required libraries using the below code.

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

Create observation data values and calculate the probability density function from these data values with mean = 0 and standard deviation = 1.

observatin_x = np.linspace(-4,4,200)
PDF_norm = stats.norm.pdf(observatin_x,loc=0,scale=1)

Plot the created distribution using the below code.

plt.plot(observatin_x,PDF_norm)
plt.xlabel('x-values')
plt.ylabel('PDF_norm_values')
plt.title("Probability density function")
plt.show()
Scipy Stats PDF
Scipy Stats PDF

Read: Matplotlib save as PDF

Scipy Stats chi-square

The chi-square test tests the variation between actual and expected results in statistics. It is used in hypothesis testing. It is applied to categorical data. In scipy, there is a method chisquare within subpackage scipy.stats to do the testing.

  • To use the chi-squared test sample size should be greater than 13.
  • This test doesn’t work if the expected or actual frequencies in a categorical variable are very small. So keep at least five expected or actual frequencies in a categorical variable.

The syntax is given below.

scipy.stats.chisquare(f_obs, f_exp=None, ddof=0)

where parameters are:

  • f_obs(array data): It is the observed frequencies in categorical variables.
  • f_exp(array data): It is the expected frequencies in categorical variables.
  • ddof(int): It is used to define the Delta degrees of freedom.

The method chisquare the test returns two float values, the first is the chi-square test statistic and the second is the p-value.

Let’s take an example by following the below steps:

Import the method chisquare from the module scipy.stats using the below code.

from scipy.stats import chisquare

Create a two array type variable to store the observed and expected frequencies. Pass the two array data to the method chisquare to perform the chi-squared test.

observed_f = [10, 25, 10, 13, 11, 11]
expected_f = [15, 15, 15, 15, 15, 5]
test_value = chisquare(f_obs=observed_f, f_exp=expected_f)

View the test result using the below code.

print('The value of chi-squared test statistic: ',test_value[0])
print('The value of p-vale: ',test_value[1])
Scipy Stats chisquare
Scipy Stats chisquare

The output shows the result of the chi-squared test. This is how to perform the chi-squared test on the categorical data to find the differences between actual and observed data using the value of the chi-squared test statistic and p-value.

Read: Scipy Misc + Examples

Scipy Stats IQR

The IQR stand for Interquartile Range which is the difference between the 1st (25th percentile) and the 3rd quartile (75th). It is used to measure the dispersion of data. The Scipy has a method iqr to calculate Interquartile Range of data on the stated axis that exists within the module scipy.stats.

The syntax is given below.

scipy.stats.iqr(x, axis=None, rng=(25, 75), nan_policy='propagate', interpolation='linear')

Where parameters are:

  • x(array data): Array or object is provided to a method.
  • axis(int): It is used to specify the axis for computing the range.
  • rng(Two-values in the range [0,100]: It is used to specify the percentiles on which range is calculated.
  • nan_policy: It is used to deal with the nan values and accept three values:
  1. omit: It means calculating the IQR by ignoring the nan values.
  2. propagate: It means returns nan values.
  3. raise: It means to throw an error for the nan values.
  • interpolation(string): It is used to specify the interpolation method to use like linear, lower, higher, nearest, and midpoint.

The method iqr returns the value in ndarray or scalar depending upon the provided input.

Let’s take an example to calculate the IQR given array data by following the below steps.

Import the method iqr from the module scipy.stats using the below code.

from scipy.stats import iqr

Create an array of data using and pass the data to a method iqr for calculating the IQR.

x_data = np.array([[15, 8, 7], [4, 3, 2]])

iqr(x_data)
Scipy Stats IQR
Scipy Stats IQR

The above output shows the Interquartile Range of given array data, this is how to find the IQR of the data.

Read: Python NumPy Average

Scipy Stats Average

The Scipy has a statistical method mean to calculate the average of the given data. The mean or average is the sum of all the values divided by the number of values.

The syntax is given below.

scipy.mean(array_data,axis)

Where parameters are:

  • array_data: It is the data in the array form containing all the elements.
  • axis(int): It is used to specify the axis along which average or mean needs to be calculated.

The method mean() return the arithmetic mean of the elements in the array.

Let’s understand through an example following the below steps.

Import the required libraries using the below code.

import scipy

Creating an array containing the elements whose arithmetic mean needs to be calculated.

array_data = [2,4,6,8,12,23]

Calculate the mean of the created array by passing it to the method mean().

scipy.mean(array_data)
Scipy Stats Average
Scipy Stats Average

The output shows the mean of the given arrays.

Scipy Stats Entropy

First, we need to know “What is entropy” entropy is a state of uncertainty in thermodynamics. But the concept of entropy has been taken in statistics which is applied while computing the probabilities. In statistics, entropy is used to assess the amount of information in distributions, variables and events.

The Scipy has a method entropy() to calculate the entropy of distributions.

The syntax of the method entropy() is given below.

scipy.stats.entropy(pk, qk=None, base=None, axis=0)

Where parameters are:

  • pk(array): It takes the distribution.
  • qk(array data): Arrangement against which the general entropy is figured. It must be in the same form as pk.
  • base(float): It is used to define which logarithmic base to be used, by default natural logarithmic base.
  • axis(int): It is used to specify the axis on which entropy is determined.

Follow the below steps for the demonstration of the method entropy().

Import the method entropy() from module scipy.stats.

from scipy.stats import entropy

pass the pk values to a method to compute the entropy.

entropy([8/9, 2/9], base=2)
Scipy Stats Entropy
Scipy Stats Entropy

Read: Scipy Normal Distribution

Scipy Stats Anderson

The Anderson-Darling test estimates the null hypothesis that the sample is coming from a population that follows a specific distribution. The Scipy has a method anderson() of module scipy.stats for that test.

The syntax of the method anderson() is given below.

scipy.stats.anderson(x, dist='norm')

Where parameters are:

  • x(array_data): It is sample data.
  • dist(): It is used to define the distribution to test in contrast to. It accepts the following values.
  1. ‘norm’,
  2. ‘expon’,
  3. ‘logistic’,
  4. ‘gumbel’,
  5. ‘gumbel_l’,
  6. ‘gumbel_r’,
  7. ‘extreme1’

The method anderson() returns statistics, critical_values, and significance_level.

Read: Scipy Stats Zscore + Examples

Scipy Stats Anova

Anova refers to the Analysis of variance that test whether to accept the null hypothesis or alternate hypothesis. The Scipy has a method f_oneway to test, the hypothesis that the population means of the given two or more groups are the same.

The syntax is given below.

scipy.stats.f_oneway(*args, axis=0)

Where parameters are:

  • *args(array_data): It is sample_1, sample_2 measurement of every group.
  • axis(int): It is used to specify the axis of the provided arrays as input on which the test is performed.

The method f_oneway returns the two values statistic and p-value in float data type.

Let’s understand through demonstration by following the below steps.

Import the method f_oneway from the module scipy.stats using the below steps.

from scipy.stats import f_oneway
import numpy as np

Creating the multidimensional array using the below code.

first_data = np.array([[7.77, 7.03, 5.71],
              [5.17, 7.35, 7.00],
              [7.39, 7.57, 7.57],
              [7.45, 5.33, 9.35],
              [5.41, 7.10, 9.33],
              [7.00, 7.24, 7.44]])
second_data = np.array([[5.35, 7.30, 7.15],
              [5.55, 5.57, 7.53],
              [5.72, 7.73, 5.72],
              [7.01, 9.19, 7.41],
              [7.75, 7.77, 7.30],
              [5.90, 7.97, 5.97]])
third_data = np.array([[3.31, 7.77, 1.01],
              [7.25, 3.24, 3.52],
              [5.32, 7.71, 5.19],
              [7.47, 7.73, 7.91],
              [7.59, 5.01, 5.07],
              [3.07, 9.72, 7.47]])

Pass the above-created arrays to a method f_oneway for the testing using the below code.

f_statistic_value, p_value = f_oneway(first_data,second_data,third_data)

Check the computed values using the below code.

print('The value of F statistic test',f_statistic_value)
print('The value of p-value',p_value)
Scipy Stats Anova
Scipy Stats Anova

This how-to used the ANOVA test using the Scipy.

Read: Binary Cross Entropy TensorFlow

Scipy Stats T-test

The T-test is used for testing the null hypothesis and calculating the T-test of the mean of the given sample. There are several methods of T-test in the Scipy module scipy.stats but here we will learn about a specific method that is ttest_1samp.

The syntax is given below.

scipy.stats.ttest_1samp(a, popmean, axis=0, nan_policy='propagate')

Where parameters are:

  • a(array_data): It is the sample of independent observations.
  • popmean(float or array_data): It is the mean or expected value of the population.
  • axis(int): It is used to specify the axis on which the test is done.
  • nan_policy: It is used to deal with the nan values and accept three values:
  1. omit: It means calculating the IQR by ignoring the nan values.
  2. propagate: It means returns nan values.
  3. raise: It means to throw an error for the nan values.

The method ttest_1samp returns two float values, the t-statistic and pvalue.

Let’s take an example by following the below steps:

Import the required libraries stats from Scipy using the below code.

from scipy import stats
import numpy as np

Create a constructor to generate a random number using the below code.

randomnub_gen = np.random.default_rng()

Creating the random number as a sample from the specific distribution using the below code.

random_variate_s = stats.norm.rvs(loc=6, scale=11, size=(51, 3), random_state=randomnub_gen)

View the generated data or numbers for the sample.

Scipy Stats T test example
Scipy Stats T-test example

Now perform the T-test on this generated random sample to know whether the sample is equal to the population mean or not.

stats.ttest_1samp(random_variate_s, 5.0)

Again perform the test with a population mean equal to zero using the below code.

stats.ttest_1samp(random_variate_s, 0.0)
Scipy Stats T test
Scipy Stats T test

From the above output result, we can reject or accept the null hypothesis based on statistics and p-value.

Read: Scipy Ndimage Rotate

Scipy Stats Half normal

The scipy.stats.halfnorm represents the random variable that is half normally continuous. It has different kinds of functions of half-normal distribution like CDF, PDF, median, etc.

It has two important parameters loc for the mean and scale for standard deviation, as we know we control the shape and location of distribution using these parameters.

The syntax is given below.

scipy.stats.halfnorm.method_name(data,loc,size,moments,scale)

Where parameters are:

  • data: It is a set of points or values that represent evenly sampled data in the form of array data.
  • loc: It is used to specify the mean, by default it is 0.
  • moments: It is used to calculate statistics like standard deviation, kurtosis and mean.
  • scale: It is used to specify the standard deviation, by default it is 1.

The above parameters are the common parameter of all the methods in the object scipy.stats.halfnorm(). The methods are given below.

  • scipy.stats.halfnorm.CDF(): It is used for the cumulative distribution function.
  • scipy.stats.halfnorm.PDF(): It is used for the probability density function.
  • scipy.stats.halfnorm.rvs(): To get the random variates.
  • scipy.stats.halfnorm.stats(): It is used to get the standard deviation, mean, kurtosis, and skew.
  • scipy.stats.halfnorm.logPDF(): It is used to get the log related to the probability density function.
  • scipy.stats.halfnorm.logCDF(): It is used to find the log related to the cumulative distribution function.
  • scipy.stats.halfnorm.sf(): It is used to get the values of the survival function.
  • scipy.stats.halfnorm.isf(): It is used to get the values of the inverse survival function.
  • scipy.stats.halfnorm.logsf(): It is used to find the log related to the survival function.
  • scipy.stats.halfnorm.mean(): It is used to find the mean related to the normal distribution.
  • scipy.stats.halfnorm.medain(): It is used to find the median related to the normal distribution.
  • scipy.stats.halfnorm.var(): It is used to find the variance related to the distribution.
  • scipy.stats.halfnorm.std(): It is used to find the standard deviation related to the distribution

Let’s take an example by using one of the methods mentioned above to know how to use the methods with parameters.

Import the required libraries using the below code.

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

Create observation data values and calculate the probability density function from these data values with mean = 0 and standard deviation = 1.

observatin_x = np.linspace(-4,4,200)
PDF_norm = stats.norm.PDF(observatin_x,loc=0,scale=1)

Plot the created distribution using the below code.

plt.plot(observatin_x,PDF_norm)
plt.xlabel('x-values')
plt.ylabel('PDF_norm_values')
plt.title("Probability density funciton of half normal distribution")
plt.show()
Scipy Stats Half normal
Scipy Stats Half normal

Look at the above output, which looks half-normal distribution.

Read: Python Scipy Minimize

Scipy Stats Cauchy

The Cauchy is a distribution like a normal distribution and belongs to members of a continuous probability distribution. It has a higher peak in comparison to the normal distribution.

The syntax is given below.

scipy.stats.cauchy.method_name(data,loc,scale)

Where parameters are:

  • data: It is a set of points or values that represent evenly sampled data in the form of array data.
  • loc: It is used to specify the mean, by default it is 0.
  • scale: It is used to specify the standard deviation, by default it is 1.

The above parameters are the common parameter of all the methods in the object scipy.stats.cauchy(). The methods are given below.

  • scipy.stats.cauchy.CDF(): It is used for the cumulative distribution function.
  • scipy.stats.cauchy.PDF(): It is used for the probability density function.
  • scipy.stats.cauchy.rvs(): To get the random variates.
  • scipy.stats.cauchy.stats(): It is used to get the standard deviation, mean, kurtosis and skew.
  • scipy.stats.cauchy.logPDF(): It is used to get the log related to the probability density function.
  • scipy.stats.cauchy.logCDF(): It is used to find the log related to the cumulative distribution function.
  • scipy.stats.cauchy.sf(): It is used to get the values of the survival function.
  • scipy.stats.cauchy.isf(): It is used to get the values of the inverse survival function.
  • scipy.stats.cauchy.logsf(): It is used to find the log related to the survival function.
  • scipy.stats.cauchy.mean(): It is used to find the mean related to the normal distribution.
  • scipy.stats.cauchy.medain(): It is used to find the median related to the normal distribution.
  • scipy.stats.cauchy.var(): It is used to find the variance related to the distribution.
  • scipy.stats.cauchy.std(): It is used to find the standard deviation related to the distribution

Let’s take an example by following the below steps:

Import the method cauchy , numpy and matplotlib using the below code.

from scipy.stats import cauchy
import matplotlib.pyplot as plt
import numpy as np

Create a cauchy distribution using the below code.

fig, ax = plt.subplots(1, 1)
x = np.linspace(cauchy.ppf(0.02),
                cauchy.ppf(0.98), 99)
ax.plot(x, cauchy.PDF(x),
       'r-', lw=5, alpha=0.6, label='cauchy PDF')
Scipy Stats Cauchy
Scipy Stats Cauchy

Look at the above output, this is how Cauchy looks like a normal distribution but with a taller peak.

Read: Python Scipy Confidence Interval

Scipy Stats Half cauchy

The HalfCauchy is a distribution like a half-normal distribution and belongs to members of a continuous probability distribution. It has a higher peak in comparison to the half-normal distribution.

The syntax is given below.

scipy.stats.halfcauchy.method_name(data,loc,scale)

Where parameters are:

  • data: It is a set of points or values that represent evenly sampled data in the form of array data.
  • loc: It is used to specify the mean, by default it is 0.
  • scale: It is used to specify the standard deviation, by default it is 1.

The above parameters are the common parameter of all the methods in the object scipy.stats.halfcauchy(). The methods are given below.

  • scipy.stats.halfcauchy.CDF(): It is used for the cumulative distribution function.
  • scipy.stats.halfcauchy.PDF(): It is used for the probability density function.
  • scipy.stats.halfcauchy.rvs(): To get the random variates.
  • scipy.stats.halfcauchy.stats(): It is used to get the standard deviation, mean, kurtosis, and skew.
  • scipy.stats.halfcauchy.logPDF(): It is used to get the log related to the probability density function.
  • scipy.stats.halfcauchy.logCDF(): It is used to find the log related to the cumulative distribution function.
  • scipy.stats.halfcauchy.sf(): It is used to get the values of the survival function.
  • scipy.stats.halfcauchy.isf(): It is used to get the values of the inverse survival function.
  • scipy.stats.halfcauchy.logsf(): It is used to find the log related to the survival function.
  • scipy.stats.halfcauchy.mean(): It is used to find the mean related to the normal distribution.
  • scipy.stats.halfcauchy.medain(): It is used to find the median related to the normal distribution.
  • scipy.stats.halfcauchy.var(): It is used to find the variance related to the distribution.
  • scipy.stats.halfcauchy.std(): It is used to find the standard deviation related to the distribution

Let’s take an example by following the below steps:

Import the method halfcauchy , numpy and matplotlib using the below code.

from scipy.stats import halfcauchy
import matplotlib.pyplot as plt
import numpy as np

Create a halfcauchy distribution using the below code.

fig, ax = plt.subplots(1, 1)
x = np.linspace(halfcauchy.ppf(0.02),
                halfcauchy.ppf(0.98), 99)
ax.plot(x, halfcauchy.PDF(x),
       'r-', lw=5, alpha=0.6, label='cauchy PDF')
Scipy Stats Half cauchy
Scipy Stats Half cauchy

Scipy Stats Binom

The scipy.stats.binom represents the discrete random variable. It has different kinds of functions of normal distribution like CDF, PDF, median, etc.

It has one important parameter loc for shifting the distribution.

The syntax is given below.

scipy.stats.binom.method_name(k,n,p,loc)

Where parameters are:

  • k(int): It is used to define the no of successes.
  • n(int): It is used to specify the no of trials.
  • p(float): It is used to specify the assumed probability of success.
  • loc: It is used to specify the mean, by default it is 0.

The above parameters are the common parameter of all the methods in the object scipy.stats.binom(). The methods are given below.

  • scipy.stats.binom.CDF(): It is used for the cumulative distribution function.
  • scipy.stats.binom.rvs(): To get the random variates.
  • scipy.stats.binom.stats(): It is used to get the standard deviation, mean, kurtosis, and skew.
  • scipy.stats.binom.logPDF(): It is used to get the log related to the probability density function.
  • scipy.stats.binom.logCDF(): It is used to find the log related to the cumulative distribution function.
  • scipy.stats.binom.sf(): It is used to get the values of the survival function.
  • scipy.stats.binom.isf(): It is used to get the values of the inverse survival function.
  • scipy.stats.binom.logsf(): It is used to find the log related to the survival function.
  • scipy.stats.binom.mean(): It is used to find the mean related to the normal distribution.
  • scipy.stats.binom.medain(): It is used to find the median related to the normal distribution.
  • scipy.stats.binom.var(): It is used to find the variance related to the distribution.
  • scipy.stats.binom.std(): It is used to find the standard deviation related to the distribution

Let’s take an example by using one of the methods mentioned above to know how to use the methods with parameters.

Import the required libraries using the below code.

from scipy.stats import binom
import matplotlib.pyplot as plt

Define the value of parameters n p using the below code.

p,n =0.3,4

Create an array of data using the method ppf() (percent point function) of object binom .

array_data = np.arange(binom.ppf(0.02, n, p),
              binom.ppf(0.98, n, p))
array_data

show the probability mass function using the below code.

fig, ax = plt.subplots(1, 1)
ax.plot(x, binom.pmf(x, n, p), 'bo', ms=7, label='binom pmf')
ax.vlines(x, 0, binom.pmf(x, n, p), colors='b', lw=6, alpha=0.5)
Scipy Stats Binom
Scipy Stats Binom

Scipy Stats Describe

The Scipy has a method describe() in a module scipy.stats to find the descriptive statistics of the given data.

The syntax is given below.

scipy.stats.describe(a, axis=0, ddof=1, bias=True, nan_policy='propagate')

Where parameters are:

  • a(array_data): It is the data of type array.
  • axis(int): It is used to specify the axis on which statistics is calculated, by default it shows descriptive statistics on the whole array.
  • ddof(int): It is used to specify the delta degrees of freedom.
  • bias(Boolean): It is used to specify the Bias.
  • nan_policy: It is used to deal with the nan values and accept three values:
  1. omit: It means calculating the IQR by ignoring the nan values.
  2. propagate: It means returns nan values.
  3. raise: It means to throw an error for the nan values.

The method descibe() returns mean, skewness, kurtosis and variance in a type ndarray or float.

Let’s take an example by following the below steps:

Import the required libraries using the below code.

from scipy import stats
import numpy as np

Create an array containing 20 observations or values using the below code.

array_data = np.arange(20)

Pass the above-created array to a method describe() for finding the descriptive statistics using the below code.

result = stats.describe(array_data)
result

Let’s view each statistic of the array using the below code.

print('Number of observation in array',result[0])
print('Minimum and maximum values in a array',result[1])
print('Mean of the array',result[2])
print('Variance of the array',result[3])
print('Skewness of the array',result[4])
print('Kurtosis of the array',result[5])
Scipy Stats Describe
Scipy Stats Describe

Scipy Stats Binomial test

The Binomial test finds the probability of the specific outcome by performing the many trials where only two possible outcomes exist. It is used for the null hypothesis test to assess the probability of the outcomes in the Bernoulli experiment.

The Scipy has a method binomtest() to perform the Binomial test that exists within the module scipy.stats.

The syntax is given below.

scipy.stats.binomtest(k, n, p=0.5, alternative='two-sided')

Where parameters are:

  • k(int): It is used to define the no of successes.
  • n(int): It is used to specify the no of trials.
  • p(float): It is used to specify the assumed probability of success.
  • alternative: It is used to specify the alternative hypothesis.

The method binomtest() returns the p-value, proportion_estimate value in float type with one more result proportion_ci to know the confidence interval of the estimate.

Let’s understand through an example by following the below steps.

Import the method binomtest() from the module scipy.stats using the below code.

from scipy.stats import binomtest

Now, A phone manufacturer claims that no more than 15% of their phones are unsafe. 20 phones are inspected for safety, and 6 were found to be unsafe. Test the manufacturer’s claim.

Test_result = binomtest(6, n=20, p=0.1, alternative='greater')

View the result using the below code.

print('The p-value is ',Test_result.pvalue)
print('The estimated proportion is 6/20 ',Test_result.proportion_estimate)
print('The confidence interval of the estimate ',Test_result.proportion_ci(confidence_level=0.95))
Scipy Stats Binomial test
Scipy Stats Binomial test

Scipy Stats Binom pmf

In Scipy there is a method binom.pmf() that exist in a module scipy.stats to show the probability mass function using the binomial distribution.

The syntax is given below.

scipy.stats.binom.pmf(k,n, p,loc=0)

Where parameters are:

  • k(int): It is used to define the no of successes.
  • n(int): It is used to specify the no of trials.
  • p(float): It is used to specify the assumed probability of success.
  • loc: It is used to specify the mean, by default it is 0.

To understand with an example, please refer to above sub-section Scipy Stats Binom where the method pmf which stands for probability mass function is used in the example.

Scipy Stats gmean

The method gmean() of module scipy.stats.mstats of Scipy finds the geometric average of the given array on basis of the specified axis.

The syntax is given below.

scipy.stats.mstats.gmean(a, axis=0, dtype=None, weights=None)

Where parameters are:

  • a(array_data): It is the collection of elements within an array or array data.
  • axis(int): It is used to specify the axis of the array on which we want to find the geometric mean.
  • dtype: It is used to specify the data type of the returned array.
  • weights(array_data): It is used to specify the weight of the values, by default the weight of values is 1.0 in the array.

The method gmean() returns the gmean which is the geometric mean of a passed array of type ndarray.

Let’s understand through an example by following the below steps.

Import the required libraries using the below code.

from scipy.stats.mstats import gmean

Find the geometric mean of the array [2,4,6,8] using the below code.

gmean([2,4,6,8])
Scipy Stats gmean
Scipy Stats gmean

Scipy Stats Alpha

The scipy.stats.alpha represents the random variable that is continuous in nature. It has different kinds of functions of distribution like CDF, PDF, median, etc.

It has two important parameters loc for the mean and scale for standard deviation, as we know we control the shape and location of distribution using these parameters.

The syntax is given below.

scipy.stats.alpha.method_name(q,x,a,loc,size,moments,scale)

Where parameters are:

  • x: It is used to define the quantiles.
  • a: It is used to define the shape parameter.
  • q: It is used to specify the tail of probability like lower and upper.
  • loc: It is used to specify the mean, by default it is 0.
  • moments: It is used to calculate statistics like standard deviation, kurtosis and mean.
  • scale: It is used to specify the standard deviation, by default it is 1.

The above parameters are the common parameter of all the methods in the object scipy.stats.alpha(). The methods are given below.

  • scipy.stats.alpha.CDF(): It is used for the cumulative distribution function.
  • scipy.stats.alpha.PDF(): It is used for the probability density function.
  • scipy.stats.alpha.rvs(): To get the random variates.
  • scipy.stats.alpha.stats(): It is used to get the standard deviation, mean, kurtosis, and skew.
  • scipy.stats.alpha.logPDF(): It is used to get the log related to the probability density function.
  • scipy.stats.alpha.logCDF(): It is used to find the log related to the cumulative distribution function.
  • scipy.stats.alpha.sf(): It is used to get the values of the survival function.
  • scipy.stats.alpha.isf(): It is used to get the values of the inverse survival function.
  • scipy.stats.alpha.logsf(): It is used to find the log related to the survival function.
  • scipy.stats.alpha.mean(): It is used to find the mean related to the normal distribution.
  • scipy.stats.alpha.medain(): It is used to find the median related to the normal distribution.
  • scipy.stats.alpha.var(): It is used to find the variance related to the distribution.
  • scipy.stats.alpha.std(): It is used to find the standard deviation related to the distribution

Let’s take an example by using one of the methods mentioned above to know how to use the methods with parameters.

Import the required libraries using the below code.

from scipy.stats import alpha
import matplotlib.pyplot as plt
import numpy as np

Creates a variable for the shape parameters and assigns some values.

a = 4.3

Create an array of data using the method ppf() of an object alpha using the below code.

array_data = np.linspace(alpha.ppf(0.01, a),
                alpha.ppf(0.90, a), 90)
array_data
Scipy Stats Alpha example
Scipy Stats Alpha example

Now plot the probability density function by accessing the method PDF() of object alpha of module scipy.stats using the below code.

fig, ax = plt.subplots(1, 1)
ax.plot(array_data, alpha.PDF(array_data, a),
       'r-', lw=4, alpha=0.5, label='alpha PDF')
Scipy Stats Alpha
Scipy Stats Alpha

Scipy Stats Beta

The scipy.stats.beta represents the random variable that is continuous in nature. It has different kinds of functions of distribution like CDF, PDF, median, etc.

It has two important parameters loc for the mean and scale for standard deviation, as we know we control the shape and location of distribution using these parameters.

The syntax is given below.

scipy.stats.beta.method_name(q,x,a,loc,size,moments,scale)

Where parameters are:

  • x: It is used to define the quantiles.
  • a,b: It is used to define the shape parameter.
  • q: It is used to specify the tail of probability like lower and upper.
  • loc: It is used to specify the mean, by default it is 0.
  • moments: It is used to calculate statistics like standard deviation, kurtosis and mean.
  • scale: It is used to specify the standard deviation, by default it is 1.

The above parameters are the common parameter of all the methods in the object scipy.stats.beta(). The methods are given below.

  • scipy.stats.beta.CDF(): It is used for the cumulative distribution function.
  • scipy.stats.beta.PDF(): It is used for the probability density function.
  • scipy.stats.beta.rvs(): To get the random variates.
  • scipy.stats.beta.stats(): It is used to get the standard deviation, mean, kurtosis, and skew.
  • scipy.stats.beta.logPDF(): It is used to get the log related to the probability density function.
  • scipy.stats.beta.logCDF(): It is used to find the log related to the cumulative distribution function.
  • scipy.stats.beta.sf(): It is used to get the values of the survival function.
  • scipy.stats.beta.isf(): It is used to get the values of the inverse survival function.
  • scipy.stats.beta.logsf(): It is used to find the log related to the survival function.
  • scipy.stats.beta.mean(): It is used to find the mean related to the normal distribution.
  • scipy.stats.beta.medain(): It is used to find the median related to the normal distribution.
  • scipy.stats.beta.var(): It is used to find the variance related to the distribution.
  • scipy.stats.beta.std(): It is used to find the standard deviation related to the distribution

Let’s take an example by using one of the methods mentioned above to know how to use the methods with parameters.

Import the required libraries using the below code.

from scipy.stats import beta
import matplotlib.pyplot as plt
import numpy as np

creates two variables a and b for the shape parameters and assigns some values.

a = 3.4
b = 0.763

Create an array of data using the method ppf() of an object beta using the below code.

array_data = np.linspace(beta.ppf(0.01, a,b),
                beta.ppf(0.90, a,b), 90)
array_data
Scipy Stats Beta example
Scipy Stats Beta example

Now plot the probability density function by accessing the method PDF() of an object beta of the module scipy.stats using the below code.

fig, ax = plt.subplots(1, 1)
ax.plot(array_data, beta.PDF(array_data, a,b),
       'r-', lw=4, alpha=0.5, label='alpha PDF')
Scipy Stats Beta
Scipy Stats Beta

Scipy Stats Gamma

The scipy.stats.gamma represents the random variable that is continuous in nature. It has different kinds of functions of distribution like CDF, PDF, median, etc.

It has two important parameters loc for the mean and scale for standard deviation, as we know we control the shape and location of distribution using these parameters.

The syntax is given below.

scipy.stats.gamma.method_name(q,x,a,loc,size,moments,scale)

Where parameters are:

  • x: It is used to define the quantiles.
  • a: It is used to define the shape parameter.
  • q: It is used to specify the tail of probability like lower and upper.
  • loc: It is used to specify the mean, by default it is 0.
  • moments: It is used to calculate statistics like standard deviation, kurtosis and mean.
  • scale: It is used to specify the standard deviation, by default it is 1.

The above parameters are the common parameter of all the methods in the object scipy.stats.gamma(). The methods are given below.

  • scipy.stats.gamma.CDF(): It is used for the cumulative distribution function.
  • scipy.stats.gamma.PDF(): It is used for the probability density function.
  • scipy.stats.gamma.rvs(): To get the random variates.
  • scipy.stats.gamma.stats(): It is used to get the standard deviation, mean, kurtosis, and skew.
  • scipy.stats.gamma.logPDF(): It is used to get the log related to the probability density function.
  • scipy.stats.gamma.logCDF(): It is used to find the log related to the cumulative distribution function.
  • scipy.stats.gamma.sf(): It is used to get the values of the survival function.
  • scipy.stats.gamma.isf(): It is used to get the values of the inverse survival function.
  • scipy.stats.gamma.logsf(): It is used to find the log related to the survival function.
  • scipy.stats.gamma.mean(): It is used to find the mean related to the normal distribution.
  • scipy.stats.gamma.medain(): It is used to find the median related to the normal distribution.
  • scipy.stats.gamma.var(): It is used to find the variance related to the distribution.
  • scipy.stats.gamma.std(): It is used to find the standard deviation related to the distribution

Let’s take an example by using one of the methods mentioned above to know how to use the methods with parameters.

Import the required libraries using the below code.

from scipy.stats import gamma
import matplotlib.pyplot as plt
import numpy as np

Code creates a variable for the shape parameters and assigns some values.

a = 1.95

Create an array of data using the method ppf() of an object gamma using the below code.

array_data = np.linspace(gamma.ppf(0.01, a),
                gamma.ppf(0.90, a,b), 90)
array_data
Scipy Stats Gamma example
Scipy Stats Gamma example

Now plot the probability density function by accessing the method PDF() of an object gamma of the module scipy.stats using the below code.

fig, ax = plt.subplots(1, 1)
ax.plot(array_data, gamma.PDF(array_data, a),
       'r-', lw=4, alpha=0.5, label='alpha PDF')
Scipy Stats Gamma
Scipy Stats Gamma

Scipy Stats Inverse Normal CDF

Here, we will learn about the Inverse of the normal Cumulative distribution function. As we already know about normal from the above sub-section ‘Scipy Stats Norm’. so here will use the method ppf() which represents the inverse of the CDF of object scipy.statst.norm of Scipy.

scipy.stats.norm.ppf(q,loc,scale)

Where parameters are:

  • q: It is used to specify the quantiles.
  • loc: It is used to specify the mean, by default it is 0.
  • scale: It is used to specify the standard deviation, by default it is 1.

Let’s take an example by following the below steps.

Import the library stats using the below code.

from scipy import stats

Find the inverse of the CDF using the below code.

stats.norm.CDF(stats.norm.ppf(0.7))
Scipy Stats Inverse Normal CDF
Scipy Stats Inverse Normal CDF

Scipy Stats Johnson

The scipy.stats contains two objects johnsonsb() and johnsonub() that belongs to the family of Johnson distribution. It has different kinds of functions of distribution like CDF, PDF, median, etc.

  • The method johnsonsb() represents the bounded continuous probability distribution whereas johnsonub() is the unbounded continuous probability distribution.

It has two important parameters loc for the mean and scale for standard deviation, as we know we control the shape and location of distribution using these parameters.

The syntax is given below.

scipy.stats.alpha.method_name(q,x,a,loc,size,moments,scale)

Where parameters are:

  • x: It is used to define the quantiles.
  • a,b: It is used to define the shape parameter.
  • q: It is used to specify the tail of probability like lower and upper.
  • loc: It is used to specify the mean, by default it is 0.
  • moments: It is used to calculate statistics like standard deviation, kurtosis and mean.
  • scale: It is used to specify the standard deviation, by default it is 1.

Let’s take an example by using one of the methods mentioned above to know how to use the methods with parameters.

Import the required libraries using the below code.

from scipy.stats import gamma
import matplotlib.pyplot as plt
import numpy as np

Code creates two variables a and b for the shape parameters and assigns some values.

a,b = 3.35,2.25

Create an array of data using the method ppf() of an object johnsonsb using the below code.

array_data = np.linspace(johnsonsb.ppf(0.01, a,b),
                johnsonsb.ppf(0.90, a,b), 90)
array_data
Scipy Stats Johnson example
Scipy Stats Johnson example

Now plot the probability density function by accessing the method PDF() of an object johnsonsb of the module scipy.stats using the below code.

fig, ax = plt.subplots(1, 1)
ax.plot(array_data, johnsonsb.PDF(array_data, a,b),
       'r-', lw=4, alpha=0.5, label='johnsonsb PDF')
Scipy Stats Johnson
Scipy Stats Johnson

We can also find the distribution of Johnson’s unbounded continuous probability distribution using the same process as we have used for Johnson’s bounded continuous probability distribution.

Scipy Stats Inverse gamma

The scipy.stats.invgamma represents the inverted random variable that is continuous in nature. It has different kinds of functions of distribution like CDF, PDF, median, etc.

It has two important parameters loc for the mean and scale for standard deviation, as we know we control the shape and location of distribution using these parameters.

The syntax is given below.

scipy.stats.invgamma.method_name(q,x,a,loc,size,moments,scale)

Where parameters are:

  • x: It is used to define the quantiles.
  • a: It is used to define the shape parameter.
  • q: It is used to specify the tail of probability like lower and upper.
  • loc: It is used to specify the mean, by default it is 0.
  • moments: It is used to calculate statistics like standard deviation, kurtosis and mean.
  • scale: It is used to specify the standard deviation, by default it is 1.

The above parameters are the common parameter of all the methods in the object scipy.stats.invgamma(). The methods are given below.

  • scipy.stats.invgamma.CDF(): It is used for the cumulative distribution function.
  • scipy.stats.invgamma.PDF(): It is used for the probability density function.
  • scipy.stats.invgamma.rvs(): To get the random variates.
  • scipy.stats.invgamma.stats(): It is used to get the standard deviation, mean, kurtosis, and skew.
  • scipy.stats.invgamma.logPDF(): It is used to get the log related to the probability density function.
  • scipy.stats.invgamma.logCDF(): It is used to find the log related to the cumulative distribution function.
  • scipy.stats.invgamma.sf(): It is used to get the values of the survival function.
  • scipy.stats.invgamma.isf(): It is used to get the values of the inverse survival function.
  • scipy.stats.invgamma.logsf(): It is used to find the log related to the survival function.
  • scipy.stats.invgamma.mean(): It is used to find the mean related to the normal distribution.
  • scipy.stats.invgamma.medain(): It is used to find the median related to the normal distribution.
  • scipy.stats.invgamma.var(): It is used to find the variance related to the distribution.
  • scipy.stats.invgamma.std(): It is used to find the standard deviation related to the distribution

Let’s take an example by using one of the methods mentioned above to know how to use the methods with parameters.

Import the required libraries using the below code.

from scipy.stats import invgamma
import matplotlib.pyplot as plt
import numpy as np

Code creates a variable for the shape parameters and assigns some values.

a = 3.04

Create an array of data using the method ppf() of an object invgamma using the below code.

array_data = np.linspace(invgamma.ppf(0.01, a),
                invgamma.ppf(0.90, a,b), 90)
array_data
Scipy Stats Inverse gamma example
Scipy Stats Inverse gamma example

Now plot the probability density function by accessing the method PDF() of an object invgamma of the module scipy.stats using the below code.

fig, ax = plt.subplots(1, 1)
ax.plot(array_data, invgamma.PDF(array_data, a),
       'r-', lw=4, alpha=0.5, label='invgamma PDF')
Scipy Stats Inverse gamma
Scipy Stats Inverse gamma

Scipy Stats Gennorm

The scipy.stats.gennorm represents the random variable that is generalized normal continuous in nature. It has different kinds of functions of normal distribution like CDF, PDF, median, etc.

It has two important parameters loc for the mean and scale for standard deviation, as we know we control the shape and location of distribution using these parameters.

The syntax is given below.

scipy.stats.gennorm.method_name(x,beta,loc,size,moments,scale)

Where parameters are:

  • x: It is a set of points or values that represent evenly sampled data in the form of array data.
  • beta: It is used to specify the shape.
  • loc: It is used to specify the mean, by default it is 0.
  • moments: It is used to calculate statistics like standard deviation, kurtosis and mean.
  • scale: It is used to specify the standard deviation, by default it is 1.

The above parameters are the common parameter of all the methods in the object scipy.stats.gennorm(). The methods are given below.

  • scipy.stats.gennorm.CDF(): It is used for the cumulative distribution function.
  • scipy.stats.gennorm.PDF(): It is used for the probability density function.
  • scipy.stats.gennorm.rvs(): To get the random variates.
  • scipy.stats.gennorm.stats(): It is used to get the standard deviation, mean, kurtosis, and skew.
  • scipy.stats.gennorm.logPDF(): It is used to get the log related to the probability density function.
  • scipy.stats.gennorm.logCDF(): It is used to find the log related to the cumulative distribution function.
  • scipy.stats.gennorm.sf(): It is used to get the values of the survival function.
  • scipy.stats.gennorm.isf(): It is used to get the values of the inverse survival function.
  • scipy.stats.gennorm.logsf(): It is used to find the log related to the survival function.
  • scipy.stats.gennorm.mean(): It is used to find the mean related to the normal distribution.
  • scipy.stats.gennorm.medain(): It is used to find the median related to the normal distribution.
  • scipy.stats.gennorm.var(): It is used to find the variance related to the distribution.
  • scipy.stats.gennorm.std(): It is used to find the standard deviation related to the distribution

Let’s take an example by using one of the methods mentioned above to know how to use the methods with parameters.

Import the required libraries using the below code.

from scipy.stats import gennorm
import matplotlib.pyplot as plt
import numpy as np

Code creates a variable for the shape parameters and assigns some values.

beta = 1.4

Create an array of data using the method ppf() of an object gennorm using the below code.

array_data = np.linspace(gennorm.ppf(0.01, a),
                gennorm.ppf(0.90, a,b), 90)
array_data
Scipy Stats Gennorm
Scipy Stats Gennorm

Now plot the probability density function by accessing the method PDF() of an object gennorm of the module scipy.stats using the below code.

fig, ax = plt.subplots(1, 1)
ax.plot(array_data, gennorm.PDF(array_data, beta),
       'r-', lw=4, alpha=0.5, label='gennorm PDF')
Scipy Stats Gennorm example
Scipy Stats Gennorm example

Scipy Stats Genpareto

The scipy.stats.genpareto represents the generalized Pareto random variable that is continuous in nature. It has different kinds of functions of normal distribution like CDF, PDF, median, etc.

It has two important parameters loc for the mean and scale for standard deviation, as we know we control the shape and location of distribution using these parameters.

The syntax is given below.

scipy.stats.genpareto.method_name(x,c,loc,size,moments,scale)

Where parameters are:

  • x: It is a set of points or values that represent evenly sampled data in the form of array data.
  • c: It is used to specify the shape.
  • loc: It is used to specify the mean, by default it is 0.
  • moments: It is used to calculate statistics like standard deviation, kurtosis and mean.
  • scale: It is used to specify the standard deviation, by default it is 1.

The above parameters are the common parameter of all the methods in the object scipy.stats.genpareto(). The methods are given below.

  • scipy.stats.genpareto.CDF(): It is used for the cumulative distribution function.
  • scipy.stats.genpareto.PDF(): It is used for the probability density function.
  • scipy.stats.genpareto.rvs(): To get the random variates.
  • scipy.stats.genpareto.stats(): It is used to get the standard deviation, mean, kurtosis, and skew.
  • scipy.stats.genpareto.logPDF(): It is used to get the log related to the probability density function.
  • scipy.stats.genpareto.logCDF(): It is used to find the log related to the cumulative distribution function.
  • scipy.stats.genpareto.sf(): It is used to get the values of the survival function.
  • scipy.stats.genpareto.isf(): It is used to get the values of the inverse survival function.
  • scipy.stats.genpareto.logsf(): It is used to find the log related to the survival function.
  • scipy.stats.genpareto.mean(): It is used to find the mean related to the normal distribution.
  • scipy.stats.genpareto.medain(): It is used to find the median related to the normal distribution.
  • scipy.stats.genpareto.var(): It is used to find the variance related to the distribution.
  • scipy.stats.genpareto.std(): It is used to find the standard deviation related to the distribution

Let’s take an example by using one of the methods mentioned above to know how to use the methods with parameters.

Import the required libraries using the below code.

from scipy.stats import genpareto
import matplotlib.pyplot as plt
import numpy as np

Code creates a variable for the shape parameters and assigns some values.

c = 0.2

Create an array of data using the method ppf() of an object genpareto using the below code.

array_data = np.linspace(genpareto.ppf(0.01, c),
                genpareto.ppf(0.90, c), 90)
array_data
Scipy Stats Genpareto example
Scipy Stats Genpareto example

Now plot the probability density function by accessing the method PDF() of an object genpareto of the module scipy.stats using the below code.

fig, ax = plt.subplots(1, 1)
ax.plot(array_data, genpareto.PDF(array_data, c),
       'r-', lw=4, alpha=0.5, label='genpareto PDF')
Scipy Stats Genpareto
Scipy Stats Genpareto

Scipy Stats Gumbel

The scipy.stats contains two objects gumbel_r() and gumbel_l() that is used to model the left or right-skewed distribution. It has different kinds of functions of distribution like CDF, PDF, median, etc.

  • The method gumbel_r() represents the right-skewed Gumbel continuous distribution whereas gumbel_l() is the left-skewed Gumbel continuous distribution.

It has two important parameters loc for the mean and scale for standard deviation, as we know we control the shape and location of distribution using these parameters.

The syntax is given below.

scipy.stats.gumbel_r.method_name(q,x,a,loc,size,moments,scale)

Where parameters are:

  • x: It is used to define the quantiles.
  • a,b: It is used to define the shape parameter.
  • q: It is used to specify the tail of probability like lower and upper.
  • loc: It is used to specify the mean, by default it is 0.
  • moments: It is used to calculate statistics like standard deviation, kurtosis and mean.
  • scale: It is used to specify the standard deviation, by default it is 1.

Let’s take an example by using one of the methods mentioned above to know how to use the methods with parameters.

Import the required libraries using the below code.

from scipy.stats import gumbel_r
import matplotlib.pyplot as plt
import numpy as np

Create an array of data using the method ppf() of an object gumbel_r using the below code.

array_data = np.linspace(gumbel_r.ppf(0.01, a,b),
                gumbel_r.ppf(0.90, a,b), 90)
array_data
Scipy Stats Gumbel example
Scipy Stats Gumbel example

Now plot the probability density function by accessing the method PDF() of an object gumbel_r of the module scipy.stats using the below code.

fig, ax = plt.subplots(1, 1)
ax.plot(array_data, gumbel_r.PDF(array_data),
       'r-', lw=4, alpha=0.5, label='gumbel PDF')

Scipy Stats Binned statistics

The Scipy submodule scipy.stats contains a method binned_statistic to calculate statistics like the mean, median, sum, etc of the values with each bin.

The syntax is given below.

scipy.stats.binned_statistic(x, values, statistic='mean', bins=10, range=None)

Whare parameters are:

  • x(array_data): It is a sequence of values that is binned.
  • values(array_data, list(N)): It is value which is used to calculate statistics.
  • statistic(string): It is used to specify what kind of statistics we want to compute like mean, sum, median, max, std, and count.
  • bin(sequence or int): It is used to define the number of bins.
  • range((float, float)): It defines the lower and upper range of the bins.

The method binned_statistic returns the statistics of the bins and the bind edges of array type.

Let’s understand with an example by following the below steps:

Import the required libraries using the below code.

from scipy import stats

Create a set of values and compute the binned statistics using the below code.

set_values = [2.0, 2.0, 3.0, 2.5, 4.0]
stats.binned_statistic([2, 2, 3, 6, 8], set_values, 'mean', bins=2)
Scipy Stats Binned statistics
Scipy Stats Binned statistics

Scipy Stats Poisson

The scipy.stats.poisson represents the random variable that is discrete in nature. It has different kinds of functions of distribution like CDF, median, etc.

It has one important parameter loc for the mean for shifting the distribution using these parameters.

The syntax is given below.

scipy.stats.gamma.method_name(mu,k,loc,moments)

Where parameters are:

  • mu: It is used to define the shape parameter.
  • k: It is the data.
  • loc: It is used to specify the mean, by default it is 0.
  • moments: It is used to calculate statistics like standard deviation, kurtosis and mean.

The above parameters are the common parameter of all the methods in the object scipy.stats.poisson(). The methods are given below.

  • scipy.stats.poisson.CDF(): It is used for the cumulative distribution function.
  • scipy.stats.poisson.rvs(): To get the random variates.
  • scipy.stats.poisson.stats(): It is used to get the standard deviation, mean, kurtosis, and skew.
  • scipy.stats.poisson.logPDF(): It is used to get the log related to the probability density function.
  • scipy.stats.poisson.logCDF(): It is used to find the log related to the cumulative distribution function.
  • scipy.stats.poisson.sf(): It is used to get the values of the survival function.
  • scipy.stats.poisson.isf(): It is used to get the values of the inverse survival function.
  • scipy.stats.poisson.logsf(): It is used to find the log related to the survival function.
  • scipy.stats.poisson.mean(): It is used to find the mean related to the normal distribution.
  • scipy.stats.poisson.medain(): It is used to find the median related to the normal distribution.
  • scipy.stats.poisson.var(): It is used to find the variance related to the distribution.
  • scipy.stats.poisson.std(): It is used to find the standard deviation related to the distribution

Let’s take an example by using one of the methods mentioned above to know how to use the methods with parameters.

Import the required libraries using the below code.

from scipy.stats import poisson
import matplotlib.pyplot as plt
import numpy as np

Code creates a variable for the shape parameters and assigns some values.

mu = 0.5

Create an array of data using the method ppf() of an object poisson using the below code.

array_data = np.linspace(poisson.ppf(0.01, mu),
                poisson.ppf(0.90, mu))
array_data
Scipy Stats Poisson example
Scipy Stats Poisson example

Now plot the probability density function by accessing the method PDF() of an object poisson of the module scipy.stats using the below code.

fig, ax = plt.subplots(1, 1)
ax.plot(array_data, poisson.pmf(array_data, mu), 'bo',ms=8,label='poisson pmf')
ax.vlines(array_data, 0,poisson.pmf(array_data, mu),colors='b', lw=4, alpha=0.5,)
Scipy Stats Poisson
Scipy Stats Poisson

This is how to use the binned statistics of Scipy.

Scipy Stats Geometric

The scipy.stats.geom represents the random variable that is discrete in nature. It has different kinds of functions of geometric distribution like CDF, PDF, median, etc.

It has one important parameter loc for the mean as we know we control the shape of distribution using these parameters.

The syntax is given below.

scipy.stats.geom.method_name(k,p,q,loc,size)

Where parameters are:

  • k(float or float of array_data): It is used to specify the Bernoulli trials.
  • p(float or float of array_data): It is used to specify the success probability for each trial.
  • q(float or float of array_data): It represents the probabilities.
  • loc: It is used to specify the mean, by default it is 0.

The above parameters are the common parameter of all the methods in the object scipy.stats.geom(). The methods are given below.

  • scipy.stats.geom.CDF(): It is used for the cumulative distribution function.
  • scipy.stats.geom.PDF(): It is used for the probability density function.
  • scipy.stats.geom.rvs(): To get the random variates.
  • scipy.stats.geom.stats(): It is used to get the standard deviation, mean, kurtosis, and skew.
  • scipy.stats.geom.logPDF(): It is used to get the log related to the probability density function.
  • scipy.stats.geom.logCDF(): It is used to find the log related to the cumulative distribution function.
  • scipy.stats.geom.sf(): It is used to get the values of the survival function.
  • scipy.stats.geom.isf(): It is used to get the values of the inverse survival function.
  • scipy.stats.geom.logsf(): It is used to find the log related to the survival function.
  • scipy.stats.geom.mean(): It is used to find the mean related to the normal distribution.
  • scipy.stats.geom.medain(): It is used to find the median related to the normal distribution.
  • scipy.stats.geom.var(): It is used to find the variance related to the distribution.
  • scipy.stats.geom.std(): It is used to find the standard deviation related to the distribution

Let’s take an example by using one of the methods mentioned above to know how to use the methods with parameters.

Import the required libraries using the below code.

from scipy.stats import geom
import numpy as np
import matplotlib.pyplot as plt

Create an array containing the 30 values and also create a variable that contains the success probability of each trial using the below code.

array_data = np.arange(1,30,1)
p = 0.5

Now plot the probability mass function by accessing the method pmf() of an object geom of the module scipy.stats using the below code.

geom_pmf_data = geom.pmf(array_data,p)
plt.plot(array_data,geom_pmf_data,'bo')
plt.show()
Scipy Stats Geometric
Scipy Stats Geometric

Scipy Stats Exponential

The scipy.stats.expon represents the random variable that is continuous in nature. It has different kinds of functions of exponential distribution like CDF, PDF, median, etc.

It has two important parameters loc for the mean and scale for standard deviation, as we know we control the shape and location of distribution using these parameters.

The syntax is given below.

scipy.stats.expon.method_name(x,q,loc,scale,size)

Where parameters are:

  • x(float or float of array_data): It is used to specify the random variable.
  • q(float or float of array_data): It represents the probabilities.
  • loc: It is used to specify the mean, by default it is 0.
  • scale: It is used to specify the standard deviation, by default it is 1.
  • size: It is used to specify the output shape.

The above parameters are the common parameter of all the methods in the object scipy.stats.expon(). The methods are given below.

  • scipy.stats.expon.CDF(): It is used for the cumulative distribution function.
  • scipy.stats.expon.PDF(): It is used for the probability density function.
  • scipy.stats.expon.rvs(): To get the random variates.
  • scipy.stats.expon.stats(): It is used to get the standard deviation, mean, kurtosis, and skew.
  • scipy.stats.expon.logPDF(): It is used to get the log related to the probability density function.
  • scipy.stats.expon.logCDF(): It is used to find the log related to the cumulative distribution function.
  • scipy.stats.expon.sf(): It is used to get the values of the survival function.
  • scipy.stats.expon.isf(): It is used to get the values of the inverse survival function.
  • scipy.stats.expon.logsf(): It is used to find the log related to the survival function.
  • scipy.stats.expon.mean(): It is used to find the mean related to the normal distribution.
  • scipy.stats.expon.medain(): It is used to find the median related to the normal distribution.
  • scipy.stats.expon.var(): It is used to find the variance related to the distribution.
  • scipy.stats.expon.std(): It is used to find the standard deviation related to the distribution

Let’s take an example by using one of the methods mentioned above to know how to use the methods with parameters.

Import the required libraries using the below code.

from scipy.stats import expon
import numpy as np
import matplotlib.pyplot as plt

Create an array containing the 30 values using the below code.

array_data = np.arange(-1,30,0.1)

Now plot the probability density function by accessing the method PDF() of an object expon of the module scipy.stats using the below code.

expon_PDF_data = expon.PDF(array_data,0,2)
plt.plot(array_data,expon_PDF_data,'bo')
plt.show()
Scipy Stats Exponential
Scipy Stats Exponential

Scipy Stats Boxcox

The Scipy submodel has a method boxcox() that transformed the non-normal dataset into the normal dataset.

The syntax is given below,

scipy.stats.boxcox(x, lmbda=None, alpha=None, optimizer=None)

Where parameters are:

  • x(array_data): It is the input array data that should be positive and one-dimensional.
  • lambda(scaler): It performs the transformation for the value.
  • alpha(float): It returns the confidence interval for lambda.
  • optimizer: If lambda is not set, then the optimizer finds the value of lambda.

The method boxcox() returns two values boxcox of type ndarray and maxlog of type float.

Let’s understand with an example by following the below steps:

Import the required modules using the below code.

import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from scipy import stats

Create or generate non-normal values using the below code.

non_normal_data = np.random.exponential(size = 500)

Transform the non-normal data or generated data into normal using the method boxcox() and also save the lambda value.

transformed_data, lambda_value = stats.boxcox(non_normal_data)
Scipy Stats Boxcox example
Scipy Stats Boxcox example

Plot both data the non-normal and transformed data using the below code.

fig, ax = plt.subplots(1, 2)

sns.distplot(non_normal_data, hist = False, kde = True,
            kde_kws = {'shade': True, 'linewidth': 2}, 
            label = "Non-Normal", color ="green", ax = ax[0])
  
sns.distplot(transformed_data, hist = False, kde = True,
            kde_kws = {'shade': True, 'linewidth': 2}, 
            label = "Normal", color ="green", ax = ax[1])

plt.legend(loc = "upper right")

fig.set_figheight(5)
fig.set_figwidth(10)
Scipy Stats Boxcox
Scipy Stats Boxcox

Scipy Stats Genextreme

The scipy.stats.genextreme represents the random variable that is continuous in nature. It has different kinds of functions of distribution like CDF, PDF, median, etc.

It has two important parameters loc for the mean and scale for standard deviation, as we know we control the shape and location of distribution using these parameters.

The syntax is given below.

scipy.stats.genextreme.method_name(q,x,a,loc,size,moments,scale)

Where parameters are:

  • x: It is used to define the quantiles.
  • a,b,c: It is used to define the shape parameter.
  • q: It is used to specify the tail of probability like lower and upper.
  • loc: It is used to specify the mean, by default it is 0.
  • moments: It is used to calculate statistics like standard deviation, kurtosis and mean.
  • scale: It is used to specify the standard deviation, by default it is 1.

The above parameters are the common parameter of all the methods in the object scipy.stats.genextreme(). The methods are given below.

  • scipy.stats.genextreme.CDF(): It is used for the cumulative distribution function.
  • scipy.stats.genextreme.PDF(): It is used for the probability density function.
  • scipy.stats.genextreme.rvs(): To get the random variates.
  • scipy.stats.genextreme.stats(): It is used to get the standard deviation, mean, kurtosis, and skew.
  • scipy.stats.genextreme.logPDF(): It is used to get the log related to the probability density function.
  • scipy.stats.genextreme.logCDF(): It is used to find the log related to the cumulative distribution function.
  • scipy.stats.genextreme.sf(): It is used to get the values of the survival function.
  • scipy.stats.genextreme.isf(): It is used to get the values of the inverse survival function.
  • scipy.stats.genextreme.logsf(): It is used to find the log related to the survival function.
  • scipy.stats.genextreme.mean(): It is used to find the mean related to the normal distribution.
  • scipy.stats.genextreme.medain(): It is used to find the median related to the normal distribution.
  • scipy.stats.genextreme.var(): It is used to find the variance related to the distribution.
  • scipy.stats.genextreme.std(): It is used to find the standard deviation related to the distribution

Let’s take an example by using one of the methods mentioned above to know how to use the methods with parameters.

Import the required libraries using the below code.

from scipy.stats import genextreme
import matplotlib.pyplot as plt
import numpy as np

Code creates a variable for the shape parameters and assigns some values.

c = 1.95

Create an array of data using the method ppf() of an object genextreme using the below code.

array_data = np.linspace(genextreme.ppf(0.01, c),
                genextreme.ppf(0.90,c), 90)
array_data
Scipy Stats Genextreme example
Scipy Stats Genextreme example

Now plot the probability density function by accessing the method PDF() of an object genextreme of the module scipy.stats using the below code.

fig, ax = plt.subplots(1, 1)
ax.plot(array_data, genextreme.PDF(array_data,c),
       'r-', lw=4, alpha=0.5, label='genextreme PDF')
Scipy Stats Genextreme
Scipy Stats Genextreme

Scipy Stats Dirichlet

The Scipy has an object dirichlet() to create a distribution that belongs to a continuous multivariate probability distribution. It has some methods or functions that are given below.

  • scipy.stats.genextreme.PDF(): It is used for the probability density function.
  • scipy.stats.genextreme.var(): It is used to find the variance of the Dirichlet distribution.
  • scipy.stats.genextreme.mean(): It is used to find the mean of the Dirichlet distribution.
  • scipy.stats.genextreme.rvs(): To get the random variates.
  • scipy.stats.genextreme.logPDF(): It is used to get the log related to the probability density function.

The syntax is given below.

scipy.stats.dirichlet(x,alpha)

Where parameters are:

  • x(array_data): It is used to specify the quantiles.
  • alpha(array_data): It is used to define the concentration parameters.

Let’s take an example by following the below steps:

Import the required libraries using the below code.

from scipy.stats import dirichlet
import numpy as np

Define the quantiles and alpha values within an array using the below code.

quant = np.array([0.3, 0.3, 0.4])
alp = np.array([0.5, 6, 16]) 

Now generate the Dirichlet random value using the below code.

dirichlet.PDF(quant,alp)
Scipy Stats Dirichlet
Scipy Stats Dirichlet

Scipy Stats Hypergeom

The Scipy has a method hypergeom()in a module scipy.stats that created hypergeom distribution by taking the objects from a bin.

The syntax is given below.

scipy.stats.hypergeom(M,n,N)

Where parameters are:

  • M: It is used to define the total number of objects.
  • n: It is used to define the number of objects of type Ith in M.
  • N: It is a random variate that represents the number of Types I objects in N taken without replacement from the whole population.

Let’s take an example by following the below steps:

Import the required libraries using the below code.

from scipy.stats import hypergeom
import numpy as np
import matplotlib.pyplot as plt

Now, think that we have a total number of 30 phones, of which 10 are apple phones. if we want to know the probability of getting the number of apple phones if we choose at random 15 of the 30 phones. Let’s use the below code to find the solution to this problem.

[M, n, N] = [30, 10, 15]
rv = hypergeom(M, n, N)
x = np.arange(0, n+1)
pmf_applephones = rv.pmf(x)

Plot the above result using the below code.

fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(x, pmf_applephones, 'bo')
ax.vlines(x, 0, pmf_applephones, lw=2)
ax.set_xlabel('# of apple phones in our group of chosen phones')
ax.set_ylabel('hypergeom PMF')
plt.show()
Scipy Stats Hypergeom
Scipy Stats Hypergeom

Scipy Stats Interval

Here in Scipy interval is referred to confidence interval that tells the probability of falling the population parameter within a certain range of values. The scipy has a method interval() within the submodule scipy.stats.rv_continous that find the confidence interval with equal areas around the mean.

The syntax is given below.

rv_continuous.interval(alpha, *args, loc, scale)

Where parameters are:

  • alpha(arry_data like a float): It defines the probability of drawing the RV from the returned range. Then range value should be from 0 to 1.
  • *args(array_data): It is used for defining the shape of the distribution.
  • loc(array_data): It is used for defining the location parameter, by default it is 0.
  • scale(array_data): It is used for defining the scale parameter, by default it is 1.

Scipy Stats ISF

The ISF stands for Inverse survival function that finds the ISF at q of the given random variates.

The syntax is given below.

rv_continuous.isf(q, *args,loc scale)

Where parameters are:

  • q(array_data): It defines the upper tail probability.
  • *args(array_data): It is used for defining the shape of the distribution.
  • loc(array_data): It is used for defining the location parameter, by default it is 0.
  • scale(array_data): It is used for defining the scale parameter, by default it is 1.

Scipy Stats Independent T-test

The T-test is used for testing the null hypothesis and calculating the T-test of the mean of the two independent samples. In simple terms, it tests that the two independent samples have the same average value.

The syntax is given below.

scipy.stats.ttest_ind(a, b, axis=0, equal_var=True, nan_policy='propagate', alternative='two-sided', trim=0)

Where parameters are:

  • a,b(array_data): It is the sample of independent observations in the form of an array.
  • axis(int): It is used to specify the axis on which the test is done.
  • equal_var(boolean): If it is true, then it considers that the variance of two independent samples is equal, otherwise in the case of false, it uses Welch’s t-test for two independent samples whose variance is not equal.
  • alternative: It is used to specify the alternative hypothesis.
  • nan_policy: It is used to deal with the nan values and accept three values:
  1. omit: It means calculating the IQR by ignoring the nan values.
  2. propagate: It means returns nan values.
  3. raise: It means to throw an error for the nan values.

The method ttest_1samp returns two float values, the t-statistic and pvalue.

Let’s take an example by following the below steps:

Import the required libraries stats from Scipy using the below code.

from scipy import stats
import numpy as np

Create a constructor to generate a random number using the below code.

randomnum_gen = np.random.default_rng()

Create two samples with identical means using the below code.

sample1 = stats.norm.rvs(loc=6, scale=15, size=1000, random_state=randomnum_gen)
sample2 = stats.norm.rvs(loc=6, scale=15, size=1000, random_state=randomnum_gen)

Calculate the T-test of independent samples that we have created above.

stats.ttest_ind(sample1, sample2)
Scipy Stats Independent t test
Scipy Stats Independent t test

From the above output result, we can reject or accept the null hypothesis based on statistics and p-value.

Scipy Stats Fisher Exact

The fisher exact is a kind of statistical test of the nonrandom relation between two categorical variables. The Scipy has a method fisher_exact() for that kind of test.

The syntax is given below.

scipy.stats.fisher_exact(table, alternative='two-sided')

Where parameters are:

  • table(array_data of type ints): It 2×2 table as input on which we want to perform the test.
  • alternative: It is used to specify the alternative hypothesis. The alternative options are given below:
  1. ‘two-sided’
  2. ‘less’: one-sided
  3. ‘greater’: one-sided

The method returns the two values oddratio and p_value of type float.

Let’s take an example by following the below steps:

Suppose we have a survey of the students in college about using the iPhone and Android phones based on gender, then we found the below data.

iPhoneAndroid
Male105
Female511
Survey about phones

To find if there is a statistically significant association between gender and phones preference use the below codes.

Import the libraries using the below code.

from scipy import stats

Create the array of data for holding the survey information.

survey_data = [[10,5],[5,11]]

Perform the fisher_exact() function on this data to know the significance.

stats.fisher_exact(survey_data)
Scipy Stats fisher exact
Scipy Stats fisher exact

From the output, the p_value is greater than 0.05 so there is not enough evidence to say there is an association between gender and phones preference.

So, in this Scipy tutorial, we understood the requirement and use of Scipy Stats. And we have also covered the following topics.

  • Scipy Stats
  • Scipy Stats Lognormal
  • Scipy Stats Norm
  • Scipy Stats T-test
  • Scipy Stats Pearsonr
  • Scipy Stats chi-square
  • Scipy Stats IQR
  • Scipy Stats Poisson
  • Scipy Stats Entropy
  • Scipy Stats Anova
  • Scipy Stats Anderson
  • Scipy Stats Average
  • Scipy Stats Alpha
  • Scipy Stats Boxcox
  • Scipy Stats Binom
  • Scipy Stats Beta
  • Scipy Stats Binomial test
  • Scipy Stats Binned statistics
  • Scipy Stats Binom pmf
  • Scipy Stats CDF
  • Scipy Stats Cauchy
  • Scipy Stats Describe
  • Scipy Stats Exponential
  • Scipy Stats Gamma
  • Scipy Stats Geometric
  • Scipy Stats gmean
  • Scipy Stats Gennorm
  • Scipy Stats Genpareto
  • Scipy Stats Gumbel
  • Scipy Stats Genextreme
  • Scipy Stats Histogram
  • Scipy Stats Half normal
  • Scipy Stats Half cauchy
  • Scipy Stats Inverse gamma
  • Scipy Stats Inverse normal CDF
  • Scipy Stats Johnson
  • Scipy Stats PDF
  • Scipy Stats Hypergeom
  • Scipy Stats Interval
  • Scipy Stats ISF
  • Scipy Stats Independent T-test
  • Scipy Stats Fisher Exact