In this Python Scipy tutorial, we will learn how to use the “Python Scipy Spatial Distance Cdist” to compute the spatial distance between two input collections using several metrics, like Cityblock, Jaccard, and others, with the following topics.
- Python Scipy Spatial Distance Cdist
- Python Scipy Spatial Distance Cdist Metric
- Python Scipy Spatial Distance Cdist Output
- Python Scipy Spatial Distance Cdist Euclidean
- Python Scipy Spatial Distance Cdist Russellrao
- Python Scipy Spatial Distance Cdist Chebyshev
- Python Scipy Spatial Distance Cdist Cityblock
- Python Scipy Spatial Distance Cdist Jaccard
Python Scipy Spatial Distance Cdist
The scipy.spatial.distance
the module of Python Scipy contains a method called cdist()
that determines the distance between each pair of the two input collections.
The syntax is given below.
scipy.spatial.distance.cdist(XA, XB, metric='correlation')
Where parameters are:
- XA(array_data): An array of original mB observations in n dimensions, each measuring mB by n.
- XB(array_data): an array of original mB observations in n dimensions, each measuring mB by n.
- metric(callabel, str): The distance unit to be applied. The distance function can be “canberra,” “braycurtis,” “chebyshev,” “correlation,” “cityblock,” “cosine,” “euclidean,” “dice,” “hamming,” “kulsinski,” “jensenshannon,” “kulczynski1,” “matching,” “mahalanobis,” “minkowski,” “russellrao,” “rogerstanimoto,” “seuclidean”.
The outcome of the method cdist()
is a distance matrix of dimensions mA by mB.
Let’s take 4 2-D coordinates of USA cities to find their Cosine distances by following the below steps:
Import the required libraries or methods using the below python code.
from scipy.spatial.distance import cdist
Define the coordinate of any four cities in the USA using the below code.
coord_data = [5, -10),
(4, -9),
(6, -14),
(9, -11]
Now pass the above coordinate to the method cdist()
with metric equal to cosine
using the below code.
cdist(coord_data,coord_data,'cosine')
This is how to compute the distance between coordinates using the method cdist()
of Python Scipy.
Read Scipy Constants
Python Scipy Spatial Distance Cdist Metric
We have enough information about the method cdist()
of Python Scipy to compute the distance between two input collections of values. Let’s know about the parameter metric
which is used to specify the type of distance metric that we want to use.
The parameter metric
accepts the bunch of distance metrics like “canberra,” “braycurtis,” “chebyshev,” “correlation,” “cityblock,” “cosine,” “euclidean,” “dice,” “hamming,” “kulsinski,” “jensenshannon,” “kulczynski1,” “matching,” “mahalanobis,” “minkowski,” “russellrao,” “rogerstanimoto,” “seuclidean”
.
Explaining each distance metric is not part of this tutorial, but it can be searched on the internet.
Each metric computes the distance in different ways according to what kind of distance we want to calculate. We will learn about some of the metrics in the subsections of this tutorial.
Read: Python Scipy Linalg Eigh
Python Scipy Spatial Distance Cdist Output
The method cdist()
of Python Scipy returns a two values y
( which the response is an mA by mB distance matrix. The metric dist(u=XA[i], v=XB[j]) is calculated for each i and j and kept in the ijth element) and ValueError
(which is If XA and XB don’t have the same length of columns, an exception is raised).
Let’s check with an example, here we will see about the second value which ValueError
. The first value y
which we can check in the subsections of this tutorial.
Import the required libraries or methods using the below python code.
from scipy.spatial.distance import cdist
Define the coordinates of any four cities in the USA and the United Kingdom (UK) using the below code.
usa_cordi_data = [5,6]
uk_cordi_data = [(6, -8),
(9, -4),
(12, -14),
(5, -10)]
From the above code, we can see that the first array usa_cordi_data
is of type one dimensional and the second array uk_cordi_data
of type two dimensional.
Now pass the above coordinates to the method cdist()
with a metric equal to euclidean
using the below code.
cdist(usa_cordi_data,uk_cordi_data,'euclidean')
From the output, the above code shows the error ValueError
due to the dimension mismatch between the two arrays.
Read Scipy Stats Zscore
Python Scipy Spatial Distance Cdist Euclidean
The distance between two points is known as the Euclidean distance in mathematics. In other words, the length of the line segment between two points is what is meant by defining the Euclidean distance between two locations in Euclidean space.
It is sometimes referred to as the Pythagorean distance because the Euclidean distance can be calculated using coordinate points and the Pythagoras theorem.
As we have talked about metrics in the above subsections, here we will use the metric euclidean to calculate the distance between coordinates of cities in the USA.
Import the required libraries or methods using the below python code.
from scipy.spatial.distance import cdist
Define the coordinates of any four cities in the USA and the United Kingdom (UK) using the below code.
usa_cordi_data = [(5, -10),
(4, -9),
(6, -14),
(9, -11)]
uk_cordi_data = [(6, -8),
(9, -4),
(12, -14),
(5, -10)]
Now pass the above coordinates to the method cdist()
with a metric equal to euclidean
using the below code.
cdist(usa_cordi_data,uk_cordi_data,'euclidean')
This is how to compute spatial distance using the method cdist()
with metric equal to euclidean
.
Read Scipy Ndimage Rotate
Python Scipy Spatial Distance Cdist Russellrao
The Python Scipy method cdist()
accept a metric russellrao
calculate the Russell-Rao difference between two input collections.
Let’s take an example by following the below steps:
Import the required libraries or methods using the below python code.
from scipy.spatial.distance import cdist
Create the coordinates points using the below code.
usa_cordi_data = [(1, 0),
(4, -9),
(6, -14),
(9, -11)]
uk_cordi_data = [(0, 1),
(9, -4),
(12, -14),
(5, -10)]
Now pass the above coordinates to the method cdist()
with a metric equal to russellrao
using the below code.
cdist(usa_cordi_data,uk_cordi_data,'russellrao')
This is how to compute spatial distance using the method cdist()
with metric equal to russellrao
.
Python Scipy Spatial Distance Cdist Chebyshev
The “maximum metric” in mathematics, commonly known as the Chebyshev distance formula, determines the distances between two points as the sum of their biggest differences along all of their axes.
The Python Scipy method cdist()
accept a metric chebyshev
calculate the Chebyshev distance between each pair of two input collections.
Let’s take an example by following the below steps:
Import the required libraries or methods using the below python code.
from scipy.spatial.distance import cdist
Create the two coordinate points using the below code.
cordi_data_1 = [(1, 0),
(4, -9),
(6, -14),
(9, -11)]
cordi_data_2 = [(0, 1),
(9, -4),
(12, -14),
(5, -10)]
Pass the above coordinates to the method cdist()
with a metric equal to chebyshev
using the below code.
cdist(cordi_data_1,cordi_data_2,'Chebyshev')
This is how to compute spatial distance using the method cdist()
with metric equal to chebyshev
.
Read Scipy Convolve
Python Scipy Spatial Distance Cdist Cityblock
The Manhattan (cityblock) Distance is the sum of all absolute distances between two points in all dimensions. The Python Scipy method cdist()
accept a metric cityblock
calculate the Manhattan distance between each pair of two input collections.
Let’s take an example by following the below steps:
Import the required libraries or methods using the below python code.
from scipy.spatial.distance import cdist
Create the two coordinate points using the below code.
cordi_data_1 = [(5, 0),
(3, -7),
(2, -9),
(10, -11)]
cordi_data_2 = [(3, 1),
(7, -4),
(7, -14),
(9, -10)]
Pass the above coordinates to the method cdist()
with a metric equal to cityblock
using the below code.
cdist(cordi_data_1,cordi_data_2,'cityblock')
This is how to compute spatial distance using the method cdist()
with metric equal to cityblock
in Python Scipy.
Read Scipy Signal
Python Scipy Spatial Distance Cdist Jaccard
It is common practice to compute an n*n matrix using the Jaccard distance for clustering and multidimensional scaling of n sample sets. The collection of all finite sets is measured by this distance.
The Python Scipy method cdist()
accept a metric jaccard
calculate the Jaccard distance between each pair of two input collections.
Let’s take an example by following the below steps:
Import the required libraries or methods using the below python code.
from scipy.spatial.distance import cdist
Create the two coordinate points using the below code.
cordi_data_1 = [(5, 0),
(3, -7),
(2, -9),
(10, -11)]
cordi_data_2 = [(3, 1),
(7, -4),
(7, -14),
(9, -10)]
Pass the above coordinates to the method cdist()
with a metric equal to jaccard
using the below code.
cdist(cordi_data_1,cordi_data_2,'jaccard')
This is how to compute spatial distance using the method cdist()
with metric equal to jaccard
.
This tutorial taught us how to compute the spatial distance between each pair of two input collections in Scipy using the metrics like Euclidean, Jaccard, Cityblock, Chebyshev and etc, with the following topics.
- Python Scipy Spatial Distance Cdist
- Python Scipy Spatial Distance Cdist Metric
- Python Scipy Spatial Distance Cdist Output
- Python Scipy Spatial Distance Cdist Euclidean
- Python Scipy Spatial Distance Cdist Russellrao
- Python Scipy Spatial Distance Cdist Chebyshev
- Python Scipy Spatial Distance Cdist Cityblock
- Python Scipy Spatial Distance Cdist Jaccard
You may like the following Python Scipy tutorials:
I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.