Have you ever wondered what is ** Python Scipy Smoothing**? Here we will learn about “

**” to smooth the curve using different filters or methods, also we will remove the noise from the noisy data by covering the following topics.**

*Python Scipy Smoothing*- What is Data Smoothing?
- Python Scipy Smoothing Spline
- How to use the filter for smoothing
- How to smooth the 1d data
- How to remove noise from the data and make it smooth
- How to control the smoothness using the method of smoothing factor
- Python Scipy Smoothing 2d Data

## What is Data Smoothing?

Data smoothing is the process of taking out noise from a data set using an algorithm. Important patterns can then be more easily distinguished as a result. Data smoothing can be used in economic analysis as well as to assist predict trends, such as those seen in securities prices. The purpose of data smoothing is to eliminate singular outliers and account for seasonality.

In the process of compiling data, any volatility or other types of noise can be eliminated or reduced. Data smoothing is the term for this.

Data smoothing is based on the notion that it can recognize simpler changes to assist in the prediction of various trends and patterns. It helps statisticians and traders who must examine a large amount of data—which is frequently difficult to comprehend to spot trends they might not otherwise notice.

The process of data smoothing can be carried out in a variety of ways. A few options are the randomization approach, conducting an exponential smoothing procedure, computing a moving average, or employing a random walk.

Also, check: Python Scipy Butterworth Filter

## Python Scipy Smoothing Spline

Splines are mathematical functions that describe a collection of polynomials that are connected at particular locations known as spline knots.

This also indicates that the splines will produce a smooth function, avoiding sudden changes in slope. They are used to interpolate a set of data points with a function that exhibits continuity among the investigated range.

The Python Scipy has a class

that fits a 1-D smoothing spline to an existing set of data points.*scipy.interpolate.UnivariateSpline()*

The syntax is given below.

`class scipy.interpolate.UnivariateSpline(x, y, w=None, bbox=[None, None], k=3, s=None, ext=0, check_finite=False)`

Where parameters are:

**x(array_data, N):**1-dimensional array with separate input data. Must be rising; if s is 0, it must be rigorously increasing.**y(array_data, N):**Dependant input data in a 1-D array that has the exact length as x.**w(N, array_data):**Weights for fitting splines. It has to be positive. All weights are equivalent if w is None. No default is used.**bbox(array_data, 2):**2-sequence defining the approximation interval’s perimeter. Bbox is equal to [x[0], x[-1]] if bbox is None. No default is used.**k(int):**The smoothing spline’s degree. K = 3 is a cubic spline, hence it must be 1 = k = 5. There is a 3.**s(float):**The number of knots was determined by a positive smoothing factor. Till the smoothing condition is met, the number of knots will be increased.**ext(string, int):**Determines how extrapolation is done for elements outside the range that the knot sequence has specified.- if ext=0 or ‘extrapolate’, return the extrapolated value.
- if ext=1 or ‘zeros’, return 0
- if ext=2 or ‘raise’, raise a ValueError
- if ext=3 of ‘const’, return the boundary value.

**check_finite(boolean):**If it is necessary to verify that the input arrays only contain finite numbers. Disabling may improve performance, but if the inputs do contain infinities or NaNs, it may cause issues (crashes, non-termination, or illogical results). False is the default.

Let’s take an example and smooth noisy data by following the below steps:

Import the required libraries or methods using the below python code.

```
import numpy as np
import matplotlib.pyplot as plt
from scipy import interpolate
```

Generate x and y, and plot them using the below code.

```
rng_ = np.random.default_rng()
x_ = np.linspace(-4, 4, 50)
y_ = np.exp(-x_**2) + 0.1 * rng_.standard_normal(50)
plt.plot(x_, y_, 'ro', ms=5)
```

Smoothing the data using the method

with the default parameter values using the below code.*UnivariateSpline()*

```
spline = interpolate.UnivariateSpline(x_, y_)
xs_ = np.linspace(-4, 4, 1000)
plt.plot(xs_, spline(xs_), 'g', lw=2)
```

Now again manually adjust the smoothing’s degree using the below code.

```
spline.set_smoothing_factor(0.5)
plt.plot(xs_, spline(xs_), 'b', lw=3)
plt.show()
```

The method

that continue computing splines using the specified smoothing factor *set_smoothing_factor()*

and the knots discovered during the previous call.*s*

This is how to smooth the data using the method

of Python Scipy.*UnivariateSpline()*

Read: Python Scipy Stats Fit + Examples

## Python Scipy Smoothing Filter

A digital filter called the Savitzky-Golay filter uses data points to smooth the graph. When using the least-squares method, a small window is created, the data in that window is subjected to a polynomial, and the polynomial is then used to determine the window’s center point.

Once all of the neighbors have been roughly adjusted with one another, the window is then shifted by one data point once more.

Python Scipy has a method

in a module *savgol_filter()*

that uses a Savitzky-Golay filter on an array.*scipy.signal*

The syntax is given below.

`scipy.signal.savgol_filter(x, window_length, polyorder, deriv=0, delta=1.0, axis=- 1, mode='interp', cval=0.0)`

Where parameters are:

Data that will be filtered. Before filtering, x will be transformed to type*x(array_data):*if it is not a single-precision or double-precision floating-point array.*numpy.float64*The filter window’s size. If the mode is “interp,” window length must be less than or equal to the size of x.*window_length(int):*The polynomial’s order, which was utilized to fit the data. Window length must be less than polyorder.*ployorder(int):*The derivative’s computation order. It must be an integer that is not negative. When the default value is 0, the data is filtered but not differentiated.*deriv(int):*The sample spacing that will be subjected to the filter. Only used when deriv > 0. 1.0 is the default.*delta(float):*The direction along which the filter should be applied along the array’s x-axis. -1 is the default.*axis(int):*Must be “interp,” “wrap,” “nearest,” “constant,” or “mirror.” Based on this, the kind of extension to apply to the padded signal before applying the filter is decided. When the mode is set to “constant,” cval provides the padding value. For further information on “mirror,” “constant,” “wrap,” and “nearest,” refer to the Notes. No extension is utilized when the default “interp” mode is chosen. Instead, a degree polyorder polynomial is fitted to the edge’s last window length values, and this polynomial is then used to calculate the output values for last window length / 2.*mode(string):*If mode is “constant,” value to fill the input beyond the edges. The default is 0.*cval(scalar):*

The method

returns filtered data.*savgol_filter()*

Let’s take an example by following the below steps:

Import the required libraries or methods using the below python code.

```
import numpy as np
import matplotlib.pyplot as plt
from scipy import signal
```

Generate noisy data and plot the data using the below code.

```
x_ = np.linspace(0,2*np.pi,200)
y_ = np.sin(x_) + np.random.random(200) * 0.2
plt.plot(x_, y_)
```

Now apply the Savitzky-Golay filter to the noisy data to smooth it.

```
yhat_ = signal.savgol_filter(y_, 49, 3)
plt.plot(x_, y_)
plt.plot(x_,yhat_, color='green')
plt.show()
```

This is how to apply the Savitzky-Golay filter to the noisy data to smooth the data using the method

of Python Scipy.*savgol_filter()*

Read: Python Scipy Stats Norm

## Python Scipy Smoothing 1d

The method

of Python Scipy in a module *interp1d()*

that is used for 1-D function interpolation. Arrays of values called x and y are used to approximate a function called f: y = f.*scipy.interpolate*

The function returned by this class employs interpolation in its call method to determine the value of new points.

The syntax is given below.

`class scipy.interpolate.interp1d(x, y, bounds_error=None,kind='linear', axis=- 1, copy=True)`

Where parameters are:

**x(array_data):**A real values 1-D array.**y(array_data):**A real value N-D array. In the interpolation axis, the length of y must match the length of x.**kind(str):**Specifies the type of interpolation in the form of a string or an integer, along with the order of the spline interpolator to be used. The string must fall into one of the following categories: linear, nearest, nearest-up, zero, slinear, quadratic, cubic, previous, or next. The terms “zero,” “slinear,” “quadratic,” and “cubic” denote spline interpolations of the zeroth, first, second, or third order; “previous,” “next,” and “nearest” simply return the prior or next value of the point; “nearest-up,” which rounds up, and “nearest,” which rounds down, are used when interpolating half-integers (such as 0.5, 1.5). Linear is the default.**axis(int):**Specifies the y-axis that will be used for interpolation. The final axis of y is the interpolation’s default.**copy(boolean):**If True, x and y are internally copied by the class. If False, x and y references are used. Copy is the default action.**bounds_error(boolean):**If True, any attempt to interpolate a value outside of the range of x results in a ValueError (where extrapolation is necessary). If False, the fill value is allocated to out-of-bounds values. Errors are raised by default unless the fill value=”extrapolate” is specified.

Let’s take an example by following the below steps:

Import the required libraries or methods using the below python code.

```
import numpy as np
import matplotlib.pyplot as plt
from scipy import interpolate
```

Create x and y data and interpolate using the below code.

```
x_ = np.arange(0, 15)
y_ = np.exp(-x_/4.0)
f_ = interp1d(x_, y_)
```

Plot the computed values using the below code.

```
xnew_ = np.arange(0, 10, 0.1)
ynew_ = f_(xnew_)
plt.plot(x_, y_, 'o', xnew_, ynew_, '-')
plt.show()
```

This is how to use the method

of Python Scipy to compute the smooth values of the 1d functions.*interp1d()*

Read: Python Scipy Stats Skew

## Python Scipy Smoothing Noisy Data

In Python Scipy,

is an additional spline creation function. It functions practically in a manner similar to *LSQUnivariateSpline()*

, as we shall see.*UnivariateSpline()*

This function’s primary distinction from the preceding one is that with the help of, it is possible to directly regulate the number and position of knots while creating spline curves.

The syntax is given below.

`class scipy.interpolate.LSQUnivariateSpline(x, y, t, w=None, bbox=[None, None], k=3, ext=0, check_finite=False)`

Where parameters are:

**x(array_data):**Data points’ input dimensions must be increasing.**y(array_data):**Dimension of the input data points.**t(array_data):**Inside-of-the-spline knots, the order must be ascending.**w(array_data):**Weights for fitting splines. It must be uplifting. If None, all weights are equal (default).**bbox(array_data):**2-sequence defining the approximation interval’s perimeter. If None, bbox is equal to [x[0], x[-1]].**k(int):**The smoothing spline’s degree. It must be 1 k 5. K = 3, a cubic spline, is the default.**ext(string,int):**Determines how extrapolation is done for elements outside the range that the knot sequence has specified.- if ext=0 or ‘extrapolate’, return the extrapolated value.
- if ext=1 or ‘zeros’, return 0
- if ext=2 or ‘raise’, raise a ValueError
- if ext=3 of ‘const’, return the boundary value.

**check_finite(boolean):**If it is necessary to verify that the input arrays only contain finite numbers. Disabling may improve performance, but if the inputs do contain infinities or NaNs, it may cause issues (crashes, non-termination, or illogical results). False is the default.

Let’s take an example by following the below steps:

Import the required libraries or methods using the below python code.

```
import numpy as np
import matplotlib.pyplot as plt
from scipy import interpolate
```

Create x and y, then plot them using the code below.

```
rng_ = np.random.default_rng()
x_ = np.linspace(-4, 4, 50)
y_ = np.exp(-x_**2) + 0.1 * rng_.standard_normal(50)
plt.plot(x_, y_, 'ro', ms=5)
```

Fit a smoothing spline with predetermined internal knots using the below code.

```
t_ = [-1, 0, 1]
spline = interpolate.LSQUnivariateSpline(x_, y_, t_)
xs_ = np.linspace(-4, 4, 1000)
plt.plot(x_, y_, 'ro', ms=5)
plt.plot(xs_, spline(xs_), 'g-', lw=3)
plt.show()
```

This is how to create a smooth curve by removing noise from noisy data using the method

of Python Scipy.*LSQUnivariateSpline()*

Read: Python Scipy Stats Kurtosis

## Python Scipy Smoothing Factor

The class

has a method *scipy.interpolate.UnivariateSpline()*

that continually compute splines using the knots discovered in the previous call and the smoothing factor *set_smoothing_factor(s)*

that are provided.**s**

Let’s take an example and use the method

by following the below steps:*set_smoothing_factor()*

Import the required libraries or methods using the below python code.

```
import numpy as np
import matplotlib.pyplot as plt
from scipy import interpolate
```

Generate x and y, and plot them using the below code.

```
x_ = np.linspace(-4, 4, 50)
y_ = np.sin(x_) + np.random.random(50) * 0.8
plt.plot(x_, y_, 'ro', ms=5)
```

Using the code below, smooth the data using the

function using the default parameter values.*UnivariateSpline()*

```
spline = interpolate.UnivariateSpline(x_, y_)
xs_ = np.linspace(-4, 4, 500)
plt.plot(xs_, spline(xs_), 'g', lw=2)
```

Now use the method `set_smoothing_factor(0.7)`

to adjust the smoothness of the data using the below code.

```
spline.set_smoothing_factor(0.7)
plt.plot(xs_, spline(xs_), 'b', lw=3)
plt.show()
```

This is how to adjust the smoothness of the data using the method

of Python Scipy.*set_smoothing_factor*

Read: Python Scipy Stats Multivariate_Normal

## Python Scipy Smoothing 2d Data

The Python Scipy has a method

in a module *interp2d()*`scipy.interpolate`

that uses a 2-D grid for interpolation. Arrays of values x, y, and z are used to approximate a function f: z = f(x, y) yields a scalar value z.

This class gives a function that uses spline interpolation in its call method to determine the value of newly created points.

The bare minimum of data points needed along the axis of interpolation is (k+1)**2, where k is equal to 1 for linear interpolation, 3 for cubic interpolation, and 5 for quintic interpolation.

Bisplrep is used to build the interpolator, and a smoothing factor of 0 is used. Direct usage of bisplrep is advised if additional smoothing control is required.

The syntax is given below.

`class scipy.interpolate.interp2d(x, y, bounds_error=False,z, kind='linear', copy=True, fill_value=None)`

Where parameters are:

**xy(array_data):**Coordinates for data points are defined using arrays. If the points are on a regular grid, x and y can be used to define the column and row coordinates, respectively.**z(array_data):**the interpolation values for the function at the data points. Assuming Fortran-ordering (order=’F’), z is flattened before usage if it is a multidimensional array. If x and y give the column and row coordinates, the length of a flattened z array is len(x)*len(y), or len(z) == len(x) == len(y).**kind(quintic, cubic, linear):**The appropriate kind of spline interpolation. It defaults to “linear.”**copy(boolean):**If True, x, y, and z are internally copied by the class. References are allowed if False. Copying is the default.**bounds_errror(boolean):**If this value is True, a ValueError is raised whenever interpolated values are expected outside of the input data’s (x, y) domain. If False, fill value is employed.**fill_value(number):**The value to use, if given, for points outside the interpolation domain. Values beyond the domain are extrapolated using nearest-neighbor extrapolation if missing (None).

Let’s take an example by following the below steps:

Import the required libraries or methods using the below python code.

```
import numpy as np
import matplotlib.pyplot as plt
from scipy import interpolate
```

Create a 2-dimensional grid using the below code.

```
x_ = np.arange(-4.01, 4.01, 0.20)
y_ = np.arange(-4.01, 4.01, 0.20)
xx_, yy_ = np.meshgrid(x_, y_)
z_ = np.sin(xx_**2+yy_**2)
```

Interpolate the above-crated data using the below code.

`f_ = interp2d(x_, y_, z_, kind='cubic')`

Plot the outcome using the interpolation function we just obtained using the below code:

```
xnew_ = np.arange(-4.01, 4.01, 1e-2)
ynew_ = np.arange(-4.01, 4.01, 1e-2)
znew_ = f_(xnew_, ynew_)
plt.plot(x_, z_[0, :], 'ro-', xnew_, znew_[0, :], 'b-')
plt.show()
```

This is how to create smoothness in 2d data using the method

of Python Scipy.*interp2d()*

Also, take a look at some more Python SciPy tutorials.

- Python Scipy Freqz
- Python Scipy Distance Matrix
- Python Scipy Exponential
- Python Scipy Spatial Distance Cdist
- Python Scipy Matrix + Examples
- Python Scipy Fcluster
- Python Scipy Optimize Root

In this python tutorial we learned, how to make smooth curves using different filters, and methods, and also how to remove the noise from the data with the following topics.

- What is Data Smoothing?
- Python Scipy Smoothing Spline
- How to use the filter for smoothing
- How to smooth the 1d data
- How to remove noise from the data and make it smooth
- How to control the smoothness using the method of smoothing factor
- Python Scipy Smoothing 2d Data

I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.