Contents
Contents
|
Pg no
|
||
1
|
Project description
|
6-7
|
|
1.1
|
Filter bank analysis
|
8
|
|
1.2
|
Convertion of discrete fourier transform
|
9
|
|
1.3
|
Windowing of a signal
|
9
|
|
1.4
|
Spectrogram of signal
|
11
|
|
2
|
Code
|
12-13
|
|
3
|
Output
|
14
|
|
4
|
References
|
15
|
SPECTRAL ANALYSIS OF SPEECH SIGNAL
INTRODUCTION:
Spectral analysis is an elementary
operation in speech recognition. Speech recognition operation requires heavy
computation due to large samples per window. Speech signal methods using
Fourier transform are commonly used in speech recognition. One of the most
widely used speech signal methods is the Fast Fourier Transform (FFT). FFT is a basic technique for digital signal
processing applicable. For spectrum analysis .Another transformation is
Discrete Cosine Transform (DCT). The FFT is often used to compute numerical
approximations to continuous Fourier. The Discrete Tchebichef Transform (DTT)
is another transform method based on discrete Tchebichef polynomials .DTT has a
lower computational complexity and it does not require complex transform unlike
continuous orthonormal transforms. The preliminary experimental results show
that DTT has the potential to be a simpler and faster transformation for speech
recognition
OBJECTIVES:
Load,display and
manipulation of speech signals both in time domain and Frequency domain.
MODULES:
1.
Build and perform a filter bank analysis of speech
signal.
2.
Use the discrete Fourier transform to convert a
waveform to a spectrum and vice
versa.
3.
Divide a signal into overlapping windows.
4.
Compute and display a spectrogram
Time domain
& Frequency domain:
Time domain is the analysis of mathematical functions, physical signals or time
series of economic or environmental data, with respect to time.
In the time domain, the signal or function's value is known for all real
numbers, for the case of continuous
time, or at various separate instants in
the case of discrete time. Frequency domain refers to
the analysis of mathematical functions or signals with respect to frequency, rather than
time.
A time-domain graph shows how a signal changes
over time, whereas a frequency-domain graph shows how such of the signal lies
within each given frequency band over a range of frequencies. A
frequency-domain representation can also include information on the phase shift
that must be applied to each sinusoid in
order to be able to recombine the frequency components to recover the original
time signal.
Filter bank analysis:
The most flexible way to perform spectral analysis is
to use a bank of band pass filters. A filter bank can be designed to
provide a spectral analysis with any degree of frequency resolution (wide or
narrow), even with non-linear filter spacing and bandwidths. A dis-advantage of
filter banks is that they almost always take more calculation
and processing time than discrete Fourier analysis using the FFT.
To
use a filter bank for analysis we need one band-pass filter per channel to do
the filtering, a means to perform rectification, and a low-pass filter to
smooth the energies. In this example, we build a 19-channel filter bank using
bandwidths that are modelled on human auditory bandwidths. We rectify and
smooth the filtered energies and convert to a decibel scale.A band-pass filter
is a device that passes frequencies within a certain range and rejects
frequencies outside that range.
Band pass is an adjective that
describes a type of filter or filtering process; it is to be distinguished from
pass band, which refers to the actual portion of affected spectrum. Hence, one
might say "A dual band pass filter has two pass bands." A band pass signal is a signal
containing a band of frequencies not adjacent to zero frequency, such as a
signal that comes out of a band pass filte
Spectral analysis
using Fourier transforms:
The discrete-time discrete-frequency
version of the Fourier transform (DFT) converts an array of N sample
amplitudes to an array of N complex harmonic amplitudes. If the sampling
rate is fs , the N input samples are 1/ fs seconds apart,
and the output harmonic frequencies are fs / N Hertz apart. That
is the N output amplitudes are evenly spaced at frequencies between 0 and (N-1)
fs / N Hertz. Perform DFT for the speech signal. Use sizes of 512,
1024, etc., for the fastest speed. Plot and display the magnitude and phase
spectrum.
To compute the DFT in
MATLAB, we use the function fft(x,n). This function takes a waveform x
and the number of samples n. When n is less than the length of x, then x
is truncated; when n is longer than the length of x, then x is padded with
zeros. The output is an array of complex amplitudes of length n.
You can obtain the magnitude of each spectral component with abs(), and its
phase with angle() (result in radians).
Windowing a signal:
Often it is desired to analyze a long signal in overlapping short
sections called “windows”. For example it is required to calculate an average
spectrum, or a spectrogram. Unfortunately it cannot simply chop the signal into
short pieces because this will cause sharp discontinuities at the edges of each
section. Instead it is preferable to have smooth joins between sections. Raised
cosine windows are a popular shape for the join.
The speech signal is constantly changing (non-stationary) Signal
processing algorithms usually assume that they the signal is stationary Piecewise
stationarity : model speech signal as a sequence of frames (each assumed to be
stationary).
Windowing: multiply the full waveform
s(n) by a window
w(n) (in time domain)
x[n] = w[n]s[n]
No comments:
Post a Comment