Articulatory Feature Based Automatic Speech Recognition Using Neural Network

UIU Institutional Repository

    • Login
    View Item 
    •   UIU DSpace Home
    • School of Science and Engineering (SoSE)
    • Department of Computer Science and Engineering (CSE)
    • B.Sc Thesis/Project
    • View Item
    •   UIU DSpace Home
    • School of Science and Engineering (SoSE)
    • Department of Computer Science and Engineering (CSE)
    • B.Sc Thesis/Project
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Articulatory Feature Based Automatic Speech Recognition Using Neural Network

    Thumbnail
    View/Open
    articulatory feature based ASR system.pdf (3.785Mb)
    Date
    2018-07-06
    Author
    Ifrat, Kazi
    Israt, Kazi
    Saimun, Imran Hossain
    Akter, Fahima
    Metadata
    Show full item record
    Abstract
    Many ASR systems based on hidden Markov model (HMM) have been developed over the last years. Most of these use mel frequency cepstral coefficient(MFCC) of the speech signal, that considers the time frequency distribution of signal energy. The main purpose of this research is to improve the performance of ASR systems by introducing articulatory information. The articulatory information describes the features of speech production rather than the features of acoustic signal. Articulatory Information is represented in term of articulatory features AF. A phoneme can easily be identified by using its unique AF set, which comprises the manner of articulation (vocalic, consonantal, continuant etc.) and place of articulation (tongue position: high, low, front, back etc.) The use of AFs in ASR had been investigated previously, and has been actively discussed in recent years. However, AF isn’t widely used as features for ASR instead of MFCC because AF itself cannot provide enough performance. This thesis presents a method to extract DPFs using multilayer neural network(MLNs). Since AFs are designed after full consideration of speech production, an AFspace well represents the distances among phonemes corresponding to their articulation differences. The fact suggests that DPFs are efficient feature parameter for ASR. In the AF extractor construction, the MLN takes 39 dimensional MFCC vectors as input and outputs 22 dimensional AF vectors. In this work a new Bangla speech corpus along with proper transcriptions has been developed; also various acoustic feature extraction methods have been investigated in order to find their effective integration into Bangla ASR system. The use of multiple acoustic features of the speech signal is considered for Bangla speech recognition. The features are usually a sequence of representative vectors that are extracted from speech signals and the classes are either words or sub word units such as phonemes. The Bangla automatic speech recognition system, developed in this work, models the probability distribution of feature vectors using hidden Markov model (HMM) and adopts the Viterbi rule for classification. Experimental results are presented for medium database of female speech samples. The performance analysis of the individual methods are compared. It has been found that out that our proposed methods out performs the existing standard MFCC based method
    URI
    http://dspace.uiu.ac.bd/handle/52243/321
    Collections
    • B.Sc Thesis/Project [82]

    Copyright 2003-2017 United International University
    Contact Us | Send Feedback
    Developed by UIU CITS
     

     

    Browse

    All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    Copyright 2003-2017 United International University
    Contact Us | Send Feedback
    Developed by UIU CITS