Font Size: a A A

Speech Recognition System Building Based On Finite State Graphs

Posted on:2012-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:J XiaoFull Text:PDF
GTID:2248330362968065Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Pioneered by Mohri and others at AT&T, the application of weight finitestate transducer(WFST) on large vocabulary continuous speech recognitionattracts many research groups around the world. In recent years, there haveemerged a number of speech recognition system using WFST approach, forexample, IBM, AT&T, Titech, IDIAP, etc. The WFST approach consists oftwo stages, compiling the search network and doing the search.This thesis presents our effort to build a state-of-art WFST based speechrecognition system. This thesis is concerned with the first stage, that’scompilation of static search network for both1-best and lattice generation onEnglish and Mandarin spontaneous speech recognition task. We use thestandard WFST representations and operations during the compilation of thesearch network, and the compiled WFST network is then converted tofinite-state graph(FSG) representation equivalently, which is more tailored toViterbi decoding and more compact in memory, then the FSG search networkis fed to GrpDecoder which is a fast, scalable, flexible decoder developed byour lab, and give out the experiment results.Experiments of1-best and lattice task are carried out separately on twolanguages-English and Mandarin. The results show that the new system issuperior to HTK and traditional two-pass system on the1-best task. Eventhough the lattice task is still in its infancy, we have already implementedsome experiments, the results show that the density of lattice generated by thenew system is much less than density of lattices produced by HTK andtwo-pass system, which means the lattice generated by new system is morecompact and efficient.
Keywords/Search Tags:Speech recognition, WFST, finite-state graph
PDF Full Text Request
Related items