Search results
Results from the WOW.Com Content Network
It has been developed by the Human Language Technology and Pattern Recognition Group at RWTH Aachen University. RWTH ASR includes tools for the development of acoustic models and decoders as well as components for speaker adaptation, speaker adaptive training, unsupervised training, discriminative training, and word lattice processing. [1]
Speaker recognition is a pattern recognition problem. The various technologies used to process and store voice prints include frequency estimation, hidden Markov models, Gaussian mixture models, pattern matching algorithms, neural networks, matrix representation, vector quantization and decision trees.
Kaldi is an open-source speech recognition toolkit written in C++ for speech recognition and signal processing, freely available under the Apache License v2.0.. Kaldi aims to provide software that is flexible and extensible, [2] and is intended for use by automatic speech recognition (ASR) researchers for building a recognition system.
Sound recognition is a technology, which is based on both traditional pattern recognition theories and audio signal analysis methods. Sound recognition technologies contain preliminary data processing, feature extraction and classification algorithms. Sound recognition can classify feature vectors.
Microsoft Kinect includes built-in software which allows speech recognition of commands. Older generations of Nokia phones like Nokia N Series (before using Windows 7 mobile technology) used speech-recognition with family names from contact list and a few commands.
Modular Audio Recognition Framework (MARF) is an open-source research platform and a collection of voice, sound, speech, text and natural language processing (NLP) algorithms written in Java and arranged into a modular and extensible framework that attempts to facilitate addition of new algorithms.
multi-track audio editor intended as a replacement for Cubase-like software GPL-2.0-or-later: MusE: Yes No No Qt MIDI sequencer GPL-2.0-or-later: Qtractor: Yes No No Qt A non-destructive multi-track audio and MIDI Workstation GPL-2.0-or-later: Rosegarden: Chris Cannam Yes No No Qt MIDI sequencer and multi-track recorder GPL-2.0-or-later: SoX ...
Sphinx is a continuous-speech, speaker-independent recognition system making use of hidden Markov acoustic models and an n-gram statistical language model. It was developed by Kai-Fu Lee. Sphinx featured feasibility of continuous-speech, speaker-independent large-vocabulary recognition, the possibility of which was in dispute at the time (1986).