mirror of https://github.com/cmusphinx/sphinxtrain.git synced 2026-05-17 13:10:52 +00:00

Files

T

History

Kevin Lenzo b0a23541c6 Add vocab_dict filter and optional 00a.vocab_dict script

Python reduces the dictionary to words in the transcript vocabulary while keeping pronunciation variants. Perl script is invoked from the optional 00a.vocab_dict Makefile step; uses python like other SphinxTrain drivers. Define CFG_VOCAB_DICT and CFG_VOCAB_DICTIONARY in sphinx_train.cfg.

2026-04-08 13:52:36 -04:00

cmusphinx

Add vocab_dict filter and optional 00a.vocab_dict script

2026-04-08 13:52:36 -04:00

test

build: reorganize and modernize cmusphinx python module

2022-07-05 22:25:50 -04:00

CMakeLists.txt

build: install everything, we hope (scripts will be in share not lib)

2022-10-12 19:17:29 -04:00

README.md

build: reorganize and modernize cmusphinx python module

2022-07-05 22:25:50 -04:00

setup.cfg

fix: oops, duplicate key in setup.cfg

2022-07-05 22:28:57 -04:00

setup.py

build: reorganize and modernize cmusphinx python module

2022-07-05 22:25:50 -04:00

README.md

CMU Sphinx Python Modules

These are a number of modules that can be used to do speech processing, generally oriented towards CMU Sphinx file formats. In fact, there is a wide variety of stuff here, mostly things created during my (David Huggins-Daines) PhD work. You will find for instance pure-Python (but using numpy and scipy) implementations of certain feature extraction and adaptation algorithms, along with code to read and write the various files produced by SphinxTrain.

There is some code to convert things to FSTs and play with them, but it does not work, since the Python bindings for OpenFst have mutated greatly since it was written. In fact, it may have used my own personal wrapper library which may or may not still exist. Sorry.