Speech and language resources for LVCSR of Russian : доклад, тезисы доклада | Научно-инновационный портал СФУ

Speech and language resources for LVCSR of Russian : доклад, тезисы доклада

Тип публикации: доклад, тезисы доклада, статья из сборника материалов конференций

Конференция: International Conference on Language Resources and Evaluation, LREC 2012; Istanbul; Istanbul

Год издания: 2012

Ключевые слова: language modelling, LVCSR, russian, sub-word units

Аннотация: A syllable-based language model reduces the lexicon size by hundreds of times. It is especially beneficial in case of highly inflective languages like Russian due to the abundance of word forms according to various grammatical categories. However, the main arising challenge is the concatenation of recognised syllables into the originally spoken sentence or phrase, particularly in the presence of syllable recognition mistakes. Natural fluent speech does not usually incorporate clear information about the outside borders of the spoken words. In this paper a method for the syllable concatenation and error correction is suggested and tested. It is based on the designed co-evolutionary asymptotic probabilistic genetic algorithm for the determination of the most likely sentence corresponding to the recognized chain of syllables within an acceptable time frame. The advantage of this genetic algorithm modification is the minimum number of settings to be manually adjusted comparing to the standard algorithm. Data used for acoustic and language modelling are also described here. A special issue is the preprocessing of the textual data, particularly, handling of abbreviations, Arabic and Roman numerals, since their inflection mostly depends on the context and grammar.

Ссылки на полный текст

Издание

Журнал: Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012

Номера страниц: 3374-3377

Авторы

  • Zablotskiy S. (University of Ulm)
  • Minker W. (University of Ulm)
  • Shvets A. (Institute for System Analysis of RAS)
  • Sidorov M. (Siberian State Aerospace University)
  • Semenkin E. (Siberian State Aerospace University)

Вхождение в базы данных

Информация о публикациях загружается с сайта службы поддержки публикационной активности СФУ. Сообщите, если заметили неточности.

Вы можете отметить интересные фрагменты текста, которые будут доступны по уникальной ссылке в адресной строке браузера.