Speaker Diarization Boosts Automatic Speaker RecognitionIn Audio Recordings

Sunday, October 20, 2013 - 10:40 in Mathematics & Economics

An important goal in spoken-language-systems research is speaker diarization - computationally determining how many speakers feature in a recording and which of them speaks when. To date, the best diarization systems have used supervised machine learning; they're trained on sample recordings that a human has indexed, indicating which speaker enters when. In a new paper, MIT researchers show how they can improve speaker diarization so that it can automatically annotate audio or video recordings without supervision: No prior indexing is necessary.  They also discuss, compact way to represent the differences between individual speakers' voices, which could be of use in other spoken-language computational tasks. read more

Read the whole article on

More from

Learn more about

Latest Science Newsletter

Get the latest and most popular science news articles of the week in your Inbox! It's free!

Check out our next project, Biology.Net