[Athen] Searching Video Lectures - MIT lecture search engine
Kathleen Cahill
kcahill at MIT.EDU
Wed Jan 9 08:36:28 PST 2008
Hi all;
I have been in touch with Jim Glass, the researcher for this project to
inquire about any plans to make the software available in the future.
I'll post when I find something out.
Thanks,
Kathy
_______________________
Kathleen Cahill
MIT ATIC (Adaptive Technology) Lab
77 Mass. Ave. 7-143
Cambridge MA 02139
(617) 253-5111
kcahill at mit.edu
Saroj Primlani wrote:
> Have you all heard about this? I can't find information on the speech
> recognition engine. If this is viable it would a major solution to our
> problems, we really need to investigate this.
> http://web.mit.edu/newsoffice/2007/lectures-tt1107.html
>
> Article in Technology Review
> http://www.technologyreview.com/Infotech/19747/page1/
>
> Monday, November 26, 2007
> Searching Video Lectures
> A tool from MIT finds keywords so that students can efficiently review
> lectures.
> By Kate Greene
> Researchers at MIT have released a video and audio search tool that solves
> one of the most challenging problems in the field: how to break up a lengthy
> academic lecture into manageable chunks, pinpoint the location of keywords,
> and direct the user to them. Announced last month, the MIT Lecture Browser
> website gives the general public detailed access to more than 200 lectures
> publicly available though the university's OpenCourseWare initiative. The
> search engine leverages decades' worth of speech-recognition research at MIT
> and other institutions to convert audio into text and make it searchable.
>
> The Lecture Browser arrives at a time when more and more universities,
> including Carnegie Mellon University and the University of California,
> Berkeley, are posting videos and podcasts of lectures online. While this
> content is useful, locating specific information within lectures can be
> difficult, frustrating students who are accustomed to finding what they need
> in less than a second with Google.
>
> "This is a growing issue for universities around the country as it becomes
> easier to record classroom lectures," says Jim Glass, research scientist at
> MIT. "It's a real challenge to know how to disseminate them and make it
> easier for students to get access to parts of the lecture they might be
> interested in. It's like finding a needle in a haystack."
>
> The fundamental elements of the Lecture Browser have been kicking around
> research labs at MIT and places such as BBN Technologies in Boston, Carnegie
> Mellon, SRI International in Palo Alto, CA, and the University of Southern
> California for more than 30 years. Their efforts have produced software
> that's finally good enough to find its way to the average person, says
> Premkumar Natarajan, scientist at BBN. "There's about three decades of work
> where many fundamental problems were addressed," he says. "The technology is
> mature enough now that there's a growing sense in the community that it's
> time [to test applications in the real world]. We've done all we can in the
> lab."
>
> A handful of companies, such as online audio and video search engines Blinkx
> and EveryZing (which has licensed technology from BBN) are making use of
> software that converts audio speech into searchable text. (See "Surfing TV
> on the Internet" and "More-Accurate Video Search".) But the MIT researchers
> faced particular challenges with academic lectures. For one, many lecturers
> are not native English speakers, which makes automatic transcription tricky
> for systems trained on American English accents. Second, the words favored
> in science lectures can be rather obscure. Finally, says Regina Barzilay,
> professor of computer Science at MIT, lectures have very little discernable
> structure, making them difficult to break up and organize for easy
> searching. "Topical transitions are very subtle," she says. "Lectures aren't
> organized like normal text."
>
> To tackle these problems, the researchers first configured the software that
> converts the audio to text. They trained the software to understand
> particular accents using accurate transcriptions of short snippets of
> recorded speech. To help the software identify uncommon words--anything from
> "drosophila" to "closed-loop integrals"--the researchers provided it with
> additional data, such as text from books and lecture notes, which assists
> the software in accurately transcribing as many as four out of five words.
> If the system is used with a nonnative English speaker whose accent and
> vocabulary it hasn't been trained to recognize, the accuracy can drop to 50
> percent. (Such a low accuracy would not be useful for direct transcription
> but can still be useful for keyword searches.)
>
> The next step, explains Barzilay, is to add structure to the transcribed
> words. Software was already available that could break up long strings of
> sentences into high-level concepts, but she found that it didn't do the
> trick with the lectures. So her group designed its own. "One of the key
> distinctions," she says, "is that, during a lecture, you speak freely; you
> ramble and mumble."
>
> To organize the transcribed text, her group created software that breaks the
> text into chunks that often correspond with individual sentences. The
> software places these chunks in a network structure; chunks that have
> similar words or were spoken closely together in time are placed closer
> together in the network. The relative distance of the chunks in the network
> lets the software decide which sentences belong with each topic or subtopic
> in the lecture.
>
> The result, she says, is a coherent transcription. When a person searches
> for a keyword, the browser offers results in the form of a video or audio
> timeline that is partitioned into sections. The section of the lecture that
> contains the keyword is highlighted; below it are snippets of text that
> surround each instance of the keyword. When a video is playing, the browser
> shows the transcribed text below it.
>
> Barzilay says that the browser currently receives an average of 21,000 hits
> a day, and while it's proving popular, there is still work to be done.
> Within the next few months, her team will add a feature that automatically
> attaches a text outline to lectures so users can jump to a desired section.
> Further ahead, the researchers will give users the ability to make
> corrections to the transcript in the same way that people contribute to
> Wikipedia. While such improvements seem straightforward, they pose technical
> challenges, Barzilay says. "It's not a trivial matter, because you want an
> interface that's not tedious, and you need to propagate the correction
> throughout the lecture and to other lectures." She says that bringing people
> into the transcription loop could improve the accuracy of the system by a
> couple percentage points, making user experience even better.
>
> Copyright Technology Review 2007
> _________________________________
> Saroj Primlani
> Coordinator of University IT Accessibility
> Office of Information Technology
> 919 513 4087
> http://ncsu.edu/it/access
>
>
More information about the athen-list
mailing list