[Athen] Searching Video Lectures - MIT lecture search engine
Kathleen Cahill
kcahill at MIT.EDU
Thu Jan 10 11:11:13 PST 2008
Gaeir is right....the Lecture Browser if you try it out ( see
http://www.galaxy.csail.mit.edu/lectures/ ) has a pretty high error rate.
Prof. Glass said that they are continuing to try and improve the
recognition, work with other colleagues and get more funding to improve
it. So, I guess it's not ready yet for us AT providers as a reliable
captioning/transcription option.
Thanks,
Kathy
__________________________
Kathleen Cahill
MIT ATIC (Adaptive Technology) Lab
77 Mass. Ave. 7-143
Cambridge MA 02139
kcahill at mit.edu
617 253-5111
Gaeir Dietrich wrote:
> We have looked at other similar products, and while they do a fairly decent
> job of indexing material, they are not even close to creating accurate
> word-for-word transcripts.
>
> If all you want is to know the kinds of words being spoken five minutes into
> an audio file, it works. If you actually want to know what is being said, it
> does not work.
>
> ******************************************************
> Gaeir (rhymes with "fire") Dietrich
> High Tech Center Training Unit of the
> California Community Colleges
> De Anza College, Cupertino, CA
> www.htctu.net
> 408-996-6043
> -----Original Message-----
> From: athen-bounces at athenpro.org [mailto:athen-bounces at athenpro.org] On
> Behalf Of Kathleen Cahill
> Sent: Wednesday, January 09, 2008 8:36 AM
> To: Access Technologists in Higher Education Network
> Cc: ITACCESS at LISTSERV.EDUCAUSE.EDU
> Subject: Re: [Athen] Searching Video Lectures - MIT lecture search engine
>
> Hi all;
>
> I have been in touch with Jim Glass, the researcher for this project to
> inquire about any plans to make the software available in the future.
> I'll post when I find something out.
>
> Thanks,
> Kathy
>
> _______________________
>
> Kathleen Cahill
> MIT ATIC (Adaptive Technology) Lab
> 77 Mass. Ave. 7-143
> Cambridge MA 02139
> (617) 253-5111
> kcahill at mit.edu
>
>
> Saroj Primlani wrote:
>
>> Have you all heard about this? I can't find information on the speech
>> recognition engine. If this is viable it would a major solution to our
>> problems, we really need to investigate this.
>> http://web.mit.edu/newsoffice/2007/lectures-tt1107.html
>>
>> Article in Technology Review
>> http://www.technologyreview.com/Infotech/19747/page1/
>>
>> Monday, November 26, 2007
>> Searching Video Lectures
>> A tool from MIT finds keywords so that students can efficiently review
>> lectures.
>> By Kate Greene
>> Researchers at MIT have released a video and audio search tool that solves
>> one of the most challenging problems in the field: how to break up a
>>
> lengthy
>
>> academic lecture into manageable chunks, pinpoint the location of
>>
> keywords,
>
>> and direct the user to them. Announced last month, the MIT Lecture Browser
>> website gives the general public detailed access to more than 200 lectures
>> publicly available though the university's OpenCourseWare initiative. The
>> search engine leverages decades' worth of speech-recognition research at
>>
> MIT
>
>> and other institutions to convert audio into text and make it searchable.
>>
>> The Lecture Browser arrives at a time when more and more universities,
>> including Carnegie Mellon University and the University of California,
>> Berkeley, are posting videos and podcasts of lectures online. While this
>> content is useful, locating specific information within lectures can be
>> difficult, frustrating students who are accustomed to finding what they
>>
> need
>
>> in less than a second with Google.
>>
>> "This is a growing issue for universities around the country as it becomes
>> easier to record classroom lectures," says Jim Glass, research scientist
>>
> at
>
>> MIT. "It's a real challenge to know how to disseminate them and make it
>> easier for students to get access to parts of the lecture they might be
>> interested in. It's like finding a needle in a haystack."
>>
>> The fundamental elements of the Lecture Browser have been kicking around
>> research labs at MIT and places such as BBN Technologies in Boston,
>>
> Carnegie
>
>> Mellon, SRI International in Palo Alto, CA, and the University of Southern
>> California for more than 30 years. Their efforts have produced software
>> that's finally good enough to find its way to the average person, says
>> Premkumar Natarajan, scientist at BBN. "There's about three decades of
>>
> work
>
>> where many fundamental problems were addressed," he says. "The technology
>>
> is
>
>> mature enough now that there's a growing sense in the community that it's
>> time [to test applications in the real world]. We've done all we can in
>>
> the
>
>> lab."
>>
>> A handful of companies, such as online audio and video search engines
>>
> Blinkx
>
>> and EveryZing (which has licensed technology from BBN) are making use of
>> software that converts audio speech into searchable text. (See "Surfing TV
>> on the Internet" and "More-Accurate Video Search".) But the MIT
>>
> researchers
>
>> faced particular challenges with academic lectures. For one, many
>>
> lecturers
>
>> are not native English speakers, which makes automatic transcription
>>
> tricky
>
>> for systems trained on American English accents. Second, the words favored
>> in science lectures can be rather obscure. Finally, says Regina Barzilay,
>> professor of computer Science at MIT, lectures have very little
>>
> discernable
>
>> structure, making them difficult to break up and organize for easy
>> searching. "Topical transitions are very subtle," she says. "Lectures
>>
> aren't
>
>> organized like normal text."
>>
>> To tackle these problems, the researchers first configured the software
>>
> that
>
>> converts the audio to text. They trained the software to understand
>> particular accents using accurate transcriptions of short snippets of
>> recorded speech. To help the software identify uncommon words--anything
>>
> from
>
>> "drosophila" to "closed-loop integrals"--the researchers provided it with
>> additional data, such as text from books and lecture notes, which assists
>> the software in accurately transcribing as many as four out of five words.
>> If the system is used with a nonnative English speaker whose accent and
>> vocabulary it hasn't been trained to recognize, the accuracy can drop to
>>
> 50
>
>> percent. (Such a low accuracy would not be useful for direct transcription
>> but can still be useful for keyword searches.)
>>
>> The next step, explains Barzilay, is to add structure to the transcribed
>> words. Software was already available that could break up long strings of
>> sentences into high-level concepts, but she found that it didn't do the
>> trick with the lectures. So her group designed its own. "One of the key
>> distinctions," she says, "is that, during a lecture, you speak freely; you
>> ramble and mumble."
>>
>> To organize the transcribed text, her group created software that breaks
>>
> the
>
>> text into chunks that often correspond with individual sentences. The
>> software places these chunks in a network structure; chunks that have
>> similar words or were spoken closely together in time are placed closer
>> together in the network. The relative distance of the chunks in the
>>
> network
>
>> lets the software decide which sentences belong with each topic or
>>
> subtopic
>
>> in the lecture.
>>
>> The result, she says, is a coherent transcription. When a person searches
>> for a keyword, the browser offers results in the form of a video or audio
>> timeline that is partitioned into sections. The section of the lecture
>>
> that
>
>> contains the keyword is highlighted; below it are snippets of text that
>> surround each instance of the keyword. When a video is playing, the
>>
> browser
>
>> shows the transcribed text below it.
>>
>> Barzilay says that the browser currently receives an average of 21,000
>>
> hits
>
>> a day, and while it's proving popular, there is still work to be done.
>> Within the next few months, her team will add a feature that automatically
>> attaches a text outline to lectures so users can jump to a desired
>>
> section.
>
>> Further ahead, the researchers will give users the ability to make
>> corrections to the transcript in the same way that people contribute to
>> Wikipedia. While such improvements seem straightforward, they pose
>>
> technical
>
>> challenges, Barzilay says. "It's not a trivial matter, because you want an
>> interface that's not tedious, and you need to propagate the correction
>> throughout the lecture and to other lectures." She says that bringing
>>
> people
>
>> into the transcription loop could improve the accuracy of the system by a
>> couple percentage points, making user experience even better.
>>
>> Copyright Technology Review 2007
>> _________________________________
>> Saroj Primlani
>> Coordinator of University IT Accessibility
>> Office of Information Technology
>> 919 513 4087
>> http://ncsu.edu/it/access
>>
>>
>>
>
>
> _______________________________________________
> Athen mailing list
> Athen at athenpro.org
> http://athenpro.org/mailman/listinfo/athen_athenpro.org
>
>
>
> _______________________________________________
> Athen mailing list
> Athen at athenpro.org
> http://athenpro.org/mailman/listinfo/athen_athenpro.org
>
More information about the athen-list
mailing list