[Athen] Searching Video Lectures - MIT lecture search engine

Kathleen Cahill kcahill at MIT.EDU
Thu Jan 10 11:11:13 PST 2008


Gaeir is right....the Lecture Browser if you try it out ( see
http://www.galaxy.csail.mit.edu/lectures/ ) has a pretty high error rate.

Prof. Glass said that they are continuing to try and improve the
recognition, work with other colleagues and get more funding to improve
it. So, I guess it's not ready yet for us AT providers as a reliable
captioning/transcription option.

Thanks,
Kathy

__________________________
Kathleen Cahill
MIT ATIC (Adaptive Technology) Lab
77 Mass. Ave. 7-143
Cambridge MA 02139
kcahill at mit.edu
617 253-5111

Gaeir Dietrich wrote:

> We have looked at other similar products, and while they do a fairly decent

> job of indexing material, they are not even close to creating accurate

> word-for-word transcripts.

>

> If all you want is to know the kinds of words being spoken five minutes into

> an audio file, it works. If you actually want to know what is being said, it

> does not work.

>

> ******************************************************

> Gaeir (rhymes with "fire") Dietrich

> High Tech Center Training Unit of the

> California Community Colleges

> De Anza College, Cupertino, CA

> www.htctu.net

> 408-996-6043

> -----Original Message-----

> From: athen-bounces at athenpro.org [mailto:athen-bounces at athenpro.org] On

> Behalf Of Kathleen Cahill

> Sent: Wednesday, January 09, 2008 8:36 AM

> To: Access Technologists in Higher Education Network

> Cc: ITACCESS at LISTSERV.EDUCAUSE.EDU

> Subject: Re: [Athen] Searching Video Lectures - MIT lecture search engine

>

> Hi all;

>

> I have been in touch with Jim Glass, the researcher for this project to

> inquire about any plans to make the software available in the future.

> I'll post when I find something out.

>

> Thanks,

> Kathy

>

> _______________________

>

> Kathleen Cahill

> MIT ATIC (Adaptive Technology) Lab

> 77 Mass. Ave. 7-143

> Cambridge MA 02139

> (617) 253-5111

> kcahill at mit.edu

>

>

> Saroj Primlani wrote:

>

>> Have you all heard about this? I can't find information on the speech

>> recognition engine. If this is viable it would a major solution to our

>> problems, we really need to investigate this.

>> http://web.mit.edu/newsoffice/2007/lectures-tt1107.html

>>

>> Article in Technology Review

>> http://www.technologyreview.com/Infotech/19747/page1/

>>

>> Monday, November 26, 2007

>> Searching Video Lectures

>> A tool from MIT finds keywords so that students can efficiently review

>> lectures.

>> By Kate Greene

>> Researchers at MIT have released a video and audio search tool that solves

>> one of the most challenging problems in the field: how to break up a

>>

> lengthy

>

>> academic lecture into manageable chunks, pinpoint the location of

>>

> keywords,

>

>> and direct the user to them. Announced last month, the MIT Lecture Browser

>> website gives the general public detailed access to more than 200 lectures

>> publicly available though the university's OpenCourseWare initiative. The

>> search engine leverages decades' worth of speech-recognition research at

>>

> MIT

>

>> and other institutions to convert audio into text and make it searchable.

>>

>> The Lecture Browser arrives at a time when more and more universities,

>> including Carnegie Mellon University and the University of California,

>> Berkeley, are posting videos and podcasts of lectures online. While this

>> content is useful, locating specific information within lectures can be

>> difficult, frustrating students who are accustomed to finding what they

>>

> need

>

>> in less than a second with Google.

>>

>> "This is a growing issue for universities around the country as it becomes

>> easier to record classroom lectures," says Jim Glass, research scientist

>>

> at

>

>> MIT. "It's a real challenge to know how to disseminate them and make it

>> easier for students to get access to parts of the lecture they might be

>> interested in. It's like finding a needle in a haystack."

>>

>> The fundamental elements of the Lecture Browser have been kicking around

>> research labs at MIT and places such as BBN Technologies in Boston,

>>

> Carnegie

>

>> Mellon, SRI International in Palo Alto, CA, and the University of Southern

>> California for more than 30 years. Their efforts have produced software

>> that's finally good enough to find its way to the average person, says

>> Premkumar Natarajan, scientist at BBN. "There's about three decades of

>>

> work

>

>> where many fundamental problems were addressed," he says. "The technology

>>

> is

>

>> mature enough now that there's a growing sense in the community that it's

>> time [to test applications in the real world]. We've done all we can in

>>

> the

>

>> lab."

>>

>> A handful of companies, such as online audio and video search engines

>>

> Blinkx

>

>> and EveryZing (which has licensed technology from BBN) are making use of

>> software that converts audio speech into searchable text. (See "Surfing TV

>> on the Internet" and "More-Accurate Video Search".) But the MIT

>>

> researchers

>

>> faced particular challenges with academic lectures. For one, many

>>

> lecturers

>

>> are not native English speakers, which makes automatic transcription

>>

> tricky

>

>> for systems trained on American English accents. Second, the words favored

>> in science lectures can be rather obscure. Finally, says Regina Barzilay,

>> professor of computer Science at MIT, lectures have very little

>>

> discernable

>

>> structure, making them difficult to break up and organize for easy

>> searching. "Topical transitions are very subtle," she says. "Lectures

>>

> aren't

>

>> organized like normal text."

>>

>> To tackle these problems, the researchers first configured the software

>>

> that

>

>> converts the audio to text. They trained the software to understand

>> particular accents using accurate transcriptions of short snippets of

>> recorded speech. To help the software identify uncommon words--anything

>>

> from

>

>> "drosophila" to "closed-loop integrals"--the researchers provided it with

>> additional data, such as text from books and lecture notes, which assists

>> the software in accurately transcribing as many as four out of five words.

>> If the system is used with a nonnative English speaker whose accent and

>> vocabulary it hasn't been trained to recognize, the accuracy can drop to

>>

> 50

>

>> percent. (Such a low accuracy would not be useful for direct transcription

>> but can still be useful for keyword searches.)

>>

>> The next step, explains Barzilay, is to add structure to the transcribed

>> words. Software was already available that could break up long strings of

>> sentences into high-level concepts, but she found that it didn't do the

>> trick with the lectures. So her group designed its own. "One of the key

>> distinctions," she says, "is that, during a lecture, you speak freely; you

>> ramble and mumble."

>>

>> To organize the transcribed text, her group created software that breaks

>>

> the

>

>> text into chunks that often correspond with individual sentences. The

>> software places these chunks in a network structure; chunks that have

>> similar words or were spoken closely together in time are placed closer

>> together in the network. The relative distance of the chunks in the

>>

> network

>

>> lets the software decide which sentences belong with each topic or

>>

> subtopic

>

>> in the lecture.

>>

>> The result, she says, is a coherent transcription. When a person searches

>> for a keyword, the browser offers results in the form of a video or audio

>> timeline that is partitioned into sections. The section of the lecture

>>

> that

>

>> contains the keyword is highlighted; below it are snippets of text that

>> surround each instance of the keyword. When a video is playing, the

>>

> browser

>

>> shows the transcribed text below it.

>>

>> Barzilay says that the browser currently receives an average of 21,000

>>

> hits

>

>> a day, and while it's proving popular, there is still work to be done.

>> Within the next few months, her team will add a feature that automatically

>> attaches a text outline to lectures so users can jump to a desired

>>

> section.

>

>> Further ahead, the researchers will give users the ability to make

>> corrections to the transcript in the same way that people contribute to

>> Wikipedia. While such improvements seem straightforward, they pose

>>

> technical

>

>> challenges, Barzilay says. "It's not a trivial matter, because you want an

>> interface that's not tedious, and you need to propagate the correction

>> throughout the lecture and to other lectures." She says that bringing

>>

> people

>

>> into the transcription loop could improve the accuracy of the system by a

>> couple percentage points, making user experience even better.

>>

>> Copyright Technology Review 2007

>> _________________________________

>> Saroj Primlani

>> Coordinator of University IT Accessibility

>> Office of Information Technology

>> 919 513 4087

>> http://ncsu.edu/it/access

>>

>>

>>

>

>

> _______________________________________________

> Athen mailing list

> Athen at athenpro.org

> http://athenpro.org/mailman/listinfo/athen_athenpro.org

>

>

>

> _______________________________________________

> Athen mailing list

> Athen at athenpro.org

> http://athenpro.org/mailman/listinfo/athen_athenpro.org

>






More information about the athen-list mailing list