[Athen] Docsoft

Sean Keegan skeegan at htctu.net
Thu May 8 14:18:43 PDT 2008

>> Others that were testing the system (for podcasting-type situations)

>> were able to get similar levels of recognition provided that

>> they scripted out what they planned to say before recording.

> If there's a script why wouldn't you just work from that?

> I don't mean this rhetorically, how helpful is a script?

We were just testing the system to see how good the recognition could
actually be and used the script to compare against. One of the items that
was also tested was the time-stamping functionality and the creation of
captioned files (SAMI, SMIL, etc.).

All that being said (in the real world), if you are already scripting out
what you are going to say, then you have your transcript. For video
content, there would be the need to create the caption file, but there are
already business entities that are out there that can perform that function
or it can be done in-house. We were just testing using various models and
found that the recognition was better when working from a script as opposed
to speaking extemporaneously.

Take care,

