Their newest project, funded by a Provost’s Tier 1 Interdisciplinary Seed Grant, targets a trickier population: teenagers who want to learn English as a second language.
“Adolescents who are not native to English often lag in reading comprehension because the prosody, which is the melody of speech, is different in their language,” said Patel, an associate professor of speech language pathology and audiology in the Bouvé College of Health Sciences with a joint appointment in the College of Computer and Information Science. “We want to help them both to sound more fluent and non-native and also to help them with language comprehension.”
But the new task presented a new set of visual challenges, noted Meirelles, an associate professor of graphic design in the College of Arts, Media & Design. “We spent a lot of time trying to devise a structural metaphor for how to engage people,” she said. “Because it’s a difficult age group, we didn’t want them to feel like they were doing work even though it is work.”
In the end, Meirelles and Patel appealed to that universal feature of puberty: narcissism. “This is the period in a person’s life when they go from where the family is a unit to where they are the unit,” Patel explained.
The program, she said, turns teenagers into television stars. With an interface modeled after a television set, users can enter any of a series of “recording studios” and become anything from a sportscaster to a fashion critic.
Here’s how it works: Upon entering the room, a user hears a native speaker reading a script. Then, while looking at the visual cues for each sentence, the user records herself reading it. Finally, she plays back her own recording.
Patel said an important aspect of the computational program is its scalability. “There isn’t an inventory of rules,” she explained. “It doesn’t see a question mark and say okay, for the question mark, raise your pitch, because in English that’s not always true. The question mark itself doesn’t tell you enough.”
Instead, the program uses speech acoustics and actual speech data to render the visual cues of native-speaker recordings. Eventually, the team hopes to introduce a voice-recognition tool, allowing the program to immediately render users’ recordings so they can compare the visual representation of their own speech with that of the native speaker.
Ultimately Patel and Meirelles view the program as a web application featuring recordings that could be shared among users.
First, however, they will test the user interface in a proof of concept pilot study. This data will help the design team determine which visual cues need more development before deploying the program in a larger efficacy study.