New lip-reading technology could help solve crimes by deciphering what people caught on CCTV are saying, researchers have claimed. The visual speech recognition technology developed by the University of East Anglia in Norwich can be used to determine what people are saying in situations where audio is not good enough to hear – such as on security camera footage.
Helen Bear, from the university’s school of computing science, said the technology could be applied to a wide range of situations from criminal investigations to entertainment. She added: “Lip-reading has been used to pinpoint words footballers have shouted in heated moments on the pitch, but is likely to be of most practical use in situations where there are high levels of noise, such as in cars or aircraft cockpits. “Crucially, whilst there are still improvements to be made, such a system could be adapted for use for a range of purposes – for example, for people with hearing or speech impairments.”
Some sounds like “P” and “B” look similar on the lips and have traditionally been hard to decipher, the researchers said. But now the machine lip-reading technology can differentiate between the sounds for a more accurate translation. Co-creator Richard Harvey said: “Lip-reading is one of the most challenging problems in artificial intelligence so it’s great to make progress on one of the trickier aspects, which is how to train machines to recognise the appearance and shape of human lips.”