TY - JOUR
T1 - Graceful degradation of speech recognition performance over packet-erasure networks
AU - Boulis, Constantinos
AU - Ostendorf, Mari
AU - Riskin, Eve A.
AU - Otterson, Scott
PY - 2002/11
Y1 - 2002/11
N2 - This paper explores packet loss recovery for automatic speech recognition (ASR) in spoken dialog systems, assuming an architecture in which a lightweight client communicates with a remote ASR server. Speech is transmitted with source and channel codes optimized for the ASR application, i.e., to minimize word error rate. Unequal amounts of forward error correction, depending on the data's effect on ASR performance, are assigned to protect against packet loss. Experiments with simulated packet loss in a range of loss conditions are conducted on the DARPA Communicator (air travel information) task. Results show that the approach provides robust ASR performance which degrades gracefully as packet loss rates increase. Transmitting at 5.2 Kbps with up to 200 ms added delay, leads to only a 7% relative degradation in word error rate even under extremely adverse network conditions.
AB - This paper explores packet loss recovery for automatic speech recognition (ASR) in spoken dialog systems, assuming an architecture in which a lightweight client communicates with a remote ASR server. Speech is transmitted with source and channel codes optimized for the ASR application, i.e., to minimize word error rate. Unequal amounts of forward error correction, depending on the data's effect on ASR performance, are assigned to protect against packet loss. Experiments with simulated packet loss in a range of loss conditions are conducted on the DARPA Communicator (air travel information) task. Results show that the approach provides robust ASR performance which degrades gracefully as packet loss rates increase. Transmitting at 5.2 Kbps with up to 200 ms added delay, leads to only a 7% relative degradation in word error rate even under extremely adverse network conditions.
KW - Bit allocation
KW - Forward error correction
KW - Packet loss
KW - Speech recognition
KW - Unequal loss protection
UR - http://www.scopus.com/inward/record.url?scp=0036880137&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0036880137&partnerID=8YFLogxK
U2 - 10.1109/TSA.2002.804532
DO - 10.1109/TSA.2002.804532
M3 - Article
AN - SCOPUS:0036880137
SN - 1063-6676
VL - 10
SP - 580
EP - 590
JO - IEEE Transactions on Speech and Audio Processing
JF - IEEE Transactions on Speech and Audio Processing
IS - 8
ER -