Some of the material in is restricted to members of the community. By logging in, you may be able to gain additional access to certain collections or items. If you have questions about access or logging in, please use the form on the Contact Page.
With the maturity of mobile technologies, voice biometrics is becoming increasingly popular as a promising alternative to traditional passwords for user authentication on smartphones. Moreover, with the popularity of voice user interface (VUI) capable devices like smart speakers and the evolution of speech recognition techniques, voice biometrics has further advanced to be the primary channel of secured user-machine communication. On these smart devices, Voice biometrics authenticates and secures the user's access and control to an array of systems, devices, and services. However, a growing body of research proves that state-of-the-art voice authentication systems are vulnerable to replay attacks, where an adversary spoofs the voice authentication systems with pre-recorded or concatenated voice samples of the genuine user. To this end, we propose three liveness detection systems, VoiceLive, VoiceGesture, and VibLive, towards enhanced mobile voice authentication. Especially, our solutions emphasize discovering new biometrics, inventing ingenious sensing mechanisms, and designing practical liveness detection systems for voice authentication. In particular, VoiceLive captures the first new biometric - the time-difference-of-arrival (TDoA) changes in a sequence of phoneme sounds with the stereo recording on smartphones and uses such biometric that do not exist under replay attacks for liveness detection. VoiceGesture examines the second new biometric - articulatory gestures, by re-using smartphones as Doppler radars to measure the articulatory gestures' velocities and locations for liveness detection. Furthermore, VibLive senses the third new biometric - bone-conducted vibrations with a pair of built-in speaker and microphone on a VUI capable device. VibLive searches the dissimilarities between microphone recorded air-conducted voices and the corresponding sensed bone-conducted vibrations for liveness detection. Both VoiceLive and VoiceGestures are text-dependent liveness detection solutions for short-range mobile voice authentication. The experiments show that both systems manage to achieve over 99% accuracy and around 1% Equal Error Rate(EER) in detecting replay attacks on enrolled passphrases. In comparison, VibLive is a text-independent liveness detection solution that supports short-range and long-range voice authentication on generic VUI capable devices, including but not limited to smartphones and smart speakers. An evaluation indicates that VibLive yields over 97% accuracy in protecting continuous voice authentication regardless of the users' speech contents.