CS474: Human Computer Interaction - Modalities - Voice Prompts

Activity Goals

The goals of this activity are:
  1. To identify alternative modalities for human-computer interaction
  2. To write a program that uses voice prompts for engagement
  3. To identify signifiers and affordances for a given application and modality

The Activity

Directions

Consider the activity models and answer the questions provided. First reflect on these questions on your own briefly, before discussing and comparing your thoughts with your group. Appoint one member of your group to discuss your findings with the class, and the rest of the group should help that member prepare their response. Answer each question individually from the activity, and compare with your group to prepare for our whole-class discussion. After class, think about the questions in the reflective prompt and respond to those individually in your notebook. Report out on areas of disagreement or items for which you and your group identified alternative approaches. Write down and report out questions you encountered along the way for group discussion.

Model 1: Voice Prompts

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
# sudo apt install portaudio19-dev libespeak-dev
# on mac: brew install portaudio
# pip3 install pyaudio pyttsx3 speechrecognition
# alternatively: pip3 install pipwin && python -m pipwin install pyaudio
 
import speech_recognition as sr
import pyttsx3
import sys
import time
 
tts = pyttsx3.init()
 
 
def speak(tts, text):
    tts.say(text)
    tts.runAndWait()
 
def main():
    # get audio from the microphone                                                                      
    listener = sr.Recognizer()                                                                                  
    with sr.Microphone() as source:
        listener.adjust_for_ambient_noise(source) # used to detect silence to stop listening after a phrase is spoken
        while True:
            print("Listening.")
            speak(tts, "listening") # how do we prevent this from being spoken every time an exception is thrown?
            time.sleep(1) # used to prevent hearing any spoken text; what else could we do?
            user_input = None
            sys.stdout.write(">")
            #record audio
            listener.pause_threshold = 0.5 # how long, in seconds, to observe silence before processing what was heard
            audio = listener.listen(source, timeout=5) #, timeout = N throws an OSError after N seconds if nothing is heard.  can also call listen_in_background(source, callback) and specify a function callback that accepts the recognizer and the audio when data is heard via a thread
            try:
                #convert audio to text
                #user_input = listener.recognize_sphinx(audio) #requires PocketSphinx installation
                user_input = listener.recognize_google(audio, show_all = False) # set show_all to True to get a dictionary of all possible translations
 
                print(user_input)
                speak(tts, user_input)
            except sr.UnknownValueError:
                print("Could not understand audio")
            except sr.RequestError as e:
                print("Could not request results; {0}".format(e))
            except OSError:
                print("No speech detected")
                 
            sys.stdout.write("\n")
 
 
if __name__ == "__main__":
    main()

Questions

  1. How might you adapt this code for use in a text-based program you've written in the past?
  2. What challenges might you anticipate when using a voice approach, particularly with respect to accessibility, and how might you address them?
  3. What other modalities can you think of?
  4. How might you indicate to a user that it is time to input a certain value, and indicate what kinds of values are permissible?
  5. How do you enable the user to to provide input and to understand output at the right time?

Adapted from Dr. Alvin Grissom’s 2020 HCI course

Submission

I encourage you to submit your answers to the questions (and ask your own questions!) using the Class Activity Questions discussion board. You may also respond to questions or comments made by others, or ask follow-up questions there. Answer any reflective prompt questions in the Reflective Journal section of your OneNote Classroom personal section. You can find the link to the class notebook on the syllabus.