Skip to main content

Advanced CUI Design

Branching and Disambiguation

Constrained Responses

  • Designing the prompt to constrain the response, so that the response is likely to contain a single keyword to progress
    • What city are you traveling to
    • What is your main symptom
  • Good strat is to used N-best lists based on the users following constrained response
    • Containing the N likely matches to find the correct results
    • Then iterate over the list until the user confirms

Open Speech

  • To allow for flow when the response doesn't need to be processed
  • Provide a general reply
  • Similar to the generic confirmation strategy
  • Can be used to obtain information in a 'natural', 'conversational' way

Categorisation of input

  • Customizing user input

Wildcards and logical expressions

  • Wildcards allows for more flexibility by allowing certain words to be repeated without having to specify them explicitly
  • Logical expressions to link keywords/phrases together

Disambiguation

  • Different kinds of situation that require disambiguation
    • Not enough information
    • More than one piece of information, when only one is expected

Dialog Management and other advanced topics

  • Need to recognise both object and intent (What the user wants to do with it)
    • Show me my calendar
    • Add an event to my calendar
    • Delete my meeting from my calendar
  • Intent and object are usually combined in the concept of utterances

Sentiment Analysis

  • The process of computationally identifying and categorizing opinions expressed in a piece of text
  • Whether the writers attitude towards a particular topic, product is pos, neg or neut
  • NLP needed.

TTS vs. recorded speech

  • Synthetic generated vs recorded human speech
    • TTS much better than what it was, good range of voices available
    • Recording voice talent can make your VUI sound more natural
    • Pronunciation, prosody adapted to contextual/local nuances

Speaker Verification

  • AKA voice biometric authentication
  • Could be used for authentication for security

Personalisation

  • Using the user's name and other information about the user to personalise the chatbots prompts
  • Greeting the user according to time zone etc

Advanced multimodal

  • Hybrid strategies of voice and visual output
  • Could also have hybrid strategies of voice and visual input