Skip to content

Using a DTMF grammar with semantic tags

Jean-Philippe Gariépy edited this page Apr 25, 2015 · 2 revisions

In the other DTMF recognition examples where a built-in grammar was used, we have rather used the interpretation property of the recognition result because built-in grammars provide a semantic interpretation. It is better to work with the interpretation than the raw utterance.

In our previous example (Using a DTMF grammar), the optional "1" will be part of the result if the caller has pressed the key, so we must deal with the fact that it may be present or not. In other word, we need to do the normalization in the dialogue. It's also possible that DTMF utterance will contains space between each digits (i.e. 5 1 4 5 5 5 1 2 3 4), so there is another normalization to do there also.

If the grammar provides semantic tags, you have control over what is returned in the recognition result. Moreover, if you are performing speech recognition and DTMF recognition at once, you can return the same format for the interpretation regardless of the input mode (i.e. DTMF or speech), making it easier to process in the dialogue.

Here's the phone number grammar modified with semantic tags:

phone-number.grxml:

<rule id="phoneNumber" scope="public">
  <tag>out = "";</tag>

  <!-- optional country code: 1 -->
  <item repeat="0-1">1</item>

  <!-- NPA (area code) -->
  <ruleref uri="#r2to9" />
  <tag>out += rules.latest()</tag>
  <item repeat="2">
    <item>
      <ruleref uri="#r0to9" />
      <tag>out += rules.latest()</tag>
    </item>
  </item>

  <!-- NXX -->
  <ruleref uri="#r2to9" />
  <tag>out += rules.latest()</tag>
  <item repeat="2">
    <item>
      <ruleref uri="#r0to9" />
      <tag>out += rules.latest()</tag>
    </item>
  </item>

  <!-- XXXX -->
  <item repeat="4">
    <item>
      <ruleref uri="#r0to9" />
      <tag>out += rules.latest()</tag>
    </item>
  </item>
</rule>

Here's how to access the result:

Dialogue.java:

Logger logger = context.getLogger();
if (inputTurn.getRecognitionInfo() != null) {
    JsonArray recognitionResult = inputTurn.getRecognitionInfo().getRecognitionResult();
    //Extracting the "interpretation" of the first recognition hypothesis. 
    String phoneNumber = recognitionResult.getJsonObject(0).getString("interpretation");
    logger.info("Phone number entered: " + phoneNumber);
} else if (VoiceXmlEvent.hasEvent(VoiceXmlEvent.NO_INPUT, inputTurn.getEvents())) {
    logger.info("Timeout.");
} else if (VoiceXmlEvent.hasEvent(VoiceXmlEvent.NO_MATCH, inputTurn.getEvents())) {
    logger.info("Invalid phone number");
}

Note that the phone number doesn't need any normalization at this point. The optional country code will not be part of the interpretation.


Running this example

You can download or browse the complete code for this example at GitHub.This is a complete working application that you can build and run for yourself.

You can also clone the Rivr Cookbook repository and checkout this example:

git clone -b dtmf-grammar-with-semantic [email protected]:nuecho/rivr-cookbook.git

Then, to build and run it:

cd rivr-cookbook

./gradlew jettyRun

The VoiceXML dialogue should be available at http://localhost:8080/rivr-cookbook/dialogue

To stop the application, press Control-C in the console.