Next IVR

Providing innovative contact center solutions and services.

Creating Dynamic GRXML Grammars with c# and ASP.NET


One of the fundamental tasks in creating speech applications is building the grammars for automated speech recognition (ASR).  This entry features techniques to make your grammar-building code fast, efficient and maintainable.

In many situations, it is unrealistic to design and build your grammars in development and deploy them as static grammars in production.  For example, if you are writing an address verification application, you may want to ask the caller for the state, then the city, then the street and so on.  Instead of building many grammars (all the streets in each city, all the cities in each state), you may want to let the user activity decide which of the most popular cities have their street grammars created, and the most popular states have their city grammars cached, too, and so on.

I tried three methods for performing this task.  In all the examples, I connected to a database to retrieve choices for the grammar.
In the first example, I wrote directly to a stream, writing the xml using strings.  In the next example, I used an XML dataset and an XSL stylesheet to transform the data to a grammar.  In the third example, I did the same as the second example, but I sent the results directly to a Response stream in ASP.NET.  Figures 1, 2 and 3 repsectively provide samples of the code.

All 3 performed well.  Using a simple performance measurement of the total processor time used, they were all a fraction of a second.  The top performer by far was using ASP.NET and the response stream.  Of course, IO is the performance killer for the first two; writing the file to a disk address is slow compared to writing to a memory address.

Total seconds of processor time.
Code sample
Grammar Items Code 1 Code 2 Code 3
10 0.891 0.938 0.000!
100 0.906 0.984 0.012
1000 0.938 1.141 0.141

So of course, I suggest you use the code in Figure 3.  Here are some tips on why I think you should prefer it over the code in Figure 1.

  • If you need to customize the grammar, you can change the XSL file without recompiling the code.  Let's say you need to change the TAG element in the grammar (and for each VoiceXML platform, tags are implemented differently!), you can adjust the XSL file and see visually how you're affecting the grammar.
  • By using the XML from the DataSet, you don't have to worry about data types as they're all converted to text.  If a database field changes in size or precision, the code still works without recompiling.
  • XSL is easier to read.  Mind you, to master it takes some work, but which code is easier for an IVR programmer to pick up...
    This:
while(TestDataReader.Read())  
   {  
   TestWriter.Write("<item>{0}</item><tag>colorid = {1};</tag>", TestDataReader.GetString(1), TestDataReader.GetInt32(0));  
   }  

Or this?

<xsl:for-each select="//record">
  <item>
    <xsl:value-of select="description"/>
  </item>
  <tag>colorid = <xsl:value-of select="id"/>;</tag>  
</xsl:for-each>
  • Using Page Output Caching http://msdn2.microsoft.com/en-us/library/ms972362.aspx you can get great performance from the dynamic grammars.  Cache the files fresh every day, based on the URL parameters.  Schedule a task to call the common URLs, so the first caller of the day doesn't have to wait for the first compile (even though it's a fraction of a second).  Cache based on a database dependency - there's examples out there on how to do this.
  • Use web.config to store the SQL queries and XSL file names.  That'll make this code grammar builder really flexible.

In web.config:

<configuration>  
  <appSettings>  
    <add key="colors" value="SELECT id, description FROM colors" />  
    <add key="colorsxsl" value="SimpleGrammarTransformer.xsl" />

In your code replace the SQL query with the following:

System.Configuration.ConfigurationSettings.AppSettings[Request.QueryString.Get("grammar_id")];

URL to get the colors:

http://servername/BuildGrammar?grammar_id=colors
  • Use a SCRIPT block in the XSL to manipulate the data, instead of doing it in the compiled code.  Using script makes it easy to perform Javascript on the XML as it's being processed by the XSL stylesheet.  I've used Javascript to parse comma-delimited strings into grammar items, clean up data, and more.  Perhaps if you need an example, I can post one...

In conclusion, you should use ASP.NET and XSLT to create your dynamic grammars.  It's fast, flexible and easy.  Let me know what you think.  Should I include a download, or can you take if from here?

Have fun!

Figure 1 - Reading from a DB, writing strings to a stream

void Main(string[] args)  
{  
        TimeSpan TS1 =
        System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;

        MySqlConnection DatabaseConnection = new MySqlConnection("Database=;Data Source=;User Id=;Password=");  
        DatabaseConnection.Open();  
        MySqlCommand TestCommand = new MySqlCommand("SELECT id, description FROM colors", DatabaseConnection);  
        MySql.Data.MySqlClient.MySqlDataReader TestDataReader = TestCommand.ExecuteReader(System.Data.CommandBehavior.CloseConnection);  
        System.IO.StreamWriter TestWriter = new System.IO.StreamWriter("c:tempGrammar2.grxml");

        TestWriter.Write("<?xml version="1.0" encoding="utf-8"?><grammar mode="voice" version="1.0" root="main"><rule id="main"><one-of>");

        if (TestDataReader.HasRows)  
        {  
                while(TestDataReader.Read())  
                {  
                        TestWriter.Write("<item>{0}</item><tag>colorid = {1};</tag>", TestDataReader.GetString(1), TestDataReader.GetInt32(0));  
                }

        }

        TestWriter.Write("</one-of></rule></grammar>");  
        TestWriter.Close();

        TimeSpan TS2 = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;  
        Console.WriteLine("Done. Total ticks = {0}.",
        TS2.Subtract(TS1).Ticks.ToString());  
        Console.ReadLine();
}

Figure 2 - Reading from a DB to a DataSet, then transforming with XSLT.

void Main(string[] args)  
{  
        TimeSpan TS1 =
        System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;

        MySqlConnection DatabaseConnection = new MySqlConnection("Database=;Data Source=;User Id=;Password=");  
        MySqlDataAdapter DataAdapter = new MySqlDataAdapter("SELECT id, description FROM colors", DatabaseConnection);  
        System.Data.DataSet DBDataSet = new System.Data.DataSet();  
        DataAdapter.Fill(DBDataSet, "record");  
        XmlDocument XMLTarget = new XmlDocument();  
        XMLTarget.LoadXml("<records>" + DBDataSet.GetXml() + "</records>");  
        string XmlTempFile = "c:temptemprecords.xml";  
        XMLTarget.Save(XmlTempFile);

        string XslFile = "file://c:/temp/SimpleGrammarTransformer.xsl";  
        System.Xml.Xsl.XslTransform StyleSheet = new
        System.Xml.Xsl.XslTransform();  
        XmlUrlResolver URLResolver = new XmlUrlResolver();  
        StyleSheet.Load(XslFile);  
        StyleSheet.Transform(XmlTempFile, "c:tempGrammar1.grxml");

        TimeSpan TS2 = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;  
        Console.WriteLine("Done. Total ticks = {0}.",
        TS2.Subtract(TS1).Ticks.ToString());  
        Console.ReadLine();  
}

Figure 3 - Reading from a DB and transforming the results to the response stream.

<%@ Page Language="c\#" AutoEventWireup="false" Debug="true" %><%@ Import namespace="System.Xml"%><%@ Import namespace="MySql.Data.MySqlClient"%><%

TimeSpan TS1 =
System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;

MySqlConnection DatabaseConnection = new MySqlConnection("Database=;Data Source=;User Id=;Password=");  
MySqlDataAdapter DataAdapter = new MySqlDataAdapter("SELECT id, description FROM colors", DatabaseConnection);  
System.Data.DataSet DBDataSet = new System.Data.DataSet();  
DataAdapter.Fill(DBDataSet, "record");  
XmlDocument XMLTarget = new XmlDocument();  
XMLTarget.LoadXml("<records>" + DBDataSet.GetXml() + "</records>");

string XslFile = String.Format("file://{0}",
Server.MapPath("SimpleGrammarTransformer.xsl")).Replace("", "/");  
System.Xml.Xsl.XslTransform StyleSheet = new System.Xml.Xsl.XslTransform();  
XmlUrlResolver URLResolver = new XmlUrlResolver();  
//StyleSheet.Load(XslFile, URLResolver);  
StyleSheet.Load(XslFile, URLResolver);

StyleSheet.Transform(XMLTarget, null, Response.OutputStream, null);

TimeSpan TS2 = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;

Response.Write(String.Format("<!--Done. Total ticks = {0} .-->",
TS2.Subtract(TS1).Ticks.ToString()));

%>

Figure 4 - An XSLT file for building GRXML grammars

You can always change this so it outputs ABNF, or any GSL.

<?xml version="1.0"?>  
<xsl:stylesheet version="1.0"  
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"  
  xmlns:msxsl="urn:schemas-microsoft-com:xslt"  
  exclude-result-prefixes="msxsl"  
  >  
<xsl:output method="xml"/>  
<xsl:template match="/">  
  <grammar mode="voice" version="1.0" root="main" >  
    <rule id="main">  
      <one-of>  
        <xsl:for-each select="//record">  
          <item><xsl:value-of select="description"/></item>
          <tag>colorid = <xsl:value-of select="id"/>;</tag>  
        </xsl:for-each>  
      </one-of>  
    </rule>  
  </grammar>  
</xsl:template>  
</xsl:stylesheet>  

Figure 5 - GRXML result

<?xml version="1.0" encoding="utf-8"?>  
<grammar mode="voice" version="1.0" root="main">  
  <rule id="main">  
    <one-of>  
      <item>red</item><tag>colorid = 1;</tag>  
      <item>orange</item><tag>colorid = 2;</tag>  
      <item>yellow</item><tag>colorid = 3;</tag>  
      ...
    </one-of>  
  </rule>  
</grammar>