One of the fundamental tasks in creating speech applications is building the grammars for automated speech recognition (ASR). This entry features techniques to make your grammar-building code fast, efficient and maintainable.
In many situations, it is unrealistic to design and build your grammars in development and deploy them as static grammars in production. For example, if you are writing an address verification application, you may want to ask the caller for the state, then the city, then the street and so on. Instead of building many grammars (all the streets in each city, all the cities in each state), you may want to let the user activity decide which of the most popular cities have their street grammars created, and the most popular states have their city grammars cached, too, and so on.
I tried three methods for performing this task. In all the examples, I connected to a database to retrieve choices for the grammar.In the first example, I wrote directly to a stream, writing the xml using strings. In the next example, I used an XML dataset and an XSL stylesheet to transform the data to a grammar. In the third example, I did the same as the second example, but I sent the results directly to a Response stream in ASP.NET. Figures 1, 2 and 3 repsectively provide samples of the code.
All 3 performed well. Using a simple performance measurement of the total processor time used, they were all a fraction of a second. The top performer by far was using ASP.NET and the response stream. Of course, IO is the performance killer for the first two; writing the file to a disk address is slow compared to writing to a memory address.
So of course, I suggest you use the code in Figure 3. Here are some tips on why I think you should prefer it over the code in Figure 1.
while(TestDataReader.Read()){TestWriter.Write("<item>{0}</item><tag>colorid = {1};</tag>", TestDataReader.GetString(1), TestDataReader.GetInt32(0));} Or this?<xsl:for-each select="//record"><item><xsl:value-of select="description"/></item><tag>colorid = <xsl:value-of select="id"/>;</tag> </xsl:for-each>
In web.config <configuration> <appSettings> <add key="colors" value="SELECT id, description FROM colors" /> <add key="colorsxsl" value="SimpleGrammarTransformer.xsl" /> In your code replace the SQL query with the following: System.Configuration.ConfigurationSettings.AppSettings[Request.QueryString.Get("grammar_id")] URL to get the colors grammar: http://servername/BuildGrammar?grammar_id=colors
In web.config <configuration> <appSettings> <add key="colors" value="SELECT id, description FROM colors" /> <add key="colorsxsl" value="SimpleGrammarTransformer.xsl" />
In your code replace the SQL query with the following: System.Configuration.ConfigurationSettings.AppSettings[Request.QueryString.Get("grammar_id")]
URL to get the colors grammar: http://servername/BuildGrammar?grammar_id=colors
In conclusion, you should use ASP.NET and XSLT to create your dynamic grammars. It's fast, flexible and easy. Let me know what you think. Should I include a download, or can you take if from here?
Have fun!
Figure 1 - Reading from a DB, writing strings to a stream
static void Main(string[] args){TimeSpan TS1 = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;MySqlConnection DatabaseConnection = new MySqlConnection("Database=;Data Source=;User Id=;Password=");DatabaseConnection.Open();MySqlCommand TestCommand = new MySqlCommand("SELECT id, description FROM colors", DatabaseConnection);MySql.Data.MySqlClient.MySqlDataReader TestDataReader = TestCommand.ExecuteReader(System.Data.CommandBehavior.CloseConnection);System.IO.StreamWriter TestWriter = new System.IO.StreamWriter("c:\\temp\\Grammar2.grxml");TestWriter.Write("<?xml version=\"1.0\" encoding=\"utf-8\"?><grammar mode=\"voice\" version=\"1.0\" root=\"main\"><rule id=\"main\"><one-of>");if (TestDataReader.HasRows){while(TestDataReader.Read()){TestWriter.Write("<item>{0}</item><tag>colorid = {1};</tag>", TestDataReader.GetString(1), TestDataReader.GetInt32(0));} }TestWriter.Write("</one-of></rule></grammar>");TestWriter.Close();TimeSpan TS2 = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;Console.WriteLine("Done. Total ticks = {0}.", TS2.Subtract(TS1).Ticks.ToString());Console.ReadLine();}
Figure 2 - Reading from a DB to a DataSet, then transforming with XSLT.
static void Main(string[] args){TimeSpan TS1 = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;MySqlConnection DatabaseConnection = new MySqlConnection("Database=;Data Source=;User Id=;Password=");MySqlDataAdapter DataAdapter = new MySqlDataAdapter("SELECT id, description FROM colors", DatabaseConnection);System.Data.DataSet DBDataSet = new System.Data.DataSet();DataAdapter.Fill(DBDataSet, "record");XmlDocument XMLTarget = new XmlDocument();XMLTarget.LoadXml("<records>" + DBDataSet.GetXml() + "</records>");string XmlTempFile = "c:\\temp\\temprecords.xml";XMLTarget.Save(XmlTempFile);string XslFile = "file://c:/temp/SimpleGrammarTransformer.xsl";System.Xml.Xsl.XslTransform StyleSheet = new System.Xml.Xsl.XslTransform();XmlUrlResolver URLResolver = new XmlUrlResolver();StyleSheet.Load(XslFile);StyleSheet.Transform(XmlTempFile, "c:\\temp\\Grammar1.grxml");TimeSpan TS2 = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;Console.WriteLine("Done. Total ticks = {0}.", TS2.Subtract(TS1).Ticks.ToString());Console.ReadLine();}
Figure 3 - Reading from a DB and transforming the results to the response stream.
<%@ Page Language="c#" AutoEventWireup="false" Debug="true" %><%@ Import namespace="System.Xml"%><%@ Import namespace="MySql.Data.MySqlClient"%><%TimeSpan TS1 = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;MySqlConnection DatabaseConnection = new MySqlConnection("Database=;Data Source=;User Id=;Password=");MySqlDataAdapter DataAdapter = new MySqlDataAdapter("SELECT id, description FROM colors", DatabaseConnection);System.Data.DataSet DBDataSet = new System.Data.DataSet();DataAdapter.Fill(DBDataSet, "record");XmlDocument XMLTarget = new XmlDocument();XMLTarget.LoadXml("<records>" + DBDataSet.GetXml() + "</records>");string XslFile = String.Format("file://{0}", Server.MapPath("SimpleGrammarTransformer.xsl")).Replace("\\", "/");System.Xml.Xsl.XslTransform StyleSheet = new System.Xml.Xsl.XslTransform();XmlUrlResolver URLResolver = new XmlUrlResolver();//StyleSheet.Load(XslFile, URLResolver);StyleSheet.Load(XslFile, URLResolver);StyleSheet.Transform(XMLTarget, null, Response.OutputStream, null);TimeSpan TS2 = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;Response.Write(String.Format("<!--Done. Total ticks = {0} .-->", TS2.Subtract(TS1).Ticks.ToString()));%>
Figure 4 - An XSLT file for building GRXML grammars
You can always change this so it outputs ABNF, or any GSL.
<?xml version="1.0"?><xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"><xsl:output method="xml"/> <xsl:template match="/"> <grammar mode="voice" version="1.0" root="main" > <rule id="main"> <one-of> <xsl:for-each select="//record"> <item><xsl:value-of select="description"/></item><tag>colorid = <xsl:value-of select="id"/>;</tag> </xsl:for-each> </one-of> </rule> </grammar> </xsl:template></xsl:stylesheet>
Figure 5 - GRXML result<?xml version="1.0" encoding="utf-8"?><grammar mode="voice" version="1.0" root="main"> <rule id="main"> <one-of> <item>red</item><tag>colorid = 1;</tag> <item>orange</item><tag>colorid = 2;</tag> <item>yellow</item><tag>colorid = 3;</tag> . . . </one-of> </rule></grammar>