Friday, May 11, 2007

I needed to add some simple, free FAX functionality to an application, so I invested a little time in doing it with Visual C# 2005 Express Edition.

Source from the web:

Resulting image:

I decided to make it use web browser content; seeing that it's pretty easy to format a web page, no FAX imaging tools are required. Thanks to Michael McCloskey's  Bitonal article Bitonal (TIFF) Image Converter for .NET, I was able to accomplish my goal rather easily!

I added in some random dithering to produce a decent balance between text pages and pages with images. I was tempted to implement Floyd-Steinberg Dithering, but my random results are good enough for my project.

I also added in code for generating the image from a web page, and creating the multi-page TIFF document.

I reference FaxComEx.dll, the Windows FAX server, in the code.  This makes it easy to generate a FAX by the command line, (e.g. Url2Fax http://localhost/reportapp/report.jsp?repnum=1 18885551212 FAXSVR100 will send a report to 8885551212, using the fax printer on FAXSVR100 (don't use "localhost"))   If you don't want to use the FAX service, leave off the parameters.  You can grab the image from your %TEMP% directory.

Here's the source code: Url2Fax.zip.  The exe is located in the release folder, in case you want to try it as is.  You'll need Windows XP or 2003 to use the built-in Fax delivery.  For Windows 2000, you can generate the image and then send it with the Windows 2000 fax server. (I have a script for that, too).


Enjoy!

5/11/2007 4:21:20 PM (GMT Daylight Time, UTC+01:00)  #    Comments [2]  |  Trackback
Thursday, May 03, 2007

One of the fundamental tasks in creating speech applications is building the grammars for automated speech recognition (ASR).  This entry features techniques to make your grammar-building code fast, efficient and maintainable.

In many situations, it is unrealistic to design and build your grammars in development and deploy them as static grammars in production.  For example, if you are writing an address verification application, you may want to ask the caller for the state, then the city, then the street and so on.  Instead of building many grammars (all the streets in each city, all the cities in each state), you may want to let the user activity decide which of the most popular cities have their street grammars created, and the most popular states have their city grammars cached, too, and so on.

I tried three methods for performing this task.  In all the examples, I connected to a database to retrieve choices for the grammar.
In the first example, I wrote directly to a stream, writing the xml using strings.  In the next example, I used an XML dataset and an XSL stylesheet to transform the data to a grammar.  In the third example, I did the same as the second example, but I sent the results directly to a Response stream in ASP.NET.  Figures 1, 2 and 3 repsectively provide samples of the code.

All 3 performed well.  Using a simple performance measurement of the total processor time used, they were all a fraction of a second.  The top performer by far was using ASP.NET and the response stream.  Of course, IO is the performance killer for the first two; writing the file to a disk address is slow compared to writing to a memory address.

Total seconds of processor time.
Code sample
Grammar Items Code 1 Code 2 Code 3
10 0.891 0.938 0.000!
100 0.906 0.984 0.012
1000 0.938 1.141 0.141


So of course, I suggest you use the code in Figure 3.  Here are some tips on why I think you should prefer it over the code in Figure 1.

  • If you need to customize the grammar, you can change the XSL file without recompiling the code.  Let's say you need to change the TAG element in the grammar (and for each VoiceXML platform, tags are implemented differently!), you can adjust the XSL file and see visually how you're affecting the grammar.
  • By using the XML from the DataSet, you don't have to worry about data types as they're all converted to text.  If a database field changes in size or precision, the code still works without recompiling.
  • XSL is easier to read.  Mind you, to master it takes some work, but which code is easier for an IVR programmer to pick up...
    This:              

    while(TestDataReader.Read())
    {
    TestWriter.Write("<item>{0}</item><tag>colorid = {1};</tag>", TestDataReader.GetString(1), TestDataReader.GetInt32(0));
    }

    Or this?
    <xsl:for-each select="//record">
    <item><xsl:value-of select="description"/></item><tag>colorid = <xsl:value-of select="id"/>;</tag>
    </xsl:for-each>

  • Using Page Output Caching http://msdn2.microsoft.com/en-us/library/ms972362.aspx you can get great performance from the dynamic grammars.  Cache the files fresh every day, based on the URL parameters.  Schedule a task to call the common URLs, so the first caller of the day doesn't have to wait for the first compile (even though it's a fraction of a second).  Cache based on a database dependency - there's examples out there on how to do this.
  • Use web.config to store the SQL queries and XSL file names.  That'll make this code grammar builder really flexible.

 In web.config
 <configuration>
  <appSettings>
   <add key="colors" value="SELECT id, description FROM colors" />
   <add key="colorsxsl" value="SimpleGrammarTransformer.xsl" />

 In your code replace the SQL query with the following:
  System.Configuration.ConfigurationSettings.AppSettings[Request.QueryString.Get("grammar_id")]

 URL to get the colors grammar: 
  http://servername/BuildGrammar?grammar_id=colors

  • Use a SCRIPT block in the XSL to manipulate the data, instead of doing it in the compiled code.  Using script makes it easy to perform Javascript on the XML as it's being processed by the XSL stylesheet.  I've used Javascript to parse comma-delimited strings into grammar items, clean up data, and more.  Perhaps if you need an example, I can post one...

In conclusion, you should use ASP.NET and XSLT to create your dynamic grammars.  It's fast, flexible and easy.  Let me know what you think.  Should I include a download, or can you take if from here?

Have fun!

Figure 1 - Reading from a DB, writing strings to a stream

static void Main(string[] args)
{
TimeSpan TS1 = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;

MySqlConnection DatabaseConnection = new MySqlConnection("Database=;Data Source=;User Id=;Password=");
DatabaseConnection.Open();
MySqlCommand TestCommand = new MySqlCommand("SELECT id, description FROM colors", DatabaseConnection);
MySql.Data.MySqlClient.MySqlDataReader TestDataReader = TestCommand.ExecuteReader(System.Data.CommandBehavior.CloseConnection);
System.IO.StreamWriter TestWriter = new System.IO.StreamWriter("c:\\temp\\Grammar2.grxml");

TestWriter.Write("<?xml version=\"1.0\" encoding=\"utf-8\"?><grammar mode=\"voice\" version=\"1.0\" root=\"main\"><rule id=\"main\"><one-of>");

if (TestDataReader.HasRows)
{
while(TestDataReader.Read())
{
TestWriter.Write("<item>{0}</item><tag>colorid = {1};</tag>", TestDataReader.GetString(1), TestDataReader.GetInt32(0));
}

}

TestWriter.Write("</one-of></rule></grammar>");
TestWriter.Close();

TimeSpan TS2 = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;
Console.WriteLine("Done. Total ticks = {0}.", TS2.Subtract(TS1).Ticks.ToString());
Console.ReadLine();

}

Figure 2 - Reading from a DB to a DataSet, then transforming with XSLT.

static void Main(string[] args)
{
TimeSpan TS1 = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;

MySqlConnection DatabaseConnection = new MySqlConnection("Database=;Data Source=;User Id=;Password=");
MySqlDataAdapter DataAdapter = new MySqlDataAdapter("SELECT id, description FROM colors", DatabaseConnection);
System.Data.DataSet DBDataSet = new System.Data.DataSet();
DataAdapter.Fill(DBDataSet, "record");
XmlDocument XMLTarget = new XmlDocument();
XMLTarget.LoadXml("<records>" + DBDataSet.GetXml() + "</records>");
string XmlTempFile = "c:\\temp\\temprecords.xml";
XMLTarget.Save(XmlTempFile);

string XslFile = "file://c:/temp/SimpleGrammarTransformer.xsl";
System.Xml.Xsl.XslTransform StyleSheet = new System.Xml.Xsl.XslTransform();
XmlUrlResolver URLResolver = new XmlUrlResolver();
StyleSheet.Load(XslFile);
StyleSheet.Transform(XmlTempFile, "c:\\temp\\Grammar1.grxml");

TimeSpan TS2 = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;
Console.WriteLine("Done. Total ticks = {0}.", TS2.Subtract(TS1).Ticks.ToString());
Console.ReadLine();
}

Figure 3 - Reading from a DB and transforming the results to the response stream.

<%@ Page Language="c#" AutoEventWireup="false" Debug="true" %><%@ Import namespace="System.Xml"%><%@ Import namespace="MySql.Data.MySqlClient"%><%

TimeSpan TS1 = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;

MySqlConnection DatabaseConnection = new MySqlConnection("Database=;Data Source=;User Id=;Password=");
MySqlDataAdapter DataAdapter = new MySqlDataAdapter("SELECT id, description FROM colors", DatabaseConnection);
System.Data.DataSet DBDataSet = new System.Data.DataSet();
DataAdapter.Fill(DBDataSet, "record");
XmlDocument XMLTarget = new XmlDocument();
XMLTarget.LoadXml("<records>" + DBDataSet.GetXml() + "</records>");

string XslFile = String.Format("file://{0}", Server.MapPath("SimpleGrammarTransformer.xsl")).Replace("\\", "/");
System.Xml.Xsl.XslTransform StyleSheet = new System.Xml.Xsl.XslTransform();
XmlUrlResolver URLResolver = new XmlUrlResolver();
//StyleSheet.Load(XslFile, URLResolver);
StyleSheet.Load(XslFile, URLResolver);

StyleSheet.Transform(XMLTarget, null, Response.OutputStream, null);

TimeSpan TS2 = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;

Response.Write(String.Format("<!--Done. Total ticks = {0} .-->", TS2.Subtract(TS1).Ticks.ToString()));

%>

Figure 4 - An XSLT file for building GRXML grammars

You can always change this so it outputs ABNF, or any GSL.


<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
  xmlns:xsl="
http://www.w3.org/1999/XSL/Transform"
  xmlns:msxsl="urn:schemas-microsoft-com:xslt"
  exclude-result-prefixes="msxsl"
>
<xsl:output method="xml"/>
  <xsl:template match="/">
    <grammar mode="voice" version="1.0" root="main" >
      <rule id="main">
        <one-of>
          <xsl:for-each select="//record">
            <item><xsl:value-of select="description"/></item><tag>colorid = <xsl:value-of select="id"/>;</tag>         
          </xsl:for-each>
        </one-of>
      </rule>
    </grammar>
  </xsl:template>
</xsl:stylesheet>


Figure 5 - GRXML result
<?xml version="1.0" encoding="utf-8"?>
<grammar mode="voice" version="1.0" root="main">
  <rule id="main">
  <one-of>
    <item>red</item><tag>colorid = 1;</tag>
    <item>orange</item><tag>colorid = 2;</tag>
    <item>yellow</item><tag>colorid = 3;</tag>
    .
    .
    .   
  </one-of>
  </rule>
</grammar>

 

5/3/2007 12:58:11 AM (GMT Daylight Time, UTC+01:00)  #    Comments [0]  |  Trackback

Theme design by Jelle Druyts

Pick a theme: