# Friday, May 27, 2011

Eclipse Plug-ins

I'm currently writing Eclipse plug-ins, leveraging the JWT project.

There's a great book for learning how to program plug-ins, and they have a site for it: http://www.qualityeclipse.com/

If you use Eclipse, you should read Chapter One on the site (free); even if you aren't programming plug-ins!  They go over all the features of Eclipse that you should know about.  There's plenty of interesting reading in the other chapters - the previous IDE technologies that made their way into the app, how plug-ins work, how to build a basic plug-in, using SWT to develop great GUIs.

I'll be releasing a project soon, complete with a help section, an update site, and all that you'd expect from an Eclipse plugin.  More on that in June!

Eric

#    Comments [0] |
# Friday, May 13, 2011

Updates to the Phone Service

Since the last posting, I've had some worthwhile experiences with my VOIP.

First, the HT286 worked perfectly.  I'm working toward switching over my home phone services, too.  I picked up the Gransdstream HandyTone 503, which provides 2 phone lines for 2 SIP trunks.

The first thing I noticed during the echo test, was that the latency was much lower then the 286's.  That joy was short-lived, though.  My outgoing sound path started having lots of jitter.  The incoming sound was perfectly clear, but people would tell me my voice was breaking up.  I could recreate this in the echo test.  After a minute or two, the sound quality was pitiful.

Basically, it was a QOS problem.  I tried setting the QOS properties on the 503, but it didn't help.  Even though my router had VOIP QOS settings, they didn't seem to work.  I replaced the stock firmware on my wireless router with dd-wrt, configured QOS, and the problem disappeared!  I even performed a bandwidth download/upload test, and there was no loss of quality.

I'll be porting my home phone number to my SIP trunk provider real soon:)

Eric

#    Comments [0] |
# Monday, March 14, 2011

You are paying too much for your phone service.

I wanted to test the quality of using a VOIP adapter with Google Voice (GV).  I have to say it is excellent.  The latency is not noticeable, compared with a 1s-2s potential delay if you are using a soft-phone (especially one on your smart phone).  You can also send and receive faxes!

Overall, I think with a little effort, you can drop your phone services bill to $5/month, with no loss in service or quality.

Simple (but technical) steps to reduce/remove your telephone service:
  1. Set up a Google Voice account.
  2. Set up a pbxes.org account.  You will have your own Asterisk PBX in the cloud for $5/month.  You need to have a paid account to get access to GV You'll have to look in the forums for details on how to set up your GV trunk, set up your inbound/outbound paths.  Email me if I should document the steps:)
  3. Buy a VOIP Adapter.  I got the HT286 from Telephony Depot for $25
  4. Plug in the adapter in your home network (ethernet/wired connection is needed).  Figure out its IP address (look on your router for DHCP leases, and find the newest entry) and configure it.
  5. Call *43 to echo test the line.
  6. Call a friend.  Have a friend call you.  It's all good!
Reasons (GV limitations) to go with voip.ms instead of steps 1 and 2 above:
  1. 1 hour max call length?
  2. E911 Support.
  3. Number portability

#    Comments [2] |

Building a LYME server - Linux Yaws Mnesia Erlang

A LYME server is Linux Yaws Mnesia Erlang - similar to how LAMP is Linux Apache (Tomcat) MySQL PHP, but of course, this system focuses on functional programming.

I started studying functional programming awhile ago.  It's pretty interesting.
Here are some resources for you:
  • The Wikipedia entry - it's not Erlang, the person, or the measurements, thus the link (to go off on a tangent, here, look at Erlang and Markov early 20th century mathematicians who still influence our lives today).
  • The official Erlang site.
  • http://lisperati.com/ - This is a fun tour of functional programming in Lisp.
    • go through the "Casting SPELs to learn what a Lisp program looks like.
    • go through the "Land of Lisp" cartoon to learn the tech points of Lisp - read it and click on the links as you go.
  • http://mitpress.mit.edu/sicp/ -  This is the "not fun" tour of functional programming in Scheme.  It's comprehensive, and you need to have an aptitude for math (or be able to ignore it and focus on the concepts).
  • http://www.sics.se/~joe/apachevsyaws.html - This is what really got me interested in using Erlang: Apache dies at about 4,000 parallel sessions. Yaws is still functioning at over 80,000 parallel connections.
You can create your own LYME server in about 5 minutes. 

On Ubuntu, you can install yaws "apt-get install yaws"  That'll create an http Yaws server pretty quickly.  However, there is no https.

On Fedora, you can do "yum install yaws" However, the http example site is not configured.

If you want to use CentOS, or https, then you are in for a longer setup.  The good thing about the following setup is:
  1. It is the latest code release
  2. It includes https and all example code.
These are my notes for building a LYME server using CentOS.  I hope you like the command line;)

yum install make wget gcc m4  openssl openssl-devel pam-devel ncurses-devel git automake glibc-devel.i386

rpm -Uvh http://download.fedora.redhat.com/pub/epel/5/i386/epel-release-5-4.noarch.rpm; wget http://erlang.org/download/otp_src_R14B01.tar.gz; wget http://ftp.gnu.org/gnu/autoconf/autoconf-2.68.tar.gz

tar -xvzf otp_src_R14B01.tar.gz ; tar -vxzf autoconf-2.68.tar.gz

cd autoconf-2.68; ./configure; make; make install;

cd ../otp_src_R14B01; ./configure --with-ssl; make; make install

cd ..; git clone git://github.com/klacke/yaws.git

cd yaws; autoconf ; ./configure; make; make install

edit /etc/sysconfig/iptables  and add a rule to allow ports 80 and 443
  -A RH-Firewall-1-INPUT -m state --state NEW -p tcp --dport 80 -j ACCEPT
  -A RH-Firewall-1-INPUT -m state --state NEW -p tcp --dport 443 -j ACCEPT

/etc/init.d/iptables restart

yaws

http://[your machine IP or FQDN]/  you should see the yaws server page, with examples

back in yaws, cctrl-g, then q [enter]

That's it!  All done!  On a decent server, if you are cutting and pasting commands, it'll probably take < 30 minutes to complete.  On an slower machine, plan on taking some coffee breaks during the configure and make commands.

Also, on my latest run of the above script, the https port wasn't showing the yaws app page.  I edited /usr/local/etc/yaws/yaws.conf as follows:
In the ssl configuration, I added the following lines under the line with "port=443":
        docroot = /usr/local/var/yaws/www
        appmods = <cgi-bin, yaws_appmod_cgi>
Replace the existing docroot entry, of course...

Eric


#    Comments [0] |
# Wednesday, April 09, 2008

Transcribing for MS Speech Server

Transcribing speech utterances is a highly repetitive task, usually performed by a pool of people who are good at typing.

Out of the box, the speech server tools for transcribing are accessible through Visual Studio, and are not good for transcribing any volume of utterances.

The following attachment contains two Visual Studio 2005 projects that can get you on your way towards a fast transcription process for non developers (no Visual Studio needed for them).


MSSTranscriptionService.zip (3.75 MB)
#    Comments [1] |
# Thursday, April 03, 2008

Unit Testing Managed Speech Server Applications

Unit testing code is important.  If you make code changes in a library that other people are using, you want to make sure all of the code works as expected.  Using NUnit is great for that.

However, when it comes to speech applications, you probably manually test your applications before each release.

If you are writing managed code for Office Communications Server 2007 Speech Server, and you are using a SIP for your telephony lines, I have something that will help you automate your testing.  I created a simple class that will send SIP INFO requests to the caller if they include "log=true" in the SIP URI parameters.  It basically works like this:

  1. Call into your application and then generate a test script based on the call log.
  2. Run the customized OutboundCalls application, passing your newly generated script as the script to run.
  3. The OutboundCalls application will automatically go through the application, following the same path.

Attached is the unit testing code, a demo and some basic instructions.  Open the UnitTesting solution and read the ReadMe.htm file for all the details.

Happy Testing!

UnitTesting.zip (8 MB)
#    Comments [4] |
# Tuesday, June 05, 2007

A Grammar for AlphaNumeric IDs

In the health industry, I've frequently run into the following predicament:  User IDs are no longer simple numeric fields.  Traditionally, an employee's Social Security number may have been used as an id, but HIPAA has put an end to that.

For speech recognition, this could present a common tuning and maintenance issue.  The following XSLT document is designed to automate this maintenance.

First, let's take a look at some simple user IDs stored in my sample database.

SELECT UserIDs FROM SampleIDs

UserIDs                                                     
------
119821
319871
31987M
D19821
D1982M
D19871
...

You can see there are a few alphanumeric patterns being used with these IDs.  Using a simple replacement, we can get all the non-numeric patterns in the database:

SELECT REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(UserIDs, '1', '_'), '2', '_'), '3', '_'), '4', '_'), '5', '_'), '6', '_'), '7', '_'), '8', '_'), '9', '_'), '0', '_') AS Pattern,
COUNT(*) AS RecordCount
FROM SampleIDs
GROUP BY REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(UserIDs, '1', '_'), '2', '_'), '3', '_'), '4', '_'), '5', '_'), '6', '_'), '7', '_'), '8', '_'), '9', '_'), '0', '_')

Pattern RecordCount
-------------------
D_____  71680
_____M  57344
X____M  14336
D____M  71680
X_____  14336
______  57344

With this, we'll simply assume there is an even chance for any of these patterns being used on a call.  So, D_____ has the probability of 71680/286720 or a 25% chance of being provided.  Using this information, we can weight the probability of this pattern matching an utterance for a user ID.

First, I run the above SQL statement and put the results into a dataset.  You can simply create the XML from the recordset, too.  My XML results look like this:

<records>
<record>
<Pattern>D_____</Pattern>
<RecordCount>71680</RecordCount>
</record>
<record>
<Pattern>_____M</Pattern>
<RecordCount>57344</RecordCount>
</record>
<record>
<Pattern>X____M</Pattern>
<RecordCount>14336</RecordCount>
</record>
<record>
<Pattern>D____M</Pattern>
<RecordCount>71680</RecordCount>
</record>
<record>
<Pattern>X_____</Pattern>
<RecordCount>14336</RecordCount>
</record>
<record>
<Pattern>______</Pattern>
<RecordCount>57344</RecordCount>
</record>
</records>

Now comes the fun part - using XSLT to create a GRXML grammar file.

A couple of points about the XSL file:

  1. For the JavaScript function buildCharacterArray(s1), you could probably slim the function down to "return s1.split('');"  It splits a string into a character array.
  2. XSLT doesn't have a for-each loop, so I recursively call the buildItem template.
  3. In this example, the TAG is set as an attribute of the related ITEM element.  For other platforms, this may need to be an element trailing the ITEM element.  Of course, the tag syntax is different on each platform:(
  4. In the real world you may find, through transcriptions and tuning, that people may utter dashes and spaces, too, or say, "B as in boy."  They may also truncate leading 0's (00001214V may be spoken 1214V).
  5. It doesn't handle robust recognition for utterances like "nine double oh one seven" or "nine thirty nine twenty two."  Unless you find patterns through transcriptions and tuning, effectively accommodating this will drop your accuracy and performance through the floor.
  6. Use your web server to cache the GRXML output; there's no need to run it too often.

The resulting grammar looks like this :

<?xml version="1.0" encoding="utf-8" ?>
<grammar xml:lang="en-US" version="1.0" root="main" mode="voice" xmlns:msxsl="urn:schemas-microsoft-com:xslt" xmlns:nextivr="http://www.nextivr.com/XSLFunctions" xmlns:rs="urn:schemas-microsoft-com:rowset">
<rule id="main" scope="public">
<one-of>
<item weight="0.25">
<ruleref type="application/srgs+xml" uri="#D_____" />
</item>
<item weight="0.2">
<ruleref type="application/srgs+xml" uri="#_____M" />
</item>
<item weight="0.05">
<ruleref type="application/srgs+xml" uri="#X____M" />
</item>
<item weight="0.25">
<ruleref type="application/srgs+xml" uri="#D____M" />
</item>
<item weight="0.05">
<ruleref type="application/srgs+xml" uri="#X_____" />
</item>
<item weight="0.2">
<ruleref type="application/srgs+xml" uri="#______" />
</item>
</one-of>
</rule>
<rule scope="private" id="D_____">
<item tag="D">D</item>
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
</rule>
<rule scope="private" id="_____M">
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
<item tag="M">M</item>
</rule>
<rule scope="private" id="X____M">
<item tag="X">X</item>
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
<item tag="M">M</item>
</rule>
<rule scope="private" id="D____M">
<item tag="D">D</item>
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
<item tag="M">M</item>
</rule>
<rule scope="private" id="X_____">
<item tag="X">X</item>
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
</rule>
<rule scope="private" id="______">
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
<ruleref type="application/srgs+xml" uri="#number" />
</rule>
<rule scope="private" id="number">
<one-of>
<item tag="1">one</item>
<item tag="2">two</item>
<item tag="3">three</item>
<item tag="4">four</item>
<item tag="5">five</item>
<item tag="6">six</item>
<item tag="7">seven</item>
<item tag="8">eight</item>
<item tag="9">nine</item>
<item tag="0">zero</item>
</one-of>
</rule>
</grammar>

Download the code!

#    Comments [0] |
# Friday, May 11, 2007

Url2Fax, free FAX software with source code, converting web pages to FAX documents.

I needed to add some simple, free FAX functionality to an application, so I invested a little time in doing it with Visual C# 2005 Express Edition.

Source from the web:

Resulting image:

I decided to make it use web browser content; seeing that it's pretty easy to format a web page, no FAX imaging tools are required. Thanks to Michael McCloskey's  Bitonal article Bitonal (TIFF) Image Converter for .NET, I was able to accomplish my goal rather easily!

I added in some random dithering to produce a decent balance between text pages and pages with images. I was tempted to implement Floyd-Steinberg Dithering, but my random results are good enough for my project.

I also added in code for generating the image from a web page, and creating the multi-page TIFF document.

I reference FaxComEx.dll, the Windows FAX server, in the code.  This makes it easy to generate a FAX by the command line, (e.g. Url2Fax http://localhost/reportapp/report.jsp?repnum=1 18885551212 FAXSVR100 will send a report to 8885551212, using the fax printer on FAXSVR100 (don't use "localhost"))   If you don't want to use the FAX service, leave off the parameters.  You can grab the image from your %TEMP% directory.

Here's the source code: Url2Fax.zip.  The exe is located in the release folder, in case you want to try it as is.  You'll need Windows XP or 2003 to use the built-in Fax delivery.  For Windows 2000, you can generate the image and then send it with the Windows 2000 fax server. (I have a script for that, too).


Enjoy!

#    Comments [3] |
# Thursday, May 03, 2007

Creating Dynamic GRXML Grammars with c# and ASP.NET

One of the fundamental tasks in creating speech applications is building the grammars for automated speech recognition (ASR).  This entry features techniques to make your grammar-building code fast, efficient and maintainable.

In many situations, it is unrealistic to design and build your grammars in development and deploy them as static grammars in production.  For example, if you are writing an address verification application, you may want to ask the caller for the state, then the city, then the street and so on.  Instead of building many grammars (all the streets in each city, all the cities in each state), you may want to let the user activity decide which of the most popular cities have their street grammars created, and the most popular states have their city grammars cached, too, and so on.

I tried three methods for performing this task.  In all the examples, I connected to a database to retrieve choices for the grammar.
In the first example, I wrote directly to a stream, writing the xml using strings.  In the next example, I used an XML dataset and an XSL stylesheet to transform the data to a grammar.  In the third example, I did the same as the second example, but I sent the results directly to a Response stream in ASP.NET.  Figures 1, 2 and 3 repsectively provide samples of the code.

All 3 performed well.  Using a simple performance measurement of the total processor time used, they were all a fraction of a second.  The top performer by far was using ASP.NET and the response stream.  Of course, IO is the performance killer for the first two; writing the file to a disk address is slow compared to writing to a memory address.

Total seconds of processor time.
Code sample
Grammar Items Code 1 Code 2 Code 3
10 0.891 0.938 0.000!
100 0.906 0.984 0.012
1000 0.938 1.141 0.141


So of course, I suggest you use the code in Figure 3.  Here are some tips on why I think you should prefer it over the code in Figure 1.

  • If you need to customize the grammar, you can change the XSL file without recompiling the code.  Let's say you need to change the TAG element in the grammar (and for each VoiceXML platform, tags are implemented differently!), you can adjust the XSL file and see visually how you're affecting the grammar.
  • By using the XML from the DataSet, you don't have to worry about data types as they're all converted to text.  If a database field changes in size or precision, the code still works without recompiling.
  • XSL is easier to read.  Mind you, to master it takes some work, but which code is easier for an IVR programmer to pick up...
    This:              

    while(TestDataReader.Read())
    {
    TestWriter.Write("<item>{0}</item><tag>colorid = {1};</tag>", TestDataReader.GetString(1), TestDataReader.GetInt32(0));
    }

    Or this?
    <xsl:for-each select="//record">
    <item><xsl:value-of select="description"/></item><tag>colorid = <xsl:value-of select="id"/>;</tag>
    </xsl:for-each>

  • Using Page Output Caching http://msdn2.microsoft.com/en-us/library/ms972362.aspx you can get great performance from the dynamic grammars.  Cache the files fresh every day, based on the URL parameters.  Schedule a task to call the common URLs, so the first caller of the day doesn't have to wait for the first compile (even though it's a fraction of a second).  Cache based on a database dependency - there's examples out there on how to do this.
  • Use web.config to store the SQL queries and XSL file names.  That'll make this code grammar builder really flexible.

 In web.config
 <configuration>
  <appSettings>
   <add key="colors" value="SELECT id, description FROM colors" />
   <add key="colorsxsl" value="SimpleGrammarTransformer.xsl" />

 In your code replace the SQL query with the following:
  System.Configuration.ConfigurationSettings.AppSettings[Request.QueryString.Get("grammar_id")]

 URL to get the colors grammar: 
  http://servername/BuildGrammar?grammar_id=colors

  • Use a SCRIPT block in the XSL to manipulate the data, instead of doing it in the compiled code.  Using script makes it easy to perform Javascript on the XML as it's being processed by the XSL stylesheet.  I've used Javascript to parse comma-delimited strings into grammar items, clean up data, and more.  Perhaps if you need an example, I can post one...

In conclusion, you should use ASP.NET and XSLT to create your dynamic grammars.  It's fast, flexible and easy.  Let me know what you think.  Should I include a download, or can you take if from here?

Have fun!

Figure 1 - Reading from a DB, writing strings to a stream

static void Main(string[] args)
{
TimeSpan TS1 = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;

MySqlConnection DatabaseConnection = new MySqlConnection("Database=;Data Source=;User Id=;Password=");
DatabaseConnection.Open();
MySqlCommand TestCommand = new MySqlCommand("SELECT id, description FROM colors", DatabaseConnection);
MySql.Data.MySqlClient.MySqlDataReader TestDataReader = TestCommand.ExecuteReader(System.Data.CommandBehavior.CloseConnection);
System.IO.StreamWriter TestWriter = new System.IO.StreamWriter("c:\\temp\\Grammar2.grxml");

TestWriter.Write("<?xml version=\"1.0\" encoding=\"utf-8\"?><grammar mode=\"voice\" version=\"1.0\" root=\"main\"><rule id=\"main\"><one-of>");

if (TestDataReader.HasRows)
{
while(TestDataReader.Read())
{
TestWriter.Write("<item>{0}</item><tag>colorid = {1};</tag>", TestDataReader.GetString(1), TestDataReader.GetInt32(0));
}

}

TestWriter.Write("</one-of></rule></grammar>");
TestWriter.Close();

TimeSpan TS2 = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;
Console.WriteLine("Done. Total ticks = {0}.", TS2.Subtract(TS1).Ticks.ToString());
Console.ReadLine();

}

Figure 2 - Reading from a DB to a DataSet, then transforming with XSLT.

static void Main(string[] args)
{
TimeSpan TS1 = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;

MySqlConnection DatabaseConnection = new MySqlConnection("Database=;Data Source=;User Id=;Password=");
MySqlDataAdapter DataAdapter = new MySqlDataAdapter("SELECT id, description FROM colors", DatabaseConnection);
System.Data.DataSet DBDataSet = new System.Data.DataSet();
DataAdapter.Fill(DBDataSet, "record");
XmlDocument XMLTarget = new XmlDocument();
XMLTarget.LoadXml("<records>" + DBDataSet.GetXml() + "</records>");
string XmlTempFile = "c:\\temp\\temprecords.xml";
XMLTarget.Save(XmlTempFile);

string XslFile = "file://c:/temp/SimpleGrammarTransformer.xsl";
System.Xml.Xsl.XslTransform StyleSheet = new System.Xml.Xsl.XslTransform();
XmlUrlResolver URLResolver = new XmlUrlResolver();
StyleSheet.Load(XslFile);
StyleSheet.Transform(XmlTempFile, "c:\\temp\\Grammar1.grxml");

TimeSpan TS2 = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;
Console.WriteLine("Done. Total ticks = {0}.", TS2.Subtract(TS1).Ticks.ToString());
Console.ReadLine();
}

Figure 3 - Reading from a DB and transforming the results to the response stream.

<%@ Page Language="c#" AutoEventWireup="false" Debug="true" %><%@ Import namespace="System.Xml"%><%@ Import namespace="MySql.Data.MySqlClient"%><%

TimeSpan TS1 = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;

MySqlConnection DatabaseConnection = new MySqlConnection("Database=;Data Source=;User Id=;Password=");
MySqlDataAdapter DataAdapter = new MySqlDataAdapter("SELECT id, description FROM colors", DatabaseConnection);
System.Data.DataSet DBDataSet = new System.Data.DataSet();
DataAdapter.Fill(DBDataSet, "record");
XmlDocument XMLTarget = new XmlDocument();
XMLTarget.LoadXml("<records>" + DBDataSet.GetXml() + "</records>");

string XslFile = String.Format("file://{0}", Server.MapPath("SimpleGrammarTransformer.xsl")).Replace("\\", "/");
System.Xml.Xsl.XslTransform StyleSheet = new System.Xml.Xsl.XslTransform();
XmlUrlResolver URLResolver = new XmlUrlResolver();
//StyleSheet.Load(XslFile, URLResolver);
StyleSheet.Load(XslFile, URLResolver);

StyleSheet.Transform(XMLTarget, null, Response.OutputStream, null);

TimeSpan TS2 = System.Diagnostics.Process.GetCurrentProcess().TotalProcessorTime;

Response.Write(String.Format("<!--Done. Total ticks = {0} .-->", TS2.Subtract(TS1).Ticks.ToString()));

%>

Figure 4 - An XSLT file for building GRXML grammars

You can always change this so it outputs ABNF, or any GSL.


<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
  xmlns:xsl="
http://www.w3.org/1999/XSL/Transform"
  xmlns:msxsl="urn:schemas-microsoft-com:xslt"
  exclude-result-prefixes="msxsl"
>
<xsl:output method="xml"/>
  <xsl:template match="/">
    <grammar mode="voice" version="1.0" root="main" >
      <rule id="main">
        <one-of>
          <xsl:for-each select="//record">
            <item><xsl:value-of select="description"/></item><tag>colorid = <xsl:value-of select="id"/>;</tag>         
          </xsl:for-each>
        </one-of>
      </rule>
    </grammar>
  </xsl:template>
</xsl:stylesheet>


Figure 5 - GRXML result
<?xml version="1.0" encoding="utf-8"?>
<grammar mode="voice" version="1.0" root="main">
  <rule id="main">
  <one-of>
    <item>red</item><tag>colorid = 1;</tag>
    <item>orange</item><tag>colorid = 2;</tag>
    <item>yellow</item><tag>colorid = 3;</tag>
    .
    .
    .   
  </one-of>
  </rule>
</grammar>

 

#    Comments [0] |
# Wednesday, March 07, 2007

Natural Microsystems Vox Wav files

Hi All,

Seeing that some of the google search traffic I receive is around NMS sound files, let me provide a little insight.

NMS sound files are recorded in their own proprietary format, optimized for quality and performance.

The NMS vox files are NOT the same as the Dialogic VOX files.

If you have some NMS VOX files that you need to convert, the easiest way is to convert them where some NMS software is installed (probably on the IVR machine itself).

Here's a Windows command line command to convert a folder of  NMS files to WAV files:

for %1 in (*.vox) do VCECOPY %1 %~n1.wav -c44M16

-c44M16 means output encoding is 44mhz mono 16-bit.

If the NMS file is indexed (use VCEINFO to figure it out), meaning it contains more than one recording - kind of like a ZIP file contains a bunch of files - you'll have to use a manual technique something like the following:

vcecopy messages.vox 0.wav -c44M16 -m0,0

vcecopy messages.vox 1.wav -c44M16 -m1,0

vcecopy messages.vox 2.wav -c44M16 -m2,0

Using Excel, you can write some equations to build a list of commands.  Using some advanced command line utilities - perhaps Windows PowerShell or grep, depending on the platform you are using.

If you need any help with decoding/encoding from one format to another, drop me a line.  NMS, Dialogic, Talx, raw PCM, GSM, whatever...

 

#    Comments [4] |