How to suprress numbers in the text from being spoken.

Read-Only forum with answers to frequently asked advanced questions.

Moderators: kdwhite, Jim Bretti

How to suprress numbers in the text from being spoken.

Postby Jim Bretti » Sun Jan 21, 2007 1:59 pm

We frequently get questions about how to skip over numbers in text. Typically the questions involve how to skip page numbers, or the numbers included in a numbered list.

This can be done using a regular expression, and I'll provide two examples here: one to handle numbered lists and another to handle page numbers. Using TextAloud's pronunciation editor, found on the main menu under Options->Pronunciation Editor, you can do some advanced pronunciation entries using regular expressions.


First, here is some general information on using a regular expression to match strings of numbers. The simplest thing to do is use \d to match any numeric character. You can then determine how many consecutive numbers you want to match like this:

\d+ Matches any string of 1 or more numerics
\d{1,3} Matches a string of numerics between 1 and 3 characters in length
\d{4} Matches a string of exactly four numerics.

Numbered Lists
We'll assume here that the numbers in the list are some number of digits followed by a period. For example:

Code: Select all
1. Monday
2. Tuesday
3. Wednesday


or ...

Code: Select all
1. Monday, 2. Tuesday, 3. Wednesday.


One thing to keep in mind is that you want to minimize the chance that the expression you use to suppress speaking these list numbers does not also suppress other numbers you'd like to hear spoken. For example, if you simply suppress any string of numbers followed by a period, you'll suppress speaking the last 4 digits of a phone number that happens to fall at the end of a sentence.

So to be safe, we'll also assume that list numbers are no more than 3 digits long, and look for whitespace preceding the digits and after the period. \s matches whitespace, including newline characters.

Here is the expression to do this:

Word:
{{re=\s\d{1,3}\.\s}}

Pronunciation:
<s>

The expression is looking for a whitespace character, followed by 1 to 3 digits, a period, and a trailing whitespace character. The backslash (\) is required before the period, because without it, the period character is used in regular expressions as a wildcard. So the backslash is needed to 'escape' it.

The <s> in the pronunciation field indicates that whatever string is matched by the expression should be replaced with a single space character.

To remove the 1 to 3 digit restriction, you can use this instead:

Word:
{{re=\s\d+\.\s}}

Pronunciation:
<s>

Page Numbers

As with numbered lists, suppressing page numbers depends on how these numbers are formatted. In this example, we'll assume the page numbers are both preceded and followed by newline characters, and that the page number string itself is the letter 'p' followed by a period and a number.

So the example text we're trying to match looks something like this:


Code: Select all
... end of page twenty.

       p. 20

Start of page twenty one ...



An expression to suppress these page numbers could look like this:

Word:
{{re=(\r\n\s*)p\.\s*\d+\s*(\r\n)}}

Pronunciation:
<s>

Some explanation on what this is doing.

\r\n matches a carriage return/newline pair in the text. So the expression is looking for any string that starts with a new line. After this initial newline, we allow any whitespace that may follow ... \s* matches 0 or more whitespace characters. This matches any combination of newlines, spaces and tabs, up to the next part of the expression, the letter 'p' followed by a period. After the period, the pattern allows for more whitespace, then a number (\d+). After the number, the pattern allows more optional whitespace, and requires a newline (\r\n) to terminate.

For help getting started with regular expressions check these forum threads:

http://www.nextup.com/phpBB2/viewtopic.php?t=475

http://www.nextup.com/phpBB2/viewtopic.php?t=2430

There is also a good online reference at
http://www.regular-expressions.info/

If you have questions either post them on the forum or send us an email at support@nextup.com.
Jim Bretti
NextUp.com
Listen and Learn Anywhere
http://www.NextUp.com
Jim Bretti
 
Posts: 1226
Joined: Wed Oct 29, 2003 11:07 am

Return to TextAloud2 Advanced FAQ

Who is online

Users browsing this forum: No registered users and 1 guest