SOME REGULAR EXPRESSION PROBLEMS

Forum for TextAloud version 3

Moderator: Jim Bretti

Post Reply
Slyths
Posts: 22
Joined: Thu Apr 10, 2014 6:09 am
Contact:

SOME REGULAR EXPRESSION PROBLEMS

Post by Slyths »

Good day, the great TA members

Can someone lend me a helping hand in two issues below:
1. I use a regular expression to assign a voice (not the default) to read anything that comes after an open parenthesis and use this [\)] regular expression and respell </voice> to return to default voice after a close parenthesis is encountered. I also use another regular expression to assign another voice (not the default) to read anything that comes after an open bracket and use this [\]] regular expression and respell </voice> to return to the default voice after a close bracket is encountered. All works just fine, except when the close parenthesis ends up like this ]) in which situation, the voice doesn't return to default but continues as if it doesn't encounter a closing parenthesis, with the voice assigned to read what is between parenthesis behaving as if it is the default. Example:

Text aloud is the best reading software (ever designed [and we] while) say kudos to the programmers. Text aloud is the best reading software (ever designed [and we]) say kudos to the programmers.

In the above paragraph, I don't have problem with the first sentence at all, but in the second sentence, the voice assigned to read the parenthesis continues till the end. All the 3 voices (the default, those assigned to read between parentheses and brackets) are Microsoft SAPI 5.

2. I use this regular expression [\[](\d*?)[\]] to skip reference numbers in articles and it works just fine, except when there is a comma or a full stop when more than one references are cited in a bracket like [2,3]

Thank you.

Thank you.
Last edited by Slyths on Fri Aug 15, 2014 11:56 am, edited 1 time in total.
PHenry1026
Posts: 231
Joined: Thu Jan 11, 2007 12:10 pm
Contact:

Re: SOME REGULAR EXPRESSION PROBLEMS

Post by PHenry1026 »

For the voice change on parenthesis problem, you can use the following post as a template:

http://www.nextup.com/phpBB2/viewtopic. ... sis#p15209



For the skip numbers problem, you can use the following post as a template:

http://www.nextup.com/phpBB2/viewtopic. ... ers#p15370
Slyths
Posts: 22
Joined: Thu Apr 10, 2014 6:09 am
Contact:

Re: SOME REGULAR EXPRESSION PROBLEMS

Post by Slyths »

Thank you Phenry, for the kind response

I strongly think the solution you provided in the link works just fine as acknowledged by the enquirer. However, I am actually a newbie and not as good as you might think. Although it looks simple to you, I couldn't come up with a solution to my exact problem and I really want to solve this problem badly and urgently. I would be grateful if you help me with just how I can deal with the above for now while I will dedicate time to understand regex as much as possible. I would appreciate some explanation also on how the commands work.

You may solve my problem in the link you provided, so as to reduce redundancy and hence maintain the link for problems of this nature, as it is likely to be cropping up from time-to-time.

Thank you once again.
PHenry1026
Posts: 231
Joined: Thu Jan 11, 2007 12:10 pm
Contact:

Re: SOME REGULAR EXPRESSION PROBLEMS

Post by PHenry1026 »

Greetings,



(?#start_[)(?m)(?:^|\s|['"‘“(]|\p{Pi}|\p{Ps}|\p{Pd})\(\K(?=[^)\r\n(]+\)[’”\p{Po}\p{Pe}\p{Pf}]{0,2}(?:\s|$))


<voice required="name = Heather22k_HQ">



(?#end_])(?m)(?:^|\s|['"‘“(]|\p{Pi}|\p{Ps}|\p{Pd})\([^)\r\n(]+\K(?=\)[’”\p{Po}\p{Pe}\p{Pf}]{0,2}(?:\s|$))


<voice required="name = Graham22k_HQ">




The first regex, (?#start_[) changes the voice to <voice required="name = Heather22k_HQ"> on encountering the open parenthesis \(. If your opening parenthesis is \[ you would substitute that in the regex.


[^)\r\n(]+ is the content of the parenthesis.


The second regex, (?#end_]) changes the voices back to <voice required="name = Graham22k_HQ"> on encountering the closing parenthesis \). If your closing parenthesis is \] you would substitute that in the regex.




For me to show you how to modify the skip number example, you have to give a specific example that shows all the situations that you want skipped.
Slyths
Posts: 22
Joined: Thu Apr 10, 2014 6:09 am
Contact:

Re: SOME REGULAR EXPRESSION PROBLEMS

Post by Slyths »

Oh! thank you PHenry as always.
Although, I haven't tried what u sent to me yet, I feel it better to respond to your question. Let me start with the skipping numbers first which I think is less complicated.

1.
For the numbers, all I want is when a bracket contains ONLY numbers, even if there are punctuations and spaces in it, it should be skipped. However, if there is even a single alphabet, it should be pronounced. This is so because almost all the brackets in my articles, what they contain, if there is no single alphabet in them, they are most likely to be references and hence less important to me during pronounciation. Eg
[9]
[8, 9, 10]
[4.3, 6] are all references and hence should be skipped. But,

[a56]
[7 come]
[give 7 egg, boiled] contain alphabets and should hence be pronounced.

2. For the change of voices in brackets and parenthesis;
Initially, I was using this regular expression [\[](.*?)[\]] to respell with Zira Microsoft voice as follows: <voice required="name = Microsoft Zira Desktop">$1</voice>.
I was also using this regular expression [\(](.*?)[\)] to respell with Bridget as follows: <voice required="name = VW Bridget">$1</voice>
They were working just fine, until when I began to notices that they don't apply simple text respell within double brackets or double parenthesis eg,

I am a man (who works in IA)

In the above sentence, I have already respell IA as 'in all' using simple text. So instead of Bridget to pronounce the parenthesis as 'who works in in all', it pronounces the content just as it is written.

I then contacted Jim Bretti, who guided me that TA doesn't support entry in an entry. So the best way was to create 2 entries, for the voice starting a parenthesis or bracket like this;
Regular expression: [\(]
respell: <voice required="name = VW Bridget">

and then, for the return to default voice, like this;
Regular expression: [\)]
Respell: </voice>

I have to tell you that the above sent to me by Jim Bretti, works almost perfectly well, except when my parenthesis is immediately preceded by a bracket like this: ])
If it is not preceded by a bracket, I have no trouble at all and all simple texts within bracket and everything is done perfectly. I somehow think the bracket invalidates the closing parenthesis, because I tried adding a full stop between the bracket and the parenthesis like this ].) and it works just fine.
Please note that the guide sent to me by Jim also works perfectly even when the parenthesis is preceded by a bracket, AS LONG AS I deactivate the regular expression which assign Zira to read anything after encountering bracket (remember that Zira read what is between [ and ] while Bridget reads what is between ( and ), and I have a different default voice which reads the whole of an article.

I'm also not comfortable with the idea of assigning a specific voice for my closing bracket or closing parenthesis. I rather prefer return to 'defaul voice' like this </voice> upon encountering closing bracket or closing parenthesis. This is due to the following reasons;

Here I am using my default voice (and here in parenthesis I want Bridget [while here in brackets for Zira] and Bridget should still continue since the parenthesis has not closed until now) but from here, it should return to the default voice which started the sentence.

In the above sentence, 3 voices are at work. You will also observe that a third voice, that is Zira interrupted while the second voice, that is Bridget was active. When Zira finished, the return of voice is not back to the default but to Bridget because parenthesis had not close. Bridget only stopped when the parenthesis closed. From there I have my first voice Heather took over.
If I were to specifically assign Heather as a return voice when a closing bracket or a closing parenthesis is encountered, Bridget would have been prematurely terminated immediately after encountering the closing bracket which terminated Zira.


Thank you for having the patience to listen to me. Hope to see your response soon.
PHenry1026
Posts: 231
Joined: Thu Jan 11, 2007 12:10 pm
Contact:

Re: SOME REGULAR EXPRESSION PROBLEMS

Post by PHenry1026 »

The following regex should do the skip for you.

Enter as regular expression

(?#Skip)(?m)(?:^|\s|['"‘“(]|\p{Pi}|\p{Ps}|\p{Pd})\K\[[0-9][0-9,. ]*\](?=[’”\p{Po}\p{Pe}\p{Pf}]{0,2}(?:\s|$))

then choose Skip text.

On the voice change issue:

If you replace \) with \]\) the regexes I supplied to you should work as you desire

In the second regex instead of using a specific voice, I guess you can use </voice> to get back to the default voice.

If you are going to be a serious user of TextAloud, you should avoid using Simple Text. In any case, place any Simple Text entry below the regular expressions for the voice change.
Slyths
Posts: 22
Joined: Thu Apr 10, 2014 6:09 am
Contact:

Re: SOME REGULAR EXPRESSION PROBLEMS

Post by Slyths »

May I here say, guide solved my problem 100% to my satisfaction. I tested it each every way and they all worked perfectly. Thank you very much.

For those following our discussion, this is how I solved my problem;
I just replaced \) after the equal sign with \]\) in the command send by PHenry like this (?#end_])(?m)(?:^|\s|['"‘“(]|\p{Pi}|\p{Ps}|\p{Pd})\([^)\r\n(]+\K(?=\]\)[’”\p{Po}\p{Pe}\p{Pf}]{0,2}(?:\s|$)) while for the skipping numbers, I simply copied and paste what he wrote and they all worked magic. I left all my previous entries as they are, I'm yet to explore the powers of the regex with the "start".

I know I need to get myself educated on regex but I'm wondering why using this regular expression \]\) respell with </voice> didn't address the issue.

I wish I know what evey character in ur command means. Is there any guide that will make me conversant with regex within a VERY SHORT time? not too detailed for now but at least to be able to solve a lot of my problems while I'll study in detail later.

Thank you PHenry, your help is invaluable.

:D :D
PHenry1026
Posts: 231
Joined: Thu Jan 11, 2007 12:10 pm
Contact:

Re: SOME REGULAR EXPRESSION PROBLEMS

Post by PHenry1026 »

I know I need to get myself educated on regex but I'm wondering why using this regular expression \]\) respell with </voice> didn't address the issue.
I just noticed your follow-up question: respell with </voice> did not work because this is a known bug in TextAloud. This is a serious and annoying bug which should be fixed because I believe it is affecting a lot of users but very few TA users are reporting it.

References for regular expression can be found here: http://www.pcre.org/pcre.txt or http://www.regular-expressions.info/unicode.html

I do not believe you can learn regular expressions adequately from these sources: When I started to learn regular expressions to use with TextAloud, I posted questions to TextAloud Forums and experimented until the regex worked. I wished that there were library of regexes that I could have used as a starting point.
Slyths
Posts: 22
Joined: Thu Apr 10, 2014 6:09 am
Contact:

Re: SOME REGULAR EXPRESSION PROBLEMS

Post by Slyths »

Thank you, PHenry for the more insight.

I.. I ran again into similar problems that I had to send a mail to TA. I very much appreciate the response of Jim Bretti but that didn't completely solved the problem (although he educated me a lot more on the issue and I believe he must be working hard on the issue). Unfortunately, he didn't hear from me also since that time to even know how it went. I mean it was a weakness on my part not to furnish you with more information. Believe me, I've been very busy since the time I made such post and didn't even have the time to respond back to Jim. I will share with you on the forum some of the helpful hint he offered me.

I visited the site today as a strict matter of necessity; I finished my work as usual. In fact, TA was the last programme I think I was actually using last night as I become so much attached to it (no one can underestimate the value this programme adds to one's life), unfortunately I woke up this morning to discover all my articles, including archives, have vanished. I visited the 'Data Folder' directory but nothing. Except my dictionaries, it's as if I'm installing the programme for the very first time. I searched the whole of my computer (hidden files, system files, archived files, temporary files and so on but nothing). In fact, since morning I've been online combing the net for good data recovery programmes and have installed a couple of them but no success.

Sincerely, I would have press for damages. I'm devastated! Months of real hard work, important materials, you just name them have vanished in thin air.
But I'm taking it lightly with TA programmers for obvious reasons; they are very responsive whenever one has any complaints. Many a times I personally feel that I'm stalking them but they never fail to respond. I have also used some other programmes including IVONA and NaturalReader but they are nowhere close to TA. So, I believe programming failure can happen with any programme. However, they must look into this and ensure that safety of files comes first into anything they do. Otherwise, they are very much at risk of losing customers. I wish this disaster never happens to anyone.
PHenry1026
Posts: 231
Joined: Thu Jan 11, 2007 12:10 pm
Contact:

Re: SOME REGULAR EXPRESSION PROBLEMS

Post by PHenry1026 »

Getting respell with </voice> to work

You can get </voice> to work as you intended if you follow these steps:

1. Create a new TA Dictionary

2. In the new Dictionary enter only the two regexes for the voice change.

3. Turn off all other TA Dictionaries (only the new dictionary should be checked on).

</voice> should now be working as you intended.

Proving that this is a Bug

1. Make the new Dictionary the last Dictionary

2. Turn on all your other dictionaries.

With all your dictionaries now on, </voice> should crash TextAloud.


The reason why this is such a nasty bug is that almost any end tag, not just </voice>, will make TextAloud unstable and crash the program.
Jim Bretti
Posts: 1558
Joined: Wed Oct 29, 2003 11:07 am
Contact:

Re: SOME REGULAR EXPRESSION PROBLEMS

Post by Jim Bretti »

On the </voice> end tag (and other end tags) ... I know of a few instances where end tags do not work. One example is if an audio clip is enclosed in <voice required...> and </voice> tags.

I'm not aware of this causing stability problems or crashing TextAloud ... the only cases I've seen are ones that display error messages. The most common error is a cryptic one generated by Sapi5, "Speech Error: Invalid at the top level of the document". I believe there are a few others, but they boil down to Sapi5 xml parsing errors. I probably need to look at performing more validation in the TextAloud application so we at least get some meaningful error messages.

If there are cases where this actually causes TextAloud to crash I need to see example(s), I'm not able to duplicate myself. If you can duplicate a crash, can you send me sample article text and a pronunciation dictionary that causes the crash?

In the cases where end tags do not work, the only workaround I have for now is to avoid using them, and instead, insert actual voice changes in the text. So instead of this:
some text <voice required= ... > more text </voice> and more text
do this
some text <voice required= ... > more text <voice required= ... > and more text

I apologize for the problem and hope to have this working better in future versions.
Jim Bretti
NextUp.com
PHenry1026
Posts: 231
Joined: Thu Jan 11, 2007 12:10 pm
Contact:

Re: SOME REGULAR EXPRESSION PROBLEMS

Post by PHenry1026 »

Greetings Jim,

I have sent you the files to simulate a crash.

My definition of a crash is that the error stops TextAloud from working. I am not sure if this is the definition of a crash you are also using.
Jim Bretti
Posts: 1558
Joined: Wed Oct 29, 2003 11:07 am
Contact:

Re: SOME REGULAR EXPRESSION PROBLEMS

Post by Jim Bretti »

I'm not trying to get hung up on terms ... what I mean is if we're not able to handle one of these cases in the app, we should display an error message, and you should be able to use TextAloud to speak another article. If *any* error causes TextAloud to become unusable until it is restarted, forces the app to exit, or anything like that, its something we need to look at, we want to be able to recover from errors.
Jim Bretti
NextUp.com
Post Reply