Batch conversion bug and feature requests

Forum for TextAloud version 3

Moderator: Jim Bretti

Batch conversion bug and feature requests

Postby StarBeamAlpha » Mon Apr 06, 2009 2:03 pm

Thanks for your transparency in the development of this program!

I have a pdf created with abby finereader pro from a scanned book, and using the batch converter, textaloud is spelling out about one word a sentence instead of pronouncing the word.

for example in this sample pdf, textaloud spells out: thermodynamics, Maxwellian, They, who, Joule, Thomson, Maxwell

If I copy and paste from adobe to a text file, or save the pdf as a textfile, then batch convert that text file textaloud pronounces all the above words fine and does not spell them out.

Here is a link to the sample pdf

https://mywebspace.wisc.edu/jcswanson/w ... %20103.pdf

I would assume there is an error in textaloud's pdf to text conversion, or abby finereader has a weird pdf format that you did not encounter before when writing the pdf to text conversion, for instance it is adding spaces between the letters of certain words for some reason or formatting.


Also here are some features I would find useful in the batch converter:

Add new folder creation to batch converter
Add drag and drop to batch converter
Allow program to fully minimize so I can see the desktop while converting
Put percent batch conversion completed in taskbar
allow to minimize textaloud to the system tray while batch converting


Thanks for your time!
StarBeamAlpha
 
Posts: 4
Joined: Mon Mar 30, 2009 12:30 am

Postby Jim Bretti » Tue Apr 07, 2009 1:20 pm

Thanks for including the link to the problem pdf. I gave that a try here, and we're definitely having some trouble extracting the text. These types of problems generally have to do with the specific way the pdf is created. I'll need to work some with our text extraction and see if there is a way we can handle the text better.

On your other issues ...

The batch file conversion window allows you to create new folders from the Browse button.

I just added drag and drop support, so the next build will allow you to drag and drop files on the batch converter window.

The other issues, related to minimizing while batch conversion is in process, have been added to our enhancement list. This has come up before so I think we'll try to do something in a future build.
Jim Bretti
NextUp.com
Listen and Learn Anywhere
http://www.NextUp.com
Jim Bretti
 
Posts: 1223
Joined: Wed Oct 29, 2003 11:07 am

Postby StarBeamAlpha » Thu Apr 09, 2009 5:47 pm

Nice! 8) Thanks for the update.

About the new folder, I didn't realize that, I was just typing in paths in the edit box. I thought that offering to make the new folder instead of giving an error would be more user friendly as I have seen the "folder does not exist, would you like to create it?" in other programs, but no big deal.

I tried batch converting 38 pdf books of various lengths that were made in various programs with the batch converter, and in 8 of them, the end resulting mp3 only read the last couple pages of the book.

For instance in this pdf:
https://mywebspace.wisc.edu/jcswanson/w ... 944%29.pdf

It takes 10 minutes to make in textaloud but I only end up with a 1 minute 37 second mp3 file of the last couple pages

the mp3 file starts at the semi colon in this sentence and goes to the end of the book:

"him out of the dazzling sun
in the park at Budapest, that the mysteries of religion are
built. But he comprehended them not; for, if he had not
suppressed the rich mystical inheritance of"

so something about some characters in this pdf is making textaloud skip things.

This does not happen when I open the pdf in textaloud, save as text and then go to the batch converter, but if I convert to it to text in adobe acrobat, it does the same thing as the pdf version.

As a side note, I am impressed that textaloud created a 42 hour mp3 of a 1000 page textbook! Impressive! :)

Thank you.
StarBeamAlpha
 
Posts: 4
Joined: Mon Mar 30, 2009 12:30 am

Postby Jim Bretti » Fri Apr 10, 2009 6:17 pm

I found a bug in the batch file converter that was causing the problem you noticed with the pdf in your last post. The problem doesn't have anything to do with the pdf text extraction.

If you had any pdf's (or other documents) where audio files created by the batch converter contained only the text at the end of the document, try them with the build I just posted (Beta 44).
Jim Bretti
NextUp.com
Listen and Learn Anywhere
http://www.NextUp.com
Jim Bretti
 
Posts: 1223
Joined: Wed Oct 29, 2003 11:07 am


Return to TextAloud 3 Forum

Who is online

Users browsing this forum: No registered users and 2 guests