The ultimate guide on TT's LoquendoTT system and how to make   

The ultimate guide on TT's LoquendoTT system and how to make

Postby Downunder35m » Sun Feb 26, 2012 9:52 pm

First my apologies for taking so long to create this and for being complicated, but it is a hard probelm with TT's....
In this posting I will update on
a) how to modify the crippled TT LoquendoTTS V7 voices to conform to Loquendo standards
b) how to modify certain words and phrases with the use of the working Loquendo engine

Important: I only tested on a 9.400 Navcore with the 885.4021 version of the Australia map, 880 and earlier maps will not work properly with this voice!
Your map must support british english and should be a 4021 version.
If you try a different version than 4021 make sure it contains a cphoneme.dat (all maps with CS SPEECH) and to post your results.


History of the making:
Everyone know that a TT, compared to other devices, has a big problem with pronounciations.
Some say it is fine, other say it is so bad that can't even guess what comes out of the speaker.
I tried many times with more or less success to modify a voice so that it is able to pronounce the australian names correctly and to change some annoying words that simply noone uses in Australia.
After weeks of trail and error tests I was getting close in terms of changing some words and in improving the pronounciation.
The big problem was a complete lack of information, not only for the Loquendo engine used on TT's but also for the changes TT made to the standards Loquendo uses.
This is one reason to create this tutorial ;)
One of the biggest issues was finding all the files and filenames needed for a complete Loquendo voice and the corresponding language settings.
No other system can be compared to a TT and most files that work fine other devices won't be accepted on a TT.

Tomtom's LoquendoTTS system (only for V7!):
If you played with Loquendo voices you might have noticed that you need a so called distribution and at least one voice to make it work.
Windows already has support for it, Igo uses it too and you get various programs using the LoquendoTTS system to speak.
Apart from very costly solutions no Loquendo voice comes with the corresponding lexcions or definitions other than for the voice itself.
You get no localisation, no dialect, not even a dictionary to define abbrevasions yourself.
Loq (to make it shorter) is very expensive compared to other systems but the most expensive part is defining a voice for regional differences.
You can find readers that use a welsh dialect and others that offer support for for different reading languages with the same voice.
Tomtom however goes the easy and cheaper way by using a phoneme definition inside the map folder - the file is cphoneme.dat.
In this file you should find all words and abbrevasions that differ from the pronounciation used in the default voice for that language.
This explains why instructions from a french map sound terrible if a german computer voice is used.

Breakthrough:
In old and IMHO wrong tutorials you will always find that deleting the cphoneme.dat can help a lot with the pronounciation.
Now I understand how some users got the idea of it:
If the file is gone the TT uses only what the computer voice has included and no changes from TT.
The only reason to that would be if you can provide all the needed phonetic translations for words that are not included in the selected language, same for changes like from "Motorway" to "Highway".
And that is exactly where everyone was thinking wrong!
The cphoneme.dat is not only bound to the languages supported by the map but more importantly to the Loquendo engine as well!
To explain the error I has to go a bit deeper...
It all started with normal voices (recorded), followed by the Vocalizer voices and than finally the Loquendo voices.
Most people know that a TT uses the V7 voices, but barely anyone knows that there are also V6, V6.5, V6.7, V7.1 and V7.3 or V7.9 (only android) voice made by TT.
Now I know that using the wrong Loquendo engine on your TT will cause problems with the cphoneme.dat!
What I mean is that even if the cphoneme.dat corrects some problems userchanges will cause instructions to be completely messed up like "Smith Street" becomes "Smith circle" for example.
But!! If the correct Loquendo version and voice for the map and Navcore is used there is no messing up of instructions!

I never thought about it too much as all of them worked more or less good on my TT.
Until the new 9.400 Navcore arrived....
Home refuses to offer me any Computer voices, except the German - and that is with english language and voice settings as well as an australian map!
I started digging and found some minor differences in the libraries of the 9.400 which made me thinking that it could be related to the available voices in Home.
Tested with an older Navcore and Home offers much more voices, but not as much as I got about 2 years ago.
Assuming that the offered voices for the 9.400 are indeed updates for the current V6.9 voices I adopted the new voice system from Windows, Android and IGO (thanks to Chas521 for the input on the way!).

What does that mean in noob terms?
Well a standard voice for your TT has about 30 files, not including the files for the libraries and Loquendo support for the TT itself.
I have not counted but in my release you'll find far over 100 files, all done by try and error....

The Kate in my release is a mix from V6.5 files to V7.5 files but the engine now supports all features according to Loquendo V7.3/V7.5 standards.
Great!! but I don't understand a word you're saying ;)
Overview of TT's Loq engines (best to my knowledge):
V6.5 : only very basic support and almost no user changes possible
V6.9 : many changes possible, but problems with the cphoneme in current maps
V7.1 : added support for different pronounciation rules, still problems with cphoneme (at least for AU)
V7.3 : added support almost any Loquendo definition (but not with the same file/rule hirachy) that includes emotions to support a better "feel" of the voice. Main target are voice readers and the upcoming replacement for Ebooks. Why read if you can listen to your book?...

You see where I'm getting? ;)
Until now it was impossible to create a proper localized voice on a TT.
At least my TT has now a fully functional Loq engine with the possibilty to change certain things for the sound output.
I could not override the names for places, streets and so on as defined inside the cphoneme.dat, but I could modify the engine for a perfect fit :)

How does the TT Loquendo system acually work?:
Since there is no documentation available it is hard to get down to the "secrets" but I will try to post my finding as accurate as possible.
Not even the guys at Opentom know more than you when you are finnished reading :)
I begin with some standards used on all Loq systems:
The Api checks for a valid database and session configuration.
Api initialises the voice and language to bind them - this means the voice is changed with the found configurations and language settings.
The system gives text to be spoken to the Api.
This goes in two ways. First by sending plain text which is handled by the voice configuration as it is - meaning it checks if the word needs to be exchanged and than starts the output. Second is by sendig the phonetic description of a word (mostly in SAMPA code but other are supported too), which is spoken the way it is defined in the code.
After that the voice is unloaded and it starts again.

As you notice this is only the very simple noob description.
However it explains the problem of 32mb devices:
With the current voices a compressed voice is used. This means with the API calls like that the device has to unpack the corresponding parts of the voice, check the databases, make changes, get all of it into the Ram and than process it for output. When done the memory needs to be cleared and the voice unloaded.
Since this can happen basically every few seconds a low power device simply can't handle it. This is especially true if the voice is not completely defined and all files available (standardfor TT :( )
On a TT it is pretty much the same but I'll try to explain in detail the parts that matter.
I assume a full voice like mine is used as otherwise I would need several pages to go through all events during a voice event (which explains the often noticed lack with the timing of the voices in general). I don't go in the details of the Api, only the parts that are relevant to voice and language handling.
The first event is the Api call which checks for a valid config.
Since TT does not support anything accept the defaults this always causes a tiny ignorable problem.
The default handle and session is loaded (default7.session in the Loq folder).
Now all standards for the selected voice are loaded and initialised.
This part is started by checking the data folder for the voice and language config.
Resposible for the settings are the files englishGb.lde, englishGb.ycf and englishGb.lcf for the language, while the Kate.vcf handles the voice.
The YCF is for the style, the LCF for the configuration and the LDE handles the lexicons, exceptions and other settings.
I'm not going into further details, so please check the files for what is loaded and used, I tried to document it and most settings can be found in the official LoquendoTTS manuals.
After all this was successful the voice is loaded and "locked" to these settings.
Since most of the necessary files are missing in a normal install you can imagine the waste of CPU power and time if the TT can't find it!
However this part was only to satisfy the Loq engine and most settings are not used at all. But they need to be there and defined correctly for the engine to work properly!

Now the TT override kicks in the first time, forcing the Loq engine to switch the output to the defaults.
The defaults are not what was defined before!!
All defaults are handled by the LDE, YCF and LCF files starting with "default" in the filename!
This is one weird thing on a TT that cause me a lot of trouble since I have not yet found to override those defaults and to use the normal settings for a language.
So the procedure is the sme as above except for the fact that the .VCF is used as before.
Another thing to mention is the a TT mixes the normal language settings with the defaults at some point.
This is why both need to be defined correctly!
It does not really matter how much you try to change in the defaults as with a correct install and a supported cphoneme.dat most things are working automatically.
For all user changes it starts now.
The Api gets the instructions to start the voice output with the defaults.
What happens now?
To make it short: check the mentioned files and follow the corresponding files and it will be clear to you. I don't want to go too deep here;)
The defaultEnglishGb.lex is used to replace any word or single standing character in the text to be spoken.
The corresponding .rex file is used to handle changes the way certain things need to be changed, e.g.: M onehundrettwentyfive to M one two five
One thing to mention is that the output string can contain normal words as well as direct phonetic descriptions.
The words are the normal stuff the TT gives out, the phonetic descriptions are taken from the cphonme.dat and replace the corresponding word with an override.
This means you can not change the way a word inside the chphoneme.dat is pronounced. I have not checke if replacing the SAMPA string would work though as the .lex can handle them properly.
So the engine cuts the stuff to be said into normal and phonetic and starts the output.
When done all is unloaded and starts again with the next command.


As I said this is the short version and you would need to follow the mentioned files, check their contents and maybe check the !Log.txt to get a cleared and more detailed picture.
Since most users won't need any of the info in here I simply assume that someone interested in would have enough brains to use my guide and the files to follow what I have done.

For normal Loq engines it is possible to keep all parts loaded in Ram to speed up the output and processing quite a bit.
This however needs more CPU power too.
In theory it is possible to activate this feature on a TT too which would stop the short delays between certain words especially if a big phoneme and lexicon is used.
I would not recommend it as most likely TT has some sort of override in place or it might eat up too much Ram. I calculated that a fully loaded voice with a long sentence can use up to 14mb if all parts are kept in Ram, which would explain why TT prefers to unload them and to clear the Ram after each output.

How to modify the voice:

A few things to know:
1. always use a Unix capable editor like Notepad++ for your mods! A normal editor will mess all up and cause problems apart from the fact that some files will be hard to edit with it.
2. before you start anything make sure the logging is activated and you have at least one complete set of current log files to compare.
This might be necessary for error fixing after you changed something so that you always have a reference, especially of the !Log.txt
3. only change one tiny thing at a time, check it on the running TT and if ok, continue. If you change a lot of stuff in one go you might not be able to locate the problem if something goes wrong! So learn first before you get too comfortable.
4. when doing bigger changes make many backups on the way and use a proper description so you know what was working and what not.
In case after a bigger change the whole thing stops talking you will loos the last changes and have to start over but you got it back working fast and with the latest working changes!
5. always check for double entries inside the LEX files! They not only cause a lot of entries in the log but also slow the output quite badly.
6. you might notice that some of your changes have no affect all, so check if it is something that TT overrides before you rip your hair out ;)
7. I don't recommend to use the lexicon for the change of a voice in terms of switching between a male and female for example. It should work but will cause a lot of delay as the voices are loaded and unloaded completely between words.

Good luck! ;)

How to create a working voice for other languages:
I can't give a simple and easy to follow tutorial for that quest.
Why you ask?
1. I have not checked what other voices in the correct version are avaiable for the use on a TT - If you check the bin files inside the folder for the voice you know what I mean. Except for the voicename the rest of the filename must be a match to the ones used in my release.
If anyone has a collection of the original TT V7 voices (.cab) please upload.
2. You need the lexicon files (.lex) for the default language and the language used (they are the same but have different filesnames) please check the "EnglishGb*" and "defaultEnglishGb*" files for reference.
If no working lexicons are available you can create your own by mofifying mine. As an example change Roundabout to Traffic Circle.
3. Except for english and german I would not be able to tell what is wrong or right, and only for australia I can be fairly to sure I know how something is pronounced or what bridge names have changed ;)

I know it sounds hard, but you already got a template, all I had to start was misleading infos and missing/incorrect files.
I had a piece of paper and a pen as I wanted to see how many reboots my TT has to make until I solved all errors (only for the install, not errors/changes for instructions) or until I had to give up again.
While uploading I started to count, one dot for each start of the TT.
Stopped counting at 287 as I had to go to the toile urgently - that was about half way through the dots. So stop the time it takes your TT to reboot, prepare and simulate a route and to check if your changes are noticable. I think you will agree that I've spent far too many hours on this project...
In case you want to create a voice I suggest to start from scratch, meaning with only the voice from the CAB file installed.
You need the folders "lib" and "bin" on your TT, the are only included in original TT voices and each voice has it's own set.
If you get the voice working (should not be a problem) you can start to add the language support files and lexicons.
Do yourself a favour and activate the logging, this you can check the error log for the file that's currently making problems, fix it and move on to the next problem.
Again : it is basically just a matter of adjusting the paths and names inside the definition files (*.lde *.vde *.ycf *.lcf) and exchanging the lexicon files. The database files should be matching too, you might find suitable *.dbl *.gpr and *.phd files in the IGO voice section, but you have to check yourself.
You won't be able to get rid of errors in relation to reading styles other than the language you use and for the default version of it. Like Teleatlas for the AU maps. But if done correctly the cphoneme is used and the actual voice output always uses the default, not the one forced by the map.
Also from time to time an error in regards to the eng;.lcf pops up - you can ignore it as it does not affect anything.
And never work on more than one definition file at a time, always delete the log files and restart the TT to check for changes/problems.
If the voice worked after the initial install with only the files from the CAB than it will work again once you have all the definitions right and the corresponding files in the right places.

Download of the new V7.3 Kate and instructions

Still to come - ok, only joking this time....

Please read the following carefully, completely and at least twice before installing the voice!

Before you start make sure you use a 885 map that matches what I say.
Before you install the new Kate test the instructions and pronounciations with you current computer voice!!
Now install the new Kate and compare.


1. make a complete backup of your existing LoquendoTTS folder
2. check that use English UK for the menu language on your TT and that your map actually supports the british english voice! It makes no sense at this stage to try the voice in other setups. If there is enough interest and help through feedback from users I will create a US version for Susan at a later stage. Final goal would be to complete all V7 voices for TT with all their regional differences.
3. check if your backup is complete - you can skip that if you have a working full backup of your device
4. delete the folder LoquendoTTS - if you have a device with internal and SD memory you should do your testing on the internal memory without a SD present to prevent interference with the settings
5. unpack the archive for the new voice to the root of your device - all folders will be created
6. to make sure testing and checking is possible I activated the Error and event logs for the Loq engine.
The log are located inside the folder LoquendoTTS and have the name !Error.txt and !Log.txt
!Log.txt shows how the engine works and how the actual output of a word is created - must have to find problems
!Error.txt show all problems from the engine. This includes missing/incorrect files, version conflicts, double entries in lexicons and some other stuff.
I could not reduce the error rate to zero due to a few non-conforming things between TT's and Loquendo's view of definitions.
Also it was impossible to create the defaults for the Teleatlas pronounciation rules which causes a constant flow of errors if a pronounciation from the cphoneme.dat is used.
But I was able to override this setting so that the same pronounciation is used together with the default rules of pronounciation.
To prevent your device from running out of free disk space within a few hours of use I highly recommend to change the following lines inside the file default7.session , located in the folder LoquendoTTS :
From
"LogFile" = "/mnt/sdcard/LoquendoTTS/!Error.txt"
"TraceFile" = "/mnt/sdcard/LoquendoTTS/!Trace.txt"

to
;"LogFile" = "/mnt/sdcard/LoquendoTTS/!Error.txt"
;"TraceFile" = "/mnt/sdcard/LoquendoTTS/!Trace.txt"

Note the ; at the beginning of the line - this will disable all logging.
Password for all is: downunder
7. Disconnect and start your TT :)
8. if you post that your device runs too slow or has a long pause between some intructions or even freezes:
I will delete you posting as I have to assume you did not follow step 6! Same for questions about the password or how to download.
Please help to keep this thread nice and clean!

[Please Register or Login to download file]

How to test this voice and post your feedback - please read and follow
To avoid uneccessary postings with useless informations please follow this procedure to avoid timeconsuming questions and problems:
1. install a suitable 885 map if not already on your device - if using several maps make sure the right one selected
2. test the instructions and pronounciations, compare the last to your original voice setup. If you find wrong street types let me know.
3. first steps with new Kate:
I assume you use NC 9.400 or something similar out of the V9 range...
Click on the left pane to check the volume and if the voice if working - Delete any active or planned routes first.
You would hear "Starting Demo" with my map and a normal voice - check what you have now!
Prepare and simulate a route were you know you had pronounciations problems with previous map releases.
Check if all instructions are correct and if the pronounciation of the problem words are corrected (or better than before).
Enjoy the new arrival note and don't forget to disable logging inside the defualt7.session ;)

Update1
Apart from australian users others might have noticed that the TT spells out certain things that actually don't need spelling.
For example "State Route 5" comes out as "S-T-A-T-E Route 5".
At first I thought it has to do with the exeption rules for roads, free- and highways but no matter what I changed the output was the same.
After checking the logs I noticed that Tomtom actually thinks we need spelling for those names...
Here is the code string used for the actual voice output :
TTSEVT_TEXT ("\style=defaultEnglishGb After 100 metres, \pause=250 ,Turn left, \pause=250 , \style=ROADNREnglishAus \s National Highway M8 \style=defaultEnglishGb , towards ........
As you can see by this example the default english lexicon is used for the normal instructions while the australian one is used for the names.
The big problem lies in the little "\s" - it actually forces the engine to spell the next word/block of numbers.
You can off reading road numbers in the voice settings to stop it completely but it is impossible to correct the spelling problem until TT decides to change it.
I don't mind if use my info for other systems than TT or if you post it elsewhere as long as you give proper credits in your posting and refer to the original source.
Some people always think it is funny to use the hard work of others for their own gain and credits - I don't like it, so please respect me when using my work in other forums!

Of course we all would love to see more fully functional languages than GB in the future, so don't forget to share your working voices here!
For further improvements it is highly necessary to keep track of changes, so if you find or create better lexicons than the ones I used or TT has included : SHARE them!!!
It might be of big interest to have languages that are localised to a smaller area for example one of the islands (no the big thing with Queen in it...) as often the locals use completely different terms for stuff like road names or even simple things like "straight trough the roundabout" instead of "across the roundabout".
If you know a way to safely edit the encrypted LEX files please let me know, especially for the nice big lexicons used for TT on Android.
TT already uses a newer and better Loq engine on android but seems to do the full localisation using the lexicons insted of a huge phoneme.
Together they are smaller than a phoneme for TT but only contain local stuff like AU for the UK voice while all the UK stuff comes directly from the engine.
Getting these files decrypted would mean a big step forward for other small systems like used on CE for IGO and even your own PC! Imagine you can use a cheap MS reader or free text reader than can actually handle most of the words in books for example try listening to a text about the british Irish country side and not getting the sound of someone throwing up when it comes to some names and towns LOL


Your feedback is welcome and your help for testing and improving is needed!
Thanks for help and support.
Forum rules

[Please Register or Login to download file]

If you find broken links or missing attachments in my postings, please send me PM
Password for all my files: downunder
Downunder35m offline


User avatar Elite Member



 
Joined: Wed Oct 21, 2009 4:44 am
Posts: 4250
Has thanked: 18 times
Been thanked: 572 times

Postby chas521 » Sun Feb 26, 2012 10:16 pm

DU,
What I really know about TT is very small at that. For those of you who don't know, I'm a moderator for the iGO section. As you members can read, pretty much all the different navs are based on a similar principle. While iGO is a WinCE format and TT is a Linux-based format, they really are related. Let me give you a short history lesson. Some years ago, a computer geek [I say that with love] developed TTS for iGO that was adapted from TT. So you can see the relationship. DU has put together an incredible package for you members. He has spent untold hours doing this for you members/guests - I know that for a fact. He is my hero. :D:clapping_mini: :grouphug:
If it ain't broke don't fix it!
ALWAYS BACK-UP YOUR FILES/FOLDERS BEFORE MAKING EDITING CHANGES!

Please do NOT post any thanks. Simply press the hand icon with the "thumb up" which is the thank you button.
chas521 offline


User avatar Ex Moderator



 
Joined: Sat May 15, 2010 8:50 pm
Posts: 4549
Location: Long Island, NY
Has thanked: 64 times
Been thanked: 1380 times

Postby Downunder35m » Sun Feb 26, 2012 11:47 pm

Well, thanks for the credits!
I admit I was mostly driven by greed as I was getting more and more annoyed by the pronounciation problem.
Without TT finally sticking a bit closer to the phoneme settings used by current Loq voices it would not be possible.
Of course you could always use a phoneme of a current 4021 map with an older 880 map but what would be the point? ;)
Next "project" is trying to get a similar setup working for the old 6.5 voice so that it can be used on 32mb devices without problems, but since the whole structure and definitions are different it might not be possible. At least not with the nice extras of the V7.3 system.
A convertion of other voices based on Loq7.1 - Loq7.3 should be no problem as long as no special characters are needed for the language, so no idea for the UTF-8 versions yet :(
This applies for example to the German, Spanish and French voices.
But if users are willing to test than I can help out on the way to create a new voice.
Depending on the TTS engine used the files can be used for other systems like on IGO too. You still need to adjust the paths inside the definition files and use the databases for the OS in use, but at least you got all files needed for correct and complete Loquendo Installation.
In the next couple of days I might try to add Susan and the definitions for US english, other languages might follow if there is enough interest and support as I would have no clue how to create french or spanish lexicons ROFL
Forum rules

[Please Register or Login to download file]

If you find broken links or missing attachments in my postings, please send me PM
Password for all my files: downunder
Downunder35m offline


User avatar Elite Member



 
Joined: Wed Oct 21, 2009 4:44 am
Posts: 4250
Has thanked: 18 times
Been thanked: 572 times

Postby bangerdemon » Wed May 30, 2012 11:52 am

Many thanks for this.

I have it running with no apparent problems on a TT ONE v3, Navcore SE 9.430 and UK and ROI 890.4222 map.

Alan
bangerdemon offline


Junior Member



 
Joined: Sun Apr 01, 2012 1:11 pm
Posts: 47
Has thanked: 8 times
Been thanked: 0 time


Return to Tutorials

 


  • Related topics
    Replies
    Views
    Last post

Who is online

Users browsing this forum: No registered users and 4 guests