Jump to content
C4 Forums | Control4

annex⁴ - Amazon Polly (TTS)


Rexabyte

Recommended Posts

I just completed preliminary work on an Amazon Polly (text to speech) driver and figured I'd ask the community for features they would like to see before I start a beta. At the moment the driver can synthesize text through a programming command and render it out over an announcement. We do have plans to add our 'reveal' connections to the driver to automatically render messages coming from other drivers as well.

Please note users will not be required to purchase anything but the driver in order to synthesize speech, annex⁴ will be absorbing recurring service costs.

What features would you like to see?

Link to comment
Share on other sites


I should also have mentioned the voices. They are pulled from the API based on language and selected in a property. There's a property for language and voices for that language.  

These are all of the languages to be supported, each with at least 1 voice but generally more:
cy-GB | da-DK | de-DE | en-AU | en-GB | en-GB-WLS | en-IN | en-US | es-ES | es-US | fr-CA | fr-FR | is-IS | it-IT | ja-JP | nb-NO | nl-NL | pl-PL | pt-BR | pt-PT | ro-RO | ru-RU | sv-SE | tr-TR

Link to comment
Share on other sites

So an update, dynamic variables are now included in the driver. We've added current Date/Time information as well as controller service status and director uptime. The way the dynamic variables work are similar to the way Lua patterns work using %. Does anyone have any dynamic variables they would like included specifically?

%DATE, %TIME, %UPTIME


To be used like this when creating your synthesized text:
 

It is %DATE at %TIME. Director has been up for %UPTIME.


Pulling information from other driver variables is done with a comma separated value.
 

The temperature outside is %VAR1 degrees.

-- Where %VAR1 is a comma seperated value of device id and variable id.
-- 843,1001



On a side note the voice selection is quite good, the following voices are available in English, you can find examples of some of these on the Amazon Polly page.

en-US:

  • Joanna
  • Salli
  • Kimberly
  • Kendra (Old)
  • Justin (Young)
  • Joey
  • Ivy (Young)

en-GB:

  • Emma
  • Brian
  • Amy

en-AU

  • Russell
  • Nicole
Link to comment
Share on other sites

This sounds like a really nifty driver! I can't think of any other dynamic variables needed especially as one can (as I underand it) bring in static variables using device ID and variable ID.

What would be nifty is if you could include one or two standard sound effects (e.g. Doorbell ring) for getting attention...

I could see the following being useful:

*VAR2 The front door bell has been pressed

Where *VAR2 could be *RING, *DINGDONG etc.

Then, I assume that you will allow some delay (user set) up front to allow for amps that need to power on etc.

When will this be available?

Definitely a driver I will buy. 

 

 

Link to comment
Share on other sites

I'll see if I can get it in beta shape tomorrow. I'll PM when it's ready so you can give it a whirl.

As for the sound effects I'll see what I can do. A requirement for keeping the sound effects at the start or end of the text would likely be put in place so I don't have to split the announcement.

Link to comment
Share on other sites

  • 4 months later...

I have finally had some time to finalize this driver and get it ready for release. If you're interested in a bit of beta testing let me know either here or through PM and I can send the driver to you. I'll be providing free licenses to the first few beta testers. Note Amazon Polly currently supports over 50 voices.

There is also some extended functionality for those that want to do more custom text to speech with SSML. You can find more information about SSML here: https://docs.aws.amazon.com/polly/latest/dg/supported-ssml.html

An example below:

<speak>He was caught up in the game.<break time="1s"/> In the middle of the 10/3/2014 <sub alias="World Wide Web Consortium">W3C</sub> meeting he shouted, "Score!" quite loudly. When his boss stared at him, he repeated <amazon:effect name="whispered">"Score"</amazon:effect> in a whisper.</speak>

The installation process is pretty simple. You just drag the driver into the project and create an announcement that matches what is set in the driver properties. You don't need to worry about any third party services, recurring charges, or logins for other services.

Link to comment
Share on other sites

Honestly, I haven't played with many of them. Just the en-US and en-AU ones. As far as I can tell there is nothing like Vader, you might be able to find something close for Hal or C3PO.

This video contains quite a few of the voices, although there are some missing:

 

Link to comment
Share on other sites

The latest version of the driver (0.3.103) is now up and has a few changes. It now includes some much needed documentation as well as a new action to view variable id's and names on drivers. This will make it easier to identify variables you want to inject into the text to speech strings. I also merged some of the programming commands to make things more simple.

From the feedback it sounds like I will need to add voice selection to individual programming commands and not just a global voice for all commands being issued. This will be coming in the next beta version which will likely be the last.

Thank you everyone for the feedback!

Link to comment
Share on other sites

13 minutes ago, TheWizard said:

Or you can just download the new driver.

Can you download it if you're not a dealer?  Can I just pull the rtf file out of the c4z file on the controller?  Just use something like WinSCP to connect to the controller and copy it to my PC.

Link to comment
Share on other sites

I had to do a Controller reboot to see the updated Actions in the Programming tab of Composer.

The driver is coming along nicely 

Do you truncate the number of variables from each device?  The Wunderground Weather Station driver has about 56 variables if I look under Agents - SNMP but when I look in the variable dropdown list I only see about 34.  And in those 34 I can't find CURRENT_TEMPERATURE or TEMPERATURE_C.  Maybe that is an issue in that Weather Underground driver as most of those variables are 0 or Undefined, at least for me.  But that driver is using a tstat proxy and many of these variables aren't relevant for a weather station, like Setpoints.

Can you do two Polly Actions in one script?  It isn't working when I try.

When you don't have any variables you still get (,,,,,) as in "Syntesize test - that is all for now (,,,,,)" in the synthesize test Command.  Should you get that?

Link to comment
Share on other sites

If you ran two Polly actions at once one of them would overwrite the other. I can queue them up in the driver so that they execute after each other sequentially. I can also likely add an event (OnAnnouncementFinished) for when an announcement finishes playing if you wanted to do something afterwards.

The (,,,,,) portion is occurring because of the way I formatted the description of the command in programming. "that is all for now" would be the text you entered (PARAM1), and the stuff in brackets is all of the variables that have been entered, in this case nothing.

Synthesize text - PARAM1 ( PARAM2, PARAM3, PARAM4, PARAM5, PARAM6, PARAM7 )

I can completely remove this from the description but I figure it would be useful to have something there to show the parameters in programming.

I've also noticed some of the variables missing from the drop down, but the drop down is a feature of composer so it's just not including them. This may be because the type of variable (STRING) or something else, I can take a look into it.

Link to comment
Share on other sites

On 1/7/2018 at 12:08 AM, TheWizard said:

If you ran two Polly actions at once one of them would overwrite the other. I can queue them up in the driver so that they execute after each other sequentially. I can also likely add an event (OnAnnouncementFinished) for when an announcement finishes playing if you wanted to do something afterwards.

The (,,,,,) portion is occurring because of the way I formatted the description of the command in programming. "that is all for now" would be the text you entered (PARAM1), and the stuff in brackets is all of the variables that have been entered, in this case nothing.


Synthesize text - PARAM1 ( PARAM2, PARAM3, PARAM4, PARAM5, PARAM6, PARAM7 )

I can completely remove this from the description but I figure it would be useful to have something there to show the parameters in programming.

I've also noticed some of the variables missing from the drop down, but the drop down is a feature of composer so it's just not including them. This may be because the type of variable (STRING) or something else, I can take a look into it.

Great driver.  Yes, the drop down list seems to only be the string variables so number variables etc. can only (currently) be injected using the device ID method. Intuitively, it is much easier to “program” using the variables, so it would be nice to have all of them appear in the drop down list and to have this ordered alphabetically (possibly a C4 limitation?). I will PM some further comments in the next couple of days...

 

Link to comment
Share on other sites

You could just generate the file, rename it on the controller and then select it in a different announcement. That may be a bit cumbersome for most. What if I include an action in the actions tab so you can create a synthesized file with whatever filename, text, and voice you want?

Link to comment
Share on other sites

Was thinking along the lines of the driver being smart enough to handle it on it's own. For example, the generated text could be "front door opened".  If "front door opened.wav exists in \\sambashare\ then play local copy else generate wav in nas then play from there.

Link to comment
Share on other sites

The driver has now been released, the latest change added support for queuing as well as an event for when announcements finish.

You can find the driver here: https://annex4.link/drivers/voice-genie

Anyone that had a beta version of the driver will have to install the new one fresh as it underwent a filename change. It also has a new name to keep it service independent. For anyone that has beta tested and wants a free license shoot me a PM and I'll provide it. This new version is technically a whole new driver on the platform so your 21 day trials will reset if you want to try out the last 2 features added.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.