Python -ibm watson - speech to text

frostieff

New Member
May 12, 2021
21
3
3
Hi,

Aside from the pyscribe tool, im trying to use IBM watson to transcribe using this tutorial


this is the code he used:

from ibm_watson import SpeechToTextV1, LanguageTranslatorV3
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator

ltapikey = 'YOUR LANGUAGE TRANSLATOR APIKEY'
lturl = 'YOUR LANGUAGE TRANSLATOR URL'
sttapikey = 'YOUR STT API KEY'
stturl = 'YOUR STT URL'

ltauthenticator = IAMAuthenticator(ltapikey)
lt = LanguageTranslatorV3(version='2018-05-01', authenticator=ltauthenticator)
lt.set_service_url(lturl)

sttauthenticator = IAMAuthenticator(sttapikey)
stt = SpeechToTextV1(authenticator=sttauthenticator)
stt.set_service_url(stturl)

with open('YOURFILENAME.mp3', 'rb') as f:
res = stt.recognize(audio=f, content_type='audio/mp3', model='en-AU_NarrowbandModel', continuous=True).get_result()
voicetext = res['results'][0]['alternatives'][0]['transcript']
voicetext

greek = 'en-el'
chinese = 'en-zh'
hindi = 'en-hi'

translation = lt.translate(text=voicetext, model_id=hindi).get_result()
translatedtext = translation['translations'][0]['translation']
translatedtext

with open('result.txt', 'w') as f:
f.write(translatedtext)



I get the following error:
TypeError:request() got an unexpect keyword argument 'continous'

Any help would be greatly appreciated :)
 
Just remove ", continuous=True", it doesn't exist in the definition of recognize(select the text recognize and press ctrl+I to bring up the contextual help to see all the options), at least with what I assume is the default install for all that stuff, it's a really bad tutorial for someone not familiar with that stuff and he doesn't explain much.

Seems to work fine without it, at least up to that point:
SpeechToText_Tutorial.jpg


Edit: And after finishing the whole script thing, the 1 obvious issue is that he doesn't iterate through the results since his example has only 1 line so you'd have to implement that yourself or manually change the script for every lines which would be insane if your plan is to translate a whole movie.

I only get the translation for "ええ " from the example I used in the picture above which translates to "-Yeah. ". Not sure why there's a "-" there.

Changing this line"voicetext = res['results'][0]['alternatives'][0]['transcript']" with a 1 instead of the first 0 and I get the second line "日本 全国 の 皆さん こんにちは " -> "Hello, everyone in Japan. "

My python knowledge is very basic and I never heard of JupyterLab before today so not sure how you'd automate iterating through all the results, you need to put everything after my screenshot in a loop and make sure the file writing append(not overwrite which this example does) the result to the text file(or you put all the translated results in an array and output that somehow).
 
Last edited:
Just remove ", continuous=True", it doesn't exist in the definition of recognize(select the text recognize and press ctrl+I to bring up the contextual help to see all the options), at least with what I assume is the default install for all that stuff, it's a really bad tutorial for someone not familiar with that stuff and he doesn't explain much.

Seems to work fine without it, at least up to that point:
View attachment 2753150


Edit: And after finishing the whole script thing, the 1 obvious issue is that he doesn't iterate through the results since his example has only 1 line so you'd have to implement that yourself or manually change the script for every lines which would be insane if your plan is to translate a whole movie.

I only get the translation for "ええ " from the example I used in the picture above which translates to "-Yeah. ". Not sure why there's a "-" there.

Changing this line"voicetext = res['results'][0]['alternatives'][0]['transcript']" with a 1 instead of the first 0 and I get the second line "日本 全国 の 皆さん こんにちは " -> "Hello, everyone in Japan. "

My python knowledge is very basic and I never heard of JupyterLab before today so not sure how you'd automate iterating through all the results, you need to put everything after my screenshot in a loop and make sure the file writing append(not overwrite which this example does) the result to the text file(or you put all the translated results in an array and output that somehow).
thanks a lot , your still better than me though.

Now my results are showing a blank. Weird... Also you can clean it up , but i forgot how.