Python -ibm watson - speech to text

frostieff · Oct 18, 2021

Hi,

Aside from the pyscribe tool, im trying to use IBM watson to transcribe using this tutorial

this is the code he used:

from ibm_watson import SpeechToTextV1, LanguageTranslatorV3
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator

ltapikey = 'YOUR LANGUAGE TRANSLATOR APIKEY'
lturl = 'YOUR LANGUAGE TRANSLATOR URL'
sttapikey = 'YOUR STT API KEY'
stturl = 'YOUR STT URL'

ltauthenticator = IAMAuthenticator(ltapikey)
lt = LanguageTranslatorV3(version='2018-05-01', authenticator=ltauthenticator)
lt.set_service_url(lturl)

sttauthenticator = IAMAuthenticator(sttapikey)
stt = SpeechToTextV1(authenticator=sttauthenticator)
stt.set_service_url(stturl)

with open('YOURFILENAME.mp3', 'rb') as f:
res = stt.recognize(audio=f, content_type='audio/mp3', model='en-AU_NarrowbandModel', continuous=True).get_result()
voicetext = res['results'][0]['alternatives'][0]['transcript']
voicetext

greek = 'en-el'
chinese = 'en-zh'
hindi = 'en-hi'

translation = lt.translate(text=voicetext, model_id=hindi).get_result()
translatedtext = translation['translations'][0]['translation']
translatedtext

with open('result.txt', 'w') as f:
f.write(translatedtext)

I get the following error:
TypeError:request() got an unexpect keyword argument 'continous'

Any help would be greatly appreciated

SamKook · Oct 22, 2021

Just remove ", continuous=True", it doesn't exist in the definition of recognize(select the text recognize and press ctrl+I to bring up the contextual help to see all the options), at least with what I assume is the default install for all that stuff, it's a really bad tutorial for someone not familiar with that stuff and he doesn't explain much.

Seems to work fine without it, at least up to that point:

Edit: And after finishing the whole script thing, the 1 obvious issue is that he doesn't iterate through the results since his example has only 1 line so you'd have to implement that yourself or manually change the script for every lines which would be insane if your plan is to translate a whole movie.

I only get the translation for "ええ " from the example I used in the picture above which translates to "-Yeah. ". Not sure why there's a "-" there.

Changing this line"voicetext = res['results'][0]['alternatives'][0]['transcript']" with a 1 instead of the first 0 and I get the second line "日本全国の皆さんこんにちは " -> "Hello, everyone in Japan. "

My python knowledge is very basic and I never heard of JupyterLab before today so not sure how you'd automate iterating through all the results, you need to put everything after my screenshot in a loop and make sure the file writing append(not overwrite which this example does) the result to the text file(or you put all the translated results in an array and output that somehow).

frostieff · Oct 29, 2021

SamKook said:
Just remove ", continuous=True", it doesn't exist in the definition of recognize(select the text recognize and press ctrl+I to bring up the contextual help to see all the options), at least with what I assume is the default install for all that stuff, it's a really bad tutorial for someone not familiar with that stuff and he doesn't explain much.

Seems to work fine without it, at least up to that point:
View attachment 2753150

Edit: And after finishing the whole script thing, the 1 obvious issue is that he doesn't iterate through the results since his example has only 1 line so you'd have to implement that yourself or manually change the script for every lines which would be insane if your plan is to translate a whole movie.

I only get the translation for "ええ " from the example I used in the picture above which translates to "-Yeah. ". Not sure why there's a "-" there.

Changing this line"voicetext = res['results'][0]['alternatives'][0]['transcript']" with a 1 instead of the first 0 and I get the second line "日本全国の皆さんこんにちは " -> "Hello, everyone in Japan. "

My python knowledge is very basic and I never heard of JupyterLab before today so not sure how you'd automate iterating through all the results, you need to put everything after my screenshot in a loop and make sure the file writing append(not overwrite which this example does) the result to the text file(or you put all the translated results in an array and output that somehow).

thanks a lot , your still better than me though.

Now my results are showing a blank. Weird... Also you can clean it up , but i forgot how.

Search

Search

Python -ibm watson - speech to text

frostieff

New Member

SamKook

Grand Wizard

frostieff

New Member