Welcome to the Panopto Community

Please note: All new registrants to the Panopto Community Forum must be approved by a forum moderator or admin. As such, if you navigate to a feature that is members-only, you may receive an error page if your registration has not yet been approved. We apologize for any inconvenience and are approving new members as quickly as possible.

Better ASR Captions

In many cases the ASR that is available in Panopto is not quite as accurate as we might hope so we are starting to dig into what it might look like to get a more accurate ASR solution in place for media in Panopto. I have been testing with the Azure Media Services offering for audio analysis. It seems to be quite accurate, very quick, and the prices seem to be OK. Beyond the Azure solution, what is available from AWS gives timings for each word so it would allow us to reflow captions with edits (https://community.panopto.com/discussion/743/reflow-captions). It would even allow some infrastructure to be built out to translate text and convert that translated text as audio. This sort of a session (with multiple audio tracks a user can pick from) isn't available today in Panopto, but getting into something like this for ASR would open that development in the future. Obviously there is some cost there beyond the "out of the box" experience, but having this sort of thing available could provide some other modes of operation in Panopto. (https://aws.amazon.com/blogs/machine-learning/create-video-subtitles-with-translation-using-machine-learning/)

For the feature request, would it be possible to tie in other ASR platforms? (namely - https://docs.microsoft.com/en-us/azure/media-services/latest/media-services-apis-overview)

If there are plans to improve the quality of what is available in Panopto out of the box, that is great, but if not, giving us the flexibility to tie into other ASR providers may be good.

Sign In or Register to comment.