Stream Insights: AI-Powered Transcription and Summarization

Solution Overview

The solution is powered by two AWS AI services, Amazon Transcribe and Amazon Translate, along with Amazon Bedrock, a fully managed service that allows you to build generative AI applications. The solution also uses Amazon Cognito user pools and identity pools for managing authentication and authorization of users, Amazon API Gateway REST APIs, AWS Lambda functions, and an Amazon Simple Storage Service (Amazon S3) bucket.

Features

Live transcription and translation – The Chrome extension transcribes and translates audio streams for you in real time using Amazon Transcribe, an automatic speech recognition service.
Summarization – The Chrome extension uses FMs such as Anthropic’s Claude 3 models on Amazon Bedrock to summarize content being transcribed, so you can grasp key ideas of your live stream by reading the summary.

Live Transcription and Translation

Live transcription is currently available in over 50 languages currently supported by Amazon Transcribe streaming (Chinese, English, French, German, Hindi, Italian, Japanese, Korean, Brazilian Portuguese, Spanish, and Thai), while translation is available in over 75 languages currently supported by Amazon Translate.

Architecture

The solution workflow includes the following steps:

A Chrome browser is used to access the desired live streamed content, and the extension is activated and displayed as a side panel.
The user signs in by entering a user name and a password. Authentication is performed against the Amazon Cognito user pool.
The extension interacts with Amazon Transcribe (StartStreamTranscription operation), Amazon Translate (TranslateText operation), and Amazon Bedrock (InvokeModel operation).
Interactions with Amazon Bedrock are handled by a Lambda function, which implements the application logic underlying an API made available using API Gateway.
The user is provided with the transcription, translation, and summary of the content playing inside the browser tab.

Prerequisites

For this walkthrough, you should have the following prerequisites:

Deploy the backend
Create a new Amazon Cognito user

Deploy the Backend

The first step consists of deploying an AWS Cloud Development Kit (AWS CDK) application that automatically provisions and configures the required AWS resources, including:

An Amazon Cognito user pool and identity pool that allow user authentication
An S3 bucket, where transcription summaries are stored
Lambda functions that interact with Amazon Bedrock to perform content summarization
IAM roles that are associated with the identity pool and have permissions required to access AWS services

Use the Extension

Now that the extension is set up, you can interact with it by completing the following steps:

On the browser tab, choose the Extensions.
Choose (right-click) on the Transcribe, translate and summarize live streams (powered by AWS) extension and choose Open side panel.
Log in using the credentials created in the Amazon Cognito user pool from the previous step.
Close the side panel.

Troubleshooting

If you receive the error “Extension has not been invoked for the current page (see activeTab permission). Chrome pages cannot be captured.”, check the following:

Make sure you’re using the extension on the tab where you first opened the side pane.
Make sure you have given permissions for audio recording in the web browser.

Conclusion

In this post, we showed you how to deploy a code sample that uses AWS AI and generative AI services to access features such as live transcription, translation, and summarization. You can follow the steps we provided to start experimenting with the browser extension.

FAQs

Q: What are the prerequisites for this walkthrough?
A: You should have the following prerequisites: deploy the backend and create a new Amazon Cognito user.

Q: How do I deploy the backend?
A: You can deploy the backend by following the steps provided in the Prerequisites section.

Q: What are the features of the Chrome extension?
A: The Chrome extension provides live transcription and translation, as well as summarization of live streams.

Q: How do I troubleshoot issues with the extension?
A: You can troubleshoot issues with the extension by checking the troubleshooting section provided in the article.

Q: Can I change the language of the transcript and summary after the recording has started?
A: No, you cannot change the language of the transcript and summary after the recording has started. You must choose the language before starting the recording.

Post Views: 49

Stream Insights: AI-Powered Transcription and Summarization

Engineering confidence to navigate uncertainty | MIT News

Generate single title from this title Best of MWC 2026: Live updates on phones, concepts, and robots we’re seeing in 100 -150 characters. And...

Featured video: Coding for underwater robotics | MIT News

Generate single title from this title Upgrading agentic AI for finance workflows in 100 -150 characters. And it must return only title i dont...

Generate single title from this title Making Softmax More Efficient with NVIDIA Blackwell Ultra in 100 -150 characters. And it must return only title...

Engineering confidence to navigate uncertainty | MIT News

Generate single title from this title Best of MWC 2026: Live updates on phones, concepts, and robots we’re seeing in 100 -150 characters. And...

Featured video: Coding for underwater robotics | MIT News

Generate single title from this title Upgrading agentic AI for finance workflows in 100 -150 characters. And it must return only title i dont...

Generate single title from this title Making Softmax More Efficient with NVIDIA Blackwell Ultra in 100 -150 characters. And it must return only title...

Generate single title from this title Nvidia shares fall as blockbuster results fail to dazzle in 100 -150 characters. And it must return only...

Generate single title from this title It exposed what was already broken in 100 -150 characters. And it must return only title i dont...

What is a Performance Review + Definition?

LEAVE A REPLY Cancel reply

Latest

Engineering confidence to navigate uncertainty | MIT News

Generate single title from this title Best of MWC 2026: Live updates on phones, concepts, and robots we’re seeing in 100 -150 characters. And...

Featured video: Coding for underwater robotics | MIT News

Categories

Useful Links

Our Newsletter