Solution Overview
The solution is powered by two AWS AI services, Amazon Transcribe and Amazon Translate, along with Amazon Bedrock, a fully managed service that allows you to build generative AI applications. The solution also uses Amazon Cognito user pools and identity pools for managing authentication and authorization of users, Amazon API Gateway REST APIs, AWS Lambda functions, and an Amazon Simple Storage Service (Amazon S3) bucket.
Features
- Live transcription and translation – The Chrome extension transcribes and translates audio streams for you in real time using Amazon Transcribe, an automatic speech recognition service.
- Summarization – The Chrome extension uses FMs such as Anthropic’s Claude 3 models on Amazon Bedrock to summarize content being transcribed, so you can grasp key ideas of your live stream by reading the summary.
Live Transcription and Translation
Live transcription is currently available in over 50 languages currently supported by Amazon Transcribe streaming (Chinese, English, French, German, Hindi, Italian, Japanese, Korean, Brazilian Portuguese, Spanish, and Thai), while translation is available in over 75 languages currently supported by Amazon Translate.
Architecture
The solution workflow includes the following steps:
- A Chrome browser is used to access the desired live streamed content, and the extension is activated and displayed as a side panel.
- The user signs in by entering a user name and a password. Authentication is performed against the Amazon Cognito user pool.
- The extension interacts with Amazon Transcribe (StartStreamTranscription operation), Amazon Translate (TranslateText operation), and Amazon Bedrock (InvokeModel operation).
- Interactions with Amazon Bedrock are handled by a Lambda function, which implements the application logic underlying an API made available using API Gateway.
- The user is provided with the transcription, translation, and summary of the content playing inside the browser tab.
Prerequisites
For this walkthrough, you should have the following prerequisites:
- Deploy the backend
- Create a new Amazon Cognito user
Deploy the Backend
The first step consists of deploying an AWS Cloud Development Kit (AWS CDK) application that automatically provisions and configures the required AWS resources, including:
- An Amazon Cognito user pool and identity pool that allow user authentication
- An S3 bucket, where transcription summaries are stored
- Lambda functions that interact with Amazon Bedrock to perform content summarization
- IAM roles that are associated with the identity pool and have permissions required to access AWS services
Use the Extension
Now that the extension is set up, you can interact with it by completing the following steps:
- On the browser tab, choose the Extensions.
- Choose (right-click) on the Transcribe, translate and summarize live streams (powered by AWS) extension and choose Open side panel.
- Log in using the credentials created in the Amazon Cognito user pool from the previous step.
- Close the side panel.
Troubleshooting
If you receive the error “Extension has not been invoked for the current page (see activeTab permission). Chrome pages cannot be captured.”, check the following:
- Make sure you’re using the extension on the tab where you first opened the side pane.
- Make sure you have given permissions for audio recording in the web browser.
Conclusion
In this post, we showed you how to deploy a code sample that uses AWS AI and generative AI services to access features such as live transcription, translation, and summarization. You can follow the steps we provided to start experimenting with the browser extension.
FAQs
Q: What are the prerequisites for this walkthrough?
A: You should have the following prerequisites: deploy the backend and create a new Amazon Cognito user.
Q: How do I deploy the backend?
A: You can deploy the backend by following the steps provided in the Prerequisites section.
Q: What are the features of the Chrome extension?
A: The Chrome extension provides live transcription and translation, as well as summarization of live streams.
Q: How do I troubleshoot issues with the extension?
A: You can troubleshoot issues with the extension by checking the troubleshooting section provided in the article.
Q: Can I change the language of the transcript and summary after the recording has started?
A: No, you cannot change the language of the transcript and summary after the recording has started. You must choose the language before starting the recording.

