AWS Machine Learning Blog

Tailoring foundation models for your business needs: A comprehensive guide to RAG, fine-tuning, and hybrid approaches

In this post, we show you how to implement and evaluate three powerful techniques for tailoring FMs to your business needs: RAG, fine-tuning, and a hybrid approach combining both methods. We provid ready-to-use code to help you experiment with these approaches and make informed decisions based on your specific use case and dataset.

How Rufus doubled their inference speed and handled Prime Day traffic with AWS AI chips and parallel decoding

Rufus, an AI-powered shopping assistant, relies on many components to deliver its customer experience including a foundation LLM (for response generation) and a query planner (QP) model for query classification and retrieval enhancement. This post focuses on how the QP model used draft centric speculative decoding (SD)—also called parallel decoding—with AWS AI chips to meet the demands of Prime Day. By combining parallel decoding with AWS Trainium and Inferentia chips, Rufus achieved two times faster response times, a 50% reduction in inference costs, and seamless scalability during peak traffic.

New Amazon Bedrock Data Automation capabilities streamline video and audio analysis

Amazon Bedrock Data Automation helps organizations streamline development and boost efficiency through customizable, multimodal analytics. It eliminates the heavy lifting of unstructured content processing at scale, whether for video or audio. The new capabilities make it faster to extract tailored, generative AI-powered insights like scene summaries, key topics, and customer intents from video and audio. This unlocks the value of unstructured content for use cases such as improving sales productivity and enhancing customer experience.

GuardianGamer scales family-safe cloud gaming with AWS

In this post, we share how GuardianGamer uses AWS services including Amazon Nova and Amazon Bedrock to deliver a scalable and efficient supervision platform. The team uses Amazon Nova for intelligent narrative generation to provide parents with meaningful insights into their children’s gaming activities and social interactions, while maintaining a non-intrusive approach to monitoring.

Principal Financial Group increases Voice Virtual Assistant performance using Genesys, Amazon Lex, and Amazon QuickSight

In this post, we explore how Principal used this opportunity to build an integrated voice VA reporting and analytics solution using an Amazon QuickSight dashboard.

Optimize query responses with user feedback using Amazon Bedrock embedding and few-shot prompting

This post demonstrates how Amazon Bedrock, combined with a user feedback dataset and few-shot prompting, can refine responses for higher user satisfaction. By using Amazon Titan Text Embeddings v2, we demonstrate a statistically significant improvement in response quality, making it a valuable tool for applications seeking accurate and personalized responses.

Boosting team productivity with Amazon Q Business Microsoft 365 integrations for Microsoft 365 Outlook and Word

Amazon Q Business integration with Microsoft 365 applications offers powerful AI assistance directly within the tools that your team already uses daily. In this post, we explore how these integrations for Outlook and Word can transform your workflow.

Integrate Amazon Bedrock Agents with Slack

In this post, we present a solution to incorporate Amazon Bedrock Agents in your Slack workspace. We guide you through configuring a Slack workspace, deploying integration components in Amazon Web Services, and using this solution.

Secure distributed logging in scalable multi-account deployments using Amazon Bedrock and LangChain

In this post, we present a solution for securing distributed logging multi-account deployments using Amazon Bedrock and LangChain.

End to end architecture of a domain aware data processing pipeline for insurance documents

Build a domain‐aware data preprocessing pipeline: A multi‐agent collaboration approach

In this post, we introduce a multi-agent collaboration pipeline for processing unstructured insurance data using Amazon Bedrock, featuring specialized agents for classification, conversion, and metadata extraction. We demonstrate how this domain-aware approach transforms diverse data formats like claims documents, videos, and audio files into metadata-rich outputs that enable fraud detection, customer 360-degree views, and advanced analytics.