AWS Storage Blog
Category: Analytics
Automatic monitoring of actions taken on objects in Amazon S3
Administrators may need to monitor and audit actions, like uploads, updates, and deletes, taken on files and other data to comply with regulations or company policies. A scalable and reliable method of tracking and saving actions taken on files can reduce manual work and operational overhead while helping to ensure compliance. An event-based fanout architectures […]
Automatically modify data you are querying with Amazon Athena using Amazon S3 Object Lambda
Enterprises may want to customize their data sets for different requesting applications. For example, if you run an e-commerce website, you may want to mask Personally Identifiable Information (PII) when querying your data for analytics. Although you can create and store multiple customized copies of your data, that can increase your storage cost. You can […]
How to enforce Amazon S3 Access Grants with Immuta
Amazon Simple Storage Service (Amazon S3) is the most popular object storage platform for modern data lakes. Organizations today evolved to adopt a lake house architecture that combines the scalability and cost effectiveness of data lakes with the performance and ease-of-use of data warehouses. Likewise, Amazon S3 plays an increasingly important role as the foundational […]
Simplify querying your archive data in Amazon S3 with Amazon Athena
Today, customers increasingly choose to store data for longer because they recognize its future value potential. Storing data longer, coupled with exponential data growth, has led to customers placing a greater emphasis on storage cost optimization and using cost-effective storage classes. However, a modern data archiving strategy not only calls for optimizing storage costs, but […]
Getting visibility into storage usage in multi-tenant Amazon S3 buckets
SaaS providers with multi-tenant environments use cloud solutions to dynamically scale their workloads as customer demand increases. As their cloud footprint grows, having visibility into each end-customer’s storage consumption becomes important to distribute resources accordingly. An organization can use storage usage data per customer (tenant) to adjust its pricing model or better plan its budget. […]
Consolidate and query Amazon S3 Inventory reports for Region-wide object-level visibility
Organizations around the world store billions of objects and files representing terabytes to petabytes of data. Data is often owned by different teams, departments, or business units, spanning multiple locations. As the amount of datastores, locations, and owners grow, you need a way to cost-effectively maintain visibility on important characteristics of your data, including based […]
Identify cold objects for archiving to Amazon S3 Glacier storage classes
Update (02/13/2024): Consider Amazon S3 Lifecycle transition fees that are charged based on the total number of objects being transitioned, the destination storage class (listed on the Amazon S3 pricing page), as well as the additional metadata charges applied. You can use the S3 pricing calculator to estimate the total upfront and monthly costs by […]
Migrate on-premises data to AWS for insightful visualizations
When migrating data from on premises, customers seek a data store that is scalable, durable, and cost effective. Equally as important, BI must support modern, interactive, and fast dashboards that can scale to tens of thousands of users seamlessly while providing the ability to create meaningful data visualizations for analysis. Visualization of on-premises business analytics […]
Disabling ACLs for existing Amazon S3 workloads with information in S3 server access logs and AWS CloudTrail
Access control lists (ACLs) are permission sets that define user access, and the operations users can take on specific resources. Amazon S3 was launched in 2006 with ACLs as its first authorization mechanism. Since 2011, Amazon S3 has also supported AWS Identity and Access Management (IAM) policies for managing access to S3 buckets, and recommends using […]
Maximizing price performance for big data workloads using Amazon EBS
Since the emergence of big data over a decade ago, Hadoop – an open-source framework that is used to efficiently store and process large datasets – has been crucial in storing, analyzing, and reducing that data to provide value for enterprises. Hadoop lets you store structured, partially structured, or unstructured data of any kind across […]