Site icon Aragon Research

AWS re:Invent 2022—Focus on Zero-ETL for AWS

By: Craig Kennedy

AWS re:Invent 2022—Focus on Zero-ETL for AWS

AWS is holding its 11th annual re:Invent conference in Las Vegas this week (11/28/22 – 12/02/22).  Amazon Web Services (AWS) CEO Adam Selipsky opened the event on Tuesday morning with the Keynote and made a host of service announcements, one specific to their database offerings.

Amazon’s Accumulation of Database Offerings

AWS has historically embraced the philosophy of providing a rich, and often redundant, toolset to its users. For example, AWS has twelve different database offerings available: four RDBMS, four NoSQL, one GraphDB, one Time Series DB, one in-memory DB, and one analytics DB.

Relational vs Analytics—Pick Your Database

Amazon Aurora is one of AWS’s RDBMS database offerings and is very efficient at handling transactional data, storing and making data available by structured queries, but not very good at performing analysis of large amounts of data across the entire database.

Amazon Redshift is AWS’s analytics databases and is very efficient at performing analysis of large amounts of data across the database but not very good at storing and retrieving discrete transactions.

Analyzing Transactional Data

Today, AWS customers wanting to perform efficient analytics on data residing in Aurora needs to extract the data from Aurora with SQL queries, transform the data to prepare it for inserting into an analytics DB, and then load into Redshift.

This extract, transform, and load process is referred to as ETL and is the most time consuming and complex process of performing data analytics on the contents of a transactional database. The ETL process on a large dataset can take hours to days, significantly delaying the ability to perform anything close to real-time analysis.

Amazon Aurora zero-ETL

 

One of Adam Selipsky announcements at re:Invent was the introduction of a zero-ETL integration between Aurora and Redshift, streamlining the previously lengthy process of ETL and transforming this conversion of data to what Amazon is calling ‘near real-time’. What this means is that changes made to the Amazon Aurora RDBMS database are replicated in the Amazon Redshift database seconds after the Aurora updates.

With this new capability, the tedious, time-consuming, and costly ETL process in AWS is essentially eliminated and analysis of transaction data can be performed in near real-time.

Bottom Line

Removing the ETL bottleneck is a game changer for AWS. This as well as some of the other announcements at re:Invent appear to be focused on solutions that benefit customers as opposed to just technology additions to toolkits. If this is a new strategy for AWS, it should resonate well with their enterprise customer base.

 


SEE CRAIG LIVE AT HIS UPCOMING WEBINAR

Cloud Computing—Why Are Costs So Out of Control

In less than two weeks, Sr. Director of Research, Craig Kennedy, will be hosting a live webinar, titled, Cloud Computing—Why Are Costs Out of Control?.

Technology tools are pricey these days. This means careful consideration of the technology tools that are worth it for your enterprise is needed. Craig will be covering why Cloud Computing costs are astronomically high and when it makes the most sense for the digital enterprise.

Don’t miss out on our live Q&A session to get your questions answered by Craig.

When: Tuesday, December 13th
Time: 10 AM PT / 1 PM ET

REGISTER TODAY

Exit mobile version