Skip to content

AWS Glue

Getting Started

Let's start by briefly introducing the key concepts we'll cover:

  • AWS Glue Data Catalog: A centralized metadata repository that stores metadata about data sources, transformations, and targets.
  • AWS Glue Database: A logical container that organizes tables, allowing for better data management.
  • AWS Glue Tables: The structure that represents data in the AWS Glue Data Catalog.
  • Partition in AWS: A way to organize data within a table based on the values of one or more columns.
  • AWS Glue Crawlers: Tools that scan various data stores, extract metadata, and create table definitions.
  • AWS Glue Connection: A resource that contains the properties needed to connect to your source or target data store.
  • AWS Glue Jobs: An ETL process that extracts data from the source, transforms it and loads it into the target.
  • AWS Glue Triggers: Events or conditions that can automatically invoke AWS Glue workflows.
  • AWS Glue Endpoints: URLs that allow external systems to call AWS Glue API operations.

Optimization

https://levelup.gitconnected.com/optimizing-and-reducing-aws-glue-costs-e7426fa732af

Read Mores