AWS Glue
Getting Started
Let's start by briefly introducing the key concepts we'll cover:
- AWS Glue Data Catalog: A centralized metadata repository that stores metadata about data sources, transformations, and targets.
- AWS Glue Database: A logical container that organizes tables, allowing for better data management.
- AWS Glue Tables: The structure that represents data in the AWS Glue Data Catalog.
- Partition in AWS: A way to organize data within a table based on the values of one or more columns.
- AWS Glue Crawlers: Tools that scan various data stores, extract metadata, and create table definitions.
- AWS Glue Connection: A resource that contains the properties needed to connect to your source or target data store.
- AWS Glue Jobs: An ETL process that extracts data from the source, transforms it and loads it into the target.
- AWS Glue Triggers: Events or conditions that can automatically invoke AWS Glue workflows.
- AWS Glue Endpoints: URLs that allow external systems to call AWS Glue API operations.
Optimization
https://levelup.gitconnected.com/optimizing-and-reducing-aws-glue-costs-e7426fa732af