Learn how to load Alation’s data catalog supplement from CSV into Snowflake, create and manage supplemental metadata, and integrate it into the Harvester process.
In this guide, you’ll see how to supplement the Harvester’s catalog with additional data from Alation. This process involves uploading a CSV file containing Alation metadata, creating a new table in Snowflake to store that metadata, and then updating the Harvester’s data with the newly loaded information.
IMPORTANT: The CSV from Alation has headers that include colons in their names, and its values are delimited by semicolons. This format must be handled carefully when uploading and loading into Snowflake to avoid alignment issues.
catalog_supplement_loaded = 'FALSE'
to trigger the re-harvest process.Upload the Alation CSV to Streamlit
Use your Streamlit interface to upload the Alation CSV file. Ensure you specify the special delimiters (columns separated by semicolons, headers with colons) so it is parsed correctly. Confirm that each column is recognized and that there are no formatting errors during upload.
Save the CSV in a Snowflake Stage
Once uploaded, ask Eve to save this CSV file into a Snowflake stage. This temporary stage location allows Snowflake to later reference the file for loading into a table.
Create the ALATION_EXTRACT Table in Eve's Workspace
Ask Eve to run the following SQL in her workspace schema:
Load the CSV Into ALATION_EXTRACT
Instruct Eve to load the data from the Snowflake stage into the ALATION_EXTRACT table. Make sure the file format matches the CSV’s semicolon delimiters and handles quoted headers with colons. If any alignment issues arise, ask Eve to adjust the file format or column mapping accordingly.
Validate the Data
After loading, ask Eve to query a few rows, for example:
Confirm that each column from the CSV is in the correct corresponding column in ALATION_EXTRACT. If there are discrepancies, ask Eve to correct them and reload.
Create the Mapping Database Schema View
Next, create a view to merge the newly loaded Alation metadata with existing Harvester results. Ask Eve to run the following:
Update the HARVEST_RESULTS with Alation Descriptions
Instruct Eve to update the Harvester’s HARVEST_RESULTS table with the newly loaded Alation descriptions. For example:
This step populates the CATALOG_SUPPLEMENT
field with Alation’s DESCRIPTION, marking them to be reprocessed by the Harvester.
Wait for Harvester Reprocessing
Give the Harvester approximately 5 minutes to detect the changes and begin reprocessing the data. Once the new supplement data is integrated, the search_metadata
and explore_data
functions will include these enriched fields when querying tables.
NOTE: Why Wait 5 Minutes?
The Harvester regularly polls its data for updates. By waiting a few minutes, you give it time to detect and process the newly loaded Alation metadata, ensuring the updated catalog is available for further queries and exploration.
By following these steps, you’ll have successfully:
Genesis Tip
This workflow showcases how external catalog data (in this case, from Alation) can be integrated into the Harvester. By leveraging Eve’s ability to create and manage tables, stages, and views, you can unify multiple sources of metadata, leading to richer search and exploration for your data environment.
Learn how to load Alation’s data catalog supplement from CSV into Snowflake, create and manage supplemental metadata, and integrate it into the Harvester process.
In this guide, you’ll see how to supplement the Harvester’s catalog with additional data from Alation. This process involves uploading a CSV file containing Alation metadata, creating a new table in Snowflake to store that metadata, and then updating the Harvester’s data with the newly loaded information.
IMPORTANT: The CSV from Alation has headers that include colons in their names, and its values are delimited by semicolons. This format must be handled carefully when uploading and loading into Snowflake to avoid alignment issues.
catalog_supplement_loaded = 'FALSE'
to trigger the re-harvest process.Upload the Alation CSV to Streamlit
Use your Streamlit interface to upload the Alation CSV file. Ensure you specify the special delimiters (columns separated by semicolons, headers with colons) so it is parsed correctly. Confirm that each column is recognized and that there are no formatting errors during upload.
Save the CSV in a Snowflake Stage
Once uploaded, ask Eve to save this CSV file into a Snowflake stage. This temporary stage location allows Snowflake to later reference the file for loading into a table.
Create the ALATION_EXTRACT Table in Eve's Workspace
Ask Eve to run the following SQL in her workspace schema:
Load the CSV Into ALATION_EXTRACT
Instruct Eve to load the data from the Snowflake stage into the ALATION_EXTRACT table. Make sure the file format matches the CSV’s semicolon delimiters and handles quoted headers with colons. If any alignment issues arise, ask Eve to adjust the file format or column mapping accordingly.
Validate the Data
After loading, ask Eve to query a few rows, for example:
Confirm that each column from the CSV is in the correct corresponding column in ALATION_EXTRACT. If there are discrepancies, ask Eve to correct them and reload.
Create the Mapping Database Schema View
Next, create a view to merge the newly loaded Alation metadata with existing Harvester results. Ask Eve to run the following:
Update the HARVEST_RESULTS with Alation Descriptions
Instruct Eve to update the Harvester’s HARVEST_RESULTS table with the newly loaded Alation descriptions. For example:
This step populates the CATALOG_SUPPLEMENT
field with Alation’s DESCRIPTION, marking them to be reprocessed by the Harvester.
Wait for Harvester Reprocessing
Give the Harvester approximately 5 minutes to detect the changes and begin reprocessing the data. Once the new supplement data is integrated, the search_metadata
and explore_data
functions will include these enriched fields when querying tables.
NOTE: Why Wait 5 Minutes?
The Harvester regularly polls its data for updates. By waiting a few minutes, you give it time to detect and process the newly loaded Alation metadata, ensuring the updated catalog is available for further queries and exploration.
By following these steps, you’ll have successfully:
Genesis Tip
This workflow showcases how external catalog data (in this case, from Alation) can be integrated into the Harvester. By leveraging Eve’s ability to create and manage tables, stages, and views, you can unify multiple sources of metadata, leading to richer search and exploration for your data environment.