Bobcares

Azure Data Factory’s “Flatten Hierarchy” Copy Behavior

by | Jun 14, 2024

Learn more about Azure Data Factory Flatten Hierarchy Copy Behavior. Our Server Management Support team is here to help you with your questions and concerns.

Azure Data Factory’s “Flatten Hierarchy” Copy Behavior

Azure Data Factory (ADF) is a powerful tool for data integration. It offers seamless data movement across different storage systems.

One of its useful features is the “Flatten Hierarchy” copy behavior, which comes in handy when dealing with data that has a hierarchical structure.

Let’s dive into what this feature does and when to use it.

About Flatten Hierarchy

The “Flatten Hierarchy” copy behavior is designed for scenarios where the source data is organized in a hierarchical structure, like folders within folders, but the destination does not support this kind of organization, like a flat database table. Here’s a closer look at its functionality:

  • Flattens the folder structure of the source data.
  • Copies all files from the source, regardless of their subfolder location, to the same level in the destination folder.
  • Does not preserve the original folder names and structure in the destination.

Here are some use cases where the “Flatten Hierarchy” feature is useful:

  • When we need to move data from a structured folder system in cloud storage into a flat database table.
  • When consolidating files from various subfolders into a single destination folder.
  • If downstream processes do not require hierarchical data, simplifying the structure can make processing more efficient.

For instance, consider a source directory structured as follows:

Azure Data Factory Flatten Hierarchy Copy Behavior

The destination, initially, looks like this:

destination

ADF offers multiple copy behaviors to handle different scenarios:

  1. Merge Files:

    Combines data from all source files into a single file in the sink directory, placing it at the top level.

  2. Flatten Hierarchy:

    Takes files from the source directory and places them directly at the top level of the sink directory, ignoring the original folder structure.

  3. Preserve Hierarchy:

    Replicates the source directory structure in the sink directory, creating paths as needed to mirror the source.

These options are particularly useful when both the source and sink/destination are file systems.

How to Configure Flatten Hierarchy in Azure Data Factory

When we set the copy behavior to “Flatten Hierarchy,” we need to specify how the nested elements should be mapped into columns. This involves defining the column names, data types, and handling missing or null values.

Here’s a practical example:

Imagine we have a JSON file with the following nested structure:

JSON file with the following nested structure

Flattening this JSON hierarchy would transform it into a table with columns such as:

  • `id`
  • `name`
  • `address.street`
  • `address.city`
  • `address.zip`

Each nested level of the JSON becomes a separate column in the resulting table, making the data easier to work with in a non-hierarchical format.

[Need assistance with a different issue? Our team is available 24/7.]

Conclusion

The “Flatten Hierarchy” feature in Azure Data Factory is a versatile tool for managing data transformations. By flattening hierarchical structures, it simplifies data integration processes. This helps streamline data workflows, making data is organized and ready for analysis.

In brief, our Support Experts introduced us to Azure Data Factory’s “Flatten Hierarchy” Copy Behavior.

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Never again lose customers to poor
server speed! Let us help you.