Bobcares

Azure Data Factory’s “Flatten Hierarchy” Copy Behavior

by | Jun 14, 2024

Learn more about Azure Data Factory Flatten Hierarchy Copy Behavior. Our Server Management Support team is here to help you with your questions and concerns.

Azure Data Factory’s “Flatten Hierarchy” Copy Behavior

Azure Data Factory (ADF) is a powerful tool for data integration. It offers seamless data movement across different storage systems.

One of its useful features is the “Flatten Hierarchy” copy behavior, which comes in handy when dealing with data that has a hierarchical structure.

Let’s dive into what this feature does and when to use it.

About Flatten Hierarchy

The “Flatten Hierarchy” copy behavior is designed for scenarios where the source data is organized in a hierarchical structure, like folders within folders, but the destination does not support this kind of organization, like a flat database table. Here’s a closer look at its functionality:

  • Flattens the folder structure of the source data.
  • Copies all files from the source, regardless of their subfolder location, to the same level in the destination folder.
  • Does not preserve the original folder names and structure in the destination.

Here are some use cases where the “Flatten Hierarchy” feature is useful:

  • When we need to move data from a structured folder system in cloud storage into a flat database table.
  • When consolidating files from various subfolders into a single destination folder.
  • If downstream processes do not require hierarchical data, simplifying the structure can make processing more efficient.

For instance, consider a source directory structured as follows:

Azure Data Factory Flatten Hierarchy Copy Behavior

The destination, initially, looks like this:

destination

ADF offers multiple copy behaviors to handle different scenarios:

  1. Merge Files:

    Combines data from all source files into a single file in the sink directory, placing it at the top level.

  2. Flatten Hierarchy:

    Takes files from the source directory and places them directly at the top level of the sink directory, ignoring the original folder structure.

  3. Preserve Hierarchy:

    Replicates the source directory structure in the sink directory, creating paths as needed to mirror the source.

These options are particularly useful when both the source and sink/destination are file systems.

How to Configure Flatten Hierarchy in Azure Data Factory

When we set the copy behavior to “Flatten Hierarchy,” we need to specify how the nested elements should be mapped into columns. This involves defining the column names, data types, and handling missing or null values.

Here’s a practical example:

Imagine we have a JSON file with the following nested structure:

JSON file with the following nested structure

Flattening this JSON hierarchy would transform it into a table with columns such as:

  • `id`
  • `name`
  • `address.street`
  • `address.city`
  • `address.zip`

Each nested level of the JSON becomes a separate column in the resulting table, making the data easier to work with in a non-hierarchical format.

[Need assistance with a different issue? Our team is available 24/7.]

Conclusion

The “Flatten Hierarchy” feature in Azure Data Factory is a versatile tool for managing data transformations. By flattening hierarchical structures, it simplifies data integration processes. This helps streamline data workflows, making data is organized and ready for analysis.

In brief, our Support Experts introduced us to Azure Data Factory’s “Flatten Hierarchy” Copy Behavior.

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Never again lose customers to poor
server speed! Let us help you.

Privacy Preference Center

Necessary

Necessary cookies help make a website usable by enabling basic functions like page navigation and access to secure areas of the website. The website cannot function properly without these cookies.

PHPSESSID - Preserves user session state across page requests.

gdpr[consent_types] - Used to store user consents.

gdpr[allowed_cookies] - Used to store user allowed cookies.

PHPSESSID, gdpr[consent_types], gdpr[allowed_cookies]
PHPSESSID
WHMCSpKDlPzh2chML

Statistics

Statistic cookies help website owners to understand how visitors interact with websites by collecting and reporting information anonymously.

_ga - Preserves user session state across page requests.

_gat - Used by Google Analytics to throttle request rate

_gid - Registers a unique ID that is used to generate statistical data on how you use the website.

smartlookCookie - Used to collect user device and location information of the site visitors to improve the websites User Experience.

_ga, _gat, _gid
_ga, _gat, _gid
smartlookCookie
_clck, _clsk, CLID, ANONCHK, MR, MUID, SM

Marketing

Marketing cookies are used to track visitors across websites. The intention is to display ads that are relevant and engaging for the individual user and thereby more valuable for publishers and third party advertisers.

IDE - Used by Google DoubleClick to register and report the website user's actions after viewing or clicking one of the advertiser's ads with the purpose of measuring the efficacy of an ad and to present targeted ads to the user.

test_cookie - Used to check if the user's browser supports cookies.

1P_JAR - Google cookie. These cookies are used to collect website statistics and track conversion rates.

NID - Registers a unique ID that identifies a returning user's device. The ID is used for serving ads that are most relevant to the user.

DV - Google ad personalisation

_reb2bgeo - The visitor's geographical location

_reb2bloaded - Whether or not the script loaded for the visitor

_reb2bref - The referring URL for the visit

_reb2bsessionID - The visitor's RB2B session ID

_reb2buid - The visitor's RB2B user ID

IDE, test_cookie, 1P_JAR, NID, DV, NID
IDE, test_cookie
1P_JAR, NID, DV
NID
hblid
_reb2bgeo, _reb2bloaded, _reb2bref, _reb2bsessionID, _reb2buid

Security

These are essential site cookies, used by the google reCAPTCHA. These cookies use an unique identifier to verify if a visitor is human or a bot.

SID, APISID, HSID, NID, PREF
SID, APISID, HSID, NID, PREF