Bobcares

Overview of Lookup Activity Limitations in Azure Data Factory

Let’s take a quick look at the limitations of Lookup Activity in Azure Data Factory. Our Server Management Support team is here to help you with your questions and concerns.

Overview of Lookup Activity Limitations in Azure Data Factory

Overview of Lookup Activity Limitations in Azure Data FactoryThe Lookup activity in Azure Data Factory (ADF) is a tool for querying data within the data pipelines.

However, we need to be aware of its limitations to design effective and efficient pipelines. Here’s a breakdown of the key limitations and some workarounds to manage them.

Data Volume

  • Row Limit

    The Lookup activity can only return a maximum of 5,000 rows. If the source data exceeds this limit, only the first 5,000 rows will be retrieved.

  • Size Limit

    The total size of the retrieved data, including all columns, cannot exceed 4 MB. This restriction applies regardless of the number of rows.

Other Limitations

  • Supported Sources

    While Lookup supports different data sources, each source may have certain limitations. For example, query capabilities.

  • Single Lookup per Activity

    Each Lookup activity can only perform a single retrieval operation. So, we cannot chain multiple lookups within a single activity.

Workarounds for Data Volume Limitations

  • Two-Level Pipeline Design

    Create a nested pipeline structure. The outer pipeline iterates through a loop, calling an inner pipeline that performs the Lookup activity with a limited data set (less than 5,000 rows or 4 MB). This allows processing larger datasets in chunks.

  • Batch Processing with ForEach Activity

    Usee a ForEach activity to iterate through a list of smaller data subsets. Inside the loop, call the Lookup activity for each subset, ensuring it stays within the limits.

  • Pre-process Large Data Sets

    If possible, consider pre-processing the source data to filter or aggregate it before the Lookup activity in ADF. This can help reduce the overall data volume.

Understanding the limitations of the Lookup activity in Azure Data Factory helps with designing robust data pipelines.

[Need assistance with a different issue? Our team is available 24/7.]

Conclusion

In brief, our Support Experts gave us a look at the limitations of Lookup Activity in Azure Data Factory.

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Speed issues driving customers away?
We’ve got your back!

Privacy Preference Center

Necessary

Necessary cookies help make a website usable by enabling basic functions like page navigation and access to secure areas of the website. The website cannot function properly without these cookies.

PHPSESSID - Preserves user session state across page requests.

gdpr[consent_types] - Used to store user consents.

gdpr[allowed_cookies] - Used to store user allowed cookies.

PHPSESSID, gdpr[consent_types], gdpr[allowed_cookies]
PHPSESSID
WHMCSpKDlPzh2chML

Statistics

Statistic cookies help website owners to understand how visitors interact with websites by collecting and reporting information anonymously.

_ga - Preserves user session state across page requests.

_gat - Used by Google Analytics to throttle request rate

_gid - Registers a unique ID that is used to generate statistical data on how you use the website.

smartlookCookie - Used to collect user device and location information of the site visitors to improve the websites User Experience.

_ga, _gat, _gid
_ga, _gat, _gid
smartlookCookie
_clck, _clsk, CLID, ANONCHK, MR, MUID, SM

Marketing

Marketing cookies are used to track visitors across websites. The intention is to display ads that are relevant and engaging for the individual user and thereby more valuable for publishers and third party advertisers.

IDE - Used by Google DoubleClick to register and report the website user's actions after viewing or clicking one of the advertiser's ads with the purpose of measuring the efficacy of an ad and to present targeted ads to the user.

test_cookie - Used to check if the user's browser supports cookies.

1P_JAR - Google cookie. These cookies are used to collect website statistics and track conversion rates.

NID - Registers a unique ID that identifies a returning user's device. The ID is used for serving ads that are most relevant to the user.

DV - Google ad personalisation

_reb2bgeo - The visitor's geographical location

_reb2bloaded - Whether or not the script loaded for the visitor

_reb2bref - The referring URL for the visit

_reb2bsessionID - The visitor's RB2B session ID

_reb2buid - The visitor's RB2B user ID

IDE, test_cookie, 1P_JAR, NID, DV, NID
IDE, test_cookie
1P_JAR, NID, DV
NID
hblid
_reb2bgeo, _reb2bloaded, _reb2bref, _reb2bsessionID, _reb2buid

Security

These are essential site cookies, used by the google reCAPTCHA. These cookies use an unique identifier to verify if a visitor is human or a bot.

SID, APISID, HSID, NID, PREF
SID, APISID, HSID, NID, PREF