u/Bayernboy23

I’m working on ingesting data from APIs into a Fabric Lakehouse. Due to company security restrictions (DLP policies, etc.), I’m unable to call the API directly from a notebook using Python’s requests library.

I also attempted to use Power Query via the Web connector. While I was able to successfully retrieve both the bearer token and the API data, I ran into issues when the token expired during execution.

Given these constraints, I moved to using a Data Pipeline with the Web connector and a Copy activity. In this setup:

  • The bearer token is stored in a pipeline variable and used in the Copy activity’s authentication header.
  • Pagination works well using the “next page” URL returned in the API response.

However, the main issue is that the bearer token is only retrieved once. Since the token expires after 30 minutes, long-running operations (e.g., thousands of paginated API calls) fail once the token expires.

I attempted to work around this by implementing a loop to:

  • Check whether the token expiration time is greater than the current time
  • Refresh the token when needed

While I can successfully generate a new token and update the expiration time, the Copy activity itself runs as a single operation. This means it continues executing with the original token and does not re-evaluate or refresh the token mid-execution. As a result, I cannot inject an updated token into a running Copy activity.

Main Questions

  1. Is there a way (configuration or setting) for a Copy activity to periodically refresh or re-acquire a bearer token during execution? It appears that once the Copy activity starts, it cannot leverage conditional logic or variable updates until it completes.
  2. What is the recommended approach for handling bearer token reauthentication in long-running API ingestion scenarios (e.g., paginated calls over several hours)?

Update / Potential Approach

I’m considering splitting this into two concurrent processes:

  • Process 1: A timed process responsible for refreshing and updating the bearer token
  • Process 2: The Copy activity that performs paginated API ingestion using the token

The idea would be for the Copy activity to reference a shared variable that gets updated by the token refresh process.

However, I’m unsure whether pipeline variables can be updated and re-read dynamically during execution, or if they are effectively static once the Copy activity begins.

reddit.com
u/Bayernboy23 — 15 days ago