We Run the Function Again å†Âã¦â¬â¡
Azure Functions are powerful and convenient extension points for your Azure Data Mill pipelines. Put your custom processing logic behind an HTTP triggered Azure Function and you are skilful to become. Unfortunately many people read the Azure documentation and assume they can merrily run a Office for up to 10 minutes on a consumption programme or with no timeout on an App Service programme.
Unfortunately, this logic is flawed for two reasons:
- Processing times vary, whether it is due to variances in data beingness processed or simply downwardly to running on multi-tenanted infrastructure. Unless your workload is well below documented threshold the chances are that every now and once again, you will sew to and exceed the maximum timeout. You may be tempted to switch over to an App Service programme. Unfortunately, this logic is too flawed for reason #2.
- The advertised ten minute timeout, or unlimited timeout for app service plans, DOES Not apply to HTTP triggered functions. The actual timeout is 230 seconds. Why? Because the Azure Load Balancer imposes this limit on u.s.a. and nosotros take no command over this.
Fortunately, at that place is a simple solution. Rather than the client waiting for an operation to complete, the server can respond with a 202 Accustomed
status lawmaking forth with data that the customer tin use to determine if and when the operation has completed.
This sounds a bang-up idea only nosotros seem to take taken our uncomplicated asynchronous operation and added a fair amount of complexity. Firstly, our service logic now needs to expose a second endpoint that clients can use to cheque on progress, this in turn requires state to be persisted beyond calls. We need some sort of queuing that will permit our processing to run in an asynchronous manner, and we need to think about how we study error conditions to the client. The client itself now needs to implement some sort of polling and then that it can respond when processing is consummate.
The good news is that the Microsoft Azure Functions team and the Data Manufactory team have thought about all of this. Durable Functions enable us to easily build asynchronous APIs by managing the complexities of status endpoints and land direction. Azure Data Manufacturing plant automatically supports polling HTTP endpoints that return 202 status codes. And so let'southward put this all together with a fiddling case.
Durable Functions
The examples below are for Durable Functions 2.0.
First by creating an HTTP triggered part to beginning the processing. To keep things simple we will pass a unproblematic Json which accepts a secondsToWait
that will be used to simulate a long running operation.
Note the IDurableOrchestrationClient
statement, this is used to initiate the workflow and create the response.
Next add together an orchestrator office - this office coordinates work to exist done by other durable functions.
The orchestrator function doesn't do any of the processing itself, instead it delegates piece of work via IDurableOrchestrationContext.CallActivityAsync(...)
.
Finally, we'll add together an Activity function to do the actual processing. Note that this office will honour the function timeout dictated by the hosting plan. Where possible consider decomposing work into multiple steps coordinated by the orchestrator function, this mode you tin avert being subject to timeout limits. Durable functions supports a range of application patterns that y'all should familiarize yourself with.
Publish your role and trigger information technology by sending a POST request. Y'all should receive response which looks like this:
The statusQueryGetUri
provides data of the long running orchestration example. If you follow this link yous will receive a suitable runtimeStatus
that describes the status of the orchestration case along with some other useful information.
If y'all pay shut attending yous may also observe that the status code changes depending on the state of the response.
Status | HTTP Status Code |
---|---|
Pending or Running | 202 Accepted |
Consummate | 200 OK |
Faulted or Terminated | 500 Internal Server Mistake |
Carry this in mind as this is the key for how the next part works.
Azure Information Factory
Create a Function linked service and point it to your deployed role app. Create a new pipeline and add together a Function activity which will phone call the asynchronous Function.
This part will simply return the payload containing the statusQueryGetUri
seen in a higher place.
Next we demand to instruct Data Manufactory to wait until the long running operation has finished. Do this by adding a web action:
The important piece to annotation is the url
property which should be set up to @activity('Get-go Long Running Chore').output.statusQueryGetUri
. When the pipeline runs the web activity will dynamically recall the status of the performance and will proceed to poll the endpoint while the HTTP status code is 202 Accepted
. When the operation completes the status url will render either 200 OK
, indicating that the activity was successful or 500 Internal Server Mistake
which volition cause the activity to neglect.
The pipeline should look like this:
Run the pipeline and verify information technology behaves as expected. You will observe that the web activity waits until the asynchronous office has completed allowing you to trigger subsequent dependent ADF activities after the asynchronous operation has finished.
Summary
Durable Functions are a great way to implement custom long running data processing steps with in Azure Data Factory without falling foul of the 230 second HTTP triggered Function timeout. Data Manufacturing plant web activeness has congenital in back up for polling APIs that render 202 status codes, making it piffling to integrate asynchronous APIs into your data pipelines.
Source: https://endjin.com/blog/2019/09/azure-data-factory-long-running-functions
0 Response to "We Run the Function Again å†Âã¦â¬â¡"
Post a Comment