Why We Chose Azure Functions Flex Consumption Over Databricks for Lightweight Integration APIs
When migrating integration APIs from a legacy system, the team had Databricks as the default platform. But for workloads that process a few hundred rows in under two seconds, a quick pricing analysis showed that Azure Functions Flex Consumption was orders of magnitude cheaper, even if it meant stepping outside the central governance model.
Tags: Azure Functions, Cost Optimization, Serverless, Databricks, Infrastructure as Code, Swiss Enterprise
Context
The cloud platform and data product infrastructure were running: governance, networking, identity, and tens of isolated data products, all provisioned as code on Databricks. The next challenge was different: a set of integration APIs had to be migrated from a legacy system. These APIs powered pricing calculations, master data lookups, and order optimisation for downstream business applications.
Databricks was the default compute platform for the project. The natural instinct was to run these integrations there too, keeping everything under one governance model with Unity Catalog, centralised logging, and a single operational surface. But the team decided to run the numbers first.
Running the numbers: what does a two-second job actually cost?
The integration workloads were lightweight: a few hundred rows of pricing data, tabular master data lookups, SQL queries against an on-premises database. Each computation completed in one to two seconds. The question was simple: what is the cheapest way to run a two-second API call dozens of times per day?
On Databricks, even with serverless compute (which eliminates the classic five-to-ten-minute cluster startup), there is still a minimum billing window and per-DBU cost. Serverless Jobs on Azure Databricks Premium are billed at approximately $0.56 per DBU-hour in Switzerland North. Even the smallest workload consumes a fraction of a DBU, but the minimum billing granularity and startup overhead mean each invocation costs roughly one to two cents.
On Azure Functions Flex Consumption, the same two-second computation on a 2 GB instance costs approximately $0.00015, that is roughly one and a half hundredths of a cent. The billing is per-millisecond at $0.000037 per GB-second in Switzerland North. On top of that, Flex Consumption includes a generous free tier: 100,000 GB-seconds and 250,000 executions per month at no charge. For the volume of calls these APIs handle, most months would fall entirely within the free tier.
The cost difference is not a few percent; it is orders of magnitude. For a workload that runs a two-second computation dozens of times per day, Flex Consumption is effectively free, while even Databricks serverless carries a measurable per-invocation cost. When the workload does not need Spark, Unity Catalog, or distributed compute, paying for them makes no sense.
The trade-off: stepping outside central governance
Choosing Flex Consumption meant these integration APIs would live outside the Databricks governance model. They would not appear in Unity Catalog. They would not benefit from Databricks' built-in lineage tracking or centralised compute policies. The team accepted this trade-off deliberately.
The functions could still reference Databricks data when needed, reading from the same storage accounts and querying the same on-premises databases through the shared VNet. What they lost was the single-pane-of-glass governance. What they gained was an ultra-cost-effective compute layer for lightweight APIs that did not need any of Databricks' capabilities.
This is not an either/or decision. Databricks remains the right choice for data engineering, ML pipelines, and anything that benefits from Spark or Unity Catalog. Flex Consumption is for the small integration APIs that would otherwise be over-provisioned. The two coexist on the same platform network.
What the functions do
A single Function App hosts all integration APIs migrated from the legacy system. The design started with one app for all workloads; separation into multiple apps would only happen if scale or isolation required it. For fewer than ten pieces of business logic, one app was sufficient. All functions are Python-based, using pandas for in-memory transformations and pyodbc for direct SQL Server access through the VNet.
Serverless Integration APIs
Function App workloads
- Loads pricing data from Azure Blob Storage (CSV/Parquet)
- Enriches with product dimension codes
- Calculates logistics, freight, and handling costs
- Runs validation checks and returns JSON or CSV
- Distributes yearly quantities across months
- Warns when orders deviate from forecasts
- SQL queries against on-premises database via VNet
- Sub-second response times
- Article lookups with dynamic filtering
- Customer discount calculations
- Direct SQL Server connection through private network
- Consumed by Power Apps and BI tools
All workloads run within a single Function App; one app was sufficient for this scale. If separation were needed later, splitting into multiple apps is a configuration change. All inbound traffic routes through private endpoints. Outbound traffic to on-premises SQL Server flows through the hub firewall via VNet integration. Functions can read from the same storage accounts used by Databricks data products.
The pricing comparison
Here is what a single two-second API call costs on each platform. The Databricks estimate uses serverless Jobs Compute on Premium ($0.56/DBU-hour, Switzerland North) with minimum billing overhead. The Flex Consumption estimate uses a 2 GB instance at $0.000037/GB-second (Switzerland North).
Before: Databricks (serverless Jobs Compute) Pricing: ~$0.56 per DBU-hour (Premium, Switzerland North) Minimum billing window per invocation Est. cost per 2s call: ~$0.01–$0.02 100 calls/day ≈ $1–$2/day Includes Unity Catalog governance Spark overhead for tabular data
After: Flex Consumption (FC1, 2 GB instance) Pricing: $0.000037 per GB-second (Switzerland North) Per-millisecond billing granularity Est. cost per 2s call: ~$0.00015 100 calls/day ≈ $0.015/day Free tier: 100K GB-s + 250K exec/mo Pandas for small datasets; right-sized
At 100 calls per day, the Databricks option costs roughly $30–60 per month. The Flex Consumption option costs approximately $0.45 per month at pay-as-you-go rates, and in most months falls entirely within the free tier (100,000 GB-seconds free). The cost difference is two orders of magnitude.
Private by default, authenticated per request
The Function App follows the same security model as the rest of the platform: zero public access. Private endpoints handle all inbound traffic. VNet integration routes outbound traffic through the hub firewall to reach on-premises SQL Server. Key Vault stores all secrets behind its own private endpoint.
Authentication uses JWT bearer tokens validated against Entra ID. Each API endpoint has its own allow list of caller identities; a service principal calling the pricing API cannot access the master data API. This per-API authorisation is configured through environment variables, making it auditable and changeable through the standard pull request workflow.
- Caller sends request with JWT token
- Private endpoint routes to Function App
- JWT validated against Entra ID
- Caller identity checked against API allow list
- Function executes (Blob Storage / SQL via VNet)
- Response returned in sub-second time
Still fully as code
The entire Function App infrastructure is defined in Terraform: the Flex Consumption service plan, storage account, Key Vault, private endpoints, VNet integration, managed identity, RBAC assignments, and diagnostic settings. Adding a new integration API means writing the Python function and updating the Terraform configuration, the same pull request workflow as every other change on the platform.
This was the third initiative built on the dedicated tenant, after the cloud platform foundation and the data product layer. The same governance, networking, and identity infrastructure made it possible to ship a fully private Function App without any new architectural work.
Outcome
The team proved that not every workload belongs on the same platform. By running the numbers before defaulting to Databricks, they identified a set of integration APIs where Flex Consumption was the right tool: two orders of magnitude cheaper, with sub-second cold starts, full VNet integration, and per-second billing. The trade-off was stepping outside Unity Catalog governance for these specific workloads. The gain was an ultra-cost-effective integration layer that still connects to the same data and runs on the same private network.
The pattern is now the default for the project: any new integration that does not need Spark or Unity Catalog goes to Flex Consumption. Everything else stays on Databricks. Right tool for the right job.
This is the type of engagement our Data Platform on Databricks use case covers. The cloud platform and data product foundations described in our companion articles made this possible. Whether you need the full stack, the data layer, or help choosing the right compute model for each workload, we have built and operate platforms like this in production.
Need this for your project?
We cover this exact scenario. Strategy, delivery, or both. See the use case or get in touch.