Data Engineering notes
Demystifying Eventhouse OneLake Availability: Why mirroring isn't real-time
This week, I collaborated in an interesting support case regarding Eventhouse on Fabric’s Real-Time Intelligence (RTI). A customer had enabled OneLake Availability on their Eventhouse tables. Naturally, they expected the feature to mirror their streaming data into their Lakehouse in a matter of seconds. Instead, they noticed it was taking several minutes for the data to surface in the Delta table. Example: If you are encountering this same delay, I want to be entirely straightforward: at the time of writing, this is the expected behavior. It is easy to assume that because Eventhouse handles real-time ingestion so beautifully, its OneLake mirroring capabilities would do the same. However, the mechanics of how data is flushed to OneLake fundamentally shift the architectural goal from real-time streaming to optimized batch querying.
April 26, 2026
Understanding OPENROWSET statistics in Synapse Serverless SQL pool
Sunday morning here… instead of watching the news (pretty bad lately), I’m just sitting here with my second coffee and trying to find that hour I lost moving to UTC+1 summer time :) Anyway, since I’m already caffeinated, I think it is a good time to follow up on my previous post, where I discussed statistics for external tables in Synapse Serverless SQL pool. Today, let’s shift the focus to statistics for OPENROWSET queries.
March 29, 2026
Statistics strategy for external tables in Synapse Serverless SQL pool
What exactly is a SQL statistic? Before we dive into the details on Synapse Serverless SQL, let’s quickly cover what a SQL statistic actually is. Simply put, it is a small object inside the database that contains information about how the data is distributed in a specific column. It helps the database engine understand what the data looks like without having to scan the entire table first.
March 27, 2026