To read the full article by Lawrence Hecht on TheNew Stack, click here.
According to Hecht, the data mesh concept continues to pick up momentum as an approach where domains-oriented teams own “data products” and have a self-serve data infrastructure platform that both delivers the data product to consumers and allows the data to be analyzed/consumed. A data fabric and a data mesh both provide an architecture to access data across multiple technologies and platforms; data fabrics are technology-centric, while a data mesh focuses on organizational change.
Hecht also explains that there is less agreement about data virtualization versus data fabric, but the former term is usually focused only on the abstraction of storage across multiple locations. Varada’s new data virtualization survey provides a bit of semantic clarity.
The survey includes 130+ data experts and executives from US & Canada and asked them about the pace of change in data virtualization, the challenges and the route to supporting enterprises to truly embrace the data lake architecture. When asked how they define data virtualization, 64% said it is the ability to seamlessly connect to any data source or platform, 19% defining it as the ability to run any query without the need to model data and 17% thinking of it as a data lake query engine.
Not surprisingly, the top benefit for 71% of companies with a 10TB+ data lake is reducing and simplifying data ops. For companies with less than 10TB, who haven’t experienced the challenges of truly massive amounts of data yet, the top benefit is the ability to run all queries on a single platform.
According to Ori Reshef, Varada VP of Products, data virtualization promotes data democratization by enabling anyone from the organization, subject to proper governance policies, to access any dataset in a data mesh. The end result is that more business units can monetize massive amounts of data. Hecht argues that is an optimistic viewpoint, but there are also a few obstacles, most notably queries to the data platform that needs to be re-written for each and every domain-specific use case.
Indeed, when asked about challenges, 88% of survey respondents stated that they face challenges that impact their migration efforts to a data virtualization platform. Query rewrites and cost are a top concern. In addition, almost half respondents indicated that performance is a critical challenge.
Download the full data virtualization survey report now!