Businesses are awash with data — and they want to capitalize on it. Their goal is to monetize all their data coming from smart devices and everywhere else, but doing so first requires a reassessment of traditional data management practices.

We are surrounded by data, but it can be siloed, which makes it very difficult to consume. Meanwhile, our data-centric society’s expectations for quality experiences are extremely high, making data, and the insights derived from it, one of a business’s most important assets. Fortunately, when it comes to building a data architecture that suits a business’s current and future needs, there are a plethora of options.

The Paradigm Shift

Scott Gnau
Gnau joined InterSystems in 2019 as vice president of data platforms, overseeing the development, management and sales of the InterSystems IRIS family of data platforms. Gnau brings more than 20 years of experience in the data management space helping to lead technology and data architecture initiatives for enterprise-level organizations. He joins InterSystems from Hortonworks, where he served as chief technology officer. Prior to Hortonworks, Gnau spent two decades at Teradata in increasingly senior roles, including serving as president of Teradata Labs. Gnau holds a bachelor’s degree in electrical engineering from Drexel University.

Data fabrics and data meshes have emerged over the last year as promising paradigms for helping organizations on their data journeys. The notion of a “single source of truth,” made famous by the traditional monolithic approaches to data management, is becoming increasingly impractical in practice as data sources continue to be more dispersed and cloud-centric.

Read More:   Update Google Offers 7 Popular Open Source Data Projects as Managed Services

Essentially, building a data lake or moving your data warehouse to the cloud is like putting a shiny new interior in a 1972 Gremlin — it’s likely going to be a much better experience, but it might not actually get you to your destination.

The data mesh approach seeks to address these shortcomings by offering distributed processing and governance at the point of data collection. Data fabrics offer a more integrated paradigm wherein processing is pushed to where the data resides while distributed, mission-critical data stores are purposefully woven and integrated through machine learning and automation.

These concepts hold fundamental differences, but they also have many similarities, especially in their intent. While it’s important to note these differences, it’s also important to understand how they overlap. However, to understand how these concepts fit together in a hybrid model, first let’s take a look at the definitions for each.

What Is a Data Fabric?

Data fabric is a design concept and reference architecture geared toward addressing the complexity of data management and minimizing disruption to data consumers. Think of it as a web stretched across a large network of existing data and technology assets. It connects disparate data and applications, including on-premises, from partners, and in the public cloud. A data fabric provides the capabilities needed to discover, connect, integrate, transform, analyze, manage, utilize, and store data assets, which enable the business to meet its myriad of business goals faster and with less complexity than previous approaches, such as data lakes. An enterprise data fabric combines several data management technologies, including database management, data integration, data transformation, pipelining, API management, etc.

Think of a data fabric as a web stretched across a large network of existing data and technology assets connecting disparate data and applications, including on-premises, from partners, and in the public cloud.

Smart data fabrics take the approach a step further and incorporate a wide range of analytics capabilities, including data exploration, business intelligence, natural language processing and machine learning. It enables organizations to gain new insights and create intelligent prescriptive services and applications.

Read More:   Update Pachyderm Challenges Hadoop with Containerized Data Lakes

What Is a Data Mesh?

A data mesh is focused on organizational change where domain teams own the delivery of data products with the understanding that the domain teams are closer to their data and thus understand it better. It’s supported by an architecture that leverages a domain-oriented, self-serve design, enabling data consumers to discover, understand, trust and use data and data products to inform decisions and initiatives.

A data mesh is focused on organizational change where domain teams own the delivery of data products with the understanding that the domain teams are closer to their data and thus understand it better.

Similar to how engineering teams have adopted microservice architectures over monolithic applications, data teams view data mesh as an opportunity to adopt data microservices that provide business contextual services over monolithic data platforms.

Finding the Happy Medium

While much of the conversation around data fabrics and data mesh has been about understanding the principles of each to determine which approach or architecture is best suited for a business’s needs, the real value of these concepts is not rooted in an “either/or” decision.

When assessing the viability of these concepts, they need to be viewed as complementary. As our current microservices environment has enabled a “best-of-breed” approach when it comes to technology adoption, the question for organizations shouldn’t be about determining the fit for data fabric versus data mesh, but rather it should be “what’s the use case?”

Local processing and governance of distributed data (i.e., with a data mesh) enables sales and marketing teams to curate a 360-degree perspective of consumer behaviors and profiles from various systems and platforms that inform targeted campaigns and customer lifetime values. For example, the infotainment system in a car tracks what buttons and functions the driver and passengers use, and then that’s used as their “data product” to inform marketing the design and functionality of the next model.

Read More:   Update Lenovo Bulks up Its Edge Server Portfolio for AI

That said, you also need enterprise processing and governance of distributed data (i.e., with a data fabric) to create the connectivity needed to have an enterprise view of your data assets. There is a common misconception that data fabric is solely about centralized processing. The reality is that data fabric embraces distributed processing where it makes sense, such as in the case of tracking data in your car’s infotainment system, but also facilitates the connectivity that allows you to go back to the source and connect the dots between distributed processing.

At the end of the day, it’s all about streamlining and simplifying your architecture so that you can focus on productizing your data in a meaningful way. Far too many still see infrastructure as a cost center, but these new paradigms are opening the door to viewing data infrastructure as a profit center.

Feature image via Pixabay.