Today more than ever before, organizations are striving to gain a competitive edge, deliver more value to customers, reduce risk, and respond more quickly to the needs of the business. To achieve these goals, organizations need easy access to a single view of accurate, consistent, and trusted data – and all in real time. However, growing volumes and complexities of data make this difficult to achieve in practice. As data grows, so does the prevalence of data silos, making integrating and leveraging data from internal and external data sources a challenge.
Recently, data fabrics have emerged as a much-needed architectural approach to providing accurate visibility across the entire business, without the problems associated with data warehouses, data lakes, and other approaches to data integration and data management. Data fabrics can transform and harmonize data from multiple sources on demand to create a single, trusted, accurate, and current source of truth to service all consumers of the data. Smart data fabrics take this approach a step further by incorporating a wide range of analytics capabilities, including data exploration, business intelligence, natural language processing, machine learning, and even generative AI, enabling organizations to gain new insights and power intelligent prescriptive services and applications.
We often get questions about how data fabrics can be applied to address specific industry challenges. Below are answers to eleven common questions.
Data governance and data fabric are related, but they are very different. Data governance is an overarching set of initiatives that strive to define and enforce the quality, usage, and security of data within an organization. It includes policies, standards, processes, rules, roles and responsibilities, privileges, and more. In contrast, a data fabric is an architectural pattern that strives to create consistency among all data and metadata in an organization to make data easy to find, access, and use. A data fabric can be a critical component of a successful data governance program.
Although financial services organizations come to us with a vast array of problems stemming from disparate data related issues, the primary solution for all of these organizations is a modern data architecture that ensures that all data consumers have access to a consistent set of accurate, current, trusted, and secure information.
In general, a data fabric provides a modern approach to create one single source of truth from all the disconnected, disparate, and dissimilar data sources inside and outside the organization that feeds all consumers of the data, whether that’s business users, applications, data scientists, clients, regulators, and so on. It also provides a consistent and overarching metadata layer, and a semantic layer that maintains relationships among the various data and metadata. A data fabric can eliminate the errors and redundancies introduced by maintaining multiple individual data repositories that serve different consumers of the data. It should allow data to be optionally persisted or virtualized (not persisted), handle real time streaming data as well as batch data at scale, be able to natively manage a wide variety of data types including unstructured data (multi-model), and have embedded analytics for real time advanced analytic processing without moving the data to a different environment for analytics (smart data fabric).
A key attribute of data fabrics is that they’re non-disruptive to an organization’s existing technical infrastructure. They connect the existing technologies, including applications, data streams, databases, data warehouses, data lakes, etc. without requiring any “rip-and-replace.” A good implementation approach is to define well-scoped projects that can provide measurable business value in the short term (a few weeks or months) that expose and connect data that is ripe for re-use for future projects, and work in an incremental manner, avoiding multi-year big bang implementations. For those of us that have been around for a while, this is exactly how we approached service-oriented architecture initiatives in the late nineties and early 2000s.
There are many ways to implement a data fabric. One way is to implement and integrate many different data management point solutions, for example for relational and non-relational database management, integration, caching layer, data catalog, workflow, business intelligence, machine learning, metadata and semantic data management, etc. We’ve seen that organizations that try this approach usually end up with a complex and inefficient architecture that is slow to deploy, difficult to maintain, lacks performance, and is inefficient in its use of infrastructure resources. Instead, a recommended approach is to look for data platform technology that provides many of the required functionality in one single product or platform. One of our customers, a $5B fintech software provider, has been able to replace eight different technologies with our one single product, gaining nine times better performance running on only 30% of the infrastructure, and with a far simpler architecture.
Absolutely! Supply chains are a perfect domain for data fabrics because they are large, disparate, and complex, spanning many different organizations, all with their own dissimilar data and application stacks. Organizations require real-time visibility across the end-to-end supply through distribution continuum to easily understand the status of millions or potentially billions of components and react to unexpected issues and disruptions as they occur.
Handling disruptions quickly and efficiently is the top issue in supply chain operations. Disruptions are a constant occurrence, and one of the most challenging supply chain related issues that an organization must deal with. An intelligent control tower must not only provide real time end to end visibility, but also it must deliver predictive insights regarding the likelihood of disruptions, calculate the impact on the business, and present a set of data-driven prescriptive options for preventing potential disruptions in advance, or handling them in real time when they occur. For example, geopolitical events, labor shortages, supply failures, weather patterns, and rapidly changing consumer demand can all impact supply and demand. Organizations can accelerate data driven decision-making by leveraging a data fabric with embedded analytics to achieve a higher level of decision support and automation-driven outcomes.
Yes, very much so. Most organizations are moving to an “analytics and decision intelligence” data platform strategy to meet their digital transformation goals in supply chain. To do so, it requires a modern architecture that can harmonize and normalize data from any disparate data source in real-time, simulate business processes and provide AI and ML capabilities to enable dynamic optimized decision making at the line of business level. In practice, there are industry standard digital maturity models that can provide guidance. The progression starts with understanding organizational requirements and critical KPIs, then leveraging a foundational data fabric architecture and developing processes to incrementally progress to the higher levels of digital maturity, which is achieving a predictive, autonomous, and adaptive supply chain.
Of course, we have many examples of customers that are leveraging a smart data fabric in supply chain to achieve outstanding results. One of our customers is the largest wholesaler of drugs and cosmetics in Japan. They distribute 50,000 different products from 1,000 different manufacturers to 400 different retailers that operate more than 50,000 stores per year. That’s a total of 3.5 billion products every year! Using this approach, they’re achieving 99.999% On Time In Full (OTIF) delivery accuracy, compared with an industry average of around 65%. That means that for every 100,000 products they deliver, 99,999 are delivered to the customer both on time, and in full. That’s an incredible achievement.
Industry 4.0 is all about digitizing the manufacturing environment and enabling OT/IT convergence to streamline the entire process chain and improve efficiencies and responsiveness. And it’s not just creating digital twins for the factory. A data fabric can span supply through manufacturing, assembly, and distribution, including SCP, MRP, MES, ERP, CRM, PLM, inventory management, and more to provide true end to end visibility. And just as with supply chain, a smart data fabric that provides advanced analytics capabilities embedded within the fabric can provide predictive and prescriptive analytics, for example to inform predictive maintenance to keep critical production lines running, to balance supply with predicted fluctuations in demand, and to optimize staffing.
We also see many practical applications of a data fabric with our healthcare customers. One of our customers, an academic medical center, needed a centralized entry point for internal and external consumers to access information distributed across the organization’s many data silos. The data fabric serves as an API service layer, allowing authorized end-users and other clients to access information in realtime to support their real-time applications that access and process data distributed among their enterprise data warehouse, data lake, EMR, and other silos. To satisfy regulatory requirements, they’re using the data fabric as a FHIR façade for relational data that resides within their enterprise data warehouse. The medical center also benefits from using the smart data fabric as an analytics layer to allow analysts and analytics toolsets to explore and report on data across different sources, incorporating near-real time information alongside data residing in their traditional long-term storage for more up-to-date analytics, uncovering new insights and patterns that would otherwise have been hidden within data silos.
Many industry analysts are promoting the data fabric architecture as the preferred approach for many use cases, especially where there is a lot of disparate and dissimilar data to be managed. However, it can be overwhelming to get started. We recommend that the technical teams in an organization work closely with stakeholders in the line of business to identify the use cases that can bring the most value to the organization and implement in sprints that each provide some measurable business value. We also recommend working with a trusted partner that has proven experience with similar organizations and use cases to help with strategy, best practices, and implementation.