Data has a central function in the digital economy. It is collected in gigantic quantities so that automated or autonomous decisions can be made on the basis of data analysis. In this context, data security is achieved by technologies such as encryption, access protection, and tools that prevent attacks (e.g., SQL injection) on information. In contrast, data governance involves tools that encourage correct handling of data (e.g., managing personal and other sensitive information) and that implement the associated risk management with the use of, for example, metrics and control instruments.
The Basis: Data Overview
To implement data security, not just as a local solution for certain databases but to secure all or at least a large amount of relevant information, you first need to establish the basis. Data management informs you of the information you have, which is important because you can't use or protect what you don't know about.
And a positive side effect is that this knowledge is needed not only for data security, but also for the efficient use of data and for data governance. More specifically, it's about metadata management and data catalogs. Metadata management is the functional approach in data catalogs that acts as a repository of information across the various data storage systems. This helps you identify data worthy of protection, as well as the best data sources for the efficient and targeted use of information.
Data catalogs and metadata management have grown in importance, especially in the wake of the European Union General Data Protection Regulation (GDPR), with its stricter requirements for handling personal data. Some of the vendors in this market started with products for privacy management and then implemented data catalogs and metadata management as core components to serve requirements other than "simple" data protection.
The cost of data usage can be reduced if users access the most suitable sources in a targeted way. The potential added value increases to match. On the other hand, better data governance and data security give a boost to security or help mitigate security risks and avoid potential compliance violations.
In an age of looking not only at increasing volumes of data on increasing numbers of systems but also at an increasingly larger number of physical systems for data storage and management, consolidation tools (more specifically, metadata management and data catalogs) and integration systems are required to keep track of information. The days when IT managers primarily thought of relational databases and SQL when it came to data are long gone. Massively growing volumes of data add complexity to information handling and raise the bar for a comprehensive data architecture.
Data Fabric
Data security and governance are central components in a comprehensive data architecture, or data fabric (Figure 1). Data architecture in this context explicitly refers to managing and using systems and their architecture and interaction, not to the information itself and its structures. Whereas data architecture at the information level is conceivable for isolated use cases, organization-wide approaches in the sense of an enterprise data model failed more than a decade ago.
A model for data management builds on the various sources ranging from traditional databases to business applications and their data storage systems to analytic applications. All of these sources often generate information themselves, which, in turn, can serve as a source for other usage scenarios.
The first integrating layer is metadata management and the resulting data catalogs, which provide an overview of what data (and of what quality) can be found where. However, state-of-the-art tools also provide more detailed information on the data lineage (e.g., the origin of the data) and enable evaluation and collaboration for the information.
A level above is data integration and quality. Data integration products such as extract, transform, load (ETL) and extract, load, transform (ELT) support the integration of information from different sources and the implementation of data formats.
Data quality tools check the quality of information to supplement or correct the data as needed. External sources such as address databases are often accessed for this purpose. Master data management (MDM) builds on this foundation and delivers function- and industry-specific applications for handling information such as product data.
Another level above are the analytical applications and functions for data usage (e.g., for serving up user-specific content in digital services or for decision support).
Specific Data Protection
The functions of data governance act as an overall theme across the layers of a data fabric. One central topic in this context is data security, which in recent years has developed beyond individual technical solutions for security of classic, relational databases.
Database security continues to be an important subarea in this context, protecting databases against breaches of integrity, confidentiality of information, and availability. Security primarily involves functions for the information itself stored and processed on database systems, as well as the underlying server and network infrastructure and access to the information.
However, as infrastructures and technologies for processing and storing data change -- especially given cloud-native tools and the resulting hybrid infrastructures of modern and legacy approaches -- the requirements change. The core functions of modern data security products include the following functional areas:
- Vulnerability assessment: identifying potential points of attack, configuration errors, and other dangers.
- Data discovery and classification: knowledge of the data and classification of the data in terms of sensitivity (e.g., personal information); tools ideally build on existing infrastructures for metadata management and data catalogs.
- Data protection: encryption, tagging, and other technologies for both storage and data transfer.
- Monitoring and analysis: continuous monitoring of access to and the use of data, and analysis to detect and respond to anomalies, including integration with security information and event management (SIEM) tools.
- Threat prevention: guarding against targeted attacks such as SQL injection.
- Access management: targeted protection of privileged user accounts and dynamic, policy-driven access control; often handled by specialized applications.
- Audit and compliance reporting: automatically generated and ad hoc reports and dashboards for an overview of the current security status.
These tools are fundamental building blocks of a modern, secure data fabric and must be designed to support complex hybrid environments and multicloud platforms.
Data Governance
As mentioned earlier, data governance is more than just data protection and is best defined as an umbrella term covering various functions of protecting and controlling data and data usage. Two other important functional areas in addition to data security are:
- Privacy management for handling information that falls under the scope of the GDPR. Like the entire topic of data management, it is no longer just about structured data, but also about unstructured data that needs to be analyzed, managed, and protected.
- Data governance and risk is a sub-area that focuses on concrete metrics and control functions that can be used to monitor and improve compliance with defined rules for handling data. On the one hand, this sphere includes regulations in the area of data protection such as the GDPR; on the other hand, it encompasses other requirements for handling sensitive information, as well as internal rules for handling and protecting particularly critical and valuable data. Such tools typically dovetail with IT governance, risk, and compliance (GRC) products to deliver data into a higher level of risk management.
A good strategy for data governance is always built on integration of the specific functions with data security, as well as the underlying metadata management and data catalogs.
Flexibility and Coordination
Perhaps the biggest challenge in implementing a data fabric -- and in sub-areas such as metadata management or data governance -- is that people work with data everywhere in organizations. The multiple areas of use and numerous stakeholders make it difficult to avoid a proliferation of initiatives and technical approaches.
However, precisely here, well-thought-out and comprehensive approaches can help, including an architecture with associated operating models (e.g., target operating models, TOMs) for the data fabric and service-oriented approaches for providing the technical implementation. Experience shows that very few divisions in the corporate environment actually want to implement their own tools if they can turn to a functionally useful service with a suitable operating model. In other words, with a correctly implemented data fabric as an internal service, a corporation has a good chance of containing a significant amount of undesirable growth.
Communication is important so that the different divisions in the company know which services are available. Because the users and areas involved come from both IT and business, advisory support is required on top of technical services. In many cases, today's tools in the wider data management environment support functions for collaboration; assessing data usability; and collaboration among users, data stewards, and administrative and technical users.
Conclusions
The importance of data security means corporations need to move away from isolated, incomplete tools that are expensive, often fail to cover important security and data governance requirements, or do not do a complete job of providing coverage. An additional consideration is that continually introducing new local solutions simply takes too much time. However, solutions must also serve the quite different requirements of stakeholders from business, IT security, data protection, and other areas.
To deal with data efficiently and effectively, corporations need a strategy in which a holistic view of the required elements (e.g., a data fabric) plays a central role. Individual approaches in the area of analysis are not enough. IT managers need to implement both the foundations with metadata management and data catalogs and the interdisciplinary functions of data governance and data security correctly to be able to work optimally with data and generate the desired added value.
This article originally appeared in ADMIN magazine and is reprinted here with permission.
Want to read more? Check out the latest edition of ADMIN Network & Security.
Comments