Data Integration
The Business results that come from integrating cross-enterprise data into a single actionable view are substantial. Data integration drives real long term cost reductions, improved operational efficiency, enhanced competitive advantage, streamlined regulatory compliance, and increased productivity.
Most businesses can benefit from achieving a single coherent view of their enterprise-wide data resources. This view must be based on a blend of heterogeneous data (relational, delimited text, etc.) from various sources and locations including real-time, historical, and event based. The untapped potential for innovation that can be realized by integrating cross- enterprise heterogeneous data to enable the creation of the next generation of focused more effective applications is substantial.
The ability to customize and propose real-time offers on-the-fly to customers who are calling in to make changes in their billing information (for example: make an address change, or change their credit information) is undeniably a powerful competitive differentiator. Next generation customer experience applications will require the near real-time integration of distributed heterogeneous data sources. The need to achieve and maintain distinctive competitive advantage is driving the move to implement cross enterprise data integration. Beating the competition is key and getting more out of your data in order to advance the business is a requirement. Additional data integration drivers are: data divergence resulting from increasing globalization, increasing competition from all corners of the globe, government focused compliance requirements, risk mitigation, and the elimination of operational risks (lowering the risk of being in business). These are only few of the reasons why IT and business professionals should be concerned with the integration of complex, highly distributed, and heterogeneous cross-enterprise data flows into a single view.
Delivering the true business value that cross-enterprise data integration promises is a task that is too complex and difficult for the majority of stand-alone products and vendor centric data hubs. There are some point-products that can help you get some degree of control but at the end of the day you are still faced with a fragmented approach to building the requisite data flows. However, if all you are trying to do is something very simple like push data from transactional systems into a data mart, today's tools are adequate. Caveat emptor - as you start to build more sophisticated data flows or if you want to change things as the business grows then in many cases you may not have the choice or flexibility required to grow your business. What works well on a departmental project basis is not typically the best enterprise-wide data integration approach.
Ideally the people in an organization who are thinking about what the data is and how to leverage it should not have to worry about how to get it. People who build, deploy, and use applications know that the real objective of an application is converting data into actionable or useful information. There have been notable failures of large ERP and CRM applications where people didn't get the benefit that they wanted due to data related problems. Data, the coin of the realm for most enterprises, brings with it a unique set of requirements, needs, and expectations. Applications, defined as the automation of business logic, are undergoing continual change as businesses evolve. When you start down the road to implementing SOA abstracting data from applications is a requirement.
Executives are beginning to understand the benefit of having access to the complete range of corporate data without having to be exposed to its underlying complexity. In today's competitive environment you would be ill advised to hard-code a particular method of integration to a given application. If you are only dealing with historical data you may be ok but what happens in the future when you may need to add real time data to the integration mix. For example; if you have a data warehouse with extremely clean data and then you acquire another company you will need to blend the loose data that you don't trust from the new company with your existing clean data in order to generate a set of consolidated financials to meet government requirements. What you need is choice and flexibility in terms of what your data is and how you get it without sacrificing the ease and speed and low cost low complexity that come with a more simplified method.
What is required is an integrated tool driven approach where you can design the lifecycle of the dataflow in a flexible way by modeling metadata and then reusing the models and metadata themselves as required. In this way you can achieve better consistency, remove variability from the process resulting in a faster more controllable outcome while benefiting from the choice and flexibility that you need.
In the future users will no longer have to be limited by or concerned with either the physical or virtual location of data, which application it comes from, or even if the data is structured or unstructured. As Data Integration matures and evolves it will have to accommodate an increasingly wider range of data types. Ultimately no data, regardless of type or location, will be beyond integration. Initially operational data coming from transactional systems as well as historical data coming from data warehouses and data marts will be integrated into a single actionable view.
Techniques
The full range of data integration problems or challenges for the most part can be addressed through data consolidation, data distribution, or data federation.
Enterprises today are using various disconnected techniques for data integration such as data federation, replication, and ETL. They will benefit from adopting a metadata approach to defining data transformation. They have been defining data models and now they are beginning to focus on integrating these various data models into data flows. They are moving one set of data to a different kind of structure. Replication is more real-time focused where as ETL is data transformation, data validation, and data cleansing focused. All of the techniques share a common set of metadata.
Data Integration Solution Vendors
Data integration is a hot market space because solving this problem is one of IT’s top concerns and the solution offers such significant value to customers. Because of this attraction, there are many vendors offering “point” solutions and some of the larger players are investing in several data integration areas including Data hubs, Master Data Management, Integration Suites, and Metadata (modeling and adaptable data integration). In fact, any vendor involved in data or in any form of integration has some data integration capability. The market is quite fragmented.
Today when you look at the vendors in the data integration space many vendors have what amounts to a single unique data integration technique rather than a total solution. For the next couple of years, these companies may have reasonable businesses because data integration is a hot area and because today, enterprises are primarily working on data integration projects. However, since data integration techniques are on a convergence course, these point solution vendors must be acquired, expand or go out of business. Data hub vendor centric solutions while nice in theory are not always a best fit for companies with disjointed heterogeneous data environments comprising a diverse mix of hardware systems, operating systems, and data types because these typically single vendor solutions do not easily accommodate heterogeneous legacy information system integration requirements.
In the past Business Intelligence based decision-making was done on the basis of static historical data where latency was not an issue. Today Operational Analysis and decision-making are based more and more on near time or almost real time data and the data that is needed comes from all across the organization. In the world of data convergence data is collected in a wide range of formats. All of this data must be extracted, transformed, and loaded into a data warehouse. Once data is integrated and place into a data warehouse it needs to be made accessible through a relational format, a files system format, or through a web services format.
Conclusion
Data integration is needed wherever enterprises have to bring multiple sources of data together for a business application. Data has been combined for many years for traditional data warehouses and business intelligence. However, new applications – especially those combining near-real-time and real-time information from multiple sources – are beginning to be critical to business success.
Increasingly businesses will come to require a single coherent view of their enterprise-wide data resources based on a blend of heterogeneous data (relational, delimited text, etc.). The business innovations to be realized by integrating cross- enterprise heterogeneous data to enable the creation of the next generation of focused more effective applications is substantial.
The people who will initially benefit the most from this drive to integrate are business users and business analysts. Through the use of effective data integration techniques and technologies business users will benefit from a single view of customers and products across their organization. They will be able to pull together data from multiple different sources in order to meet compliance requirements without having to create more work / cost by pushing compliance down to individual divisions of the company. The business analysts who are looking for trends and correlations between products, sales, and promotions will be able to do a better job because the information they require for their analysis will be readily available to them. The business users that tasked with creating better products and services in order to most effectively compete for customers, market share, and revenue will benefit as a direct result of data integration. They will be able to identify regions they are doing well in and regions they are not doing well in and why and then find correlations with product sales and promotions in order to drive effective sales and promotion solutions.
|