Sunday, March 22, 2009

Data Integration Tools

Data Integration Tools

Tags: Extract, transform and load (ETL), Enterprise data integration (EDI), Service-oriented architecture (SOA) data services, Data integration tools, VIEW ALL TAGS

These Apps Can Help You Know Your Customers

CRM (customer relationship management) is tricky business. First there are the CRM tools themselves: big, bulky systems from mega-vendors such as Siebel, Oracle, and SAP, or lesser players such as Vantive and Clarify. Installing them can be a major project; teaching users to use them can be even bigger.

As if that weren't enough, most firms store their data in different silos across the enterprise, each with its own format and subset of data. The result? Customer data—your lifeblood—is often fractured, flawed, and messy.

What's more, companies can't build unified views of data across their many systems, meaning they can't get a grasp on who their customers are, what they buy, when they buy it, how they buy it, and most crucially, why they buy it.

A Triad Of Tools

Of course, there are tools to help you cure your data disease. In fact, there are many, but the top three are ETL, EII, and DQ.

ETL. Short for extract, transform, and load, ETL tools are used to grab data from one database and merge it into another. The first step, extraction, reads data from the first database and stores it in memory. Next comes the transformation phase: The ETL tools convert the data into a form that matches the new database. Last is loading, in which the tool writes the data to the target database.

Of course, a tool that can do this has a broader range than simple data migration. ETL can be used to form data marts (a database for a single department or division) and even the odd data warehouse (a larger system that's used across the enterprise and often pools the data from several database systems at once).

EII. Next on the list of tools is EII, short for enterprise information integration. As the name implies, these tools are used to merge data from different systems across the enterprise and show the data in a single, unified (also called "federated") view, as though it came from a single source. Unlike its cousin ETL, EII does not write to a new database or change any data source. Instead, it holds the data in a cache that lets you read—but not write—different records.

DQ. Last is DQ, or data quality tools. These are the simplest of the three. They're used to "scrub" or "cleanse" dirty data (data that's missing fields or contains other errors, such as spelling mistakes).

DQ is utterly needed to keep a clean database that you can rely on (after all, it's awkward to send a form letter to a customer and spell his name wrong). But it's merely a first, and partial, step in cleaning your data. For quality business use, cleansed data needs to be carefully vetted by human eyes, a costly step that not every enterprise can afford.



The Goal

Here's yet another acronym for you to master: CDI. Short for Customer Data Integration, it refers to the process of building a customer hub: a single, central place to store and view data from multiple (and often disparate) systems across the enterprise. A customer hub is one of the seminal concepts of data integration, as it can be used as an end product in itself (a tool for users such as managers and business strategists to view and vet data) or a source for downstream applications (a "feeder" that sends data to other tools, such as analytics and finance software).

You can buy a soup-to-nuts CDI solution from several vendors, the most famous of which is Siperian (www.siperian.com), whose Master Reference Manager and Activity Manager let you troll internal and external silos for customer data and integrate that data into a single, sensible hub.

Or you can build one of your own using the tools mentioned above. After all, ETL, EII, and DQ are meant to gather and clean data. But do they work for such a daunting task as CDI? Remember, the average enterprise can have upward of 20 data silos in just as many departments.

The answer is yes: ETL, EII, and DQ tools can help you with CDI, but they're largely inefficient and ill-suited to the task. First, these tools support only one data modality (or method of moving data from place to place): batch or real-time. Yet most CDI apps need both because there are times when you'll need to update the hub with a high-volume batch process that loads data into it from a new silo, as well as times when you'll need to retrieve single records in real-time. And second, none of the ETL, EII, and DQ tools uses a metadata framework. Metadata, or data about the data in your database, lets data stewards and others quickly configure and extend your data systems without having to write reams of code.

Anurag Wadehra, vice president of marketing for Siperian, notes that using ETL, EII, and DQ for customer data integration is like "having a plumber build your house. Like plumbing in a house, the tools that push data through the pipes don't represent the overarching blueprint needed for CDI architecture."

It's a point well taken, and one you'll need to ponder as you begin to integrate your customer data. Proceed with care: It's one of the hardest tasks that IT can face.

by David Garrett

Data Lifecycle Management

Data, like everything else in this world, has a lifecycle, from the time it's first written to a database to the time it grows obsolete and gets deleted from the system.

Short for data lifecycle management, DLM tools let you manage data from youth to old age, storing newer, more vital data on faster servers or systems, and moving old, rarely used data to slower, cheaper storage, such as tape.

CDI (customer data integration) and DLM systems need to be linked in the enterprise, since the former unifies data from different silos into a customer hub, and the latter decides how and where to store the hub's information. When planning your CDI architecture, be sure to use experts who can take DLM into account.



CDI Vendors

CDI (customer data integration) is the process of building a single, central place to store and view data from many systems across the enterprise. Its major vendors include:

Acxiom | www.acxiom.com ; (501) 342-1000

Ascential | www.ascential.com ; (508) 366-3888

Journee | www.journee.com ; (512) 634-5111

Siebel | www.siebel.com ; (650) 295-5000

Siperian | www.siperian.com ; (650) 350-2200



RELATED CONTENT
Extract, transform and load (ETL)
Data integration costs hurting companies with poor negotiation skills
Wholesaler turns to data integration adapters for mainframe to Oracle migration
Pervasive the most persuasive, cost-effective data integration platform, study says
Integration competency centers centralize data integration projects
Data warehouse case study: Midsized insurer goes enterprise with data warehouse
Enterprise data integration quiz
Data migration planning: Key things to remember
The ETL process and MySQL
Gartner data integration Magic Quadrant 2007: Platforms, market expand
ETL tools and EDR tools: What's the difference?

Enterprise data integration (EDI)
Data integration costs hurting companies with poor negotiation skills
Wholesaler turns to data integration adapters for mainframe to Oracle migration
Exec explains IBM's Information On Demand (IOD) initiative
SaaS-based data quality and integration tools gaining momentum
Exploring key trends in data integration for data warehouses
Pervasive the most persuasive, cost-effective data integration platform, study says
IBM unveils new data management software, services
How to develop a sustainable data integration plan for business intelligence projects
Data integration certifications: Finding the value
Data mashups meet business intelligence: "Bashups" explained

Service-oriented architecture (SOA) data services
How to complete the MDM requirements-gathering analysis process
Event-driven architectures: Understanding concepts, benefits and the bottom line for data management
Event-driven architectures' implications for enterprises and data management professionals
What are the components of service-oriented architecture (SOA)?
What's the difference between SOA and Web services?
Pervasive the most persuasive, cost-effective data integration platform, study says
Master data management as a service-oriented architecture enabler
Understanding master data management and service-oriented architectures, with IBM's Dan Wolfson
SOA governance best practices
A growing integration service providers market gives customers better choices, lower prices

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary