Wednesday, December 26, 2012

Customer Data Intergration Software

 Customer Data Integration: Creating a Single Version of the Truth

Effective customer management happens with online, accurate, integrated and up-to-date customer information. In most organizations, customer information is distributed and duplicated across various applications, and it is difficult to get a single version of the truth. To enhance customer management, organizations are increasingly investing time and money into customer data management, with the customer data integration (CDI) system. CDI involves the integrating and unifying of customer information from disperse and heterogeneous business applications. This integrated customer repository becomes the central repository of customer information being used by various applications, and produces the true view of the customer. 

The customer is the heart and soul of an organization. This is the central entity around which business of an organization revolves. The customer is the golden nugget on which the survival and prosperity of an organization depends. The entire universe may be considered as the customer base of an organization in a true sense. The customer can be viewed in various perspectives, three of which are shown in Figure 1.



Figure 1: Customer Classification
From the business interaction angle, customers can be:
  • Account holders: The existing customers possessing active, inactive, dormant or closed accounts.
  • Organizations: The internal organization can be analyzed at various levels of granularity and functions, e.g., the employees themselves can be account holders.
  • Partners: These are channels as well as business supporters. Partner behavior can be studied to understand their needs through the business events. This can be effectively utilized to earn loyalty as well as to grow business.
  • Competitors: In order to strategically frame the business, competitor activities need to be monitored.
Based on status, the customers can be classified as:
  • Active customer: One whose account is active and running.
  • Inactive customer: Customer having valid account but has not used in a period of time.
  • Dormant customer: Individual or corporation with whom organization did business in the past.
  • Prospective customer: An individual or corporate that can be targeted as a potential customer.
Based on the organization, the customer can be grouped as:
  • Individual: A specific person of interest to the organization, i.e., employee, agent or dealer.
  • Corporate: A group of individuals who have banded together for a commercial purpose.

CDI Overview

Organizations have to build strong customer relationships to stay competitive and grow in today's market. Effective customer management mandates easy and quick access to up-to-date and accurate customer information. To enhance customer management, organizations are investing time and money toward building an integrated central customer repository, which can provide the online, accurate, integrated and up-to-date customer information.
Customer data integration (CDI) is an approach toward integration and unification of the customer information from disperse and heterogeneous business applications. CDI processes consolidate customer information from all available sources, such as operational systems, call centers, customer relationship management (CRM) and data warehousing (DW) applications, and ensures the access of the current and complete view of customer information to the relevant departments/ business groups.



Figure 2: CDI Context Diagram
A successful CDI solution helps organizations in:
  • Effective customer management by providing a timely and accurate understanding of customer needs and behaviors.
  • Improved cross-selling and up-selling opportunities by understanding the prospective customers.
  • Removing duplication and misleading customer information and providing single version of truth across the various business units of an organization.
  • Providing effective campaign management.
  • Complying with legislation, regulations and privacy requirements.
  • Optimizing operational, maintenance and enhancement cost by having a central integrated environment (hardware, software).

Customer Data Integration - Challenges

In most organizations, since customer information is distributed across various applications, the unification and integration of customer information from heterogeneous and dispersed applications is a big challenge. Forester Research has found that though 92 percent companies say that having an integrated customer application is critical or important, only 2 percent have managed to achieve this. There are numerous challenges faced during customer data integration:
Duplicate Customer Data Duplicate customer records hinder the organization's ability to identify the customer uniquely and correctly. Duplicate records also cause problems in relating customer transactions to a single customer record. Also, it becomes difficult for the customer service representative to correctly understand the history of interactions made with a customer. The other significant drawback of duplicate records is that it causes duplicate campaigning. Key factors influencing data duplication issues are:
  • Local maintenance and storages of customer information in an individual application;
  • Inorganic growth of the organization (merger and acquisition) resulting in heterogeneous processes and systems to maintain and support customer information;
  • Different customer details fed through different channels (Web, telephone, etc.);
  • Data entry error;
  • Relaxed data entry service level agreements (SLA) and audit;
  • Lack of briefing, training and education to the customer service/front-end staff about the important and significance of customer data fields.
Individual systems have their own way of maintaining customer details, which lead to ambiguous and duplicate customer information. Data entry error or inconsistent data entry by customer service agents leads to an ambiguous data set. To compound the problem, the customer maintains different contact details when they are interacting through different channels. For example, the name and address can be captured in the various ways, which leads to duplicate or inconsistent customer details within and across the applications.
Inconsistent and Inaccurate Data Inconsistent and inaccurate customer data limits the organization's ability to understand and analyze the customer. This leads to poor decision-making that causes customer dissatisfaction. This inconsistent and inaccurate data set can generate a different version of the customer information and defeat the prime purpose of CDI, which is to produce a single version of the truth. It also leads to data reconciliation issues and affects the functioning of the business applications. Key factors influencing consistency and correctness issues are:
  • Lack of common metadata control: Distributed and disintegrated customer metadata across application can lead to the inconsistent definition of the customer.
  • Clerical errors (data entry error): Inaccurate and insufficient data entered by data entry operators or call center agents leads to data sufficiency and accuracy issues.
  • Lack of data ownership, infrequent audit and relaxed SLA. Based upon business needs, individual business groups (sales, operations, marketing, human resource, etc.) primarily focus on the given subset of customer information. The data fields not being used by given business groups can contain a default or meaningless value in the database. Inconsistent domain range and or default values definition can generate the data consistency issues, i.e., default values for birthdate can be different for different departments. The missing data fields cause data sufficiency issues.

With the increasing volume and velocity of data, managing data growth and maintaining the latest and accurate customer information is a challenging task. Decayed and old data contains no value to the business. The two major areas of data management are growth and latest data management.

Growth management. Business applications generate millions of customer records every year. Inefficient data management and storage can have an adverse affect on the performance and usability of the application. Factors contributing data growth are:
  • Nature of operational systems (business applications),
  • Inappropriate historical data management strategy,
  • Lack of data archival and housekeeping strategy, and
  • Inappropriate reference data management strategy.
Current data management. Customer information changes over time. The CDI application should track the changes and maintain the most current customer information. Factor contributing information changes are:
  • Changes in customer credentials;
  • Changes in customer contact details; and
  • Changes in customer demographic, psychographic and geographic details.

The CDI Solution Architecture

A customer data integration (CDI) system is the central application to capture, integrate and distribute customer information. The goal of a CDI application is to integrate customer information from different applications with minimum latency. Based on the needs of an organization and the dynamism of the customer data, the CDI architecture can be implemented either using batch processes (ETL - extract, transform and load) or using real-time messaging (EAI - enterprise application integration).
Figure 3: CDI Logical Architecture (Hub-and-Spoke Model)

A CDI system extracts customer information from disperse applications and performs data cleansing, customer matching (deduping) and integration as per the predefined cleansing, matching and integration rules. The central repository contains the integrated customer data with different views of customer information. The data access interface defines the data access mode, restriction and privileges. Business applications and user communities can access only that data set they are authorized to. Business rules (data cleansing, customer matching, data integration and data access rules) can be stored in the central metadata repository or reside in the individual tools repository.

Conceptual Data Model (CDM)
CDI is an application to store and distribute meaningful customer information. The CDI data model contains customer and related entities. The generic CDM is illustrated in the Figure 4.

Figure 4: CDI Conceptual Data Model

Customer and customer classification. A Customer is a Person or Organization of interest. Customers enter in a relationship with other customers. The nature of this involvement is used to determine whether a specific customer in an external customer, employee, supplier, partner or a competitor. The customer can be viewed from various perspectives as discussed in the beginning of this article.

Customer relationship. This entity stores relationship between two customers. Customer relationships can be categorized as personal or professional, e.g., Parent-Child, Employer-Employee, etc.

Customer contact. This entity captures the customer contact details. Customer contact details can be the physical address, telephone contact and electronic information. The postal address can be subgrouped as current address, permanent address, office address, bill-to-address and ship-to-address. Telephone contact consists of home phone number, office phone number, cell number, corporate office number and local office number. Electronic address consists of personal, office and corporate email ID.

Customer details. This entity captures the customer demographic, psychographic and geographic details. This information can be used for customer segmentation and analysis.

The demographic details to be captured are gender, age group, marital status, number of children, profession, income group, other financial details, etc. The psychographic information captured includes channel preference, privacy specifications, market research, etc. The geographic details to be captured are location (country, region), population groups, country development status, primary currency etc.

Customer accounts. A customer account is a contractual relationship between a customer and an organization and is associated with a given product or services. The account entity stores the account details, account type, account status and other related information.

Customer household and household details. Households are the collection of existing or prospect customers. Households and their demographic, psychographic and geographic information will help in understanding the associated patterns and defining the proactive campaign management.

CDI Processes
CDI processes facilitate the consolidation and unification of disparate customer data into integrated and meaningful customer information. The key driver for customer data integration is to provide the true view of customer. The process steps involved in the customer data integration are data acquisition, data cleansing, data integration and data management.


Figure 5: CDI Processes

Data Acquisition
The data acquisition phase helps in understanding the customer data and defining the data extraction strategy. It involves the identification, analysis and extraction of customer data from various business applications (operational systems). A detailed study of source data is performed to understand the data format, characteristics, pattern and usability. A data extraction strategy and approach is defined to extract the relevant customer information from source systems.

Data Cleansing
The data cleansing phase encompasses the processes and procedures for data correction and standardization. Data correction is the process of fixing, spelling and correcting the address, ZIP code, Social Security number and permanent account number. Once the data has been corrected, it needs to be standardized according to the predefined data format and structure through a data standardization process such as storing the Social Security number as 999-99-9999.

The data integration phase includes the processes for matching, merging and linking of customer information. This involves the following processes:
  • Customer matching and linking - Customer data is deduped to remove the duplicate customer records and generate a single customer record valid across the business applications (source systems). Also, customer records get linked with the other related records, i.e., households and organizations.
  • Data transformation and integration - Data will be transformed and integrated to produce the true view of customer. On a need basis, in-house customer information will be integrated with external third-party customer data set (e.g., Dun & Bradstreet, Experian) and produce the integrated customer database with various data access views.
Data Management
Data management includes the processes for monitoring and maintenance of customer data, which is dynamic by nature and changes over time. It requires periodic data monitoring and maintenance to keep the up-to-date customer information available.

Data monitoring processes periodically analyze customer data to understand any changes in the customer information. Data maintenance makes the latest information available and archives the old data set. The archived data set is required to reproduce the snapshot of customer information at any given point in time.

Customer Data Integration Architecture



As above figure depicts, building a customer data hub requires both bulk data movement from ERP, CRM and other operational systems as well as transaction level data validation and customer master management from the customer touch points.

In reality, most data is generated by the operational systems, such as an SAP R/3 system or a Siebel application. Customer name and address data will be maintained by the various operational components that need to communicate with the customer. These systems perform tasks such as invoicing, campaign execution and shipping – each of which provide customer touch points that can aggregate more customer information. One approach for maintaining data integrity would be to attack the problem at the operational system level. This seems to be a practical approach. After all, operational systems are the place where detailed transactions are completed. However, these applications are dedicated to performing one function that represents specific business requirements. The data collected in this environment is a by-product of the transactions that have been executed, and for the most part, the applications found here are not integrated with any other applications. Furthermore, each application is its own standalone environment and is optimized for the particular needs of the application. While this data is optimized for the operational system, to fully understand your customer, you need to consolidate that data, by customer, into a single customer-centric database. 

 
The goal of CDI is to provide the best information from the combination of the customer systems. By combining the systems, you know the customer at each touch point across every line of business. This requires an accurate, coherent customer view. Specifically, the goal is to:

Resolve customer data duplications and ambiguities throughout the entire enterprise.
Supplement gaps in the knowledge of customers from external sources.
Support customer data extraction and creation of an integrated customer database



Customer Data Intergration Techniques

Techniques for managing complexity

Attributes and their values can become extremely complex and dynamic due to the many changes individuals go through. Multiply all these fields by the millions of records a business or organization may have in its data sources, then factor in how quickly and how often this information changes. The Data Warehousing Institute (TDWI) says: “The problem with data is that its quality quickly degenerates over time. Experts say 2% of records in a customer file become obsolete in one month because customers die, divorce, marry and move.”[1]
To put this statistic into perspective, assume that a company or charity has 500,000 customers, donors or prospects in its databases. Cumulatively, if 2% of these records become obsolete in one month, 10,000 records go stale per month; or 120,000 records every year. Within two years about half of all the records may become obsolete if left unchecked.
Peppers and Rogers[who?] call the problem, "an ocean of data"[this quote needs a citation]. Jill Dyche and Evan Levy, gurus in this field[citation needed], have boiled the challenges down to five primary categories:
  1. completeness – organizations lack all the data required to make sound business or organizational decisions
  2. latency – it takes too long to make the data valuable: by the time of use, too much has become obsolete or outdated (slowed by operational systems or extraction methods)
  3. accuracy
  4. management – data integration, governance, stewardship, operations and distribution all combine to make-or-break data-value
  5. ownership – the more disparate the owners of the data-source owners, the more silos of data exist, and the more difficult it becomes to solve problems

History of customer data integration

In the late 1990s Acxiom and GartnerGroup coined the term "customer data integration" (CDI).[citation needed] The process of CDI, as Acxiom and Gartner described it, includes:
  1. cleansing, updating, completing contact-data
  2. consolidating the appropriate records, purging duplicates and linking records from disparate sources to enable customer or donor recognition at any touch-point
  3. enriching internal and transactional data with external knowledge and segmentation
  4. ensuring compliance with contact suppression to protect the individual and the organization
As of 2009, service providers deliver CDI as a hosted solution in batch volumes, on demand using a software as a service (SaaS) model, or on-site as licensed software in companies and organizations with the resources to drive their own data integration processing. CDI enables companies to optimize merchandizing (assortment, promotion, pricing and rotation) based on demographics, lifestyle and life-stage, to ensure inventory turn and to reduce waste.[citation needed] CDI also aids companies and organizations in choosing the best location for new branch offices or outlets.[citation needed]
CDI commonly supports both customer relationship management and master data management, and enables access from these enterprise applications to information confidently describing everything known about a customer, donor, or prospect, including all attributes and cross references, along with the critical definition and identification necessary to uniquely differentiate one customer from another and their individual needs.

Customer Data Integration Software

When companies and organizations wish to compile all of their customer or consumer information into one client, they use customer data integration software. Customer data integration software is used to integrate customer addresses, sales, demographics, customer needs, and features that would appeal to certain customers into one interface system so that the business entity can view all of this information at once in order to speed up production and make more sales. In this article, we will look at customer data integration software and various products available on the market.

In data processing, customer data integration (CDI) combines the technology, processes and services needed to set up and maintain an accurate, timely, complete and comprehensive representation of a customer across multiple channels, business-lines, and enterprises — typically from multiple sources of associated data in multiple application systems and databases. It applies data-integration techniques in this specific area.

What is Customer Data Integration Software
Customer data integration software is used to integrate customer information into one user-friendly console so that companies, organizations, and small businesses may review the information without wasting time by searching through large databases or files. Customer data integration software is generally top of the line but some software is more expensive than others. While many of these programs are similar in nature, they each have slightly different functions and display methods that may make it easier to view customer data. In order to find a customer data integration software that is good for you and your company, you will need to preview several customer data integration programs and select one that your company likes.

Popular Customer Data Integration Software
Customer data integration comes in various forms and is sometimes hard to distinguish from other forms of data integration such as application integration. Many companies that provide other forms of data integration, however, also provide customer data integration software. For this reason, we have compiled a small list of customer data integration software for you to review. The following is that list.

IBM WebSphere Customer Center
The IBM WebSphere Customer Center is a very powerful and user-friendly customer data integration software that will get the job done without taxing your patience. The IBM WebSphere Customer Center comes with over 500 individual services and functions to help you manage your customer information. The software is also based on open-source data so its services are constantly being updated and there is much support available on the Internet for this software. Some of the key features of this software is the ability to recognize and process duplicate customer data as well as the ability to integrate IBM WebSphere Customer Center with your other enterprise software.

Adeptia Customer Data Integration Accelerator
The Adeptia Customer Data Integration Accelerator is more than just a customer data integration software. It not only allows you to store and process information from customers but also from distributors, manufacturers, suppliers, and even your partner companies. The Adeptia Customer Data Integration Accelerator is able to process files from multiple sources and even in multiple formats so that you will never have to worry about converting your data to the same type of file. The Adeptia Customer Data Integration Accelerator works with databases, email clients, and a large number of applications which makes it capable of cross-platform functionality.

SAS
SAS is first and foremost a data analysis program but it can also be used as a customer data integration software. SAS is able to process large amounts of customer information and present it in readily-available reports. These reports allow you to not only view the information that you need but also to manage that data in a powerful way. SAS can be used to compile all customer information into one spot and then make predictions about your company's future.

Altova MapForce
Altova MapForce is a data mapping software that is also capable of processing and integrating data into one interface. Altova MapForce can also be used to create customer data integration software from a number of applications and integration methods. Altova MapForce can collect data from databases, spreadsheets, and other documents and sources. MapForce can then build an entirely customized customer data integration software by allowing the user to combine functions and tools. Altova MapForce can be combined with other services such as Visual Studio and Eclipse to build even more advanced software. Altova MapForce also allows the user to build web-based applications which can be used by the entire company from one online location.

Siperian
Siperian, also known as Informatica, is a source for customer data integration software as well as other forms of data integration. Siperian specializes in data integration and can be used by almost anyone, whether they have experience in customer data integration or not. Siperian offers many of the same features that other customer data integration software does but executes their services in a way that has more functionality and support.

Wednesday, December 19, 2012

SQL Transformation in Informatica


SQL Transformation is a connected transformation used to process SQL queries in the midstream of a pipeline. We can insert, update, delete and retrieve rows from the database at run time using the SQL transformation.

The SQL transformation processes external SQL scripts or SQL queries created in the SQL editor. You can also pass the database connection information to the SQL transformation as an input data at run time.

The following SQL statements can be used in the SQL transformation.
  • Data Definition Statements (CREATE, ALTER, DROP, TRUNCATE, RENAME)
  • DATA MANIPULATION statements (INSERT, UPDATE, DELETE, MERGE)
  • DATA Retrieval Statement (SELECT)
  • DATA Control Language Statements (GRANT, REVOKE)
  • Transaction Control Statements (COMMIT, ROLLBACK)

Configuring SQL Transformation


The following options can be used to configure an SQL transformation
  • Mode: SQL transformation runs either in script mode or query mode.
  • Active/Passive: By default, SQL transformation is an active transformation. You can configure it as passive transformation.
  • Database Type: The type of database that the SQL transformation connects to.
  • Connection type: You can pass database connection information or you can use a connection object.

We will see how to create an SQL transformation in script mode, query mode and passing the dynamic database connection with examples.

Creating SQL Transformation in Query Mode


Query Mode: The SQL transformation executes a query that defined in the query editor. You can pass parameters to the query to define dynamic queries. The SQL transformation can output multiple rows when the query has a select statement. In query mode, the SQL transformation acts as an active transformation.

You can create the following types of SQL queries

Static SQL query: The SQL query statement does not change, however you can pass parameters to the sql query. The integration service runs the query once and runs the same query for all the input rows.

Dynamic SQL query: The SQL query statement and the data can change. The integration service prepares the query for each input row and then runs the query.

Dynamic SQL query: A dynamic SQL query can execute different query statements for each input row. You can pass a full query or a partial query to the sql transformation input ports to execute the dynamic sql queries.

SQL Transformation in Informatica Example Using Static SQL query

Q1) Let’s say we have the products and Sales table with the below data.

Table Name: Products
PRODUCT 
-------
SAMSUNG
LG
IPhone

Table Name: Sales
PRODUCT QUANTITY PRICE
----------------------
SAMSUNG 2        100
LG      3        80
IPhone  5        200
SAMSUNG 5        50
 
Create a mapping to join the products ant sales table on product column using the SQL Transformation? The output will be

PRODUCT QUANTITY PRICE
----------------------
SAMSUNG 2        100
SAMSUNG 5        500
LG      3        80
 

SQL Transformation in Informatica Example Using Full Dynamic query

Dynamic SQL query: A dynamic SQL query can execute different query statements for each input row. You can pass a full query or a partial query to the sql transformation input ports to execute the dynamic sql queries.

Q2) I have the below source table which contains the below data.


Table Name: Del_Tab
Del_statement
------------------------------------------
Delete FROM Sales WHERE Product = 'LG'
Delete FROM products WHERE Product = 'LG'

Solution:

Just follow the same steps for creating the sql transformation in the example 1.
  • Now go to the "SQL Ports" tab of SQL transformation and create the input port as "Query_Port". Connect this input port to the Source Qualifier Transformation.
  • In the "SQL Ports" tab, enter the sql query as ~Query_Port~. The tilt indicates a variable substitution for the queries.
  • As we don’t need any output, just connect the SQLError port to the target.
  • Now create workflow and run the workflow.

SQL Transformation in Informatica Example Using Partial Dynamic query

Q3) In the example 2, you can see the delete statements are similar except Athe table name. Now we will pass only the table name to the sql transformation. The source table contains the below data.


Table Name: Del_Tab
Tab_Names
----------
sales
products

Solution:

Create the input port in the sql transformation as Table_Name and enter the below query in the SQL Query window.

Delete FROM ~Table_Name WHERE Product = 'LG'