Monday, August 27, 2012

The State of Dashboards in 2012: Pathetic

The State of Dashboards in 2012: Pathetic

Over the last several months, my colleague VP and Research Director Tony Cosentino and I have been assessing vendors and products in the business intelligence market as part of our upcoming Value Index.

Tony recently wrote about the swirling world of business analytics, covering many of the dynamics of this industry. He and I have been reviewing the breadth and depth of over 15 of these vendors using our Value Index methodology, which examines the products closely in terms of usability, adaptability, reliability, capability and manageability. As we have gone through this analysis, we see the dashboard as the most common tool for displaying business intelligence. The early forms of dashboards appeared in the 1980s, but in my honest evaluation, today’s dashboards have not gotten much more intelligent in all those years. The graphics have gotten better, and we can interact with charts in what is commonly called visual discovery so you can drill into and page through data to change its presentation. So some progress has been made, but the basic presentation of a number of charts on the screen has not improved significantly and worse yet neither has the usefulness of the charts. Let’s face it: It’s a big mistake to place several bar and pie charts on a screen side by side and assume that business viewers will know what they mean and what is important in them. We cannot assume that individuals in an audience have the ability to interpret charts and draw the right conclusions from them; just being pretty or interactive will not communicate the desired message.

The lack of adoption of business intelligence that includes dashboards is notorious in this industry, and so are the billions of dollars that companies have spent on BI products in the last decade. It is not helpful to make a big statement that the technology has failed; we should look for reasons that have held it back. Here we might start by questioning whether the tools present the right information in a useful form for business people or if organizations have properly configured what tools they have purchased. If the goal is to inform them through dashboards, then maybe we need to make it explicit what the dashboard or collection of charts actually mean. Typically, this means describing in words the issues or priorities that need to be examined further. A little discipline in populating the dashboard could help, such as presenting only the charts that clearly point out issues that need attention and determining which ones to use by applying analytics. If we ask why Microsoft PowerPoint is so popular as a business intelligence tool, we probably would find that the answer is the descriptive text boxes that accompany charts, providing summary sentences or emphasizing specific bullets in a list on the slide. While many people do not like the static nature of Microsoft Excel based charts in presentations or PDF versions of them, they do through human intervention with annotation and commentary provide better explanation of the charts than dashboards are doing today. If we expect our organizations to move beyond personal productivity tools and work in a collaborative enterprise environment with dashboards, we better understand how business intelligence should adapt to the way people work and operate not the other way around. In this case it may not be true that, as the old saying goes, one picture is worth a thousand words but a hundred or so words explaining the relevance of the chart could really help.

Many technology vendors believe they need to provide better context in their dashboards, so they try to align the charts to the geographic area of focus, or to the product line of responsibility or to management key performance indicators to make them more usable. Providing better role-based dashboards that are generated based on the individual’s level of responsibility and the business context is a good first step, though most business intelligence vendors do not provide this level of support. But just presenting charts tuned to the context of the individual’s role that may or may not require action is not enough. We need to prioritize the information and make it like the news, with headlines and stories that people can read to determine if they need to make decisions or take action. Whether you are reading the physical or the digital version of The Wall Street Journal or USA Today, newspapers have survived over the centuries as the main source of what humans read in formats they can comprehend. When is the last time you saw a dashboard that communicated the story of its charts and explained the analytics?

Well, once upon a time analytics and logic were applied to generate stories, in the early 1990s in a product called IRI CoverStory. Then it was classified as an expert system that programmatically would create English sentences based on the interpretation of the analytics in a memo that the system created. I would even be happy if we had titles and sub-titles to the charts that were dynamically created and represented something to guide an individual to what the purpose of the chart is to represent. Many of the current business intelligence technologies do not even allow for a free form text box that can be placed besides a chart which is really sad as this is one of the most basic methods used in business today. It would be great if dashboards could make these steps forward and make it easier to understand what is presented, but 20 years later, they have not.
Another thing dashboards need to do is help individuals take action based on the information they receive. My colleague Robert Kugel has written about action-oriented information technology frameworks and how they can help increase the productivity and effectiveness of our workers. To date, most developments of the notion of an action-enabled dashboard have focused on data discovery and supporting root-cause analysis; that can’t match the familiar people type actions that happen in our organization – collaboration through dialogue to address issues and opportunities.

Some of my industry colleagues have written books on dashboards to capitalize on the hype surrounding the topic. It’s about time for a set of books about the death of the dashboard or moving beyond dashboards; the current designs are not advancing the ability to take appropriate action on the information presented or provide the right level of guidance using analytics. We are entering the next wave of discussion on visual discovery, but so far much of this focus is just about using visualization on greater volumes and velocity of data, not making it more useful for the general population of business users. If we want to learn from the disappointing decades of business intelligence deployments, then we should find out what our business users really need to take action and make decisions on the information; delivering prettier charts won’t help. Until then, we are just perpetuating the past, and we know it has not had the best track record in advancing usefulness and adoption of business intelligence and dashboards.

I will follow up on this rant of the state of dashboards by writing about the lack of improvement in the types of metrics and indicators as they relate to overall business analytics, which are another source of the problems that underlie our current methods of delivering and providing access to analytics through business intelligence. We all can do a much better job in meeting the needs of business and truly advancing the usefulness of technology that still holds promise for significantly impacting organizations’ effectiveness.
This blog originally appeared at Ventana Research.

Principles of Data Visualization

Eight Principles of Data Visualization

Imagine you are walking out of the office after a long day and your phone buzzes with a new email. Taking a quick glance, you see that it’s from Joe in operations: "Hey, wondering if you could run me a few numbers and put them in a nice chart to show how well our new store layouts are doing along with the latest sale promo we started last week. Need to put it into a presentation for the executive team next Monday. Thanks."

What does Joe really need? Where do you start? For anyone in a business environment who collects or manages some kind of raw data, tasks that are becoming more pervasive, the need to process that data into a human-usable form is increasingly common.

Visualizations, like the chart Joe asked for, are a great way to accomplish this, but they can be difficult to do properly, as anyone who has sat through a slide show presentation with an unreadable pie chart or vague growth projection graph can attest. As available data becomes more complex and extensive, weaving it into a visualization that invites engagement, understanding and decision-making is a bigger challenge, with a bigger opportunity for payoff.

Some of the traditional business standbys, like a one-off pie chart or simple line graph, even if done well, may not offer enough data to answer multi-faceted questions like Joe's. (See Figures 1 and 2, at left.) How can we take visualizations to the next level, so they can take on the challenge of today's business complexity?

Get the Fundamentals Right

The first step is to back up and focus on the basics. If you have ever played a team sport with a good coach, you may recall that he or she spent a lot of time working on fundamentals. Trick plays or advanced moves don’t win a game without solid fundamentals supporting them, and data visualization is no different. The most complex, data-rich graphic is useless unless it follows basic principles of good visualization:

1. Understand the problem domain. If you are producing visualization for your own use or that of your department, chances are good you already understand the area you will be working in. But if, as in our scenario with Joe, the visualization is for another department, or even an external stakeholder such as a customer or partner, you may need to ask questions and do more research to understand what is involved. In this case, you should investigate when these initiatives started, whether any others are in progress at the same time and what metrics the executive team will use to determine success.

2. Get sound data. This may seem obvious, but good data is at the heart of any effective visualization. Make sure the data you select is as accurate as possible, and that you have a sense of how it was gathered and what errors or inadequacies  may exist. For example, maybe our store sales data for Joe is only current as of the last close of business, thanks to an older cash register system. Make sure you get relevant data and enough of it. We probably want not only sales data after these changes, but also the month or quarter before and even the same period in past years for comparison purposes. Above all, to create an effective visualization, you need to understand the meaning of the data you are working with. This can be a challenge if it has been stored as raw numbers. In this case, we may need to determine the store visitor counting method  being used to know what those numeric tallies mean.

3. Show the data and show comparisons. Picking the best type of visualization is an art and science; however, the basic rule of thumb is to choose a spatial metaphor that will show your data and the relationships within it, with minimum distractions or effort on the part of the viewer. As Eddie Breidenbach explains, most graphic arrangements fall into one of four categories or metaphors (see Figure 3, at left):
  • Network - to show connections, sometimes in a radial layout.
  • Linear - to show how something varies over time or in relation to another factor, often on an X/Y space.
  • Hierarchical - to show groupings and importance; these can come in many different layouts.
  • Parallel - to show reach, frequency or shares of a whole; these can come in many different layouts.
For Joe's chart, we can start with a well-labeled, linear line graph since we want to see how sales have been affected since introducing these new initiatives. (See Figure 4, at left.)

4. Incorporate visual design principles. Using sound visual design elements, like line, form, shape, value and color, with principles like balance and variety, make a visualization both more inviting and easier to read for trends and comparisons. (See Figure 5, at left.) This will become particularly important as we take our linear metaphor visualization to the next level.

Bring in More Dimensions

Once we have good data and a sound underlying spatial metaphor (in this case, a linear metaphor), it is time to take account of the complexity at play. Though it might seem like we have satisfied the initial question at face value (“Sales are up since changing the store layout and starting the new promo”), this answer is likely to spur more questions

Based on our knowledge and research into the problem domain, we can come up with  initial follow-up questions after looking at the simple linear metaphor visualization:
  1. We started both of these initiatives right before a holiday weekend. How do we know that this uptick in sales is not just a seasonal trend?
  2. Total sales are up, but has the new store layout succeeded in improving the performance of some departments that were struggling before?
  3. Are we succeeding in getting more customers into the store and not just selling more to existing ones?
  4. Are customers shopping more departments and buying a more diverse mix of items?
Asking these kinds of questions is a great exercise to begin taking a visualization to the next level because they prompt us to add more dimensions that allow viewers to explore and understand the subject from additional angles and in more detail. There are a variety of solid techniques that can help achieve this additional dimensionality. Below are the answers to these questions:

5. Add small multiples. As described by author Edward Tufte, small repeated variations of a graphic side-by-side allow for quick visual comparison. Whenever possible, scales should be kept the same and the axis of comparison, aligned. Adding some small, stacked thumbnails of our chart next to the main one allows a comparison of sales trends for the same period last year, and the one before that. (See Figure 6, at left.) This answers our first question: sales do normally go up this time of year, but the increase seems to be quite a bit bigger this time, so it is probably not just the normal seasonal cycle.

6. Add layers. Adding extra levels of information, while preserving the high-level summary data, can make a graphic more flexible and useful. Next, we are going to break down the "top line" of total sales into departments. (See Figure 7, at left.)The resulting stacked area chart answers our second question, showing that sales from the appliances department have increased as a proportion of the whole, but media department sales have not improved much.

7. Add axes or coding patterns. Another way to get more dimensions in a graphic is to add additional patterns for coding information, such as varying the shape or color of points on a plot based on a variable. In some cases, an extra axis in space, alongside an existing one or in a new direction (for a 3D chart), can also be useful for showing new variables. It's important to be careful with this approach, as it can add clutter, but when used sparingly and with good design principles it can increase a graphic's usefulness. In Figure 8 (at left) we added an additional vertical axis on the right to show daily foot traffic into the store, with its scale overlaid carefully to be comparable but distinct. To answer question number three, “Yes; we have increased foot traffic, but only after the sales promotion.”

8. Combine metaphors. So far, we have used a linear metaphor for our visualization. However, to answer our last question, we want to add a network metaphor to show connections between product categories in purchases. A pair of circular relationship (chord) diagrams showing snapshots at the beginning and end of the time period under consideration can help compare these connections. Like a pie chart, each product category is assigned a section of the circle, by percentage of total sales, but the center of the circle is hollow. If a majority of purchases containing items in one category also included items in a second category, a line is drawn to that second category; line width is based on the average proportion of both categories in the mixed purchases. As shown in Figure 9 (at left), the increase in these chord lines from the first to second diagram suggests there are indeed more purchases that cross departments since our initiatives went into place.
This relationship data would be even better if we could see it at any chosen point in time (for example, to see what effect, if any, the layout change alone had, before the promotion started). A zoomed-in view of the chord diagrams for detailed study might be useful, too. Clearly, some presentation media lend themselves to these opportunities more than others. As our graphics increase in complexity and sophistication, we need to think more carefully about how to deliver them.

Consider New (and Old) Delivery Methods

The point of any visualization is to be viewed by the right people, in the right context. Unfortunately, many business visualizations have a fleeting life on a slide, up one minute on a low-resolution projector to be scanned from across the room, and nothing but a vague memory the next.
What if, instead of a “flash on a slide” with all of these limitations, Joe's final visualization was printed in high-resolution color on a handout? Everyone could refer back to it as a touchstone during the whole presentation, seeing how the data backs up Joe's conclusions. Afterward, they could tack it up on a whiteboard for further study and follow-up.
On the other hand, maybe Joe needs people at a remote site to see this graphic or he would just prefer not to kill so many trees. He might consider putting a high-resolution version on the Web (or corporate intranet) for viewing on a PC or tablet. This could be as simple as a static graphic like the paper copy, but it also opens all kinds of possibilities for interactivity. To give just a few examples, we could enable scrubbing through time (great for seeing more network metaphors), drilling down and zooming out for a bird's eye view, seeing new data live as it becomes available or even manipulating future variables to watch different scenarios play out.

For more ideas of what's possible, and a great tool for building these using HTML standards that will work on the boss’s iPad, the Data Driven Documents JavaScript library is a great place to start.

Toward the Future

As visualization moves toward delivery via electronic medium, complex data visualization is increasingly blending into the discipline of user experience design and programming. Business analysts, IT staff and knowledge workers  will need more skills designing, building and using fluid, interactive, dynamic visualizations. Fortunately, there are great tools  and great groups of people focused on user experience, The potential payoff for the investment is huge: visualizations invite us to explore, understand and decide, not as one-off disposable products, but rather as robust, enduring touchstones that customers and leaders return to for insight, conversation and connection.
Note: For more on visualization fundamentals, a good place to start is Edward Tufte's excellent series beginning with “The Visual Display of Quantitative Information.” Also see “Visual Design Fundamentals: A Digital Approach” by Alan Hashimoto.

Ryan Bell is a user interface developer for EffectiveUI, where he gets to employ his passion for building great user experiences and indulge his inner information-design enthusiast.

Principles for Enterprise Data Warehouse Design

Seven Principles for Enterprise Data Warehouse Design

This month, I'd like to narrow the focus to one particular aspect of the enterprise information management spectrum: data warehouse (DW) design.
Contrary to popular sentiment, data warehousing is not a moribund technology; it's alive and kicking. Indeed, most companies deploy data warehousing technology to some extent, and many have an enterprise-wide DW.
However, as with any technology, a DW can quickly become a quagmire if it's not designed, implemented and maintained properly. With this in mind, I'd like to discuss seven principles that I believe will help you start - and keep - your DW design and implementation on the road to achieving your desired results (see Figure 1). I'm including both business and IT principles because most IT issues really involve business and IT equally.

Business Principles 
Organizational Consensus
From the outset of the data warehousing effort, there should be a consensus-building process that helps guide the planning, design and implementation process. If your knowledge workers and managers see the DW as an unnecessary intrusion - or worse, a threatening intrusion - into their jobs, they won't like it and won't use it.
Make every effort to gain acceptance for, and minimize resistance to, the DW. If you involve the stakeholders early in the process, they're much more likely to embrace the DW, use it and, hopefully, champion it to the rest of the company.
Data Integrity
The brass ring of data warehousing - of any business intelligence (BI) project - is a single version of the truth about organizational data. The path to this brass ring begins with achieving data integrity in your DW.
Therefore, any design for your DW should begin by minimizing the chances for data replication and inconsistency. It should also promote data integration and standardization. Any reasonable methodology you choose to achieve data integrity should work, as long as you implement the methodology effectively with the end result in mind.
Implementation Efficiency
To help meet the needs of your company as early as possible and minimize project costs, the DW design should be straightforward and efficient to implement.  This is truly a fundamental design issue. You can design a technically elegant DW, but if that design is difficult to understand or implement or doesn't meet user needs, your DW project will be mired in difficulty and cost overruns almost from the start.
Opt for simplicity in your design plans and choose (to the most practical extent) function over beautiful form. This choice will help you stay within budgetary constraints, and it will go a long way toward providing user needs that are effective.
User Friendliness
User friendliness and ease of use issues, though they are addressed by the technical people, are really business issues. Why? Because, again, if the end business users don't like the DW or if they find it difficult to use, they won't use it, and all your work will be for naught.
To help achieve a user-friendly design, the DW should leverage a common front-end across the company - based on user roles and security levels, of course. It should also be intuitive enough to have a minimal learning curve for most users.  Of course, there will be exceptions, but your rule of thumb should be that even the least technical users will find the interface reasonably intuitive.
Operational Efficiency
This principle is really a corollary to the principle of implementation efficiency.  Once implemented, the data warehouse should be easy to support and facilitate rapid responses to business change requests. Errors and exceptions should also be easy to remedy, and support costs should be moderate over the life of the DW. 
The reason I say that this principle is a corollary to the implementation efficiency principle is that operational efficiency can be achieved only with a DW design that is easy to implement and maintain. Again, a technically elegant solution might be beautiful, but a practical, easy-to-maintain solution can yield better results in the long run.
IT Principles 
Scalability
Scalability is often a big problem with DW design. The solution is to build in scalability from the start. Choose toolsets and platforms that support future expansions of data volumes and types as well as changing business requirements.  It's also a good idea to look at toolsets and platforms that support integration of, and reporting on, unstructured content and document repositories.
Compliance with IT Standards
Perhaps the most important IT principle to keep in mind is to not reinvent the wheel when you build your DW. That is, the toolsets and platforms you choose to implement your DW should conform to and leverage existing IT standards.
You also want, as much as possible, to leverage existing skill sets of IT and business users. In a way, this is a corollary of the user friendliness principle. The more your users know going in, the easier they'll find the DW to use once they see it.


Following these principles won't guarantee you will always achieve your desired results in designing and implementing your DW. Beware of any vendors that tell you it's a slam-dunk if you follow their methodology. There will almost always be problems that seem intractable at first - and may eventually prove to be so. Nevertheless, if you build your DW following these seven principles, you should be in a better position to recognize and address potential problems before they turn into project killers.

Rich Cohen is a principal in Deloitte Consulting LLP's Information Dynamics practice where he is responsible for the strategy, development and implementation of data governance, data warehousing, decision support and data mining engagements to support the emergence of world-class business intelligence applications. Cohen has more than 27 years of experience in the design, development, implementation and support of information technology in a variety of industries. Over the last 18 years, he has had extensive experience in the creation of technology strategies, implementations and deployment of CRM and business intelligence solutions to drive improved business performance.

Monday, August 20, 2012

Informatica - Power Center - Transformations

Transformations



Power Center Transformations (partial list)

Source Qualifier: reads data from flat file and relational sources
Expression: performs row-level calculations
Filter: drops rows conditionally
Sorter: sorts data
Aggregator: performs aggregate calculations
Joiner: joins heterogeneous sources
Lookup: looks up values and passes them to other objects
Update Strategy: tags rows for insert, update, delete, reject
Router: routes rows conditionally
Transaction Control: allows data-driven commits and rollbacks

Advanced Power Center Transformations

Union: Performs a union-all join between two data streams
Java: allows Java syntax to be used within Power Center
Midstream XML Parser: reads XML from anywhere in mapping
Midstream XML Generator: writes XML to anywhere

More Source Qualifiers: read from XML, message queues
and applications

Mapplet - Set of Transformation that can be reusable


Example : Data Sources Defined Outside Mapplet


Recap

1. ETL - a. Extract, transform and load data
2. Designer - b. Create mapping objects
3. Mapping - c. Logically defines the ETL process
4. Transformation - d. Generates or manipulates data
5. Mapplet - Set of transformations that can be reused in multiple mappings

Informatica - Power Center Basic Concepts

Power Center Introduction
  • Is a single, unified enterprise data integration platform that allows companies and government
    organizations of all sizes to access, discover,and integrate data from virtually any business
    system, in any format, and deliver that data throughout the enterprise at any speed
  • An ETL Tool (Extract, Transform and Load)
Power Center Client Applications


Designer Tools – Create mappings



Mapping



A mapping is a set of source and target definitions linked by transformation
objects that define the rules for data transformation. Mappings represent the
data flow between sources and targets. When the Integration Service runs a
session, it uses the instructions configured in the mapping to read,
transform, and write data.
• Every mapping must contain the following components:
Source definition. Describes the characteristics of a source table or file.
Transformation. Modifies data before writing it to targets. Use different transformation objects to
perform different functions.
Target definition. Defines the target table or file.
Links. Connect sources, targets, and transformations so the Integration Service can move the
data as it transforms it.
• A mapping can also contain one or more mapplets. A mapplet is a set of transformations that you

Example

Give me an Excel file with Total Order Amount per Customer. I also need to know when this data was
extracted (date) and the customer type initial ( first letter of the customer type)
• Define the sources
• Orders
• Customers
• Define any required transformation
• Sum of order amount
• Get extracted date
• Get first letter of customer type
• Create the file