Special Report: Data as a Profit Center

Article ID: 64156

Imagine walking up to the door of large, seemingly unused building. In your hand is the key to the lock. Inserting the key into the slot, you open the door with a rusty squeal and peer inside. As your eyes adjust to the darkness, you first notice the vastness of the space, and then shafts of sunlight stabbing down from the high, dusty windows. You then notice piles of boxes stacked across the floor. With curiosity rising, you step inside and start to explore. To your amazement and delight, you realize each box is filled with something interesting and unexpected. As you wander around taking it all in, you finally understand, "Hey! This stuff is valuable." And with the wheels of entrepreneurialism now turning, you ponder the possibilities….

We can apply the same scenario to your data. Over months, years, and sometimes decades, your computer systems have acquired and stored data—maybe vast quantities of it. This data is placed in a storage system where it sits waiting to be discovered by some curious and clever explorer. And just like our earlier scenario, your boxes of data could prove to be extremely valuable.

Profit

To extract value from your stored data and ultimately make a profit, you need to first understand the meaning of "profit." According to Merriam-Webster, profit is defined as:

  1. A valuable return or gain
  2. The excess of returns over expenditure in a transaction or series of transactions, especially the excess of the selling price of goods over their cost
  3. Net income usually for a given period of time
  4. The ratio of profit for a given year to the amount of capital invested or to the value of sales
  5. The compensation accruing to entrepreneurs for the assumption of risk in business enterprise as distinguished from wages or rent

Of these definitions, let's focus on numbers 2 and 5, and apply them to the premise that your data is indeed valuable.

Because you've already acquired and stored your data, the costs and expenditures of these activities have already been borne. Any returns on the additional use of data will be considered "profit." Furthermore, any compensation accrued to entrepreneurs who make use of the data will also be considered "profit."

Numerous examples of this concept exist. For instance, consider your credit rating or credit score. As you purchase goods and services, the data associated with your transactions is captured and stored—a natural byproduct of your activities. As you borrow money and pay back loans, data about this activity is also captured and stored. After the data capture, your transaction is complete. But the data lives on and can be used to provide insight into your behavior, as well as your worthiness to conduct a future transaction. The data is valuable to the companies that are considering future business with you. And the data is profitable to the company providing the information.

So, how can you extract value from the data collected in your business to turn a profit? You need to consider several factors, including acquiring data, consolidating and organizing the data, turning the data into useful information, and making that information accessible to those who want it.

Acquiring Data

The first thing to consider in trying to make a profit from your data is how you acquire, capture, and store the data. Although these things seem obvious, you might be haphazardly saving some transactions and discarding others. Reviewing your recording processes from beginning to end can be very worthwhile. Trace the flow of your business data, starting with the source and ending with the target. Be sure to include various copying devices, such as USB attached memory sticks and hard drives. The last thing you want is for a box full of valuable items such as these to be shoved out the back door for the garbage collector—or worse yet, taken away from the premises and used by someone else to make a profit!

Organizing the Data

After you acquire data, you need to consolidate and organize it. Many different information management and data processing systems are available in today's business environment. Data systems can be inhouse or cloud-based, real or virtual, homogeneous or heterogeneous. To create, extend, and extract value from your data, you need to consolidate and organize the various transaction artifacts. In most cases this is done to provide information and insight that wouldn't otherwise be available. In other cases, you're trying to provide a timely view of the information, to meet a client's needs regarding availability and response time.

Another interesting scenario in which you'll want to consolidate data for profit is when you have different clients, customers, or partners who are in the same industry, business, or sector, as Figure 1 illustrates. Traditionally, acquiring or creating data for separate customers meant keeping each business' data separate and isolated. But what if you could extract data from each client, transform and cleanse the data to avoid issues of identity or confidentiality, and consolidate all the data into a new superset, as Figure 2 illustrates. You'd have a valuable new perspective with a much larger view of the marketplace, sector, or environment. Doing so would be like gathering up all the related boxes from our introductory scenario and organizing them into a cohesive collection to better evaluate their contents.

Fortunately, the current state of information technology provides many options for outfitting a company or organization with the proper gear, maps, and guidance to successfully meet this goal. Much of this technology exists under the umbrella phrase business intelligence (BI). Within the architecture of BI are the concepts of data warehousing and operational data stores, which are proven methods for consolidating, cleansing, and organizing data.

I sometimes encounter resistance when I suggest consolidating data into a data warehouse or operational data store. However, imagine a client knocking on the door of your storage building and asking for some useful data; you wouldn't open the door and simply point to all the various boxes of data and tell the client to "have at it." If you did, your client would be unlikely to find any value in the data, and you'd be equally unlikely to make any profit. Even if the client were willing to move, open, and explore each box for useful data, doing so would take excessive time and energy. To make data profitable, you have to consolidate it and organize it before a client asks for it.

Turning Useful Data Into Valuable Information

Data is not information. To make data profitable, you need to turn it into valuable information. At its lowest level, data is nothing more than ones and zeros. When these two digits are combined and rendered into characters, you begin to see something recognizable. When the characters are combined into words and numbers, you begin to understand. When the words and numbers are transformed into graphs, pictures, and diagrams, you discover insight and meaning. From here, you can take action.

You need to keep in mind that there's a difference between querying data and mining data. As a simple explanation and comparison, consider querying to be developing a premise, then asking the database management system if the premise is true or false. For example, consider the premise that high heeled, red shoes are sold on Saturday mornings. To confirm this premise, you could issue the query

select	shoe_type,
     shoe_color,
     day_sold,
     time_period_sold,
     sum(sales) as total_sales
from     sales
where     shoe_type = "high heels"
and	shoe_color = "red"
and	day_sold = "Saturday"
and	time_period_sold = "AM"
group by     shoe_type,
     shoe_color, 
     day_sold,
     time_period_sold;

If the query comes back with a result, the premise is true. If no result is returned, the premise is false.

With data mining, no such premise exists. Rather, the mining algorithms can identify and highlight correlations and connections within the data that otherwise would be missed or difficult to comprehend. Think of this as a process of discovery. For example, high heeled, red shoes are sold on Saturday mornings more often when long black dresses are sold on Friday evenings between 5:00 p.m. and 8:00 p.m., in the springtime. The connection between sales of dresses on April Friday evenings and red shoes on Saturday mornings was discovered, and not previously understood or conceived.

You most likely have both unstructured data and structured data within your organization. Think of structured data as the rows and columns found in a typical database table. Think of unstructured data as text documents, pictures, images, etc.—sets of data that have no discernable or consistent structure. You probably have numerous methods for reading, processing, and extracting information from the structured data. But what about the unstructured data? How do you efficiently and effectively read, analyze, and extract information from a pile of documents containing words and figures?

Along with some colleagues, I recently had the opportunity to lead a BI project involving the analysis of unstructured data. We needed to extract insight from a computer containing thousands of text documents. Each document represented a set of unstructured data describing a particular engagement or transaction. To the human mind, the text documents consisted of paragraphs, sentences, and words with meaning. To the computer system, the documents were nothing more than a stream of meaningless bits and bytes. Using the advanced algorithms in IBM Research's Business Insights Workbench, we were able to turn each text document into a "bag of words" and then analyze and compare these words across all documents, both in and out of context. The text mining application we used quickly and efficiently renders analysis results into a taxonomy. This taxonomy is refined and used to produce comparisons of words and phrases in various ranked categories and graphs, which lets the business analyst visualize conditions, connections, and trends that would otherwise be hidden or difficult to comprehend. Our team was able to provide useful information from unstructured data, and the business was able to extract value from the information. Text documents that were simply byproducts of completed business transactions became a profitable source of knowledge and insight.

Making Information Accessible

The final step in extracting profit from data is to make the information accessible to buyers and consumers. No matter how valuable an item is, you can't make any profit without a buyer. Once you consolidate and organize your piles of data into insightful information, you must give potential buyers access to it. For this discussion, you can consider three avenues to your information: intranet, extranet, and Internet. Think of these as private, semi-private, and public, where each avenue brings the consumer to the appropriate set of information commensurate with his or her level of authority and need, as Figure 3 illustrates.

However, just providing access to information isn't enough to meet your customers' needs. The information must be presented and interacted with in ways that encourage use, foster understanding, and provide meaning and context. This can take the form of presenting traditional rows and columns but will probably also require clever and sophisticated visual devices. Think of it as leading an explorer to the most relevant and important information (e.g., X marks the spot for the treasure).

Your intranet is the most obvious choice for providing information to your local and most trusted colleagues. An extranet or the Internet is more appropriate for your business partners or the general public. Keep in mind that you don't want to give anyone access to the actual data, just information. It's up to you to extract the data, consolidate it to provide value, and turn it into useful information.

Consider the example of a travel agency. If you travel regularly, a travel agency might make your specific travel details available to you via the web. But suppose you want to know more about general travel behavior—that is, you want to see a "macro view" of travel metrics and trends based on time and geography. Of course the travel agency has data for activity associated with all its travelers; if this data is consolidated and turned into information, it might be valuable to you. The travel agency could turn this data into identity-less information and provide it to you as a trusted client or business partner, through a specific and secure extranet or Internet session. Providing this level of information would give the travel agency additional opportunities to add value and therefore make a profit from data it has already collected and stored. In addition, the travel agency might find new customers for this information, such as transportation service providers who want to better understand the marketplace.

Remember that anything of value is a likely target for thieves. As you begin to derive more and more value from your data, the risk of loss or theft will become greater and more costly. You need to take proper measures to understand and mitigate these risks, including identifying and meeting data availability and recovery requirements, as well as setting up adequate and effective authentication, security, and auditing mechanisms. You must start thinking of your data and the information derived from it as an asset. Close attention to detail is vital for protecting your data. Periodic security reviews and audits can go a long way toward identifying weaknesses and holes, and thus preventing a loss—not to mention making your auditors happy.

The Next Step

It is said that a journey of 1,000 miles begins with a single step. Now that you've taken that first step, you need to embark on the entire journey. Your next steps should include assessing what data is available in your organization that if consolidated, cleaned up, and crystallized, could be worth something to someone else. You need to learn about data-centric application architectures and BI concepts, tools, and best practices. Talk to your clients and business partners about their requirements and where you can provide more value. Then you can truly embark on the journey of turning your data into information, and turning that information into profit.

Mike Cain is a senior technical staff member in the IBM Systems and Technology Group and team leader of the DB2 for i Center of Excellence. His team provides consulting, education, and services for clients and businesses worldwide.

ProVIP Sponsors

ProVIP Sponsors