Articles

Where’s the Money? Part 9 of 18: Big Data

Article Author
Andrew Cardno and Dr. Ralph Thomas
Publish Date
March 1, 2012
Article Tools
View all articles in the CEM Archive
Author: 
Andrew Cardno and Dr. Ralph Thomas

In this article, we will look at “big data,” what it means for the gaming industry, and specifically, what it means for the gaming floor. Big data is essentially ultra-large datasets that are being updated frequently. These datasets are at the forefront of many of the biggest changes in the world today—Google, Facebook and Yahoo! are three canonical business at the epicenter of the big data explosion. The gaming industry has a deep history of exploiting data, and big data is unlikely to be the exception. This article will start to prepare you for the changes that big data might bring.

Data Then and Now
The last century’s data breakthrough was transactional data. In other words, the finest-grained data was collected by retailers tracking their sales and inventories. Transactional data was considered to be massive, and a highly specialized industry evolved to handle these huge data volumes. The shining example is Walmart and its 2.5 petabyte database that continues to drive its global retail link solution.1

This century’s super star data, however, is interaction data. The graph in Figure 1, based on extrapolated International Data Corp. (IDC) estimates, shows how we are now crossing into the Zettabyte Age. In this data age, the majority of information comes from interactions with customers. The key attribute of interaction data is that it happens before and after the transaction. Quite simply, this means that the potential exists to see what customers are shopping for and then what they choose. Already there are companies where the main value driver for their business is using and exploiting this data. You do not have to look further than Google, Yahoo!, LinkedIn or Facebook to see companies who are pioneers in this developing space.

Today, the world is flooding with big data, and according to the IDC, unstructured data now accounts for about 90 percent of the data being captured.2 Over the next 10 years, the number of servers to hold all this data will grow by a factor of 10. Quite simply, the data infrastructure of the world is transforming, and the gaming industry is at the forefront of this change. Historically, the gaming industry has been a leading industry for data utilization, with near universal adoption of loyalty cards and a proliferation of data warehouses across the industry. These data warehouses have become a central part of the strategy of many gaming companies. The big question is, how can the gaming industry use these new sources of data to drive innovation?­

Gaming data has been large for more than a decade. Sophisticated player tracking systems have told us a great deal about our customers. From standard RFM data (recency, frequency, monetary spend) to standard demographic data (age, home address), casinos have for years known more about their customers than most retail businesses. In addition, the detail data for each customer has been very specific—for example, a marketer can easily pull up a list of customers who play on Tuesday mornings with an average bet of more than $1.

More recently, companies that have installed enterprise-wide data warehouses (EDWs) have been able to add a second set of dimensions to their data—product selection. By combining the data from the player tracking system and the slot accounting system, casinos are now able to know every game played by every customer at every point in time.

Let’s see what this does to the size of the data, assuming our casino has 500,000 rated customers in its database and 2,000 slot machines on its floor. For the sake of this example, we’ll assume each customer visits four times per year and plays an average of three different games per visit.

Traditional Player Tracking Data
500,000 customers x 4 visits per year = 2,000,000 records per year

EDW Data
500,000 customers x 4 visits per year x 3 slot machines per visit = 6,000,000 records per year

Dimensionality and Complexity
In a certain sense, the dimensionality of the data tells us how complex or “big” our data is. Dimensionality is relative to the problems that we are trying to solve. We could arbitrarily create big data sets if we wanted to—for example, imagine trying to cross-reference all of your customers against all possible sets of three-letter acronyms in the English alphabet. There are 17,576 possible three-letter acronyms, and if we were suitably crazy, we could create a data set that contains a last name column and one column for each three-letter acronym. Each row would then contain the customer’s last name plus a 0 or a 1, depending on whether the three-letter acronym occurred within the customer’s last name. This is a massively complex and large data set. It is also completely useless. This example does, however, show how attribution of data can create more complexity than the numbers, and in a world of big data, we think there will be a lot of dimensions.

Let’s take another look at the complexity of our data from the standpoint of dimensionality, comparing traditional player tracking data to the newer EDW data sets. For this example, we assume that we are interested in looking at the play of our customers over a fixed time period (one month) and that there are 10 reasonably interesting gaming metrics to examine for this time period.

Traditional Player Tracking Data Dimensionality
500,000 players x 10 gaming metrics = 5,000,000

EDW Data Dimensionality
500,000 players x 10 gaming metrics x 2,000 slot machines = 10,000,000,000

With the addition of EDW, our data explodes from a matrix of 10 columns and 500,000 rows to one with 20,000 columns and 500,000 rows—a dimensionality of 10 billion!

From the CEM article “The Petabyte Era of Gaming Data,” (Singh and Cardno, September 2008) we know that the move to detailed interaction data on the gaming device leads to a 480 times increase in the volume of data collected. Just 10 years ago, nearly every piece of data was touched or viewed by a human; today we only view 1 percent of the data generated. In another decade, humans will only see a tiny portion, probably less than 0.0001 percent of all data generated.

The last century’s data was a sample of what was occurring in the real world. These samples of the data, while still of considerable size, encouraged models that understood the nature of the data. In the future, when data volumes will essentially represent the entirety of all information, the challenge will not be building a model from a sample, but building a sample of the whole.

In addition, historically, the economic value of data has been gathered through careful analysis of transaction data. The operator controls the data, and the operator therefore controls the data’s power. In today’s world, “competing on analytics”3  is a well-established practice. This establishes analysis of transaction data as a requirement for business, and it is likely that this will expand to competing using the big data of the future.

Consider someone browsing online for a hotel to book—and, remember, with interaction data, we are able to see the customers’ actions during this search, not just at the time of booking. The hotel operator that is able to respond to this shopping event is going to be acting before their transaction-based analytics have even started. In this sense, the interaction data we are now able to collect is like a crystal ball that can show us what our customers are about to do before they transact with us.

Data Advantaged Customers
Customers are, in a way, also early adopters of this interaction data as both its creators and as users of how it can be applied. This information places the power of yield management in the hands of the customer. Consider a world where your customers have near perfect insight into your competition’s offers and comparative views between them—your customers know the marketing programs and the responses from the market. In this world, the customer has perfect information and operators are at an information disadvantage. Let’s call these customers “data advantaged.”

Data advantaged customers already exist today, and it seems reasonable to assume that the number of data advantaged customers will grow. This growth is more horizontal in nature, as social media permeates different aspects of society—Facebook has 845 million users, 161 million of whom are in the U.S.4

These customers are also more communicative online than ever. To use Facebook as an example again, it records 2.7 billion “likes” and comments daily.5 There are companies now betting their future on the likelihood that social media will replace e-mail.6 If these companies are correct, then not only will our traditional relationship with customers be fundamentally changed, but our method of communication will also be “shared” beyond our control by customers using social media.

If these changes do occur, then there will be a massive shift in information power toward the customer. In this environment, operators need to think about how to effectively take part in these interactions to try to gain back some of their information power. Two initiatives that organizations can follow to balance the power are locational intelligence and predictive analysis.

Locational Intelligence
Locational data, collected via GPS, proliferates in the world of big data; it accumulates on Facebook and through many iPhone® applications. The locational view is extremely powerful in deciphering the mass of communication, and we can now understand customer behavior and target rewards that are appropriate, not only to what we think the customer needs or wants, but more specifically, we can target rewards based on what the customer needs or wants at a specific GPS location.

GPS data aggregates to a location, and so vast amounts of location-specific data can be seen, understood and acted on. As an illustrative example, we could display all the patrons who look for fast food while inside the casino and see if there is a relationship between where patrons are playing and their desire for fast food. We could, for example, then show the conversion rates to the in-property Johnny Rockets. If the numbers are low, maybe something as simple as better signage could keep the patrons on site for their fast food desires.

Predictive Analysis
Amongst the massive amounts of big data, there are deep trends and deeper analytics. There are companies driving this data to predict the future. For example, we can apply social media data to predict future hotel occupancy levels or to predict the number of attendees at an upcoming convention. This data has the potential to show our upcoming business and driving the interaction-based hotel yield management systems of the future.

Where is the Money?
The ocean of information that is interaction data only takes us to general trends. It is the transaction data that we hold that shows the conversion to real value. So by combining the transaction data with the interaction data, we gain a chance of understanding both the behavior of our customers and the results of our actions. This transaction data is still growing at a tremendous rate and the integration with the interaction data is at best an unexplored space. There is reporting on this dat­a, according to Davenport, “There’s analytics, which, to me, is explanatory and predictive. It’s why this happened and what might happen going forward.”7

Footnotes
1 Information from www.informationweek.com/news/software/info_management/228800661 January 2012.
2 Extracted from www.emc.com/collateral/demos/microsites/emc-digital-universe-2011/index.htm January 2012.
3 Thomas H. Davenport and Jeanne G. Harris, Competing on Analytics: The New Science of Winning.
4 Extracted from www.itproportal.com/2012/02/02/facebook-releases-uvsage-statistics-845-m... January 2012.       
5 Extracted from www.itproportal.com/2012/02/02/facebook-releases-usage-statistics-845-mi... January 2012.
6 Reference www.telegraph.co.uk/finance/newsbysector/mediatechnologyandtelecoms/digi... January 2012.
7 Extracted from http://sloanreview.mit.edu/the-magazine/2010-fall/52102/are-you-ready-to....


Andrew Cardno has more than 16 years of experience in business analytics, ranging from modeling health care drive times to casino gaming floor analytics. He often presents on the future of analytics across the world and has spent the last seven years living in the United States and working with corporations around the world. He can be reached at andrewcardno[at]yahoo.com.

Dr. Ralph Thomas is Vice President of Strategic Analytics and Database Marketing for Seminole Gaming. During his years in the casino industry, Thomas has focused on maximizing profitability by applying statistical analysis to the company database. Previously, Thomas spent 15 years in academia, as both a student and a lecturer of mathematics. He can be reached at ralph.thomas[at]stofgaming.com.

Comments

Post new comment

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.