Articles

Market Basket Analysis, Part III: Using Demographics and Spatial Information

Article Author
Bart A. Lewin, Dr. A. K. Singh and Andrew Cardno
Publish Date
February 1, 2009
Article Tools
View all articles in the CEM Archive
Author: 
Bart A. Lewin, Dr. A. K. Singh and Andrew Cardno

[Author’s Note: This is the third and final article in our Market Basket Analysis (MBA) series. In last month’s article1 we discussed how market basket analysis can be used not just to analyze a mix of different products purchased by a casino patron in a single visit, but can also be used to determine customer preferences for options and characteristics of modern multi-game machines. In a customer centric organization, the knowledge of customer preferences is key to building a good customer loyalty program2. Here, we will round out the picture by showing how to incorporate customer demographics in the MBA, so we can see the relationships between behavior and a patron’s demographic profile.]

It is commonly thought in the gaming industry that it is difficult to find a usable profile of a gambler. There is, however, some literature available on the subject. According to a 1987 study in Canada, the typical Canadian gambler (at the time) was a blue-collar worker with a stalled career, had no criminal record, did not spend recklessly and did not tell the truth about his or her gambling activities.3 Perhaps the best study on the behavior of gamblers in the United States is Harrah’s Entertainment’s Profile of the American Casino Gambler (2002 – 2006).4 Other industries utilize customer preferences such as recency, frequency, product category, amount spent and demographic data (e.g., age, education, income, gender, ethnicity) and spatial (geographic) information in their MBA to gain market share.5 For example, International House of Pancakes (IHOP), the popular restaurant chain, makes detailed studies of the demographics and other characteristics of an area before deciding where to place a new outlet.6 Also, Netflix has offered a $1 million prize to the team that improves the performance of its DVD rental preference model Cinematch by 10 percent.7 Cinematch uses information such as “Customer A hated Ishtar, and loved The Dark Knight” to come out with movie suggestions for its customers. A group of scientists at AT&T Labs and Yahoo! Research, and a team of researchers based in Austria, quite recently were awarded a consolation prize of $50,000 for a system that upgraded Cinematch’s abilities by more than 9 percent.8

Today’s Demographics and Casino Floor Optimization
Electronic slot machines typically offer a single game with options (e.g., amount bet, number of paylines, a range of bonus, and multiples of a base bet). Tomorrow, several games are likely to be available on a single cabinet.

One heavily publicized example is CityCenter, the newest $9.2 billion MGM project in Las Vegas, featuring server-based slot machines. It is said that each gaming device will allow players to select from a large menu of games, order drinks, print show tickets and play progressive games.

Server-based games, those that allow multiple games to be played on a single slot machine cabinet, will likely be subject to innovative royalty structures that could, for example, be based on the number and types of games a casino makes available and how many times each game is played. Therefore, predicting the characteristics of games attractive to certain customer profiles could become a key component to casino marketing and casino floor optimization. Also, understanding how customers will play these games may provide insights into how the games could be configured to maximize profitability (e.g., make the default game for this customer a five multi-line game). MBA can also be used for casino floor optimization and casino marketing; the data in these cases has a spatial (geographical or location) component, and requires the use of Spatial MBA.9 In the next sections, we provide some simple examples for using MBA with demographic and spatial data.

Market Basket Analysis for Demographic and Gaming Activity Data

In this simulated scenario, we have gaming preference and demographic data for 20 visits depicted in Table 1. In the Age Range column, we have a classifier indicating whether or not the age is below one standard deviation from the average age (Low), within one standard deviation of the average age (Med) and above one standard deviation from the average age (High). The Income Range uses a similar classification of the household income. The Presence of Children column indicates whether or not there is a child living in the patron’s household, and the three Slot X / Attribute X columns represent whether or not a player played a minimum amount on a slot machine using a particular game attribute. An example of an attribute might mean a particular game option like a five-line game.

We have run this data through an association rule calculator. As explained in the first article of this series, in classic market basket analysis, the association rules
(Table 2) tell us which products are purchased with others. In this case, rather than associate products, we associated gaming behavior and demographic information, and this made for at least one very interesting observation.

For example, it is very clear that people with children like to play Slot 1 in the manner described by Attribute 1, but do not like to play Slot 2 using Attribute 1; the support and confidence measures (explained in the first article) are well above our minimum threshold.

This information may be used in several ways, for example, when we are marketing slot tournaments to players that have children, we can alter our product offering to include products they are more likely to enjoy.

In the world of configurable games, when players known to have children insert their player card into a cabinet compatible with Slot 1, that game may be immediately presented to players defaulting to Attribute 1. This customer intimacy, where the gaming product becomes reactive to the customers’ needs and desires, fulfills the operator’s goal to maximize revenue. It is our view that if players are quickly provided the product offering they are known to like best, satisfaction and spending on casino entertainment would increase accordingly. (See Table 2.)

Our illustrative example with only six “products” is quite manageable. But, the number increases quickly as more games or more precise demographic information becomes available. In addition, the demographic data can be extended to include census information. Using MBA for this analysis could result in thousands of good rules, making the requirement for sophisticated business intelligence a central theme in the management of active gaming floors. As we illustrated in our previous article, applying super graphics to the output provides a powerful way for humans to understand the massive complexity associated with the full MBA based gaming floor management.

Spatial Market Basket Analysis for Gaming Data

Attributes of a gaming machine are one aspect of the product offering; however the location of the gaming machine relative to other gaming devices truly defines the whole product offering. Our managers have always known that it is the right product at the right price at the right place that defines the success of the floor. To take the next step in this analysis we need massive computer power to crunch the associate rule numbers.

The following section is important because understanding the basis for a model is a critical requirement in its management and the discovery of outliers.

Spatial data requires some additional processing. Consider the hypothesis that a slot machine that is close to the casino entrance gets more play than those that are not. As discussed in the first article of our MBA series, the quality of the rules produced by MBA is evaluated in terms of its support and confidence values. The support value of an association rule in the case of binary variables (a binary value is a yes or no value: Did someone buy a certain product or not?) is defined as the probability that someone buys one product with another (e.g., turkey and yams). The confidence value is the conditional probability: If someone buys one product then they buy the other product.

Because the number of possible distances from the entrance is unlimited, calculating these values for MBA requires the use of a threshold and a cut-off point (a theoretical maximum) to obtain meaningful support and confidence values. This threshold problem also arises in other situations; the term “high coin-in” also requires a cut-off point. There is some literature available on the use of fuzzy logic describing this type of calculation10, where a continuous variable (e.g., coin-in) is mapped to a score in the range [0,1]. Figure 1 shows an example of a distance-to-score conversion for a small data set with four slot games placed at different distances from the casino entrance. In this example, the maximum slot-to-entrance distance is 300 units, and any slot within a distance of 60 units is assigned a score of 1, and the score for distances greater than 60 linearly decreases from 1 to 0. Once distances have been converted to their requisite scores, the support and confidence spatial (locations) can be calculated as follows:11

We next show how to calculate the support and confidence for the spatial variable distance. Column 1 of two Spatial Association Rules (SAR) for a small data set shown in Table 3.

SAR1: “If a slot machine is close to the casino entrance, then it has high coin-in.”

SAR2: “If a slot machine is close to the casino buffet, then it has high coin-in.”

Table 4, column 1 shows the distance of the slot game from the casino entrance, the second column is the value of coin-in per day for the game, the third column is the distance-score, the fourth column is the coin-in score, the fifth column is the product of the third and the fourth columns, the sixth column is the distance of the slot game from casino buffet, and the remaining two columns show calculations for spatial support and confidence for SAR2. Using the approach shown here, one can calculate the spatial support and spatial confidence for many association rules and then select the ones that are useful. For instance, if the hypothesis regarding higher revenue being associated with the proximity to the casino entrance is verified, then a casino manager might place the slightly less popular games near the entrance and the more popular games further away to draw customers in. Spatial MBA may also be used to study local versus non-local customers.

Conclusions

This completes our series about market basket analysis. We have attempted to show the mathematical foundation and the reasoning for combining both behavioral and demographic information when applying MBA to casino marketing and operations. We believe that application of MBA is a critical component in the management of the active gaming floor of the future, and it has significant applications in the gaming floor of today. We have deliberately used simple examples because in the end we believe that computers will automate much of the work; but that this automation without a human understanding of the fundamental theory is dangerous.

 

 

Footnotes

1 Lewin, Bart A.,  Singh, A. K. and Cardno, Andrew. “Let’s Talk Turkey: Applying Retail Market Basket Analysis to Gaming”. Casino Enterprise Management, December 2008.
2 Hughes, Arthur Middleton (2008). Obtaining and Acting on Customer Preferences,                 http://www.crm2day.com/editorial/50092.php.
3 Brenner, G. A., and Brenner, R. (1987). “A Profile of Gamblers.” http://ideas.repec.org/p/mtl/montde/8716.html.
4 http://findarticles.com/p/articles/mi_m0EIN/is_/ai_89209832;                     http://www.americangaming.org/industry/faq_detail.cfv?id=60; http://www.hotel-             online.com/News/PR2004_4th/Oct04_HarrahsSurvey.html; http://www.harrahs.com/harrahs-corporate/about-us.html).
5 Gordon, Larry (2008). Leading Practices in Marker Basket Analysis – How Top Retailers Are Using Market Basket Analysis to Win Margin and Market Share. (http://www.irgintl.com/pdf2/1.pdf).
6 “The New Science of Siting Stores,” Business Week, July 6, 2005,                     http://www.businessweek.com/technology/content/jul2005/tc2005076_7033.htm.
7 http://www.netflixprize.com.
8 http://bits.blogs.nytimes.com/2008/12/10/computer-scientists-claim-50000....
9 http://www.springerlink.com/content/r613754xp4862768/.
10 Dubois, Didier, Hüllermeir, Eyke and Prade, Henri (2006). A Syematic Approach to the Assessment of Fuzzy Association Rules. Data Mining and Knowledge Discovery. Vol. 13(2), pp. 167 – 192.
11 http://sdh-sageo.teledetection.fr/index2.php?option=com_docman&task=d                 oc_view&gid=46&Itemid=35. We have modified the numerator of the formula to better fit our needs.

 

 

Bart Lewin has more than 25 years of experience in the Engineering and Information Technology field, holding serveral technical and executive technical management positions. He is currently a technical and management consultant.
He can be reached at balewin@mac.com.

Dr. Ashok K. Singh has taught statistics, mathematics and operations research courses at New Mexico Tech, Socorro, N.M., and statistics and mathematics courses at University of Nevada, Las Vegas. He has over 75 publications in theoretical and applied statistics.

Andrew Cardno has more than 16 years of experience in business analytics, ranging from modeling healthcare drive times to casino gaming floor analytics. He often presents on the future of analytics across the world and has spent the last seven years living in the United States and working with corporations around the world.

Comments

Post new comment

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.