In Australia, the big 4 banks receive large amounts of Electronic Funds Transfer at Point of Sale (EFTPOS) transaction data on a daily basis, but despite this, this information-rich data are not stored nor analysed. The fact that EFTPOS data is both very large and very messy makes it difficult for banks themselves to gain visibility of the characteristics of the stakeholders of the data.
That changed in 2014, when a researcher in Monash’s Faculty of IT, Dr. Grace Rumantir, approached us for assistance in accessing/building a secure analysis environment for a data mining project on a collection of commercially sensitive EFTPOS data obtained through an award winning collaboration with the Australia and New Zealand Banking Group (ANZ). To our knowledge this is the first time market segmentation analyses have been applied to such a large amount of EFTPOS data anywhere in the world.
As a pilot, ANZ collated 5 months of EFTPOS transaction records, where all customer and retailer identifying data was redacted. Before this commercial in-confidence data could be released for research purposes, ANZ produced a list of comprehensive requirements pertaining to the secure storage and processing of the data. Securing the release of this data through ANZ Information Security protocol has been a lengthy and difficult process. The success was gained for the main part due to our team’s ability to demonstrate how we can very confidently meet these requirements with the infrastructure we have in place at Monash.
Our team very quickly built a workhorse but appropriately secure environment on R@CMon (specialist nodes due to the memory requirements for processing such a large dataset). The R@CMon environment already uses software defined virtualisation technology. We sandbox servers and R@CMon is housed in Monash’s own secure access facility. All ingress/egress access was locked down to allow only a few known clients (Grace and her research students). Remote desktop software and several data-mining tools of interest were configured for use by the researchers. The data (in daily csv samples) was stored in an encrypted volume file which was uploaded to a R@CMon volume attached to the analysis server. Individual passwords were used to unlock and mount the encrypted data, with a strict usage protocol to ensure the data remained locked when not in use. And so on.
A paper outlining our experience in acquiring, secured-storing and processing of the EFTPOS data can be found at:
Ashishkumar Singh, Grace Rumantir, Annie South, and Blair Bethwaite, Clustering Experiments on Big Transaction Data for Market Segmentation. In Proceedings of the 2014 International Conference on Big Data Science and Computing (BigDataScience ’14). ACM, New York, NY, USA, Article 16, DOI=http://dx.doi.org/10.1145/2640087.2644161
The market segmentation experiments on the retailers of the EFTPOS data involve reduction of the transaction data using the RFM (Recency, Frequency, Monetary) and clustering analysis with results indicating distinct combinations of RFM values of retailers in the clusters that could give the bank indications of different marketing strategies that can be applied to each of the retailer performance categories. This ground breaking revelation of the existence of retailer segments extracted from EFTPOS data has won Best Paper Award Industry Track at the Australasian Data Mining and Analytics Conference 2014.
Ashishkumar Singh, Grace Rumantir and Annie South, Market Segmentation of EFTPOS Retailers. In Proceedings of the 12th Australasian Data Mining Conference (AusDM 2014), Brisbane, Australia (http://ausdm14.ausdm.org/program) – Best Paper Award Industry Track
Ashishkumar Singh, Grace Rumantir. Two-tiered Clustering Classification Experiments for Market Segmentation of EFTPOS Retailers. Australasian Journal of Information Systems, [S.l.], v. 19, sep. 2015. ISSN 1449-8618. Available at: <http://journal.acs.org.au/index.php/ajis/article/view/1184>. Date accessed: 18 Oct. 2015. doi:http://dx.doi.org/10.3127/ajis.v19i0.1184.
This exciting result has been cited in the financial industry publications as an important example of how academia can help business gain insights into their own massive amount of data that can help them in making business decision.
On the success of this collaborative project, Patrick Maes, ANZ Chief Technology Officer, writes:
“The key here is to find the data scientists who can work with these models, a skill not easy to find nowadays”
(see http://www.itnews.com.au/news/me-bank-hires-data-boss-in-it-exec-restructure-411908 and https://bluenotes.anz.com/posts/2015/03/big-data-from-customer-targeting-to-customer-centric ).
On lessons learnt from this important pilot project, Dr. Grace Rumantir says:
“There is a long standing gap between what research in academia can offer and the needs in the industry. This gap takes the form of mistrust on the part of the people in the industry that academics may not deliver a solution that is relevant to their business on a timely manner. The results of this ground breaking project using EFTPOS data shows that we do understand what business needs and come up with a practical solution that business can directly translate into business strategies which can give them an edge in the competitive business environment.
We are able to do this with our ability to talk in the same wavelength with our industry clients, with our research skills in bleeding edge technology and with the support of the world class research support and infrastructure that Monash has been investing heavily on.”