AusDM07

Home
Dates
Program
Register
Accomm
Organisers

AusDM
Togaware

AusDM 2007 Program

PDF version of the program is available.

Keynote Presentation

Professor Jaideep Srivastava

Title: Data Mining for Social Network Analysis

Abstract: A social network is defined as a social structure of individuals, who are related (directly or indirectly to each other) based on a common relation of interest, e.g. friendship, trust, etc. Social network analysis is the study of social networks to understand their structure and behavior. Social network analysis has gained prominence due to its use in different applications - from product marketing (e.g. viral marketing) to search engines and organizational dynamics (e.g. management). Recently there has been a rapid increase in interest regarding social network analysis in the data mining community. The basic motivation is the demand to exploit knowledge from copious amounts of data collected, pertaining to social behavior of users in online environments. A prime example of this are the research efforts dedicated towards the Enron email dataset. Data mining based techniques are proving to be useful for analysis of social network data, especially for large datasets that cannot be handled by traditional methods.

This talk will provide an up-to-date introduction to the increasingly important field of data mining in social network analysis, and a brief overview of research directions in this field. We first provide an introduction to social network analysis and then briefly survey the research in this field. Next, an overview of emerging research in data mining for social network analysis is presented. Finally, we will present our own work in two areas: (i) data mining for socio-cognitive analysis of email networks, and (ii) data mining on logs from massively multi-player online (MMO) games to understand social and group dynamics amongst players.

Speaker Bio: Jaideep Srivastava is a professor at the University of Minnesota, where he has established and led a research laboratory which conducts research in the information and knowledge aspects of computing. He has supervised 24 Ph.D. dissertations and 50 M.S. theses, and authored or co-authored over 200 papers in refereed journals and conferences. Dr Srivastava have served on the editorial boards of various journals, including IEEE TPDS, IEEE TKDE, and the VLDB journal. He has also served as Program and Conference Chair for a number of prominent conferences, especially in the area of data mining, and is on the Steering Committee for the PAKDD series of conferences. He has delivered a number of keynote addresses, plenary talks, and invited tutorials at major conferences.

Dr Srivastava has a very active interaction with the industry, in both consulting and executive roles. Specifically, during a 2-year sabbatical during 1999-2001, he lead a corporate data mining team at Amazon.com (www.amazon.com) and built a data analytics department at Yodlee (www.yodlee.com) from the ground up. More recently, he spent two years as the Chief Technology Officer for Persistent Systems, where he built an RandD division and oversaw the redesign of the training and technical vitalization program for 2,200+ engineers. He has provided technology and technology strategy advice to a number of large corporations including Cargill, United Technologies, IBM, Honeywell, 3M, and Eaton. He has served in an advisory capacity to a number of small companies, including Lancet Software and Infobionics.

Dr Srivastava has also played an active advisory role in the government sector. Specifically, he has served as the US federal government's expert witness in a nationally significant tax case. He is presently serving as Senior Technology Advisor to the State of Minnesota, and is on the Technology Advisory Council to the Chief Minister of Maharashtra, India. He is a Fellow of the IEEE, and has been an IEEE Distinguished Visitor.

Industry Keynote Presentation

Dr Bhavani Raskutti

Title: Partnership between research and industry for developing innovative data mining applications

Abstract: In this talk, Bhavani will argue for the need for strong and equal partnership between industry and research to develop innovative data mining solutions to solve real world problems. Bhavani will draw on examples from her experience as a researcher at Telstra to tease out the role that research and industry play during such a partnership, and highlight some of the factors that contribute to the successful implementation of innovative data mining solutions.

Speaker Bio: Dr Bhavani Raskutti is the manager of Westpac's Predictive Modelling Unit, where she has implemented a series of productivity improvements and automation to deliver more accurate models in a significantly shorter time frame. These improvements have resulted in more consistent leveraging of predictive models by the business for customer relationship management.

Prior to Westpac, Bhavani was a Lead Researcher for Telstra's Artificial Intelligence unit. Her areas of interest included natural language understanding, clustering, machine learning and text mining. Some of her commercial achievements in Telstra include: Successful commercialisation of an internally developed search engine, Improvement in marketing and sales effectiveness through the business' adoption of innovative predictive models to improve customer targeting for churn and up-sell, and Implementation of automated text based analysis tool to improve customer complaint handling.

Other accolades include winning the global KDD (Knowledge Discovery & Data mining) Cup in 2002, co-founding of Insitex to implement data and text mining for bio-informatics, authoring of four patents and over 40 publications in the field of analytics.

Industry Keynote Presentation

Dr Eugene Dubossarsky

Title: Bridging the Divide - Appropriate Sophistication

Abstract: Industrial data mining gurus repeatedly emphasise demonstrated value, effective business communications, a focus on key business problems, and the political difficulties of selling their product and educating their market.

Academic data mining efforts are primarily driven by different imperatives, seeking to produce academically worthy innovation, measured by academically accepted criteria, over academically determined timeframes, resulting in Ph.D. theses, world class journal articles and to win further research grants.

The careers of both groups reflect their different purposes. However there is, and has always been, overlap between their activities.

What is often missing is an innovative eye to solving real-world problems, under real-world timeframes, political and business constraints, and commercial objectives.

In facing business reality, the analyst must at times present a different paradigm of problem specification to management, educating and challenging perceptions. The times to do this are rare, and must be treated with extreme caution, political sensitivity and highly effective communication.

Genuine innovation in the "real world" can often make a difference, but opportunities are frequently missed. Furthermore, these opportunities may not be well aligned with academic objectives.

The analytics professional of the future must be an independent innovator with an eye on business value, aware of recent academic developments and ready to adapt them as appropriate.

This is increasingly important as analytics becomes ubiquitous across a range of industries, and competition dictates an "arms race" in data driven decision making.

I will present some examples of appropriate sophistication and a framework for Appropriate Sophistication, addressing the tangible value add of methods, tools and techniques.

Speaker Bio: Eugene Dubossarsky is a founder and Fellow of the Institute of Analytics Professionals of Australia Limited (IAPA) has served as the Institutes's Secretary in the first two years of its life, and is now Head of the NSW Chapter.

He is currently the owner of Alces Ronin Pty Ltd, an Analytics Services company specialising in vendor-neutral and freeware-based solutions, analytics strategy, capacity building, and advanced analytics including forecasting, small data set prediction and collective intelligence techniques such as prediction markets. His team's recent work have included marketing, forensics, financial forecasting and prediction markets.

Eugene is a Senior Visiting Fellow at the School of Mathematics, University of New South Wales.

He has worked in commercial analytics for over a decade, most recently as Director of Predictive Business Intelligence at Ernst & Young. He has worked in a range of industries and environments, applying analytics to a diverse range of problems including marketing, futures trading, mining optimisation, medical technology, web analytics, sales forecasting, general insurance forecasting, product design, direct marketing, human resources, and customer retention.

He has a strong interest in collective intelligence techniques ("tacit data mining"), the application of advanced analytics methods to real-world problems, the commercial use of data mining freeware and the appropriate configuration of business analytics efforts.

Eugene Holds Ph.D. in Computing Science from The University of Technology, Sydney and a B.Sc. (Hons 1) from UNSW.

Education Panel

As a first, there will be a panel discussion data mining and analytics educational issues from both the academic and industry perspective. The panel session will follow a session on data mining education with two presentations. Panel members will include leading Australian data mining academics and industry participants.

Accepted Papers

SemGrAM - Integrating Semantic Graphs into Association Rule Mining
John Roddick, Peter Fule
Evaluation of a Graduate Level Data Mining Course with Industry Participants
Peter Christen
Are Zero-suppressed Binary Decision Diagrams Good for Mining Frequent Patterns in High Dimensional Datasets ?
Elsa Loekito, James Bailey
Using Corpus Analysis to Inform Research into Opinion Detection in Blogs
Deanna Osman, John Yearwood, Peter Vamplew
Useful Clustering Outcomes from Meaningful Time Series Clustering
Jason Chen
Preference Networks: Probabilistic Models for Recommendation Systems
The Truyen Tran, Quoc Dinh Phung, Svetha Venkatesh
Termhood Determination for Automatic Term Recognition (Part 1): Domain Prevalence and Tendency
Wilson Wong, Wei Liu, Mohammed Bennamoun
Termhood Determination for Automatic Term Recognition (Part 2): A Probabilistic Framework
Wilson Wong, Wei Liu, Mohammed Bennamoun
Discovering Frequent Sets from Data Streams with CPU Constraint
Xuan Hong Dang, Wee-Keong Ng, Kok-Leong Ong, Vincent C S Lee
Mining for offender group detection and story of a police operation
Fatih OZGUL, Julian Bondy, Hakan Aksoy
A Two-Step Classification Approach to Unsupervised Record Linkage
Peter Christen
Measuring Data-Driven Ontology Changes using Text Mining
Majigsuren Enkhsaikhan, Wilson Wong, Wei Liu, Mark Reynolds
Exploratory Multilevel Hot Spot Analysis: Australian Taxation Office Case Study
Denny, Graham J. Williams, Peter Christen
A New Efficient Privacy-Preserving Scalar Product Protocol
Artak Amirbekyan, Vladimir Estivill-Castro
Adaptive Spike Detection for Resilient Data Stream Mining
Clifton Phua, Kate Smith-Miles, Vincent Lee, Ross Gayler
Analytics for Audit and Business Controls in Corporate Travel & Entertainment
Vijay Iyengar, Ioana Boier, Karen Kelley, Raymond Curatolo
News Aware Volatility Forecasting: Is the Content of News Important?
Calum Robertson, Shlomo Geva, Rodney Wolff
Classification for accuracy and insight: A weighted sum approach
Anthony Quinn, Andrew Stranieri, John Yearwood
Effectiveness of Using Quantified Intermarket Influence for Predicting Trading Signals of Stock Markets
Chandima Tilakaratne, Musa Mammadov, Sidney Morris
Establishing a Lineage for Medical Knowledge Discovery
Anna Shillabeer, John Roddick
An E-Market Framework to Determine the Strength of Business Relationships between Intelligent Agents
Khandaker Shahidul Islam
Predictive Model of Insolvency Risk for Australian Corporations
Baxter Rohan, Gawler Mark, Ang Russell
PCITMiner- Prefix-based Closed Induced Tree Miner for finding closed induced frequent subtrees
Sangeetha Kutty, Richi Nayak, Yuefeng Li
Customer Analytics Projects: Addressing Existing Problems with a Process that Leads to Success
Inna Kolyshkina, Simeon Simoff
The application of data mining techniques to characterize agricultural soil profiles
Leisa Armstrong, Dean Diepeveen, Rowan Maddern
Reflection on Development and Delivery of a Data Mining Unit
Bozena Stewart

Last modified: 2007-11-15 21:22:08 Graham Williams

Further information from Peter Christen, Paul Kennedy, Jiuyong (John) Li, or Richi Nayak.