AusDM 2007 Program

PDF version of the program is available.

Keynote Presentation

Professor Jaideep Srivastava

Title: Data Mining for Social Network Analysis

Abstract: A social network is defined as a social structure of individuals, who are related (directly or indirectly to each other) based on a common relation of interest, e.g. friendship, trust, etc. Social network analysis is the study of social networks to understand their structure and behavior. Social network analysis has gained prominence due to its use in different applications - from product marketing (e.g. viral marketing) to search engines and organizational dynamics (e.g. management). Recently there has been a rapid increase in interest regarding social network analysis in the data mining community. The basic motivation is the demand to exploit knowledge from copious amounts of data collected, pertaining to social behavior of users in online environments. A prime example of this are the research efforts dedicated towards the Enron email dataset. Data mining based techniques are proving to be useful for analysis of social network data, especially for large datasets that cannot be handled by traditional methods.

This talk will provide an up-to-date introduction to the increasingly important field of data mining in social network analysis, and a brief overview of research directions in this field. We first provide an introduction to social network analysis and then briefly survey the research in this field. Next, an overview of emerging research in data mining for social network analysis is presented. Finally, we will present our own work in two areas: (i) data mining for socio-cognitive analysis of email networks, and (ii) data mining on logs from massively multi-player online (MMO) games to understand social and group dynamics amongst players.

Speaker Bio: Jaideep Srivastava is a professor at the University of Minnesota, where he has established and led a research laboratory which conducts research in the information and knowledge aspects of computing. He has supervised 24 Ph.D. dissertations and 50 M.S. theses, and authored or co-authored over 200 papers in refereed journals and conferences. Dr Srivastava have served on the editorial boards of various journals, including IEEE TPDS, IEEE TKDE, and the VLDB journal. He has also served as Program and Conference Chair for a number of prominent conferences, especially in the area of data mining, and is on the Steering Committee for the PAKDD series of conferences. He has delivered a number of keynote addresses, plenary talks, and invited tutorials at major conferences.

Dr Srivastava has a very active interaction with the industry, in both consulting and executive roles. Specifically, during a 2-year sabbatical during 1999-2001, he lead a corporate data mining team at Amazon.com (www.amazon.com) and built a data analytics department at Yodlee (www.yodlee.com) from the ground up. More recently, he spent two years as the Chief Technology Officer for Persistent Systems, where he built an RandD division and oversaw the redesign of the training and technical vitalization program for 2,200+ engineers. He has provided technology and technology strategy advice to a number of large corporations including Cargill, United Technologies, IBM, Honeywell, 3M, and Eaton. He has served in an advisory capacity to a number of small companies, including Lancet Software and Infobionics.

Dr Srivastava has also played an active advisory role in the government sector. Specifically, he has served as the US federal government's expert witness in a nationally significant tax case. He is presently serving as Senior Technology Advisor to the State of Minnesota, and is on the Technology Advisory Council to the Chief Minister of Maharashtra, India. He is a Fellow of the IEEE, and has been an IEEE Distinguished Visitor.

Industry Keynote Presentation

Dr Bhavani Raskutti

Title: Partnership between research and industry for developing innovative data mining applications

Abstract: In this talk, Bhavani will argue for the need for strong and equal partnership between industry and research to develop innovative data mining solutions to solve real world problems. Bhavani will draw on examples from her experience as a researcher at Telstra to tease out the role that research and industry play during such a partnership, and highlight some of the factors that contribute to the successful implementation of innovative data mining solutions.

Speaker Bio: Dr Bhavani Raskutti is the manager of Westpac's Predictive Modelling Unit, where she has implemented a series of productivity improvements and automation to deliver more accurate models in a significantly shorter time frame. These improvements have resulted in more consistent leveraging of predictive models by the business for customer relationship management.

Prior to Westpac, Bhavani was a Lead Researcher for Telstra's Artificial Intelligence unit. Her areas of interest included natural language understanding, clustering, machine learning and text mining. Some of her commercial achievements in Telstra include: Successful commercialisation of an internally developed search engine, Improvement in marketing and sales effectiveness through the business' adoption of innovative predictive models to improve customer targeting for churn and up-sell, and Implementation of automated text based analysis tool to improve customer complaint handling.

Other accolades include winning the global KDD (Knowledge Discovery & Data mining) Cup in 2002, co-founding of Insitex to implement data and text mining for bio-informatics, authoring of four patents and over 40 publications in the field of analytics.

Industry Keynote Presentation

Dr Eugene Dubossarsky

Title: Bridging the Divide - Appropriate Sophistication

Abstract: Industrial data mining gurus repeatedly emphasise demonstrated value, effective business communications, a focus on key business problems, and the political difficulties of selling their product and educating their market.

Academic data mining efforts are primarily driven by different imperatives, seeking to produce academically worthy innovation, measured by academically accepted criteria, over academically determined timeframes, resulting in Ph.D. theses, world class journal articles and to win further research grants.

The careers of both groups reflect their different purposes. However there is, and has always been, overlap between their activities.

What is often missing is an innovative eye to solving real-world problems, under real-world timeframes, political and business constraints, and commercial objectives.

In facing business reality, the analyst must at times present a different paradigm of problem specification to management, educating and challenging perceptions. The times to do this are rare, and must be treated with extreme caution, political sensitivity and highly effective communication.

Genuine innovation in the "real world" can often make a difference, but opportunities are frequently missed. Furthermore, these opportunities may not be well aligned with academic objectives.

The analytics professional of the future must be an independent innovator with an eye on business value, aware of recent academic developments and ready to adapt them as appropriate.

This is increasingly important as analytics becomes ubiquitous across a range of industries, and competition dictates an "arms race" in data driven decision making.

I will present some examples of appropriate sophistication and a framework for Appropriate Sophistication, addressing the tangible value add of methods, tools and techniques.

Speaker Bio: Eugene Dubossarsky is a founder and Fellow of the Institute of Analytics Professionals of Australia Limited (IAPA) has served as the Institutes's Secretary in the first two years of its life, and is now Head of the NSW Chapter.

He is currently the owner of Alces Ronin Pty Ltd, an Analytics Services company specialising in vendor-neutral and freeware-based solutions, analytics strategy, capacity building, and advanced analytics including forecasting, small data set prediction and collective intelligence techniques such as prediction markets. His team's recent work have included marketing, forensics, financial forecasting and prediction markets.

Eugene is a Senior Visiting Fellow at the School of Mathematics, University of New South Wales.

He has worked in commercial analytics for over a decade, most recently as Director of Predictive Business Intelligence at Ernst & Young. He has worked in a range of industries and environments, applying analytics to a diverse range of problems including marketing, futures trading, mining optimisation, medical technology, web analytics, sales forecasting, general insurance forecasting, product design, direct marketing, human resources, and customer retention.

He has a strong interest in collective intelligence techniques ("tacit data mining"), the application of advanced analytics methods to real-world problems, the commercial use of data mining freeware and the appropriate configuration of business analytics efforts.

Eugene Holds Ph.D. in Computing Science from The University of Technology, Sydney and a B.Sc. (Hons 1) from UNSW.

Education Panel

As a first, there will be a panel discussion data mining and analytics educational issues from both the academic and industry perspective. The panel session will follow a session on data mining education with two presentations. Panel members will include leading Australian data mining academics and industry participants.

Accepted Papers

  1. SemGrAM - Integrating Semantic Graphs into Association Rule Mining
    John Roddick, Peter Fule

  2. Evaluation of a Graduate Level Data Mining Course with Industry Participants
    Peter Christen

  3. Are Zero-suppressed Binary Decision Diagrams Good for Mining Frequent Patterns in High Dimensional Datasets ?
    Elsa Loekito, James Bailey

  4. Using Corpus Analysis to Inform Research into Opinion Detection in Blogs
    Deanna Osman, John Yearwood, Peter Vamplew

  5. Useful Clustering Outcomes from Meaningful Time Series Clustering
    Jason Chen

  6. Preference Networks: Probabilistic Models for Recommendation Systems
    The Truyen Tran, Quoc Dinh Phung, Svetha Venkatesh

  7. Termhood Determination for Automatic Term Recognition (Part 1): Domain Prevalence and Tendency
    Wilson Wong, Wei Liu, Mohammed Bennamoun

  8. Termhood Determination for Automatic Term Recognition (Part 2): A Probabilistic Framework
    Wilson Wong, Wei Liu, Mohammed Bennamoun

  9. Discovering Frequent Sets from Data Streams with CPU Constraint
    Xuan Hong Dang, Wee-Keong Ng, Kok-Leong Ong, Vincent C S Lee

  10. Mining for offender group detection and story of a police operation
    Fatih OZGUL, Julian Bondy, Hakan Aksoy

  11. A Two-Step Classification Approach to Unsupervised Record Linkage
    Peter Christen

  12. Measuring Data-Driven Ontology Changes using Text Mining
    Majigsuren Enkhsaikhan, Wilson Wong, Wei Liu, Mark Reynolds

  13. Exploratory Multilevel Hot Spot Analysis: Australian Taxation Office Case Study
    Denny, Graham J. Williams, Peter Christen

  14. A New Efficient Privacy-Preserving Scalar Product Protocol
    Artak Amirbekyan, Vladimir Estivill-Castro

  15. Adaptive Spike Detection for Resilient Data Stream Mining
    Clifton Phua, Kate Smith-Miles, Vincent Lee, Ross Gayler

  16. Analytics for Audit and Business Controls in Corporate Travel & Entertainment
    Vijay Iyengar, Ioana Boier, Karen Kelley, Raymond Curatolo

  17. News Aware Volatility Forecasting: Is the Content of News Important?
    Calum Robertson, Shlomo Geva, Rodney Wolff

  18. Classification for accuracy and insight: A weighted sum approach
    Anthony Quinn, Andrew Stranieri, John Yearwood

  19. Effectiveness of Using Quantified Intermarket Influence for Predicting Trading Signals of Stock Markets
    Chandima Tilakaratne, Musa Mammadov, Sidney Morris

  20. Establishing a Lineage for Medical Knowledge Discovery
    Anna Shillabeer, John Roddick

  21. An E-Market Framework to Determine the Strength of Business Relationships between Intelligent Agents
    Khandaker Shahidul Islam

  22. Predictive Model of Insolvency Risk for Australian Corporations
    Baxter Rohan, Gawler Mark, Ang Russell

  23. PCITMiner- Prefix-based Closed Induced Tree Miner for finding closed induced frequent subtrees
    Sangeetha Kutty, Richi Nayak, Yuefeng Li

  24. Customer Analytics Projects: Addressing Existing Problems with a Process that Leads to Success
    Inna Kolyshkina, Simeon Simoff

  25. The application of data mining techniques to characterize agricultural soil profiles
    Leisa Armstrong, Dean Diepeveen, Rowan Maddern

  26. Reflection on Development and Delivery of a Data Mining Unit
    Bozena Stewart

Last modified: 2007-11-15 21:22:08 Graham Williams

Further information from Peter Christen, Paul Kennedy, Jiuyong (John) Li, or Richi Nayak.