Tutorials

This year’s AusDM will boast a variety of interesting and engaging tutorials from leading researchers.

R and Data Mining with Dr. Yanchang Zhao

This is a tutorial on data mining with R. It will cover four sessions below. Each session will be of 1 hour, composed of a 40-minute presentation and a 20-minute lab. Participants should bring along their laptops with R, RStudio and some required R packages. Detailed instructions can be found under the “Requirement” section here.

  • R Programming: basics of R language and programming, parallel computing, data import and export, RStudio, Shiny and R MarkDown
  • Association Rule Mining with R: mining and selecting interesting association rules, redundancy removal, and rule visualisation
  • Text Mining with R: text mining, word cloud, topic modelling, and sentiment analysis,
  • Social Network Analysis with R: graph construction, graph query, centrality measures, and graph visualisation

Dr. Yanchang Zhao, CSIRO

Dr. Yanchang Zhao is a Senior Research Scientist with Data61, CSIRO and an Adjunct Professor with the University of Canberra. Previously, he was a Data Analytics Lead with IBM Australia in 2017, a Senior Data Scientist with the Australian Government from 2009 to 2016 and a Postdoctoral Research Fellow with the University of Technology Sydney (UTS) from 2007 to 2009.

He has 11 years’ hands-on experiences in data analytics and has authored/co-authored 60+ publications (incl. 4 books) on data mining research and applications. His book titled “R and Data Mining – Examples and Case Studies” has been widely used as a reference book for university courses and industry trainings. He is a Founder of RDataMining.com and the RDataMining LinkedIn group. He is a Senior Member of the IEEE and a Steering Committee Member of the AusDM Conference, and has been a Program Committee member for 100+ academic conferences on data mining and machine learning.

Accessible Machine Learning with Dr. Graham Williams

This tutorial will introduce a number of tools and technologies for easily building machine learning models in R/Python in the cloud. We will introduce a templates based predictive model building approach introduced in the recent book Essentials of Data Science and highlight the use of cloud platforms for data science. After building predictive models we introduce a new platform for openly sharing pre-built models and guide you through the publishing process to share your models with the community. This tutorial will be hands-on with participants using their own laptops to connect to a cloud platform where the computation can be completed.

  • Getting Started with Open Source Data Mining on the Cloud
  • Loading and Exploring Data.
  • Building Classification/Predictive Models
  • Freely Publishing/Sharing Models on the Cloud

ML Hack for ML Hub

The ML Hub (mlhub.ai) is an open source initiative to support the effort to make ML more accessible to more people. The intent is to build a repository of pre-built ML models that within 5 minutes aim to demonstrate a practical ML model in action. We invite you to participate in a session to explore ML Hub and contribute some new ideas and packages to the endeavour. It would be great to hack some public pre-built models to add them to MLHub.ai. Have a look at MLHub as it is now and come join us to share the fruits of our ML communities with the wider world.

Dr. Graham Williams, Microsoft

Dr. Graham Williams is Microsoft’s Director of Data Science for the Asia Pacific regions, based in Singapore, and Adjunct Fellow ANU and Adjunct Professor University of Canberra. Previously, he was Senior Director of Analytics with the Australian Government and Principal Research Scientist with CSIRO. He is the author of many research papers and several text books in Data Mining and Data Science, as well as widely used open source software packages dating from the 1980s including for Data Mining, Unix and Linux, TeX and Emacs. Further details from https://togaware.com/about/bio/.