• Offered by School of Computing
  • ANU College ANU College of Engineering Computing & Cybernetics
  • Course subject Computer Science
  • Areas of interest Computer Science, Advanced Computing, Information - Intensive Computing, Algorithms and Data

Real-world data are commonly messy, distributed, and heterogeneous. This course introduces core concepts of data cleaning, standardisation, and data integration, that are aimed at converting and mapping raw data into other formats that allow more efficient and convenient use and analysis of data. The courses also discusses data quality, management, and storage issues as relevant to data analytics.

Learning Outcomes

Upon successful completion, students will have the knowledge and skills to:

  1. Critically reflect upon different data sources, types, formats and structures,
  2. Justify and apply data cleaning, preprocessing, and standardisation for data analytics,
  3. Apply data integration concepts and techniques to heterogeneous and distributed data,
  4. Interpret, assess and discuss data quality measurements,
  5. Understand and be able to use advanced data wrangling, data integration, and database techniques as relevant to data analytics.

Other Information

Professional Skills Mapping:

Mapping of Learning Outcomes to Assessment and Professional Competencies

Indicative Assessment

  1. Written and practical assignments (30) [LO 1,2,3,4,5]
  2. Oral presentation and report (20) [LO 1,2,3,4,5]
  3. Final examination (50) [LO 1,2,3,4,5]

The ANU uses Turnitin to enhance student citation and referencing techniques, and to assess assignment submissions as a component of the University's approach to managing Academic Integrity. While the use of Turnitin is not mandatory, the ANU highly recommends Turnitin is used by both teaching staff and students. For additional information regarding Turnitin please visit the ANU Online website.

Workload

The workload for the course is around 130 hours, including reading, the viewing of online course material, participation in face-to-face lectures, practical labs and tutorials, and preparation for assessments

Inherent Requirements

Information on inherent requirements for this course is currently not available.

Requisite and Incompatibility

Students are required to have completed introductory courses on databases, programming and algorithms. To enrol in this course you must have completed 6 units from COMP1030 or COMP1100 or COMP1130 or COMP1730; AND 6 units from COMP1040 or COMP1110 or COMP1140; AND COMP2400 Incompatible with COMP8430 and COMP8930.

Prescribed Texts

None

Preliminary Reading

Data Matching - Concepts and Techniques fro Record Linkage, Entity Resolution and Duplicate Detection,
Peter Christen, Springer, 2012 
For more information see: http://users.cecs.anu.edu.au/~christen/data-matching-book-2012.html

Fees

Tuition fees are for the academic year indicated at the top of the page.  

Commonwealth Support (CSP) Students
If you have been offered a Commonwealth supported place, your fees are set by the Australian Government for each course. At ANU 1 EFTSL is 48 units (normally 8 x 6-unit courses). More information about your student contribution amount for each course at Fees

Student Contribution Band:
2
Unit value:
6 units

If you are a domestic graduate coursework student with a Domestic Tuition Fee (DTF) place or international student you will be required to pay course tuition fees (see below). Course tuition fees are indexed annually. Further information for domestic and international students about tuition and other fees can be found at Fees.

Where there is a unit range displayed for this course, not all unit options below may be available.

Units EFTSL
6.00 0.12500
Domestic fee paying students
Year Fee
2024 $4980
International fee paying students
Year Fee
2024 $6360
Note: Please note that fee information is for current year only.

Offerings, Dates and Class Summary Links

ANU utilises MyTimetable to enable students to view the timetable for their enrolled courses, browse, then self-allocate to small teaching activities / tutorials so they can better plan their time. Find out more on the Timetable webpage.

The list of offerings for future years is indicative only.
Class summaries, if available, can be accessed by clicking on the View link for the relevant class number.

Second Semester

Class number Class start date Last day to enrol Census date Class end date Mode Of Delivery Class Summary
9185 22 Jul 2024 29 Jul 2024 31 Aug 2024 25 Oct 2024 In Person N/A

Responsible Officer: Registrar, Student Administration / Page Contact: Website Administrator / Frequently Asked Questions