Syllabus

CSC 467: Big Data Engineering

General Information

{!assets/text/instructor_info_summer.md!}


Course Information


Required Materials:


Resources and Accessibility:


Course Description

This course will investigate engineering approaches in solving challenges in data-intensive and big data computing problems. Course topics include distributed tools and parallel algorithms that help with acquiring, cleaning, and mining very large amount of data.


Learning Objectives

Course Student Learning Outcomes (CSLO):

  1. Be able to setup and deploy appropriate data engineering technologies to manage big data sets.
  2. Be able to understand MapReduce, one of the key enabling programming concepts.
  3. Be able to implement key data mining techniques using Spark programming libraries.

BS in CS Program Objectives (CSPO):

  1. Be able to apply theory, techniques, and methodologies to create and/or maintain high quality computing systems that function effectively and reliably in the emerging and future information infrastructure.

ABET Objectives (APO):

  1. Analyze a complex computing problem and to apply principles of computing and other relevant disciplines to identify solutions (ABET 1).
  2. Design, implement, and evaluate a computing-based solution to meet a given set of computing requirements in the context of the program’s discipline (ABET 2).

Assessments and Grading:

Grade Scale:

Refer to the Undergraduate Catalog for description of NG (No Grade), W, & other grades.

Assessments:

Assessment % of Final Grade Course Objectives Assessed Program Objectives Assessed ABET Objectives
Assignments 40% 1,2,3 1 1
Project 30% 1,2,3 1 2
Quiz 15% 1 1 1
Final 15% 1,2,3 1 1

Lateness Policy:

Individual assignments that are late are assessed a 10% per day late penalty. Saturday and Sunday are each days. There is no late acceptance for team-based milestones.


Course Topics and Schedules

Week Topic Assessments
1 Introduction to Big Data -
  MapReduce Programming Paradigm -
  Data Parallel Computing with Spark -
2 Frequent Itemsets -
  Locality-Sensitive Hashing -
  Similar Items -
3 Clustering -
  Dimensionality Reduction -
  Recommendation Systems -
4 Link Analysis -
  Page Rank -
5 Decision Trees -

{!assets/text/policy.md!}


{!assets/text/distance_education.md!}