Syllabus

CSC 467: Big Data Engineering

General Information

Semester: Summer 2026
Class Meeting Time: N/A

{!assets/text/instructor_info_summer.md!}

Course Information

The course runs from June 30, 2025 until August 03, 2025. It is a fully online course.
- The course is 100% asynchronous.
- All class materials and recorded links to the lectures will be provided via D2L.

Required Materials:

Mining of Massive Datasets. Julre Leskovec, Anand Rajaraman, and Jeffrey David Ullman.
Author’s Textbook Download Page.

Resources and Accessibility:

For general technical support, students can contact WCU IT HelpDesk at 610-436-3350 or via email: helpdesk@wcupa.edu.
For distance education support, students can contact WCU Distance Education Services at 610-436-3373 or via email: distanceed@wcupa.edu.
A Discord server will be created and invitation link made available inside D2L. Technical questions specific to online competition platforms used in the course can be sent via email to the instructor or posted on the Discord server.

Course Description

This course will investigate engineering approaches in solving challenges in data-intensive and big data computing problems. Course topics include distributed tools and parallel algorithms that help with acquiring, cleaning, and mining very large amount of data.

Learning Objectives

Course Student Learning Outcomes (CSLO):

Be able to setup and deploy appropriate data engineering technologies to manage big data sets.
Be able to understand MapReduce, one of the key enabling programming concepts.
Be able to implement key data mining techniques using Spark programming libraries.

BS in CS Program Objectives (CSPO):

Be able to apply theory, techniques, and methodologies to create and/or maintain high quality computing systems that function effectively and reliably in the emerging and future information infrastructure.

ABET Objectives (APO):

Analyze a complex computing problem and to apply principles of computing and other relevant disciplines to identify solutions (ABET 1).
Design, implement, and evaluate a computing-based solution to meet a given set of computing requirements in the context of the program’s discipline (ABET 2).

Assessments and Grading:

Grade Scale:

Grade	Quality Points	Numeric	Interpretation

Refer to the Undergraduate Catalog for description of NG (No Grade), W, & other grades.

Assessments:

Assessment	% of Final Grade	Course Objectives Assessed	Program Objectives Assessed	ABET Objectives
Assignments	40%	1,2,3	1	1
Project	30%	1,2,3	1	2
Quiz	15%	1	1	1
Final	15%	1,2,3	1	1

Lateness Policy:

Individual assignments that are late are assessed a 10% per day late penalty. Saturday and Sunday are each days. There is no late acceptance for team-based milestones.

Course Topics and Schedules

Week	Topic	Assessments
1	Introduction to Big Data	-
	MapReduce Programming Paradigm	-
	Data Parallel Computing with Spark	-
2	Frequent Itemsets	-
	Locality-Sensitive Hashing	-
	Similar Items	-
3	Clustering	-
	Dimensionality Reduction	-
	Recommendation Systems	-
4	Link Analysis	-
	Page Rank	-
5	Decision Trees	-

{!assets/text/policy.md!}

{!assets/text/distance_education.md!}