Syllabus

Jump to:


About CSC 4220/5220

This course begins by exploring the basics of data science using python to understand the role of modeling. The relevant portion of the world can range from simple to outrageously complex. So, we must learn various types of models and explore how to evaluate and choose models, based on task type, complexity, and available data. We will delve into: Simple linear regression (with and without input transformations), multiple linear regression, logistic regression, PCA, KNN clustering, decision trees, support vector machines, the perceptron, multi-layer perceptron, auto-encoders, and large language models. Additionally, we will discuss ensemble learning, reinforcement learning, and machine learning ethics.

The instructor is Dr. Jesse Roberts. To find out more about the course instructor and TAs, explore the course staff page.

Goals

  • Empower students to apply computational and inferential thinking to address real-world data science problems. (Learn practical data science theory and tools)
  • Prepare students for advanced courses in machine learning or data science by exploring the foundational models and context. (Learn foundations of machine learning theory)
  • Explore the tools and theory used to create and train complex deep learning models at scale. (Expose to practical machine learning and tools)

Topic List and Tentative Schedule

Session Day Week Lecture
0 Wednesday, January 22, 2025 0 Introduction
1 Monday, January 27, 2025 1 Pandas 1
2 Wednesday, January 29, 2025 1 Pandas 2
3 Monday, February 3, 2025 2 Exploratory Data A nalysis
4 Wednesday, February 5, 2025 2 Visualization
5 Monday, February 10, 2025 3 Statistics and Hypothesis Testing
6 Wednesday, February 12, 2025 3 Sampling Populations and Distributions
7 Monday, February 17, 2025 4 Intro to Modeling
8 Wednesday, February 19, 2025 4 Case studies
9 Monday, February 24, 2025 5 Multiple Linear Regression
10 Wednesday, February 26, 2025 5 SkLearn + Gradient Descent
11 Monday, March 3, 2025 6 Feature Engineering
12 Wednesday, March 5, 2025 6 Model Evaluation + Regularization
13 Monday, March 10, 2025 7 Estimators + Bias + Variance
14 Wednesday, March 12, 2025 7 Exam
  Monday, March 17, 2025   Spring Break
  Wednesday, March 19, 2025   Spring Break
15 Monday, March 24, 2025 8 Logistic Regression
16 Wednesday, March 26, 2025 8 Principal Component Analysis
17 Monday, March 31, 2025 9 Clustering
18 Wednesday, April 2, 2025 9 Decision Trees
19 Monday, April 7, 2025 10 Support Vector Machines
20 Wednesday, April 9, 2025 10 Perceptron
21 Monday, April 14, 2025 11 Neural Networks + Backpropogation
22 Wednesday, April 16, 2025 11 Auto-Encoders
23 Monday, April 21, 2025 12 Large Language Models
24 Wednesday, April 23, 2025 12 Reinforcement Learning
25 Monday, April 28, 2025 13 Ensemble Learning
26 Wednesday, April 30, 2025 13 Ethics
27     Final Exam

Prerequisites

  • Design of Algorithms and Computing: The rigorous design and analysis of algorithms is covered in CSC 2400. This is important to understanding the time and space complexity of various models. Additionally, this requirement ensures a sufficient background in practical programming.

  • Statistics: Machine Learning and Data Science are fields that hail from statistical analysis. To understand why we do what we do, the underlying statistical processes are indispensable. We will review the relevant statistics but depend upon a basic understanding acquired in previous courses like Math 3070 (or 3470 or 4470 or 5470).

  • Math: Linear Algebra (Math 2010). We will need some basic concepts like linear operators, projections, and optimization to analyze and derive new prediction algorithms.

Please consult the Resources page for additional resources for reviewing prerequisite material and diving deeper.

Textbook: This course has two official texts. Both texts are available for free online (legally) at the provided links.

Additionally, the course material will be supplemented by instructor notes and other external resources.

Course Culture

We hope to foster an inclusive and supportive learning environment based on curiosity rather than competition. All members of the course community — the instructors, students, and TAs — are expected to treat each other with courtesy and respect. Some of the responsibility for that lies with the staff, but a lot of it ultimately rests with you, the students.

Be Aware of Your Actions

Sometimes, the little things add up to creating an unwelcoming culture to some students. For example, you and a friend may think you are sharing a private joke about other races, majors, genders, abilities, cultures, etc. but this can have adverse effects on classmates who overhear it. There is a great deal of research on something called “stereotype threat”: research finds that simply reminding someone that they belong to a particular culture or share a particular identity (on whatever dimension) can interfere with their course performance.

Stereotype threat works both ways: you can assume that a student will struggle based on who they appear to be, or you can assume that a student is doing great based on who they appear to be. Both are potentially harmful.

Bear in mind that diversity has many facets, some of which are not visible. Your classmates may have medical conditions (physical or mental), personal situations (financial, family, etc.), or interests that aren’t common to most students in the course. Another aspect of professionalism is avoiding comments that (likely unintentionally) put down colleagues for situations they cannot control. Bragging in open space that an assignment is easy or “crazy,” for example, can send subtle cues that discourage classmates who are dealing with issues that you can’t see. Please take care, so we can create a class in which all students feel supported and respected.

Be Respectful

Beyond the slips that many of us make unintentionally are a host of behaviors that the course staff, department, and university do not tolerate. These are generally classified under the term harassment; sexual harassment is a specific form that is governed by federal laws known as Title IX.

Communicate Issues with Course Staff and/or the Department

We take all complaints about unprofessional or discriminatory behavior seriously. Professionalism and respect for diversity are not just matters between students; they also apply to how the course staff treat the students. The staff of this course will treat you in a way that respects our differences. However, despite our best efforts, we might slip up, hopefully inadvertently. If you are concerned about classroom environment issues created by the staff or overall class dynamic, please feel free to talk to us about it. The instructors in particular welcome any comments or concerns regarding conduct of the course and the staff. See below for how to best reach us.

As course staff, we are committed to creating a learning environment welcoming of all students that supports a diversity of thoughts, perspectives and experiences and respects your identities and backgrounds (including race, ethnicity, nationality, gender identity, socioeconomic class, sexual orientation, language, religion, ability, and more.) To help accomplish this:

  • If you feel like your performance in the class is being affected by your experiences outside of class (e.g., family matters, current events), please don’t hesitate to come and talk with us. We want to be resources for you.

Course Components

Below is a high-level “typical week in the course” for Spring 2025.

Mo Tu We Th Fr
Lecture   Lecture    
Office Hours   Office Hours    
        Homework due and new homework released
  • All deadlines are subject to change.
  • The Office Hours schedule is on the Calendar page.
  • Lectures, assignments, projects, and exams are scheduled on the Home page.

Lecture

Lecture attendance is mandatory.

Please refer to Grading Scheme for a comprehensive grade breakdown.

Participation

Participation grades are 1 or 0 for each lecture. A score of 1 is earned by attending lecture and taking part in the slido interactive questions.

Homework and Projects

Homeworks are week-long assignments that are designed to help students develop an in-depth understanding of both the theoretical and practical aspects of ideas presented in lecture. Projects are 2-week assignments (with a weekly checkpoint) that synthesize multiple topics.

  • All homeworks and projects must be submitted to Ilearn by their posted deadlines.
  • Homeworks and projects are graded by the TAs and developed test cases.
  • The primary form of support students will have for homeworks and projects are office hours.
  • Homeworks and projects must be completed individually, without the usage of any unauthorized resources (CourseHero, ChatGPT, code on the internet). See the Collaboration Policy for more details.

Homework and projects may take a number of forms. Often, they will require the ability to edit and run .ipynb files. There are resources available to help getting started with jupyter notebooks.

Exams

There will be two exams in this course:

  • Midterm on Wednesday, March 12 (tentative) 1-2:15 PM CST.
  • Final on Tuesday, May 6 1-3 PM CST.

Graduate Course

Students in the graduate version of the course will complete 2 additional assignments.

  • Reading and discussing a modern book on machine learning approved by the instructor (ie. Genius Makers, The Alignment Problem, Human Compatible)
  • An original project involving data science and machine learning techniques

For more information, refer to the graduate course extension page.

Office Hours and Communication

We encourage you to discuss course content with your friends, classmates, and course staff throughout the semester, particularly during office hours!

  • All office hours will be updated on the Office Hours Calendar.
  • Instructor office hours are by appointment and will be in office or on teams.
  • TA office hours are drop-in and will be held in Clement 402.
  • Hours are listed in the office hours like of the website.

Course Communication:

  • Ilearn, is our course forum this semester. All course announcements will be through Ilearn. Please check out Ilearn. It’s best to set Ilearn so that announcements are sent as emails - that way none will be missed.

  • Email is the primary way to contact the instructor or TAs outside of class and office hours. Typically, you will receive a response within 24 hours.

Policies

Grading Scheme

Category 4220 5220 Details
Participation 5% 5% Drop 2
Homeworks 25% 15% Drop 1
Projects 25% 20%  
Midterm Exam 20% 20%  
Final Exam 25% 20%  
Graduate Reading and Discussion Bonus 5% 10%  
Graduate Project   10%  

On-Time Submission

All assignments are due at 11:59 PM Central Time on the due date specified on the Home / Schedule page. The date and time of this deadline are firm. Submitting even a minute past is considered late.

All assignments have a 20% per day late penalty.

Slip Days

Each student gets an extension budget of 6 total slip days for the homeworks and projects (so use them wisely). You can apply these slip days to homework and projects only during the semester.

Slip days are automatically applied based on the additional hours you take to submit any assignment after its given deadline. Slip days are rounded up to the next day. For instance, 1 minute late counts as 1 day late. We will use the submission time as displayed on Ilearn.

If all 6 slip days are used for the first three homework assignments (for example), you are out of slip days and cannot ask us to not consider one of the slip days previously used.

Each project or homework can have a maximum of 4 slip days applied. After 3 days of the assignment due date, it is unlikely that we will be able to accept your submission unless you have additional accommodations. Slip days should be reserved for unforeseen circumstances. You should not plan to use your slip days regularly.

Collaboration Policy and Academic Honesty

If you misrepresent work as your own, disciplinary action will be taken, including a failing grade in the course.

Assignments. Data science is a collaborative activity. While you may talk with others about the homework and projects, we ask that you write your solutions individually in your own words. If you do discuss the assignments with others please include their names at the top of your notebook. Restated, you and your peers are encouraged to discuss course content and approaches to problem-solving, but you are not allowed to share your code nor answers with other students, nor are you allowed to post your assignment solutions publicly. Doing so will be considered academic misconduct.

The benefit to completing the work in this course is similar to the benefit one gets from running laps. If someone else does it for you, you don’t get the benefit. The value is in the doing, and not in having a completed assignment.

Exams. Students caught cheating on any exam will fail the course. No exceptions.

Plagiarism on any assignment, as well as other violations to TnTech’s Student Handook, will be reported. Additionally, we reserve the right to give you an F in the course. It’s just not worth it!

Rather than copying, ask for help. You are not alone! The instructor and TAs are here to help you succeed. We expect that you will work with integrity and with respect for other members of the class, just as the course staff will work with integrity and respect for you.

Generative AI Usage

Use of AI-assisted methods, such as ChatGPT, to generate written or code solutions to assignments is prohibited. Usage of past assignment solutions is also prohibited.

What can you do with Generative AI?

You can ask questions to improve understanding, treat it as a documentation assistant, or even a debugging assistant.

Student Academic Integrity Policy

Maintaining high standards of academic integrity in every class is critical to the reputation of Tennessee Tech, its students, faculty, alumni, and the employers of Tennessee Tech graduates. Academic integrity is at the foundation of the educational process and key to student success. Students with academic integrity are committed to honesty, ethical behavior, and avoiding academic integrity violations. All students must read and understand Policy 216: Student Academic Integrity. Please see the Academic Integrity website for more information.

Disability Accommodation

Students with a disability requiring accommodations should contact the accessible education center (AEC). An accommodation request (AR) should be completed as soon as possible, preferably by the end of the first week of the course. The AEC is located in the Roaden University Center, room 112; phone 931-372-6119. For details, view Tennessee Tech’s policy 340 – services for students with disabilities at policy central.

Additional Resources

Technical Help

If you are experiencing technical problems, visit the myTech IT Helpdesk for assistance.

If you are having trouble with one of the instructional technologies (i.e. Zoom, Teams, Qualtrics, Respondus, or any technology listed here) visit the Center for Innovation in Teaching and Learning (CITL) website or call 931-372-3675 for assistance.

For accessibility information and statements for our instructional technologies, visit the CITL’s Learner Success Resource webpage.

Tutoring

The university provides free tutoring to all Tennessee Tech students. Tutoring is available for any class or subject, as well as writing, test prep, study skills, and resume support. Appointments are scheduled, so contact the Learning Center website for more information.

Health and Wellness

Counseling Center

The Counseling Center offers brief, short-term, solution-focused therapeutic interventions for Tennessee Tech University students. The staff of the Counseling Center is available to assist students with their personal and social concerns in hopes of helping them achieve satisfying educational and life experiences. To learn more or schedule an appointment, visit the Counseling Center website.

Health Services

Health Services offers high-quality, affordable care that is accessible and promotes the health and wellness of our Tennessee Tech community. Visit the Health Services website to learn more.

Pandemic Protocols

Each student must take personal responsibility for knowing and following any University protocol related to pandemics and other public health events. Students are expected to follow all directives published by Tennessee Tech on its official webpage. As conditions related to the COVID-19 pandemic change, the University’s COVID-19 protocols are also likely to change. Students are expected to monitor the University’s official webpage to stay up to date on public health protocols.

Acknowledgements

This syllabus is adapted from the syllabus provided by the TnTech office of the provost and the Berkeley Data 100 course syllabus.