Algorithms for Data Science
This course introduces key algorithmic techniques for solving large-scale data science problems, focusing on efficient data processing and analysis methods. Students will explore fundamental topics such as frequent itemset mining, mining similar items, and data stream algorithms. The course will blend theoretical lectures with practical lab sessions to reinforce concepts through hands-on experience. Students will also engage in a project to apply these algorithms to real-world data, culminating in an exam to assess their mastery of the material.
To upload your labs and the project
Upload your lab work and your project, at this link, by entering your first and last name and the number of the lab work in the file name:
🚨 DO NOT USE EMAILS! 🚨
Course Structure
Week 1 (12/09/2024) - Intro, Frequent Itemset Mining
- intro
- frequent
- To download the lab, run:
wget https://phparis.net/uploads/m2_ds_algods_lab1_frequent.ipynb
Week 2 (19/09/2024) - Mining Similar Items
- similar
- To download the lab, run:
wget https://phparis.net/uploads/m2_ds_algods_lab2_similar.ipynb
Week 3 (26/09/2024) - Data Stream Algorithms
- stream I
- To download the lab, run:
wget https://phparis.net/uploads/m2_ds_algods_lab3_filtering.ipynb
Weeks 4 (03/10/2024) - Data Stream Algorithms (continued)
- stream II
- To download the lab, run:
wget https://phparis.net/uploads/m2_ds_algods_lab4_distinct.ipynb
Week 5 (10/10/2024) - Project
- recommendation
- Project
- To download the lab, run:
wget https://phparis.net/uploads/m2_ds_algods_lab5_moments.ipynb
Week 6 (17/10/2024) - Advertising on the Web
- ads
- To download the lab, run:
wget https://phparis.net/uploads/m2_ds_algods_lab6_adwords.ipynb
Week 7 (24/10/2024) - Exam
- Previous exams: 2020
References
- J. Leskovec, A. Rajaraman, J. Ullman. “Mining of Massive Datasets”. site