Algorithms for Data Science
This course introduces key algorithmic techniques for solving large-scale data science problems, focusing on efficient data processing and analysis methods. Students will explore fundamental topics such as frequent itemset mining, mining similar items, and data stream algorithms. The course will blend theoretical lectures with practical lab sessions to reinforce concepts through hands-on experience. Students will also engage in a project to apply these algorithms to real-world data, culminating in an exam to assess their mastery of the material.
Course Structure
- Week 1 (12/09/2024) - Intro, Frequent Itemset Mining
- intro
- frequent
- To download the lab, run:
wget https://phparis.net/uploads/m2_ds_algods_lab1_frequent.ipynb
- Week 2 (19/09/2024) - Mining Similar Items
- similar
- To download the lab, run:
wget https://phparis.net/uploads/m2_ds_algods_lab2_similar.ipynb
- Week 3 (26/09/2024) - Data Stream Algorithms
- stream I
- To download the lab, run:
wget https://phparis.net/uploads/m2_ds_algods_lab3_filtering.ipynb
- Weeks 4 (03/10/2024) - Data Stream Algorithms (continued)
- stream II
- To download the lab, run:
wget https://phparis.net/uploads/m2_ds_algods_lab4_distinct.ipynb
- Week 5 (10/10/2024) - Project
- Week 6 (17/10/2024) - Advertising on the Web
- Week 7 (24/10/2024) - Exam
References
- J. Leskovec, A. Rajaraman, J. Ullman. “Mining of Massive Datasets”. site