Instructor: Prof. O-Joun Lee
Office: Michael Hall T404
Office Hour:
e-mail: [email protected]
Website: https://nslab-cuk.github.io/
Lecture Time: Tuesday 7-8/Thursday 7
TA: Van Thuy Hoang
Lab: Sophie Barat Hall B348
Office Hour:
e-mail: [email protected]
TA: Sang Thanh Nguyen
Lab: Sophie Barat Hall B348
Office Hour:
e-mail: [email protected]
📜 Introduction
In this class, we will be exploring the techniques and algorithms used to analyze and mine large-scale graphs and networks. We will cover various types of graph-based data and the unique challenges they pose in terms of data storage and processing. You will learn how to apply different techniques, such as graph traversal, centrality measures, and community detection, to extract valuable insights from large-scale graph data. Furthermore, we will discuss the various applications of graph mining such as social network analysis, fraud detection, and recommendation systems. By the end of this class, you will have a solid understanding of the key concepts and techniques in graph mining and be able to apply them to real-world problems.
📚 Course Materials
References
🗓 Schedule
- Week 1: Introduction to Graph Mining
- Overview of graph mining
- Types of graphs and their applications
- Basic graph concepts (nodes, edges, degree, etc.)
- Sample code: Creating a simple graph using NetworkX
- Week 2: Graph Representation and Storage
- Adjacency matrix and list
- Sparse matrix representations
- Graph databases and storage systems
- Sample code: Converting a NetworkX graph to a sparse matrix representation
- Week 3: Centrality Measures
- Degree centrality
- Betweenness centrality
- Closeness centrality
- Eigenvector centrality
- PageRank and its application in web search
- Sample code: Calculating centrality measures in a graph using NetworkX
- Week 4: Graph Visualization
- Visualization techniques (spring-embedded, circular, etc.)
- Tools for graph exploration and visualization (Gephi, Cytoscape, etc.)
- Sample code: Visualizing a graph using NetworkX
- Week 5: Community Detection
- Definition of communities and their properties
- Clustering techniques (k-means, hierarchical, etc.)
- Modularity and its variants
- Tools for community detection (Louvain, Infomap, etc.)
- Sample code: Detecting communities in a graph using the Louvain method in NetworkX
- Week 6: Link Prediction
- Types of links (positive, negative, neutral)
- Commonly used link prediction methods (common neighbors, Jaccard coefficient, etc.)
- Tools for link prediction
- Sample code: Predicting links in a graph using the common neighbors method in NetworkX
- Week 7: Subgraph Mining
- Frequent subgraph mining (FSM)
- FSM algorithms (gSpan, FSG, etc.)
- Tools for frequent subgraph mining (Traceminer, etc.)
- Sample code: Mining frequent subgraphs in a graph using the gSpan algorithm in NetworkX
- Week 8: Mid-term Exam
- Week 9: Graph Kernels
- Overview of graph kernels
- WL relabeling process
- Applications of graph kernels
- Sample code: Computing graph kernels using WL relabeling process in NetworkX
- Week 10: Node Classification
- Overview of node classification
- Feature extraction methods
- Classification algorithms (SVM, Random Forest, etc.)
- Evaluation metrics
- Sample code: Classifying nodes in a graph using SVM in NetworkX
- Week 11: Graph Applications in Social networks
- Overview of graph applications in social networks
- Social network analysis
- Community detection and link prediction in social networks
- Sample code: Analyzing a social network graph using NetworkX
- Week 12: Graph Applications in transportation networks
- Overview of graph applications in transportation networks
- Shortest path algorithms
- Centrality measures in transportation networks
- Sample code: Finding the shortest path in a transportation network graph using NetworkX
- Week 13: Graph Applications in web graph
- Overview of graph applications in web graph
- Web graph crawling
- Web graph analysis
- Sample code: Crawling a web graph and analyzing it using NetworkX