Graduate

Advanced Data Mining

Data capture has become inexpensive and ubiquitous as a by-product of innovations such as the Internet, e-commerce, electronic banking, point-of-sale devices, bar-code readers, and intelligent machines. As a result, the amount of available data has been increasing at an incredible rate due to technological advances.

“Data mining” refers to a collection of techniques for extracting interesting relationships and knowledge hidden in large volumes of data, in order to help managers and analysts make intelligent use of them. A number of successful applications have been reported in areas such as credit rating, fraud detection, database marketing, customer relationship management, and stock market investments.

In this course, we will examine a variety of data mining techniques that have evolved from the disciplines of statistics and artificial intelligence (or machine learning), and we will practice applying them to recognizing patterns and making predictions from an application-oriented perspective. Case studies and hands-on experiments using easy-to-use software will be provided.

The course is aimed at providing students with the most important fundamentals and techniques of data mining:

  • Predictive modeling: classification and regression (Decision Trees, Neural Networks, Statistical Models, etc.)
  • Association rules and link analysis
  • Clustering
  • Anomaly detection
  • Visualization
  • Interpretation
  • Some advanced machine learning techniques

With the presentation of the corresponding theories, the course still maintains its practical approach by using statistical software packages (e.g., Python, SAS E-miner, MATLAB, MINITAB, etc.).

Advanced Machine Learning Theory

Machine learning is all about finding generalized patterns from data. The whole idea is to replace “humans writing code” with “humans supplying data” and then let the system figure out what it is that the person wants to do by looking at the examples. In recent years, many successful applications of machine learning have been developed, ranging from data-mining programs that learn to detect fraudulent credit-card transactions to autonomous vehicles that learn to drive on public highways. At the same time, there have been important advances in the theory and algorithms that form the foundation of this field. The goal of this class is to provide an overview of state-of-the-art algorithms used in machine learning from different perspectives, and hopefully to gain some understanding of what is coming next. We will discuss both the theoretical properties of these algorithms and their practical applications.

There are a few cool things about machine learning. The first is that it is broadly applicable. These techniques have led to significant advances in many fields, including marketing, stock trading, robotics, machine translation, computer vision, medicine, and more. The second is that there is a very close connection between theory and practice. While this course is more on …

  • Bayesian Decision Theory
  • Maximum-Likelihood and Bayesian Parameter Estimation
  • Nonparametric Techniques
  • Linear Discriminant Functions
  • Multilayer Neural Networks
  • Stochastic Methods
  • Nonmetric Methods
  • Algorithm-Independent Machine Learning
  • Unsupervised Learning and Clustering
  • Invited talks & Term project presentations(tentative)

Machine Learning

Machine learning deals with the design and development of computer algorithms that can harness the vast amounts of data available nowadays and use this data in an intelligent way to solve a variety of real-world problems. A principal focus of machine learning research is to automatically discover rules and patterns from data. In recent years, machine learning techniques have led to significant advances in many fields, including marketing, fraud detection, stock trading, robotics, machine translation, computer vision, medicine, and bioinformatics. There is a very close connection between theory and practice in machine learning, and almost every application is accompanied by a substantial amount of theory. However, one of the appealing aspects of machine learning is that once you understand the basics of the technology, the field is very open, and rapid progress can be made by finding ways to formalize what we know about the world. The goal of this course is to provide an overview of state-of-the-art algorithms used in machine learning, to introduce different perspectives on them, and to help students gain some understanding of where the field is heading next. We will discuss both the theoretical properties of these algorithms and their practical applications.

The course is divided into two parts. The first half will focus on many “traditional,” tried-and-true techniques and algorithms in machine learning and pattern recognition. The second half will focus on more recent developments in machine learning. (Topics may be deleted and additional topics added depending on the interests of the participants and the available time.)

  • Linear Models for Regression and Classification
  • Probability Distributions and Density Estimation
  • Mixture Models and Expectation–Maximization
  • Kernel Methods (including SVMs)
  • Latent Variable Models
  • Sampling Methods (including MCMC)
  • Graphical Models
  • Models for Sequential Data (including Hidden Markov Models)

Critical Reviews in Machine Learning and Deep Learning

This course aims to cultivate critical reading, analytical reasoning, and scholarly communication skills through close readings of key research papers in machine learning and deep learning. Students will engage with seminal and contemporary journal articles, focusing on theoretical contributions, methodological soundness, and experimental design. The course emphasizes critical discussion, synthesis of research trends, and the development of their academic perspectives as future researchers.

By the end of the course, students will be able to:

  • Accurately summarize the key content and contributions of research papers in machine learning and deep learning.
  • Critically analyze the theoretical foundations and limitations of each paper.
  • Evaluate the soundness of methodologies and the validity of experimental designs.
  • Identify and connect current research trends across different subfields.
  • Develop academic literacy and critical thinking through oral and written communication.

Medical Artificial Intelligence

This course, Medical Artificial Intelligence, introduces medical students to the foundational concepts, methodologies, and analytical frameworks of modern artificial intelligence. As AI-driven technologies increasingly shape diagnostic reasoning, clinical decision support, and biomedical research, it is essential for future clinicians to understand not only how these systems work but also how to critically evaluate their strengths and limitations.

The portion of the course devoted to artificial intelligence provides a structured overview of key machine learning principles, covering both classical statistical models and contemporary computational approaches. Through conceptual lectures and practical examples from healthcare, students will explore how data-driven models learn patterns, make predictions, and support real-world medical decision-making.

The course emphasizes interpretability, critical thinking, and clinical relevance—ensuring that students develop the ability to understand, assess, and meaningfully utilize AI tools in future medical practice.

Undergraduate

Data Analysis & Practice

  • Basics to Data Analysis: Descriptive/Inferential Statistics, Sampling, Data Screening,etc
  • Multivariate Analysis: Correlation/Regression/Factor/Discriminant Analysis
  • Time Series Analysis: AR(I)MA
  • Introduction to Up-to-date Data Mining Techniques
  • Practice with Python, SAS or MINITAB
  • Term Project

Statistics Applications

  • Understanding of the “descriptive statistics”
  • Understanding of the “discrete distributions and their applications”
  • Understanding of the “continuous distributions and their applications”
  • Foundation of “statistical inference” including parameter estimation and hypothesis testing
  • Understanding of “experimental design”
  • Foundation of “distribution-free statistics” (e.g., nonparametric hypothesis testing), etc.
  • With presentation of the corresponding theories, the course still maintains its practical approach with statistical S/W packages (e.g., Python/MINITAB/EXCEL/SAS/MATLAB, etc)

Operations Research and Practice I / II

  • Work scheduling
  • Production planning & Production process
  • Capital budgeting
  • Financial planning
  • Blending (e.g. Oil refinery management)
  • Farm planning
  • Distribution
  • Multi-period decision problems
  • Inventory model
  • Financial models
  • Work scheduling
  • Linear programming
  • Network programming
  • Integer Programming
  • Non-Linear programming
  • Stochastic Process
  • Problem Identification
  • Formulation
  • Derivation of Optimal Solution(s)
  • Model Validation
  • Implementation
  • Interpretation

MATLAB:Computer Programming for Science Computation

  • Math and computation
  • Algorithm development
  • Modeling, simulation, and prototyping
  • Data analysis, exploration, and visualization
  • Scientific and engineering graphics
  • Etc.

Scroll to Top