
Courses
Graduate
Advanced Data Mining
Data capture has become inexpensive and ubiquitous as a by-product of innovations such as the internet, e-commerce, electronic banking, point-of-sale devices, bar-code readers, intelligent machines, and the amount has been increasing at an incredible rate due to technological advances. “Data mining” refers to a collection of techniques for extracting “interesting” relationships and knowledge hidden in a mountain of data in order to assist managers or analysts to make intelligent use of them. A number of successful applications have been reported in areas such as credit rating, fraud detection, database marketing, customer relationship management, and stock market investments. In this course, we will examine a variety of data mining techniques evolved from the disciplines of statistics and artificial intelligence (or machine learning), and practice them in recognizing patterns and making predictions from an applications perspective. Application (or case) surveys and hands-on experimentations with easy-to-use software will be provided.
The course is aimed at providing students with the most important fundamentals and techniques of Data Mining:
– Predictive modeling: Classification, Regression (decision trees, neural networks, statistical models, etc)
– Association rules and Link analysis
– Clustering
– Anomaly Detection
– Visualization
– Interpretations
– Some advanced Machine Learning techniques
– etc.
With presentation of the corresponding theories, the course still maintains its practical approach with statistical S/W packages (e.g.,SAS E-miner/MATLAB/MINITAB, etc)
Machine Learning
Machine learning deals with design and development of computer algorithms that can harness the vast amounts of data available nowadays and then use this resultant data in an intelligent way to solve a variety of real-world problems. A principal focus of machine learning research is to automatically produce rules and patterns from data. In recent years, machine learning techniques have led to significant advances in many fields, including marketing, fraud detection, stock trading, robotics, machine translation, computer vision, medicine, bioinformatics, etc. And there is a very close connection between theory and practice in machine learning, and almost every application has a huge amount of accompanying theory. However, one of the cool things about machine learning is once you understand the basics of machine learning technology, it is a very open field and lots of progress can be made quickly by figuring out ways to formalize whatever we can figure out about the world. The goal of this course is to provide an overview of the state-of-art algorithms used in machine learning and different perspectives, and hopefully to gain some understanding of what’s going on the next. We will discuss both the theoretical properties of these algorithms and their practical applications.
The course is divided into two parts: The first half will focus on many “traditional”, tried-and-true techniques and algorithms in machine learning and pattern recognition. The second half will focus on more recent developments in machine learning. (Topics may be deleted and additional topics added depending on the interests of the participants and the available time).
– Linear Models for Regression/Classification
– Probability Distributions and Density Estimation
– Mixture Models and Expectation Maximization
– Kernel Methods (including SVMs)
– Latent Variable Models
– Sampling Method (including MCMC)
– Graphical Models
– Models for Sequential Data (including Hidden Markov Models)
– etc
Undergraduate
Data Analysis & Practice
- Basics to Data Analysis: Descriptive/Inferential Statistics, Sampling, Data Screening,etc
- Multivariate Analysis: Correlation/Regression/Factor/Discriminant Analysis
- Time Series Analysis: AR(I)MA
- Introduction to Up-to-date Data Mining Techniques
- Practice with SAS or MINITAB
- Term Project
Statistics Applications
- Understanding of the “descriptive statistics”
- Understanding of the “discrete distributions and their applications”
- Understanding of the “continuous distributions and their applications”
- Foundation of “statistical inference” including parameter estimation and hypothesis testing
- Understanding of “experimental design”
- Foundation of “distribution-free statistics” (e.g., nonparametric hypothesis testing), etc.
- With presentation of the corresponding theories, the course still maintains its practical approach with statistical S/W packages (e.g., MINITAB/EXCEL/SAS/MATLAB, etc)
[Operations Research and Practice I / II]
- Work scheduling
- Production planning & Production process
- Capital budgeting
- Financial planning
- Blending (e.g. Oil refinery management)
- Farm planning
- Distribution
- Multi-period decision problems
- Inventory model
- Financial models
- Work scheduling
- Linear programming
- Network programming
- Integer Programming
- Non-Linear programming
- Stochastic Process
- Problem Identification
- Formulation
- Derivation of Optimal Solution(s)
- Model Validation
- Implementation
- Interpretation
[MATLAB:Computer Programming for Science Computation]
- Math and computation
- Algorithm development
- Modeling, simulation, and prototyping
- Data analysis, exploration, and visualization
- Scientific and engineering graphics
- Etc.