Lab Introduction

Welcome to Machine Learning and Data Mining Lab at Ajou University

Data sets with millions of records and thousands of fields are increasingly common in business, engineering, medicine, and the sciences. With the amount of data doubling every few years the problem of uncovering hidden patterns or extracting useful information from such data sets is becoming an important practical issue.

Research on this topic focuses on key questions such as how can one build useful models which both allow us to make predictions and also aid us to figure out the underlying process of the data generation. Research projects in our lab use theories and techniques from the intersection of computer science, statistics, and mathematics, including foundational ideas from algorithms, artificial intelligence, multivariate data analysis, Bayesian estimation, and computational statistics (from statistics), and optimization and probability theory (from mathematics). Machine learning and pattern recognition, in particular, are central to our research, providing both a sound theoretical basis and a practical framework for developing useful data analysis algorithms.

Research activities in our lab range across areas as different as hospital fraud detection, direct marketing in CRM, oil price prediction, protein function prediction in bioinformatics, etc. We hope you find our web-site useful and encourage you to explore its contents (publications, courses, seminars, and other information).

Research Areas

Theory of Machine Learning or
Statistical Learning Algorithms

  1. Semi-Supervised Learning
    • Graph-based SSL
    • Transductive Learning
  2. Kernel Methods
    • Support Vector Machines(SVM)
    • Independent Component Analysis (ICA)
    • Kernel PCA, etc.
  3. Connectionist Methods
    • Feed-Forward Neural Networks
    • Autoencoders
    • Self-Organizing Map (SOM), etc.
  4. Graph-based Deep Learning
    • GNN(Graph Neural Network)
    • GCN(Graph Neural Networks), etc.

Applications of Machine Learning Methods in Various Fields

  1. BioMedical Informatics
    • DNA/RNA/Protein Sequence Analysis
    • Protein Function Analysis
  2. Financial Engineering
    • Stock and Futures Trading System
  3. Customer Relationship Management
    • Customer Retention
    • Fraud Detection
    • Cross-sales
    • Direct Marketing

Courses

  1. Data Analysis & Practice (Undergraduate)
    • Basics to Data Analysis: Descriptive/Inferential Statistics, Sampling, Data Screening,etc
    • Multivariate Analysis: Correlation/Regression/Factor/Discriminant Analysis
    • Time Series Analysis: AR(I)MA
    • Introduction to Up-to-date Data Mining Techniques
    • Practice with SAS or MINITAB
    • Term Project
  2. Statistics Applications: (Undergraduate)
    • Understanding of the “descriptive statistics”
    • Understanding of the “discrete distributions and their applications”
    • Understanding of the “continuous distributions and their applications”
    • Foundation of “statistical inference” including parameter estimation and hypothesis testing
    • Understanding of “experimental design”
    • Foundation of “distribution-free statistics” (e.g., nonparametric hypothesis testing), etc.
    • With presentation of the corresponding theories, the course still maintains its practical approach with statistical S/W packages (e.g., MINITAB/EXCEL/SAS/MATLAB, etc)
  3. Statistics Applications: (Undergraduate)
    • Understanding of the “descriptive statistics”
    • Understanding of the “discrete distributions and their applications”
    • Understanding of the “continuous distributions and their applications”
    • Foundation of “statistical inference” including parameter estimation and hypothesis testing
    • Understanding of “experimental design”
    • Foundation of “distribution-free statistics” (e.g., nonparametric hypothesis testing), etc.
    • With presentation of the corresponding theories, the course still maintains its practical approach with statistical S/W packages (e.g., MINITAB/EXCEL/SAS/MATLAB, etc)
  4. Statistics Applications: (Undergraduate)
    • Understanding of the “descriptive statistics”
    • Understanding of the “discrete distributions and their applications”
    • Understanding of the “continuous distributions and their applications”
    • Foundation of “statistical inference” including parameter estimation and hypothesis testing
    • Understanding of “experimental design”
    • Foundation of “distribution-free statistics” (e.g., nonparametric hypothesis testing), etc.
    • With presentation of the corresponding theories, the course still maintains its practical approach with statistical S/W packages (e.g., MINITAB/EXCEL/SAS/MATLAB, etc)
  5. Statistics Applications: (Undergraduate)
    • Understanding of the “descriptive statistics”
    • Understanding of the “discrete distributions and their applications”
    • Understanding of the “continuous distributions and their applications”
    • Foundation of “statistical inference” including parameter estimation and hypothesis testing
    • Understanding of “experimental design”
    • Foundation of “distribution-free statistics” (e.g., nonparametric hypothesis testing), etc.
    • With presentation of the corresponding theories, the course still maintains its practical approach with statistical S/W packages (e.g., MINITAB/EXCEL/SAS/MATLAB, etc)

Projects

  • [Global news-based early warning algorithm for supply chain crisis detection] 한국과학기술정보연구원 (2023.05 ~ 2023.11.)
  • [VOCs 구성성분 종합분석을 위한 데이터 처리 시스템 개발] 현대엔지비 (2023.04 ~ 2023.11.04)
  • [보이스피싱 정보 수집·가공 및 빅데이터 기반 수사지원시스템 개발] 정보통신기획평가원, 경찰대 (2022.01 ~ 현재)
  • [치매 정밀의료 및 진단 다각화를 위한 인공지능 모델 개발] 한국연구재단(2022.01 ~ 현재)
  • [다종 치안데이터 연계 네트워크 기반 인물/조직 위험성 추론 및 성능 고도화 연구] 한국전자통신연구원 (2021.04 ~ 2021.10)
  • [문화재 이미지의 인식기술 도입 및 유사 이미지 군집 모델 구축 전략 컨설팅] 한국과학기술정보연구원 (2020.04 ~ 2020.10)
  • [‘빅데이터’ 분석 기반 한국사 권력 메커니즘] 교육부 (2015.09 ~ 2020.09)
  • [딥 준지도학습 네트워크를 활용한 다종 도메인 간 메트릭 학습 알고리즘 개발] 교육부 (2018.06. ~ 2021.03)
  • [기계학습을 이용한 기술사업화 지원 유망기업 선별 모형 연구] 한국과학기술정보연구 (2018.05. ~ 2018.11)
  • [복합제 개발을 위한 인공지능 추천 시스템] 미래창조과학부 (2017.05 ~ 2018.04)
  • [텍스트마이닝을 통한 고객 불만 분석] 현대엔지비 (2017.05~2017.10)
  • [다양한 바이오메디컬 빅데이터의 네트워크화 및 연결방법론 개발] 교육부(2015.11 ~ 2018.10)
  • [대규모의 다양한 바이오메디컬 데이터에 대한 계층적 통합 알고리즘 개발] 미래창조과학부 (2013.06 ~ 2016.05)
  • [Medical Fraud and Abuse Pattern Detection Algorithm Development] supported by the Korea Research Foundation Grant funded by the Korean Government(MOEHRD)(2009 – 2012)
  • [Oil Price Early Warning System] supported by the Korea Energy Economics Institute (KEEI)(2008 – 2009)
  • [Direct Marketing with Multiple Data Sources](2006.12 – 2008.08)
  • [Medical Fee Review & Assessment based on Hospital Profiling Data] supported by the Health Insurance Review & Assessment Service(HIRA)(2007 – 2008)
  • [iDMS: Intelligent Digital Manufacturing System] sponsored by the grant for Post Brain Korea 21 National Project National Project (2007 – 2012)
  • [Graph-based Multiple Data Integration] funded by the Ajou University (2006 – 2008)
  • [Protein Functional Class Prediction: Multiple Data Integration and Importance Ranking] supported by the Korea Research Foundation Grant funded by the Korean Government (MOEHRD) (2006 – 2007)
  • [Neural Network Modeling for Intelligent Novelty Detection ] sponsored by Korean Ministry of Science and Technology, and by the Brain Korea 21 National Project (2001 – 2004 )
  • [Sensory Information Processing Models Based on Brain Function: Learning and Evolution Algorithms for Neural Networks] supported by Brain Science and Engineering Research Program sponsored by Korean Ministry of Science and Technology (1998 – 2000)
  • [Data Warehouse based Data Mining S/W Development] sponsored by Korean Ministry of Information and Communication, (1997 – 1998)
  • [Neural Network Intelligent Quality Systems] sponsored by PoHang Iron and Steel Company (POSCO), (1995 – 1996)