Research

Overview

High-Performance Computing:
- Scalable parallel computing with a focus on PETSc and scientific software frameworks
Machine Learning and Deep Learning:
- Pattern recognition, data mining, statistical learning, and neural network-based models
Computational Mathematics and Numerical Analysis:
- Numerical solutions of partial differential equations (PDEs), scientific computing, and optimization
Statistics and Data Science:
- Probability, time series modeling, statistical computing, and applications in business and economics
- Dimensionality Reduction and Variable Selection:
- Methods for simplifying complex, high-dimensional data while preserving essential structure
Information and Complexity Theory:
- Exploring fundamental limits and efficient representations in data-driven modeling

Machine Learning (ML) is one of the most important fields in engineering and science today. It deals with the fundamental problem of using a data set to recreate the process that generated the data (hence that process is ‘learned’ from a set of observations). The name of the field distinguishes it from human learning. ML is also referred to as computational learning or statistical learning, and has significant overlap with data mining, pattern recognition, and parts of statistics. Over the past 5 years, there has been an explosion of interest in ML and its application to scientific research in different fields, to industrial products, and to financial and commercial systems.

We work on the theory, algorithms, and applications of ML. Our goal is to understand the principles of how learning works and to develop learning solutions for real-life problems. We have successfully applied ML to a variety of practical problems. Among the applications we worked on are financial forecasting, medical image classification, industrial inspection, recommender systems, and credit approval. All of these problems, and many others, share the same premise of a data set generated by an underlying process. The process cannot be mathematically pinned down, and ML enables us to infer what the process is based on the available data.

ML can be viewed as an alternative approach to system design. Instead of the conventional way of mathematically modeling the task at hand and implementing the model as a system (a computer program or a piece of hardware), we let the learning algorithm do the work for us. We start with a generic model, such as a neural network, that has a number of “untuned” internal parameters. Depending on how we tune the parameters, the model can implement vastly different tasks. The role of learning is to take examples of the task, such as inputs together with their target outputs, and use this information to tune the parameters of the model to mimic the desired task.

Some of the highlights of our research work have been the use of hints in learning where, in addition to the data, side information about the underlying process can be used to enhance the learning process. We also worked on the theory of learning, and focused on the case where the data is very noisy with application to computational finance. Some of our current research work is in the area of recommender systems and collaborative filtering, where we are addressing a number of intriguing questions on the theoretical and algorithmic fronts.

I am also broadly interested in computational social science, natural language processing, and artificial intelligence.

Current teaching:

EBIO 1240: General Biology Lab 2

Previous teaching:

CSCI 2270: Data Structures
CSCI 3022: Data Science with Probability and Statistics
CSCI 1320: Engineer Applications

为往生继绝学！