You are here

Artificial Intelligence, Machine Learning, and Data Science

AI group


This research cluster works to develop technology, processes, and software to enable effective access to and utilization of overwhelming amounts of information. This cluster works to combine knowledge from database, machine learning, information retrieval, networking and human-computer interaction research to create more intelligent information systems. Core strengths include collaborative filtering, probabilistic modeling, spatial databases, usability engineering, web-based interfaces, and wireless computing.

We seek to construct computer systems that can build models of their environments and apply those models to make reliable, rapid decisions. We develop new methods for statistical learning, data mining, and probabilistic reasoning and apply these to problems in environmental monitoring, ecological science, manufacturing engineering, space exploration, robot control, and web-based information systems. Anticipated impacts include cheaper and more accurate environmental monitoring, more cost-efficient factories, and easier access to information on the web.

Research Thrusts

AI/Machine Learning

  • Decision-Making & Reinforcement Learning
  • Machine Learning & Data Mining
  • Pattern Recognition
  • Probabilistic Representation & Reasoning

Natural Language Processing

  • Machine Translation
  • Syntax and Semantics
  • Information Extraction
  • Information Retrieval

Database Systems

  • Easy-to-use Data Exploration
  • Large Scale Data Analytics
  • Heterogenous & Evolving Datasets
  • Information Extraction & Knowledge-base Construction

Related Courses

  • CS 275: Introduction to Databases
  • CS 450: Database Management System
  • CS 515: Algorithms and Data Structures
  • CS 517: Theory of Computation
  • CS 523: Advanced Algorithms
  • CS 531: Artificial Intelligence
  • CS 533: Advanced Artificial Intelligence
  • CS 534: Machine Learning
  • CS 536: Introduction to Graphical Models
  • CS 539: Special Topics in Artificial Intelligence
  • CS 550: Advanced Database Management System

Software Downloads

  • StratagusAI: an open-source modification of the Stratagus RTS game engine to support AI research
  • Java Library for Adaptation-Based Programming
  • Error Correcting Output Codes for multiclass learning problems
  • MAXQ Hierarchical Reinforcement Learning system
  • TreeCRF system for training conditional random fields via functional gradient ascent
  • PCBR region detector
  • Open Source SIFT Feature Detector
  • TaskTracer system


Tom Dietterich

Tom Dietterich
Machine learning; intelligent systems; intelligent user interfaces; ecosystem informatics

Xiaoli Fern

Xiaoli Fern
Machine learning and data mining, specifically the subfields of unsupervised learning and pattern discovery

Photo of Liang Huang

Liang Huang
Natural language processing, including parsing and translation; structured machine learning; programming languages; automata and formal language theory

Fuxin Li

Fuxin Li
Computer vision; deep learning; machine learning; segmentation-based object recognition and scene understanding; spatio-temporal video analysis

Raviv Raich

Raviv Raich
Adaptive sensing/sampling; manifold learning; sparse representations for signal processing

Prasad Tadepalli

Prasad Tadepalli 
Artificial intelligence; machine learning; automated planning; natural language processing

Arash Termehchy

Weng-Keen Wong 
Data mining; machine learning; anomaly detection; human-in-the-loop learning; ecosystem informatics

Alan Fern

Alan Fern
Artificial intelligence, including machine learning, data mining, and automated planning/control

David Hendrix

David Hendrix
Motif finding; non-coding RNA structure & function analysis; apps of machine learning to computational biology; deep sequencing data analysis

Rebecca Hutchinson

Rebecca Hutchinson
Machine learning; data mining; ecosystem informatics; computational sustainability

V John Mathews

V John Mathews
Adaptation & learning; nonlinear signal processing; application of signal & information processing to neural engineering and biomedical applications, structural health monitoring, audio & communication systems

Stephen Ramsey

Stephen Ramsey
Machine learning; computational systems biology; bioinformatics; integrative computational methods to map gene regulatory networks

Arash Termehchy

Arash Termehchy 
Data exploration and management; large scale data analytics; easy-to-use query interfaces; knowledge-base construction


Research Facilities

AI Laboratory
AI Computer Cluster: 34 12-core 64-bit processors (408 total cores)

Research Partners

Industry & Other Organizations

American Museum of Natural History logo BBN Technologies logo Bell Labs logo Coelo Company of Design logo
Google logo Lockheel Martin logo Microsoft Research logo Pacific Northwest National Laboratory logo
SAIC logo SRI International logo Viewplus Techologies logo  


Arizona State University logo Brown University logo Carnegie Mellon University logo Case Western Reserve University logo
City University London logo Cornell University logo Georgia Tech logo MIT logo
Northeastern University logo Purdue University logo Stanford University logo Stony Brook University logo
Tufts University logo University of Arkansas logo University of California at Berkeley logo University of California, Irvine logo
University of Illinois at Urbana–Champaign logo University of Massachusetts Amherst logo University of Michigan logo University of Rochester logo
University of Southern California logo University of Texas at Austin logo University of Texas at Dallas logo University of Washington logo
University of Wisconsin logo Yale University logo    

Selected Projects

Integrative methods to map gene regulatory networks in mammalian cells
( S. Ramsey)

  • Machine-learning approaches to identify cis-regulatory regions
  • Bayesian analysis of transcription factor binding site frequencies
  • Computational inference, modeling, and analysis of gene regulatory networks

Adaptation-Based Programming
( A. Fern, M. Erwig, T. Nguyen)

  • Integration of programming language and machine learning research
  • Allow programmers to leave difficult program choices as open choice points
  • Use machine learning techniques to automatically optimize choice points
  • Applications to software game agents and network optimization

Automated Planning and Learning for Complex Environments
( A. Fern, P. Tadepalli)

  • Development of learning and planning algorithms for selecting actions in complex environments
  • New monte-carlo planning algorithms
  • New algorithms for reinforcement learning
  • Applications to emergency response planning, game-playing agents, ecological conservation

Understanding Visual Scenes and Activity
( T. Dietterich, A. Fern, S. Todorovic)

  • BugID project: recognizing bug species from image data
  • The OSU Digital Scout Project: Computer Vision Meets American Football
  • Generic part-based object recognition and learning
  • Generic high-level event recognition and learning

Ecosystem Informatics
( T. Dietterich, X. Fern, W-K. Wong)

  • Learning models of species distribution, dispersal, and migration
  • Occupancy-detection-expertise models to understand both the distribution of species and the process of observing them
  • Recognition of bird species from songs and flight calls
  • Optimal active management of wildfires and invasive species

End-User Training of Machine Learning Systems
( M. Burnett, T. Dietterich, A. Fern, W-K. Wong)

  • Enable end-users to tune adaptive software systems
  • Investigating effective forms of communication between computer students and human teachers
  • Applications to labeling and ranking systems
  • Applications to teaching autonomous control agents

Learning from Natural Language Texts
( T. Dietterich, X. Fern, P. Tadepalli)

  • Learning general rules by reading text
  • Learning from incomplete and noisy data
  • Incorporating pragmatics of document generation in learning from texts

Intelligent User Interfaces: The TaskTracer Project
( T. Dietterich)

  • Windows add-on to support multi-tasking desktop workers
  • Methods for machine learning tagging of email, documents, and web pages
  • User interface support for information re-finding and interruption recovery

Genome Informatics
( W-K. Wong, T.C. Mockler)

  • Improved methods and tools for de novo assembly of high throughput sequencing data.
  • New methods and tools for the automatically accurate annotation of genomes in near real-time.
  • Design and implementation of novel tools for interacting with massive biological datasets.

Active Transfer Learning
( A. Fern, P. Tadepalli)

  • Algorithms for learning and transferring knowledge across related domains
  • Learning in sequential decision problems by asking questions
  • Active learning of transferable knowledge