image rights reserved by xkcd.com

Research Interests

Creation of/ Use of Benchmark data sets with ground truth.

Scientific Evaluation Methods for Visual Analytics Science and Technology (SEMVAST)

I participate in SEMVAST on a National Science Foundation grant.

Working with Dr. Catherine Plaisant, University of Maryland
Dr. Jean Sholtz, Battelle Memorial Institute
Dr. Georges Grinstein, University of Massachusetts Lowell

Below is an excerpt from the SEMVAST Website

"Visual analytics is the science of analytical reasoning facilitated by interactive visual interfaces. As new visual analytics methods and tools are developed an evaluation infrastructure is needed. There is currently no consensus on how to evaluate visual analytics systems as a whole. It is especially difficult to assess their effectiveness as they combine multiple low level components (analytical reasoning, visual representations, computer human interactions, data representations and algorithms, tools for communicating the results of such analyses) integrated in complex interactive systems that requires empirical user testing. Furthermore, it is difficult to assess the effectiveness without realistic data and tasks. Our long term goals are to:

Visual analytics builds on multiple core research fields (e.g. information visualization, knowledge discovery, data mining, cognitive science, intelligent user interfaces, human-computer interaction) and will impact many visual analytics application domains (e.g. intelligence and business analysis, bioengineering and genomic research, transportation, emergency response). A survey of visual analytics evaluation methodologies across those disciplines is needed. An evaluation infrastructure will be seeded with benchmark data sets with ground truth and corresponding tasks, metrics definitions and tools to automate measurements, and the beginning of an online community to encourage collaboration and sharing of qualitative and quantitative methods amongst researchers.

Community wide, systematic evaluations of visual analytic systems will produce better understanding of the issues in the core research fields involved in visual analytics as well as the issues that cross between those research fields. A sharable set of user centered evaluation methods, benchmarks and metrics will be developed, which will allow researchers to assess the utility of their own techniques, in their own application domain. New approaches to the preparation and use of datasets with ground truth for empirical evaluation will be devised. Tools will be available for facilitating measurements of utility."

The above was an excerpt from the SEMVAST Website

IEEE VAST 2008 Challenge

I am currently working on the VAST 2008 Challenge on social network analysis.  Lots of things can be represented as a social network and there are many different tools that have already been created to deal with the analysis/creation of them.  But how can you tell which tool is best?  Is it because it runs the fastest?  Does the best job?  But what does the best job mean?  What I am working on is creating a tool (focused specifically on the 2008 VAST Challenge) that will compare two social networks and produce a measure.  The VAST 2008 Challenge is unique in the fact that its synthetic datasets contain a ground truth senario within them and contestants use various tools and try to discover it.  My tool will be a web based social network evaluator that takes in the contestants guess as to what the answer social network is, compares it, and then gives it a score.

I will be at Pacific Northwest National Labs to help assist in judging from July 20th - 25th.  I will be volunteering and attending VisWeek in Columbus Ohio from Oct 19th - 24th.

Universal Visualization Platform (UVP)

The UVP is  common Java based software infrastructure for developing and deploying custom visualization and analysis tools.  I currently provide basic support for the UVP.

Breast Cancer Risk Analysis

Supported on a Mass General Hospital grant from 2006 - 2007.

  • Worked as a member of a grant supported team analyzing breast cancer risk.  Duties included data mining and statistical analysis.
  • Created a program to produce Gail Model results to aid in the analysis of breast cancer data.
  • Used Inforsense a knowledge discovery environment software package focused on visual programming to create visual workflows of data mining steps.
  • Integrated de-identified patient data into current database.