1. Overview Recent years have witnessed an explosive increase in electronic data. For example, International Data Corporation (IDC) estimated that the total amount of electronic data was 2.7 zettabytes by the end of 2012. The global size of “Big Data” in Biomedicine stands at roughly 200 Exabytes in 2012. Such big data have fundamentally changed the business, research, and education. Understanding the big data and thereby transforming big data into smart data at the semantic level is the key to explore the full potential of data and to unravel complex phenomena. Due to the unique challenges of big data, the task of transforming big data presents a number of compelling computational and analytical challenges
Our research interests span a variety of aspects of algorithms and software infrastructure for Big Data and Computational Intelligence, with a particular focus on scientific domains including: (1) data-intensive analytics for pervasive healthcare monitoring to assist aging population and patients with chronic diseases; (2) scalable deep learning from extremely complex biomedical multimedia data; and (3) user-centered knowledge discovery and decision support from large scale clinical data to evaluate and improve the quality of health care.
|
|
|
|
2. Research Areas | |
Research Area 1: Data-intensive
Analytics for Pervasive Healthcare Monitoring to Assist Aging Population
and Patients with Chronic Diseases:
During last five years, we have seen an explosive increasing of new
human-computer interaction devices (e.g., smartphones, Microsoft Kinect)
and new sensors (e.g., wireless body sensors). These devices are equipped
with inexpensive and unobtrusive sensors that can collect physiology data
for chronic condition in natural living environment. It opens an
unprecedented opportunity to discover early predictors and novel
biomarkers to support clinical decision making and to reduce healthcare
cost. The long term objective of this research is to investigate, develop,
and validate new data-intensive analytics models and algorithms to
discover insights from physiological information using inexpensive and
unobtrusive sensors. However, developing new analytics models and
algorithms using these devices remains to be an open research problem with
many challenging research questions. Currently, we center our efforts on
data analytics for three types of physiology data: time series data from
body sensors and smartphones; human motion data from Microsoft Kinect; and
Electroencephalography (EEG) data from EPOC neuroheadset consumer
brain-computer interfaces (BCI), with immediate applications on assisted
living environment for aging population and in-home
monitoring/rehabilitation for patients with chronic diseases. Our research
results have been published in top journals and conferences, such as
IEEE Transactions on Biomedical
Engineering (TBME), Journal of Neural Computing & Applications (NCA) by
Springer, Journal of Cognitive Neurodynamics by Springer, ACM Multimedia,
ACM/IEEE BodyNets, IEEE EMBS, and etc. Real-world validation of our
proposed approach is being conducted with our clinic collaborators at the
University of Tennessee: College of Medicine Chattanooga
|
|
Research Area 2: Scalable Deep Learning from Extremely Complex Biomedical Multimedia Data: Tremendous amounts of biomedical multimedia data, such as CT images, cardiac ECHO videos and textual patient records, are captured and recorded in digital format during the daily clinical practice, medical research, and education. Important biomedical knowledge is embedded in this data. Automatic discovery of this biomedical knowledge, by machine learning-based intelligent analysis, is highly desirable and very useful. Unfortunately, the extremely complex natural of biomedical multimedia data (e.g., tens of millions of training data) makes the problem of learning and analysis a very challenging problem. This project includes two parts and both of them are rooted from recent advances in deep learning, which is a very promising intriguing area of machine learning research. In the first part of this project, we are developing automated data-intensive (petebytes) content analysis techniques and software for medical images and videos captured during endoscopy procedure. We are actively one of the pioneers in this area and our pioneer work was awarded the ACG (American College of Gastroenterology) Governors Award for Excellence in Clinical Research for “the Best Scientific Paper”. In the second part of this project, we aim to develop an intelligent and scalable multi-modal medical retrieval system to support the medical diagnosis, research, and teaching. Radiology is a case in point in our study. Our pioneer research is reshaping the future of the medical multimedia analysis retrieval with publications in top ranked journals and conferences, such as IEEE TBME, ACM Multimedia, IEEE ISM, IEEE CIVR, IAPR/IEEE ICPR, IEEE ICME, and etc. Some of our proposed approaches are being evaluated and validated in clinic practice, with the support from our collaborators: Dr. J. Kalpathy-Cramer from Harvard Medical School and Piet C. De Groen, M.D at Mayo Clinic |
|
Research Area 3: User-centered Knowledge Discovery and Decision Support from Large Scale Clinical Data to Evaluate and Improve the Quality of Health Care: With the full adoption of Health Information Technology, there will be a steady accumulation of large amounts of patient data that can be leveraged to evaluate and improve the quality, safety and efficiency of care and extend public health and research. Research in this area focuses on user-centered data analysis to determine the relative clinical effectiveness of different interventions. Specifically, we employ the risk analysis for Acute Coronary Syndromes in chest pain patients as our application domain. The use of nuclear cardiac stress testing has been incorporated into chest pain unit (CPU) evaluation protocols in the evaluation of patients deemed at low to intermediate risk of acute coronary syndromes (ACS) defined as unstable angina or acute myocardial infarction (AMI). The objective of this project is to develop a computer-aided predicative model to investigate the risk factors (e.g., age, sex, cardiac risk factors) on the incidence of ACS for the purpose of developing a tool that may assist physicians to predicate the ACS in chest pain patients. We have developed parallel, distributed, and scalable computer algorithms to handle the real-world clinical data with large volume of patient information and a very large number of variables. Our results have been reported at the top biomedical journals and computer science conferences such as American Journal of Emergency Medicine, Annals of Emergency Medicine, ACM Multimedia, IEEE EMBC, IEEE BioMed, and etc. We have integrated some of our proposed approach into clinical workflow to provide computer-aided clinic decision support for medical professionals at the Emergency Department of Erlanger Hospital, Chattanooga, Tennessee. To the best of our knowledge, this is the first ACS risk calculator that has been employed under a real-world clinical environment |
|
3. Sample Research Projects (Special Notes: The following sample projects represent a partial list of our past and current research projects. We could not disclose everything here due to HIPAA compliance and other regulatory compliance. If you are interested in collaborating with us, please feel free to contact PI, Dr. Yu Cao at ycao@cs.uml.edu |
|
|
Medical Video/Image Analysis and Retrieval for User-centered Decision Support
Overview This project includes two parts. In the first part, we have developed automated data-intensive content analysis techniques and software for medical videos/images captured during endoscopy procedure [1-9]. In the second part of this project, we aim to develop an intelligent and scalable multi-modal medical retrieval system to support the medical diagnosis, research, and teaching. We are actively one of the pioneers in these areas. Some examples of current research efforts include: to investigate novel annotation, retrieval, and dimension deduction techniques to build high quality, large scale, trusted medical archive [10-16] , to develop new query-adaptive search strategy to retrieval the most relevant medical information [17, 18] . We are also developing new representation of biomedical data in a semantically rich, structured form that lends itself to automated search, retrieval, inference, and data-driven knowledge acquisition (e.g., using machine learning) [19] . Software tools and methods for biomedical data annotation/indexing are also explored [20, 21]. Real world applications of our research include: (1) the first intelligent multimedia system to analyze, retrieve, and visualize important content in medical videos captured during endoscopy; (2) intelligent medical search engine; (2) instructional video search engine.
References
Software
|
|
|
|
Motion Tracking, Analyzing, and Visualization
Overview Motion is a fundamental component of all organismal behavior. Motion research is one of the most active research topics in graphics and visual computing, driven by a wide range of promising applications in many areas such as animation production, movement analysis, and industrial. The long term goal of our motion project is to develop a 3D video tracking and analyzing system that can fully recover the unconstrained movements of a wide range of organisms that do not have easily trackable natural landmarks or markers placed by the experimenter. Tracking and analyzing a moving and deforming three-dimensional organism to derive detailed and accurate locomotory kinematics remains a challenging open problem.We are now investigating efficient techniques for multi-view stereo reconstruction, motion analysis, 2D/3D visual tracking algorithm using a flexible geometric model, as well as new visual learning paradigm that includes both geometric and appearance model for 3D tracking and analysis [22-25]. We envision the new techniques would enable the biomechanists and ethologists to model very large datasets with high accuracy..
References
Software 2D Fish Tracking, Analyzing, and Visualization Software 3D Fly Tracking Software (please email us for source code) 3D Fish Tracking and Analyzing Software (coming soon) |
|
|
|
Context Awareness Data Analysis for Body Area Sensor Networks
Overview The ultimate goal of this project is to develop new context awareness data analysis framework that supports the development of next generation pervasive healthcare monitoring based on wireless body area sensor networks (BodyNets). One of the pilot projects is to build a patient care center, a system for periodic and opportunistic patient data collection, analysis, and exchange for people living in rural area. In this collaborative project, I am leading the data analysis efforts. Due to the noisy sensor measurements, low bandwidth and unreliable communications between sensors, and limited sensor storage and computation speed, data analysis for BodyNets is very challenging and extremely difficult. We herein propose a new data analysis framework to address these issues. We have obtained very promising preliminary results, which have been reported in our recent papers [26, 27] .
References
Software Communication and Data Analysis for BodyNets (Support the BSN platform and can be compiled under TinyOS2.x) |
|
|
|
Risk Analysis for Acute Coronary Syndromes in Chest Pain Patients Overview The use of nuclear cardiac stress testing has been incorporated into chest pain unit (CPU) evaluation protocols in the evaluation of patients deemed at low to intermediate risk of acute coronary syndromes (ACS) defined as unstable angina or acute myocardial infarction (AMI). The objective of this project is to develop a computer-aided predicative model to investigate the risk factors (e.g., age, sex, cardiac risk factors) on the incidence of ACS for the purpose of developing a tool that may assist physicians to predicate the ACS in chest pain patients. References
Software Online Risk Calculator for ACS in Patients Undergoing Stress Testing
|
|
|
|
Protein Structure Classification from NMR Spectra
Overview Knowledge of the three-dimensional structure of proteins is integral to understanding their functions and a necessity in the era of proteomics. The structural class of a protein lies at the top of any hierarchical characterization of its fold. The designation of class based on protein structure content has been extremely useful from both experimental and theoretical points of view. The objective of this project is to investigate effective and efficient data mining methods for the classification of the protein structure directly from Nuclear Magnetic Resonance (NMR) using the chemical shift information [28-31].
References
Software |
|
|
4. Past and Current Research Collaborators
Working in the high impact interdisciplinary research projects to contribute the core area of computer science has played a vital role in the success of our research. We have found that perform theoretical research, combined with practical applications, especially where there is a cross contribution between theory and practice as well as between different domains, is extremely productive. Our past and current collaboration experience put us in an excellent position to initialize and solicit this kind of project. Feel free to contact us at ycao@cs.uml.edu if you are interested in collaborating with us.
Medical Image Analysis and Retrieval Piet C. De Groen, M.D., Mayo Clinic, Rochester, MN Charles E. Kahn Jr, MD, MS, Medical College of Wisconsin, Milwaukee, WI Dr. Sanqing Hu, School of Biomedical Engineering, Science & Health Systems, Drexel University, Philadelphia, PA Dr. Wallapak Tavanapong, Department of Computer Science, Iowa State University, Ames, IA Dr. Johnny S. Wong, Department of Computer Science, Iowa State University, Ames, IA Dr. Alex Liu, Department of Computer Science, California State University, Fresno, CA Rodney Kent Hutson, Jr., M.D, chief of the Department of Radiology at Erlanger Health System Francis M. Fesmire, MD, FACEP, Research Director, Department of Emergency Medicine, University Tennessee, School of Medicine Chattanooga and Director of Heart-Stroke Center Erlanger Medical Center Dr. Tian Zhao, Associate Professor of Computer Science, University of Wisconsin-Milwaukee Dr. Jayashree Kalpathy-Cramer, Research Scientists of Biomedical Informatics,Oregon Health & Science University
Motion Tracking, Analyzing, and Visualization Dr. Ulrike Muller, Department of Biology, California State University, Fresno, CA Dr. Joy Goto, Department of Chemistry, California State University, Fresno, CA Dr. Ebraheem Fontaine, Department of Mechanical Engineering, California Institute of Technology, Pasadena, CA
Context Awareness Data Analysis for Body Area Sensor Networks Dr. B. Prabhakaran, Department of Computer Science, University of Texas at Dallas, Dallas, TX Dr. Sanqing Hu, School of Biomedical Engineering, Science & Health Systems, Drexel University, Philadelphia, PA Dr. Ming Li, Department of Computer Science, California State University, Fresno, CA Dr. Alex Liu, Department of Computer Science, California State University, Fresno, CA Thomas Devlin,M.D. Medical Director, Erlanger Southeast Regional Stroke Center Gregory Heath, Ph DHSc, MPH, FACSM, FAHA, Director of Research and Professor of Medicine at UTCOMC and Assistant Provost for Research and Engagement and Guerry Professor at UTC
Protein Structure Classification from NMR Spectra Dr. Krish Krishnan, Department of Chemistry, California State University, Fresno, CA Dr. Charles Yan, Department of Computer
Science, Utah State University,
|
|
|
|
5. People Interacting with students and other researchers
is always my favorite activity. My past tutoring and advising
experiences make me believe that different people learn in different
way and respond best to different approaches. My education goal is to
help students to prepare for their professional career. By assisting
students to identify the goal, prioritize the tasks, execute the plan,
and evaluate the results, I am able to help students to become the
future leaders of science and engineering. We are always looking self
motivated students and researchers to join our team. Please do not
hesitate to contact us at
ycao@cs.uml.edu for further information Faculty: Dr. Yu Cao Current Graduate Students: Yii Li (Medical Image Retrieval), Swapna Philip (Data Analysis for BodyNets), Julie Pena (Medical Image Retrieval), Satmeet Ubhi (Motion Tracking and Visualization) Current Undergraduate Students: Ronald John Lugge III (3D Motion Analysis and Game Development) Graduate Alumina: Sung Baang, Sachin Raka, Mohamed Ali, Rehana Ferwin Undergraduate Alumina: Matthew Calderaz, Thell Smith, Matthew Daniel Mclaughlin, Brandon Joseph Wilson |
|
|
|
6. Sponsors
We are active researchers that explore the creative solutions on several fundamental issues for building intelligent information system and biomedical information system. From an application point of view, such knowledge is very promising to make impact to the areas of biological science; medical science, healthy care; education; homeland security; public safety, etc. We are grateful爐o the爁ollowing爋rganizations爁or their generous support爐o our past and current research projects. If you share our vision and are interested in funding our research to address the grand challenges facing society, please contact us at ycao@cs.uml.edu.
|
|
|
|
7. References (please refer to my publication page for more details)
[1]
Y. Cao, W. Tavanapong, K. Kim, J. Wong, J. Oh, and P. C.
d. Groen, "A framework for parsing colonoscopy videos for semantic
units," in Proceedings of the IEEE International Conference on
Multimedia and Expo, Taipei, Taiwan, 2004. [2]
Y. Cao, W. Tavanapong, D. Li, J. Oh, P. C. d. Groen, and
J. Wong, "A Visual model approach for parsing colonocsopy videos," in
Proceedings of the International Conference on Image and Video
Retrieval, Dublin, Ireland, 2004. [3]
Y. Cao, D. Li, W. Tavanapong, J. Oh, J. Wong, and P. C. d.
Groen, "Parsing and browsing tools for colonoscopy videos," in Proc. of
ACM Multimedia, New York, NY, USA, 2004. [4]
S. Hwang, J. Oh, J. Lee, Y. Cao, W. Tavanapong, D. Liu, J.
Wong, and P. C. d. Groen, "Automatic measurement of quality metrics for
colonoscopy videos," in Proceedings of the Annual ACM International
Conference on Multimedia, Singapore, 2005. [5]
Y. Cao, D. Liu, W. Tavanapong, J.-H. Oh, J. Wong, and
P.-C. Groen, "Automatic classification of image with appendiceal
orifice in colonoscopy videos," in Proceedings of the IEEE
International Conference of the Engineering in Medicine and Biology
Society, New York City, NY, USA, 2006. [6]
Y. Cao, D. Liu, W. Tavanapong, J. Wong, J. Oh, and P. C.
d. Groen, "Computer-aided Detection of Diagnostic and Therapeutic
Operations in Colonoscopy Videos," IEEE Transactions on
Biomedical Engineering,, vol. 54, pp. 1268-1279, 2007. [7]
Y. Cao, S. Baang, S. Liu, M. Li, and S. Hu, "Audio-Visual
Event Classification via Spatial-Temporal-Audio Words," in Proc. of
IAPR/IEEE International Conference on Pattern Recognition (ICPR),
Tampa, FL, USA, 2008. [8]
Y. Cao, S. Liu, M. Li, S. Hu, and S. Baang, "Medical Video
Event Classification Using Shared Features," in Proc. of IEEE
International Symposium on Multimedia (ISM), Berkeley, CA, USA, 2008. [9]
J. Oh, S. Hwang, Y. Cao, W. Tavanapong, D. Liu, J. Wong,
and P. C. d. Groen, "Measuring objective quality of colonoscopy," IEEE Transactions on Biomedical Engineering, vol. 56, pp.
2190-2196, 2009. [10]
"ARRS GoldMiner," in http://goldminer.arrs.org,
2009. [11]
J. Charles E. Kahn and C. Thao, "GoldMiner: a radiology
image search engine," American Journal of Roentgenology,
vol. 188, pp. 1475-1478, 2007. [12]
J. Charles E. Kahn and D. L. Rubin, "Automated semantic
indexing of figure captions to improve radiology image retrieval," Journal of the American Medical Informatics Association,
vol. 16, pp. 380-386, 2009. [13]
H. M黮ler, J. Kalpathy-Cramer, C. E. K. Jr, W. Hatt, S.
Bedrick, and W. Hersh, "Overview of the ImageCLEFmed 2008 Medical Image
Retrieval Task," in 9th Workshop of the Cross-Language Evaluation
Forum, 2008. [14]
H. M黮ler, J. Kalpathy-Cramer, I. Eggel, S. Bedrick, R.
Said, B. Bakke, C. E. K. Jr, and W. Hersh, "Overview of the 2009
Medical Image Retrieval Task," in Working Notes of CLEF 2009 (Cross
Language Evaluation Forum), 2009. [15]
Y. Cao, R. Troncy, B. Prabhakaran, and J. Gao, "Data
Semantics for Multimedia Systems and Applications " in Proc. of IEEE
International Symposium on Multimedia (ISM), San Diego, CA, USA, 2009
(To Appear, Invited). [16]
M.-L. Shyu, Y. Cao, J. Kong, M. Li, M. Lux, and J. Bao,
"Introduction to the special issue on "data semantics for multimedia
systems"," Multimedia Tools and Applications, An
International Journal from Springer, 2010 (Guest Editorial). [17]
S. Liu, Y. Cao, M. Li, P. Kilaru, T. Smith, and S. Toner,
"A Semantics- and Data-Driven SOA for Biomedical Multimedia Systems,"
in Proc. of IEEE International Workshop on Data Semantics for
Multimedia Systems and Applications (DSMSA), Berkeley, CA, USA, 2008. [18]
M.-L. Shyu, Y. Cao, M. Li, J. Kong, and J. Bao,
"Introduction to the Special Issue on Data Semantics and Multimedia
Information Management," Journal of Multimedia, 2010
(Guest Editorial). [19]
S.-H. Liu, Y. Cao, M. Li, T. Smith, J. Harris, J. Bao, B.
R. Bryant, and J. Gray, "A SOA-Based Functional and QoS
Semantics-Driven Biomedical Multimedia Processing," in Methodologies
for Non-Functional Requirements in Service Oriented Architecture,
J. Suzuki, Ed. Hershey, PA, USA: IGI Global (formerly Idea Group), 2010
(Accepted). [20]
J. Bao, Y. Cao, W. Tavanapong, and V. Honavar,
"Integration of Domain-Specific and Domain-Independent Ontologies for
Colonoscopy Video Database Annotation," in Proc. of the International
Conference on Information and Knowledge Engineering, Las Vegas, Nevada,
2004. [21]
D. Liu, Y. Cao, W. Tavanapong, J. O. Johnny Wong, and P.
C. d. Groen, "Arthemis: A Case Study of Annotation Software in an
Integrated Capturing and Analysis System for Colonoscopy," Computer
Methods and Programs in Biomedicine, vol. 88, pp. 152-163, 2007. [22]
H. T. Kim, C. Saito, N. T. Mekdara, S. Choudhury, A.
Goodarzi, F. Mazloomi, T. Sakha, M. Soltani, S. Ubhi, Y. Cao, J. Goto,
and U. K. Muller, "The effects of the glutamate agonist BMAA on the
walking behavior of adult fruit flies," in Annual Meeting
of the Society for Integrative and Comparative Biology. Seattle,
WA, USA, 2010 (Poster, Accepted). [23]
E. I. Fontaine, F. Zabala, M. H. Dickinson, and J. W.
Burdick, "Wing and body motion during flight initiation in Drosophila
revealed by automated visual tracking," Journal of
Experimental Biology, vol. 212, pp. 1307-1323, 2009. [24]
E. I. Fontaine, D. Lentink, S. Kranenbarg, U. M黮ler, J.
van Leeuwen, A. H. Barr, and J. W. Burdick, "Automated visual tracking
for studying the ontogeny of zebrafish swimming," The
Journal of Experimental Biology, pp. 1305-1316, 2008. [25]
Y. Cao, S. Read, S. Raka, and R. Nandamuri, "A Theoretic
Framework for Object Class Tracking," in Proc. of IEEE International
Conference on Networking, Sensing and Control (ICNSC), Hainan, China,
2008. [26] M.
Chen, M. Li, V. Leung, S. Prasad, S.-H. Liu, and Y. Cao, "Recent
Advances in Body Sensor Networks: A Survey," Computer
Communications, The International Journal for the Computer and
Telecommunications Industry (Elsevier), 2010 (Accepted with major
revision). [28] S. P. Mielke and V. V. Krishnan, "Protein structural class identification directly from NMR spectra using averaged chemical shifts," Bioinformatics, vol. 19, pp. 2054-64, 2003. [29] S. P. Mielke and V. V. Krishnan, "An evaluation of chemical shift index-based secondary structure determination in proteins: influence of random coil chemical shifts," Journal of Biomolecular NMR, vol. 30, pp. 143-53, 2004. [30] S. P. Mielke and V. V. Krishnan, "Estimation of protein secondary structure content directly from NMR spectra using an improved empirical correlation with averaged chemical shift," Journal of Structural and Functional Genomics, vol. 6, pp. 281-285, 2005. [31] S. P. Mielke and V. V. Krishnan, "Characterization of protein secondary structure from NMR chemical shifts," Progress in Nuclear Magnetic Resonances, vol. 54, pp. 141-165, 2009. [32] Buchheit RC, Fesmire FM, Cao Y, et al. Nuclear stress testing in the emergency department chest pain patients with suspected acute coronary syndrome: who should we stress? Ann Emerg Med 2011;58:S209. [33] Fesmire FM, Hughes AD, Stout PK, et al: Selective dual nuclear scanning in low risk patients with chest pain to reliably identify and exclude acute coronary syndromes. Ann Emerg Med 2001;38:207-215. [34] Fesmire FM, Hughes AD, Fody EP, Stout PK, et al: The Erlanger Protocol: A one year experience with serial 12-lead ECG monitoring, 2-hour delta serum marker measurements, and selective nuclear stress test to identify and exclude acute coronary syndromes. Ann Emerg Med 2002; 40: 584-594.
|