Sam Henry
PhD, Virginia Commonwealth University, 2019
Curriculum Vitae
Dissertation
Research interests: Natural Language Processing, Literature-Based Discovery, Hyperspectal Image Analysis, Machine Learning, Data Visualization
Contact
Physical Mail
- Sam Henry
Virginia Commonwealth University
School of Engineering
Department of Computer Science
601 West Main Street, Room 435
Richmond, Virginia 23284, USA
Academics
Publications
- 2019
- A Literature Based Discovery Visualization System with Hierarchical Clustering and Linking Set Associations, Sam Henry, Aliakbar Panahi, D. Shanaka Wijesinghe, Bridget McInnes, American Medical Informatics Association (AMIA) Informatics Summit, 2019
- 2018
- Association measures for estimating semantic similarity and relatedness between biomedical concepts, Sam Henry, Alex McQuilkin, Bridget McInnes, Artificial Intelligence in Medicine, 2018
- Application of natural language processing (NLP) to metabolomic/lipidomic data for new knowledge discovery from existing scientific literature Ali Panahi, Sam Henry, Daniel Contaifer, Bridget T. McInnes, Dayanjan Wijesinghe, ASMS Conference on Mass Spectrometry and Allied Topics 2018.
- Vector representations of multi-word terms for semantic relatedness, Sam Henry, Clint Cuffy, Bridget McInnes, Journal of Biomedical Informatics 77, 2018
- 2017
- Linking term association: ranking implicit terms in literature based discovery (Presentation, Proposal), Sam Henry, AMIA NLP Working Group Pre-Symposium 2017
- Semantic Association for Literature Based Discovery [poster], Sam Henry, Bridget McInnes, American Medical Informatics Association (AMIA) 2017
- Literature Based Discovery: Models, methods, and trends, Sam Henry, Bridget McInnes, Journal of Biomedical Informatics 74, 2017
- Evaluating Feature Extraction Methods for Knowledge-based Biomedical Word Sense Disambiguation, Sam Henry, Clint Cuffy, Bridget McInnes, BioNLP 2017
- 2016
- Semantic Relatedness for Literature Based Discovery (Poster, Proposal), Sam Henry, Bridget McInnes, AMIA NLP Working Group Pre-Symposium 2016
- VRep at SemEval-2016 Task 1 and Task 2: A System for Interpretable Semantic Similarity, Sam Henry, Allison Sands, Proceedings of the 10th International Workshop on Semantic Evaluations (SemEval 2016)
- 2015
- Video rate multispectral imaging for camouflaged target detection, Henry S., Proc. SPIE 9472, Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery XXI, 2015
- Hyperspectral imaging for differentiation of foreign materials from pinto beans, Mehrubeoglu, M., Zemlan, M., Henry, S., Proc. SPIE 9611, Imaging Spectrometry XX, 2015
- 2014
- Hyperspectral to Multispectral: Video Rate Spectral Imaging Applications, Henry S., Jafolla, J., Hyperspectral Imaging and Applications (HSI) 2014 proceedings, 2014
- 2010
- Bayesian Classification of Flight Calls with a novel Dynamic Time Warping Kernel, Damoulas, T., Henry S., Farnsworth, A., Lanzone, M., Gomes, C., ICMLA '10 Proceedings of the 2010 Ninth International Conference on Machine Learning and Applications, 2010
Patents
- 2015
- Hyperspectral Imaging System for Monitoring Agricultural Products during Processing and Manufacturing, Dante, H., Henry S., Seetharama, D., United States Patent Publication, 2015
- On-line Oil and Foreign Matter Detection System and Method Employing Hyperspectral Imaging, Dante, H., Henry S., Seetharama, D., United States Patent Publication, 2015
- 2014
- Blending of Agricultural Porducts via Hyperspectral Imaging and Analysis, Dante, H., Henry S., Seetharama, D., United States Patent Publication, 2014
Downloads
Guides and Help
- WordNet Installation Guide - a guide for installing WordNet onto Linux machines
- RegEx Reference Sheet - a short list of useful RegExes and their meanings
- Useful Linux Commands - a short list of useful Linux commands
- Poster Printing Instructions - instructions on how to use the poster printer at VCU
- CPAN Update Instructions - a rough guide for updating CPAN modules
- SemEval 2016
- VRep Task 1 - Code for VRep at SemEval 2016 Task 1
- VRep Task 2 - Code for VRep at SemEval 2016 Task 2
- GitHub - my GitHub page with multiple projects
- Other
- UMLS::Association - a Perl package for quantifying the association between two UMLS concepts
- ALBD - Association Literature Based Discovery, a Perl module for performing and evaluating LBD that makes use of association measures
- MetaMap::Datastrucutures - a library package with data structures to store and manipulate MetaMap data
- Word Query - a user-friendly wrapper class of WordNet::QueryData which allows easy access and use of WordNet
Software and Code
Data
-
CUI Embeddings - Min Frequency of 5:
UMLS Concept (CUI) embeddings generated with Word2vec-Interface with minimum frequency cutoff of 5, a window size of 8, size of 200, using the CBOW model and default for all other parameters using abstracts from the MetaMapped MEDLINE baseline 2015 with data from dates ranging from
[1983-1985).
CUI Embeddings - No Min Frequency:
UMLS Concept (CUI) embeddings generated with Word2vec-Interface with no minimum frequency cutoff, a window size of 8, size of 200, using the CBOW model and default for all other parameters using abstracts from the MetaMapped MEDLINE baseline 2015 with data from dates ranging from [1975-2015], [1975-2000), and [1975-2010).
CUI Co-occurrence Matrices - threshold 1:
CUI Co-occurrence Matrix genarated using UMLS::Association's Hadoop CUI Collector Tool using a window size of 8, word order enforced, with all co-occurrence counts of 1 removed, on titles and abstracts from the MetaMapped MEDLINE baseline 2015 with dates ranging from
[1975-2015],
[1975-2010)
CUI Association Matrix:
CUI Association Matrix genarated using CUI Co-occurrences collected using UMLS::Association's Hadoop CUI Collector Tool using a window size of 8, word order enforced, and with all co-occurrence counts of 1 removed on titles and abstracts from the MetaMapped MEDLINE baseline 2015 with dates ranging from [1975-2015]. Association scores are calculated using Pearson's Chi Squared of UMLS::Association.
Semantic Similarity and Relatedness Evaluation Datasets: Evaluation datasets for semantic similarity and relatedness. This includes UMNSRS subsets tagged for relatedness and similarity, MiniMayoSRS graded by medical coders, and MiniMayoSRS graded by physicians.>
UMNSRS Reference: S. Pakhomov, B. McInnes, T. Adam, Y. Liu, T. Pedersen, G. Melton, Semantic similarity and relatedness between clinical terms: An experimental study, in: Proceedings of the American Medical Informatics Association (AMIA) Symposium, Washington, DC, 2010, pp. 572–576.
- Lab - VCU Natural Language Processing Lab
- Advisor - Bridget McInnes
- Clinical NLP and Machine Learning Seminar - a seminar I gave with an introduction to clincial NLP
- RegExr - create RegExes and get immediate highlighted results
- Vassar Stats - useful for computing p-values and other statistics
- Matrix Calculator - a nice matrix operation calculator
- A more complete look at my past work is on my linkedIn page
- I enjoy photography, here is my Flickr page
Lab and Colleagues
Other