
CCS/ITSRCI Seminar Series on AI in Practice – April 24, 2025
April 24, 202512:15 PM – 1:15 PM Speaker:Xin Liang, Computer Science Department, University of Kentucky Where:327 McVey Hall(Zoom link: https://uky.zoom.us/j/82467171189) Title:Advancing Extreme-scale Data Science via Trust-driven Data Reduction Abstract:Extreme-scale scientific simulations and experiments generate more data than that can be stored, transmitted, and analyzed. The recently delivered exascale systems and high-resolution scientific instruments are exacerbating this problem, due to the imbalanced growth between storage systems and data. My research focuses on the development of efficient algorithms and scalable software for high-performance data management on massively parallel and heterogeneous systems. In this talk, I will present how we address the scientific data challenges arising from real-world scientific applications through trust- driven data reduction. Specifically, I will talk about 1) error-controlled and feature- preserving lossy data reduction; 2) data refactoring and error-controlled progressive…

CCS/ITSRCI Seminar Series on AI in Practice – March 13, 2025
*RESCHEDULED* March 13, 202512:15 PM – 1:15 PM Speaker:William Mattingly, Cultural Heritage Data Scientist, Yale University Where:327 McVey Hall (Virtual Speaker)(Zoom link: https://uky.zoom.us/j/82467171189) Title:Evolution of Machine Learning Approaches in Human Rights Data Analysis Abstract:Over the past five years, our approach to named entity recognition and semantic search has undergone significant changes. In this tutorial, we will compare the methods used during the initial phase of Bitter Aloe with the techniques currently in use. Click here to see the complete list of speak

CCS/ITSRCI Seminar Series on AI in Practice – March 6, 2025
March 6, 202512:15 PM – 1:15 PM Speaker:Stephen Davis, Department of History, University of Kentucky Where:327 McVey Hall(Zoom link: https://uky.zoom.us/j/82467171189) Title:The Application of Machine Learning to Human Rights Data: Two Use Cases from the Bitter Aloe Project Abstract:South Africa’s Truth and Reconciliation Commission (TRC) documented gross human rights violations during apartheid through individual testimonies and incident descriptions. While this approach provided an unprecedented view of political violence, limited planning for future accessibility has constrained researchers to keyword searches and detailed reading of individual transcripts. Bitter Aloe seeks to improve access to the TRC archive by applying two natural language processing techniques: named entity recognition and document embedding. This talk will explore the methodologies used in Bitter Aloe and demonstrate their potential as a broader application of machine learning to human rights…

CCS/ITSRCI Seminar Series on AI in Practice – February 20, 2025
February 20, 202512:15 PM - 1:15 PM Speaker:Mami Hayashida, ITS-RCI, University of Kentucky and Vikram Gazula, CCS, University of Kentucky Where:327 McVey Hall(Zoom link: https://uky.zoom.us/j/82467171189) Title:Introduction to Retrieval-Augmented Generation Abstract: Despite their impressive performance and versatility, LLMs are fundamentally limited by the data they were initially trained on. For researchers who wish to incorporate LLMs into their work, this often becomes a challenge as the available models trained on general knowledge fail to make reference to the domain knowledge. Similarly, LLMs do not keep up with the most up-to-date information as each LLM’s knowledge is “frozen”. RAG (Retrieval Augmented Generation) is a simple technique to overcome this gap by feeding additional knowledge to the LLM as part of workflow. When deployed effectively, the same LLMs will incorporate the added knowledge…

Dr. Xin Liang wins the NSF CAREER Award for the project Data Polymorphism: Enabling Fast and Adaptable Scientific Data Retrieval with Progressive Representations
Congratulations to Dr. Xin Liang from the UK Computer Science Department, a recipient of the prestigious NSF CAREER Award. The funded project is concerned with the concept of data polymorphism and aims to enable fast and adaptable scientific data retrieval. Scientific simulations and instruments produce an unprecedented amount of data that overwhelms the network and storage systems. Due to the limited capacity in high-end parallel file systems, such data must be stored at remote sites or moved to secondary storage for archival purposes. This poses challenges to fetching the data for post hoc data analytics, as the data movement bandwidth across wide area networks or from secondary systems is very limited. This project bridges this gap by developing scalable software to realize data polymorphism, a novel paradigm that allows for variable…

New NIH Genomics Data Management Policy
The National Institutes of Health (NIH) has issued updated data management policies, which introduce new security standards in the Genomic Data Sharing (GDS) Policy. Specifically, “NIH Security Best Practices for Users of Controlled-Access Data” require that data managed on institutional IT systems and third-party computing infrastructures that meet certain standards in accordance with NIST SP 800-171 “Protecting Controlled Unclassified Information in Nonfederal Information Systems and Organizations.” These new policies will be effective on January 25, 2025, at which point adherence to this standard will be included in new or renewed Data Use Certifications or similar agreements stipulating terms of access to controlled-access human genomic data regardless of whether the Approved User is supported by NIH or not. Read more.

2024 IEEE-CS TCHPC Award Winner Dr. Xin Liang, Computer Science, University of Kentucky
CCS wishes to congratulate Dr. Xin Liang from the University of Kentucky Department of Computer Science. Dr. Liang is one of the three recipients of the IEEE-CS Technical Community on High Performance Computing (TCHPC) Early Career Researchers Award for Excellence in High Performance Computing. More information about the award can be found on the IEEE-CS TCHPC website. Dr. Liang’s research focuses on improving the speed and efficiency of high-performance computing systems, particularly in handling large amounts of scientific data. His work aims to reduce the time it takes to process and analyze data by optimizing operations such as input/output, data transfer, and data analysis. Through collaboration with experts in various fields, including computer science, applied mathematics, and data analysis, Dr. Liang has developed innovative solutions that help bridge the gap between theoretical research…

UK researchers’ exploration of hagfish genome published in Nature
CCS acknowledges Dr. Jeramiah Smith's impactful contribution to genomic research through his publication in Nature. His utilization of CCS resources underscores the vital role of computational support in advancing scientific understanding. Dr. Smith's collaborative effort sheds light on the early evolution of vertebrates through a comprehensive analysis of the hagfish genome. Discover more about his groundbreaking findings: Read more