Data & Storage
Research at the University of Kentucky often involves large datasets that must be stored, processed, and shared across computing systems. CCS provides several storage platforms designed to support computational workflows, collaborative research, and long-term data retention.
These systems include high-performance parallel file systems for active computation, scalable object storage for large datasets, and network-attached storage for persistent research data. This page provides an overview of these storage models and how they are used within CCS environments.
On this page
- What is Research Storage?
- When should I use Research Storage?
- Storage Systems
- Moving and Sharing Data
- Data Protection Notice
What is Research Storage?
Research storage systems provide the infrastructure needed to manage large datasets generated by modern research workflows.
These systems support activities such as:
- storing simulation and analysis outputs
- managing large datasets used in computational workflows
- sharing research data within research groups or collaborations
- retaining datasets beyond active computation
Different storage platforms are designed to support different access patterns, performance needs, and data management goals.
When should I use Research Storage?
Research storage services may be appropriate if:
- your research generates large datasets that exceed local storage
- you need shared storage accessible across compute systems
- your workflows require high-throughput access during computation
- you need a scalable system for storing or sharing large datasets
- you need persistent storage for research data that must remain accessible over time
If you are unsure which storage platform is appropriate for your workflow, CCS can help evaluate your data management needs.
Storage Systems
UK’s research storage infrastructure includes multiple storage platforms designed to support different classes of data workflows. These systems support high-performance computation, collaborative research environments, and long-term dataset retention.
Parallel File Systems (GPFS)
GPFS is optimized for active computational workloads rather than long-term data retention.
Parallel file systems provide high-performance storage designed for active computation on HPC systems. CCS operates GPFS-based storage environments connected to the University’s research computing clusters, enabling fast access to data during simulations, analysis workflows, and other compute-intensive tasks. Each cluster maintains its own GPFS environment, so storage performance can be optimized for the workloads running on that system.
Typical uses include:
- active research data used during computation
- temporary workspace for simulations and analysis pipelines
- shared project directories used by research groups
Storage Quotas/Limits: LCC | MCC | ECC
How to check disk usage
Filesystem Basics
Object Storage (Ceph)
Object storage is not intended to replace high-performance filesystems used by compute workloads.
Object storage provides scalable storage designed for large datasets and applications that benefit from programmatic or service-based data access. CCS operates object storage infrastructure based on Ceph, providing S3-compatible storage services for research workflows that benefit from scalable data storage or object-based interfaces.
Typical uses include:
- storing large research datasets
- enabling data services or programmatic access to research data
- supporting applications designed to interact with object storage systems
Network-Attached Storage (NAS)
NAS systems prioritize accessibility and reliability rather than high-performance parallel I/O.
Network-attached storage (NAS) provides persistent storage for research datasets that are not actively being processed by HPC systems. NAS environments are deployed through condo storage purchases, allowing research groups to obtain dedicated storage capacity for long-term project data.
CCS periodically coordinates campus storage procurements to allow research groups to participate in shared purchases.
Typical uses include:
- retaining datasets after analysis workflows complete
- maintaining persistent project data repositories
- staging data before or after computational workflows
Moving and Sharing Data
Research computing workflows often require moving large datasets between systems, institutions, and collaborators.
CCS provides Globus endpoints on the data transfer nodes (DTNs) associated with each cluster. These endpoints allow researchers to efficiently transfer data between campus systems and external research infrastructure.
Through the institutional Globus service, CCS can also create Globus Guest Collections, allowing researchers to securely share datasets with collaborators without requiring direct system access.
In addition to campus storage infrastructure, CCS works with national research data services such as OURRstore to support long-term archival storage of large research datasets. These services provide durable tape-based storage designed for retaining data beyond the active phases of computational research.
Data Transfer Node Documentation
Globus User Documentation
OURRstore User Documentation
Data Protection Notice
CCS storage platforms are designed to support research workflows but are not intended to serve as the sole copy of important research data.
Researchers are responsible for maintaining appropriate redundant or off-site copies of critical datasets.

