CAOS

CAOS (Context Aware Object Storage) is an advanced data storage solution built to handle the diverse and dynamic needs of modern biodiversity research.

Scalable Data Storage for Biodiversity Research

CAOS (Context-Aware Object Storage) leverages standard S3 object storage to support the management of millions of biodiversity-related multimedia objects, including specimen photographs, microscope images, lot specimen images, trace electropherograms, and sequences in FASTA and FASTQ formats, as well as videos. In CAOS, objects are categorized by type, and metadata is assigned to enable context-based operations. For instance, specimen images from robotic imagers are automatically cropped using subject isolation methods, with multiple thumbnail sizes generated. Each object type has distinct functions, and the system is designed to accommodate the addition of new object types.

Functionality:

CAOS operates independently, focusing on the ingestion and distribution of media files at an exceptional scale. Its design enables labs producing large volumes of images and barcodes to register images or FASTA/FASTQ files before associating them with records on BOLD. Additionally, CAOS provides an API to support data replication and distribution for mirrors and partner databases.

Multimedia Object Storage:

Data is stored in highly resilient, fault-tolerant S3 storage, ensuring robust data protection.

Metadata Storage:

Each object type has defined metadata, including information on licensing, provenance, and quality.

Automated Processing

Upon uploading a media object, CAOS queues it for processing via defined workflows. For supporting barcode reads, visualizations akin to trace electropherograms are generated, accessible through the BOLD workbench, portal, or third-party platforms.

HTS Barcode Support:

A critical requirement for advancing DNA barcoding is ensuring the provenance of HTS (High-Throughput Sequencing) barcodes. CAOS addresses this need by storing and processing supporting reads.

Accessing CAOS:

CAOS operates seamlessly in the background as the object storage engine for BOLD but also supports institutions generating large volumes of multimedia or sequence data. It provides a robust API for those developing applications on the BOLD framework.

To request an organizational account or additional information, please contact support.