Got questions? This way.

FAQs and quick fixes.

For New Users
For Registered Users
For High-Volume Data Users

New Users:

Essential Information for Getting Started with BOLD

1. What kind of sequences does BOLD accept?

BOLD accepts sequences from more than 150 genetic markers including the 4 main DNA barcoding markers: COI-5P, ITS, matK, and rbcL as well as dozens of others. For a full list of markers accepted, or to inquire about adding new markers to the BOLD database, contact support@boldsystems.org.

2. Can I take advantage of the tools on BOLD without creating an account?

Many tools including the BOLD Taxonomy Browser, the BOLD ID Engine, the BIN Database, and the BOLD Public Data Portal do not require an account. The BOLD Taxonomy Browser allows users to view information about the progress of DNA barcoding for any taxon on BOLD, while the Public Data Portal and BIN database allow users to view the actual records assembled in BOLD and used in scientific publications. Finally, the BOLD ID Engine allows users to compare their sequences against the DNA barcode library to infer specimen identifications.

3. Can I create an account and put my sequence on BOLD even if I am not part of the iBOL consortium?

Being a member of iBOL is not a pre-requisite for using BOLD Systems. BOLD Systems is an open access DNA barcoding workbench and anyone is welcome to use it to store and analyze data. All BOLD users have access to the same tools and resources whether they are part of iBOL or not.

4. Can I request access to private data?

BOLD adheres to stringent security policies to ensure the privacy of its users. However, one of the goals of BOLD is to promote data sharing and collaboration. As such, you may request access to private data from BOLD by sending a message to support@boldsystems.org, and our support staff will contact the owner on your behalf. It is beneficial to include a description of what the data will be used for, as it will help data owners in their decision-making process.

5. What is the source of the identification in the BOLD ID Engine?

BOLD is a community-based website, so all identifications provided by the BOLD ID Engine are based on data submitted by other users of the system. In order to minimize uncertainty, the BOLD ID Engine offers four separate libraries for users to search from. These vary both in the number of records and type of data they possess. Read the descriptions offered for each library to make sure you select the most appropriate option for your search. BOLD also offers historical database searches, so you can search as far back as 2009 and see the historical results for your search. This tool is especially useful for users trying to replicate results from previous years.

6. I noticed an issue with one of the records in the BOLD Public Data Portal. Who should I contact?

If you notice an issue with one of the records on BOLD Public Data Portal, you can log into BOLD to add a comment or a tag to that particular record. The owner of the record will be notified of your tag and comment so they know to take action. If you do not have a BOLD account, we recommend that you register for one. Otherwise please contact support@boldsystems.org so our support staff can contact the record owner.

Registered Users:

Key Information and Support for Current BOLD Members

1. What is a BOLD Process ID?

BOLD Process IDs are unique codes automatically generated for each new record added to a project. They serve to connect specimen information, such as taxonomy, collection data and images, to the DNA barcode sequence for that specimen. BOLD Process IDs consist of a standard format including the project code and sequential numbers, followed by the year the record was added to the database. For example, the first record uploaded to project PROJ in 2012 would be assigned BOLD Process ID PROJ001-12 . This format ensures BOLD Process IDs are always unique in the system, as well as identifying the year the record was uploaded and the original project it was uploaded to.

2. What is a verbatim field?

A verbatim (user) field is a duplicate of a key field on BOLD that typically uses a controlled vocabulary (e.g. taxonomic rank, identification method, geographic fields). It allows users to submit data directly to BOLD without delay, even if the entry doesn’t yet meet validation rules. In the background, this gives our team time to work with you to resolve any issues and bring the data into compliance. This approach helps prevent entire records from being held back due to a problem in just one field.

3. How can I make partial updates to records?

BOLD now supports batch updates for specific fields in specimen records without requiring resubmission of all fields.

Upload an Excel file (.xlsx or .xls) that includes the Sample ID column with only the fields that require updating.

For example, to update Life Stage and Extra Info, include only the following columns: Sample ID, Life Stage, and Extra Info.

If the file contains multiple sheets, each sheet must contain the same set of records. Distinct record sets must be submitted as separate files.

Important:

Updates overwrite existing data.

Blank cells will clear the corresponding data in BOLD.

Start with a Data Spreadsheet

Select the records on the BOLD workbench to be updated.
From the Downloads menu, choose Data Spreadsheets.
Select the sheets containing the specimen data to be updated, then click Download.
Retain the Sample ID column and only the columns requiring modification. Field headers will be automatically mapped to the latest BCDM field names.

Full field definitions and accepted values are available in our documentation. Note special cases below.

Other options:

If an older BOLD template is used, field headers will also be automatically mapped to the latest BCDM field names.

Excel files exported from external databases may also be uploaded, provided that the column headers match the BCDM field names.

Special cases:

Taxonomic ranks:
Include columns Sample ID, all ranks as a set (Phylum to Subspecies), Identifier, and Identification Method.
Additional columns are optional. The Sample ID and Phylum fields must contain data. Other fields may be blank and this will overwrite any existing data.

Sample ID

Phylum

Class

Order

Family

Subfamily

Tribe

Genus

Species

Subspecies

Identifier

Identification
method

GPS coordinates:
Include columns Sample ID as well as Lat and Lon as a set. Additional columns are optional, but GPS Source is recommended if the project tracks GPS sources.
The Sample ID field must contain data. Other fields may be blank and this will overwrite any existing data.

Sample ID

Lat

Lon

Collection dates:
Include Sample ID along with both Collection Date Start and Collection Date End in YYYY-MM-DD format.
If the Data Spreadsheet being modified only has a single Collection Date column, there are two options:

Duplicate the Collection Date column to create both required fields, or
Include both Collection Date and Collection Date Accuracy columns and our conversion tool will do the rest.

When the collection date range is uncertain or not tracked, please provide the Collection Date Start and leave the Collection Date End blank.

If the start and end dates are the same, please enter the same date in both columns.

Sample ID

Collection Date Start

Collection Date End

How to submit the update file:

Enter the project console on the BOLD workbench.
From the Uploads menu, select Specimen Data.
Click on the green Initiate Batch Submission box.
Select UPDATE Existing Records.

4. I noticed my records have been flagged. What does it mean, and how can I remove the flag after I fix the issue?

There are several reasons a record may be flagged, including the detection of a contaminated sequence or a misidentified species. Flags serve two purposes: they act as alerts to inform project managers that an issue has been detected in their records, and they prevent a record from being included in the BOLD ID Engine and Taxonomy Browser. In some cases, changing the taxonomy of the sample or re-editing the sequence can resolve the flags. Once the issue has been resolved, project managers can contact support@boldsystems.org to have the flag removed from their record(s).

5. Why doesn't my sequence have a BIN assignment?

BIN assignments are based on sequence divergence, and BINs are generated once per month. Not all sequences will receive a BIN assignment. Currently BINs are only assigned to records with COI sequences longer than 500bp that contain less than 1% ambiguous bases.

6. Why might the BOLD ID Engine not return with matches for my sequence?

If the BOLD ID Engine is not returning any identification matches to your sequence, there may be a few factors worth investigating. First, the genetic marker used must be supported by the database. BOLD currently supports COI for animal identifications, matK and rbcL for plant identifications and ITS for fungal identifications. Second, the sequenced region of the gene should match the marker used. For example, the barcode region for COI is located in the 5’ end. Although other gene regions may return results, most of the database is composed of sequences within the barcode region. Finally, you should ensure the length of your sequence is 180bp or longer. Short sequences and/or those containing a large number of ambiguous bases should be run in the full length database only. If the above factors have been examined and no identifications are returned, please contact support@boldsystems.org and our support staff will be happy to assist you further.

7. What improvements were made to the ID Engine since the transition to BOLD 5?

The BOLD version 5 Identification Engine has been completely overhauled to use an upgraded analytical compute cluster to provide faster identification and support a larger number of parallel requests. In addition to providing more reference databases, the databases themselves have also been extended to cover additional barcode markers and employ smarter sampling strategies to reduce duplicate sequences. Compared to v4 which supported COI-5P, ITS, and rbcL only, the v5 ID Engine databases may include COI-5P, rbcL, matK, ITS, ITS1, ITS2, 18S, 12S and CAD.

Mapping of reference libraries between the v4 and v5 ID Engine. For details about each reference library, please check the description on the submission form at id.boldsystems.org.

v4 Library Name	v5 Library
v4 Library Name	Name	Code*
Animal - All Barcode Records on BOLD	ANIMAL LIBRARY (PUBLIC+PRIVATE)	ANIMAL:PUBLIC+PRIVATE
Animal - Species Level Barcode Records	ANIMAL SPECIES-LEVEL LIBRARY (PUBLIC+PRIVATE)	ANIMAL:SPECIES
Animal - Public Record Barcode Database	ANIMAL LIBRARY (PUBLIC)	ANIMAL:PUBLIC
Animal - Full Length Record Barcode Database	N/A	N/A
Plant Sequences	PLANT LIBRARY (PUBLIC)	PLANT:PUBLIC
Fungal ITS Sequences	FUNGI LIBRARY (PUBLIC)	FUNGI:PUBLIC
N/A	VALIDATED CANADIAN ARTHROPOD LIBRARY	VALIDATED:CANREF22
N/A	ANIMAL SECONDARY MARKERS (PUBLIC)	ANIMAL:PUBLIC-SECONDARY
N/A	VALIDATED ANIMAL RED LIST LIBRARY	VALIDATED:IUCN

* Referenced on the Sequence Record Page for triggering the v5 ID Engine.

The front end web interface has also undergone changes to provide a more responsive and customizable user experience. The Operating Mode feature was added to allow users to choose the analysis parameter preset based on their need for speed requirement, similarity threshold cutoff, and the desired number of hits. The maximum sequence input per submission has also increased from 50 sequences (BOLD v4) up to 1000 sequences, depending on the Operating Mode selected.

Another feature introduced with this version of ID Engine is the ability to monitor progress live by associating each request with an unique submission ID. You will be able to bookmark the progress page URL in your browser, which allows you to conveniently check back later for a progress update or retrieve the results when the run is complete. The result page, which includes downloadable data tables, will be accessible by this unique submission ID for up to 3 days after the results are generated.

8. How do I interpret the results of the Taxon ID Tree?

BOLD uses neighbour-joining trees which group sequences together by the number of amino acid or nucleotide differences. The arrangement of the specimens in the tree is based on sequence similarities, with the sequences that are most similar placed closer together on the tree, and with the branch length indicating the degree of similarity. The percentage of similarity between sequences can be measured against the legend (usually 2%) where the longer the branch the more disparity between the sequences. It is often expected that specimens of the same species have more similar sequences and cluster closer together than specimens from different species. Unexpected outcomes can reveal interesting findings, which could be associated with biologically relevant patterns, or they can reveal errors such as misidentification or contamination of a sample. For more information on how to build a Taxon ID Tree, and the parameters you can select to tailor your tree, please refer to the BOLD Handbook. (Note: The BOLD Taxon ID Tree does not infer phylogenetic relationships. There may be many ways to interpret a tree, BOLD encourages that you to use your own discretion in making assumptions from the results).

9. Does the length of my sequence influence the shape of my Taxon ID Tree?

Short sequences may influence the shape of the Taxon ID Tree based on the alignment algorithm selected while building the tree. The BOLD Aligner is amino acid based, so instead of comparing the nucleotides between the sequences, it compares the amino acid translations. Using this alignment algorithm, short sequences are less likely to align correctly with longer sequences. When building Taxon ID Trees containing short sequences (anything shorter than 200bp), it is recommended to use Muscle or Kalign algorithms. Independent of the alignment algorithm selected, whenever short sequences are included in an analysis, the results should be interpreted with caution.

10. What is a dataset and how is it different from a project?

A dataset is a virtual representation of records stored on BOLD. Records from multiple projects can be added to datasets allowing users to access the data while keeping the records in their original projects. Using datasets, records from multiple projects can be concatenated, analyzed, and even published without ever having to be moved from their original projects. For example, if you are performing a three-year biodiversity study, you may wish to store the records on BOLD in projects based on the year they were collected. If you want to look at all of the Hymenoptera collected over the three years, you can add the appropriate records to a dataset. The records will stay organized in their year-based projects but you can access them all at once and even publish them to GenBank from the dataset. To simplify the publication process, you may also request a DOI (Digital Object Identifier) from BOLD for public datasets. The DOI can be incorporated into the publication so readers will have quick and easy access to the data.

11. When is the best time to submit my sequences to GenBank?

Records submitted through BOLD to GenBank will remain private on GenBank for one year to allow time for submitters to publish their findings before the records are publicly accessible. BOLD recommends all users submit their sequences to GenBank while preparing their manuscript for publication. Once the manuscript is published, users are encouraged to make their data publicly available in BOLD. This can be accomplished by visiting the Modify Project Properties page in the Project Console.

12. Once a record has been submitted to GenBank, can I modify it? Will updating a record on BOLD automatically update the record on GenBank?

BOLD regularly submits record updates to GenBank, and GenBank will incorporate these changes on a periodic basis. If an update is required within a specific time period, please contact support@boldsystems.org and our support staff will help facilitate the synchronization of BOLD records with GenBank records.

13. How long will my data remain private before they are included in the BOLD Public Data Portal? Is there any way I can make my data public sooner?

BOLD Systems serves as a workbench for users to organize, analyze, and publish data, with its design and policies recognizing that specimen identification can be a time-consuming process. Consequently, there is no strict timeline for making records public. However, it is recommended that data generated with public funds be released as early as possible, in alignment with open science principles. Instructions for releasing data can be found in the documentation section.

14. How do I make a dataset or project available in the BOLD Public Data Portal?

Making a project or dataset public is a straightforward process and project managers can choose to do so at any time. This can be achieved by clicking on Modify Project Properties from within a project or dataset and clicking on the check box that says “Make this project publicly visible”. See the BOLD Handbook for more information. After clicking on save, your project will be publicly accessible through the workbench and the records will be available on BOLD Public Data Portal after it is updated.

15. Can I delete images from my records?

You may request to have your images deleted from BOLD at any time by sending an email with your request to support@boldsystems.org.

16. My trace file has a failed status. Can I upload new copies of the traces with the same name?

No, the trace files must be given a new name. Once a trace file has been uploaded to BOLD, whether it succeeded or failed, another trace file with an identical name cannot be uploaded. However, only a small change is required, such as adding a letter or number to the end of the naming scheme of the trace file.

For High-Volume Data Users:

Common Support Questions for Users Looking to Access Large Data Volume

1. How can I access large amounts of BOLD data at one time?

To access large portions of the public BOLD database, we recommend downloading a BOLD data package and filtering for the fields required. Non-registered BOLD users can request a free BOLD account at support@boldsystems.org prior to downloading a BOLD data package.

2. What are BOLD Data Packages?

BOLD Data Packages are FAIR-compliant collections of public sample and sequence data that support data interoperability and data reuse.

BOLD offers three types of Data Packages, each serving distinct research needs:

Recent Data packages are provided with a dynamic DOI link that will always point to the latest BOLD public database snapshot. On a weekly basis, snapshots rotate through the latest and second latest DOI links, then become inaccessible. Recent Data packages are useful for programmatic access that always points to the latest BOLD public data. For a stable link to a BOLD public database snapshot, please use a Historical Data Package.
Historical Data packages are provided with a stable DOI link that will always point to the same BOLD public database snapshot. Archived on a quarterly basis, every March, June, September, and December, these packages help ensure the reproducibility of previous studies by providing access to standardized, well-structured historical datasets.
Project Data packages are provided with a stable DOI link and contain millions of specimen records from thousands of species from countries worldwide. These packages are directly related to specific projects, such as the Canadian Barcode of Life Network or BARCODE 500K, and marking the Centre for Biodiversity Genomic’s compliance with CBG’s Data Release Policy. The CBG data packages are currently being released twice a year. For a full public BOLD database snapshot, please use a Historical Data Package.

Data packages contain a summary in JSON format, metadata in JSON format, sequences and metadata together in a compressed tab-delimited file, and sequences in FASTA format.

For information on citing BOLD data packages, please refer to the Citation page.

Got questions? This way.

New Users:

Registered Users:

For High-Volume Data Users:

Data

Research Tools

About

Resources Hub