____________________________________________________________________________ MIDORI2 Reference: Files in this directory are MIDORI2 reference databases created based on GenBank files downloaded from ftp://ftp.ncbi.nlm.nih.gov. An explanation for the file names: "sp" = those databases include sequences that lack binomial species-level description, such as "sp.," "aff.," "nr.," "cf.," "complex," and "nomen nudum." "AA" = amino acid sequence database "NUC" = nucleic acid sequence database "srRNA" = mitochondrial 12S ribosomal RNA "lrRNA" = mitochondrial 16S ribosomal RNA "UNIQ" = UNIQ files contain all unique haplotypes associated with each species. "LONGEST" = LONGEST files contain the longest sequence for each species. "TOTAL" = TOTAL files contain all sequences. Those TOTAL files are only available in RAW format. Currently, we have seven formats, including, MOTHUR (Schloss et al., Applied and Environmental Microbiology, 2009), QIIME (Caporaso et al., Nature Methods, 2010), RDP Classifier (Wang et al., Applied and Environmental Microbiology, 2007), SINTAX (Edgar, BioRxiv Preprint, 2016), SPINGO (Allard et al., BMC Bioinformatics, 2015), BLAST+ (Camacho et al. BMC Bioinformatics, 2009), and DADA2 (Callahan et al., Nature Methods, 2016). We also provide "RAW" files, which contain complete taxonomy. Since version GB 237, we have included not only Metazoan but also all Eukaryote sequences in our reference. Since version GB 242, we provide two types of databases, 1) with (sp) and 2) without binomial species description, such as "sp.," "aff.," "nr.," "cf.," "complex," and "nomen nudum." Since version GB 243, we created not only nucleic acid sequences but also amino acid sequences databases (folders with AA: amino acid). The latest MIDORI reference databases are available from the folder "Databases." You will find numbers after the taxonomy. Those numbers represent GenBank Taxonomy ID. We inserted those numbers to differentiate synonyms (ex. Phylum: Ctenophora = Ctenophora_10197, Diatom: Ctenophora = Ctenophora_1003038, flies: Ctenophora = Ctenophora_1803). In all formats, except RAW files, we have inserted missing taxonomy by creating it from a lower taxonomic ranking (ex. description in class-level was missing, so it was created from order-level in the following example, >JF502242.1.7041.7724 root_1;Eukaryota_2759;Chordata_7711;class_Crocodylia_1294634;Crocodylia_1294634;Crocodylidae_8493;Crocodylus_8500;Crocodylus intermedius_184240). List files are listing accession numbers collapsed into each sequence for uniq and longest files. For more information, please look at our manuscript: https://onlinelibrary.wiley.com/doi/full/10.1002/edn3.303, or contact us. MIDORI2 Team Last modified on August 27, 2022 ____________________________________________________________________________