Database formatting

16S taxonomic database

This database should be given as FASTA file.
Format of headers for 16S should be like in following example:

Example:

>AF093247.1.2007 Eukaryota;Amoebozoa;Mycetozoa;Myxogastria;;Hyperamoeba_sp._ATCC50750

ITS taxonomic database

This database should be given as FASTA file.
Format of headers for ITS is the same as one used in UNITE database:

Example:

>DQ233785|uncultured ectomycorrhizal fungus|Fungi|Thelephora terrestris|Fungi; Basidiomycota; Agaricomycotina; Agaricomycetes; Incertae sedis; Thelephorales; Thelephoraceae; Thelephora; Thelephora terrestris

Database “GI → TaxID”

This database should be prepared in tsv (tab-separed values) format.
First column is a GI, second is a TaxID.

Example:

13   9913
15   9915
16   9771
17   9771

Database “TaxID → scientific_name”

From file formated as in example below:

2    |       prokaryotes     |       prokaryotes <Bacteria>  |       in-part |
6    |       Azorhizobium    |               |       scientific name |
6    |       Azorhizobium Dreyfus et al. 1988        |               |       synonym |
6    |       Azotirhizobium  |               |       equivalent name |
7    |       ATCC 43989      |               |       type material   |
7    |       Azorhizobium caulinodans        |               |       scientific name |
Only following data will be extracted:

TaxID scientific_name
6 Azotirhizobium
7 Azorhizobium caulinodans