Skip to content

download_ncbi_genome

Download genome-associated files (.fna, .gbk, .gff) from NCBI, rename the locus tags, and generate .ffn and faa files.

Overview

The download_ncbi_genome command automates the process of downloading genome data from NCBI and preparing it for import into the Arx folder structure. It downloads the essential files, renames locus tags, and generates additional required files.

Usage

download_ncbi_genome \
  --assembly_name GCF_005864195.1 \
  --out_dir /path/to/outdir \
  --new_locus_tag_prefix FAM3257_

Parameters

  • --assembly_name: The NCBI assembly accession (e.g., GCF_005864195.1)
  • --out_dir: Directory where downloaded files will be saved
  • --new_locus_tag_prefix: Prefix for the new locus tags (e.g., FAM3257_)

Output Files

The command generates the following files in the output directory:

outdir
├── FAM3257.faa
├── FAM3257.ffn
├── FAM3257.fna
├── FAM3257.gbk
└── FAM3257.gff

File Descriptions

  • .fna: Assembly FASTA file (downloaded from NCBI)
  • .gbk: GenBank file (downloaded from NCBI)
  • .gff: General Feature Format file (downloaded from NCBI)
  • .faa: Protein sequences FASTA (generated from .gbk)
  • .ffn: Nucleotide sequences FASTA (generated from .gbk)

Next Steps

After downloading, you can import the genome into the Arx folder structure:

import_genome --import_dir=/path/to/outdir --organism FAM3257 --genome FAM3257

Help

download_ncbi_genome --help