Skip to content

init_folder_structure

Creates a basic Arx folders structure.

Overview

The init_folder_structure command creates the foundational directory structure required for Arx genome data organization. This is typically the first step in setting up a new Arx project.

Usage

export FOLDER_STRUCTURE=/path/to/folder_structure
init_folder_structure

Or specify the directory directly:

init_folder_structure --folder_structure_dir=/path/to/folder_structure

Created Structure

After running the command, you'll have the following folder structure:

folder_structure
├── organisms
├── annotations.json
├── annotation-descriptions
│   ├── SL.tsv
│   ├── KO.tsv
│   ├── KR.tsv
│   ├── EC.tsv
│   └── GO.tsv
├── orthologs
└── pathway-maps
    ├── type_dictionary.json
    └── svg

Directory Descriptions

  • organisms/: Contains individual organism directories with their genome data
  • annotations.json: Central annotation configuration file
  • annotation-descriptions/: Contains description files for different annotation types
  • SL.tsv: Subcellular localization annotations
  • KO.tsv: KEGG Orthology annotations
  • KR.tsv: KEGG Reaction annotations
  • EC.tsv: Enzyme Commission annotations
  • GO.tsv: Gene Ontology annotations
  • orthologs/: Stores orthology analysis results
  • pathway-maps/: Contains pathway visualization files
  • type_dictionary.json: Pathway type definitions
  • svg/: SVG pathway map files

Next Steps

Once the folder structure has been initiated:

  1. Use import_genome to add genomes to the folder structure
  2. Use download_ncbi_genome and import_genome to download and add genomes from NCBI
  3. When all genomes have been added, use init_orthofinder and import_orthofinder to calculate orthologs (optional)

Help

init_folder_structure --help