Genome analysis of Endozoicomonas species¶
Genome assembly of OTU5¶
Read preprocessing was conducted by fastp
fastp \
-i ${READ1}.fastq \
-I ${READ2}.fastq \
-w 20 \
-h ${report}.html \
-j ${report}.json \
-o ${READ1}.QC.fastq \
-O ${READ2}.QC.fastq
De novo assembly¶
De novo assembly was performed by SPAdes \
-1 ${READ1}.QC.fastq \
-2 ${READ2}.QC.fastq \
-k auto \
--careful \
-t 20 \
-o assembly_results
Gene annotation¶
Gene prediction¶
CDS regions were predicted from the genome sequences by Prokka.
prokka \
--outdir ../Prokka/${id}_prokka \
--kingdom Bacteria \
--gcode 11 \
--gram neg \
--cpus 20 \
--centre X \
--compliant \
--prefix ${id}_prokka \
--addgenes ${id}
Gene annotation with KEGG database was performed by KEGG Automatic Annotation Server (
NCBI nr¶
Similarity search against NCBI nr was performed to identify taxonomy of the most similar sequences
Preparation of DB
git clone chmod +x ./
Similarity search with Diamond
diamond blastx \ -q ${input} \ -d ${db} \ -o ${input}.nr_diamond.txt \ -f 6 \ -p 8
Parse results
python \ -i ${input}.nr_diamond.txt \ -o ${input}.nr_diamond.parse.txt \ -threshold 1e-5
Comparative genome analysis¶
Orthogroup identification¶
Orthologous gene groups were determined by OrthoFinder2
orthofinder \
-f ${faa_dir} \ # Directory including Protein amino acid FASTA files
-t 16 \
-a 3
Genome alignment¶
Genome alignment was performed by progressiveMauve
progressiveMauve \
--output=progressiveMauve.all.xmfa \
${indir}/Endozoicomonas_acroporae_Acr-14.fasta_prokka.fna \
${indir}/Endozoicomonas_arenosclerae_ab112.fasta_prokka.fna \
${indir}/Endozoicomonas_ascidiicola_AVMART05.fasta_prokka.fna \
${indir}/Endozoicomonas_ascidiicola_KASP37.fasta_prokka.fna \
${indir}/Endozoicomonas_atrinae_WP70.fasta_prokka.fna \
${indir}/Endozoicomonas_culture.decontamination.fasta_prokka.fna \
${indir}/Endozoicomonas_elysicola_DSM_22380.fasta_prokka.fna \
${indir}/Endozoicomonas_montiporae_CL-33.fasta_prokka.fna \
${indir}/Endozoicomonas_montiporae_LMG_24815.fasta_prokka.fna \
${indir}/Endozoicomonas_numazuensis_DSM_25634.fasta_prokka.fna \