Commandline Usage

airrship [-h] [-v] -o OUT_NAME [--outdir OUTDIR] [--datadir DATADIR] 
         [-n NUMBER_SEQS] [--het PROP PROP PROP] [--shm] [--shm_multiplier SHM_MULTIPLIER] 
         [--shm_flat] [--mut_rate MUT_RATE | --mut_num MUT_NUM] [--shm_random] 
         [--all_alleles] [--locus LOCUS FILE] 
         [--flat_vdj {gene,family,False}] [--no_trim] 
         [--no_trim_v3] [--no_trim_d3] [--no_trim_d5] [--no_trim_j5] 
         [--no_np][--no_np1] [--no_np2] [--non_productive]
         [--prop_non_productive PROP] [--seed SEED] [--species SPECIES]

Parameters

Option Details
-o, --outname <out_prefix> Name for repertoire files. Only required parameter.
--outdir <out_dir> Output directory. The current working directory is used if not specified.
--datadir <data_dir> Alternative input data directory. Defaults to airrship/data.
Data must be formatted as in the airrship data directory
-n, --number_seqs <n_seqs> Number of sequences to simulate. Defaults to 1000.
--het < prop prop prop > Proportion of genes to be heterozygous, specify as V D J.
Values must be between 0 and 1. Not compatible with --all_alleles.
Not all genes have more than one allele. The proportion achieved may
therefore be lower than requested.
Defaults to 0 0 0 (the maximum possible proportion of
heterozygous genes using the included IMGT alleles).
--shm Hypermutate sequences according to experimental parameters.
Each base will be mutated according to its 5mer context
and the mutation frequency for each sequence will match observed
distributions. If not specified, sequences will not be mutated. Mutation
rates can be controlled by replacing the mut_freq_per_seq_per_family.csv
reference file or specifying a --shm_multiplier.
--shm_multiplier Multiplication factor to use on per sequence mutation rate distribution.
Defaults to 1, i.e. replicates frequencies from the
mut_freq_per_seq_per_family.csv reference file.
--shm_flat Mutate each sequence to the same degree (i.e. return a flat per
sequence mutation distribution). Specify degree of mutation
using --mut_num or --mut_rate. Will default to a mutation rate of 0.05.
--shm_random Do not mutate individual bases according to kmer context.
Each base will have an equal chance of being mutated.
--mut_rate <mut_rate> Mutation frequency for flat SHM only. Value between 0 and 0.6.
Not compatible with --mut_num. Defaults to 0.05.
--mut_num <number_muts> Number of mutations for flat SHM only.
Not compatible with --mut_rate.
--all_alleles Use all available alleles from all available genes,
i.e., do not generate a synthetic 'haplotype'. Not compatible with --het.
--locus <locus_file> Do not generate a new locus, instead specify path
to an existing csv file to use as locus for repertoire generation.
--vdj_flat {gene, family} Do not use experimental data to bias VDJ usage, instead
use all genes or families evenly.
--no_trim Don't trim any end of any VDJ genes during recombination.
--no_trim_v3 Don't trim 3' end of V genes during recombination.
--no_trim_d5 Don't trim 5' end of D genes during recombination.
--no_trim_d3 Don't trim 3' end of D genes during recombination.
--no_trim_j5 Don't trim 5' end of J genes during recombination.
--no_np Don't insert nucleotides at either gene junction,
i.e., do not create NP regions.
--no_np1 Don't insert nucleotides at the VD junction,
i.e., do not create NP1 regions.
--no_np2 Don't insert nucleotides at the DJ junction,
i.e., do not create NP2 regions.
--non_productive Include non-productive sequences in the output.
This includes sequences with out of frame V and J segments,
stop codons and/or missing junction anchor residues
(C-104 and W/F-118). The majority of sequences produced will be
non-productive (~75% using defaults without SHM, ~85% with SHM).
Specify --prop_non_productive to control this proportion.
--prop_non_productive <prop> Proportion of sequences to be non-productive.
Value between 0 and 1. Use with --non_productive.
--seed <seed> Set random seed.
--species <species> Specify if simulating non-human sequences.
Will be used to find the imgt_{species}_IGH[V/D/J].fasta
files in the specified --datadir. Deafult is human.