Utilities

bsbolt.Utils.get_external_paths()

Get paths of dependencies. Print warning if setup.py not run and dependencies not compiled.

Returns: * bwa (str): path to bwa executable * wgsim (str): path to wgsim executable

bsbolt.Utils.index_bam(bam_input=None)

Index bam file

Params: * bam_input (str): input bam files

bsbolt.Utils.reverse_complement(sequence)

Params:

  • sequence (str): DNA sequence, non ATGC nucleotide will be returned unaltered

Returns:

  • reversed_string.translate(_rc_trans) (str): reverse complement of input sequence
bsbolt.Utils.retrieve_iupac(nucleotide)

Params:

  • nucleotide (str): single character

Returns:

  • iupac_tuple (tuple): tuple of strings with possible bases
bsbolt.Utils.sort_bam(bam_output=None, bam_input=None)

Sort bam file

Params: * bam_output (str): output path for sorted bam file * bam_input (str): input bam file

class bsbolt.Utils.AlignmentEvaluator(duplicated_regions=None, matching_target_prop=0.95, verbose=False)

Evaluate alignment against simulated bisulfite sequencing data.

Params:

  • duplicated_regions (dict): regions duplicated in the simulation reference, [None]
  • matching_target_prop (float): proportion of alignment that most overlap with target region for a valid alignment to be called, [0.95]
assess_alignment(alignment, alignment_info)

Compare alignment against reference alignment

evaluate_alignment(self, alignment_file, fastq_files=None)

Params:

  • alignment_file (str): path to alignment file
  • fastq_files (list): list of paths to fastq files

Returns:

  • alignment_evaluations (dict): target alignment stats