seqdata.BAM

class seqdata.BAM(name, bams, samples, batch_size, n_jobs=1, threads_per_job=1, dtype=<class 'numpy.uint16'>, sample_dim=None, offset_tn5=False, count_method='depth-only')
__init__(name, bams, samples, batch_size, n_jobs=1, threads_per_job=1, dtype=<class 'numpy.uint16'>, sample_dim=None, offset_tn5=False, count_method='depth-only')

Reader for BAM files.

Parameters:
  • name (str) – Name of the array this reader will write.

  • bams (Union[str, Path, List[str], List[Path]]) – Path or a list of paths to BAM(s).

  • samples (Union[str, List[str]]) – Sample names for each BAM.

  • batch_size (int) – Number of sequences to write at a time. Note this also sets the chunksize along the sequence dimension.

  • n_jobs (int, optional) – Number of BAMs to process in parallel, by default 1, which disables multiprocessing. Don’t set this higher than the number of BAMs or number of cores available.

  • threads_per_job (int, optional) – Threads to use per job, by default 1. Make sure the number of available cores is >= n_jobs * threads_per_job.

  • dtype (Union[str, Type[np.number]], optional) – Data type to write the coverage as, by default np.uint16.

  • sample_dim (Optional[str], optional) – Name of the sample dimension, by default None

  • offset_tn5 (bool, optional) – Whether to adjust read lengths to account for Tn5 binding, by default False

  • count_method (Union[CountMethod, Literal["depth-only", "tn5-cutsite", "tn5-fragment"]]) – Count method, by default “depth-only”

Methods

__init__(name, bams, samples, batch_size[, ...])

Reader for BAM files.

Attributes

name