Annotated sequence data
API¶
Readers¶
These classes are designed to read data from a variety of file formats into a SeqData object.
|
|
|
|
|
|
|
|
|
|
|
Composing readers¶
These functions are designed to be used in a composable way to read data from a variety of file formats into a single SeqData object.
|
Composable function to create a SeqData object from flat files. |
|
Composable function to create a SeqData object from region based files. |
Default readers¶
These functions are special cases of the composable readers that are designed to be used for common use cases
|
Read in sequences with coverage from a BAM file. |
|
Read a bigWig file and return a Dataset. |
|
Reads sequences from a "flat" FASTA file into xarray. |
|
Reads sequences from a "genome" FASTA file into xarray. |
|
Reads sequences and metadata from tabular files (e.g. |
|
Read a VCF file and return a Dataset. |
|
Reads a bed-like (BED3+) file as a pandas DataFrame. |
Zarr¶
SeqData reads and writes all datasets to disk as Zarr stores using the following functions
|
Write a xarray object to disk as a Zarr store. |
|
Open a SeqData object from disk. |
PyTorch dataloading¶
SeqData provides a unified interface for converting SeqData objects into PyTorch dataloaders
|
Get a PyTorch DataLoader for this SeqData. |
Utilities¶
Some utility functions that are useful for working with SeqData objects
|
Add a BED-like DataFrame to a Dataset. |
|
Label regions for binary or multitask classification based on whether they overlap with another set of regions. |
|
Merge observations into a SeqData object along sequence axis. |