seqdata.from_region_files

seqdata.from_region_files(*readers, path, fixed_length, bed, max_jitter=0, sequence_dim=None, length_dim=None, splice=False, overwrite=False)

Save a SeqData to disk and open it (without loading it into memory).

Parameters:
  • path (str, Path) – Path to save this SeqData to.

  • fixed_length (int, bool, optional) –

    int: use regions of this length centered around those in the BED file.

    True: assume the all sequences have the same length and will try to infer it from the data.

    False: write variable length sequences

  • bed (str, Path, pd.DataFrame, optional) – BED file or DataFrame matching the BED3+ specification describing what regions to write.

  • max_jitter (int, optional) – How much jitter to allow for the SeqData object by writing additional flanking sequences, by default 0

  • sequence_dim (str, optional) – Name of sequence dimension. Defaults to “_sequence”.

  • length_dim (str, optional) – Name of length dimension. Defaults to “_length”.

  • splice (bool, optional) – Whether to splice together regions that have the same name in the BED file, by default False

  • overwrite (bool, optional) – Whether to overwrite existing arrays of the SeqData at path, by default False

Return type:

xr.Dataset