dissect.target.helpers.scrape

Module Contents

Functions

find_needles

Yields needles and their offsets found in provided byte stream.

find_needle_chunks

Yields tuples with an offset, a needle and a byte chunk found in provided byte stream.

scrape_chunks

Yields records scraped from chunks found in a provided byte stream.

recover_string

Recover the longest possible string from a byte buffer, forward or reverse.

Attributes

dissect.target.helpers.scrape.Needle
dissect.target.helpers.scrape.find_needles(fh: BinaryIO, needles: Needle | list[Needle], *, start: int | None = None, end: int | None = None, lock_seek: bool = True, block_size: int = io.DEFAULT_BUFFER_SIZE, progress: Callable[[int], None] | None = None) collections.abc.Iterator[tuple[bytes, int]]

Yields needles and their offsets found in provided byte stream.

Parameters:
  • fh – The byte stream to search for needles.

  • needles – The list of bytes needles to search for.

  • start – The offset to start searching from.

  • end – The offset to stop searching at.

  • lock_seek – Whether the file position is maintained by the scraper or the consumer. Setting this to False will allow the consumer to seek the file pointer, i.e. to skip forward.

  • block_size – The block size to use for reading from the byte stream.

  • progress – A function to call with the current offset.

dissect.target.helpers.scrape.find_needle_chunks(fh: BinaryIO, needle_chunk_size_map: dict[Needle, int], chunk_reader: Callable[[BinaryIO, Needle, int, int], bytes] | None = None, lock_seek: bool = True, block_size: int = io.DEFAULT_BUFFER_SIZE) collections.abc.Iterator[tuple[bytes, int, bytes]]

Yields tuples with an offset, a needle and a byte chunk found in provided byte stream.

Parameters:
  • fh – The byte stream to search for needles.

  • needle_chunk_size_map – A dictionary with needle bytes as keys and chunk sizes as values.

  • chunk_reader – A function to read a chunk from a byte stream for provided needle, offset and chunk size.

  • lock_seek – Whether the file position is maintained by the scraper or the consumer. Setting this to False wil allow the consumer to seek the file pointer, i.e. to skip forward.

  • block_size – The block size to use for reading from the byte stream.

dissect.target.helpers.scrape.scrape_chunks(fh: BinaryIO, needle_chunk_size_map: dict[Needle, int], chunk_parser: Callable[[Needle, bytes], collections.abc.Iterator[dissect.target.helpers.record.TargetRecordDescriptor]], chunk_reader: Callable[[BinaryIO, Needle, int, int], bytes] | None = None, block_size: int = io.DEFAULT_BUFFER_SIZE, log: logging.Logger | None = None) collections.abc.Iterator[dissect.target.helpers.record.TargetRecordDescriptor]

Yields records scraped from chunks found in a provided byte stream.

Parameters:
  • fh – The byte stream to search for needles.

  • needle_chunk_size_map – A dictionary with needle bytes as keys and chunk sizes as values.

  • chunk_parser – A function to parse a chunk and yield records.

  • chunk_reader – A function to read a chunk from a byte stream for provided needle, offset and chunk size.

  • block_size – The block size to use for reading from the byte stream.

  • log – A logger to use for logging.

dissect.target.helpers.scrape.recover_string(buf: bytes, encoding: str, *, reverse: bool = False, ascii: bool = True) str

Recover the longest possible string from a byte buffer, forward or reverse.

Parameters:
  • buf – The byte buffer to recover a string from.

  • encoding – The encoding to use for decoding the buffer.

  • reverse – Whether to recover the string from the end of the buffer.

  • ascii – Whether to recover only ASCII characters.