dissect.target.tools.dump

Module Contents

Classes

RecordStreamElement

Sink

DumpState

Compression

Supported compression types.

Serialization

Supported serialization methods.

JsonLinesWriter

SortedKeysJsonRecordPacker

Functions

get_targets

Return a generator with Target objects for provided paths.

execute_function

Execute function function on provided target target and return a generator

produce_target_func_pairs

Return a generator with target and function pairs for execution.

execute_functions

Execute a function on a target for target / function pairs in the stream.

log_progress

Log a number of items that went though the generator stream after

sink_records

Persist records from the stream into appropriate sinks, per serialization, compression and record type.

persist_processing_state

Keep track of the pipeline state in a persistent state object.

configure_state

create_state

Create a DumpState instance with provided properties.

persisted_state

Return a context manager for persisting DumpState instance.

load_state

Load persisted DumpState instance from provided output_dir path and perform sink validation.

serialize_obj

JSON serializer for object types not serializable by json library.

get_nested_attr

get_sink_dir_by_target

get_sink_dir_by_func

slugify_descriptor_name

get_sink_filename

Return a sink filename for provided record descriptor, serialization and compression.

get_relative_sink_path

Return a sink path relative to an output directory.

open_path

Open path using mode, with specified compression and return a file object.

get_sink_writer

cached_sink_writers

get_current_utc_time

parse_datetime_iso

execute_pipeline

Run the record generation, processing and sinking pipeline.

parse_arguments

main

Attributes

dissect.target.tools.dump.HAS_LZ4 = True
dissect.target.tools.dump.HAS_ZSTD = True
dissect.target.tools.dump.log
class dissect.target.tools.dump.RecordStreamElement
target: dissect.target.target.Target
func: dissect.target.plugin.FunctionDescriptor
record: flow.record.Record
end_pos: int | None = None
sink_path: pathlib.Path | None = None
dissect.target.tools.dump.get_targets(targets: list[str]) collections.abc.Iterator[dissect.target.target.Target]

Return a generator with Target objects for provided paths.

dissect.target.tools.dump.execute_function(target: dissect.target.target.Target, function: dissect.target.plugin.FunctionDescriptor, dry_run: bool, arguments: list[str]) collections.abc.Iterator[dissect.target.helpers.record.TargetRecordDescriptor]

Execute function function on provided target target and return a generator with the records produced.

Only output type record is supported for plugin functions.

dissect.target.tools.dump.produce_target_func_pairs(targets: collections.abc.Iterable[dissect.target.target.Target], state: DumpState) collections.abc.Iterator[tuple[dissect.target.target.Target, dissect.target.plugin.FunctionDescriptor]]

Return a generator with target and function pairs for execution.

Target and function pairs that correspond to finished sinks in provided state state are skipped.

dissect.target.tools.dump.execute_functions(target_func_stream: collections.abc.Iterable[tuple[dissect.target.target.Target, dissect.target.plugin.FunctionDescriptor]], dry_run: bool, arguments: list[str]) collections.abc.Iterator[RecordStreamElement]

Execute a function on a target for target / function pairs in the stream.

Returns a generator of RecordStreamElement objects.

dissect.target.tools.dump.log_progress(stream: collections.abc.Iterable[Any], step_size: int = 1000) collections.abc.Iterator[Any]

Log a number of items that went though the generator stream after every N element (N is configured in step_size).

dissect.target.tools.dump.sink_records(record_stream: collections.abc.Iterable[RecordStreamElement], state: DumpState) collections.abc.Iterator[RecordStreamElement]

Persist records from the stream into appropriate sinks, per serialization, compression and record type.

dissect.target.tools.dump.persist_processing_state(record_stream: collections.abc.Iterable[RecordStreamElement], state: DumpState) collections.abc.Iterator[RecordStreamElement]

Keep track of the pipeline state in a persistent state object.

dissect.target.tools.dump.configure_state(args: argparse.Namespace) DumpState | None
dissect.target.tools.dump.STATE_FILE_NAME = 'target-dump.state.json'
dissect.target.tools.dump.PENDING_UPDATES_LIMIT = 10
class dissect.target.tools.dump.Sink
target_path: str
func: str
path: pathlib.Path
is_dirty: bool = True
record_count: int = 0
size_bytes: int = 0
__post_init__()
class dissect.target.tools.dump.DumpState
target_paths: list[str]
functions: str
excluded_functions: list[str]
serialization: str
compression: str
start_time: datetime.datetime
last_update_time: datetime.datetime
sinks: list[Sink] = []
output_dir: pathlib.Path | None = None
pending_updates_count: int | None = 0
property record_count: int
property finished_sinks: list[Sink]
property path: pathlib.Path
classmethod get_state_path(output_dir: pathlib.Path) pathlib.Path
get_full_sink_path(sink: Sink) pathlib.Path
get_sink(path: pathlib.Path) Sink | None
serialize() str

Serialize state instance into a JSON formatted string.

persist(fh: TextIO) None

Write serialized state instance into profided fh byte stream, overwriting it from the beginning.

mark_as_finished(target: dissect.target.target.Target, func: str) None

Mark sinks that match provided target and func pair as not dirty.

create_sink(sink_path: pathlib.Path, stream_element: RecordStreamElement) Sink

Create a sink instance for provided sink_path and stream_element (from which target and func properties are used).

update(stream_element: RecordStreamElement, fp_position: int) None

Update a sink instance for provided stream_element.

classmethod from_dict(state_dict: dict) typing_extensions.Self

Deserialize state instance from provided dictionary.

classmethod from_path(output_dir: pathlib.Path) typing_extensions.Self | None

Deserialize state instance from a file in the provided output directory path.

get_invalid_sinks() list[Sink]

Return sinks that have a mismatch between recorded size and a real file size.

drop_invalid_sinks() None

Remove sinks that have a mismatch between recorded size and a real file size from the list of sinks.

drop_dirty_sinks() None

Drop sinks that are marked as “dirty” in the current state from the list of sinks.

dissect.target.tools.dump.create_state(*, output_dir: pathlib.Path, target_paths: list[str], functions: str, excluded_functions: list[str], serialization: Serialization, compression: Compression = None) DumpState

Create a DumpState instance with provided properties.

dissect.target.tools.dump.persisted_state(state: DumpState) collections.abc.Iterator[collections.abc.Callable]

Return a context manager for persisting DumpState instance.

dissect.target.tools.dump.load_state(output_dir: pathlib.Path) DumpState | None

Load persisted DumpState instance from provided output_dir path and perform sink validation.

dissect.target.tools.dump.serialize_obj(obj: Any) str

JSON serializer for object types not serializable by json library.

class dissect.target.tools.dump.Compression

Bases: str, enum.Enum

Supported compression types.

BZIP2 = 'bzip2'
GZIP = 'gzip'
LZ4 = 'lz4'
ZSTD = 'zstandard'
NONE = None
class dissect.target.tools.dump.Serialization

Bases: str, enum.Enum

Supported serialization methods.

JSONLINES = 'jsonlines'
MSGPACK = 'msgpack'
dissect.target.tools.dump.COMPRESSION_TO_EXT
dissect.target.tools.dump.DEST_DIR_CACHE_SIZE = 10
dissect.target.tools.dump.DEST_FILENAME_CACHE_SIZE = 10
dissect.target.tools.dump.OPEN_WRITERS_LIMIT = 10
dissect.target.tools.dump.get_nested_attr(obj: Any, nested_attr: str) Any
dissect.target.tools.dump.get_sink_dir_by_target(target: dissect.target.target.Target, function: dissect.target.plugin.FunctionDescriptor) pathlib.Path
dissect.target.tools.dump.get_sink_dir_by_func(target: dissect.target.target.Target, function: dissect.target.plugin.FunctionDescriptor) pathlib.Path
dissect.target.tools.dump.slugify_descriptor_name(descriptor_name: str) str
dissect.target.tools.dump.get_sink_filename(record_descriptor: flow.record.RecordDescriptor, serialization: Serialization, compression: Compression | None = None) str

Return a sink filename for provided record descriptor, serialization and compression.

dissect.target.tools.dump.get_relative_sink_path(element: RecordStreamElement, serialization: str, compression: Compression | None = None) pathlib.Path

Return a sink path relative to an output directory.

dissect.target.tools.dump.open_path(path: pathlib.Path, mode: str, compression: Compression | None = None) BinaryIO

Open path using mode, with specified compression and return a file object.

class dissect.target.tools.dump.JsonLinesWriter(fp: TextIO, **kwargs)

Bases: flow.record.adapter.jsonfile.JsonfileWriter

fp
packer
flush() None

Flush any buffered writes.

close() None

Close the Writer, no more writes will be possible.

class dissect.target.tools.dump.SortedKeysJsonRecordPacker(indent: int | None = None, pack_descriptors: bool = True)

Bases: flow.record.jsonpacker.JsonRecordPacker

pack(obj: flow.record.Record | flow.record.RecordDescriptor) str
dissect.target.tools.dump.SERIALIZERS
dissect.target.tools.dump.get_sink_writer(full_sink_path: pathlib.Path, serialization: Serialization, compression: Compression | None = None, new_sink: bool = True) flow.record.adapter.jsonfile.JsonfileWriter | flow.record.RecordStreamWriter
dissect.target.tools.dump.cached_sink_writers(state: DumpState) collections.abc.Iterator[collections.abc.Callable]
dissect.target.tools.dump.get_current_utc_time() datetime.datetime
dissect.target.tools.dump.parse_datetime_iso(datetime_str: str) datetime.datetime
dissect.target.tools.dump.execute_pipeline(state: DumpState, targets: collections.abc.Iterator[dissect.target.target.Target], dry_run: bool, arguments: list[str], limit: int | None = None) None

Run the record generation, processing and sinking pipeline.

dissect.target.tools.dump.parse_arguments() tuple[argparse.Namespace, list[str]]
dissect.target.tools.dump.main() None