dissect.vmfs

Submodules

Package Contents

Classes

DirEntry

Directory entry representation.

FileDescriptor

VMFS file descriptor implementation.

LVM

VMFS LVM implementation, supports LVM5 and LVM6.

Device

VMFS LVM device implementation.

Volume

Logical volume in a VMFS LVM.

VMFS

VMFS filesystem implementation.

class dissect.vmfs.DirEntry(vmfs: dissect.vmfs.vmfs.VMFS, address: int, name: str, type: dissect.vmfs.c_vmfs.FS3_DescriptorType, raw: dissect.vmfs.c_vmfs.c_vmfs.FS3_DirEntry | dissect.vmfs.c_vmfs.c_vmfs.FS6_DirEntry | None = None)

Directory entry representation.

Parameters:
  • vmfs – The VMFS instance this directory entry belongs to.

  • address – The address of the file descriptor of this directory entry.

  • name – The name of the directory entry.

  • type – The type of the directory entry.

  • raw – The raw directory entry struct, if available.

vmfs
address
name
type
raw = None
__repr__() str
property file_descriptor: FileDescriptor

Resolve this directory entry to its file descriptor.

fd
class dissect.vmfs.FileDescriptor(vmfs: dissect.vmfs.vmfs.VMFS, address: int)

VMFS file descriptor implementation.

See FileDescriptor5 and FileDescriptor6 for the VMFS5 and VMFS6 specific implementations.

File descriptors are basically the inodes of VMFS and are all stored in the .fdc.sf resource. They are the combination of a lock block, a metadata block, and a bit of space for data. They start with lock information, which allows multiple ESXi hosts to stay in sync and place locks. This is followed by the FS3_FileMetadata structure is and contains fields that you would expect of an “inode”.

The file descriptor on disk roughly looks like the following:

struct FS3_FileDescriptor {
    FS3_DiskLock lockBlock;
    FS3_FileMetadata metaBlock;
    char data[N];
};

On VMFS5, each block is 512 bytes large, and there’s 1024 bytes of data. On VMFS6, the block size is determined by the metadata alignment. The entire file descriptor is two metadata blocks large, with the lock occupying the first metadata block, and the metadata and data occupying the second metadata block.

Data is stored in a way that is also similar to many Unix filesystems. There is is some space at the end of the metadata for either a block pointer array, or some resident data. On VMFS5, the block pointer array and the data portion are stored in the same place, whereas on VMFS6, the block pointer array is aligned to the end of the file descriptor, and the data portion is aligned to the end of the metadata structure.

The “zeroLevelAddrType” (or ZLA) determines how to interpret the block pointer array. They can generally be seperated into two kinds: direct and indirect. Like other filesystems, direct blocks refer directly to filesystem blocks, or offsets on disk, that contain data. With indirect blocks, you first need to go through one or more layers of indirection go get to the final filesystem block. View the documentation of BlockStream for more information.

Directory entries are also stored very differently between VMFS5 and VMFS6. Refer to : func:FileDescriptor5._iterdir and FileDescriptor6._iterdir() for more information on how these work.

vmfs
address
__repr__() str
debug() str

Return a debug string for this file descriptor.

Mimicks vmkfstool -D output.

static from_bytes(vmfs: dissect.vmfs.vmfs.VMFS, address: int, buf: bytes) FileDescriptor | FileDescriptor5 | FileDescriptor6

Create a FileDescriptor5 or FileDescriptor6 from a bytes buffer.

property raw: memoryview

The raw buffer of this file descriptor.

property lock_info: dissect.vmfs.c_vmfs.c_vmfs.FS3_DiskLock

The lock info of this file descriptor.

property metadata: dissect.vmfs.c_vmfs.c_vmfs.FS3_FileMetadata

The file metadata of this file descriptor.

property data: memoryview

The data portion of this file descriptor.

property blocks: list[int]

The block array of this file.

Also referred to as the pointer array, or data addresses.

On VMFS5, this is stored in the data portion of the file descriptor as an array of 32-bit integers. On VMFS6, it’s aligned to the end of the file descriptor, and is an array of 64-bit integers.

property rdm_mapping: dissect.vmfs.c_vmfs.c_vmfs.FS3_RawDiskMap

The RDM mapping of this file, if this file is an RDM file.

The RDM mapping is stored in the data portion of the file descriptor.

property parent: FileDescriptor | None

The parent file descriptor of this file, if it has one.

property size: int

The size of this file.

property type: int

The type of this descriptor. Not to be confused with the file type.

property zla: dissect.vmfs.c_vmfs.FS3_ZeroLevelAddrType

The “Zero Level Address” type of this file.

property mode: int

The file mode of this file.

The mode in the metadata only contains a type bit for directories, we add the appropriate type bits for regular files, symlinks and RDM files.

Access the mode through the metadata attribute to get the raw mode value.

property block_size: int

The file specific block size of this file.

property atime: datetime.datetime

The last access time of this file.

property mtime: datetime.datetime

The last modified time of this file.

property ctime: datetime.datetime

The creation time of this file.

The destination of this symlink, if this file descriptor is a symlink.

is_dir() bool

Return whether this file descriptor is a directory.

is_file() bool

Return whether this file descriptor is a regular file.

Return whether this file descriptor is a symlink.

is_system() bool

Return whether this file descriptor is a system file.

is_rdm() bool

Return whether this file descriptor is an RDM file.

listdir() dict[str, DirEntry]

A dictionary of the content of this directory, if this file descriptor is a directory.

iterdir() collections.abc.Iterator[DirEntry]

Iterate file descriptors of the directory entries, if this file descriptor is a directory.

get(name: str) DirEntry

Get a child directory entry by name.

Parameters:

name – The name of the directory entry to get.

open() BlockStream

Open a read-only stream for this file descriptor.

exception dissect.vmfs.Error

Bases: Exception

Common base class for all non-exit exceptions.

exception dissect.vmfs.FileNotFoundError

Bases: Error, FileNotFoundError

Common base class for all non-exit exceptions.

exception dissect.vmfs.InvalidHeader

Bases: Error

Common base class for all non-exit exceptions.

exception dissect.vmfs.NotADirectoryError

Bases: Error, NotADirectoryError

Common base class for all non-exit exceptions.

exception dissect.vmfs.NotASymlinkError

Bases: Error

Common base class for all non-exit exceptions.

class dissect.vmfs.LVM(fh: BinaryIO | Device | list[BinaryIO] | list[Device])

VMFS LVM implementation, supports LVM5 and LVM6.

VMFS LVM is a logical volume manager that allows multiple physical devices to be combined into a single logical volume. Technically LVM supports multiple logical volumes, in fact LVM3 started with supporting 1024, later versions decreased it to 512. LVM6 only allows 1. In practice only one logical volume is ever used.

Provide this class with file-like objects for all devices that make up the LVM, then access the volumes attribute to get a list of logical volumes. A Volume can be opened for reading by calling Volume.open().

Parameters:

fh – A file-like object or a list of file-like objects that constitute an LVM.

devices: list[Device] = []

List of Device objects that make up the LVM.

volumes: list[Volume] = []

List of Volume objects that are in the LVM.

__repr__() str
class dissect.vmfs.Device(fh: BinaryIO)

VMFS LVM device implementation.

Represents a single device in the LVM.

LVM devices contain metadata that describes itself, the logical volumes and physical extents it contains.

The metadata roughly looks like the following pseudo-structure:

struct LVM_DeviceHeader {
    LVM_DevMetadata     devMeta;
    LVM_VolTableEntry   volTable[LVM_MAX_VOLUMES_PER_DEV];
    char                reserved[LVM_RESERVED_SIZE];
    LVM_SDTableEntry    sdTable[FS_PLIST_DEF_MAX_PARTITIONS];
    uint8               peBitmap[LVM_PE_BITMAP_SIZE];
    LVM_PETableEntry    peTable[LVM_PES_PER_BITMAP];
};

On versions prior to LVM6, it looks like the structure sizes are largely respected when calculating offsets to other tables. However, since LVM6 a specific field in LVM_DevMetadata is often used for this calculation. Because the real name of this field is unknown, we have decided to call it mdAlignment within this project, since it appears to be used in a similar way as in VMFS.

The device metadata (devMeta) starts at a fixed offset (0x00100000), but since LVM6 may reference extended metadata at other offsets. The volume table (volTable) starts after devMeta, which is 0x00100000 + LVM_SIZEOF_LVM_DEVMETA, where LVM_SIZEOF_LVM_DEVMETA is either 512 or since LVM6 the value of the mdAlignment field in the LVM_DevMetadata structure. Since LVM5 (I could not find evidence that LVM4 exists) there exists a sdTable, which is a table of device names, that starts at the end of the volume table. The peBitmap is a bitmap that describes which entries in the peTable are used, and starts at the end of the volTable/sdTable. The peTable is a table of physical extents, which starts at the end of the peBitmap. A pair of peBitmap and peTable repeats for numPEs times.

The device metadata LVM_DevMetadata contains information about the device, including some identifiers and number of volumes and physical extents. There are also timestamps when the device was created and last modified.

The volume descriptor LVM_VolDescriptor contains information about the logical volume, specific to that device. The LVM_VolMetadata structure contains metadata that is shared across all devices in the volume, but other fields in the descriptor are specific to that device (such as the first and last physical extent on that device).

The “SD table” (storage device? SCSI disk?) is a table of device names that are part of the volume, which is only present in LVM5 and later, and only on the first device in the LVM (internally referred to as “devZero”).

The physical extent descriptors (LVM_PEDescriptor) contain information about the physical extents on the device, including the logical offset, physical offset and length of the extent, as well as a reference to the volume it belongs to. A device can have multiple physical extents, and the logical volume is constructed from these physical extents across all devices in the LVM.

There can only be a maximum of 8 physical extent map/table pairs per metadata region (so 8 maps and 64k table entries). If more are needed, the device metadata will reference extended metadata regions, which are similarly laid out, but with a different offset. The extended metadata regions are linked together by the nextOffset field in the LVM_ExtDevMetadata structure.

Parameters:

fh – A file-like object of a LVM device.

fh
metadata
ext_metadata = []
major_version
minor_version
uuid
size
volumes
__repr__() str
class dissect.vmfs.Volume(uuid: str, snap_id: int, devices: list[Device])

Logical volume in a VMFS LVM.

Represents a logical volume that is constructed from one or more devices.

Parameters:
  • uuid – The UUID of the volume.

  • snap_id – The snapshot ID of the volume.

  • devices – A list of Device objects that make up the volume. Must contain at least one device.

uuid
snap_id
devices
size
generation
state
name
creation_ts
dataruns
__repr__() str
is_valid() bool

Check if the volume is valid and can be opened for reading.

open() VolumeStream

Open a read-only stream for the volume.

class dissect.vmfs.VMFS(volume: BinaryIO | None = None, vh: BinaryIO | None = None, fdc: BinaryIO | None = None, fbb: BinaryIO | None = None, sbc: BinaryIO | None = None, pbc: BinaryIO | None = None, pb2: BinaryIO | None = None, jbc: BinaryIO | None = None)

VMFS filesystem implementation.

The VMFS filesystem is a complex clustered filesystem used by VMware ESXi. This implementation aims to provide a read-only interface for reading VMFS filesystems, supporting VMFS5 and VMFS6. Locks and such are not implemented, so feel free to read any file to your heart’s content.

Within ESXi, the VMFS filesystem is tightly coupled with the VMFS LVM (Logical Volume Manager). You can have a raw LVM if you want, but VMFS must be placed on an LVM volume. The LVM is responsible for managing the physical storage across one or more physical disks (devices), while the VMFS filesystem is responsible for managing the files and directories. Both are multi-host aware, meaning that multiple ESXi hosts can access and claim locks on individual parts of both the LVM and the filesystem.

Within our implementation, we decouple the LVM and VMFS filesystem. The LVM implementation behaves like any other volume manager, providing raw volume access to any underlying storage. The VMFS filesystem implementation can be used on any file-like object that contains a VMFS filesystem, not technically requiring it to be a volume managed by the LVM. However, in practice, you will most often use the LVM implementation to access the VMFS filesystem. Unless you imaged a VMFS filesystem directly from /dev/lvm, for some reason 🤷‍♂️.

This implementation can be initialized with a file-like object of a VMFS volume, or from individual system files. When initialized from a volume, a VMFS LVM volume must already have been loaded. When initialized from individual system files, you can inspect most of the filesystem (including browing most directories), but you won’t be able to access most file data directly.

Note

A lot of the math consists of bitwise shifts and masks, which translate to modulo or multiplication operations. For the sake of “maintainability” in relation to the original “code”, we keep this as bitwise masks, at the sacrifice of some human readability. Comments explaining as such are placed where appropriate.

Parameters:
  • volume – A file-like object of a VMFS volume.

  • vh – An optional file-like object of the VMFS volume header file system file (.vh.sf).

  • fdc – An optional file-like object of the file descriptor cluster system file (.fdc.sf).

  • fbb – An optional file-like object of the file block system file (.fbb.sf).

  • sbc – An optional file-like object of the sub-block cluster system file (.sbc.sf).

  • pbc – An optional file-like object of the pointer block cluster system file (.pbc.sf).

  • pb2 – An optional file-like object of the pointer block 2 system file (.pb2.sf).

  • jbc – An optional file-like object of the journal block cluster system file (.jbc.sf).

fh = None
descriptor
md_alignment
file_block_size
sub_block_size
major_version
minor_version
uuid
label
resources
file_descriptor
root
property is_vmfs5: bool

Whether this is a VMFS5 filesystem.

property is_vmfs6: bool

Whether this is a VMFS6 filesystem.

property is_local: bool

Whether this is a “local” VMFS filesystem (VMFS-L).

get(path: str | int | dissect.vmfs.descriptor.DirEntry, node: dissect.vmfs.descriptor.FileDescriptor | None = None) dissect.vmfs.descriptor.FileDescriptor