NVMe Emulation¶
QEMU provides NVMe emulation through the nvme
, nvme-ns
and
nvme-subsys
devices.
See the following sections for specific information on
- Adding NVMe Devices, additional namespaces and NVM subsystems.
- Configuration of Optional Features such as Controller Memory Buffer, Simple Copy, Zoned Namespaces, metadata and End-to-End Data Protection,
Adding NVMe Devices¶
Controller Emulation¶
The QEMU emulated NVMe controller implements version 1.4 of the NVM Express specification. All mandatory features are implement with a couple of exceptions and limitations:
- Accounting numbers in the SMART/Health log page are reset when the device is power cycled.
- Interrupt Coalescing is not supported and is disabled by default.
The simplest way to attach an NVMe controller on the QEMU PCI bus is to add the following parameters:
-drive file=nvm.img,if=none,id=nvm
-device nvme,serial=deadbeef,drive=nvm
There are a number of optional general parameters for the nvme
device. Some
are mentioned here, but see -device nvme,help
to list all possible
parameters.
max_ioqpairs=UINT32
(default:64
)- Set the maximum number of allowed I/O queue pairs. This replaces the
deprecated
num_queues
parameter. msix_qsize=UINT16
(default:65
)- The number of MSI-X vectors that the device should support.
mdts=UINT8
(default:7
)- Set the Maximum Data Transfer Size of the device.
use-intel-id
(default:off
)- Since QEMU 5.2, the device uses a QEMU allocated “Red Hat” PCI Device and
Vendor ID. Set this to
on
to revert to the unallocated Intel ID previously used.
Additional Namespaces¶
In the simplest possible invocation sketched above, the device only support a
single namespace with the namespace identifier 1
. To support multiple
namespaces and additional features, the nvme-ns
device must be used.
-device nvme,id=nvme-ctrl-0,serial=deadbeef
-drive file=nvm-1.img,if=none,id=nvm-1
-device nvme-ns,drive=nvm-1
-drive file=nvm-2.img,if=none,id=nvm-2
-device nvme-ns,drive=nvm-2
The namespaces defined by the nvme-ns
device will attach to the most
recently defined nvme-bus
that is created by the nvme
device. Namespace
identifers are allocated automatically, starting from 1
.
There are a number of parameters available:
nsid
(default:0
)- Explicitly set the namespace identifier.
uuid
(default: autogenerated)- Set the UUID of the namespace. This will be reported as a “Namespace UUID” descriptor in the Namespace Identification Descriptor List.
bus
- If there are more
nvme
devices defined, this parameter may be used to attach the namespace to a specificnvme
device (identified by anid
parameter on the controller device).
NVM Subsystems¶
Additional features becomes available if the controller device (nvme
) is
linked to an NVM Subsystem device (nvme-subsys
).
The NVM Subsystem emulation allows features such as shared namespaces and multipath I/O.
-device nvme-subsys,id=nvme-subsys-0,nqn=subsys0
-device nvme,serial=a,subsys=nvme-subsys-0
-device nvme,serial=b,subsys=nvme-subsys-0
This will create an NVM subsystem with two controllers. Having controllers
linked to an nvme-subsys
device allows additional nvme-ns
parameters:
shared
(default:off
)- Specifies that the namespace will be attached to all controllers in the
subsystem. If set to
off
(the default), the namespace will remain a private namespace and may only be attached to a single controller at a time. detached
(default:off
)- If set to
on
, the namespace will be be available in the subsystem, but not attached to any controllers initially.
Thus, adding
-drive file=nvm-1.img,if=none,id=nvm-1
-device nvme-ns,drive=nvm-1,nsid=1,shared=on
-drive file=nvm-2.img,if=none,id=nvm-2
-device nvme-ns,drive=nvm-2,nsid=3,detached=on
will cause NSID 1 will be a shared namespace (due to shared=on
) that is
initially attached to both controllers. NSID 3 will be a private namespace
(i.e. only attachable to a single controller at a time) and will not be
attached to any controller initially (due to detached=on
).
Optional Features¶
Controller Memory Buffer¶
nvme
device parameters related to the Controller Memory Buffer support:
cmb_size_mb=UINT32
(default:0
)- This adds a Controller Memory Buffer of the given size at offset zero in BAR 2.
legacy-cmb
(default:off
)- By default, the device uses the “v1.4 scheme” for the Controller Memory
Buffer support (i.e, the CMB is initially disabled and must be explicitly
enabled by the host). Set this to
on
to behave as a v1.3 device wrt. the CMB.
Simple Copy¶
The device includes support for TP 4065 (“Simple Copy Command”). A number of
additional nvme-ns
device parameters may be used to control the Copy
command limits:
mssrl=UINT16
(default:128
)- Set the Maximum Single Source Range Length (
MSSRL
). This is the maximum number of logical blocks that may be specified in each source range. mcl=UINT32
(default:128
)- Set the Maximum Copy Length (
MCL
). This is the maximum number of logical blocks that may be specified in a Copy command (the total for all source ranges). msrc=UINT8
(default:127
)- Set the Maximum Source Range Count (
MSRC
). This is the maximum number of source ranges that may be used in a Copy command. This is a 0’s based value.
Zoned Namespaces¶
A namespaces may be “Zoned” as defined by TP 4053 (“Zoned Namespaces”). Set
zoned=on
on an nvme-ns
device to configure it as a zoned namespace.
The namespace may be configured with additional parameters
zoned.zone_size=SIZE
(default:128MiB
)- Define the zone size (
ZSZE
). zoned.zone_capacity=SIZE
(default:0
)- Define the zone capacity (
ZCAP
). If left at the default (0
), the zone capacity will equal the zone size. zoned.descr_ext_size=UINT32
(default:0
)- Set the Zone Descriptor Extension Size (
ZDES
). Must be a multiple of 64 bytes. zoned.cross_read=BOOL
(default:off
)- Set to
on
to allow reads to cross zone boundaries. zoned.max_active=UINT32
(default:0
)- Set the maximum number of active resources (
MAR
). The default (0
) allows all zones to be active. zoned.max_open=UINT32
(default:0
)- Set the maximum number of open resources (
MOR
). The default (0
) allows all zones to be open. Ifzoned.max_active
is specified, this value must be less than or equal to that.
Metadata¶
The virtual namespace device supports LBA metadata in the form separate
metadata (MPTR
-based) and extended LBAs.
ms=UINT16
(default:0
)- Defines the number of metadata bytes per LBA.
mset=UINT8
(default:0
)- Set to
1
to enable extended LBAs.
End-to-End Data Protection¶
The virtual namespace device supports DIF- and DIX-based protection information
(depending on mset
).
pi=UINT8
(default:0
)- Enable protection information of the specified type (type
1
,2
or3
). pil=UINT8
(default:0
)- Controls the location of the protection information within the metadata. Set
to
1
to transfer protection information as the first eight bytes of metadata. Otherwise, the protection information is transferred as the last eight bytes.