使用本地磁盘配置 Milvus QueryNode

本文介绍如何配置 Milvus QueryNode 使用本地磁盘存储。

概述

Milvus 是一个专注于 AI 的向量数据库，专为高效存储和检索大量向量数据而量身定制。它非常适合图像和视频分析、自然语言处理和推荐系统等任务。为确保最佳性能，最小化磁盘读取延迟至关重要。强烈建议使用本地 NVMe SSD 以防止延迟并保持系统稳定性。

Key features where local disk storage comes into play include:

Chunk cache: Preloads data into local disk cache for faster search.
MMap: Maps file contents directly into memory for better memory efficiency.
DiskANN Index: Requires disk storage for efficient index management.

In this article, we will focus on deploying Milvus Distributed on cloud platforms, and how to configure the QueryNode to use NVMe disk storage. The following table lists the recommended machine types of various cloud providers.

Cloud Provider	Machine Type
AWS	R6id series
GCP	N2 series
Azure	Lsv3 series
Alibaba Cloud	i3 series
Tencent Cloud	IT5 series

These machine types provide NVMe disk storage. You can use the lsblk command on the instances of these machine types to check if they have NVMe disk storage. If they do, you can proceed to the next step.

$ lsblk | grep nvme
nvme0n1     259:0    0 250.0G  0 disk 
nvme1n1     259:1    0 250.0G  0 disk 

Configure Kubernetes to use local disk

To configure the QueryNode of Milvus Distributed to use NVMe disk storage, you need to configure the worker nodes of the target Kubernetes clusters to store the containers and images on an NVMe disk. The procedure for this varies depending on the cloud providers.

AWS

When using Amazon EKS, you can customize managed nodes with launch templates, in which you can specify configuration settings for your node groups. The following is an example of how to mount an NVMe disk on the worker nodes of your Amazon EKS cluster:

MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="==MYBOUNDARY=="

--==MYBOUNDARY==
Content-Type: text/x-shellscript; charset="us-ascii"

#!/bin/bash
echo "Running custom user data script"
if ( lsblk | fgrep -q nvme1n1 ); then
    mkdir -p /mnt/data /var/lib/kubelet /var/lib/docker
    mkfs.xfs /dev/nvme1n1
    mount /dev/nvme1n1 /mnt/data
    chmod 0755 /mnt/data
    mv /var/lib/kubelet /mnt/data/
    mv /var/lib/docker /mnt/data/
    ln -sf /mnt/data/kubelet /var/lib/kubelet
    ln -sf /mnt/data/docker /var/lib/docker
    UUID=$(lsblk -f | grep nvme1n1 | awk '{print $3}')
    echo "UUID=$UUID     /mnt/data   xfs    defaults,noatime  1   1" >> /etc/fstab
fi
echo 10485760 > /proc/sys/fs/aio-max-nr

--==MYBOUNDARY==--

In the above example, we assume that the NVMe disk is /dev/nvme1n1. You need to modify the script to match your specific configuration.

For details, see Customizing managed nodes with launch templates.

GCP

To provision Local SSD storage on Google Kubernetes Engine (GKE) clusters, and configure workloads to consume data from Local SSD-backed ephemeral storage attached to nodes in your cluster, run the following command:

gcloud container node-pools create ${POOL_NAME} \
    --cluster=${CLUSTER_NAME} \
    --ephemeral-storage-local-ssd count=${NUMBER_OF_DISKS} \
    --machine-type=${MACHINE_TYPE}

For details, see Provisioning Local SSD storage on GKE.

Azure

To create a virtual machine scale set (VMSS) with local NVMe disk storage, you need to pass custom data to the VM instances. The following is an example of how to attach an NVMe disk to the VM instances in the VMSS:

mdadm -Cv /dev/md0 -l0 -n2 /dev/nvme0n1 /dev/nvme1n1
mdadm -Ds > /etc/mdadm/mdadm.conf 
update-initramfs -u

mkfs.xfs /dev/md0
mkdir -p /var/lib/kubelet
echo '/dev/md0 /var/lib/kubelet xfs defaults 0 0' >> /etc/fstab
mount -a

In the above example, we assume that the NVMe disks are /dev/nvme0n1 and /dev/nvme1n1. You need to modify the script to match your specific configuration.

Alibaba Cloud & TecentCloud

To create a node pool that uses Local SSD volumes, we need to pass Custom Data. The following is an example of custom data.

#!/bin/bash
echo "nvme init start..."
mkfs.xfs /dev/nvme0n1
mkdir -p /mnt/data
echo '/dev/nvme0n1 /mnt/data/ xfs defaults 0 0' >> /etc/fstab
mount -a

mkdir -p /mnt/data/kubelet /mnt/data/containerd /mnt/data/log/pods
mkdir -p  /var/lib/kubelet /var/lib/containerd /var/log/pods

echo '/mnt/data/kubelet /var/lib/kubelet none defaults,bind 0 0' >> /etc/fstab
echo '/mnt/data/containerd /var/lib/containerd none defaults,bind 0 0' >> /etc/fstab
echo '/mnt/data/log/pods /var/log/pods none defaults,bind 0 0' >> /etc/fstab
mount -a

echo "nvme init end..."

In the above example, we assume that the NVMe disk is /dev/nvme0n1. You need to modify the script to match your specific configuration.

Your own IDC

If you are running your own IDC and want to configure your containers to use the filesystem on a newly mounted NVMe disk by default in containerd, follow these steps:

Mount the NVMe disks.

Ensure that your NVMe disk is properly mounted on your host machine. You can mount it to a directory of your choice. For instance, if you mount it to /mnt/nvme, make sure it is correctly set up and you can see the disk available at /mnt/nvme by running lsblk or df -h.
Update containerd configuration.

Modify the containerd configuration to use the new mount as the root directory for container storage.
```
sudo mkdir -p /mnt/nvme/containerd /mnt/nvme/containerd/state
sudo vim /etc/containerd/config.toml
```
Locate the [plugins."io.containerd.grpc.v1.cri".containerd] section, and modify the snapshotter and root settings as follows：
```
[plugins."io.containerd.grpc.v1.cri".containerd]
snapshotter = "overlayfs"
root = "/mnt/nvme/containerd"
state = "/mnt/nvme/containerd/state"
```
Restart containerd.

Restart the containerd service to apply the changes.
```
sudo systemctl restart containerd
```

Verify disk performance

You are advised to verify the disk performance using Fio, which is a popular tool for benchmarking disk performance. The following is an example of how to run Fio to test the disk performance.

Deploy the test pod to the node with the NVMe disk.

kubectl create -f ubuntu.yaml

The ubuntu.yaml file is as follows:

apiVersion: v1
kind: Pod
metadata:
name: ubuntu
spec:
containers:
- name: ubuntu
    image: ubuntu:latest
    command: ["sleep", "86400"]
    volumeMounts:
    - name: data-volume
        mountPath: /data
volumes:
    - name: data-volume
    emptyDir: {}

Run Fio to test the disk performance.

# enter the container
kubectl exec pod/ubuntu -it bash

# in container
apt-get update
apt-get install fio -y

# change to the mounted dir
cd /data

# write 10GB
fio -direct=1 -iodepth=128 -rw=randwrite -ioengine=libaio -bs=4K -size=10G -numjobs=10 -runtime=600 -group_reporting -filename=test -name=Rand_Write_IOPS_Test

# verify the read speed
# compare with the disk performance indicators provided by various cloud providers.
fio --filename=test --direct=1 --rw=randread --bs=4k --ioengine=libaio --iodepth=64 --runtime=120 --numjobs=128 --time_based --group_reporting --name=iops-test-job --eta-newline=1  --readonly

And the output should look like this:

Jobs: 128 (f=128): [r(128)][100.0%][r=1458MiB/s][r=373k IOPS][eta 00m:00s]
iops-test-job: (groupid=0, jobs=128): err= 0: pid=768: Mon Jun 24 09:35:06 2024
read: IOPS=349k, BW=1364MiB/s (1430MB/s)(160GiB/120067msec)
    slat (nsec): min=765, max=530621k, avg=365836.09, stdev=4765464.96
    clat (usec): min=35, max=1476.0k, avg=23096.78, stdev=45409.13
    lat (usec): min=36, max=1571.6k, avg=23462.62, stdev=46296.74
    clat percentiles (usec):
    |  1.00th=[    69],  5.00th=[    79], 10.00th=[    85], 20.00th=[    95],
    | 30.00th=[   106], 40.00th=[   123], 50.00th=[   149], 60.00th=[ 11469],
    | 70.00th=[ 23462], 80.00th=[ 39584], 90.00th=[ 70779], 95.00th=[103285],
    | 99.00th=[189793], 99.50th=[244319], 99.90th=[497026], 99.95th=[591397],
    | 99.99th=[767558]
bw (  MiB/s): min=  236, max= 4439, per=100.00%, avg=1365.82, stdev= 5.02, samples=30591
iops        : min=60447, max=1136488, avg=349640.62, stdev=1284.65, samples=30591
lat (usec)   : 50=0.01%, 100=24.90%, 250=30.47%, 500=0.09%, 750=0.31%
lat (usec)   : 1000=0.08%
lat (msec)   : 2=0.32%, 4=0.59%, 10=1.86%, 20=8.20%, 50=17.29%
lat (msec)   : 100=10.62%, 250=4.80%, 500=0.38%, 750=0.09%, 1000=0.01%
lat (msec)   : 2000=0.01%
cpu          : usr=0.20%, sys=0.48%, ctx=838085, majf=0, minf=9665
IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
    submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
    complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
    issued rwts: total=41910256,0,0,0 short=0,0,0,0 dropped=0,0,0,0
    latency   : target=0, window=0, percentile=100.00%, depth=64

Deploy Milvus Distributed

Once the verification results are satisfactory, you can deploy Milvus Distributed with the following steps:

Tips for deploying Milvus Distributed using Helm

The QueryNode pod uses NVMe disks as EmptyDir volumes by default. You are advised to mount NVMe disks to /var/lib/milvus/data within the QueryNode pods to ensure optimal performance.

For details on how to deploy Milvus Distributed using Helm, see Run Milvus in Kubernetes with Helm.

Tips for deploying Milvus Distributed using Milvus Operator

The Milvus Operator automatically configures the QueryNode pod to use NVMe disks as EmptyDir volumes. You are advised to add the following configurations to the MilvusCluster custom resource:

...
spec:
  components:
    queryNode:
      volumeMounts:
      - mountPath: /var/lib/milvus/data
        name: data
      volumes:
      - emptyDir:
        name: data

This will ensure that the QueryNode pod uses the NVMe disk as the data volume. For details on how to deploy Milvus Distributed using Milvus Operator, see Run Milvus in Kubernetes with Milvus Operator.

概述​

Configure Kubernetes to use local disk​

AWS​

GCP​

Azure​

Alibaba Cloud & TecentCloud​

Your own IDC​

Verify disk performance​

Deploy Milvus Distributed​

Tips for deploying Milvus Distributed using Helm​

Tips for deploying Milvus Distributed using Milvus Operator​

概述