On Sat, Sep 30, 2017 at 12:05:54PM +0800, Wei Wang wrote:
+static void ctrlq_send_cmd(struct virtio_balloon *vb,
+ struct virtio_balloon_ctrlq_cmd *cmd,
+ bool inbuf)
+{
+ struct virtqueue *vq = vb->ctrl_vq;
+
+ ctrlq_add_cmd(vq, cmd, inbuf);
+ if (!inbuf) {
+ /*
+ * All the input cmd buffers are replenished here.
+ * This is necessary because the input cmd buffers are lost
+ * after live migration. The device needs to rewind all of
+ * them from the ctrl_vq.
Confused. Live migration somehow loses state? Why is that and why is it a good
idea? And how do you know this is migration even?
Looks like all you know is you got free page end. Could be any reason for this.
+static void ctrlq_handle(struct virtqueue *vq) {VIRTIO_BALLOON_CTRLQ_CLASS_FREE_PAGE;
+ struct virtio_balloon *vb = vq->vdev->priv;
+ struct virtio_balloon_ctrlq_cmd *msg;
+ unsigned int class, cmd, len;
+
+ msg = (struct virtio_balloon_ctrlq_cmd *)virtqueue_get_buf(vq, &len);
+ if (unlikely(!msg))
+ return;
+
+ /* The outbuf is sent by the host for recycling, so just return. */
+ if (msg == &vb->free_page_cmd_out)
+ return;
+
+ class = virtio32_to_cpu(vb->vdev, msg->class);
+ cmd = virtio32_to_cpu(vb->vdev, msg->cmd);
+
+ switch (class) {
+ case VIRTIO_BALLOON_CTRLQ_CLASS_FREE_PAGE:
+ if (cmd == VIRTIO_BALLOON_FREE_PAGE_F_STOP) {
+ vb->report_free_page_stop = true;
+ } else if (cmd == VIRTIO_BALLOON_FREE_PAGE_F_START) {
+ vb->report_free_page_stop = false;
+ queue_work(vb->balloon_wq, &vb-
report_free_page_work);
+ }
+ vb->free_page_cmd_in.class =
+
+ ctrlq_send_cmd(vb, &vb->free_page_cmd_in, true);
+ break;
+ default:
+ dev_warn(&vb->vdev->dev, "%s: cmd class not supported\n",
+ __func__);
+ }
Manipulating report_free_page_stop without any locks looks very suspicious.
Also, what if we get two start commands? we should restart from beginning, should we not?
+/* Ctrlq commands related to VIRTIO_BALLOON_CTRLQ_CLASS_FREE_PAGE*/
+#define VIRTIO_BALLOON_FREE_PAGE_F_STOP 0
+#define VIRTIO_BALLOON_FREE_PAGE_F_START 1
+
#endif /* _LINUX_VIRTIO_BALLOON_H */
The stop command does not appear to be thought through.
Let's assume e.g. you started migration. You ask guest for free pages.
Then you cancel it. There are a bunch of pages in free vq and you are getting
more. You now want to start migration again. What to do?
A bunch of vq flushing and waiting will maybe do the trick, but waiting on guest
is never a great idea.
I previously suggested pushing the stop/start commands from guest to host on the free page vq, and including an ID in host to guest and guest to host commands. This way ctrl vq is just for host to guest commands, and host matches commands and knows which command is a free page in response to.
I still think it's a good idea but go ahead and propose something else that works.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 293 |
Nodes: | 16 (2 / 14) |
Uptime: | 217:56:05 |
Calls: | 6,621 |
Calls today: | 3 |
Files: | 12,171 |
Messages: | 5,317,713 |