- 26 Jul, 2019 1 commit
-
-
Kevin Pouget authored
This work was supported by the ExaNoDe project that has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 671578. The work presented in this paper reflects only authors’ view and the European Commission is not responsible for any use that may be made of the information it contains. [vosys] remove submodules for faster testing [vosys] add gitlab-ci migration/postcopy: define userfaultfd syscall number This patch adds `userfaultfd` definition to Qemu Linux ARM64 syscall list. migration/postcopy: update userfaultfd header file This patch updates Qemu's copy of Linux `userfaultfd.h` based on commit 25412491e9e43d27d9c50aea09a106e4876f108e from repository https://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git https://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git/tree/include/uapi/linux/userfaultfd.h?h=userfault&id=25412491e9e43d27d9c50aea09a106e4876f108eSigned-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> migration/postcopy: improve postcopy helper functions This patch extends the postcopy toolset functions to provide/extend UserfaultFD functionalities. These functions are lightweight wrappers around UFFD syscalls: static int uffd_wake(int uffd, ram_addr_t region, size_t len); static int uffd_unregister_protection(int uffd, ram_addr_t region, size_t len); static int uffd_protection(int uffd, ram_addr_t page_addr, size_t len, int remove); These functions are their public interface: int postcopy_ram_register_wp(UserfaultState *us); int postcopy_ram_register_missing(UserfaultState *us) int postcopy_ram_wprotect_all(UserfaultState *us); int postcopy_ram_disable_notify(UserfaultState *us); int postcopy_ram_write_protect(UserfaultState *us); There is currently a functionnality of UFFD that is not working as expected: when the write-protection of some pages of a region has been turned off, we currently must unregister the whole region from UFFD, then register it again, and activate the write protection. This is unfortunate, as it implies that the VM cannot be running between these two operations. The expected behavior (re-activate the protection of all the pages of a region) could be performed without stopping the VM. Set `UFFD_USE_UNREGISTER` to `1` to have `postcopy_ram_enable_notify` to work; set it to `0` to switch to the version not working at the moment, but supposedly correct. Function `postcopy_ram_fault_thread` relies on `UserfaultState *us` structure to belong to an `MigrationIncomingState` object (it accesses `mis->postcopy_remote_fds` array). For checkpoint migration, we do not modify this behavior, and instead make sure that this structure is always in a valid state (ie, not destroyed it at the end of incoming migrations.) Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> migration/hmp: allow printing migration capabilities without migration statuses The `hmp_info_migrate function` first prints the migration capabilities, then the migration status. This patch allows the display of the migration capabilities, even if there is no status currently set (for printing incremental information about incremental checkpointing dirty page tracking). Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> migration: add dedicated init function This patch adds a dedicated init function for the `migration` module. Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> migration/checkpoint: add capabilities and state query helpers Introduce 'live' and 'incremental' migration capabilities and helper functions, as well as snapshot state query functions: - full or incremental checkpoint ongoing? - inside an incremental checkpoint? Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> migration/checkpoint: add page-level atomic operations and bitmaps These operations and bitmaps track the state of the virtual machine RAM pages during the checkpointing. Pages can be marked as dirty, already saved, or under processing. These bitmaps are independent of the ones used for VM migrations. Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> migration/checkpoint: extend migration priority-page queue This patch extends the migration page queue to store the pages that should be saved as part of the on-going or next checkpoint. - unqueue_page: extended to correctly track the case where the virtual machine pages are smaller then the host page (eg, 1K vs 4K) - ram_update_page_in_queue: add shadow copy of the page content to a page already in the priority queue - ram_next_dirty_pages_to_priority_queue: at the end of a live and incremental checkpoint, put in priority queue of the next checkpoint the pages that faulted during the current checkpoint. - ram_page_req_mutex: lock/unlock the priority queue mutex from other files (eg: postcopy-ram.c) - ram_save_queue_pages: add flags to indicate if the page should be added to the main priority queue, or in the priority queue of the next checkpoint - ram_pages_in_queue: indicates if the priority queue is currently empty Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> migration/checkpoint: extend the page fault handler This patch extends the UserfaultFD fault handler to track write-protection faults during live and incremental checkpointing. - postcopy_ram_fault_thread: handle write-protection fault in the VM memory: - mark the page as dirty, - push it to the right priority queue (current or next) - if necessary, take a shadow copy of its content before allowing the VM to continue its execution. - migration/postcopy-ram.o is moved to Makefile.target so that it can use the TARGET_PAGE_SIZE macro. Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> migration/checkpoint: add run_state transitions This patch adds the different `run_state` transitions that occur during live and incremental checkpointing. Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> migration/checkpoint: add checkpoint metadata and file-saving operations This patch introduces the structures and helper functions required to save incremental checkpoints to disk: - checkpoint_save_metadata: write (or update) `CHPT_META_MAGIC` and `checkpoint_file_state` at the beginning of the file `filename`. - file_start_outgoing_migration: make sure that the file already exist if saving a checkpoint increment. - qmp_migrate: add 'chpt:' prefix for checkpointing into a file, add initialize checkpoint state data. - snapshot_reset_increments: reset the checkpoint increment counters. - struct CheckpointState checkpoint_file_state[]: global array structure storing the meta data related to each of the current incremental checkpoint. - struct CheckpointState checkpoint_state: global structure storing the current state of the checkpointing. Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> migration/checkpoint: extends the snapshot_thread This patch extends the `snapshot_thread` function, which leads the live and incremental memory checkpointing. If the `live` migration capability is enabled, this snapshot_thread will run in parallel of the VM execution. The guest RAM will be write-protected, so that we can ensure that the pages touched by the guest system are copied to shadow memory before being modified. If the `incremental` migration capability is enabled, then ... 1a. if this is the first checkpoint, then a full memory checkpoint will be carried out. 1b. if this is not the first checkpoint, only the pages marked as dirty will be saved to disk. 2. at the end of the checkpoint, dirty page (write-protection) tracking will be enabled, in order to track the pages modified by the guest system. The page protection is handled inside `postcopy-ram.c`, which encapsulates the calls to UserfaultFD. Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> migration/checkpoint: add RAM checkpointing mechanisms This patch extends the RAM migration mechanisms to support live and incremental checkpointing. RAM checkpointing relies on a set of per-page atomic flags: - dirty: true if the page has been modified since the previous checkpoint, and hence needs to be saved in the next one. - sent: true if the page has already been saved to disk, as part of the currently ongoing checkpoint - under processing: flag used as mutex to ensure that a page cannot be at the same time copied to shadow memory and copied to disk (this can only happen during a incremental live checkpoint). Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> migration/checkpoint: introduce incremental checkpoint reloading This patches introduces the functions required to reload incremental checkpoints: - qemu_start_incoming_migration: add prefix 'chpt' to incoming migrations to trigger a checkpoint file reloading - file_start_incoming_checkpoint_reload: starts a "checkpoint file reloading" incoming migration. If the filename starts with "<n:int>:", then only the n first increments will be reloaded (`reload_stop_at` field of the `checkpoint_state` global state). - file_get_checkpoint_fd: returns a new file description (`dup`licated from the main one), pointing at the beginning of the incremental migration stream to reload (`snapshot_number` parameter). - checkpoint_load_metadata: checks the checkpoint metadata magic number and loads the checkpoint meta data, at the beginning of the checkpoint file. - process_incoming_migration_co: updated to allow reloading multiple checkpoint increments, or a single/simple checkpoint of if the checkpoint metadata magic number was not found. - incoming_migration_is_last_increment: indicate if the checkpoint increment currently being reloaded is the last one that will be loaded. - loadvm_load_checkpoint: triggers the actual reloading of a given checkpoint increment. - vmstate_load_state: updated to call `vmsd->post_load` only once, after the reloading of the last checkpoint increment. Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> migration/checkpoint: introduce checkpoint increment squashing This patch introduces the ability to squash multiple checkpoint increments into a single one. It is an extension of the checkpoint reloading incoming migration, using the prefix "chpt:squash[:N]", where N is the number of checkpoint increment to consider. Checkpoint squashing works by performing a normal checkpoint reload, but without restarting the VM after its completion. Instead, when the reloading has succeeded, a new, full, checkpoint is performed, creating a single checkpoint file. Once this full checkpoint has completed, Qemu exits. The goal of checkpoint squashing is to reduce disk size, as incremental checkpoints may grow big over the time. Besides, reloading a single checkpoint is necessarily faster than reloading multiple increments. - qemu_start_incoming_migration: set the flag `do_squash` if the migration prefix is `chpt:squash:`. - process_incoming_migration_bh:trigger a new checkpoint migration to create the single-increment snapshot file, and setup a timer that will wait for the completion of the migration and terminate Qemu. - hmp_migrate_status_cb: in checkpoing squashing, do not report the progress of disks or block migration; inform the user about the squash Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> migration/checkpoint: introduce periodic checkpointing This patch adds the ability to trigger periodic checkpoints. It should be used in conjunction with the 'incremental' capability, but this is not mandatory. Periodic checkpointing works by setting a Qemu timer that triggers a new checkpoint migration when it elapsed. The timer is restarted when the checkpoint completes successfully. A new parameter (`period`) is added to the HMP `migrate` command. It takes as value the period (in x10s) at which the checkpoint will be performed. A value of 0 disables any ongoing periodic checkpointing. - hmp-commands.hx::migrate: add `period` parameter to the HMP `migrate` command. - hmp_migrate: Idem. - qapi/migration.json: Idem. - qmp_migrate: Idem, and correctly initialize or disable the periodic checkpointing. - PERIODIC_CHECKPOINT_UNIT: Defines the unit of the `period` parameter of the `migrate` command. Current value: 10s. - migration_state_notifier: New migration state change notifier, that restarts the periodic timer if the migration succeeded, or cleans up the periodic structures if it failed. - struct MigrationState: extended to store the migration parameters and the periodic checkpoint timer. - struct MigrationParams: new structure to store the migration parameters, so that we can restart it later with the same options. - periodic_snapshot_cb: Callback triggered after the period checkpoint timer elapsed. Triggers a new migration if the previous one completed successfully, or deletes the timer if it failed. - periodic_snapshot_setup: Saves the checkpoint arguments (to be able to trigger it again later) and sets the periodic checkpoint timer; or deletes it if the requested period is 0. Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> migration/checkpoint: introduce partial reloading This patches improves the reloading of incremental checkpointing, by only reloading the RAM section of the incremental state (except for the last increment). - qemu_savevm_state_iterate: save the offset of the checkpoint file where the RAM begins. - qemu_loadvm_section_start_full: after having reloaded the RAM, if this is not the last checkpoint increment, interrupt the reloading (return `-EINTR`). - process_incoming_migration_co: detect that the reloading was interrupted because the RAM section has been reloaded (`err == -EINTR`). Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> [vosys] add debugging message commands Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> [vosys] add init hook Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> [vosys] add a VM id for Qemu: $VM_UID or $USER-$VM_ID Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> [vosys] force Qemu to name threads Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> [vosys] migration/checkpoint: introduce VOSYS_TRY_MMAP_SHADOW_COPY [optimization] Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> [vosys] migration/checkpoint: introduce checksum capability Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> [vosys] migration/checkpoint: add sanity checks Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> [vosys] migration/checkpoint: abort on sanity check failure [vosys] migration/checkpoint: add checkpoint statistics Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> [vosys] migration/checkpoint: add logging messages Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> [vosys] migration/checkpoint: checksum the pages when they are saved to disk [WARNING] This patch computes the checksum of the pages when they are being saved to disk. This is complementary with the ram_checksum capability, that computes the checksum when the checkpoint is requested, and compares it with the checksum when the VM is reloaded. This patch ensures that the RAM content saved to disk is identical to what the one at the checkpoint request. However, for this capability to work with incremental checkpointing, we have to store the checksum for each of the RAM page, and update it when performing an incremental checkpoint. WARNING: there is an 'off by one' error that appears sometimes so I comment out the abort() on failure ... Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> [vosys] migration: add set 'internal-dist' build parameters Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> [vosys] migration/checkpoint: allow switching between old/new UFFD Old is for Linux 4.4 Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> [vosys] migration/checkpoint: add CoW checkpointing capability Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> [vosys] migration/checkpoint: add FORCE_CONT_AFTER_RELOAD for FORTH Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> [vosys] makefile: add libfuse compilation Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> [vosys] migration: add QFS virtual fuse filesystem Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> [vosys] migration/checkpoint: add default value for checkpoint destination Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> [vosys] migration: introduce guest-inform module Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> [vosys] migration: add guest-inform binding to the QFS Signed-off-by:
Kevin Pouget <k.pouget@virtualopensystems.com> CONFIG_DEBUG_MUTEX
-
- 11 Mar, 2019 1 commit
-
-
zhanghailiang authored
KP: Update for Qemu master branch, as of 2019-03-07 ------------------------------ postcopy/migration: Split fault related state into struct UserfaultState Split fault related state from MigrationIncomingState struct, and put them all into a new struct UserfaultState. We will add this state into struct MigrationState in later patch. We also fix some helper functions to use the new type. Signed-off-by:
zhanghailiang <zhang.zhanghailiang@huawei.com> ------------------------------ migration: Allow the migrate command to work on file: urls Usage: (qemu) migrate file:/path/to/vm_statefile Signed-off-by:
zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by:
Benoit Canet <benoit.canet@gmail.com> ------------------------------ migration: Allow -incoming to work on file: urls Usage: -incoming file:/path/to/vm_statefile Signed-off-by:
zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by:
Benoit Canet <benoit.canet@gmail.com> ------------------------------ migration: Create a snapshot thread to realize saving memory snapshot If users use migrate file:url command, we consider it as creating live memory snapshot command. Besides, we only support tcg accel for now. Signed-off-by:
zhanghailiang <zhang.zhanghailiang@huawei.com> ------------------------------ migration: implement initialization work for snapshot We re-use some migration helper fucntions to realize setup work for snapshot, besides, we need to do some initialization work (for example, save VM's device state) with VM pausing. Signed-off-by:
zhanghailiang <zhang.zhanghailiang@huawei.com> savevm: Split qemu_savevm_state_complete_precopy() into two helper functions We splited qemu_savevm_state_complete_precopy() into two helper functions, qemu_savevm_section_full() and qemu_savevm_section_end(). The main reason to do that is, sometimes we may want to do this two works separately. Signed-off-by:
zhanghailiang <zhang.zhanghailiang@huawei.com> ------------------------------ snapshot: Save VM's device state into snapshot file For live memory snapshot, we want to catch VM's state at the time of getting snapshot command. So we need to save the VM's static state (here, it is VM's device state) at the beginning of snapshot_thread(), but we can't do that while VM is running. Besides, we can't save device's state into snapshot file directly, because, we want to re-use the migration's incoming process with snapshot, we need to keep the save sequence. So here, we save the VM's device state into qsb temporarily in the SETUP stage with VM is stopped, and save it into snapshot file after finishing save VM's live state. Signed-off-by:
zhanghailiang <zhang.zhanghailiang@huawei.com>
-
- 06 Mar, 2019 38 commits
-
-
Peter Maydell authored
Machine queue, 2019-03-06 * qdev: Hotplug handler chaining (David Hildenbrand) * qdev: fix qbus_is_full() (Tony Krowiak) * hostmem: fix crash when querying empty host-nodes property via QMP (Igor Mammedov) # gpg: Signature made Wed 06 Mar 2019 18:39:29 GMT # gpg: using RSA key 2807936F984DC5A6 # gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" [full] # Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6 * remotes/ehabkost/tags/machine-next-pull-request: qdev: Provide qdev_get_bus_hotplug_handler() qdev: Let machine hotplug handler to override bus hotplug handler qdev: Let the hotplug_handler_unplug() caller delete the device hostmem: fix crash when querying empty host-nodes property via QMP qdev/core: fix qbus_is_full() Signed-off-by:
Peter Maydell <peter.maydell@linaro.org>
-
David Hildenbrand authored
Let's use a wrapper instead of looking it up manually. This function can than be reused when we explicitly want to have the bus hotplug handler (e.g. when the bus hotplug handler was overwritten by the machine hotplug handler). Reviewed-by:
Igor Mammedov <imammedo@redhat.com> Signed-off-by:
David Hildenbrand <david@redhat.com> Message-Id: <20190228122849.4296-4-david@redhat.com> Signed-off-by:
Eduardo Habkost <ehabkost@redhat.com>
-
Igor Mammedov authored
it will allow to return another hotplug handler than the default one for a specific bus based device type. Which is needed to handle non trivial plug/unplug sequences that need the access to resources configured outside of bus where device is attached. That will allow for returned hotplug handler to orchestrate wiring in arbitrary order, by chaining other hotplug handlers when it's needed. PS: It could be used for hybrid virtio-mem and virtio-pmem devices where it will return machine as hotplug handler which will do necessary wiring at machine level and then pass control down the chain to bus specific hotplug handler. Example of top level hotplug handler override and custom plug sequence: some_machine_get_hotplug_handler(machine){ if (object_dynamic_cast(OBJECT(dev), TYPE_SOME_BUS_DEVICE)) { return HOTPLUG_HANDLER(machine); } return NULL; } some_machine_device_plug(hotplug_dev, dev) { if (object_dynamic_cast(OBJECT(dev), TYPE_SOME_BUS_DEVICE)) { /* do machine specific initialization */ some_machine_init_special_device(dev) /* pass control to bus specific handler */ hotplug_handler_plug(dev->parent_bus->hotplug_handler, dev) } } Reviewed-by:
David Gibson <david@gibson.dropbear.id.au> Signed-off-by:
Igor Mammedov <imammedo@redhat.com> Signed-off-by:
David Hildenbrand <david@redhat.com> Message-Id: <20190228122849.4296-3-david@redhat.com> Signed-off-by:
Eduardo Habkost <ehabkost@redhat.com>
-
David Hildenbrand authored
When unplugging a device, at one point the device will be destroyed via object_unparent(). This will, one the one hand, unrealize the removed device hierarchy, and on the other hand, destroy/free the device hierarchy. When chaining hotplug handlers, we want to overwrite a bus hotplug handler by the machine hotplug handler, to be able to perform some part of the plug/unplug and to forward the calls to the bus hotplug handler. For now, the bus hotplug handler would trigger an object_unparent(), not allowing us to perform some unplug action on a device after we forwarded the call to the bus hotplug handler. The device would be gone at that point. machine_unplug_handler(dev) /* eventually do unplug stuff */ bus_unplug_handler(dev) /* dev is gone, we can't do more unplug stuff */ So move the object_unparent() to the original caller of the unplug. For now, keep the unrealize() at the original places of the object_unparent(). For implicitly chained hotplug handlers (e.g. pc code calling acpi hotplug handlers), the object_unparent() has to be done by the outermost caller. So when calling hotplug_handler_unplug() from inside an unplug handler, nothing is to be done. hotplug_handler_unplug(dev) -> calls machine_unplug_handler() machine_unplug_handler(dev) { /* eventually do unplug stuff */ bus_unplug_handler(dev) -> calls unrealize(dev) /* we can do more unplug stuff but device already unrealized */ } object_unparent(dev) In the long run, every unplug action should be factored out of the unrealize() function into the unplug handler (especially for PCI). Then we can get rid of the additonal unrealize() calls and object_unparent() will properly unrealize the device hierarchy after the device has been unplugged. hotplug_handler_unplug(dev) -> calls machine_unplug_handler() machine_unplug_handler(dev) { /* eventually do unplug stuff */ bus_unplug_handler(dev) -> only unplugs, does not unrealize /* we can do more unplug stuff */ } object_unparent(dev) -> will unrealize The original approach was suggested by Igor Mammedov for the PCI part, but I extended it to all hotplug handlers. I consider this one step into the right direction. To summarize: - object_unparent() on synchronous unplugs is done by common code -- "Caller of hotplug_handler_unplug" - object_unparent() on asynchronous unplugs ("unplug requests") has to be done manually -- "Caller of hotplug_handler_unplug" Reviewed-by:
Igor Mammedov <imammedo@redhat.com> Acked-by:
Cornelia Huck <cohuck@redhat.com> Signed-off-by:
David Hildenbrand <david@redhat.com> Message-Id: <20190228122849.4296-2-david@redhat.com> Reviewed-by:
Greg Kurz <groug@kaod.org> Signed-off-by:
Eduardo Habkost <ehabkost@redhat.com>
-
Igor Mammedov authored
QEMU will crashes with qapi/qobject-output-visitor.c:210: qobject_output_complete: Assertion `qov->root && ((&qov->stack)->slh_first == ((void *)0))' failed when trying to get value of not set hostmem's "host-nodes" property, HostMemoryBackend::host_nodes bitmap doesn't have any bits set in it, which leads to find_first_bit() returning MAX_NODES and consequently to an early return from host_memory_backend_get_host_nodes() without calling visitor. Fix it by calling visitor even if "host-nodes" property wasn't set before exiting from property getter to return valid empty list. Signed-off-by:
Igor Mammedov <imammedo@redhat.com> Message-Id: <20190214105733.25643-1-imammedo@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Reviewed-by:
Stefano Garzarella <sgarzare@redhat.com> Signed-off-by:
Eduardo Habkost <ehabkost@redhat.com>
-
Tony Krowiak authored
The qbus_is_full(BusState *bus) function (qdev_monitor.c) compares the max_index value of the BusState structure with the max_dev value of the BusClass structure to determine whether the maximum number of children has been reached for the bus. The problem is, the max_index field of the BusState structure does not necessarily reflect the number of devices that have been plugged into the bus. Whenever a child device is plugged into the bus, the bus's max_index value is assigned to the child device and then incremented. If the child is subsequently unplugged, the value of the max_index does not change and no longer reflects the number of children. When the bus's max_index value reaches the maximum number of devices allowed for the bus (i.e., the max_dev field in the BusClass structure), attempts to plug another device will be rejected claiming that the bus is full -- even if the bus is actually empty. To resolve the problem, a new 'num_children' field is being added to the BusState structure to keep track of the number of children plugged into the bus. It will be incremented when a child is plugged, and decremented when a child is unplugged. Signed-off-by:
Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Pierre Morel<pmorel@linux.ibm.com> Reviewed-by:
Halil Pasic <pasic@linux.ibm.com> Message-Id: <1545062250-7573-1-git-send-email-akrowiak@linux.ibm.com> Reviewed-by:
Igor Mammedov <imammedo@redhat.com> Signed-off-by:
Eduardo Habkost <ehabkost@redhat.com>
-
Peter Maydell authored
Migation pull 2019-03-06 (This replaces the pull sent yesterday) a) 4 small fixes including the cancel problem that caused the ahci migration test to fail intermittently b) Yury's ignore-shared feature c) Juan's extra tests d) Wei Wang's free page hinting e) Some Colo fixes from Zhang Chen Diff from yesterdays pull: 1) A missing fix of mine (cleanup during exit) 2) Changes from Eric/Markus on 'Create socket-address parameter' # gpg: Signature made Wed 06 Mar 2019 11:39:53 GMT # gpg: using RSA key 0516331EBC5BFDE7 # gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full] # Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7 * remotes/dgilbert/tags/pull-migration-20190306a: (22 commits) qapi/migration.json: Remove a variable that doesn't exist in example Migration/colo.c: Make COLO node running after failover Migration/colo.c: Fix double close bug when occur COLO failover virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT migration/ram.c: add the free page optimization enable flag migration/ram.c: add a notifier chain for precopy migration: API to clear bits of guest free pages from the dirty bitmap migration: use bitmap_mutex in migration_bitmap_clear_dirty bitmap: bitmap_count_one_with_offset bitmap: fix bitmap_count_one tests: Add basic migration precopy tcp test migration: Create socket-address parameter tests: Add migration xbzrle test migration: Add capabilities validation tests/migration-test: Add a test for ignore-shared capability migration: Add an ability to ignore shared RAM blocks migration: Introduce ignore-shared capability exec: Change RAMBlockIterFunc definition migration/rdma: clang compilation fix migration: Cleanup during exit ... Signed-off-by:
Peter Maydell <peter.maydell@linaro.org>
-
Peter Maydell authored
trivial patches pull request (20190206) - acpi: remove unused functions/variables - tests: remove useless architecture checks - some typo fixes and documentation update - flash_cfi02: fix memory leak # gpg: Signature made Wed 06 Mar 2019 11:05:12 GMT # gpg: using RSA key F30C38BD3F2FBE3C # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full] # gpg: aka "Laurent Vivier <laurent@vivier.eu>" [full] # gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full] # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C * remotes/vivier2/tags/trivial-branch-pull-request: thunk: fix of malloc to g_new hostmem-file: simplify ifdef-s in file_backend_memory_alloc() build: Correct explanation of unnest-vars example bswap: Fix accessors syntax in comment doc: fix typos for documents in tree block/pflash_cfi02: Fix memory leak and potential use-after-free hw/acpi: remove unnecessary variable acpi_table_builtin hw/acpi: remove unused function acpi_table_add_builtin() hw/i386/pc.c: remove unused function pc_acpi_init() tests: Remove (mostly) useless architecture checks Signed-off-by:
Peter Maydell <peter.maydell@linaro.org>
-
Zhang Chen authored
Remove the "active" variable in example for query-colo-status. It is a doc bug from commit f56c0065Signed-off-by:
Zhang Chen <chen.zhang@intel.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <20190303145021.2962-6-chen.zhang@intel.com> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com>
-
Zhang Chen authored
Delay to close COLO for auto start VM after failover. Signed-off-by:
Zhang Chen <chen.zhang@intel.com> Reviewed-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190303145021.2962-4-chen.zhang@intel.com> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com>
-
Zhang Chen authored
In migration_incoming_state_destroy(void) will check the mis->to_src_file to double close the mis->to_src_file when occur COLO failover. Signed-off-by:
Zhang Chen <chen.zhang@intel.com> Reviewed-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190303145021.2962-2-chen.zhang@intel.com> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com>
-
Wei Wang authored
The new feature enables the virtio-balloon device to receive hints of guest free pages from the free page vq. A notifier is registered to the migration precopy notifier chain. The notifier calls free_page_start after the migration thread syncs the dirty bitmap, so that the free page optimization starts to clear bits of free pages from the bitmap. It calls the free_page_stop before the migration thread syncs the bitmap, which is the end of the current round of ram save. The free_page_stop is also called to stop the optimization in the case when there is an error occurred in the process of ram saving. Note: balloon will report pages which were free at the time of this call. As the reporting happens asynchronously, dirty bit logging must be enabled before this free_page_start call is made. Guest reporting must be disabled before the migration dirty bitmap is synchronized. Signed-off-by:
Wei Wang <wei.w.wang@intel.com> CC: Michael S. Tsirkin <mst@redhat.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Peter Xu <peterx@redhat.com> Message-Id: <1544516693-5395-8-git-send-email-wei.w.wang@intel.com> Reviewed-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Dropped kernel header update, fixed up CMD_ID_* name change
-
Wei Wang authored
This patch adds the free page optimization enable flag, and a function to set this flag. When the free page optimization is enabled, not all the pages are needed to be sent in the bulk stage. Why using a new flag, instead of directly disabling ram_bulk_stage when the optimization is running? Thanks for Peter Xu's reminder that disabling ram_bulk_stage will affect the use of compression. Please see save_page_use_compression. When xbzrle and compression are used, if free page optimizaion causes the ram_bulk_stage to be disabled, save_page_use_compression will return false, which disables the use of compression. That is, if free page optimization avoids the sending of half of the guest pages, the other half of pages loses the benefits of compression in the meantime. Using a new flag to let migration_bitmap_find_dirty skip the free pages in the bulk stage will avoid the above issue. Signed-off-by:
Wei Wang <wei.w.wang@intel.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> CC: Peter Xu <peterx@redhat.com> Message-Id: <1544516693-5395-7-git-send-email-wei.w.wang@intel.com> Reviewed-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com>
-
Wei Wang authored
This patch adds a notifier chain for the memory precopy. This enables various precopy optimizations to be invoked at specific places. Signed-off-by:
Wei Wang <wei.w.wang@intel.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> CC: Peter Xu <peterx@redhat.com> Reviewed-by:
Peter Xu <peterx@redhat.com> Message-Id: <1544516693-5395-6-git-send-email-wei.w.wang@intel.com> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com>
-
Wei Wang authored
This patch adds an API to clear bits corresponding to guest free pages from the dirty bitmap. Spilt the free page block if it crosses the QEMU RAMBlock boundary. Signed-off-by:
Wei Wang <wei.w.wang@intel.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> CC: Peter Xu <peterx@redhat.com> Reviewed-by:
Peter Xu <peterx@redhat.com> Message-Id: <1544516693-5395-5-git-send-email-wei.w.wang@intel.com> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com>
-
Wei Wang authored
The bitmap mutex is used to synchronize threads to update the dirty bitmap and the migration_dirty_pages counter. For example, the free page optimization clears bits of free pages from the bitmap in an iothread context. This patch makes migration_bitmap_clear_dirty update the bitmap and counter under the mutex. Signed-off-by:
Wei Wang <wei.w.wang@intel.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> CC: Peter Xu <peterx@redhat.com> Reviewed-by:
Peter Xu <peterx@redhat.com> Message-Id: <1544516693-5395-4-git-send-email-wei.w.wang@intel.com> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com>
-
Wei Wang authored
Count the number of 1s in a bitmap starting from an offset. Signed-off-by:
Wei Wang <wei.w.wang@intel.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> Reviewed-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <1544516693-5395-3-git-send-email-wei.w.wang@intel.com> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com>
-
Wei Wang authored
BITMAP_LAST_WORD_MASK(nbits) returns 0xffffffff when "nbits=0", which makes bitmap_count_one fail to handle the "nbits=0" case. It appears to be preferred to remain BITMAP_LAST_WORD_MASK identical to the kernel implementation that it is ported from. So this patch fixes bitmap_count_one to handle the nbits=0 case. Inital Discussion Link: https://www.mail-archive.com/qemu-devel@nongnu.org/msg554316.htmlSigned-off-by:
Wei Wang <wei.w.wang@intel.com> CC: Juan Quintela <quintela@redhat.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Peter Xu <peterx@redhat.com> Message-Id: <1544516693-5395-2-git-send-email-wei.w.wang@intel.com> Reviewed-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com>
-
Juan Quintela authored
Not sharing code from precopy/unix because we have to read back the tcp parameter. Signed-off-by:
Juan Quintela <quintela@redhat.com> Reviewed-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by:
Peter Xu <peterx@redhat.com> Reviewed-by:
Thomas Huth <thuth@redhat.com> Message-Id: <20190227105128.1655-4-quintela@redhat.com> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Fixup for clash with Yury's
-
Juan Quintela authored
It will be used to store the uri parameters. We want this only for tcp, so we don't set it for other uris. We need it to know what port is migration running. Reviewed-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by:
Juan Quintela <quintela@redhat.com> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Removed DummyStruct as suggested by Eric & Markus --
-
Juan Quintela authored
Reviewed-by:
Peter Xu <peterx@redhat.com> Signed-off-by:
Juan Quintela <quintela@redhat.com> Message-Id: <20190227105128.1655-2-quintela@redhat.com> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Fixup for class with Yury's series
-
Yury Kotov authored
Currently we don't check which capabilities set in the source QEMU. We just expect that the target QEMU has the same enabled capabilities. Add explicit validation for capabilities to make sure that the target VM has them too. This is enabled for only new capabilities to keep compatibily. Signed-off-by:
Yury Kotov <yury-kotov@yandex-team.ru> Message-Id: <20190215174548.2630-6-yury-kotov@yandex-team.ru> Reviewed-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Manual merge
-
Yury Kotov authored
Signed-off-by:
Yury Kotov <yury-kotov@yandex-team.ru> Message-Id: <20190215174548.2630-5-yury-kotov@yandex-team.ru> Reviewed-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Disabled the test for now, not happy on aarch64
-
Yury Kotov authored
If ignore-shared capability is set then skip shared RAMBlocks during the RAM migration. Also, move qemu_ram_foreach_migratable_block (and rename) to the migration code, because it requires access to the migration capabilities. Signed-off-by:
Yury Kotov <yury-kotov@yandex-team.ru> Message-Id: <20190215174548.2630-4-yury-kotov@yandex-team.ru> Reviewed-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com>
-
Yury Kotov authored
We want to use local migration to update QEMU for running guests. In this case we don't need to migrate shared (file backed) RAM. So, add a capability to ignore such blocks during live migration. Signed-off-by:
Yury Kotov <yury-kotov@yandex-team.ru> Message-Id: <20190215174548.2630-3-yury-kotov@yandex-team.ru> Reviewed-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com>
-
Yury Kotov authored
Currently, qemu_ram_foreach_* calls RAMBlockIterFunc with many block-specific arguments. But often iter func needs RAMBlock*. This refactoring is needed for fast access to RAMBlock flags from qemu_ram_foreach_block's callback. The only way to achieve this now is to call qemu_ram_block_from_host (which also enumerates blocks). So, this patch reduces complexity of qemu_ram_foreach_block() -> cb() -> qemu_ram_block_from_host() from O(n^2) to O(n). Fix RAMBlockIterFunc definition and add some functions to read RAMBlock* fields witch were passed. Signed-off-by:
Yury Kotov <yury-kotov@yandex-team.ru> Message-Id: <20190215174548.2630-2-yury-kotov@yandex-team.ru> Reviewed-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com>
-
Marcel Apfelbaum authored
Configuring QEMU with: ../configure --cc=clang --enable-rdma Leads to compilation error: CC migration/rdma.o CC migration/block.o qemu/migration/rdma.c:3615:58: error: taking address of packed member 'rkey' of class or structure 'RDMARegisterResult' may result in an unaligned pointer value [-Werror,-Waddress-of-packed-member] (uintptr_t)host_addr, NULL, ®_result->rkey, ^~~~~~~~~~~~~~~~ Fix it by using a temp local variable. Signed-off-by:
Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Message-Id: <20190304184923.24215-1-marcel.apfelbaum@gmail.com> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by:
Philippe Mathieu-Daudé <philmd@redhat.com>
-
Dr. David Alan Gilbert authored
Currently we cleanup the migration object as we exit main after the main_loop finishes; however if there's a migration running things get messy and we can end up with the migration thread still trying to access freed structures. We now take a ref to the object around the migration thread itself, so the act of dropping the ref during exit doesn't cause us to lose the state until the thread quits. Cancelling the migration during migration also tries to get the thread to quit. We do this a bit earlier; so hopefully migration gets out of the way before all the devices etc are freed. Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Tested-by:
Alex Bennée <alex.bennee@linaro.org> Message-Id: <20190227164900.16378-1-dgilbert@redhat.com> Reviewed-by:
Juan Quintela <quintela@redhat.com> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com>
-
Dr. David Alan Gilbert authored
If the migration fails before the channel is open (e.g. a bad address) we end up in the cleanup with rdma->channel==NULL. Spotted by Coverity: CID 1398634 Fixes: fbbaacabSigned-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190214185351.5927-1-dgilbert@redhat.com> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by:
Peter Xu <peterx@redhat.com> Reviewed-by:
Philippe Mathieu-Daudé <philmd@redhat.com>
-
Dr. David Alan Gilbert authored
During a cancelled migration there's a race where the fd can go into an error state before we get back around the migration loop and migration_detect_error transitions from cancelling->failed. Check for cancelled/cancelling and don't change the state. Red Hat bug: https://bugzilla.redhat.com/show_bug.cgi?id=1608649 Fixes: b23c2adeSigned-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190219195928.12289-1-dgilbert@redhat.com> Signed-off-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by:
Peter Xu <peterx@redhat.com> Reviewed-by:
Juan Quintela <quintela@redhat.com>
-
Aarushi Mehta authored
Note that since thunking occurs throughout the lifetime of the QEMU instance, there is no matching 'free' to correct. Signed-off-by:
Aarushi Mehta <mehta.aaru20@gmail.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <5310bd5d152fa36c1828a7cbd19fc893739d1609.camel@gmail.com> Signed-off-by:
Laurent Vivier <laurent@vivier.eu>
-
Igor Mammedov authored
cleanup file_backend_memory_alloc() by using one CONFIG_POSIX ifdef instead of several ones within the function to make it simpler to follow. Signed-off-by:
Igor Mammedov <imammedo@redhat.com> Suggested-by:
Wei Yang <richardw.yang@linux.intel.com> Reviewed-by:
Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190213123858.24620-1-imammedo@redhat.com> Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20190214031004.32522-2-stefanha@redhat.com> [lv: s/hostmem/hostmem-file/] Signed-off-by:
Laurent Vivier <laurent@vivier.eu>
-
Markus Armbruster authored
Cc: Fam Zheng <fam@euphon.net> Signed-off-by:
Markus Armbruster <armbru@redhat.com> Message-Id: <20190213130240.15492-1-armbru@redhat.com> Signed-off-by:
Laurent Vivier <laurent@vivier.eu>
-
Greg Kurz authored
All accessors that have an endian infix DO have an underscore between {size} and {endian}. Signed-off-by:
Greg Kurz <groug@kaod.org> Reviewed-by:
Richard Henderson <richard.henderson@linaro.org> Reviewed-by:
Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <155119086741.1037569.12734854713022304642.stgit@bahia.lan> Signed-off-by:
Laurent Vivier <laurent@vivier.eu>
-
Like Xu authored
Signed-off-by:
Like Xu <like.xu@linux.intel.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Message-Id: <1550640446-18788-1-git-send-email-like.xu@linux.intel.com> Signed-off-by:
Laurent Vivier <laurent@vivier.eu>
-
Stephen Checkoway authored
Don't dynamically allocate the pflash's timer. But do use timer_del in an unrealize function to make sure that the timer can't fire after the pflash_t has been freed. Signed-off-by:
Stephen Checkoway <stephen.checkoway@oberlin.edu> Reviewed-by:
Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by:
Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190219153727.62279-1-stephen.checkoway@oberlin.edu> Signed-off-by:
Laurent Vivier <laurent@vivier.eu>
-
Wei Yang authored
acpi_table_builtin is now always false, it is not necessary to check it again. This patch just removes it. Signed-off-by:
Wei Yang <richardw.yang@linux.intel.com> Reviewed-by:
Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by:
Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by:
Igor Mammedov <imammedo@redhat.com> Message-Id: <20190214084939.20640-4-richardw.yang@linux.intel.com> Signed-off-by:
Laurent Vivier <laurent@vivier.eu>
-
Wei Yang authored
Function acpi_table_add_builtin() is not used anymore. Remove the definition and declaration. Signed-off-by:
Wei Yang <richardw.yang@linux.intel.com> Reviewed-by:
Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by:
Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by:
Igor Mammedov <imammedo@redhat.com> Message-Id: <20190214084939.20640-3-richardw.yang@linux.intel.com> Signed-off-by:
Laurent Vivier <laurent@vivier.eu>
-