Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDI importer converting image which is already RAW #3457

Open
waldner opened this issue Oct 11, 2024 · 9 comments
Open

CDI importer converting image which is already RAW #3457

waldner opened this issue Oct 11, 2024 · 9 comments
Assignees
Labels
kind/bug lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@waldner
Copy link

waldner commented Oct 11, 2024

What happened:

When creating a DV using an HTTP URL as source to import an image in RAW format, the importer runs a conversion pass on it. The relevant log lines:

...
I1011 13:00:14.111593       1 prometheus.go:78] 99.40
I1011 13:00:15.111696       1 prometheus.go:78] 99.80
I1011 13:00:16.111891       1 prometheus.go:78] 100.00
I1011 13:00:16.638904       1 data-processor.go:247] New phase: Convert
I1011 13:00:16.638960       1 data-processor.go:253] Validating image
E1011 13:00:16.655763       1 prlimit.go:156] failed to kill the process; os: process already finished
I1011 13:00:16.655975       1 qemu.go:354] Adding preallocation method: [-o preallocation=falloc]
I1011 13:00:16.656036       1 qemu.go:358] Attempting preallocation method, qemu-img args: [convert -o preallocation=falloc -t writeback -p -O raw /scratch/tmpimage /data/disk.img]
I1011 13:00:16.668451       1 qemu.go:273] 0.00
I1011 13:00:18.530061       1 qemu.go:273] 1.00
...

As said, the image is already RAW; the import over http is already extremely slow despite the http server being located on the same lan; it'd be better not to add yet even more extra time for a no-op conversion to the VM start time.

What you expected to happen:

The mage should be used as-is, or there should be a way to tell the importer to skip the (no-op) conversion.

Additional context:

Here's the DV template from the VM definition:

    dataVolumeTemplates:
    - metadata:
        creationTimestamp: null
        name: myvm-cdisk
      spec:
        preallocation: true
        pvc:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 50Gi
        source:
          http:
            url: http://10.11.12.10:9999/assets/rawimg.img

Environment:

  • CDI version (use kubectl get deployments cdi-deployment -o yaml): v1.60.3
  • Kubernetes version (use kubectl version): 1.31.1
  • DV specification: see above
@awels
Copy link
Member

awels commented Oct 11, 2024

If the source is already RAW, then the conversion is essentially a no-op, and it is just copying the data.

@waldner
Copy link
Author

waldner commented Oct 11, 2024

Sure, but it still wastes time before the VM can start. Since the initial image is rather large, the (already extremely slow) process of importing it becomes even slower.
Also I might be misremembering, but I think this was not happening in older versions.

@waldner
Copy link
Author

waldner commented Oct 14, 2024

In fact, why is a conversion needed at all? Qemu and libvirt can work with different image types, so even if the imported image were (say) QCOW2 it could still be used as-is.
What is the rationale behind the conversion process? I think the user should be given more control over it.

@awels
Copy link
Member

awels commented Oct 14, 2024

So we do the conversion because we want all the images to be raw. This allows us to directly use built in kubernetes APIs and worry about different formats. This also allows us to use the built in capabilities of the kubernets platform (snapshots, etc). We could use qcow2 format most likely, but now we need a different approach when we want to use that with block devices. It just makes everything more complex. This approach has served us really well and we are unlikely to diverge.

@waldner
Copy link
Author

waldner commented Oct 14, 2024

thanks for the explanation; still, I maintain that a raw-to-raw conversion is unneeded and can delay the VM startup by a few minutes if the image is big.
I don't know the storage details, but it might also waste space if both images need to be stored at the same time, at least for the duration of the conversion (I might be wrong on this last one though).

@awels
Copy link
Member

awels commented Oct 14, 2024

I agree with you, I just double checked the code as I was like, there is no reason to not write directly to the target once we determine this is a raw file. But in the code we write to scratch space first and then write to the target. I have to go look up the history of why it is like this. There might be some validation on sizes and stuff that we can only get from writing to scratch space first and then copying the data.

@aglitke
Copy link
Member

aglitke commented Oct 21, 2024

One reason we do the conversion unconditionally is so that we can respect the sparseness of the image on the target PVC. In the past we did not have a sparse writer and the so the only way to do this was with explicit conversion. Now @mhenriks has provided a sparse writer so it could be possible to write directly to the target while respecting sparseness. Alvaro is going to take a look and see if this can be done...

/assign @alromeros

@waldner
Copy link
Author

waldner commented Oct 21, 2024

Thanks. Either way, I would give the user more power to decide, if they want to trade optimizations for VM startup speed, they should be able to do it.

@kubevirt-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@kubevirt-bot kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

5 participants