-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Couldn't deleted snapshot data #7826
Comments
Issue still occuring on latest version, seems like an issue with timing, snapshot is still in use. After a few seconds the command can be processed by hand. |
Thanks for your feedback @rtjdamen ! |
Hi @rtjdamen , After a careful analysis with the xcp-ng team, we found that this error is raised when the xapi failed to unplug the vdi after a non modifiable delay of 4s. We patched your installation with a retry on XO side. If it's ok with you we'll monitor this night jobs and see i it's enough to handle this edge case. Regards |
sometimes the capi take too long to detach the VDI in this case, the timeout is fixed at 4s, non modifiable when the timeout is reached the xapi raise a VDI_IN_USE error this is an internal process of the xapi This commit add a retry on XO side to give more room for the xapi to work through this process, as XO already do it one vdi destroying fix #7826
Yes, no problem, we will keep an eye on it too! |
Hi, Can you patch our installation too ASAP ? Thanks ! |
@fbeauchamp unfortunatly it seems like the snapshot data is still not destroyed in every case, i think the 4s is still too short. Maybe we need to increase it to 10s to start with? |
patch has been redeployed on proxy. Waiting for this night run to be sure |
Seems like that did the trick! no more orphan VDI's this morning! Also no VDI_In_Use destroy messages, are these related or not? |
yes because the vdi were not deleted (VDI_IN_USE ) and stayed as orphans. Now we purge them correctly. |
So 2 issues fixed! |
sometimes the capi take too long to detach the VDI in this case, the timeout is fixed at 4s, non modifiable when the timeout is reached the xapi raise a VDI_IN_USE error this is an internal process of the xapi This commit add a retry on XO side to give more room for the xapi to work through this process, as XO already do it one vdi destroying fix #7826
sometimes the capi take too long to detach the VDI in this case, the timeout is fixed at 4s, non modifiable when the timeout is reached the xapi raise a VDI_IN_USE error this is an internal process of the xapi This commit add a retry on XO side to give more room for the xapi to work through this process, as XO already do it one vdi destroying fix #7826
Issue not resolved completely, original fix was solving the issue but the version now active in XOA is not resolving the issue. |
@rtjdamen Are you sure both your XOA and your XO Proxies are up-to-date on latest channel? If they are, we need to take a look at them. |
According to the gui they are. |
@rtjdamen After looking at your infra, it seems that there is still a major improvement, there are very few The only problem we saw comes from the fact that one of your VDIs is still attached to the control domain and our XCP-ng team is still investigating this issue. We will still continue to monitor this problem. |
Ok, hope they find a solution for that issue soon!
Met mobiele groet Robin Damen
…________________________________
Van: Julien Fontanet ***@***.***>
Verzonden: Tuesday, September 24, 2024 10:42:49 AM
Aan: vatesfr/xen-orchestra ***@***.***>
CC: Robin Damen | RDM Media BV ***@***.***>; Mention ***@***.***>
Onderwerp: Re: [vatesfr/xen-orchestra] Couldn't deleted snapshot data (Issue #7826)
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
@rtjdamen<https://github.com/rtjdamen> After looking at your infra, it seems that there is still a major improvement, there are very few VDI_IN_USE errors now 🙂
The only problem we saw comes from the fact that one of your VDIs is still attached to the control domain and our XCP-ng team is still investigating this issue.
We will still continue to monitor this problem.
—
Reply to this email directly, view it on GitHub<#7826 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AEL4XJWKML5OYS3TV7Q3R5DZYEQYTAVCNFSM6AAAAABKWFZ35CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNZQGY2DGNBTGU>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
I just checked but the one failed yesterday is not hanging at the control domain. So this is incorrect |
Are you using XOA or XO from the sources?
XOA
Which release channel?
latest
Provide your commit number
No response
Describe the bug
A CBT snapshot backup finishes with warning " Couldn't deleted snapshot data"
Error message
To reproduce
Random behavior, seems to be related to specific vms as it reoccurs on the same vms every time
Expected behavior
if the vdi.data-destroy fails i would expect a retry, a manual retry does work. Maybe a timing issue?
Screenshots
No response
Node
18.20.2
Hypervisor
XCP-ng 8.2
Additional context
Does happen on some vms
The text was updated successfully, but these errors were encountered: