Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vtpm fix : use domain uuid instead of full domain name #4270

Merged
merged 1 commit into from
Sep 24, 2024

Conversation

shjala
Copy link
Member

@shjala shjala commented Sep 18, 2024

Domain name contains UUID, version and app number,the version number might get changed, so use only the
UUID part to preserve the vtpm state.


This PR is based on a observation in @OhmSpectator code comment :

each time the configuration is changed, it updates the domain name, increasing the counter after UUID

I'll be happy if someone point me to part of the code that actually makes this change to app version number, I couldn't find it myself.

@shjala shjala added the bug Something isn't working label Sep 18, 2024
Copy link
Member

@OhmSpectator OhmSpectator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lucky you, the tests are fine... I'm fixing kvm_tets for an hour now =D

@OhmSpectator
Copy link
Member

I'll be happy if someone point me to part of the code that actually makes this change to app number, I couldn't find it myself.

Config Num comes from the cloud. They increase it with each config update.

@OhmSpectator
Copy link
Member

The virtualization test suite has failed three times. Strange.

@OhmSpectator
Copy link
Member

Yeah, it looks like something is broken...

[stdout]
Docker app's state test
=== RUN   TestAppStatus
apps: '[eclient]' state: 'RUNNING' secs: 1800
time: 2024-09-19T02:04:00.913252133Z out: 	appName eclient state changed to UNKNOWN
time: 2024-09-19T02:04:03.570037979Z out: 	appName eclient state changed to RESOLVING_TAG
time: 2024-09-19T02:04:04.570522093Z out: 	appName eclient state changed to DOWNLOAD_STARTED
time: 2024-09-19T02:04:05.571844887Z out: 	appName eclient state changed to DOWNLOAD_STARTED (0%)
time: 2024-09-19T02:04:08.574364348Z out: 	appName eclient state changed to DOWNLOAD_STARTED (11%)
time: 2024-09-19T02:04:08.574380459Z out: 	appName eclient state changed to DOWNLOAD_STARTED (23%)
time: 2024-09-19T02:04:09.57494185Z out: 	appName eclient state changed to DOWNLOAD_STARTED (64%)
time: 2024-09-19T02:04:09.574978076Z out: 	appName eclient state changed to LOADING
time: 2024-09-19T02:04:10.576055852Z out: 	appName eclient state changed to CREATING_VOLUME
time: 2024-09-19T02:04:14.578651073Z out: 	appName eclient state changed to INSTALLED
time: 2024-09-19T02:04:15.579136424Z out: 	appName eclient state changed to BOOTING
time: 2024-09-19T02:05:54.653641582Z out: 	appName eclient state changed to HALTING: [description:"Giving up waiting to connect to QEMU Monitor Protocol socket /run/hypervisor/kvm/7418b0e3-96e3-4e0f-b977-3720d7464f1c.1.1/qmp from VM 7418b0e3-96e3-4e0f-b977-3720d7464f1c.1.1, error: [attempt 1] qmp status failed for QMP socket '/run/hypervisor/kvm/7418b0e3-96e3-4e0f-b977-3720d7464f1c.1.1/qmp': err: 'dial unix /run/hypervisor/kvm/7418b0e3-96e3-4e0f-b977-3720d7464f1c.1.1/qmp: connect: connection refused'; (JSON response: ''); [attempt 2] qmp status failed for QMP socket '/run/hypervisor/kvm/7418b0e3-96e3-4e0f-b977-3720d7464f1c.1.1/qmp': err: 'dial unix /run/hypervisor/kvm/7418b0e3-96e3-4e0f-b977-3720d7464f1c.1.1/qmp: connect: connection refused'; (JSON response: ''); [attempt 3] qmp status failed for QMP socket '/run/hypervisor/kvm/7418b0e3-96e3-4e0f-b977-3720d7464f1c.1.1/qmp': err: 'dial unix /run/hypervisor/kvm/7418b0e3-96e3-4e0f-b977-3720d7464f1c.1.1/qmp: connect: connection refused'; (JSON response: '')" timestamp:{seconds:1726711530 nanos:262492703} severity:SEVERITY_ERROR]
time: 2024-09-19T02:15:47.104054215Z out: 	appName eclient state changed to BOOTING
time: 2024-09-19T02:17:26.189625019Z out: 	appName eclient state changed to HALTING: [description:"Giving up waiting to connect to QEMU Monitor Protocol socket /run/hypervisor/kvm/7418b0e3-96e3-4e0f-b977-3720d7464f1c.1.1/qmp from VM 7418b0e3-96e3-4e0f-b977-3720d7464f1c.1.1, error: [attempt 1] qmp status failed for QMP socket '/run/hypervisor/kvm/7418b0e3-96e3-4e0f-b977-3720d7464f1c.1.1/qmp': err: 'dial unix /run/hypervisor/kvm/7418b0e3-96e3-4e0f-b977-3720d7464f1c.1.1/qmp: connect: connection refused'; (JSON response: ''); [attempt 2] qmp status failed for QMP socket '/run/hypervisor/kvm/7418b0e3-96e3-4e0f-b977-3720d7464f1c.1.1/qmp': err: 'dial unix /run/hypervisor/kvm/7418b0e3-96e3-4e0f-b977-3720d7464f1c.1.1/qmp: connect: connection refused'; (JSON response: ''); [attempt 3] qmp status failed for QMP socket '/run/hypervisor/kvm/7418b0e3-96e3-4e0f-b977-3720d7464f1c.1.1/qmp': err: 'dial unix /run/hypervisor/kvm/7418b0e3-96e3-4e0f-b977-3720d7464f1c.1.1/qmp: connect: connection refused'; (JSON response: '')" timestamp:{seconds:1726712221 nanos:781452343} severity:SEVERITY_ERROR]
time: 2024-09-19T02:27:09.631665355Z out: 	appName eclient state changed to BOOTING
time: 2024-09-19T02:28:48.707869707Z out: 	appName eclient state changed to HALTING: [description:"Giving up waiting to connect to QEMU Monitor Protocol socket /run/hypervisor/kvm/7418b0e3-96e3-4e0f-b977-3720d7464f1c.1.1/qmp from VM 7418b0e3-96e3-4e0f-b977-3720d7464f1c.1.1, error: [attempt 1] qmp status failed for QMP socket '/run/hypervisor/kvm/7418b0e3-96e3-4e0f-b977-3720d7464f1c.1.1/qmp': err: 'dial unix /run/hypervisor/kvm/7418b0e3-96e3-4e0f-b977-3720d7464f1c.1.1/qmp: connect: connection refused'; (JSON response: ''); [attempt 2] qmp status failed for QMP socket '/run/hypervisor/kvm/7418b0e3-96e3-4e0f-b977-3720d7464f1c.1.1

@eriknordmark
Copy link
Contributor

I'll be happy if someone point me to part of the code that actually makes this change to app number, I couldn't find it myself.

Config Num comes from the cloud. They increase it with each config update.

It is the version number (and not the app number) which is changed, right? If so it makes sense to update the PR description.

if err != nil {
return fmt.Errorf("failed to extract UUID from domain name: %v", err)
}
return requestVtpmLaunch(domainUUID.String(), wk, swtpmTimeout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, but if the argument type was uuid.UUID and not a string then the compiler would help check.

Copy link
Member

@OhmSpectator OhmSpectator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run Eden

@OhmSpectator
Copy link
Member

@shjala, will you take a look at why the same Eden test fails all the time?

@shjala
Copy link
Member Author

shjala commented Sep 23, 2024

@shjala, will you take a look at why the same Eden test fails all the time?

All are green except virtualization, do you mean that or there is more?

@OhmSpectator
Copy link
Member

Yeah, I mean that one. I don't see the same test failing in other PRs all the time. Maybe it's just a coincidence, but it's better to check.

Domain name contains UUID, version and app number,
the version number might get changed, so use only the
UUID part to presever the vtpm state.

Signed-off-by: Shahriyar Jalayeri <[email protected]>
@shjala
Copy link
Member Author

shjala commented Sep 23, 2024

@OhmSpectator should be fixed now, rerun?

@OhmSpectator
Copy link
Member

All green! What exactly have you done? Just rebase?

@shjala
Copy link
Member Author

shjala commented Sep 23, 2024

All green! What exactly have you done? Just rebase?

Eden rerun is appreciated.

@eriknordmark eriknordmark merged commit 115f553 into lf-edge:master Sep 24, 2024
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants