Skip to content
This repository has been archived by the owner on Jan 28, 2023. It is now read-only.

Fixed PAE issues #152

Merged
merged 2 commits into from
Jan 15, 2019
Merged

Fixed PAE issues #152

merged 2 commits into from
Jan 15, 2019

Conversation

nevilad
Copy link
Contributor

@nevilad nevilad commented Jan 8, 2019

Fixed two bugs

  1. PDPTEs were written to vmcs even when they were not read after the last vmexit. Since CR3 writes are not intercepted, it is possible that the guest in PAE mode changes CR3, the processor changes PDPTE values and this change doesn't update cached pae_pdpte array values. At the next cr0\cr4 write when is_ept_pae in exit_cr_access is false, new PDPTE values are not read, and old ones are written to PDPTEs in vmcs. This leads to guest crash at the next instruction. Intercepting CR3 writes is too a feasible solution, but with a performance penalty due to many additional vmexits.
  2. CR3 in PAE mode is 32-byte aligned, pw_perform_page_walk passed to gpa_space_map_page the page (pdpt_gpa >> PG_ORDER_4K) and got hva of the page were CR3 points, not the hva of gpa in CR3. Fixed by calculating the offset of CR3 value from its page start and adding it to the page hva. This fix is only in the CONFIG_HAX_EPT2 branch and was tested only in pure PAE mode, not in LME (the modified code affects this branch too).

Both fixes were compiled only for Windows and were tested on a Win10 x64 host and Win7 x32 guest. I was able to run the guest, so this fix maybe fixes #118.

Copy link
Contributor

@raphaelning raphaelning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing these bugs and submitting the PR! Please let me know if you agree with the slightly different solutions I proposed. In addition, please revise the commit message of each patch:

  1. We require a Signed-off-by: line at the end: https://github.com/intel/haxm/blob/master/CONTRIBUTING.md
  2. The subject line need to be concise:
    • For the first patch, how about "Avoid flushing outdated PDPTE0..3 cache to VMCS"?
    • For the second patch, how about "page_walker: Fix PAE PDPT pointer calculation"?
    • There are some helpful guidelines for writing a good commit message here: https://chris.beams.io/posts/git-commit/

pdpt_gpa >> PG_ORDER_4K,
&pdpt_kmap, NULL);

uint pdpt_offset_on_page = ((uint)pdpt_gpa) & ((1 << PG_ORDER_4K) - 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Instead of modifying the CONFIG_HAX_EPT2 branch, how about postponing the offset fix until after line 671:

// Quoting original code from line 671
#endif // CONFIG_HAX_EPT2
        if (pdpt_hva == NULL) {
            retval = TF_FAILED;
            goto out;
        }

        if (!is_lme) {
            // In PAE paging mode, pdpt_gpa is 32-byte aligned, not 4KB-aligned
            pdpt_hva += (uint)(pdpt_gpa & (PAGE_SIZE_4K - 1));
        }

The !is_lme check is optional, but makes sure the new logic is only applied to PAE Paging mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, pointer modifications should be done after checking the returned value.

core/vcpu.c Outdated
@@ -79,7 +79,7 @@ static void vcpu_prepare(struct vcpu_t *vcpu);
static void vcpu_init_emulator(struct vcpu_t *vcpu);

static void vmread_cr(struct vcpu_t *vcpu);
static void vmwrite_cr(struct vcpu_t *vcpu);
static void vmwrite_cr(struct vcpu_t *vcpu, bool can_update_pdpte);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding an explicit Boolean switch works, but I think it slightly hurts code readability: people might be looking at all the vmwrite_cr(vcpu, false); calls and wonder what the second parameter is about. Since the sole purpose is to hack into the function and skip a few lines of PAE-specific code, I'd like to propose an alternative approach that preserves the signature of vmwrite_cr().

The idea is to move the Boolean to struct vcpu_t (core/include/vcpu.h), and make sure it's properly updated. In fact, vcpu_t already contains an unnamed bit field, where some of the existing bits serve a similar purpose:

// Quoting from vcpu.h, line 181
    struct {
        uint64_t paused                          : 1;
        uint64_t panicked                        : 1;
        // ...
        uint64_t debug_control_dirty             : 1;
        uint64_t dr_dirty                        : 1;
        uint64_t rflags_dirty                    : 1;
        uint64_t rip_dirty                       : 1;
        uint64_t fs_base_dirty                   : 1;
        uint64_t interruptibility_dirty          : 1;
        uint64_t pcpu_ctls_dirty                 : 1;
        uint64_t pae_pdpt_dirty                  : 1;
        uint64_t padding                         : 45;
    };

Then, we set pae_pdpt_dirty bit after updating vcpu_t::pae_pdptes[] (in vcpu_prepare_pae_pdpt()), and reset it after flushing these dirty values to VMCS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me both solutions are feasible and depend on the code style of the project. For many vmcs field modifications haxm uses dirty bits paradigm, but there are counterexamples - vmwrite_cr directly modifies SECONDARY_PROCESSOR_CONTROLS (PRIMARY_PROCESSOR_CONTROLS are modified using pcpu_ctls_dirty flag).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, #117 added pcpu_ctls_dirty along with several other dirty flags. It only covered the VMCS fields that would benefit a lot (in terms of performance) from delayed VMWRITE. There's also #130 (still WIP) which aims to convert more VMCS fields to use the cached VMREAD / delayed VMWRITE paradigm. So I think the "style" is moving towards more prevalent use of dirty bits, and adding pae_pdpt_dirty would be consistent with that.

core/vcpu.c Outdated
vmwrite(vcpu, GUEST_PDPTE1, vcpu->pae_pdptes[1]);
vmwrite(vcpu, GUEST_PDPTE2, vcpu->pae_pdptes[2]);
vmwrite(vcpu, GUEST_PDPTE3, vcpu->pae_pdptes[3]);
if( can_update_pdpte ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the alternative approach I proposed above, we just need a simple check here:

        if (vcpu->pae_pdpt_dirty) {
            // vcpu_prepare_pae_pdpt() has updated vcpu->pae_pdptes
            // Note that because we do not monitor guest writes to CR3, the only
            // case where vcpu->pae_pdptes is newer than VMCS GUEST_PDPTE{0..3}
            // is following a guest write to CR0 or CR4 that requires PDPTEs to
            // be reloaded, i.e. the pae_pdpt_dirty case. When the guest is in
            // PAE paging mode but !pae_pdpt_dirty, VMCS GUEST_PDPTE{0..3} are
            // already up-to-date following each VM exit (see Intel SDM Vol. 3C
            // 27.3.4), and we must not overwrite them with our cached values
            // (vcpu->pae_pdptes), which may be outdated.
            vmwrite(vcpu, GUEST_PDPTE0, vcpu->pae_pdptes[0]);
            vmwrite(vcpu, GUEST_PDPTE1, vcpu->pae_pdptes[1]);
            vmwrite(vcpu, GUEST_PDPTE2, vcpu->pae_pdptes[2]);
            vmwrite(vcpu, GUEST_PDPTE3, vcpu->pae_pdptes[3]);
            vcpu->pae_pdpt_dirty = 0;
        }

Note that I've also removed the outdated TODO comment and added a new comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Interception of CR3 modifications complicates the solution and should be avoided.

@raphaelning
Copy link
Contributor

BTW, how did you test the patches? We'd like to add PAE paging mode to our test coverage. If you could share your guest image with us, that would be great. Otherwise, we'll try to build a Linux i386 kernel with PAE enabled.

@nevilad
Copy link
Contributor Author

nevilad commented Jan 9, 2019

I didn't configure my windows guest to use PAE. I was even surprised that PAE mode is on, because I allocated only 600 Mb of RAM to my VM. I googled that windows can decide to use PAE automatically, based on some internal logic. This means that my image isn't acceptable for a test harness.
There were no special tests, I simply ran the guest, became a triple fault, found and fixed the first bug, ran again, became a vcpu_translate error in a ept_violation vmexit (due to mmio), fixed it too, run again and was able to log in and work in the guest.

@nevilad
Copy link
Contributor Author

nevilad commented Jan 9, 2019

The subject line need to be concise:
For the first patch, how about "Avoid flushing outdated PDPTE0..3 cache to VMCS"?
For the second patch, how about "page_walker: Fix PAE PDPT pointer calculation"?
OK.

@raphaelning
Copy link
Contributor

I didn't configure my windows guest to use PAE.

Ah, sorry, I overlooked the part about your use of a 32-bit Win7 guest. So you have enabled Windows to boot on HAXM for the first time, that's truly awesome!

#118 is about booting a x64 Windows guest, but even if it's still broken (likely due to lack of CR8 virtualization, which is only relevant to x64), we know we are now a lot closer to getting it working :)

@maronz
Copy link

maronz commented Jan 10, 2019

I just wanted to chime in with my observations of a quick test of 88ce709 on a 32-bit Ubuntu 16.04 (Xenial) system: I was able to boot a Windows 7 (SP1) x86 system on QEMU (v3.0.0) with '-accel hax'. A further test of a Windows 8.1 x86 system was also successful.

The same can't be said when I then moved onto Windows 10 (1809), but this is still quite some progress. And of course any attempts using x64 boot media ended up in failure (e.g. 'VCPU shutdown request')

@raphaelning
Copy link
Contributor

@maronz Thanks, that's great news! We're really looking forward to merging this PR.

@nevilad Please let me know if you have time to address the review comments. I can revise the code and update your branch, but I can't complete the DCO sign-off procedure on your behalf.

@HaHoYou
Copy link
Contributor

HaHoYou commented Jan 11, 2019

FYI, fail when update windows 10. Here is log when launching Windows 10 32bit iso:
$ log show --predicate 'sender == "intelhaxm"' --style syslog --last 20m
Filtering the log data using "sender == "intelhaxm""
Skipping info and debug messages, pass --info and/or --debug to include.
Timestamp (process)[PID]
2019-01-11 13:26:08.116556+0800 localhost kernel[0]: (intelhaxm) haxm_warn: hax_vm_ioctl: Got HAX_VM_IOCTL_NOTIFY_QEMU_VERSION, pid=889 ('qemu-system-x86_')
2019-01-11 13:26:08.124255+0800 localhost kernel[0]: (intelhaxm) cvcpu vmid 0 vcpu_id 0
2019-01-11 13:29:19.100928+0800 localhost kernel[0]: (intelhaxm) haxm_panic: Unhandled vmx vmexit reason:9
2019-01-11 13:29:19.100933+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4000 VMX_PIN_CONTROLS: 1f
2019-01-11 13:29:19.100937+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4002 VMX_PRIMARY_PROCESSOR_CONTROLS: 960061fa
2019-01-11 13:29:19.100940+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 401e VMX_SECONDARY_PROCESSOR_CONTROLS: aa
2019-01-11 13:29:19.100943+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4004 VMX_EXCEPTION_BITMAP: 40000
2019-01-11 13:29:19.100946+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4006 VMX_PAGE_FAULT_ERROR_CODE_MASK: 0
2019-01-11 13:29:19.100949+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4008 VMX_PAGE_FAULT_ERROR_CODE_MATCH: 0
2019-01-11 13:29:19.100952+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 400c VMX_EXIT_CONTROLS: 236fff
2019-01-11 13:29:19.100955+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 400e VMX_EXIT_MSR_STORE_COUNT: 0
2019-01-11 13:29:19.100958+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4010 VMX_EXIT_MSR_LOAD_COUNT: 0
2019-01-11 13:29:19.100961+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4012 VMX_ENTRY_CONTROLS: 11ff
2019-01-11 13:29:19.100964+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4014 VMX_ENTRY_MSR_LOAD_COUNT: 0
2019-01-11 13:29:19.100967+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4016 VMX_ENTRY_INTERRUPT_INFO: 8
2019-01-11 13:29:19.100970+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4018 VMX_ENTRY_EXCEPTION_ERROR_CODE: 0
2019-01-11 13:29:19.100973+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 401a VMX_ENTRY_INSTRUCTION_LENGTH: 0
2019-01-11 13:29:19.100976+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 401c VMX_TPR_THRESHOLD: 0
2019-01-11 13:29:19.100978+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6000 VMX_CR0_MASK:
2019-01-11 13:29:19.100981+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6002 VMX_CR4_MASK:
2019-01-11 13:29:19.100984+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6004 VMX_CR0_READ_SHADOW: 80000033
2019-01-11 13:29:19.100987+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6006 VMX_CR4_READ_SHADOW: 220
2019-01-11 13:29:19.100990+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 400a VMX_CR3_TARGET_COUNT: 0
2019-01-11 13:29:19.100993+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6008 VMX_CR3_TARGET_VAL_BASE: 0
2019-01-11 13:29:19.100995+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 0000 VMX_VPID: 1
2019-01-11 13:29:19.100998+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 2000 VMX_IO_BITMAP_A: 105eb000
2019-01-11 13:29:19.101001+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 2002 VMX_IO_BITMAP_B: 105ea000
2019-01-11 13:29:19.101004+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 2004 VMX_MSR_BITMAP: 105e9000
2019-01-11 13:29:19.101013+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 2006 VMX_EXIT_MSR_STORE_ADDRESS: 0
2019-01-11 13:29:19.101016+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 2008 VMX_EXIT_MSR_LOAD_ADDRESS: 0
2019-01-11 13:29:19.101019+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 200a VMX_ENTRY_MSR_LOAD_ADDRESS: 0
2019-01-11 13:29:19.101023+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 2010 VMX_TSC_OFFSET: ffffeee965be579f
2019-01-11 13:29:19.101025+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 2012 VMX_VAPIC_PAGE: 0
2019-01-11 13:29:19.101028+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 2014 VMX_APIC_ACCESS_PAGE: 0
2019-01-11 13:29:19.101031+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 201a VMX_EPTP: 104dc01e
2019-01-11 13:29:19.101034+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 482e VMX_PREEMPTION_TIMER: 0
2019-01-11 13:29:19.101037+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4400 VMX_INSTRUCTION_ERROR_CODE: 0
2019-01-11 13:29:19.101040+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4402 VM_EXIT_INFO_REASON: 9
2019-01-11 13:29:19.101042+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4404 VM_EXIT_INFO_INTERRUPT_INFO: 0
2019-01-11 13:29:19.101046+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4406 VM_EXIT_INFO_EXCEPTION_ERROR_CODE: 0
2019-01-11 13:29:19.101049+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4408 VM_EXIT_INFO_IDT_VECTORING: 80000008
2019-01-11 13:29:19.101052+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 440a VM_EXIT_INFO_IDT_VECTORING_ERROR_CODE: 0
2019-01-11 13:29:19.101055+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 440c VM_EXIT_INFO_INSTRUCTION_LENGTH: 3
2019-01-11 13:29:19.101058+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 440e VM_EXIT_INFO_INSTRUCTION_INFO: 0
2019-01-11 13:29:19.101062+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6400 VM_EXIT_INFO_QUALIFICATION: c0000050
2019-01-11 13:29:19.101066+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6402 VM_EXIT_INFO_IO_ECX: 31f
2019-01-11 13:29:19.101070+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6404 VM_EXIT_INFO_IO_ESI: 31f
2019-01-11 13:29:19.101074+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6406 VM_EXIT_INFO_IO_EDI:
2019-01-11 13:29:19.101078+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6408 VM_EXIT_INFO_IO_EIP: 8102edd4
2019-01-11 13:29:19.101082+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 640a VM_EXIT_INFO_GUEST_LINEAR_ADDRESS: 0
2019-01-11 13:29:19.101087+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 2400 VM_EXIT_INFO_GUEST_PHYSICAL_ADDRESS: 15a00000
2019-01-11 13:29:19.101090+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6c16 HOST_RIP:
2019-01-11 13:29:19.101092+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6c14 HOST_RSP:
2019-01-11 13:29:19.101094+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6c00 HOST_CR0: 80010033
2019-01-11 13:29:19.101097+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6c02 HOST_CR3: 25faa7111
2019-01-11 13:29:19.101100+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6c04 HOST_CR4: 3626e0
2019-01-11 13:29:19.101102+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 0c02 HOST_CS_SELECTOR: 8
2019-01-11 13:29:19.101105+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 0c06 HOST_DS_SELECTOR: 0
2019-01-11 13:29:19.101107+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 0c00 HOST_ES_SELECTOR: 0
2019-01-11 13:29:19.101110+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 0c08 HOST_FS_SELECTOR: 0
2019-01-11 13:29:19.101112+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 0c0a HOST_GS_SELECTOR: 0
2019-01-11 13:29:19.101115+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 0c04 HOST_SS_SELECTOR: 10
2019-01-11 13:29:19.101117+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 0c0c HOST_TR_SELECTOR: 40
2019-01-11 13:29:19.101120+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6c06 HOST_FS_BASE: 0
2019-01-11 13:29:19.101122+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6c08 HOST_GS_BASE:
2019-01-11 13:29:19.101125+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6c0a HOST_TR_BASE: fffffd000006c3c0
2019-01-11 13:29:19.101129+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6c0c HOST_GDTR_BASE: fffffd000006c320
2019-01-11 13:29:19.101132+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6c0e HOST_IDTR_BASE: fffffd0000007000
2019-01-11 13:29:19.101134+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4c00 HOST_SYSENTER_CS: b
2019-01-11 13:29:19.101137+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6c10 HOST_SYSENTER_ESP: fffffd000006c630
2019-01-11 13:29:19.101141+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6c12 HOST_SYSENTER_EIP: fffffd0000094740
2019-01-11 13:29:19.101143+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 681e GUEST_RIP: 811ec30a
2019-01-11 13:29:19.101146+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6820 GUEST_RFLAGS: 200246
2019-01-11 13:29:19.101149+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 681c GUEST_RSP: 80f89d18
2019-01-11 13:29:19.101152+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6800 GUEST_CR0: 80010033
2019-01-11 13:29:19.101157+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6802 GUEST_CR3: 1a8000
2019-01-11 13:29:19.101160+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6804 GUEST_CR4: 2660
2019-01-11 13:29:19.101163+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 0800 GUEST_ES_SELECTOR: 23
2019-01-11 13:29:19.101166+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 0802 GUEST_CS_SELECTOR: 8
2019-01-11 13:29:19.101168+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 0804 GUEST_SS_SELECTOR: 10
2019-01-11 13:29:19.101171+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 0806 GUEST_DS_SELECTOR: 23
2019-01-11 13:29:19.101175+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 0808 GUEST_FS_SELECTOR: 30
2019-01-11 13:29:19.101178+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 080a GUEST_GS_SELECTOR: 0
2019-01-11 13:29:19.101181+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 080c GUEST_LDTR_SELECTOR: 0
2019-01-11 13:29:19.101183+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 080e GUEST_TR_SELECTOR: 28
2019-01-11 13:29:19.101186+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4814 GUEST_ES_AR: c0f3
2019-01-11 13:29:19.101188+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4816 GUEST_CS_AR: c09b
2019-01-11 13:29:19.101191+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4818 GUEST_SS_AR: c093
2019-01-11 13:29:19.101193+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 481a GUEST_DS_AR: c0f3
2019-01-11 13:29:19.101196+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 481c GUEST_FS_AR: 4093
2019-01-11 13:29:19.101198+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 481e GUEST_GS_AR: 1c000
2019-01-11 13:29:19.101201+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4820 GUEST_LDTR_AR: 1c000
2019-01-11 13:29:19.101203+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4822 GUEST_TR_AR: 8b
2019-01-11 13:29:19.101205+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6806 GUEST_ES_BASE: 0
2019-01-11 13:29:19.101209+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6808 GUEST_CS_BASE: 0
2019-01-11 13:29:19.101212+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 680a GUEST_SS_BASE: 0
2019-01-11 13:29:19.101216+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 680c GUEST_DS_BASE: 0
2019-01-11 13:29:19.101219+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 680e GUEST_FS_BASE: 80ee6000
2019-01-11 13:29:19.101221+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6810 GUEST_GS_BASE: 0
2019-01-11 13:29:19.101224+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6812 GUEST_LDTR_BASE: 0
2019-01-11 13:29:19.101227+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6814 GUEST_TR_BASE: 80f93400
2019-01-11 13:29:19.101229+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6816 GUEST_GDTR_BASE: 80f93000
2019-01-11 13:29:19.101232+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6818 GUEST_IDTR_BASE: 80f96000
2019-01-11 13:29:19.101235+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4800 GUEST_ES_LIMIT: ffffffff
2019-01-11 13:29:19.101237+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4802 GUEST_CS_LIMIT: ffffffff
2019-01-11 13:29:19.101240+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4804 GUEST_SS_LIMIT: ffffffff
2019-01-11 13:29:19.101243+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4806 GUEST_DS_LIMIT: ffffffff
2019-01-11 13:29:19.101247+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4808 GUEST_FS_LIMIT: 6020
2019-01-11 13:29:19.101251+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 480a GUEST_GS_LIMIT: ffffffff
2019-01-11 13:29:19.101254+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 480c GUEST_LDTR_LIMIT: ffffffff
2019-01-11 13:29:19.101257+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 480e GUEST_TR_LIMIT: 20ab
2019-01-11 13:29:19.101259+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4810 GUEST_GDTR_LIMIT: 3ff
2019-01-11 13:29:19.101262+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4812 GUEST_IDTR_LIMIT: 7ff
2019-01-11 13:29:19.101265+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 2800 GUEST_VMCS_LINK_PTR: ffffffffffffffff
2019-01-11 13:29:19.101267+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 2802 GUEST_DEBUGCTL: 0
2019-01-11 13:29:19.101269+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 2804 GUEST_PAT: 0
2019-01-11 13:29:19.101272+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 2806 GUEST_EFER: 800
2019-01-11 13:29:19.101274+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 2808 GUEST_PERF_GLOBAL_CTRL: 0
2019-01-11 13:29:19.101277+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 280a GUEST_PDPTE0: 1a9001
2019-01-11 13:29:19.101280+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 280c GUEST_PDPTE1: 1aa001
2019-01-11 13:29:19.101282+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 280e GUEST_PDPTE2: 1ab001
2019-01-11 13:29:19.101285+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 2810 GUEST_PDPTE3: 1ac001
2019-01-11 13:29:19.101287+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 681a GUEST_DR7: 400
2019-01-11 13:29:19.101290+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6822 GUEST_PENDING_DBE: 0
2019-01-11 13:29:19.101292+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 482a GUEST_SYSENTER_CS: 0
2019-01-11 13:29:19.101294+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6824 GUEST_SYSENTER_ESP: 0
2019-01-11 13:29:19.101297+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 6826 GUEST_SYSENTER_EIP: 0
2019-01-11 13:29:19.101299+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4828 GUEST_SMBASE: 0
2019-01-11 13:29:19.101302+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4824 GUEST_INTERRUPTIBILITY: 0
2019-01-11 13:29:19.101304+0800 localhost kernel[0]: (intelhaxm) haxm_warn: 4826 GUEST_ACTIVITY_STATE: 0
2019-01-11 13:29:19.101306+0800 localhost kernel[0]: (intelhaxm) haxm_error: vcpu has panicked, id:0
2019-01-11 13:29:19.101308+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_host_cr4_vmxe: 1
2019-01-11 13:29:19.101309+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_host_cr4 3626e0
2019-01-11 13:29:19.101311+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxon_res 0
2019-01-11 13:29:19.101312+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxon_addr 105e6000
2019-01-11 13:29:19.101314+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxon_err_type1 0
2019-01-11 13:29:19.101315+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxon_err_type2 0
2019-01-11 13:29:19.101317+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxon_err_type3 0
2019-01-11 13:29:19.101318+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmclear_err 0
2019-01-11 13:29:19.101319+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmptrld_err 0
2019-01-11 13:29:19.101321+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmoff_no 0
2019-01-11 13:29:19.101322+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxoff_res 0
2019-01-11 13:29:19.101325+0800 localhost kernel[0]: (intelhaxm) haxm_error: vcpu has panicked, id:0
2019-01-11 13:29:19.101327+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_host_cr4_vmxe: 1
2019-01-11 13:29:19.101328+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_host_cr4 3626e0
2019-01-11 13:29:19.101329+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxon_res 0
2019-01-11 13:29:19.101331+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxon_addr 105e6000
2019-01-11 13:29:19.101333+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxon_err_type1 0
2019-01-11 13:29:19.101334+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxon_err_type2 0
2019-01-11 13:29:19.101335+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxon_err_type3 0
2019-01-11 13:29:19.101337+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmclear_err 0
2019-01-11 13:29:19.101338+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmptrld_err 0
2019-01-11 13:29:19.101339+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmoff_no 0
2019-01-11 13:29:19.101341+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxoff_res 0
2019-01-11 13:29:19.101445+0800 localhost kernel[0]: (intelhaxm) haxm_error: vcpu has panicked, id:0
2019-01-11 13:29:19.101447+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_host_cr4_vmxe: 1
2019-01-11 13:29:19.101449+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_host_cr4 3626e0
2019-01-11 13:29:19.101451+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxon_res 0
2019-01-11 13:29:19.101453+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxon_addr 105e6000
2019-01-11 13:29:19.101455+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxon_err_type1 0
2019-01-11 13:29:19.101457+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxon_err_type2 0
2019-01-11 13:29:19.101459+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxon_err_type3 0
2019-01-11 13:29:19.101461+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmclear_err 0
2019-01-11 13:29:19.101463+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmptrld_err 0
2019-01-11 13:29:19.101465+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmoff_no 0
2019-01-11 13:29:19.101467+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxoff_res 0
2019-01-11 13:29:19.101469+0800 localhost kernel[0]: (intelhaxm) haxm_error: vcpu has panicked, id:0
2019-01-11 13:29:19.101471+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_host_cr4_vmxe: 1
2019-01-11 13:29:19.101473+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_host_cr4 3626e0
2019-01-11 13:29:19.101475+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxon_res 0
2019-01-11 13:29:19.101477+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxon_addr 105e6000
2019-01-11 13:29:19.101479+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxon_err_type1 0
2019-01-11 13:29:19.101481+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxon_err_type2 0
2019-01-11 13:29:19.101483+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxon_err_type3 0
2019-01-11 13:29:19.101485+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmclear_err 0
2019-01-11 13:29:19.101487+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmptrld_err 0
2019-01-11 13:29:19.101489+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmoff_no 0
2019-01-11 13:29:19.101491+0800 localhost kernel[0]: (intelhaxm) haxm_error: log_vmxoff_res 0
2019-01-11 13:29:19.124327+0800 localhost kernel[0]: (intelhaxm) haxm_error:
...........hax_teardown_vm

@maronz
Copy link

maronz commented Jan 11, 2019

I've now re-done the Win10 (1809) x86 test, and will attach the relevant part from the 'dmesg' output here. Whilst I can't interpret this log, I've also added in this attachment the "cleaned up" versions of the log from the Windows host posted above, and my test from a Linux host. Maybe someone else can "see" what's going on.
w10_x86__panic.zip

@raphaelning
Copy link
Contributor

@maronz Thanks a lot. @HaHoYou was actually running the test on macOS, but apparently the host OS doesn't really matter in this case, because this fatal error appears in all the logs:

haxm_panic: Unhandled vmx vmexit reason:9

According to Intel SDM Vol. 3D Appendix C, VM exit reason 9 is:

Task switch. Guest software attempted a task switch.

which refers to the hardware task switching mechanism. Most OSes do not use this mechanism at all (in favor of the more flexible software task switching), and I think it's no longer available in 64-bit mode. But for some reason, Windows 10 x86 uses it, so we'll need to support it at some point.

@nevilad
Copy link
Contributor Author

nevilad commented Jan 12, 2019

  1. I will update the code, test it again and commit the new version according the rules.
  2. Windows 10 x64 and x86 still don't boot. This don't seems to be the result of this patch, but it seems that people are interested in running these OS'es under haxm. I will try to run these and fix the bugs.

@raphaelning
Copy link
Contributor

raphaelning commented Jan 14, 2019

I will update the code, test it again and commit the new version according the rules.

Thanks, the new patches good! But before we can merge this PR, we need to remove the first two commits (e7d78b7 and 88ce709) from the branch, and only keep the last two. Could you rewrite the git history and force push?

Windows 10 x64 and x86 still don't boot. [...] I will try to run these and fix the bugs.

That would be great! #118 can be used to track both issues, but if it makes sense, we can create a new one for Windows 10 x86 as well.

@nevilad nevilad closed this Jan 14, 2019
@nevilad
Copy link
Contributor Author

nevilad commented Jan 14, 2019

I removed all commits, force-pushed the branch, committed again and pushed. I didn't close the PR. Github thought that by reverting all commits I closed the PR?

@nevilad nevilad reopened this Jan 14, 2019
@@ -654,22 +654,26 @@ uint32_t pw_perform_page_walk(
pdpt_gpa = first_table;
}

char* pdpt_page_hva;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: There are a few cosmetic issues with this declaration:

  • All local variables should be declared at the beginning of the current scope, i.e. right below line 618 (C coding style).
  • char* p => char *p (C coding style).
  • Although sizeof(char) is always 1, I'd still prefer uint8_t, which is explicit about the size.

Sorry I overlooked them yesterday. Let me fix them and push a new commit to your branch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm having some trouble pushing to your branch, due to an HTTPS authentication error. Hopefully I'll find a solution soon. Meanwhile, please feel free to amend your second commit yourself:

diff --git a/core/page_walker.c b/core/page_walker.c
index 9b0f1c5..ffdea4d 100644
--- a/core/page_walker.c
+++ b/core/page_walker.c
@@ -616,8 +616,6 @@ uint32_t pw_perform_page_walk(
     first_table = pw_retrieve_table_from_cr3(cr3, is_pae, is_lme);

     if (is_pae) {
-        uint8_t *pdpt_page_hva;
-
         if (is_lme) {
             pml4t_gpa = first_table;
 #ifdef CONFIG_HAX_EPT2
@@ -656,6 +654,7 @@ uint32_t pw_perform_page_walk(
             pdpt_gpa = first_table;
         }

+        char* pdpt_page_hva;
 #ifdef CONFIG_HAX_EPT2
         pdpt_page_hva = gpa_space_map_page(&vcpu->vm->gpa_space,
                                            pdpt_gpa >> PG_ORDER_4K,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. We'll start testing this PR now. Before I merge it, I'll merge the last two commits (git rebase -i with fixup) and force-push to your branch.

@raphaelning
Copy link
Contributor

I removed all commits, force-pushed the branch, committed again and pushed. I didn't close the PR. Github thought that by reverting all commits I closed the PR?

Thanks. You didn't have to force-push right after removing all commits, because then the branch would be identical to master, which could confuse GitHub. Instead, all you need to do is clean up the git history locally (with git rebase -i, git reset --soft, etc.) and then do one force-push.

@raphaelning raphaelning merged commit e39e298 into intel:master Jan 15, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

No support for Windows guests
4 participants