Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

L1CACHE.linesize is nothing on WSL2 Ubuntu making LoopVectorization.jl fail to precompile #27

Open
hsgg opened this issue Dec 14, 2020 · 7 comments

Comments

@hsgg
Copy link

hsgg commented Dec 14, 2020

Not sure if this is a VectorizationBase.jl, LoopVectorization.jl, or Hwloc.jl bug.

L1CACHE.linesize=nothing on my system:

julia> VectorizationBase.L₁CACHE
(size = nothing, depth = nothing, linesize = nothing, associativity = nothing, type = nothing)

This causes LoopVectorization.jl to fail to precompile.

My system is the WSL2, the Windows Subsystem for Linux 2 running Ubuntu-20.04. The /proc/cpuinfo appears normal (happy to post on request), CPU is Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz.

Some more info:

julia> VectorizationBase.CACHE_COUNT
(0, 0, 0, 0)

julia> VectorizationBase.COUNTS
Dict{Symbol,Int64} with 19 entries:
  :Package    => 1
  :Error      => 0
  :PU         => 16
  :OS_Device  => 0
  :L5Cache    => 0
  :L4Cache    => 0
  :I1Cache    => 0
  :L3Cache    => 0
  :Core       => 8
  :Machine    => 1
  :I3Cache    => 0
  :PCI_Device => 0
  :L2Cache    => 0
  :NUMANode   => 0
  :Bridge     => 0
  :Group      => 0
  :Misc       => 0
  :L1Cache    => 0
  :I2Cache    => 0

julia> VectorizationBase.TOPOLOGY
D0: L0 P0 Machine  
    D1: L0 P0 Package  
        D2: L0 P0 Core  
            D3: L0 P0 PU  
            D3: L1 P1 PU  
        D2: L1 P1 Core  
            D3: L2 P2 PU  
            D3: L3 P3 PU  
        D2: L2 P2 Core  
            D3: L4 P4 PU  
            D3: L5 P5 PU  
        D2: L3 P3 Core  
            D3: L6 P6 PU  
            D3: L7 P7 PU  
        D2: L4 P4 Core  
            D3: L8 P8 PU  
            D3: L9 P9 PU  
        D2: L5 P5 Core  
            D3: L10 P10 PU  
            D3: L11 P11 PU  
        D2: L6 P6 Core  
            D3: L12 P12 PU  
            D3: L13 P13 PU  
        D2: L7 P7 Core  
            D3: L14 P14 PU  
            D3: L15 P15 PU  

@hsgg
Copy link
Author

hsgg commented Dec 14, 2020

Just to follow up, this problem does not occur if I pin VectorizationBase to version 0.12.

@chriselrod
Copy link
Member

I think it's a Hwloc bug, but I don't know if it's supposed to work.

VectorizationBase 0.12 used CpuId.jl instead.
Maybe I should use both, with CpuId serving as a backup when Hwloc doesn't work. =/

@chriselrod
Copy link
Member

FWIW, it should look more like this:

julia> VectorizationBase.CACHE_COUNT
(18, 18, 1, 0)

julia> VectorizationBase.COUNTS
Dict{Symbol, Int64} with 19 entries:
  :L3Cache    => 1
  :I2Cache    => 0
  :Package    => 1
  :Machine    => 1
  :I3Cache    => 0
  :PU         => 36
  :PCI_Device => 0
  :OS_Device  => 0
  :Error      => 0
  :L2Cache    => 18
  :NUMANode   => 0
  :Bridge     => 0
  :L5Cache    => 0
  :Group      => 0
  :Misc       => 0
  :L1Cache    => 18
  :L4Cache    => 0
  :I1Cache    => 0
  :Core       => 18

julia> VectorizationBase.TOPOLOGY
D0: L0 P0 Machine
    D1: L0 P0 Package
        D2: L0 P-1 L3Cache  Cache{size=25952256,depth=3,linesize=64,associativity=11,type=Unified}
            D3: L0 P-1 L2Cache  Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
                D4: L0 P-1 L1Cache  Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
                    D5: L0 P0 Core
                        D6: L0 P0 PU
                        D6: L1 P18 PU
            D3: L1 P-1 L2Cache  Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
                D4: L1 P-1 L1Cache  Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
                    D5: L1 P1 Core
                        D6: L2 P1 PU
                        D6: L3 P19 PU
            D3: L2 P-1 L2Cache  Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
                D4: L2 P-1 L1Cache  Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
                    D5: L2 P2 Core
                        D6: L4 P2 PU
                        D6: L5 P20 PU
            D3: L3 P-1 L2Cache  Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
                D4: L3 P-1 L1Cache  Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
                    D5: L3 P3 Core
                        D6: L6 P3 PU
                        D6: L7 P21 PU
            D3: L4 P-1 L2Cache  Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
                D4: L4 P-1 L1Cache  Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
                    D5: L4 P4 Core
                        D6: L8 P4 PU
                        D6: L9 P22 PU
            D3: L5 P-1 L2Cache  Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
                D4: L5 P-1 L1Cache  Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
                    D5: L5 P8 Core
                        D6: L10 P5 PU
                        D6: L11 P23 PU
            D3: L6 P-1 L2Cache  Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
                D4: L6 P-1 L1Cache  Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
                    D5: L6 P9 Core
                        D6: L12 P6 PU
                        D6: L13 P24 PU
            D3: L7 P-1 L2Cache  Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
                D4: L7 P-1 L1Cache  Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
                    D5: L7 P10 Core
                        D6: L14 P7 PU
                        D6: L15 P25 PU
            D3: L8 P-1 L2Cache  Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
                D4: L8 P-1 L1Cache  Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
                    D5: L8 P11 Core
                        D6: L16 P8 PU
                        D6: L17 P26 PU
            D3: L9 P-1 L2Cache  Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
                D4: L9 P-1 L1Cache  Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
                    D5: L9 P16 Core
                        D6: L18 P9 PU
                        D6: L19 P27 PU
            D3: L10 P-1 L2Cache  Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
                D4: L10 P-1 L1Cache  Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
                    D5: L10 P17 Core
                        D6: L20 P10 PU
                        D6: L21 P28 PU
            D3: L11 P-1 L2Cache  Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
                D4: L11 P-1 L1Cache  Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
                    D5: L11 P18 Core
                        D6: L22 P11 PU
                        D6: L23 P29 PU
            D3: L12 P-1 L2Cache  Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
                D4: L12 P-1 L1Cache  Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
                    D5: L12 P19 Core
                        D6: L24 P12 PU
                        D6: L25 P30 PU
            D3: L13 P-1 L2Cache  Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
                D4: L13 P-1 L1Cache  Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
                    D5: L13 P20 Core
                        D6: L26 P13 PU
                        D6: L27 P31 PU
            D3: L14 P-1 L2Cache  Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
                D4: L14 P-1 L1Cache  Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
                    D5: L14 P24 Core
                        D6: L28 P14 PU
                        D6: L29 P32 PU
            D3: L15 P-1 L2Cache  Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
                D4: L15 P-1 L1Cache  Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
                    D5: L15 P25 Core
                        D6: L30 P15 PU
                        D6: L31 P33 PU
            D3: L16 P-1 L2Cache  Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
                D4: L16 P-1 L1Cache  Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
                    D5: L16 P26 Core
                        D6: L32 P16 PU
                        D6: L33 P34 PU
            D3: L17 P-1 L2Cache  Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
                D4: L17 P-1 L1Cache  Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
                    D5: L17 P27 Core
                        D6: L34 P17 PU
                        D6: L35 P35 PU

Seems that everything pertaining to the cache is missing. While the topology has the 8 cores, they're all nested directly inside the package rather than the caches. =/

@hsgg
Copy link
Author

hsgg commented Dec 14, 2020

Thanks for the response!

I made a Hwloc.jl bug report here, and have pinned VectorizationBase.jl to version 0.12 as a workaround.

@chriselrod
Copy link
Member

Thanks for filling a report there, but the contributors at Hwloc.jl may forward you here:
https://github.com/open-mpi/hwloc

Note that VectorizationBase 0.12 doesn't support Julia 1.6. Julia 1.6 won't be out until probably early February, so there's time to find a solutio. Or, failing that, I could implement a workaround.

@hsgg
Copy link
Author

hsgg commented Dec 14, 2020

Thanks for the heads-up. I appreciate your help!

@chriselrod
Copy link
Member

Forgot to update to confirm that LoopVectorization's been using 64 as a default fallback for a while now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants