-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about 2-pass copy? #17
Comments
@wangtZJU It doesn't seem to help on most modern hardware, but it certainly could for some hardware. Examples of hardware design/behaviors that cause this include:
Evidently on relatively recent x86 hardware these types of effects aren't enough (if they exist at all) to overcome the disadvantage of twice as many reads and writes (and twice as many instructions usually). This technique also has the disadvantage of having a larger impact on the existing contents of L1, which could be really bad for smaller copies (but not easily picked up by a benchmark, since the cost is largely incurred by code following the copy which suffers increases misses). Finally, this technique doesn't really place nice with non-temporal reads and writes, which are the main trick to accelerate large copies on hardware that offers them. |
Hi, I want to know whether there is possibility that 2-pass copy run faster than direct copy. I have seen you said 2-pass copy means source -> L1 cache, L1 cache -> destination. But I think it's not helpful for reducing cache misses of source and dest, isn't it?
Thanks~
The text was updated successfully, but these errors were encountered: