使用 Litmus 驗證內(nèi)存重排
本文轉(zhuǎn)載自微信公眾號「碼農(nóng)桃花源」,作者曹春暉。轉(zhuǎn)載本文請聯(lián)系碼農(nóng)桃花源公眾號。
At the same time, x86 defines quite a strict memory model, which bans most possible reorderings, roughly summarized as follows:
Stores have a single global order of visibility, observed consistently by all CPUs, subject to one loosening of this rule below. Local load operations are never reordered with respect to other local load operations.
Local store operations are never reordered with respect to other local store operations (i.e., a store that appears earlier in the instruction stream always appears earlier in the global order).
Local load operations may be reordered with respect to earlier local store operations, such that the load appears to execute earlier wrt the global store order than the local store, but the reverse (earlier load, older store) is not true.
簡單概括一下,就是在 x86 平臺采用較強的內(nèi)存序,只有 store load 會發(fā)生亂序。
看各位八股文老仙們背的實在辛苦,本文提供一點可以直接實操證明這些問題的手段。
perfbook 一書在講 memory barrier 相關(guān)的概念時,都使用了一個叫 litmus 的工具,現(xiàn)在被集成在 herdtools[2] 中,安裝好 herdtools 就已經(jīng)有了 litmus,上面提到的所有讀寫重排/亂序的情況我們都可以進行測試。
讀寫亂序測試
- X86 RW
- { x=0; y=0; }
- P0 | P1 ;
- MOV EAX,[y] | MOV EAX,[x] ;
- MOV [x],$1 | MOV [y],$1 ;
- locations [x;y;]
- exists (0:EAX=1 /\ 1:EAX=1)
- %%%%%%%%%%%%%%%%%%%%%%%%%
- % Results for sb.litmus %
- %%%%%%%%%%%%%%%%%%%%%%%%%
- X86 OOO
- {x=0; y=0;}
- P0 | P1 ;
- MOV EAX,[y] | MOV EAX,[x] ;
- MOV [x],$1 | MOV [y],$1 ;
- locations [x; y;]
- exists (0:EAX=1 /\ 1:EAX=1)
- Generated assembler
- ##START _litmus_P0
- movl -4(%rsi,%rcx,4), %eax
- movl $1, -4(%rbx,%rcx,4)
- ##START _litmus_P1
- movl -4(%rbx,%rcx,4), %eax
- movl $1, -4(%rsi,%rcx,4)
- Test OOO Allowed
- Histogram (2 states)
- 500000:>0:EAX=1; 1:EAX=0; x=1; y=1;
- 500000:>0:EAX=0; 1:EAX=1; x=1; y=1;
- No
- Witnesses
- Positive: 0, Negative: 1000000
- Condition exists (0:EAX=1 /\ 1:EAX=1) is NOT validated
- Hash=7cdd62e8647b817c1615cf8eb9d2117b
- Observation OOO Never 0 1000000
- Time OOO 0.14
寫讀亂序測試
- X86 RW
- { x=0; y=0; }
- P0 | P1 ;
- MOV EAX,[y] | MOV EAX,[x] ;
- MOV [x],$1 | MOV [y],$1 ;
- locations [x;y;]
- exists (0:EAX=1 /\ 1:EAX=1)
- %%%%%%%%%%%%%%%%%%%%%%%%%%
- % Results for sb2.litmus %
- %%%%%%%%%%%%%%%%%%%%%%%%%%
- X86 OOO
- {x=0; y=0;}
- P0 | P1 ;
- MOV [x],$1 | MOV [y],$1 ;
- MOV EAX,[y] | MOV EAX,[x] ;
- locations [x; y;]
- exists (0:EAX=0 /\ 1:EAX=0)
- Generated assembler
- ##START _litmus_P0
- movl $1, -4(%rbx,%rcx,4)
- movl -4(%rsi,%rcx,4), %eax
- ##START _litmus_P1
- movl $1, -4(%rsi,%rcx,4)
- movl -4(%rbx,%rcx,4), %eax
- Test OOO Allowed
- Histogram (4 states)
- 2 *>0:EAX=0; 1:EAX=0; x=1; y=1;
- 499998:>0:EAX=1; 1:EAX=0; x=1; y=1;
- 499999:>0:EAX=0; 1:EAX=1; x=1; y=1;
- 1 :>0:EAX=1; 1:EAX=1; x=1; y=1;
- Ok
- Witnesses
- Positive: 2, Negative: 999998
- Condition exists (0:EAX=0 /\ 1:EAX=0) is validated
- Hash=2d53e83cd627ba17ab11c875525e078b
- Observation OOO Sometimes 2 999998
- Time OOO 0.12
讀讀和寫寫亂序測試
這里我沒想到太好的辦法,所以將讀讀和寫寫混在一起進行測試,無論是 WW 會發(fā)生重排,或是 RR 會發(fā)生重排,都可能會出現(xiàn)在 P0 中,EAX = 2,EBX = 0 的情況。
- X86 OOO
- { x=0; y=0; }
- P0 | P1 ;
- MOV EAX,[x] | MOV [y],$1 ;
- MOV EBX,[y] | MOV [x],$2 ;
- locations [x;y;]
- exists (0:EAX=2 /\ 0:EBX=0)
- %%%%%%%%%%%%%%%%%%%%%%%%%%
- % Results for sb3.litmus %
- %%%%%%%%%%%%%%%%%%%%%%%%%%
- X86 OOO
- {x=0; y=0;}
- P0 | P1 ;
- MOV EAX,[x] | MOV [y],$1 ;
- MOV EBX,[y] | MOV [x],$2 ;
- locations [x; y;]
- exists (0:EAX=2 /\ 0:EBX=0)
- Generated assembler
- ##START _litmus_P0
- movl -4(%rbx,%rcx,4), %eax
- movl -4(%rdx,%rcx,4), %r11d
- ##START _litmus_P1
- movl $1, -4(%rdi,%rax,4)
- movl $2, -4(%rcx,%rax,4)
- Test OOO Allowed
- Histogram (3 states)
- 500000:>0:EAX=0; 0:EBX=0; x=2; y=1;
- 1 :>0:EAX=0; 0:EBX=1; x=2; y=1;
- 499999:>0:EAX=2; 0:EBX=1; x=2; y=1;
- No
- Witnesses
- Positive: 0, Negative: 1000000
- Condition exists (0:EAX=2 /\ 0:EBX=0) is NOT validated
- Hash=74f6930f2a61d6cfec9fb5ea3132555e
- Observation OOO Never 0 1000000
- Time OOO 0.11
[1]
這么: https://stackoverflow.com/questions/50307693/does-an-x86-cpu-reorder-instructions
[2]
herdtools: https://github.com/herd/herdtools7