Linus:為何對象引用計數(shù)必須是原子的
Linus大神又在rant了!這次的吐槽對象是時下很火熱的并行技術(shù)(parellism),并直截了當(dāng)?shù)乇硎?a title="并行計算基本上就是浪費大家的時間" target="_blank">并行計算是浪費所有人時間(“The whole “let’s parallelize” thing is a huge waste of everybody’s time.”)。大致意思是說亂序性能快、提高緩存容量、降功耗。當(dāng)然筆者不打算正面討論并行的是是非非(過于宏偉的主題),因為Linus在另一則帖子中舉了對象引用計數(shù)(reference counting)的例子來說明并行的復(fù)雜性。
在Linus回復(fù)之前有人指出對象需要鎖機(jī)制的情況下,引用計數(shù)的原子性問題:
Since it is being accessed in a multi-threaded way, via multiple access paths, generally it needs its own mutex — otherwise, reference counting would not be required to be atomic and a lock of a higher-level object would suffice.
由于(對象)通過多線程方式及多種獲取渠道,一般而言它需要自身維護(hù)一個互斥鎖——否則引用計數(shù)就不要求是原子的,一個更高層次的對象鎖足矣。
而Linus不那么認(rèn)為:
The problem with reference counts is that you often need to take them *before* you take the lock that protects the object data.
引用計數(shù)的問題在于你經(jīng)常需要在對象數(shù)據(jù)上鎖保護(hù)之前完成它。
The thing is, you have two different cases:
問題有兩種情況:
- object *reference* 對象引用
- object data 對象數(shù)據(jù)
and they have completely different locking.
它們鎖機(jī)制是完全不一樣的。
Object data locking is generally per-object. Well, unless you don’t have huge scalability issues, in which case you may have some external bigger lock (extreme case: one single global lock).
對象數(shù)據(jù)保護(hù)一般是一個對象擁有一個鎖,假設(shè)你沒有海量擴(kuò)展性問題,不然你需要一些外部大一點的鎖(極端的例子,一個對象一個全局鎖)。
But object *referencing* is mostly about finding the object (and removing/freeing it). Is it on a hash chain? Is it in a tree? Linked list? When the reference count goes down to zero, it’s not the object data that you need to protect (the object is not used by anything else, so there’s nothing to protect!), it’s the ways to find the object you need to protect.
但對象引用主要關(guān)于對象的尋找(移除或釋放),它是否在哈希鏈,一棵樹或者鏈表上。當(dāng)對象引用計數(shù)降為零,你要保護(hù)的不是對象數(shù)據(jù),因為對象沒有在其它地方使用,你要保護(hù)的是對象的尋找操作。
And the lock for the lookup operation cannot be in the object, because – by definition – you don’t know what the object is! You’re trying to look it up, after all.
而且查詢操作的鎖不可能在對象內(nèi)部,因為根據(jù)定義,你還不知道這是什么對象,你在嘗試尋找它。
So generally you have a lock that protects the lookup operation some way, and the reference count needs to be atomic with respect to that lock.
因此一般你要對查詢操作上鎖,而且引用計數(shù)相對那個鎖來說是原子的(譯者注:查詢鎖不是引用計數(shù)所在的對象所有,不能保護(hù)對象引用計數(shù),后面會解釋為何引用計數(shù)變更時其所在對象不能上鎖)。
And yes, that lock may well be sufficient, and now you’re back to non-atomic reference counts. But you usually don’t have just one way to look things up: you might have pointers from other objects (and that pointer is protected by the object locking of the other object), but there may be multiple such objects that point to this (which is why you have a reference count in the first place!)
當(dāng)然這個鎖是充分有效的,現(xiàn)在假設(shè)引用計數(shù)是非原子的,但你常常不僅僅使用一種方式來查詢:你可能擁有其它對象的指針(這個指針又被其它對象的對象鎖給保護(hù)起來),但同時還會有多個對象指向它(這就是為何你第一時間需要引用計數(shù)的理由)。
See what happens? There is no longer one single lock for lookup. Imagine walking a graph of objects, where objects have pointers to each other. Each pointer implies a reference to an object, but as you walk the graph, you have to release the lock from the source object, so you have to take a new reference to the object you are now entering.
看看會發(fā)生什么?查詢不止存在一個鎖保護(hù)。你可以想象走過一張對象流程圖,其中對象存在指向其它對象的指針,每個指針暗含了一次對象引用,但當(dāng)你走過這個流程圖,你必須釋放源對象的鎖,而你進(jìn)入新對象時又必須增加一次引用。
And in order to avoid deadlocks, you can not in the general case take the lock of the new object first – you have to release the lock on the source object, because otherwise (in a complex graph), how do you avoid simple ABBA deadlock?
而且為了避免死鎖,你一般不能立即對新對象上鎖——你必須釋放源對象的鎖,否則在一個復(fù)雜流程圖里,你如何避免ABBA死鎖(譯者注:假設(shè)兩個線程,一個是A->B,另一個B->;A,當(dāng)線程一給A上鎖,線程二給B上鎖,此時兩者誰也無法釋放對方的鎖)?
So atomic reference counts fix that. They work because when you move from object A to object B, you can do this:
原子引用計數(shù)修正了這一點,當(dāng)你從對象A到對象B,你會這樣做:
(a) you have a reference count to A, and you can lock A
對象A增加一次引用計數(shù),并上鎖。
(b) once object A is locked, the pointer from A to B is stable, and you know you have a reference to B (because of that pointer from A to B)
對象A一旦上鎖,A指向B的指針就是穩(wěn)定的,于是你知道你引用了對象B。
(c) but you cannot take the object lock for B (ABBA deadlock) while holding the lock on A
但你不能在對象A上鎖期間給B上鎖(ABBA死鎖)。
(d) increment the atomic reference count on B
對象B增加一次原子引用計數(shù)。
(e) now you can drop the lock on A (you’re “exiting” A)
現(xiàn)在你可以扔掉對象A的鎖(退出對象A)。
(f) your reference count means that B cannot go away from under you despite unlocking A, so now you can lock B.
對象B的原子引用計數(shù)意味著即使給A解鎖期間,B也不會失聯(lián),現(xiàn)在你可以給B上鎖。
See? Atomic reference counts make this kind of situation possible. Yes, you want to avoid the overhead if at all possible (for example, maybe you have a strict ordering of objects, so you know you can walk from A to B, and never walk from B to A, so there is no ABBA deadlock, and you can just lock B while still holding the lock on A).
看見了嗎?原子引用計數(shù)使這種情況成為可能。是的,你想盡一切辦法避免這種代價,比如,你也許把對象寫成嚴(yán)格順序的,這樣你可以從A到B,絕不會從B到A,如此就不存在ABBA死鎖了,你也就可以在A上鎖期間給B上鎖了。
But if you don’t have some kind of forced ordering, and if you have multiple ways to reach an object (and again – why have reference counts in the first place if that isn’t true!) then atomic reference counts really are the simple and sane answer.
但如果你無法做到這種強(qiáng)迫序列,如果你有多種方式接觸一個對象(再一次強(qiáng)調(diào),這是第一時間使用引用計數(shù)的理由),這樣,原子引用計數(shù)就是簡單又理智的答案。
If you think atomic refcounts are unnecessary, that’s a big flag that you don’t actually understand the complexities of locking.
如果你認(rèn)為原子引用計數(shù)是不必要的,這就大大說明你實際上不了解鎖機(jī)制的復(fù)雜性。
Trust me, concurrency is hard. There’s a reason all the examples of “look how easy it is to parallelize things” tend to use simple arrays and don’t ever have allocations or freeing of the objects.
相信我,并發(fā)設(shè)計是困難的。所有關(guān)于“并行化如此容易”的理由都傾向于使用簡單數(shù)組操作做例子,甚至不包含對象的分配和釋放。
People who think that the future is highly parallel are invariably completely unaware of just how hard concurrency really is. They’ve seen Linpack, they’ve seen all those wonderful examples of sorting an array in parallel, they’ve seen all these things that have absolutely no actual real complexity – and often very limited real usefulness.
那些認(rèn)為未來是高度并行化的人一成不變地完全沒有意識到并發(fā)設(shè)計是多么困難。他們只見過Linpack,他們只見過并行技術(shù)中關(guān)于數(shù)組排序的一切精妙例子,他們只見過一切絕不算真正復(fù)雜的事物——對真正的用處經(jīng)常是非常有限的。
(譯者注:當(dāng)然,我無意借大神之口把技術(shù)宗教化。實際上Linus又在另一篇帖子中綜合了對并行的評價。)
Oh, I agree. My example was the simple case. The really complex cases are much worse.
哦,我同意。我的例子還算簡單,真正復(fù)雜的用例更糟糕。
I seriously don’t believe that the future is parallel. People who think you can solve it with compilers or programming languages (or better programmers) are so far out to lunch that it’s not even funny.
我嚴(yán)重不相信未來是并行的。有人認(rèn)為你可以通過編譯器,編程語言或者更好的程序員來解決問題,他們目前都是神志不清,沒意識到這一點都不有趣。
Parallelism works well in simplified cases with fairly clear interfaces and models. You find parallelism in servers with independent queries, in HPC, in kernels, in databases. And even there, people work really hard to make it work at all, and tend to expressly limit their models to be more amenable to it (eg databases do some things much better than others, so DB admins make sure that they lay out their data in order to cater to the limitations).
并行計算可以在簡化的用例以及具備清晰的接口和模型上正常工作。你發(fā)現(xiàn)并行在服務(wù)器上獨立查詢里,在高性能計算(High-performance computing)里,在內(nèi)核里,在數(shù)據(jù)庫里。即使如此,人們還得花很大力氣才能使它工作,并且還要明確限制他們的模型來盡更多義務(wù)(例如數(shù)據(jù)庫要想做得更好,數(shù)據(jù)庫管理員得確保數(shù)據(jù)得到合理安排來迎合局限性)。
Of course, other programming models can work. Neural networks are inherently very parallel indeed. And you don’t need smarter programmers to program them either..
當(dāng)然,其它編程模型倒能派上用場,神經(jīng)網(wǎng)絡(luò)(neural networking)天生就是非常并行化的,你不需要更聰明的程序員為之寫代碼。
參考資料
- Real World Technologies:Linus常去“灌水”的一個論壇,討論未來機(jī)器架構(gòu)(看名字就知道Linus技術(shù)偏好,及其之前對虛擬化技術(shù)(virtualization)的吐槽)
- 多線程程序中操作的原子性:解釋為什么i++不是原子操作
- Concurrency Is Not Parallelism:Go語言之父Rob Pike幻燈片解釋“并發(fā)”與“并行”概念上的區(qū)別