Talk:Spurious wakeup

Is this even a thing?[edit]

Any sleeping thread can be unblocked if another thread can use the wake syscall using a valid reference to a lock, but the caller isn't under any restriction to ensure that the sleeping thread is actually supposed to wake up. That's just the nature of how the sleep syscall works. This article makes this phenomenon seem unavoidable, but there's no sign that this is true. 205.211.94.118 (talk) 15:00, 31 May 2018 (UTC)[reply]

To answer that question, I always suggest the original "Worse Is Better" paper, which gives a brief synopsis of the problem which is understandable by someone with a computer-science background. https://www.jwz.org/doc/worse-is-better.html (Here, "PC" means "program counter", and not "personal computer".) I just randomly googled this link which appears to do a better job of explaining what's going on. http://blog.reverberate.org/2011/04/eintr-and-pc-loser-ing-is-better-case.html — Preceding unsigned comment added by 4.15.123.194 (talk) 00:08, 6 September 2018 (UTC)[reply]

I don't understand the rationale[edit]

According to the rationale given, the reason that a thread could spuriously wake up is because making it predictable would slow things down. However, to be able to accomplish any task you need it to be predictable, so you check the condition and wait again. Since everyone does this, that might as well have been done by the function itself. As well as being cleaner, that would lower the memory footprint. It would not cost any speed, and on systems where threads don't spuriously wake up, the checking could be done away with completely.

If the article is true, and I understand it correctly, then I must say that I find the Windows equivalent much cleaner. If you WaitForSingleObject you know (barring catastrophic failure like waiting on nonexistant objects) you'll get WAIT_ABANDONED, WAIT_OBJECT_0 or WAIT_TIMEOUT, which is returned directly and thus always accurately reflects what happened. WAIT_TIMEOUT can't happen if it's INFINITE. WAIT_ABANDONED either signifies a fault of some sort or can be treated like WAIT_OBJECT_0, depending on the design of your program. Much cleaner. Shinobu (talk) 05:10, 21 November 2008 (UTC)[reply]

More than that. Thread could be pre-empted after it checks the invariant. The invariant can be changed at any time, so the checks give nothing by itself. —Preceding unsigned comment added by 87.248.233.35 (talk) 08:58, 19 June 2010 (UTC)[reply]

No, you're wrong. The mutex that is used together with the condition variable is supposed to protect the invariant from changing. So pre-empting the thread after checking the invariant will not be a problem, because the thread still holds the mutex lock. (Of course another thread could simply ignore the mutex and modify the invariant anyway, but that would be simply a bug in your code, something that fails you your Concurrent Programming 101 exam). — Preceding unsigned comment added by 217.67.201.162 (talk) 08:59, 17 September 2012 (UTC)[reply]

Personal correspondence[edit]

I removed the personal correspondence. Wikipedia:No original research requires all sources to be previously reliably published elsewhere. Superm401 - Talk 03:18, 14 October 2010 (UTC)[reply]

I understand the principle being invoked here, but can't help noticing the irony that the personal correspondence removed here answers exactly the question raised above. AllenDowney (talk) 17:08, 20 October 2012 (UTC)[reply]

Here is the story as originally present in the main article, for the benefit of people like me who want to know the story but (like me) don't enjoy digging through Wikipedia's "View history" feature to find it:

“

However, in later personal correspondence, David R. Butenhof admitted:

"Though there were indeed some members of the working group who argued that it was theoretically possible to imagine that there might be such an implementation, that wasn't really the reason. (And they were never able to prove it.) POSIX threads were the result of a lot of tension between pragmatic hard realtime programmers and largely academic researchers. Spurious wakeups are the mechanism of an academic computer scientist clique to make sure that everyone had to write clean code that checked and verified predicates!

"But the (perhaps) largely spurious (or at least arcanely philosophical) 'efficiency' argument went over better with the realtime people, and the real reason was usually relegated to second place in the rationale.

"I've thought many times about how you might construct a correct and practical implementation that would really have spurious wakeups. I've never managed to construct an example. Doesn't mean there isn't one, though, and it makes a good story."

”

This sounds slightly fishy to me, though, by the economist's fallacy: "If that thing on the ground were really a $20 bill, someone else would have picked it up by now." If the implementation of POSIX threads on, let's say, Ubuntu Linux, were really not subject to spurious wakeups, surely the man pages would mention that fact by now?

See also https://stackoverflow.com/questions/8594591/why-does-pthread-cond-wait-have-spurious-wakeups for more personal communications from David Butenhof. FWIW, none of the "personal communications" here nor there were to me. :) --Quuxplusone (talk) 01:21, 4 July 2017 (UTC)[reply]

Ah, indeed, it's not a $20 bill. Condition variables can be implemented via futexes on Linux, and FUTEX_WAIT is a system call, which means it will return with EINTR if the current process catches a signal. Vladimir Prus explains in this blog post: http://blog.vladimirprus.com/2005/07/spurious-wakeups.html --Quuxplusone (talk) 01:25, 4 July 2017 (UTC)[reply]

Other reasons for verifying the invariant[edit]

Please remove the "Other reasons for verifying the invariant" section, it is complete BS.

Upon wakeup, a thread is in the critical section of a lock that presumably guards the resource that is being checked, so no matter whether the thread gets scheduled immediately or not, it does own the resource now. If anyone else can modify the state of the resource at that time, i.e. before the woken-up thread has released ownership, it is a general problem with system design (or implementation) -- as pointed out above by 217.67.201.162 at 08:59, 17 September 2012 (UTC). Doing the check is certainly *NOT* the way to address it, and is in fact a textbook case of a race condition. — Preceding unsigned comment added by 78.83.51.215 (talk)

Not just spurious, but random wakeups? Really?[edit]

I've requested a better citation for the claim that a spurious wakeup can occur without the condition variable ever having been signaled. There can always exist a race between one thread waking from a signal and taking a lock to examine the condition and another thread simply taking the lock, changing the condition and unlocking. The result is that a thread can wake up on the cv but discover it's too late; the condition has already changed. But I think you need a race. If only one thread can be waiting on the cv and the condition has never been signaled, I don't believe any actual OS would generate not just spurious but completely random wakeups. The claim is sourced later in the article to a quote from a popular and likely very good programming how-to but I question its authority on this particular claim. A better source might be a Win32 Event or Linux semaphore api man page documenting that this can happen. Can someone illuminate, please? Msnicki (talk) 17:16, 29 April 2020 (UTC)[reply]

If there are no objections or other follow-ups in the next few days, I will correct the article to indicate that spurious interrupts do not happen for no reason, they happen because of a race condition as I've described above. Msnicki (talk) 15:44, 1 May 2020 (UTC)[reply]

I've just rewritten the article. I invite comments. Msnicki (talk) 17:21, 9 May 2020 (UTC)[reply]

Seems to be simply a bug in the runtime implementation[edit]

It seems to be simply a bug in the runtime implementation. Consider the code below. When debugging in Visual Studio, i get DebugBreak() occasionally called which doesn't make sense since the conditional variable is never set.

include <windows.h>
include <chrono>
include <mutex>
include <thread>

using namespace std;

struct ThreadContext {

 mutex canceledMutex;
 condition_variable canceled;

 static int _threadFunc(ThreadContext* ctx) {
   {
     unique_lock<mutex> isCanceledLock(ctx->canceledMutex);
     auto waitStatus = ctx->canceled.wait_for(isCanceledLock, 1ms);
     if (waitStatus == cv_status::no_timeout) {
       DebugBreak(); // !!! Should never get here !!!
     }
   }
   delete ctx;
   return 0;
 }

};

void repro() {

 for (int i = 0; i < 10000; i++) {
   auto ctx = new ThreadContext();
   thread _thread(ThreadContext::_threadFunc, ctx);
   _thread.detach();
 }

}

int main() {

 repro();
 return 0;

}