问题：

锁定C++11 std::unique_lock会导致死锁异常

田柏

2023-03-14

我试图使用C++11的std::condition_variable，但是当我试图从第二个线程锁定与其关联的unique_lock时，我得到一个异常“资源死锁已避免”。创建它的线程可以锁定和解锁它，但不能锁定第二个线程，即使我非常肯定unique_lock不应该在第二个线程试图锁定它的地方已经锁定。

FWIW我在Linux中使用gcc4.8.1和-std=gnu++11。

我已经围绕condition_variable、unique_lock和mutex编写了一个包装类，所以代码中的其他内容都不能直接访问它们。注意使用std::defer_lock，我已经掉进了那个陷阱：-）。

class Cond {
private:
    std::condition_variable cCond;
    std::mutex cMutex;
    std::unique_lock<std::mutex> cULock;
public:
    Cond() : cULock(cMutex, std::defer_lock)
    {}

    void wait()
    {
        std::ostringstream id;
        id << std::this_thread::get_id();
        H_LOG_D("Cond %p waiting in thread %s", this, id.str().c_str());
        cCond.wait(cULock);
        H_LOG_D("Cond %p woke up in thread %s", this, id.str().c_str());
    }

    // Returns false on timeout
    bool waitTimeout(unsigned int ms)
    {
        std::ostringstream id;
        id << std::this_thread::get_id();
        H_LOG_D("Cond %p waiting (timed) in thread %s", this, id.str().c_str());
        bool result = cCond.wait_for(cULock, std::chrono::milliseconds(ms))
                == std::cv_status::no_timeout;
        H_LOG_D("Cond %p woke up in thread %s", this, id.str().c_str());
        return result;
    }

    void notify()
    {
        cCond.notify_one();
    }

    void notifyAll()
    {
        cCond.notify_all();
    }

    void lock()
    {
        std::ostringstream id;
        id << std::this_thread::get_id();
        H_LOG_D("Locking Cond %p in thread %s", this, id.str().c_str());
        cULock.lock();
    }

    void release()
    {
        std::ostringstream id;
        id << std::this_thread::get_id();
        H_LOG_D("Releasing Cond %p in thread %s", this, id.str().c_str());
        cULock.unlock();
    }
};

我的主线程创建了一个RenderContext，它有一个与之关联的线程。从主线程的角度来看，它使用Cond向呈现线程发出执行操作的信号，并且还可以在Cond上等待呈现线程完成该操作。呈现线程在Cond上等待主线程发送呈现请求，并在必要时使用相同的Cond告诉主线程它已经完成了一个操作。我得到的错误发生在呈现线程试图锁定Cond以检查/等待呈现请求时，此时它根本不应该被锁定（因为主线程正在等待它），更别提由同一线程锁定了。输出如下：

DEBUG: Created window
DEBUG: OpenGL 3.0 Mesa 9.1.4, GLSL 1.30
DEBUG: setScreen locking from thread 140564696819520
DEBUG: Locking Cond 0x13ec1e0 in thread 140564696819520
DEBUG: Releasing Cond 0x13ec1e0 in thread 140564696819520
DEBUG: Entering GLFW main loop
DEBUG: requestRender locking from thread 140564696819520
DEBUG: Locking Cond 0x13ec1e0 in thread 140564696819520
DEBUG: requestRender waiting
DEBUG: Cond 0x13ec1e0 waiting in thread 140564696819520
DEBUG: Running thread 'RenderThread' with id 140564575180544
DEBUG: render thread::run locking from thread 140564575180544
DEBUG: Locking Cond 0x13ec1e0 in thread 140564575180544
terminate called after throwing an instance of 'std::system_error'
  what():  Resource deadlock avoided

老实说，我真的不明白unique_lock是干什么用的，也不明白为什么condition_variable需要一个互斥体，而不是直接使用互斥体，所以这可能就是问题的原因。我在网上找不到一个很好的解释。

慕璞

2023-03-14

前言：要理解条件变量的一个重要问题是，它们可能会受到随机的、虚假的唤醒。换句话说，CV可以退出wait()而无需任何人先调用notify_*()。不幸的是，没有办法区分这种虚假的唤醒和合法的唤醒，所以唯一的解决方案是提供一个额外的资源（至少是一个布尔资源），这样您就可以判断是否实际满足唤醒条件。

这个额外的资源也应该由互斥体来保护，通常是作为CV的同伴使用的。

CV/互斥对的典型用法如下：

std::mutex mutex;
std::condition_variable cv;
Resource resource;

void produce() {
    // note how the lock only protects the resource, not the notify() call
    // in practice this makes little difference, you just get to release the
    // lock a bit earlier which slightly improves concurrency
    {
        std::lock_guard<std::mutex> lock(mutex); // use the lightweight lock_guard
        make_ready(resource);
    }
    // the point is: notify_*() don't require a locked mutex
    cv.notify_one(); // or notify_all()
}

void consume() {
    std::unique_lock<std::mutex> lock(mutex);
    while (!is_ready(resource))
        cv.wait(lock);
    // note how the lock still protects the resource, in order to exclude other threads
    use(resource);
}

与您的代码相比，请注意几个线程可以同时调用produce()/consume()，而不必担心共享的unique_lock:唯一共享的东西是mutex/cv/resource，每个线程都有自己的unique_lock，如果互斥体已经被其他东西锁定，它会强制线程等待轮到它。

正如您所看到的，资源实际上不能从CV/Mutex对中分离出来，这就是为什么我在一个评论中说您的包装类实际上不适合IMHO，因为它确实试图将它们分离出来。

通常的方法不是像您所尝试的那样为cv/mutex对制作包装器，而是为整个cv/mutex/resource三元组制作包装器。例如，一个线程安全的消息队列，使用者线程将在CV上等待，直到队列中的消息准备就绪。

如果您真的想要包装cv/mutex对，则应该删除不安全的lock()/release()方法（从RAII的角度来看），并用返回unique_ptr的单个lock()方法替换它们：

std::unique_ptr<std::mutex> lock() {
    return std::unique_ptr<std::mutex>(cMutex);
}

这样，您就可以使用cond包装器类，其方式与我上面展示的完全相同：

Cond cond;
Resource resource;

void produce() {
    {
        auto lock = cond.lock();
        make_ready(resource);
    }
    cond.notify(); // or notifyAll()
}

void consume() {
    auto lock = cond.lock();
    while (!is_ready(resource))
        cond.wait(lock);
    use(resource);
}

但老实说，我不确定是否值得这样做：如果您想使用recursive_mutex而不是普通的mutex怎么办？那么，您必须从您的类中创建一个模板，以便您可以选择互斥类型（或者完全编写第二个类，对代码重复是正确的）。而且无论如何，您没有得到多少好处，因为您仍然必须编写几乎相同的代码来管理资源。一个仅用于cv/mutex对的包装类是一个太薄的包装，不可能真正有用。但和往常一样，YMMV。

锁定C++11 std::unique_lock会导致死锁异常

共有1个答案

相关问答

相关文章

相关阅读

相关工具

相关文档