Author Topic: Rmanager mutex error  (Read 8925 times)

Anton

  • Newbie
  • *
  • Posts: 25
    • View Profile
Rmanager mutex error
« on: June 21, 2017, 09:55:02 AM »
Dear Colleagues!

Periodically I see in Yate's log following errors:
----------------------------------------------------
20170621143618.000069 <engine:MILD> Creating new message dispatching thread (4 running)
20170621143618.749978 <engine:WARN> Engine is congested: Call endpoint mutex busy
20170621143620.000069 <engine:MILD> Creating new message dispatching thread (5 running)
20170621143622.000079 <engine:MILD> Creating new message dispatching thread (6 running)
Supervisor: child 10193 died on signal 6
Yate (10247) is starting Wed Jun 21 14:36:25 2017
...

Mentioned core dump lines:
----------------------------------------------------
Thread 1 (Thread 0x7f2f47db6700 (LWP 10168)):
#0  0x00007f2f5b56e125 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f2f5b5713a0 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f2f5c0cb3fb in TelEngine::Debug(int, char const*, ...) () from /usr/lib/libyate.so.5.5.1
#3  0x00007f2f5c0ce148 in TelEngine::MutexPrivate::lock(long) () from /usr/lib/libyate.so.5.5.1
#4  0x00007f2f5aee19a0 in Lock (maxwait=-1, lck=0x7f2f5b11d7a0, this=<synthetic pointer>) at ../yateclass.h:5363
#5  complete (msg=..., this=0x7f2f48104c30, minimal=<optimized out>) at ysipchan.cpp:7376
#6  (anonymous namespace)::YateSIPConnection::complete (this=0x7f2f48104c30, msg=..., minimal=false) at ysipchan.cpp:7371
#7  0x00007f2f5c109ad6 in TelEngine::Channel::message(char const*, bool, bool) () from /usr/lib/libyate.so.5.5.1
#8  0x00007f2f5c109b3d in TelEngine::Channel::connected(char const*) () from /usr/lib/libyate.so.5.5.1
#9  0x00007f2f5c106f24 in TelEngine::CallEndpoint::connect(TelEngine::CallEndpoint*, char const*, bool) () from /usr/lib/libyate.so.5.5.1
#10 0x00007f2f5876cc88 in (anonymous namespace)::ForkModule::msgExecute (msg=..., this=<error reading variable: Unhandled dwarf expression opcode 0xfa>) at callfork.cpp:782
#11 0x00007f2f5876e7c8 in (anonymous namespace)::ForkModule::received (this=0x7f2f589709e0, msg=..., id=256) at callfork.cpp:854
#12 0x00007f2f5c0f50df in TelEngine::MessageDispatcher::dispatch(TelEngine::Message&) () from /usr/lib/libyate.so.5.5.1
#13 0x00007f2f5c107cb6 in TelEngine::Router::route() () from /usr/lib/libyate.so.5.5.1
#14 0x00007f2f5c10b89b in TelEngine::Router::run() () from /usr/lib/libyate.so.5.5.1
#15 0x00007f2f5c0cfc05 in TelEngine::ThreadPrivate::run() () from /usr/lib/libyate.so.5.5.1
#16 0x00007f2f5c0cfc49 in TelEngine::ThreadPrivate::startFunc(void*) () from /usr/lib/libyate.so.5.5.1
#17 0x00007f2f5c3e9b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#18 0x00007f2f5b619fbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#19 0x0000000000000000 in ?? ()

And probably here is part of problem code:
------------------------------------------------------
bool ForkModule::msgExecute(Message& msg)
{
    CallEndpoint* ch = YOBJECT(CallEndpoint,msg.userData());
    if (!ch)
   return false;
    String dest(msg.getParam("callto"));
    if (!dest.startSkip(MOD_PREFIX))
   return false;
    ObjList* targets = 0;
    if (dest)
   targets = dest.split(' ',false);
    else {
   for (int n = 1; true; n++) {
       String prefix;
       prefix << "callto." << n;
       NamedString* ns = msg.getParam(prefix);
       if (!ns)
      break;
       if (TelEngine::null(ns))
      continue;
       // Set target parameters from enclosed list
       // Override/add new params from message sub-params
       NamedPointer* np = YOBJECT(NamedPointer,ns);
       NamedList* target = YOBJECT(NamedList,np);
       if (target) {
      np->takeData();
      target->assign(*ns);
       }
       else
      target = new NamedList(*ns);
       target->copySubParams(msg,prefix + ".");
       if (!targets)
      targets = new ObjList;
       targets->append(target);
       // Clear from initial message
       msg.clearParam(prefix,'.');
   }
    }
    if (!(targets && targets->skipNull())) {
   msg.setParam("error","failure");
   TelEngine::destruct(targets);
   return false;
    }
    CallEndpoint::commonMutex().lock();
    ForkMaster* master = new ForkMaster(targets);
    bool ok = master->connect(ch,msg.getValue("reason")) && master->startCalling(msg); // <------------------- line 782
    CallEndpoint::commonMutex().unlock();
    master->deref();
    return ok;
}

Best regards!
« Last Edit: June 26, 2017, 05:25:12 AM by Anton »

Anton

  • Newbie
  • *
  • Posts: 25
    • View Profile
Re: Callfork mutex error
« Reply #1 on: June 21, 2017, 11:30:46 AM »
Can the Rmanager block the call?
20170621195218.415632 <engine:WARN> Engine is congested: Call endpoint mutex busy
20170621195219.000076 <engine:MILD> Creating new message dispatching thread (4 running)
20170621195221.000078 <engine:MILD> Creating new message dispatching thread (5 running)
20170621195223.000072 <engine:MILD> Creating new message dispatching thread (6 running)
20170621195223.413976 <FAIL> Thread 'Call Router' could not lock mutex 'Module' owned by 'RManager Connection' waited by 0 others for 10000000 usec!
Supervisor: child 10247 died on signal 6
Yate (2613) is starting Wed Jun 21 19:52:25 2017
...

Just received a new dump of a slightly different error.
Please see attached files.

Anton

  • Newbie
  • *
  • Posts: 25
    • View Profile
Rmanager blocking error
« Reply #2 on: June 26, 2017, 05:22:48 AM »
Experience has shown that the origination reason of calls’ blockings is in the realization of the command “status” in the rmanager console.
Probably this problem is quite rare thing and didn’t attract attention, but as my yate monitoring is realized through processing of command’s output “status” with an interval 1 minute, therefore this problem can come into service several times per day.
The problem has gone completely after commands “status” has not been in progress. I think it’s necessary to check in details where and how modules “ysip” and “yrtp” answer the “status” request.
Unfortunately I cannot do it by myself as don’t’ have C++ experience.
Kindly ask you to assist in the solving of this problem.

Best regards,
Anton

marian

  • Hero Member
  • *****
  • Posts: 513
    • View Profile
Re: Rmanager mutex error
« Reply #3 on: June 27, 2017, 12:42:06 AM »
Hi,
There is, indeed, a deadlock probability between call start by fork (possible some other module also) and status handling.
It may appear if full status is requested.
If you don't need full status you may use 'status overview'
 

Anton

  • Newbie
  • *
  • Posts: 25
    • View Profile
Re: Rmanager mutex error
« Reply #4 on: June 27, 2017, 03:48:39 AM »
Hello Marian!

Should I expect that this bug will be corrected in coming yate releases or I should look for another solution?

Thanks in advance,
Anton

marian

  • Hero Member
  • *****
  • Posts: 513
    • View Profile
Re: Rmanager mutex error
« Reply #5 on: June 28, 2017, 07:50:27 AM »
It will be fixed, of course.
Can't say when (as soon as possible!).