Author Topic: Yate does not behave correctly as a signal proxy (Read 19938 times)

FT-Tech · « **on:** August 16, 2016, 01:29:17 PM »

We are trying to use yate as a signal proxy, however we have seen improper behavior in processing some of the calls. It seems the process of the calls on the originating and terminating sides are completely decoupled from Yate's perspective.

I tried to attach the call flow picture but constantly getting below error.

"The attachments upload directory is not writable. Your attachment or avatar cannot be saved."

hence we have to email the picture to whomever interested to help if we get the email address.

What I see in the graph I could not attach:

- 180 Ringing was received by Yate but never forwarded to Originating end (why?)
- About 6 seconds Originating end issued CANCLE which was never forwarded to terminating end (why?)
- 200 OK is received from Terminating end; while the originating leg has been completely ended; however yate sent ACK to 200 OK without sending the 200 OK to the originating end and while there was no Originating leg at that time.

Acting like this is quite strange and I wonder if anyone else has ever experienced the same and know how to fix.Thanks to all of you who care to help.

marian · « **Reply #1 on:** August 17, 2016, 12:07:55 AM »

Can you try posting a yate log?

FT-Tech · « **Reply #2 on:** August 18, 2016, 01:14:44 AM »

As I was not able to attach files, These are Yate Debug link and wireshark trace graph link.

marian · « **Reply #3 on:** August 18, 2016, 01:31:19 AM »

I'm sorry: the log is incomplete: internal messages are missing (chan.rtp only is not enough).
I can't say what is happening, I need the internal messages also.

FT-Tech · « **Reply #4 on:** August 18, 2016, 05:53:06 AM »

debug file size is enourmous, so, please tell me, which messages are needed, I will pick them manually. or how can filter them and send it here.

marian · « **Reply #5 on:** August 18, 2016, 06:03:30 AM »

It would need chan and call messages.
You may filter sniffed messages using:
sniffer on filter ^chan\.\|call\.

When you identify the call with issue look at call.execute message return (check log for an entry saying something like: Returned true 'call.execute' delay...):
The 'id' parameter contains the id of the incoming call leg.
The 'peerid' parameter contains the id of the outgoing call leg.

You may filter messages by the 'id' parameter containing either incoming or outgoing call leg id.

FT-Tech · « **Reply #6 on:** August 18, 2016, 11:56:12 AM »

Thank you so much for the lead in how prepare the logs for you even though still capturing the proper set of logs when the issue happens and making the required pre-processing before being able to share still is very time consuming.

It might be good to know that, Yate is using DB connection (mysql) for configs and also routing and also the problem happens intermittently. However, when it happens it creates a disaster because the duration for the originating leg would end up to be considerably lower than the duration on the terminating leg.

Interestingly I found another discussion back in 2013 between email address name "jraditchkov" and "Monica Tepelus" which seems to be exactly the same issue as what we are observing here after more than 3 years and many released versions.

This discussion started with below link.
http://yate.null.ro/archive/?action=show_msg&actionargs[]=81&actionargs[]=70

Monica asked for full log and jraditchkov provided the log which you could find in below link directly.
http://yate.null.ro/archive/?action=show_msg&actionargs[]=81&actionargs[]=72

Unfortunately the discussion ended there with no conclusion or even clue regarding what the root cause of this misbehavior would be.

Perhaps a quick review of the already provided logs in above link could help.

Appreciate your help on this.

marian · « **Reply #7 on:** August 19, 2016, 08:40:33 AM »

I can't say if it's related to it.
Take your time with log.
Unfortunately I won't be available in the next 2 weeks.

Monica Tepelus · « **Reply #8 on:** August 22, 2016, 02:26:06 AM »

Hi,

The issue with uploading attachments was fixed. Please upload a full log.

Thanks

FT-Tech · « **Reply #9 on:** August 23, 2016, 11:52:10 AM »

Thanks for fixing the download problem.

I am attaching two new logs for the same bizarre yate behavior.

First is the graph of both legs of a call which clearly shows handling of call legs are completely decoupled. Instead of receiving the SIP messages and passing it through to the other end, yate processes each leg itself individually. As its result the originating leg ends up with a zero duration call but terminating leg ends up with a connected call with duration that needs to be paid for.

The other attached file is the yate logs as marian asked for.

We are using yate version 5.4.3 on Centos 6.6 with routing via mysqldb module.

Furthermore, I see the last update about adding attachments fix was made by Monica Tepelus who participated in the other discussion around almost the same problem with jraditchkov. It'd be great if Monica could make a comment if that conversation was ended where it is logged with no clear conclusion? Having the email address of jraditchkov also would be a help to get in touch with him direct to understand if he could ever fix his issue.

Your help understanding what causing the very basic Yate call handling gets broken is very much appreciated.

Does anyone use yate as a softswitch for routing calls of different wholesale clients among different available suppliers? Maybe yate is not the right choice for applications like this.

Thanks.

Monica Tepelus · « **Reply #10 on:** August 24, 2016, 02:19:28 AM »

You really need to try to send the full log or as much information as possible: sip + internal messages. You also need to post the configuration files. How is the call routed? Are you also using a php script to route the calls, are you using fork or dumbchannel, a combination of various modules? Normally a yate chanell can't live by itself too much, so there is something that allows that leg to survive without the intended peer.

You can email this to monica@null and marian@null.ro and we'll try to debug this to see if there is a bug or configuration issue.

FT-Tech · « **Reply #11 on:** August 25, 2016, 10:39:02 AM »

Requested information was sent through email due to the large size of the files. Please kindly review and respond at your earliest.

FT-Tech · « **Reply #12 on:** August 28, 2016, 09:03:03 AM »

It would be nice to have a quick note about what the normal response time in issues with Yate is. It's been more than three days with no response and we have no clue when one could even be expected.

Is there any other resource for professional yate support for serious applications with some expectation about progress on issues to sign up for?

Monica Tepelus · « **Reply #13 on:** August 29, 2016, 04:47:38 AM »

There is no "normal response time", especially not when answering forum posts and there is no support contract involved. This is done by the developers when they have free time and good will. Considering that this is the holiday season, the remaining people are already overworked and streached over many things.

Concerning your issue, it looks to me that your system is under really heavy load. I see some call.cdr messages are delayed with more than 30 seconds.
My guess is that the SIP thread can't process all the traffic that is sent to it, so it doesn't send the chan.disconnected message in a timely manner. In the message pasted below you can see it's delayed with 40 seconds. You can run "top" then press H and see if the sip thread uses 100% of a CPU (when this issues happens).

Returned false 'chan.disconnected' delay=40.487222
thread=0xb7d110 'Engine Worker'
data=0x7f4ff401ad00
retval='(null)'
param['id'] = 'sip/597'
param['module'] = 'sip'
param['status'] = 'answered'
param['address'] = '218.213.249.2:5060'
param['targetid'] = 'sip/578'
param['billid'] = '1471838388-301'
param['lastpeerid'] = 'sip/578'

Try using a more poverfull machine or divide your traffic between multiple servers. You should also analize the duration of your database functions/procedures to see if they need to be improved. You can wait for Marian to return from holiday to see if he can find something more, but overall your system is congested.

FT-Tech · « **Reply #14 on:** August 29, 2016, 03:34:47 PM »

Thank you for the response.

We understand there could not be a quick response expectation on discussion in a forum where participants give their time and expertise on good will basis. Longer delays would definitely be more tolerable on educational or experimental platforms as they are just for learning purposes which obviously is not the subject of most of posts in this forum. We've been under impression that this forum is being managed by creators of Yate for professional discussion about the open solution.

We've looked for finding a paid support solution on Yate but could not find anything. Let us know if there is something in this matter with expected SLA within the contract to consider.

Regarding the posted response, we do have two concerns.

CPU utilization was one of our concerns as well hence we logged the values at the time of the problem.

When Yate was performing incorrectly, CPU was 85% idle. Also there was an average of CPS=5 at the time for the calls hitting Yate. There might have been longer delays in getting the result of SQL queries at the time, however that should have affected the routing but not the call sequence.

The second and main concern is the way Yate behave when the machine is supposedly under heavy load. If the machine is under heavy load then everything should slow down with the same expected sequence, however in the previously provided graph it is clear that Yate had no problem processing the call flow on each side of the CLIENT and PROVIDER sides separately.

Yate as a proxy sitting in the middle, should submit some of the messages only when they are received from the other end. Below is a part of "SIP Call Flow Examples" draft under section 3.1.7 which could be accessed through below link.

https://www.ietf.org/proceedings/51/I-D/draft-ietf-sip-call-flows-05.txt

IFT UA Proxy IFTGW UA
| F1 INVITE | |
|------------------->| F2 INVITE |
| |------------------->|
| F3 100 Trying | |
|<-------------------| F4 100 Trying |
| |<-------------------|
| | F5 180 Ringing|
|F6 180 Ringing |<-------------------|
|<-------------------| |
| | F7 200 OK |
| F8 200 OK |<-------------------|
|<-------------------| |
| F9 ACK | |
|------------------->| F10 ACK |
| |------------------->|
|Both Way RTP Media Established |
|<======================>|

As an example The ACK for 200 OK (F10 message in above graph) could only be passed to IFTGW UA after it is received from IFT UA (F9 message in above graph).

In the previously supplied graph for our call, you can see that Yate sent the ACK for 200 OK on the PROVIDER side on its own without receiving the message from the CLIENT side. Therefore the expected order was not followed by Yate in our case which was the main concern as it left one not connected call on the CLIENT side and one FALSE connected (based on incorrect SIP message handling) call with billable duration on the PROVIDER side.

How this behavior could be justified with the reason of the machine being under heavy load while no lag could be seen in the SIP message exchange between Yate and CLIENT and also Yate and PROVIDER legs individually.

Doesn't this SIP message submission without receiving the message from the other end look like an obvious bug; regardless of the machine being under heavily load or not?

Thank you.

Author Topic: Yate does not behave correctly as a signal proxy (Read 19938 times)

FT-Tech

Yate does not behave correctly as a signal proxy

marian

Re: Yate does not behave correctly as a signal proxy

FT-Tech

Re: Yate does not behave correctly as a signal proxy

marian

Re: Yate does not behave correctly as a signal proxy

FT-Tech

Re: Yate does not behave correctly as a signal proxy

marian

Re: Yate does not behave correctly as a signal proxy

FT-Tech

Re: Yate does not behave correctly as a signal proxy

marian

Re: Yate does not behave correctly as a signal proxy

Monica Tepelus

Re: Yate does not behave correctly as a signal proxy

FT-Tech

Re: Yate does not behave correctly as a signal proxy

Monica Tepelus

Re: Yate does not behave correctly as a signal proxy

FT-Tech

Re: Yate does not behave correctly as a signal proxy

FT-Tech

Re: Yate does not behave correctly as a signal proxy

Monica Tepelus

Re: Yate does not behave correctly as a signal proxy

FT-Tech

Re: Yate does not behave correctly as a signal proxy