Quantcast
Channel: MSDN Blogs
Viewing all articles
Browse latest Browse all 29128

Intermittent connectivity issues with TF Service - 8/31/2013 - Update

$
0
0

Update:  We were able to identify the issue and took memory dumps across our AT farm.  Tomorrow AM we will be analyzing the dumps and attempt to understand the root cause.  We will be looking to release a fix over the weekend if need be, to ensure the issue is fully resolved.

As for the issue itself, when we first got alerted on it, we troubleshooted it from intermittent connectivity issues perspective as the alerts were intermittent and we were observing slow performance across parts of the service.  As we isolated the problem from network & dependent services perspective, we observed a low spikey pattern on memory utilization across our ATs. Troubleshooting the problem further, we noticed some level of memory thrashing.  Our AT code has a thread that goes around freeing memory caches.  In this case, the cache is being freed up more aggressively than we like.  This appears like a bug that we don't fully understand yet, therefore, we took memory dumps across all ATs.  A recycle of the worker processes did not solve the issue therefore, we will be digging in deeper in the AM.  Given the current traffic conditions this appears to be somewhat low impacting, but we'd like to understand the root cause and fix the issue ASAP.

We appreciate your patience and apologize for any inconvenience this may have caused you.

----

Update: Sat, Aug 31 2013 9:37 PM        

Our analysis indicates that users may experience slow performance while connecting to the service, intermittently.  Our DevOps team is engaged and actively working towards isolating the problem.  We apologize for the inconvenience here and hope to get to the bottom of the issue as soon as the possible. 

 ----

Initial Update: Sat, Aug 31 2013 5:15 PM

We are currently investigating intermittent connectivity issues with TF Service as alerted by our monitoring systems.  The issue doesn't appear to be repro'able via manual navigation of the service, however we are troubleshooting the alerts to ensure all is well.  We will keep you informed of our findings.

Thank you


Viewing all articles
Browse latest Browse all 29128

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>