Wednesday, March 12, 2014

Application Insights Deep Dive Part 5 - Monitoring Performance & Using Diagnostics

This post is Part 5 of my 'Application Insights Deep Dive' series and if you haven't yet read over the previous posts, then you can check them all out here:

Application Insights Deep Dive Part 1 - Getting Started

Application Insights Deep Dive Part 2 - Building A Demo Server

Application Insights Deep Dive Part 3 - Deploying A Demo Web Application

Application Insights Deep Dive Part 4 - Monitoring Availability

If you've been working through the various steps of my previous posts, you should now have your web application up and running in Application Insights and returning valuable availability data. It should go without saying, that monitoring the performance of your applications is critical to providing an optimal end-user experience and if you have a heads up on problems before your users notice, then you go from being re-active to pro-active in dealing with those problems.

In this post I'll focus on monitoring performance so that you can be sure that the application is performing within its defined thresholds. Once you get an overview of the performance of your application, I'll walk through the diagnostics capability to help you triage and find answers to your application problems.

Finding Performance Problems

Once you have your applications monitored in Application Insights, finding any performance related problems inside them is pretty easy. Here's all you need to do:

Logon to Visual Studio Online, open your Application Insights console and then click on the Performance view.

The first thing you should see is the 'Response Time and Load vs Dependencies' chart (shown below) which is useful for determining how your application is scaling under different load types over a given period of time.


If you want to change the time period for the performance chart from its default of 24 Hours, then just click the 'Selected Date Range' over in the right-hand corner and choose the range that you require.

The Response Time Distribution chart tells you how many requests to your application have returned a high latency value.


From the same performance view, you can also check out the number of exceptions that are thrown by your application per second...


See how much CPU, Network and Memory the application is using..


You even get a handy list of the Top 10 slowest requests by issue count.


Digging in with Diagnostics

Having nice performance graphs is all well and good if you're looking for some eye candy to stick up on a dashboard or to show the CEO, but if you want to get information as to why you are having performance problems, then you'll need to use the Diagnostics built into Application Insights to give you the answers.

With the Performance view of one of your web applications open, click on a title link from any of the dashboards to dig deeper into the diagnostic information for your application (see below for examples of links to click).


I'll click on the 'Exceptions Rate' link as I notice a small blip on the graph there that I want to investigate. This opens up the Diagnostics view (pointed out below) which is scoped to my web application and I can clearly see the spike in my exception rate at approximately 3PM as shown in the screenshot.


If you scroll down to the bottom of this same page, you will see a number of tiles grouped into 'Events' and 'Insights'.


Click on one of those tiles (I'll choose the Exception Events one), and you'll be brought to the 'Event' view where you will see a list of all the events that the application has a problem with.


Double click on one of the events in the list (here's where things start to become real familiar again if you've used the APM piece in SCOM 2012) and you'll see something like this....


The screenshot above gives a description of the event (in this case showing a spike in memory use), and if you click on the 'Performance Counters' view, you get additional information around timeframes and other components of the application that are affected.


From this view, you have the capability to download the memory snapshot by clicking on the link as shown below. This can be useful when you want to troubleshoot memory issues offline - similar to a memory dump, but much smaller and containing just instance counts and a reference graph.


Here's another example of an event generated about slow performance when accessing pages. You can see a breakdown of how long each execution and page load takes to run from one simple view.


If you want to extract event information out of Application Insights as an IntelliTrace download, then all you have to do while viewing an event is to click on the 'Download IntelliTrace' link as shown in the image below.


If you have a lot of events to sort through in the Diagnostics\Events view, then you can filter them by Event Type, by Problem or by Date Range using the links up in the top-left hand side of the screen.


With this new understanding of how to monitor the performance and diagnose issues inside your web applications, you're ready to move on to Part 6, where I'll explain how to monitor usage data as well as how to create your own custom dashboards in Application Insights.

4 comments:

  1. Hi Keven,

    Just need advise, if we can use Application Insights for offline website/application? like TEST or Pre-production system?


    Thanks,
    Folk

    ReplyDelete
    Replies
    1. Hi,
      Application Insights needs to have a connection back to your Test or Pre-production website so it can collect data on it. You can specify how it connects though (either via agent or URL) so once there's an internet connection to those sites, it shouldn't be a problem.

      Kevin.

      Delete
  2. Can we use application insight for websites hosted in a secure environment, basically internal websites which are not publically available and web server vms does not have internet connectivity.

    ReplyDelete
    Replies
    1. Hi Manish,
      Application Insights can only be used when an internet connection is available. If you need offline monitoring of your websites, I'd suggest using System Center Operations Manager (SCOM) to do the job.

      HTH,

      Kevin.

      Delete