Friday, July 6, 2018

Awarded Microsoft MVP 2018 for Cloud and Datacenter Management!

Last Sunday (1st July) I received a very welcome email into my inbox stating I'd been renewed as a Cloud and Datacenter Management MVP for 2018!


This email from Microsoft confirms that I'm now moving into my 7th year as an MVP and it's always a relief when it comes in as there's no guarantee that any of us will get renewed - no matter how much you think you've contributed to the community over the past year.

The MVP program enables me to network and interact with some of the best technical brains in the industry and I'm very lucky to work for an employer (Ergo) that supports me on this journey. Each year, they have given me the projects, tools and time that I need to enable me work with Cloud and Datacenter technologies in the Microsoft space - which in turn, helps me to contribute back to the community through this blog, my social media channels and to attend/speak at conferences where I can maximize my learning curve.

Due to some internal changes to the MVP award program, this year is the first time my renewal has come up in July (I'm originally an October awardee) and as such, it's 18 months since my last renewal date. Over those last 18 months, I've kept myself busy in the community by presenting at conferences such as Experts Live Europe, the Cloud and Datacenter Conference Germany, Experts Live NL and SCOM Day Sweden. I've also spent some time authoring with some awesome MVP friends on the 'Inside the Microsoft Operations Management Suite (v2)' book.

Thanks to my family, to everyone in Microsoft and the MVP community for their help and advice over the last year and also thanks to my friends and work colleagues at Ergo for helping me get this far in the program!

Friday, June 15, 2018

Azure Monitor - Alerting Gets an Upgrade

Earlier this week, Microsoft announced some upgrades to the alerts experience inside Azure Monitor and if you've ever worked with SCOM, then a few of these changes will have a pretty familiar look about them.


New Alert Enumeration Experience
There's a new Alert Enumeration feature which delivers a centralized view of all the alerts that have occurred across your various Azure deployments. You can query alerts across multiple subscriptions and sort them based on severity, signal types, resource type, and even resolution state. The enhanced alert enumeration feature is a serious upgrade on the previous Azure Monitor Alerts experience shown in the following image...


To upgrade to the new feature, click the purple banner at the top of the old Monitor - Alerts view and you will be presented with the following new enhanced user interface...


When you've upgraded, the first thing you will notice (assuming you've already got a few alerts present across your subscriptions), is that Azure Monitor has gathered all of your alerts into a central view and sorted them by Severity.

Now, if you've used SCOM Alert Rules in the past, you'll be familiar with Microsoft's method of defining severity levels using integers (where Critical = 2, Warning = 1 and Informational = 0). In Azure Monitor, Microsoft use a similar mapping process however, the lower numbered severity is the most important (which is the opposite to SCOM). You can read more about the exact Azure Monitor Alert Severity Mappings in my previous blog post here.

Clicking on any of the Severity links will then pivot you into the All Alerts page with a filter that's scoped to that particular severity.


Additional filters can then be applied to scope the view even further with options such as subscriptions, resource groups, time range and conditions to choose from.

Alert State Management

The next addition to Azure Monitor alerting is the new Alert State Management feature. These are essentially very similar to SCOM Alert Resolution States and in Azure Monitor, three alert resolution states are currently supported - New, Acknowledged and Closed.

You can manage the alert resolution state by drilling into an alert in the All Alerts view and clicking the Change Alert State button shown in the following image...


From there, you can use the drop-down menu to change the alert resolution state from New to either Acknowledged or Closed as shown here..


After that, you have the option to add a comment as to why you're changing the resolution state before then returning to the All Alerts view - where you should see the new Alert Resolution State assigned to your alert.

If you need to bulk-edit the resolution state of a number of alerts, then Microsoft have made this easy for you too. All you need to do is select each of the alerts that you need to modify, then hit the Change State button as shown in the following image...


Then modify your resolution state, add your comment and hit OK to return to the All Alerts view. Alert resolution states should now be easy to identify for all alerts that you've modified.

Something to keep in mind when working with these new Alert States is that they are completely separate from the Monitoring Condition - which supports two values - Fired and Resolved.  The Monitoring Condition indicates whether or not the condition that created a metric alert has subsequently been resolved.

To define the Monitoring Condition, the metric alert rules sample a particular metric at regular intervals and if the criteria in the alert rule is met, then a new alert is created with a condition of Fired. When the metric is sampled again and if the criteria is still the same, then nothing happens. However, if the criteria is not met, then the condition of the alert is changed to Resolved. The next time that the criteria is met, then a new alert is created with a condition of Fired.

Putting my SCOM hat back on again, the Monitoring Condition is a similar process to how SCOM Alert Monitors fire when a specific threshold is breached and then auto-close when that threshold is no longer breached.

One gotcha that might catch people out however, is that even though the system may set the Monitor Condition to Resolved, the alert state isn't changed until the user changes it manually and vice-versa. For example, if I modify an alert resolution state for a number of alerts and I set the resolution state to Closed, the Monitoring Condition will still show that the alert is still in a Fired state. The following image shows this exact scenario - where I've set the resolution state of a couple of my alerts to Closed, but as the metric that fired the alert in the first place is still present, the alerts are still displaying a Monitoring Condition of Fired.


Smart Groups

The final new alerting feature that I wanted to post about is Smart Groups. These contain alerts that were automatically grouped together based on either similarity, historical patterns or a combination of both. Smart Groups are automatically created using machine learning algorithms looking for similarity and co-occurrence patterns among alerts originating from a monitor service such as Log Analytics or across the rest of the Azure platform.

There's a couple of ways that you can view/access Smart Groups. The first method is to simply click the Smart Groups button from the All Alerts view in the new Alert Enumeration feature shown here...


The second method is to open the All Alerts view then click the blue banner as shown in this image...


Using Smart Groups, you can significantly reduce the number of alerts to analyze by focusing on only a handful of groups with some handy alert correlation in place.

As an example, if a performance counter such as CPU or RAM spikes on multiple virtual machines in your Azure subscription at the same time, this will generate a lot of alerts in Azure Monitor. When you click the Smart Groups feature, those alerts will get automatically grouped into a single Smart Group - offering up a much clearer picture of a common root cause.

In the following image, you can see a Smart Group that Azure Monitor has automatically created in my subscription where it has correlated 25 alerts together based on the reason that they are very similar to other alerts that have fired. From here, I can change the alert resolution state of individual alerts or I can use the Change Smart Group State button to change the resolution state of all alerts contained in the group.


Microsoft kicked the tires with alert correlation in SCOM when they released the Exchange 2010 management pack a few years ago and although it was quite noisy, the event correlation engine it came with was a similar concept to what we now have with Smart Groups. I think this is a pretty handy feature to have in your Azure monitoring toolbox and along with all the other features that have just launched, things are looking good for the next generation of Microsoft monitoring!



Azure Monitor Alert Severity Mappings



When I first started using SCOM, one of the things that I had to quickly get my head around was how alerts that were generated by rules were defined with a Severity that mapped to an integer value (e.g. Critical = 2, Warning = 1, and Informational = 0).

With alerts in Azure Monitor, Microsoft have taken a similar approach where they have defined five alert severity levels - each one mapping to it's own integer. These severity levels have been color-coded to help quickly identify alerts that should be treated as more important than others but for clarity, I've detailed the exact mappings as follows:

Azure Monitor Alert Severity Levels

Sev 0 = Critical
Sev 1 = Error
Sev 2 = Warning
Sev 3 = Informational
Sev 4 = Verbose


As you can see from the mappings above, in Azure, the lower the integer, the higher the severity - which is the opposite to alert rule severity mappings in SCOM. Hopefully this post will prove useful for any SCOM administrators who are dipping more into the Azure Monitor world over the coming year and might get slightly confused by the reverse numbering mapping between the two platforms.

If you'd like to read more about some newly announced feature enhancements in Azure Monitor, then check out my recent post here.

Wednesday, June 13, 2018

The OMS Portal is Moving to Azure

Over the last couple of years, I've worked a lot with the awesome Microsoft Operations Management Suite (aka OMS) and at one of the presentations I attended during Microsoft Ignite last year, it was announced that they would soon be retiring the OMS Portal and integrating all of it's functionality directly into the Azure Portal.

Earlier this week, Microsoft confirmed that the OMS Portal would indeed be retired and all it's functionality moved into the Azure Portal. The idea behind this move is to deliver a more centralized experience for monitoring and managing your on-premise and Azure-based workloads.

As it stands, nearly all of the existing OMS solutions have been available within the Azure Portal for a number of months and the only solutions still waiting to be ported over are as follows:
If you're using any of these solutions, then you'll still need to manage them within the original OMS Portal and Microsoft have committed to moving these solutions over to Azure by August 2018. When this happens, Microsoft will then communicate an official timeline for 'sunsetting' the original OMS Portal.

When this happens, the old OMS Portal that looks something like this (depending on which solutions you have enabled)...


Will then look like something similar to this in the Azure Portal...


As you can see from the two images above, they're not too dissimilar and in the Azure Portal, we get the added management benefit of being able to quickly pivot directly into Azure Resources using the navigation menu on the left or by simply drilling down into one of the dashboard widgets.

At the time of writing and along with the five OMS solutions mentioned earlier, there are still a few additional gaps that Microsoft need to address. These gaps are as follows:

  • To access Log Analytics resource in Azure, the user must be granted access through Azure role-based access.
  • Update schedules that were created with the OMS portal may not be reflected in the scheduled update deployments or update job history of the Update management dashboard in the Azure portal. This gap is expected to be addressed by the end of June 2018.
  • Custom logs preview feature can only be enabled through OMS Portal. By the end of June 2018, this will be automatically enabled for all work spaces.

You can read more about these gaps and the planned migration from the OMS portal to the Azure Portal in Microsoft's original post here.

They've also put together a useful FAQ post to help answer some common questions that you or your customers might have and you can access this post here.

All-in-all, I'm pretty happy with this move as I find that lately, I've been spending all of my time in the Azure Portal instead of the original OMS Portal. Having the additional management capabilities inside the Azure Portal definitely makes it a more seamless user experience and hopefully others will see the benefit of this too.

SCOM - New Community MP to Multi-Home Large Numbers of Agents

Microsoft's Kevin Holman has just released a very useful new community MP for SCOM that enables you to multi-home large numbers of agents in a phased and controlled time-frame. This is perfect for any large side-by-side migrations you might be planning from SCOM 2012 R2 to SCOM 2016 or the latest SCOM 180x release.


On earlier versions of SCOM, I've used the excellent 'Extended Agent Info Management Pack' from Jose Fehse and over the last year or so, I've been using Kevin Holman's 'SCOM Agent Management Pack' to meet the same requirement. Although both of these community MP's enable me to add or remove Management Group name references on agents (which essentially multi-homes the agent), it's still a manual task that needs to be kicked off from the console.

With Kevin's newest 'SCOM Multi-Home Management Pack', this process is made a lot easier through the use of a rule that runs periodically and which is targeted at eight pre-created SQL Query-based groups within the MP.

This means that in large environments (think 1000's of agents), the management pack will query the SCOM database and then automatically distribute the number of agents you have across each of the pre-defined groups shown below.


The automatic assignment of agents to the different groups is configured by default to distribute in batches of 500 agents per group however, you can modify this number by editing the group discovery prior to importing the MP into SCOM.

Once the groups have been populated, the MP will then perform a check once a day to validate if the agents have been multi-homed and if any haven't, then it will update those agents using a random time window - thus ensuring your OpsDB doesn't get hammered with the dreaded Event ID 2115 data insertion errors.

To conclude, if you're planning any side-by-side migrations that contain large numbers of agents in the near future, then you'll definitely want to try out this MP to make your job easier and to ensure your OpsDB stays healthy.

You can get the full lowdown on the MP from Kevin Holman's blog here and you can download it directly from the TechNet Gallery here.

Enjoy!

Thursday, May 10, 2018

SCOM - Security Monitoring MP has been Updated

Last year, Nathan Gau (Microsoft Premier Field Engineer) released an awesome free management pack to the community with the specific focus of enhancing your security monitoring capabilities with SCOM.

I've been using this management pack in our own environment and on customer sites for a while now and there's some really useful alerts that it can generate which give you an extra layer of security monitoring within your environment.

Some examples of the alerts include:

  • Active Directory Domain Admin/Enterprise Admin/Schema Admin group changes
  • Detecting the clearance of security logs
  • Detection of new services being created on Domain Controllers
  • Golden Ticket detection
  • App Locker rules for detection of WCE, Mimikatz, PSExec, Powersploit
  • Scheduled task creation

The management pack isn't designed to be the only security monitoring tool that you use and it should instead be an addition to complement your overall security alert management strategy.

Here's how the author has positioned the management pack on his blog:

"To be clear, this is not a foolproof management pack. It is another defense in depth strategy that can help an organization to determine if they are breached, potentially catching the attacker before data loss occurs. It will not catch every intrusion, so please do not assume that putting this in makes you secure. It is 100% dependent on good alert management process, a subject that I have written extensively. With that said, main goal in this design was to keep alert noise down to a minimum. The hope is that very little of this will fire out of the box. If this MP is generating alerts, they should be investigated."

Since its inception, there has been a lot of work put into this management pack with the list of contributors making up a 'who's-who' list of the best in the SCOM community.
If you're using SCOM, then I highly recommend you take this free community MP for a test drive and see for yourself the value it can add to your security monitoring arsenal.

You can get all the information you need on this MP (including the latest change log and a summary of all features) from Nathan's main blog post on it from the following link:

Introducing the Security Monitoring Management Pack for SCOM

Enjoy!

Thursday, April 26, 2018

SCOM 2016 Update Rollup 5 is Now Available

A couple of days ago, Microsoft announced the latest Update Rollup (UR5) for SCOM 2016.

The Fixes

Unlike the last UR4 release, this update comes with a raft of new bug fixes - including a handy one for when you want to co-exist the SCOM and SCSM consoles on the same server along with a fix for a widely reported bug that occurs when performing an in-place upgrade of SCOM 2016 to the Semi-Annual Channel SCOM 1801.

Here's what you get with UR5:

  • The SCOM console and Service Manager console for PowerShell modules can now coexist on the same server. (Note Both SCOM Update Rollup 5 (this update) and Service Manager Update Rollup 5 (update KB 4093685) must be installed to resolve this issue.)
  • Active Directory Integration rules are not visible or editable in an upgraded 2016 Management Group. This prevents the ongoing management of Active Directory integration assignment in the upgraded Management Group.
  • When the UNIX host name on the server is in lowercase, the OS and MonitoredBy information is displayed incorrectly in the Unix/Linux Computers view.
  • Active Directory integrated agents do not display correct failover server information.
  • Performance views in the web console do not persist the selection of counters after web console restart or refresh.
  • The PowerShell cmdlet Get-SCXAgent fails with error “This cmdlet requires PowerShell version 3.0 or greater.”
  • During the upgrade from SCOM 2016 to SCOM 1801, if the reporting server is installed on a server other than the management server, the upgrade fails. Additionally, you receive the error message, "The management server to which this component reports has not been upgraded."
  • If a group name has been changed through the operations console, the Get-SCOMGroup cmdlet does not retrieve the group data that includes the changed group name.
  • Error HTTP 500 occurs when you access Diagram view through the web console.
  • When you download a Linux management pack after you upgrade to SCOM 2016, the error "OpsMgr Management Configuration Service failed to process configuration request (Xml configuration file or management pack request)" occurs.
  • The SQLCommand Timeout property is exposed so that it can be dynamically adjusted by users to manage random and expected influx of data scenarios.
  • The MonitoringHost process crashes and returns the exception "System.OverflowException: Value was either too large or too small for an Int32."
  • When company knowledge is edited by using the Japanese version of Microsoft Office through the SCOM console, the error (translated in English) "Failed to launch Microsoft Word. Please make sure Microsoft Word is installed. Here is the error message: Item with specified name does not exist" occurs.
  • Accessing Silverlight dashboards displays the "Web Console Configuration Required" message because of a certificate issue.
  • Microsoft.SystemCenter.ManagementPack.Recommendations causes errors to be logged on instances of Microsoft SQL Server that have case-sensitive collations.
  • Deep monitoring displays error “Discovery_Not_Found” if the installation of JBoss application server is customized.
  • Adds support for the Lancer driver on IBM Power 8 Servers that use AIX.
  • The ComputerOptInCompatibleMonitor monitor is disabled in the Microsoft.SystemCenter.Advisor.Internal management pack. This monitor is no longer valid.
My Advice

As always, my advice for deploying this update is to head over to Kevin Holman's blog and wait for his handy step-by-step guide to get this up and running in your non-production environments first.

Monday, February 19, 2018

Speaking at the Global Azure Bootcamp 2018

This coming April 21st, I'll be presenting a session on Azure Monitoring at the Global Azure Bootcamp 2018 event in Dublin.


This annual event is now in its sixth year of running and is held on the same date in nearly 200 locations around the globe - bringing together some of the best speakers and contributors in the Azure community.

Organised as a free event by the Irish MVP community with support from the awesome people over at our local Microsoft team, we're running an agenda of three tracks side-by-side covering topics across Azure Infrastructure & Security (Track 1), Azure Compute/General (Track 2) and Azure Workshops/Lightning Talks (Track 3).

If you haven't attended one of these events before, here's the lowdown on what to expect (taken from our official event website):

"Welcome to Global Azure Bootcamp! All around the world, user groups and communities want to learn about Azure and Cloud Computing. On April 21, 2018, tech communities world-wide will come together once again in the sixth great Global Azure Bootcamp event!

In Dublin, we are organising the biggest community lead event yet, with two tracks and in-depth workshops during the day. Bootcamps are happening on the same day all over the world - come to Dublin and join in - please share your experience under the social hashtag #GlobalAzure!

It is important to point out, that while this event is *about* Azure, it is *not* a commercial event. Azure bootcamp Dublin is organised by the local MVP tech commmunity - we are here to share our knowledge, not sell anything."


Registration is filling up fast and if you miss out on a seat at the first attempt, we've put a waiting list system in place to hopefully help you grab a cancellation spot. You can check out the full agenda and list of speakers on the day along with your free registration at our new website here - http://www.azurebootcampdublin.com/index.html

Hope to see some of you guys there!

Thursday, February 8, 2018

SCOM 1801 Has Just Been Released!

The latest release of SCOM (1801) has just been announced and it brings with it some major changes in licensing along with some nice additional features and enhancements compared to earlier versions.
Licensing Changes

This is the first release of SCOM in the new Semi-Annual Channel (SAC) model and it will enable Microsoft to deliver much faster capabilities to our favourite monitoring platform than we ever had before - e.g. two product releases per year versus one every three or four years. Due to this faster release cadence, SAC releases only have an 18-month support policy with the concept being similar to how we manage, deploy and get support for service packs to our operating systems and other applications.

If this short-term release cycle isn't something that you fancy, then you can still deploy SCOM using the Long-Term Servicing Channel (LTSC) model - which will provide new version releases at a much lower frequency and no new features will be added - mainly just bug fixes. With LTSC, you get up to 5 years of mainstream support followed by 5 more years of extended support - as has been standard with the versions of SCOM we've been using up to now.

Key Features

We get a number of new features with this release with my favourites being the new HTML 5 widgets, Service Map integration and the enhanced performance gains. Here's the full list of everything that's new:

  • Improved HTML5 dashboard experience 
  • Enhanced SDK performance 
  • Linux Logfile monitoring enhancements 
  • Linux Kerberos support 
  • GUI support for entering SCOM License key 
  • Service Map integration 
  • Updates and Recommendations for third-party vendor Management Packs 
  • System Center Visual Studio Authoring Extension (VSAE) support for Visual Studio 2017

The bits for this new release should start hitting your normal licensing channels for download around about now (if it's not there, give it a day or so to fully populate) and in the meantime, you can download an evaluation copy of SCOM 1801 from the Evaluation Center here.

I'll post back in the coming days with my thoughts on the new release and anything extra that I come across.

Enjoy!

SCOM 'Updates and Recommendations' Feature Now Supports External Partner MP's

Earlier this week Microsoft announced that the Updates and Recommendations feature (first introduced in SCOM 2016) will be extended for the new SCOM 1801 semi-annual release to include management pack recommendations from certified external partners - such as NiCE and Comtrade to name a few.

The screenshot below shows this new capability in action where you can see a mixture of external partner management packs offered alongside the typical Microsoft ones.


How It Works

The Updates component of this feature periodically checks for updates to the existing management packs that you've deployed into your environment and then suggests which ones to upgrade.

For the Suggestions component, a discovery scans your monitored servers for workloads/technologies that are supported for monitoring with a SCOM management pack and then suggests which ones you should download for a better monitoring experience. It will also detect and suggest any dependent management packs that you might need to bring in so you don't run into any partial import problems.

This image shows an example of how this all comes together...

I've used the Updates and Recommendations feature a fair amount of times in SCOM 2016 and it's definitely a much better upgrade to the original 'Updates available for installed management packs' option that we had in SCOM 2012 R2 (which never really had a full up-to-date view of all current management packs anyway) and this extended capability for external vendors can only be a good thing going forward.

Here's what Microsoft had to say in their original post on this new capability...

"We are extending this feature to support Management Packs authored and offered by several external technologies and consulting partners of SCOM. Partners have extended their support by signing up with the SCOM team to onboard their Management Packs to ease the Management Pack discovery problem solved by this feature. With the partner support, this feature is now able to recommend Management Packs for both Microsoft and non-Microsoft workloads."

SCOM 1801 is now generally available and you can read all about it here and download an evaluation copy of it from here.