Wednesday, July 25, 2018

SCOM - New Management Packs for Connecting to OMS

Microsoft have just announced the following three new management packs to connect your SCOM environments to Azure Log Analytics/OMS:

  • For SCOM 1801, download the management pack from here. 
  • For SCOM 2016, download the management pack from here. 
  • For SCOM 2012 R2, download the management packs from here.

These new MP's contain a new on-boarding wizard (shown below) that enables your SCOM environments to communicate with the new OMS/Azure API's.


For all new SCOM to OMS connections, you need to import the relevant management pack to your environment first.

If you've already configured an OMS connection, then you don't need to deploy the new management pack for now however, if you need to reconfigure that connection again, you will then need to import it.


SCOM 1807 - What's New

Yesterday, Microsoft announced the General Availability of the latest release of our favourite monitoring platform - SCOM 1807.

SCOM 1807 is the second release this year in Microsoft's new Semi-Annual Channel licensing model and it follows hot on the heels of its predecessor SCOM 1801.

As promised by Microsoft, I found the in-place upgrade process from 1801 to 1807 seamless and not much different than deploying a typical Update Rollup to your SCOM environment.

Key Features

This release comes bundled with a lot of new useful features to play with. Here's a rundown on what you get:

New HTML5 dashboard PowerShell Widget

Use this new PowerShell widget to execute scripts for a more customised visualisation within your HTML5 dashboards.

Effective Configuration Web Console Dashboard Drill Down

Clicking on a monitored object from the HTML5 dashboard console now gives users the option to view the effective configuration information of specific rules or monitors.



Scheduled Maintenance Mode from the Web Console

This is a feature that people have been requesting for quite some time and since SCOM 2016, it has been possible to create and configure scheduled maintenance mode from the full console. With SCOM 1807, we now get this functionality in the HTML5 web console.



Create and Manage HTML5 Dashboards from My Workspace

If you wish to use the built-in Role Based Access Control feature of SCOM to restrict operator access to just the areas of monitoring they need to see, then with SCOM 1807, those operators can use the My Workspace area to configure user-specific dashboards that are only applicable to themselves.



Improved Network Monitoring from the Web Console

A key area of monitoring for most customers is to gain visibility of the health and performance of their network devices and although the full SCOM console has some very handy network monitoring capabilites, there was very little you could work with from the web console. This has changed in SCOM 1807 and now, we can pivot from a monitored network device in one of our custom dashboards such as this....


To a new Network Summary dashboard like this....


Then from there, we can drill down even further to an interface performance dashboard like this...



Cleaner Alert Resolution Management from the Web Console

If you drill down into an alert from the HTML5 web console, you now get a cleaner management experience for changing the resolution state and viewing all properties of the alert from one screen.



Icon Sizing in the Topology Widget

This is a simple but very useful new feature for anyone who uses the 'Topology Widget' for their dashboards. With SCOM 1807, we can now re-size the icons that we use to display health status for our objects (small or large are the only two options at the moment)...



Enable/Disable the APM Feature During Agent Deployment

Now, this is something that could have really saved a lot of time and hassle when SCOM 2016 first launched. I blogged at the time about how the SCOM 2016 agent was crashing IIS application pools and this caused a lot of pain for us when we realised it was the APM feature that comes bundled into the agent installer and can only be removed using command line/scripting.

Although that issue was resolved in SCOM 2016 through an Update Rollup release, there are still a number of other reason why you might want to not install the APM component of the agent onto your SCOM-monitored servers and with SCOM 1807, you can now enable or disable the APM feature during initial installation as shown here....



Linux Agent Log Rotation

In the past, customers have complained about the SCX log frequently filling up on their Linux agents - causing the system disk space to run out and the system becoming unresponsive. The only solution then was to manually clear out the logs but in SCOM 1807, Microsoft have introduced a logrotate feature to address this issue and stop system disk space from filling up.

SQL Support

If you're looking to deploy SCOM 1807 as a fresh installation and want to deploy the latest release of SQL alongside it, then you might be disappointed to know that you can't install it directly onto a fresh deployment of SQL 2017. Instead, you must first install SQL 2016 and then upgrade that installation to SQL 2017.

Also, if you're currently running SCOM 1801 with SQL 2016 and wish to upgrade to both SCOM 1807 and SQL 2017, then you must first carry out the SCOM 1801 to 1807 in-place upgrade and once that's complete, then you can upgrade SQL 2016 to SQL 2017.


Conclusion

I was already a big fan of SCOM 1801 and after working through the simple in-place upgrade to 1807 and playing around with all of these new features and enhancements, I'm really looking forward to working with our customers and getting this release deployed on a wider scale. This experience also bodes well for the Semi-Annual Channel licensing model as it's the first time I have performed an in-place upgrade of SCOM in production without it breaking anything!


Friday, July 6, 2018

Awarded Microsoft MVP 2018 for Cloud and Datacenter Management!

Last Sunday (1st July) I received a very welcome email into my inbox stating I'd been renewed as a Cloud and Datacenter Management MVP for 2018!


This email from Microsoft confirms that I'm now moving into my 7th year as an MVP and it's always a relief when it comes in as there's no guarantee that any of us will get renewed - no matter how much you think you've contributed to the community over the past year.

The MVP program enables me to network and interact with some of the best technical brains in the industry and I'm very lucky to work for an employer (Ergo) that supports me on this journey. Each year, they have given me the projects, tools and time that I need to enable me work with Cloud and Datacenter technologies in the Microsoft space - which in turn, helps me to contribute back to the community through this blog, my social media channels and to attend/speak at conferences where I can maximize my learning curve.

Due to some internal changes to the MVP award program, this year is the first time my renewal has come up in July (I'm originally an October awardee) and as such, it's 18 months since my last renewal date. Over those last 18 months, I've kept myself busy in the community by presenting at conferences such as Experts Live Europe, the Cloud and Datacenter Conference Germany, Experts Live NL and SCOM Day Sweden. I've also spent some time authoring with some awesome MVP friends on the 'Inside the Microsoft Operations Management Suite (v2)' book.

Thanks to my family, to everyone in Microsoft and the MVP community for their help and advice over the last year and also thanks to my friends and work colleagues at Ergo for helping me get this far in the program!

Friday, June 15, 2018

Azure Monitor - Alerting Gets an Upgrade

Earlier this week, Microsoft announced some upgrades to the alerts experience inside Azure Monitor and if you've ever worked with SCOM, then a few of these changes will have a pretty familiar look about them.


New Alert Enumeration Experience
There's a new Alert Enumeration feature which delivers a centralized view of all the alerts that have occurred across your various Azure deployments. You can query alerts across multiple subscriptions and sort them based on severity, signal types, resource type, and even resolution state. The enhanced alert enumeration feature is a serious upgrade on the previous Azure Monitor Alerts experience shown in the following image...


To upgrade to the new feature, click the purple banner at the top of the old Monitor - Alerts view and you will be presented with the following new enhanced user interface...


When you've upgraded, the first thing you will notice (assuming you've already got a few alerts present across your subscriptions), is that Azure Monitor has gathered all of your alerts into a central view and sorted them by Severity.

Now, if you've used SCOM Alert Rules in the past, you'll be familiar with Microsoft's method of defining severity levels using integers (where Critical = 2, Warning = 1 and Informational = 0). In Azure Monitor, Microsoft use a similar mapping process however, the lower numbered severity is the most important (which is the opposite to SCOM). You can read more about the exact Azure Monitor Alert Severity Mappings in my previous blog post here.

Clicking on any of the Severity links will then pivot you into the All Alerts page with a filter that's scoped to that particular severity.


Additional filters can then be applied to scope the view even further with options such as subscriptions, resource groups, time range and conditions to choose from.

Alert State Management

The next addition to Azure Monitor alerting is the new Alert State Management feature. These are essentially very similar to SCOM Alert Resolution States and in Azure Monitor, three alert resolution states are currently supported - New, Acknowledged and Closed.

You can manage the alert resolution state by drilling into an alert in the All Alerts view and clicking the Change Alert State button shown in the following image...


From there, you can use the drop-down menu to change the alert resolution state from New to either Acknowledged or Closed as shown here..


After that, you have the option to add a comment as to why you're changing the resolution state before then returning to the All Alerts view - where you should see the new Alert Resolution State assigned to your alert.

If you need to bulk-edit the resolution state of a number of alerts, then Microsoft have made this easy for you too. All you need to do is select each of the alerts that you need to modify, then hit the Change State button as shown in the following image...


Then modify your resolution state, add your comment and hit OK to return to the All Alerts view. Alert resolution states should now be easy to identify for all alerts that you've modified.

Something to keep in mind when working with these new Alert States is that they are completely separate from the Monitoring Condition - which supports two values - Fired and Resolved.  The Monitoring Condition indicates whether or not the condition that created a metric alert has subsequently been resolved.

To define the Monitoring Condition, the metric alert rules sample a particular metric at regular intervals and if the criteria in the alert rule is met, then a new alert is created with a condition of Fired. When the metric is sampled again and if the criteria is still the same, then nothing happens. However, if the criteria is not met, then the condition of the alert is changed to Resolved. The next time that the criteria is met, then a new alert is created with a condition of Fired.

Putting my SCOM hat back on again, the Monitoring Condition is a similar process to how SCOM Alert Monitors fire when a specific threshold is breached and then auto-close when that threshold is no longer breached.

One gotcha that might catch people out however, is that even though the system may set the Monitor Condition to Resolved, the alert state isn't changed until the user changes it manually and vice-versa. For example, if I modify an alert resolution state for a number of alerts and I set the resolution state to Closed, the Monitoring Condition will still show that the alert is still in a Fired state. The following image shows this exact scenario - where I've set the resolution state of a couple of my alerts to Closed, but as the metric that fired the alert in the first place is still present, the alerts are still displaying a Monitoring Condition of Fired.


Smart Groups

The final new alerting feature that I wanted to post about is Smart Groups. These contain alerts that were automatically grouped together based on either similarity, historical patterns or a combination of both. Smart Groups are automatically created using machine learning algorithms looking for similarity and co-occurrence patterns among alerts originating from a monitor service such as Log Analytics or across the rest of the Azure platform.

There's a couple of ways that you can view/access Smart Groups. The first method is to simply click the Smart Groups button from the All Alerts view in the new Alert Enumeration feature shown here...


The second method is to open the All Alerts view then click the blue banner as shown in this image...


Using Smart Groups, you can significantly reduce the number of alerts to analyze by focusing on only a handful of groups with some handy alert correlation in place.

As an example, if a performance counter such as CPU or RAM spikes on multiple virtual machines in your Azure subscription at the same time, this will generate a lot of alerts in Azure Monitor. When you click the Smart Groups feature, those alerts will get automatically grouped into a single Smart Group - offering up a much clearer picture of a common root cause.

In the following image, you can see a Smart Group that Azure Monitor has automatically created in my subscription where it has correlated 25 alerts together based on the reason that they are very similar to other alerts that have fired. From here, I can change the alert resolution state of individual alerts or I can use the Change Smart Group State button to change the resolution state of all alerts contained in the group.


Microsoft kicked the tires with alert correlation in SCOM when they released the Exchange 2010 management pack a few years ago and although it was quite noisy, the event correlation engine it came with was a similar concept to what we now have with Smart Groups. I think this is a pretty handy feature to have in your Azure monitoring toolbox and along with all the other features that have just launched, things are looking good for the next generation of Microsoft monitoring!



Azure Monitor Alert Severity Mappings



When I first started using SCOM, one of the things that I had to quickly get my head around was how alerts that were generated by rules were defined with a Severity that mapped to an integer value (e.g. Critical = 2, Warning = 1, and Informational = 0).

With alerts in Azure Monitor, Microsoft have taken a similar approach where they have defined five alert severity levels - each one mapping to it's own integer. These severity levels have been color-coded to help quickly identify alerts that should be treated as more important than others but for clarity, I've detailed the exact mappings as follows:

Azure Monitor Alert Severity Levels

Sev 0 = Critical
Sev 1 = Error
Sev 2 = Warning
Sev 3 = Informational
Sev 4 = Verbose


As you can see from the mappings above, in Azure, the lower the integer, the higher the severity - which is the opposite to alert rule severity mappings in SCOM. Hopefully this post will prove useful for any SCOM administrators who are dipping more into the Azure Monitor world over the coming year and might get slightly confused by the reverse numbering mapping between the two platforms.

If you'd like to read more about some newly announced feature enhancements in Azure Monitor, then check out my recent post here.

Wednesday, June 13, 2018

The OMS Portal is Moving to Azure

Over the last couple of years, I've worked a lot with the awesome Microsoft Operations Management Suite (aka OMS) and at one of the presentations I attended during Microsoft Ignite last year, it was announced that they would soon be retiring the OMS Portal and integrating all of it's functionality directly into the Azure Portal.

Earlier this week, Microsoft confirmed that the OMS Portal would indeed be retired and all it's functionality moved into the Azure Portal. The idea behind this move is to deliver a more centralized experience for monitoring and managing your on-premise and Azure-based workloads.

As it stands, nearly all of the existing OMS solutions have been available within the Azure Portal for a number of months and the only solutions still waiting to be ported over are as follows:
If you're using any of these solutions, then you'll still need to manage them within the original OMS Portal and Microsoft have committed to moving these solutions over to Azure by August 2018. When this happens, Microsoft will then communicate an official timeline for 'sunsetting' the original OMS Portal.

When this happens, the old OMS Portal that looks something like this (depending on which solutions you have enabled)...


Will then look like something similar to this in the Azure Portal...


As you can see from the two images above, they're not too dissimilar and in the Azure Portal, we get the added management benefit of being able to quickly pivot directly into Azure Resources using the navigation menu on the left or by simply drilling down into one of the dashboard widgets.

At the time of writing and along with the five OMS solutions mentioned earlier, there are still a few additional gaps that Microsoft need to address. These gaps are as follows:

  • To access Log Analytics resource in Azure, the user must be granted access through Azure role-based access.
  • Update schedules that were created with the OMS portal may not be reflected in the scheduled update deployments or update job history of the Update management dashboard in the Azure portal. This gap is expected to be addressed by the end of June 2018.
  • Custom logs preview feature can only be enabled through OMS Portal. By the end of June 2018, this will be automatically enabled for all work spaces.

You can read more about these gaps and the planned migration from the OMS portal to the Azure Portal in Microsoft's original post here.

They've also put together a useful FAQ post to help answer some common questions that you or your customers might have and you can access this post here.

All-in-all, I'm pretty happy with this move as I find that lately, I've been spending all of my time in the Azure Portal instead of the original OMS Portal. Having the additional management capabilities inside the Azure Portal definitely makes it a more seamless user experience and hopefully others will see the benefit of this too.

SCOM - New Community MP to Multi-Home Large Numbers of Agents

Microsoft's Kevin Holman has just released a very useful new community MP for SCOM that enables you to multi-home large numbers of agents in a phased and controlled time-frame. This is perfect for any large side-by-side migrations you might be planning from SCOM 2012 R2 to SCOM 2016 or the latest SCOM 180x release.


On earlier versions of SCOM, I've used the excellent 'Extended Agent Info Management Pack' from Jose Fehse and over the last year or so, I've been using Kevin Holman's 'SCOM Agent Management Pack' to meet the same requirement. Although both of these community MP's enable me to add or remove Management Group name references on agents (which essentially multi-homes the agent), it's still a manual task that needs to be kicked off from the console.

With Kevin's newest 'SCOM Multi-Home Management Pack', this process is made a lot easier through the use of a rule that runs periodically and which is targeted at eight pre-created SQL Query-based groups within the MP.

This means that in large environments (think 1000's of agents), the management pack will query the SCOM database and then automatically distribute the number of agents you have across each of the pre-defined groups shown below.


The automatic assignment of agents to the different groups is configured by default to distribute in batches of 500 agents per group however, you can modify this number by editing the group discovery prior to importing the MP into SCOM.

Once the groups have been populated, the MP will then perform a check once a day to validate if the agents have been multi-homed and if any haven't, then it will update those agents using a random time window - thus ensuring your OpsDB doesn't get hammered with the dreaded Event ID 2115 data insertion errors.

To conclude, if you're planning any side-by-side migrations that contain large numbers of agents in the near future, then you'll definitely want to try out this MP to make your job easier and to ensure your OpsDB stays healthy.

You can get the full lowdown on the MP from Kevin Holman's blog here and you can download it directly from the TechNet Gallery here.

Enjoy!

Thursday, May 10, 2018

SCOM - Security Monitoring MP has been Updated

Last year, Nathan Gau (Microsoft Premier Field Engineer) released an awesome free management pack to the community with the specific focus of enhancing your security monitoring capabilities with SCOM.

I've been using this management pack in our own environment and on customer sites for a while now and there's some really useful alerts that it can generate which give you an extra layer of security monitoring within your environment.

Some examples of the alerts include:

  • Active Directory Domain Admin/Enterprise Admin/Schema Admin group changes
  • Detecting the clearance of security logs
  • Detection of new services being created on Domain Controllers
  • Golden Ticket detection
  • App Locker rules for detection of WCE, Mimikatz, PSExec, Powersploit
  • Scheduled task creation

The management pack isn't designed to be the only security monitoring tool that you use and it should instead be an addition to complement your overall security alert management strategy.

Here's how the author has positioned the management pack on his blog:

"To be clear, this is not a foolproof management pack. It is another defense in depth strategy that can help an organization to determine if they are breached, potentially catching the attacker before data loss occurs. It will not catch every intrusion, so please do not assume that putting this in makes you secure. It is 100% dependent on good alert management process, a subject that I have written extensively. With that said, main goal in this design was to keep alert noise down to a minimum. The hope is that very little of this will fire out of the box. If this MP is generating alerts, they should be investigated."

Since its inception, there has been a lot of work put into this management pack with the list of contributors making up a 'who's-who' list of the best in the SCOM community.
If you're using SCOM, then I highly recommend you take this free community MP for a test drive and see for yourself the value it can add to your security monitoring arsenal.

You can get all the information you need on this MP (including the latest change log and a summary of all features) from Nathan's main blog post on it from the following link:

Introducing the Security Monitoring Management Pack for SCOM

Enjoy!

Thursday, April 26, 2018

SCOM 2016 Update Rollup 5 is Now Available

A couple of days ago, Microsoft announced the latest Update Rollup (UR5) for SCOM 2016.

The Fixes

Unlike the last UR4 release, this update comes with a raft of new bug fixes - including a handy one for when you want to co-exist the SCOM and SCSM consoles on the same server along with a fix for a widely reported bug that occurs when performing an in-place upgrade of SCOM 2016 to the Semi-Annual Channel SCOM 1801.

Here's what you get with UR5:

  • The SCOM console and Service Manager console for PowerShell modules can now coexist on the same server. (Note Both SCOM Update Rollup 5 (this update) and Service Manager Update Rollup 5 (update KB 4093685) must be installed to resolve this issue.)
  • Active Directory Integration rules are not visible or editable in an upgraded 2016 Management Group. This prevents the ongoing management of Active Directory integration assignment in the upgraded Management Group.
  • When the UNIX host name on the server is in lowercase, the OS and MonitoredBy information is displayed incorrectly in the Unix/Linux Computers view.
  • Active Directory integrated agents do not display correct failover server information.
  • Performance views in the web console do not persist the selection of counters after web console restart or refresh.
  • The PowerShell cmdlet Get-SCXAgent fails with error “This cmdlet requires PowerShell version 3.0 or greater.”
  • During the upgrade from SCOM 2016 to SCOM 1801, if the reporting server is installed on a server other than the management server, the upgrade fails. Additionally, you receive the error message, "The management server to which this component reports has not been upgraded."
  • If a group name has been changed through the operations console, the Get-SCOMGroup cmdlet does not retrieve the group data that includes the changed group name.
  • Error HTTP 500 occurs when you access Diagram view through the web console.
  • When you download a Linux management pack after you upgrade to SCOM 2016, the error "OpsMgr Management Configuration Service failed to process configuration request (Xml configuration file or management pack request)" occurs.
  • The SQLCommand Timeout property is exposed so that it can be dynamically adjusted by users to manage random and expected influx of data scenarios.
  • The MonitoringHost process crashes and returns the exception "System.OverflowException: Value was either too large or too small for an Int32."
  • When company knowledge is edited by using the Japanese version of Microsoft Office through the SCOM console, the error (translated in English) "Failed to launch Microsoft Word. Please make sure Microsoft Word is installed. Here is the error message: Item with specified name does not exist" occurs.
  • Accessing Silverlight dashboards displays the "Web Console Configuration Required" message because of a certificate issue.
  • Microsoft.SystemCenter.ManagementPack.Recommendations causes errors to be logged on instances of Microsoft SQL Server that have case-sensitive collations.
  • Deep monitoring displays error “Discovery_Not_Found” if the installation of JBoss application server is customized.
  • Adds support for the Lancer driver on IBM Power 8 Servers that use AIX.
  • The ComputerOptInCompatibleMonitor monitor is disabled in the Microsoft.SystemCenter.Advisor.Internal management pack. This monitor is no longer valid.
My Advice

As always, my advice for deploying this update is to head over to Kevin Holman's blog and wait for his handy step-by-step guide to get this up and running in your non-production environments first.

Monday, February 19, 2018

Speaking at the Global Azure Bootcamp 2018

This coming April 21st, I'll be presenting a session on Azure Monitoring at the Global Azure Bootcamp 2018 event in Dublin.


This annual event is now in its sixth year of running and is held on the same date in nearly 200 locations around the globe - bringing together some of the best speakers and contributors in the Azure community.

Organised as a free event by the Irish MVP community with support from the awesome people over at our local Microsoft team, we're running an agenda of three tracks side-by-side covering topics across Azure Infrastructure & Security (Track 1), Azure Compute/General (Track 2) and Azure Workshops/Lightning Talks (Track 3).

If you haven't attended one of these events before, here's the lowdown on what to expect (taken from our official event website):

"Welcome to Global Azure Bootcamp! All around the world, user groups and communities want to learn about Azure and Cloud Computing. On April 21, 2018, tech communities world-wide will come together once again in the sixth great Global Azure Bootcamp event!

In Dublin, we are organising the biggest community lead event yet, with two tracks and in-depth workshops during the day. Bootcamps are happening on the same day all over the world - come to Dublin and join in - please share your experience under the social hashtag #GlobalAzure!

It is important to point out, that while this event is *about* Azure, it is *not* a commercial event. Azure bootcamp Dublin is organised by the local MVP tech commmunity - we are here to share our knowledge, not sell anything."


Registration is filling up fast and if you miss out on a seat at the first attempt, we've put a waiting list system in place to hopefully help you grab a cancellation spot. You can check out the full agenda and list of speakers on the day along with your free registration at our new website here - http://www.azurebootcampdublin.com/index.html

Hope to see some of you guys there!