Fighting Application Drift with Self-Healing

The ability to conduct and on-demand or automatic self-healing repair of a corrupted PC  and bring it back to the approved desired state is one of the many advantages Utopic’s Persystent Suite offers. In these cases it is easy to imagine a user clicking on a suspect link and infecting their system, or adding an unauthorized application fraught with spyware and other performance-hindering elements, or perhaps accessing their registry and adjusting a standard setting. Persystent Suite is perfectly scaled to directly remediate these issues on a single reboot of the system to return the affected PC to the desired state.

Although these scenarios happen too often and are extremely problematic, the more likely service desk call will be from a user experiencing slow performance or an application performing out of character. There are dozens of potential root causes, but a likely culprit is application drift. This indicates the affected PC is not using the most updated version of the application, an update did not correctly apply during the last upgrade, or, over time, has become misconfigured from its optimum standard. In some cases, it represents unsanctioned modifications made by the end-user or third party.

The best practice is to get the user back up and running immediately. By conducting an on-demand reboot of the PC, the user can return to productivity in about a minute. From the end-user’s perspective, it will be as if the issue never happened. The reboot restores the last known desired state. This desired state, maintained and controlled by IT, contains the proper version and the correct configuration for not only the application, but for the PCs entire operating system.  It also removes any unauthorized changes that may have contributed to the PCs performance issue, yet leaves the end-user files and profiles untouched. From a security, asset management and compliance standard, it ensures each user is using the appropriate version of an application to which they are given rights to access.

However in terms of best practice (and compliance satisfaction), it is important to know why the PC was affected. Every time the desired state is reapplied, Persystent Suite creates a file change report. This report lists all the items that differentiated from the desired state settings. It allows IT to see and document (without having to sift through thousands of event logs) the delta between the approved policies, registries, applications, infrastructure configurations etc…  from the time of the issue and the approved state. This way, IT can pinpoint root causes more clearly and adjust policies or procedures to prevent or mitigate such occurrences in the future.

Advertisements

Self-healing automations allow CIOs to meet new challenges

According to CIO magazine one of the “Top Actions CIOs should take in 2016″ is to redesign the workforce. They say “to maximize IT talent and meet digital needs of the future, changes to the roles and responsibilities must be considered. IT work is evolving beyond managing programs and developing software…”

This has several ramifications. One of which is the IT specialist must now (more than ever) wear multiple hats. However, this is not as burdensome an issue as once thought. With the application of automation for certain tasks once exclusively reserved for hands-on attention, the IT specialist , can transcend beyond the plugger of cables or writer of code. The article agrees: “CIOs also need to develop a workforce strategy that reflects increased automation. Intelligent machines can augment higher-end skills and automate routine tasks and decisions.”

One area that could obviously benefit from greater (and smarter) automation is IT Support. Help desk tends to be heavily dependent on hands-on techs (whether in house or off-shore). Many companies already implement some sort of self-service automation for Level One issues like forgotten passwords. However, it is the Level Two break/fix issues where significant improvements in time and resource management could allow CIOs to adapt the article’s stated best practice.

Sure, but computers can’t fix themselves.

In this landscape of automationthey can. In fact, a device can be self-healed in about 45 seconds; the time it takes a machine to reboot.  It is part of an automated process that restores a desired image to a corrupted or damaged OS or other break/fix issues  malware/ATP infestation, poor performance, and mis-configuration.  The significant difference with an existing platform is in this automated repair is that the profiles, files and settings remain intact. This isn’t a Day Zero rollback.

LEARN MORE

The automation comes with the development of an updating desired image. An IT team can schedule an update anytime. Most companies do so on a weekly or bi-weekly basis in conjunction with a Windows security patch or application update. There are two time-saving automations that ensure the most current image is available. First is that the repair/restore/imaging component takes a snapshot of the approved image and only adds what has changed since the last update. So IT is not reimaging an entire workstation and the process is infinitely faster. Second is, any patches can be tested against a single image rather than every case of the distributed image. This is because of single-instance file storage to ensure that only one copy of any one file is stored across the organization. You have many licenses for Word, but only a single instance is captured for an image. What this means is that the organization always has a recent authorized image that won’t take days to rebuild. It is restored in moments.

So if a workstation can heal itself, is their still a role for IT help desk?  First off, help desk is now involved in the discovery of root causes rather than fixing symptoms. Fixing the symptom simply gets the user back to productivity almost instantaneously. When a user does call with a break/fix issue, the IT specialist can resolve the issue in seconds by initiating the reboot with the correct repair level. However, once the user is back up and running, the IT team can investigate causes so that they can be prevented in the future. Part of the solution is that it has granular reporting so the logs can be reviewed and ideal states compared to show how and where the performance issue occurred.

The end result is a new baseline of automations that reduce the number of support incidents, reduce the time to resolution, increase uptime activity, improve compute landscape integrity, but with relevance to the CIO article, create new bandwidth and greater visibility to allow for the changing roles and responsibilities of IT.

Of course, Utopic Software is happy to perform a demo of these automations in a live presentation to prove self-healing is not only possible, but generates the necessary ROI and bandwidth expansion a CIO needs to achieve their vision.

Maintaining control of the repair/recovery process

One of the hallmarks of a self-healing process is the control applied to the width and depth of what parts of the operating system are automatically corrected back to an approved ideal state upon reboot. Based on requirements such as corporate policies, regulatory compliance, multi-user access and best practices, this setting changes from company to company, but also can be easily modified for various sub-groups within each organization.

To maintain proper control, an automated system must allow for or provide multiple levels of repair point control (for example-High, Medium and Low). A device will repair on every boot if it is assigned a Low/Medium/High level repair policy. However, an individual device (or group) can be assigned a “No Repair” setting to support an on-demand repair policy. This way, IT administration can control if and when a device needs to be returned to the last repair point. In fact, best practices suggest that repair should be implemented on demand rather that upon every reboot in order to maintain the continuous integrity of the permitted updates and allowable changes to the image assigned to that specific device.

When engaged to self-heal, Persystent always repairs the registry files (except for the keys excluded in filters). During boot up (after the BIOS loads), the process reapplies the approved image and the Repair Exempt filter. This way, any specific file and setting, such as Virus Definition Files, can be appropriately preserved. One of the chief benefits of the Persystent self-healing process is that the device is not being reset to a Zero-day state, but rather the last approved repair point. Additionally, the repair process only affects operating system and application files. The user’s data and files are not touched. A user profile is only impacted at the highest level of repair.

Further control of the repair process is evidenced by the flexibility of changing settings. The centralized WebUI allows IT administrators to change the repair levels against any individual or group at any time. This is done by simply adjusting the policy setting. This includes adding or modifying excluded files or other policies that can be applied to a named group of devices (i.e. identified by a characteristics like location, department, function or permission etc…) or defined by an event (i.e., updates, public daily usage by multiple users). The policies can be extended to when returns to ideal state can be scheduled and enforced.

The three levels of Repair:

Low Level Repair

  • Repairs any operating system or application files that are either modified or deleted back to the repair point state.
  • Deletes any new files/folders added in operating system and application folders.
  • User profiles are left intact. All change in the user’s profile are preserved and not repaired.
  • Any new files/folders created at the root of C:\ will be left intact.

Medium Level Repair

  • Repairs any operating system and application files that are either modified or deleted back to the repair point state.
  • Deletes any new files/folders added in operating system or application folders.
  • User profiles are left intact. All change in the user’s profile are preserved and not repaired.
  • Any new files/folders created at the root of C:\ are deleted.

High Level Repair

  • Repairs any operating system or application files that are either modified or deleted back to the repair point state.
  • Deletes any new files/folders added in operating system or application folders.
  • User profiles are deleted so that new user profiles will be created when a user logs on.
  • Any new files/folders created at the root of C:\ will be deleted.

Which repair setting is best?

Each company has unique compliance, security, device performance and administrative needs. This is why settings can be adjusted to meet specific requirements. IT administrators can add various policies that control the ability to add or manipulate certain registry, files, and services. Many companies enforce a variety of direct and unique policies that apply to a selection of their diverse user profiles. Most organization use low and medium settings for individual PCs based on the above noted considerations. High level repair settings are typically reserved for publicly accessed devices, classroom, kiosk and other multi-user machines.

Who chooses the repair point?

You do. Persystent’s imaging capabilities facilitates the creation and management of an image. A snapshot of this image is reapplied during the reboot process. When the repair is initiated, the self-healing (automatic corrective action) follows the repair level rules, exemptions and filters associated with the individual or group of devices and applies the last approved ideal state (image).

How often should a new repair point be created?

Best practices dictate that a new repair point should be taken immediately after authorized changes are made to the system. This can be automatically scheduled and executed with Persystent Suite. This includes changes such as Windows Updates, key application updates, installation of new applications, installation of new devices, etc… This will preserve the authorized changes and ensure the integrity of the repair process. Many companies schedule updated images weekly on a weekly basis; typically after the application of “Patch Tuesday” or a similar coordinated event. With Persystent, the process is considerably faster in that an entire image is not recreated. The process only identifies and incorporates the changes since the last approved repair point.

What exactly is repaired?

Depending on the level of control the self-healing applies corrective action against operating system and application files. However, this can be optionally expanded to include other files and folders that are not automatically part of the repair point by using our “Repair Point Include Filter” feature. The following is Persystent’s default repair point listing:

Default on Windows Vista, Windows 7/8/8.1
C:\Bootmgr
C:\Bootsect.bak

Captured on Windows Vista/Windows 7/8/8.1
C:\Windows (Excluding C:\Windows\CSC)
C:\Program Files
C:\Program Files (x86)\
C:\ProgramData
C:\Users\Public
C:\Users\Default
C:\Boot
C:\inetpub

The driving force behind Persystent’s multiple levels of repair is to allow for the maximum amount of control by IT while maintaining corporate standards of performance integrity. The flexibility of Persystent provides the right amount of protection, lifecycle expediency and compliance support for every machine under the enterprise umbrella.

With so many potential issues affecting critical systems, from user errors to malware infections to catastrophic failures (“blue screen of death’) IT departments constantly need to reimage machines from scratch or spend countless hours troubleshooting and repairing. The benefits of self-healing are obvious. It reduces helpdesk calls, promotes faster resolution of issues, eliminates the need for lengthy manual intervention, but most importantly it maintains a standard of performance through Persystent’s levels of repair.

 Download this article as PDF