Fighting Application Drift with Self-Healing

The ability to conduct and on-demand or automatic self-healing repair of a corrupted PC  and bring it back to the approved desired state is one of the many advantages Utopic’s Persystent Suite offers. In these cases it is easy to imagine a user clicking on a suspect link and infecting their system, or adding an unauthorized application fraught with spyware and other performance-hindering elements, or perhaps accessing their registry and adjusting a standard setting. Persystent Suite is perfectly scaled to directly remediate these issues on a single reboot of the system to return the affected PC to the desired state.

Although these scenarios happen too often and are extremely problematic, the more likely service desk call will be from a user experiencing slow performance or an application performing out of character. There are dozens of potential root causes, but a likely culprit is application drift. This indicates the affected PC is not using the most updated version of the application, an update did not correctly apply during the last upgrade, or, over time, has become misconfigured from its optimum standard. In some cases, it represents unsanctioned modifications made by the end-user or third party.

The best practice is to get the user back up and running immediately. By conducting an on-demand reboot of the PC, the user can return to productivity in about a minute. From the end-user’s perspective, it will be as if the issue never happened. The reboot restores the last known desired state. This desired state, maintained and controlled by IT, contains the proper version and the correct configuration for not only the application, but for the PCs entire operating system.  It also removes any unauthorized changes that may have contributed to the PCs performance issue, yet leaves the end-user files and profiles untouched. From a security, asset management and compliance standard, it ensures each user is using the appropriate version of an application to which they are given rights to access.

However in terms of best practice (and compliance satisfaction), it is important to know why the PC was affected. Every time the desired state is reapplied, Persystent Suite creates a file change report. This report lists all the items that differentiated from the desired state settings. It allows IT to see and document (without having to sift through thousands of event logs) the delta between the approved policies, registries, applications, infrastructure configurations etc…  from the time of the issue and the approved state. This way, IT can pinpoint root causes more clearly and adjust policies or procedures to prevent or mitigate such occurrences in the future.

Advertisements

Don’t forget to wipe! The keys to data sanitization and hard disk erasure

Every year IT teams supporting a modest-sized enterprise (2500 devices) will retire about 23% of its devices each year. That’s 575 machines a year containing sensitive information. As many companies like to take advantage of re-purposing these machines, they first must go through an end-of-lifecycle transition; from storage of data to reassignment, resell or donation. If the device is being reassigned from one department to another, it might require a new image; so the previous image with its specific rights and application selection needs a fresh tableau on which to build upon. If the device is leaving the organization, there can’t be any trace of its prior usage left. NIST agrees:

NIST Special Publication 800-88 Guidelines for Media Sanitization mandates that “in order for organizations to have appropriate controls on the information they are responsible for safeguarding, they must properly safeguard used media.” Taking control of old electronic media means disposing of it in a safe, secure, and compliant fashion.

The decommission process can be lengthy and, with all the daily fires requiring attention, considered a lower priority. This is why many companies ether have a stack of old devices waiting for retirement in some storage room or outsource to companies that specialize in data sanitization and hard disk destruction.

This year, IT teams will be potentially inundated with retiring devices considering the sunsetting of Windows XP last April. Because of the cost, many companies have simply opted to invest in brand new machines with Windows 7 preinstalled rather than face the battle of OS migration. This leaves them to face the problem of decommissioning their old PCs in a way that prevents any significant leakage of sensitive information.

As noted, many companies use outside organizations to handle this aspect of their business. Using our modest-sized enterprise as a model, decommissioning 575 devices can be expensive. Based on industry research, this costs between $30 and $50 per device. For our example company, that is a budget line item in excess of $23,000 for the year. Unfortunately for this company, an additional 12% of their machines, still within their industry-accepted 4-year lifecycle, were XP machines. They opted for new units rather than upgrade. Another 300 machines; that’s an additional $12,000. According to Microsoft (The Enterprise PC Lifecycle: Seeing the Big Picture for PC Fleet Management), the breakdown of the service is basically $46 (or as high as $375 per PC) including $12 for archiving data, $12 for sanitizing the hard drive, $8 for reloading the operating systems, and $12 to test the PCs. Granted some of this cost is deferred by the potential resale of these units. However, with older, unsupported OS’s, donation is more likely.

To validate these numbers, I spoke with the VP of IT of a well-known health care plan provider. They routinely spent $25,000 on top of the cost to recycle decommissioned machines to ensure the sensitive data that may still reside on hard drives was removed. This company is bound by very strict HIPAA compliance requirements in addition to the mandates of a dozen or more accreditation agencies.

If cost is prohibitive, the other option is to do it yourself. Without getting into soft costs and personnel time, there are two other potential hurdles that make this option complicated. First, it can be a fairly lengthy process. This means a resource has been reassigned from higher value tasks; not to mention the aforementioned daily emergencies. Secondly, it requires a degree of expertise. Every IT pro worth their salt knows simple file deletion or partitioning is insufficient. Companies must take action that will leave no trace of the previous image or data on a device.

Okay, one last thorn. Your company has the will and bandwidth to re-purpose/ decommission end-of-lifecycle devices. Now you must invest in a unique software license to run shredding/removal process. Besides having another SLA to manage, does the product actually make the process easier? Does it use recognized best practices to remove data, sanitize drives and replace old images with an approved, “clean” version? Can it accommodate multiple drives simultaneously (such as in a RAID) without having to break it apart first? And, does it allow you to provide certified evidence of data destruction?

It’s almost enough, as one IT pro wrote in a tech forum, “to take a sledge hammer, thermite, and go Office Space on 200 old hard drives. But I have other things to do.”

Whether re-purposing for use in another department, donating, reselling or smashing it to bits with a baseball bat, “wiping” the hard drive is a definitive part of the PC lifecycle. For companies that maintain any sensitive data on the drives (that’s most of them!), it rises to the level of necessity. Companies can reduce the financial impact if their sanitization process is included as a part of another indispensable infrastructure maintenance solution such as configuration or change management. For example, deploy one central solution that handles your entire automated configuration initiative: self-healing restoration, recovery, imaging and patching/updating.

But to make the whole thing effective and worth unifying sanitization with other configuration functions, it has to be fast (at least 10 seconds per gigabyte). It has to be thorough. It must use one of the two recognized destruction techniques: degaussing or making every shred of data permanently unreadable by overwriting it. In terms of repurpose and donation, you can now apply a proper clean and approved image on the “wiped” machine with confidence.

Unification makes a great deal of sense since it leverages other components important to compliance and security. The ability to image/reimage a re-purposed machine without having to expend any more capital is a huge boon. It goes back to that often repeated CIO mantra, try to do more for less.

Persystent Suite, which currently facilitates restoration, recovery, imaging and patch/update migration capabilities in a single centralized solution, recently added “wipe” functionality to its suite in order to help larger enterprises fulfill compliance mandates related to data security and device control. See it here.

Becoming a trusted adviser: helping clients prevent unforeseen expenses

As a service provider of any kind, the ultimate compliment is to be considered a “trusted adviser” by your client. But this status is more than simply getting a good reference or getting a customer to renew their annual contract. By its very title, a trusted adviser is an outside insider for a company. A consultant depended upon by an organization to provide valuable insight on how a company can best achieve its stated and latent goals.

For managed service providers, whose very purpose is to ensure various IT infrastructure and applications provide the expected results and value for the client, to become a trusted adviser means you have the responsibility to continually identify and implement ways to improve performance, anticipate challenges and constantly adapt to the transformative nature of technology.

Sounds easy enough. That is what you do, right? Whether you provide network support, security, help desk or a variety of other key services doesn’t immediately raise you to the level of trusted adviser. It simply means you provide an important service…and we assume you provide it very well.

Part of the trusted adviser’s job description is not only to improve performance, but to do so at the maximum level for minimal costs. The transition from service provider to trusted adviser means you are looking out for your client’s best interest, and not just service they can buy. To accomplish this, MSPs must address one of the biggest cost burdens that can affect the relationship: break/fix issues.

The labor required to manage this portion of the relationship is the biggest drain on margin. Regardless of whether a client purchased full coverage for a monthly fee, use a capped block of hours or pay out of pocket for each issue, somebody’s margin is affected when things go sideways.  It’s either money (margin) out of the MSPs pocket or out of the clients.

It’s not that issues arise, it’s just that the labor required to address problems is unpredictable. It could be a five minute fix or something that takes an application or network offline for an extended period of time while troubleshooting, fix planning and solution are applied.

Nothing erodes trusted adviser status faster than money. This is not to say an MSP should operate as a non-profit, but there are ways to proactively and automatically confront the break/fix issue without either side having to dig deep into profit margin. And, more importantly, provide a reliable means to attack unforeseen issues that eat time, upset productivity, and force reprioritization of potential revenue generating services. This is the road to trusted adviser status.

The ability to break out of “firefighter mode,” is the first step to creating lasting value for clients. The less time spent with your hair on fire, the more you can concentrate on tasks that support client business (and add to an MSP partner’s credibility and differentiation). For many MSPs services surround 6 general areas of coverage:

  1. Network Support
  2. Backup and Recovery
  3. Security
  4. End User Support/Help Desk
  5. Compliance
  6. Extra consulting services

The one constant through each of these services are the likelihood that break/fix will occur sooner or later. The ability to mitigate the risk associated with these problems and the labor required to properly diagnose and repair them can by automated configuration.

This doesn’t suggest a simple recovery tool. Instead of applying hours diagnosing and repairing, systems can self-heal upon reboot. It takes the client’s ideal image and removes the service issue. It’s simple. It’s automatic. And it removes problems that would otherwise require manual intervention and desk side visits.

Of course this doesn’t solve every problem, but if it can remove 60-70% of user-inflicted issues like changing critical settings, downloading malicious viruses, making unauthorized application changes, deleting necessary dll files, disabling BITS, and thousand other actions that compromise infrastructure integrity, not only are significant dollars saved, uptime and asset availability increased, but expensive personnel time is saved for higher value tasks.

There are several other benefits an MSP achieves by including automated self-healing as part of an overall package.

Scheduled versus variable labor: Labor costs take a huge bite out of the scope of service–especially when it comes to break/fix issues. An MSP and their client can create more fiscally stable relationship through precision budgeting. The client knows how much is going to pay each and every month and the MSP gains the stable recurring revenue. By using configuration automation and optimization, MSPs can reduce the specter of additional pass-along costs to the client or avoid absorbing the additional expensive labor costs. Now the conversation can move from “how much” to “how to improve” (from reactive to proactive).

Expand geographic reach: Many MSPs operate as regional entities because they do not have the personnel or the budget to adequately cover a larger (or even national) territory. From a cost perspective, self-healing eliminates a great many client visits. Typical on-site services like device restoration, no longer require a warm body in the room. This, in turn, reduces the need to travel and out-of-pocket time and costs. Without having to hop in a car or plane, you can provide effective service to a wider circle of clientele. Now when you visit a client, it is to provide proactive intellectual value and consulting expertise…or simply take them to dinner to thank them for the business.

Help Desk reduction: Resources show that by self-healing and rebooting to an ideal state eliminates more than 34% of all inbound help desk issues without manual intervention. If you consider that very time the help desk phone rings, it’s $20 (based on nat’l average). For more serious issues such as catastrophic device failure, infected operating systems/applications, unauthorized downloads, the cost is obviously greater–and not just in terms of tech/admin intervention, but lost productivity and potential loss of client trust. This doesn’t include scheduled maintenance tasks such as patching, updating and migration—which in itself requires a significant time and resource commitment. By adding a self-healing component to your existing slate of offerings, it reduces the number of help desk calls and, more importantly, allows an MSPs help desk pros to uncover root causes rather than continually fix the symptoms.

Removal of malicious changes: Through maliciousness or carelessness, your client’s network is under constant attack from botnets, malware, viruses and a variety of other negative impact influences. Although automatic configuration and reimaging can’t prevent Stan from sales downloading a suspect app or prevent organized element in Eastern Europe from worming into  a system, the continuous maintenance and reapplication of an ideal state can prevent lingering damage. Any time an unauthorized outside influence tries to change a registry, attach itself to a file, or embed itself in a supported application, the system rejects these modifications in favor of the ideal state…in real time. From an MSP perspective, this avoids the downtime needed to cleanse a network and helps preserve the continuity of critical information.

Of the six general service areas mentioned, it is obvious how configuration/recovery/repair/ reimage automation can helps issues related to the network, backup and end users, however some question the value to those who provide security and compliance services. The answer is simple. Although not a traditional security solution, it not only demonstrates control over network assets (as required in SANS, HIPAA, PCI and others), but enables the operating environment running smooth over the course of the lifecycle.

Because a trusted adviser is more interested in a long term relationship than any short term gains, it is imperative that MSPs find and propose new and innovative solutions to include within their base services. If clients consider a MSPs service as a commodity, then it is very simple to find another provider.

The difference between an expert and a trusted adviser really comes down to a single attribute: an expert provides good answers. A trusted adviser asks good questions. Can you reduce costs while increasing your quality of service?

Learn how to answer that question at www.utopicsoftware.com

Utopic presents Top 5 benefits for self-healing IT

A video blog entry!

Utopic presents it’s Top 5 benefits for adopting a self-configuring, self-optimizing, self-protecting , and of course, self healing process for an enterprise IT landscape. Self-healing describes the ability to perceive that an IT system or device is not operating correctly and, without human intervention, make the necessary adjustments to restore itself to normal operation.

If anti-virus is dead…then what?

How configuration automation fills the vulnerability gap.

Earlier this month, the progenitors of anti-virus software declared that “anti-virus is dead.”(Wall Street Journal May 4, 2014) According to Symantec and other industry leading statistics the software designed to prevent malware,spyware and other intrusive tactics are doomed to failure. They say that anti-virus only catches 45% of the threats.

The battle is being lost  because prevention and protection are always two steps behind. As fast as someone comes up with a preventive signature, six more even nastier bugs are developed and released on unsuspecting networks. It is said that 95% of all networks (source: FireEye and ThreatSTOP) have some

sort of active infection.

To add fuel to the fire, IT security thought guru Eugene Kaspersky recently said: “The single-layer signature-based virus scanning is nowhere near a sufficient degree of protection – not for individuals, not for organizations large or small.”

The barbarians may be at the gates, but it’s not all doom and gloom. Many IT pros, those associated with mid- and larger tier enterprises recognize that security best practices are not singularly tied into firewall protection, but rather, an interoperable combination of key functions.

The defenses may be in place, but the war is still not being won. An organization may be continuously monitoring, correlating, provisioning, authenticating, blocking, but too many companies are not taking advantage of what makes security more effective; more prolific across a wider enterprise expanse. What is missing is automation.

Let’s return to the company that depends heavily on anti-virus to prevent breaches and other negative impact events. If Symantec is a credible source, then this company needs a new and innovative way of maintaining a safe and secure environment. Let’s also assume that even with a stack of other security tools, that phishing, botnets, and malware will always find a way to breach the network. If multinationals like Citibank, eBay, Target and Sony struggle with breaches, than the likelihood is you do as well (sources say 78% experienced breach in the past 2 years). What needs to happen is to automatically protect.

In the absence, or more likely in support of anti-virus protection, initiating some sort of automated repair/recovery program seems to be a progressive alternative growing in acceptance and popularity. It is based on the continuous maintenance of an ideal state. This way, any time an unauthorized outside influence tries to change a registry, attach itself to a file, or embed itself in a supported application, the system rejects these modifications in favor of the ideal state. After every PXE reboot of a workstation, or device, the automated system reapplies the latest approved image.

Within this scenario, any infection introduced after the last boot up is eliminated. Case in point: an inside sales person uses your network and internet connection to reach their independent email account. They see a new email from a friend: “U should see this.” Thinking the friend is a trustworthy source, the email is opened and the link within is clicked.  The website redirect seems harmless enough, a picture of puppies or a video of a skateboarder miserably miscalculating an airborne trick. However, on the next click a Secure Shield dialog box appears and lets the user know their network is in danger. Believing they are good stewards of the company, click on the link to load the “security update.” And as fast as that, the ransom-ware makes their device a paperweight.

Without automation, a help desk tech will probably spend several hours diagnosing and then manually restoring the hacked registry. Even if a fresh image is available, there is still the necessary manual intervention of reapplying specific user settings, applications and privileges based on the business need, corporate policies and organizational role. Then there is a greater time commitment on investigating whether the issue has spread beyond the single device or has evolved into a greater threat. One moment of carelessness creates hours and hours of IT involvement, QA/testing, and re-ensuring compliance requirements. This doesn’t include the lost productivity, potential risk and cost this threat poses to the entire network.

The same scenario using repair/recovery automation doesn’t prevent the recklessness, but prevents the mistake from spreading further. All the user needs to do is turn the machine off and back on. This applies a fresh ideal state. The ransom-ware and any other unauthorized change are gone… automatically without IT intervention. More importantly, the ideal image is configured for the individual (or their role). The image maintains their applications, settings, latest updates and other unique components so the system lifecycle is perpetuated, uninterrupted and remains firmly under IT’s control.

Real time configuration management security also supports compliance considering that several of the SANS critical controls (which serve as the basis for more than 3 dozen regulatory compliance agency mandates) are maintained through proper configuration and demonstrated control. For example, PCI/DSS requires: “2.2 Develop configuration standards for all system components. Assure that these standards address all known security vulnerabilities and are consistent with industry-accepted system hardening standards.” SAN simplifies this to mean “Secure Configurations for Hardware and Software on Laptops, Workstations, and Servers.”  The ability to continuously maintain an ideal state for a variety of roles is the key to ensuring assets are only available to appropriate users. If each device is covered under a Repair/Recovery Reimage configuration protocol, then (as HIPAA 11.0 demands) you are demonstrating control over data. The system cannot accept unauthorized changes (as detailed in your organization’s standards and policies) to registry, applications or files. This is not to say an organization can forgo provisioning, log archiving, firewall reinforcement or authentication, but automating configuration puts another proverbial brick in your defense wall.

Security requires attention 24/7…

“If I can cut that in half, we’re talking a staggering amount of money,” says Bruce Perrin, CIO for Florida-based Phenix Energy Group. “Seventy percent of what security profes­sionals do could be done completely automatically, giving them more time to do things that are more important.”

62 percent of respondents in a recent IDG Research survey indicated they automate less than 30 percent of their security functions. For most companies that turns out to be a great deal of manual personnel hours. Hackers don’t sleep, so why should your security? Unless an IT department is staffed around the clock, there is a certain amount of time that users are on their own. And the most blatant issues (the ones that gain headlines) don’t start as brute force attacks—they are sneaky and insidious that can lay dormant for days or months (like Heartbleed); so that middle of the night emergency call may never come until it’s too late. By automating the configuration-break/fix process, organizations remove a significant burden.

For example, a unified school district in Central Florida manages a student computer lab more than 2000 PCs. They conservatively estimated that each PC experiences some sort of break/fix incident every 90 days (and 5% experience a catastrophic failure each year). And each incident required a manual intervention of one hour each. This equated to approximately 7,450 man hours over the course of the year. Also when considering the ROI, the average downtime of each machine was at least 4 hours from report to resolution. When they applied an automated process, the break/fix issues were reduced by 90%. This saved 9,450 hours and an annual cost savings of slightly less than 16,000/mo ($191,121/yr).

 

Automation also promotes the ability to respond to higher value threats in a shorter amount of time. And if you can reduce the number of security incidents through automation, you reduce the risk of data loss, which again can amount to staggering amounts of money given the potential cost of a single breach.

Configuration (Repair/Recovery/Reimage) may not be a traditional security solution, but as an automated component in a larger initiative, enables key security features that are not only compliance requirements, but keep the operating environment running smooth over the course of the lifecycle. And, for that reason alone, should be included as part of any organization’s next generation security arsenal.

Learn more at www.utopicsoftware.com