Understand the aspects of disaster recovery Disaster recovery is concerned with the recovery of critical systems in the event of a loss. Be able to discuss the process of recovering a system in the event of a failure A system recovery usually involves restoring the base operating systems, applications, and data files. Be able to discuss the types of alternative sites available for disaster recovery. The three types of sites available for disaster recovery are hot sites, warm sites, and cold sites. Be able to describe the needed components of an incident response policy the incident response policy explains how incidents will be handled, including notification, resources, and escalation. Full Backup A full backup is a complete, …show more content…
comprehensive backup of all files on a disk or server. Incremental Backup an incremental backup is a partial backup that stores only the information that has been changed since the last full or the last incremental backup. If a full backup were performed on a Sunday night, an incremental backup done on Monday night would contain only the information that changed since Sunday night. Incremental backups are usually the fastest backups to perform on most systems, and each incremental backup tape is relatively small. Differential Backup a differential backup is similar in function to an incremental backup, but it backs up any files that have been altered since the last full backup; it makes duplicate copies of files that haven't changed since the last differential backup. If a full backup were performed on Sunday night, a differential backup performed on Monday night would capture the information that was changed on Monday. A differential backup completed on Tuesday night would record the changes in any files from Monday and any changes in files on Tuesday. As you can see, during the week each differential backup would become larger; by Friday or Saturday night, it might be nearly as large as a full backup. At the conclusion of the backup, the archive bit is left on for those files so they are then included again in the next backup. When these backup methods are used in conjunction with each other, the risk of loss can be greatly reduced, but you can never combine incremental and differential backups in the same set. At the main YaST System Backup dialog box, click Start Backup.
Essay 8-2
An important concept to keep in mind when working with incidents is the chain of custody, which covers how evidence is secured, where it is stored, and who has access to it. It is highly recommended that a log book be used to document every access and visuals (pictures and video) recorded to show how the evidence is secured. An easy way to think of the two is that an event is anything that happens, while an incident is any event that endangers a system or network. When a suspected incident pops up, first responders are those who must ascertain whether it truly is an incident or a false alarm.
Depending on your organization, the first responder may only be the main security administrator or could consist of a team of network and system administrators. This process, called escalation, involves consulting policies, consulting appropriate management, and determining how best to conduct an investigation into the incident. When an incident occurs, who is responsible for managing the communications about the incident? You periodically investigate log and audit files to determine the status of your systems and servers. Upon examining the email system, you notice that the outbound mail folder seems to be sending mail every second. You should investigate why the antivirus software is out-of-date, upgrade these systems as appropriate, and add server-based and mail-server virus-protection capabilities to your network.
Step two: Investigating the Incident
The process of investigating an incident involves searching logs, files, and any other sources of data about the nature and scope of the incident.
It is sad but true: One reason administrators don't put as much security on networks as they could is because they do not want to have to deal with the false positives. As a security administrator, you must seek a balance between being overwhelmed with too much unneeded information and knowing when something out of the ordinary is occurring. Although collecting as much information as possible is important, no one can be blamed for trying to protect their data. While it may be admirable to catch a crook deleting your data, if you can keep the data from being deleted, you will stand a much better chance of still being employed tomorrow.
Step three: Repairing the Damage
Most operating systems provide the ability to create a disaster-recovery process using distribution media or system state files. The user updated all the programs in his computer and also updated his antivirus software; however, he's still reporting unusual behavior in his computer system. The user has probably contracted a worm that has infected the system files in his computer. When the scan is complete, help the user reinstall data files and scan the system again for viruses. ClamAV is an open source solution once available only for Unix-based systems that is now available for most operating systems.
Step four: Documenting and Reporting the …show more content…
Response
During the entire process of responding to an incident, you should document the steps you take to identify, detect, and repair the system or network. Emergency management (EM) personnel routinely stage fake emergencies to verify that they know what they should do in the event of an actual emergency. You should plan a fake incident at your site, inform all those who will be involved that it's coming, and then evaluate their response. Practice makes perfect, and there is no better time to practice your company's response to an emergency than before one really occurs.
Step five: Adjusting Procedures
The following questions might be included in a policy or procedure manual: How did the policies work or not work in this situation? Act in Order of Volatility When dealing with multiple issues, address them in order of volatility (OOV); always deal with the most volatile first. Capturing an image of the operating system in its exploited state can be helpful in revisiting the issue after the fact to learn more about it. Record Time Offset It is quite common for workstation times to be off slightly from actual time, and that can happen with servers as well. One method of assisting with this is to add an entry to a log file and note the time this was done and the time associated with it on the system. Take Hashes It is important to collect as much data as possible to be able to illustrate the situation, and hashes must not be left out of the equation. Talk to Witnesses It is important to talk to as many witnesses to what happened as possible and as soon as possible. Software vendors and hardware vendors are necessary elements in the process of building systems and applications. These agreements help you protect yourself in the event that a software vendor goes out of business or you have a dispute with a maintenance provider for your systems. A service-level agreement (SLA) is an agreement between you or your company and a service provider, typically a technical support provider. Consider a medical practice that must grant an application vendor full access to all patient records in the spirit of being able to maintain the application. If a vendor promises to provide you with a response time of four hours, this means it will have someone involved and dedicated to resolving any difficulties you encounter either a service technician in the field or a remote diagnostic process occurring on your system within that time frame. Recovery Time Objectives The recovery time objective (RTO) is the maximum amount of time that a process or service is allowed to be down and the consequences still considered acceptable. Mean Time between Failures The mean time between failures (MTBF) is the measure of the anticipated incidence of failure for a system or component. If the MTBF of a cooling system is one year, you can anticipate that the system will last for a one-year period; this means you should be prepared to replace or rebuild the system once a year. Mean Time to Restore The mean time to restore (MTTR) is the measurement of how long it takes to repair a system or component once a failure occurs (this is often also referenced as mean time to repair). While MTTR is considered a common measure of maintainability, be careful when evaluating it because it doesn't typically include the time needed to acquire a component and have it shipped to your location.
Essay 8-3
A periodic security audit of user access and rights review can help determine whether privilege-granting processes are appropriate and whether computer usage and escalation processes are in place and working.
Privilege audits: The specifics may differ, but the following general steps should always be undertaken: Plan for the audit, conduct the audit, evaluate the results, communicate the results and needed changes, and follow up.
Failing to do so can result in privilege creep (also known as access creep, referenced earlier), which occurs when an individual accidentally gains a higher level of access than they would normally be entitled to or need.
Usage auditing: Verifies that systems and software are used appropriately and consistently with organizational policies. A usage audit may entail physically inspecting systems, verifying software configurations, and conducting other activities intended to prove that resources are being used appropriately. Periodically inspecting systems to ensure that software updates and patches are current and that only approved software is installed is a good idea.
Escalation audits: Help ensure that procedures and communications methods are working properly in the event of a problem or issue. These types of audits test your organization to ensure that it has the appropriate procedures, policies, and tools to deal with any problems in the event of an emergency, catastrophe, or other need for management intervention. Disaster recovery plans, business continuity plans, and other plans are tested and verified for accuracy. To successfully complete your assignment, you'll need to inspect every user account and group to verify which user accounts
belong to which groups.
Administrative auditing: Among the assets you must be able to demonstrate appropriate controls on are all those related to personally identifiable information (PII). PII exists within your databases for all users, customers, vendors, and contacts and includes such things as their phone number, address, credit card number, employee status, and so on.
Log file auditing: The DNS service, when running on Windows Server 2008 for example, writes entries to the log file that can be examined using Event Viewer. Just as you set the size and overwrite options for the Security Log object, you should take those same actions for the DNS Server logs as well.