aesthetics  →
being  →
complexity  →
database  →
enterprise  →
ethics  →
fiction  →
history  →
internet  →
knowledge  →
language  →
licensing  →
linux  →
logic  →
method  →
news  →
perception  →
philosophy  →
policy  →
purpose  →
religion  →
science  →
sociology  →
software  →
truth  →
unix  →
wiki  →
essay  →
feed  →
help  →
system  →
wiki  →
critical  →
discussion  →
forked  →
imported  →
original  →
[ temporary import ]
please note:
- the content below is remote from Wikipedia
- it has been imported raw for GetWiki
{{about|backup in computer systems|other uses}}{{Incomprehensible|date=July 2019}}{{Use dmy dates|date=August 2013}} In information technology, a backup, or data backup, or the process of backing up, refers to the copying into an archive fileIn contrast to everyday use of the term "archive", the data stored in an "archive file" is not necessarily old or of historical interest.BOOK
, Joe Kissell
, Take Control of Mac OS X Backups, 2007
, TidBITS Electronic Publishing, Ithaca, NY
, 978-0-9759503-0-2, Version 2.0
, 17 May 2019, Kissell
, 18–20 (The Archive), 24 (client-server), 82–83 (archive file), 112–114 (Off-site storage backup rotation scheme), 126–141 (old Retrospect terminology and GUI—still used in Windows variant), 165 (client-server), 128 (subvolume—later renamed Favorite Folder in Macintosh variant), of computer data that is already in secondary storage—so that it may be used to restore the original after a data loss event. The verb form is (wikt:back up|"back up") (a phrasal verb), whereas the noun and adjective form is (wikt:backup|"backup").WEB, back•up,weblink The American Heritage Dictionary of the English Language, Houghton Mifflin Harcourt, 9 May 2018, 2018,
Backups are primarily to recover data after its loss from data deletion or corruption, and secondarily to recover data from an earlier time, based on a user-defined data retention policy.BOOK
, Pro Data Backup and Recovery
, Chapter 1: Introduction to Backup and Recovery
, S. Nelson, Apress, 1–16, 2011
, 978-1-4302-2663-5, 8 May 2018, Though backups represent a simple form of disaster recovery and should be part of any disaster recovery plan, backups by themselves should not be considered a complete disaster recovery plan. One reason for this is that not all backup systems are able to reconstitute a computer system or other complex configuration such as a computer cluster, active directory server, or database server by simply restoring data from a backup.BOOK,weblink The Backup Book: Disaster Recovery from Desktop to Data Center, Chapter 1: What's a Disaster Without a Recovery?, Cougias, D.J., Heiberger, E.L., Koop, K., Network Frontiers, 1–14, 2003, 0-9729039-0-9,
Since a backup system contains at least one copy of all data considered worth saving, the data storage requirements can be significant. Organizing this storage space and managing the backup process can be a complicated undertaking. An information repository model may be used to provide structure to the storage. Nowadays, there are many different types of data storage devices that are useful for making backups. There are also many different ways in which these devices can be arranged to provide geographic redundancy, data security, and portability.Before data are sent to their storage locations, they are selected, extracted, and manipulated. Many different techniques have been developed to optimize the backup procedure. These include optimizations for dealing with open files and live data sources as well as compression, encryption, and de-duplication, among others. Every backup scheme should include dry runs that validate the reliability of the data being backed up. It is important to recognize the limitationsNEWSPAPER, The New York Times
, A Beginner’s Guide to Backing Up Photos
, Terry Sullivan, 11 January 2018
, a hard drive ... an established company ... declared bankruptcy ... where many ... had ..., and human factors involved in any backup scheme.
This article focuses on features found even in personal backup applications, as opposed to features found only in enterprise client-server backup applications. It also assumes at least a random access index to the secondary storage data to be backed up, and therefore does not discuss the decades-old practice of pure tape-to-tape copying.

Storage, the base of a backup system

Information repository models

Any backup strategy starts with a concept of an information repository, "a secondary storage space for data".WEB, McMahon, Mary, What Is an Information Repository?,weblink wiseGEEK, Conjecture Corporation, 8 May 2019, 1 April 2019, In the sense of an approach to data management, an information repository is a secondary storage space for data., The backup data needs to be stored, so part of the model is the backup rotation scheme. The repository must have some kind of method behind it; the method could be as simple as a sheet of paper with a list of all backup media (DVDs, etc.) and the dates they were produced. A more sophisticated method could include a computerized index, catalog, or relational database. Different backup methods have different advantages:

Backup methods

Unstructured : An unstructured repository may simply be a stack of tapes or DVD-Rs or external HDDs with minimal information about what was backed up and when. This method is the easiest to implement, but probably the least likely to achieve a high level of recoverability as it must lack automation.
Full only / System imaging : A repository using this backup method contains complete source data copies taken at one or more specific points in time.WEB, Mayer, Alex, Backup Types Explained: Full, Incremental, Differential, Synthetic, and Forever-Incremental,weblink Nakivo Blog, Nakivo, 17 May 2019, Full Backup, Incremental Backup, Differential Backup, Mirror Backup, Reverse Incremental Backup, Continuous Data Protection (CDP), Synthetic Full Backup, Forever-Incremental Backup
, 6 November 2017, Copying system images, this technology is frequently used by computer technicians to record known good configurations. ImagingWEB, Five key questions to ask about your backup solution,weblink, 23 September 2015,weblink" title="">weblink 4 March 2016, no, dmy-all, Does your company have a low tolerance to longer “data access outages” and/or would you like to minimize the time your company may be without its data?, 23 March 2014, is generally more useful for deploying a standard configuration to many systems rather than as a tool for making ongoing backups of diverse systems.


An incremental backup stores data changed since a reference point in time.WEB, Reed, Jessie, What Is Incremental Backup?,weblink Nakivo Blog, Nakivo, 17 May 2019, Reverse incremental, Multilevel incremental, Block-level, 27 February 2018, Duplicate copies of unchanged data aren't copied. Typically a full (usually non-image) backup of all files is made on one occasion (or at infrequent intervals), serving as the reference point for an incremental repository. After that, a number of incremental backups are made after successive time periods. Restores begins with the last full backup and then apply the incrementals.Incremental Backup {{Webarchive|url= |date=21 June 2016 }}. Retrieved 10 March 2006Some backup systemsWEB, Pond, James,weblink How Time Machine Works its Magic, Apple OSX and Time Machine Tips,, May 19, 2019, August 31, 2013, File System Event Store,Hard Links, can create a {{visible anchor|synthetic full backup}} from a series of incrementals, thus providing the equivalent of frequently doing a full backup. This, when done to modify a single archive file, speeds restores of recent versions of files even for personal backup applications. It should not be confused with a technique for creating a second archive file from a first, which is a capability of enterprise client-server backup applications.


True Continuous Data Protection (CDP), allowing restoring data to any point in time, "is the gold standard—the most comprehensive and advanced data protection. But 'near CDP' technologies can deliver enough protection for many companies with less complexity and cost. For example, snapshots ["near-CDP" clarification two paragraphs down] can provide a reasonable near-CDP-level of protection for file shares, letting users directly access data on the file share at regular intervals—say, every half hour or 15 minutes. That's certainly a higher level of protection than tape-based or disk-based nightly backups and may be all you need."WEB
, Behzad Behtash
, Why Continuous Data Protection's Getting More Practical
, Disaster recovery/business continuity
, Informationweek, 2010-05-10, 2011-11-12, CDP is the gold standard—the most comprehensive and advanced data protection. But "near CDP" technologies can deliver enough protection for many companies with less complexity and cost ... [the other quotes are now behind a registration wall for the rest of the article], Because "near-CDP does this [copying] at pre-set time intervals",WEB, Continuous data protection (CDP) explained: True CDP vs near-CDP,weblink, TechTarget, 22 June 2019, July 2010, ... copies data from a source to a target. True CDP does this every time a change is made, while so-called near-CDP does this at pre-set time intervals. Near-CDP is effectively the same as snapshotting....True CDP systems record every write and copy them to the target where all changes are stored in a log. [new paragraph] By contrast, near-CDP/snapshot systems copy files in a straightforward manner but require applications to be quiesced and made ready for backup, either via the application's backup mode or using, for example, Microsoft's Volume Shadow Copy Services (VSS)., it is essentially incremental backup initiated by a timer instead of a script.
Because in true CDP "backup write operations are executed at the level of the basic input/output system (BIOS) of the microcomputer in such a manner that normal use of the computer is unaffected",WEB
, US Patent 5086502: Method of operating a data processing system
, Peter B. Malcolm, 13 November 1989
, Filing date Nov 13, 1989 ... a backup system in which a copy of every change made to a storage medium is recorded as the change occurs ... backup write operations are executed at the level of the basic input/output system (BIOS) ...
,weblink Google Patents
, 29 November 2016, true CDP backup must in practice be run in conjunction with a virtual machineWEB, Wu, Victor, EMC RecoverPoint for Virtual Machine Overview,weblink Victor Virtual, WuChiKin, 22 June 2019, 4 March 2017, The splitter splits out the Write IOs to the VMDK/RDM of a VM and sends a copy to the production VMDK and also to the RecoverPoint for VMs cluster., WEB, Zerto or Veeam?,weblink RES-Q Services, 7 July 2019, March 2017, Zerto doesn’t use snapshot technology like Veeam. Instead, Zerto deploys small virtual machines on its physical hosts. These Zerto VMs capture the data as it is written to the host and then send a copy of that data to the replication site.....However, Veeam has the advantage of being able to more efficiently capture and store data for long-term retention needs. There is also a significant pricing difference, with Veeam being cheaper than Zerto., or equivalentWEB, Agent Related,weblink, 3 July 2019, What does the CloudEndure Agent do?, 2019, The CloudEndure Agent performs an initial block-level read of the content of any volume attached to the server and replicates it to the Replication Server. The Agent then acts as an OS-level read filter to capture writes and synchronizes any block level modifications to the CloudEndure Replication Server, ensuring near-zero RPO., —ruling it out for ordinary personal backup applications. It is therefore discussed in the "Enterprise client-server backup" article, rather than in this article.
"Near-CDP" backup applications—frequently marketed as "CDP"—only allow restores at fixed intervals such as 15 minutes or one hour or 24 hours, because they automatically take incremental backups at those intervals. They use journaling; when the interval is shorter than one hour,WEB, Pond, James, FAQ 13. How are [Time Machine] backups scheduled (and can I change that)?,weblink Apple OSX and Time Machine Tips, (as mirrored after James Pond died in 2013), 4 July 2019, 25 May 2013, Time Machine was designed and optimized to do backups hourly.... You cannot change the schedule within Time Machine. You must use a 3rd-party app, or manually alter some system files., "near-CDP" backup applications—for example Arq BackupWEB, Reitshamer, Stefan, Troubleshooting backing up open/locked files on Windows,weblink Arq Blog, Haystack Software LLC, 25 June 2019, 5 July 2017, Arq uses Shadow_Copy, Windows Volume Shadow Copy Service (VSS) to back up files that are open/locked. [Reitshamer is the principal developer of Arq Backup], —are typically based on periodic "snapshots" because "to avoid downtime, high-availability systems may instead perform the backup on a snapshot—a read-only copy of the data set frozen at a point in time—and allow applications to continue writing to their data". Interactive applications' coordinated snapshots are discussed in the "Enterprise client-server backup" article."Near-CDP"—except for Apple Time Machine—WEB, Pond, James, How Time Machine Works its Magic,weblink Apple OSX and Time Machine Tips, (as mirrored after James Pond died in 2013), 10 July 2019, 31 August 2013, The File System Event Store is a hidden log that OSX keeps on each HFS+ formatted disk/partition of changes made to the data on it. It doesn’t list every file that’s changed, but each directory (folder) that’s had anything changed inside it., intent-logs every change on the host system, often by saving byte or block-level differences rather than file-level differences.WEB, Brian Posey
, CDP technology offers organizations a steady data protection method
, DataBackup, TechTarget, 10 May 2019, Other differentiators, Value in block-level backup, August 2016, This backup method differs from simple disk mirroring in that it enables a roll-back of the log and thus a restoration of old images of data.
Intent logging allows proper precautions for the consistency of live data, so that captured changes can provide fine granularities of restorable objects ranging from crash-consistent images to logical objects such as files, databases and logs.WEB
, An Overview of Continuous Data Protection
,, 2011-11-12,

Reverse incremental

A Reverse incremental backup method stores a recent archive file "mirror" of the source data and a series of differences between the "mirror" in its current state and its previous states. A reverse incremental backup method starts with a non-image full backup. After the full backup is performed, the system periodically synchronizes the full backup with the live copy, while storing the data necessary to reconstruct older versions. This can either be done using hard links—as Apple Time Machine does, or using binary diffs. Reverse incremental works particularly well if most restores are of latest versions.


Each differential backup saves the data that has changed since the last full backup. This backup method has the advantage that only a maximum of two backups from the repository are used to restore the data. One disadvantage, compared to the incremental backup method, is that as time from the last full backup (and thus the accumulated changes in data) increases, so does the time to perform the differential backup. Restoring an entire system requires starting from the most recent full backup and then applying just the last differential backup since the last full backup.
By standard definition, a differential backup copies files that have been created or changed since the last full backup, regardless of whether any other differential backups have been made since then, whereas an incremental backup copies files that have been created or changed since the most recent backup of any type (full or incremental). Other variations of incremental backup include multi-level incrementals and block-level incrementals that compare parts of files instead of just entire files.

Storage media

File:DVD, USB flash drive and external hard drive.jpg|thumb|right|From left to right, a DVD disc in plastic cover, a USB flash drive and an external hard driveexternal hard driveRegardless of the repository model that is used, the data has to be copied onto some archive file data storage medium.
Magnetic tape : Lower prices for disk-to-disk backup now give magnetic tape, long the most commonly used medium for bulk data storage, backup, archiving, and interchange, less of a "clear price advantage."WEB, 9 December 2004, 2019-05-26
, Disk to Disk Backup versus Tape – War or Truce?, Gardner, Steve
, Many tape formats have been proprietary or specific to certain markets like mainframes or a particular brand of personal computer, but by 2014 LTO was edging out two other remaining viable "super" formats—IBM 3592 (now also referred to as the TS11xx series) and Oracle StorageTek T10000,WEB, Tom Coughlin, Keeping Data for a Long Time,weblink Forbes, Forbes Media LLC, 19 April 2018, 29 June 2014, para. Magnetic Tapes(popular formats, storage life), para. Hard Disk Drives(active archive), para. First consider flash memory in archiving(... may not have good media archive life), and further development of the smaller-capacity DDS format had been canceled. By 2017 Spectra Logic, which builds tape libraries for both the LTO and TS11xx formats, was predicting that "Linear Tape Open (LTO) technology has been and will continue to be the primary tape technology."WEB, Digital Data Storage Outlook 2017,weblink Spectra, Spectra Logic, 11 July 2018, 14(Tape), 2017, Tape is a sequential access medium, so even though access times may be poor, the rate of continuously writing or reading data can actually be very fast.

Hard disk: The capacity-to-price ratio of hard disks has been improving for many years, making them more competitive with magnetic tape as a bulk storage medium. The main advantages of hard disk storage are low access times, availability, capacity and ease of use.WEB,weblink Bye Bye Tape, Hello 5.3TB eSATA, 22 April 2007, External disks can be connected via local interfaces like SCSI, USB, FireWire, or eSATA, or via longer distance technologies like Ethernet, iSCSI, or Fibre Channel. Some disk-based backup systems, via Virtual Tape Libraries or otherwise, support data deduplication, which can dramatically reduce the amount of disk storage capacity consumed by daily and weekly backup data.WEB, Retrospect ® 12 Windows User's Guide,weblink Retrospect, Retrospect Inc., 2 September 2018, 2017, 30–31(deduplication via "Snapshots"—a Retrospect term which predates and is distinct from Snapshot (computer storage)), 31–32(Dashboard), 41–43(removable disk drives), 216–218(selector as subset filter for synthetic full backups), 230–233(Scripted Verification), 280(Multiple Executions), 369(Duplicate Execution Options), 420(Startup Preferences—Launcher for auto-launch), 426–427(E-mail), 433–434(Open File Backup Tips—VSS snapshot at natural pause), 530–544(SQL Server Agent—coordinating VSS snapshot), 545–566(Exchange Server Agent—coordinating VSS snapshot), WEB,weblink Symantec Shows Backup Exec a Little Dedupe Love; Lays out Source Side Deduplication Roadmap – DCIG, DCIG, 26 February 2016,weblink" title="">weblink 4 March 2016, no, dmy-all, WEB, Veritas NetBackup™ Deduplication Guide,weblink Veritas, Veritas Technologies LLC, 26 July 2018, 2016, One disadvantage of hard disk backups vis-a-vis tape is that hard drives are close-tolerance mechanical devices and may be more easily damaged, especially while being transported (e.g., for off-site backups).WEB, Jacobi, John L., Hard-core data preservation: The best media and methods for archiving your data,weblink PC World, 19 April 2018, 29 Feb 2016, sec. External Hard Drives(on the shelf, magnetic properties, mechanical stresses, vulnerable to shocks), Tape, In the mid-2000s, several drive manufacturers began to produce portable drives employing ramp loading and accelerometer technology (sometimes termed a "shock sensor"),WEB, Ramp Load/Unload Technology in Hard Disk Drives,weblink HGST, Western Digital, 29 June 2018, 3(sec. Enhanced Shock Tolerance), November 2007, WEB, Toshiba Portable Hard Drive (Canvio® 3.0),weblink Toshiba Data Dynamics Singapore, Toshiba Data Dynamics Pte Ltd, 16 June 2018, 2018, sec. Overview(Internal shock sensor and ramp loading technology), and—by 2010—the industry average in drop tests for drives with that technology showed drives remaining intact and working after a 36-inch non-operating drop onto industrial carpeting.WEB, Iomega ® Drop Guard ™ Technology,weblink Hard Drive Storage Solutions, Iomega Corp., 12 July 2018, 2(What is Drop Shock Technology?, What is Drop Guard Technology? (... features special internal cushioning .... 40% above the industry average)), 3(*NOTE), 20 September 2010, The manufacturers do not, however, guarantee these results and note that a drive may fail to survive even a shorter drop. Some manufacturers also offer 'ruggedized' portable hard drives, which include a shock-absorbing case around the hard disk, and claim a range of higher drop specifications.
WEB, John Burek
, The Best Rugged Hard Drives and SSDs
, PC Magazine, Ziff Davis, 4 August 2018
, What Exactly Makes a Drive Rugged?(When a drive is encased ... you're mostly at the mercy of the drive vendor to tell you the rated maximum drop distance for the drive), 15 May 2018, WEB
, Justin Krajeski, Kimber Streams
, The Best Portable Hard Drive
,weblink" title="">weblink
, The New York Times, 4 August 2018
,weblink" title="">weblink
, 31 March 2017, 20 March 2017, Another disadvantage is that over a period of years the stability of hard disk backups is shorter than that of tape backups.WEB, Best Long-Term Data Archive Solutions,weblink Iron Mountain, Iron Mountain Inc., 19 April 2018, 2018, sec. More Reliable(average mean time between failure ... rates, best practice for migrating data),

Optical storage : Recordable CDs, DVDs, and Blu-ray Discs are commonly used with personal computers and generally have low media unit costs. However, the capacities and speeds of these and other optical discs have traditionally been lower than that of hard disks or tapes (though advances in optical media are slowly shrinking that gapJOURNAL
, Optical storage: An emerging option in long-term digital preservation
, Frontiers of Optoelectronics
, S. Wan, Q. Cao, C. Xie, 7, 4, 486–492, 2014
, 10.1007/s12200-014-0442-2, JOURNAL, High-capacity optical long data memory based on enhanced Young's modulus in nanoplasmonic hybrid glass composites, Nature Communications
author2=Z. Xia author4=M. Gupages=1183 doi=10.1038/s41467-018-03589-y, 2018NatCo...9.1183Z, ). Many optical disk formats are WORM type, which makes them useful for archival purposes since the data cannot be changed. The use of an auto-changer or jukebox can make optical discs a feasible option for larger-scale backup systems. Some optical storage systems allow for cataloged data backups without human contact with the discs, allowing for longer data integrity. A 2008 French study indicated the lifespan of typically-sold CD-Rs was 2–10 years,WEB
, Journal de 20 Heures, 3 March 2008
, approximately minute 30 of the TV news broadcast
, Institut national de l'audiovisuel
, Gérard Poirier, Foued Berahou
, 3 March 2008, but one manufacturer later estimated the longevity of its CD-Rs with a gold-sputtered layer to be as high as 100 years.WEB
, Archival Gold CD-R "300 Year Disc" Binder of 10 Discs with Scratch Armor Surface
, 27 September 2013,weblink" title="">weblink
, Delkin Devices, Delkin Devices Inc., Sony's Optical Disc Archive can achieve speeds of 250Mbit/sweblink

SSD/Solid-state drive: Also known as flash memory, thumb drives, USB flash drives, CompactFlash, SmartMedia, Memory Stick, Secure Digital cards, etc., these devices are relatively expensive for their low capacity in comparison to hard disk drives, but are very convenient for backing up relatively low data volumes. A solid-state drive does not contain any movable parts unlike its magnetic drive counterpart, making it less susceptible to physical damage, and can have huge throughput in the order of 500Mbit/s to 6Gbit/s. The capacity offered from SSDs continues to grow and prices are gradually decreasing as they become more common.JOURNAL
, Solid-State Drives (SSDs), Proceedings of the IEEE
, R. Micheloni, P. Olivo
, 105, 9, 1586–88, 2017
, 10.1109/JPROC.2017.2727228, 8 May 2018, Over a period of years the stability of flash memory backups is shorter than that of hard disk backups.

Remote backup service AKA cloud backup : Adding cloud-based backup to the benefits of local and offsite tape archiving adds a layer of data protection, because "Securing your data for posterity, i.e., archiving, requires a different approach, where shelved media life and future file compatibility trump the speed and convenience that make backup palatable to the average user." Offsite has historically been used to protect against events such as fires, floods, or earthquakes which could destroy locally stored backups.WEB
, Remote Backup, EMC Glossary, Dell, Inc
, 8 May 2018
, Effective remote backup requires that production data be regularly backed up to a location far enough away from the primary location so that both locations would not be affected by the same disruptive event.,

Factors for success include: * initial seed loading / cloud seeding
* trusting a provider to maintain the privacy and integrity of their data (with confidentiality enhanced by encryption)
Floppy disk and its derivatives : While floppy disks were still commonly used during the 1980s and early 1990s, their limited capacity rendered them effectively obsolete. "Superfloppy" and related "non-floppy" devices provide greater storage capacity and are used by some developers.

Managing the information repository

Regardless of the information repository model, or data storage media used for backups, a balance needs to be struck between accessibility, security and cost. These media management methods are not mutually exclusive and are frequently combined to meet the user's needs. Using on-line disks for staging data before it is sent to a near-line tape library is a common example.Information repository implementations include:BOOK,weblink Software Deployment, Updating, and Patching, Stackpole, B., Hanrion, P., CRC Press, 164–165, 2007, 978-1-4200-1329-0, 8 May 2018, BOOK,weblink Information Storage and Management: Storing, Managing, and Protecting Digital Information in Classic, Virtualized, and Cloud Environments, Gnanasundaram, S., Shrivastava, A., John Wiley and Sons, 255, 2012, 978-1-118-23696-3, 8 May 2018,
On-line : On-line backup storage is typically the most accessible type of data storage, which can begin a restore in milliseconds. An internal hard disk or a disk array (maybe connected to SAN) is one example of an on-line backup. This type of storage is convenient and speedy, but is relatively expensive and is vulnerable to being deleted or overwritten, either by accident, by malevolent action, or in the wake of a data-deleting virus payload.
Near-line : Near-line storage is typically less accessible and less expensive than on-line storage, but still useful for backup data storage. A good example would be a tape library with restore times ranging from seconds to a few minutes. A mechanical device is usually used to move media units from storage into a drive where the data can be read or written. Generally it has safety properties similar to on-line storage.
Off-line : Off-line storage requires some direct action to provide access to the storage media: for example inserting a tape into a tape drive or plugging in a cable. Because the data are not accessible via any computer except during limited periods in which they are written or read back, they are largely immune to a whole class of on-line backup failure modes. Access time will vary depending on whether the media are on-site or off-site.
Off-site data protection: Backup media may be sent to an off-site vault to protect against a disaster or other site-specific problem. The vault can be as simple as a system administrator's home office or as sophisticated as a disaster-hardened, temperature-controlled, high-security bunker with facilities for backup media storage. Importantly a data replica can be off-site but also on-line (e.g., an off-site RAID mirror). Such a replica has fairly limited value as a backup, and should not be confused with an off-line backup.
Backup site or disaster recovery center (DR center): In the event of a disaster, the data on backup media will not be sufficient to recover. Computer systems onto which the data can be restored and properly configured networks are necessary too. Some organizations have their own data recovery centers that are equipped for this scenario. Other organizations contract this out to a third-party recovery center. Because a DR site is itself a huge investment, backing up is very rarely considered the preferred method of moving data to a DR site. A more typical way would be remote disk mirroring, which keeps the DR data as up to date as possible.

Selection and extraction of data

A successful backup job starts with selecting and extracting coherent units of data. Most data on modern computer systems is stored in discrete units, known as files. These files are organized into filesystems. Files that are actively being updated can be thought of as "live" and present a challenge to back up. It is also useful to save metadata that describes the computer or the filesystem being backed up.Deciding what to back up at any given time involves tradeoffs. By backing up too much redundant data, the information repository will fill up too quickly. Backing up an insufficient amount of data can eventually lead to the loss of critical information.WEB,weblink What to backup – a critical look at your data, Lees, D.
, Irontree Blog, Irontree Internet Services CC
, 25 January 2017, 8 May 2018,


Copying files : With file-level approach, making copies of files is the simplest and most common way to perform a backup. A means to perform this basic function is included in all backup software and all operating systems.
Partial file copying: Instead of copying whole files, a backup may include only the blocks or bytes within a file that have changed in a given period of time. This technique can substantially reduce needed storage space, but requires a high level of sophistication to reconstruct files in a restore situation. Some implementations require integration with the source file system.
Deleted files : To prevent the unintentional restoration of files that have been intentionally deleted, a record of the deletion must be kept.


Filesystem dump: Instead of copying files within a file system, a copy of the whole filesystem itself in block-level can be made. This is also known as a raw partition backup and is related to disk imaging. The process usually involves unmounting the filesystem and running a program like dd (Unix).BOOK,weblink Backup & Recovery: Inexpensive Backup Solutions for Open Systems, Preston, W.C., O'Reilly Media, Inc, 111–114, 2007, 978-0-596-55504-7, 8 May 2018, Because the disk is read sequentially and with large buffers, this type of backup can be much faster than reading every file normally, especially when the filesystem contains many small files, is highly fragmented, or is nearly full. But because this method also reads the free disk blocks that contain no useful data, this method can also be slower than conventional reading, especially when the filesystem is nearly empty. Some filesystems, such as XFS, provide a "dump" utility that reads the disk sequentially for high performance while skipping unused sections. The corresponding restore utility can selectively restore individual files or the entire volume at the operator's choice.BOOK,weblink Unix Backup & Recovery, Preston, W.C., O'Reilly Media, Inc, 73–91, 1999, 978-1-56592-642-4, 8 May 2018,
Identification of changes: Some filesystems have an archive bit for each file that says it was recently changed. Some backup software looks at the date of the file and compares it with the last backup to determine whether the file was changed.
Versioning file system : A versioning filesystem tracks all changes to a file. The number of versions can be all the way back to the file's creation time, or less. The Wayback versioning filesystem for Linux is an example.Wayback: A User-level V File System for Linux {{Webarchive|url=
|date=6 April 2007}} (2004). Retrieved 10 March 2007

Live data

A snapshot is an instantaneous function of some filesystems that presents a copy of the filesystem as if it were frozen at a specific point in time, often by a copy-on-write mechanism. An effective way to back up live data is to temporarily quiesce them (e.g., close all files), take a snapshot, and then resume live operations. At this point the snapshot can be backed up through normal methods.WEB, Staimer, Marc, Using different types of storage snapshot technologies for data protection,weblink TechTarget, TechTarget Inc., 4 December 2018, 2011, Snapshotting a file while it is being changed results in a corrupted file that is unusable, as most large files contain internal references between their various parts that must remain consistent throughout the file. This is also the case across interrelated files, as may be found in a conventional database or in applications such as Microsoft Exchange Server. The term fuzzy backup can be used to describe a backup of live data that looks like it ran correctly, but does not represent the state of the data at a single point in time.BOOK,weblink Mission-critical Network Planning, Liotine, M., Artech House, 244, 2003, 978-1-58053-559-5, 8 May 2018, Backup options for data files that cannot be or are not quiesced include:BOOK,weblink Enterprise Systems Backup and Recovery: A Corporate Insurance Policy, de Guise, P., CRC Press, 50–54, 2008, 978-1-4200-7640-0,
Open file backup: Many backup software applications undertake to back up open files in an internally consistent state.WEB, Open File Backup Software for Windows,weblink
, Handy Backup, Novosoft LLC
, 29 November 2018, 8 November 2018, File locking would be useful for regulating access to open files, but this may be inconvenient for the user. Some applications simply check whether open files are in use and try again later. Other applications exclude open files that are updated very frequently.WEB, Reitshamer, Stefan, Troubleshooting backing up open/locked files on Windows,weblink Arq Blog
, Haystack Software
, 29 November 2018, 5 July 2017, Stefan Reitshamer is the principal developer of Arq,

Interrelated database files backup: Some interrelated database file systems offer a means to generate a "hot backup"WEB, Boss, Nina, Oracle Tips Session #3: Oracle Backups,weblink" title="">weblink, University of Wisconsin, 1 December 2018, 2 March 2007, 10 December 1997, of the database while it is online and usable. This may include a snapshot of the data files plus a snapshotted log of changes made while the backup is running. Upon a restore, the changes in the log files are applied to bring the copy of the database up to the point in time at which the initial backup ended.WEB, What is ARCHIVE-LOG and NO-ARCHIVE-LOG mode in Oracle and the advantages & disadvantages of these modes?,weblink Arcserve Backup, Arcserve, 29 November 2018, 27 September 2018,


Not all information stored on the computer is stored in files. Accurately recovering a complete system from scratch requires keeping track of this non-file data too.WEB,weblink Preparation of Bootable Media and Images, Grešovnik, Igor, April 2016,weblink" title="">weblink 25 April 2016, 21 April 2016,
System description: System specifications are needed to procure an exact replacement after a disaster.
Boot sector : The boot sector can sometimes be recreated more easily than saving it. Still, it usually isn't a normal file and the system won't boot without it.
Partition layout: The layout of the original disk, as well as partition tables and filesystem settings, is needed to properly recreate the original system.
File metadata : Each file's permissions, owner, group, ACLs, and any other metadata need to be backed up for a restore to properly recreate the original environment.
System metadata: Different operating systems have different ways of storing configuration information. Microsoft Windows keeps a registry of system information that is more difficult to restore than a typical file.

Manipulation of data and dataset optimization

It is frequently useful or required to manipulate the data being backed up to optimize the backup process. These manipulations can provide many benefits including improved backup speed, restore speed, data security, media usage and/or reduced bandwidth requirements.
Automated data grooming: Out-of-date data can be automatically deleted, but for personal backup applications—as opposed to enterprise client-server backup applications—this feature can at mostWEB, Pond, James, 12. Should I delete old backups? If so, How?,weblink Time Machine,, 21 June 2019, Green box, Gray box, 2 June 2012, be turned off.WEB, Kissell, Joe, The Best Online Cloud Backup Service,weblink wirecutter, The New York Times, 21 June 2019, Next, there’s file retention., 12 March 2019,
Compression : Various schemes can be employed to shrink the size of the source data to be stored so that it uses less storage space. Compression is frequently a built-in feature of tape drive hardware.BOOK,weblink Securing SQL Server: Protecting Your Database from Attackers
, D. Cherry, Syngress, 306–308, 2015
, 978-0-12-801375-5, 8 May 2018,

Deduplication : Redundancy due to backing up similarly configured workstations can be reduced, thus storing just one copy. This technique can be applied at the file or even raw block level. This potentially massive reduction is called Deduplication. It can occur on a server before any data moves to backup media, sometimes referred to as source/client side deduplication. This approach also reduces bandwidth required to send backup data to its target media. The process can also occur at the target storage device, sometimes referred to as inline or back-end deduplication.
Duplication : Sometimes backup jobs are duplicated to a second set of storage media. This can be done to rearrange the backup images to optimize restore speed or to have a second copy at a different location or on a different storage medium.
Encryption : High-capacity removable storage media such as backup tapes present a data security risk if they are lost or stolen.Backups tapes a backdoor for identity thieves {{Webarchive|url= |date=5 April 2016 }} (28 April 2004). Retrieved 10 March 2007 Encrypting the data on these media can mitigate this problem, but presents new problems. Encryption is a CPU intensive process that can slow down backup speeds, and the security of the encrypted backups is only as effective as the security of the key management policy.
Multiplexing : When there are many more computers to be backed up than there are destination storage devices, the ability to use a single storage device with several simultaneous backups can be useful.BOOK,weblink Backup & Recovery: Inexpensive Backup Solutions for Open Systems, Preston, W.C., O'Reilly Media, Inc, 219–220, 2007, 978-0-596-55504-7, 8 May 2018,
Refactoring: The process of rearranging the backup sets in a archive file is known as refactoring. For example, if a backup system uses a single tape each day to store the incremental backups for all the protected computers, restoring one of the computers could potentially require many tapes. Refactoring could be used to consolidate all the backups for a single computer onto a single tape. This is especially useful for backup systems that do incrementals forever style backups.
Staging : Sometimes backup jobs are copied to a staging disk before being copied to tape. This process is sometimes referred to as D2D2T, an acronym for Disk to Disk to Tape. This can be useful if there is a problem matching the speed of the final destination device with the source device as is frequently faced in network-based backup systems. It can also serve as a centralized location for applying other data manipulation techniques.


Recovery point objective (RPO) : The point in time that the restarted infrastructure will reflect. Essentially, this is the roll-back that will be experienced as a result of the recovery. The most desirable RPO would be the point just prior to the data loss event. Making a more recent recovery point achievable requires increasing the frequency of synchronization between the source data and the backup repository.Definition of recovery point objective {{Webarchive|url= |date=13 May 2007 }}. Retrieved 10 March 2007WEB, Top four things to consider in business continuity planning,weblink, 23 September 2015,weblink" title="">weblink 4 March 2016, no, dmy-all,
Recovery time objective (RTO) : The amount of time elapsed between disaster and restoration of business functions.Definition of recovery time objective {{Webarchive|url= |date=16 May 2007 }}. Retrieved 7 March 2007
Data security : In addition to preserving access to data for its owners, data must be restricted from unauthorized access. Backups must be performed in a manner that does not compromise the original owner's undertaking. This can be achieved with data encryption and proper media handling policies.BOOK,weblink Implementing Backup and Recovery: The Readiness Guide for the Enterprise, Chapter 2: Business Requirements of Backup Systems, Little, D.B., John Wiley and Sons, 17–30, 2003, 978-0-471-48081-5, 8 May 2018,
Data retention period : Regulations and policy can lead to situations where backups are expected to be retained for a particular period, but not any further. Retaining backups after this period can lead to unwanted liability and sub-optimal use of storage media.


An effective backup scheme will take into consideration the following situational limitations:BOOK,weblink Pro Data Backup and Recovery, Chapter 9: Putting It All Together: Sample Backup Environments, Nelson, S., Apress, 203–246, 2011, 978-1-4302-2663-5, 8 May 2018,
Backup window: The period of time when backups are permitted to run on a system is called the backup window. Techniques such as synthetic full backup can help reduce "the ever-growing backup window."MAGAZINE, Computerworld
, Lucas Mearian
, Incremental improvement: 'Synthetic' backups save time, money
, May 2, 2005, 2019-05-26,

Performance impact: All backup schemes have some performance impact on the system being backed up. For example, for the period of time that a computer system is being backed up, the hard drive is busy reading files for the purpose of backing up, and its full bandwidth is no longer available for other tasks. Such impacts should be analyzed.
Costs of hardware, software, labor: All types of storage media have a finite capacity with a real cost. Matching the correct amount of storage capacity (over time) with the backup needs is an important part of the design of a backup scheme. Any backup scheme has some labor requirement, but complicated schemes have considerably higher labor requirements. The cost of commercial backup software can also be considerable.
Network bandwidth: Distributed backup systems can be affected by limited network bandwidth.


Meeting the defined objectives in the face of the above limitations can be a difficult task. The tools and concepts below can make that task more achievable.
Scheduling: Using a job scheduler can greatly improve the reliability and consistency of backups by removing part of the human element. Many backup software packages include this functionality.
Authentication: Over the course of regular operations, the user accounts and/or system agents that perform the backups need to be authenticated at some level. The power to copy all data off of or onto a system requires unrestricted access. Using an authentication mechanism is a good way to prevent the backup scheme from being used for unauthorized activity.
Chain of trust : Removable storage media are physical items and must only be handled by trusted individuals. Establishing a chain of trusted individuals (and vendors) is critical to defining the security of the data.

Managing the backup process

Those who perform or oversee backups need to know how successful the backups are.

Measuring the process

To ensure that the backup scheme is working as expected, the following best practices should be enacted:JOURNAL
, Database Backup and Recovery Best Practices, ISACA Journal
, Akhtar, A.N., Buchholtz, J., Ryan, M., Setty, K.
, 1, 1–6, 2012
,weblink 8 May 2018, WEB, Dorion, Pierre, Why you need a data backup reporting tool,weblink TechTarget, Tech Target Inc., 13 November 2017, June 2008, WEB,weblink Cloud-to-cloud backup: What it is and why you need it, Pritchard, S., Computer Weekly, TechTarget, December 2017, 8 May 2018,

Backup validation : (also known as "backup success validation") Provides information about the backup, and proves compliance to regulatory bodies outside the organization: for example, an insurance company in the USA might be required under HIPAA to demonstrate that its client data meet records retention requirements.HIPAA Advisory {{Webarchive|url= |date=11 April 2007 }}. Retrieved 10 March 2007 Disaster, data complexity, data value and increasing dependence upon ever-growing volumes of data all contribute to the anxiety around and dependence upon successful backups to ensure business continuity. Thus many organizations rely on third-party or "independent" solutions to test, validate, and optimize their backup operations (backup reporting).
Reporting: In larger configurations, reports are useful for monitoring media usage, device status, errors, vault coordination and other information about the backup process.
Logging: In addition to the history of computer generated reports, activity and change logs are useful for monitoring backup system events.
Validation: Many backup programs use checksums or hashes to validate that the data was accurately copied. These offer several advantages. First, they allow data integrity to be verified without reference to the original file: if the file as copied to the archive file has the same checksum as the saved value, then it is very probably correct. Second, some backup programs can use checksums to avoid making redundant copies of files, and thus improve backup speed. This is particularly useful for the de-duplication process.
Monitored backup: Backup processes can be monitored locally via a software dashboard or by a third party monitoring center. Both alert users to any errors that occur during automated backups. Some third-party monitoring services also allow collection of historical metadata, that can be used for storage resource management purposes like projection of data growth and locating redundant primary storage capacity and reclaimable backup capacity.
Enterprise client-server backup applications generally have an administration console, high-level/medium-term reports supplementing the administration console, e-mailing of notifications about operations to chosen recipients that contain extracts of logging, and integration with monitoring systems. Moreover, because such applications must have disk-to-disk-to-tape capabilities, they all have checksum or hash validation options.

See also

About backup

Related topics





External links

{{Wiktionary|back up}}{{Wiktionary|backup}}{{Commons category|Backup}}

- content above as imported from Wikipedia
- "backup" does not exist on GetWiki (yet)
- time: 2:04am EDT - Sat, Jul 20 2019
[ this remote article is provided by Wikipedia ]
LATEST EDITS [ see all ]
Eastern Philosophy
History of Philosophy
M.R.M. Parrott