The volume of data is said to be increasing by the minute and estimated to reach 163 zettabytes (ZB) by 2025. This enormous data growth is contributed to by various devices generating data such as images, applications producing log files, reports, and more. Emerging technologies are also adding to this trend like embedded devices such as RFID, smart cards, smart chips, and other IoT (Internet of Things) devices.
As the data grows the need to store and backup that data increases along with it. The cost for this additional storage is becoming a concern for the SMB (Small to Medium-sized Businesses).
Data backups are important for recovery, therefore (despite the rising storage costs) backups are essential to keeping the operation up and avoiding any possible downtime.
To reduce the cost of the backups, SMB’s sometimes face the dilemma of which backups to delete and when the best time to delete them is.
This blog will give some pointers to look out for to help you decide when or which backup sets need to be deleted.
Backup vs. Archive
There is a difference between backups and archives. Backups are a necessity for timely recovery, therefore having the recent backup snapshot is essential. On the other hand, archives are data stored for a longer period of time, often to meet a certain regulation.
This type of data is not useful for point-in-time recovery, but is required for audit purposes. Archive records have to be maintained for a certain life span and then it can be deleted and replaced with a new set of records.
The retention time period for the archive depends on the data type and also the organisation’s policy. With backups, the best practice is to do a full backup monthly and incremental or differential backup daily. The full backup can be deleted after two months and the fresh backup can be used for any recovery. This method protects from any malware residing in the system from old backups.
The backup frequency depends on each company’s backup policy, but the idea is the same more or less frequently.
Type of Data
Backups store various types of data, but not all are important or will continue to be relevant over time. Organisations need to identify the data sets which are crucial for the operation and also for regulation purposes.
Medical data usually needs to be kept longer, therefore the backup cannot be deleted as often. Other types of backups such as emails, documentation, or communication materials can be deleted regularly as they have a validity period and most likely would not be useful for recovery purposes, in fact, having unwanted data can actually slow down restoration time.
Ideally, backup storage should be able to keep two full and the changes between the two backups. While you could have more storage availability, having the same redundant or obsolete data for the same amount of time gives little to no value to the organisation.
With more backup copies, storage can be wasted, retrieval of backup copies can be slower, and the cost increases to manage the expanded infrastructure.
The retention policy in the organisation should spell out what the max size or number of copies you should keep. This can be used as a guideline to remove the old backups.
Legal & Compliance
It is compulsory to retain some data to meet the compliance requirements like HIPAA. Medical records (as mentioned earlier) have to reside on premises for a stipulated duration.
As for financial records, there are regulations on how to safely dispose of these records. Financial records are usually not retained for long for security purposes.
Backups with data which is required for legal consideration or compliance should not be deleted but, as we mentioned, archived if it is possible.
Verifying backups is something that is usually skipped by many companies, but you should always make sure that all backups are working to avoid a “bad surprise” when you need to actually use it.
As part of a good backup policy, you can add a verification point by restoring the backup or using a specific tool to check it. A good tool should not only set the restoration point for backup verification, but it should also provide an appropriate backup health check report and alert if there are critical errors during the process. If there are any critical errors, then the backup copy is useless and can be disposed of and replaced with another fresh copy.
Corrupted backups can have invalid files, indexes, or even be infected by a virus which can be dangerous if it is restored during recovery. A virus-infected backup copy is not only dangerous during restoration, but it may also infect the other backup copies and puts a security threat to the entire infrastructure.
It is important to store a backup in an external location, but be careful to only store it in a location with adequate security. Encryption and network security are a must when you use a cloud provider to store your data.
A good backup retention policy is a key to cloud storage as well, as using a large amount of space could generate a huge bill.
Since space correlates with the cost, removing old or irrelevant backups is necessary to save costs. To do this, you can perform a periodic verification on the backup to check if it is healthy. Next is to check if the backup data is still relevant to the current state of operation. If the backup is healthy but it is obsolete, then you can decide if it needs to be archived or replaced with a fresh backup copy.
On-Prem Backup Storage
To store backups on-prem it’s important to have good hardware and network. This, however, could be expensive, as you will not only need to buy a server, but also the ongoing electricity, air conditioning, and have a physical place where to put it.
Storing backups without a retention policy could generate a high disk space usage, which will then may require more hardware, electricity, and a bigger physical place.
Data stored on-premise is more vulnerable to virus and ransom attacks. If backups are stored in premise, it's important to verify to determine if any backups are affected if there was a security breach in the system.
While organisations regularly clean the operational data in the system and they often neglect the backup storage. With infected backups, there is a possibility the virus could stay dormant and then attack the system during restoration. Another possible threat is virus travelling through the network from an infected file in the backups. It is, therefore, best to check and then remove any infected backups after a recent security breach or virus attacks to keep the entire system or infrastructure safe.
In conclusion, the question on when to delete a backup depends on the organisation policies. The organisation has to decide the retention period, data types, data relevancy and also compliance with regulations before removing a backup.
It is also important to know if the backup is healthy and not obsolete to avoid any disappointments when it is used for recovery. Many organisations sometimes have the perception of having multiple copies, but in reality they may not be ready for restore.
Having multiple copies of an unhealthy backup is equivalent to not having any backups at all. Verification then becomes an important part to determining when a backup should be destroyed and replaced. This saves space, avoids unwanted security attacks, and will not “pop a surprise” when it is restored during disaster recovery.