The CERN IT department provides a centralized service to backup CERN-owned server machines containing critical data in the computer centre.
The IBM Tivoli Storage Manager product (TSM) is currently used to provide this function.
The service excludes all individual desktop machines as these are expected to store the users files on one of the NICE/Windows or Unix/AFS services (which are also backed up). The target machines are the large scale, central production file, database servers and mail servers.
Machines using the backup service will, on a regular basis, send their changed data to the backup service over the network. A copy of these files is then kept on disk or tape. Restores can be performed by the service managers via command line or GUI interfaces. The service manager is responsible for the the client software installation and configuration, as well as for verifying the completeness and correctness of the backups.
All machines using the backup service must comply with the OC5 rules.
New installations are restricted to servers in the computer centre. End user desktops and private clusters are not eligible for the backup service.
Backup of the following environments are supported:
|Incremental backup of files|
Windows Server 2008
Windows Server 2012
|Incremental backup of files|
Any other client is entirely the responsibility of the service manager. The service parameters will be provided to allow configuration of the client. Backup clients can be downloaded from IBM.
- Providing the necessary information to register the client. In particular, users are responsible for providing a valid contact e-group.
- Installation of the client component of the software when first using the service. Instructions on how to install the client for CERN computer centre users are available at BackNcmClient. Other clients are the responsibility of the service managers (some guidelines are available, but not officially supported).
- Configuring of the client file systems and include/exclude lists such that only critical data which is not available elsewhere is backed up. Linux system programs, log files and temporary files should not be backed up. A default configuration is made on first installation.
- Maintaining the client program within 1 software version of the server. Thus, if the server is running version 6.3, the clients must be running at least version 6.2. The server software is upgraded every 6 to 12 months. The version can be seen on the output of the dsmc i command.
- Validation of backups for completeness and correctness. The backup service cannot guarantee that all files needed to run the service are backed up correctly (e.g. open active files) and that this set of files is sufficient to recover the service in a reasonable time. Operating system files such as binaries in /usr/bin are explicitly excluded from backup. It is therefore strongly recommended to perform regular restore tests and ensure that all data required is backed up and that a restore can be performed within the time scales required by the client.
- In the event of a backup failing to complete, an e-mail will be sent to the contact informing them of the failure. This requires action to resolve the problem. In the event of repeated failures, access to the backup service will be blocked. If the machine is not performing regular backups and the service manager is not responding to emails, the TSM Administrators may assume that the machine has been retired and the data on the backup system can be deleted.
- When TSM servers are moved to new machines, the client configuration must be changed within one month. If this is not completed, access to the backup may be blocked and the previous backup data no longer available following the migration to new machines.
- Once the server is not longer in production, the user will inform TSM support indicating how long they wish to keep the data for.
- For any service requiring more than 10TB of backup space, a report is required on a quarterly basis with an estimate of the future usage of backup, both in terms of daily backup volumes and amount of data to be kept (occupancy).