Troubleshooting Tips to Resolve Backup Failures
In this article i am going to share some common troubleshooting tips to resolve backup failures which will help to find out the root cause and fix the backup failure issue.
Let us start with very common one.Error Codes are really very important in veritas netbackup which will guide you exactly what to check but there are few very common causes we will discuss as there is huge list of error codes in veritas netbackup.
Step 1. Error Code 58 in Veritas NetBackup
You may encounter error code 58.This is mostly due to server or you can say netbackup client in non performing state.May be below might be possible reasons.
1. Client Host May be really down.
2. Client Host May be intentionally Down.
3. Client Host May be decommissioned.
Let us dig further on this.In case 1 if host is really down you must be aware what to do including notifying clients and all and making it up and make sure all backup related process is working and communication with media server is intact.
Client host may be kept in down state intentionally due to any other reason.One of the well known case in case of cloud hosts is that Cloud host is provisioned based on Pay as you go and being operational on demand and kept down when it is not needed as effort to save cost.
In major organizations provisioning and deprovisioning keep happening for Dev/QA/Test/PROD/DR and even POC so it is always worth to check your inventory system if that client host is really operational.It will give clarity that you really need to address those or not.
Step 2 : Error Code 150 in Veritas NetBackup
Another very common error code 150 in netbackup which is usually expected to come if backup is getting killed due to any reason.
Backup may be killed by backup administrator or system administrator for any troubleshooting issues in backup or may be due to organizational configurations.
Since backup is intensive in I/O operations which may hit server performance may lead to business impact so many organization prefers to have policy like backup gets killed as part of their automations to kill if any backup extends to start of business hour.
Step3 : Troubleshooting NetBackup Issues Except Error Code 58/150
3(i) Validating BP.CONF have all required media servers configured
So now you know it is not error code 58 or 150 and you really need to address those. So let us see our next course of action.
If you have access to GUI console first check in console about specific Job ID and find out which media media server was taken as part of that job and ensure bp.conf have entry of same server and communication is intact.
If not take backup of bp.conf and update bp.conf in /usr/openv/netbackup file with media server taken during backup.
if not take a backup of this file, adds this media server into the file,check if the media server is resolving to the IP from client and rerun the backup from policy not just rerun.
1. Update bp.conf with media server as mentioned in above step.
2.Validate using nslookup #nslookup from client to media server.
3.Restart the backup job from policy not just right click and restart as simple restart may lead to failure again.
3(ii) Validating bpcd, vnetd, vopied and bprd services are enabled
Please check if bpcd, vnetd, vopied and bprd are defined in the /etc/services file.
# cat /etc/services |egrep "bprd|vopied|vnetd|bpcd" bprd 13720/tcp # BPRD (VERITAS NetBackup) bprd 13720/udp # BPRD (VERITAS NetBackup) vnetd 13724/tcp # Veritas Network Utility vnetd 13724/udp # Veritas Network Utility bpcd 13782/tcp # VERITAS NetBackup bpcd 13782/udp # VERITAS NetBackup vopied 13783/tcp # VOPIED Protocol vopied 13783/udp # VOPIED Protocol
3(iii) Validating bpcd, vnetd services are listening
Validate if bpcd and vnetd process process is listening to its required port using netstat command.
# netstat -a | egrep "bpcd|vnetd" tcp 0 0 *:bpcd *:* LISTEN tcp 0 0 *:vnetd *:* LISTEN unix 2 [ ACC ] STREAM LISTENING 752107566 /usr/openv/var/vnetd/terminate_vnetd.uds unix 2 [ ACC ] STREAM LISTENING 752107613 /usr/openv/var/vnetd/terminate_bpcd.uds unix 2 [ ACC ] STREAM LISTENING 752107624 /usr/openv/var/vnetd/bpcd.uds unix 3 [ ] STREAM CONNECTED 752107674 /var/VRTSpbx/root/PBXPIPEvnetd-ssa unix 3 [ ] STREAM CONNECTED 752107650 /var/VRTSpbx/root/PBXPIPEvnetd-no-auth unix 3 [ ] STREAM CONNECTED 752107638 /var/VRTSpbx/root/PBXPIPEbpcd unix 3 [ ] STREAM CONNECTED 752107596 /var/VRTSpbx/root/PBXPIPEvnetd-auth-only unix 3 [ ] STREAM CONNECTED 752107577 /var/VRTSpbx/root/PBXPIPEvnetd
If it is not working you can restart xinetd service using service init scripts in /etc/init.d.
# service xinetd restart Stopping xinetd: [FAILED] Starting xinetd: [ OK ]
Revalidate using netstat command mentioned again it will show network connection established.
You can restart your backup job here but let,s do one more check to make sure communication between client and media server is OK.
3(iv) Perform telnet test to validate ports functionality
From Media Server
Login to the media server and telnet to the client . Simultaneously you can login to the client where the backup has failed and check if the connection has been established.
# telnet <nb client> vnetd
You should get login if connection is proper with message connected and symbol of login.
You can test also from client validate vnetd connectivity using netstat as mentioned above.
# netstat -a |grep vnetd
From Client Server
You can perform telnet test from client server to media server as well like below.
# telnet <media server> bpcd
Again you can verify same connection from media server using netstat command as mentioned.
# netstat -a |grep bpcd
So we have tested telnet connectivity from client as well as media server.If all looks good you will be good to restart backup .
It is always better to restart backup from policy as it is treated as new backup initiation.
If all of above is not working you can try to restart all netbackup services using how to restart netbackup services in client and then rerun backup from policy.
That’s it about troubleshooting tips to resolve backup failures in netbackup.I hope you will find this helpful in troubleshooting netbackup jobs failures.
You may like to read some similar articles.