Access is denied when attempting to view or restore Volume Shadow Copy contents


I setup our help desk users to be able to restore documents using Microsoft’s Volume Shadow Copy client on remote servers yesterday.  Everything worked just fine for me as an administrator, and for users who owned the files, but it didn’t work for the help desk folks.  I found out they didn’t have NTFS rights to the files and folders, so I assumed all I had to do was assign them change permissions, and they’d be able to do the restore.

I made the permission change, but when the help desk folks tried to view the contents of the shadow copy snapshots they received “Access Denied” errors.  I had them confirm they could UNC to the location where the snapshots were located, and they could create and delete files there.

After much Googling didn’t provide many troubleshooting ideas, I decided to manually create a snapshot of the same volume.  I had them test again, and they were able to view the snapshot’s contents and restore files.  Underlying cause was the help desk group didn’t have permissions to the original snapshot, so they couldn’t see the files to restore them.  Hope this helps someone else out.

Fix: Groupwise webaccess hanging with “Request aborted while waiting on locked conversation” in webaccess log file


Users of our external Groupwise 6.5.6 webaccess gateway have been experiencing problems with their sessions hanging when trying to open items. Users access our internal webaccess gateways, which are the same version and configuration as the external gateway, have not been experiencing this problem. All Groupwise servers were running on NetWare 6.5.5. and were patched up to FTF GW 6.5 post SP6 English only Agents Rev 6.

It did not matter which Internet browser the external users were using, the same problem was apparent for users of Firefox 2.x, 3.x, IE6, and IE7. They’d try to open an item, and their browser would appear to hang for anywhere from a few seconds to a few minutes. Rebooting the webaccess server did not have any affect on the problem.

The following message was seen in the webaccess log file:

Error: Request aborted while waiting on locked conversation

The only Novell TID that was relevant to this problem is TID #10023251.  It describes this exact error message, but it specifically states the error only occurs when trying to view an attachment, which wasn’t the case for our users, since the error was logged when they were just navigating though the webaccess client.

The TID referenced above was not helpful, since it stated the only fix is to have the user perform the action again (ie resend the message) and there are no configurable parameters to increase this timeout.

Here’s what we did to troubleshoot this problem:

First thing I did was to verify the amount of free disk space on the sys volume of each server. The two well behaving servers had at least 750MB free, while the failing server only had 500MB free. Java sometimes behaves poorly when it lacks an abundance of free disk space, so I cleared out 500MB of old log files and restarted the server. Unfortunately, no change in performance or error rate was noted.

I loaded config.nlm on the good internal webaccess servers and the problematic external webaccess server. I then used Winmerge to compare the log resulting files to check for differences in versions of drivers, nlms, and configuration files. One of the team members noticed a difference in how webaccess was being loaded in protected mode. We tried duplicating the change on the external server, but that didn’t have any impact on the situation.

Next I checked the versions of java and tomcat on all machines:
To get the Java version number: java – version
To view the running instances of Java: java – show
To see Java instance memory utilization: java -showmemoryID where ID is the ID of the instance listed when performing java – show with no space between showmemory and the ID number.
Note: You have to switch to the NetWare console logger screen to see the output of these commands

I didn’t see anything abnormal in the problem server’s Java configuration, so next I looked at the Sys:\Apache2\logs\mod_jk.log file. Inside it I saw the following messages, repeated frequently:

jk_ajp_common.c (1318)]: Error connecting to tomcat. Tomcat is probably not started or is listening on the wrong port. worker=ajp13admin failed errno = 54

jk_uri_worker_map.c (620)]: In jk_uri_worker_map_t::map_uri_to_worker, wrong parameters

jk_ajp_common.c (1483): Timeout with waiting reply from tomcat. Tomcat is down, stopped or network problems

jk_ajp_common.c (1503): Tomcat is down or refused connection. No response has been sent to the client (yet)

These messages made me think communication was definitely failing somewhere. The server administrator in charge of the NetWare servers replaced the server’s patch cable and moved it to another port on the switch, thinking that may help communication. It didn’t.

My co-worker was poking through the NetWare Console Monitor – LAN/WAN drivers – highlight NIC – press tab for stats, and noticed increasing Rx CRC errors, as well as other errors. He replaced the network card, and all of the webaccess errors went away!

Troubleshooting Exchange Error 4.4.7 Delivery Delay and Failures


 

One of our partners keeps receiving the following messages when trying to email certain domains:

This is an automatically generated Delivery Status Notification.

THIS IS A WARNING MESSAGE ONLY.

YOU DO NOT NEED TO RESEND YOUR MESSAGE.

Delivery to the following recipients has been delayed.

user@domain.com

Where user@domain.com is the address he’s trying to send the message to.

Eventually he receives the following message

Your message did not reach some or all of the intended recipients.

The following recipient(s) could not be reached:

user@domain.com on 3/27/2008 9:11 AM

Could not deliver the message in the time limit specified. Please retry or contact your administrator.

<originating.mailserver.hostname #4.4.7>

He’s sending to addresses he’s previously sent to with no problems.

KB 284204 notes the following about the 4.4.7 error message:

Possible Cause: The message in the queue has expired. The sending server tried to relay or deliver the message, but the action was not completed before the message expiration time occurred. This NDR may also indicate that a message header limit has been reached on a remote server or that some other protocol timeout occurred during communication with the remote server.

Troubleshooting: This code typically indicates an issue on the receiving server. Verify the validity of the recipient address, and verify that the receiving server is configured to receive messages correctly. You may have to reduce the number of recipients in the header of the message for the host that you are receiving this NDR from. If you resend the message, it is placed in the queue again. If the receiving server is on line, the message is delivered.

You can see the problem is usually on the recipient’s server. Common causes are the recipients mail server is offline or otherwise unreachable, possibly due to DNS problems.

One thing you can try on the originator’s mail server is to increase the SMTP Virtual Server’s Delay Notification and Expiration Timeout settings.

To access these settings in Exchange 2003, open System Manager and navigate to Servers – Your Mail Server’s Name – Protocols – SMTP. Right click on your SMTP Virtual Server – Properties – Delivery tab.

SMTP Virtual Server Delivery Settings

I changed my Delay notification from 12 hours to 18 hours, and the Expiration timeout from 2 days to 4 days. You will need to tweak these settings to what is appropriate for your particular environment.

Another reason you may have these errors, especially with AOL email recipient may be you don’t have a DNS PTR record (Reverse DNS Record) for your mail server. AOL explains:

“AOL does require that all connecting Mail Transfer Agents have established reverse DNS, regardless of whether it matches the domain.”

This means if your mail server doesn’t have a Reverse DNS record, your messages sent to AOL will fail.

AOL has a page where you can enter your mail server’s IP address to determine if AOL can find it’s corresponding Reverse DNS record. If you’re not sure what the IP address of your mail server is, you can look it up based on your domain name.

Also note that setting up a Reverse DNS record is not the same process you would perform while creating a host name or other record. With forward (regular) DNS you setup your name servers with your domain registrar, like Network Solutions. With reverse DNS you must contact your ISP to have them create and host the record. The reason why is because the ISP is who is ultimately responsible for your IP address, and only they can verify that your mail server does indeed reside at that particular IP address.

 

 

 

 

 

 

Making Groupwise 7 and Blackberry Enterprise Server Communicate


I have a client who wanted to integrate Blackberry Enterprise Server (BES) version 4.1.3 with his Groupwise 7.0.1 system. He already had a Windows 2003 SP2 server ready for me to install BES onto, so I figured it would be a quick job. I was wrong.

The first hurdle appeared when I started to run the BES setup program on the Windows 2003 server. The installer refused to run because the server was running in Terminal Services Application mode, which is not a supported configuration.

We changed our plan and started running the BES installer on a different Windows 2003 SP2 server, but this time the installer quit because we did not have at least SQL 2000 SP3a on the server. Determining which version of SQL 2000 is installed is not the easiest thing to do, so we just went ahead and downloaded and installed SQL 2000 SP4.

After installing SQL 2000 SP4 we were able to install BES without problems, but BES was unable to communicate with Groupwise. We determined the problem was the version of the Groupwise client installed on the BES server was 7.0.1 IR1, which is not a supported configuration – we’d later learn we needed to be on versions 7.0.2 or 6.5.6 FTF4. Utilizing client 7.0.2 would have required upgrading the entire Groupwise system, so we decided to backrev to client 6.5.6.

I uninstalled the 7.0.1 IR1 client, rebooted the server, then installed GroupWise 6.5
Support Pack 6, Update 1
dated June 27, 2006. After rebooting the server again, BES and Groupwise could not communicate.

We uninstalled the GroupWise 6.5 Support Pack 6, Update 1 client, rebooted, then tried GroupWise 6.5 Post SP6 Client Rev 4 dated November 10, 2006. We found this to be the required client version according to KB04164, but it didn’t work for us despite following the special installation instructions listed in TID 2974707. We kept receiving the following error message in gwenv1.dll when executing the client:

Entry point not found. WpfCheckAncestryAnd Read

I figured the problem had to lie with gwenv1.dll, so I checked the file’s date. C:\windows\system32\gwenv1.dll was dated 6/16/2006, while the gwenv1.dll found in GroupWise 6.5 Post SP6 Client Rev 4 was dated 11/6/2006.

I suspected the problem was that files from previous Groupwise client installations were not being overwritten by the new client installations. I uninstalled the GroupWise 6.5 Post SP6 Client Rev 4 client , rebooted, ran Messaging Architects’ GW CleanIT, rebooted, then reinstalled GroupWise 6.5 Post SP6 Client Rev 4 client per the TID’s instructions.

We were finally able to communicate with Groupwise through the BES server!

In hindsight, I wish I would have found Blackberry’s KB KB12662, “Perform basic troubleshooting steps for Novell GroupWise”, prior to beginning the BES installation. It probably would have saved us a few hours worth of work.

SonicWall ViewPoint Administration web site won’t load


About a month ago, my Sonicwall Viewpoint 4.1 administration web site stopped loading. The www service was running just fine on my Windows XP SP2 host, but when I double clicked on the administration web site shortcut, http://localhost/sgms/login, the site never came up.

I tried replacing the localhost with the machine’s actual IP address and with 127.0.0.1, but those didn’t make any difference. No errors were seen in the Sonicwall firewall appliance or Windows XP event viewer, and no alerts were emailed to me from Viewpoint or the Sonicwall firewall device.

I found some interesting entries in the Viewpoint log files located at C:\ViewPoint4\MSDE\Data\MSSQL$SNWL\LOG\

  • 2008-02-06 11:58:15.56 spid51 CREATE/ALTER DATABASE failed because the resulting cumulative database size would exceed your licensed limit of 2048 MB per database.
  • 2008-02-06 11:58:15.71 spid51 Error: 1105, Severity: 17, State: 2
  • 2008-02-06 11:58:15.71 spid51 Could not allocate space for object ‘LOGS’ in database ‘sgmsdb’ because the ‘PRIMARY’ filegroup is full..

I searched the Sonicwall knowledgebase and forums, but couldn’t find any information on any of these errors. I was hesitant to contact Sonicwall technical support because of the horrible experiences I’ve had every time I’ve contacted them in the past.

I checked the C:\ViewPoint4\syslogs directory and found that no new syslogs had been written since when the problem started on 01-09-2008 . Much to my dismay, that fact convinced me I had no recent syslog data.

I decided to focus on:

  1. Clearing out the old junk data and getting the program capturing new syslog information once again
  2. Fixing the access to the Viewpoint administration web site.

I found this post (registration required) on the Sonicwall forums where Stephanie recommended running the following from the SQL Query Analyzer, which is a part of SQL Enterprise Manager.

update sgmsdb.dbo.sgms_config
set paramValue = ’02/05/2008 12:00:00′
where paramName = ‘summarydaysLastDeleted’;

I connected to the Sonicwall web/database server from my SQL server that had Enterprise Manager, and ran the above query, making sure the date in set paramValue = ’02/05/2008 12:00:00′ reflected yesterdays date. This cleared out the old data from the database.

I restarted the Viewpoint/web server machine, and was once again able to login to the Sonicwall Viewpoint administration web site. I waited a few minutes, then manually summarized the fresh data, and was once again able to monitor the traffic on my network.

Troubleshooting when Groupwise GWIA won’t send out mail


The other day my Netware 6.5.7 / Groupwise 7.0.2 server decided to stop sending out email for no apparent reason. Some of the things I tried during the troubleshooting process were:

1) Checked the GWIA log files, which didn’t show any errors occurring even with verbose logging enabled. As a matter of fact, the logs didn’t show the messages ever getting to the GWIA for processing! The MTA and POA log files did show the messages being processed, though.

2) Cleared all the GWIA queue directories, but mail still wasn’t sending out even after restarting the server.

3) I toggled the GWIA subdirectory per TID 10091741

4) I reinstalled GWIA per TID 3674238

5) I created a route.cfg file per TID 10010997

6) I made sure nothing weird was happening with DNS lookup on the Groupwise server.

7) I went through each step in TID 10061085, ” How to troubleshoot GWIA”

8 ) As a last ditch effort, I disabled Gwava (version 3.72), which we use as an inbound spam scanner. As soon as Gwava was disabled, mail started leaving the network. I was pretty stunned, since we only scan incoming mail, and we don’t use Gwava as a virus scanner. I verified in the Gwava config outgoing mail wasn’t set to be scanned. I then re-enabled Gwava, and the mail started piling up again. I had found the culprit, but not the cause of the holdup.

I checked over the server’s Gwava log files and console screens and didn’t see any errors, but did notice a message regarding NGW-VSCAN-CONTROLLER when unloading the MTA. That led me to TID 10069173, which pointed to a corrupt message being stuck in the \domain\MSlocal\gwvscan directory. I unloaded GWIA, GWAVA, and the MTA, and renamed the \domain\mslocal directory. I restarted the server, which recreated the previously renamed directory, and mail started flowing out again.

In my case, I had a bad message stuck in the \domain\MSlocal\gwvscan\4 directory. I moved a few files at a time from the renamed directory to the new \domain\MSlocal\gwvscan\4 directory until mail stopped processing. I then downed Gwava and the MTA, deleted the problem message, then reloaded the MTA and Gwava, and mail flow returned to normal.

Identifying and Clearing Groupwise GWIA Queues of Corrupt Messages


When the Groupwise GWIA gateway has problems sending or receiving mail, it’s often the result of a corrupt message clogging up a queue. The easiest way to troubleshoot the problem and restore mail flow is often to down the GWIA and rename the queue folders.

To accomplish this on a Netware server you can stop the GWIA and MTA by pressing F7. Once they have unloaded, browse to the domain\wpgate\gwia directory and rename the following directories:

  • 000.PRC
  • DEFER
  • GWHOLD
  • GWPROB
  • RECEIVE
  • RESULT
  • SEND
  • WPCSIN
  • WPCSOUT

Restarting the GWIA and MTA will recreate these folders. If mail starting flowing again, you can bet that the cause of the problem was a bad message in one of the renamed folders. Move a few messages at a time from the renamed folders to their corresponding new folder. The message flow should continue until you find the corrupt message, which is often the oldest message.

Once the corrupt message is identified, delete it or move it to a different location. This should allow mail flow to resume as expected.

For additional details, see TID 10075205, TID 10054298 and TID 10008353.

In a worst case scenario you may need to delete and reinstall GWIA per TID 3674238. Don’t forget to apply any applicable patches.