Understanding the EMC VNX/Celerra AntiVirus Agent (CAVA): Part 2 – Common Errors

This is part 2 of my CAVA blog post series. In this post, I will go through common error messages you could see in the output of server_viruschk. For those of you haven’t already, please check out part 1 where I go line by line through the output of the server_viruscheck command.

 

Most of these errors have to do with the account used for CAVA. This account is set as the “Log on as” option for EMC Cava in the “services” section of windows.

 

 

OBJECT_NAME_NOT_FOUND:

server_2 :
10 threads started.
1 Checker IP Address(es):
192.168.1.101        OFFLINE at Sat Aug 20 20:28:33 2011 (GMT-00:00)
                     MS-RPC over SMB, CAVA version: , ntStatus: OBJECT_NAME_NOT_FOUND
                     AV Engine:
                     Server Name: cava.thulin.local
                     No signature date


Description: ntStatus: OBJECT_NAME_NOT_FOUND means that the cava service is not running on the server.

Solution: Start the EMC CAVA service under the services menu on the AV server.

 

ERROR_AUTH 5:

server_2 :
10 threads started.
1 Checker IP Address(es):
192.168.1.101        ERROR_AUTH 5 at Sat Aug 20 21:00:10 2011 (GMT-00:00)
                     MS-RPC over SMB, CAVA version: 4.8.5.0, ntStatus: SUCCESS
                     AV Engine: Symantec AV
                     Server Name: cava.thulin.local
                     Last time signature updated: Tue May 17 05:55:23 2011 (GMT-00:00)


Description: ERROR_AUTH means that when cava when to connect to the “check$” folder on the cifs server, it ran into an error. In this case, ERROR_AUTH 5 means that the account does not have the viruschecking privilege.

Resolution: Check to make sure that the EMC CAVA process is running under the cava network user and not the Local System account. If this is correct, verify that you gave the CAVA network account the Viruschecking Privilege in the MMC snap in.

 

AV_NOT_FOUND:

server_2 :
10 threads started.
1 Checker IP Address(es):
192.168.1.101        AV_NOT_FOUND at Sat Aug 20 20:29:59 2011 (GMT-00:00)
                     MS-RPC over SMB, CAVA version: 4.8.5.0, ntStatus: SUCCESS
                     AV Engine: Unknown third party antivirus software
                     Server Name: cava.thulin.local
                     Last time signature updated: Tue May 17 05:55:23 2011 (GMT-00:00)


Description: AV_NOT_FOUND means that CAVA cannot find a running AV process. By default, cava uses a privilege called “Debug Program Rights” to search for the following applications running in memory: SpntSvc.exe, rtvscan.exe, Mcshield.exe, InoRT.exe, SWEEPSRV.SYS, SavService.exe, NTRtScan.exe, and kavfs.exe

Solution: First check to make sure your antivirus software is installed and running. If this is true, then make sure the CAVA account has the Debug Program Rights. By default, this privilege is granted to all local administrators, so add the cava account to the local administrators folder.

 

INVALID_PARAMETER:

server_2 :
10 threads started.
1 Checker IP Address(es):
192.168.1.101        OFFLINE at Sun Aug 21 17:08:28 2011 (GMT-00:00)
                     MS-RPC over SMB, CAVA version: , ntStatus: INVALID_PARAMETER
                     AV Engine:
                     Server Name: cava.thulin.local
                     No signature date


Description: ntStatus is throwing an error trying to connect from the Cifs server to the Cava server. This error is caused when the CIFS server specified for CAVA is not joined to AD.

Resolution: Join the cifs server to AD and restart CAVA.

 

ERROR_AUTH 64:

server_2 :
10 threads started.
1 Checker IP Address(es):
192.168.1.101        ERROR_AUTH 64 at Sun Aug 21 18:16:05 2011 (GMT-00:00)
                     MS-RPC over SMB, CAVA version: 4.8.5.0, ntStatus: SUCCESS
                     AV Engine: Symantec AV
                     Server Name: cava.thulin.local
                     Last time signature updated: Tue May 17 05:55:23 2011 (GMT-00:00)


Description: ERROR_AUTH 64 is because there is a kerberos skew error.

Resolution: Make sure the time on the cava server is within 5 minutes of the data mover.

 

ERROR_AUTH 86:

server_2 :
10 threads started.
1 Checker IP Address(es):
192.168.1.101        ERROR_AUTH 86 at Sun Aug 21 17:25:31 2011 (GMT-00:00)
                     MS-RPC over SMB, CAVA version: 4.8.5.0, ntStatus: SUCCESS
                     AV Engine: Symantec AV
                     Server Name: cava.thulin.local
                     Last time signature updated: Tue May 17 05:55:23 2011 (GMT-00:00)

Problem: ERROR_AUTH 86 is caused when someone changes the password of the CAVA user in AD, but the cava software is using the old password.

Resolution: Update the password used for the cava account on each cava server. If you attempt to restart cava without updating, cava will fail to start with a logon failure error.

 

ERROR_AUTH 1265:

server_2 :
10 threads started.
1 Checker IP Address(es):
192.168.1.101        ERROR_AUTH 1265 at Sun Aug 21 16:04:33 2011 (GMT-00:00)
                     MS-RPC over SMB, CAVA version: 4.8.5.0, ntStatus: SUCCESS
                     AV Engine: Symantec AV
                     Server Name: cava.thulin.local
                     Last time signature updated: Tue May 17 05:55:23 2011 (GMT-00:00)

Description: ERROR_AUTH 1265 is caused when the cava user account has expired in AD. You can verify this if you attempt to login to a remote desktop with the cava user’s credentials.

Resolution: Have a domain admin reset the CAVA account and change it to never expire to keep this problem from returning.

 

ERROR_AUTH 1326:

server_2 :
10 threads started.
1 Checker IP Address(es):
192.168.1.101        ERROR_AUTH 1326 at Sun Aug 21 17:49:37 2011 (GMT-00:00)
                     MS-RPC over SMB, CAVA version: 4.8.5.0, ntStatus: SUCCESS
                     AV Engine: Symantec AV
                     Server Name: cava.thulin.local
                     Last time signature updated: Tue May 17 05:55:23 2011 (GMT-00:00)

Description: ERROR_AUTH 1326 occurs when the cava user’s password has expired in AD.

Resolution: Change the cava account password and have a domain admin set it to never expire.

 

ERROR_AUTH 1331:

server_2 :
10 threads started.
1 Checker IP Address(es):
192.168.1.101        ERROR_AUTH 1331 at Sun Aug 21 17:09:45 2011 (GMT-00:00)
                     MS-RPC over SMB, CAVA version: 4.8.5.0, ntStatus: SUCCESS
                     AV Engine: Symantec AV
                     Server Name: cava.thulin.local
                     Last time signature updated: Tue May 17 05:55:23 2011 (GMT-00:00)

Description: ERROR_AUTH 1331 is when the cava account object is disabled or logon hours have been put in place to deny logon.

Resolution: Have a domain admin enable the cava account object in AD and confirm that the cava account can logon at all hours of the day.

 

ERROR_AUTH 1909:

server_2 :
10 threads started.
1 Checker IP Address(es):
192.168.1.101        ERROR_AUTH 1909 at Sun Aug 21 17:57:17 2011 (GMT-00:00)
                     MS-RPC over SMB, CAVA version: 4.8.5.0, ntStatus: SUCCESS
                     AV Engine: Symantec AV
                     Server Name: cava.thulin.local
                     Last time signature updated: Tue May 17 05:55:23 2011 (GMT-00:00)

Description: ERROR_AUTH 1909 occurs when the cava user account has been locked out due to too many invalid logon attempts.

Resolution: Have an AD admin reset the lockout status on the cava network user.

 

This should cover most of the common errors you will find when cava is running. You may have to check the server logs on cava to see them in the event that cava is turned off. If you have experienced a problem and my resolution does not fix it, please let me know and also open a case with EMC Celerra support.
 
On a side note, I want to also recognize Daniel Morris for his blog posts on CAVA. I urge you to read the following links to get a good understanding as well.

http://blog.planetchopstick.com/2010/10/18/what-is-emc-cava-celerra-anti-virus-agent/

http://blog.planetchopstick.com/2011/05/03/cava-considerations-and-basic-setup/

http://blog.planetchopstick.com/2011/05/05/cava-troubleshooting/

Configuring LDAP Authentication for Unisphere on the VNX

Whether you are configuring security for corporate compliance, or you want a central repository to manage user access, LDAP integration is becoming a major part of corporate infrastructure. Many of you may not realize this, but the VNX (as well as the older Clariion and Celerra) support LDAP integration, and after reading this blog post you will to. During this post I will cover the different steps (with pictures) required to set up LDAP authentication for VNX for FILE, BLOCK, and Unified.

 

*UPDATE*  With the release of FILE OE 7.1 and BLOCK OE 5.32, All LDAP settings are now done in the Storage Domain section of Unisphere.  Just follow the directions here to setup LDAP.

 

To start this process we will need a few things:

  1. The IPs of two domain controllers
  2. The “distinguished name” and password of a service account that can do an LDAP lookup
  3. The name of an active directory group you want to give admin access to (no spaces pleas)
  4. An existing administrator account on the VNX (and the root password for FILE)

 

Before we begin, you may want to login to the control station CLI as root and run the following command: “/nas/sbin/cst_setup –reset”. This command will regenerate the control station lockbox fingerprint and is usually required on systems where you may have changed the IP or name of the control station. I find it’s best to get this out of the way early instead of proceeding with configuration and finding it needs to be done later since this does not change any settings outside of the scope of this tutorial. More information on this can be found in Primus EMC260883.

 

Configuring LDAP on VNX for FILE

 

To start, we will need to login with an administrator account such as nasadmin/systadmin. You will start by clicking on the “settings” tab. On the right hand side you will see link to “Manage File LDAP Domain”, click it.

This section has several entries and is where we configure all the domain information. I have broken this down line by line as well as included a picture.

 

  1. Domain Name:
    • In this area you will put in the domain name. For this example, I used my domain “thulin.local”
  2.  Primary:
    • This is where you put in the IP address of the first domain controller
  3. Backup:
    •  This is where you put in the IP address of the second domain controller
  4. SSL Enabled:
    • Are you using SSL? If so, click the box. For this example I am not because I don’t have a certificate authority setup in the lab
  5. Port:
    • 389 for LDAP and 636 if your using LDAPS
  6. Directory Service Type:
    • Here you get 3 options (default, custom, and other). Default takes most of the guess work out, but will only work if the service account and all the users and groups exist in the “users” container. The custom option allows you to specify the exact container for the service accounts and the user and group search path. Other is used for non active directory setups (such as OpenLDAP servers). For this example we are using the custom option
  7. User Id Attribute:
    • This is the attribute that represents a user in LDAP, in 99% of Active Directory environments it is “samAccountName” and we will leave it as that here
  8. Distinguished Name:
    • This is where you put the distinguished name of the service account. For this example I just used the administrator account
  9. Account Password:
    • If this needs explaining then I have a nice etch-a-sketch you should be using instead of a VNX.
  10. User Search Path:
    • This is where you specify the path to search for users who will be logging in. If the user is not inside this path, they will not be granted access. I like to search the whole domain because a user cannot exist in more than one spot, and authentication won’t be effected by moving a user inside active directory
  11. User Name Attribute:
    • This is the attribute to search by, we will use “cn” (aka Common Name)
  12. Group Search Path:
    • This is just like above, but for groups instead. The same restrictions apply as well
  13. Group Name Attribute:
    • Again we want to search by the common name
  14. Group Class:
    • You want to search for the “group” class
  15. Group Member:
    • We are searching for a “member” of a group

 

Once all the information has been populated, hit apply to save it (if you run into an error here, see the statement I made in paragraph 2 and start over). Once this is done we will need to test things, so hit the test button. If everything worked correctly it will say “Test Domain Settings. OK”. If you get “Bind Failed” error, either your IP, Distinguished Name, or password is incorrect. If you get a user or group error, check the search path and try again.

 

Now that we have configured our authentication protocol, we need to assign a privilege to an AD group. This is done in the in the user management area, so go back to the settings tab, then click on security, then click on user management, and finally “User Customization for File”. This area will present you with 3 tabs: Users, Groups, and Roles. Click on groups and then click create at the bottom. You will now be presented with a screen to make a new group and map it to LDAP.

 

  1. Group Name:
    • This is a local name for the group. You can call it whatever you want because it ONLY exists on the VNX FILE control station. I chose the name LDAP_Admins
  2. GID:
    • This is where you can specify a GID or just have the system auto select one. I use the default of auto select
  3. Role:
    • This is where you give permissions to the group based on the role. Any user in this group will be given this role/permission level by default. For this example, I chose to give the users the Administrator role.
  4. Group Type:
    1. This is where you would select “LDAP group mapped” and put in the name of the group (in this case serviceAdmins) and the domain name (thulin.local). The group name can’t have any spaces but does support underscores.

 

At this point all the work on the VNX FILE side is done and it’s time to start on the BLOCK side.

 

Configuring LDAP on VNX for BLOCK

 

Setting up LDAP for Block is very similar to the way it was done on the Clariions. Just like with the File side, you will need the same 4 bits of information. To begin, click on the home button in the upper left, then click on the domain tab, and finally click on “Manage LDAP Domain for Block”. This will bring up a window where we can start configuring our LDAP settings. The block side requires you to setup individual domain controllers, and set all the settings on that one server, so click on the “add” button and we’ll get started. You will see several areas to input information and I will go through them:

 

  1. IP Address
    • This is where you put in the IP of the domain controller
  2. Port
    • 389 for LDAP, 636 for LDAPS
  3. Server Type
    • There are two options: LDAP Server and Active Directory. Make sure to choose “Active Directory” if you’re using an AD environment (most of you will be doing this)
  4. Protocol
    • LDAP or LDAPS
  5. BindDN
    • This is where you put in the Distinguished Name of the service account just like when setting it up for file.
  6. Bind Password
    • Password for the service account
  7. Confirm Bind Password
    • Make sure it matches
  8. User Search Path
    • Just like with File, this is where you would set the search scope to find your users
  9. Group Search Path
    • Just like with File, This is where you set the search scope to find your groups
  10. Add certificate
    • This is where you would upload a root CA certificate for LDAPS. Make sure it’s in base64 encoding

 

After you have put in all this information, click on the “Role Mapping” tab so we can map an AD group. Once in there you will want to select “Group” from the first pull down. Put in the name of the AD group (in this example I used “ServiceAdmins”), then select the Role from the second pull down (in this case I selected Administrator), and finally click “Add” to add the mapping. Once you have all your mappings, click ok and wait for the confirmation message. Then you want to do this all over again for the second domain controller. Once you have this all set, click “Synchronize”. And that is it!

 

Configuring LDAP on VNX for UNIFIED

 

Configuring LDAP for a unified box is no different than the Block and File side.  The only thing you need to remember is that you need to do both, because the authentication will check your LDAP account against both the control station and the service processor.  Both configurations will have to be working correctly to login properly.

 

Now it is time to test your LDAP login. Logout of Unisphere by clicking the door icon in the upper right. Open Unisphere again and this time put in your AD username and password. Be sure to select “Use LDAP” and click on “Login”. If all your configuration is correct, you will be brought back in to Unisphere. If you get an access denied message, check you username, password, as well as your user and group search paths.

*UPDATE*

I have included a youtube video published by EMC that shows exactly what I have demonstrated above.

I hope you enjoyed this tutorial and I hope this is the first of many. If you have any questions on what you’ve just seen, or if you have any suggestions for future write-ups, drop a message in the comments below.

Are you running MAC OS X 10.7 and have a Celerra? It may be time for an upgrade!

As just about everyone on the internet knows, on July 20th Apple released the OS X 10.7 (aka Lion) to the public. $30 gets you a boat load of new features. One of these features is a completely rewritten CIFS client. For those of you who don’t know, CIFS is the protocol used for windows file sharing and is a big part of the EMC Celerra / VNX product. We have identified an incompatibility within our code. The good news there is a fix available for all DART code families (5.6, 6.0, and 7.0) and we are encouraging everyone to upgrade as soon as possible.

 

On July 14th, EMC has released ETA emc263721 (powerlink credentials required) to address this issue. An ETA (EMC Technical Advisory) is a way for EMC to notify customers proactively to address issues such as this before it happens in their environment. This details the problem and states the current fix. For this issue, we have put the fix into the following code levels:

• 5.6.51.323 or higher

• 5.6.52.201 or higher

• 6.0.43.104 or higher

• 7.0.14.100 or higher

• 7.0.35.301 or higher

You can figure out your code version by running the following command from the CLI: “server_version ALL” (without the quotes). If your current version is the same or newer than the versions I listed above, then no action is required on your part and you are fine to deploy OS X 10.7 in your environment. If your code is below these levels I urge you to upgrade as soon as possible (especially if your environment contains a large number of Macintosh computers). To schedule an update, simply call EMC Support (800-782-4362), open a service request on powerlink, or speak with your local field resources.

 

Understanding the EMC VNX/Celerra AntiVirus Agent (CAVA): Part 1 – server_viruschk

CAVA is one of the few parts of the Celerra/VNX that cannot be configured and monitored from the GUI.  Most, if not all, of the information you need about cava can be found in the command line.  Over the course of a few posts, I will start with a fully working cava setup, and then work backwards to break it so you can see common implementation problems and possible performance bottlenecks.  In this first post of the series, I will go line by line through the output of server_viruschk so that you can understand just what the output is saying.  For reference, this is the output I will be working with:
[nasadmin@UberCS ~]$ server_viruschk server_2
server_2 :
 10 threads started.
 1 Checker IP Address(es): 192.168.1.101     ONLINE at Thu May 26 19:41:13 2011 (GMT-00:00)
                        MS-RPC over SMB, CAVA version: 4.8.5.0, ntStatus: SUCCESS
                       AV Engine: Symantec AV
                       Server Name: cava.thulin.local
                        Last time signature updated: Tue May 17 05:55:23 2011 (GMT-00:00)

 1 File Mask(s):
 *.*
 5 Excluded File(s):
 ~$* >>>>>>>> *.PST *.TXT *.TMP
 Share \\UBERCIFS\CHECK$.
 RPC request timeout=25000 milliseconds.
 RPC retry timeout=5000 milliseconds.
 High water mark=200.
 Low water mark=50.
 Scan all virus checkers every 10 seconds.
 When all virus checkers are offline:
 Shutdown Virus Checking.
 Scan on read disable.
 Panic handler registered for 65 chunks.
 MS-RPC User: UBERCIFS$
 MS-RPC ClientName: ubercifs.THULIN.LOCAL

 

I will now go line by line starting with the first one.
  1. 10 threads started.
    • This is the number of threads for cava.  Each thread represents a file that can actively be scanned.  Cava will process up to 10 files at once to distribute across your available cava servers.  Any additional files will be put into a holding queue until cava can get to them.  This limit here is so that we don’t overwhelm the av software running on each cava server.  This limit is adjustable by the support lab if it is determined that this will solve a performance issue.
  2. 1 Checker IP Address(es):
    • This line tells you have many cava servers you have defined in your viruschecker.conf file.  In this example, I only have 1 server defined, but you should be running at least 2 servers at a minimum.
  3. 192.168.1.101                                  ONLINE at Thu May 26 19:41:13 2011 (GMT-00:00)
    • This line tells you the IP address of your cava server as well as the status and the last time we checked it.  If that line says anything other than ONLINE, there is a problem with the connection from the windows server to the celerra and that server will not be used for scanning.  More information on possible errors will be in a later post.
  4. MS-RPC over SMB, CAVA version: 4.8.5.0, ntStatus: SUCCESS
    • This has 3 pieces of useful information.  The first is the connection method we use to send commands to the cava agent.  In this case, we are using the MSRPC protocol.  Older clients may use the ONCRPC protocol, but this is not supported on 64 bit systems.  The next part tells you the version of cava you are running.  As of writing this, i am using the latest version (VNX Event Enabler 4.8.5).  Like above where we reported the connection from windows back to the celerra, the ntStatus section reports the status of our initial connection to the windows server.
  5. AV Engine: Symantec AV
    • This tells you the AV software we detected to use for CAVA.  This can be helpful if you have more than AV engine installed on the client.  In my case, I am using Symantec Endpoint.
  6. Last time signature updated: Tue May 17 05:55:23 2011 (GMT-00:00)
    • This is the last time you updated your AV definitions
  7. 1 File Mask(s):
    • The number of file masks you have set to scan for.  In this case, it’s just 1 mask.
  8. *.*
    • This is the file masks you have in place.  Any files that match the entries here will be processed for scanning.  In this case i have *.* (everything with a . in it), but you can cut down a lot of traffic if your only scanning for certain file types.
  9. 5 Excluded File(s):
    • This is how many file exclusion filters you have in place.  In this case i have 5.
  10. ~$* >>>>>>>> *.PST *.TXT *.TMP
    • These are the file filters i have in place.  There are a number of files that AV software just can’t scan (like database files).  I also have in place ~$* and >>>>>>>> to ignore Microsoft Office temporary files as they can become locked temporarily while being scanned and cause a loss of data in the office application.
  11. Share \\UBERCIFS\CHECK$.
    • This is the beginning of the UNC path that will be sent for file scan requests.  This is determined from the CIFSserver line in the viruschecker.conf and will change depending of if you defined it with the ip, netbios name, or FQDN.  The check$ folder is a hidden folder created just for CAVA.  The only account that can access this is the one granted the virus checking privilege.
  12. RPC request timeout=25000 milliseconds.
    • This is the amount of time we will wait for a file to be scanned before trying again.
  13. RPC retry timeout=5000 milliseconds.
    • This is the amount of time we wait for an acknowledgement of each RPC command.
  14. High water mark=200.
    • I spoke before about how we process 10 files at a time, and that addition files are put into a queue.  The high watermark is when we allocate additional resources to cava to process through AV files faster.  Hitting this high limit can cause a performance impact to your cifs servers, so try not to let the queue get this bad.  In my case, i have set the limit to the default of 200.
  15. Low water mark=50.
    • Just like the high watermark, this is a lower limit that starts to indicate that files are queuing up too fast.  This won’t cause a performance problem, but is an indicator of a possible problem to come.
  16. Scan all virus checkers every 10 seconds.
    • Every 10 seconds we will check the status of each cava server to make sure it’s still online and ready to take requests.
  17. When all virus checkers are offline:
    Shutdown Virus Checking.

    • This is the action we will take when all the cava servers are not marked as ONLINE.  This will shutdown cava so that files don’t continue to be queued and hit a high watermark.  The other options is to do nothing (a setting of ‘no’) or to shutdown cifs (what i like to call paranoia mode).
  18. Scan on read disable.
    • This means that scan on read is not enabled and that we are only processing scan on write.  If scan on read was enabled, the cutoff date and time would be listed in this place.
  19. Panic handler registered for 65 chunks.
    • This is mostly just for debug information and how many internal failures cava would survive before causing a panic.  Every process on the celerra has a panic handler and this information is of no use to basic cava troubleshooting.
  20. MS-RPC User: UBERCIFS$
    • Earlier i talked about how we use the MS-RPC protocol to connect to the cava agent servers.  This is the username we will use for the SMB connection.  In this case, we are using the compname of the cifs server for cava.
  21. MS-RPC ClientName: ubercifs.THULIN.LOCAL
    • This is the FQDN of the cifs server we are using for cava which is used as part of the MS-RPC process.
This concludes my line by line explanation of the cava output.  I hope you understand the output of cava a bit better.  In future posts on cava  Iwill talk about some of the different information you might see when there is an error as well as the output of the -audit option.  Please feel free to ask questions in the comment section below.