Troubleshooting Prepositioning on WAAS 4.1.1 and above

Document

Jan 18, 2011 1:52 AM
Jan 18th, 2011

Introduction

Prepositioning is a powerful tools on the WAAS platform but it is not always easy to figure out why your jobs are failing when trying to retrieve the files.

Here is a method that should help you to figure out the reason why they are not successful.

Throughout the whole document, X.X.X.X will represent the IP address of
the WAE interface while Y.Y.Y.Y will be the interface of the server you
are trying to preposition from.

Job stuck "In Progress" at 0 bytes

Here is what you can do if you are getting the following screen when you monitor the status of your preposition job:

Prepo0.jpg

a) Verify that the WAE has TCP access on ports 445 or 139 to the CIFS server

To make sure of this, you can use the telnet command on the WAE itself:

CDN-WAVE-474-1#telnet Y.Y.Y.Y 445
Trying Y.Y.Y.Y...
Connected to Y.Y.Y.Y.
Escape character is '^]'.
fff
Connection closed by foreign host.
CDN-WAVE-474-1#

CDN-WAVE-474-1#telnet Y.Y.Y.Y 139
Trying Y.Y.Y.Y...
Connected to Y.Y.Y.Y.
Escape character is '^]'.
ddd
Connection closed by foreign host.
CDN-WAVE-474-1#

If you see the "Connected to" message for one of the two tries, it should be ok. If you don't, please verify the connectivity between the WAE and the CIFS server on those ports.


b) Verify that the traffic is going through the Core WAE

If you have access to the server on the CIFS port, you need to verify that the traffic is at least going through another WAE before reaching the hosting server.

If you have the following setup:

NotOk.jpg

Your jobs will get stuck to 0 bytes, To have preposition working, you need to have the following setup:

Ok.jpg

With the way the new CIFSAO is working (Transparent CIFS compared to the tunneling that was used in the Legacy WAFS component), support for the first scenario could be added so the following enhancement request has been opened:

CSCsv12937 WAAS 4.1.1 edge preposition directly to the server

There are a couple of means to verify that the traffic is indeed going through another WAE;

1. Wireshark captures

You can use the following command on your WAE:

tethereal -w prepo -f "host Y.Y.Y.Y" -i eth1 -s 1600

Then start the preposition job. Once it is started, you can stop the capture by pressing CTRL+C then retrieve the file which has been created on the hard drive of your WAE under the name prepo_00001_XXXXXXXXXXXXXX where all the X are replaced by the timestamp of which the trace has been taken.

Open this file with Wireshark on your PC and check the TCP part of the SYN/ACK received when the WAE is establishing the 3WHS to the CIFS server:

WireReturnNo21.jpg

If you don't see option 0x21 there, it means that the traffic is not going through another WAE. If you do see it like shown in the following capture, you can move to the next step:

WireReturn21.jpg

2. Connection state

You can verify that the traffic is going through another WAE by looking at the output of "show statistics connection":

CDN-WAVE-474-1#sh statistics connection

Current Active Optimized Flows:                      0
   Current Active Optimized TCP Plus Flows:          0
   Current Active Optimized TCP Only Flows:          0
   Current Active Optimized TCP Preposition Flows:   0
Current Active Auto-Discovery Flows:                 1
Current Reserved Flows:                              10
Current Active Pass-Through Flows:                   1
Historical Flows:                                    103


O-ST: Origin State, T-ST: Terminal State
E: Established, S: Syn, A: Ack, F: Fin, R: Reset
s: sent, r: received, O: Options, P: Passthrough

Local IP:Port       Remote IP:Port      Peer ID           O-ST T-ST ConnType   
Y.Y.Y.Y:139      10.48.68.74:1813    N/A               Sr   Sso  EXTERNAL CLIENT

Local IP:Port         Remote IP:Port        Peer ID           ConnType         
X.X.X.X:1421       Y.Y.Y.Y:445        N/A               PT In Progress   
Y.Y.Y.Y:445        X.X.X.X:1421       N/A               PT In Progress   

CDN-WAVE-474-1#

If you see the connection twice (A to B and B to A) as "PT In Progress", this means that the traffic is not going through another WAE.

3. Logs

There are two files that you can monitor to get more info on the status of your preposition jobs: /local1/errorlog/cifs/cifs_err.log and /local1/errorlog/cifsao-errorlog.current.

If you check the entries which are added to those through the type-tail command when launching the preposition job, you will see the following:

type-tail /local1/errorlog/cifsao-errorlog.current follow
01/16/2011 11:32:30.637(Local)(19336) TRCE (637592) Preposition ID  14605 started on Y.Y.Y.Y\Shared folder\. [AoShellLog.cpp:19]
01/16/2011 11:32:30.789(Local)(17967) ERRO (789865) Cannot get connection version, status=-1. Return error status. [AoShellWrapper.cpp:543]
01/16/2011 11:32:30.941(Local)(17967) ERRO (941307) Cannot get connection version, status=-1. Return error status. [AoShellWrapper.cpp:543]
01/16/2011 11:32:30.941(Local)(17967) NTCE (941955) Preposition ID  14605 failed, reason: network initialization error, retrying in 30 seconds. [AoShellLog.cpp:22]

01/16/2011 11:33:01.099(Local)(17967) ERRO (99306) Cannot get connection version, status=-1. Return error status. [AoShellWrapper.cpp:543]
01/16/2011 11:33:01.250(Local)(17967) ERRO (250692) Cannot get connection version, status=-1. Return error status. [AoShellWrapper.cpp:543]
01/16/2011 11:33:01.251(Local)(17967) NTCE (251188) Preposition ID  14605 failed, reason: network initialization error, retrying in 30 seconds. [AoShellLog.cpp:22]

type-tail /local1/errorlog/cifs/cifs_err.log follow
2011-01-16 11:32:30,637  INFO (actona.preposition.PrepositionController:385) prepositionPool-4 - Prep task 14605  Preposition ID  14605 started on Y.Y.Y.Y\Shared folder\.
2011-01-16 11:32:30,789  INFO (actona.aosh_io.AoShellWrapper:140) Thread-2 -  Connect failed to [ server: Y.Y.Y.Y, port: 445 ]
2011-01-16 11:32:30,790  INFO (actona.preposition.PrepositionController:963) Thread-2 -  Connection failed to /Y.Y.Y.Y:445, falling back to port 139
2011-01-16 11:32:30,941  INFO (actona.aosh_io.AoShellWrapper:140) Thread-2 -  Connect failed to [ server: Y.Y.Y.Y, port: 139 ]
2011-01-16 11:32:30,942  WARN (actona.preposition.PrepositionController:983) Thread-2 -  Preposition ID  14605 failed, reason: network initialization error, retrying in 30 seconds.

2011-01-16 11:33:01,099  INFO (actona.aosh_io.AoShellWrapper:140) Thread-2 -  Connect failed to [ server: Y.Y.Y.Y, port: 445 ]
2011-01-16 11:33:01,099  INFO (actona.preposition.PrepositionController:963) Thread-2 -  Connection failed to /Y.Y.Y.Y:445, falling back to port 139
2011-01-16 11:33:01,250  INFO (actona.aosh_io.AoShellWrapper:140) Thread-2 -  Connect failed to [ server: Y.Y.Y.Y, port: 139 ]
2011-01-16 11:33:01,251  WARN (actona.preposition.PrepositionController:983) Thread-2 -  Preposition ID  14605 failed, reason: network initialization error, retrying in 30 seconds.
2011-01-16 11:33:31,406  INFO (actona.aosh_io.AoShellWrapper:140) Thread-2 -  Connect failed to [ server: Y.Y.Y.Y, port: 445 ]
2011-01-16 11:33:31,407  INFO (actona.preposition.PrepositionController:963) Thread-2 -  Connection failed to /Y.Y.Y.Y:445, falling back to port 139
2011-01-16 11:33:31,558  INFO (actona.aosh_io.AoShellWrapper:140) Thread-2 -  Connect failed to [ server: Y.Y.Y.Y, port: 139 ]
2011-01-16 11:33:31,558  WARN (actona.preposition.PrepositionController:983) Thread-2 -  Preposition ID  14605 failed, reason: network initialization error, retrying in 30 seconds.
2011-01-16 11:34:01,717  INFO (actona.aosh_io.AoShellWrapper:140) Thread-2 -  Connect failed to [ server: Y.Y.Y.Y, port: 445 ]
2011-01-16 11:34:01,717  INFO (actona.preposition.PrepositionController:963) Thread-2 -  Connection failed to /Y.Y.Y.Y:445, falling back to port 139
2011-01-16 11:34:01,869  INFO (actona.aosh_io.AoShellWrapper:140) Thread-2 -  Connect failed to [ server: Y.Y.Y.Y, port: 139 ]
2011-01-16 11:34:01,869  WARN (actona.preposition.PrepositionController:983) Thread-2 -  Preposition ID  14605 failed, reason: network initialization error, retrying in 30 seconds.
2011-01-16 11:34:32,026  INFO (actona.aosh_io.AoShellWrapper:140) Thread-2 -  Connect failed to [ server: Y.Y.Y.Y, port: 445 ]

If you are 100% positive that you are going through another WAE but you still see those symptoms, it might be due to a device on the path between the two devices which is clearing the option 0x21. If it is a PIX/ASA firewall, you might want to make sure that you have "inspect waas" configured or if you have an IPS, you should disable the following signatures: 1306, 1330.12, 1330.17, 1330.18, 1330.19, 3030, 5581.0

Verify that the traffic is handled by the CIFSAO on the core side

If you can confirm that the return traffic is also going through the Core WAE, there is one last item that needs to be checked: this traffic needs to be handled by the CIFSAO for the preposition to be successful.

For instance if you have the following configuration on your Core:

policy-engine application
   set-dscp copy
   name WAFS
   classifier CIFS
      match dst port eq 445
      match dst port eq 139
   exit
   map basic
      name WAFS classifier CIFS action optimize full
exit

Preposition will not work since the CIFS traffic will not be sent to the accelerator.

If you check the status of the connection on the Edge side, you will see the following:

CDN-WAVE-474-1#sh statistics connection

Current Active Optimized Flows:                      1
   Current Active Optimized TCP Plus Flows:          1
   Current Active Optimized TCP Only Flows:          0
   Current Active Optimized TCP Preposition Flows:   0
Current Active Auto-Discovery Flows:                 0
Current Reserved Flows:                              10
Current Active Pass-Through Flows:                   0
Historical Flows:                                    103


D:DRE,L:LZ,T:TCP Optimization RR:Total Reduction Ratio
A:AOIM,C:CIFS,E:EPM,G:GENERIC,H:HTTP,M:MAPI,N:NFS,S:SSL,V:VIDEO

ConnID        Source IP:Port          Dest IP:Port            PeerID Accel RR  
  1403      172.16.5.3:15779        Y.Y.Y.Y:445 00:22:64:96:eb:5c TDL   00.0%
CDN-WAVE-474-1#

As you can see the connection is seen as optimized but we won't see any traffic going through it.

To solve this problem, you just need to change the classifier config of the Core side so that the CIFS traffic is handled by the accelerator:

policy-engine application
   set-dscp copy
   name WAFS
   classifier CIFS
      match dst port eq 445
      match dst port eq 139
   exit
   map basic
      name WAFS classifier CIFS action optimize full  accelerate cifs
exit

Once this is done, your preposition job should start to retrieve files. If the Edge is also configured to handle the traffic via the accelerator (Not mandatory), the acceleration of the connection in the show statistics connection output will be changed from TDL to TCDL.

Job "Completed" but with errors

If you see the following status once you've launched your preposition job:

PrepoError.jpg

It is advised to check the /local1/errorlog/cifs/cifs_err.log and /local1/errorlog/cifsao-errorlog.current.files as they would contain the reason of those errors. For instance, in this case, I have entered a share name that did not exist on the CIFS server:

type-tail /local1/errorlog/cifsao-errorlog.current follow
01/16/2011 12:08:44.254(Local)(18961) TRCE (254057) Preposition ID  14605 started on Y.Y.Y.Y\Does Not Exist\. [AoShellLog.cpp:19]
01/16/2011 12:08:45.068(Local)(19336) TRCE (68614) Prpositioned files under \\Y.Y.Y.Y\Does Not Exist\ (task 14605): Source root directory does not exist(0 shares with errors) - scanned 0 files, updated 0 files, 0 bytes 0 directories. [AoShellLog.cpp:19]
01/16/2011 12:08:45.069(Local)(18484) TRCE (69736) Preposition ID  14605 finished,  0 files prepositioned successfully,  0 files with errors [AoShellLog.cpp:19]

type-tail /local1/errorlog/cifs/cifs_err.log follow
2011-01-16 12:08:44,889  INFO (actona.preposition.ProtocolAdapter:217) prepositionPool-5 - Prep task 14605  starting scan task 14605
2011-01-16 12:08:45,067  WARN (actona.preposition.PrepositionController:1095) prepositionPool-4 - Prep task 14605  Prep controller got an API error: File Not Found
2011-01-16 12:08:45,068  INFO (actona.preposition.TaskStatus:251) prepositionPool-4 - Prep task 14605  Prpositioned files under \\Y.Y.Y.Y\Does Not Exist\ (task 14605): Source root directory does not exist(0 shares with errors) - scanned 0 files, updated 0 files, 0 bytes 0 directories.
2011-01-16 12:08:45,068  INFO (actona.preposition.ProtocolAdapter:179) prepositionPool-4 - Prep task 14605  Closing preposition channel, task 14605

Corrective actions can be taken by either changing the parameters of the preposition or on the CIFS server itself.

Bu g

If you verified all of the above and still have problems with prepositioning, you might be hitting a software bug. Here is a list of all the bugs related to preposition which have been fixed since the 4.1.1 release. You might want to go through the list and see if one of those might apply to your setup:

Bug ID
TitleFixed In
Bug Tool
CSCsu25035Sitemap not working for PP & Dynamic shares when secure store open on CM4.1(1B)Here
CSCsu90033Sitemap won't work when enabled secure store on Core4.1(1C)Here
CSCsr95819CIFS AO: Not all the expected files are cached by preposition task4.1(3)Here
CSCsw37661Unable to browse directory with CIFS prepositioned content4.1(3)Here
CSCsw39896CIFS Preposition directive gets renamed when configured from CLI4.1(3)Here
CSCsw80798CIFS preposition tasks fail with NTLMv2 authentication4.1(3)Here
CSCsx54846Preposition task may not complete under specific stress scenario4.1(3A)Here
CSCsz01264Preposition tasks fail while multiple WAEs scan same high-volume root4.1(3A)Here
CSCsz78799Concurrent preposition tasks may not fetch all the data for large files4.1(3A)Here
CSCsw36112File preposition may not complete when very large files are transferred4.1(3A)Here
CSCsz53126Preposition with large number of files in a root share may not complete4.1(3B)Here
CSCsz84284Preposition task may not complete for root shares with many files4.1(5)Here
CSCsx96126Exception if no share or "/" used as root for CIFS preposition (CLI/CM)4.1(5C)Here
CSCtb89492WAAS: Preposition task may fail due to resources being unavailable4.1(5C)Here
CSCtb84428Concurrent preposition tasks through a DC WAE fail when size exceed 10GB4.1(7)Here
CSCta55041Preposition task may terminate early when preposition size > cache size4.1(7) 4.2(1)Here
CSCsz75060CIFS preposition startup-config may not be applied to running-config4.1(7A) 4.2(3B)Here
CSCsz79863Preposition task may fail to fetch all files after network disruption4.2(1)Here
CSCtb43432Under certain conditions prepositions tasks may be deleted and added4.2(1)Here
CSCsx66071Preposition task may start at incorrect time in specific conditions4.2(1)Here
CSCsz79863Preposition task may fail to fetch all files after network disruption4.2(1)Here
CSCsz77214CIFS preposition tasks does keep retrying if server FQDN is unreachable4.2(3B)Here
CSCte86102Rarely, preposition root share may get deleted without user intervention4.2(3B)Here
CSCti98840Preposition task may not restart in certain rare scenarios4.2(3B)Here
CSCti33775preposition not applied to devices if schedule set before assign devices4.3(1)Here

Related Information:

Common WAAS/WCCP issues on interactions with Security Devices

GRE Redirection in WCCP Creates new tunnel interfaces

Average Rating: 5 (7 ratings)

Comments

nfournie Wed, 01/19/2011 - 04:33

Comments on the document itself or on what could be added to it are more than welcome.

fearriet Tue, 11/01/2011 - 10:19

great document I'd suggest this reading to anybody having trouble with preposition, thanks for sharing.

dbooth@oxspring.com Wed, 09/12/2012 - 05:00

I found this document very helpful but have an odd preposition scenario.  I get the 'Job completed but with errors' state and see the 'Source root directory does not exist(0 shares with errors)' error if the cifs_err.log, but host clients at the same site as the Edge WAE can access these shares without issue.  The File Server Settings have been checked and double-checked and different shares on the server have been tried but the same error is always produced.  The other symptom is that in the Preposition Definition tab, if the Browse button is clicked to select Root Share and Directories, the error 'Could not access cifs://x.x.x.x/' is shown in the browsing window.  Other prepositions on the same Edge WAE to another Core location work fine.  Any thoughts would be much appreciated.

fearriet Thu, 09/13/2012 - 07:30 (reply to dbooth@oxspring.com)

WAASversion 5 now supports SMBv2 with digital signing,but  try to disable digital signing and test it to make sure that's actually your problem, there could be other isssues as well... I definitely suggest to open a TAC case.

good luck

dbooth@oxspring.com Thu, 09/13/2012 - 07:40

That's interesting about WAAS v5 Felix, thanks for that.  Would you just need the Core WAE (the one near the server doing the SMBv2 with DS) to be v5 or would the one at the client location need to be too?

Actions

Login or Register to take actions

This Document

Posted January 18, 2011 at 1:52 AM
Stats:
Comments:7 Avg. Rating:5
Views:4006 Contributors:3
Shares:0

Related Content