We recently change our Switch to a 1Gb/s Cisco Switch and eveything seem to be fine until my user start crying at me saying they could not access their email anymore, I'm not sure it's the switch or the computer but my first guess would be the Switch.
Let explain in detail and the test i have done
My network is composed of 1 48port Switch where i connect all my client computer and server on it. 1 router for our WAN and one switch/router for the Local ISP
All my computer and server are updated to the last Ms patch
The problem in detail and some exemple:
If i lose the connection to the Mail (software) server i also cannot ping the IP addresse of the server, witch i think indicate that there is no problem whith the Software itself.
In the same time an other computer can still ping and access the Mail server and his Mail.
Now i got this problem i tried to move it on the other port of the same switch with no better result so i move it on the Switch router and i see an improvement. but still i lost connectivity to the Server time to time.
while i was able to ping the switch router itsel
to try to locate what happen when i lose the server i initiate a ping from my computer to the server and to the switch (ping -t) and see what happen when i lose the connection, i have also initiate a ping from the mail server to the switch and to the File server.
The result of this is i never lost packet even when other user report problem to me and the server itself never lost any packet.
we have a small network with not so many load between the server and the client
The configuration of the switch hasn't been changed and only one vlan is enable for the switch
if anyone have any idee (is it from the switch, the computer itself)
Thanks for your help
If your computers are all patched up neat and tidy, it should not be a problem with the computers themselves.
Are all the computers connected @ 1gbps to the switch ? Are the ports all in auto negotiation mode?
You can try the following:
When the problem occurs, telnet to the switch or console into it and run a "show interface module/port" command to see what kind of errors you are getting off that port (FCS errors perhaps?)
Set the speed and duplex settings of the switchport connected to the server and all ports that seem to be affected by this symptom manually @ 100 mbps full duplex and then test if the problem appears again.
Please rate posts that help.
Ok i have switch all the port for the client computer to Full duplex 100 (and i'm wondering why i bought a 1Gb switch :'(
I will see tomorow if it's still happening to my users
Thanks for your help
Im sure that you can switch back to 1 Gbps but you need to probably upgrade the cables / reduce crosstalk or some very technical detail like that. What kind of cables are you using currently?
I have experienced cases where cheap quality cables caused several errors, causing me to throttle back to as low as 10mbps after which it would work perfectly fine!
Do revert back if switching down to 100mbps solved the problem atleast temporarily.
Meanwhile check out this link
which states that
xmit errors (which are most prominent in your case) are indicative that the internal port transmit (Tx) and receive (Rx) buffers are full. A common cause of Xmit-Err is traffic from a high bandwidth link being switched to a lower bandwidth link, or traffic from multiple inbound links being switched to a single outbound link. For example, if a large amount of bursty traffic comes in on a gigabit port and is switched out to a 100Mbps port, this might cause the Xmit-Err field to increment on the 100Mbps port. This is because that ports output buffer is overwhelmed by the excess traffic due to the speed mismatch between the incoming and outgoing bandwidths.
So its most probably an issue with speed / duplex mismatches.
On the NIC's that are on the computers could you try manually setting the parameters to 1000Mpbs / Full duplex?
We Use Cat5e Shield from the wall plug up to the server bay, the rest is with normal cat5e cable,
The only thing wish bother me is the way they cable the bay, they removed the shield and the plastic cover from all the cable and plug them like normal except the cable aer uncoil from the begining of the 24port bay up to the end...
Like this (hum i hope you understand this poor quality drawing):
Most of the computer where quite stable on 1000Mbps / Full duplex (i mean they where not switching fro 1000Mb/s to 100Mb/s and back to 1000Mb/s).
I understand your point about the hight bandwith link but since the server itself got a 1Gb network card and only tranfert a small amount of data to each of those computer it should be ok even if you look at the traffic it seem to be quite far from this problem
Traffic Peak Peak-Time
------- ---- -------------------------
0% 4% Fri Feb 24 2006, 18:17:37
Anyway it's still not working and i'm wondering what can cause that problem (should i burn the server?)...
Since we were running at 100Mb/s full duplex and the file server is running at 1Gb/s i made some test to see how the thing was working. so i tried to tranfer 2 file of 480MO each and send them to the file server. it was quite a suprise to see that i was only using 0.50% of my 100Mb/s Bandwitch.
So i tried at 1000Mb/s again with the same file and then it was working at 25% of the capacity wich is a bit more understandable....
I'll try to change our patch pannel and cable it in a better way with the hope to solve this problem.
Throttle all devices on your network back down to 100mbps. Having the server at 1000mbps and other devices at 100mbps would only elevate the problem in most cases as the server will try to pump data out at wire speeds.
Also you mention the shielding has been removed. I dont quite understand why your cabling engineers might do this? Removing the shielding would increase crosstalk and since you have only one 48 port switch there is going to be quite a bit of cross talk all over the place!
Please rate posts that help
ok will do that and see what happen,
About the cabling guy well i'm not sure he was so knoweldgable about cabling the network since his work was mostly in Telecom (where you only have 2 cable) anyay it seem fine to me at that time since he guaranty to me that he had done the same for our local Internet Provider, but he might have lied also anyway i'll redo the cabling on the patch pannel and see the result
I have changed all the port to 100Mb/s and the speed is better (better than when connected at 100 to the server at 1000) between my client and the file server, but i have a lot more error than before.
See my attachement:
Before i had no Single-Col or Multi-Coll in the statistics.
Which Cat OS you are using on it. Did you try upgrading the Cat OS on the box. Try that and see if that makes any difference.Make sure that you upgrade it to the latest one.
To check if this is a crosstalk issue due to too many cables in the wiring closet, keep only the server and one client plugged into the switch physically and then run a heavy file transfer.
You might consider doing doing this at an off peak period.
it will happen in one or two hours then i'll be able to test again without to many user on the link,
I have tested to conenct the server and one client computer directly to the switch and send 2 big package on it, i still have error (FCS & RCV)
i have removed all the client except of 2 computer and connect the servers directly to the Switch.
The client i've done the test with is also direclty connected to the switch.
It's not realy better please see the attached file to see the error counter and i've made a clear counter on both port before transmitting 2 big file to the File Server and then i have done the same with the mail server.
Port Error Mail Server
Port Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize
----- ---------- ---------- ---------- ---------- ---------
2/41 - 13 0 182395 0
Port Single-Col Multi-Coll Late-Coll Excess-Col Carri-Sen Runts Giants
----- ---------- ---------- ---------- ---------- --------- --------- ---------
2/41 0 0 0 0 0 182382 0
Ok last test for tonight wich did work this time :) same configuration on the switch as in my last post 100 full duplex, but this time i change from auto on the client and the server to 100 full duplex also.
Why, because while i was idling around trying to gess an answer i see on my client that the configured port while on auto was 100 half duplex which for me is weird since i specify on the switch full, well anyway i change everyone to full duplex 100 and it work, fucking no error anywhere -.-; do i have to change this on all client computer by hand T.T;
Hope i'm write i'll see tomorow how it's is working with everyone up.
No more error but still the same problem, temporary(1-2min avg) lost of connection with client computer.
I know it might sound very difficult but could you capture a show interface on an affected port when the workstation loses connectivity?
Also, now that there are no errors (hopefully) I think you may have successfuly proved that the existing cabling infrastructure is not adequate to support gig ethernet. Might wanna consider overhauling the cabling for atleast those clients that need the speed (one computer at a time! :) )
Did you get an explanation for why the sheilding had to be removed?
Please rate posts that help
First i would like to remind you that the switch was not showing any error before we changed it from auto to manual and i beleive it was more a duplexing error (wich i'm not experienced enought to know why) and it didn't show any error after i modify the client duplexing from half to full...
I start to think that it might be the server MotherBoard which is faulty and not the switch and will resume it's 1Gb operation in a few days if i don't find any solution (i still have to do the backup of the perssonal files and now it's quite a pain) the file server is not showing the same problem but he is not continuously acceced by the client, the email is some kind of imap (lotus notes) and so every action is done on the server.
About the cabling i done it 2 days ago so now it should not be a problem anymore all the plug are connected whith a full shielded cable up to it's own socket.
Thanks for your help