Traceroute explanation

CiscogeekIND · ‎04-24-2009

Small doubt, but feeling something is wrong here. can anybody please explain me.

Tracing route to http://www.singnet.com.sg [165.21.74.59]

over a maximum of 30 hops:

1 1 ms 1 ms 1 ms cm171.delta115.maxonline.com.sg [59.189.115.171]

2 14 ms 10 ms 9 ms 10.43.128.1

3 25 ms 10 ms 10 ms 172.20.43.129

4 24 ms 7 ms 10 ms 172.26.43.1

5 17 ms 11 ms 14 ms 172.20.8.129

6 12 ms 12 ms 9 ms 203.117.161.253

7 149 ms 207 ms 211 ms vlan913-an-cat6k-ts-2-rsm1.starhub.net.sg [203.1

18.7.4]

8 16 ms 16 ms 15 ms vlan917-cat6k-ts2-r2.starhub.net.sg [203.118.1.2

35]

9 18 ms 16 ms 14 ms ge3-0-gsrts1.starhub.net.sg [203.118.1.6]

10 10 ms 12 ms 13 ms pos1-0-gsrtl1.starhub.net.sg [203.118.0.241]

11 12 ms 15 ms 32 ms 203.118.2.20

12 18 ms 17 ms 17 ms 165.21.49.109

13 255 ms 298 ms 364 ms FE-3-0.aljunied.singnet.com.sg [165.21.12.85]

14 35 ms 45 ms 15 ms http://www.singnet.com.sg [165.21.74.59]

Can someone explain to me how come on hop 13

i seeing 255 ms 298 ms 364 ms

and on hop 14

im seeing 35 ms 45 ms 15 ms

the response time should be higher than 255ms on hop 14, but why am i seeing 35/45/15 ms instead

how exactly this traceroute works.

I took above traceroute as example.

My client is showing me the highest delay in between the traceroute and asking to check with my service provider, where as i am able to ping his gateway with 40ms only.

how could i justify this?

thanks in advance

Giuseppe Larosa · ‎04-25-2009

Hello Suresh,

some days ago someone showed a similar issue.

Traceroute is good at showing the path the packets use to reach a destination.

Notice that traceroute tells you about the path from you to destination but tells nothing about the return path.

The delays at each hop are dependent on several factors including the usage level of the router node.

if the node is busy it will hold the traceroute packet before preparing the icmp unreachable message to be sent back.

As I noted above you don't know what path is taken from the icmp unreachable message sent back by node at hop 13.

The packet may go through a different link that is congested and so it has to sit on a queue waiting to be transmitted on the wire.

Hope to help

Giuseppe

dsnixon · ‎05-05-2009

Hi Suresh,

It may also depend on what device is at Hop 13. For example, certain firewalls may report a slow response time to ICMP, but perform much better when ICMP is passed through it.

Reason being that the ICMP response processing is low priority compared to the day-to-day functional task of checking and passing (or not) traffic.

Regards,

David

marikakis · ‎05-05-2009

Hello,

Explaining weird traceroutes is a very common task for network engineers. There are many possible reasons for weird outputs as others have already explained. I would add one that comes to my mind and has to do with possible MPLS paths that might cause the output to produce high times that do not correspond to the forwarding performance experienced by "real" traffic destined to the end destination (as opposed to traffic destined to the router itself).

Still, I could summarize some general advice when trying to troubleshoot such issues:

1. Do you see a problem when trying to interact with the end station? In this case, this simply translates to opening a browser and try to interact with the site. I see no problem from my pc right now. Check out for possibility of using proxies etc and try to confirm the result with as many people as you can.

2. Try to traceroute from UNIX-like, Windows and cisco platforms with various source addresses if possible. Traceroute does not behave the same way in all platforms (I think you posted a tracert from Windows).

3. If you see the problem in the previous steps, go and try to resolve it. If you see no problem, report this to the client (i.e. you see no problem from your side right now) and ask the client if they are having difficulties accessing the site or any other performance issues. This step serves to ensure that a real problem exists and you are not troubleshooting the traceroute program instead of the network itself. This could have been your first step, BUT you can only be convincing if you did your homework first. If they have real issues, you need to dig deeper. Else you tell the story of traceroute not being a real measure for network performance for various reasons already mentioned.

Note also that an important parameter when real issues exist is the time and day issues occur. You need to clarify this with the client. Real issues might well exist, but exhibit themselves at particular times (e.g. when backups are taken, transient congestion situations might occur or at particular times known interface flaps might have occured or anything like that). Also have in mind that end servers could go down or be congested. In all such time-dependent cases you could arrange for the client to call you immediately when issue re-appears to be able to troubleshoot it effectively. I would recommend you tracking the performance of the site in graphs and establish a baseline of what is expected traceroute output when no issues exist, so that you can eliminate hops that do not have to do with the real problem and have a high expected traceroute response even under normal circumstances.

Kind Regards,

M.

CiscogeekIND · ‎05-05-2009

Thanks everyone for the nice explanation.

Will it be the ICMP packets only see the delay or general traffic also face the delay.

As far as i know, Router can be configured to give low priority for ICMP packets. But dont think ISP routers are give this much of delay ( Because they will use High end routers).

what about the ping, it is ok between the source and destination, does it mean traffic between source and destination is flowing with the same delay which i am able to see on the ping response.

one more simple question, My BW utilization is just 40% and client end utilization also less than 60%, But VPN tunnel traffic shows delay where as non vpn traffic (Ping from VPN source device to Destination device) is normal.

route2null · ‎05-05-2009

To marikakis 2nd point. Don't get hung up troubleshooting the troubleshooting tool. I know it's intresting stuff but you'd be better served learning about tracert in an environment that you control. Troubleshooting tracert over the Internet exposes you to so many variables beyond your control.

Just an opinion,

James

marikakis · ‎05-06-2009

In my opinion, engineers have to do the best they can when a complaint from a client comes. Performing a few traceroutes is not too much work if you have a client that is hard to convince. As I said before, the real issue is whether the client does face connectivity or performance issues for real traffic together with the troubleshooting tools showing something not easily explained. What was the reason for the client to decide to perform the traceroute? You cannot know if a real problem exists unless you explore it. You cannot just say "traceroute is worthless" to the client when problems occur, and use it at other times as a clue to convince the client everything is fine.

You cannot know if real packets face the delay reported by traceroute unless you interact with the site. In this case, things are easy. Delays in the order of 400ms are noticable by humans. If you interact with the site very quickly when clicking here and there at the web interface, then the delay is just traceroute delay. If site responds slowly, you need to dig deeper, but still over the Internet you cannot know if this is a problem of the web server or a problem of the intermediate network and that's why you need a baseline traceroute. If you have the same traceroute delays when the site is slow and when it is fast responding, then you have a clue pointing towards the site being the problem.

Also have in mind that no matter how annoying a single client complain about a single site might look, you can see it as a help from the client that gives you a hint to diagnose early a potential issue that might affect other clients soon. In my own opinion, a client complaint should not be left unexplored. This procedure is part of the quality of the service you provide to the client.

Just because ISP routers are monsters this does not mean they are left unprotected to respond to every ping/traceroute their clients initiate. The real traffic they receive is also huge, so the numbers are relative. For example, they also receive huge amounts of troubleshooting traffic, which is not only their own direct clients' traffic, but also from all over the world.

You are right that ping delay is probably better, because it does not require responses from intermediate devices. Ping delay is closer to delay of forwarded traffic. As always, tools are tools. I could argue that real delay can be better than ping delay like this: When you download a file you can have many packets on the way between source and destination. The overall delay could be much lower than sending one packet and waiting for a response to send the next one. Of course, interactive traffic of non-file transfer applications might not be as good and depends more on end-to-end delay.

Bandwidth utilization is important to know and cannot be very easily measured (many overheads might not be reported by routing devices). If we assume that your numbers are accurate enough, then congestion due to bandwidth might not be an issue in your case (though over the internet you never know either the utilization of all the links or the cpu utilizations of the devices or anything). The traceroute over the VPN might look slower in traceroute because packets from intermediate devices might follow different return path for the different source addresses you use (i.e. your source addresses are destination addresses from the point of view of the routers in the path). Anyway, the most important thing is to determine whether the client does have connectivity or performance issues when browsing the site. Application performance is your real measure of whether a problem does exist. Also try to see if client has issues with only this site or this is more general. This detail is easily forgotten and I also forgot to mention it earlier. If client cannot see a thing then we move troubleshooting more towards the client side.

kaushikvadali · ‎05-06-2009

Hi Suresh,

The scenario is common for many destinations in internet world.

Traceroute sends batch of 3 packets with TTL value incremented by one for successive batch of packets.

The difference in latency depends on the HOP behavior when it comes to a

- hop passing the packet

- hop replying to the packet

As per your trace average expected response is around 15ms to the destination.

7th & 13th hops are points of congestion in your network path and is taking long time when it has to respond but passing the packets with higher ttl to next hop.

Kaushik Vadali