Today during my technical discussions with one of my colleagues I found an interesting question posed by him,
He asked me is there any specific reason to disable PHP?
I am keen to know this, Please answer if any one has a reason? I frankly admit that I haven't found any?
The only reason i have come across is to do with QOS ie. the MPLS EXP bit in the label header. Cisco have this explanation from one of their docs -
A. In some cases, the PHP can expose the IP packet to the penultimate hop. This happens when an MPLS packet arriving at the penultimate hop has only one label. In this case, the penultimate LSR and the edge LSR do not have access to the EXP value that the packet carried before the MPLS header was removed. To preserve the EXP value in this case, the edge LSR needs to advertise an explicit NULL label (a label value of zero). The penultimate hop forwards MPLS packets with a NULL label instead of forwarding IP packets. An explicit NULL label is not needed when the penultimate hop receives MPLS packets with a label stack that contains at least two labels and PHP is performed. In that case, the inner label can still carry the EXP value needed by the penultimate and edge LSR to implement their QoS policy.
The only part of the above that confuses me is the last sentence where it seems to say the P router could use the VPN label (or any other second level label) for the QOS decision.
My understanding was that P routers only ever looked at the top level label but that is not what the above seems to imply.
Sorry to raise another question instead of just answering the question
It refers to PE (edge LSR) not the P router.
That was my understanding as well but read the extract i posted and you can see in the final sentence it is saying it is not just the edge LSR (PE) but also the penultimate hop (P router) that can use the second label to get the QOS value.
It could be the document is wrong.
Per my understanidng, it is not the P router but the egress PE. When MPLS is implemented in core without any services like vpn, the ingress PE will impose only one label. Depending on the local policy, it can set any EXP (I think we should stop saying EXP and start calling TC ;)) value will be set which might be different from IP DSCP value.
Now when PHP pops this (only) label, the EXP value will be gone and egress PE will end up treating it per IP DSCP (or just classify it to default class). To avoid such issues, we use null label. But if ingress PE imposes 2 labels, it copies the same EXP to both the labels whcih can be used by egress PE to idenfiy and give the right treatment for the packet.
Also, disabling PHP plays a role in context identification as well. Though it is not in production, there are scenarios where we need to idenfiy the context to perform the right table lookup (Ex: Upstream label assignment).
Per my understanidng, it is not the P router but the egress PE.
That was my understanding as well but the extract from the Cisco document suggests that the penultimate P router can also use the second label.
Perhaps the document is wrong ?
Hmm Thanks Jon,
Thanks for your answer and the question as well.
Is that something like this,
P router which is performing PHP, will just copy the EXP Value to the Top label while popping the label, I meant only when there is a EXP value assigned then, otherwise it is not going to look in to the TOP label. If yes how is this happening.
Is it going to look in to the top label in any of the cases if it is performing the PHP?
Need to clarify this as this eating my brain.
If the P router is performing PHP it simply pops the top label and that's it. As far as i know it does not look into the label for any EXP value. There would be no point because it is not going to copy this value to a new label.
If the P router is not performing PHP then it would read the EXP value in the incoming label and write that value into the new label.
Thanks Jon, I understood it,
You mean, in any of the cases if PE router wants to read the EXP value it should then signal an explicit NULL, the EXP bits in the label stack are preserved throughout the MPLS network and the QoS actions performed by the penultimate router.
Basically yes although there seem to be two different issues here.
By using the explict Null the PE is signalling the P to add a new label which, if there was an EXP value, would be copied into the label. This means -
1) when the P router sends the packet to it's outgoing interface for processing there is an EXP value that it can use for any QOS treatment
2) the PE also can use the EXP value in it's QOS treatment.
Nagendra seems to be suggesting that it is really only the PE that needs this value (apologies Nagendra if i have misrepresented what you have said).
But i thought it was relevant to both the penultimate P router and the egress PE router ie. for the entire end to end LSP.
Perhaps we need a bit more clarification.
Thinking about it logically it would make sense if it was for both routers because the LSP is end to end ie. from the ingress PE outbound interface to the egress PE inbound interface.
Routers obviously can do QOS on both the outbound and inbound interfaces so both the P and PE router might well need to see the EXP value in the label depending on how QOS has been setup.
You mean if we have two P and two PEs then all the routers have to see the EXP value because the LSP is End to End.
Please refer to any document which can at-least give an overview on this
sorry for making this discussion longer but I am concerned about this, otherwise I cannot sleep even.
Now What I understood is in the environments where we want to use MPLS QoS values that are different from IP DSCP/IP Precedence values, Explicit NULL can be used.
But what about the times when we have two label in the stack. How is it going to copy it to the top label while PHP is being performed and we don’t have explicit null configured.
Is the Router performing the PHP is also going to look in to the Top label in the stack? Please help
So you meant to say that the EXP value which is imposed on the bottom label is also copied to TOP label by ingress router itself. DId I understood correctly?
I found this
I am confused now.
It is the same as that we are discussing. If classification is done in ingress using MPLS you can reclassify based on ToS value on egress because the exp value stops to exist (label is teared off but IP header remains). So at egress point you cannot reclassify based on exp value. Is this clear or not?