cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
74735
Views
68
Helpful
9
Comments
Luc De Ghein
Cisco Employee
Cisco Employee

Size of OSPF Packets

Links on routers have an MTU. The outgoing packets, including OSPF packets cannot have a bigger size than the interface MTU. Let’s have a look at the behavior of OSPF and packets.


This is what RFC 2328 (OSPF version 2 specification) says about OSPF packets and MTU.

    A.1 Encapsulation of OSPF packets

    OSPF runs directly over the Internet Protocol's network layer.  OSPF
    packets are therefore encapsulated solely by IP and local data-link
    headers.
 
    OSPF does not define a way to fragment its protocol packets, and
    depends on IP fragmentation when transmitting packets larger than
    the network MTU. If necessary, the length of OSPF packets can be up
    to 65,535 bytes (including the IP header).  The OSPF packet types
    that are likely to be large (Database Description Packets, Link
    State Request, Link State Update, and Link State Acknowledgment
    packets) can usually be split into several separate protocol
    packets, without loss of functionality.  This is recommended; IP
    fragmentation should be avoided whenever possible.

Remember that there could be one LSA in one Link State (LS) Update packet, but there can also be many LSAs in one LS Update packet. This is called packing LSAs into one LS Update packet.

MTU in DBD packet

Here’s a DBD or Database Description packet, specified in RFC 2328. This packet describes the contents of the 
OSPF link-state database.

        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |   Version #   |       2       |         Packet length         |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                          Router ID                            |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                           Area ID                             |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |           Checksum            |             AuType            |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                       Authentication                          |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                       Authentication                          |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |         Interface MTU         |    Options    |0|0|0|0|0|I|M|MS
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                     DD sequence number                        |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       +-                                                             -+
       |                                                               |
       +-                      An LSA Header                          -+
       |                                                               |
       +-                                                             -+
       |                                                               |
       +-                                                             -+
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                              ...                              |

Interface MTU is defined as: “The size in bytes of the largest IP datagram that can be sent out 
the associated interface, without fragmentation”. So, routers attached to a link exchange their
interface MTU value in DBD packets when the OSPF adjacency is initialized.

Section 10.6 of RFC 2328 says:

        If the Interface MTU field in the Database Description packet
        indicates an IP datagram size that is larger than the router can
        accept on the receiving interface without fragmentation, the
        Database Description packet is rejected. 

When "debug ip ospf adj" is turned on, we can see the arrival of these DBD packets. In the following example, we can see that there is a mismatch in MTU values between two OSPF neighbors. This router has MTU 1600, while the neighboring OSPF router has interface MTU 2000.



On this router:

OSPF: Rcv DBD from 10.100.1.2 on GigabitEthernet0/1 seq 0x2124 opt 0x52 flag 0x2 len 1452  mtu 2000 state EXSTART

OSPF: Nbr 10.100.1.2 has larger interface MTU

On the neighboring router:

OSPF: Rcv DBD from 10.100.100.1 on GigabitEthernet0/1 seq 0x89E opt 0x52 flag 0x7 len 32  mtu 1600 state EXCHANGE

OSPF: Nbr 10.100.100.1 has smaller interface MTU

The DBD packets are retransmitted continuously and eventually, the OSPF adjacency is torn down.


OSPF: Send DBD to 10.100.1.2 on GigabitEthernet0/1 seq 0x9E6 opt 0x52 flag 0x7 len 32

OSPF: Retransmitting DBD to 10.100.1.2 on GigabitEthernet0/1 [10]

OSPF: Send DBD to 10.100.1.2 on GigabitEthernet0/1 seq 0x9E6 opt 0x52 flag 0x7 len 32

OSPF: Retransmitting DBD to 10.100.1.2 on GigabitEthernet0/1 [11]


%OSPF-5-ADJCHG: Process 1, Nbr 10.100.1.2 on GigabitEthernet0/1 from EXSTART to DOWN, Neighbor Down: Too many retransmissions

Behavior change of OSPF and packing LSAs into a LS Update packet

Before CSCse01519

Before CSCse01519, OSPF in IOS would build OSPF packets up to a maximum of 1500 bytes. This is a regardless of the interface MTU. So, if the interface MTU is bigger than 1500 bytes, OSPF would still pack only up to 1500 bytes into an OSPF packet. This is somewhat  inefficient because OSPF could send bigger packets on the link and achieve a greater throughput. There is one exception to this: if the LSA is so big that one LSA holds more than 1500 bytes, then OSPF builds that packet, no matter what the size (OSPF cannot fragment one LSA). The IP stack of the router then fragments it to fit the MTU of the outgoing interface. This typically occurs when an OSPF router has many links and hence the router LSA because bigger than the link MTU.


Equally so, if the MTU of the outgoing interface is smaller than 1500 bytes, then the OSPF process would still build or pack OSPF packets up 1500 bytes and the IP stack of the router would fragment this into smaller IP packets in order to fit the MTU of the outgoing link. One example where this typically occurs, is an IPSec tunnel between 2 routers running OSPF. The added overhead of the encapsulation bytes of the tunnel leads to an MTU which is lower than 1500 bytes.  OSPF builds OSPF packets up to 1500 bytes and they then get fragmented before the router transmits them. This is another inefficiency.

After CSCse01519


After CSCse01519, OSPF in IOS can pack OSPF packets to be greater than 1500 bytes. This occurs if the MTU of the outgoing interface is greater than 1500 bytes. This will make the transmissions more efficient as more information can be packed into one larger packet.  For example, if one OSPF router needs to transmit a lot of external LSAs to an OSPF neighbor, it can pack more external LSAs into one LS Update packet, if that router runs IOS with CSCse01519 implemented.
CSCse01519 also allows OSPF to build packets lower than 1500 bytes. In some scenarios, the MTU between 2 OSPF neighbors is lower than 1500 bytes. See the example above with an IPSec tunnel. In that case, OSPF transmits OSPF packets which are smaller than 1500 bytes, avoiding IP fragmentation, except in the case of one large LSA, bigger than the interface MTU


Example of an issue due to the behavior change of OSPF and packing LSAs into a LS Update packet

Here's a specific example of what can go wrong when upgrading an OSPF router and discovering an OSPF MTU issue due to CSCse01519.


Many networks have OSPF neighbors which are connected through a Layer 2 switched network, or transport network, comprised of L2VPN service or a SDH/SONET network. These transport networks can have different MTU settings than the routers running OSPF.

While the MTU setting should be correct on all routers, reflecting the true MTU, there are often mistakes and they can go unnoticed.


Here's an example network, with two routers R1 and R2 running OSPF and they are connected through a Layer 2 switch.

Picture1.jpg


The issue occurs a lot if the routers have MTU-settable Ethernet interfaces. In  this case, they are. The interfaces are GigabitEthernet interfaces and have an MTU set to 2000. The MTU of the Layer 2 switch is only 1500 bytes.
Assume that the size of the data traffic is never bigger than 1500 bytes, then there is no problem running IOS without CSCse01519. The OSPF packets will never be larger than 1500 bytes.  Except if there is one LSA which is larger than 1500 bytes, in which case the OSPF process on router R1 or  R2 builds a Link State Update packet larger than 1500 bytes and transmits it. Assume this packet is 1800 bytes, then it will get dropped by the Layer 2 switch between the routers.


Assume we have an OSPF database on R2 that has enough networks so that the locally originated LSAs are so big that a LS Update packet can be potentially larger than the interface MTU.
If these networks are originated by the covering network command, then the networks show up in the router LSA of R2. R2 builds a router LSA which is bigger than 2000 bytes and transmits it, but IP fragments it down to 2000, the interface MTU. The Layer 2 switch however will drop these packets. OSPF will then retransmit this packet endlessly and the OSPF adjacency is never full. So, the issue is immediately discovered, even when running IOS without CSCse01519.


If these networks are originated by "redistribute connected", then they will show up in external LSAs. OSPF will only try to pack external LSAs into one LS Update packet up to 1500 bytes big.

In this case, because the interface MTU is 2000, the OSPF adjacency reaches the FULL state. The issue of the underlying MTU -which is not adequate- is not immediately discovered.
When we upgrade one router to IOS with CSCse01519, then the issue will be discovered.
Let's see what happens.


First both routers run IOS without CSCse01519.
When the adjacency builds, we see that R1 never receives an OSPF packet bigger than 1500 bytes, even if the MTU of the interfaces is 2000.
We enable "debug ip ospf packets".

OSPF: rcv. v:2 t:1 l:48 rid:10.100.1.2
      aid:0.0.0.0 chk:72CF aut:0 auk: from GigabitEthernet0/1

...

OSPF: rcv. v:2 t:4 l:1468 rid:10.100.1.2
      aid:0.0.0.0 chk:8389 aut:0 auk: from GigabitEthernet0/1
OSPF: rcv. v:2 t:4 l:136 rid:10.100.1.2

...

L: xx  in the debug output shows us the length of the OSPF packet. The biggest OSPF packet sent out was 1468 bytes.
t: 4 means that the type of the OSPF packet is  "Link State Update". Refer to this table from RFC 2328, section 4.3, for the

different OSPF packet types.


             Type   Packet  name           Protocol  function
             __________________________________________________________
             1      Hello                  Discover/maintain  neighbors
             2      Database Description   Summarize database contents
             3      Link State Request     Database download
             4      Link State Update      Database update
             5      Link State Ack         Flooding acknowledgment


We see that the OSPF adjacency reaches the full state.

R1#show ip ospf neighbor gigabitEthernet 0/1


Neighbor ID     Pri   State           Dead Time   Address         Interface
10.100.1.2        0   FULL/  -        00:00:34    10.1.1.2        GigabitEthernet0/1



R2#show ip ospf neighbor gigabitEthernet 0/1


Neighbor ID     Pri   State           Dead Time   Address         Interface
10.100.100.1      0   FULL/  -        00:00:34    10.1.1.1        GigabitEthernet0/1

We upgrade IOS on R2 to an IOS with CSCse01519.

R2#show ip ospf neighbor gigabitEthernet 0/1       


Neighbor ID     Pri   State           Dead Time   Address         Interface
10.100.100.1      0   LOADING/  -     00:00:33    10.1.1.1        GigabitEthernet0/1



R2#show ip ospf neighbor gigabitEthernet 0/1 detail
Neighbor 10.100.100.1, interface address 10.1.1.1
    In the area 0 via interface GigabitEthernet0/1
    Neighbor priority is 0, State is LOADING, 5 state changes
    DR is 0.0.0.0 BDR is 0.0.0.0
    Options is 0x12 in Hello (E-bit L-bit )
    Options is 0x52 in DBD (E-bit L-bit O-bit)
    LLS Options is 0x1 (LR)
    Dead timer due in 00:00:39
    Neighbor is up for 00:00:49
    Index 1/1, retransmission queue length 0, number of retransmission 0
    First 0x0(0)/0x0(0) Next 0x0(0)/0x0(0)
    Last retransmission scan length is 0, maximum is 0
    Last retransmission scan time is 0 msec, maximum is 0 msec
    Number of retransmissions for last link state request packet 9
    Poll due in 00:00:00



R2#show ip ospf neighbor gigabitEthernet 0/1 detail
Neighbor 10.100.100.1, interface address 10.1.1.1
    In the area 0 via interface GigabitEthernet0/1
    Neighbor priority is 0, State is LOADING, 5 state changes
    DR is 0.0.0.0 BDR is 0.0.0.0
    Options is 0x12 in Hello (E-bit L-bit )
    Options is 0x52 in DBD (E-bit L-bit O-bit)
    LLS Options is 0x1 (LR)
    Dead timer due in 00:00:33
    Neighbor is up for 00:02:06
    Index 1/1, retransmission queue length 0, number of retransmission 0
    First 0x0(0)/0x0(0) Next 0x0(0)/0x0(0)
    Last retransmission scan length is 0, maximum is 0
    Last retransmission scan time is 0 msec, maximum is 0 msec
    Number of retransmissions for last link state request packet 25
    Poll due in 00:00:03



%OSPF-5-ADJCHG: Process 1, Nbr 10.100.100.1 on GigabitEthernet0/1 from LOADING to DOWN, Neighbor Down: Too many retransmissions

The OSPF adjacency does not reach the FULL state. We see retransmissions. The OSPF adjacency is stuck in LOADING state. OSPF gave up after 25 retransmissions, after which, it will try to establish the adjacency again, but will run into the same issue. So, this continues endlessly.
We see that by only upgrading one router (R2) we uncover a previsouly hidden issue: the underlying MTU is smaller than the one used by the OSPF routers.
When the switch changes MTU to pass 2000 bytes packets, we see an OSPF packet which is bigger than 1500 bytes being transmitted fine.

R1#
OSPF: rcv. v:2 t:3 l:1980 rid:10.100.1.2
      aid:0.0.0.0 chk:AC5B aut:0 auk: from GigabitEthernet0/1


To check underlying MTU issues, always ping the OSPF neighbor IP address with a size equal to MTU and the df-bit set.


To discover the value of the underlying MTU, perform this ping and sweep the size. Then count the number of "!" we see in the output and you'll get the real MTU. In this case, the last echo reply we got back from the ping command has size 1500 bytes.

R2#ping
Protocol [ip]:
Target IP address: 10.1.1.1
Repeat count [5]: 1
Datagram size [100]:
Timeout in seconds [2]:
Extended commands [n]: yes
Source address or interface:
Type of service [0]:
Set DF bit in IP header? [no]: yes
Validate reply data? [no]:
Data pattern [0xABCD]:
Loose, Strict, Record, Timestamp, Verbose[none]:
Sweep range of sizes [n]: yes
Sweep min size [36]: 1460
Sweep max size [18024]: 1540
Sweep interval [1]:
Type escape sequence to abort.
Sending 81, [1460..1540]-byte ICMP Echos to 10.1.1.1, timeout is 2 seconds:
Packet sent with the DF bit set
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!.............................
...........
Success rate is 49 percent (40/81), round-trip min/avg/max = 1/1/4 ms

Comments
Jan.Ferre
Level 1
Level 1

A very good article describing the problems.

One subtle thing to notice is the definition of MTU's especially on L3-switches:

  MTU size is for 10-100 Mbit interfaces

  Jumbo MTU is for 1-10 Gbit interfaces

  Routing MTU is for OSPF (guess for other routing protocols as well)

Especially on the L3 switches you may need extended MTU for switching/trunking purposes while it may be nescessary to reduce the routing MTU. This is especially important when mixing switch-models - like C3550, C3560, C3750 as they behave differently.

Any way - this article does give a good understanding of _why_ the problem exists.

Nandan Mathure
Level 1
Level 1

Nice and helpful post. Thanks :-)

siddhartham
Level 4
Level 4

great explanation, thanks for the article.

shreerampardhy
Level 1
Level 1

Hello Luc,

I have a bit of different understanding on the above topic for the type 1 and type 2 LSAs. I agree to the point that the max size of the type 1 and type 2 lsa can be 65K ( as we have length field of 2 bytes ). I also agree that the device needs to build the complete LSA ( router and Network ) without fragmenting it. But i dont think that the IP layer can then fragment this packet, if the interface MTU is less than the LSA MTU. There is no field in teh type 1 and type 2 LSA that can reassemble the fragmented LSA.

For example, if the size of the Type 1 LSA generated by the device is more than 1500 bytes and the link is of only 1500 bytes, then the IP header cannot just fragement the packet. Even if it does, the LSA would be corrupted when it is received at the receiving end.

I think this can be done only in ISIS ( just speaking about the link state protocols ). LSP for ISIS supports max of 255 fragments which can be reassembled at the receiving end. Since each fragment has its own checksum, they can also be individually verified

Regards,

Shreeram

siddhartham
Level 4
Level 4

I do have a question..If the IP layer can frgment the packets then why do we get a OSPF neighbourship issue when there is a interface MTU mismatch.

Luc De Ghein
Cisco Employee
Cisco Employee

Hi Shreeram,

IP can fragment OSPF packets.

Here's two routers, R1 and R2 with both MTU 1500 on the ethernet interface between them.

R1 has many OSPF-enabled interfaces, so that the Router LSA of R1 becomes bigger than 1500 bytes.

R1#show ip int et 0/0

Ethernet0/0 is up, line protocol is up

  Internet address is 10.1.1.1/24

  Broadcast address is 255.255.255.255

  Address determined by setup command

  MTU is 1500 bytes      <<<<<<

A capture on the wire when OSPF exchanges the router LSA of R1 shows:

Frame 37 (1514 bytes on wire, 1514 bytes captured)

Ethernet II, Src: aa:bb:cc:00:01:00, Dst: 01:00:5e:00:00:05

Internet Protocol, Src Addr: 10.1.1.1 (10.1.1.1), Dst Addr: 224.0.0.5 (224.0.0.5)

    Version: 4

    Header length: 20 bytes

    Differentiated Services Field: 0xc0 (DSCP 0x30: Class Selector 6; ECN: 0x00)

    Total Length: 1500      <<<<<<

    Identification: 0x0184 (388)      <<<<<<

    Flags: 0x02

        .0.. = Don't fragment: Not set

        ..1. = More fragments: Set      <<<<<<

    Fragment offset: 0

    Time to live: 1

    Protocol: OSPF IGP (0x59)

    Header checksum: 0xa67e (correct)

    Source: 10.1.1.1 (10.1.1.1)

    Destination: 224.0.0.5 (224.0.0.5)

Open Shortest Path First

    OSPF Header

        OSPF Version: 2

        Message Type: LS Update (4)

        Packet Length: 1528      <<<<<<

        Source OSPF Router: 10.100.1.1 (10.100.1.1)

        Area ID: 0.0.0.0 (Backbone)

        Packet Checksum: 0xf490

        Auth Type: Null

        Auth Data (none)

[Unreassembled Packet: OSPF]

Frame 38 (82 bytes on wire, 82 bytes captured)

Ethernet II, Src: aa:bb:cc:00:01:00, Dst: 01:00:5e:00:00:05

Internet Protocol, Src Addr: 10.1.1.1 (10.1.1.1), Dst Addr: 224.0.0.5 (224.0.0.5)

    Version: 4

    Header length: 20 bytes

    Differentiated Services Field: 0xc0 (DSCP 0x30: Class Selector 6; ECN: 0x00)

    Total Length: 68

    Identification: 0x0184 (388)      <<<<<<

    Flags: 0x00

        .0.. = Don't fragment: Not set

        ..0. = More fragments: Not set

    Fragment offset: 1480

    Time to live: 1

    Protocol: OSPF IGP (0x59)

    Header checksum: 0xcb5d (correct)

    Source: 10.1.1.1 (10.1.1.1)

    Destination: 224.0.0.5 (224.0.0.5)

Data (48 bytes)

0000  0a c8 01 03 ff ff ff ff 03 00 00 01 0a c8 01 02   ................

0010  ff ff ff ff 03 00 00 01 0a c8 01 01 ff ff ff ff   ................

0020  03 00 00 01 0a 64 01 01 ff ff ff ff 03 00 00 01   .....d..........

The router LSA of R1 is bigger than 1500 bytes and was fragmented by IPv4.

The router LSA of R1 will be stored on R2. We can see that the size of the LSA is bigger than 1500 bytes.

R2#show ip ospf database router 10.100.1.1

            OSPF Router with ID (10.100.1.2) (Process ID 1)

                Router Link States (Area 0)

  LS age: 4

  Options: (No TOS-capability, DC)

  LS Type: Router Links

  Link State ID: 10.100.1.1

  Advertising Router: 10.100.1.1

  LS Seq Number: 80000022

  Checksum: 0x2CF4

  Length: 1536      <<<<<<

  Number of Links: 126

The difference between OSPF and ISIS is that OSPF runs on top of IP, while ISIS runs directly on Layer 2.

ISIS builds one LSP per level per router. With OSPF, IP can fragment the packet.

The OSPF LSA header does have a checksum field. The re-assembled LSA can be verified.

RFC 2328:

4.3.  Routing protocol packets

        The OSPF protocol runs directly over IP, using IP protocol 89.

        OSPF does not provide any explicit fragmentation/reassembly

        support.  When fragmentation is necessary, IP

        fragmentation/reassembly is used.  OSPF protocol packets have

        been designed so that large protocol packets can generally be

        split into several smaller protocol packets.  This practice is

        recommended; IP fragmentation should be avoided whenever

        possible.

A.1 Encapsulation of OSPF packets

    OSPF does not define a way to fragment its protocol packets, and

    depends on IP fragmentation when transmitting packets larger than

    the network MTU. If necessary, the length of OSPF packets can be up

    to 65,535 bytes (including the IP header).

The issues with the OSPF adjacency not forming is related to a mismatch in MTU settings or another problem with the MTU.

Either the MTU is set differently on either side of the link or there is a Layer 2 device in the middle with a lower MTU than what the routers have on the interface.

In the example above, the router LSA of R1 is fragmented, but the OSPF adjacency forms fine.

I hope this clarifies things.

Thanks,

Luc

shreerampardhy
Level 1
Level 1

Hello Luc,

Many thanks for the detailed explaination. My confusion was if the type 1 and type 2 LSAs can be fragmented or not as I was looking at the way ospf packet can identify the fragments. Your outputs seem to precisely clarify this.:)

Thank you once again for the explaination.

Regards,

Shreeram

csoto
Level 1
Level 1

Hi Luc

Thanks for this document, it is very clear and precise in his explanation.

I tested in the laboratory and could repeat several times the failure.

Best regard

Christian

CSCO11508096
Level 1
Level 1

Hi,

To resolve this issue.

use the command under ospf process " Ip ospf mtu-ignore" on the router that is having lower mut set on the interface.

Regards

Shashi

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Quick Links