| TOC |
|
This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.
This Internet-Draft will expire on April 22, 2004.
Copyright (C) The Internet Society (2003). All Rights Reserved.
This memo specifies the details of the Host Identity Protocol (HIP). The overall description of protocol and the underlying architectural thinking is available in the separate HIP architecture specification. The Host Identity Protocol is used to establish a rapid authentication between two hosts and to provide continuity of communications between those hosts independent of the networking layer.
The various forms of the Host Identity, Host Identity Tag (HIT) and Local Scope Identifier (LSI), are covered in detail. It is described how they are used to support authentication and the establishment of keying material, which is then used by IPsec Encapsulated Security payload (ESP) to establish a two-way secured communication channel between the hosts. The basic state machine for HIP provides a HIP compliant host with the resiliency to avoid many denial-of-service (DoS) attacks. The basic HIP exchange for two public hosts shows the actual packet flow. Other HIP exchanges, including those that work across NATs are covered elsewhere.
| TOC |
| TOC |
The Host Identity Protocol (HIP) provides a rapid exchange of Host Identities between two hosts. The exchange also establishes a pair IPsec Security Associations (SA), to be used with IPsec Encapsulated Security Payload (ESP)[15]. The HIP protocol is designed to be resistant to Denial-of-Service (DoS) and Man-in-the-middle (MitM) attacks, and when used to enable ESP, provides DoS and MitM protection for upper layer protocols, such as TCP and UDP.
The Host Identity Protocol introduces a new namespace, the Host Identity. The effects of this change are explained in the companion document, the HIP architecture[17] specification.
There are three representations of the Host Identity, the full Host Identifier (HI), the Host Identity Tag (HIT), and the Local Scope Identifier (LSI). Three representations are used, as each meets a different design goal of HIP, and none of them can be removed and meet these goals. The HI represents directly the Identity. It is a public key. Since there are different public key algorithms that can be used with different key lengths, the HI is not good for using as a packet identifier, or as a index into the various operational tables needed to support HIP.
A hash of the HI, the Host Identity Tag (HIT), thus becomes the operational representation. It is 128 bits long. It is used in the HIP payloads, and it is intended be used to index the corresponding state in the end hosts.
In many environments, 128 bits is still considered large. For example, currently used IPv4 based applications are constrained with 32 bit API fields. Thus, the third representation, the 32 bit LSI, is needed. The LSI provides a compression of the HIT with only a local scope so that it can be carried efficiently in any application level packet and used in API calls.
The base HIP exchange consists of four packets. The four-packet design helps to make HIP DoS resilient. The protocol exchanges Diffie-Hellman keys in the 2nd and 3rd packets, and authenticates the parties in the 3rd and 4th packets. Additionally, it starts the cookie exchange in the 2nd packet, completing it in the 3rd packet.
The exchange uses the Diffie-Hellman exchange to hide the Host Identity of the Initiator in packet 3. The Responder's Host Identity is not protected. It should be noted, however, that both the Initiator's and the Responder's HITs are transported as such (in cleartext) in the packets, allowing an eavesdropper with a priori knowledge about the parties to verify their identies.
Data packets start after the 4th packet. The 3rd and 4th HIP packets may carry a data payload in the future. However, the details of this are to be defined later as more implementation experience is gained.
Finally, HIP is designed as an end-to-end authentication and key establishment protocol. It lacks much of the fine-grain policy control found in IKE that allows IKE to support complex gateway policies. Thus, HIP is not a complete replacement for IKE.
| TOC |
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119[3].
| TOC |
The structure of the Host Identifier (HI) is the public key of an asymmetric key pair. Correspondingly, the host itself is entity that holds the private key from the key pair. See the HIP architecture specification[17] for more details about the difference between an identity and the corresponding identifier.
DSA is the MUST implement public key algorithm for all HIP implementations, other algorithms MAY be supported. DSA was chosen as the default algorithm due to its small signature size.
A Host Identity Tag (HIT) is used in protocols to represent the Host Identity. Another representation of the Host Identity, the Local Scope Identifier (LSI), can also be used in protocols and APIs. LSI's advantage over HIT is its size; its disadvantage is its local scope.
The Host Identity Tag is a 128 bit value -- a hash of the Host Identifier. There are two advantages of using a hash over the actual Identity in protocols. Firstly, its fixed length makes for easier protocol coding and also better manages the packet size cost of this technology. Secondly, it presents a consistent format to the protocol whatever underlying identity technology is used.
There are two types of HITs. HITs of the first type, called type 1 HIT, consist of an initial 2 bit prefix of 01, followed by 126 bits of the SHA-1 hash of the public key. HITs of the second type consist of an initial 2 bit prefix of 10, a Host Assigning Authority (HAA) field, and only the last 64 bits come from a SHA-1 hash of the Host Identity. This latter format for HIT is recommended for 'well known' systems. It is possible to support a resolution mechanism for these names in hierarchical directories, like the DNS. Another use of HAA is in policy controls, see HIP Policies.
This document fully specifies only type 1 HITs. HITs that consists of the HAA field and the hash are specified in [20].
Any conforming implementation MUST be able to deal with both types of HITs. When handling other than type 1 HITs, the implementation is RECOMMENDED to explicitly learn and record the binding between the Host Identifier and the HIT, as it may not be able generate such HITs from Host Identifiers.
The 126 or 64 hash bits in a HIT MUST be generated by taking the least significant 126 or 64 bits of the SHA-1[18] hash of the Host Identifier as it is represented in the Host Identity field in a HIP payload packet.
For Identities that are DSA public keys, the HIT is formed as follows:
The following pseudo-code illustrates the process. The symbol := denotes assignment; the symbol += denotes appending. The pseudo-function encode_in_network_byte_order takes two parameters, an integer (bignum) and a length in bytes, and returns the integer encoded into a byte string of the given length.
buffer := encode_in_network_byte_order ( DSA.T , 1 )
buffer += encode_in_network_byte_order ( DSA.Q , 20 )
buffer += encode_in_network_byte_order ( DSA.P , 64 + 8 * T )
buffer += encode_in_network_byte_order ( DSA.G , 64 + 8 * T )
buffer += encode_in_network_byte_order ( DSA.Y , 64 + 8 * T )
digest := SHA-1 ( buffer )
hit_126 := concatenate ( 01 , low_order_bits ( digest, 126 ) )
hit_haa := concatenate ( 10 , HAA, low_order_bits ( digest, 64 ) )
LSIs are 32-bit localized representations of a Host Identity. The purpose of an LSI is to facilitate using Host Identities in existing IPv4 based protocols and APIs. The owner of the Host Identity does not set the LSI that other hosts use for it; each host selects the LSIs that it uses for representing its partners.
A local LSI is an LSI that a remote host has assigned to a host. In some implementations, local LSIs may be assigned to some interface as an IP address. A remote LSI is an LSI that the host has assigned to represent a remote host (and that the remote host has accepted).
The LSIs MUST be allocated from the 1.0.0.0/8 subnet. That makes it easier to differentiate between LSIs and IPv4 addresses at the API level. By default, the low order 24 bits of an LSI SHOULD be equal with the low order 24 bits of the corresponding HIT. That allows easier mapping between LSIs and HITs, and makes the LSI assigned to a host to be a fixed one.
If a host is forming a remote LSI for a HIT whose low order 24 bits are equal with another already existing remote LSI, the host MAY select another LSI to represent that host. If the low order 24 bits of a remote HIT are equal to the low order 24 bits of a local LSI, the host MAY select a different LSI to represent the remote host. In either case, the host SHOULD assign the low order 24 bits of the LSI randomly. All hosts SHOULD be prepared to handle local LSIs whose low order 24 bits do not match with any of their own HITs. Note that any such mechanisms may be subject to implementation complications, see Backwards compatibility API issues.
If the LSI assigned by a peer to represent a host is unacceptable, the host MAY terminate the HIP four-way handshake and start anew.
It is possible that the HITs of two remote hosts have equal low order 24 bits. Since HITs are basically random, if a host is communicating with 1000 other hosts, the risk of such collision is roughly 0.006%, and for a host communicating with 10000 other hosts, the risk is about 0.06%. However, given a population of 100000 hosts, each communicating with 1000 other hosts, the probability that there were no collisions at all is only about 2%. In other words, even though collisions are fairly rare events for any given host, they will happen, and there must be a way to handle them. However, this specification does not currently specify any such way. A future version of this specification is expected to include a definition; see also the discussion in Backwards compatibility API issues.
SPIs are used in ESP to find the right security association for received packets. The ESP SPIs have added significance when used with HIP; they are a compressed representation of the HITs in every packet. Thus, SPIs MAY be used by intermediary systems in providing services like address mapping. Note that since the SPI has significance at the receiver, only the < DST, SPI >, where DST is a destination IP address, uniquely identifies the receiver HIT at every given point of time. The same SPI value may be used by several hosts. A single < DST, SPI > value may denote different hosts at different points of time, depending on which host is currently reachable at the DST.
Each host selects for itself the SPI it wants to see in packets received from its peer. This allows it to select different SPIs for different peers. The SPI selection SHOULD be random; the rules of Section 2.1 of the ESP specification[15] must be followed. A different SPI SHOULD be used for each HIP exchange with a particular host; this is to avoid a replay attack. Additionally, when a host rekeys, the SPI MUST be changed. Furthermore, if a host changes over to use a different IP address, it MAY change the SPI.
One method for SPI creation that meets these criteria would be to concatenate the HIT with a 32 bit random or sequential number, hash this (using SHA1), and then use the high order 32 bits as the SPI.
The selected SPI is communicated to the peer in the third (I2) and fourth (R2) packets of the base HIP exchange. Changes in SPI are signalled with NES packets.
There is a subtle difference between an LSI and a SPI.
The LSI is designed to be relatively long-lived. A system selects the LSI it locally uses to represent its peer, and it SHOULD reuse a previous LSI for a HIT during a subsequent HIP exchange. This could be important in a timeout recovery situation. The LSI only appears in the 3rd and 4th HIP packets. The LSI is used anywhere in system processes where IP addresses have traditionally have been used, like in TCBs, IPv4 API calls, and FTP PORT commands.
The SPI is short-lived. It changes with each HIP exchange and with a HIP rekey. A system notifies its peer of the SPI to use in ESP packets sent to it. Since the SPI is in all but the first two HIP packets, it can be used in intermediary systems to assist in address remapping.
When computing TCP and UDP checksums on sockets bound to HITs or LSIs, the IPv6 pseudo-header format [8] is used. Additionally, the HITs MUST be used in the place of the IPv6 addresses in the IPv6 pseudo-header. Note that the pseudo-header for actual HIP payloads is computed differently; see Checksum.
| TOC |
The Host Identity Protocol is IP protocol TBD. The HIP payload could be carried in every datagram. However, since HIP datagrams are relatively large (at least 40 bytes), and ESP already has all of the functionality to maintain and protect state, the HIP payload is 'compressed' into an ESP payload after the HIP exchange. Thus in practice, HIP packets only occur in datagrams to establish or change HIP state.
For testing purposes, the protocol number 99 is currently used.
The base HIP exchange serves to manage the establishment of state between an Initiator and a Responder. The Initiator first sends a trigger packet, I1, to the Responder. The second packet, R1, starts the actual exchange. It contains a puzzle, that is, a cryptographic challenge that the Initiator must solve before continuing the exchange. In its reply, I2, the Initiator must display the solution. Without a solution the I2 message is simply discarded.
The last three packets of the exchange, R1, I2, and R2, constitute a standard authenticated Diffie-Hellman key exchange. The base exchange is illustrated below.
Initiator Responder
I1: trigger exchange
-------------------------->
select pre-computed R1
R1: puzzle, D-H, key, sig
<-------------------------
check sig remain stateless
solve puzzle
I2: solution, D-H, {key}, sig
-------------------------->
compute D-H check cookie
check puzzle
check sig
R2: sig
<--------------------------
check sig compute D-H
In this section we cover the overall design of the base exchange. The details are the subject of the rest of this memo.
The purpose of the HIP cookie mechanism is to protect the Responder from a number of denial-of-service threats. It allows the Responder to delay state creation until receiving I2. Furthermore, the puzzle included in the cookie allows the Responder to use a fairly cheap calculation to check that the Initiator is "sincere" in the sense that it has churned CPU cycles in solving the puzzle.
The Cookie mechanism has been explicitly designed to give space for various implementation options. It allows a responder implementation to completely delay session specific state creation until a valid I2 is received. In such a case a validly formatted I2 can be rejected earliest only once the Responder has checked its validity by computing one hash function. On the other hand, the design also allows a responder implementation to keep state about received I1s, and match the received I2s against the state, thereby allowing the implementation to avoid the computational cost of the hash function. The drawback of this latter approach is the requirement of creating state. Finally, it also allows an implementation to use any combination of the space-saving and computation-saving mechanisms.
One possible way for a Responder to remain stateless but drop most spoofed I2s is to base the selection of the cookie on some function over the Initiator's Host Identity. The idea is that the Responder has a (perhaps varying) number of pre-calculated R1 packets, and it selects one of these based on the information carried in I1. When the Responder then later receives I2, it checks that the cookie in the I2 matches with the cookie sent in the R1, thereby making it impractical for the attacker to first exchange one I1/R1, and then generate a large number of spoofed I2s that seemingly come from different IP addresses or use different HITs. The method does not protect from an attacker that uses fixed IP addresses and HITs, though. Against such an attacker it is probably best to create a piece of local state, and remember that the puzzle check has previously failed. See Using responder cookies for one possible implementation. Note, however, that the implementations MUST NOT use the exact implementation given in the appendix, and SHOULD include sufficient randomness to the algorithm so that algorithm complexity attacks become impossible [22].
The Responder can set the difficulty for Initiator, based on its concern of trust of the Initiator. The Responder SHOULD use heuristics to determine when it is under a denial-of-service attack, and set the difficulty value K appropriately.
The Responder starts the cookie exchange when it receives an I1. The Responder supplies a random number I, and requires the Initiator to find a number J. To select a proper J, the Initiator must create the concatenation of I, the HITs of the parties, and J, and take a SHA-1 hash over this concatenation. The lowest order K bits of the result MUST be zeros.
To generate a proper number J, the Initiator will have to generate a number of Js until one produces the hash target of zero. The Initiator SHOULD give up after trying 2^(K+2) times, and start over the exchange. (See Probabilities in the cookie calculation.) The Responder needs to re-create the concatenation of I, the HITs, and the provided J, and compute the hash once to prove that the Initiator did its assigned task.
To prevent pre-computation attacks, the Responder MUST select the number I in such a way that the Initiator cannot guess it. Furthermore, the construction MUST allow the Responder to verify that the value was indeed selected by it and not by the Initiator. See Using responder cookies for an example on how to implement this.
It is RECOMMENDED that the Responder generates a new cookie and a new R1 once every few minutes. Furthermore, it is RECOMMENDED that the Responder remembers an old cookie at least 60 seconds after it has been deprecated. These time values allow a slower Initiator to solve the cookie puzzle while limiting the usability that an old, solved cookie has to an attacker.
NOTE: The protocol developers explicitly considered whether R1 should include a timestamp in order to protect the Initiator from replay attacks. The decision was NOT to include a timestamp.
In R1, the values I and K are sent in network byte order. Similarily, in I2 the values I and J are sent in network byte order. The SHA-1 hash is created by concatenating, in network byte order, the following data, in the following order:
64-bit random value I, in network byte order, as appearing in R1 and I2.
128-bit initiator HIT, in network byte order, as appearing in the HIP Payload in R1 and I2.
128-bit responder HIT, in network byte order, as appearing in the HIP Payload in R1 and I2.
64-bit random value J, in network byte order, as appearing in I2.
In order to be a valid response cookie, the K low-order bits of the resulting SHA-1 digest must be zero.
Notes:
The length of the data to be hashed is 48 bytes.
All the data in the hash input MUST be in network byte order.
The order of the initiator and responder HITs are different in the R1 and I2 packets, see Payload format. Care must be taken to copy the values in right order to the hash input.
- Precomputation by the Responder
- Sets up the challenge difficulty K.
Generates a random number I.
Creates a signed R1 and caches it.
- Responder
- Selects a suitable cached R1.
Sends I and K in a HIP Cookie in the R1.
Saves I and K for a Delta time.
- Initiator
- Generates repeated attempts to solve the challenge until a matching J is found:
Ltrunc( SHA-1( I | HIT-I | HIT-R | J ), K ) == 0
Sends I and J in HIP Cookie in a I2.
- Responder
- Verifies that the received I is a saved one.
Finds the right K based on I.
Computes V := Ltrunc( SHA-1( I | HIT-I | HIT-R | J ), K )
Rejects if V != 0
Accept if V == 0
The packets R1, I2, and R2 implement a standard authenticated Diffie-Hellman exchange. The Responder sents its public Diffie-Hellman key and its public authentication key, i.e., its host identity, in R1. The signature in R1 allows the Initiator to verify that the R1 has been once generated by the Responder. However, since it is precomputed and therefore does not cover all of the packet, it does not protect from replay attacks.
When the Initiator receives an R1, it computes the Diffie-Hellman session key. It creates a HIP security association using keying material from the session key (see HIP KEYMAT), and uses the security association to encrypt its public authentication key, i.e., host identity. The resulting I2 contains the Initiator's Diffie-Hellman key and its the encrypted public authentication key. The signature in I2 covers all of the packet.
The Responder extracts the Initiator Diffie-Hellman public key from the I2, computes the Diffie-Hellman session key, creates a corresponding HIP security association, and decrypts the Initiator's public authentication key. It can then verify the signature using the authentication key.
The final message, R2, is needed to protect the Initiator from replay attacks.
The HIP Birthday is a reboot count used to manage state re-establishment when one peer rebooted or timed out its SA. The Birthday is increased every time the system boots. The Birthday also has to be increased in accordance with the system's SA timeout parameter. If the system has open SAs, it MUST increase its Birthday. This impacts a system's approach to precomputing R1 packets.
Birthday SHOULD be a counter. It MUST NOT be reset by the user and a system is unlikely to need a birthday larger than 2^64. Date-time in GMT MAY be used if a cross-boot counter is not possible, but it has a potential problem if the system time is set back by the user.
A future version of this document may define how to include ESP protected data on various HIP packets. However, currently the HIP header is a terminal header, and not followed by any other headers.
The OPTIONAL PAYLOAD packet (see PAYLOAD - the HIP Payload Packet) MAY be used to transfer data.
HIP includes a simple rekey mechanism, allowing the hosts to introduce new keying material at any time by introducing a new Diffie-Hellman public key; see NES - the HIP New SPI Packet. All conforming HIP implementations MUST support rekeying.
This memo defines an OPTIONAL HIP based bootstrap mechanism, intended for ad hoc like environments; see BOS - the HIP Bootstrap Packet. There is little operational experience of the usability of this mechanism, and it may be dropped or completely revised in some future protocol version.
HIP does not define how to use certificates. However, it does define a simple certificate transport mechanisms that MAY be used to implement certificate based security policies. The certificate payload is defined in CERT, and the certificate packet in CER - the HIP Certificate Packet.
| TOC |
A typical HIP packet flow is shown below.
I --> Directory: lookup of R
I <-- Directory: return R's addresses, and HI and/or HIT
I1 I --> R (Hi. Here is my I1, let's talk HIP)
R1 I <-- R (OK. Here is my R1, handle this HIP cookie)
I2 I --> R (Compute, compute, here is my counter I2)
R2 I <-- R (OK. Let's finish HIP with my R2)
I --> R (ESP protected data)
I <-- R (ESP protected data)
The HIP protocol and state machine is designed to recover from one of the parties crashing and losing its state. The following scenarios describe the main use cases covered by the design.
No prior state between the two systems.
The system with data to send is the Initiator. The process follows standard 4 packet base exchange, establishing the SAs.
The system with data to send has no state with the receiver, but the receiver has a residual SA.
Initiator acts as in no prior state, sending I1 and getting R1. When the Receiver gets an I2, the old SAs are 'discovered' and deleted; the new SAs are established.
The system with data to send has an SA, but the receiver does not.
The receiver 'detects' the situation when it receives an ESP packet that contains an unknown SPI. The receiver sends an R1 with a NULL initiator HIT. The sender gets the R1 with a later birthdate, discards old SA, and continues the base exchange to establish new SAs for sending data.
A peer determines that it needs to reset Sequence number or rekey.
It sends NES. Receiver sends NES response, establishes new SAs for peers.
A HIP aware host may choose not to accept a HIP exchange. If the host's policy is to only be an initiator, it should begin its own HIP exchange. A host MAY choose to have such a policy since only the initiator HI is protected in the exchange. There is a risk of a race condition if each host's policy is to only be an Initiator, at which point the HIP exchange will fail.
If the host's policy does not permit it to enter into a HIP exchange with the Initiator, it should send an ICMP 'Destination Unreachable, Administratively Prohibited' message. A more complex HIP packet is not used here as it actually opens up more potential DoS attacks than a simple ICMP message.
Simulating a loss of state is a potential DoS attack. The following process has been crafted to manage state recovery without presenting a DoS opportunity.
If a host reboots or times out, it has lost its HIP state. If the system that lost state has a datagram to deliver to its peer, it simply restarts the HIP exchange. The peer sends an R1 HIP packet, but does not reset its state until it receives the I2 HIP packet. The I2 packet MUST have a Birthday greater than the current SA's Birthday. This is to handle DoS attacks that simulate a reboot of a peer. Note that either the original Initiator or the Responder could end up restarting the exchange, becoming the new Initiator.
If a system receives an ESP packet for an unknown SPI, the assumption is that it has lost the state and its peer did not. In this case, the system treats the ESP packet like an I1 packet and sends an R1 packet. The initiator HIT is typically NULL in the R1, since the system usually does not know the peer's HIT any more.
The system receiving the R1 packet first checks to see if it has an established and recently used SA with the party sending the R1. If such an SA exists, the system checks the Birthday, and if the Birthday is greater than the current SA's Birthday, it processes the R1 packet, optionally queuing the payload packet(s) to be resent later. The peer system processes the I2 in the normal manner, and replies with an R2. This will re-establish state between the two peers. Note that the process will result in new ESP SAs being used, and therefore simply resending ESP packets is not sufficient.
Note that there is a potential DoS attack. If an attacker can simulate a situation where a large number of peers apparently loose their state at the same time, and all send R1 packet at once to a server, the server will choke on trying to solve all the puzzles at the same time. However, such an attack would require that the attacker has specific knowledge about the SAs being used, and an ability to trigger R1s as the SAs are used.
The HIP protocol itself has very little state. In the HIP base exchange, there is an Initiator and a Responder. Once the SAs are established, this distinction is lost. If the HIP state needs to be re-established, the controlling parameters are which peer still has state and which has a datagram to send to its peer. The following state machine attempts to capture these processes.
The state machine is presented in a single system view, representing either an Initiator or a Responder. There is not a complete overlap of processing logic here and in the packet definitions. Both are needed to completely implement HIP.
Implementors must understand that the state machine, as described here, is informational. Specific implementations are free to implement the actual functions differently.
- E0
- State machine start
- E1
- Initiating HIP
- E2
- Waiting to finish HIP
- E3
- HIP SA established
- E4
- HIP SA established, rekeying
- E-FAILED
- HIP SA establishment failed
+---------+
| E0 | Start state
+---------+
Datagram to send, send I1 and go to E1
Receive I1, send R1 and stay at E0
Receive I2, process
if successful, send R2 and go to E3
if fail, stay at E0
Receive ESP for unknown SA, send R1 and stay at E0
Receive ANYOTHER, drop and stay at E0
+---------+
| E1 | Initiating HIP
+---------+
Receive I1, send R1 and stay at E1
Receive I2, process
if successful, send R2 and go to E3
if fail, stay at E1
Receive R1, process
if successful, send I2 and go to E2
if fail, go to E-FAILED
Receive ANYOTHER, drop and stay at E1
Timeout, increment timeout counter
If counter is less than I1_RETRIES_MAX, send I1 and stay at E1
If counter is greater than I1_RETRIES_MAX, go to E-FAILED
+---------+
| E2 | Waiting to finish HIP
+---------+
Receive I1, send R1 and stay at E2
Receive I2, process
if successful, send R2 and go to E3
if fail, stay at E2
Receive R2, process
if successful, go to E3
if fail, go to E-FAILED
Receive ANYOTHER, drop and stay at E2
Timeout, increment timeout counter
If counter is less than I2_RETRIES_MAX, send I2 and stay at E2
If counter is greater than I2_RETRIES_MAX, go to E-FAILED
+---------+
| E3 | HIP SA established
+---------+
Receive I1, send R1 and stay at E3
Receive I2, process with Birthday check
if successful, send R2, drop old SA and cycle at E3
if fail, stay at E3
Receive R1, process with SA and Birthday check
if successful, send I2, prepare to drop old SA and cycle at E3
if fail, stay at E3
Receive R2, drop and stay at E3
Receive ESP for SA, process and stay at E3
Receive NES, process
if successful, send NES and stay at E3
if failed, stay at E3
Need rekey,
send NES and go to E4
+---------+
| E4 | HIP SA established, rekey pending
+---------+
Receive I1, send R1 and stay at E4
Receive I2, process with Birthday check
if successful, send R2, drop old SA and go to E3
if fail, stay at E4
Receive R1, process with SA and Birthday check
if successful, send I2, prepare to drop old SA and go to E3
if fail, stay at E4
Receive R2, drop and stay at E4
Receive ESP for SA, process and stay at E4
Receive NES, process
if successful, replace SAs and go to E3
if failed, stay at E4
Timeout, increment timeout counter
If counter is less than NES_RETRIES_MAX, send NES and stay at E4
If counter is greater than NES_RETRIES_MAX, go to E-FAILED
Receive packets cause a move to a new state.
+---------+
| E0 |>---+
+---------+ |
| ^ | |
| | | Dgm to |
+-+ | send |
I1 | | (note: ESP- means ESP with unknown SPI)
ESP- | |
v |
+---------+ |
| E1 |>---|----------+
+---------+ | |
| | |
| R1 | |
| |I2 |I2
v | |
+---------+ | |
| E2 |>---|----------|-----+
| | | | |
+---------+ | | |
| | | |
| R2 | | |I2
| | | |
v | | |
+---------+<---+ | |
| | | |
| E3 |<--------------+ |
| |<--------------------+
| |<-------+
| |----+ |
+---------+ | |
| ^ ^ | |
| | | | |
+--+ | | |
ESP, | rekey| |R1,I2
NES, | | |
I1, |NES | |
I2, | | |
R1 | | |
| | |
+---------+ | |
| |<---+ |
| E4 |>-------+
+---------+
| ^
| |
+--+
ESP,
I1,
| TOC |
All HIP packets start with a fixed header.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Next Header | Payload Len | Type | VER. | RES. |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Controls | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sender's Host Identity Tag (HIT) |
| |
| |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Receiver's Host Identity Tag (HIT) |
| |
| |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
/ HIP Parameters /
/ /
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The HIP header is logically an IPv6 extension header. However, this document does not describe processing for Next Header values other than decimal 59, IPPROTO_NONE, the IPV6 no next header value. Future documents MAY do so. However, implementations MUST ignore trailing data if a Next Header value is received that is not implemented.
The Header Length field contains the length of the HIP Header and the length of HIP parameters in 8 bytes units, excluding the first 8 bytes. Since all HIP headers MUST contain the sender's and receiver's HIT fields, the minimum value for this field is 4, and conversely, the maximum length of the HIP Parameters field is (255*8)-32 = 2008 bytes.
The Packet Type indicates the HIP packet type. The individual packet types are defined in the relevant sections. If a HIP host receives a HIP packet that contains an unknown packet type, it MUST drop the packet.
The HIP Version is four bits. The current version is 1. The version number is expected to be incremented only if there are incompatible changes to the protocol. Most extensions can be handled by defining new packet types, new parameter types, or new controls.
The following four bits are reserved for future use. They MUST be zero when sent, and they SHOULD be ignored when handling a received packet.
The HIT fields are always 128 bits (16 bytes) long.
The HIP control section transfers information about the structure of the packet and capabilities of the host.
The following fields have been defined:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | | | | | | | | | | | |C|E|A|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- C - Certificate
- One or more certificate packets (CER) follows this HIP packet (see CER - the HIP Certificate Packet).
- E - ESP sequence numbers
- The ESP transform requires 64-bit sequence numbers. See Sequence Number for processing this control.
- A - Anonymous
- If this is set, the sender's HI in this packet is anonymous, i.e., one not listed in a directory. Anonymous HIs SHOULD NOT be stored. This control is set in packets R1 and/or I2. The peer receiving an anonymous HI may choose to refuse it by silently dropping the exchange.
The rest of the fields are reserved for future use and MUST be set to zero on sent packets and ignored on received packets.
The checksum field is located at the same location within the header as the checksum field in UDP packets, enabling hardware assisted checksum generation and verification. Note that since the checksum covers the source and destination addresses in the IP header, it must be recomputed on HIP based NAT boxes.
If IPv6 is used to carry the HIP packet, the pseudo-header [8] contains the source and destination IPv6 addresses, HIP packet length in the pseudo-header length field, a zero field, and the HIP protocol number (TBD) in the Next Header field. The length field is in bytes and can be calculated from the HIP header length field: (HIP Header Length + 1) * 8.
In case of using IPv4, the IPv4 UDP pseudo header format [1] is used. In the pseudo header, the source and destination addresses are those used in the IP header, the zero field is obviously zero, the protocol is the HIP protocol number (TBD), and the length is calculated as in the IPv6 case.
The HIP Parameters are used to carry the public key associated with the sender's HIT, together with other related security information. The HIP Parameters consists of ordered parameters, encoded in TLV format.
The following parameter types are currently defined.
TLV Type Length Data
SPI_LSI 0 16 Remote's SPI, Remote's LSI.
BIRTHDAY_COOKIE 2/4 32 System Boot Counter plus
two 64-bit fields:
- Random #I
- K or random #J
DIFFIE_HELLMAN 6 variable public key
NES_INFO 10 Old SPI, New SPI and other
info needed for NES
HIP_TRANSFORM 16 variable HIP Encryption and Integrity
Transform
ESP_TRANSFORM 18 variable ESP Encryption and
Authentication Transform
ENCRYPTED 20 variable Encrypted part of I2 or CER
packets
HOST_ID 32 variable Host Identity
HOST_ID_FQDN 34 variable Host Identity with Fully
Qualified Domain Name
CERT 64 variable HI certificate
HMAC 65500 24 HMAC based message
authentication code, with
key material from HIP_TRANSFORM
HIP_SIGNATURE2 65532 variable Signature of the R1 packet
HIP_SIGNATURE 65534 variable Signature of the packet
The TLV encoded parameters are described in the following subsections. The type-field value also describes the order of these fields in the packet. The parameters MUST be included into the packet so that the types form an increasing order. If the order does not follow this rule, the packet is considered to be malformed and it MUST be discarded.
All the TLV parameters have a length which is a multiple of 8 bytes. When needed, padding MUST be added to the end of the parameter so that the total length becomes a multiple of 8 bytes. This rule ensures proper alignment of data. If padding is added, the Length field MUST NOT include the padding. Any added padding bytes MUST be set zero by the sender, but their content SHOULD NOT be checked on the receiving end.
Consequently, the Length field indicates the length of the Contents field (in bytes). The total length of the TLV parameter (including Type, Length, Contents, and Padding) is related to the Length field according to the following formula:
Total Length = 11 + Length - (Length + 3) % 8;
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type |C| Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
/ Contents /
/ +-+-+-+-+-+-+-+-+
| | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type Type code for the parameter
C Critical. Zero if this parameter is critical, and
MUST be recognized by the recipient, one otherwise.
The C bit is considered to be a part of the Type field.
Consequently, critical parameters are always even
and non-critical ones have an odd value.
Length Length of the parameter, in bytes.
Contents Parameter specific, defined by Type
Padding Padding, 0-7 bytes, added if needed
Critical parameters MUST be recognized by the recipient. If a recipient encounters a critical parameter that it does not recognize, it MUST NOT process the packet any further.
Non-critical parameters MAY be safely ignored. If a recipient encounters a non-critical parameter that it does not recognize, it SHOULD proceed as if the parameter was not present in the received packet.
Future specifications may define new parameters as needed. When defining new parameters, care must be taken to ensure that the parameter type values are appropriate and leave suitable space for other future extensions. One must remember that the parameters MUST always be arranged in the increasing order by the type code, thereby limiting the order of parameters.
The following rules must be followed when defining new parameters.
- 0 - 511
- 65024 - 65535
- 32768 - 49141
The SPI_LSI parameter contains the SPI that the receiving host must use when sending data to the sending host, and the LSI that the receiving host must to represent itself when talkin to the sending host.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SPI |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| LSI |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type 0
Length 12
Reserved Zero when sent, ignored when received
SPI Security Parameter Index
LSI Local Scope Identifier
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Birthday, 8 bytes |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Random # I, 8 bytes |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| K or Random # J, 8 bytes |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type 2 (in R1) or 4 (in I2)
Length 28
Reserved Zero when sent, ignored when received
Birthday System boot counter
Random # I random number
K K is the number of verified bits (in R1 packet)
Random # J random number (in I2 packet)
Birthday, Random #I, K, and Random #J are represented as 64-bit integers, in network byte order.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Group ID | Public Value /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ | padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type 6
Length length in octets, excluding Type, Length, and padding
Group ID defines values for p and g
Public Value the sender's public Diffie-Hellman key
The following Group IDs have been defined:
Group Value
Reserved 0
OAKLEY well known group 1 1
OAKLEY well known group 2 2
1536-bit MODP group 3
2048-bit MODP group 4
3072-bit MODP group 5
4096-bit MODP group 6
6144-bit MODP group 7
8192-bit MODP group 8
The MODP Diffie-Hellman groups are defined in [14]. The OAKLEY groups are defined in [6]. The OAKLEY well known group 5 is the same as the 1536-bit MODP group.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Transform-ID #1 | Transform-ID #2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Transform-ID #n | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type 16
Length length in octets, excluding Type, Length, and padding
Transform-ID Defines the HIP Suite to be used
The Suite-IDs are identical to those defined in ESP_TRANSFORM.
There MUST NOT be more than six (6) HIP Suite-IDs in one HIP transform TLV. The limited number of transforms sets the maximum size of HIP_TRANSFORM TLV. The HIP_TRANSFORM TLV MUST contain at least one of the mandatory Suide-IDs.
Mandatory implementations: ENCR-3DES-CBC with HMAC-SHA1 and ENCR-NULL with HMAC-SHA1.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Suite-ID #1 | Suite-ID #2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Suite-ID #n | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type 18
Length length in octets, excluding Type, Length, and padding
Suite-ID Defines the ESP Suite to be used
The following Suite-IDs are defined ([16],[19]):
Suite-ID Value
RESERVED 0
ESP-AES-CBC with HMAC-SHA1 1
ESP-3DES-CBC with HMAC-SHA1 2
ESP-3DES-CBC with HMAC-MD5 3
ESP-BLOWFISH-CBC with HMAC-SHA1 4
ESP-NULL with HMAC-SHA1 5
ESP-NULL with HMAC-MD5 6
There MUST NOT be more than six (6) ESP Suite-IDs in one ESP_TRANSFORM TLV. The limited number of Suite-IDs sets the maximum size of ESP_TRANSFORM TLV. The ESP_TRANSFORM MUST contain at least one of the mandatory Suite-IDs.
Mandatory implementations: ESP-3DES-CBC with HMAC-SHA1 and ESP-NULL with HMAC-SHA1.
When the host sends a Host Identity to a peer, it MAY send the identity without any verification information or use certificates to proof the HI. If certificates are sent, they are sent in a separate HIP packet (CER).
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Host Identity /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ | padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type 32
Length length in octets, excluding Type, Length, and padding
Host Identity actual host identity
The Host Identity is represented in RFC2535[9] format. The algorithms used in RDATA format are the following:
Algorithms Values
RESERVED 0
DSA 3 [RFC2536] (REQUIRED)
RSA 5 [RFC3110] (OPTIONAL)
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| HI Length | FQDN Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Host Identity /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ | FDQN /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type 34
Length length in octets, excluding Type, Length, and padding
HI length length of the Host Identity
FQDN length length of the FQDN
Host Identity actual host identity
FQDN Fully Qualified Domain Name, in the binary format.
The Host Identity is represented in RFC2535[9] format; see above. The format for the FQDN is defined in RFC1035[2] Section 3.1.
If there is no FQDN, the HOST_ID TLV is sent instead.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Cert count | Cert ID | Cert type | /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ Certificate /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type 64
Length length in octets, excluding Type, Length, and padding
Cert count total count of certificates that are sent, possibly
in several consequtive CER packets
Cert ID the order number for this certificate
Cert Type describes the type of the certificate
The receiver must know the total number (Cert count) of certificates that it will receive from the sender, related to the R1 or I2. The Cert ID identifies the particular certificate and its order in the certificate chain. The numbering in Cert ID MUST go from 1 to Cert count.
The following certificate types are defined:
Cert format Type number
X.509 v3 1
The encoding format for X.509v3 certificate is defined in [11].
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
| HMAC |
| |
| |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type 65500
Length 20
HMAC
160 low order bits of the HMAC computed over the HIP
packet, excluding the HMAC parameter and any
following HIP_SIGNATURE or HIP_SIGNATURE2
parameters. The checksum field MUST be set to zero
and the HIP header length in the HIP common header
MUST be calculated not to cover any excluded
parameters when the HMAC is calculated.
HMAC calculation and verification process:
Packet sender:
Packet receiver:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SIG alg | Signature /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ | padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type 65534 (2^16-2)
Length length in octets, excluding Type, Length, and padding
SIG alg Signature algorithm
Signature the signature is calculated over the HIP packet,
excluding the HIP_SIGNATURE TLV field, but including
the HMAC field, if present. The checksum field MUST
be set to zero and the HIP header length in the HIP
common header MUST be calculated to the beginning of
the HIP_SIGNATURE TLV when the signature is
calculated.
Signature calculation and verification process:
Packet sender:
Packet receiver:
The signature algorithms are defined in HOST_ID. The signature in the Signature field is encoded using the proper method depending on the signature algorithm (e.g. in case of DSA, according to [10]).
The verification can use either the HI received from a HIP packet, the HI from a DNS query, if the FQDN has been received either in the HOST_ID_FQDN or in the CER packet, or one received by some other means.
The TLV structure is the same as in HIP_SIGNATURE. The fields are:
Type 65532 (2^16-4)
Length length in octets, excluding Type, Length, and padding
SIG alg Signature algorithm
Signature the signature is calculated over the R1 packet,
excluding the HIP_SIGNATURE_2 TLV field, but
including the HMAC field, if present. Initiator's
HIT and Checksum field MUST be set to zero and the
HIP packet length in the HIP header MUST be
calculated to the beginning of the HIP_SIGNATURE_2
TLV when the signature is calculated.
Zeroing the Initiator's HIT makes it possible to create R1 packets beforehand to minimize the effects of possible DoS attacks.
Signature calculation and verification process follows the process in HIP_SIGNATURE. The only difference is that instead of the the HIP_SIGNATURE TLV the HIP_SIGNATURE_2 TLV is used, and the Initiator's HIT is cleared (set to all zeros) before computing the signature.
The NES payload is used to reset Security Associations. It introduces a new SPI to be used when sending data to the sender of the NES packet. The keys for the new Security Association will be drawn from KEYMAT. If the packet contains a Diffie-Hellman parameter, the KEYMAT is first recomputed before drawing the new keys.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|R| Keymat Index | NES ID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Old SPI |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| New SPI |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type 10
Length 12
R One if the NES is a reply to another NES,
otherwise zero.
Keymat Index Index, in bytes, where to continue to draw ESP keys
from KEYMAT. If the packet includes a new
Diffie-Hellman key, the field MUST be zero. Note
that the length of this field limits the amount of
keying material that can be drawn from KEYMAT. If
that amount is exceeded, the NES packet MUST contain
a new Diffie-Hellman key.
NES ID Packet identifier. Used to tie NES packets
into pairs. Initialized to zero and incremented
for each NES.
Old SPI Old SPI for data sent to the source address of
this packet
New SPI New SPI for data sent to the source address of
this packet
Note that the NES_INFO used to include the SPI used in reverse direction, too. However, since NES packets are now always sent in pairs, that is not needed any more. Any middleboxes between the communicating hosts will learn the reverse mappings from the NES reply.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| IV /
/ /
/ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ /
/ Encrypted data /
/ /
/ +-------------------------------+
/ | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type 20
Length length in octets, excluding Type, Length, and padding
Reserved zero when sent, ignored when received
IV Initialization vector, if needed, otherwise nonexisting.
The length of the IV is inferred from the HIP transform.
Encrypted The data is encrypted using an encryption algorithm as
data defined in HIP transform.
Padding Any Padding, if necessary, to make the TLV a multiple
of 8 bytes.
The encrypted data is in TLV format itself. Consequently, the first fields in the contents are Type and Length, allowing the contents to be easily parsed after decryption.
If the encryption algorithm requires the length of the data to be encrypted to be a multiple of the cipher algorithm block size, thereby necessitating padding, and if the encryption algorithm does not specify the padding contents, then an implementation MUST append the TLV parameter that is to be encrypted with an additional padding, so that the length of the resulting cleartext is a multiple of the cipher block size length. Such a padding MUST be constructed as specified in [15] Section 2.4. On the other hand, if the data to be encrypted is already a multiple of the block size, or if the encryption algorithm does specify padding as per [15] Section 2.4, then such additional padding SHOULD NOT be added.
The Length field in the inside, to be encrypted TLV does not include the padding. The Length field in the outside ENCRYPTED TLV is the length of the data after encryption (including the Reserved field, the IV field, and the output from the encryption process specified for that suite, but not any additional external padding). Note that the length of the cipher suite output may be smaller or larger than the length of the data to be encrypted, since the encryption process may compress the data or add additional padding to the data.
The ENCRYPTED payload may contain additional external padding, if the result of encryption, the TLV header and the IV is not a multiple of 8 bytes. The contents of this external padding MUST follow the rules given in Section TLV format.
| TOC |
There are eight basic HIP packets. Four are for the base HIP exchange, one is for rekeying, one is a broadcast for use when there is no IP addressing (e.g., before DHCP exchange), one is used to send certificates, and one is for sending unencrypted data.
Packets consist of the fixed header as described in Payload format, followed by the parameters. The parameter part, in turn, consists of zero or more TLV coded parameters.
In addition to the base packets, other packets types will be defined later in separate specifications. For example, support for mobility and multi-homing is not included in this specification.
Packet representation uses the following operations:
() parameter
x{y} operation x on content y
<x>i x exists i times
[] optional parameter
x | y x or y
In the future, an OPTIONAL upper layer payload MAY follow the HIP header. The payload proto field in the header indicates if there is additional data following the HIP header. The HIP packet, however, MUST NOT be fragmented. This limits the size of the possible additional data in the packet.
The HIP header values for the I1 packet:
Header:
Packet Type = 1
SRC HIT = Initiator's HIT
DST HIT = Responder's HIT, or NULL
IP(HIP())
The I1 packet contains only the fixed HIP header.
Valid control bits: None
The Initiator gets the Responder's HIT either from a DNS lookup of the Responder's FQDN, from some other repository, or from a local table. If the Initiator does not know the Responder's HIT, it may attempt opportunistic mode by using NULL (all zeros) as the Responder's HIT.
Since this packet is so easy to spoof even if it were signed, no attempt is made to add to its generation or processing cost.
Implementation MUST be able to handle a storm of received I1 packets, discarding those with common content that arrive within a small time delta.
The HIP header values for the R1 packet:
Header:
Packet Type = 2
SRC HIT = Responder's HIT
DST HIT = Initiator's HIT
IP ( HIP ( BIRTHDAY_COOKIE,
DIFFIE_HELLMAN,
HIP_TRANSFORM,
ESP_TRANSFORM,
( HOST_ID | HOST_ID_FQDN ),
HIP_SIGNATURE_2 ) )
Valid control bits: C, A
The R1 packet may be followed by one or more CER packets. In this case, the C-bit in the control field MUST be set.
If the responder HI is an anonymous one, the A control MUST be set.
The initiator HIT MUST match the one received in I1. If the R1 is a response to an ESP packet with an unknown SPI, the Initiator HIT SHOULD be zero. If the Responder has multiple HIs, the responder HIT used MUST match Initiator's request. If the Initiator used opportunistic mode, the Responder may select freely among its HIs.
The Birthday is a reboot count used to manage state re-establishment when one peer rebooted or timed out its SA.
The Cookie contains a random # I and the difficulty K. The difficulty K is the number of bits that the Initiator must get zero in the puzzle.
The Diffie-Hellman value is ephemeral, but can be reused over a number of connections. In fact, as a defense against I1 storms, an implementation MAY use the same Diffie-Hellman value for a period of time, for example, 15 minutes. By using a small number of different Cookies for a given Diffie-Hellman value, the R1 packets can be pre-computed and delivered as quickly as I1 packets arrive. A scavenger process should clean up unused DHs and Cookies.
The HIP_TRANSFORM contains the encryption and integrity algorithms supported by the Responder to protect the HI exchange, in the order of preference. All implementations MUST support the 3DES [7] with HMAC-SHA-1-96 [4].
The ESP_TRANSFORM contains the ESP modes supported by the Responder, in the order of preference. All implementations MUST support 3DES [7] with HMAC-SHA-1-96 [4].
The signature is calculated over the whole HIP envelope, after setting the initiator HIT and header checksum temporarily to zero. This allows the Responder to use precomputed R1s. The Initiator SHOULD validate this signature. It SHOULD check that the responder HI received matches with the one expected, if any.
The HIP header values for the I2 packet:
Header:
Type = 3
SRC HIT = Initiator's HIT
DST HIT = Responder's HIT
IP ( HIP ( SPI_LSI,
BIRTHDAY_COOKIE,
DIFFIE_HELLMAN,
HIP_TRANSFORM,
ESP_TRANSFORM,
ENCRYPTED { HOST_ID | HOST_ID_FQDN },
HIP_SIGNATURE ) )
Valid control bits: C, E, A
The HITs used MUST match the ones used previously.
If the initiator HI is an anonymous one, the A control MUST be set.
The Birthday is a reboot count used to manage state re-establishment when one peer rebooted or timed out its SA.
The Cookie contains the random # I from R1 and the computed # J. The low order K bits of the SHA-1(I | ... | J) MUST be zero.
The Diffie-Hellman value is ephemeral. If precomputed, a scavenger process should clean up unused DHs.
The HIP_TRANSFORM contains the encryption and integrity used to protect the HI exchange selected by the Initiator. All implementations MUST support the 3DES transform [7].
The Initiator's HI is encrypted using the HIP_TRANSFORM encryption algorithm. The keying material is derived from the Diffie-Hellman exchanged as defined in HIP KEYMAT.
The ESP_TRANSFORM contains the ESP mode selected by the Initiator. All implementations MUST support 3DES [7] with HMAC-SHA-1-96 [4].
The signature is calculated over whole HIP envelope. The Responder MUST validate this signature. It MAY use either the HI in the packet or the HI acquired by some other means.
The HIP header values for the R2 packet:
Header:
Packet Type = 4
SRC HIT = Responder's HIT
DST HIT = Initiator's HIT
IP ( HIP ( SPI_LSI, HMAC, HIP_SIGNATURE ) )
Valid control bits: E
The HMAC and signature are calculated over whole HIP envelope. The Initiator MUST validate both the HMAC and the signature.
The NES packet is MANDATORY.
The NES serves three functions. Firstly, it provides the peer system with a new SPI to use when sending packets. Secondly, it optionally provides a new Diffie-Hellman key to produce new keying material. Thirdly, it provides any intermediate system with the mapping of the old SPI to the new one. This is important to systems like NATs [21] that use SPIs to maintain address translation state.
The NES packet is a HIP packet with NES_INFO and optional DIFFIE_HELLMAN and in the HIP payload. The NES_INFO parameter contains the old and new SPI values. It also contains a NES ID and HMAC to provide DoS and replay protection. Each system must have a NES ID counter, initialized to zero and incremented on each NES.
The HIP header values for the NES packet:
Header:
Packet Type = 5
SRC HIT = Sender's HIT
DST HIT = Recipients's HIT
IP ( HIP ( [ DIFFIE_HELLMAN, ] NES_INFO , HMAC, HIP_SIGNATURE ) )
Valid control bits: None
During the life of an SA established by HIP, one of the hosts may need to reset the Sequence Number to one (to prevent wrapping) and rekey. The reason for rekeying might be an approaching sequence number wrap in ESP, or a local policy on use of a key. Rekeying ends the current SAs and starts new ones on both peers.
NES packets are always used in pairs, one in both directions, with identical NES IDs. In the case both parties decide to rekey at the same time, the result is four NES packets, two in both directions.
Intermediate systems that use the SPI will have to inspect HIP packets for a NES packet. The packet is signed for the benefit of the intermediate systems. Since intermediate systems may need the new SPI values, the contents of this packet cannot be encrypted.
Processing NES signatures is a potential DoS attack against intermediate systems.
The BOS packet is OPTIONAL.
In some situations, an Initiator may not be able to learn of a Responder's information from DNS or another repository. Some examples of this are DHCP and NetBios servers. Thus, a packet is needed to provide information that would otherwise be gleaned from a repository. This HIP packet is either self-signed in applications like SoHo, or from a trust anchor in large private or public deployments. This packet MAY be broadcasted in IPv4 or multicasted to the all hosts multicast group in IPv6. The packet MUST NOT be sent more often than once in every second. Implementations MAY ignore received BOS packets.
The HIP header values for the BOS packet:
Header:
Packet Type = 7
SRC HIT = Announcer's HIT
DST HIT = NULL
IP ( HIP ( ( HOST_ID | HOST_ID_FQDN ), HIP_SIGNATURE ) )
The BOS packet may be followed by a CER packet if the HI is signed. In this case, the C-bit in the control field MUST be set. If the BOS packet is broadcasted or multicasted, the following CER packet(s) MUST be broadcasted or multicasted to the same multicast group and scope, respectively.
Valid control bits: C, A
The CER packet is OPTIONAL.
The Optional CER packets over the Announcer's HI by a higher level authority known to the Recipient is an alternative method for the Recipient to trust the Announcer's HI (over DNSSEC or PKI).
The HIP header values for CER packet:
Header:
Packet Type = 8
SRC HIT = Announcer's HIT
DST HIT = Recipients's HIT
IP ( HIP ( { <CERT>i }, HIP_SIGNATURE ) ) or
IP ( HIP ( ENCRYPTED { <CERT>i }, HIP_SIGNATURE ) )
Valid control bits: None
Certificates in the CER packet MAY be encrypted. The encryption algorithm is provided in the HIP transform of the previous (R1 or I2) packet.
The PAYLOAD packet is OPTIONAL.
The HIP header values for the PAYLOAD packet:
Header:
Packet Type = 64
SRC HIT = Senders's HIT
DST HIT = Recipients's HIT
IP ( HIP ( ), payload )
Valid control bits: None
Payload Proto field in the Header MUST be set to correspond the correct protocol number of the payload.
The PAYLOAD packet is used to carry a non-ESP protected data. By using the HIP header we ensure interoperability with NAT and other middle boxes.
Processing rules of the PAYLOAD packet are the following:
- Receiving:
- If there is an existing HIP security association with the given HITs, and the IP addresses match the IP addresses associated with the HITs, pass the packet to the upper layer, tagged with metadata indicating that the packet was NOT integrity or confidentiality protected.
- Sending:
- If the IPsec SPD defines BYPASS for a given destination HIT, send it with the PAYLOAD packet. Otherwise use ESP as specified in the SPD.
| TOC |
Each host is assumed to have a separate HIP protocol implementation that manages the host's HIP associations and handles requests for new ones. Each HIP association is governed by a state machine, with states defined above in HIP State Machine. The HIP implementation can simultaneously maintain HIP associations with more than one host. Furthermore, the HIP implementation may have more than one active HIP association with another host; in this case, HIP associations are distinguished by their respective HITs and IPsec SPIs. It is not possible to have more than one HIP associations between any given pair of HITs. Consequently, the only way for two hosts to have more than one parallel associations is to use different HITs, at least in one end.
The processing of packets depends on the state of the HIP association(s) with respect to the authenticated or apparent originator of the packet. A HIP implementation determines whether it has an active association with the originator of the packet based on the HITs or the SPI of the packet.
In a HIP host, an application can send application level data using HITs or LSIs as source and destination identifiers. The HITs and LSIs may be specified via a backwards compatible API (see Backwards compatibility API issues) or a completely new API. However, whenever there is such outgoing data, the stack has to protect the data with ESP, and send the resulting datagram using appropriate source and destination IP addresses. Here, we specify the processing rules only for the base case where both hosts have only single usable IP addresses; the multi-address multi-homing case will be specified separately.
If the IPv4 backward compatible APIs and therefore LSIs are supported, it is assumed that the LSIs will be converted into proper HITs somewhere in the stack. The exact location of the conversion is an implementation specific issue and not discussed here. The following conceptual algorithm discusses only HITs, with the assumption that the LSI-to-HIT conversion takes place somewhere.
The following steps define the conceptual processing rules for outgoing datagrams destinated to a HIT.
Incoming HIP datagrams arrive as ESP protected packets. In the usual case the receiving host has a corresponding ESP security association, identified by the SPI and destination IP address in the packet. However, if the host has crashed or otherwise lost its HIP state, it may not have such an SA.
The following steps define the conceptual processing rules for incoming ESP protected datagrams targeted to an ESP security association created with HIP.
An implementation may originate a HIP exchange to another host based on a local policy decision, usually triggered by an application datagram, in much the same way that an IPsec IKE key exchange can dynamically create a Security Association. Alternatively, a system may initiate a HIP exchange if it has rebooted or timed out, or otherwise lost its HIP state, as described in Reboot and SA timeout restart of HIP.
The implementation prepares an I1 packet and sends it to the IP address that corresponds to the peer host. The IP address of the peer host may be obtained via conventional mechanisms, such as DNS lookup. The I1 contents are specified in I1 - the HIP initiator packet. The selection of which host identity to use, if a host has more than one to choose from, is typically also be a policy decision.
The following steps define the conceptual processing rules for initiating a HIP exchange:
For the sake of minimizing the session establishment latency, an implementation MAY send the same I1 to more than one of the Responder's addresses. However, it MUST NOT send to more than three (3) addresses in parallel. Furthermore, upon timeout, the implementation MUST refrain from sending the same I1 packet to multiple addresses. These limitations are placed order to avoid congestion of the network, and potential DoS attacks that might happen, e.g., because someone claims to have hundreds or thousands of addresses.
As the Responder is not guaranteed to distinguish the duplicate I1's it receives at several of its addresses (because it avoids to store states when it answers back an R1), the Initiator may receive several duplicate R1's.
The Initiator SHOULD then select the destination address using the source address of the first received R1 as a source address for the next I2, and discards subsequent R1's. This strategy seems to be quite good in terms of RTT.
A host may receive an ICMP Destination Protocol Unreachable message as a response to sending an HIP I1 packet. Such a packet may be an indication that the peer does not support HIP, or it may be an attempt to launch an attack by making the Initiator to believe that the Responder does not support HIP.
When a system receives an ICMP Destination Protocol Unreachable message while it is waiting for an R1, it MUST NOT terminate the wait. It MAY continue as if it had not received the ICMP message, and send a few more I1s. Alternatively, it MAY take the ICMP message as a hint that the peer most probably does not support HIP, and return to state E0 earlier than otherwise. However, at minimum, it MUST continue waiting for an R1 for a reasonable time before returning to E0.
An implementation SHOULD reply to an I1 with an R1 packet, unless the implementation is unable or unwilling to setup a HIP association. If the implementation is unable to setup a HIP association, the host SHOULD send an ICMP Destination Protocol Unreachable, Administratively Prohibited, message to the I1 source address. If the implementation is unwilling to setup a HIP association, the host MAY ignore the I1. This latter case may occur during a DoS attack such as an I1 flood.
The implementation MUST be able to handle a storm of received I1 packets, discarding those with common content that arrive within a small time delta.
A spoofed I1 can result in an R1 attack on a system. An R1 sender MUST have a mechanism to rate limit R1s to an address.
Under no circumstances does the HIP state machine transition upon sending an R1.
The following steps define the conceptual processing rules for responding to an I1 packet:
All compliant implementations MUST produce R1 packets. An R1 packet MAY be precomputed. An R1 packet MAY be reused for time Delta T, which is implementation dependent. R1 information MUST not be discarded until Delta S after T. Time S is the delay needed for the last I2 to arrive back to the responder.
An implementation MAY keep state about received I1s and match the received I2s against the state, as discussed in HIP Cookie Mechanism.
A system receiving an R1 MUST first check to see if it has sent an I1 to the originator of the R1 (i.e., it is in state E1). If so, it SHOULD process the R1 as described below, send an I2, and go to state E2, setting a timer to protect the I2. If the system is in state E0 or state E2 with respect to that host, it SHOULD silently drop the R1.
If the system is in state E3/E4, it SHOULD process with a Security Association and Birthday check as described in Reboot and SA timeout restart of HIP, before further processing. In this last case, if the R1 is successfully processed, the system sends an I2, sets a retransmit timer to protect the I2, prepares to replace its old Security Associations with the newly generated ones upon receiving a matching R2, and goes to state E3. Note that if the system was in state E4, it stops the rekey attempt and goes to state E3.
The following steps define the conceptual processing rules for responding to an R1 packet:
Upon receipt of an I2, the system MAY perform initial checks to determine whether the I2 corresponds to a recent R1 that has been sent out, if the Responder keeps such state. For example, the sender could check whether the I2 is from an address or HIT that has recently received an R1 from it. If the I2 is considered to be suspect, it MAY be silently discarded by the system.
Otherwise, the HIP implementation SHOULD process the I2. This includes validation of the cookie puzzle solution, generating the Diffie-Hellman key, decrypting the Initiator's Host Identity, verifying the signature, creating state, and finally sending an R2.
The following steps define the conceptual processing rules for responding to an I2 packet:
An R2 received in states E0, E1, E3 or E4 results in the R2 being dropped and the state machine staying in the same state. If an R2 is received in state E2, it SHOULD be processed.
The following steps define the conceptual processing rules for responding to an I2 packet:
A system may initiate the rekey procedure at any time. It MUST initiate a rekey if its incoming ESP sequence counter is about to overflow.
The following steps define the conceptual processing rules for initiating a rekey:
When a system receives a NES packet, its processing depends on whether the packet is a reply to a previously sent NES packet or the NES is a new packet.
The following steps define the conceptual processing rules responding handling a received NES packet:
When a system receives an initial NES packet, i.e. one that does not have the R-bit set, it prepares new incoming and outgoing SAs, but does not change the outgoing SA yet. Once it has the new SAs in place, it sends a reply NES. The contents of the reply NES depend on whether the system was in state E3 or E4 upon receiving the initial NES packet.
The following steps define the conceptual processing rules responding handling a received initial NES packet:
When a system receives a reply NES packet, i.e. one that has the R-bit set, it starts to use the new outgoing SA. It must also complete its new incoming SA.
The following steps define the conceptual processing rules responding handling a received reply NES packet:
Processing BOS packets is OPTIONAL, and currently undefined.
Processing CER packets is OPTIONAL, and currently undefined.
Processing PAYLOAD packets is OPTIONAL, and currently undefined.
| TOC |
HIP keying material is derived from the Diffie-Hellman Kij produced during the base HIP exchange. The Initiator has Kij during the creation of the I2 packet, and the Responder has Kij once it receives the I2 packet. This is why I2 can already contain encrypted information.
The KEYMAT is derived by feeding Kij and the HITs into the following operation; the | operation denotes concatenation.
KEYMAT = K1 | K2 | K3 | ...
where
K1 = SHA-1( Kij | sort(HIT-I | HIT-R) | 0x01 )
K2 = SHA-1( Kij | K1 | 0x02 )
K3 = SHA-1( Kij | K2 | 0x03 )
...
K255 = SHA-1( Kij | K254 | 0xff )
K256 = SHA-1( Kij | K255 | 0x00 )
etc.
Sort(HIT-I | HIT-R) is defined as the numeric network byte order comparison of the HITs, with lower HIT preceding higher HIT, resulting in the concatenation of the HITs in the said order. The initial keys are drawn sequentially in the following order:
Initiator HIP encryption key
Initiator HIP integrity (HMAC) key
Responder HIP encryption key (currently unused)
Responder HIP integrity (HMAC) key
Initiator ESP encryption key
Initiator ESP authentication key
Responder ESP encryption key
Responder ESP authentication key
The number of bits drawn for a given algorithm is the "natural" size of the keys. For the mandatory algorithms, the following sizes apply:
- 3DES
- 192 bits
- SHA-1
- 160 bits
- NULL
- 0 bits
Subsequent rekeys without Diffie-Hellman just require drawing out more sets of ESP keys. In the situation where Kij is the result of a HIP rekey exchange with Diffie-Hellman, there is only the need from one set of ESP keys, without the HIP keys. These are then the only keys taken from the KEYMAT.
| TOC |
A HIP implementation must support IP fragmentation / reassembly. Fragment reassembly MUST be implemented in both IPv4 and IPv6, but fragment generation MUST be implemented only in IPv4 (IPv4 stacks and networks will usually do this by default) and SHOULD be implemented in IPv6. In the IPv6 world, the minimum MTU is larger, 1280 bytes, than in the IPv4 world. The larger MTU size is usually sufficient for most HIP packets, and therefore fragment generation may not be needed. If a host expects to send HIP packets that are larger than the minimum IPv6 MTU, it MUST implement fragment generation even for IPv6.
In the IPv4 world, HIP packets may encounter low MTUs along their routed path. Since HIP does not provide a mechanism to use multiple IP datagrams for a single HIP packet, support of path MTU discovery does not bring any value to HIP in the IPv4 world. HIP aware NAT systems MUST perform any IPv4 reassembly/fragmentation.
All HIP implementations MUST employ a reassembly algorithm that is sufficiently resistant against DoS attacks.
| TOC |
XXX: Since HIP is designed for host usage, not for gateways, only ESP transport mode is supported with HIP. The SA is not bound to an IP address; all internal control of the SA is by the HIT and LSI. XXX BEET mode.
Since HIP does not negotiate any lifetimes, all lifetimes are local policy. The only lifetimes a HIP implementation MUST support are sequence number rollover (for replay protection), and SA timeout. An SA times out if no packets are received using that SA. The default timeout value is 15 minutes. Implementations MAY support lifetimes for the various ESP transforms.
An SA pair is indexed by the 2 SPIs and 2 HITs (both HITs since a system can have more than one HIT). An inactivity timer is recommended for all SAs. If the state dictates the deletion of an SA, a timer is set to allow for any late arriving packets.
The SPIs in ESP provide a simple compression of the HIP data from all packets after the HIP exchange. This does require a per HIT- pair Security Association (and SPI), and a decrease of policy granularity over other Key Management Protocols like IKE.
When a host rekeys, it gets a new SPI from its partner.
All HIP implementations MUST support 3DES [7] and HMAC-SHA-1-96 [4]. If the Initiator does not support any of the transforms offered by the Responder in the R1 HIP packet, it MUST use 3DES and HMAC-SHA-1-96 and state so in the I2 HIP packet.
In addition to 3DES, all implementations MUST implement the ESP NULL encryption and authentication algorithms. These algoritms are provided mainly for debugging purposes, and SHOULD NOT be used in production environments. The default configuration in implementations MUST be to reject NULL encryption or authentication.
The Sequence Number field is MANDATORY in ESP. Anti-replay protection MUST be used in an ESP SA established with HIP.
This means that each host MUST rekey before its sequence number reaches 2^32, or if extended sequence numbers are used, 2^64. Note that in HIP rekeying, unlike IKE rekeying, only one Diffie-Hellman key can be changed, that of the rekeying host. However, if one host rekeys, the other host SHOULD rekey as well.
In some instances, a 32 bit sequence number is inadequate. In either the I2 or R2 packets, a peer MAY require that a 64 bit sequence number be used. In this case the higher 32 bits are NOT included in the ESP header, but are simply kept local to both peers. 64 bit sequence numbers must only be used for ciphers that will not be open to cryptoanalysis as a result. AES is one such cipher.
| TOC |
There are a number of variables that will influence the HIP exchanges that each host must support. All HIP implementations MUST support more than one simultaneous HIs, at least one of which SHOULD be reserved for anonymous usage. Although anonymous HIs will be rarely used as responder HIs, they will be common for Initiators. Support for more than two HIs is RECOMMENDED.
Many Initiators would want to use a different HI for different Responders. The implementations SHOULD provide for an ACL of initiator HIT to responder HIT. This ACL SHOULD also include preferred transform and local lifetimes. For HITs with HAAs, wildcarding SHOULD be supported. Thus if a Community of Interest, like Banking, gets an RAA, a single ACL could be used. A global wildcard would represent the general policy to be used. Policy selection would be from most specific to most general.
The value of K used in the HIP R1 packet can also vary by policy. K should never be greater than 20, but for trusted partners it could be as low as 0.
Responders would need a similar ACL, representing which hosts they accept HIP exchanges, and the preferred transform and local lifetimes. Wildcarding SHOULD be supported for this ACL also.
| TOC |
HIP is designed to provide secure authentication of hosts and to provide a fast key exchange for IPsec ESP. HIP also attempts to limit the exposure of the host to various denial-of-service and man- in-the-middle attacks. In so doing, HIP itself is subject to its own DoS and MitM attacks that potentially could be more damaging to a host's ability to conduct business as usual.
HIP enabled ESP is IP address independent. This might seem to make it easier for an attacker, but ESP with replay protection is already as well protected as possible, and the removal of the IP address as a check should not increase the exposure of ESP to DoS attacks. Furthermore, this is in line with the forthcoming revision of ESP.
Denial-of-service attacks take advantage of the cost of start of state for a protocol on the Responder compared to the 'cheapness' on the Initiator. HIP makes no attempt to increase the cost of the start of state on the Initiator, but makes an effort to reduce the cost to the Responder. This is done by having the Responder start the 3-way cookie exchange instead of the Initiator, making the HIP protocol 4 packets long. In doing this, packet 2 becomes a 'stock' packet that the Responder MAY use many times. The duration of use is a paranoia versus throughput concern. Using the same Diffie- Hellman values and random puzzle I has some risk. This risk needs to be balanced against a potential storm of HIP I1 packets.
This shifting of the start of state cost to the Initiator in creating the I2 HIP packet, presents another DoS attack. The attacker spoofs the I1 HIP packet and the Responder sends out the R1 HIP packet. This could conceivably tie up the 'initiator' with evaluating the R1 HIP packet, and creating the I2 HIP packet. The defense against this attack is to simply ignore any R1 packet where a corresponding I1 or ESP data was not sent.
A second form of DoS attack arrives in the I2 HIP packet. Once the attacking Initiator has solved the cookie challenge, it can send packets with spoofed IP source addresses with either invalid encrypted HIP payload component or a bad HIP signature. This would take resources in the Responder's part to reach the point to discover that the I2 packet cannot be completely processed. The defense against this attack is after N bad I2 packets, the Responder would discard any I2s that contain the given Initiator HIT. Thus will shut down the attack. The attacker would have to request another R1 and use that to launch a new attack. The Responder could up the value of K while under attack. On the downside, valid I2s might get dropped too.
A third form of DoS attack is emulating the restart of state after a reboot of one of the partners. To protect from such an attack, a system Birthday is included in the R1 and I2 packets to prove loss of state to a peer. The inclusion of the Birthday creates a very deterministic process for state restart. Any other action is a DoS attack.
A fourth form of DoS attack is emulating the end of state. HIP has no end of state packet. It relies on a local policy timer to end state.
Man-in-the-middle attacks are difficult to defend against, without third-party authentication. A skillful MitM could easily handle all parts of HIP; but HIP indirectly provides the following protection from a MitM attack. If the Responder's HI is retrieved from a signed DNS zone, a certificate, or through some other secure means, the Initiator can use this to validate the R1 HIP packet.
Likewise, if the Initiator's HI is in a secure DNS zone, a trusted certificate, or otherwise securely available, the Responder can retrieve it after it gets the I2 HIP packet and validate that. However, since an Initiator may choose to use an anonymous HI, it knowingly risks a MitM attack. The Responder may choose not to accept a HIP exchange with an anonymous Initiator.
Since not all hosts will ever support HIP, ICMP 'Destination Protocol Unreachable' are to be expected and present a DoS attack. Against an Initiator, the attack would look like the Responder does not support HIP, but shortly after receiving the ICMP message, the Initiator would receive a valid R1 HIP packet. Thus to protect from this attack, an Initiator should not react to an ICMP message until a reasonable delta time to get the real Responder's R1 HIP packet. A similar attack against the Responder is more involved. First an ICMP message is expected if the I1 was a DoS attack and the real owner of the spoofed IP address does not support HIP. The Responder SHOULD NOT act on this ICMP message to remove the minimal state from the R1 HIP packet (if it has one), but wait for either a valid I2 HIP packet or the natural timeout of the R1 HIP packet. This is to allow for a sophisticated attacker that is trying to break up the HIP exchange. Likewise, the Initiator should ignore any ICMP message while waiting for an R2 HIP packet, deleting state only after a natural timeout.
| TOC |
IANA has assigned IP Protocol number TBD to HIP.
| TOC |
The drive to create HIP came to being after attending the MALLOC meeting at IETF 43. Baiju Patel and Hilarie Orman really gave the original author, Bob Moskowitz, the assist to get HIP beyond 5 paragraphs of ideas. It has matured considerably since the early drafts thanks to extensive input from IETFers. Most importantly, its design goals are articulated and are different from other efforts in this direction. Particular mention goes to the members of the NameSpace Research Group of the IRTF. Noel Chiappa provided the framework for LSIs and Keith Moore the impetus to provide resolvability. Steve Deering provided encouragement to keep working, as a solid proposal can act as a proof of ideas for a research group.
Many others contributed; extensive security tips were provided by Steve Bellovin. Rob Austein kept the DNS parts on track. Paul Kocher taught Bob Moskowitz how to make the cookie exchange expensive for the Initiator to respond, but easy for the Responder to validate. Bill Sommerfeld supplied the Birthday concept to simplify reboot management. Rodney Thayer and Hugh Daniels provide extensive feedback. In the early times of this draft, John Gilmore kept Bob Moskowitz challenged to provide something of value.
During the later stages of this document, when the editing baton was transfered to Pekka Nikander, the input from the early implementors were invaluable. Without having actual implementations, this document would not be on the level it is now.
In the usual IETF fashion, a large number of people have contributed to the actual text or ideas. The list of these people include Jeff Ahrenholz, Francis Dupont, Derek Fawcus, George Gross, Andrew McGregor, Julien Laganier, Miika Komu, Mika Kousa, Jan Melen, Henrik Petander, Michael Richardson, Tim Shepard, and Jukka Ylitalo. Our apologies to anyone who's name is missing.
| TOC |
| TOC |
| [19] | Bellovin, S. and W. Aiello, "Just Fast Keying (JFK)", draft-ietf-ipsec-jfk-04 (work in progress), July 2002. |
| [20] | Moskowitz, R. and P. Nikander, "Using Domain Name System (DNS) with Host Identity Protocol (HIP)", draft-nikander-hip-dns-00 (to be issued) (work in progress), June 2003. |
| [21] | Nikander, P., "SPI assisted NAT traversal (SPINAT) with Host Identity Protocol (HIP)", draft-nikander-hip-nat-00 (to be issued) (work in progress), June 2003. |
| [22] | Crosby, SA. and DS. Wallach, "Denial of Service via Algorithmic Complexity Attacks", in Proceedings of Usenix Security Symposium 2003, Washington, DC., August 2003. |
| TOC |
| Robert Moskowitz | |
| ICSAlabs, a Division of TruSecure Corporation | |
| 1000 Bent Creek Blvd, Suite 200 | |
| Mechanicsburg, PA | |
| USA | |
| EMail: | rgm@icsalabs.com |
| Pekka Nikander | |
| Ericsson Research Nomadiclab | |
| JORVAS FIN-02420 | |
| FINLAND | |
| Phone: | +358 9 299 1 |
| EMail: | pekka.nikander@nomadiclab.com |
| Petri Jokela | |
| Ericsson Research Nomadiclab | |
| JORVAS FIN-02420 | |
| FINLAND | |
| Phone: | +358 9 299 1 |
| EMail: | petri.jokela@nomadiclab.com |
| Thomas R. Henderson | |
| The Boeing Company | |
| P.O. Box 3707 | |
| Seattle, WA | |
| USA | |
| EMail: | thomas.r.henderson@boeing.com |
| TOC |
Tom Henderson has several times expressed the thought that that the LSI could be completely local and does not need to be exchanged. Applications could continue to use IP addresses in socket calls, and kernel does whatever NATting (including application NATting) is required. It was pointed out that this approach was going to be prone to some kinds of data flows escaping the HIP protection, unless the local housekeeping in an implementation was especially good. Example: FTP opens control connection to IP address. One or both parties move. FTP later opens data connection to the old IP address. Kernel must identify that the application really means to connect to the host that was previously at that IP address -- but obviously if the old address is reused by another host, this becomes difficult.
Related to this, the discussion also opened up the question of DNS resolution. Should the HIT/LSI be returned to applications as a (spoofed) address in the resolution process, allowing apps to use the socket API with HIT or LSI values instead of an IP address? While this seems to be the original intention of LSIs, there are a couple of difficulties especially in the IPv4 case:
How does kernel know whether value being passed in a socket call is an IP address or an LSI? The fact that a name resolver library gave an application an LSI is no guarantee that the application will use that information in its socket call. It may also have cached some IP address from before or received an IP address as side information. This difficulty is now relieved as the LSIs are constrained to the well-known private subnet space.
Handing an LSI may confuse legacy applications that assume that what is being passed to them is an IP address. Good examples of this are diagnostic tools such as dig and ping. The conclusion is that HIP should most not be used with diagnostic applications.
What does kernel do with an LSI that it cannot map to an address based on information that it has locally cached?
It seems that some modification to the resolver library (to explicitly convey HIP information rather than spoofing IP addresses), as well as modifications to socket API to explicitly let the kernel know that the application is HIP aware, are the cleanest long-term solution, but what to do about legacy applications?? -- still partially an open issue. The HUT team has been considering these problems.
In summary, there seems to be two schools of thought, and their approaches can be summarized as follows:
One way around might be to have a separate resolver call that returns an LSI, or enhance the data structure returned by resolver to include LSI in addition to IP address, but this then throws the burden on applications to be HIP-aware.
| TOC |
The birthday paradox sets a bound for the expectation of collisions. It is based on the square root of the number of values. A 64-bit hash, then, would put the chances of a collision at 50-50 with 2^32 hosts (4 billion). A 1% chance of collision would occur in a population of 640M and a .001% collision chance in a 20M population. A 128 bit hash will have the same .001% collision chance in a 9x10^16 population.
| TOC |
A question: Is it guaranteed that the Initiator is able to solve the puzzle in this way when the K value is large?
Answer: No, it is not guaranteed. But it is not guaranteed even in the old mechanism, since the Initiator may start far away from J and arrive to J after far too many steps. If we wanted to make sure that the Initiator finds a value, we would need to give some hint of a suitable J, and I don't think we want to do that.
In general, if we model the hash function with a random function, the probability that one iteration gives are result with K zero bits is 2^-K. Thus, the probablity that one iteration does not give K zero bits is (1 - 2^-K). Consequently, the probablity that 2^K iterations does not give K zero bits is (1 - 2^-K)^(2^K).
Since my calculus starts to be rusty, I made a small experiment and found out that
lim (1 - 2^-k)^(2^k) = 0.36788
k->inf
lim (1 - 2^-k)^(2^(k+1)) = 0.13534
k->inf
lim (1 - 2^-k)^(2^(k+2)) = 0.01832
k->inf
lim (1 - 2^-k)^(2^(k+3)) = 0.000335
k->inf
Thus, if hash functions were random functions, we would need about 2^(K+3) iterations to make sure that the probability of a failure is less than 1% (actually less than 0.04%). Now, since my perhaps flawed understanding of hash functions is that they are "flatter" than random functions, 2^(K+3) is probably an overkill. OTOH, the currently suggested 2^K is clearly too little. The draft has been changed to read 2^(K+2).
| TOC |
As mentioned in HIP Cookie Mechanism, the Responder may delay state creation and still reject most spoofed I2s by using a number of pre-calculated R1s and a local selection function. This appendix defines one possible implementation in detail. The purpose of this appendix is to give the implementators an idea on how to implement the mechanism. The method described in this appendix SHOULD NOT be used in any real implementation. If the implementation is based on this appendix, it SHOULD contain some local modification that makes an attacker's task harder.
The basic idea is to create a cheap, varying local mapping function f:
f( IP-I, IP-R, HIT-I, HIT-R ) -> cookie-index
That is, given the Initiator's and Responder's IP addresses and HITs, the function returns an index to a cookie. When processing an I1, the cookie is embedded in an pre-computed R1, and the Responder simply sends that particular R1 to the Initiator. When processing an I2, the cookie may still be embedded in the R1, or the R1 may be depracated (and replaced with a new one), but the cookie is still there. If the received cookie does not match with the R1 or saved cookie, the I2 is simply dropped. That prevents the Initiator from generating spoofed I2s with a probability that depends on the number of pre-computed R1s.
As a concrete example, let us assume that the Responder has an array of R1s. Each slot in the array contains a timestamp, an R1, and an old cookie that was sent in the previous R1 that occupied that particular slot. The Responder replaces one R1 in the array every few minutes, thereby replacing all the R1s gradually.
To create a varying mapping function, the Responder generates a random number every few minutes. The octets in the IP addresses and HITs are XORed together, and finally the result is XORed with the random number. Using pseudo-code, the function looks like the following.
Pre-computation:
r1 := random number
Index computation:
index := r1 XOR hit_r[0] XOR hit_r[1] XOR ... XOR hit_r[15]
index := index XOR hit_i[0] XOR hit_i[1] XOR ... XOR hit_i[15]
index := index XOR ip_r[0] XOR ip_r[1] XOR ... XOR ip_r[15]
index := index XOR ip_i[0] XOR ip_i[1] XOR ... XOR ip_i[15]
The index gives the slot used in the array.
It is possible that an Initiator receives an I1, and while it is computing I2, the Responder deprecates an R1 and/or chooses a new random number for the mapping function. Therefore the Responder must remember the cookies used in deprecated R1s and the previous random number.
To check an received I2, the Responder can use a simple algorithm, expressed in pseudo-code as follows.
If I2.hit_r does not match my_hits, drop the packet.
index := compute_index(current_random_number, I2)
If current_cookie[index] == I2.cookie, go to cookie check.
If previous_cookie[index] == I2.cookie, go to cookie check.
index := compute_index(previous_random_number, I2)
If current_cookie[index] == I2.cookie, go to cookie check.
If previous_cookie[index] == I2.cookie, go to cookie check.
Drop packet.
cookie_check:
V := Ltrunc( SHA-1( I2.I, I2.hit_i, I2.hit_r, I2.J ), K )
if V != 0, drop the packet.
Whenever the Responder receives an I2 that fails on the index check, it can simply drop the packet on the floor and forget about it. New I2s with the same or other spoofed parameters will get dropped with a reasonable probability and minimal effort.
If a Responder receives an I2 that passes the index check but fails on the puzzle check, it should create a state indicating this. After two or three failures the Responder should cease checking the puzzle but drop the packets directly. This saves the Responder from the SHA-1 calculations. Such block should not last long, however, or there would be a danger that a legitimite Initiator could be blocked from getting connections.
A key for the success of the defined scheme is that the mapping function must be considerably cheaper than computing SHA-1. It also must detect any changes in the IP addresses, and preferably most changes in the HITs. Checking the HITs is not that essential, though, since HITs are included in the cookie computation, too.
The effectivity of the method can be varied by varying the size of the array containing pre-computed R1s. If the array is large, the probability that an I2 with a spoofed IP address or HIT happens to map to the same slot is fairly slow. However, a large array means that each R1 has a fairly long life time, thereby allowing an attacker to utilize one solved puzzle for a longer time.
| TOC |
In the IPv4 world, with the deployed NAT devices, it may make sense to run HIP over UDP. When running HIP over UDP, the following packet structure is used. The structure is followed by the HITs, as usual. Both the Source and Destionation port MUST be 272.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\
| Source port | Destination port | \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ >UDP
| Length | Checksum | /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<
| HIP Controls | HIP pkt Type | Ver. | Res. | >HIP
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/
It is currently undefined how the actual data transfer, using ESP, is handled. Plain ESP may not go through all NAT devices.
It is currently FORBIDDEN to use this packet format with IPv6.
| TOC |
The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the IETF's procedures with respect to rights in standards-track and standards-related documentation can be found in BCP-11. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification can be obtained from the IETF Secretariat.
The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to practice this standard. Please address the information to the IETF Executive Director.
Copyright (C) The Internet Society (2003). All Rights Reserved.
This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English.
The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assignees.
This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Funding for the RFC Editor function is currently provided by the Internet Society.