doc/rfc2104.txt

   1
   2
   3
   4
   5
   6
   7 Network Working Group                                       H. Krawczyk
   8 Request for Comments: 2104                                          IBM
   9 Category: Informational                                      M. Bellare
  10                                                                    UCSD
  11                                                              R. Canetti
  12                                                                     IBM
  13                                                           February 1997
  14
  15
  16              HMAC: Keyed-Hashing for Message Authentication
  17
  18 Status of This Memo
  19
  20    This memo provides information for the Internet community.  This memo
  21    does not specify an Internet standard of any kind.  Distribution of
  22    this memo is unlimited.
  23
  24 Abstract
  25
  26    This document describes HMAC, a mechanism for message authentication
  27    using cryptographic hash functions. HMAC can be used with any
  28    iterative cryptographic hash function, e.g., MD5, SHA-1, in
  29    combination with a secret shared key.  The cryptographic strength of
  30    HMAC depends on the properties of the underlying hash function.
  31
  32 1. Introduction
  33
  34    Providing a way to check the integrity of information transmitted
  35    over or stored in an unreliable medium is a prime necessity in the
  36    world of open computing and communications. Mechanisms that provide
  37    such integrity check based on a secret key are usually called
  38    "message authentication codes" (MAC). Typically, message
  39    authentication codes are used between two parties that share a secret
  40    key in order to validate information transmitted between these
  41    parties. In this document we present such a MAC mechanism based on
  42    cryptographic hash functions. This mechanism, called HMAC, is based
  43    on work by the authors [BCK1] where the construction is presented and
  44    cryptographically analyzed. We refer to that work for the details on
  45    the rationale and security analysis of HMAC, and its comparison to
  46    other keyed-hash methods.
  47
  48
  49
  50
  51
  52
  53
  54
  55
  56
  57
  58 Krawczyk, et. al.            Informational                      [Page 1]
  59 \f
  60 RFC 2104                          HMAC                     February 1997
  61
  62
  63    HMAC can be used in combination with any iterated cryptographic hash
  64    function. MD5 and SHA-1 are examples of such hash functions. HMAC
  65    also uses a secret key for calculation and verification of the
  66    message authentication values. The main goals behind this
  67    construction are
  68
  69    * To use, without modifications, available hash functions.
  70      In particular, hash functions that perform well in software,
  71      and for which code is freely and widely available.
  72
  73    * To preserve the original performance of the hash function without
  74      incurring a significant degradation.
  75
  76    * To use and handle keys in a simple way.
  77
  78    * To have a well understood cryptographic analysis of the strength of
  79      the authentication mechanism based on reasonable assumptions on the
  80      underlying hash function.
  81
  82    * To allow for easy replaceability of the underlying hash function in
  83      case that faster or more secure hash functions are found or
  84      required.
  85
  86    This document specifies HMAC using a generic cryptographic hash
  87    function (denoted by H). Specific instantiations of HMAC need to
  88    define a particular hash function. Current candidates for such hash
  89    functions include SHA-1 [SHA], MD5 [MD5], RIPEMD-128/160 [RIPEMD].
  90    These different realizations of HMAC will be denoted by HMAC-SHA1,
  91    HMAC-MD5, HMAC-RIPEMD, etc.
  92
  93    Note: To the date of writing of this document MD5 and SHA-1 are the
  94    most widely used cryptographic hash functions. MD5 has been recently
  95    shown to be vulnerable to collision search attacks [Dobb].  This
  96    attack and other currently known weaknesses of MD5 do not compromise
  97    the use of MD5 within HMAC as specified in this document (see
  98    [Dobb]); however, SHA-1 appears to be a cryptographically stronger
  99    function. To this date, MD5 can be considered for use in HMAC for
 100    applications where the superior performance of MD5 is critical.   In
 101    any case, implementers and users need to be aware of possible
 102    cryptanalytic developments regarding any of these cryptographic hash
 103    functions, and the eventual need to replace the underlying hash
 104    function. (See section 6 for more information on the security of
 105    HMAC.)
 106
 107
 108
 109
 110
 111
 112
 113
 114 Krawczyk, et. al.            Informational                      [Page 2]
 115 \f
 116 RFC 2104                          HMAC                     February 1997
 117
 118
 119 2. Definition of HMAC
 120
 121    The definition of HMAC requires a cryptographic hash function, which
 122    we denote by H, and a secret key K. We assume H to be a cryptographic
 123    hash function where data is hashed by iterating a basic compression
 124    function on blocks of data.   We denote by B the byte-length of such
 125    blocks (B=64 for all the above mentioned examples of hash functions),
 126    and by L the byte-length of hash outputs (L=16 for MD5, L=20 for
 127    SHA-1).  The authentication key K can be of any length up to B, the
 128    block length of the hash function.  Applications that use keys longer
 129    than B bytes will first hash the key using H and then use the
 130    resultant L byte string as the actual key to HMAC. In any case the
 131    minimal recommended length for K is L bytes (as the hash output
 132    length). See section 3 for more information on keys.
 133
 134    We define two fixed and different strings ipad and opad as follows
 135    (the 'i' and 'o' are mnemonics for inner and outer):
 136
 137                    ipad = the byte 0x36 repeated B times
 138                   opad = the byte 0x5C repeated B times.
 139
 140    To compute HMAC over the data `text' we perform
 141
 142                     H(K XOR opad, H(K XOR ipad, text))
 143
 144    Namely,
 145
 146     (1) append zeros to the end of K to create a B byte string
 147         (e.g., if K is of length 20 bytes and B=64, then K will be
 148          appended with 44 zero bytes 0x00)
 149     (2) XOR (bitwise exclusive-OR) the B byte string computed in step
 150         (1) with ipad
 151     (3) append the stream of data 'text' to the B byte string resulting
 152         from step (2)
 153     (4) apply H to the stream generated in step (3)
 154     (5) XOR (bitwise exclusive-OR) the B byte string computed in
 155         step (1) with opad
 156     (6) append the H result from step (4) to the B byte string
 157         resulting from step (5)
 158     (7) apply H to the stream generated in step (6) and output
 159         the result
 160
 161    For illustration purposes, sample code based on MD5 is provided as an
 162    appendix.
 163
 164
 165
 166
 167
 168
 169
 170 Krawczyk, et. al.            Informational                      [Page 3]
 171 \f
 172 RFC 2104                          HMAC                     February 1997
 173
 174
 175 3. Keys
 176
 177    The key for HMAC can be of any length (keys longer than B bytes are
 178    first hashed using H).  However, less than L bytes is strongly
 179    discouraged as it would decrease the security strength of the
 180    function.  Keys longer than L bytes are acceptable but the extra
 181    length would not significantly increase the function strength. (A
 182    longer key may be advisable if the randomness of the key is
 183    considered weak.)
 184
 185    Keys need to be chosen at random (or using a cryptographically strong
 186    pseudo-random generator seeded with a random seed), and periodically
 187    refreshed.  (Current attacks do not indicate a specific recommended
 188    frequency for key changes as these attacks are practically
 189    infeasible.  However, periodic key refreshment is a fundamental
 190    security practice that helps against potential weaknesses of the
 191    function and keys, and limits the damage of an exposed key.)
 192
 193 4. Implementation Note
 194
 195    HMAC is defined in such a way that the underlying hash function H can
 196    be used with no modification to its code. In particular, it uses the
 197    function H with the pre-defined initial value IV (a fixed value
 198    specified by each iterative hash function to initialize its
 199    compression function).  However, if desired, a performance
 200    improvement can be achieved at the cost of (possibly) modifying the
 201    code of H to support variable IVs.
 202
 203    The idea is that the intermediate results of the compression function
 204    on the B-byte blocks (K XOR ipad) and (K XOR opad) can be precomputed
 205    only once at the time of generation of the key K, or before its first
 206    use. These intermediate results are stored and then used to
 207    initialize the IV of H each time that a message needs to be
 208    authenticated.  This method saves, for each authenticated message,
 209    the application of the compression function of H on two B-byte blocks
 210    (i.e., on (K XOR ipad) and (K XOR opad)). Such a savings may be
 211    significant when authenticating short streams of data.  We stress
 212    that the stored intermediate values need to be treated and protected
 213    the same as secret keys.
 214
 215    Choosing to implement HMAC in the above way is a decision of the
 216    local implementation and has no effect on inter-operability.
 217
 218
 219
 220
 221
 222
 223
 224
 225
 226 Krawczyk, et. al.            Informational                      [Page 4]
 227 \f
 228 RFC 2104                          HMAC                     February 1997
 229
 230
 231 5. Truncated output
 232
 233    A well-known practice with message authentication codes is to
 234    truncate the output of the MAC and output only part of the bits
 235    (e.g., [MM, ANSI]).  Preneel and van Oorschot [PV] show some
 236    analytical advantages of truncating the output of hash-based MAC
 237    functions. The results in this area are not absolute as for the
 238    overall security advantages of truncation. It has advantages (less
 239    information on the hash result available to an attacker) and
 240    disadvantages (less bits to predict for the attacker).  Applications
 241    of HMAC can choose to truncate the output of HMAC by outputting the t
 242    leftmost bits of the HMAC computation for some parameter t (namely,
 243    the computation is carried in the normal way as defined in section 2
 244    above but the end result is truncated to t bits). We recommend that
 245    the output length t be not less than half the length of the hash
 246    output (to match the birthday attack bound) and not less than 80 bits
 247    (a suitable lower bound on the number of bits that need to be
 248    predicted by an attacker).  We propose denoting a realization of HMAC
 249    that uses a hash function H with t bits of output as HMAC-H-t. For
 250    example, HMAC-SHA1-80 denotes HMAC computed using the SHA-1 function
 251    and with the output truncated to 80 bits.  (If the parameter t is not
 252    specified, e.g. HMAC-MD5, then it is assumed that all the bits of the
 253    hash are output.)
 254
 255 6. Security
 256
 257    The security of the message authentication mechanism presented here
 258    depends on cryptographic properties of the hash function H: the
 259    resistance to collision finding (limited to the case where the
 260    initial value is secret and random, and where the output of the
 261    function is not explicitly available to the attacker), and the
 262    message authentication property of the compression function of H when
 263    applied to single blocks (in HMAC these blocks are partially unknown
 264    to an attacker as they contain the result of the inner H computation
 265    and, in particular, cannot be fully chosen by the attacker).
 266
 267    These properties, and actually stronger ones, are commonly assumed
 268    for hash functions of the kind used with HMAC. In particular, a hash
 269    function for which the above properties do not hold would become
 270    unsuitable for most (probably, all) cryptographic applications,
 271    including alternative message authentication schemes based on such
 272    functions.  (For a complete analysis and rationale of the HMAC
 273    function the reader is referred to [BCK1].)
 274
 275
 276
 277
 278
 279
 280
 281
 282 Krawczyk, et. al.            Informational                      [Page 5]
 283 \f
 284 RFC 2104                          HMAC                     February 1997
 285
 286
 287    Given the limited confidence gained so far as for the cryptographic
 288    strength of candidate hash functions, it is important to observe the
 289    following two properties of the HMAC construction and its secure use
 290    for message authentication:
 291
 292    1. The construction is independent of the details of the particular
 293    hash function H in use and then the latter can be replaced by any
 294    other secure (iterative) cryptographic hash function.
 295
 296    2. Message authentication, as opposed to encryption, has a
 297    "transient" effect. A published breaking of a message authentication
 298    scheme would lead to the replacement of that scheme, but would have
 299    no adversarial effect on information authenticated in the past.  This
 300    is in sharp contrast with encryption, where information encrypted
 301    today may suffer from exposure in the future if, and when, the
 302    encryption algorithm is broken.
 303
 304    The strongest attack known against HMAC is based on the frequency of
 305    collisions for the hash function H ("birthday attack") [PV,BCK2], and
 306    is totally impractical for minimally reasonable hash functions.
 307
 308    As an example, if we consider a hash function like MD5 where the
 309    output length equals L=16 bytes (128 bits) the attacker needs to
 310    acquire the correct message authentication tags computed (with the
 311    _same_ secret key K!) on about 2**64 known plaintexts.  This would
 312    require the processing of at least 2**64 blocks under H, an
 313    impossible task in any realistic scenario (for a block length of 64
 314    bytes this would take 250,000 years in a continuous 1Gbps link, and
 315    without changing the secret key K during all this time).  This attack
 316    could become realistic only if serious flaws in the collision
 317    behavior of the function H are discovered (e.g.  collisions found
 318    after 2**30 messages). Such a discovery would determine the immediate
 319    replacement of the function H (the effects of such failure would be
 320    far more severe for the traditional uses of H in the context of
 321    digital signatures, public key certificates, etc.).
 322
 323    Note: this attack needs to be strongly contrasted with regular
 324    collision attacks on cryptographic hash functions where no secret key
 325    is involved and where 2**64 off-line parallelizable (!) operations
 326    suffice to find collisions.  The latter attack is approaching
 327    feasibility [VW] while the birthday attack on HMAC is totally
 328    impractical.  (In the above examples, if one uses a hash function
 329    with, say, 160 bit of output then 2**64 should be replaced by 2**80.)
 330
 331
 332
 333
 334
 335
 336
 337
 338 Krawczyk, et. al.            Informational                      [Page 6]
 339 \f
 340 RFC 2104                          HMAC                     February 1997
 341
 342
 343    A correct implementation of the above construction, the choice of
 344    random (or cryptographically pseudorandom) keys, a secure key
 345    exchange mechanism, frequent key refreshments, and good secrecy
 346    protection of keys are all essential ingredients for the security of
 347    the integrity verification mechanism provided by HMAC.
 348
 349
 350
 351
 352
 353
 354
 355
 356
 357
 358
 359
 360
 361
 362
 363
 364
 365
 366
 367
 368
 369
 370
 371
 372
 373
 374
 375
 376
 377
 378
 379
 380
 381
 382
 383
 384
 385
 386
 387
 388
 389
 390
 391
 392
 393
 394 Krawczyk, et. al.            Informational                      [Page 7]
 395 \f
 396 RFC 2104                          HMAC                     February 1997
 397
 398
 399 Appendix -- Sample Code
 400
 401    For the sake of illustration we provide the following sample code for
 402    the implementation of HMAC-MD5 as well as some corresponding test
 403    vectors (the code is based on MD5 code as described in [MD5]).
 404
 405 /*
 406 ** Function: hmac_md5
 407 */
 408
 409 void
 410 hmac_md5(text, text_len, key, key_len, digest)
 411 unsigned char*  text;                /* pointer to data stream */
 412 int             text_len;            /* length of data stream */
 413 unsigned char*  key;                 /* pointer to authentication key */
 414 int             key_len;             /* length of authentication key */
 415 caddr_t         digest;              /* caller digest to be filled in */
 416
 417 {
 418         MD5_CTX context;
 419         unsigned char k_ipad[65];    /* inner padding -
 420                                       * key XORd with ipad
 421                                       */
 422         unsigned char k_opad[65];    /* outer padding -
 423                                       * key XORd with opad
 424                                       */
 425         unsigned char tk[16];
 426         int i;
 427         /* if key is longer than 64 bytes reset it to key=MD5(key) */
 428         if (key_len > 64) {
 429
 430                 MD5_CTX      tctx;
 431
 432                 MD5Init(&tctx);
 433                 MD5Update(&tctx, key, key_len);
 434                 MD5Final(tk, &tctx);
 435
 436                 key = tk;
 437                 key_len = 16;
 438         }
 439
 440         /*
 441          * the HMAC_MD5 transform looks like:
 442          *
 443          * MD5(K XOR opad, MD5(K XOR ipad, text))
 444          *
 445          * where K is an n byte key
 446          * ipad is the byte 0x36 repeated 64 times
 447
 448
 449
 450 Krawczyk, et. al.            Informational                      [Page 8]
 451 \f
 452 RFC 2104                          HMAC                     February 1997
 453
 454
 455          * opad is the byte 0x5c repeated 64 times
 456          * and text is the data being protected
 457          */
 458
 459         /* start out by storing key in pads */
 460         bzero( k_ipad, sizeof k_ipad);
 461         bzero( k_opad, sizeof k_opad);
 462         bcopy( key, k_ipad, key_len);
 463         bcopy( key, k_opad, key_len);
 464
 465         /* XOR key with ipad and opad values */
 466         for (i=0; i<64; i++) {
 467                 k_ipad[i] ^= 0x36;
 468                 k_opad[i] ^= 0x5c;
 469         }
 470         /*
 471          * perform inner MD5
 472          */
 473         MD5Init(&context);                   /* init context for 1st
 474                                               * pass */
 475         MD5Update(&context, k_ipad, 64)      /* start with inner pad */
 476         MD5Update(&context, text, text_len); /* then text of datagram */
 477         MD5Final(digest, &context);          /* finish up 1st pass */
 478         /*
 479          * perform outer MD5
 480          */
 481         MD5Init(&context);                   /* init context for 2nd
 482                                               * pass */
 483         MD5Update(&context, k_opad, 64);     /* start with outer pad */
 484         MD5Update(&context, digest, 16);     /* then results of 1st
 485                                               * hash */
 486         MD5Final(digest, &context);          /* finish up 2nd pass */
 487 }
 488
 489 Test Vectors (Trailing '\0' of a character string not included in test):
 490
 491   key =         0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b
 492   key_len =     16 bytes
 493   data =        "Hi There"
 494   data_len =    8  bytes
 495   digest =      0x9294727a3638bb1c13f48ef8158bfc9d
 496
 497   key =         "Jefe"
 498   data =        "what do ya want for nothing?"
 499   data_len =    28 bytes
 500   digest =      0x750c783e6ab0b503eaa86e310a5db738
 501
 502   key =         0xAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
 503
 504
 505
 506 Krawczyk, et. al.            Informational                      [Page 9]
 507 \f
 508 RFC 2104                          HMAC                     February 1997
 509
 510
 511   key_len       16 bytes
 512   data =        0xDDDDDDDDDDDDDDDDDDDD...
 513                 ..DDDDDDDDDDDDDDDDDDDD...
 514                 ..DDDDDDDDDDDDDDDDDDDD...
 515                 ..DDDDDDDDDDDDDDDDDDDD...
 516                 ..DDDDDDDDDDDDDDDDDDDD
 517   data_len =    50 bytes
 518   digest =      0x56be34521d144c88dbb8c733f0e8b3f6
 519
 520 Acknowledgments
 521
 522    Pau-Chen Cheng, Jeff Kraemer, and Michael Oehler, have provided
 523    useful comments on early drafts, and ran the first interoperability
 524    tests of this specification. Jeff and Pau-Chen kindly provided the
 525    sample code and test vectors that appear in the appendix.  Burt
 526    Kaliski, Bart Preneel, Matt Robshaw, Adi Shamir, and Paul van
 527    Oorschot have provided useful comments and suggestions during the
 528    investigation of the HMAC construction.
 529
 530 References
 531
 532    [ANSI]  ANSI X9.9, "American National Standard for Financial
 533            Institution Message Authentication (Wholesale)," American
 534            Bankers Association, 1981.   Revised 1986.
 535
 536    [Atk]   Atkinson, R., "IP Authentication Header", RFC 1826, August
 537            1995.
 538
 539    [BCK1]  M. Bellare, R. Canetti, and H. Krawczyk,
 540            "Keyed Hash Functions and Message Authentication",
 541            Proceedings of Crypto'96, LNCS 1109, pp. 1-15.
 542            (http://www.research.ibm.com/security/keyed-md5.html)
 543
 544    [BCK2]  M. Bellare, R. Canetti, and H. Krawczyk,
 545            "Pseudorandom Functions Revisited: The Cascade Construction",
 546            Proceedings of FOCS'96.
 547
 548    [Dobb]  H. Dobbertin, "The Status of MD5  After a Recent Attack",
 549            RSA Labs' CryptoBytes, Vol. 2 No. 2, Summer 1996.
 550            http://www.rsa.com/rsalabs/pubs/cryptobytes.html
 551
 552    [PV]    B. Preneel and P. van Oorschot, "Building fast MACs from hash
 553            functions", Advances in Cryptology -- CRYPTO'95 Proceedings,
 554            Lecture Notes in Computer Science, Springer-Verlag Vol.963,
 555            1995, pp. 1-14.
 556
 557    [MD5]   Rivest, R., "The MD5 Message-Digest Algorithm",
 558            RFC 1321, April 1992.
 559
 560
 561
 562 Krawczyk, et. al.            Informational                     [Page 10]
 563 \f
 564 RFC 2104                          HMAC                     February 1997
 565
 566
 567    [MM]    Meyer, S. and Matyas, S.M., Cryptography, New York Wiley,
 568            1982.
 569
 570    [RIPEMD] H. Dobbertin, A. Bosselaers, and B. Preneel, "RIPEMD-160: A
 571             strengthened version of RIPEMD", Fast Software Encryption,
 572             LNCS Vol 1039, pp. 71-82.
 573             ftp://ftp.esat.kuleuven.ac.be/pub/COSIC/bosselae/ripemd/.
 574
 575    [SHA]   NIST, FIPS PUB 180-1: Secure Hash Standard, April 1995.
 576
 577    [Tsu]   G. Tsudik, "Message authentication with one-way hash
 578            functions", In Proceedings of Infocom'92, May 1992.
 579            (Also in "Access Control and Policy Enforcement in
 580             Internetworks", Ph.D. Dissertation, Computer Science
 581             Department, University of Southern California, April 1991.)
 582
 583    [VW]    P. van Oorschot and M. Wiener, "Parallel Collision
 584            Search with Applications to Hash Functions and Discrete
 585            Logarithms", Proceedings of the 2nd ACM Conf. Computer and
 586            Communications Security, Fairfax, VA, November 1994.
 587
 588 Authors' Addresses
 589
 590    Hugo Krawczyk
 591    IBM T.J. Watson Research Center
 592    P.O.Box 704
 593    Yorktown Heights, NY 10598
 594
 595    EMail: hugo@watson.ibm.com
 596
 597    Mihir Bellare
 598    Dept of Computer Science and Engineering
 599    Mail Code 0114
 600    University of California at San Diego
 601    9500 Gilman Drive
 602    La Jolla, CA 92093
 603
 604    EMail: mihir@cs.ucsd.edu
 605
 606    Ran Canetti
 607    IBM T.J. Watson Research Center
 608    P.O.Box 704
 609    Yorktown Heights, NY 10598
 610
 611    EMail: canetti@watson.ibm.com
 612
 613
 614
 615
 616
 617
 618 Krawczyk, et. al.            Informational                     [Page 11]
 619 \f