From: Peter Avalos Date: Sun, 25 Mar 2012 17:44:51 +0000 (-0700) Subject: Import OpenSSL-1.0.1. X-Git-Tag: v3.2.0~1169^2 X-Git-Url: https://gitweb.dragonflybsd.org/~tuxillo/dragonfly.git/commitdiff_plain/672590bc3265e9b8b2a848c90070e94c3db182f3 Import OpenSSL-1.0.1. Major changes between OpenSSL 1.0.0h and OpenSSL 1.0.1: o TLS/DTLS heartbeat support. o SCTP support. o RFC 5705 TLS key material exporter. o RFC 5764 DTLS-SRTP negotiation. o Next Protocol Negotiation. o PSS signatures in certificates, requests and CRLs. o Support for password based recipient info for CMS. o Support TLS v1.2 and TLS v1.1. o Preliminary FIPS capability for unvalidated 2.0 FIPS module. o SRP support. Major changes between OpenSSL 1.0.0g and OpenSSL 1.0.0h: o Fix for CMS/PKCS#7 MMA CVE-2012-0884 o Corrected fix for CVE-2011-4619 o Various DTLS fixes. --- diff --git a/crypto/openssl/CHANGES b/crypto/openssl/CHANGES index 67ff293f35..03d731451d 100644 --- a/crypto/openssl/CHANGES +++ b/crypto/openssl/CHANGES @@ -2,6 +2,296 @@ OpenSSL CHANGES _______________ + Changes between 1.0.0h and 1.0.1 [14 Mar 2012] + + *) Add compatibility with old MDC2 signatures which use an ASN1 OCTET + STRING form instead of a DigestInfo. + [Steve Henson] + + *) The format used for MDC2 RSA signatures is inconsistent between EVP + and the RSA_sign/RSA_verify functions. This was made more apparent when + OpenSSL used RSA_sign/RSA_verify for some RSA signatures in particular + those which went through EVP_PKEY_METHOD in 1.0.0 and later. Detect + the correct format in RSA_verify so both forms transparently work. + [Steve Henson] + + *) Some servers which support TLS 1.0 can choke if we initially indicate + support for TLS 1.2 and later renegotiate using TLS 1.0 in the RSA + encrypted premaster secret. As a workaround use the maximum pemitted + client version in client hello, this should keep such servers happy + and still work with previous versions of OpenSSL. + [Steve Henson] + + *) Add support for TLS/DTLS heartbeats. + [Robin Seggelmann ] + + *) Add support for SCTP. + [Robin Seggelmann ] + + *) Improved PRNG seeding for VOS. + [Paul Green ] + + *) Extensive assembler packs updates, most notably: + + - x86[_64]: AES-NI, PCLMULQDQ, RDRAND support; + - x86[_64]: SSSE3 support (SHA1, vector-permutation AES); + - x86_64: bit-sliced AES implementation; + - ARM: NEON support, contemporary platforms optimizations; + - s390x: z196 support; + - *: GHASH and GF(2^m) multiplication implementations; + + [Andy Polyakov] + + *) Make TLS-SRP code conformant with RFC 5054 API cleanup + (removal of unnecessary code) + [Peter Sylvester ] + + *) Add TLS key material exporter from RFC 5705. + [Eric Rescorla] + + *) Add DTLS-SRTP negotiation from RFC 5764. + [Eric Rescorla] + + *) Add Next Protocol Negotiation, + http://tools.ietf.org/html/draft-agl-tls-nextprotoneg-00. Can be + disabled with a no-npn flag to config or Configure. Code donated + by Google. + [Adam Langley and Ben Laurie] + + *) Add optional 64-bit optimized implementations of elliptic curves NIST-P224, + NIST-P256, NIST-P521, with constant-time single point multiplication on + typical inputs. Compiler support for the nonstandard type __uint128_t is + required to use this (present in gcc 4.4 and later, for 64-bit builds). + Code made available under Apache License version 2.0. + + Specify "enable-ec_nistp_64_gcc_128" on the Configure (or config) command + line to include this in your build of OpenSSL, and run "make depend" (or + "make update"). This enables the following EC_METHODs: + + EC_GFp_nistp224_method() + EC_GFp_nistp256_method() + EC_GFp_nistp521_method() + + EC_GROUP_new_by_curve_name() will automatically use these (while + EC_GROUP_new_curve_GFp() currently prefers the more flexible + implementations). + [Emilia Käsper, Adam Langley, Bodo Moeller (Google)] + + *) Use type ossl_ssize_t instad of ssize_t which isn't available on + all platforms. Move ssize_t definition from e_os.h to the public + header file e_os2.h as it now appears in public header file cms.h + [Steve Henson] + + *) New -sigopt option to the ca, req and x509 utilities. Additional + signature parameters can be passed using this option and in + particular PSS. + [Steve Henson] + + *) Add RSA PSS signing function. This will generate and set the + appropriate AlgorithmIdentifiers for PSS based on those in the + corresponding EVP_MD_CTX structure. No application support yet. + [Steve Henson] + + *) Support for companion algorithm specific ASN1 signing routines. + New function ASN1_item_sign_ctx() signs a pre-initialised + EVP_MD_CTX structure and sets AlgorithmIdentifiers based on + the appropriate parameters. + [Steve Henson] + + *) Add new algorithm specific ASN1 verification initialisation function + to EVP_PKEY_ASN1_METHOD: this is not in EVP_PKEY_METHOD since the ASN1 + handling will be the same no matter what EVP_PKEY_METHOD is used. + Add a PSS handler to support verification of PSS signatures: checked + against a number of sample certificates. + [Steve Henson] + + *) Add signature printing for PSS. Add PSS OIDs. + [Steve Henson, Martin Kaiser ] + + *) Add algorithm specific signature printing. An individual ASN1 method + can now print out signatures instead of the standard hex dump. + + More complex signatures (e.g. PSS) can print out more meaningful + information. Include DSA version that prints out the signature + parameters r, s. + [Steve Henson] + + *) Password based recipient info support for CMS library: implementing + RFC3211. + [Steve Henson] + + *) Split password based encryption into PBES2 and PBKDF2 functions. This + neatly separates the code into cipher and PBE sections and is required + for some algorithms that split PBES2 into separate pieces (such as + password based CMS). + [Steve Henson] + + *) Session-handling fixes: + - Fix handling of connections that are resuming with a session ID, + but also support Session Tickets. + - Fix a bug that suppressed issuing of a new ticket if the client + presented a ticket with an expired session. + - Try to set the ticket lifetime hint to something reasonable. + - Make tickets shorter by excluding irrelevant information. + - On the client side, don't ignore renewed tickets. + [Adam Langley, Bodo Moeller (Google)] + + *) Fix PSK session representation. + [Bodo Moeller] + + *) Add RC4-MD5 and AESNI-SHA1 "stitched" implementations. + + This work was sponsored by Intel. + [Andy Polyakov] + + *) Add GCM support to TLS library. Some custom code is needed to split + the IV between the fixed (from PRF) and explicit (from TLS record) + portions. This adds all GCM ciphersuites supported by RFC5288 and + RFC5289. Generalise some AES* cipherstrings to inlclude GCM and + add a special AESGCM string for GCM only. + [Steve Henson] + + *) Expand range of ctrls for AES GCM. Permit setting invocation + field on decrypt and retrieval of invocation field only on encrypt. + [Steve Henson] + + *) Add HMAC ECC ciphersuites from RFC5289. Include SHA384 PRF support. + As required by RFC5289 these ciphersuites cannot be used if for + versions of TLS earlier than 1.2. + [Steve Henson] + + *) For FIPS capable OpenSSL interpret a NULL default public key method + as unset and return the appopriate default but do *not* set the default. + This means we can return the appopriate method in applications that + swicth between FIPS and non-FIPS modes. + [Steve Henson] + + *) Redirect HMAC and CMAC operations to FIPS module in FIPS mode. If an + ENGINE is used then we cannot handle that in the FIPS module so we + keep original code iff non-FIPS operations are allowed. + [Steve Henson] + + *) Add -attime option to openssl utilities. + [Peter Eckersley , Ben Laurie and Steve Henson] + + *) Redirect DSA and DH operations to FIPS module in FIPS mode. + [Steve Henson] + + *) Redirect ECDSA and ECDH operations to FIPS module in FIPS mode. Also use + FIPS EC methods unconditionally for now. + [Steve Henson] + + *) New build option no-ec2m to disable characteristic 2 code. + [Steve Henson] + + *) Backport libcrypto audit of return value checking from 1.1.0-dev; not + all cases can be covered as some introduce binary incompatibilities. + [Steve Henson] + + *) Redirect RSA operations to FIPS module including keygen, + encrypt, decrypt, sign and verify. Block use of non FIPS RSA methods. + [Steve Henson] + + *) Add similar low level API blocking to ciphers. + [Steve Henson] + + *) Low level digest APIs are not approved in FIPS mode: any attempt + to use these will cause a fatal error. Applications that *really* want + to use them can use the private_* version instead. + [Steve Henson] + + *) Redirect cipher operations to FIPS module for FIPS builds. + [Steve Henson] + + *) Redirect digest operations to FIPS module for FIPS builds. + [Steve Henson] + + *) Update build system to add "fips" flag which will link in fipscanister.o + for static and shared library builds embedding a signature if needed. + [Steve Henson] + + *) Output TLS supported curves in preference order instead of numerical + order. This is currently hardcoded for the highest order curves first. + This should be configurable so applications can judge speed vs strength. + [Steve Henson] + + *) Add TLS v1.2 server support for client authentication. + [Steve Henson] + + *) Add support for FIPS mode in ssl library: disable SSLv3, non-FIPS ciphers + and enable MD5. + [Steve Henson] + + *) Functions FIPS_mode_set() and FIPS_mode() which call the underlying + FIPS modules versions. + [Steve Henson] + + *) Add TLS v1.2 client side support for client authentication. Keep cache + of handshake records longer as we don't know the hash algorithm to use + until after the certificate request message is received. + [Steve Henson] + + *) Initial TLS v1.2 client support. Add a default signature algorithms + extension including all the algorithms we support. Parse new signature + format in client key exchange. Relax some ECC signing restrictions for + TLS v1.2 as indicated in RFC5246. + [Steve Henson] + + *) Add server support for TLS v1.2 signature algorithms extension. Switch + to new signature format when needed using client digest preference. + All server ciphersuites should now work correctly in TLS v1.2. No client + support yet and no support for client certificates. + [Steve Henson] + + *) Initial TLS v1.2 support. Add new SHA256 digest to ssl code, switch + to SHA256 for PRF when using TLS v1.2 and later. Add new SHA256 based + ciphersuites. At present only RSA key exchange ciphersuites work with + TLS v1.2. Add new option for TLS v1.2 replacing the old and obsolete + SSL_OP_PKCS1_CHECK flags with SSL_OP_NO_TLSv1_2. New TLSv1.2 methods + and version checking. + [Steve Henson] + + *) New option OPENSSL_NO_SSL_INTERN. If an application can be compiled + with this defined it will not be affected by any changes to ssl internal + structures. Add several utility functions to allow openssl application + to work with OPENSSL_NO_SSL_INTERN defined. + [Steve Henson] + + *) Add SRP support. + [Tom Wu and Ben Laurie] + + *) Add functions to copy EVP_PKEY_METHOD and retrieve flags and id. + [Steve Henson] + + *) Permit abbreviated handshakes when renegotiating using the function + SSL_renegotiate_abbreviated(). + [Robin Seggelmann ] + + *) Add call to ENGINE_register_all_complete() to + ENGINE_load_builtin_engines(), so some implementations get used + automatically instead of needing explicit application support. + [Steve Henson] + + *) Add support for TLS key exporter as described in RFC5705. + [Robin Seggelmann , Steve Henson] + + *) Initial TLSv1.1 support. Since TLSv1.1 is very similar to TLS v1.0 only + a few changes are required: + + Add SSL_OP_NO_TLSv1_1 flag. + Add TLSv1_1 methods. + Update version checking logic to handle version 1.1. + Add explicit IV handling (ported from DTLS code). + Add command line options to s_client/s_server. + [Steve Henson] + + Changes between 1.0.0g and 1.0.0h [xx XXX xxxx] + + *) Fix CVE-2011-4619: make sure we really are receiving a + client hello before rejecting multiple SGC restarts. Thanks to + Ivan Nestlerode for discovering this bug. + [Steve Henson] + Changes between 1.0.0f and 1.0.0g [18 Jan 2012] *) Fix for DTLS DoS issue introduced by fix for CVE-2011-4109. diff --git a/crypto/openssl/FAQ b/crypto/openssl/FAQ index 2a271ed679..b9243a6104 100644 --- a/crypto/openssl/FAQ +++ b/crypto/openssl/FAQ @@ -82,7 +82,7 @@ OpenSSL - Frequently Asked Questions * Which is the current version of OpenSSL? The current version is available from . -OpenSSL 1.0.0g was released on Jan 18th, 2012. +OpenSSL 1.0.1 was released on Mar 14th, 2012. In addition to the current stable release, you can also access daily snapshots of the OpenSSL development version at = 0) X509_VERIFY_PARAM_set_depth(*pm, depth); + if (at_time) + X509_VERIFY_PARAM_set_time(*pm, at_time); + end: (*pargs)++; @@ -2693,6 +2720,50 @@ void jpake_server_auth(BIO *out, BIO *conn, const char *secret) #endif +#if !defined(OPENSSL_NO_TLSEXT) && !defined(OPENSSL_NO_NEXTPROTONEG) +/* next_protos_parse parses a comma separated list of strings into a string + * in a format suitable for passing to SSL_CTX_set_next_protos_advertised. + * outlen: (output) set to the length of the resulting buffer on success. + * err: (maybe NULL) on failure, an error message line is written to this BIO. + * in: a NUL termianted string like "abc,def,ghi" + * + * returns: a malloced buffer or NULL on failure. + */ +unsigned char *next_protos_parse(unsigned short *outlen, const char *in) + { + size_t len; + unsigned char *out; + size_t i, start = 0; + + len = strlen(in); + if (len >= 65535) + return NULL; + + out = OPENSSL_malloc(strlen(in) + 1); + if (!out) + return NULL; + + for (i = 0; i <= len; ++i) + { + if (i == len || in[i] == ',') + { + if (i - start > 255) + { + OPENSSL_free(out); + return NULL; + } + out[start] = i - start; + start = i + 1; + } + else + out[i+1] = in[i]; + } + + *outlen = len + 1; + return out; + } +#endif /* !OPENSSL_NO_TLSEXT && !OPENSSL_NO_NEXTPROTONEG */ + /* * Platform-specific sections */ diff --git a/crypto/openssl/apps/apps.h b/crypto/openssl/apps/apps.h index 596a39aceb..c1ca99da12 100644 --- a/crypto/openssl/apps/apps.h +++ b/crypto/openssl/apps/apps.h @@ -317,6 +317,12 @@ int bio_to_mem(unsigned char **out, int maxlen, BIO *in); int pkey_ctrl_string(EVP_PKEY_CTX *ctx, char *value); int init_gen_str(BIO *err, EVP_PKEY_CTX **pctx, const char *algname, ENGINE *e, int do_param); +int do_X509_sign(BIO *err, X509 *x, EVP_PKEY *pkey, const EVP_MD *md, + STACK_OF(OPENSSL_STRING) *sigopts); +int do_X509_REQ_sign(BIO *err, X509_REQ *x, EVP_PKEY *pkey, const EVP_MD *md, + STACK_OF(OPENSSL_STRING) *sigopts); +int do_X509_CRL_sign(BIO *err, X509_CRL *x, EVP_PKEY *pkey, const EVP_MD *md, + STACK_OF(OPENSSL_STRING) *sigopts); #ifndef OPENSSL_NO_PSK extern char *psk_key; #endif @@ -325,6 +331,10 @@ void jpake_client_auth(BIO *out, BIO *conn, const char *secret); void jpake_server_auth(BIO *out, BIO *conn, const char *secret); #endif +#if !defined(OPENSSL_NO_TLSEXT) && !defined(OPENSSL_NO_NEXTPROTONEG) +unsigned char *next_protos_parse(unsigned short *outlen, const char *in); +#endif /* !OPENSSL_NO_TLSEXT && !OPENSSL_NO_NEXTPROTONEG */ + #define FORMAT_UNDEF 0 #define FORMAT_ASN1 1 #define FORMAT_TEXT 2 @@ -357,4 +367,7 @@ int raw_write_stdout(const void *,int); #define TM_START 0 #define TM_STOP 1 double app_tminterval (int stop,int usertime); + +#define OPENSSL_NO_SSL_INTERN + #endif diff --git a/crypto/openssl/apps/ca.c b/crypto/openssl/apps/ca.c index 5d11948bb3..2a83d1936e 100644 --- a/crypto/openssl/apps/ca.c +++ b/crypto/openssl/apps/ca.c @@ -197,26 +197,30 @@ extern int EF_ALIGNMENT; static void lookup_fail(const char *name, const char *tag); static int certify(X509 **xret, char *infile,EVP_PKEY *pkey,X509 *x509, - const EVP_MD *dgst,STACK_OF(CONF_VALUE) *policy,CA_DB *db, + const EVP_MD *dgst,STACK_OF(OPENSSL_STRING) *sigopts, + STACK_OF(CONF_VALUE) *policy,CA_DB *db, BIGNUM *serial, char *subj,unsigned long chtype, int multirdn, int email_dn, char *startdate, char *enddate, long days, int batch, char *ext_sect, CONF *conf, int verbose, unsigned long certopt, unsigned long nameopt, int default_op, int ext_copy, int selfsign); static int certify_cert(X509 **xret, char *infile,EVP_PKEY *pkey,X509 *x509, - const EVP_MD *dgst,STACK_OF(CONF_VALUE) *policy, + const EVP_MD *dgst,STACK_OF(OPENSSL_STRING) *sigopts, + STACK_OF(CONF_VALUE) *policy, CA_DB *db, BIGNUM *serial, char *subj,unsigned long chtype, int multirdn, int email_dn, char *startdate, char *enddate, long days, int batch, char *ext_sect, CONF *conf,int verbose, unsigned long certopt, unsigned long nameopt, int default_op, int ext_copy, ENGINE *e); static int certify_spkac(X509 **xret, char *infile,EVP_PKEY *pkey,X509 *x509, - const EVP_MD *dgst,STACK_OF(CONF_VALUE) *policy, + const EVP_MD *dgst,STACK_OF(OPENSSL_STRING) *sigopts, + STACK_OF(CONF_VALUE) *policy, CA_DB *db, BIGNUM *serial,char *subj,unsigned long chtype, int multirdn, int email_dn, char *startdate, char *enddate, long days, char *ext_sect, CONF *conf, int verbose, unsigned long certopt, unsigned long nameopt, int default_op, int ext_copy); static void write_new_certificate(BIO *bp, X509 *x, int output_der, int notext); static int do_body(X509 **xret, EVP_PKEY *pkey, X509 *x509, const EVP_MD *dgst, + STACK_OF(OPENSSL_STRING) *sigopts, STACK_OF(CONF_VALUE) *policy, CA_DB *db, BIGNUM *serial,char *subj,unsigned long chtype, int multirdn, int email_dn, char *startdate, char *enddate, long days, int batch, int verbose, X509_REQ *req, char *ext_sect, CONF *conf, @@ -311,6 +315,7 @@ int MAIN(int argc, char **argv) const EVP_MD *dgst=NULL; STACK_OF(CONF_VALUE) *attribs=NULL; STACK_OF(X509) *cert_sk=NULL; + STACK_OF(OPENSSL_STRING) *sigopts = NULL; #undef BSIZE #define BSIZE 256 MS_STATIC char buf[3][BSIZE]; @@ -435,6 +440,15 @@ EF_ALIGNMENT=0; if (--argc < 1) goto bad; outdir= *(++argv); } + else if (strcmp(*argv,"-sigopt") == 0) + { + if (--argc < 1) + goto bad; + if (!sigopts) + sigopts = sk_OPENSSL_STRING_new_null(); + if (!sigopts || !sk_OPENSSL_STRING_push(sigopts, *(++argv))) + goto bad; + } else if (strcmp(*argv,"-notext") == 0) notext=1; else if (strcmp(*argv,"-batch") == 0) @@ -1170,8 +1184,9 @@ bad: if (spkac_file != NULL) { total++; - j=certify_spkac(&x,spkac_file,pkey,x509,dgst,attribs,db, - serial,subj,chtype,multirdn,email_dn,startdate,enddate,days,extensions, + j=certify_spkac(&x,spkac_file,pkey,x509,dgst,sigopts, + attribs,db, serial,subj,chtype,multirdn, + email_dn,startdate,enddate,days,extensions, conf,verbose,certopt,nameopt,default_op,ext_copy); if (j < 0) goto err; if (j > 0) @@ -1194,7 +1209,8 @@ bad: if (ss_cert_file != NULL) { total++; - j=certify_cert(&x,ss_cert_file,pkey,x509,dgst,attribs, + j=certify_cert(&x,ss_cert_file,pkey,x509,dgst,sigopts, + attribs, db,serial,subj,chtype,multirdn,email_dn,startdate,enddate,days,batch, extensions,conf,verbose, certopt, nameopt, default_op, ext_copy, e); @@ -1214,7 +1230,7 @@ bad: if (infile != NULL) { total++; - j=certify(&x,infile,pkey,x509p,dgst,attribs,db, + j=certify(&x,infile,pkey,x509p,dgst,sigopts, attribs,db, serial,subj,chtype,multirdn,email_dn,startdate,enddate,days,batch, extensions,conf,verbose, certopt, nameopt, default_op, ext_copy, selfsign); @@ -1234,7 +1250,7 @@ bad: for (i=0; iid; + unsigned long id = SSL_CIPHER_get_id(c); int id0 = (int)(id >> 24); int id1 = (int)((id >> 16) & 0xffL); int id2 = (int)((id >> 8) & 0xffL); diff --git a/crypto/openssl/apps/cms.c b/crypto/openssl/apps/cms.c index d15925a59f..d754140987 100644 --- a/crypto/openssl/apps/cms.c +++ b/crypto/openssl/apps/cms.c @@ -136,6 +136,7 @@ int MAIN(int argc, char **argv) char *engine=NULL; #endif unsigned char *secret_key = NULL, *secret_keyid = NULL; + unsigned char *pwri_pass = NULL, *pwri_tmp = NULL; size_t secret_keylen = 0, secret_keyidlen = 0; ASN1_OBJECT *econtent_type = NULL; @@ -326,6 +327,13 @@ int MAIN(int argc, char **argv) } secret_keyidlen = (size_t)ltmp; } + else if (!strcmp(*args,"-pwri_password")) + { + if (!args[1]) + goto argerr; + args++; + pwri_pass = (unsigned char *)*args; + } else if (!strcmp(*args,"-econtent_type")) { if (!args[1]) @@ -559,7 +567,7 @@ int MAIN(int argc, char **argv) else if (operation == SMIME_DECRYPT) { - if (!recipfile && !keyfile && !secret_key) + if (!recipfile && !keyfile && !secret_key && !pwri_pass) { BIO_printf(bio_err, "No recipient certificate or key specified\n"); badarg = 1; @@ -567,7 +575,7 @@ int MAIN(int argc, char **argv) } else if (operation == SMIME_ENCRYPT) { - if (!*args && !secret_key) + if (!*args && !secret_key && !pwri_pass) { BIO_printf(bio_err, "No recipient(s) certificate(s) specified\n"); badarg = 1; @@ -917,6 +925,17 @@ int MAIN(int argc, char **argv) secret_key = NULL; secret_keyid = NULL; } + if (pwri_pass) + { + pwri_tmp = (unsigned char *)BUF_strdup((char *)pwri_pass); + if (!pwri_tmp) + goto end; + if (!CMS_add0_recipient_password(cms, + -1, NID_undef, NID_undef, + pwri_tmp, -1, NULL)) + goto end; + pwri_tmp = NULL; + } if (!(flags & CMS_STREAM)) { if (!CMS_final(cms, in, NULL, flags)) @@ -1043,6 +1062,16 @@ int MAIN(int argc, char **argv) } } + if (pwri_pass) + { + if (!CMS_decrypt_set1_password(cms, pwri_pass, -1)) + { + BIO_puts(bio_err, + "Error decrypting CMS using password\n"); + goto end; + } + } + if (!CMS_decrypt(cms, NULL, NULL, indata, out, flags)) { BIO_printf(bio_err, "Error decrypting CMS structure\n"); @@ -1167,6 +1196,8 @@ end: OPENSSL_free(secret_key); if (secret_keyid) OPENSSL_free(secret_keyid); + if (pwri_tmp) + OPENSSL_free(pwri_tmp); if (econtent_type) ASN1_OBJECT_free(econtent_type); if (rr) diff --git a/crypto/openssl/apps/dgst.c b/crypto/openssl/apps/dgst.c index 9bf38ce73b..b08e9a7c78 100644 --- a/crypto/openssl/apps/dgst.c +++ b/crypto/openssl/apps/dgst.c @@ -127,6 +127,7 @@ int MAIN(int argc, char **argv) #endif char *hmac_key=NULL; char *mac_name=NULL; + int non_fips_allow = 0; STACK_OF(OPENSSL_STRING) *sigopts = NULL, *macopts = NULL; apps_startup(); @@ -215,6 +216,10 @@ int MAIN(int argc, char **argv) out_bin = 1; else if (strcmp(*argv,"-d") == 0) debug=1; + else if (strcmp(*argv,"-non-fips-allow") == 0) + non_fips_allow=1; + else if (!strcmp(*argv,"-fips-fingerprint")) + hmac_key = "etaonrishdlcupfm"; else if (!strcmp(*argv,"-hmac")) { if (--argc < 1) @@ -395,6 +400,13 @@ int MAIN(int argc, char **argv) goto end; } + if (non_fips_allow) + { + EVP_MD_CTX *md_ctx; + BIO_get_md_ctx(bmd,&md_ctx); + EVP_MD_CTX_set_flags(md_ctx, EVP_MD_CTX_FLAG_NON_FIPS_ALLOW); + } + if (hmac_key) { sigkey = EVP_PKEY_new_mac_key(EVP_PKEY_HMAC, e, diff --git a/crypto/openssl/apps/enc.c b/crypto/openssl/apps/enc.c index 076225c4cb..719acc3250 100644 --- a/crypto/openssl/apps/enc.c +++ b/crypto/openssl/apps/enc.c @@ -129,6 +129,7 @@ int MAIN(int argc, char **argv) char *engine = NULL; #endif const EVP_MD *dgst=NULL; + int non_fips_allow = 0; apps_startup(); @@ -281,6 +282,8 @@ int MAIN(int argc, char **argv) if (--argc < 1) goto bad; md= *(++argv); } + else if (strcmp(*argv,"-non-fips-allow") == 0) + non_fips_allow = 1; else if ((argv[0][0] == '-') && ((c=EVP_get_cipherbyname(&(argv[0][1]))) != NULL)) { @@ -589,6 +592,11 @@ bad: */ BIO_get_cipher_ctx(benc, &ctx); + + if (non_fips_allow) + EVP_CIPHER_CTX_set_flags(ctx, + EVP_CIPH_FLAG_NON_FIPS_ALLOW); + if (!EVP_CipherInit_ex(ctx, cipher, NULL, NULL, NULL, enc)) { BIO_printf(bio_err, "Error setting cipher %s\n", diff --git a/crypto/openssl/apps/openssl.c b/crypto/openssl/apps/openssl.c index dab057bbff..1c880d90ba 100644 --- a/crypto/openssl/apps/openssl.c +++ b/crypto/openssl/apps/openssl.c @@ -129,6 +129,9 @@ #include "progs.h" #include "s_apps.h" #include +#ifdef OPENSSL_FIPS +#include +#endif /* The LHASH callbacks ("hash" & "cmp") have been replaced by functions with the * base prototypes (we cast each variable inside the function to the required @@ -310,6 +313,19 @@ int main(int Argc, char *ARGV[]) CRYPTO_set_locking_callback(lock_dbg_cb); } + if(getenv("OPENSSL_FIPS")) { +#ifdef OPENSSL_FIPS + if (!FIPS_mode_set(1)) { + ERR_load_crypto_strings(); + ERR_print_errors(BIO_new_fp(stderr,BIO_NOCLOSE)); + EXIT(1); + } +#else + fprintf(stderr, "FIPS mode not supported.\n"); + EXIT(1); +#endif + } + apps_startup(); /* Lets load up our environment a little */ diff --git a/crypto/openssl/apps/progs.h b/crypto/openssl/apps/progs.h index 79e479a337..949e78066b 100644 --- a/crypto/openssl/apps/progs.h +++ b/crypto/openssl/apps/progs.h @@ -46,6 +46,7 @@ extern int engine_main(int argc,char *argv[]); extern int ocsp_main(int argc,char *argv[]); extern int prime_main(int argc,char *argv[]); extern int ts_main(int argc,char *argv[]); +extern int srp_main(int argc,char *argv[]); #define FUNC_TYPE_GENERAL 1 #define FUNC_TYPE_MD 2 @@ -147,6 +148,9 @@ FUNCTION functions[] = { #endif {FUNC_TYPE_GENERAL,"prime",prime_main}, {FUNC_TYPE_GENERAL,"ts",ts_main}, +#ifndef OPENSSL_NO_SRP + {FUNC_TYPE_GENERAL,"srp",srp_main}, +#endif #ifndef OPENSSL_NO_MD2 {FUNC_TYPE_MD,"md2",dgst_main}, #endif diff --git a/crypto/openssl/apps/req.c b/crypto/openssl/apps/req.c index 820cd18fc7..85526581ce 100644 --- a/crypto/openssl/apps/req.c +++ b/crypto/openssl/apps/req.c @@ -165,7 +165,7 @@ int MAIN(int argc, char **argv) EVP_PKEY_CTX *genctx = NULL; const char *keyalg = NULL; char *keyalgstr = NULL; - STACK_OF(OPENSSL_STRING) *pkeyopts = NULL; + STACK_OF(OPENSSL_STRING) *pkeyopts = NULL, *sigopts = NULL; EVP_PKEY *pkey=NULL; int i=0,badops=0,newreq=0,verbose=0,pkey_type=-1; long newkey = -1; @@ -310,6 +310,15 @@ int MAIN(int argc, char **argv) if (!pkeyopts || !sk_OPENSSL_STRING_push(pkeyopts, *(++argv))) goto bad; } + else if (strcmp(*argv,"-sigopt") == 0) + { + if (--argc < 1) + goto bad; + if (!sigopts) + sigopts = sk_OPENSSL_STRING_new_null(); + if (!sigopts || !sk_OPENSSL_STRING_push(sigopts, *(++argv))) + goto bad; + } else if (strcmp(*argv,"-batch") == 0) batch=1; else if (strcmp(*argv,"-newhdr") == 0) @@ -858,8 +867,9 @@ loop: extensions); goto end; } - - if (!(i=X509_sign(x509ss,pkey,digest))) + + i=do_X509_sign(bio_err, x509ss, pkey, digest, sigopts); + if (!i) { ERR_print_errors(bio_err); goto end; @@ -883,7 +893,8 @@ loop: req_exts); goto end; } - if (!(i=X509_REQ_sign(req,pkey,digest))) + i=do_X509_REQ_sign(bio_err, req, pkey, digest, sigopts); + if (!i) { ERR_print_errors(bio_err); goto end; @@ -1084,6 +1095,8 @@ end: EVP_PKEY_CTX_free(genctx); if (pkeyopts) sk_OPENSSL_STRING_free(pkeyopts); + if (sigopts) + sk_OPENSSL_STRING_free(sigopts); #ifndef OPENSSL_NO_ENGINE if (gen_eng) ENGINE_free(gen_eng); @@ -1756,3 +1769,68 @@ static int genpkey_cb(EVP_PKEY_CTX *ctx) #endif return 1; } + +static int do_sign_init(BIO *err, EVP_MD_CTX *ctx, EVP_PKEY *pkey, + const EVP_MD *md, STACK_OF(OPENSSL_STRING) *sigopts) + { + EVP_PKEY_CTX *pkctx = NULL; + int i; + EVP_MD_CTX_init(ctx); + if (!EVP_DigestSignInit(ctx, &pkctx, md, NULL, pkey)) + return 0; + for (i = 0; i < sk_OPENSSL_STRING_num(sigopts); i++) + { + char *sigopt = sk_OPENSSL_STRING_value(sigopts, i); + if (pkey_ctrl_string(pkctx, sigopt) <= 0) + { + BIO_printf(err, "parameter error \"%s\"\n", sigopt); + ERR_print_errors(bio_err); + return 0; + } + } + return 1; + } + +int do_X509_sign(BIO *err, X509 *x, EVP_PKEY *pkey, const EVP_MD *md, + STACK_OF(OPENSSL_STRING) *sigopts) + { + int rv; + EVP_MD_CTX mctx; + EVP_MD_CTX_init(&mctx); + rv = do_sign_init(err, &mctx, pkey, md, sigopts); + if (rv > 0) + rv = X509_sign_ctx(x, &mctx); + EVP_MD_CTX_cleanup(&mctx); + return rv > 0 ? 1 : 0; + } + + +int do_X509_REQ_sign(BIO *err, X509_REQ *x, EVP_PKEY *pkey, const EVP_MD *md, + STACK_OF(OPENSSL_STRING) *sigopts) + { + int rv; + EVP_MD_CTX mctx; + EVP_MD_CTX_init(&mctx); + rv = do_sign_init(err, &mctx, pkey, md, sigopts); + if (rv > 0) + rv = X509_REQ_sign_ctx(x, &mctx); + EVP_MD_CTX_cleanup(&mctx); + return rv > 0 ? 1 : 0; + } + + + +int do_X509_CRL_sign(BIO *err, X509_CRL *x, EVP_PKEY *pkey, const EVP_MD *md, + STACK_OF(OPENSSL_STRING) *sigopts) + { + int rv; + EVP_MD_CTX mctx; + EVP_MD_CTX_init(&mctx); + rv = do_sign_init(err, &mctx, pkey, md, sigopts); + if (rv > 0) + rv = X509_CRL_sign_ctx(x, &mctx); + EVP_MD_CTX_cleanup(&mctx); + return rv > 0 ? 1 : 0; + } + + diff --git a/crypto/openssl/apps/s_cb.c b/crypto/openssl/apps/s_cb.c index c4f5512247..2cd73376df 100644 --- a/crypto/openssl/apps/s_cb.c +++ b/crypto/openssl/apps/s_cb.c @@ -357,6 +357,12 @@ void MS_CALLBACK msg_cb(int write_p, int version, int content_type, const void * case TLS1_VERSION: str_version = "TLS 1.0 "; break; + case TLS1_1_VERSION: + str_version = "TLS 1.1 "; + break; + case TLS1_2_VERSION: + str_version = "TLS 1.2 "; + break; case DTLS1_VERSION: str_version = "DTLS 1.0 "; break; @@ -549,6 +555,9 @@ void MS_CALLBACK msg_cb(int write_p, int version, int content_type, const void * case 114: str_details2 = " bad_certificate_hash_value"; break; + case 115: + str_details2 = " unknown_psk_identity"; + break; } } } @@ -597,6 +606,26 @@ void MS_CALLBACK msg_cb(int write_p, int version, int content_type, const void * } } } + +#ifndef OPENSSL_NO_HEARTBEATS + if (content_type == 24) /* Heartbeat */ + { + str_details1 = ", Heartbeat"; + + if (len > 0) + { + switch (((const unsigned char*)buf)[0]) + { + case 1: + str_details1 = ", HeartbeatRequest"; + break; + case 2: + str_details1 = ", HeartbeatResponse"; + break; + } + } + } +#endif } BIO_printf(bio, "%s %s%s [length %04lx]%s%s\n", str_write_p, str_version, str_content_type, (unsigned long)len, str_details1, str_details2); @@ -657,6 +686,22 @@ void MS_CALLBACK tlsext_cb(SSL *s, int client_server, int type, extname = "status request"; break; + case TLSEXT_TYPE_user_mapping: + extname = "user mapping"; + break; + + case TLSEXT_TYPE_client_authz: + extname = "client authz"; + break; + + case TLSEXT_TYPE_server_authz: + extname = "server authz"; + break; + + case TLSEXT_TYPE_cert_type: + extname = "cert type"; + break; + case TLSEXT_TYPE_elliptic_curves: extname = "elliptic curves"; break; @@ -665,12 +710,28 @@ void MS_CALLBACK tlsext_cb(SSL *s, int client_server, int type, extname = "EC point formats"; break; + case TLSEXT_TYPE_srp: + extname = "SRP"; + break; + + case TLSEXT_TYPE_signature_algorithms: + extname = "signature algorithms"; + break; + + case TLSEXT_TYPE_use_srtp: + extname = "use SRTP"; + break; + + case TLSEXT_TYPE_heartbeat: + extname = "heartbeat"; + break; + case TLSEXT_TYPE_session_ticket: - extname = "server ticket"; + extname = "session ticket"; break; - case TLSEXT_TYPE_renegotiate: - extname = "renegotiate"; + case TLSEXT_TYPE_renegotiate: + extname = "renegotiation info"; break; #ifdef TLSEXT_TYPE_opaque_prf_input @@ -678,6 +739,11 @@ void MS_CALLBACK tlsext_cb(SSL *s, int client_server, int type, extname = "opaque PRF input"; break; #endif +#ifdef TLSEXT_TYPE_next_proto_neg + case TLSEXT_TYPE_next_proto_neg: + extname = "next protocol"; + break; +#endif default: extname = "unknown"; diff --git a/crypto/openssl/apps/s_client.c b/crypto/openssl/apps/s_client.c index 53be0f8f82..098cce27dd 100644 --- a/crypto/openssl/apps/s_client.c +++ b/crypto/openssl/apps/s_client.c @@ -163,6 +163,9 @@ typedef unsigned int u_int; #include #include #include +#ifndef OPENSSL_NO_SRP +#include +#endif #include "s_apps.h" #include "timeouts.h" @@ -203,6 +206,9 @@ static int c_status_req=0; static int c_msg=0; static int c_showcerts=0; +static char *keymatexportlabel=NULL; +static int keymatexportlen=20; + static void sc_usage(void); static void print_stuff(BIO *berr,SSL *con,int full); #ifndef OPENSSL_NO_TLSEXT @@ -315,13 +321,22 @@ static void sc_usage(void) # ifndef OPENSSL_NO_JPAKE BIO_printf(bio_err," -jpake arg - JPAKE secret to use\n"); # endif +#endif +#ifndef OPENSSL_NO_SRP + BIO_printf(bio_err," -srpuser user - SRP authentification for 'user'\n"); + BIO_printf(bio_err," -srppass arg - password for 'user'\n"); + BIO_printf(bio_err," -srp_lateuser - SRP username into second ClientHello message\n"); + BIO_printf(bio_err," -srp_moregroups - Tolerate other than the known g N values.\n"); + BIO_printf(bio_err," -srp_strength int - minimal mength in bits for N (default %d).\n",SRP_MINIMAL_N); #endif BIO_printf(bio_err," -ssl2 - just use SSLv2\n"); BIO_printf(bio_err," -ssl3 - just use SSLv3\n"); + BIO_printf(bio_err," -tls1_2 - just use TLSv1.2\n"); + BIO_printf(bio_err," -tls1_1 - just use TLSv1.1\n"); BIO_printf(bio_err," -tls1 - just use TLSv1\n"); BIO_printf(bio_err," -dtls1 - just use DTLSv1\n"); BIO_printf(bio_err," -mtu - set the link layer MTU\n"); - BIO_printf(bio_err," -no_tls1/-no_ssl3/-no_ssl2 - turn off that protocol\n"); + BIO_printf(bio_err," -no_tls1_2/-no_tls1_1/-no_tls1/-no_ssl3/-no_ssl2 - turn off that protocol\n"); BIO_printf(bio_err," -bugs - Switch on all SSL implementation bug workarounds\n"); BIO_printf(bio_err," -serverpref - Use server's cipher preferences (only SSLv2)\n"); BIO_printf(bio_err," -cipher - preferred cipher to use, use the 'openssl ciphers'\n"); @@ -342,8 +357,14 @@ static void sc_usage(void) BIO_printf(bio_err," -tlsextdebug - hex dump of all TLS extensions received\n"); BIO_printf(bio_err," -status - request certificate status from server\n"); BIO_printf(bio_err," -no_ticket - disable use of RFC4507bis session tickets\n"); +# if !defined(OPENSSL_NO_NEXTPROTONEG) + BIO_printf(bio_err," -nextprotoneg arg - enable NPN extension, considering named protocols supported (comma-separated list)\n"); +# endif #endif BIO_printf(bio_err," -legacy_renegotiation - enable use of legacy renegotiation (dangerous)\n"); + BIO_printf(bio_err," -use_srtp profiles - Offer SRTP key management with a colon-separated profile list\n"); + BIO_printf(bio_err," -keymatexport label - Export keying material using label\n"); + BIO_printf(bio_err," -keymatexportlen len - Export len bytes of keying material (default 20)\n"); } #ifndef OPENSSL_NO_TLSEXT @@ -366,6 +387,156 @@ static int MS_CALLBACK ssl_servername_cb(SSL *s, int *ad, void *arg) return SSL_TLSEXT_ERR_OK; } + +#ifndef OPENSSL_NO_SRP + +/* This is a context that we pass to all callbacks */ +typedef struct srp_arg_st + { + char *srppassin; + char *srplogin; + int msg; /* copy from c_msg */ + int debug; /* copy from c_debug */ + int amp; /* allow more groups */ + int strength /* minimal size for N */ ; + } SRP_ARG; + +#define SRP_NUMBER_ITERATIONS_FOR_PRIME 64 + +static int srp_Verify_N_and_g(BIGNUM *N, BIGNUM *g) + { + BN_CTX *bn_ctx = BN_CTX_new(); + BIGNUM *p = BN_new(); + BIGNUM *r = BN_new(); + int ret = + g != NULL && N != NULL && bn_ctx != NULL && BN_is_odd(N) && + BN_is_prime_ex(N, SRP_NUMBER_ITERATIONS_FOR_PRIME, bn_ctx, NULL) && + p != NULL && BN_rshift1(p, N) && + + /* p = (N-1)/2 */ + BN_is_prime_ex(p, SRP_NUMBER_ITERATIONS_FOR_PRIME, bn_ctx, NULL) && + r != NULL && + + /* verify g^((N-1)/2) == -1 (mod N) */ + BN_mod_exp(r, g, p, N, bn_ctx) && + BN_add_word(r, 1) && + BN_cmp(r, N) == 0; + + if(r) + BN_free(r); + if(p) + BN_free(p); + if(bn_ctx) + BN_CTX_free(bn_ctx); + return ret; + } + +/* This callback is used here for two purposes: + - extended debugging + - making some primality tests for unknown groups + The callback is only called for a non default group. + + An application does not need the call back at all if + only the stanard groups are used. In real life situations, + client and server already share well known groups, + thus there is no need to verify them. + Furthermore, in case that a server actually proposes a group that + is not one of those defined in RFC 5054, it is more appropriate + to add the group to a static list and then compare since + primality tests are rather cpu consuming. +*/ + +static int MS_CALLBACK ssl_srp_verify_param_cb(SSL *s, void *arg) + { + SRP_ARG *srp_arg = (SRP_ARG *)arg; + BIGNUM *N = NULL, *g = NULL; + if (!(N = SSL_get_srp_N(s)) || !(g = SSL_get_srp_g(s))) + return 0; + if (srp_arg->debug || srp_arg->msg || srp_arg->amp == 1) + { + BIO_printf(bio_err, "SRP parameters:\n"); + BIO_printf(bio_err,"\tN="); BN_print(bio_err,N); + BIO_printf(bio_err,"\n\tg="); BN_print(bio_err,g); + BIO_printf(bio_err,"\n"); + } + + if (SRP_check_known_gN_param(g,N)) + return 1; + + if (srp_arg->amp == 1) + { + if (srp_arg->debug) + BIO_printf(bio_err, "SRP param N and g are not known params, going to check deeper.\n"); + +/* The srp_moregroups is a real debugging feature. + Implementors should rather add the value to the known ones. + The minimal size has already been tested. +*/ + if (BN_num_bits(g) <= BN_BITS && srp_Verify_N_and_g(N,g)) + return 1; + } + BIO_printf(bio_err, "SRP param N and g rejected.\n"); + return 0; + } + +#define PWD_STRLEN 1024 + +static char * MS_CALLBACK ssl_give_srp_client_pwd_cb(SSL *s, void *arg) + { + SRP_ARG *srp_arg = (SRP_ARG *)arg; + char *pass = (char *)OPENSSL_malloc(PWD_STRLEN+1); + PW_CB_DATA cb_tmp; + int l; + + cb_tmp.password = (char *)srp_arg->srppassin; + cb_tmp.prompt_info = "SRP user"; + if ((l = password_callback(pass, PWD_STRLEN, 0, &cb_tmp))<0) + { + BIO_printf (bio_err, "Can't read Password\n"); + OPENSSL_free(pass); + return NULL; + } + *(pass+l)= '\0'; + + return pass; + } + +#endif + char *srtp_profiles = NULL; + +# ifndef OPENSSL_NO_NEXTPROTONEG +/* This the context that we pass to next_proto_cb */ +typedef struct tlsextnextprotoctx_st { + unsigned char *data; + unsigned short len; + int status; +} tlsextnextprotoctx; + +static tlsextnextprotoctx next_proto; + +static int next_proto_cb(SSL *s, unsigned char **out, unsigned char *outlen, const unsigned char *in, unsigned int inlen, void *arg) + { + tlsextnextprotoctx *ctx = arg; + + if (!c_quiet) + { + /* We can assume that |in| is syntactically valid. */ + unsigned i; + BIO_printf(bio_c_out, "Protocols advertised by server: "); + for (i = 0; i < inlen; ) + { + if (i) + BIO_write(bio_c_out, ", ", 2); + BIO_write(bio_c_out, &in[i + 1], in[i]); + i += in[i] + 1; + } + BIO_write(bio_c_out, "\n", 1); + } + + ctx->status = SSL_select_next_proto(out, outlen, in, inlen, ctx->data, ctx->len); + return SSL_TLSEXT_ERR_OK; + } +# endif #endif enum @@ -384,6 +555,9 @@ int MAIN(int argc, char **argv) { unsigned int off=0, clr=0; SSL *con=NULL; +#ifndef OPENSSL_NO_KRB5 + KSSL_CTX *kctx; +#endif int s,k,width,state=0; char *cbuf=NULL,*sbuf=NULL,*mbuf=NULL; int cbuf_len,cbuf_off; @@ -429,6 +603,9 @@ int MAIN(int argc, char **argv) char *servername = NULL; tlsextctx tlsextcbp = {NULL,0}; +# ifndef OPENSSL_NO_NEXTPROTONEG + const char *next_proto_neg_in = NULL; +# endif #endif char *sess_in = NULL; char *sess_out = NULL; @@ -439,6 +616,11 @@ int MAIN(int argc, char **argv) #ifndef OPENSSL_NO_JPAKE char *jpake_secret = NULL; #endif +#ifndef OPENSSL_NO_SRP + char * srppass = NULL; + int srp_lateuser = 0; + SRP_ARG srp_arg = {NULL,NULL,0,0,0,1024}; +#endif #if !defined(OPENSSL_NO_SSL2) && !defined(OPENSSL_NO_SSL3) meth=SSLv23_client_method(); @@ -588,6 +770,37 @@ int MAIN(int argc, char **argv) } } #endif +#ifndef OPENSSL_NO_SRP + else if (strcmp(*argv,"-srpuser") == 0) + { + if (--argc < 1) goto bad; + srp_arg.srplogin= *(++argv); + meth=TLSv1_client_method(); + } + else if (strcmp(*argv,"-srppass") == 0) + { + if (--argc < 1) goto bad; + srppass= *(++argv); + meth=TLSv1_client_method(); + } + else if (strcmp(*argv,"-srp_strength") == 0) + { + if (--argc < 1) goto bad; + srp_arg.strength=atoi(*(++argv)); + BIO_printf(bio_err,"SRP minimal length for N is %d\n",srp_arg.strength); + meth=TLSv1_client_method(); + } + else if (strcmp(*argv,"-srp_lateuser") == 0) + { + srp_lateuser= 1; + meth=TLSv1_client_method(); + } + else if (strcmp(*argv,"-srp_moregroups") == 0) + { + srp_arg.amp=1; + meth=TLSv1_client_method(); + } +#endif #ifndef OPENSSL_NO_SSL2 else if (strcmp(*argv,"-ssl2") == 0) meth=SSLv2_client_method(); @@ -597,6 +810,10 @@ int MAIN(int argc, char **argv) meth=SSLv3_client_method(); #endif #ifndef OPENSSL_NO_TLS1 + else if (strcmp(*argv,"-tls1_2") == 0) + meth=TLSv1_2_client_method(); + else if (strcmp(*argv,"-tls1_1") == 0) + meth=TLSv1_1_client_method(); else if (strcmp(*argv,"-tls1") == 0) meth=TLSv1_client_method(); #endif @@ -645,6 +862,10 @@ int MAIN(int argc, char **argv) if (--argc < 1) goto bad; CAfile= *(++argv); } + else if (strcmp(*argv,"-no_tls1_2") == 0) + off|=SSL_OP_NO_TLSv1_2; + else if (strcmp(*argv,"-no_tls1_1") == 0) + off|=SSL_OP_NO_TLSv1_1; else if (strcmp(*argv,"-no_tls1") == 0) off|=SSL_OP_NO_TLSv1; else if (strcmp(*argv,"-no_ssl3") == 0) @@ -656,6 +877,13 @@ int MAIN(int argc, char **argv) #ifndef OPENSSL_NO_TLSEXT else if (strcmp(*argv,"-no_ticket") == 0) { off|=SSL_OP_NO_TICKET; } +# ifndef OPENSSL_NO_NEXTPROTONEG + else if (strcmp(*argv,"-nextprotoneg") == 0) + { + if (--argc < 1) goto bad; + next_proto_neg_in = *(++argv); + } +# endif #endif else if (strcmp(*argv,"-serverpref") == 0) off|=SSL_OP_CIPHER_SERVER_PREFERENCE; @@ -723,7 +951,23 @@ int MAIN(int argc, char **argv) jpake_secret = *++argv; } #endif - else + else if (strcmp(*argv,"-use_srtp") == 0) + { + if (--argc < 1) goto bad; + srtp_profiles = *(++argv); + } + else if (strcmp(*argv,"-keymatexport") == 0) + { + if (--argc < 1) goto bad; + keymatexportlabel= *(++argv); + } + else if (strcmp(*argv,"-keymatexportlen") == 0) + { + if (--argc < 1) goto bad; + keymatexportlen=atoi(*(++argv)); + if (keymatexportlen == 0) goto bad; + } + else { BIO_printf(bio_err,"unknown option %s\n",*argv); badop=1; @@ -749,19 +993,33 @@ bad: goto end; } psk_identity = "JPAKE"; + if (cipher) + { + BIO_printf(bio_err, "JPAKE sets cipher to PSK\n"); + goto end; + } + cipher = "PSK"; } - - if (cipher) - { - BIO_printf(bio_err, "JPAKE sets cipher to PSK\n"); - goto end; - } - cipher = "PSK"; #endif OpenSSL_add_ssl_algorithms(); SSL_load_error_strings(); +#if !defined(OPENSSL_NO_TLSEXT) && !defined(OPENSSL_NO_NEXTPROTONEG) + next_proto.status = -1; + if (next_proto_neg_in) + { + next_proto.data = next_protos_parse(&next_proto.len, next_proto_neg_in); + if (next_proto.data == NULL) + { + BIO_printf(bio_err, "Error parsing -nextprotoneg argument\n"); + goto end; + } + } + else + next_proto.data = NULL; +#endif + #ifndef OPENSSL_NO_ENGINE e = setup_engine(bio_err, engine_id, 1); if (ssl_client_engine_id) @@ -835,6 +1093,14 @@ bad: } } +#ifndef OPENSSL_NO_SRP + if(!app_passwd(bio_err, srppass, NULL, &srp_arg.srppassin, NULL)) + { + BIO_printf(bio_err, "Error getting password\n"); + goto end; + } +#endif + ctx=SSL_CTX_new(meth); if (ctx == NULL) { @@ -870,6 +1136,8 @@ bad: BIO_printf(bio_c_out, "PSK key given or JPAKE in use, setting client callback\n"); SSL_CTX_set_psk_client_callback(ctx, psk_client_cb); } + if (srtp_profiles != NULL) + SSL_CTX_set_tlsext_use_srtp(ctx, srtp_profiles); #endif if (bugs) SSL_CTX_set_options(ctx,SSL_OP_ALL|off); @@ -883,6 +1151,11 @@ bad: */ if (socket_type == SOCK_DGRAM) SSL_CTX_set_read_ahead(ctx, 1); +#if !defined(OPENSSL_NO_TLSEXT) && !defined(OPENSSL_NO_NEXTPROTONEG) + if (next_proto.data) + SSL_CTX_set_next_proto_select_cb(ctx, next_proto_cb, &next_proto); +#endif + if (state) SSL_CTX_set_info_callback(ctx,apps_ssl_info_callback); if (cipher != NULL) if(!SSL_CTX_set_cipher_list(ctx,cipher)) { @@ -914,6 +1187,24 @@ bad: SSL_CTX_set_tlsext_servername_callback(ctx, ssl_servername_cb); SSL_CTX_set_tlsext_servername_arg(ctx, &tlsextcbp); } +#ifndef OPENSSL_NO_SRP + if (srp_arg.srplogin) + { + if (!srp_lateuser && !SSL_CTX_set_srp_username(ctx, srp_arg.srplogin)) + { + BIO_printf(bio_err,"Unable to set SRP username\n"); + goto end; + } + srp_arg.msg = c_msg; + srp_arg.debug = c_debug ; + SSL_CTX_set_srp_cb_arg(ctx,&srp_arg); + SSL_CTX_set_srp_client_pwd_callback(ctx, ssl_give_srp_client_pwd_cb); + SSL_CTX_set_srp_strength(ctx, srp_arg.strength); + if (c_msg || c_debug || srp_arg.amp == 0) + SSL_CTX_set_srp_verify_param_callback(ctx, ssl_srp_verify_param_cb); + } + +#endif #endif con=SSL_new(ctx); @@ -952,9 +1243,10 @@ bad: } #endif #ifndef OPENSSL_NO_KRB5 - if (con && (con->kssl_ctx = kssl_ctx_new()) != NULL) + if (con && (kctx = kssl_ctx_new()) != NULL) { - kssl_ctx_setstring(con->kssl_ctx, KSSL_SERVER, host); + SSL_set0_kssl_ctx(con, kctx); + kssl_ctx_setstring(kctx, KSSL_SERVER, host); } #endif /* OPENSSL_NO_KRB5 */ /* SSL_set_cipher_list(con,"RC4-MD5"); */ @@ -986,7 +1278,7 @@ re_start: } } #endif - if (c_Pause & 0x01) con->debug=1; + if (c_Pause & 0x01) SSL_set_debug(con, 1); if ( SSL_version(con) == DTLS1_VERSION) { @@ -1035,7 +1327,7 @@ re_start: if (c_debug) { - con->debug=1; + SSL_set_debug(con, 1); BIO_set_callback(sbio,bio_dump_callback); BIO_set_callback_arg(sbio,(char *)bio_c_out); } @@ -1569,6 +1861,14 @@ printf("read=%d pending=%d peek=%d\n",k,SSL_pending(con),SSL_peek(con,zbuf,10240 SSL_renegotiate(con); cbuf_len=0; } +#ifndef OPENSSL_NO_HEARTBEATS + else if ((!c_ign_eof) && (cbuf[0] == 'B')) + { + BIO_printf(bio_err,"HEARTBEATING\n"); + SSL_heartbeat(con); + cbuf_len=0; + } +#endif else { cbuf_len=i; @@ -1630,6 +1930,7 @@ static void print_stuff(BIO *bio, SSL *s, int full) #ifndef OPENSSL_NO_COMP const COMP_METHOD *comp, *expansion; #endif + unsigned char *exportedkeymat; if (full) { @@ -1720,7 +2021,7 @@ static void print_stuff(BIO *bio, SSL *s, int full) BIO_number_read(SSL_get_rbio(s)), BIO_number_written(SSL_get_wbio(s))); } - BIO_printf(bio,((s->hit)?"---\nReused, ":"---\nNew, ")); + BIO_printf(bio,(SSL_cache_hit(s)?"---\nReused, ":"---\nNew, ")); c=SSL_get_current_cipher(s); BIO_printf(bio,"%s, Cipher is %s\n", SSL_CIPHER_get_version(c), @@ -1742,7 +2043,66 @@ static void print_stuff(BIO *bio, SSL *s, int full) BIO_printf(bio,"Expansion: %s\n", expansion ? SSL_COMP_get_name(expansion) : "NONE"); #endif + +#ifdef SSL_DEBUG + { + /* Print out local port of connection: useful for debugging */ + int sock; + struct sockaddr_in ladd; + socklen_t ladd_size = sizeof(ladd); + sock = SSL_get_fd(s); + getsockname(sock, (struct sockaddr *)&ladd, &ladd_size); + BIO_printf(bio_c_out, "LOCAL PORT is %u\n", ntohs(ladd.sin_port)); + } +#endif + +#if !defined(OPENSSL_NO_TLSEXT) && !defined(OPENSSL_NO_NEXTPROTONEG) + if (next_proto.status != -1) { + const unsigned char *proto; + unsigned int proto_len; + SSL_get0_next_proto_negotiated(s, &proto, &proto_len); + BIO_printf(bio, "Next protocol: (%d) ", next_proto.status); + BIO_write(bio, proto, proto_len); + BIO_write(bio, "\n", 1); + } +#endif + + { + SRTP_PROTECTION_PROFILE *srtp_profile=SSL_get_selected_srtp_profile(s); + + if(srtp_profile) + BIO_printf(bio,"SRTP Extension negotiated, profile=%s\n", + srtp_profile->name); + } + SSL_SESSION_print(bio,SSL_get_session(s)); + if (keymatexportlabel != NULL) + { + BIO_printf(bio, "Keying material exporter:\n"); + BIO_printf(bio, " Label: '%s'\n", keymatexportlabel); + BIO_printf(bio, " Length: %i bytes\n", keymatexportlen); + exportedkeymat = OPENSSL_malloc(keymatexportlen); + if (exportedkeymat != NULL) + { + if (!SSL_export_keying_material(s, exportedkeymat, + keymatexportlen, + keymatexportlabel, + strlen(keymatexportlabel), + NULL, 0, 0)) + { + BIO_printf(bio, " Error\n"); + } + else + { + BIO_printf(bio, " Keying material: "); + for (i=0; i #endif +#ifndef OPENSSL_NO_SRP +#include +#endif #include "s_apps.h" #include "timeouts.h" @@ -290,6 +293,9 @@ static int cert_status_cb(SSL *s, void *arg); static int s_msg=0; static int s_quiet=0; +static char *keymatexportlabel=NULL; +static int keymatexportlen=20; + static int hack=0; #ifndef OPENSSL_NO_ENGINE static char *engine_id=NULL; @@ -302,6 +308,7 @@ static long socket_mtu; static int cert_chain = 0; #endif + #ifndef OPENSSL_NO_PSK static char *psk_identity="Client_identity"; char *psk_key=NULL; /* by default PSK is not used */ @@ -369,6 +376,52 @@ static unsigned int psk_server_cb(SSL *ssl, const char *identity, } #endif +#ifndef OPENSSL_NO_SRP +/* This is a context that we pass to callbacks */ +typedef struct srpsrvparm_st + { + char *login; + SRP_VBASE *vb; + SRP_user_pwd *user; + } srpsrvparm; + +/* This callback pretends to require some asynchronous logic in order to obtain + a verifier. When the callback is called for a new connection we return + with a negative value. This will provoke the accept etc to return with + an LOOKUP_X509. The main logic of the reinvokes the suspended call + (which would normally occur after a worker has finished) and we + set the user parameters. +*/ +static int MS_CALLBACK ssl_srp_server_param_cb(SSL *s, int *ad, void *arg) + { + srpsrvparm *p = (srpsrvparm *)arg; + if (p->login == NULL && p->user == NULL ) + { + p->login = SSL_get_srp_username(s); + BIO_printf(bio_err, "SRP username = \"%s\"\n", p->login); + return (-1) ; + } + + if (p->user == NULL) + { + BIO_printf(bio_err, "User %s doesn't exist\n", p->login); + return SSL3_AL_FATAL; + } + if (SSL_set_srp_server_param(s, p->user->N, p->user->g, p->user->s, p->user->v, + p->user->info) < 0) + { + *ad = SSL_AD_INTERNAL_ERROR; + return SSL3_AL_FATAL; + } + BIO_printf(bio_err, "SRP parameters set: username = \"%s\" info=\"%s\" \n", p->login,p->user->info); + /* need to check whether there are memory leaks */ + p->user = NULL; + p->login = NULL; + return SSL_ERROR_NONE; + } + +#endif + #ifdef MONOLITH static void s_server_init(void) { @@ -455,9 +508,15 @@ static void sv_usage(void) # ifndef OPENSSL_NO_JPAKE BIO_printf(bio_err," -jpake arg - JPAKE secret to use\n"); # endif +#endif +#ifndef OPENSSL_NO_SRP + BIO_printf(bio_err," -srpvfile file - The verifier file for SRP\n"); + BIO_printf(bio_err," -srpuserseed string - A seed string for a default user salt.\n"); #endif BIO_printf(bio_err," -ssl2 - Just talk SSLv2\n"); BIO_printf(bio_err," -ssl3 - Just talk SSLv3\n"); + BIO_printf(bio_err," -tls1_2 - Just talk TLSv1.2\n"); + BIO_printf(bio_err," -tls1_1 - Just talk TLSv1.1\n"); BIO_printf(bio_err," -tls1 - Just talk TLSv1\n"); BIO_printf(bio_err," -dtls1 - Just talk DTLSv1\n"); BIO_printf(bio_err," -timeout - Enable timeouts\n"); @@ -466,6 +525,8 @@ static void sv_usage(void) BIO_printf(bio_err," -no_ssl2 - Just disable SSLv2\n"); BIO_printf(bio_err," -no_ssl3 - Just disable SSLv3\n"); BIO_printf(bio_err," -no_tls1 - Just disable TLSv1\n"); + BIO_printf(bio_err," -no_tls1_1 - Just disable TLSv1.1\n"); + BIO_printf(bio_err," -no_tls1_2 - Just disable TLSv1.2\n"); #ifndef OPENSSL_NO_DH BIO_printf(bio_err," -no_dhe - Disable ephemeral DH\n"); #endif @@ -492,7 +553,13 @@ static void sv_usage(void) BIO_printf(bio_err," -tlsextdebug - hex dump of all TLS extensions received\n"); BIO_printf(bio_err," -no_ticket - disable use of RFC4507bis session tickets\n"); BIO_printf(bio_err," -legacy_renegotiation - enable use of legacy renegotiation (dangerous)\n"); +# ifndef OPENSSL_NO_NEXTPROTONEG + BIO_printf(bio_err," -nextprotoneg arg - set the advertised protocols for the NPN extension (comma-separated list)\n"); +# endif + BIO_printf(bio_err," -use_srtp profiles - Offer SRTP key management with a colon-separated profile list\n"); #endif + BIO_printf(bio_err," -keymatexport label - Export keying material using label\n"); + BIO_printf(bio_err," -keymatexportlen len - Export len bytes of keying material (default 20)\n"); } static int local_argc=0; @@ -826,6 +893,26 @@ BIO_printf(err, "cert_status: received %d ids\n", sk_OCSP_RESPID_num(ids)); ret = SSL_TLSEXT_ERR_ALERT_FATAL; goto done; } + +# ifndef OPENSSL_NO_NEXTPROTONEG +/* This is the context that we pass to next_proto_cb */ +typedef struct tlsextnextprotoctx_st { + unsigned char *data; + unsigned int len; +} tlsextnextprotoctx; + +static int next_proto_cb(SSL *s, const unsigned char **data, unsigned int *len, void *arg) + { + tlsextnextprotoctx *next_proto = arg; + + *data = next_proto->data; + *len = next_proto->len; + + return SSL_TLSEXT_ERR_OK; + } +# endif /* ndef OPENSSL_NO_NEXTPROTONEG */ + + #endif int MAIN(int, char **); @@ -833,6 +920,10 @@ int MAIN(int, char **); #ifndef OPENSSL_NO_JPAKE static char *jpake_secret = NULL; #endif +#ifndef OPENSSL_NO_SRP + static srpsrvparm srp_callback_parm; +#endif +static char *srtp_profiles = NULL; int MAIN(int argc, char *argv[]) { @@ -864,20 +955,30 @@ int MAIN(int argc, char *argv[]) #ifndef OPENSSL_NO_TLSEXT EVP_PKEY *s_key2 = NULL; X509 *s_cert2 = NULL; -#endif -#ifndef OPENSSL_NO_TLSEXT tlsextctx tlsextcbp = {NULL, NULL, SSL_TLSEXT_ERR_ALERT_WARNING}; +# ifndef OPENSSL_NO_NEXTPROTONEG + const char *next_proto_neg_in = NULL; + tlsextnextprotoctx next_proto; +# endif #endif #ifndef OPENSSL_NO_PSK /* by default do not send a PSK identity hint */ static char *psk_identity_hint=NULL; #endif +#ifndef OPENSSL_NO_SRP + char *srpuserseed = NULL; + char *srp_verifier_file = NULL; +#endif #if !defined(OPENSSL_NO_SSL2) && !defined(OPENSSL_NO_SSL3) meth=SSLv23_server_method(); #elif !defined(OPENSSL_NO_SSL3) meth=SSLv3_server_method(); #elif !defined(OPENSSL_NO_SSL2) meth=SSLv2_server_method(); +#elif !defined(OPENSSL_NO_TLS1) + meth=TLSv1_server_method(); +#else + /* #error no SSL version enabled */ #endif local_argc=argc; @@ -1109,6 +1210,20 @@ int MAIN(int argc, char *argv[]) goto bad; } } +#endif +#ifndef OPENSSL_NO_SRP + else if (strcmp(*argv, "-srpvfile") == 0) + { + if (--argc < 1) goto bad; + srp_verifier_file = *(++argv); + meth=TLSv1_server_method(); + } + else if (strcmp(*argv, "-srpuserseed") == 0) + { + if (--argc < 1) goto bad; + srpuserseed = *(++argv); + meth=TLSv1_server_method(); + } #endif else if (strcmp(*argv,"-www") == 0) { www=1; } @@ -1122,6 +1237,10 @@ int MAIN(int argc, char *argv[]) { off|=SSL_OP_NO_SSLv3; } else if (strcmp(*argv,"-no_tls1") == 0) { off|=SSL_OP_NO_TLSv1; } + else if (strcmp(*argv,"-no_tls1_1") == 0) + { off|=SSL_OP_NO_TLSv1_1; } + else if (strcmp(*argv,"-no_tls1_2") == 0) + { off|=SSL_OP_NO_TLSv1_2; } else if (strcmp(*argv,"-no_comp") == 0) { off|=SSL_OP_NO_COMPRESSION; } #ifndef OPENSSL_NO_TLSEXT @@ -1139,6 +1258,10 @@ int MAIN(int argc, char *argv[]) #ifndef OPENSSL_NO_TLS1 else if (strcmp(*argv,"-tls1") == 0) { meth=TLSv1_server_method(); } + else if (strcmp(*argv,"-tls1_1") == 0) + { meth=TLSv1_1_server_method(); } + else if (strcmp(*argv,"-tls1_2") == 0) + { meth=TLSv1_2_server_method(); } #endif #ifndef OPENSSL_NO_DTLS1 else if (strcmp(*argv,"-dtls1") == 0) @@ -1191,7 +1314,13 @@ int MAIN(int argc, char *argv[]) if (--argc < 1) goto bad; s_key_file2= *(++argv); } - +# ifndef OPENSSL_NO_NEXTPROTONEG + else if (strcmp(*argv,"-nextprotoneg") == 0) + { + if (--argc < 1) goto bad; + next_proto_neg_in = *(++argv); + } +# endif #endif #if !defined(OPENSSL_NO_JPAKE) && !defined(OPENSSL_NO_PSK) else if (strcmp(*argv,"-jpake") == 0) @@ -1200,6 +1329,22 @@ int MAIN(int argc, char *argv[]) jpake_secret = *(++argv); } #endif + else if (strcmp(*argv,"-use_srtp") == 0) + { + if (--argc < 1) goto bad; + srtp_profiles = *(++argv); + } + else if (strcmp(*argv,"-keymatexport") == 0) + { + if (--argc < 1) goto bad; + keymatexportlabel= *(++argv); + } + else if (strcmp(*argv,"-keymatexportlen") == 0) + { + if (--argc < 1) goto bad; + keymatexportlen=atoi(*(++argv)); + if (keymatexportlen == 0) goto bad; + } else { BIO_printf(bio_err,"unknown option %s\n",*argv); @@ -1296,6 +1441,22 @@ bad: goto end; } } + +# ifndef OPENSSL_NO_NEXTPROTONEG + if (next_proto_neg_in) + { + unsigned short len; + next_proto.data = next_protos_parse(&len, + next_proto_neg_in); + if (next_proto.data == NULL) + goto end; + next_proto.len = len; + } + else + { + next_proto.data = NULL; + } +# endif #endif } @@ -1399,6 +1560,9 @@ bad: else SSL_CTX_sess_set_cache_size(ctx,128); + if (srtp_profiles != NULL) + SSL_CTX_set_tlsext_use_srtp(ctx, srtp_profiles); + #if 0 if (cipher == NULL) cipher=getenv("SSL_CIPHER"); #endif @@ -1476,6 +1640,11 @@ bad: if (vpm) SSL_CTX_set1_param(ctx2, vpm); } + +# ifndef OPENSSL_NO_NEXTPROTONEG + if (next_proto.data) + SSL_CTX_set_next_protos_advertised_cb(ctx, next_proto_cb, &next_proto); +# endif #endif #ifndef OPENSSL_NO_DH @@ -1684,6 +1853,25 @@ bad: } #endif +#ifndef OPENSSL_NO_SRP + if (srp_verifier_file != NULL) + { + srp_callback_parm.vb = SRP_VBASE_new(srpuserseed); + srp_callback_parm.user = NULL; + srp_callback_parm.login = NULL; + if ((ret = SRP_VBASE_init(srp_callback_parm.vb, srp_verifier_file)) != SRP_NO_ERROR) + { + BIO_printf(bio_err, + "Cannot initialize SRP verifier file \"%s\":ret=%d\n", + srp_verifier_file, ret); + goto end; + } + SSL_CTX_set_verify(ctx, SSL_VERIFY_NONE,verify_callback); + SSL_CTX_set_srp_cb_arg(ctx, &srp_callback_parm); + SSL_CTX_set_srp_username_callback(ctx, ssl_srp_server_param_cb); + } + else +#endif if (CAfile != NULL) { SSL_CTX_set_client_CA_list(ctx,SSL_load_client_CA_file(CAfile)); @@ -1765,6 +1953,9 @@ static int sv_body(char *hostname, int s, unsigned char *context) unsigned long l; SSL *con=NULL; BIO *sbio; +#ifndef OPENSSL_NO_KRB5 + KSSL_CTX *kctx; +#endif struct timeval timeout; #if defined(OPENSSL_SYS_WINDOWS) || defined(OPENSSL_SYS_MSDOS) || defined(OPENSSL_SYS_NETWARE) || defined(OPENSSL_SYS_BEOS_R5) struct timeval tv; @@ -1805,12 +1996,11 @@ static int sv_body(char *hostname, int s, unsigned char *context) } #endif #ifndef OPENSSL_NO_KRB5 - if ((con->kssl_ctx = kssl_ctx_new()) != NULL) + if ((kctx = kssl_ctx_new()) != NULL) { - kssl_ctx_setstring(con->kssl_ctx, KSSL_SERVICE, - KRB5SVC); - kssl_ctx_setstring(con->kssl_ctx, KSSL_KEYTAB, - KRB5KEYTAB); + SSL_set0_kssl_ctx(con, kctx); + kssl_ctx_setstring(kctx, KSSL_SERVICE, KRB5SVC); + kssl_ctx_setstring(kctx, KSSL_KEYTAB, KRB5KEYTAB); } #endif /* OPENSSL_NO_KRB5 */ if(context) @@ -1873,7 +2063,7 @@ static int sv_body(char *hostname, int s, unsigned char *context) if (s_debug) { - con->debug=1; + SSL_set_debug(con, 1); BIO_set_callback(SSL_get_rbio(con),bio_dump_callback); BIO_set_callback_arg(SSL_get_rbio(con),(char *)bio_s_out); } @@ -2002,6 +2192,16 @@ static int sv_body(char *hostname, int s, unsigned char *context) goto err; } +#ifndef OPENSSL_NO_HEARTBEATS + if ((buf[0] == 'B') && + ((buf[1] == '\n') || (buf[1] == '\r'))) + { + BIO_printf(bio_err,"HEARTBEATING\n"); + SSL_heartbeat(con); + i=0; + continue; + } +#endif if ((buf[0] == 'r') && ((buf[1] == '\n') || (buf[1] == '\r'))) { @@ -2045,6 +2245,18 @@ static int sv_body(char *hostname, int s, unsigned char *context) { static count=0; if (++count == 100) { count=0; SSL_renegotiate(con); } } #endif k=SSL_write(con,&(buf[l]),(unsigned int)i); +#ifndef OPENSSL_NO_SRP + while (SSL_get_error(con,k) == SSL_ERROR_WANT_X509_LOOKUP) + { + BIO_printf(bio_s_out,"LOOKUP renego during write\n"); + srp_callback_parm.user = SRP_VBASE_get_by_user(srp_callback_parm.vb, srp_callback_parm.login); + if (srp_callback_parm.user) + BIO_printf(bio_s_out,"LOOKUP done %s\n",srp_callback_parm.user->info); + else + BIO_printf(bio_s_out,"LOOKUP not successful\n"); + k=SSL_write(con,&(buf[l]),(unsigned int)i); + } +#endif switch (SSL_get_error(con,k)) { case SSL_ERROR_NONE: @@ -2092,6 +2304,18 @@ static int sv_body(char *hostname, int s, unsigned char *context) { again: i=SSL_read(con,(char *)buf,bufsize); +#ifndef OPENSSL_NO_SRP + while (SSL_get_error(con,i) == SSL_ERROR_WANT_X509_LOOKUP) + { + BIO_printf(bio_s_out,"LOOKUP renego during read\n"); + srp_callback_parm.user = SRP_VBASE_get_by_user(srp_callback_parm.vb, srp_callback_parm.login); + if (srp_callback_parm.user) + BIO_printf(bio_s_out,"LOOKUP done %s\n",srp_callback_parm.user->info); + else + BIO_printf(bio_s_out,"LOOKUP not successful\n"); + i=SSL_read(con,(char *)buf,bufsize); + } +#endif switch (SSL_get_error(con,i)) { case SSL_ERROR_NONE: @@ -2104,7 +2328,6 @@ again: break; case SSL_ERROR_WANT_WRITE: case SSL_ERROR_WANT_READ: - case SSL_ERROR_WANT_X509_LOOKUP: BIO_printf(bio_s_out,"Read BLOCK\n"); break; case SSL_ERROR_SYSCALL: @@ -2159,8 +2382,30 @@ static int init_ssl_connection(SSL *con) X509 *peer; long verify_error; MS_STATIC char buf[BUFSIZ]; +#ifndef OPENSSL_NO_KRB5 + char *client_princ; +#endif +#if !defined(OPENSSL_NO_TLSEXT) && !defined(OPENSSL_NO_NEXTPROTONEG) + const unsigned char *next_proto_neg; + unsigned next_proto_neg_len; +#endif + unsigned char *exportedkeymat; - if ((i=SSL_accept(con)) <= 0) + + i=SSL_accept(con); +#ifndef OPENSSL_NO_SRP + while (i <= 0 && SSL_get_error(con,i) == SSL_ERROR_WANT_X509_LOOKUP) + { + BIO_printf(bio_s_out,"LOOKUP during accept %s\n",srp_callback_parm.login); + srp_callback_parm.user = SRP_VBASE_get_by_user(srp_callback_parm.vb, srp_callback_parm.login); + if (srp_callback_parm.user) + BIO_printf(bio_s_out,"LOOKUP done %s\n",srp_callback_parm.user->info); + else + BIO_printf(bio_s_out,"LOOKUP not successful\n"); + i=SSL_accept(con); + } +#endif + if (i <= 0) { if (BIO_sock_should_retry(i)) { @@ -2198,19 +2443,67 @@ static int init_ssl_connection(SSL *con) BIO_printf(bio_s_out,"Shared ciphers:%s\n",buf); str=SSL_CIPHER_get_name(SSL_get_current_cipher(con)); BIO_printf(bio_s_out,"CIPHER is %s\n",(str != NULL)?str:"(NONE)"); - if (con->hit) BIO_printf(bio_s_out,"Reused session-id\n"); +#if !defined(OPENSSL_NO_TLSEXT) && !defined(OPENSSL_NO_NEXTPROTONEG) + SSL_get0_next_proto_negotiated(con, &next_proto_neg, &next_proto_neg_len); + if (next_proto_neg) + { + BIO_printf(bio_s_out,"NEXTPROTO is "); + BIO_write(bio_s_out, next_proto_neg, next_proto_neg_len); + BIO_printf(bio_s_out, "\n"); + } +#endif + { + SRTP_PROTECTION_PROFILE *srtp_profile + = SSL_get_selected_srtp_profile(con); + + if(srtp_profile) + BIO_printf(bio_s_out,"SRTP Extension negotiated, profile=%s\n", + srtp_profile->name); + } + if (SSL_cache_hit(con)) BIO_printf(bio_s_out,"Reused session-id\n"); if (SSL_ctrl(con,SSL_CTRL_GET_FLAGS,0,NULL) & TLS1_FLAGS_TLS_PADDING_BUG) - BIO_printf(bio_s_out,"Peer has incorrect TLSv1 block padding\n"); + BIO_printf(bio_s_out, + "Peer has incorrect TLSv1 block padding\n"); #ifndef OPENSSL_NO_KRB5 - if (con->kssl_ctx->client_princ != NULL) + client_princ = kssl_ctx_get0_client_princ(SSL_get0_kssl_ctx(con)); + if (client_princ != NULL) { BIO_printf(bio_s_out,"Kerberos peer principal is %s\n", - con->kssl_ctx->client_princ); + client_princ); } #endif /* OPENSSL_NO_KRB5 */ BIO_printf(bio_s_out, "Secure Renegotiation IS%s supported\n", SSL_get_secure_renegotiation_support(con) ? "" : " NOT"); + if (keymatexportlabel != NULL) + { + BIO_printf(bio_s_out, "Keying material exporter:\n"); + BIO_printf(bio_s_out, " Label: '%s'\n", keymatexportlabel); + BIO_printf(bio_s_out, " Length: %i bytes\n", + keymatexportlen); + exportedkeymat = OPENSSL_malloc(keymatexportlen); + if (exportedkeymat != NULL) + { + if (!SSL_export_keying_material(con, exportedkeymat, + keymatexportlen, + keymatexportlabel, + strlen(keymatexportlabel), + NULL, 0, 0)) + { + BIO_printf(bio_s_out, " Error\n"); + } + else + { + BIO_printf(bio_s_out, " Keying material: "); + for (i=0; ikssl_ctx = kssl_ctx_new()) != NULL) + if ((kctx = kssl_ctx_new()) != NULL) { - kssl_ctx_setstring(con->kssl_ctx, KSSL_SERVICE, KRB5SVC); - kssl_ctx_setstring(con->kssl_ctx, KSSL_KEYTAB, KRB5KEYTAB); + kssl_ctx_setstring(kctx, KSSL_SERVICE, KRB5SVC); + kssl_ctx_setstring(kctx, KSSL_KEYTAB, KRB5KEYTAB); } #endif /* OPENSSL_NO_KRB5 */ if(context) SSL_set_session_id_context(con, context, @@ -2318,7 +2617,7 @@ static int www_body(char *hostname, int s, unsigned char *context) if (s_debug) { - con->debug=1; + SSL_set_debug(con, 1); BIO_set_callback(SSL_get_rbio(con),bio_dump_callback); BIO_set_callback_arg(SSL_get_rbio(con),(char *)bio_s_out); } @@ -2333,7 +2632,18 @@ static int www_body(char *hostname, int s, unsigned char *context) if (hack) { i=SSL_accept(con); - +#ifndef OPENSSL_NO_SRP + while (i <= 0 && SSL_get_error(con,i) == SSL_ERROR_WANT_X509_LOOKUP) + { + BIO_printf(bio_s_out,"LOOKUP during accept %s\n",srp_callback_parm.login); + srp_callback_parm.user = SRP_VBASE_get_by_user(srp_callback_parm.vb, srp_callback_parm.login); + if (srp_callback_parm.user) + BIO_printf(bio_s_out,"LOOKUP done %s\n",srp_callback_parm.user->info); + else + BIO_printf(bio_s_out,"LOOKUP not successful\n"); + i=SSL_accept(con); + } +#endif switch (SSL_get_error(con,i)) { case SSL_ERROR_NONE: @@ -2439,7 +2749,7 @@ static int www_body(char *hostname, int s, unsigned char *context) } BIO_puts(io,"\n"); } - BIO_printf(io,((con->hit) + BIO_printf(io,(SSL_cache_hit(con) ?"---\nReused, " :"---\nNew, ")); c=SSL_get_current_cipher(con); diff --git a/crypto/openssl/apps/s_socket.c b/crypto/openssl/apps/s_socket.c index c08544a13c..380efdb1b9 100644 --- a/crypto/openssl/apps/s_socket.c +++ b/crypto/openssl/apps/s_socket.c @@ -238,11 +238,10 @@ int init_client(int *sock, char *host, int port, int type) { unsigned char ip[4]; + memset(ip, '\0', sizeof ip); if (!host_ip(host,&(ip[0]))) - { - return(0); - } - return(init_client_ip(sock,ip,port,type)); + return 0; + return init_client_ip(sock,ip,port,type); } static int init_client_ip(int *sock, unsigned char ip[4], int port, int type) diff --git a/crypto/openssl/apps/sess_id.c b/crypto/openssl/apps/sess_id.c index b99179f276..b16686c26d 100644 --- a/crypto/openssl/apps/sess_id.c +++ b/crypto/openssl/apps/sess_id.c @@ -90,6 +90,7 @@ int MAIN(int, char **); int MAIN(int argc, char **argv) { SSL_SESSION *x=NULL; + X509 *peer = NULL; int ret=1,i,num,badops=0; BIO *out=NULL; int informat,outformat; @@ -163,16 +164,17 @@ bad: ERR_load_crypto_strings(); x=load_sess_id(infile,informat); if (x == NULL) { goto end; } + peer = SSL_SESSION_get0_peer(x); if(context) { - x->sid_ctx_length=strlen(context); - if(x->sid_ctx_length > SSL_MAX_SID_CTX_LENGTH) + size_t ctx_len = strlen(context); + if(ctx_len > SSL_MAX_SID_CTX_LENGTH) { BIO_printf(bio_err,"Context too long\n"); goto end; } - memcpy(x->sid_ctx,context,x->sid_ctx_length); + SSL_SESSION_set1_id_context(x, (unsigned char *)context, ctx_len); } #ifdef undef @@ -231,10 +233,10 @@ bad: if (cert) { - if (x->peer == NULL) + if (peer == NULL) BIO_puts(out,"No certificate present\n"); else - X509_print(out,x->peer); + X509_print(out,peer); } } @@ -253,12 +255,12 @@ bad: goto end; } } - else if (!noout && (x->peer != NULL)) /* just print the certificate */ + else if (!noout && (peer != NULL)) /* just print the certificate */ { if (outformat == FORMAT_ASN1) - i=(int)i2d_X509_bio(out,x->peer); + i=(int)i2d_X509_bio(out,peer); else if (outformat == FORMAT_PEM) - i=PEM_write_bio_X509(out,x->peer); + i=PEM_write_bio_X509(out,peer); else { BIO_printf(bio_err,"bad output format specified for outfile\n"); goto end; diff --git a/crypto/openssl/apps/speed.c b/crypto/openssl/apps/speed.c index 65f85fecf7..8358b12fdd 100644 --- a/crypto/openssl/apps/speed.c +++ b/crypto/openssl/apps/speed.c @@ -108,8 +108,14 @@ #include #endif -#ifdef _WIN32 +#if defined(_WIN32) || defined(__CYGWIN__) #include +# if defined(__CYGWIN__) && !defined(_WIN32) + /* should define _WIN32, which normally is mutually + * exclusive with __CYGWIN__, but if it didn't... */ +# define _WIN32 + /* this is done because Cygwin alarm() fails sometimes. */ +# endif #endif #include @@ -183,6 +189,25 @@ #ifndef OPENSSL_NO_ECDH #include #endif +#include + +#ifdef OPENSSL_FIPS +#ifdef OPENSSL_DOING_MAKEDEPEND +#undef AES_set_encrypt_key +#undef AES_set_decrypt_key +#undef DES_set_key_unchecked +#endif +#define BF_set_key private_BF_set_key +#define CAST_set_key private_CAST_set_key +#define idea_set_encrypt_key private_idea_set_encrypt_key +#define SEED_set_key private_SEED_set_key +#define RC2_set_key private_RC2_set_key +#define RC4_set_key private_RC4_set_key +#define DES_set_key_unchecked private_DES_set_key_unchecked +#define AES_set_encrypt_key private_AES_set_encrypt_key +#define AES_set_decrypt_key private_AES_set_decrypt_key +#define Camellia_set_key private_Camellia_set_key +#endif #ifndef HAVE_FORK # if defined(OPENSSL_SYS_VMS) || defined(OPENSSL_SYS_WINDOWS) || defined(OPENSSL_SYS_MACINTOSH_CLASSIC) || defined(OPENSSL_SYS_OS2) || defined(OPENSSL_SYS_NETWARE) @@ -214,7 +239,7 @@ static void print_result(int alg,int run_no,int count,double time_used); static int do_multi(int multi); #endif -#define ALGOR_NUM 29 +#define ALGOR_NUM 30 #define SIZE_NUM 5 #define RSA_NUM 4 #define DSA_NUM 3 @@ -229,7 +254,7 @@ static const char *names[ALGOR_NUM]={ "aes-128 cbc","aes-192 cbc","aes-256 cbc", "camellia-128 cbc","camellia-192 cbc","camellia-256 cbc", "evp","sha256","sha512","whirlpool", - "aes-128 ige","aes-192 ige","aes-256 ige"}; + "aes-128 ige","aes-192 ige","aes-256 ige","ghash"}; static double results[ALGOR_NUM][SIZE_NUM]; static int lengths[SIZE_NUM]={16,64,256,1024,8*1024}; #ifndef OPENSSL_NO_RSA @@ -273,9 +298,12 @@ static SIGRETTYPE sig_done(int sig) #if defined(_WIN32) +#if !defined(SIGALRM) #define SIGALRM +#endif static unsigned int lapse,schlock; -static void alarm(unsigned int secs) { lapse = secs*1000; } +static void alarm_win32(unsigned int secs) { lapse = secs*1000; } +#define alarm alarm_win32 static DWORD WINAPI sleepy(VOID *arg) { @@ -469,6 +497,7 @@ int MAIN(int argc, char **argv) #define D_IGE_128_AES 26 #define D_IGE_192_AES 27 #define D_IGE_256_AES 28 +#define D_GHASH 29 double d=0.0; long c[ALGOR_NUM][SIZE_NUM]; #define R_DSA_512 0 @@ -894,6 +923,10 @@ int MAIN(int argc, char **argv) doit[D_CBC_192_AES]=1; doit[D_CBC_256_AES]=1; } + else if (strcmp(*argv,"ghash") == 0) + { + doit[D_GHASH]=1; + } else #endif #ifndef OPENSSL_NO_CAMELLIA @@ -1264,6 +1297,7 @@ int MAIN(int argc, char **argv) c[D_IGE_128_AES][0]=count; c[D_IGE_192_AES][0]=count; c[D_IGE_256_AES][0]=count; + c[D_GHASH][0]=count; for (i=1; i + +#ifndef OPENSSL_NO_SRP +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "apps.h" + +#undef PROG +#define PROG srp_main + +#define BASE_SECTION "srp" +#define CONFIG_FILE "openssl.cnf" + +#define ENV_RANDFILE "RANDFILE" + +#define ENV_DATABASE "srpvfile" +#define ENV_DEFAULT_SRP "default_srp" + +static char *srp_usage[]={ +"usage: srp [args] [user] \n", +"\n", +" -verbose Talk alot while doing things\n", +" -config file A config file\n", +" -name arg The particular srp definition to use\n", +" -srpvfile arg The srp verifier file name\n", +" -add add an user and srp verifier\n", +" -modify modify the srp verifier of an existing user\n", +" -delete delete user from verifier file\n", +" -list list user\n", +" -gn arg g and N values to be used for new verifier\n", +" -userinfo arg additional info to be set for user\n", +" -passin arg input file pass phrase source\n", +" -passout arg output file pass phrase source\n", +#ifndef OPENSSL_NO_ENGINE +" -engine e - use engine e, possibly a hardware device.\n", +#endif +NULL +}; + +#ifdef EFENCE +extern int EF_PROTECT_FREE; +extern int EF_PROTECT_BELOW; +extern int EF_ALIGNMENT; +#endif + +static CONF *conf=NULL; +static char *section=NULL; + +#define VERBOSE if (verbose) +#define VVERBOSE if (verbose>1) + + +int MAIN(int, char **); + +static int get_index(CA_DB *db, char* id, char type) + { + char ** pp; + int i; + if (id == NULL) return -1; + if (type == DB_SRP_INDEX) + for (i = 0; i < sk_OPENSSL_PSTRING_num(db->db->data); i++) + { + pp = (char **)sk_OPENSSL_PSTRING_value(db->db->data, i); + if (pp[DB_srptype][0] == DB_SRP_INDEX && !strcmp(id, pp[DB_srpid])) + return i; + } + else for (i = 0; i < sk_OPENSSL_PSTRING_num(db->db->data); i++) + { + pp = (char **)sk_OPENSSL_PSTRING_value(db->db->data, i); + + if (pp[DB_srptype][0] != DB_SRP_INDEX && !strcmp(id,pp[DB_srpid])) + return i; + } + + return -1 ; + } + +static void print_entry(CA_DB *db, BIO *bio, int indx, int verbose, char *s) + { + if (indx >= 0 && verbose) + { + int j; + char **pp = (char **)sk_OPENSSL_PSTRING_value(db->db->data, indx); + BIO_printf(bio, "%s \"%s\"\n", s, pp[DB_srpid]); + for (j = 0; j < DB_NUMBER; j++) + { + BIO_printf(bio_err," %d = \"%s\"\n", j, pp[j]); + } + } + } + +static void print_index(CA_DB *db, BIO *bio, int indexindex, int verbose) + { + print_entry(db, bio, indexindex, verbose, "g N entry") ; + } + +static void print_user(CA_DB *db, BIO *bio, int userindex, int verbose) + { + if (verbose > 0) + { + char **pp = (char **)sk_OPENSSL_PSTRING_value(db->db->data, userindex); + + if (pp[DB_srptype][0] != 'I') + { + print_entry(db, bio, userindex, verbose, "User entry"); + print_entry(db, bio, get_index(db, pp[DB_srpgN], 'I'), verbose, "g N entry"); + } + + } + } + +static int update_index(CA_DB *db, BIO *bio, char **row) + { + char ** irow; + int i; + + if ((irow=(char **)OPENSSL_malloc(sizeof(char *)*(DB_NUMBER+1))) == NULL) + { + BIO_printf(bio_err,"Memory allocation failure\n"); + return 0; + } + + for (i=0; idb,irow)) + { + BIO_printf(bio,"failed to update srpvfile\n"); + BIO_printf(bio,"TXT_DB error number %ld\n",db->db->error); + OPENSSL_free(irow); + return 0; + } + return 1; + } + +static void lookup_fail(const char *name, char *tag) + { + BIO_printf(bio_err,"variable lookup failed for %s::%s\n",name,tag); + } + + +static char *srp_verify_user(const char *user, const char *srp_verifier, + char *srp_usersalt, const char *g, const char *N, + const char *passin, BIO *bio, int verbose) + { + char password[1024]; + PW_CB_DATA cb_tmp; + char *verifier = NULL; + char *gNid = NULL; + + cb_tmp.prompt_info = user; + cb_tmp.password = passin; + + if (password_callback(password, 1024, 0, &cb_tmp) >0) + { + VERBOSE BIO_printf(bio,"Validating\n user=\"%s\"\n srp_verifier=\"%s\"\n srp_usersalt=\"%s\"\n g=\"%s\"\n N=\"%s\"\n",user,srp_verifier,srp_usersalt, g, N); + BIO_printf(bio, "Pass %s\n", password); + + if (!(gNid=SRP_create_verifier(user, password, &srp_usersalt, &verifier, N, g))) + { + BIO_printf(bio, "Internal error validating SRP verifier\n"); + } + else + { + if (strcmp(verifier, srp_verifier)) + gNid = NULL; + OPENSSL_free(verifier); + } + } + return gNid; + } + +static char *srp_create_user(char *user, char **srp_verifier, + char **srp_usersalt, char *g, char *N, + char *passout, BIO *bio, int verbose) + { + char password[1024]; + PW_CB_DATA cb_tmp; + char *gNid = NULL; + char *salt = NULL; + cb_tmp.prompt_info = user; + cb_tmp.password = passout; + + if (password_callback(password,1024,1,&cb_tmp) >0) + { + VERBOSE BIO_printf(bio,"Creating\n user=\"%s\"\n g=\"%s\"\n N=\"%s\"\n",user,g,N); + if (!(gNid =SRP_create_verifier(user, password, &salt, srp_verifier, N, g))) + { + BIO_printf(bio,"Internal error creating SRP verifier\n"); + } + else + *srp_usersalt = salt; + VVERBOSE BIO_printf(bio,"gNid=%s salt =\"%s\"\n verifier =\"%s\"\n", gNid,salt, *srp_verifier); + + } + return gNid; + } + +int MAIN(int argc, char **argv) + { + int add_user = 0; + int list_user= 0; + int delete_user= 0; + int modify_user= 0; + char * user = NULL; + + char *passargin = NULL, *passargout = NULL; + char *passin = NULL, *passout = NULL; + char * gN = NULL; + int gNindex = -1; + char ** gNrow = NULL; + int maxgN = -1; + + char * userinfo = NULL; + + int badops=0; + int ret=1; + int errors=0; + int verbose=0; + int doupdatedb=0; + char *configfile=NULL; + char *dbfile=NULL; + CA_DB *db=NULL; + char **pp ; + int i; + long errorline = -1; + char *randfile=NULL; +#ifndef OPENSSL_NO_ENGINE + char *engine = NULL; +#endif + char *tofree=NULL; + DB_ATTR db_attr; + +#ifdef EFENCE +EF_PROTECT_FREE=1; +EF_PROTECT_BELOW=1; +EF_ALIGNMENT=0; +#endif + + apps_startup(); + + conf = NULL; + section = NULL; + + if (bio_err == NULL) + if ((bio_err=BIO_new(BIO_s_file())) != NULL) + BIO_set_fp(bio_err,stderr,BIO_NOCLOSE|BIO_FP_TEXT); + + argc--; + argv++; + while (argc >= 1 && badops == 0) + { + if (strcmp(*argv,"-verbose") == 0) + verbose++; + else if (strcmp(*argv,"-config") == 0) + { + if (--argc < 1) goto bad; + configfile= *(++argv); + } + else if (strcmp(*argv,"-name") == 0) + { + if (--argc < 1) goto bad; + section= *(++argv); + } + else if (strcmp(*argv,"-srpvfile") == 0) + { + if (--argc < 1) goto bad; + dbfile= *(++argv); + } + else if (strcmp(*argv,"-add") == 0) + add_user=1; + else if (strcmp(*argv,"-delete") == 0) + delete_user=1; + else if (strcmp(*argv,"-modify") == 0) + modify_user=1; + else if (strcmp(*argv,"-list") == 0) + list_user=1; + else if (strcmp(*argv,"-gn") == 0) + { + if (--argc < 1) goto bad; + gN= *(++argv); + } + else if (strcmp(*argv,"-userinfo") == 0) + { + if (--argc < 1) goto bad; + userinfo= *(++argv); + } + else if (strcmp(*argv,"-passin") == 0) + { + if (--argc < 1) goto bad; + passargin= *(++argv); + } + else if (strcmp(*argv,"-passout") == 0) + { + if (--argc < 1) goto bad; + passargout= *(++argv); + } +#ifndef OPENSSL_NO_ENGINE + else if (strcmp(*argv,"-engine") == 0) + { + if (--argc < 1) goto bad; + engine= *(++argv); + } +#endif + + else if (**argv == '-') + { +bad: + BIO_printf(bio_err,"unknown option %s\n",*argv); + badops=1; + break; + } + else + break; + + argc--; + argv++; + } + + if (dbfile && configfile) + { + BIO_printf(bio_err,"-dbfile and -configfile cannot be specified together.\n"); + badops = 1; + } + if (add_user+delete_user+modify_user+list_user != 1) + { + BIO_printf(bio_err,"Exactly one of the options -add, -delete, -modify -list must be specified.\n"); + badops = 1; + } + if (delete_user+modify_user+delete_user== 1 && argc <= 0) + { + BIO_printf(bio_err,"Need at least one user for options -add, -delete, -modify. \n"); + badops = 1; + } + if ((passin || passout) && argc != 1 ) + { + BIO_printf(bio_err,"-passin, -passout arguments only valid with one user.\n"); + badops = 1; + } + + if (badops) + { + for (pp=srp_usage; (*pp != NULL); pp++) + BIO_printf(bio_err,"%s",*pp); + + BIO_printf(bio_err," -rand file%cfile%c...\n", LIST_SEPARATOR_CHAR, LIST_SEPARATOR_CHAR); + BIO_printf(bio_err," load the file (or the files in the directory) into\n"); + BIO_printf(bio_err," the random number generator\n"); + goto err; + } + + ERR_load_crypto_strings(); + +#ifndef OPENSSL_NO_ENGINE + setup_engine(bio_err, engine, 0); +#endif + + if(!app_passwd(bio_err, passargin, passargout, &passin, &passout)) + { + BIO_printf(bio_err, "Error getting passwords\n"); + goto err; + } + + if (!dbfile) + { + + + /*****************************************************************/ + tofree=NULL; + if (configfile == NULL) configfile = getenv("OPENSSL_CONF"); + if (configfile == NULL) configfile = getenv("SSLEAY_CONF"); + if (configfile == NULL) + { + const char *s=X509_get_default_cert_area(); + size_t len; + +#ifdef OPENSSL_SYS_VMS + len = strlen(s)+sizeof(CONFIG_FILE); + tofree=OPENSSL_malloc(len); + strcpy(tofree,s); +#else + len = strlen(s)+sizeof(CONFIG_FILE)+1; + tofree=OPENSSL_malloc(len); + BUF_strlcpy(tofree,s,len); + BUF_strlcat(tofree,"/",len); +#endif + BUF_strlcat(tofree,CONFIG_FILE,len); + configfile=tofree; + } + + VERBOSE BIO_printf(bio_err,"Using configuration from %s\n",configfile); + conf = NCONF_new(NULL); + if (NCONF_load(conf,configfile,&errorline) <= 0) + { + if (errorline <= 0) + BIO_printf(bio_err,"error loading the config file '%s'\n", + configfile); + else + BIO_printf(bio_err,"error on line %ld of config file '%s'\n" + ,errorline,configfile); + goto err; + } + if(tofree) + { + OPENSSL_free(tofree); + tofree = NULL; + } + + if (!load_config(bio_err, conf)) + goto err; + + /* Lets get the config section we are using */ + if (section == NULL) + { + VERBOSE BIO_printf(bio_err,"trying to read " ENV_DEFAULT_SRP " in \" BASE_SECTION \"\n"); + + section=NCONF_get_string(conf,BASE_SECTION,ENV_DEFAULT_SRP); + if (section == NULL) + { + lookup_fail(BASE_SECTION,ENV_DEFAULT_SRP); + goto err; + } + } + + if (randfile == NULL && conf) + randfile = NCONF_get_string(conf, BASE_SECTION, "RANDFILE"); + + + VERBOSE BIO_printf(bio_err,"trying to read " ENV_DATABASE " in section \"%s\"\n",section); + + if ((dbfile=NCONF_get_string(conf,section,ENV_DATABASE)) == NULL) + { + lookup_fail(section,ENV_DATABASE); + goto err; + } + + } + if (randfile == NULL) + ERR_clear_error(); + else + app_RAND_load_file(randfile, bio_err, 0); + + VERBOSE BIO_printf(bio_err,"Trying to read SRP verifier file \"%s\"\n",dbfile); + + db = load_index(dbfile, &db_attr); + if (db == NULL) goto err; + + /* Lets check some fields */ + for (i = 0; i < sk_OPENSSL_PSTRING_num(db->db->data); i++) + { + pp = (char **)sk_OPENSSL_PSTRING_value(db->db->data, i); + + if (pp[DB_srptype][0] == DB_SRP_INDEX) + { + maxgN = i; + if (gNindex < 0 && gN != NULL && !strcmp(gN, pp[DB_srpid])) + gNindex = i; + + print_index(db, bio_err, i, verbose > 1); + } + } + + VERBOSE BIO_printf(bio_err, "Database initialised\n"); + + if (gNindex >= 0) + { + gNrow = (char **)sk_OPENSSL_PSTRING_value(db->db->data, gNindex); + print_entry(db, bio_err, gNindex, verbose > 1, "Default g and N") ; + } + else if (maxgN > 0 && !SRP_get_default_gN(gN)) + { + BIO_printf(bio_err, "No g and N value for index \"%s\"\n", gN); + goto err; + } + else + { + VERBOSE BIO_printf(bio_err, "Database has no g N information.\n"); + gNrow = NULL; + } + + + VVERBOSE BIO_printf(bio_err,"Starting user processing\n"); + + if (argc > 0) + user = *(argv++) ; + + while (list_user || user) + { + int userindex = -1; + if (user) + VVERBOSE BIO_printf(bio_err, "Processing user \"%s\"\n", user); + if ((userindex = get_index(db, user, 'U')) >= 0) + { + print_user(db, bio_err, userindex, (verbose > 0) || list_user); + } + + if (list_user) + { + if (user == NULL) + { + BIO_printf(bio_err,"List all users\n"); + + for (i = 0; i < sk_OPENSSL_PSTRING_num(db->db->data); i++) + { + print_user(db,bio_err, i, 1); + } + list_user = 0; + } + else if (userindex < 0) + { + BIO_printf(bio_err, "user \"%s\" does not exist, ignored. t\n", + user); + errors++; + } + } + else if (add_user) + { + if (userindex >= 0) + { + /* reactivation of a new user */ + char **row = (char **)sk_OPENSSL_PSTRING_value(db->db->data, userindex); + BIO_printf(bio_err, "user \"%s\" reactivated.\n", user); + row[DB_srptype][0] = 'V'; + + doupdatedb = 1; + } + else + { + char *row[DB_NUMBER] ; char *gNid; + row[DB_srpverifier] = NULL; + row[DB_srpsalt] = NULL; + row[DB_srpinfo] = NULL; + if (!(gNid = srp_create_user(user,&(row[DB_srpverifier]), &(row[DB_srpsalt]),gNrow?gNrow[DB_srpsalt]:gN,gNrow?gNrow[DB_srpverifier]:NULL, passout, bio_err,verbose))) + { + BIO_printf(bio_err, "Cannot create srp verifier for user \"%s\", operation abandoned .\n", user); + errors++; + goto err; + } + row[DB_srpid] = BUF_strdup(user); + row[DB_srptype] = BUF_strdup("v"); + row[DB_srpgN] = BUF_strdup(gNid); + + if (!row[DB_srpid] || !row[DB_srpgN] || !row[DB_srptype] || !row[DB_srpverifier] || !row[DB_srpsalt] || + (userinfo && (!(row[DB_srpinfo] = BUF_strdup(userinfo)))) || + !update_index(db, bio_err, row)) + { + if (row[DB_srpid]) OPENSSL_free(row[DB_srpid]); + if (row[DB_srpgN]) OPENSSL_free(row[DB_srpgN]); + if (row[DB_srpinfo]) OPENSSL_free(row[DB_srpinfo]); + if (row[DB_srptype]) OPENSSL_free(row[DB_srptype]); + if (row[DB_srpverifier]) OPENSSL_free(row[DB_srpverifier]); + if (row[DB_srpsalt]) OPENSSL_free(row[DB_srpsalt]); + goto err; + } + doupdatedb = 1; + } + } + else if (modify_user) + { + if (userindex < 0) + { + BIO_printf(bio_err,"user \"%s\" does not exist, operation ignored.\n",user); + errors++; + } + else + { + + char **row = (char **)sk_OPENSSL_PSTRING_value(db->db->data, userindex); + char type = row[DB_srptype][0]; + if (type == 'v') + { + BIO_printf(bio_err,"user \"%s\" already updated, operation ignored.\n",user); + errors++; + } + else + { + char *gNid; + + if (row[DB_srptype][0] == 'V') + { + int user_gN; + char **irow = NULL; + VERBOSE BIO_printf(bio_err,"Verifying password for user \"%s\"\n",user); + if ( (user_gN = get_index(db, row[DB_srpgN], DB_SRP_INDEX)) >= 0) + irow = (char **)sk_OPENSSL_PSTRING_value(db->db->data, userindex); + + if (!srp_verify_user(user, row[DB_srpverifier], row[DB_srpsalt], irow ? irow[DB_srpsalt] : row[DB_srpgN], irow ? irow[DB_srpverifier] : NULL, passin, bio_err, verbose)) + { + BIO_printf(bio_err, "Invalid password for user \"%s\", operation abandoned.\n", user); + errors++; + goto err; + } + } + VERBOSE BIO_printf(bio_err,"Password for user \"%s\" ok.\n",user); + + if (!(gNid=srp_create_user(user,&(row[DB_srpverifier]), &(row[DB_srpsalt]),gNrow?gNrow[DB_srpsalt]:NULL, gNrow?gNrow[DB_srpverifier]:NULL, passout, bio_err,verbose))) + { + BIO_printf(bio_err, "Cannot create srp verifier for user \"%s\", operation abandoned.\n", user); + errors++; + goto err; + } + + row[DB_srptype][0] = 'v'; + row[DB_srpgN] = BUF_strdup(gNid); + + if (!row[DB_srpid] || !row[DB_srpgN] || !row[DB_srptype] || !row[DB_srpverifier] || !row[DB_srpsalt] || + (userinfo && (!(row[DB_srpinfo] = BUF_strdup(userinfo))))) + goto err; + + doupdatedb = 1; + } + } + } + else if (delete_user) + { + if (userindex < 0) + { + BIO_printf(bio_err, "user \"%s\" does not exist, operation ignored. t\n", user); + errors++; + } + else + { + char **xpp = (char **)sk_OPENSSL_PSTRING_value(db->db->data, userindex); + BIO_printf(bio_err, "user \"%s\" revoked. t\n", user); + + xpp[DB_srptype][0] = 'R'; + + doupdatedb = 1; + } + } + if (--argc > 0) + user = *(argv++) ; + else + { + user = NULL; + list_user = 0; + } + } + + VERBOSE BIO_printf(bio_err,"User procession done.\n"); + + + if (doupdatedb) + { + /* Lets check some fields */ + for (i = 0; i < sk_OPENSSL_PSTRING_num(db->db->data); i++) + { + pp = (char **)sk_OPENSSL_PSTRING_value(db->db->data, i); + + if (pp[DB_srptype][0] == 'v') + { + pp[DB_srptype][0] = 'V'; + print_user(db, bio_err, i, verbose); + } + } + + VERBOSE BIO_printf(bio_err, "Trying to update srpvfile.\n"); + if (!save_index(dbfile, "new", db)) goto err; + + VERBOSE BIO_printf(bio_err, "Temporary srpvfile created.\n"); + if (!rotate_index(dbfile, "new", "old")) goto err; + + VERBOSE BIO_printf(bio_err, "srpvfile updated.\n"); + } + + ret = (errors != 0); +err: + if (errors != 0) + VERBOSE BIO_printf(bio_err,"User errors %d.\n",errors); + + VERBOSE BIO_printf(bio_err,"SRP terminating with code %d.\n",ret); + if(tofree) + OPENSSL_free(tofree); + if (ret) ERR_print_errors(bio_err); + if (randfile) app_RAND_write_file(randfile, bio_err); + if (conf) NCONF_free(conf); + if (db) free_index(db); + + OBJ_cleanup(); + apps_shutdown(); + OPENSSL_EXIT(ret); + } + + + +#endif + diff --git a/crypto/openssl/apps/verify.c b/crypto/openssl/apps/verify.c index 9163997e93..b9749dcd36 100644 --- a/crypto/openssl/apps/verify.c +++ b/crypto/openssl/apps/verify.c @@ -230,6 +230,7 @@ int MAIN(int argc, char **argv) end: if (ret == 1) { BIO_printf(bio_err,"usage: verify [-verbose] [-CApath path] [-CAfile file] [-purpose purpose] [-crl_check]"); + BIO_printf(bio_err," [-attime timestamp]"); #ifndef OPENSSL_NO_ENGINE BIO_printf(bio_err," [-engine e]"); #endif diff --git a/crypto/openssl/apps/x509.c b/crypto/openssl/apps/x509.c index 9f5eaeb6be..e6e5e0d4e5 100644 --- a/crypto/openssl/apps/x509.c +++ b/crypto/openssl/apps/x509.c @@ -157,9 +157,10 @@ static int MS_CALLBACK callb(int ok, X509_STORE_CTX *ctx); static int sign (X509 *x, EVP_PKEY *pkey,int days,int clrext, const EVP_MD *digest, CONF *conf, char *section); static int x509_certify (X509_STORE *ctx,char *CAfile,const EVP_MD *digest, - X509 *x,X509 *xca,EVP_PKEY *pkey,char *serial, - int create,int days, int clrext, CONF *conf, char *section, - ASN1_INTEGER *sno); + X509 *x,X509 *xca,EVP_PKEY *pkey, + STACK_OF(OPENSSL_STRING) *sigopts, + char *serial, int create ,int days, int clrext, + CONF *conf, char *section, ASN1_INTEGER *sno); static int purpose_print(BIO *bio, X509 *cert, X509_PURPOSE *pt); static int reqfile=0; @@ -172,6 +173,7 @@ int MAIN(int argc, char **argv) X509_REQ *req=NULL; X509 *x=NULL,*xca=NULL; ASN1_OBJECT *objtmp; + STACK_OF(OPENSSL_STRING) *sigopts = NULL; EVP_PKEY *Upkey=NULL,*CApkey=NULL; ASN1_INTEGER *sno = NULL; int i,num,badops=0; @@ -271,6 +273,15 @@ int MAIN(int argc, char **argv) if (--argc < 1) goto bad; CAkeyformat=str2fmt(*(++argv)); } + else if (strcmp(*argv,"-sigopt") == 0) + { + if (--argc < 1) + goto bad; + if (!sigopts) + sigopts = sk_OPENSSL_STRING_new_null(); + if (!sigopts || !sk_OPENSSL_STRING_push(sigopts, *(++argv))) + goto bad; + } else if (strcmp(*argv,"-days") == 0) { if (--argc < 1) goto bad; @@ -970,7 +981,8 @@ bad: assert(need_rand); if (!x509_certify(ctx,CAfile,digest,x,xca, - CApkey, CAserial,CA_createserial,days, clrext, + CApkey, sigopts, + CAserial,CA_createserial,days, clrext, extconf, extsect, sno)) goto end; } @@ -1081,6 +1093,8 @@ end: X509_free(xca); EVP_PKEY_free(Upkey); EVP_PKEY_free(CApkey); + if (sigopts) + sk_OPENSSL_STRING_free(sigopts); X509_REQ_free(rq); ASN1_INTEGER_free(sno); sk_ASN1_OBJECT_pop_free(trust, ASN1_OBJECT_free); @@ -1131,8 +1145,11 @@ static ASN1_INTEGER *x509_load_serial(char *CAfile, char *serialfile, int create } static int x509_certify(X509_STORE *ctx, char *CAfile, const EVP_MD *digest, - X509 *x, X509 *xca, EVP_PKEY *pkey, char *serialfile, int create, - int days, int clrext, CONF *conf, char *section, ASN1_INTEGER *sno) + X509 *x, X509 *xca, EVP_PKEY *pkey, + STACK_OF(OPENSSL_STRING) *sigopts, + char *serialfile, int create, + int days, int clrext, CONF *conf, char *section, + ASN1_INTEGER *sno) { int ret=0; ASN1_INTEGER *bs=NULL; @@ -1191,7 +1208,8 @@ static int x509_certify(X509_STORE *ctx, char *CAfile, const EVP_MD *digest, if (!X509V3_EXT_add_nconf(conf, &ctx2, section, x)) goto end; } - if (!X509_sign(x,pkey,digest)) goto end; + if (!do_X509_sign(bio_err, x, pkey, digest, sigopts)) + goto end; ret=1; end: X509_STORE_CTX_cleanup(&xsc); diff --git a/crypto/openssl/crypto/aes/aes.h b/crypto/openssl/crypto/aes/aes.h index d2c99730fe..031abf01b5 100644 --- a/crypto/openssl/crypto/aes/aes.h +++ b/crypto/openssl/crypto/aes/aes.h @@ -90,6 +90,11 @@ int AES_set_encrypt_key(const unsigned char *userKey, const int bits, int AES_set_decrypt_key(const unsigned char *userKey, const int bits, AES_KEY *key); +int private_AES_set_encrypt_key(const unsigned char *userKey, const int bits, + AES_KEY *key); +int private_AES_set_decrypt_key(const unsigned char *userKey, const int bits, + AES_KEY *key); + void AES_encrypt(const unsigned char *in, unsigned char *out, const AES_KEY *key); void AES_decrypt(const unsigned char *in, unsigned char *out, diff --git a/crypto/openssl/crypto/aes/aes_core.c b/crypto/openssl/crypto/aes/aes_core.c index a7ec54f4da..8f5210ac70 100644 --- a/crypto/openssl/crypto/aes/aes_core.c +++ b/crypto/openssl/crypto/aes/aes_core.c @@ -625,7 +625,7 @@ static const u32 rcon[] = { /** * Expand the cipher key into the encryption key schedule. */ -int AES_set_encrypt_key(const unsigned char *userKey, const int bits, +int private_AES_set_encrypt_key(const unsigned char *userKey, const int bits, AES_KEY *key) { u32 *rk; @@ -726,7 +726,7 @@ int AES_set_encrypt_key(const unsigned char *userKey, const int bits, /** * Expand the cipher key into the decryption key schedule. */ -int AES_set_decrypt_key(const unsigned char *userKey, const int bits, +int private_AES_set_decrypt_key(const unsigned char *userKey, const int bits, AES_KEY *key) { u32 *rk; @@ -734,7 +734,7 @@ int AES_set_decrypt_key(const unsigned char *userKey, const int bits, u32 temp; /* first, start with an encryption schedule */ - status = AES_set_encrypt_key(userKey, bits, key); + status = private_AES_set_encrypt_key(userKey, bits, key); if (status < 0) return status; @@ -1201,7 +1201,7 @@ static const u32 rcon[] = { /** * Expand the cipher key into the encryption key schedule. */ -int AES_set_encrypt_key(const unsigned char *userKey, const int bits, +int private_AES_set_encrypt_key(const unsigned char *userKey, const int bits, AES_KEY *key) { u32 *rk; int i = 0; @@ -1301,7 +1301,7 @@ int AES_set_encrypt_key(const unsigned char *userKey, const int bits, /** * Expand the cipher key into the decryption key schedule. */ -int AES_set_decrypt_key(const unsigned char *userKey, const int bits, +int private_AES_set_decrypt_key(const unsigned char *userKey, const int bits, AES_KEY *key) { u32 *rk; @@ -1309,7 +1309,7 @@ int AES_set_decrypt_key(const unsigned char *userKey, const int bits, u32 temp; /* first, start with an encryption schedule */ - status = AES_set_encrypt_key(userKey, bits, key); + status = private_AES_set_encrypt_key(userKey, bits, key); if (status < 0) return status; diff --git a/crypto/openssl/crypto/aes/aes_misc.c b/crypto/openssl/crypto/aes/aes_misc.c index 4fead1b4c7..f083488ecb 100644 --- a/crypto/openssl/crypto/aes/aes_misc.c +++ b/crypto/openssl/crypto/aes/aes_misc.c @@ -50,6 +50,7 @@ */ #include +#include #include #include "aes_locl.h" @@ -62,3 +63,23 @@ const char *AES_options(void) { return "aes(partial)"; #endif } + +/* FIPS wrapper functions to block low level AES calls in FIPS mode */ + +int AES_set_encrypt_key(const unsigned char *userKey, const int bits, + AES_KEY *key) + { +#ifdef OPENSSL_FIPS + fips_cipher_abort(AES); +#endif + return private_AES_set_encrypt_key(userKey, bits, key); + } + +int AES_set_decrypt_key(const unsigned char *userKey, const int bits, + AES_KEY *key) + { +#ifdef OPENSSL_FIPS + fips_cipher_abort(AES); +#endif + return private_AES_set_decrypt_key(userKey, bits, key); + } diff --git a/crypto/openssl/crypto/aes/asm/aes-586.pl b/crypto/openssl/crypto/aes/asm/aes-586.pl index aab40e6f1c..687ed811be 100755 --- a/crypto/openssl/crypto/aes/asm/aes-586.pl +++ b/crypto/openssl/crypto/aes/asm/aes-586.pl @@ -39,7 +39,7 @@ # but exhibits up to 10% improvement on other cores. # # Second version is "monolithic" replacement for aes_core.c, which in -# addition to AES_[de|en]crypt implements AES_set_[de|en]cryption_key. +# addition to AES_[de|en]crypt implements private_AES_set_[de|en]cryption_key. # This made it possible to implement little-endian variant of the # algorithm without modifying the base C code. Motivating factor for # the undertaken effort was that it appeared that in tight IA-32 @@ -2854,12 +2854,12 @@ sub enckey() &set_label("exit"); &function_end("_x86_AES_set_encrypt_key"); -# int AES_set_encrypt_key(const unsigned char *userKey, const int bits, +# int private_AES_set_encrypt_key(const unsigned char *userKey, const int bits, # AES_KEY *key) -&function_begin_B("AES_set_encrypt_key"); +&function_begin_B("private_AES_set_encrypt_key"); &call ("_x86_AES_set_encrypt_key"); &ret (); -&function_end_B("AES_set_encrypt_key"); +&function_end_B("private_AES_set_encrypt_key"); sub deckey() { my ($i,$key,$tp1,$tp2,$tp4,$tp8) = @_; @@ -2916,9 +2916,9 @@ sub deckey() &mov (&DWP(4*$i,$key),$tp1); } -# int AES_set_decrypt_key(const unsigned char *userKey, const int bits, +# int private_AES_set_decrypt_key(const unsigned char *userKey, const int bits, # AES_KEY *key) -&function_begin_B("AES_set_decrypt_key"); +&function_begin_B("private_AES_set_decrypt_key"); &call ("_x86_AES_set_encrypt_key"); &cmp ("eax",0); &je (&label("proceed")); @@ -2974,7 +2974,7 @@ sub deckey() &jb (&label("permute")); &xor ("eax","eax"); # return success -&function_end("AES_set_decrypt_key"); +&function_end("private_AES_set_decrypt_key"); &asciz("AES for x86, CRYPTOGAMS by "); &asm_finish(); diff --git a/crypto/openssl/crypto/aes/asm/aes-x86_64.pl b/crypto/openssl/crypto/aes/asm/aes-x86_64.pl index a545e892ae..48fa857d5b 100755 --- a/crypto/openssl/crypto/aes/asm/aes-x86_64.pl +++ b/crypto/openssl/crypto/aes/asm/aes-x86_64.pl @@ -588,6 +588,9 @@ $code.=<<___; .globl AES_encrypt .type AES_encrypt,\@function,3 .align 16 +.globl asm_AES_encrypt +.hidden asm_AES_encrypt +asm_AES_encrypt: AES_encrypt: push %rbx push %rbp @@ -1184,6 +1187,9 @@ $code.=<<___; .globl AES_decrypt .type AES_decrypt,\@function,3 .align 16 +.globl asm_AES_decrypt +.hidden asm_AES_decrypt +asm_AES_decrypt: AES_decrypt: push %rbx push %rbp @@ -1277,13 +1283,13 @@ $code.=<<___; ___ } -# int AES_set_encrypt_key(const unsigned char *userKey, const int bits, +# int private_AES_set_encrypt_key(const unsigned char *userKey, const int bits, # AES_KEY *key) $code.=<<___; -.globl AES_set_encrypt_key -.type AES_set_encrypt_key,\@function,3 +.globl private_AES_set_encrypt_key +.type private_AES_set_encrypt_key,\@function,3 .align 16 -AES_set_encrypt_key: +private_AES_set_encrypt_key: push %rbx push %rbp push %r12 # redundant, but allows to share @@ -1304,7 +1310,7 @@ AES_set_encrypt_key: add \$56,%rsp .Lenc_key_epilogue: ret -.size AES_set_encrypt_key,.-AES_set_encrypt_key +.size private_AES_set_encrypt_key,.-private_AES_set_encrypt_key .type _x86_64_AES_set_encrypt_key,\@abi-omnipotent .align 16 @@ -1547,13 +1553,13 @@ $code.=<<___; ___ } -# int AES_set_decrypt_key(const unsigned char *userKey, const int bits, +# int private_AES_set_decrypt_key(const unsigned char *userKey, const int bits, # AES_KEY *key) $code.=<<___; -.globl AES_set_decrypt_key -.type AES_set_decrypt_key,\@function,3 +.globl private_AES_set_decrypt_key +.type private_AES_set_decrypt_key,\@function,3 .align 16 -AES_set_decrypt_key: +private_AES_set_decrypt_key: push %rbx push %rbp push %r12 @@ -1622,7 +1628,7 @@ $code.=<<___; add \$56,%rsp .Ldec_key_epilogue: ret -.size AES_set_decrypt_key,.-AES_set_decrypt_key +.size private_AES_set_decrypt_key,.-private_AES_set_decrypt_key ___ # void AES_cbc_encrypt (const void char *inp, unsigned char *out, @@ -1648,6 +1654,9 @@ $code.=<<___; .type AES_cbc_encrypt,\@function,6 .align 16 .extern OPENSSL_ia32cap_P +.globl asm_AES_cbc_encrypt +.hidden asm_AES_cbc_encrypt +asm_AES_cbc_encrypt: AES_cbc_encrypt: cmp \$0,%rdx # check length je .Lcbc_epilogue @@ -2766,13 +2775,13 @@ cbc_se_handler: .rva .LSEH_end_AES_decrypt .rva .LSEH_info_AES_decrypt - .rva .LSEH_begin_AES_set_encrypt_key - .rva .LSEH_end_AES_set_encrypt_key - .rva .LSEH_info_AES_set_encrypt_key + .rva .LSEH_begin_private_AES_set_encrypt_key + .rva .LSEH_end_private_AES_set_encrypt_key + .rva .LSEH_info_private_AES_set_encrypt_key - .rva .LSEH_begin_AES_set_decrypt_key - .rva .LSEH_end_AES_set_decrypt_key - .rva .LSEH_info_AES_set_decrypt_key + .rva .LSEH_begin_private_AES_set_decrypt_key + .rva .LSEH_end_private_AES_set_decrypt_key + .rva .LSEH_info_private_AES_set_decrypt_key .rva .LSEH_begin_AES_cbc_encrypt .rva .LSEH_end_AES_cbc_encrypt @@ -2788,11 +2797,11 @@ cbc_se_handler: .byte 9,0,0,0 .rva block_se_handler .rva .Ldec_prologue,.Ldec_epilogue # HandlerData[] -.LSEH_info_AES_set_encrypt_key: +.LSEH_info_private_AES_set_encrypt_key: .byte 9,0,0,0 .rva key_se_handler .rva .Lenc_key_prologue,.Lenc_key_epilogue # HandlerData[] -.LSEH_info_AES_set_decrypt_key: +.LSEH_info_private_AES_set_decrypt_key: .byte 9,0,0,0 .rva key_se_handler .rva .Ldec_key_prologue,.Ldec_key_epilogue # HandlerData[] diff --git a/crypto/openssl/crypto/aes/asm/aesni-sha1-x86_64.pl b/crypto/openssl/crypto/aes/asm/aesni-sha1-x86_64.pl new file mode 100644 index 0000000000..c6f6b3334a --- /dev/null +++ b/crypto/openssl/crypto/aes/asm/aesni-sha1-x86_64.pl @@ -0,0 +1,1249 @@ +#!/usr/bin/env perl +# +# ==================================================================== +# Written by Andy Polyakov for the OpenSSL +# project. The module is, however, dual licensed under OpenSSL and +# CRYPTOGAMS licenses depending on where you obtain it. For further +# details see http://www.openssl.org/~appro/cryptogams/. +# ==================================================================== +# +# June 2011 +# +# This is AESNI-CBC+SHA1 "stitch" implementation. The idea, as spelled +# in http://download.intel.com/design/intarch/papers/323686.pdf, is +# that since AESNI-CBC encrypt exhibit *very* low instruction-level +# parallelism, interleaving it with another algorithm would allow to +# utilize processor resources better and achieve better performance. +# SHA1 instruction sequences(*) are taken from sha1-x86_64.pl and +# AESNI code is weaved into it. Below are performance numbers in +# cycles per processed byte, less is better, for standalone AESNI-CBC +# encrypt, sum of the latter and standalone SHA1, and "stitched" +# subroutine: +# +# AES-128-CBC +SHA1 stitch gain +# Westmere 3.77[+5.6] 9.37 6.65 +41% +# Sandy Bridge 5.05[+5.2(6.3)] 10.25(11.35) 6.16(7.08) +67%(+60%) +# +# AES-192-CBC +# Westmere 4.51 10.11 6.97 +45% +# Sandy Bridge 6.05 11.25(12.35) 6.34(7.27) +77%(+70%) +# +# AES-256-CBC +# Westmere 5.25 10.85 7.25 +50% +# Sandy Bridge 7.05 12.25(13.35) 7.06(7.70) +74%(+73%) +# +# (*) There are two code paths: SSSE3 and AVX. See sha1-568.pl for +# background information. Above numbers in parentheses are SSSE3 +# results collected on AVX-capable CPU, i.e. apply on OSes that +# don't support AVX. +# +# Needless to mention that it makes no sense to implement "stitched" +# *decrypt* subroutine. Because *both* AESNI-CBC decrypt and SHA1 +# fully utilize parallelism, so stitching would not give any gain +# anyway. Well, there might be some, e.g. because of better cache +# locality... For reference, here are performance results for +# standalone AESNI-CBC decrypt: +# +# AES-128-CBC AES-192-CBC AES-256-CBC +# Westmere 1.31 1.55 1.80 +# Sandy Bridge 0.93 1.06 1.22 + +$flavour = shift; +$output = shift; +if ($flavour =~ /\./) { $output = $flavour; undef $flavour; } + +$win64=0; $win64=1 if ($flavour =~ /[nm]asm|mingw64/ || $output =~ /\.asm$/); + +$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; +( $xlate="${dir}x86_64-xlate.pl" and -f $xlate ) or +( $xlate="${dir}../../perlasm/x86_64-xlate.pl" and -f $xlate) or +die "can't locate x86_64-xlate.pl"; + +$avx=1 if (`$ENV{CC} -Wa,-v -c -o /dev/null -x assembler /dev/null 2>&1` + =~ /GNU assembler version ([2-9]\.[0-9]+)/ && + $1>=2.19); +$avx=1 if (!$avx && $win64 && ($flavour =~ /nasm/ || $ENV{ASM} =~ /nasm/) && + `nasm -v 2>&1` =~ /NASM version ([2-9]\.[0-9]+)/ && + $1>=2.09); +$avx=1 if (!$avx && $win64 && ($flavour =~ /masm/ || $ENV{ASM} =~ /ml64/) && + `ml64 2>&1` =~ /Version ([0-9]+)\./ && + $1>=10); + +open STDOUT,"| $^X $xlate $flavour $output"; + +# void aesni_cbc_sha1_enc(const void *inp, +# void *out, +# size_t length, +# const AES_KEY *key, +# unsigned char *iv, +# SHA_CTX *ctx, +# const void *in0); + +$code.=<<___; +.text +.extern OPENSSL_ia32cap_P + +.globl aesni_cbc_sha1_enc +.type aesni_cbc_sha1_enc,\@abi-omnipotent +.align 16 +aesni_cbc_sha1_enc: + # caller should check for SSSE3 and AES-NI bits + mov OPENSSL_ia32cap_P+0(%rip),%r10d + mov OPENSSL_ia32cap_P+4(%rip),%r11d +___ +$code.=<<___ if ($avx); + and \$`1<<28`,%r11d # mask AVX bit + and \$`1<<30`,%r10d # mask "Intel CPU" bit + or %r11d,%r10d + cmp \$`1<<28|1<<30`,%r10d + je aesni_cbc_sha1_enc_avx +___ +$code.=<<___; + jmp aesni_cbc_sha1_enc_ssse3 + ret +.size aesni_cbc_sha1_enc,.-aesni_cbc_sha1_enc +___ + +my ($in0,$out,$len,$key,$ivp,$ctx,$inp)=("%rdi","%rsi","%rdx","%rcx","%r8","%r9","%r10"); + +my $Xi=4; +my @X=map("%xmm$_",(4..7,0..3)); +my @Tx=map("%xmm$_",(8..10)); +my @V=($A,$B,$C,$D,$E)=("%eax","%ebx","%ecx","%edx","%ebp"); # size optimization +my @T=("%esi","%edi"); +my $j=0; my $jj=0; my $r=0; my $sn=0; +my $K_XX_XX="%r11"; +my ($iv,$in,$rndkey0)=map("%xmm$_",(11..13)); +my @rndkey=("%xmm14","%xmm15"); + +sub AUTOLOAD() # thunk [simplified] 32-bit style perlasm +{ my $opcode = $AUTOLOAD; $opcode =~ s/.*:://; + my $arg = pop; + $arg = "\$$arg" if ($arg*1 eq $arg); + $code .= "\t$opcode\t".join(',',$arg,reverse @_)."\n"; +} + +my $_rol=sub { &rol(@_) }; +my $_ror=sub { &ror(@_) }; + +$code.=<<___; +.type aesni_cbc_sha1_enc_ssse3,\@function,6 +.align 16 +aesni_cbc_sha1_enc_ssse3: + mov `($win64?56:8)`(%rsp),$inp # load 7th argument + #shr \$6,$len # debugging artefact + #jz .Lepilogue_ssse3 # debugging artefact + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + lea `-104-($win64?10*16:0)`(%rsp),%rsp + #mov $in0,$inp # debugging artefact + #lea 64(%rsp),$ctx # debugging artefact +___ +$code.=<<___ if ($win64); + movaps %xmm6,96+0(%rsp) + movaps %xmm7,96+16(%rsp) + movaps %xmm8,96+32(%rsp) + movaps %xmm9,96+48(%rsp) + movaps %xmm10,96+64(%rsp) + movaps %xmm11,96+80(%rsp) + movaps %xmm12,96+96(%rsp) + movaps %xmm13,96+112(%rsp) + movaps %xmm14,96+128(%rsp) + movaps %xmm15,96+144(%rsp) +.Lprologue_ssse3: +___ +$code.=<<___; + mov $in0,%r12 # reassign arguments + mov $out,%r13 + mov $len,%r14 + mov $key,%r15 + movdqu ($ivp),$iv # load IV + mov $ivp,88(%rsp) # save $ivp +___ +my ($in0,$out,$len,$key)=map("%r$_",(12..15)); # reassign arguments +my $rounds="${ivp}d"; +$code.=<<___; + shl \$6,$len + sub $in0,$out + mov 240($key),$rounds + add $inp,$len # end of input + + lea K_XX_XX(%rip),$K_XX_XX + mov 0($ctx),$A # load context + mov 4($ctx),$B + mov 8($ctx),$C + mov 12($ctx),$D + mov $B,@T[0] # magic seed + mov 16($ctx),$E + + movdqa 64($K_XX_XX),@X[2] # pbswap mask + movdqa 0($K_XX_XX),@Tx[1] # K_00_19 + movdqu 0($inp),@X[-4&7] # load input to %xmm[0-3] + movdqu 16($inp),@X[-3&7] + movdqu 32($inp),@X[-2&7] + movdqu 48($inp),@X[-1&7] + pshufb @X[2],@X[-4&7] # byte swap + add \$64,$inp + pshufb @X[2],@X[-3&7] + pshufb @X[2],@X[-2&7] + pshufb @X[2],@X[-1&7] + paddd @Tx[1],@X[-4&7] # add K_00_19 + paddd @Tx[1],@X[-3&7] + paddd @Tx[1],@X[-2&7] + movdqa @X[-4&7],0(%rsp) # X[]+K xfer to IALU + psubd @Tx[1],@X[-4&7] # restore X[] + movdqa @X[-3&7],16(%rsp) + psubd @Tx[1],@X[-3&7] + movdqa @X[-2&7],32(%rsp) + psubd @Tx[1],@X[-2&7] + movups ($key),$rndkey0 # $key[0] + movups 16($key),$rndkey[0] # forward reference + jmp .Loop_ssse3 +___ + +my $aesenc=sub { + use integer; + my ($n,$k)=($r/10,$r%10); + if ($k==0) { + $code.=<<___; + movups `16*$n`($in0),$in # load input + xorps $rndkey0,$in +___ + $code.=<<___ if ($n); + movups $iv,`16*($n-1)`($out,$in0) # write output +___ + $code.=<<___; + xorps $in,$iv + aesenc $rndkey[0],$iv + movups `32+16*$k`($key),$rndkey[1] +___ + } elsif ($k==9) { + $sn++; + $code.=<<___; + cmp \$11,$rounds + jb .Laesenclast$sn + movups `32+16*($k+0)`($key),$rndkey[1] + aesenc $rndkey[0],$iv + movups `32+16*($k+1)`($key),$rndkey[0] + aesenc $rndkey[1],$iv + je .Laesenclast$sn + movups `32+16*($k+2)`($key),$rndkey[1] + aesenc $rndkey[0],$iv + movups `32+16*($k+3)`($key),$rndkey[0] + aesenc $rndkey[1],$iv +.Laesenclast$sn: + aesenclast $rndkey[0],$iv + movups 16($key),$rndkey[1] # forward reference +___ + } else { + $code.=<<___; + aesenc $rndkey[0],$iv + movups `32+16*$k`($key),$rndkey[1] +___ + } + $r++; unshift(@rndkey,pop(@rndkey)); +}; + +sub Xupdate_ssse3_16_31() # recall that $Xi starts wtih 4 +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 40 instructions + my ($a,$b,$c,$d,$e); + + &movdqa (@X[0],@X[-3&7]); + eval(shift(@insns)); + eval(shift(@insns)); + &movdqa (@Tx[0],@X[-1&7]); + &palignr(@X[0],@X[-4&7],8); # compose "X[-14]" in "X[0]" + eval(shift(@insns)); + eval(shift(@insns)); + + &paddd (@Tx[1],@X[-1&7]); + eval(shift(@insns)); + eval(shift(@insns)); + &psrldq (@Tx[0],4); # "X[-3]", 3 dwords + eval(shift(@insns)); + eval(shift(@insns)); + &pxor (@X[0],@X[-4&7]); # "X[0]"^="X[-16]" + eval(shift(@insns)); + eval(shift(@insns)); + + &pxor (@Tx[0],@X[-2&7]); # "X[-3]"^"X[-8]" + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &pxor (@X[0],@Tx[0]); # "X[0]"^="X[-3]"^"X[-8]" + eval(shift(@insns)); + eval(shift(@insns)); + &movdqa (eval(16*(($Xi-1)&3))."(%rsp)",@Tx[1]); # X[]+K xfer to IALU + eval(shift(@insns)); + eval(shift(@insns)); + + &movdqa (@Tx[2],@X[0]); + &movdqa (@Tx[0],@X[0]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &pslldq (@Tx[2],12); # "X[0]"<<96, extract one dword + &paddd (@X[0],@X[0]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &psrld (@Tx[0],31); + eval(shift(@insns)); + eval(shift(@insns)); + &movdqa (@Tx[1],@Tx[2]); + eval(shift(@insns)); + eval(shift(@insns)); + + &psrld (@Tx[2],30); + &por (@X[0],@Tx[0]); # "X[0]"<<<=1 + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &pslld (@Tx[1],2); + &pxor (@X[0],@Tx[2]); + eval(shift(@insns)); + eval(shift(@insns)); + &movdqa (@Tx[2],eval(16*(($Xi)/5))."($K_XX_XX)"); # K_XX_XX + eval(shift(@insns)); + eval(shift(@insns)); + + &pxor (@X[0],@Tx[1]); # "X[0]"^=("X[0]">>96)<<<2 + + foreach (@insns) { eval; } # remaining instructions [if any] + + $Xi++; push(@X,shift(@X)); # "rotate" X[] + push(@Tx,shift(@Tx)); +} + +sub Xupdate_ssse3_32_79() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 to 48 instructions + my ($a,$b,$c,$d,$e); + + &movdqa (@Tx[0],@X[-1&7]) if ($Xi==8); + eval(shift(@insns)); # body_20_39 + &pxor (@X[0],@X[-4&7]); # "X[0]"="X[-32]"^"X[-16]" + &palignr(@Tx[0],@X[-2&7],8); # compose "X[-6]" + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # rol + + &pxor (@X[0],@X[-7&7]); # "X[0]"^="X[-28]" + eval(shift(@insns)); + eval(shift(@insns)) if (@insns[0] !~ /&ro[rl]/); + if ($Xi%5) { + &movdqa (@Tx[2],@Tx[1]);# "perpetuate" K_XX_XX... + } else { # ... or load next one + &movdqa (@Tx[2],eval(16*($Xi/5))."($K_XX_XX)"); + } + &paddd (@Tx[1],@X[-1&7]); + eval(shift(@insns)); # ror + eval(shift(@insns)); + + &pxor (@X[0],@Tx[0]); # "X[0]"^="X[-6]" + eval(shift(@insns)); # body_20_39 + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # rol + + &movdqa (@Tx[0],@X[0]); + &movdqa (eval(16*(($Xi-1)&3))."(%rsp)",@Tx[1]); # X[]+K xfer to IALU + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # ror + eval(shift(@insns)); + + &pslld (@X[0],2); + eval(shift(@insns)); # body_20_39 + eval(shift(@insns)); + &psrld (@Tx[0],30); + eval(shift(@insns)); + eval(shift(@insns)); # rol + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # ror + eval(shift(@insns)); + + &por (@X[0],@Tx[0]); # "X[0]"<<<=2 + eval(shift(@insns)); # body_20_39 + eval(shift(@insns)); + &movdqa (@Tx[1],@X[0]) if ($Xi<19); + eval(shift(@insns)); + eval(shift(@insns)); # rol + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # rol + eval(shift(@insns)); + + foreach (@insns) { eval; } # remaining instructions + + $Xi++; push(@X,shift(@X)); # "rotate" X[] + push(@Tx,shift(@Tx)); +} + +sub Xuplast_ssse3_80() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 instructions + my ($a,$b,$c,$d,$e); + + eval(shift(@insns)); + &paddd (@Tx[1],@X[-1&7]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &movdqa (eval(16*(($Xi-1)&3))."(%rsp)",@Tx[1]); # X[]+K xfer IALU + + foreach (@insns) { eval; } # remaining instructions + + &cmp ($inp,$len); + &je (".Ldone_ssse3"); + + unshift(@Tx,pop(@Tx)); + + &movdqa (@X[2],"64($K_XX_XX)"); # pbswap mask + &movdqa (@Tx[1],"0($K_XX_XX)"); # K_00_19 + &movdqu (@X[-4&7],"0($inp)"); # load input + &movdqu (@X[-3&7],"16($inp)"); + &movdqu (@X[-2&7],"32($inp)"); + &movdqu (@X[-1&7],"48($inp)"); + &pshufb (@X[-4&7],@X[2]); # byte swap + &add ($inp,64); + + $Xi=0; +} + +sub Xloop_ssse3() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 instructions + my ($a,$b,$c,$d,$e); + + eval(shift(@insns)); + eval(shift(@insns)); + &pshufb (@X[($Xi-3)&7],@X[2]); + eval(shift(@insns)); + eval(shift(@insns)); + &paddd (@X[($Xi-4)&7],@Tx[1]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + &movdqa (eval(16*$Xi)."(%rsp)",@X[($Xi-4)&7]); # X[]+K xfer to IALU + eval(shift(@insns)); + eval(shift(@insns)); + &psubd (@X[($Xi-4)&7],@Tx[1]); + + foreach (@insns) { eval; } + $Xi++; +} + +sub Xtail_ssse3() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 instructions + my ($a,$b,$c,$d,$e); + + foreach (@insns) { eval; } +} + +sub body_00_19 () { + use integer; + my ($k,$n); + my @r=( + '($a,$b,$c,$d,$e)=@V;'. + '&add ($e,eval(4*($j&15))."(%rsp)");', # X[]+K xfer + '&xor ($c,$d);', + '&mov (@T[1],$a);', # $b in next round + '&$_rol ($a,5);', + '&and (@T[0],$c);', # ($b&($c^$d)) + '&xor ($c,$d);', # restore $c + '&xor (@T[0],$d);', + '&add ($e,$a);', + '&$_ror ($b,$j?7:2);', # $b>>>2 + '&add ($e,@T[0]);' .'$j++; unshift(@V,pop(@V)); unshift(@T,pop(@T));' + ); + $n = scalar(@r); + $k = (($jj+1)*12/20)*20*$n/12; # 12 aesencs per these 20 rounds + @r[$k%$n].='&$aesenc();' if ($jj==$k/$n); + $jj++; + return @r; +} + +sub body_20_39 () { + use integer; + my ($k,$n); + my @r=( + '($a,$b,$c,$d,$e)=@V;'. + '&add ($e,eval(4*($j++&15))."(%rsp)");', # X[]+K xfer + '&xor (@T[0],$d);', # ($b^$d) + '&mov (@T[1],$a);', # $b in next round + '&$_rol ($a,5);', + '&xor (@T[0],$c);', # ($b^$d^$c) + '&add ($e,$a);', + '&$_ror ($b,7);', # $b>>>2 + '&add ($e,@T[0]);' .'unshift(@V,pop(@V)); unshift(@T,pop(@T));' + ); + $n = scalar(@r); + $k = (($jj+1)*8/20)*20*$n/8; # 8 aesencs per these 20 rounds + @r[$k%$n].='&$aesenc();' if ($jj==$k/$n); + $jj++; + return @r; +} + +sub body_40_59 () { + use integer; + my ($k,$n); + my @r=( + '($a,$b,$c,$d,$e)=@V;'. + '&mov (@T[1],$c);', + '&xor ($c,$d);', + '&add ($e,eval(4*($j++&15))."(%rsp)");', # X[]+K xfer + '&and (@T[1],$d);', + '&and (@T[0],$c);', # ($b&($c^$d)) + '&$_ror ($b,7);', # $b>>>2 + '&add ($e,@T[1]);', + '&mov (@T[1],$a);', # $b in next round + '&$_rol ($a,5);', + '&add ($e,@T[0]);', + '&xor ($c,$d);', # restore $c + '&add ($e,$a);' .'unshift(@V,pop(@V)); unshift(@T,pop(@T));' + ); + $n = scalar(@r); + $k=(($jj+1)*12/20)*20*$n/12; # 12 aesencs per these 20 rounds + @r[$k%$n].='&$aesenc();' if ($jj==$k/$n); + $jj++; + return @r; +} +$code.=<<___; +.align 16 +.Loop_ssse3: +___ + &Xupdate_ssse3_16_31(\&body_00_19); + &Xupdate_ssse3_16_31(\&body_00_19); + &Xupdate_ssse3_16_31(\&body_00_19); + &Xupdate_ssse3_16_31(\&body_00_19); + &Xupdate_ssse3_32_79(\&body_00_19); + &Xupdate_ssse3_32_79(\&body_20_39); + &Xupdate_ssse3_32_79(\&body_20_39); + &Xupdate_ssse3_32_79(\&body_20_39); + &Xupdate_ssse3_32_79(\&body_20_39); + &Xupdate_ssse3_32_79(\&body_20_39); + &Xupdate_ssse3_32_79(\&body_40_59); + &Xupdate_ssse3_32_79(\&body_40_59); + &Xupdate_ssse3_32_79(\&body_40_59); + &Xupdate_ssse3_32_79(\&body_40_59); + &Xupdate_ssse3_32_79(\&body_40_59); + &Xupdate_ssse3_32_79(\&body_20_39); + &Xuplast_ssse3_80(\&body_20_39); # can jump to "done" + + $saved_j=$j; @saved_V=@V; + $saved_r=$r; @saved_rndkey=@rndkey; + + &Xloop_ssse3(\&body_20_39); + &Xloop_ssse3(\&body_20_39); + &Xloop_ssse3(\&body_20_39); + +$code.=<<___; + movups $iv,48($out,$in0) # write output + lea 64($in0),$in0 + + add 0($ctx),$A # update context + add 4($ctx),@T[0] + add 8($ctx),$C + add 12($ctx),$D + mov $A,0($ctx) + add 16($ctx),$E + mov @T[0],4($ctx) + mov @T[0],$B # magic seed + mov $C,8($ctx) + mov $D,12($ctx) + mov $E,16($ctx) + jmp .Loop_ssse3 + +.align 16 +.Ldone_ssse3: +___ + $jj=$j=$saved_j; @V=@saved_V; + $r=$saved_r; @rndkey=@saved_rndkey; + + &Xtail_ssse3(\&body_20_39); + &Xtail_ssse3(\&body_20_39); + &Xtail_ssse3(\&body_20_39); + +$code.=<<___; + movups $iv,48($out,$in0) # write output + mov 88(%rsp),$ivp # restore $ivp + + add 0($ctx),$A # update context + add 4($ctx),@T[0] + add 8($ctx),$C + mov $A,0($ctx) + add 12($ctx),$D + mov @T[0],4($ctx) + add 16($ctx),$E + mov $C,8($ctx) + mov $D,12($ctx) + mov $E,16($ctx) + movups $iv,($ivp) # write IV +___ +$code.=<<___ if ($win64); + movaps 96+0(%rsp),%xmm6 + movaps 96+16(%rsp),%xmm7 + movaps 96+32(%rsp),%xmm8 + movaps 96+48(%rsp),%xmm9 + movaps 96+64(%rsp),%xmm10 + movaps 96+80(%rsp),%xmm11 + movaps 96+96(%rsp),%xmm12 + movaps 96+112(%rsp),%xmm13 + movaps 96+128(%rsp),%xmm14 + movaps 96+144(%rsp),%xmm15 +___ +$code.=<<___; + lea `104+($win64?10*16:0)`(%rsp),%rsi + mov 0(%rsi),%r15 + mov 8(%rsi),%r14 + mov 16(%rsi),%r13 + mov 24(%rsi),%r12 + mov 32(%rsi),%rbp + mov 40(%rsi),%rbx + lea 48(%rsi),%rsp +.Lepilogue_ssse3: + ret +.size aesni_cbc_sha1_enc_ssse3,.-aesni_cbc_sha1_enc_ssse3 +___ + +$j=$jj=$r=$sn=0; + +if ($avx) { +my ($in0,$out,$len,$key,$ivp,$ctx,$inp)=("%rdi","%rsi","%rdx","%rcx","%r8","%r9","%r10"); + +my $Xi=4; +my @X=map("%xmm$_",(4..7,0..3)); +my @Tx=map("%xmm$_",(8..10)); +my @V=($A,$B,$C,$D,$E)=("%eax","%ebx","%ecx","%edx","%ebp"); # size optimization +my @T=("%esi","%edi"); + +my $_rol=sub { &shld(@_[0],@_) }; +my $_ror=sub { &shrd(@_[0],@_) }; + +$code.=<<___; +.type aesni_cbc_sha1_enc_avx,\@function,6 +.align 16 +aesni_cbc_sha1_enc_avx: + mov `($win64?56:8)`(%rsp),$inp # load 7th argument + #shr \$6,$len # debugging artefact + #jz .Lepilogue_avx # debugging artefact + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + lea `-104-($win64?10*16:0)`(%rsp),%rsp + #mov $in0,$inp # debugging artefact + #lea 64(%rsp),$ctx # debugging artefact +___ +$code.=<<___ if ($win64); + movaps %xmm6,96+0(%rsp) + movaps %xmm7,96+16(%rsp) + movaps %xmm8,96+32(%rsp) + movaps %xmm9,96+48(%rsp) + movaps %xmm10,96+64(%rsp) + movaps %xmm11,96+80(%rsp) + movaps %xmm12,96+96(%rsp) + movaps %xmm13,96+112(%rsp) + movaps %xmm14,96+128(%rsp) + movaps %xmm15,96+144(%rsp) +.Lprologue_avx: +___ +$code.=<<___; + vzeroall + mov $in0,%r12 # reassign arguments + mov $out,%r13 + mov $len,%r14 + mov $key,%r15 + vmovdqu ($ivp),$iv # load IV + mov $ivp,88(%rsp) # save $ivp +___ +my ($in0,$out,$len,$key)=map("%r$_",(12..15)); # reassign arguments +my $rounds="${ivp}d"; +$code.=<<___; + shl \$6,$len + sub $in0,$out + mov 240($key),$rounds + add \$112,$key # size optimization + add $inp,$len # end of input + + lea K_XX_XX(%rip),$K_XX_XX + mov 0($ctx),$A # load context + mov 4($ctx),$B + mov 8($ctx),$C + mov 12($ctx),$D + mov $B,@T[0] # magic seed + mov 16($ctx),$E + + vmovdqa 64($K_XX_XX),@X[2] # pbswap mask + vmovdqa 0($K_XX_XX),@Tx[1] # K_00_19 + vmovdqu 0($inp),@X[-4&7] # load input to %xmm[0-3] + vmovdqu 16($inp),@X[-3&7] + vmovdqu 32($inp),@X[-2&7] + vmovdqu 48($inp),@X[-1&7] + vpshufb @X[2],@X[-4&7],@X[-4&7] # byte swap + add \$64,$inp + vpshufb @X[2],@X[-3&7],@X[-3&7] + vpshufb @X[2],@X[-2&7],@X[-2&7] + vpshufb @X[2],@X[-1&7],@X[-1&7] + vpaddd @Tx[1],@X[-4&7],@X[0] # add K_00_19 + vpaddd @Tx[1],@X[-3&7],@X[1] + vpaddd @Tx[1],@X[-2&7],@X[2] + vmovdqa @X[0],0(%rsp) # X[]+K xfer to IALU + vmovdqa @X[1],16(%rsp) + vmovdqa @X[2],32(%rsp) + vmovups -112($key),$rndkey0 # $key[0] + vmovups 16-112($key),$rndkey[0] # forward reference + jmp .Loop_avx +___ + +my $aesenc=sub { + use integer; + my ($n,$k)=($r/10,$r%10); + if ($k==0) { + $code.=<<___; + vmovups `16*$n`($in0),$in # load input + vxorps $rndkey0,$in,$in +___ + $code.=<<___ if ($n); + vmovups $iv,`16*($n-1)`($out,$in0) # write output +___ + $code.=<<___; + vxorps $in,$iv,$iv + vaesenc $rndkey[0],$iv,$iv + vmovups `32+16*$k-112`($key),$rndkey[1] +___ + } elsif ($k==9) { + $sn++; + $code.=<<___; + cmp \$11,$rounds + jb .Lvaesenclast$sn + vaesenc $rndkey[0],$iv,$iv + vmovups `32+16*($k+0)-112`($key),$rndkey[1] + vaesenc $rndkey[1],$iv,$iv + vmovups `32+16*($k+1)-112`($key),$rndkey[0] + je .Lvaesenclast$sn + vaesenc $rndkey[0],$iv,$iv + vmovups `32+16*($k+2)-112`($key),$rndkey[1] + vaesenc $rndkey[1],$iv,$iv + vmovups `32+16*($k+3)-112`($key),$rndkey[0] +.Lvaesenclast$sn: + vaesenclast $rndkey[0],$iv,$iv + vmovups 16-112($key),$rndkey[1] # forward reference +___ + } else { + $code.=<<___; + vaesenc $rndkey[0],$iv,$iv + vmovups `32+16*$k-112`($key),$rndkey[1] +___ + } + $r++; unshift(@rndkey,pop(@rndkey)); +}; + +sub Xupdate_avx_16_31() # recall that $Xi starts wtih 4 +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 40 instructions + my ($a,$b,$c,$d,$e); + + eval(shift(@insns)); + eval(shift(@insns)); + &vpalignr(@X[0],@X[-3&7],@X[-4&7],8); # compose "X[-14]" in "X[0]" + eval(shift(@insns)); + eval(shift(@insns)); + + &vpaddd (@Tx[1],@Tx[1],@X[-1&7]); + eval(shift(@insns)); + eval(shift(@insns)); + &vpsrldq(@Tx[0],@X[-1&7],4); # "X[-3]", 3 dwords + eval(shift(@insns)); + eval(shift(@insns)); + &vpxor (@X[0],@X[0],@X[-4&7]); # "X[0]"^="X[-16]" + eval(shift(@insns)); + eval(shift(@insns)); + + &vpxor (@Tx[0],@Tx[0],@X[-2&7]); # "X[-3]"^"X[-8]" + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &vpxor (@X[0],@X[0],@Tx[0]); # "X[0]"^="X[-3]"^"X[-8]" + eval(shift(@insns)); + eval(shift(@insns)); + &vmovdqa (eval(16*(($Xi-1)&3))."(%rsp)",@Tx[1]); # X[]+K xfer to IALU + eval(shift(@insns)); + eval(shift(@insns)); + + &vpsrld (@Tx[0],@X[0],31); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &vpslldq(@Tx[2],@X[0],12); # "X[0]"<<96, extract one dword + &vpaddd (@X[0],@X[0],@X[0]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &vpsrld (@Tx[1],@Tx[2],30); + &vpor (@X[0],@X[0],@Tx[0]); # "X[0]"<<<=1 + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &vpslld (@Tx[2],@Tx[2],2); + &vpxor (@X[0],@X[0],@Tx[1]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &vpxor (@X[0],@X[0],@Tx[2]); # "X[0]"^=("X[0]">>96)<<<2 + eval(shift(@insns)); + eval(shift(@insns)); + &vmovdqa (@Tx[2],eval(16*(($Xi)/5))."($K_XX_XX)"); # K_XX_XX + eval(shift(@insns)); + eval(shift(@insns)); + + + foreach (@insns) { eval; } # remaining instructions [if any] + + $Xi++; push(@X,shift(@X)); # "rotate" X[] + push(@Tx,shift(@Tx)); +} + +sub Xupdate_avx_32_79() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 to 48 instructions + my ($a,$b,$c,$d,$e); + + &vpalignr(@Tx[0],@X[-1&7],@X[-2&7],8); # compose "X[-6]" + &vpxor (@X[0],@X[0],@X[-4&7]); # "X[0]"="X[-32]"^"X[-16]" + eval(shift(@insns)); # body_20_39 + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # rol + + &vpxor (@X[0],@X[0],@X[-7&7]); # "X[0]"^="X[-28]" + eval(shift(@insns)); + eval(shift(@insns)) if (@insns[0] !~ /&ro[rl]/); + if ($Xi%5) { + &vmovdqa (@Tx[2],@Tx[1]);# "perpetuate" K_XX_XX... + } else { # ... or load next one + &vmovdqa (@Tx[2],eval(16*($Xi/5))."($K_XX_XX)"); + } + &vpaddd (@Tx[1],@Tx[1],@X[-1&7]); + eval(shift(@insns)); # ror + eval(shift(@insns)); + + &vpxor (@X[0],@X[0],@Tx[0]); # "X[0]"^="X[-6]" + eval(shift(@insns)); # body_20_39 + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # rol + + &vpsrld (@Tx[0],@X[0],30); + &vmovdqa (eval(16*(($Xi-1)&3))."(%rsp)",@Tx[1]); # X[]+K xfer to IALU + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # ror + eval(shift(@insns)); + + &vpslld (@X[0],@X[0],2); + eval(shift(@insns)); # body_20_39 + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # rol + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # ror + eval(shift(@insns)); + + &vpor (@X[0],@X[0],@Tx[0]); # "X[0]"<<<=2 + eval(shift(@insns)); # body_20_39 + eval(shift(@insns)); + &vmovdqa (@Tx[1],@X[0]) if ($Xi<19); + eval(shift(@insns)); + eval(shift(@insns)); # rol + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # rol + eval(shift(@insns)); + + foreach (@insns) { eval; } # remaining instructions + + $Xi++; push(@X,shift(@X)); # "rotate" X[] + push(@Tx,shift(@Tx)); +} + +sub Xuplast_avx_80() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 instructions + my ($a,$b,$c,$d,$e); + + eval(shift(@insns)); + &vpaddd (@Tx[1],@Tx[1],@X[-1&7]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &movdqa (eval(16*(($Xi-1)&3))."(%rsp)",@Tx[1]); # X[]+K xfer IALU + + foreach (@insns) { eval; } # remaining instructions + + &cmp ($inp,$len); + &je (".Ldone_avx"); + + unshift(@Tx,pop(@Tx)); + + &vmovdqa(@X[2],"64($K_XX_XX)"); # pbswap mask + &vmovdqa(@Tx[1],"0($K_XX_XX)"); # K_00_19 + &vmovdqu(@X[-4&7],"0($inp)"); # load input + &vmovdqu(@X[-3&7],"16($inp)"); + &vmovdqu(@X[-2&7],"32($inp)"); + &vmovdqu(@X[-1&7],"48($inp)"); + &vpshufb(@X[-4&7],@X[-4&7],@X[2]); # byte swap + &add ($inp,64); + + $Xi=0; +} + +sub Xloop_avx() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 instructions + my ($a,$b,$c,$d,$e); + + eval(shift(@insns)); + eval(shift(@insns)); + &vpshufb(@X[($Xi-3)&7],@X[($Xi-3)&7],@X[2]); + eval(shift(@insns)); + eval(shift(@insns)); + &vpaddd (@X[$Xi&7],@X[($Xi-4)&7],@Tx[1]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + &vmovdqa(eval(16*$Xi)."(%rsp)",@X[$Xi&7]); # X[]+K xfer to IALU + eval(shift(@insns)); + eval(shift(@insns)); + + foreach (@insns) { eval; } + $Xi++; +} + +sub Xtail_avx() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 instructions + my ($a,$b,$c,$d,$e); + + foreach (@insns) { eval; } +} + +$code.=<<___; +.align 16 +.Loop_avx: +___ + &Xupdate_avx_16_31(\&body_00_19); + &Xupdate_avx_16_31(\&body_00_19); + &Xupdate_avx_16_31(\&body_00_19); + &Xupdate_avx_16_31(\&body_00_19); + &Xupdate_avx_32_79(\&body_00_19); + &Xupdate_avx_32_79(\&body_20_39); + &Xupdate_avx_32_79(\&body_20_39); + &Xupdate_avx_32_79(\&body_20_39); + &Xupdate_avx_32_79(\&body_20_39); + &Xupdate_avx_32_79(\&body_20_39); + &Xupdate_avx_32_79(\&body_40_59); + &Xupdate_avx_32_79(\&body_40_59); + &Xupdate_avx_32_79(\&body_40_59); + &Xupdate_avx_32_79(\&body_40_59); + &Xupdate_avx_32_79(\&body_40_59); + &Xupdate_avx_32_79(\&body_20_39); + &Xuplast_avx_80(\&body_20_39); # can jump to "done" + + $saved_j=$j; @saved_V=@V; + $saved_r=$r; @saved_rndkey=@rndkey; + + &Xloop_avx(\&body_20_39); + &Xloop_avx(\&body_20_39); + &Xloop_avx(\&body_20_39); + +$code.=<<___; + vmovups $iv,48($out,$in0) # write output + lea 64($in0),$in0 + + add 0($ctx),$A # update context + add 4($ctx),@T[0] + add 8($ctx),$C + add 12($ctx),$D + mov $A,0($ctx) + add 16($ctx),$E + mov @T[0],4($ctx) + mov @T[0],$B # magic seed + mov $C,8($ctx) + mov $D,12($ctx) + mov $E,16($ctx) + jmp .Loop_avx + +.align 16 +.Ldone_avx: +___ + $jj=$j=$saved_j; @V=@saved_V; + $r=$saved_r; @rndkey=@saved_rndkey; + + &Xtail_avx(\&body_20_39); + &Xtail_avx(\&body_20_39); + &Xtail_avx(\&body_20_39); + +$code.=<<___; + vmovups $iv,48($out,$in0) # write output + mov 88(%rsp),$ivp # restore $ivp + + add 0($ctx),$A # update context + add 4($ctx),@T[0] + add 8($ctx),$C + mov $A,0($ctx) + add 12($ctx),$D + mov @T[0],4($ctx) + add 16($ctx),$E + mov $C,8($ctx) + mov $D,12($ctx) + mov $E,16($ctx) + vmovups $iv,($ivp) # write IV + vzeroall +___ +$code.=<<___ if ($win64); + movaps 96+0(%rsp),%xmm6 + movaps 96+16(%rsp),%xmm7 + movaps 96+32(%rsp),%xmm8 + movaps 96+48(%rsp),%xmm9 + movaps 96+64(%rsp),%xmm10 + movaps 96+80(%rsp),%xmm11 + movaps 96+96(%rsp),%xmm12 + movaps 96+112(%rsp),%xmm13 + movaps 96+128(%rsp),%xmm14 + movaps 96+144(%rsp),%xmm15 +___ +$code.=<<___; + lea `104+($win64?10*16:0)`(%rsp),%rsi + mov 0(%rsi),%r15 + mov 8(%rsi),%r14 + mov 16(%rsi),%r13 + mov 24(%rsi),%r12 + mov 32(%rsi),%rbp + mov 40(%rsi),%rbx + lea 48(%rsi),%rsp +.Lepilogue_avx: + ret +.size aesni_cbc_sha1_enc_avx,.-aesni_cbc_sha1_enc_avx +___ +} +$code.=<<___; +.align 64 +K_XX_XX: +.long 0x5a827999,0x5a827999,0x5a827999,0x5a827999 # K_00_19 +.long 0x6ed9eba1,0x6ed9eba1,0x6ed9eba1,0x6ed9eba1 # K_20_39 +.long 0x8f1bbcdc,0x8f1bbcdc,0x8f1bbcdc,0x8f1bbcdc # K_40_59 +.long 0xca62c1d6,0xca62c1d6,0xca62c1d6,0xca62c1d6 # K_60_79 +.long 0x00010203,0x04050607,0x08090a0b,0x0c0d0e0f # pbswap mask + +.asciz "AESNI-CBC+SHA1 stitch for x86_64, CRYPTOGAMS by " +.align 64 +___ + +# EXCEPTION_DISPOSITION handler (EXCEPTION_RECORD *rec,ULONG64 frame, +# CONTEXT *context,DISPATCHER_CONTEXT *disp) +if ($win64) { +$rec="%rcx"; +$frame="%rdx"; +$context="%r8"; +$disp="%r9"; + +$code.=<<___; +.extern __imp_RtlVirtualUnwind +.type ssse3_handler,\@abi-omnipotent +.align 16 +ssse3_handler: + push %rsi + push %rdi + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + pushfq + sub \$64,%rsp + + mov 120($context),%rax # pull context->Rax + mov 248($context),%rbx # pull context->Rip + + mov 8($disp),%rsi # disp->ImageBase + mov 56($disp),%r11 # disp->HandlerData + + mov 0(%r11),%r10d # HandlerData[0] + lea (%rsi,%r10),%r10 # prologue label + cmp %r10,%rbx # context->RipRsp + + mov 4(%r11),%r10d # HandlerData[1] + lea (%rsi,%r10),%r10 # epilogue label + cmp %r10,%rbx # context->Rip>=epilogue label + jae .Lcommon_seh_tail + + lea 96(%rax),%rsi + lea 512($context),%rdi # &context.Xmm6 + mov \$20,%ecx + .long 0xa548f3fc # cld; rep movsq + lea `104+10*16`(%rax),%rax # adjust stack pointer + + mov 0(%rax),%r15 + mov 8(%rax),%r14 + mov 16(%rax),%r13 + mov 24(%rax),%r12 + mov 32(%rax),%rbp + mov 40(%rax),%rbx + lea 48(%rax),%rax + mov %rbx,144($context) # restore context->Rbx + mov %rbp,160($context) # restore context->Rbp + mov %r12,216($context) # restore context->R12 + mov %r13,224($context) # restore context->R13 + mov %r14,232($context) # restore context->R14 + mov %r15,240($context) # restore context->R15 + +.Lcommon_seh_tail: + mov 8(%rax),%rdi + mov 16(%rax),%rsi + mov %rax,152($context) # restore context->Rsp + mov %rsi,168($context) # restore context->Rsi + mov %rdi,176($context) # restore context->Rdi + + mov 40($disp),%rdi # disp->ContextRecord + mov $context,%rsi # context + mov \$154,%ecx # sizeof(CONTEXT) + .long 0xa548f3fc # cld; rep movsq + + mov $disp,%rsi + xor %rcx,%rcx # arg1, UNW_FLAG_NHANDLER + mov 8(%rsi),%rdx # arg2, disp->ImageBase + mov 0(%rsi),%r8 # arg3, disp->ControlPc + mov 16(%rsi),%r9 # arg4, disp->FunctionEntry + mov 40(%rsi),%r10 # disp->ContextRecord + lea 56(%rsi),%r11 # &disp->HandlerData + lea 24(%rsi),%r12 # &disp->EstablisherFrame + mov %r10,32(%rsp) # arg5 + mov %r11,40(%rsp) # arg6 + mov %r12,48(%rsp) # arg7 + mov %rcx,56(%rsp) # arg8, (NULL) + call *__imp_RtlVirtualUnwind(%rip) + + mov \$1,%eax # ExceptionContinueSearch + add \$64,%rsp + popfq + pop %r15 + pop %r14 + pop %r13 + pop %r12 + pop %rbp + pop %rbx + pop %rdi + pop %rsi + ret +.size ssse3_handler,.-ssse3_handler + +.section .pdata +.align 4 + .rva .LSEH_begin_aesni_cbc_sha1_enc_ssse3 + .rva .LSEH_end_aesni_cbc_sha1_enc_ssse3 + .rva .LSEH_info_aesni_cbc_sha1_enc_ssse3 +___ +$code.=<<___ if ($avx); + .rva .LSEH_begin_aesni_cbc_sha1_enc_avx + .rva .LSEH_end_aesni_cbc_sha1_enc_avx + .rva .LSEH_info_aesni_cbc_sha1_enc_avx +___ +$code.=<<___; +.section .xdata +.align 8 +.LSEH_info_aesni_cbc_sha1_enc_ssse3: + .byte 9,0,0,0 + .rva ssse3_handler + .rva .Lprologue_ssse3,.Lepilogue_ssse3 # HandlerData[] +___ +$code.=<<___ if ($avx); +.LSEH_info_aesni_cbc_sha1_enc_avx: + .byte 9,0,0,0 + .rva ssse3_handler + .rva .Lprologue_avx,.Lepilogue_avx # HandlerData[] +___ +} + +#################################################################### +sub rex { + local *opcode=shift; + my ($dst,$src)=@_; + my $rex=0; + + $rex|=0x04 if($dst>=8); + $rex|=0x01 if($src>=8); + push @opcode,$rex|0x40 if($rex); +} + +sub aesni { + my $line=shift; + my @opcode=(0x66); + + if ($line=~/(aes[a-z]+)\s+%xmm([0-9]+),\s*%xmm([0-9]+)/) { + my %opcodelet = ( + "aesenc" => 0xdc, "aesenclast" => 0xdd + ); + return undef if (!defined($opcodelet{$1})); + rex(\@opcode,$3,$2); + push @opcode,0x0f,0x38,$opcodelet{$1}; + push @opcode,0xc0|($2&7)|(($3&7)<<3); # ModR/M + return ".byte\t".join(',',@opcode); + } + return $line; +} + +$code =~ s/\`([^\`]*)\`/eval($1)/gem; +$code =~ s/\b(aes.*%xmm[0-9]+).*$/aesni($1)/gem; + +print $code; +close STDOUT; diff --git a/crypto/openssl/crypto/aes/asm/aesni-x86.pl b/crypto/openssl/crypto/aes/asm/aesni-x86.pl new file mode 100644 index 0000000000..3dc345b585 --- /dev/null +++ b/crypto/openssl/crypto/aes/asm/aesni-x86.pl @@ -0,0 +1,2189 @@ +#!/usr/bin/env perl + +# ==================================================================== +# Written by Andy Polyakov for the OpenSSL +# project. The module is, however, dual licensed under OpenSSL and +# CRYPTOGAMS licenses depending on where you obtain it. For further +# details see http://www.openssl.org/~appro/cryptogams/. +# ==================================================================== +# +# This module implements support for Intel AES-NI extension. In +# OpenSSL context it's used with Intel engine, but can also be used as +# drop-in replacement for crypto/aes/asm/aes-586.pl [see below for +# details]. +# +# Performance. +# +# To start with see corresponding paragraph in aesni-x86_64.pl... +# Instead of filling table similar to one found there I've chosen to +# summarize *comparison* results for raw ECB, CTR and CBC benchmarks. +# The simplified table below represents 32-bit performance relative +# to 64-bit one in every given point. Ratios vary for different +# encryption modes, therefore interval values. +# +# 16-byte 64-byte 256-byte 1-KB 8-KB +# 53-67% 67-84% 91-94% 95-98% 97-99.5% +# +# Lower ratios for smaller block sizes are perfectly understandable, +# because function call overhead is higher in 32-bit mode. Largest +# 8-KB block performance is virtually same: 32-bit code is less than +# 1% slower for ECB, CBC and CCM, and ~3% slower otherwise. + +# January 2011 +# +# See aesni-x86_64.pl for details. Unlike x86_64 version this module +# interleaves at most 6 aes[enc|dec] instructions, because there are +# not enough registers for 8x interleave [which should be optimal for +# Sandy Bridge]. Actually, performance results for 6x interleave +# factor presented in aesni-x86_64.pl (except for CTR) are for this +# module. + +# April 2011 +# +# Add aesni_xts_[en|de]crypt. Westmere spends 1.50 cycles processing +# one byte out of 8KB with 128-bit key, Sandy Bridge - 1.09. + +$PREFIX="aesni"; # if $PREFIX is set to "AES", the script + # generates drop-in replacement for + # crypto/aes/asm/aes-586.pl:-) +$inline=1; # inline _aesni_[en|de]crypt + +$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; +push(@INC,"${dir}","${dir}../../perlasm"); +require "x86asm.pl"; + +&asm_init($ARGV[0],$0); + +if ($PREFIX eq "aesni") { $movekey=*movups; } +else { $movekey=*movups; } + +$len="eax"; +$rounds="ecx"; +$key="edx"; +$inp="esi"; +$out="edi"; +$rounds_="ebx"; # backup copy for $rounds +$key_="ebp"; # backup copy for $key + +$rndkey0="xmm0"; +$rndkey1="xmm1"; +$inout0="xmm2"; +$inout1="xmm3"; +$inout2="xmm4"; +$inout3="xmm5"; $in1="xmm5"; +$inout4="xmm6"; $in0="xmm6"; +$inout5="xmm7"; $ivec="xmm7"; + +# AESNI extenstion +sub aeskeygenassist +{ my($dst,$src,$imm)=@_; + if ("$dst:$src" =~ /xmm([0-7]):xmm([0-7])/) + { &data_byte(0x66,0x0f,0x3a,0xdf,0xc0|($1<<3)|$2,$imm); } +} +sub aescommon +{ my($opcodelet,$dst,$src)=@_; + if ("$dst:$src" =~ /xmm([0-7]):xmm([0-7])/) + { &data_byte(0x66,0x0f,0x38,$opcodelet,0xc0|($1<<3)|$2);} +} +sub aesimc { aescommon(0xdb,@_); } +sub aesenc { aescommon(0xdc,@_); } +sub aesenclast { aescommon(0xdd,@_); } +sub aesdec { aescommon(0xde,@_); } +sub aesdeclast { aescommon(0xdf,@_); } + +# Inline version of internal aesni_[en|de]crypt1 +{ my $sn; +sub aesni_inline_generate1 +{ my ($p,$inout,$ivec)=@_; $inout=$inout0 if (!defined($inout)); + $sn++; + + &$movekey ($rndkey0,&QWP(0,$key)); + &$movekey ($rndkey1,&QWP(16,$key)); + &xorps ($ivec,$rndkey0) if (defined($ivec)); + &lea ($key,&DWP(32,$key)); + &xorps ($inout,$ivec) if (defined($ivec)); + &xorps ($inout,$rndkey0) if (!defined($ivec)); + &set_label("${p}1_loop_$sn"); + eval"&aes${p} ($inout,$rndkey1)"; + &dec ($rounds); + &$movekey ($rndkey1,&QWP(0,$key)); + &lea ($key,&DWP(16,$key)); + &jnz (&label("${p}1_loop_$sn")); + eval"&aes${p}last ($inout,$rndkey1)"; +}} + +sub aesni_generate1 # fully unrolled loop +{ my ($p,$inout)=@_; $inout=$inout0 if (!defined($inout)); + + &function_begin_B("_aesni_${p}rypt1"); + &movups ($rndkey0,&QWP(0,$key)); + &$movekey ($rndkey1,&QWP(0x10,$key)); + &xorps ($inout,$rndkey0); + &$movekey ($rndkey0,&QWP(0x20,$key)); + &lea ($key,&DWP(0x30,$key)); + &cmp ($rounds,11); + &jb (&label("${p}128")); + &lea ($key,&DWP(0x20,$key)); + &je (&label("${p}192")); + &lea ($key,&DWP(0x20,$key)); + eval"&aes${p} ($inout,$rndkey1)"; + &$movekey ($rndkey1,&QWP(-0x40,$key)); + eval"&aes${p} ($inout,$rndkey0)"; + &$movekey ($rndkey0,&QWP(-0x30,$key)); + &set_label("${p}192"); + eval"&aes${p} ($inout,$rndkey1)"; + &$movekey ($rndkey1,&QWP(-0x20,$key)); + eval"&aes${p} ($inout,$rndkey0)"; + &$movekey ($rndkey0,&QWP(-0x10,$key)); + &set_label("${p}128"); + eval"&aes${p} ($inout,$rndkey1)"; + &$movekey ($rndkey1,&QWP(0,$key)); + eval"&aes${p} ($inout,$rndkey0)"; + &$movekey ($rndkey0,&QWP(0x10,$key)); + eval"&aes${p} ($inout,$rndkey1)"; + &$movekey ($rndkey1,&QWP(0x20,$key)); + eval"&aes${p} ($inout,$rndkey0)"; + &$movekey ($rndkey0,&QWP(0x30,$key)); + eval"&aes${p} ($inout,$rndkey1)"; + &$movekey ($rndkey1,&QWP(0x40,$key)); + eval"&aes${p} ($inout,$rndkey0)"; + &$movekey ($rndkey0,&QWP(0x50,$key)); + eval"&aes${p} ($inout,$rndkey1)"; + &$movekey ($rndkey1,&QWP(0x60,$key)); + eval"&aes${p} ($inout,$rndkey0)"; + &$movekey ($rndkey0,&QWP(0x70,$key)); + eval"&aes${p} ($inout,$rndkey1)"; + eval"&aes${p}last ($inout,$rndkey0)"; + &ret(); + &function_end_B("_aesni_${p}rypt1"); +} + +# void $PREFIX_encrypt (const void *inp,void *out,const AES_KEY *key); +&aesni_generate1("enc") if (!$inline); +&function_begin_B("${PREFIX}_encrypt"); + &mov ("eax",&wparam(0)); + &mov ($key,&wparam(2)); + &movups ($inout0,&QWP(0,"eax")); + &mov ($rounds,&DWP(240,$key)); + &mov ("eax",&wparam(1)); + if ($inline) + { &aesni_inline_generate1("enc"); } + else + { &call ("_aesni_encrypt1"); } + &movups (&QWP(0,"eax"),$inout0); + &ret (); +&function_end_B("${PREFIX}_encrypt"); + +# void $PREFIX_decrypt (const void *inp,void *out,const AES_KEY *key); +&aesni_generate1("dec") if(!$inline); +&function_begin_B("${PREFIX}_decrypt"); + &mov ("eax",&wparam(0)); + &mov ($key,&wparam(2)); + &movups ($inout0,&QWP(0,"eax")); + &mov ($rounds,&DWP(240,$key)); + &mov ("eax",&wparam(1)); + if ($inline) + { &aesni_inline_generate1("dec"); } + else + { &call ("_aesni_decrypt1"); } + &movups (&QWP(0,"eax"),$inout0); + &ret (); +&function_end_B("${PREFIX}_decrypt"); + +# _aesni_[en|de]cryptN are private interfaces, N denotes interleave +# factor. Why 3x subroutine were originally used in loops? Even though +# aes[enc|dec] latency was originally 6, it could be scheduled only +# every *2nd* cycle. Thus 3x interleave was the one providing optimal +# utilization, i.e. when subroutine's throughput is virtually same as +# of non-interleaved subroutine [for number of input blocks up to 3]. +# This is why it makes no sense to implement 2x subroutine. +# aes[enc|dec] latency in next processor generation is 8, but the +# instructions can be scheduled every cycle. Optimal interleave for +# new processor is therefore 8x, but it's unfeasible to accommodate it +# in XMM registers addreassable in 32-bit mode and therefore 6x is +# used instead... + +sub aesni_generate3 +{ my $p=shift; + + &function_begin_B("_aesni_${p}rypt3"); + &$movekey ($rndkey0,&QWP(0,$key)); + &shr ($rounds,1); + &$movekey ($rndkey1,&QWP(16,$key)); + &lea ($key,&DWP(32,$key)); + &xorps ($inout0,$rndkey0); + &pxor ($inout1,$rndkey0); + &pxor ($inout2,$rndkey0); + &$movekey ($rndkey0,&QWP(0,$key)); + + &set_label("${p}3_loop"); + eval"&aes${p} ($inout0,$rndkey1)"; + eval"&aes${p} ($inout1,$rndkey1)"; + &dec ($rounds); + eval"&aes${p} ($inout2,$rndkey1)"; + &$movekey ($rndkey1,&QWP(16,$key)); + eval"&aes${p} ($inout0,$rndkey0)"; + eval"&aes${p} ($inout1,$rndkey0)"; + &lea ($key,&DWP(32,$key)); + eval"&aes${p} ($inout2,$rndkey0)"; + &$movekey ($rndkey0,&QWP(0,$key)); + &jnz (&label("${p}3_loop")); + eval"&aes${p} ($inout0,$rndkey1)"; + eval"&aes${p} ($inout1,$rndkey1)"; + eval"&aes${p} ($inout2,$rndkey1)"; + eval"&aes${p}last ($inout0,$rndkey0)"; + eval"&aes${p}last ($inout1,$rndkey0)"; + eval"&aes${p}last ($inout2,$rndkey0)"; + &ret(); + &function_end_B("_aesni_${p}rypt3"); +} + +# 4x interleave is implemented to improve small block performance, +# most notably [and naturally] 4 block by ~30%. One can argue that one +# should have implemented 5x as well, but improvement would be <20%, +# so it's not worth it... +sub aesni_generate4 +{ my $p=shift; + + &function_begin_B("_aesni_${p}rypt4"); + &$movekey ($rndkey0,&QWP(0,$key)); + &$movekey ($rndkey1,&QWP(16,$key)); + &shr ($rounds,1); + &lea ($key,&DWP(32,$key)); + &xorps ($inout0,$rndkey0); + &pxor ($inout1,$rndkey0); + &pxor ($inout2,$rndkey0); + &pxor ($inout3,$rndkey0); + &$movekey ($rndkey0,&QWP(0,$key)); + + &set_label("${p}4_loop"); + eval"&aes${p} ($inout0,$rndkey1)"; + eval"&aes${p} ($inout1,$rndkey1)"; + &dec ($rounds); + eval"&aes${p} ($inout2,$rndkey1)"; + eval"&aes${p} ($inout3,$rndkey1)"; + &$movekey ($rndkey1,&QWP(16,$key)); + eval"&aes${p} ($inout0,$rndkey0)"; + eval"&aes${p} ($inout1,$rndkey0)"; + &lea ($key,&DWP(32,$key)); + eval"&aes${p} ($inout2,$rndkey0)"; + eval"&aes${p} ($inout3,$rndkey0)"; + &$movekey ($rndkey0,&QWP(0,$key)); + &jnz (&label("${p}4_loop")); + + eval"&aes${p} ($inout0,$rndkey1)"; + eval"&aes${p} ($inout1,$rndkey1)"; + eval"&aes${p} ($inout2,$rndkey1)"; + eval"&aes${p} ($inout3,$rndkey1)"; + eval"&aes${p}last ($inout0,$rndkey0)"; + eval"&aes${p}last ($inout1,$rndkey0)"; + eval"&aes${p}last ($inout2,$rndkey0)"; + eval"&aes${p}last ($inout3,$rndkey0)"; + &ret(); + &function_end_B("_aesni_${p}rypt4"); +} + +sub aesni_generate6 +{ my $p=shift; + + &function_begin_B("_aesni_${p}rypt6"); + &static_label("_aesni_${p}rypt6_enter"); + &$movekey ($rndkey0,&QWP(0,$key)); + &shr ($rounds,1); + &$movekey ($rndkey1,&QWP(16,$key)); + &lea ($key,&DWP(32,$key)); + &xorps ($inout0,$rndkey0); + &pxor ($inout1,$rndkey0); # pxor does better here + eval"&aes${p} ($inout0,$rndkey1)"; + &pxor ($inout2,$rndkey0); + eval"&aes${p} ($inout1,$rndkey1)"; + &pxor ($inout3,$rndkey0); + &dec ($rounds); + eval"&aes${p} ($inout2,$rndkey1)"; + &pxor ($inout4,$rndkey0); + eval"&aes${p} ($inout3,$rndkey1)"; + &pxor ($inout5,$rndkey0); + eval"&aes${p} ($inout4,$rndkey1)"; + &$movekey ($rndkey0,&QWP(0,$key)); + eval"&aes${p} ($inout5,$rndkey1)"; + &jmp (&label("_aesni_${p}rypt6_enter")); + + &set_label("${p}6_loop",16); + eval"&aes${p} ($inout0,$rndkey1)"; + eval"&aes${p} ($inout1,$rndkey1)"; + &dec ($rounds); + eval"&aes${p} ($inout2,$rndkey1)"; + eval"&aes${p} ($inout3,$rndkey1)"; + eval"&aes${p} ($inout4,$rndkey1)"; + eval"&aes${p} ($inout5,$rndkey1)"; + &set_label("_aesni_${p}rypt6_enter",16); + &$movekey ($rndkey1,&QWP(16,$key)); + eval"&aes${p} ($inout0,$rndkey0)"; + eval"&aes${p} ($inout1,$rndkey0)"; + &lea ($key,&DWP(32,$key)); + eval"&aes${p} ($inout2,$rndkey0)"; + eval"&aes${p} ($inout3,$rndkey0)"; + eval"&aes${p} ($inout4,$rndkey0)"; + eval"&aes${p} ($inout5,$rndkey0)"; + &$movekey ($rndkey0,&QWP(0,$key)); + &jnz (&label("${p}6_loop")); + + eval"&aes${p} ($inout0,$rndkey1)"; + eval"&aes${p} ($inout1,$rndkey1)"; + eval"&aes${p} ($inout2,$rndkey1)"; + eval"&aes${p} ($inout3,$rndkey1)"; + eval"&aes${p} ($inout4,$rndkey1)"; + eval"&aes${p} ($inout5,$rndkey1)"; + eval"&aes${p}last ($inout0,$rndkey0)"; + eval"&aes${p}last ($inout1,$rndkey0)"; + eval"&aes${p}last ($inout2,$rndkey0)"; + eval"&aes${p}last ($inout3,$rndkey0)"; + eval"&aes${p}last ($inout4,$rndkey0)"; + eval"&aes${p}last ($inout5,$rndkey0)"; + &ret(); + &function_end_B("_aesni_${p}rypt6"); +} +&aesni_generate3("enc") if ($PREFIX eq "aesni"); +&aesni_generate3("dec"); +&aesni_generate4("enc") if ($PREFIX eq "aesni"); +&aesni_generate4("dec"); +&aesni_generate6("enc") if ($PREFIX eq "aesni"); +&aesni_generate6("dec"); + +if ($PREFIX eq "aesni") { +###################################################################### +# void aesni_ecb_encrypt (const void *in, void *out, +# size_t length, const AES_KEY *key, +# int enc); +&function_begin("aesni_ecb_encrypt"); + &mov ($inp,&wparam(0)); + &mov ($out,&wparam(1)); + &mov ($len,&wparam(2)); + &mov ($key,&wparam(3)); + &mov ($rounds_,&wparam(4)); + &and ($len,-16); + &jz (&label("ecb_ret")); + &mov ($rounds,&DWP(240,$key)); + &test ($rounds_,$rounds_); + &jz (&label("ecb_decrypt")); + + &mov ($key_,$key); # backup $key + &mov ($rounds_,$rounds); # backup $rounds + &cmp ($len,0x60); + &jb (&label("ecb_enc_tail")); + + &movdqu ($inout0,&QWP(0,$inp)); + &movdqu ($inout1,&QWP(0x10,$inp)); + &movdqu ($inout2,&QWP(0x20,$inp)); + &movdqu ($inout3,&QWP(0x30,$inp)); + &movdqu ($inout4,&QWP(0x40,$inp)); + &movdqu ($inout5,&QWP(0x50,$inp)); + &lea ($inp,&DWP(0x60,$inp)); + &sub ($len,0x60); + &jmp (&label("ecb_enc_loop6_enter")); + +&set_label("ecb_enc_loop6",16); + &movups (&QWP(0,$out),$inout0); + &movdqu ($inout0,&QWP(0,$inp)); + &movups (&QWP(0x10,$out),$inout1); + &movdqu ($inout1,&QWP(0x10,$inp)); + &movups (&QWP(0x20,$out),$inout2); + &movdqu ($inout2,&QWP(0x20,$inp)); + &movups (&QWP(0x30,$out),$inout3); + &movdqu ($inout3,&QWP(0x30,$inp)); + &movups (&QWP(0x40,$out),$inout4); + &movdqu ($inout4,&QWP(0x40,$inp)); + &movups (&QWP(0x50,$out),$inout5); + &lea ($out,&DWP(0x60,$out)); + &movdqu ($inout5,&QWP(0x50,$inp)); + &lea ($inp,&DWP(0x60,$inp)); +&set_label("ecb_enc_loop6_enter"); + + &call ("_aesni_encrypt6"); + + &mov ($key,$key_); # restore $key + &mov ($rounds,$rounds_); # restore $rounds + &sub ($len,0x60); + &jnc (&label("ecb_enc_loop6")); + + &movups (&QWP(0,$out),$inout0); + &movups (&QWP(0x10,$out),$inout1); + &movups (&QWP(0x20,$out),$inout2); + &movups (&QWP(0x30,$out),$inout3); + &movups (&QWP(0x40,$out),$inout4); + &movups (&QWP(0x50,$out),$inout5); + &lea ($out,&DWP(0x60,$out)); + &add ($len,0x60); + &jz (&label("ecb_ret")); + +&set_label("ecb_enc_tail"); + &movups ($inout0,&QWP(0,$inp)); + &cmp ($len,0x20); + &jb (&label("ecb_enc_one")); + &movups ($inout1,&QWP(0x10,$inp)); + &je (&label("ecb_enc_two")); + &movups ($inout2,&QWP(0x20,$inp)); + &cmp ($len,0x40); + &jb (&label("ecb_enc_three")); + &movups ($inout3,&QWP(0x30,$inp)); + &je (&label("ecb_enc_four")); + &movups ($inout4,&QWP(0x40,$inp)); + &xorps ($inout5,$inout5); + &call ("_aesni_encrypt6"); + &movups (&QWP(0,$out),$inout0); + &movups (&QWP(0x10,$out),$inout1); + &movups (&QWP(0x20,$out),$inout2); + &movups (&QWP(0x30,$out),$inout3); + &movups (&QWP(0x40,$out),$inout4); + jmp (&label("ecb_ret")); + +&set_label("ecb_enc_one",16); + if ($inline) + { &aesni_inline_generate1("enc"); } + else + { &call ("_aesni_encrypt1"); } + &movups (&QWP(0,$out),$inout0); + &jmp (&label("ecb_ret")); + +&set_label("ecb_enc_two",16); + &xorps ($inout2,$inout2); + &call ("_aesni_encrypt3"); + &movups (&QWP(0,$out),$inout0); + &movups (&QWP(0x10,$out),$inout1); + &jmp (&label("ecb_ret")); + +&set_label("ecb_enc_three",16); + &call ("_aesni_encrypt3"); + &movups (&QWP(0,$out),$inout0); + &movups (&QWP(0x10,$out),$inout1); + &movups (&QWP(0x20,$out),$inout2); + &jmp (&label("ecb_ret")); + +&set_label("ecb_enc_four",16); + &call ("_aesni_encrypt4"); + &movups (&QWP(0,$out),$inout0); + &movups (&QWP(0x10,$out),$inout1); + &movups (&QWP(0x20,$out),$inout2); + &movups (&QWP(0x30,$out),$inout3); + &jmp (&label("ecb_ret")); +###################################################################### +&set_label("ecb_decrypt",16); + &mov ($key_,$key); # backup $key + &mov ($rounds_,$rounds); # backup $rounds + &cmp ($len,0x60); + &jb (&label("ecb_dec_tail")); + + &movdqu ($inout0,&QWP(0,$inp)); + &movdqu ($inout1,&QWP(0x10,$inp)); + &movdqu ($inout2,&QWP(0x20,$inp)); + &movdqu ($inout3,&QWP(0x30,$inp)); + &movdqu ($inout4,&QWP(0x40,$inp)); + &movdqu ($inout5,&QWP(0x50,$inp)); + &lea ($inp,&DWP(0x60,$inp)); + &sub ($len,0x60); + &jmp (&label("ecb_dec_loop6_enter")); + +&set_label("ecb_dec_loop6",16); + &movups (&QWP(0,$out),$inout0); + &movdqu ($inout0,&QWP(0,$inp)); + &movups (&QWP(0x10,$out),$inout1); + &movdqu ($inout1,&QWP(0x10,$inp)); + &movups (&QWP(0x20,$out),$inout2); + &movdqu ($inout2,&QWP(0x20,$inp)); + &movups (&QWP(0x30,$out),$inout3); + &movdqu ($inout3,&QWP(0x30,$inp)); + &movups (&QWP(0x40,$out),$inout4); + &movdqu ($inout4,&QWP(0x40,$inp)); + &movups (&QWP(0x50,$out),$inout5); + &lea ($out,&DWP(0x60,$out)); + &movdqu ($inout5,&QWP(0x50,$inp)); + &lea ($inp,&DWP(0x60,$inp)); +&set_label("ecb_dec_loop6_enter"); + + &call ("_aesni_decrypt6"); + + &mov ($key,$key_); # restore $key + &mov ($rounds,$rounds_); # restore $rounds + &sub ($len,0x60); + &jnc (&label("ecb_dec_loop6")); + + &movups (&QWP(0,$out),$inout0); + &movups (&QWP(0x10,$out),$inout1); + &movups (&QWP(0x20,$out),$inout2); + &movups (&QWP(0x30,$out),$inout3); + &movups (&QWP(0x40,$out),$inout4); + &movups (&QWP(0x50,$out),$inout5); + &lea ($out,&DWP(0x60,$out)); + &add ($len,0x60); + &jz (&label("ecb_ret")); + +&set_label("ecb_dec_tail"); + &movups ($inout0,&QWP(0,$inp)); + &cmp ($len,0x20); + &jb (&label("ecb_dec_one")); + &movups ($inout1,&QWP(0x10,$inp)); + &je (&label("ecb_dec_two")); + &movups ($inout2,&QWP(0x20,$inp)); + &cmp ($len,0x40); + &jb (&label("ecb_dec_three")); + &movups ($inout3,&QWP(0x30,$inp)); + &je (&label("ecb_dec_four")); + &movups ($inout4,&QWP(0x40,$inp)); + &xorps ($inout5,$inout5); + &call ("_aesni_decrypt6"); + &movups (&QWP(0,$out),$inout0); + &movups (&QWP(0x10,$out),$inout1); + &movups (&QWP(0x20,$out),$inout2); + &movups (&QWP(0x30,$out),$inout3); + &movups (&QWP(0x40,$out),$inout4); + &jmp (&label("ecb_ret")); + +&set_label("ecb_dec_one",16); + if ($inline) + { &aesni_inline_generate1("dec"); } + else + { &call ("_aesni_decrypt1"); } + &movups (&QWP(0,$out),$inout0); + &jmp (&label("ecb_ret")); + +&set_label("ecb_dec_two",16); + &xorps ($inout2,$inout2); + &call ("_aesni_decrypt3"); + &movups (&QWP(0,$out),$inout0); + &movups (&QWP(0x10,$out),$inout1); + &jmp (&label("ecb_ret")); + +&set_label("ecb_dec_three",16); + &call ("_aesni_decrypt3"); + &movups (&QWP(0,$out),$inout0); + &movups (&QWP(0x10,$out),$inout1); + &movups (&QWP(0x20,$out),$inout2); + &jmp (&label("ecb_ret")); + +&set_label("ecb_dec_four",16); + &call ("_aesni_decrypt4"); + &movups (&QWP(0,$out),$inout0); + &movups (&QWP(0x10,$out),$inout1); + &movups (&QWP(0x20,$out),$inout2); + &movups (&QWP(0x30,$out),$inout3); + +&set_label("ecb_ret"); +&function_end("aesni_ecb_encrypt"); + +###################################################################### +# void aesni_ccm64_[en|de]crypt_blocks (const void *in, void *out, +# size_t blocks, const AES_KEY *key, +# const char *ivec,char *cmac); +# +# Handles only complete blocks, operates on 64-bit counter and +# does not update *ivec! Nor does it finalize CMAC value +# (see engine/eng_aesni.c for details) +# +{ my $cmac=$inout1; +&function_begin("aesni_ccm64_encrypt_blocks"); + &mov ($inp,&wparam(0)); + &mov ($out,&wparam(1)); + &mov ($len,&wparam(2)); + &mov ($key,&wparam(3)); + &mov ($rounds_,&wparam(4)); + &mov ($rounds,&wparam(5)); + &mov ($key_,"esp"); + &sub ("esp",60); + &and ("esp",-16); # align stack + &mov (&DWP(48,"esp"),$key_); + + &movdqu ($ivec,&QWP(0,$rounds_)); # load ivec + &movdqu ($cmac,&QWP(0,$rounds)); # load cmac + &mov ($rounds,&DWP(240,$key)); + + # compose byte-swap control mask for pshufb on stack + &mov (&DWP(0,"esp"),0x0c0d0e0f); + &mov (&DWP(4,"esp"),0x08090a0b); + &mov (&DWP(8,"esp"),0x04050607); + &mov (&DWP(12,"esp"),0x00010203); + + # compose counter increment vector on stack + &mov ($rounds_,1); + &xor ($key_,$key_); + &mov (&DWP(16,"esp"),$rounds_); + &mov (&DWP(20,"esp"),$key_); + &mov (&DWP(24,"esp"),$key_); + &mov (&DWP(28,"esp"),$key_); + + &shr ($rounds,1); + &lea ($key_,&DWP(0,$key)); + &movdqa ($inout3,&QWP(0,"esp")); + &movdqa ($inout0,$ivec); + &mov ($rounds_,$rounds); + &pshufb ($ivec,$inout3); + +&set_label("ccm64_enc_outer"); + &$movekey ($rndkey0,&QWP(0,$key_)); + &mov ($rounds,$rounds_); + &movups ($in0,&QWP(0,$inp)); + + &xorps ($inout0,$rndkey0); + &$movekey ($rndkey1,&QWP(16,$key_)); + &xorps ($rndkey0,$in0); + &lea ($key,&DWP(32,$key_)); + &xorps ($cmac,$rndkey0); # cmac^=inp + &$movekey ($rndkey0,&QWP(0,$key)); + +&set_label("ccm64_enc2_loop"); + &aesenc ($inout0,$rndkey1); + &dec ($rounds); + &aesenc ($cmac,$rndkey1); + &$movekey ($rndkey1,&QWP(16,$key)); + &aesenc ($inout0,$rndkey0); + &lea ($key,&DWP(32,$key)); + &aesenc ($cmac,$rndkey0); + &$movekey ($rndkey0,&QWP(0,$key)); + &jnz (&label("ccm64_enc2_loop")); + &aesenc ($inout0,$rndkey1); + &aesenc ($cmac,$rndkey1); + &paddq ($ivec,&QWP(16,"esp")); + &aesenclast ($inout0,$rndkey0); + &aesenclast ($cmac,$rndkey0); + + &dec ($len); + &lea ($inp,&DWP(16,$inp)); + &xorps ($in0,$inout0); # inp^=E(ivec) + &movdqa ($inout0,$ivec); + &movups (&QWP(0,$out),$in0); # save output + &lea ($out,&DWP(16,$out)); + &pshufb ($inout0,$inout3); + &jnz (&label("ccm64_enc_outer")); + + &mov ("esp",&DWP(48,"esp")); + &mov ($out,&wparam(5)); + &movups (&QWP(0,$out),$cmac); +&function_end("aesni_ccm64_encrypt_blocks"); + +&function_begin("aesni_ccm64_decrypt_blocks"); + &mov ($inp,&wparam(0)); + &mov ($out,&wparam(1)); + &mov ($len,&wparam(2)); + &mov ($key,&wparam(3)); + &mov ($rounds_,&wparam(4)); + &mov ($rounds,&wparam(5)); + &mov ($key_,"esp"); + &sub ("esp",60); + &and ("esp",-16); # align stack + &mov (&DWP(48,"esp"),$key_); + + &movdqu ($ivec,&QWP(0,$rounds_)); # load ivec + &movdqu ($cmac,&QWP(0,$rounds)); # load cmac + &mov ($rounds,&DWP(240,$key)); + + # compose byte-swap control mask for pshufb on stack + &mov (&DWP(0,"esp"),0x0c0d0e0f); + &mov (&DWP(4,"esp"),0x08090a0b); + &mov (&DWP(8,"esp"),0x04050607); + &mov (&DWP(12,"esp"),0x00010203); + + # compose counter increment vector on stack + &mov ($rounds_,1); + &xor ($key_,$key_); + &mov (&DWP(16,"esp"),$rounds_); + &mov (&DWP(20,"esp"),$key_); + &mov (&DWP(24,"esp"),$key_); + &mov (&DWP(28,"esp"),$key_); + + &movdqa ($inout3,&QWP(0,"esp")); # bswap mask + &movdqa ($inout0,$ivec); + + &mov ($key_,$key); + &mov ($rounds_,$rounds); + + &pshufb ($ivec,$inout3); + if ($inline) + { &aesni_inline_generate1("enc"); } + else + { &call ("_aesni_encrypt1"); } + &movups ($in0,&QWP(0,$inp)); # load inp + &paddq ($ivec,&QWP(16,"esp")); + &lea ($inp,&QWP(16,$inp)); + &jmp (&label("ccm64_dec_outer")); + +&set_label("ccm64_dec_outer",16); + &xorps ($in0,$inout0); # inp ^= E(ivec) + &movdqa ($inout0,$ivec); + &mov ($rounds,$rounds_); + &movups (&QWP(0,$out),$in0); # save output + &lea ($out,&DWP(16,$out)); + &pshufb ($inout0,$inout3); + + &sub ($len,1); + &jz (&label("ccm64_dec_break")); + + &$movekey ($rndkey0,&QWP(0,$key_)); + &shr ($rounds,1); + &$movekey ($rndkey1,&QWP(16,$key_)); + &xorps ($in0,$rndkey0); + &lea ($key,&DWP(32,$key_)); + &xorps ($inout0,$rndkey0); + &xorps ($cmac,$in0); # cmac^=out + &$movekey ($rndkey0,&QWP(0,$key)); + +&set_label("ccm64_dec2_loop"); + &aesenc ($inout0,$rndkey1); + &dec ($rounds); + &aesenc ($cmac,$rndkey1); + &$movekey ($rndkey1,&QWP(16,$key)); + &aesenc ($inout0,$rndkey0); + &lea ($key,&DWP(32,$key)); + &aesenc ($cmac,$rndkey0); + &$movekey ($rndkey0,&QWP(0,$key)); + &jnz (&label("ccm64_dec2_loop")); + &movups ($in0,&QWP(0,$inp)); # load inp + &paddq ($ivec,&QWP(16,"esp")); + &aesenc ($inout0,$rndkey1); + &aesenc ($cmac,$rndkey1); + &lea ($inp,&QWP(16,$inp)); + &aesenclast ($inout0,$rndkey0); + &aesenclast ($cmac,$rndkey0); + &jmp (&label("ccm64_dec_outer")); + +&set_label("ccm64_dec_break",16); + &mov ($key,$key_); + if ($inline) + { &aesni_inline_generate1("enc",$cmac,$in0); } + else + { &call ("_aesni_encrypt1",$cmac); } + + &mov ("esp",&DWP(48,"esp")); + &mov ($out,&wparam(5)); + &movups (&QWP(0,$out),$cmac); +&function_end("aesni_ccm64_decrypt_blocks"); +} + +###################################################################### +# void aesni_ctr32_encrypt_blocks (const void *in, void *out, +# size_t blocks, const AES_KEY *key, +# const char *ivec); +# +# Handles only complete blocks, operates on 32-bit counter and +# does not update *ivec! (see engine/eng_aesni.c for details) +# +# stack layout: +# 0 pshufb mask +# 16 vector addend: 0,6,6,6 +# 32 counter-less ivec +# 48 1st triplet of counter vector +# 64 2nd triplet of counter vector +# 80 saved %esp + +&function_begin("aesni_ctr32_encrypt_blocks"); + &mov ($inp,&wparam(0)); + &mov ($out,&wparam(1)); + &mov ($len,&wparam(2)); + &mov ($key,&wparam(3)); + &mov ($rounds_,&wparam(4)); + &mov ($key_,"esp"); + &sub ("esp",88); + &and ("esp",-16); # align stack + &mov (&DWP(80,"esp"),$key_); + + &cmp ($len,1); + &je (&label("ctr32_one_shortcut")); + + &movdqu ($inout5,&QWP(0,$rounds_)); # load ivec + + # compose byte-swap control mask for pshufb on stack + &mov (&DWP(0,"esp"),0x0c0d0e0f); + &mov (&DWP(4,"esp"),0x08090a0b); + &mov (&DWP(8,"esp"),0x04050607); + &mov (&DWP(12,"esp"),0x00010203); + + # compose counter increment vector on stack + &mov ($rounds,6); + &xor ($key_,$key_); + &mov (&DWP(16,"esp"),$rounds); + &mov (&DWP(20,"esp"),$rounds); + &mov (&DWP(24,"esp"),$rounds); + &mov (&DWP(28,"esp"),$key_); + + &pextrd ($rounds_,$inout5,3); # pull 32-bit counter + &pinsrd ($inout5,$key_,3); # wipe 32-bit counter + + &mov ($rounds,&DWP(240,$key)); # key->rounds + + # compose 2 vectors of 3x32-bit counters + &bswap ($rounds_); + &pxor ($rndkey1,$rndkey1); + &pxor ($rndkey0,$rndkey0); + &movdqa ($inout0,&QWP(0,"esp")); # load byte-swap mask + &pinsrd ($rndkey1,$rounds_,0); + &lea ($key_,&DWP(3,$rounds_)); + &pinsrd ($rndkey0,$key_,0); + &inc ($rounds_); + &pinsrd ($rndkey1,$rounds_,1); + &inc ($key_); + &pinsrd ($rndkey0,$key_,1); + &inc ($rounds_); + &pinsrd ($rndkey1,$rounds_,2); + &inc ($key_); + &pinsrd ($rndkey0,$key_,2); + &movdqa (&QWP(48,"esp"),$rndkey1); # save 1st triplet + &pshufb ($rndkey1,$inout0); # byte swap + &movdqa (&QWP(64,"esp"),$rndkey0); # save 2nd triplet + &pshufb ($rndkey0,$inout0); # byte swap + + &pshufd ($inout0,$rndkey1,3<<6); # place counter to upper dword + &pshufd ($inout1,$rndkey1,2<<6); + &cmp ($len,6); + &jb (&label("ctr32_tail")); + &movdqa (&QWP(32,"esp"),$inout5); # save counter-less ivec + &shr ($rounds,1); + &mov ($key_,$key); # backup $key + &mov ($rounds_,$rounds); # backup $rounds + &sub ($len,6); + &jmp (&label("ctr32_loop6")); + +&set_label("ctr32_loop6",16); + &pshufd ($inout2,$rndkey1,1<<6); + &movdqa ($rndkey1,&QWP(32,"esp")); # pull counter-less ivec + &pshufd ($inout3,$rndkey0,3<<6); + &por ($inout0,$rndkey1); # merge counter-less ivec + &pshufd ($inout4,$rndkey0,2<<6); + &por ($inout1,$rndkey1); + &pshufd ($inout5,$rndkey0,1<<6); + &por ($inout2,$rndkey1); + &por ($inout3,$rndkey1); + &por ($inout4,$rndkey1); + &por ($inout5,$rndkey1); + + # inlining _aesni_encrypt6's prologue gives ~4% improvement... + &$movekey ($rndkey0,&QWP(0,$key_)); + &$movekey ($rndkey1,&QWP(16,$key_)); + &lea ($key,&DWP(32,$key_)); + &dec ($rounds); + &pxor ($inout0,$rndkey0); + &pxor ($inout1,$rndkey0); + &aesenc ($inout0,$rndkey1); + &pxor ($inout2,$rndkey0); + &aesenc ($inout1,$rndkey1); + &pxor ($inout3,$rndkey0); + &aesenc ($inout2,$rndkey1); + &pxor ($inout4,$rndkey0); + &aesenc ($inout3,$rndkey1); + &pxor ($inout5,$rndkey0); + &aesenc ($inout4,$rndkey1); + &$movekey ($rndkey0,&QWP(0,$key)); + &aesenc ($inout5,$rndkey1); + + &call (&label("_aesni_encrypt6_enter")); + + &movups ($rndkey1,&QWP(0,$inp)); + &movups ($rndkey0,&QWP(0x10,$inp)); + &xorps ($inout0,$rndkey1); + &movups ($rndkey1,&QWP(0x20,$inp)); + &xorps ($inout1,$rndkey0); + &movups (&QWP(0,$out),$inout0); + &movdqa ($rndkey0,&QWP(16,"esp")); # load increment + &xorps ($inout2,$rndkey1); + &movdqa ($rndkey1,&QWP(48,"esp")); # load 1st triplet + &movups (&QWP(0x10,$out),$inout1); + &movups (&QWP(0x20,$out),$inout2); + + &paddd ($rndkey1,$rndkey0); # 1st triplet increment + &paddd ($rndkey0,&QWP(64,"esp")); # 2nd triplet increment + &movdqa ($inout0,&QWP(0,"esp")); # load byte swap mask + + &movups ($inout1,&QWP(0x30,$inp)); + &movups ($inout2,&QWP(0x40,$inp)); + &xorps ($inout3,$inout1); + &movups ($inout1,&QWP(0x50,$inp)); + &lea ($inp,&DWP(0x60,$inp)); + &movdqa (&QWP(48,"esp"),$rndkey1); # save 1st triplet + &pshufb ($rndkey1,$inout0); # byte swap + &xorps ($inout4,$inout2); + &movups (&QWP(0x30,$out),$inout3); + &xorps ($inout5,$inout1); + &movdqa (&QWP(64,"esp"),$rndkey0); # save 2nd triplet + &pshufb ($rndkey0,$inout0); # byte swap + &movups (&QWP(0x40,$out),$inout4); + &pshufd ($inout0,$rndkey1,3<<6); + &movups (&QWP(0x50,$out),$inout5); + &lea ($out,&DWP(0x60,$out)); + + &mov ($rounds,$rounds_); + &pshufd ($inout1,$rndkey1,2<<6); + &sub ($len,6); + &jnc (&label("ctr32_loop6")); + + &add ($len,6); + &jz (&label("ctr32_ret")); + &mov ($key,$key_); + &lea ($rounds,&DWP(1,"",$rounds,2)); # restore $rounds + &movdqa ($inout5,&QWP(32,"esp")); # pull count-less ivec + +&set_label("ctr32_tail"); + &por ($inout0,$inout5); + &cmp ($len,2); + &jb (&label("ctr32_one")); + + &pshufd ($inout2,$rndkey1,1<<6); + &por ($inout1,$inout5); + &je (&label("ctr32_two")); + + &pshufd ($inout3,$rndkey0,3<<6); + &por ($inout2,$inout5); + &cmp ($len,4); + &jb (&label("ctr32_three")); + + &pshufd ($inout4,$rndkey0,2<<6); + &por ($inout3,$inout5); + &je (&label("ctr32_four")); + + &por ($inout4,$inout5); + &call ("_aesni_encrypt6"); + &movups ($rndkey1,&QWP(0,$inp)); + &movups ($rndkey0,&QWP(0x10,$inp)); + &xorps ($inout0,$rndkey1); + &movups ($rndkey1,&QWP(0x20,$inp)); + &xorps ($inout1,$rndkey0); + &movups ($rndkey0,&QWP(0x30,$inp)); + &xorps ($inout2,$rndkey1); + &movups ($rndkey1,&QWP(0x40,$inp)); + &xorps ($inout3,$rndkey0); + &movups (&QWP(0,$out),$inout0); + &xorps ($inout4,$rndkey1); + &movups (&QWP(0x10,$out),$inout1); + &movups (&QWP(0x20,$out),$inout2); + &movups (&QWP(0x30,$out),$inout3); + &movups (&QWP(0x40,$out),$inout4); + &jmp (&label("ctr32_ret")); + +&set_label("ctr32_one_shortcut",16); + &movups ($inout0,&QWP(0,$rounds_)); # load ivec + &mov ($rounds,&DWP(240,$key)); + +&set_label("ctr32_one"); + if ($inline) + { &aesni_inline_generate1("enc"); } + else + { &call ("_aesni_encrypt1"); } + &movups ($in0,&QWP(0,$inp)); + &xorps ($in0,$inout0); + &movups (&QWP(0,$out),$in0); + &jmp (&label("ctr32_ret")); + +&set_label("ctr32_two",16); + &call ("_aesni_encrypt3"); + &movups ($inout3,&QWP(0,$inp)); + &movups ($inout4,&QWP(0x10,$inp)); + &xorps ($inout0,$inout3); + &xorps ($inout1,$inout4); + &movups (&QWP(0,$out),$inout0); + &movups (&QWP(0x10,$out),$inout1); + &jmp (&label("ctr32_ret")); + +&set_label("ctr32_three",16); + &call ("_aesni_encrypt3"); + &movups ($inout3,&QWP(0,$inp)); + &movups ($inout4,&QWP(0x10,$inp)); + &xorps ($inout0,$inout3); + &movups ($inout5,&QWP(0x20,$inp)); + &xorps ($inout1,$inout4); + &movups (&QWP(0,$out),$inout0); + &xorps ($inout2,$inout5); + &movups (&QWP(0x10,$out),$inout1); + &movups (&QWP(0x20,$out),$inout2); + &jmp (&label("ctr32_ret")); + +&set_label("ctr32_four",16); + &call ("_aesni_encrypt4"); + &movups ($inout4,&QWP(0,$inp)); + &movups ($inout5,&QWP(0x10,$inp)); + &movups ($rndkey1,&QWP(0x20,$inp)); + &xorps ($inout0,$inout4); + &movups ($rndkey0,&QWP(0x30,$inp)); + &xorps ($inout1,$inout5); + &movups (&QWP(0,$out),$inout0); + &xorps ($inout2,$rndkey1); + &movups (&QWP(0x10,$out),$inout1); + &xorps ($inout3,$rndkey0); + &movups (&QWP(0x20,$out),$inout2); + &movups (&QWP(0x30,$out),$inout3); + +&set_label("ctr32_ret"); + &mov ("esp",&DWP(80,"esp")); +&function_end("aesni_ctr32_encrypt_blocks"); + +###################################################################### +# void aesni_xts_[en|de]crypt(const char *inp,char *out,size_t len, +# const AES_KEY *key1, const AES_KEY *key2 +# const unsigned char iv[16]); +# +{ my ($tweak,$twtmp,$twres,$twmask)=($rndkey1,$rndkey0,$inout0,$inout1); + +&function_begin("aesni_xts_encrypt"); + &mov ($key,&wparam(4)); # key2 + &mov ($inp,&wparam(5)); # clear-text tweak + + &mov ($rounds,&DWP(240,$key)); # key2->rounds + &movups ($inout0,&QWP(0,$inp)); + if ($inline) + { &aesni_inline_generate1("enc"); } + else + { &call ("_aesni_encrypt1"); } + + &mov ($inp,&wparam(0)); + &mov ($out,&wparam(1)); + &mov ($len,&wparam(2)); + &mov ($key,&wparam(3)); # key1 + + &mov ($key_,"esp"); + &sub ("esp",16*7+8); + &mov ($rounds,&DWP(240,$key)); # key1->rounds + &and ("esp",-16); # align stack + + &mov (&DWP(16*6+0,"esp"),0x87); # compose the magic constant + &mov (&DWP(16*6+4,"esp"),0); + &mov (&DWP(16*6+8,"esp"),1); + &mov (&DWP(16*6+12,"esp"),0); + &mov (&DWP(16*7+0,"esp"),$len); # save original $len + &mov (&DWP(16*7+4,"esp"),$key_); # save original %esp + + &movdqa ($tweak,$inout0); + &pxor ($twtmp,$twtmp); + &movdqa ($twmask,&QWP(6*16,"esp")); # 0x0...010...87 + &pcmpgtd($twtmp,$tweak); # broadcast upper bits + + &and ($len,-16); + &mov ($key_,$key); # backup $key + &mov ($rounds_,$rounds); # backup $rounds + &sub ($len,16*6); + &jc (&label("xts_enc_short")); + + &shr ($rounds,1); + &mov ($rounds_,$rounds); + &jmp (&label("xts_enc_loop6")); + +&set_label("xts_enc_loop6",16); + for ($i=0;$i<4;$i++) { + &pshufd ($twres,$twtmp,0x13); + &pxor ($twtmp,$twtmp); + &movdqa (&QWP(16*$i,"esp"),$tweak); + &paddq ($tweak,$tweak); # &psllq($tweak,1); + &pand ($twres,$twmask); # isolate carry and residue + &pcmpgtd ($twtmp,$tweak); # broadcast upper bits + &pxor ($tweak,$twres); + } + &pshufd ($inout5,$twtmp,0x13); + &movdqa (&QWP(16*$i++,"esp"),$tweak); + &paddq ($tweak,$tweak); # &psllq($tweak,1); + &$movekey ($rndkey0,&QWP(0,$key_)); + &pand ($inout5,$twmask); # isolate carry and residue + &movups ($inout0,&QWP(0,$inp)); # load input + &pxor ($inout5,$tweak); + + # inline _aesni_encrypt6 prologue and flip xor with tweak and key[0] + &movdqu ($inout1,&QWP(16*1,$inp)); + &xorps ($inout0,$rndkey0); # input^=rndkey[0] + &movdqu ($inout2,&QWP(16*2,$inp)); + &pxor ($inout1,$rndkey0); + &movdqu ($inout3,&QWP(16*3,$inp)); + &pxor ($inout2,$rndkey0); + &movdqu ($inout4,&QWP(16*4,$inp)); + &pxor ($inout3,$rndkey0); + &movdqu ($rndkey1,&QWP(16*5,$inp)); + &pxor ($inout4,$rndkey0); + &lea ($inp,&DWP(16*6,$inp)); + &pxor ($inout0,&QWP(16*0,"esp")); # input^=tweak + &movdqa (&QWP(16*$i,"esp"),$inout5); # save last tweak + &pxor ($inout5,$rndkey1); + + &$movekey ($rndkey1,&QWP(16,$key_)); + &lea ($key,&DWP(32,$key_)); + &pxor ($inout1,&QWP(16*1,"esp")); + &aesenc ($inout0,$rndkey1); + &pxor ($inout2,&QWP(16*2,"esp")); + &aesenc ($inout1,$rndkey1); + &pxor ($inout3,&QWP(16*3,"esp")); + &dec ($rounds); + &aesenc ($inout2,$rndkey1); + &pxor ($inout4,&QWP(16*4,"esp")); + &aesenc ($inout3,$rndkey1); + &pxor ($inout5,$rndkey0); + &aesenc ($inout4,$rndkey1); + &$movekey ($rndkey0,&QWP(0,$key)); + &aesenc ($inout5,$rndkey1); + &call (&label("_aesni_encrypt6_enter")); + + &movdqa ($tweak,&QWP(16*5,"esp")); # last tweak + &pxor ($twtmp,$twtmp); + &xorps ($inout0,&QWP(16*0,"esp")); # output^=tweak + &pcmpgtd ($twtmp,$tweak); # broadcast upper bits + &xorps ($inout1,&QWP(16*1,"esp")); + &movups (&QWP(16*0,$out),$inout0); # write output + &xorps ($inout2,&QWP(16*2,"esp")); + &movups (&QWP(16*1,$out),$inout1); + &xorps ($inout3,&QWP(16*3,"esp")); + &movups (&QWP(16*2,$out),$inout2); + &xorps ($inout4,&QWP(16*4,"esp")); + &movups (&QWP(16*3,$out),$inout3); + &xorps ($inout5,$tweak); + &movups (&QWP(16*4,$out),$inout4); + &pshufd ($twres,$twtmp,0x13); + &movups (&QWP(16*5,$out),$inout5); + &lea ($out,&DWP(16*6,$out)); + &movdqa ($twmask,&QWP(16*6,"esp")); # 0x0...010...87 + + &pxor ($twtmp,$twtmp); + &paddq ($tweak,$tweak); # &psllq($tweak,1); + &pand ($twres,$twmask); # isolate carry and residue + &pcmpgtd($twtmp,$tweak); # broadcast upper bits + &mov ($rounds,$rounds_); # restore $rounds + &pxor ($tweak,$twres); + + &sub ($len,16*6); + &jnc (&label("xts_enc_loop6")); + + &lea ($rounds,&DWP(1,"",$rounds,2)); # restore $rounds + &mov ($key,$key_); # restore $key + &mov ($rounds_,$rounds); + +&set_label("xts_enc_short"); + &add ($len,16*6); + &jz (&label("xts_enc_done6x")); + + &movdqa ($inout3,$tweak); # put aside previous tweak + &cmp ($len,0x20); + &jb (&label("xts_enc_one")); + + &pshufd ($twres,$twtmp,0x13); + &pxor ($twtmp,$twtmp); + &paddq ($tweak,$tweak); # &psllq($tweak,1); + &pand ($twres,$twmask); # isolate carry and residue + &pcmpgtd($twtmp,$tweak); # broadcast upper bits + &pxor ($tweak,$twres); + &je (&label("xts_enc_two")); + + &pshufd ($twres,$twtmp,0x13); + &pxor ($twtmp,$twtmp); + &movdqa ($inout4,$tweak); # put aside previous tweak + &paddq ($tweak,$tweak); # &psllq($tweak,1); + &pand ($twres,$twmask); # isolate carry and residue + &pcmpgtd($twtmp,$tweak); # broadcast upper bits + &pxor ($tweak,$twres); + &cmp ($len,0x40); + &jb (&label("xts_enc_three")); + + &pshufd ($twres,$twtmp,0x13); + &pxor ($twtmp,$twtmp); + &movdqa ($inout5,$tweak); # put aside previous tweak + &paddq ($tweak,$tweak); # &psllq($tweak,1); + &pand ($twres,$twmask); # isolate carry and residue + &pcmpgtd($twtmp,$tweak); # broadcast upper bits + &pxor ($tweak,$twres); + &movdqa (&QWP(16*0,"esp"),$inout3); + &movdqa (&QWP(16*1,"esp"),$inout4); + &je (&label("xts_enc_four")); + + &movdqa (&QWP(16*2,"esp"),$inout5); + &pshufd ($inout5,$twtmp,0x13); + &movdqa (&QWP(16*3,"esp"),$tweak); + &paddq ($tweak,$tweak); # &psllq($inout0,1); + &pand ($inout5,$twmask); # isolate carry and residue + &pxor ($inout5,$tweak); + + &movdqu ($inout0,&QWP(16*0,$inp)); # load input + &movdqu ($inout1,&QWP(16*1,$inp)); + &movdqu ($inout2,&QWP(16*2,$inp)); + &pxor ($inout0,&QWP(16*0,"esp")); # input^=tweak + &movdqu ($inout3,&QWP(16*3,$inp)); + &pxor ($inout1,&QWP(16*1,"esp")); + &movdqu ($inout4,&QWP(16*4,$inp)); + &pxor ($inout2,&QWP(16*2,"esp")); + &lea ($inp,&DWP(16*5,$inp)); + &pxor ($inout3,&QWP(16*3,"esp")); + &movdqa (&QWP(16*4,"esp"),$inout5); # save last tweak + &pxor ($inout4,$inout5); + + &call ("_aesni_encrypt6"); + + &movaps ($tweak,&QWP(16*4,"esp")); # last tweak + &xorps ($inout0,&QWP(16*0,"esp")); # output^=tweak + &xorps ($inout1,&QWP(16*1,"esp")); + &xorps ($inout2,&QWP(16*2,"esp")); + &movups (&QWP(16*0,$out),$inout0); # write output + &xorps ($inout3,&QWP(16*3,"esp")); + &movups (&QWP(16*1,$out),$inout1); + &xorps ($inout4,$tweak); + &movups (&QWP(16*2,$out),$inout2); + &movups (&QWP(16*3,$out),$inout3); + &movups (&QWP(16*4,$out),$inout4); + &lea ($out,&DWP(16*5,$out)); + &jmp (&label("xts_enc_done")); + +&set_label("xts_enc_one",16); + &movups ($inout0,&QWP(16*0,$inp)); # load input + &lea ($inp,&DWP(16*1,$inp)); + &xorps ($inout0,$inout3); # input^=tweak + if ($inline) + { &aesni_inline_generate1("enc"); } + else + { &call ("_aesni_encrypt1"); } + &xorps ($inout0,$inout3); # output^=tweak + &movups (&QWP(16*0,$out),$inout0); # write output + &lea ($out,&DWP(16*1,$out)); + + &movdqa ($tweak,$inout3); # last tweak + &jmp (&label("xts_enc_done")); + +&set_label("xts_enc_two",16); + &movaps ($inout4,$tweak); # put aside last tweak + + &movups ($inout0,&QWP(16*0,$inp)); # load input + &movups ($inout1,&QWP(16*1,$inp)); + &lea ($inp,&DWP(16*2,$inp)); + &xorps ($inout0,$inout3); # input^=tweak + &xorps ($inout1,$inout4); + &xorps ($inout2,$inout2); + + &call ("_aesni_encrypt3"); + + &xorps ($inout0,$inout3); # output^=tweak + &xorps ($inout1,$inout4); + &movups (&QWP(16*0,$out),$inout0); # write output + &movups (&QWP(16*1,$out),$inout1); + &lea ($out,&DWP(16*2,$out)); + + &movdqa ($tweak,$inout4); # last tweak + &jmp (&label("xts_enc_done")); + +&set_label("xts_enc_three",16); + &movaps ($inout5,$tweak); # put aside last tweak + &movups ($inout0,&QWP(16*0,$inp)); # load input + &movups ($inout1,&QWP(16*1,$inp)); + &movups ($inout2,&QWP(16*2,$inp)); + &lea ($inp,&DWP(16*3,$inp)); + &xorps ($inout0,$inout3); # input^=tweak + &xorps ($inout1,$inout4); + &xorps ($inout2,$inout5); + + &call ("_aesni_encrypt3"); + + &xorps ($inout0,$inout3); # output^=tweak + &xorps ($inout1,$inout4); + &xorps ($inout2,$inout5); + &movups (&QWP(16*0,$out),$inout0); # write output + &movups (&QWP(16*1,$out),$inout1); + &movups (&QWP(16*2,$out),$inout2); + &lea ($out,&DWP(16*3,$out)); + + &movdqa ($tweak,$inout5); # last tweak + &jmp (&label("xts_enc_done")); + +&set_label("xts_enc_four",16); + &movaps ($inout4,$tweak); # put aside last tweak + + &movups ($inout0,&QWP(16*0,$inp)); # load input + &movups ($inout1,&QWP(16*1,$inp)); + &movups ($inout2,&QWP(16*2,$inp)); + &xorps ($inout0,&QWP(16*0,"esp")); # input^=tweak + &movups ($inout3,&QWP(16*3,$inp)); + &lea ($inp,&DWP(16*4,$inp)); + &xorps ($inout1,&QWP(16*1,"esp")); + &xorps ($inout2,$inout5); + &xorps ($inout3,$inout4); + + &call ("_aesni_encrypt4"); + + &xorps ($inout0,&QWP(16*0,"esp")); # output^=tweak + &xorps ($inout1,&QWP(16*1,"esp")); + &xorps ($inout2,$inout5); + &movups (&QWP(16*0,$out),$inout0); # write output + &xorps ($inout3,$inout4); + &movups (&QWP(16*1,$out),$inout1); + &movups (&QWP(16*2,$out),$inout2); + &movups (&QWP(16*3,$out),$inout3); + &lea ($out,&DWP(16*4,$out)); + + &movdqa ($tweak,$inout4); # last tweak + &jmp (&label("xts_enc_done")); + +&set_label("xts_enc_done6x",16); # $tweak is pre-calculated + &mov ($len,&DWP(16*7+0,"esp")); # restore original $len + &and ($len,15); + &jz (&label("xts_enc_ret")); + &movdqa ($inout3,$tweak); + &mov (&DWP(16*7+0,"esp"),$len); # save $len%16 + &jmp (&label("xts_enc_steal")); + +&set_label("xts_enc_done",16); + &mov ($len,&DWP(16*7+0,"esp")); # restore original $len + &pxor ($twtmp,$twtmp); + &and ($len,15); + &jz (&label("xts_enc_ret")); + + &pcmpgtd($twtmp,$tweak); # broadcast upper bits + &mov (&DWP(16*7+0,"esp"),$len); # save $len%16 + &pshufd ($inout3,$twtmp,0x13); + &paddq ($tweak,$tweak); # &psllq($tweak,1); + &pand ($inout3,&QWP(16*6,"esp")); # isolate carry and residue + &pxor ($inout3,$tweak); + +&set_label("xts_enc_steal"); + &movz ($rounds,&BP(0,$inp)); + &movz ($key,&BP(-16,$out)); + &lea ($inp,&DWP(1,$inp)); + &mov (&BP(-16,$out),&LB($rounds)); + &mov (&BP(0,$out),&LB($key)); + &lea ($out,&DWP(1,$out)); + &sub ($len,1); + &jnz (&label("xts_enc_steal")); + + &sub ($out,&DWP(16*7+0,"esp")); # rewind $out + &mov ($key,$key_); # restore $key + &mov ($rounds,$rounds_); # restore $rounds + + &movups ($inout0,&QWP(-16,$out)); # load input + &xorps ($inout0,$inout3); # input^=tweak + if ($inline) + { &aesni_inline_generate1("enc"); } + else + { &call ("_aesni_encrypt1"); } + &xorps ($inout0,$inout3); # output^=tweak + &movups (&QWP(-16,$out),$inout0); # write output + +&set_label("xts_enc_ret"); + &mov ("esp",&DWP(16*7+4,"esp")); # restore %esp +&function_end("aesni_xts_encrypt"); + +&function_begin("aesni_xts_decrypt"); + &mov ($key,&wparam(4)); # key2 + &mov ($inp,&wparam(5)); # clear-text tweak + + &mov ($rounds,&DWP(240,$key)); # key2->rounds + &movups ($inout0,&QWP(0,$inp)); + if ($inline) + { &aesni_inline_generate1("enc"); } + else + { &call ("_aesni_encrypt1"); } + + &mov ($inp,&wparam(0)); + &mov ($out,&wparam(1)); + &mov ($len,&wparam(2)); + &mov ($key,&wparam(3)); # key1 + + &mov ($key_,"esp"); + &sub ("esp",16*7+8); + &and ("esp",-16); # align stack + + &xor ($rounds_,$rounds_); # if(len%16) len-=16; + &test ($len,15); + &setnz (&LB($rounds_)); + &shl ($rounds_,4); + &sub ($len,$rounds_); + + &mov (&DWP(16*6+0,"esp"),0x87); # compose the magic constant + &mov (&DWP(16*6+4,"esp"),0); + &mov (&DWP(16*6+8,"esp"),1); + &mov (&DWP(16*6+12,"esp"),0); + &mov (&DWP(16*7+0,"esp"),$len); # save original $len + &mov (&DWP(16*7+4,"esp"),$key_); # save original %esp + + &mov ($rounds,&DWP(240,$key)); # key1->rounds + &mov ($key_,$key); # backup $key + &mov ($rounds_,$rounds); # backup $rounds + + &movdqa ($tweak,$inout0); + &pxor ($twtmp,$twtmp); + &movdqa ($twmask,&QWP(6*16,"esp")); # 0x0...010...87 + &pcmpgtd($twtmp,$tweak); # broadcast upper bits + + &and ($len,-16); + &sub ($len,16*6); + &jc (&label("xts_dec_short")); + + &shr ($rounds,1); + &mov ($rounds_,$rounds); + &jmp (&label("xts_dec_loop6")); + +&set_label("xts_dec_loop6",16); + for ($i=0;$i<4;$i++) { + &pshufd ($twres,$twtmp,0x13); + &pxor ($twtmp,$twtmp); + &movdqa (&QWP(16*$i,"esp"),$tweak); + &paddq ($tweak,$tweak); # &psllq($tweak,1); + &pand ($twres,$twmask); # isolate carry and residue + &pcmpgtd ($twtmp,$tweak); # broadcast upper bits + &pxor ($tweak,$twres); + } + &pshufd ($inout5,$twtmp,0x13); + &movdqa (&QWP(16*$i++,"esp"),$tweak); + &paddq ($tweak,$tweak); # &psllq($tweak,1); + &$movekey ($rndkey0,&QWP(0,$key_)); + &pand ($inout5,$twmask); # isolate carry and residue + &movups ($inout0,&QWP(0,$inp)); # load input + &pxor ($inout5,$tweak); + + # inline _aesni_encrypt6 prologue and flip xor with tweak and key[0] + &movdqu ($inout1,&QWP(16*1,$inp)); + &xorps ($inout0,$rndkey0); # input^=rndkey[0] + &movdqu ($inout2,&QWP(16*2,$inp)); + &pxor ($inout1,$rndkey0); + &movdqu ($inout3,&QWP(16*3,$inp)); + &pxor ($inout2,$rndkey0); + &movdqu ($inout4,&QWP(16*4,$inp)); + &pxor ($inout3,$rndkey0); + &movdqu ($rndkey1,&QWP(16*5,$inp)); + &pxor ($inout4,$rndkey0); + &lea ($inp,&DWP(16*6,$inp)); + &pxor ($inout0,&QWP(16*0,"esp")); # input^=tweak + &movdqa (&QWP(16*$i,"esp"),$inout5); # save last tweak + &pxor ($inout5,$rndkey1); + + &$movekey ($rndkey1,&QWP(16,$key_)); + &lea ($key,&DWP(32,$key_)); + &pxor ($inout1,&QWP(16*1,"esp")); + &aesdec ($inout0,$rndkey1); + &pxor ($inout2,&QWP(16*2,"esp")); + &aesdec ($inout1,$rndkey1); + &pxor ($inout3,&QWP(16*3,"esp")); + &dec ($rounds); + &aesdec ($inout2,$rndkey1); + &pxor ($inout4,&QWP(16*4,"esp")); + &aesdec ($inout3,$rndkey1); + &pxor ($inout5,$rndkey0); + &aesdec ($inout4,$rndkey1); + &$movekey ($rndkey0,&QWP(0,$key)); + &aesdec ($inout5,$rndkey1); + &call (&label("_aesni_decrypt6_enter")); + + &movdqa ($tweak,&QWP(16*5,"esp")); # last tweak + &pxor ($twtmp,$twtmp); + &xorps ($inout0,&QWP(16*0,"esp")); # output^=tweak + &pcmpgtd ($twtmp,$tweak); # broadcast upper bits + &xorps ($inout1,&QWP(16*1,"esp")); + &movups (&QWP(16*0,$out),$inout0); # write output + &xorps ($inout2,&QWP(16*2,"esp")); + &movups (&QWP(16*1,$out),$inout1); + &xorps ($inout3,&QWP(16*3,"esp")); + &movups (&QWP(16*2,$out),$inout2); + &xorps ($inout4,&QWP(16*4,"esp")); + &movups (&QWP(16*3,$out),$inout3); + &xorps ($inout5,$tweak); + &movups (&QWP(16*4,$out),$inout4); + &pshufd ($twres,$twtmp,0x13); + &movups (&QWP(16*5,$out),$inout5); + &lea ($out,&DWP(16*6,$out)); + &movdqa ($twmask,&QWP(16*6,"esp")); # 0x0...010...87 + + &pxor ($twtmp,$twtmp); + &paddq ($tweak,$tweak); # &psllq($tweak,1); + &pand ($twres,$twmask); # isolate carry and residue + &pcmpgtd($twtmp,$tweak); # broadcast upper bits + &mov ($rounds,$rounds_); # restore $rounds + &pxor ($tweak,$twres); + + &sub ($len,16*6); + &jnc (&label("xts_dec_loop6")); + + &lea ($rounds,&DWP(1,"",$rounds,2)); # restore $rounds + &mov ($key,$key_); # restore $key + &mov ($rounds_,$rounds); + +&set_label("xts_dec_short"); + &add ($len,16*6); + &jz (&label("xts_dec_done6x")); + + &movdqa ($inout3,$tweak); # put aside previous tweak + &cmp ($len,0x20); + &jb (&label("xts_dec_one")); + + &pshufd ($twres,$twtmp,0x13); + &pxor ($twtmp,$twtmp); + &paddq ($tweak,$tweak); # &psllq($tweak,1); + &pand ($twres,$twmask); # isolate carry and residue + &pcmpgtd($twtmp,$tweak); # broadcast upper bits + &pxor ($tweak,$twres); + &je (&label("xts_dec_two")); + + &pshufd ($twres,$twtmp,0x13); + &pxor ($twtmp,$twtmp); + &movdqa ($inout4,$tweak); # put aside previous tweak + &paddq ($tweak,$tweak); # &psllq($tweak,1); + &pand ($twres,$twmask); # isolate carry and residue + &pcmpgtd($twtmp,$tweak); # broadcast upper bits + &pxor ($tweak,$twres); + &cmp ($len,0x40); + &jb (&label("xts_dec_three")); + + &pshufd ($twres,$twtmp,0x13); + &pxor ($twtmp,$twtmp); + &movdqa ($inout5,$tweak); # put aside previous tweak + &paddq ($tweak,$tweak); # &psllq($tweak,1); + &pand ($twres,$twmask); # isolate carry and residue + &pcmpgtd($twtmp,$tweak); # broadcast upper bits + &pxor ($tweak,$twres); + &movdqa (&QWP(16*0,"esp"),$inout3); + &movdqa (&QWP(16*1,"esp"),$inout4); + &je (&label("xts_dec_four")); + + &movdqa (&QWP(16*2,"esp"),$inout5); + &pshufd ($inout5,$twtmp,0x13); + &movdqa (&QWP(16*3,"esp"),$tweak); + &paddq ($tweak,$tweak); # &psllq($inout0,1); + &pand ($inout5,$twmask); # isolate carry and residue + &pxor ($inout5,$tweak); + + &movdqu ($inout0,&QWP(16*0,$inp)); # load input + &movdqu ($inout1,&QWP(16*1,$inp)); + &movdqu ($inout2,&QWP(16*2,$inp)); + &pxor ($inout0,&QWP(16*0,"esp")); # input^=tweak + &movdqu ($inout3,&QWP(16*3,$inp)); + &pxor ($inout1,&QWP(16*1,"esp")); + &movdqu ($inout4,&QWP(16*4,$inp)); + &pxor ($inout2,&QWP(16*2,"esp")); + &lea ($inp,&DWP(16*5,$inp)); + &pxor ($inout3,&QWP(16*3,"esp")); + &movdqa (&QWP(16*4,"esp"),$inout5); # save last tweak + &pxor ($inout4,$inout5); + + &call ("_aesni_decrypt6"); + + &movaps ($tweak,&QWP(16*4,"esp")); # last tweak + &xorps ($inout0,&QWP(16*0,"esp")); # output^=tweak + &xorps ($inout1,&QWP(16*1,"esp")); + &xorps ($inout2,&QWP(16*2,"esp")); + &movups (&QWP(16*0,$out),$inout0); # write output + &xorps ($inout3,&QWP(16*3,"esp")); + &movups (&QWP(16*1,$out),$inout1); + &xorps ($inout4,$tweak); + &movups (&QWP(16*2,$out),$inout2); + &movups (&QWP(16*3,$out),$inout3); + &movups (&QWP(16*4,$out),$inout4); + &lea ($out,&DWP(16*5,$out)); + &jmp (&label("xts_dec_done")); + +&set_label("xts_dec_one",16); + &movups ($inout0,&QWP(16*0,$inp)); # load input + &lea ($inp,&DWP(16*1,$inp)); + &xorps ($inout0,$inout3); # input^=tweak + if ($inline) + { &aesni_inline_generate1("dec"); } + else + { &call ("_aesni_decrypt1"); } + &xorps ($inout0,$inout3); # output^=tweak + &movups (&QWP(16*0,$out),$inout0); # write output + &lea ($out,&DWP(16*1,$out)); + + &movdqa ($tweak,$inout3); # last tweak + &jmp (&label("xts_dec_done")); + +&set_label("xts_dec_two",16); + &movaps ($inout4,$tweak); # put aside last tweak + + &movups ($inout0,&QWP(16*0,$inp)); # load input + &movups ($inout1,&QWP(16*1,$inp)); + &lea ($inp,&DWP(16*2,$inp)); + &xorps ($inout0,$inout3); # input^=tweak + &xorps ($inout1,$inout4); + + &call ("_aesni_decrypt3"); + + &xorps ($inout0,$inout3); # output^=tweak + &xorps ($inout1,$inout4); + &movups (&QWP(16*0,$out),$inout0); # write output + &movups (&QWP(16*1,$out),$inout1); + &lea ($out,&DWP(16*2,$out)); + + &movdqa ($tweak,$inout4); # last tweak + &jmp (&label("xts_dec_done")); + +&set_label("xts_dec_three",16); + &movaps ($inout5,$tweak); # put aside last tweak + &movups ($inout0,&QWP(16*0,$inp)); # load input + &movups ($inout1,&QWP(16*1,$inp)); + &movups ($inout2,&QWP(16*2,$inp)); + &lea ($inp,&DWP(16*3,$inp)); + &xorps ($inout0,$inout3); # input^=tweak + &xorps ($inout1,$inout4); + &xorps ($inout2,$inout5); + + &call ("_aesni_decrypt3"); + + &xorps ($inout0,$inout3); # output^=tweak + &xorps ($inout1,$inout4); + &xorps ($inout2,$inout5); + &movups (&QWP(16*0,$out),$inout0); # write output + &movups (&QWP(16*1,$out),$inout1); + &movups (&QWP(16*2,$out),$inout2); + &lea ($out,&DWP(16*3,$out)); + + &movdqa ($tweak,$inout5); # last tweak + &jmp (&label("xts_dec_done")); + +&set_label("xts_dec_four",16); + &movaps ($inout4,$tweak); # put aside last tweak + + &movups ($inout0,&QWP(16*0,$inp)); # load input + &movups ($inout1,&QWP(16*1,$inp)); + &movups ($inout2,&QWP(16*2,$inp)); + &xorps ($inout0,&QWP(16*0,"esp")); # input^=tweak + &movups ($inout3,&QWP(16*3,$inp)); + &lea ($inp,&DWP(16*4,$inp)); + &xorps ($inout1,&QWP(16*1,"esp")); + &xorps ($inout2,$inout5); + &xorps ($inout3,$inout4); + + &call ("_aesni_decrypt4"); + + &xorps ($inout0,&QWP(16*0,"esp")); # output^=tweak + &xorps ($inout1,&QWP(16*1,"esp")); + &xorps ($inout2,$inout5); + &movups (&QWP(16*0,$out),$inout0); # write output + &xorps ($inout3,$inout4); + &movups (&QWP(16*1,$out),$inout1); + &movups (&QWP(16*2,$out),$inout2); + &movups (&QWP(16*3,$out),$inout3); + &lea ($out,&DWP(16*4,$out)); + + &movdqa ($tweak,$inout4); # last tweak + &jmp (&label("xts_dec_done")); + +&set_label("xts_dec_done6x",16); # $tweak is pre-calculated + &mov ($len,&DWP(16*7+0,"esp")); # restore original $len + &and ($len,15); + &jz (&label("xts_dec_ret")); + &mov (&DWP(16*7+0,"esp"),$len); # save $len%16 + &jmp (&label("xts_dec_only_one_more")); + +&set_label("xts_dec_done",16); + &mov ($len,&DWP(16*7+0,"esp")); # restore original $len + &pxor ($twtmp,$twtmp); + &and ($len,15); + &jz (&label("xts_dec_ret")); + + &pcmpgtd($twtmp,$tweak); # broadcast upper bits + &mov (&DWP(16*7+0,"esp"),$len); # save $len%16 + &pshufd ($twres,$twtmp,0x13); + &pxor ($twtmp,$twtmp); + &movdqa ($twmask,&QWP(16*6,"esp")); + &paddq ($tweak,$tweak); # &psllq($tweak,1); + &pand ($twres,$twmask); # isolate carry and residue + &pcmpgtd($twtmp,$tweak); # broadcast upper bits + &pxor ($tweak,$twres); + +&set_label("xts_dec_only_one_more"); + &pshufd ($inout3,$twtmp,0x13); + &movdqa ($inout4,$tweak); # put aside previous tweak + &paddq ($tweak,$tweak); # &psllq($tweak,1); + &pand ($inout3,$twmask); # isolate carry and residue + &pxor ($inout3,$tweak); + + &mov ($key,$key_); # restore $key + &mov ($rounds,$rounds_); # restore $rounds + + &movups ($inout0,&QWP(0,$inp)); # load input + &xorps ($inout0,$inout3); # input^=tweak + if ($inline) + { &aesni_inline_generate1("dec"); } + else + { &call ("_aesni_decrypt1"); } + &xorps ($inout0,$inout3); # output^=tweak + &movups (&QWP(0,$out),$inout0); # write output + +&set_label("xts_dec_steal"); + &movz ($rounds,&BP(16,$inp)); + &movz ($key,&BP(0,$out)); + &lea ($inp,&DWP(1,$inp)); + &mov (&BP(0,$out),&LB($rounds)); + &mov (&BP(16,$out),&LB($key)); + &lea ($out,&DWP(1,$out)); + &sub ($len,1); + &jnz (&label("xts_dec_steal")); + + &sub ($out,&DWP(16*7+0,"esp")); # rewind $out + &mov ($key,$key_); # restore $key + &mov ($rounds,$rounds_); # restore $rounds + + &movups ($inout0,&QWP(0,$out)); # load input + &xorps ($inout0,$inout4); # input^=tweak + if ($inline) + { &aesni_inline_generate1("dec"); } + else + { &call ("_aesni_decrypt1"); } + &xorps ($inout0,$inout4); # output^=tweak + &movups (&QWP(0,$out),$inout0); # write output + +&set_label("xts_dec_ret"); + &mov ("esp",&DWP(16*7+4,"esp")); # restore %esp +&function_end("aesni_xts_decrypt"); +} +} + +###################################################################### +# void $PREFIX_cbc_encrypt (const void *inp, void *out, +# size_t length, const AES_KEY *key, +# unsigned char *ivp,const int enc); +&function_begin("${PREFIX}_cbc_encrypt"); + &mov ($inp,&wparam(0)); + &mov ($rounds_,"esp"); + &mov ($out,&wparam(1)); + &sub ($rounds_,24); + &mov ($len,&wparam(2)); + &and ($rounds_,-16); + &mov ($key,&wparam(3)); + &mov ($key_,&wparam(4)); + &test ($len,$len); + &jz (&label("cbc_abort")); + + &cmp (&wparam(5),0); + &xchg ($rounds_,"esp"); # alloca + &movups ($ivec,&QWP(0,$key_)); # load IV + &mov ($rounds,&DWP(240,$key)); + &mov ($key_,$key); # backup $key + &mov (&DWP(16,"esp"),$rounds_); # save original %esp + &mov ($rounds_,$rounds); # backup $rounds + &je (&label("cbc_decrypt")); + + &movaps ($inout0,$ivec); + &cmp ($len,16); + &jb (&label("cbc_enc_tail")); + &sub ($len,16); + &jmp (&label("cbc_enc_loop")); + +&set_label("cbc_enc_loop",16); + &movups ($ivec,&QWP(0,$inp)); # input actually + &lea ($inp,&DWP(16,$inp)); + if ($inline) + { &aesni_inline_generate1("enc",$inout0,$ivec); } + else + { &xorps($inout0,$ivec); &call("_aesni_encrypt1"); } + &mov ($rounds,$rounds_); # restore $rounds + &mov ($key,$key_); # restore $key + &movups (&QWP(0,$out),$inout0); # store output + &lea ($out,&DWP(16,$out)); + &sub ($len,16); + &jnc (&label("cbc_enc_loop")); + &add ($len,16); + &jnz (&label("cbc_enc_tail")); + &movaps ($ivec,$inout0); + &jmp (&label("cbc_ret")); + +&set_label("cbc_enc_tail"); + &mov ("ecx",$len); # zaps $rounds + &data_word(0xA4F3F689); # rep movsb + &mov ("ecx",16); # zero tail + &sub ("ecx",$len); + &xor ("eax","eax"); # zaps $len + &data_word(0xAAF3F689); # rep stosb + &lea ($out,&DWP(-16,$out)); # rewind $out by 1 block + &mov ($rounds,$rounds_); # restore $rounds + &mov ($inp,$out); # $inp and $out are the same + &mov ($key,$key_); # restore $key + &jmp (&label("cbc_enc_loop")); +###################################################################### +&set_label("cbc_decrypt",16); + &cmp ($len,0x50); + &jbe (&label("cbc_dec_tail")); + &movaps (&QWP(0,"esp"),$ivec); # save IV + &sub ($len,0x50); + &jmp (&label("cbc_dec_loop6_enter")); + +&set_label("cbc_dec_loop6",16); + &movaps (&QWP(0,"esp"),$rndkey0); # save IV + &movups (&QWP(0,$out),$inout5); + &lea ($out,&DWP(0x10,$out)); +&set_label("cbc_dec_loop6_enter"); + &movdqu ($inout0,&QWP(0,$inp)); + &movdqu ($inout1,&QWP(0x10,$inp)); + &movdqu ($inout2,&QWP(0x20,$inp)); + &movdqu ($inout3,&QWP(0x30,$inp)); + &movdqu ($inout4,&QWP(0x40,$inp)); + &movdqu ($inout5,&QWP(0x50,$inp)); + + &call ("_aesni_decrypt6"); + + &movups ($rndkey1,&QWP(0,$inp)); + &movups ($rndkey0,&QWP(0x10,$inp)); + &xorps ($inout0,&QWP(0,"esp")); # ^=IV + &xorps ($inout1,$rndkey1); + &movups ($rndkey1,&QWP(0x20,$inp)); + &xorps ($inout2,$rndkey0); + &movups ($rndkey0,&QWP(0x30,$inp)); + &xorps ($inout3,$rndkey1); + &movups ($rndkey1,&QWP(0x40,$inp)); + &xorps ($inout4,$rndkey0); + &movups ($rndkey0,&QWP(0x50,$inp)); # IV + &xorps ($inout5,$rndkey1); + &movups (&QWP(0,$out),$inout0); + &movups (&QWP(0x10,$out),$inout1); + &lea ($inp,&DWP(0x60,$inp)); + &movups (&QWP(0x20,$out),$inout2); + &mov ($rounds,$rounds_) # restore $rounds + &movups (&QWP(0x30,$out),$inout3); + &mov ($key,$key_); # restore $key + &movups (&QWP(0x40,$out),$inout4); + &lea ($out,&DWP(0x50,$out)); + &sub ($len,0x60); + &ja (&label("cbc_dec_loop6")); + + &movaps ($inout0,$inout5); + &movaps ($ivec,$rndkey0); + &add ($len,0x50); + &jle (&label("cbc_dec_tail_collected")); + &movups (&QWP(0,$out),$inout0); + &lea ($out,&DWP(0x10,$out)); +&set_label("cbc_dec_tail"); + &movups ($inout0,&QWP(0,$inp)); + &movaps ($in0,$inout0); + &cmp ($len,0x10); + &jbe (&label("cbc_dec_one")); + + &movups ($inout1,&QWP(0x10,$inp)); + &movaps ($in1,$inout1); + &cmp ($len,0x20); + &jbe (&label("cbc_dec_two")); + + &movups ($inout2,&QWP(0x20,$inp)); + &cmp ($len,0x30); + &jbe (&label("cbc_dec_three")); + + &movups ($inout3,&QWP(0x30,$inp)); + &cmp ($len,0x40); + &jbe (&label("cbc_dec_four")); + + &movups ($inout4,&QWP(0x40,$inp)); + &movaps (&QWP(0,"esp"),$ivec); # save IV + &movups ($inout0,&QWP(0,$inp)); + &xorps ($inout5,$inout5); + &call ("_aesni_decrypt6"); + &movups ($rndkey1,&QWP(0,$inp)); + &movups ($rndkey0,&QWP(0x10,$inp)); + &xorps ($inout0,&QWP(0,"esp")); # ^= IV + &xorps ($inout1,$rndkey1); + &movups ($rndkey1,&QWP(0x20,$inp)); + &xorps ($inout2,$rndkey0); + &movups ($rndkey0,&QWP(0x30,$inp)); + &xorps ($inout3,$rndkey1); + &movups ($ivec,&QWP(0x40,$inp)); # IV + &xorps ($inout4,$rndkey0); + &movups (&QWP(0,$out),$inout0); + &movups (&QWP(0x10,$out),$inout1); + &movups (&QWP(0x20,$out),$inout2); + &movups (&QWP(0x30,$out),$inout3); + &lea ($out,&DWP(0x40,$out)); + &movaps ($inout0,$inout4); + &sub ($len,0x50); + &jmp (&label("cbc_dec_tail_collected")); + +&set_label("cbc_dec_one",16); + if ($inline) + { &aesni_inline_generate1("dec"); } + else + { &call ("_aesni_decrypt1"); } + &xorps ($inout0,$ivec); + &movaps ($ivec,$in0); + &sub ($len,0x10); + &jmp (&label("cbc_dec_tail_collected")); + +&set_label("cbc_dec_two",16); + &xorps ($inout2,$inout2); + &call ("_aesni_decrypt3"); + &xorps ($inout0,$ivec); + &xorps ($inout1,$in0); + &movups (&QWP(0,$out),$inout0); + &movaps ($inout0,$inout1); + &lea ($out,&DWP(0x10,$out)); + &movaps ($ivec,$in1); + &sub ($len,0x20); + &jmp (&label("cbc_dec_tail_collected")); + +&set_label("cbc_dec_three",16); + &call ("_aesni_decrypt3"); + &xorps ($inout0,$ivec); + &xorps ($inout1,$in0); + &xorps ($inout2,$in1); + &movups (&QWP(0,$out),$inout0); + &movaps ($inout0,$inout2); + &movups (&QWP(0x10,$out),$inout1); + &lea ($out,&DWP(0x20,$out)); + &movups ($ivec,&QWP(0x20,$inp)); + &sub ($len,0x30); + &jmp (&label("cbc_dec_tail_collected")); + +&set_label("cbc_dec_four",16); + &call ("_aesni_decrypt4"); + &movups ($rndkey1,&QWP(0x10,$inp)); + &movups ($rndkey0,&QWP(0x20,$inp)); + &xorps ($inout0,$ivec); + &movups ($ivec,&QWP(0x30,$inp)); + &xorps ($inout1,$in0); + &movups (&QWP(0,$out),$inout0); + &xorps ($inout2,$rndkey1); + &movups (&QWP(0x10,$out),$inout1); + &xorps ($inout3,$rndkey0); + &movups (&QWP(0x20,$out),$inout2); + &lea ($out,&DWP(0x30,$out)); + &movaps ($inout0,$inout3); + &sub ($len,0x40); + +&set_label("cbc_dec_tail_collected"); + &and ($len,15); + &jnz (&label("cbc_dec_tail_partial")); + &movups (&QWP(0,$out),$inout0); + &jmp (&label("cbc_ret")); + +&set_label("cbc_dec_tail_partial",16); + &movaps (&QWP(0,"esp"),$inout0); + &mov ("ecx",16); + &mov ($inp,"esp"); + &sub ("ecx",$len); + &data_word(0xA4F3F689); # rep movsb + +&set_label("cbc_ret"); + &mov ("esp",&DWP(16,"esp")); # pull original %esp + &mov ($key_,&wparam(4)); + &movups (&QWP(0,$key_),$ivec); # output IV +&set_label("cbc_abort"); +&function_end("${PREFIX}_cbc_encrypt"); + +###################################################################### +# Mechanical port from aesni-x86_64.pl. +# +# _aesni_set_encrypt_key is private interface, +# input: +# "eax" const unsigned char *userKey +# $rounds int bits +# $key AES_KEY *key +# output: +# "eax" return code +# $round rounds + +&function_begin_B("_aesni_set_encrypt_key"); + &test ("eax","eax"); + &jz (&label("bad_pointer")); + &test ($key,$key); + &jz (&label("bad_pointer")); + + &movups ("xmm0",&QWP(0,"eax")); # pull first 128 bits of *userKey + &xorps ("xmm4","xmm4"); # low dword of xmm4 is assumed 0 + &lea ($key,&DWP(16,$key)); + &cmp ($rounds,256); + &je (&label("14rounds")); + &cmp ($rounds,192); + &je (&label("12rounds")); + &cmp ($rounds,128); + &jne (&label("bad_keybits")); + +&set_label("10rounds",16); + &mov ($rounds,9); + &$movekey (&QWP(-16,$key),"xmm0"); # round 0 + &aeskeygenassist("xmm1","xmm0",0x01); # round 1 + &call (&label("key_128_cold")); + &aeskeygenassist("xmm1","xmm0",0x2); # round 2 + &call (&label("key_128")); + &aeskeygenassist("xmm1","xmm0",0x04); # round 3 + &call (&label("key_128")); + &aeskeygenassist("xmm1","xmm0",0x08); # round 4 + &call (&label("key_128")); + &aeskeygenassist("xmm1","xmm0",0x10); # round 5 + &call (&label("key_128")); + &aeskeygenassist("xmm1","xmm0",0x20); # round 6 + &call (&label("key_128")); + &aeskeygenassist("xmm1","xmm0",0x40); # round 7 + &call (&label("key_128")); + &aeskeygenassist("xmm1","xmm0",0x80); # round 8 + &call (&label("key_128")); + &aeskeygenassist("xmm1","xmm0",0x1b); # round 9 + &call (&label("key_128")); + &aeskeygenassist("xmm1","xmm0",0x36); # round 10 + &call (&label("key_128")); + &$movekey (&QWP(0,$key),"xmm0"); + &mov (&DWP(80,$key),$rounds); + &xor ("eax","eax"); + &ret(); + +&set_label("key_128",16); + &$movekey (&QWP(0,$key),"xmm0"); + &lea ($key,&DWP(16,$key)); +&set_label("key_128_cold"); + &shufps ("xmm4","xmm0",0b00010000); + &xorps ("xmm0","xmm4"); + &shufps ("xmm4","xmm0",0b10001100); + &xorps ("xmm0","xmm4"); + &shufps ("xmm1","xmm1",0b11111111); # critical path + &xorps ("xmm0","xmm1"); + &ret(); + +&set_label("12rounds",16); + &movq ("xmm2",&QWP(16,"eax")); # remaining 1/3 of *userKey + &mov ($rounds,11); + &$movekey (&QWP(-16,$key),"xmm0") # round 0 + &aeskeygenassist("xmm1","xmm2",0x01); # round 1,2 + &call (&label("key_192a_cold")); + &aeskeygenassist("xmm1","xmm2",0x02); # round 2,3 + &call (&label("key_192b")); + &aeskeygenassist("xmm1","xmm2",0x04); # round 4,5 + &call (&label("key_192a")); + &aeskeygenassist("xmm1","xmm2",0x08); # round 5,6 + &call (&label("key_192b")); + &aeskeygenassist("xmm1","xmm2",0x10); # round 7,8 + &call (&label("key_192a")); + &aeskeygenassist("xmm1","xmm2",0x20); # round 8,9 + &call (&label("key_192b")); + &aeskeygenassist("xmm1","xmm2",0x40); # round 10,11 + &call (&label("key_192a")); + &aeskeygenassist("xmm1","xmm2",0x80); # round 11,12 + &call (&label("key_192b")); + &$movekey (&QWP(0,$key),"xmm0"); + &mov (&DWP(48,$key),$rounds); + &xor ("eax","eax"); + &ret(); + +&set_label("key_192a",16); + &$movekey (&QWP(0,$key),"xmm0"); + &lea ($key,&DWP(16,$key)); +&set_label("key_192a_cold",16); + &movaps ("xmm5","xmm2"); +&set_label("key_192b_warm"); + &shufps ("xmm4","xmm0",0b00010000); + &movdqa ("xmm3","xmm2"); + &xorps ("xmm0","xmm4"); + &shufps ("xmm4","xmm0",0b10001100); + &pslldq ("xmm3",4); + &xorps ("xmm0","xmm4"); + &pshufd ("xmm1","xmm1",0b01010101); # critical path + &pxor ("xmm2","xmm3"); + &pxor ("xmm0","xmm1"); + &pshufd ("xmm3","xmm0",0b11111111); + &pxor ("xmm2","xmm3"); + &ret(); + +&set_label("key_192b",16); + &movaps ("xmm3","xmm0"); + &shufps ("xmm5","xmm0",0b01000100); + &$movekey (&QWP(0,$key),"xmm5"); + &shufps ("xmm3","xmm2",0b01001110); + &$movekey (&QWP(16,$key),"xmm3"); + &lea ($key,&DWP(32,$key)); + &jmp (&label("key_192b_warm")); + +&set_label("14rounds",16); + &movups ("xmm2",&QWP(16,"eax")); # remaining half of *userKey + &mov ($rounds,13); + &lea ($key,&DWP(16,$key)); + &$movekey (&QWP(-32,$key),"xmm0"); # round 0 + &$movekey (&QWP(-16,$key),"xmm2"); # round 1 + &aeskeygenassist("xmm1","xmm2",0x01); # round 2 + &call (&label("key_256a_cold")); + &aeskeygenassist("xmm1","xmm0",0x01); # round 3 + &call (&label("key_256b")); + &aeskeygenassist("xmm1","xmm2",0x02); # round 4 + &call (&label("key_256a")); + &aeskeygenassist("xmm1","xmm0",0x02); # round 5 + &call (&label("key_256b")); + &aeskeygenassist("xmm1","xmm2",0x04); # round 6 + &call (&label("key_256a")); + &aeskeygenassist("xmm1","xmm0",0x04); # round 7 + &call (&label("key_256b")); + &aeskeygenassist("xmm1","xmm2",0x08); # round 8 + &call (&label("key_256a")); + &aeskeygenassist("xmm1","xmm0",0x08); # round 9 + &call (&label("key_256b")); + &aeskeygenassist("xmm1","xmm2",0x10); # round 10 + &call (&label("key_256a")); + &aeskeygenassist("xmm1","xmm0",0x10); # round 11 + &call (&label("key_256b")); + &aeskeygenassist("xmm1","xmm2",0x20); # round 12 + &call (&label("key_256a")); + &aeskeygenassist("xmm1","xmm0",0x20); # round 13 + &call (&label("key_256b")); + &aeskeygenassist("xmm1","xmm2",0x40); # round 14 + &call (&label("key_256a")); + &$movekey (&QWP(0,$key),"xmm0"); + &mov (&DWP(16,$key),$rounds); + &xor ("eax","eax"); + &ret(); + +&set_label("key_256a",16); + &$movekey (&QWP(0,$key),"xmm2"); + &lea ($key,&DWP(16,$key)); +&set_label("key_256a_cold"); + &shufps ("xmm4","xmm0",0b00010000); + &xorps ("xmm0","xmm4"); + &shufps ("xmm4","xmm0",0b10001100); + &xorps ("xmm0","xmm4"); + &shufps ("xmm1","xmm1",0b11111111); # critical path + &xorps ("xmm0","xmm1"); + &ret(); + +&set_label("key_256b",16); + &$movekey (&QWP(0,$key),"xmm0"); + &lea ($key,&DWP(16,$key)); + + &shufps ("xmm4","xmm2",0b00010000); + &xorps ("xmm2","xmm4"); + &shufps ("xmm4","xmm2",0b10001100); + &xorps ("xmm2","xmm4"); + &shufps ("xmm1","xmm1",0b10101010); # critical path + &xorps ("xmm2","xmm1"); + &ret(); + +&set_label("bad_pointer",4); + &mov ("eax",-1); + &ret (); +&set_label("bad_keybits",4); + &mov ("eax",-2); + &ret (); +&function_end_B("_aesni_set_encrypt_key"); + +# int $PREFIX_set_encrypt_key (const unsigned char *userKey, int bits, +# AES_KEY *key) +&function_begin_B("${PREFIX}_set_encrypt_key"); + &mov ("eax",&wparam(0)); + &mov ($rounds,&wparam(1)); + &mov ($key,&wparam(2)); + &call ("_aesni_set_encrypt_key"); + &ret (); +&function_end_B("${PREFIX}_set_encrypt_key"); + +# int $PREFIX_set_decrypt_key (const unsigned char *userKey, int bits, +# AES_KEY *key) +&function_begin_B("${PREFIX}_set_decrypt_key"); + &mov ("eax",&wparam(0)); + &mov ($rounds,&wparam(1)); + &mov ($key,&wparam(2)); + &call ("_aesni_set_encrypt_key"); + &mov ($key,&wparam(2)); + &shl ($rounds,4) # rounds-1 after _aesni_set_encrypt_key + &test ("eax","eax"); + &jnz (&label("dec_key_ret")); + &lea ("eax",&DWP(16,$key,$rounds)); # end of key schedule + + &$movekey ("xmm0",&QWP(0,$key)); # just swap + &$movekey ("xmm1",&QWP(0,"eax")); + &$movekey (&QWP(0,"eax"),"xmm0"); + &$movekey (&QWP(0,$key),"xmm1"); + &lea ($key,&DWP(16,$key)); + &lea ("eax",&DWP(-16,"eax")); + +&set_label("dec_key_inverse"); + &$movekey ("xmm0",&QWP(0,$key)); # swap and inverse + &$movekey ("xmm1",&QWP(0,"eax")); + &aesimc ("xmm0","xmm0"); + &aesimc ("xmm1","xmm1"); + &lea ($key,&DWP(16,$key)); + &lea ("eax",&DWP(-16,"eax")); + &$movekey (&QWP(16,"eax"),"xmm0"); + &$movekey (&QWP(-16,$key),"xmm1"); + &cmp ("eax",$key); + &ja (&label("dec_key_inverse")); + + &$movekey ("xmm0",&QWP(0,$key)); # inverse middle + &aesimc ("xmm0","xmm0"); + &$movekey (&QWP(0,$key),"xmm0"); + + &xor ("eax","eax"); # return success +&set_label("dec_key_ret"); + &ret (); +&function_end_B("${PREFIX}_set_decrypt_key"); +&asciz("AES for Intel AES-NI, CRYPTOGAMS by "); + +&asm_finish(); diff --git a/crypto/openssl/crypto/aes/asm/aesni-x86_64.pl b/crypto/openssl/crypto/aes/asm/aesni-x86_64.pl new file mode 100644 index 0000000000..499f3b3f42 --- /dev/null +++ b/crypto/openssl/crypto/aes/asm/aesni-x86_64.pl @@ -0,0 +1,3068 @@ +#!/usr/bin/env perl +# +# ==================================================================== +# Written by Andy Polyakov for the OpenSSL +# project. The module is, however, dual licensed under OpenSSL and +# CRYPTOGAMS licenses depending on where you obtain it. For further +# details see http://www.openssl.org/~appro/cryptogams/. +# ==================================================================== +# +# This module implements support for Intel AES-NI extension. In +# OpenSSL context it's used with Intel engine, but can also be used as +# drop-in replacement for crypto/aes/asm/aes-x86_64.pl [see below for +# details]. +# +# Performance. +# +# Given aes(enc|dec) instructions' latency asymptotic performance for +# non-parallelizable modes such as CBC encrypt is 3.75 cycles per byte +# processed with 128-bit key. And given their throughput asymptotic +# performance for parallelizable modes is 1.25 cycles per byte. Being +# asymptotic limit it's not something you commonly achieve in reality, +# but how close does one get? Below are results collected for +# different modes and block sized. Pairs of numbers are for en-/ +# decryption. +# +# 16-byte 64-byte 256-byte 1-KB 8-KB +# ECB 4.25/4.25 1.38/1.38 1.28/1.28 1.26/1.26 1.26/1.26 +# CTR 5.42/5.42 1.92/1.92 1.44/1.44 1.28/1.28 1.26/1.26 +# CBC 4.38/4.43 4.15/1.43 4.07/1.32 4.07/1.29 4.06/1.28 +# CCM 5.66/9.42 4.42/5.41 4.16/4.40 4.09/4.15 4.06/4.07 +# OFB 5.42/5.42 4.64/4.64 4.44/4.44 4.39/4.39 4.38/4.38 +# CFB 5.73/5.85 5.56/5.62 5.48/5.56 5.47/5.55 5.47/5.55 +# +# ECB, CTR, CBC and CCM results are free from EVP overhead. This means +# that otherwise used 'openssl speed -evp aes-128-??? -engine aesni +# [-decrypt]' will exhibit 10-15% worse results for smaller blocks. +# The results were collected with specially crafted speed.c benchmark +# in order to compare them with results reported in "Intel Advanced +# Encryption Standard (AES) New Instruction Set" White Paper Revision +# 3.0 dated May 2010. All above results are consistently better. This +# module also provides better performance for block sizes smaller than +# 128 bytes in points *not* represented in the above table. +# +# Looking at the results for 8-KB buffer. +# +# CFB and OFB results are far from the limit, because implementation +# uses "generic" CRYPTO_[c|o]fb128_encrypt interfaces relying on +# single-block aesni_encrypt, which is not the most optimal way to go. +# CBC encrypt result is unexpectedly high and there is no documented +# explanation for it. Seemingly there is a small penalty for feeding +# the result back to AES unit the way it's done in CBC mode. There is +# nothing one can do and the result appears optimal. CCM result is +# identical to CBC, because CBC-MAC is essentially CBC encrypt without +# saving output. CCM CTR "stays invisible," because it's neatly +# interleaved wih CBC-MAC. This provides ~30% improvement over +# "straghtforward" CCM implementation with CTR and CBC-MAC performed +# disjointly. Parallelizable modes practically achieve the theoretical +# limit. +# +# Looking at how results vary with buffer size. +# +# Curves are practically saturated at 1-KB buffer size. In most cases +# "256-byte" performance is >95%, and "64-byte" is ~90% of "8-KB" one. +# CTR curve doesn't follow this pattern and is "slowest" changing one +# with "256-byte" result being 87% of "8-KB." This is because overhead +# in CTR mode is most computationally intensive. Small-block CCM +# decrypt is slower than encrypt, because first CTR and last CBC-MAC +# iterations can't be interleaved. +# +# Results for 192- and 256-bit keys. +# +# EVP-free results were observed to scale perfectly with number of +# rounds for larger block sizes, i.e. 192-bit result being 10/12 times +# lower and 256-bit one - 10/14. Well, in CBC encrypt case differences +# are a tad smaller, because the above mentioned penalty biases all +# results by same constant value. In similar way function call +# overhead affects small-block performance, as well as OFB and CFB +# results. Differences are not large, most common coefficients are +# 10/11.7 and 10/13.4 (as opposite to 10/12.0 and 10/14.0), but one +# observe even 10/11.2 and 10/12.4 (CTR, OFB, CFB)... + +# January 2011 +# +# While Westmere processor features 6 cycles latency for aes[enc|dec] +# instructions, which can be scheduled every second cycle, Sandy +# Bridge spends 8 cycles per instruction, but it can schedule them +# every cycle. This means that code targeting Westmere would perform +# suboptimally on Sandy Bridge. Therefore this update. +# +# In addition, non-parallelizable CBC encrypt (as well as CCM) is +# optimized. Relative improvement might appear modest, 8% on Westmere, +# but in absolute terms it's 3.77 cycles per byte encrypted with +# 128-bit key on Westmere, and 5.07 - on Sandy Bridge. These numbers +# should be compared to asymptotic limits of 3.75 for Westmere and +# 5.00 for Sandy Bridge. Actually, the fact that they get this close +# to asymptotic limits is quite amazing. Indeed, the limit is +# calculated as latency times number of rounds, 10 for 128-bit key, +# and divided by 16, the number of bytes in block, or in other words +# it accounts *solely* for aesenc instructions. But there are extra +# instructions, and numbers so close to the asymptotic limits mean +# that it's as if it takes as little as *one* additional cycle to +# execute all of them. How is it possible? It is possible thanks to +# out-of-order execution logic, which manages to overlap post- +# processing of previous block, things like saving the output, with +# actual encryption of current block, as well as pre-processing of +# current block, things like fetching input and xor-ing it with +# 0-round element of the key schedule, with actual encryption of +# previous block. Keep this in mind... +# +# For parallelizable modes, such as ECB, CBC decrypt, CTR, higher +# performance is achieved by interleaving instructions working on +# independent blocks. In which case asymptotic limit for such modes +# can be obtained by dividing above mentioned numbers by AES +# instructions' interleave factor. Westmere can execute at most 3 +# instructions at a time, meaning that optimal interleave factor is 3, +# and that's where the "magic" number of 1.25 come from. "Optimal +# interleave factor" means that increase of interleave factor does +# not improve performance. The formula has proven to reflect reality +# pretty well on Westmere... Sandy Bridge on the other hand can +# execute up to 8 AES instructions at a time, so how does varying +# interleave factor affect the performance? Here is table for ECB +# (numbers are cycles per byte processed with 128-bit key): +# +# instruction interleave factor 3x 6x 8x +# theoretical asymptotic limit 1.67 0.83 0.625 +# measured performance for 8KB block 1.05 0.86 0.84 +# +# "as if" interleave factor 4.7x 5.8x 6.0x +# +# Further data for other parallelizable modes: +# +# CBC decrypt 1.16 0.93 0.93 +# CTR 1.14 0.91 n/a +# +# Well, given 3x column it's probably inappropriate to call the limit +# asymptotic, if it can be surpassed, isn't it? What happens there? +# Rewind to CBC paragraph for the answer. Yes, out-of-order execution +# magic is responsible for this. Processor overlaps not only the +# additional instructions with AES ones, but even AES instuctions +# processing adjacent triplets of independent blocks. In the 6x case +# additional instructions still claim disproportionally small amount +# of additional cycles, but in 8x case number of instructions must be +# a tad too high for out-of-order logic to cope with, and AES unit +# remains underutilized... As you can see 8x interleave is hardly +# justifiable, so there no need to feel bad that 32-bit aesni-x86.pl +# utilizies 6x interleave because of limited register bank capacity. +# +# Higher interleave factors do have negative impact on Westmere +# performance. While for ECB mode it's negligible ~1.5%, other +# parallelizables perform ~5% worse, which is outweighed by ~25% +# improvement on Sandy Bridge. To balance regression on Westmere +# CTR mode was implemented with 6x aesenc interleave factor. + +# April 2011 +# +# Add aesni_xts_[en|de]crypt. Westmere spends 1.33 cycles processing +# one byte out of 8KB with 128-bit key, Sandy Bridge - 0.97. Just like +# in CTR mode AES instruction interleave factor was chosen to be 6x. + +$PREFIX="aesni"; # if $PREFIX is set to "AES", the script + # generates drop-in replacement for + # crypto/aes/asm/aes-x86_64.pl:-) + +$flavour = shift; +$output = shift; +if ($flavour =~ /\./) { $output = $flavour; undef $flavour; } + +$win64=0; $win64=1 if ($flavour =~ /[nm]asm|mingw64/ || $output =~ /\.asm$/); + +$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; +( $xlate="${dir}x86_64-xlate.pl" and -f $xlate ) or +( $xlate="${dir}../../perlasm/x86_64-xlate.pl" and -f $xlate) or +die "can't locate x86_64-xlate.pl"; + +open STDOUT,"| $^X $xlate $flavour $output"; + +$movkey = $PREFIX eq "aesni" ? "movups" : "movups"; +@_4args=$win64? ("%rcx","%rdx","%r8", "%r9") : # Win64 order + ("%rdi","%rsi","%rdx","%rcx"); # Unix order + +$code=".text\n"; + +$rounds="%eax"; # input to and changed by aesni_[en|de]cryptN !!! +# this is natural Unix argument order for public $PREFIX_[ecb|cbc]_encrypt ... +$inp="%rdi"; +$out="%rsi"; +$len="%rdx"; +$key="%rcx"; # input to and changed by aesni_[en|de]cryptN !!! +$ivp="%r8"; # cbc, ctr, ... + +$rnds_="%r10d"; # backup copy for $rounds +$key_="%r11"; # backup copy for $key + +# %xmm register layout +$rndkey0="%xmm0"; $rndkey1="%xmm1"; +$inout0="%xmm2"; $inout1="%xmm3"; +$inout2="%xmm4"; $inout3="%xmm5"; +$inout4="%xmm6"; $inout5="%xmm7"; +$inout6="%xmm8"; $inout7="%xmm9"; + +$in2="%xmm6"; $in1="%xmm7"; # used in CBC decrypt, CTR, ... +$in0="%xmm8"; $iv="%xmm9"; + +# Inline version of internal aesni_[en|de]crypt1. +# +# Why folded loop? Because aes[enc|dec] is slow enough to accommodate +# cycles which take care of loop variables... +{ my $sn; +sub aesni_generate1 { +my ($p,$key,$rounds,$inout,$ivec)=@_; $inout=$inout0 if (!defined($inout)); +++$sn; +$code.=<<___; + $movkey ($key),$rndkey0 + $movkey 16($key),$rndkey1 +___ +$code.=<<___ if (defined($ivec)); + xorps $rndkey0,$ivec + lea 32($key),$key + xorps $ivec,$inout +___ +$code.=<<___ if (!defined($ivec)); + lea 32($key),$key + xorps $rndkey0,$inout +___ +$code.=<<___; +.Loop_${p}1_$sn: + aes${p} $rndkey1,$inout + dec $rounds + $movkey ($key),$rndkey1 + lea 16($key),$key + jnz .Loop_${p}1_$sn # loop body is 16 bytes + aes${p}last $rndkey1,$inout +___ +}} +# void $PREFIX_[en|de]crypt (const void *inp,void *out,const AES_KEY *key); +# +{ my ($inp,$out,$key) = @_4args; + +$code.=<<___; +.globl ${PREFIX}_encrypt +.type ${PREFIX}_encrypt,\@abi-omnipotent +.align 16 +${PREFIX}_encrypt: + movups ($inp),$inout0 # load input + mov 240($key),$rounds # key->rounds +___ + &aesni_generate1("enc",$key,$rounds); +$code.=<<___; + movups $inout0,($out) # output + ret +.size ${PREFIX}_encrypt,.-${PREFIX}_encrypt + +.globl ${PREFIX}_decrypt +.type ${PREFIX}_decrypt,\@abi-omnipotent +.align 16 +${PREFIX}_decrypt: + movups ($inp),$inout0 # load input + mov 240($key),$rounds # key->rounds +___ + &aesni_generate1("dec",$key,$rounds); +$code.=<<___; + movups $inout0,($out) # output + ret +.size ${PREFIX}_decrypt, .-${PREFIX}_decrypt +___ +} + +# _aesni_[en|de]cryptN are private interfaces, N denotes interleave +# factor. Why 3x subroutine were originally used in loops? Even though +# aes[enc|dec] latency was originally 6, it could be scheduled only +# every *2nd* cycle. Thus 3x interleave was the one providing optimal +# utilization, i.e. when subroutine's throughput is virtually same as +# of non-interleaved subroutine [for number of input blocks up to 3]. +# This is why it makes no sense to implement 2x subroutine. +# aes[enc|dec] latency in next processor generation is 8, but the +# instructions can be scheduled every cycle. Optimal interleave for +# new processor is therefore 8x... +sub aesni_generate3 { +my $dir=shift; +# As already mentioned it takes in $key and $rounds, which are *not* +# preserved. $inout[0-2] is cipher/clear text... +$code.=<<___; +.type _aesni_${dir}rypt3,\@abi-omnipotent +.align 16 +_aesni_${dir}rypt3: + $movkey ($key),$rndkey0 + shr \$1,$rounds + $movkey 16($key),$rndkey1 + lea 32($key),$key + xorps $rndkey0,$inout0 + xorps $rndkey0,$inout1 + xorps $rndkey0,$inout2 + $movkey ($key),$rndkey0 + +.L${dir}_loop3: + aes${dir} $rndkey1,$inout0 + aes${dir} $rndkey1,$inout1 + dec $rounds + aes${dir} $rndkey1,$inout2 + $movkey 16($key),$rndkey1 + aes${dir} $rndkey0,$inout0 + aes${dir} $rndkey0,$inout1 + lea 32($key),$key + aes${dir} $rndkey0,$inout2 + $movkey ($key),$rndkey0 + jnz .L${dir}_loop3 + + aes${dir} $rndkey1,$inout0 + aes${dir} $rndkey1,$inout1 + aes${dir} $rndkey1,$inout2 + aes${dir}last $rndkey0,$inout0 + aes${dir}last $rndkey0,$inout1 + aes${dir}last $rndkey0,$inout2 + ret +.size _aesni_${dir}rypt3,.-_aesni_${dir}rypt3 +___ +} +# 4x interleave is implemented to improve small block performance, +# most notably [and naturally] 4 block by ~30%. One can argue that one +# should have implemented 5x as well, but improvement would be <20%, +# so it's not worth it... +sub aesni_generate4 { +my $dir=shift; +# As already mentioned it takes in $key and $rounds, which are *not* +# preserved. $inout[0-3] is cipher/clear text... +$code.=<<___; +.type _aesni_${dir}rypt4,\@abi-omnipotent +.align 16 +_aesni_${dir}rypt4: + $movkey ($key),$rndkey0 + shr \$1,$rounds + $movkey 16($key),$rndkey1 + lea 32($key),$key + xorps $rndkey0,$inout0 + xorps $rndkey0,$inout1 + xorps $rndkey0,$inout2 + xorps $rndkey0,$inout3 + $movkey ($key),$rndkey0 + +.L${dir}_loop4: + aes${dir} $rndkey1,$inout0 + aes${dir} $rndkey1,$inout1 + dec $rounds + aes${dir} $rndkey1,$inout2 + aes${dir} $rndkey1,$inout3 + $movkey 16($key),$rndkey1 + aes${dir} $rndkey0,$inout0 + aes${dir} $rndkey0,$inout1 + lea 32($key),$key + aes${dir} $rndkey0,$inout2 + aes${dir} $rndkey0,$inout3 + $movkey ($key),$rndkey0 + jnz .L${dir}_loop4 + + aes${dir} $rndkey1,$inout0 + aes${dir} $rndkey1,$inout1 + aes${dir} $rndkey1,$inout2 + aes${dir} $rndkey1,$inout3 + aes${dir}last $rndkey0,$inout0 + aes${dir}last $rndkey0,$inout1 + aes${dir}last $rndkey0,$inout2 + aes${dir}last $rndkey0,$inout3 + ret +.size _aesni_${dir}rypt4,.-_aesni_${dir}rypt4 +___ +} +sub aesni_generate6 { +my $dir=shift; +# As already mentioned it takes in $key and $rounds, which are *not* +# preserved. $inout[0-5] is cipher/clear text... +$code.=<<___; +.type _aesni_${dir}rypt6,\@abi-omnipotent +.align 16 +_aesni_${dir}rypt6: + $movkey ($key),$rndkey0 + shr \$1,$rounds + $movkey 16($key),$rndkey1 + lea 32($key),$key + xorps $rndkey0,$inout0 + pxor $rndkey0,$inout1 + aes${dir} $rndkey1,$inout0 + pxor $rndkey0,$inout2 + aes${dir} $rndkey1,$inout1 + pxor $rndkey0,$inout3 + aes${dir} $rndkey1,$inout2 + pxor $rndkey0,$inout4 + aes${dir} $rndkey1,$inout3 + pxor $rndkey0,$inout5 + dec $rounds + aes${dir} $rndkey1,$inout4 + $movkey ($key),$rndkey0 + aes${dir} $rndkey1,$inout5 + jmp .L${dir}_loop6_enter +.align 16 +.L${dir}_loop6: + aes${dir} $rndkey1,$inout0 + aes${dir} $rndkey1,$inout1 + dec $rounds + aes${dir} $rndkey1,$inout2 + aes${dir} $rndkey1,$inout3 + aes${dir} $rndkey1,$inout4 + aes${dir} $rndkey1,$inout5 +.L${dir}_loop6_enter: # happens to be 16-byte aligned + $movkey 16($key),$rndkey1 + aes${dir} $rndkey0,$inout0 + aes${dir} $rndkey0,$inout1 + lea 32($key),$key + aes${dir} $rndkey0,$inout2 + aes${dir} $rndkey0,$inout3 + aes${dir} $rndkey0,$inout4 + aes${dir} $rndkey0,$inout5 + $movkey ($key),$rndkey0 + jnz .L${dir}_loop6 + + aes${dir} $rndkey1,$inout0 + aes${dir} $rndkey1,$inout1 + aes${dir} $rndkey1,$inout2 + aes${dir} $rndkey1,$inout3 + aes${dir} $rndkey1,$inout4 + aes${dir} $rndkey1,$inout5 + aes${dir}last $rndkey0,$inout0 + aes${dir}last $rndkey0,$inout1 + aes${dir}last $rndkey0,$inout2 + aes${dir}last $rndkey0,$inout3 + aes${dir}last $rndkey0,$inout4 + aes${dir}last $rndkey0,$inout5 + ret +.size _aesni_${dir}rypt6,.-_aesni_${dir}rypt6 +___ +} +sub aesni_generate8 { +my $dir=shift; +# As already mentioned it takes in $key and $rounds, which are *not* +# preserved. $inout[0-7] is cipher/clear text... +$code.=<<___; +.type _aesni_${dir}rypt8,\@abi-omnipotent +.align 16 +_aesni_${dir}rypt8: + $movkey ($key),$rndkey0 + shr \$1,$rounds + $movkey 16($key),$rndkey1 + lea 32($key),$key + xorps $rndkey0,$inout0 + xorps $rndkey0,$inout1 + aes${dir} $rndkey1,$inout0 + pxor $rndkey0,$inout2 + aes${dir} $rndkey1,$inout1 + pxor $rndkey0,$inout3 + aes${dir} $rndkey1,$inout2 + pxor $rndkey0,$inout4 + aes${dir} $rndkey1,$inout3 + pxor $rndkey0,$inout5 + dec $rounds + aes${dir} $rndkey1,$inout4 + pxor $rndkey0,$inout6 + aes${dir} $rndkey1,$inout5 + pxor $rndkey0,$inout7 + $movkey ($key),$rndkey0 + aes${dir} $rndkey1,$inout6 + aes${dir} $rndkey1,$inout7 + $movkey 16($key),$rndkey1 + jmp .L${dir}_loop8_enter +.align 16 +.L${dir}_loop8: + aes${dir} $rndkey1,$inout0 + aes${dir} $rndkey1,$inout1 + dec $rounds + aes${dir} $rndkey1,$inout2 + aes${dir} $rndkey1,$inout3 + aes${dir} $rndkey1,$inout4 + aes${dir} $rndkey1,$inout5 + aes${dir} $rndkey1,$inout6 + aes${dir} $rndkey1,$inout7 + $movkey 16($key),$rndkey1 +.L${dir}_loop8_enter: # happens to be 16-byte aligned + aes${dir} $rndkey0,$inout0 + aes${dir} $rndkey0,$inout1 + lea 32($key),$key + aes${dir} $rndkey0,$inout2 + aes${dir} $rndkey0,$inout3 + aes${dir} $rndkey0,$inout4 + aes${dir} $rndkey0,$inout5 + aes${dir} $rndkey0,$inout6 + aes${dir} $rndkey0,$inout7 + $movkey ($key),$rndkey0 + jnz .L${dir}_loop8 + + aes${dir} $rndkey1,$inout0 + aes${dir} $rndkey1,$inout1 + aes${dir} $rndkey1,$inout2 + aes${dir} $rndkey1,$inout3 + aes${dir} $rndkey1,$inout4 + aes${dir} $rndkey1,$inout5 + aes${dir} $rndkey1,$inout6 + aes${dir} $rndkey1,$inout7 + aes${dir}last $rndkey0,$inout0 + aes${dir}last $rndkey0,$inout1 + aes${dir}last $rndkey0,$inout2 + aes${dir}last $rndkey0,$inout3 + aes${dir}last $rndkey0,$inout4 + aes${dir}last $rndkey0,$inout5 + aes${dir}last $rndkey0,$inout6 + aes${dir}last $rndkey0,$inout7 + ret +.size _aesni_${dir}rypt8,.-_aesni_${dir}rypt8 +___ +} +&aesni_generate3("enc") if ($PREFIX eq "aesni"); +&aesni_generate3("dec"); +&aesni_generate4("enc") if ($PREFIX eq "aesni"); +&aesni_generate4("dec"); +&aesni_generate6("enc") if ($PREFIX eq "aesni"); +&aesni_generate6("dec"); +&aesni_generate8("enc") if ($PREFIX eq "aesni"); +&aesni_generate8("dec"); + +if ($PREFIX eq "aesni") { +######################################################################## +# void aesni_ecb_encrypt (const void *in, void *out, +# size_t length, const AES_KEY *key, +# int enc); +$code.=<<___; +.globl aesni_ecb_encrypt +.type aesni_ecb_encrypt,\@function,5 +.align 16 +aesni_ecb_encrypt: + and \$-16,$len + jz .Lecb_ret + + mov 240($key),$rounds # key->rounds + $movkey ($key),$rndkey0 + mov $key,$key_ # backup $key + mov $rounds,$rnds_ # backup $rounds + test %r8d,%r8d # 5th argument + jz .Lecb_decrypt +#--------------------------- ECB ENCRYPT ------------------------------# + cmp \$0x80,$len + jb .Lecb_enc_tail + + movdqu ($inp),$inout0 + movdqu 0x10($inp),$inout1 + movdqu 0x20($inp),$inout2 + movdqu 0x30($inp),$inout3 + movdqu 0x40($inp),$inout4 + movdqu 0x50($inp),$inout5 + movdqu 0x60($inp),$inout6 + movdqu 0x70($inp),$inout7 + lea 0x80($inp),$inp + sub \$0x80,$len + jmp .Lecb_enc_loop8_enter +.align 16 +.Lecb_enc_loop8: + movups $inout0,($out) + mov $key_,$key # restore $key + movdqu ($inp),$inout0 + mov $rnds_,$rounds # restore $rounds + movups $inout1,0x10($out) + movdqu 0x10($inp),$inout1 + movups $inout2,0x20($out) + movdqu 0x20($inp),$inout2 + movups $inout3,0x30($out) + movdqu 0x30($inp),$inout3 + movups $inout4,0x40($out) + movdqu 0x40($inp),$inout4 + movups $inout5,0x50($out) + movdqu 0x50($inp),$inout5 + movups $inout6,0x60($out) + movdqu 0x60($inp),$inout6 + movups $inout7,0x70($out) + lea 0x80($out),$out + movdqu 0x70($inp),$inout7 + lea 0x80($inp),$inp +.Lecb_enc_loop8_enter: + + call _aesni_encrypt8 + + sub \$0x80,$len + jnc .Lecb_enc_loop8 + + movups $inout0,($out) + mov $key_,$key # restore $key + movups $inout1,0x10($out) + mov $rnds_,$rounds # restore $rounds + movups $inout2,0x20($out) + movups $inout3,0x30($out) + movups $inout4,0x40($out) + movups $inout5,0x50($out) + movups $inout6,0x60($out) + movups $inout7,0x70($out) + lea 0x80($out),$out + add \$0x80,$len + jz .Lecb_ret + +.Lecb_enc_tail: + movups ($inp),$inout0 + cmp \$0x20,$len + jb .Lecb_enc_one + movups 0x10($inp),$inout1 + je .Lecb_enc_two + movups 0x20($inp),$inout2 + cmp \$0x40,$len + jb .Lecb_enc_three + movups 0x30($inp),$inout3 + je .Lecb_enc_four + movups 0x40($inp),$inout4 + cmp \$0x60,$len + jb .Lecb_enc_five + movups 0x50($inp),$inout5 + je .Lecb_enc_six + movdqu 0x60($inp),$inout6 + call _aesni_encrypt8 + movups $inout0,($out) + movups $inout1,0x10($out) + movups $inout2,0x20($out) + movups $inout3,0x30($out) + movups $inout4,0x40($out) + movups $inout5,0x50($out) + movups $inout6,0x60($out) + jmp .Lecb_ret +.align 16 +.Lecb_enc_one: +___ + &aesni_generate1("enc",$key,$rounds); +$code.=<<___; + movups $inout0,($out) + jmp .Lecb_ret +.align 16 +.Lecb_enc_two: + xorps $inout2,$inout2 + call _aesni_encrypt3 + movups $inout0,($out) + movups $inout1,0x10($out) + jmp .Lecb_ret +.align 16 +.Lecb_enc_three: + call _aesni_encrypt3 + movups $inout0,($out) + movups $inout1,0x10($out) + movups $inout2,0x20($out) + jmp .Lecb_ret +.align 16 +.Lecb_enc_four: + call _aesni_encrypt4 + movups $inout0,($out) + movups $inout1,0x10($out) + movups $inout2,0x20($out) + movups $inout3,0x30($out) + jmp .Lecb_ret +.align 16 +.Lecb_enc_five: + xorps $inout5,$inout5 + call _aesni_encrypt6 + movups $inout0,($out) + movups $inout1,0x10($out) + movups $inout2,0x20($out) + movups $inout3,0x30($out) + movups $inout4,0x40($out) + jmp .Lecb_ret +.align 16 +.Lecb_enc_six: + call _aesni_encrypt6 + movups $inout0,($out) + movups $inout1,0x10($out) + movups $inout2,0x20($out) + movups $inout3,0x30($out) + movups $inout4,0x40($out) + movups $inout5,0x50($out) + jmp .Lecb_ret + #--------------------------- ECB DECRYPT ------------------------------# +.align 16 +.Lecb_decrypt: + cmp \$0x80,$len + jb .Lecb_dec_tail + + movdqu ($inp),$inout0 + movdqu 0x10($inp),$inout1 + movdqu 0x20($inp),$inout2 + movdqu 0x30($inp),$inout3 + movdqu 0x40($inp),$inout4 + movdqu 0x50($inp),$inout5 + movdqu 0x60($inp),$inout6 + movdqu 0x70($inp),$inout7 + lea 0x80($inp),$inp + sub \$0x80,$len + jmp .Lecb_dec_loop8_enter +.align 16 +.Lecb_dec_loop8: + movups $inout0,($out) + mov $key_,$key # restore $key + movdqu ($inp),$inout0 + mov $rnds_,$rounds # restore $rounds + movups $inout1,0x10($out) + movdqu 0x10($inp),$inout1 + movups $inout2,0x20($out) + movdqu 0x20($inp),$inout2 + movups $inout3,0x30($out) + movdqu 0x30($inp),$inout3 + movups $inout4,0x40($out) + movdqu 0x40($inp),$inout4 + movups $inout5,0x50($out) + movdqu 0x50($inp),$inout5 + movups $inout6,0x60($out) + movdqu 0x60($inp),$inout6 + movups $inout7,0x70($out) + lea 0x80($out),$out + movdqu 0x70($inp),$inout7 + lea 0x80($inp),$inp +.Lecb_dec_loop8_enter: + + call _aesni_decrypt8 + + $movkey ($key_),$rndkey0 + sub \$0x80,$len + jnc .Lecb_dec_loop8 + + movups $inout0,($out) + mov $key_,$key # restore $key + movups $inout1,0x10($out) + mov $rnds_,$rounds # restore $rounds + movups $inout2,0x20($out) + movups $inout3,0x30($out) + movups $inout4,0x40($out) + movups $inout5,0x50($out) + movups $inout6,0x60($out) + movups $inout7,0x70($out) + lea 0x80($out),$out + add \$0x80,$len + jz .Lecb_ret + +.Lecb_dec_tail: + movups ($inp),$inout0 + cmp \$0x20,$len + jb .Lecb_dec_one + movups 0x10($inp),$inout1 + je .Lecb_dec_two + movups 0x20($inp),$inout2 + cmp \$0x40,$len + jb .Lecb_dec_three + movups 0x30($inp),$inout3 + je .Lecb_dec_four + movups 0x40($inp),$inout4 + cmp \$0x60,$len + jb .Lecb_dec_five + movups 0x50($inp),$inout5 + je .Lecb_dec_six + movups 0x60($inp),$inout6 + $movkey ($key),$rndkey0 + call _aesni_decrypt8 + movups $inout0,($out) + movups $inout1,0x10($out) + movups $inout2,0x20($out) + movups $inout3,0x30($out) + movups $inout4,0x40($out) + movups $inout5,0x50($out) + movups $inout6,0x60($out) + jmp .Lecb_ret +.align 16 +.Lecb_dec_one: +___ + &aesni_generate1("dec",$key,$rounds); +$code.=<<___; + movups $inout0,($out) + jmp .Lecb_ret +.align 16 +.Lecb_dec_two: + xorps $inout2,$inout2 + call _aesni_decrypt3 + movups $inout0,($out) + movups $inout1,0x10($out) + jmp .Lecb_ret +.align 16 +.Lecb_dec_three: + call _aesni_decrypt3 + movups $inout0,($out) + movups $inout1,0x10($out) + movups $inout2,0x20($out) + jmp .Lecb_ret +.align 16 +.Lecb_dec_four: + call _aesni_decrypt4 + movups $inout0,($out) + movups $inout1,0x10($out) + movups $inout2,0x20($out) + movups $inout3,0x30($out) + jmp .Lecb_ret +.align 16 +.Lecb_dec_five: + xorps $inout5,$inout5 + call _aesni_decrypt6 + movups $inout0,($out) + movups $inout1,0x10($out) + movups $inout2,0x20($out) + movups $inout3,0x30($out) + movups $inout4,0x40($out) + jmp .Lecb_ret +.align 16 +.Lecb_dec_six: + call _aesni_decrypt6 + movups $inout0,($out) + movups $inout1,0x10($out) + movups $inout2,0x20($out) + movups $inout3,0x30($out) + movups $inout4,0x40($out) + movups $inout5,0x50($out) + +.Lecb_ret: + ret +.size aesni_ecb_encrypt,.-aesni_ecb_encrypt +___ + +{ +###################################################################### +# void aesni_ccm64_[en|de]crypt_blocks (const void *in, void *out, +# size_t blocks, const AES_KEY *key, +# const char *ivec,char *cmac); +# +# Handles only complete blocks, operates on 64-bit counter and +# does not update *ivec! Nor does it finalize CMAC value +# (see engine/eng_aesni.c for details) +# +{ +my $cmac="%r9"; # 6th argument + +my $increment="%xmm6"; +my $bswap_mask="%xmm7"; + +$code.=<<___; +.globl aesni_ccm64_encrypt_blocks +.type aesni_ccm64_encrypt_blocks,\@function,6 +.align 16 +aesni_ccm64_encrypt_blocks: +___ +$code.=<<___ if ($win64); + lea -0x58(%rsp),%rsp + movaps %xmm6,(%rsp) + movaps %xmm7,0x10(%rsp) + movaps %xmm8,0x20(%rsp) + movaps %xmm9,0x30(%rsp) +.Lccm64_enc_body: +___ +$code.=<<___; + mov 240($key),$rounds # key->rounds + movdqu ($ivp),$iv + movdqa .Lincrement64(%rip),$increment + movdqa .Lbswap_mask(%rip),$bswap_mask + + shr \$1,$rounds + lea 0($key),$key_ + movdqu ($cmac),$inout1 + movdqa $iv,$inout0 + mov $rounds,$rnds_ + pshufb $bswap_mask,$iv + jmp .Lccm64_enc_outer +.align 16 +.Lccm64_enc_outer: + $movkey ($key_),$rndkey0 + mov $rnds_,$rounds + movups ($inp),$in0 # load inp + + xorps $rndkey0,$inout0 # counter + $movkey 16($key_),$rndkey1 + xorps $in0,$rndkey0 + lea 32($key_),$key + xorps $rndkey0,$inout1 # cmac^=inp + $movkey ($key),$rndkey0 + +.Lccm64_enc2_loop: + aesenc $rndkey1,$inout0 + dec $rounds + aesenc $rndkey1,$inout1 + $movkey 16($key),$rndkey1 + aesenc $rndkey0,$inout0 + lea 32($key),$key + aesenc $rndkey0,$inout1 + $movkey 0($key),$rndkey0 + jnz .Lccm64_enc2_loop + aesenc $rndkey1,$inout0 + aesenc $rndkey1,$inout1 + paddq $increment,$iv + aesenclast $rndkey0,$inout0 + aesenclast $rndkey0,$inout1 + + dec $len + lea 16($inp),$inp + xorps $inout0,$in0 # inp ^= E(iv) + movdqa $iv,$inout0 + movups $in0,($out) # save output + lea 16($out),$out + pshufb $bswap_mask,$inout0 + jnz .Lccm64_enc_outer + + movups $inout1,($cmac) +___ +$code.=<<___ if ($win64); + movaps (%rsp),%xmm6 + movaps 0x10(%rsp),%xmm7 + movaps 0x20(%rsp),%xmm8 + movaps 0x30(%rsp),%xmm9 + lea 0x58(%rsp),%rsp +.Lccm64_enc_ret: +___ +$code.=<<___; + ret +.size aesni_ccm64_encrypt_blocks,.-aesni_ccm64_encrypt_blocks +___ +###################################################################### +$code.=<<___; +.globl aesni_ccm64_decrypt_blocks +.type aesni_ccm64_decrypt_blocks,\@function,6 +.align 16 +aesni_ccm64_decrypt_blocks: +___ +$code.=<<___ if ($win64); + lea -0x58(%rsp),%rsp + movaps %xmm6,(%rsp) + movaps %xmm7,0x10(%rsp) + movaps %xmm8,0x20(%rsp) + movaps %xmm9,0x30(%rsp) +.Lccm64_dec_body: +___ +$code.=<<___; + mov 240($key),$rounds # key->rounds + movups ($ivp),$iv + movdqu ($cmac),$inout1 + movdqa .Lincrement64(%rip),$increment + movdqa .Lbswap_mask(%rip),$bswap_mask + + movaps $iv,$inout0 + mov $rounds,$rnds_ + mov $key,$key_ + pshufb $bswap_mask,$iv +___ + &aesni_generate1("enc",$key,$rounds); +$code.=<<___; + movups ($inp),$in0 # load inp + paddq $increment,$iv + lea 16($inp),$inp + jmp .Lccm64_dec_outer +.align 16 +.Lccm64_dec_outer: + xorps $inout0,$in0 # inp ^= E(iv) + movdqa $iv,$inout0 + mov $rnds_,$rounds + movups $in0,($out) # save output + lea 16($out),$out + pshufb $bswap_mask,$inout0 + + sub \$1,$len + jz .Lccm64_dec_break + + $movkey ($key_),$rndkey0 + shr \$1,$rounds + $movkey 16($key_),$rndkey1 + xorps $rndkey0,$in0 + lea 32($key_),$key + xorps $rndkey0,$inout0 + xorps $in0,$inout1 # cmac^=out + $movkey ($key),$rndkey0 + +.Lccm64_dec2_loop: + aesenc $rndkey1,$inout0 + dec $rounds + aesenc $rndkey1,$inout1 + $movkey 16($key),$rndkey1 + aesenc $rndkey0,$inout0 + lea 32($key),$key + aesenc $rndkey0,$inout1 + $movkey 0($key),$rndkey0 + jnz .Lccm64_dec2_loop + movups ($inp),$in0 # load inp + paddq $increment,$iv + aesenc $rndkey1,$inout0 + aesenc $rndkey1,$inout1 + lea 16($inp),$inp + aesenclast $rndkey0,$inout0 + aesenclast $rndkey0,$inout1 + jmp .Lccm64_dec_outer + +.align 16 +.Lccm64_dec_break: + #xorps $in0,$inout1 # cmac^=out +___ + &aesni_generate1("enc",$key_,$rounds,$inout1,$in0); +$code.=<<___; + movups $inout1,($cmac) +___ +$code.=<<___ if ($win64); + movaps (%rsp),%xmm6 + movaps 0x10(%rsp),%xmm7 + movaps 0x20(%rsp),%xmm8 + movaps 0x30(%rsp),%xmm9 + lea 0x58(%rsp),%rsp +.Lccm64_dec_ret: +___ +$code.=<<___; + ret +.size aesni_ccm64_decrypt_blocks,.-aesni_ccm64_decrypt_blocks +___ +} +###################################################################### +# void aesni_ctr32_encrypt_blocks (const void *in, void *out, +# size_t blocks, const AES_KEY *key, +# const char *ivec); +# +# Handles only complete blocks, operates on 32-bit counter and +# does not update *ivec! (see engine/eng_aesni.c for details) +# +{ +my $reserved = $win64?0:-0x28; +my ($in0,$in1,$in2,$in3)=map("%xmm$_",(8..11)); +my ($iv0,$iv1,$ivec)=("%xmm12","%xmm13","%xmm14"); +my $bswap_mask="%xmm15"; + +$code.=<<___; +.globl aesni_ctr32_encrypt_blocks +.type aesni_ctr32_encrypt_blocks,\@function,5 +.align 16 +aesni_ctr32_encrypt_blocks: +___ +$code.=<<___ if ($win64); + lea -0xc8(%rsp),%rsp + movaps %xmm6,0x20(%rsp) + movaps %xmm7,0x30(%rsp) + movaps %xmm8,0x40(%rsp) + movaps %xmm9,0x50(%rsp) + movaps %xmm10,0x60(%rsp) + movaps %xmm11,0x70(%rsp) + movaps %xmm12,0x80(%rsp) + movaps %xmm13,0x90(%rsp) + movaps %xmm14,0xa0(%rsp) + movaps %xmm15,0xb0(%rsp) +.Lctr32_body: +___ +$code.=<<___; + cmp \$1,$len + je .Lctr32_one_shortcut + + movdqu ($ivp),$ivec + movdqa .Lbswap_mask(%rip),$bswap_mask + xor $rounds,$rounds + pextrd \$3,$ivec,$rnds_ # pull 32-bit counter + pinsrd \$3,$rounds,$ivec # wipe 32-bit counter + + mov 240($key),$rounds # key->rounds + bswap $rnds_ + pxor $iv0,$iv0 # vector of 3 32-bit counters + pxor $iv1,$iv1 # vector of 3 32-bit counters + pinsrd \$0,$rnds_,$iv0 + lea 3($rnds_),$key_ + pinsrd \$0,$key_,$iv1 + inc $rnds_ + pinsrd \$1,$rnds_,$iv0 + inc $key_ + pinsrd \$1,$key_,$iv1 + inc $rnds_ + pinsrd \$2,$rnds_,$iv0 + inc $key_ + pinsrd \$2,$key_,$iv1 + movdqa $iv0,$reserved(%rsp) + pshufb $bswap_mask,$iv0 + movdqa $iv1,`$reserved+0x10`(%rsp) + pshufb $bswap_mask,$iv1 + + pshufd \$`3<<6`,$iv0,$inout0 # place counter to upper dword + pshufd \$`2<<6`,$iv0,$inout1 + pshufd \$`1<<6`,$iv0,$inout2 + cmp \$6,$len + jb .Lctr32_tail + shr \$1,$rounds + mov $key,$key_ # backup $key + mov $rounds,$rnds_ # backup $rounds + sub \$6,$len + jmp .Lctr32_loop6 + +.align 16 +.Lctr32_loop6: + pshufd \$`3<<6`,$iv1,$inout3 + por $ivec,$inout0 # merge counter-less ivec + $movkey ($key_),$rndkey0 + pshufd \$`2<<6`,$iv1,$inout4 + por $ivec,$inout1 + $movkey 16($key_),$rndkey1 + pshufd \$`1<<6`,$iv1,$inout5 + por $ivec,$inout2 + por $ivec,$inout3 + xorps $rndkey0,$inout0 + por $ivec,$inout4 + por $ivec,$inout5 + + # inline _aesni_encrypt6 and interleave last rounds + # with own code... + + pxor $rndkey0,$inout1 + aesenc $rndkey1,$inout0 + lea 32($key_),$key + pxor $rndkey0,$inout2 + aesenc $rndkey1,$inout1 + movdqa .Lincrement32(%rip),$iv1 + pxor $rndkey0,$inout3 + aesenc $rndkey1,$inout2 + movdqa $reserved(%rsp),$iv0 + pxor $rndkey0,$inout4 + aesenc $rndkey1,$inout3 + pxor $rndkey0,$inout5 + $movkey ($key),$rndkey0 + dec $rounds + aesenc $rndkey1,$inout4 + aesenc $rndkey1,$inout5 + jmp .Lctr32_enc_loop6_enter +.align 16 +.Lctr32_enc_loop6: + aesenc $rndkey1,$inout0 + aesenc $rndkey1,$inout1 + dec $rounds + aesenc $rndkey1,$inout2 + aesenc $rndkey1,$inout3 + aesenc $rndkey1,$inout4 + aesenc $rndkey1,$inout5 +.Lctr32_enc_loop6_enter: + $movkey 16($key),$rndkey1 + aesenc $rndkey0,$inout0 + aesenc $rndkey0,$inout1 + lea 32($key),$key + aesenc $rndkey0,$inout2 + aesenc $rndkey0,$inout3 + aesenc $rndkey0,$inout4 + aesenc $rndkey0,$inout5 + $movkey ($key),$rndkey0 + jnz .Lctr32_enc_loop6 + + aesenc $rndkey1,$inout0 + paddd $iv1,$iv0 # increment counter vector + aesenc $rndkey1,$inout1 + paddd `$reserved+0x10`(%rsp),$iv1 + aesenc $rndkey1,$inout2 + movdqa $iv0,$reserved(%rsp) # save counter vector + aesenc $rndkey1,$inout3 + movdqa $iv1,`$reserved+0x10`(%rsp) + aesenc $rndkey1,$inout4 + pshufb $bswap_mask,$iv0 # byte swap + aesenc $rndkey1,$inout5 + pshufb $bswap_mask,$iv1 + + aesenclast $rndkey0,$inout0 + movups ($inp),$in0 # load input + aesenclast $rndkey0,$inout1 + movups 0x10($inp),$in1 + aesenclast $rndkey0,$inout2 + movups 0x20($inp),$in2 + aesenclast $rndkey0,$inout3 + movups 0x30($inp),$in3 + aesenclast $rndkey0,$inout4 + movups 0x40($inp),$rndkey1 + aesenclast $rndkey0,$inout5 + movups 0x50($inp),$rndkey0 + lea 0x60($inp),$inp + + xorps $inout0,$in0 # xor + pshufd \$`3<<6`,$iv0,$inout0 + xorps $inout1,$in1 + pshufd \$`2<<6`,$iv0,$inout1 + movups $in0,($out) # store output + xorps $inout2,$in2 + pshufd \$`1<<6`,$iv0,$inout2 + movups $in1,0x10($out) + xorps $inout3,$in3 + movups $in2,0x20($out) + xorps $inout4,$rndkey1 + movups $in3,0x30($out) + xorps $inout5,$rndkey0 + movups $rndkey1,0x40($out) + movups $rndkey0,0x50($out) + lea 0x60($out),$out + mov $rnds_,$rounds + sub \$6,$len + jnc .Lctr32_loop6 + + add \$6,$len + jz .Lctr32_done + mov $key_,$key # restore $key + lea 1($rounds,$rounds),$rounds # restore original value + +.Lctr32_tail: + por $ivec,$inout0 + movups ($inp),$in0 + cmp \$2,$len + jb .Lctr32_one + + por $ivec,$inout1 + movups 0x10($inp),$in1 + je .Lctr32_two + + pshufd \$`3<<6`,$iv1,$inout3 + por $ivec,$inout2 + movups 0x20($inp),$in2 + cmp \$4,$len + jb .Lctr32_three + + pshufd \$`2<<6`,$iv1,$inout4 + por $ivec,$inout3 + movups 0x30($inp),$in3 + je .Lctr32_four + + por $ivec,$inout4 + xorps $inout5,$inout5 + + call _aesni_encrypt6 + + movups 0x40($inp),$rndkey1 + xorps $inout0,$in0 + xorps $inout1,$in1 + movups $in0,($out) + xorps $inout2,$in2 + movups $in1,0x10($out) + xorps $inout3,$in3 + movups $in2,0x20($out) + xorps $inout4,$rndkey1 + movups $in3,0x30($out) + movups $rndkey1,0x40($out) + jmp .Lctr32_done + +.align 16 +.Lctr32_one_shortcut: + movups ($ivp),$inout0 + movups ($inp),$in0 + mov 240($key),$rounds # key->rounds +.Lctr32_one: +___ + &aesni_generate1("enc",$key,$rounds); +$code.=<<___; + xorps $inout0,$in0 + movups $in0,($out) + jmp .Lctr32_done + +.align 16 +.Lctr32_two: + xorps $inout2,$inout2 + call _aesni_encrypt3 + xorps $inout0,$in0 + xorps $inout1,$in1 + movups $in0,($out) + movups $in1,0x10($out) + jmp .Lctr32_done + +.align 16 +.Lctr32_three: + call _aesni_encrypt3 + xorps $inout0,$in0 + xorps $inout1,$in1 + movups $in0,($out) + xorps $inout2,$in2 + movups $in1,0x10($out) + movups $in2,0x20($out) + jmp .Lctr32_done + +.align 16 +.Lctr32_four: + call _aesni_encrypt4 + xorps $inout0,$in0 + xorps $inout1,$in1 + movups $in0,($out) + xorps $inout2,$in2 + movups $in1,0x10($out) + xorps $inout3,$in3 + movups $in2,0x20($out) + movups $in3,0x30($out) + +.Lctr32_done: +___ +$code.=<<___ if ($win64); + movaps 0x20(%rsp),%xmm6 + movaps 0x30(%rsp),%xmm7 + movaps 0x40(%rsp),%xmm8 + movaps 0x50(%rsp),%xmm9 + movaps 0x60(%rsp),%xmm10 + movaps 0x70(%rsp),%xmm11 + movaps 0x80(%rsp),%xmm12 + movaps 0x90(%rsp),%xmm13 + movaps 0xa0(%rsp),%xmm14 + movaps 0xb0(%rsp),%xmm15 + lea 0xc8(%rsp),%rsp +.Lctr32_ret: +___ +$code.=<<___; + ret +.size aesni_ctr32_encrypt_blocks,.-aesni_ctr32_encrypt_blocks +___ +} + +###################################################################### +# void aesni_xts_[en|de]crypt(const char *inp,char *out,size_t len, +# const AES_KEY *key1, const AES_KEY *key2 +# const unsigned char iv[16]); +# +{ +my @tweak=map("%xmm$_",(10..15)); +my ($twmask,$twres,$twtmp)=("%xmm8","%xmm9",@tweak[4]); +my ($key2,$ivp,$len_)=("%r8","%r9","%r9"); +my $frame_size = 0x68 + ($win64?160:0); + +$code.=<<___; +.globl aesni_xts_encrypt +.type aesni_xts_encrypt,\@function,6 +.align 16 +aesni_xts_encrypt: + lea -$frame_size(%rsp),%rsp +___ +$code.=<<___ if ($win64); + movaps %xmm6,0x60(%rsp) + movaps %xmm7,0x70(%rsp) + movaps %xmm8,0x80(%rsp) + movaps %xmm9,0x90(%rsp) + movaps %xmm10,0xa0(%rsp) + movaps %xmm11,0xb0(%rsp) + movaps %xmm12,0xc0(%rsp) + movaps %xmm13,0xd0(%rsp) + movaps %xmm14,0xe0(%rsp) + movaps %xmm15,0xf0(%rsp) +.Lxts_enc_body: +___ +$code.=<<___; + movups ($ivp),@tweak[5] # load clear-text tweak + mov 240(%r8),$rounds # key2->rounds + mov 240($key),$rnds_ # key1->rounds +___ + # generate the tweak + &aesni_generate1("enc",$key2,$rounds,@tweak[5]); +$code.=<<___; + mov $key,$key_ # backup $key + mov $rnds_,$rounds # backup $rounds + mov $len,$len_ # backup $len + and \$-16,$len + + movdqa .Lxts_magic(%rip),$twmask + pxor $twtmp,$twtmp + pcmpgtd @tweak[5],$twtmp # broadcast upper bits +___ + for ($i=0;$i<4;$i++) { + $code.=<<___; + pshufd \$0x13,$twtmp,$twres + pxor $twtmp,$twtmp + movdqa @tweak[5],@tweak[$i] + paddq @tweak[5],@tweak[5] # psllq 1,$tweak + pand $twmask,$twres # isolate carry and residue + pcmpgtd @tweak[5],$twtmp # broadcat upper bits + pxor $twres,@tweak[5] +___ + } +$code.=<<___; + sub \$16*6,$len + jc .Lxts_enc_short + + shr \$1,$rounds + sub \$1,$rounds + mov $rounds,$rnds_ + jmp .Lxts_enc_grandloop + +.align 16 +.Lxts_enc_grandloop: + pshufd \$0x13,$twtmp,$twres + movdqa @tweak[5],@tweak[4] + paddq @tweak[5],@tweak[5] # psllq 1,$tweak + movdqu `16*0`($inp),$inout0 # load input + pand $twmask,$twres # isolate carry and residue + movdqu `16*1`($inp),$inout1 + pxor $twres,@tweak[5] + + movdqu `16*2`($inp),$inout2 + pxor @tweak[0],$inout0 # input^=tweak + movdqu `16*3`($inp),$inout3 + pxor @tweak[1],$inout1 + movdqu `16*4`($inp),$inout4 + pxor @tweak[2],$inout2 + movdqu `16*5`($inp),$inout5 + lea `16*6`($inp),$inp + pxor @tweak[3],$inout3 + $movkey ($key_),$rndkey0 + pxor @tweak[4],$inout4 + pxor @tweak[5],$inout5 + + # inline _aesni_encrypt6 and interleave first and last rounds + # with own code... + $movkey 16($key_),$rndkey1 + pxor $rndkey0,$inout0 + pxor $rndkey0,$inout1 + movdqa @tweak[0],`16*0`(%rsp) # put aside tweaks + aesenc $rndkey1,$inout0 + lea 32($key_),$key + pxor $rndkey0,$inout2 + movdqa @tweak[1],`16*1`(%rsp) + aesenc $rndkey1,$inout1 + pxor $rndkey0,$inout3 + movdqa @tweak[2],`16*2`(%rsp) + aesenc $rndkey1,$inout2 + pxor $rndkey0,$inout4 + movdqa @tweak[3],`16*3`(%rsp) + aesenc $rndkey1,$inout3 + pxor $rndkey0,$inout5 + $movkey ($key),$rndkey0 + dec $rounds + movdqa @tweak[4],`16*4`(%rsp) + aesenc $rndkey1,$inout4 + movdqa @tweak[5],`16*5`(%rsp) + aesenc $rndkey1,$inout5 + pxor $twtmp,$twtmp + pcmpgtd @tweak[5],$twtmp + jmp .Lxts_enc_loop6_enter + +.align 16 +.Lxts_enc_loop6: + aesenc $rndkey1,$inout0 + aesenc $rndkey1,$inout1 + dec $rounds + aesenc $rndkey1,$inout2 + aesenc $rndkey1,$inout3 + aesenc $rndkey1,$inout4 + aesenc $rndkey1,$inout5 +.Lxts_enc_loop6_enter: + $movkey 16($key),$rndkey1 + aesenc $rndkey0,$inout0 + aesenc $rndkey0,$inout1 + lea 32($key),$key + aesenc $rndkey0,$inout2 + aesenc $rndkey0,$inout3 + aesenc $rndkey0,$inout4 + aesenc $rndkey0,$inout5 + $movkey ($key),$rndkey0 + jnz .Lxts_enc_loop6 + + pshufd \$0x13,$twtmp,$twres + pxor $twtmp,$twtmp + paddq @tweak[5],@tweak[5] # psllq 1,$tweak + aesenc $rndkey1,$inout0 + pand $twmask,$twres # isolate carry and residue + aesenc $rndkey1,$inout1 + pcmpgtd @tweak[5],$twtmp # broadcast upper bits + aesenc $rndkey1,$inout2 + pxor $twres,@tweak[5] + aesenc $rndkey1,$inout3 + aesenc $rndkey1,$inout4 + aesenc $rndkey1,$inout5 + $movkey 16($key),$rndkey1 + + pshufd \$0x13,$twtmp,$twres + pxor $twtmp,$twtmp + movdqa @tweak[5],@tweak[0] + paddq @tweak[5],@tweak[5] # psllq 1,$tweak + aesenc $rndkey0,$inout0 + pand $twmask,$twres # isolate carry and residue + aesenc $rndkey0,$inout1 + pcmpgtd @tweak[5],$twtmp # broadcat upper bits + aesenc $rndkey0,$inout2 + pxor $twres,@tweak[5] + aesenc $rndkey0,$inout3 + aesenc $rndkey0,$inout4 + aesenc $rndkey0,$inout5 + $movkey 32($key),$rndkey0 + + pshufd \$0x13,$twtmp,$twres + pxor $twtmp,$twtmp + movdqa @tweak[5],@tweak[1] + paddq @tweak[5],@tweak[5] # psllq 1,$tweak + aesenc $rndkey1,$inout0 + pand $twmask,$twres # isolate carry and residue + aesenc $rndkey1,$inout1 + pcmpgtd @tweak[5],$twtmp # broadcat upper bits + aesenc $rndkey1,$inout2 + pxor $twres,@tweak[5] + aesenc $rndkey1,$inout3 + aesenc $rndkey1,$inout4 + aesenc $rndkey1,$inout5 + + pshufd \$0x13,$twtmp,$twres + pxor $twtmp,$twtmp + movdqa @tweak[5],@tweak[2] + paddq @tweak[5],@tweak[5] # psllq 1,$tweak + aesenclast $rndkey0,$inout0 + pand $twmask,$twres # isolate carry and residue + aesenclast $rndkey0,$inout1 + pcmpgtd @tweak[5],$twtmp # broadcat upper bits + aesenclast $rndkey0,$inout2 + pxor $twres,@tweak[5] + aesenclast $rndkey0,$inout3 + aesenclast $rndkey0,$inout4 + aesenclast $rndkey0,$inout5 + + pshufd \$0x13,$twtmp,$twres + pxor $twtmp,$twtmp + movdqa @tweak[5],@tweak[3] + paddq @tweak[5],@tweak[5] # psllq 1,$tweak + xorps `16*0`(%rsp),$inout0 # output^=tweak + pand $twmask,$twres # isolate carry and residue + xorps `16*1`(%rsp),$inout1 + pcmpgtd @tweak[5],$twtmp # broadcat upper bits + pxor $twres,@tweak[5] + + xorps `16*2`(%rsp),$inout2 + movups $inout0,`16*0`($out) # write output + xorps `16*3`(%rsp),$inout3 + movups $inout1,`16*1`($out) + xorps `16*4`(%rsp),$inout4 + movups $inout2,`16*2`($out) + xorps `16*5`(%rsp),$inout5 + movups $inout3,`16*3`($out) + mov $rnds_,$rounds # restore $rounds + movups $inout4,`16*4`($out) + movups $inout5,`16*5`($out) + lea `16*6`($out),$out + sub \$16*6,$len + jnc .Lxts_enc_grandloop + + lea 3($rounds,$rounds),$rounds # restore original value + mov $key_,$key # restore $key + mov $rounds,$rnds_ # backup $rounds + +.Lxts_enc_short: + add \$16*6,$len + jz .Lxts_enc_done + + cmp \$0x20,$len + jb .Lxts_enc_one + je .Lxts_enc_two + + cmp \$0x40,$len + jb .Lxts_enc_three + je .Lxts_enc_four + + pshufd \$0x13,$twtmp,$twres + movdqa @tweak[5],@tweak[4] + paddq @tweak[5],@tweak[5] # psllq 1,$tweak + movdqu ($inp),$inout0 + pand $twmask,$twres # isolate carry and residue + movdqu 16*1($inp),$inout1 + pxor $twres,@tweak[5] + + movdqu 16*2($inp),$inout2 + pxor @tweak[0],$inout0 + movdqu 16*3($inp),$inout3 + pxor @tweak[1],$inout1 + movdqu 16*4($inp),$inout4 + lea 16*5($inp),$inp + pxor @tweak[2],$inout2 + pxor @tweak[3],$inout3 + pxor @tweak[4],$inout4 + + call _aesni_encrypt6 + + xorps @tweak[0],$inout0 + movdqa @tweak[5],@tweak[0] + xorps @tweak[1],$inout1 + xorps @tweak[2],$inout2 + movdqu $inout0,($out) + xorps @tweak[3],$inout3 + movdqu $inout1,16*1($out) + xorps @tweak[4],$inout4 + movdqu $inout2,16*2($out) + movdqu $inout3,16*3($out) + movdqu $inout4,16*4($out) + lea 16*5($out),$out + jmp .Lxts_enc_done + +.align 16 +.Lxts_enc_one: + movups ($inp),$inout0 + lea 16*1($inp),$inp + xorps @tweak[0],$inout0 +___ + &aesni_generate1("enc",$key,$rounds); +$code.=<<___; + xorps @tweak[0],$inout0 + movdqa @tweak[1],@tweak[0] + movups $inout0,($out) + lea 16*1($out),$out + jmp .Lxts_enc_done + +.align 16 +.Lxts_enc_two: + movups ($inp),$inout0 + movups 16($inp),$inout1 + lea 32($inp),$inp + xorps @tweak[0],$inout0 + xorps @tweak[1],$inout1 + + call _aesni_encrypt3 + + xorps @tweak[0],$inout0 + movdqa @tweak[2],@tweak[0] + xorps @tweak[1],$inout1 + movups $inout0,($out) + movups $inout1,16*1($out) + lea 16*2($out),$out + jmp .Lxts_enc_done + +.align 16 +.Lxts_enc_three: + movups ($inp),$inout0 + movups 16*1($inp),$inout1 + movups 16*2($inp),$inout2 + lea 16*3($inp),$inp + xorps @tweak[0],$inout0 + xorps @tweak[1],$inout1 + xorps @tweak[2],$inout2 + + call _aesni_encrypt3 + + xorps @tweak[0],$inout0 + movdqa @tweak[3],@tweak[0] + xorps @tweak[1],$inout1 + xorps @tweak[2],$inout2 + movups $inout0,($out) + movups $inout1,16*1($out) + movups $inout2,16*2($out) + lea 16*3($out),$out + jmp .Lxts_enc_done + +.align 16 +.Lxts_enc_four: + movups ($inp),$inout0 + movups 16*1($inp),$inout1 + movups 16*2($inp),$inout2 + xorps @tweak[0],$inout0 + movups 16*3($inp),$inout3 + lea 16*4($inp),$inp + xorps @tweak[1],$inout1 + xorps @tweak[2],$inout2 + xorps @tweak[3],$inout3 + + call _aesni_encrypt4 + + xorps @tweak[0],$inout0 + movdqa @tweak[5],@tweak[0] + xorps @tweak[1],$inout1 + xorps @tweak[2],$inout2 + movups $inout0,($out) + xorps @tweak[3],$inout3 + movups $inout1,16*1($out) + movups $inout2,16*2($out) + movups $inout3,16*3($out) + lea 16*4($out),$out + jmp .Lxts_enc_done + +.align 16 +.Lxts_enc_done: + and \$15,$len_ + jz .Lxts_enc_ret + mov $len_,$len + +.Lxts_enc_steal: + movzb ($inp),%eax # borrow $rounds ... + movzb -16($out),%ecx # ... and $key + lea 1($inp),$inp + mov %al,-16($out) + mov %cl,0($out) + lea 1($out),$out + sub \$1,$len + jnz .Lxts_enc_steal + + sub $len_,$out # rewind $out + mov $key_,$key # restore $key + mov $rnds_,$rounds # restore $rounds + + movups -16($out),$inout0 + xorps @tweak[0],$inout0 +___ + &aesni_generate1("enc",$key,$rounds); +$code.=<<___; + xorps @tweak[0],$inout0 + movups $inout0,-16($out) + +.Lxts_enc_ret: +___ +$code.=<<___ if ($win64); + movaps 0x60(%rsp),%xmm6 + movaps 0x70(%rsp),%xmm7 + movaps 0x80(%rsp),%xmm8 + movaps 0x90(%rsp),%xmm9 + movaps 0xa0(%rsp),%xmm10 + movaps 0xb0(%rsp),%xmm11 + movaps 0xc0(%rsp),%xmm12 + movaps 0xd0(%rsp),%xmm13 + movaps 0xe0(%rsp),%xmm14 + movaps 0xf0(%rsp),%xmm15 +___ +$code.=<<___; + lea $frame_size(%rsp),%rsp +.Lxts_enc_epilogue: + ret +.size aesni_xts_encrypt,.-aesni_xts_encrypt +___ + +$code.=<<___; +.globl aesni_xts_decrypt +.type aesni_xts_decrypt,\@function,6 +.align 16 +aesni_xts_decrypt: + lea -$frame_size(%rsp),%rsp +___ +$code.=<<___ if ($win64); + movaps %xmm6,0x60(%rsp) + movaps %xmm7,0x70(%rsp) + movaps %xmm8,0x80(%rsp) + movaps %xmm9,0x90(%rsp) + movaps %xmm10,0xa0(%rsp) + movaps %xmm11,0xb0(%rsp) + movaps %xmm12,0xc0(%rsp) + movaps %xmm13,0xd0(%rsp) + movaps %xmm14,0xe0(%rsp) + movaps %xmm15,0xf0(%rsp) +.Lxts_dec_body: +___ +$code.=<<___; + movups ($ivp),@tweak[5] # load clear-text tweak + mov 240($key2),$rounds # key2->rounds + mov 240($key),$rnds_ # key1->rounds +___ + # generate the tweak + &aesni_generate1("enc",$key2,$rounds,@tweak[5]); +$code.=<<___; + xor %eax,%eax # if ($len%16) len-=16; + test \$15,$len + setnz %al + shl \$4,%rax + sub %rax,$len + + mov $key,$key_ # backup $key + mov $rnds_,$rounds # backup $rounds + mov $len,$len_ # backup $len + and \$-16,$len + + movdqa .Lxts_magic(%rip),$twmask + pxor $twtmp,$twtmp + pcmpgtd @tweak[5],$twtmp # broadcast upper bits +___ + for ($i=0;$i<4;$i++) { + $code.=<<___; + pshufd \$0x13,$twtmp,$twres + pxor $twtmp,$twtmp + movdqa @tweak[5],@tweak[$i] + paddq @tweak[5],@tweak[5] # psllq 1,$tweak + pand $twmask,$twres # isolate carry and residue + pcmpgtd @tweak[5],$twtmp # broadcat upper bits + pxor $twres,@tweak[5] +___ + } +$code.=<<___; + sub \$16*6,$len + jc .Lxts_dec_short + + shr \$1,$rounds + sub \$1,$rounds + mov $rounds,$rnds_ + jmp .Lxts_dec_grandloop + +.align 16 +.Lxts_dec_grandloop: + pshufd \$0x13,$twtmp,$twres + movdqa @tweak[5],@tweak[4] + paddq @tweak[5],@tweak[5] # psllq 1,$tweak + movdqu `16*0`($inp),$inout0 # load input + pand $twmask,$twres # isolate carry and residue + movdqu `16*1`($inp),$inout1 + pxor $twres,@tweak[5] + + movdqu `16*2`($inp),$inout2 + pxor @tweak[0],$inout0 # input^=tweak + movdqu `16*3`($inp),$inout3 + pxor @tweak[1],$inout1 + movdqu `16*4`($inp),$inout4 + pxor @tweak[2],$inout2 + movdqu `16*5`($inp),$inout5 + lea `16*6`($inp),$inp + pxor @tweak[3],$inout3 + $movkey ($key_),$rndkey0 + pxor @tweak[4],$inout4 + pxor @tweak[5],$inout5 + + # inline _aesni_decrypt6 and interleave first and last rounds + # with own code... + $movkey 16($key_),$rndkey1 + pxor $rndkey0,$inout0 + pxor $rndkey0,$inout1 + movdqa @tweak[0],`16*0`(%rsp) # put aside tweaks + aesdec $rndkey1,$inout0 + lea 32($key_),$key + pxor $rndkey0,$inout2 + movdqa @tweak[1],`16*1`(%rsp) + aesdec $rndkey1,$inout1 + pxor $rndkey0,$inout3 + movdqa @tweak[2],`16*2`(%rsp) + aesdec $rndkey1,$inout2 + pxor $rndkey0,$inout4 + movdqa @tweak[3],`16*3`(%rsp) + aesdec $rndkey1,$inout3 + pxor $rndkey0,$inout5 + $movkey ($key),$rndkey0 + dec $rounds + movdqa @tweak[4],`16*4`(%rsp) + aesdec $rndkey1,$inout4 + movdqa @tweak[5],`16*5`(%rsp) + aesdec $rndkey1,$inout5 + pxor $twtmp,$twtmp + pcmpgtd @tweak[5],$twtmp + jmp .Lxts_dec_loop6_enter + +.align 16 +.Lxts_dec_loop6: + aesdec $rndkey1,$inout0 + aesdec $rndkey1,$inout1 + dec $rounds + aesdec $rndkey1,$inout2 + aesdec $rndkey1,$inout3 + aesdec $rndkey1,$inout4 + aesdec $rndkey1,$inout5 +.Lxts_dec_loop6_enter: + $movkey 16($key),$rndkey1 + aesdec $rndkey0,$inout0 + aesdec $rndkey0,$inout1 + lea 32($key),$key + aesdec $rndkey0,$inout2 + aesdec $rndkey0,$inout3 + aesdec $rndkey0,$inout4 + aesdec $rndkey0,$inout5 + $movkey ($key),$rndkey0 + jnz .Lxts_dec_loop6 + + pshufd \$0x13,$twtmp,$twres + pxor $twtmp,$twtmp + paddq @tweak[5],@tweak[5] # psllq 1,$tweak + aesdec $rndkey1,$inout0 + pand $twmask,$twres # isolate carry and residue + aesdec $rndkey1,$inout1 + pcmpgtd @tweak[5],$twtmp # broadcast upper bits + aesdec $rndkey1,$inout2 + pxor $twres,@tweak[5] + aesdec $rndkey1,$inout3 + aesdec $rndkey1,$inout4 + aesdec $rndkey1,$inout5 + $movkey 16($key),$rndkey1 + + pshufd \$0x13,$twtmp,$twres + pxor $twtmp,$twtmp + movdqa @tweak[5],@tweak[0] + paddq @tweak[5],@tweak[5] # psllq 1,$tweak + aesdec $rndkey0,$inout0 + pand $twmask,$twres # isolate carry and residue + aesdec $rndkey0,$inout1 + pcmpgtd @tweak[5],$twtmp # broadcat upper bits + aesdec $rndkey0,$inout2 + pxor $twres,@tweak[5] + aesdec $rndkey0,$inout3 + aesdec $rndkey0,$inout4 + aesdec $rndkey0,$inout5 + $movkey 32($key),$rndkey0 + + pshufd \$0x13,$twtmp,$twres + pxor $twtmp,$twtmp + movdqa @tweak[5],@tweak[1] + paddq @tweak[5],@tweak[5] # psllq 1,$tweak + aesdec $rndkey1,$inout0 + pand $twmask,$twres # isolate carry and residue + aesdec $rndkey1,$inout1 + pcmpgtd @tweak[5],$twtmp # broadcat upper bits + aesdec $rndkey1,$inout2 + pxor $twres,@tweak[5] + aesdec $rndkey1,$inout3 + aesdec $rndkey1,$inout4 + aesdec $rndkey1,$inout5 + + pshufd \$0x13,$twtmp,$twres + pxor $twtmp,$twtmp + movdqa @tweak[5],@tweak[2] + paddq @tweak[5],@tweak[5] # psllq 1,$tweak + aesdeclast $rndkey0,$inout0 + pand $twmask,$twres # isolate carry and residue + aesdeclast $rndkey0,$inout1 + pcmpgtd @tweak[5],$twtmp # broadcat upper bits + aesdeclast $rndkey0,$inout2 + pxor $twres,@tweak[5] + aesdeclast $rndkey0,$inout3 + aesdeclast $rndkey0,$inout4 + aesdeclast $rndkey0,$inout5 + + pshufd \$0x13,$twtmp,$twres + pxor $twtmp,$twtmp + movdqa @tweak[5],@tweak[3] + paddq @tweak[5],@tweak[5] # psllq 1,$tweak + xorps `16*0`(%rsp),$inout0 # output^=tweak + pand $twmask,$twres # isolate carry and residue + xorps `16*1`(%rsp),$inout1 + pcmpgtd @tweak[5],$twtmp # broadcat upper bits + pxor $twres,@tweak[5] + + xorps `16*2`(%rsp),$inout2 + movups $inout0,`16*0`($out) # write output + xorps `16*3`(%rsp),$inout3 + movups $inout1,`16*1`($out) + xorps `16*4`(%rsp),$inout4 + movups $inout2,`16*2`($out) + xorps `16*5`(%rsp),$inout5 + movups $inout3,`16*3`($out) + mov $rnds_,$rounds # restore $rounds + movups $inout4,`16*4`($out) + movups $inout5,`16*5`($out) + lea `16*6`($out),$out + sub \$16*6,$len + jnc .Lxts_dec_grandloop + + lea 3($rounds,$rounds),$rounds # restore original value + mov $key_,$key # restore $key + mov $rounds,$rnds_ # backup $rounds + +.Lxts_dec_short: + add \$16*6,$len + jz .Lxts_dec_done + + cmp \$0x20,$len + jb .Lxts_dec_one + je .Lxts_dec_two + + cmp \$0x40,$len + jb .Lxts_dec_three + je .Lxts_dec_four + + pshufd \$0x13,$twtmp,$twres + movdqa @tweak[5],@tweak[4] + paddq @tweak[5],@tweak[5] # psllq 1,$tweak + movdqu ($inp),$inout0 + pand $twmask,$twres # isolate carry and residue + movdqu 16*1($inp),$inout1 + pxor $twres,@tweak[5] + + movdqu 16*2($inp),$inout2 + pxor @tweak[0],$inout0 + movdqu 16*3($inp),$inout3 + pxor @tweak[1],$inout1 + movdqu 16*4($inp),$inout4 + lea 16*5($inp),$inp + pxor @tweak[2],$inout2 + pxor @tweak[3],$inout3 + pxor @tweak[4],$inout4 + + call _aesni_decrypt6 + + xorps @tweak[0],$inout0 + xorps @tweak[1],$inout1 + xorps @tweak[2],$inout2 + movdqu $inout0,($out) + xorps @tweak[3],$inout3 + movdqu $inout1,16*1($out) + xorps @tweak[4],$inout4 + movdqu $inout2,16*2($out) + pxor $twtmp,$twtmp + movdqu $inout3,16*3($out) + pcmpgtd @tweak[5],$twtmp + movdqu $inout4,16*4($out) + lea 16*5($out),$out + pshufd \$0x13,$twtmp,@tweak[1] # $twres + and \$15,$len_ + jz .Lxts_dec_ret + + movdqa @tweak[5],@tweak[0] + paddq @tweak[5],@tweak[5] # psllq 1,$tweak + pand $twmask,@tweak[1] # isolate carry and residue + pxor @tweak[5],@tweak[1] + jmp .Lxts_dec_done2 + +.align 16 +.Lxts_dec_one: + movups ($inp),$inout0 + lea 16*1($inp),$inp + xorps @tweak[0],$inout0 +___ + &aesni_generate1("dec",$key,$rounds); +$code.=<<___; + xorps @tweak[0],$inout0 + movdqa @tweak[1],@tweak[0] + movups $inout0,($out) + movdqa @tweak[2],@tweak[1] + lea 16*1($out),$out + jmp .Lxts_dec_done + +.align 16 +.Lxts_dec_two: + movups ($inp),$inout0 + movups 16($inp),$inout1 + lea 32($inp),$inp + xorps @tweak[0],$inout0 + xorps @tweak[1],$inout1 + + call _aesni_decrypt3 + + xorps @tweak[0],$inout0 + movdqa @tweak[2],@tweak[0] + xorps @tweak[1],$inout1 + movdqa @tweak[3],@tweak[1] + movups $inout0,($out) + movups $inout1,16*1($out) + lea 16*2($out),$out + jmp .Lxts_dec_done + +.align 16 +.Lxts_dec_three: + movups ($inp),$inout0 + movups 16*1($inp),$inout1 + movups 16*2($inp),$inout2 + lea 16*3($inp),$inp + xorps @tweak[0],$inout0 + xorps @tweak[1],$inout1 + xorps @tweak[2],$inout2 + + call _aesni_decrypt3 + + xorps @tweak[0],$inout0 + movdqa @tweak[3],@tweak[0] + xorps @tweak[1],$inout1 + movdqa @tweak[5],@tweak[1] + xorps @tweak[2],$inout2 + movups $inout0,($out) + movups $inout1,16*1($out) + movups $inout2,16*2($out) + lea 16*3($out),$out + jmp .Lxts_dec_done + +.align 16 +.Lxts_dec_four: + pshufd \$0x13,$twtmp,$twres + movdqa @tweak[5],@tweak[4] + paddq @tweak[5],@tweak[5] # psllq 1,$tweak + movups ($inp),$inout0 + pand $twmask,$twres # isolate carry and residue + movups 16*1($inp),$inout1 + pxor $twres,@tweak[5] + + movups 16*2($inp),$inout2 + xorps @tweak[0],$inout0 + movups 16*3($inp),$inout3 + lea 16*4($inp),$inp + xorps @tweak[1],$inout1 + xorps @tweak[2],$inout2 + xorps @tweak[3],$inout3 + + call _aesni_decrypt4 + + xorps @tweak[0],$inout0 + movdqa @tweak[4],@tweak[0] + xorps @tweak[1],$inout1 + movdqa @tweak[5],@tweak[1] + xorps @tweak[2],$inout2 + movups $inout0,($out) + xorps @tweak[3],$inout3 + movups $inout1,16*1($out) + movups $inout2,16*2($out) + movups $inout3,16*3($out) + lea 16*4($out),$out + jmp .Lxts_dec_done + +.align 16 +.Lxts_dec_done: + and \$15,$len_ + jz .Lxts_dec_ret +.Lxts_dec_done2: + mov $len_,$len + mov $key_,$key # restore $key + mov $rnds_,$rounds # restore $rounds + + movups ($inp),$inout0 + xorps @tweak[1],$inout0 +___ + &aesni_generate1("dec",$key,$rounds); +$code.=<<___; + xorps @tweak[1],$inout0 + movups $inout0,($out) + +.Lxts_dec_steal: + movzb 16($inp),%eax # borrow $rounds ... + movzb ($out),%ecx # ... and $key + lea 1($inp),$inp + mov %al,($out) + mov %cl,16($out) + lea 1($out),$out + sub \$1,$len + jnz .Lxts_dec_steal + + sub $len_,$out # rewind $out + mov $key_,$key # restore $key + mov $rnds_,$rounds # restore $rounds + + movups ($out),$inout0 + xorps @tweak[0],$inout0 +___ + &aesni_generate1("dec",$key,$rounds); +$code.=<<___; + xorps @tweak[0],$inout0 + movups $inout0,($out) + +.Lxts_dec_ret: +___ +$code.=<<___ if ($win64); + movaps 0x60(%rsp),%xmm6 + movaps 0x70(%rsp),%xmm7 + movaps 0x80(%rsp),%xmm8 + movaps 0x90(%rsp),%xmm9 + movaps 0xa0(%rsp),%xmm10 + movaps 0xb0(%rsp),%xmm11 + movaps 0xc0(%rsp),%xmm12 + movaps 0xd0(%rsp),%xmm13 + movaps 0xe0(%rsp),%xmm14 + movaps 0xf0(%rsp),%xmm15 +___ +$code.=<<___; + lea $frame_size(%rsp),%rsp +.Lxts_dec_epilogue: + ret +.size aesni_xts_decrypt,.-aesni_xts_decrypt +___ +} }} + +######################################################################## +# void $PREFIX_cbc_encrypt (const void *inp, void *out, +# size_t length, const AES_KEY *key, +# unsigned char *ivp,const int enc); +{ +my $reserved = $win64?0x40:-0x18; # used in decrypt +$code.=<<___; +.globl ${PREFIX}_cbc_encrypt +.type ${PREFIX}_cbc_encrypt,\@function,6 +.align 16 +${PREFIX}_cbc_encrypt: + test $len,$len # check length + jz .Lcbc_ret + + mov 240($key),$rnds_ # key->rounds + mov $key,$key_ # backup $key + test %r9d,%r9d # 6th argument + jz .Lcbc_decrypt +#--------------------------- CBC ENCRYPT ------------------------------# + movups ($ivp),$inout0 # load iv as initial state + mov $rnds_,$rounds + cmp \$16,$len + jb .Lcbc_enc_tail + sub \$16,$len + jmp .Lcbc_enc_loop +.align 16 +.Lcbc_enc_loop: + movups ($inp),$inout1 # load input + lea 16($inp),$inp + #xorps $inout1,$inout0 +___ + &aesni_generate1("enc",$key,$rounds,$inout0,$inout1); +$code.=<<___; + mov $rnds_,$rounds # restore $rounds + mov $key_,$key # restore $key + movups $inout0,0($out) # store output + lea 16($out),$out + sub \$16,$len + jnc .Lcbc_enc_loop + add \$16,$len + jnz .Lcbc_enc_tail + movups $inout0,($ivp) + jmp .Lcbc_ret + +.Lcbc_enc_tail: + mov $len,%rcx # zaps $key + xchg $inp,$out # $inp is %rsi and $out is %rdi now + .long 0x9066A4F3 # rep movsb + mov \$16,%ecx # zero tail + sub $len,%rcx + xor %eax,%eax + .long 0x9066AAF3 # rep stosb + lea -16(%rdi),%rdi # rewind $out by 1 block + mov $rnds_,$rounds # restore $rounds + mov %rdi,%rsi # $inp and $out are the same + mov $key_,$key # restore $key + xor $len,$len # len=16 + jmp .Lcbc_enc_loop # one more spin + #--------------------------- CBC DECRYPT ------------------------------# +.align 16 +.Lcbc_decrypt: +___ +$code.=<<___ if ($win64); + lea -0x58(%rsp),%rsp + movaps %xmm6,(%rsp) + movaps %xmm7,0x10(%rsp) + movaps %xmm8,0x20(%rsp) + movaps %xmm9,0x30(%rsp) +.Lcbc_decrypt_body: +___ +$code.=<<___; + movups ($ivp),$iv + mov $rnds_,$rounds + cmp \$0x70,$len + jbe .Lcbc_dec_tail + shr \$1,$rnds_ + sub \$0x70,$len + mov $rnds_,$rounds + movaps $iv,$reserved(%rsp) + jmp .Lcbc_dec_loop8_enter +.align 16 +.Lcbc_dec_loop8: + movaps $rndkey0,$reserved(%rsp) # save IV + movups $inout7,($out) + lea 0x10($out),$out +.Lcbc_dec_loop8_enter: + $movkey ($key),$rndkey0 + movups ($inp),$inout0 # load input + movups 0x10($inp),$inout1 + $movkey 16($key),$rndkey1 + + lea 32($key),$key + movdqu 0x20($inp),$inout2 + xorps $rndkey0,$inout0 + movdqu 0x30($inp),$inout3 + xorps $rndkey0,$inout1 + movdqu 0x40($inp),$inout4 + aesdec $rndkey1,$inout0 + pxor $rndkey0,$inout2 + movdqu 0x50($inp),$inout5 + aesdec $rndkey1,$inout1 + pxor $rndkey0,$inout3 + movdqu 0x60($inp),$inout6 + aesdec $rndkey1,$inout2 + pxor $rndkey0,$inout4 + movdqu 0x70($inp),$inout7 + aesdec $rndkey1,$inout3 + pxor $rndkey0,$inout5 + dec $rounds + aesdec $rndkey1,$inout4 + pxor $rndkey0,$inout6 + aesdec $rndkey1,$inout5 + pxor $rndkey0,$inout7 + $movkey ($key),$rndkey0 + aesdec $rndkey1,$inout6 + aesdec $rndkey1,$inout7 + $movkey 16($key),$rndkey1 + + call .Ldec_loop8_enter + + movups ($inp),$rndkey1 # re-load input + movups 0x10($inp),$rndkey0 + xorps $reserved(%rsp),$inout0 # ^= IV + xorps $rndkey1,$inout1 + movups 0x20($inp),$rndkey1 + xorps $rndkey0,$inout2 + movups 0x30($inp),$rndkey0 + xorps $rndkey1,$inout3 + movups 0x40($inp),$rndkey1 + xorps $rndkey0,$inout4 + movups 0x50($inp),$rndkey0 + xorps $rndkey1,$inout5 + movups 0x60($inp),$rndkey1 + xorps $rndkey0,$inout6 + movups 0x70($inp),$rndkey0 # IV + xorps $rndkey1,$inout7 + movups $inout0,($out) + movups $inout1,0x10($out) + movups $inout2,0x20($out) + movups $inout3,0x30($out) + mov $rnds_,$rounds # restore $rounds + movups $inout4,0x40($out) + mov $key_,$key # restore $key + movups $inout5,0x50($out) + lea 0x80($inp),$inp + movups $inout6,0x60($out) + lea 0x70($out),$out + sub \$0x80,$len + ja .Lcbc_dec_loop8 + + movaps $inout7,$inout0 + movaps $rndkey0,$iv + add \$0x70,$len + jle .Lcbc_dec_tail_collected + movups $inout0,($out) + lea 1($rnds_,$rnds_),$rounds + lea 0x10($out),$out +.Lcbc_dec_tail: + movups ($inp),$inout0 + movaps $inout0,$in0 + cmp \$0x10,$len + jbe .Lcbc_dec_one + + movups 0x10($inp),$inout1 + movaps $inout1,$in1 + cmp \$0x20,$len + jbe .Lcbc_dec_two + + movups 0x20($inp),$inout2 + movaps $inout2,$in2 + cmp \$0x30,$len + jbe .Lcbc_dec_three + + movups 0x30($inp),$inout3 + cmp \$0x40,$len + jbe .Lcbc_dec_four + + movups 0x40($inp),$inout4 + cmp \$0x50,$len + jbe .Lcbc_dec_five + + movups 0x50($inp),$inout5 + cmp \$0x60,$len + jbe .Lcbc_dec_six + + movups 0x60($inp),$inout6 + movaps $iv,$reserved(%rsp) # save IV + call _aesni_decrypt8 + movups ($inp),$rndkey1 + movups 0x10($inp),$rndkey0 + xorps $reserved(%rsp),$inout0 # ^= IV + xorps $rndkey1,$inout1 + movups 0x20($inp),$rndkey1 + xorps $rndkey0,$inout2 + movups 0x30($inp),$rndkey0 + xorps $rndkey1,$inout3 + movups 0x40($inp),$rndkey1 + xorps $rndkey0,$inout4 + movups 0x50($inp),$rndkey0 + xorps $rndkey1,$inout5 + movups 0x60($inp),$iv # IV + xorps $rndkey0,$inout6 + movups $inout0,($out) + movups $inout1,0x10($out) + movups $inout2,0x20($out) + movups $inout3,0x30($out) + movups $inout4,0x40($out) + movups $inout5,0x50($out) + lea 0x60($out),$out + movaps $inout6,$inout0 + sub \$0x70,$len + jmp .Lcbc_dec_tail_collected +.align 16 +.Lcbc_dec_one: +___ + &aesni_generate1("dec",$key,$rounds); +$code.=<<___; + xorps $iv,$inout0 + movaps $in0,$iv + sub \$0x10,$len + jmp .Lcbc_dec_tail_collected +.align 16 +.Lcbc_dec_two: + xorps $inout2,$inout2 + call _aesni_decrypt3 + xorps $iv,$inout0 + xorps $in0,$inout1 + movups $inout0,($out) + movaps $in1,$iv + movaps $inout1,$inout0 + lea 0x10($out),$out + sub \$0x20,$len + jmp .Lcbc_dec_tail_collected +.align 16 +.Lcbc_dec_three: + call _aesni_decrypt3 + xorps $iv,$inout0 + xorps $in0,$inout1 + movups $inout0,($out) + xorps $in1,$inout2 + movups $inout1,0x10($out) + movaps $in2,$iv + movaps $inout2,$inout0 + lea 0x20($out),$out + sub \$0x30,$len + jmp .Lcbc_dec_tail_collected +.align 16 +.Lcbc_dec_four: + call _aesni_decrypt4 + xorps $iv,$inout0 + movups 0x30($inp),$iv + xorps $in0,$inout1 + movups $inout0,($out) + xorps $in1,$inout2 + movups $inout1,0x10($out) + xorps $in2,$inout3 + movups $inout2,0x20($out) + movaps $inout3,$inout0 + lea 0x30($out),$out + sub \$0x40,$len + jmp .Lcbc_dec_tail_collected +.align 16 +.Lcbc_dec_five: + xorps $inout5,$inout5 + call _aesni_decrypt6 + movups 0x10($inp),$rndkey1 + movups 0x20($inp),$rndkey0 + xorps $iv,$inout0 + xorps $in0,$inout1 + xorps $rndkey1,$inout2 + movups 0x30($inp),$rndkey1 + xorps $rndkey0,$inout3 + movups 0x40($inp),$iv + xorps $rndkey1,$inout4 + movups $inout0,($out) + movups $inout1,0x10($out) + movups $inout2,0x20($out) + movups $inout3,0x30($out) + lea 0x40($out),$out + movaps $inout4,$inout0 + sub \$0x50,$len + jmp .Lcbc_dec_tail_collected +.align 16 +.Lcbc_dec_six: + call _aesni_decrypt6 + movups 0x10($inp),$rndkey1 + movups 0x20($inp),$rndkey0 + xorps $iv,$inout0 + xorps $in0,$inout1 + xorps $rndkey1,$inout2 + movups 0x30($inp),$rndkey1 + xorps $rndkey0,$inout3 + movups 0x40($inp),$rndkey0 + xorps $rndkey1,$inout4 + movups 0x50($inp),$iv + xorps $rndkey0,$inout5 + movups $inout0,($out) + movups $inout1,0x10($out) + movups $inout2,0x20($out) + movups $inout3,0x30($out) + movups $inout4,0x40($out) + lea 0x50($out),$out + movaps $inout5,$inout0 + sub \$0x60,$len + jmp .Lcbc_dec_tail_collected +.align 16 +.Lcbc_dec_tail_collected: + and \$15,$len + movups $iv,($ivp) + jnz .Lcbc_dec_tail_partial + movups $inout0,($out) + jmp .Lcbc_dec_ret +.align 16 +.Lcbc_dec_tail_partial: + movaps $inout0,$reserved(%rsp) + mov \$16,%rcx + mov $out,%rdi + sub $len,%rcx + lea $reserved(%rsp),%rsi + .long 0x9066A4F3 # rep movsb + +.Lcbc_dec_ret: +___ +$code.=<<___ if ($win64); + movaps (%rsp),%xmm6 + movaps 0x10(%rsp),%xmm7 + movaps 0x20(%rsp),%xmm8 + movaps 0x30(%rsp),%xmm9 + lea 0x58(%rsp),%rsp +___ +$code.=<<___; +.Lcbc_ret: + ret +.size ${PREFIX}_cbc_encrypt,.-${PREFIX}_cbc_encrypt +___ +} +# int $PREFIX_set_[en|de]crypt_key (const unsigned char *userKey, +# int bits, AES_KEY *key) +{ my ($inp,$bits,$key) = @_4args; + $bits =~ s/%r/%e/; + +$code.=<<___; +.globl ${PREFIX}_set_decrypt_key +.type ${PREFIX}_set_decrypt_key,\@abi-omnipotent +.align 16 +${PREFIX}_set_decrypt_key: + .byte 0x48,0x83,0xEC,0x08 # sub rsp,8 + call __aesni_set_encrypt_key + shl \$4,$bits # rounds-1 after _aesni_set_encrypt_key + test %eax,%eax + jnz .Ldec_key_ret + lea 16($key,$bits),$inp # points at the end of key schedule + + $movkey ($key),%xmm0 # just swap + $movkey ($inp),%xmm1 + $movkey %xmm0,($inp) + $movkey %xmm1,($key) + lea 16($key),$key + lea -16($inp),$inp + +.Ldec_key_inverse: + $movkey ($key),%xmm0 # swap and inverse + $movkey ($inp),%xmm1 + aesimc %xmm0,%xmm0 + aesimc %xmm1,%xmm1 + lea 16($key),$key + lea -16($inp),$inp + $movkey %xmm0,16($inp) + $movkey %xmm1,-16($key) + cmp $key,$inp + ja .Ldec_key_inverse + + $movkey ($key),%xmm0 # inverse middle + aesimc %xmm0,%xmm0 + $movkey %xmm0,($inp) +.Ldec_key_ret: + add \$8,%rsp + ret +.LSEH_end_set_decrypt_key: +.size ${PREFIX}_set_decrypt_key,.-${PREFIX}_set_decrypt_key +___ + +# This is based on submission by +# +# Huang Ying +# Vinodh Gopal +# Kahraman Akdemir +# +# Agressively optimized in respect to aeskeygenassist's critical path +# and is contained in %xmm0-5 to meet Win64 ABI requirement. +# +$code.=<<___; +.globl ${PREFIX}_set_encrypt_key +.type ${PREFIX}_set_encrypt_key,\@abi-omnipotent +.align 16 +${PREFIX}_set_encrypt_key: +__aesni_set_encrypt_key: + .byte 0x48,0x83,0xEC,0x08 # sub rsp,8 + mov \$-1,%rax + test $inp,$inp + jz .Lenc_key_ret + test $key,$key + jz .Lenc_key_ret + + movups ($inp),%xmm0 # pull first 128 bits of *userKey + xorps %xmm4,%xmm4 # low dword of xmm4 is assumed 0 + lea 16($key),%rax + cmp \$256,$bits + je .L14rounds + cmp \$192,$bits + je .L12rounds + cmp \$128,$bits + jne .Lbad_keybits + +.L10rounds: + mov \$9,$bits # 10 rounds for 128-bit key + $movkey %xmm0,($key) # round 0 + aeskeygenassist \$0x1,%xmm0,%xmm1 # round 1 + call .Lkey_expansion_128_cold + aeskeygenassist \$0x2,%xmm0,%xmm1 # round 2 + call .Lkey_expansion_128 + aeskeygenassist \$0x4,%xmm0,%xmm1 # round 3 + call .Lkey_expansion_128 + aeskeygenassist \$0x8,%xmm0,%xmm1 # round 4 + call .Lkey_expansion_128 + aeskeygenassist \$0x10,%xmm0,%xmm1 # round 5 + call .Lkey_expansion_128 + aeskeygenassist \$0x20,%xmm0,%xmm1 # round 6 + call .Lkey_expansion_128 + aeskeygenassist \$0x40,%xmm0,%xmm1 # round 7 + call .Lkey_expansion_128 + aeskeygenassist \$0x80,%xmm0,%xmm1 # round 8 + call .Lkey_expansion_128 + aeskeygenassist \$0x1b,%xmm0,%xmm1 # round 9 + call .Lkey_expansion_128 + aeskeygenassist \$0x36,%xmm0,%xmm1 # round 10 + call .Lkey_expansion_128 + $movkey %xmm0,(%rax) + mov $bits,80(%rax) # 240(%rdx) + xor %eax,%eax + jmp .Lenc_key_ret + +.align 16 +.L12rounds: + movq 16($inp),%xmm2 # remaining 1/3 of *userKey + mov \$11,$bits # 12 rounds for 192 + $movkey %xmm0,($key) # round 0 + aeskeygenassist \$0x1,%xmm2,%xmm1 # round 1,2 + call .Lkey_expansion_192a_cold + aeskeygenassist \$0x2,%xmm2,%xmm1 # round 2,3 + call .Lkey_expansion_192b + aeskeygenassist \$0x4,%xmm2,%xmm1 # round 4,5 + call .Lkey_expansion_192a + aeskeygenassist \$0x8,%xmm2,%xmm1 # round 5,6 + call .Lkey_expansion_192b + aeskeygenassist \$0x10,%xmm2,%xmm1 # round 7,8 + call .Lkey_expansion_192a + aeskeygenassist \$0x20,%xmm2,%xmm1 # round 8,9 + call .Lkey_expansion_192b + aeskeygenassist \$0x40,%xmm2,%xmm1 # round 10,11 + call .Lkey_expansion_192a + aeskeygenassist \$0x80,%xmm2,%xmm1 # round 11,12 + call .Lkey_expansion_192b + $movkey %xmm0,(%rax) + mov $bits,48(%rax) # 240(%rdx) + xor %rax, %rax + jmp .Lenc_key_ret + +.align 16 +.L14rounds: + movups 16($inp),%xmm2 # remaning half of *userKey + mov \$13,$bits # 14 rounds for 256 + lea 16(%rax),%rax + $movkey %xmm0,($key) # round 0 + $movkey %xmm2,16($key) # round 1 + aeskeygenassist \$0x1,%xmm2,%xmm1 # round 2 + call .Lkey_expansion_256a_cold + aeskeygenassist \$0x1,%xmm0,%xmm1 # round 3 + call .Lkey_expansion_256b + aeskeygenassist \$0x2,%xmm2,%xmm1 # round 4 + call .Lkey_expansion_256a + aeskeygenassist \$0x2,%xmm0,%xmm1 # round 5 + call .Lkey_expansion_256b + aeskeygenassist \$0x4,%xmm2,%xmm1 # round 6 + call .Lkey_expansion_256a + aeskeygenassist \$0x4,%xmm0,%xmm1 # round 7 + call .Lkey_expansion_256b + aeskeygenassist \$0x8,%xmm2,%xmm1 # round 8 + call .Lkey_expansion_256a + aeskeygenassist \$0x8,%xmm0,%xmm1 # round 9 + call .Lkey_expansion_256b + aeskeygenassist \$0x10,%xmm2,%xmm1 # round 10 + call .Lkey_expansion_256a + aeskeygenassist \$0x10,%xmm0,%xmm1 # round 11 + call .Lkey_expansion_256b + aeskeygenassist \$0x20,%xmm2,%xmm1 # round 12 + call .Lkey_expansion_256a + aeskeygenassist \$0x20,%xmm0,%xmm1 # round 13 + call .Lkey_expansion_256b + aeskeygenassist \$0x40,%xmm2,%xmm1 # round 14 + call .Lkey_expansion_256a + $movkey %xmm0,(%rax) + mov $bits,16(%rax) # 240(%rdx) + xor %rax,%rax + jmp .Lenc_key_ret + +.align 16 +.Lbad_keybits: + mov \$-2,%rax +.Lenc_key_ret: + add \$8,%rsp + ret +.LSEH_end_set_encrypt_key: + +.align 16 +.Lkey_expansion_128: + $movkey %xmm0,(%rax) + lea 16(%rax),%rax +.Lkey_expansion_128_cold: + shufps \$0b00010000,%xmm0,%xmm4 + xorps %xmm4, %xmm0 + shufps \$0b10001100,%xmm0,%xmm4 + xorps %xmm4, %xmm0 + shufps \$0b11111111,%xmm1,%xmm1 # critical path + xorps %xmm1,%xmm0 + ret + +.align 16 +.Lkey_expansion_192a: + $movkey %xmm0,(%rax) + lea 16(%rax),%rax +.Lkey_expansion_192a_cold: + movaps %xmm2, %xmm5 +.Lkey_expansion_192b_warm: + shufps \$0b00010000,%xmm0,%xmm4 + movdqa %xmm2,%xmm3 + xorps %xmm4,%xmm0 + shufps \$0b10001100,%xmm0,%xmm4 + pslldq \$4,%xmm3 + xorps %xmm4,%xmm0 + pshufd \$0b01010101,%xmm1,%xmm1 # critical path + pxor %xmm3,%xmm2 + pxor %xmm1,%xmm0 + pshufd \$0b11111111,%xmm0,%xmm3 + pxor %xmm3,%xmm2 + ret + +.align 16 +.Lkey_expansion_192b: + movaps %xmm0,%xmm3 + shufps \$0b01000100,%xmm0,%xmm5 + $movkey %xmm5,(%rax) + shufps \$0b01001110,%xmm2,%xmm3 + $movkey %xmm3,16(%rax) + lea 32(%rax),%rax + jmp .Lkey_expansion_192b_warm + +.align 16 +.Lkey_expansion_256a: + $movkey %xmm2,(%rax) + lea 16(%rax),%rax +.Lkey_expansion_256a_cold: + shufps \$0b00010000,%xmm0,%xmm4 + xorps %xmm4,%xmm0 + shufps \$0b10001100,%xmm0,%xmm4 + xorps %xmm4,%xmm0 + shufps \$0b11111111,%xmm1,%xmm1 # critical path + xorps %xmm1,%xmm0 + ret + +.align 16 +.Lkey_expansion_256b: + $movkey %xmm0,(%rax) + lea 16(%rax),%rax + + shufps \$0b00010000,%xmm2,%xmm4 + xorps %xmm4,%xmm2 + shufps \$0b10001100,%xmm2,%xmm4 + xorps %xmm4,%xmm2 + shufps \$0b10101010,%xmm1,%xmm1 # critical path + xorps %xmm1,%xmm2 + ret +.size ${PREFIX}_set_encrypt_key,.-${PREFIX}_set_encrypt_key +.size __aesni_set_encrypt_key,.-__aesni_set_encrypt_key +___ +} + +$code.=<<___; +.align 64 +.Lbswap_mask: + .byte 15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0 +.Lincrement32: + .long 6,6,6,0 +.Lincrement64: + .long 1,0,0,0 +.Lxts_magic: + .long 0x87,0,1,0 + +.asciz "AES for Intel AES-NI, CRYPTOGAMS by " +.align 64 +___ + +# EXCEPTION_DISPOSITION handler (EXCEPTION_RECORD *rec,ULONG64 frame, +# CONTEXT *context,DISPATCHER_CONTEXT *disp) +if ($win64) { +$rec="%rcx"; +$frame="%rdx"; +$context="%r8"; +$disp="%r9"; + +$code.=<<___; +.extern __imp_RtlVirtualUnwind +___ +$code.=<<___ if ($PREFIX eq "aesni"); +.type ecb_se_handler,\@abi-omnipotent +.align 16 +ecb_se_handler: + push %rsi + push %rdi + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + pushfq + sub \$64,%rsp + + mov 152($context),%rax # pull context->Rsp + + jmp .Lcommon_seh_tail +.size ecb_se_handler,.-ecb_se_handler + +.type ccm64_se_handler,\@abi-omnipotent +.align 16 +ccm64_se_handler: + push %rsi + push %rdi + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + pushfq + sub \$64,%rsp + + mov 120($context),%rax # pull context->Rax + mov 248($context),%rbx # pull context->Rip + + mov 8($disp),%rsi # disp->ImageBase + mov 56($disp),%r11 # disp->HandlerData + + mov 0(%r11),%r10d # HandlerData[0] + lea (%rsi,%r10),%r10 # prologue label + cmp %r10,%rbx # context->RipRsp + + mov 4(%r11),%r10d # HandlerData[1] + lea (%rsi,%r10),%r10 # epilogue label + cmp %r10,%rbx # context->Rip>=epilogue label + jae .Lcommon_seh_tail + + lea 0(%rax),%rsi # %xmm save area + lea 512($context),%rdi # &context.Xmm6 + mov \$8,%ecx # 4*sizeof(%xmm0)/sizeof(%rax) + .long 0xa548f3fc # cld; rep movsq + lea 0x58(%rax),%rax # adjust stack pointer + + jmp .Lcommon_seh_tail +.size ccm64_se_handler,.-ccm64_se_handler + +.type ctr32_se_handler,\@abi-omnipotent +.align 16 +ctr32_se_handler: + push %rsi + push %rdi + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + pushfq + sub \$64,%rsp + + mov 120($context),%rax # pull context->Rax + mov 248($context),%rbx # pull context->Rip + + lea .Lctr32_body(%rip),%r10 + cmp %r10,%rbx # context->Rip<"prologue" label + jb .Lcommon_seh_tail + + mov 152($context),%rax # pull context->Rsp + + lea .Lctr32_ret(%rip),%r10 + cmp %r10,%rbx + jae .Lcommon_seh_tail + + lea 0x20(%rax),%rsi # %xmm save area + lea 512($context),%rdi # &context.Xmm6 + mov \$20,%ecx # 10*sizeof(%xmm0)/sizeof(%rax) + .long 0xa548f3fc # cld; rep movsq + lea 0xc8(%rax),%rax # adjust stack pointer + + jmp .Lcommon_seh_tail +.size ctr32_se_handler,.-ctr32_se_handler + +.type xts_se_handler,\@abi-omnipotent +.align 16 +xts_se_handler: + push %rsi + push %rdi + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + pushfq + sub \$64,%rsp + + mov 120($context),%rax # pull context->Rax + mov 248($context),%rbx # pull context->Rip + + mov 8($disp),%rsi # disp->ImageBase + mov 56($disp),%r11 # disp->HandlerData + + mov 0(%r11),%r10d # HandlerData[0] + lea (%rsi,%r10),%r10 # prologue lable + cmp %r10,%rbx # context->RipRsp + + mov 4(%r11),%r10d # HandlerData[1] + lea (%rsi,%r10),%r10 # epilogue label + cmp %r10,%rbx # context->Rip>=epilogue label + jae .Lcommon_seh_tail + + lea 0x60(%rax),%rsi # %xmm save area + lea 512($context),%rdi # & context.Xmm6 + mov \$20,%ecx # 10*sizeof(%xmm0)/sizeof(%rax) + .long 0xa548f3fc # cld; rep movsq + lea 0x68+160(%rax),%rax # adjust stack pointer + + jmp .Lcommon_seh_tail +.size xts_se_handler,.-xts_se_handler +___ +$code.=<<___; +.type cbc_se_handler,\@abi-omnipotent +.align 16 +cbc_se_handler: + push %rsi + push %rdi + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + pushfq + sub \$64,%rsp + + mov 152($context),%rax # pull context->Rsp + mov 248($context),%rbx # pull context->Rip + + lea .Lcbc_decrypt(%rip),%r10 + cmp %r10,%rbx # context->Rip<"prologue" label + jb .Lcommon_seh_tail + + lea .Lcbc_decrypt_body(%rip),%r10 + cmp %r10,%rbx # context->RipRip>="epilogue" label + jae .Lcommon_seh_tail + + lea 0(%rax),%rsi # top of stack + lea 512($context),%rdi # &context.Xmm6 + mov \$8,%ecx # 4*sizeof(%xmm0)/sizeof(%rax) + .long 0xa548f3fc # cld; rep movsq + lea 0x58(%rax),%rax # adjust stack pointer + jmp .Lcommon_seh_tail + +.Lrestore_cbc_rax: + mov 120($context),%rax + +.Lcommon_seh_tail: + mov 8(%rax),%rdi + mov 16(%rax),%rsi + mov %rax,152($context) # restore context->Rsp + mov %rsi,168($context) # restore context->Rsi + mov %rdi,176($context) # restore context->Rdi + + mov 40($disp),%rdi # disp->ContextRecord + mov $context,%rsi # context + mov \$154,%ecx # sizeof(CONTEXT) + .long 0xa548f3fc # cld; rep movsq + + mov $disp,%rsi + xor %rcx,%rcx # arg1, UNW_FLAG_NHANDLER + mov 8(%rsi),%rdx # arg2, disp->ImageBase + mov 0(%rsi),%r8 # arg3, disp->ControlPc + mov 16(%rsi),%r9 # arg4, disp->FunctionEntry + mov 40(%rsi),%r10 # disp->ContextRecord + lea 56(%rsi),%r11 # &disp->HandlerData + lea 24(%rsi),%r12 # &disp->EstablisherFrame + mov %r10,32(%rsp) # arg5 + mov %r11,40(%rsp) # arg6 + mov %r12,48(%rsp) # arg7 + mov %rcx,56(%rsp) # arg8, (NULL) + call *__imp_RtlVirtualUnwind(%rip) + + mov \$1,%eax # ExceptionContinueSearch + add \$64,%rsp + popfq + pop %r15 + pop %r14 + pop %r13 + pop %r12 + pop %rbp + pop %rbx + pop %rdi + pop %rsi + ret +.size cbc_se_handler,.-cbc_se_handler + +.section .pdata +.align 4 +___ +$code.=<<___ if ($PREFIX eq "aesni"); + .rva .LSEH_begin_aesni_ecb_encrypt + .rva .LSEH_end_aesni_ecb_encrypt + .rva .LSEH_info_ecb + + .rva .LSEH_begin_aesni_ccm64_encrypt_blocks + .rva .LSEH_end_aesni_ccm64_encrypt_blocks + .rva .LSEH_info_ccm64_enc + + .rva .LSEH_begin_aesni_ccm64_decrypt_blocks + .rva .LSEH_end_aesni_ccm64_decrypt_blocks + .rva .LSEH_info_ccm64_dec + + .rva .LSEH_begin_aesni_ctr32_encrypt_blocks + .rva .LSEH_end_aesni_ctr32_encrypt_blocks + .rva .LSEH_info_ctr32 + + .rva .LSEH_begin_aesni_xts_encrypt + .rva .LSEH_end_aesni_xts_encrypt + .rva .LSEH_info_xts_enc + + .rva .LSEH_begin_aesni_xts_decrypt + .rva .LSEH_end_aesni_xts_decrypt + .rva .LSEH_info_xts_dec +___ +$code.=<<___; + .rva .LSEH_begin_${PREFIX}_cbc_encrypt + .rva .LSEH_end_${PREFIX}_cbc_encrypt + .rva .LSEH_info_cbc + + .rva ${PREFIX}_set_decrypt_key + .rva .LSEH_end_set_decrypt_key + .rva .LSEH_info_key + + .rva ${PREFIX}_set_encrypt_key + .rva .LSEH_end_set_encrypt_key + .rva .LSEH_info_key +.section .xdata +.align 8 +___ +$code.=<<___ if ($PREFIX eq "aesni"); +.LSEH_info_ecb: + .byte 9,0,0,0 + .rva ecb_se_handler +.LSEH_info_ccm64_enc: + .byte 9,0,0,0 + .rva ccm64_se_handler + .rva .Lccm64_enc_body,.Lccm64_enc_ret # HandlerData[] +.LSEH_info_ccm64_dec: + .byte 9,0,0,0 + .rva ccm64_se_handler + .rva .Lccm64_dec_body,.Lccm64_dec_ret # HandlerData[] +.LSEH_info_ctr32: + .byte 9,0,0,0 + .rva ctr32_se_handler +.LSEH_info_xts_enc: + .byte 9,0,0,0 + .rva xts_se_handler + .rva .Lxts_enc_body,.Lxts_enc_epilogue # HandlerData[] +.LSEH_info_xts_dec: + .byte 9,0,0,0 + .rva xts_se_handler + .rva .Lxts_dec_body,.Lxts_dec_epilogue # HandlerData[] +___ +$code.=<<___; +.LSEH_info_cbc: + .byte 9,0,0,0 + .rva cbc_se_handler +.LSEH_info_key: + .byte 0x01,0x04,0x01,0x00 + .byte 0x04,0x02,0x00,0x00 # sub rsp,8 +___ +} + +sub rex { + local *opcode=shift; + my ($dst,$src)=@_; + my $rex=0; + + $rex|=0x04 if($dst>=8); + $rex|=0x01 if($src>=8); + push @opcode,$rex|0x40 if($rex); +} + +sub aesni { + my $line=shift; + my @opcode=(0x66); + + if ($line=~/(aeskeygenassist)\s+\$([x0-9a-f]+),\s*%xmm([0-9]+),\s*%xmm([0-9]+)/) { + rex(\@opcode,$4,$3); + push @opcode,0x0f,0x3a,0xdf; + push @opcode,0xc0|($3&7)|(($4&7)<<3); # ModR/M + my $c=$2; + push @opcode,$c=~/^0/?oct($c):$c; + return ".byte\t".join(',',@opcode); + } + elsif ($line=~/(aes[a-z]+)\s+%xmm([0-9]+),\s*%xmm([0-9]+)/) { + my %opcodelet = ( + "aesimc" => 0xdb, + "aesenc" => 0xdc, "aesenclast" => 0xdd, + "aesdec" => 0xde, "aesdeclast" => 0xdf + ); + return undef if (!defined($opcodelet{$1})); + rex(\@opcode,$3,$2); + push @opcode,0x0f,0x38,$opcodelet{$1}; + push @opcode,0xc0|($2&7)|(($3&7)<<3); # ModR/M + return ".byte\t".join(',',@opcode); + } + return $line; +} + +$code =~ s/\`([^\`]*)\`/eval($1)/gem; +$code =~ s/\b(aes.*%xmm[0-9]+).*$/aesni($1)/gem; + +print $code; + +close STDOUT; diff --git a/crypto/openssl/crypto/aes/asm/bsaes-x86_64.pl b/crypto/openssl/crypto/aes/asm/bsaes-x86_64.pl new file mode 100644 index 0000000000..ff7e3afe82 --- /dev/null +++ b/crypto/openssl/crypto/aes/asm/bsaes-x86_64.pl @@ -0,0 +1,3004 @@ +#!/usr/bin/env perl + +################################################################### +### AES-128 [originally in CTR mode] ### +### bitsliced implementation for Intel Core 2 processors ### +### requires support of SSE extensions up to SSSE3 ### +### Author: Emilia Käsper and Peter Schwabe ### +### Date: 2009-03-19 ### +### Public domain ### +### ### +### See http://homes.esat.kuleuven.be/~ekasper/#software for ### +### further information. ### +################################################################### +# +# September 2011. +# +# Started as transliteration to "perlasm" the original code has +# undergone following changes: +# +# - code was made position-independent; +# - rounds were folded into a loop resulting in >5x size reduction +# from 12.5KB to 2.2KB; +# - above was possibile thanks to mixcolumns() modification that +# allowed to feed its output back to aesenc[last], this was +# achieved at cost of two additional inter-registers moves; +# - some instruction reordering and interleaving; +# - this module doesn't implement key setup subroutine, instead it +# relies on conversion of "conventional" key schedule as returned +# by AES_set_encrypt_key (see discussion below); +# - first and last round keys are treated differently, which allowed +# to skip one shiftrows(), reduce bit-sliced key schedule and +# speed-up conversion by 22%; +# - support for 192- and 256-bit keys was added; +# +# Resulting performance in CPU cycles spent to encrypt one byte out +# of 4096-byte buffer with 128-bit key is: +# +# Emilia's this(*) difference +# +# Core 2 9.30 8.69 +7% +# Nehalem(**) 7.63 6.98 +9% +# Atom 17.1 17.4 -2%(***) +# +# (*) Comparison is not completely fair, because "this" is ECB, +# i.e. no extra processing such as counter values calculation +# and xor-ing input as in Emilia's CTR implementation is +# performed. However, the CTR calculations stand for not more +# than 1% of total time, so comparison is *rather* fair. +# +# (**) Results were collected on Westmere, which is considered to +# be equivalent to Nehalem for this code. +# +# (***) Slowdown on Atom is rather strange per se, because original +# implementation has a number of 9+-bytes instructions, which +# are bad for Atom front-end, and which I eliminated completely. +# In attempt to address deterioration sbox() was tested in FP +# SIMD "domain" (movaps instead of movdqa, xorps instead of +# pxor, etc.). While it resulted in nominal 4% improvement on +# Atom, it hurted Westmere by more than 2x factor. +# +# As for key schedule conversion subroutine. Interface to OpenSSL +# relies on per-invocation on-the-fly conversion. This naturally +# has impact on performance, especially for short inputs. Conversion +# time in CPU cycles and its ratio to CPU cycles spent in 8x block +# function is: +# +# conversion conversion/8x block +# Core 2 410 0.37 +# Nehalem 310 0.35 +# Atom 570 0.26 +# +# The ratio values mean that 128-byte blocks will be processed +# 21-27% slower, 256-byte blocks - 12-16%, 384-byte blocks - 8-11%, +# etc. Then keep in mind that input sizes not divisible by 128 are +# *effectively* slower, especially shortest ones, e.g. consecutive +# 144-byte blocks are processed 44% slower than one would expect, +# 272 - 29%, 400 - 22%, etc. Yet, despite all these "shortcomings" +# it's still faster than ["hyper-threading-safe" code path in] +# aes-x86_64.pl on all lengths above 64 bytes... +# +# October 2011. +# +# Add decryption procedure. Performance in CPU cycles spent to decrypt +# one byte out of 4096-byte buffer with 128-bit key is: +# +# Core 2 11.0 +# Nehalem 9.16 +# +# November 2011. +# +# Add bsaes_xts_[en|de]crypt. Less-than-80-bytes-block performance is +# suboptimal, but XTS is meant to be used with larger blocks... +# +# + +$flavour = shift; +$output = shift; +if ($flavour =~ /\./) { $output = $flavour; undef $flavour; } + +$win64=0; $win64=1 if ($flavour =~ /[nm]asm|mingw64/ || $output =~ /\.asm$/); + +$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; +( $xlate="${dir}x86_64-xlate.pl" and -f $xlate ) or +( $xlate="${dir}../../perlasm/x86_64-xlate.pl" and -f $xlate) or +die "can't locate x86_64-xlate.pl"; + +open STDOUT,"| $^X $xlate $flavour $output"; + +my ($inp,$out,$len,$key,$ivp)=("%rdi","%rsi","%rdx","%rcx"); +my @XMM=map("%xmm$_",(15,0..14)); # best on Atom, +10% over (0..15) +my $ecb=0; # suppress unreferenced ECB subroutines, spare some space... + +{ +my ($key,$rounds,$const)=("%rax","%r10d","%r11"); + +sub Sbox { +# input in lsb > [b0, b1, b2, b3, b4, b5, b6, b7] < msb +# output in lsb > [b0, b1, b4, b6, b3, b7, b2, b5] < msb +my @b=@_[0..7]; +my @t=@_[8..11]; +my @s=@_[12..15]; + &InBasisChange (@b); + &Inv_GF256 (@b[6,5,0,3,7,1,4,2],@t,@s); + &OutBasisChange (@b[7,1,4,2,6,5,0,3]); +} + +sub InBasisChange { +# input in lsb > [b0, b1, b2, b3, b4, b5, b6, b7] < msb +# output in lsb > [b6, b5, b0, b3, b7, b1, b4, b2] < msb +my @b=@_[0..7]; +$code.=<<___; + pxor @b[6], @b[5] + pxor @b[1], @b[2] + pxor @b[0], @b[3] + pxor @b[2], @b[6] + pxor @b[0], @b[5] + + pxor @b[3], @b[6] + pxor @b[7], @b[3] + pxor @b[5], @b[7] + pxor @b[4], @b[3] + pxor @b[5], @b[4] + pxor @b[1], @b[3] + + pxor @b[7], @b[2] + pxor @b[5], @b[1] +___ +} + +sub OutBasisChange { +# input in lsb > [b0, b1, b2, b3, b4, b5, b6, b7] < msb +# output in lsb > [b6, b1, b2, b4, b7, b0, b3, b5] < msb +my @b=@_[0..7]; +$code.=<<___; + pxor @b[6], @b[0] + pxor @b[4], @b[1] + pxor @b[0], @b[2] + pxor @b[6], @b[4] + pxor @b[1], @b[6] + + pxor @b[5], @b[1] + pxor @b[3], @b[5] + pxor @b[7], @b[3] + pxor @b[5], @b[7] + pxor @b[5], @b[2] + + pxor @b[7], @b[4] +___ +} + +sub InvSbox { +# input in lsb > [b0, b1, b2, b3, b4, b5, b6, b7] < msb +# output in lsb > [b0, b1, b6, b4, b2, b7, b3, b5] < msb +my @b=@_[0..7]; +my @t=@_[8..11]; +my @s=@_[12..15]; + &InvInBasisChange (@b); + &Inv_GF256 (@b[5,1,2,6,3,7,0,4],@t,@s); + &InvOutBasisChange (@b[3,7,0,4,5,1,2,6]); +} + +sub InvInBasisChange { # OutBasisChange in reverse +my @b=@_[5,1,2,6,3,7,0,4]; +$code.=<<___ + pxor @b[7], @b[4] + + pxor @b[5], @b[7] + pxor @b[5], @b[2] + pxor @b[7], @b[3] + pxor @b[3], @b[5] + pxor @b[5], @b[1] + + pxor @b[1], @b[6] + pxor @b[0], @b[2] + pxor @b[6], @b[4] + pxor @b[6], @b[0] + pxor @b[4], @b[1] +___ +} + +sub InvOutBasisChange { # InBasisChange in reverse +my @b=@_[2,5,7,3,6,1,0,4]; +$code.=<<___; + pxor @b[5], @b[1] + pxor @b[7], @b[2] + + pxor @b[1], @b[3] + pxor @b[5], @b[4] + pxor @b[5], @b[7] + pxor @b[4], @b[3] + pxor @b[0], @b[5] + pxor @b[7], @b[3] + pxor @b[2], @b[6] + pxor @b[1], @b[2] + pxor @b[3], @b[6] + + pxor @b[0], @b[3] + pxor @b[6], @b[5] +___ +} + +sub Mul_GF4 { +#;************************************************************* +#;* Mul_GF4: Input x0-x1,y0-y1 Output x0-x1 Temp t0 (8) * +#;************************************************************* +my ($x0,$x1,$y0,$y1,$t0)=@_; +$code.=<<___; + movdqa $y0, $t0 + pxor $y1, $t0 + pand $x0, $t0 + pxor $x1, $x0 + pand $y0, $x1 + pand $y1, $x0 + pxor $x1, $x0 + pxor $t0, $x1 +___ +} + +sub Mul_GF4_N { # not used, see next subroutine +# multiply and scale by N +my ($x0,$x1,$y0,$y1,$t0)=@_; +$code.=<<___; + movdqa $y0, $t0 + pxor $y1, $t0 + pand $x0, $t0 + pxor $x1, $x0 + pand $y0, $x1 + pand $y1, $x0 + pxor $x0, $x1 + pxor $t0, $x0 +___ +} + +sub Mul_GF4_N_GF4 { +# interleaved Mul_GF4_N and Mul_GF4 +my ($x0,$x1,$y0,$y1,$t0, + $x2,$x3,$y2,$y3,$t1)=@_; +$code.=<<___; + movdqa $y0, $t0 + movdqa $y2, $t1 + pxor $y1, $t0 + pxor $y3, $t1 + pand $x0, $t0 + pand $x2, $t1 + pxor $x1, $x0 + pxor $x3, $x2 + pand $y0, $x1 + pand $y2, $x3 + pand $y1, $x0 + pand $y3, $x2 + pxor $x0, $x1 + pxor $x3, $x2 + pxor $t0, $x0 + pxor $t1, $x3 +___ +} +sub Mul_GF16_2 { +my @x=@_[0..7]; +my @y=@_[8..11]; +my @t=@_[12..15]; +$code.=<<___; + movdqa @x[0], @t[0] + movdqa @x[1], @t[1] +___ + &Mul_GF4 (@x[0], @x[1], @y[0], @y[1], @t[2]); +$code.=<<___; + pxor @x[2], @t[0] + pxor @x[3], @t[1] + pxor @y[2], @y[0] + pxor @y[3], @y[1] +___ + Mul_GF4_N_GF4 (@t[0], @t[1], @y[0], @y[1], @t[3], + @x[2], @x[3], @y[2], @y[3], @t[2]); +$code.=<<___; + pxor @t[0], @x[0] + pxor @t[0], @x[2] + pxor @t[1], @x[1] + pxor @t[1], @x[3] + + movdqa @x[4], @t[0] + movdqa @x[5], @t[1] + pxor @x[6], @t[0] + pxor @x[7], @t[1] +___ + &Mul_GF4_N_GF4 (@t[0], @t[1], @y[0], @y[1], @t[3], + @x[6], @x[7], @y[2], @y[3], @t[2]); +$code.=<<___; + pxor @y[2], @y[0] + pxor @y[3], @y[1] +___ + &Mul_GF4 (@x[4], @x[5], @y[0], @y[1], @t[3]); +$code.=<<___; + pxor @t[0], @x[4] + pxor @t[0], @x[6] + pxor @t[1], @x[5] + pxor @t[1], @x[7] +___ +} +sub Inv_GF256 { +#;******************************************************************** +#;* Inv_GF256: Input x0-x7 Output x0-x7 Temp t0-t3,s0-s3 (144) * +#;******************************************************************** +my @x=@_[0..7]; +my @t=@_[8..11]; +my @s=@_[12..15]; +# direct optimizations from hardware +$code.=<<___; + movdqa @x[4], @t[3] + movdqa @x[5], @t[2] + movdqa @x[1], @t[1] + movdqa @x[7], @s[1] + movdqa @x[0], @s[0] + + pxor @x[6], @t[3] + pxor @x[7], @t[2] + pxor @x[3], @t[1] + movdqa @t[3], @s[2] + pxor @x[6], @s[1] + movdqa @t[2], @t[0] + pxor @x[2], @s[0] + movdqa @t[3], @s[3] + + por @t[1], @t[2] + por @s[0], @t[3] + pxor @t[0], @s[3] + pand @s[0], @s[2] + pxor @t[1], @s[0] + pand @t[1], @t[0] + pand @s[0], @s[3] + movdqa @x[3], @s[0] + pxor @x[2], @s[0] + pand @s[0], @s[1] + pxor @s[1], @t[3] + pxor @s[1], @t[2] + movdqa @x[4], @s[1] + movdqa @x[1], @s[0] + pxor @x[5], @s[1] + pxor @x[0], @s[0] + movdqa @s[1], @t[1] + pand @s[0], @s[1] + por @s[0], @t[1] + pxor @s[1], @t[0] + pxor @s[3], @t[3] + pxor @s[2], @t[2] + pxor @s[3], @t[1] + movdqa @x[7], @s[0] + pxor @s[2], @t[0] + movdqa @x[6], @s[1] + pxor @s[2], @t[1] + movdqa @x[5], @s[2] + pand @x[3], @s[0] + movdqa @x[4], @s[3] + pand @x[2], @s[1] + pand @x[1], @s[2] + por @x[0], @s[3] + pxor @s[0], @t[3] + pxor @s[1], @t[2] + pxor @s[2], @t[1] + pxor @s[3], @t[0] + + #Inv_GF16 \t0, \t1, \t2, \t3, \s0, \s1, \s2, \s3 + + # new smaller inversion + + movdqa @t[3], @s[0] + pand @t[1], @t[3] + pxor @t[2], @s[0] + + movdqa @t[0], @s[2] + movdqa @s[0], @s[3] + pxor @t[3], @s[2] + pand @s[2], @s[3] + + movdqa @t[1], @s[1] + pxor @t[2], @s[3] + pxor @t[0], @s[1] + + pxor @t[2], @t[3] + + pand @t[3], @s[1] + + movdqa @s[2], @t[2] + pxor @t[0], @s[1] + + pxor @s[1], @t[2] + pxor @s[1], @t[1] + + pand @t[0], @t[2] + + pxor @t[2], @s[2] + pxor @t[2], @t[1] + + pand @s[3], @s[2] + + pxor @s[0], @s[2] +___ +# output in s3, s2, s1, t1 + +# Mul_GF16_2 \x0, \x1, \x2, \x3, \x4, \x5, \x6, \x7, \t2, \t3, \t0, \t1, \s0, \s1, \s2, \s3 + +# Mul_GF16_2 \x0, \x1, \x2, \x3, \x4, \x5, \x6, \x7, \s3, \s2, \s1, \t1, \s0, \t0, \t2, \t3 + &Mul_GF16_2(@x,@s[3,2,1],@t[1],@s[0],@t[0,2,3]); + +### output msb > [x3,x2,x1,x0,x7,x6,x5,x4] < lsb +} + +# AES linear components + +sub ShiftRows { +my @x=@_[0..7]; +my $mask=pop; +$code.=<<___; + pxor 0x00($key),@x[0] + pxor 0x10($key),@x[1] + pshufb $mask,@x[0] + pxor 0x20($key),@x[2] + pshufb $mask,@x[1] + pxor 0x30($key),@x[3] + pshufb $mask,@x[2] + pxor 0x40($key),@x[4] + pshufb $mask,@x[3] + pxor 0x50($key),@x[5] + pshufb $mask,@x[4] + pxor 0x60($key),@x[6] + pshufb $mask,@x[5] + pxor 0x70($key),@x[7] + pshufb $mask,@x[6] + lea 0x80($key),$key + pshufb $mask,@x[7] +___ +} + +sub MixColumns { +# modified to emit output in order suitable for feeding back to aesenc[last] +my @x=@_[0..7]; +my @t=@_[8..15]; +$code.=<<___; + pshufd \$0x93, @x[0], @t[0] # x0 <<< 32 + pshufd \$0x93, @x[1], @t[1] + pxor @t[0], @x[0] # x0 ^ (x0 <<< 32) + pshufd \$0x93, @x[2], @t[2] + pxor @t[1], @x[1] + pshufd \$0x93, @x[3], @t[3] + pxor @t[2], @x[2] + pshufd \$0x93, @x[4], @t[4] + pxor @t[3], @x[3] + pshufd \$0x93, @x[5], @t[5] + pxor @t[4], @x[4] + pshufd \$0x93, @x[6], @t[6] + pxor @t[5], @x[5] + pshufd \$0x93, @x[7], @t[7] + pxor @t[6], @x[6] + pxor @t[7], @x[7] + + pxor @x[0], @t[1] + pxor @x[7], @t[0] + pxor @x[7], @t[1] + pshufd \$0x4E, @x[0], @x[0] # (x0 ^ (x0 <<< 32)) <<< 64) + pxor @x[1], @t[2] + pshufd \$0x4E, @x[1], @x[1] + pxor @x[4], @t[5] + pxor @t[0], @x[0] + pxor @x[5], @t[6] + pxor @t[1], @x[1] + pxor @x[3], @t[4] + pshufd \$0x4E, @x[4], @t[0] + pxor @x[6], @t[7] + pshufd \$0x4E, @x[5], @t[1] + pxor @x[2], @t[3] + pshufd \$0x4E, @x[3], @x[4] + pxor @x[7], @t[3] + pshufd \$0x4E, @x[7], @x[5] + pxor @x[7], @t[4] + pshufd \$0x4E, @x[6], @x[3] + pxor @t[4], @t[0] + pshufd \$0x4E, @x[2], @x[6] + pxor @t[5], @t[1] + + pxor @t[3], @x[4] + pxor @t[7], @x[5] + pxor @t[6], @x[3] + movdqa @t[0], @x[2] + pxor @t[2], @x[6] + movdqa @t[1], @x[7] +___ +} + +sub InvMixColumns { +my @x=@_[0..7]; +my @t=@_[8..15]; + +$code.=<<___; + # multiplication by 0x0e + pshufd \$0x93, @x[7], @t[7] + movdqa @x[2], @t[2] + pxor @x[5], @x[7] # 7 5 + pxor @x[5], @x[2] # 2 5 + pshufd \$0x93, @x[0], @t[0] + movdqa @x[5], @t[5] + pxor @x[0], @x[5] # 5 0 [1] + pxor @x[1], @x[0] # 0 1 + pshufd \$0x93, @x[1], @t[1] + pxor @x[2], @x[1] # 1 25 + pxor @x[6], @x[0] # 01 6 [2] + pxor @x[3], @x[1] # 125 3 [4] + pshufd \$0x93, @x[3], @t[3] + pxor @x[0], @x[2] # 25 016 [3] + pxor @x[7], @x[3] # 3 75 + pxor @x[6], @x[7] # 75 6 [0] + pshufd \$0x93, @x[6], @t[6] + movdqa @x[4], @t[4] + pxor @x[4], @x[6] # 6 4 + pxor @x[3], @x[4] # 4 375 [6] + pxor @x[7], @x[3] # 375 756=36 + pxor @t[5], @x[6] # 64 5 [7] + pxor @t[2], @x[3] # 36 2 + pxor @t[4], @x[3] # 362 4 [5] + pshufd \$0x93, @t[5], @t[5] +___ + my @y = @x[7,5,0,2,1,3,4,6]; +$code.=<<___; + # multiplication by 0x0b + pxor @y[0], @y[1] + pxor @t[0], @y[0] + pxor @t[1], @y[1] + pshufd \$0x93, @t[2], @t[2] + pxor @t[5], @y[0] + pxor @t[6], @y[1] + pxor @t[7], @y[0] + pshufd \$0x93, @t[4], @t[4] + pxor @t[6], @t[7] # clobber t[7] + pxor @y[0], @y[1] + + pxor @t[0], @y[3] + pshufd \$0x93, @t[0], @t[0] + pxor @t[1], @y[2] + pxor @t[1], @y[4] + pxor @t[2], @y[2] + pshufd \$0x93, @t[1], @t[1] + pxor @t[2], @y[3] + pxor @t[2], @y[5] + pxor @t[7], @y[2] + pshufd \$0x93, @t[2], @t[2] + pxor @t[3], @y[3] + pxor @t[3], @y[6] + pxor @t[3], @y[4] + pshufd \$0x93, @t[3], @t[3] + pxor @t[4], @y[7] + pxor @t[4], @y[5] + pxor @t[7], @y[7] + pxor @t[5], @y[3] + pxor @t[4], @y[4] + pxor @t[5], @t[7] # clobber t[7] even more + + pxor @t[7], @y[5] + pshufd \$0x93, @t[4], @t[4] + pxor @t[7], @y[6] + pxor @t[7], @y[4] + + pxor @t[5], @t[7] + pshufd \$0x93, @t[5], @t[5] + pxor @t[6], @t[7] # restore t[7] + + # multiplication by 0x0d + pxor @y[7], @y[4] + pxor @t[4], @y[7] + pshufd \$0x93, @t[6], @t[6] + pxor @t[0], @y[2] + pxor @t[5], @y[7] + pxor @t[2], @y[2] + pshufd \$0x93, @t[7], @t[7] + + pxor @y[1], @y[3] + pxor @t[1], @y[1] + pxor @t[0], @y[0] + pxor @t[0], @y[3] + pxor @t[5], @y[1] + pxor @t[5], @y[0] + pxor @t[7], @y[1] + pshufd \$0x93, @t[0], @t[0] + pxor @t[6], @y[0] + pxor @y[1], @y[3] + pxor @t[1], @y[4] + pshufd \$0x93, @t[1], @t[1] + + pxor @t[7], @y[7] + pxor @t[2], @y[4] + pxor @t[2], @y[5] + pshufd \$0x93, @t[2], @t[2] + pxor @t[6], @y[2] + pxor @t[3], @t[6] # clobber t[6] + pxor @y[7], @y[4] + pxor @t[6], @y[3] + + pxor @t[6], @y[6] + pxor @t[5], @y[5] + pxor @t[4], @y[6] + pshufd \$0x93, @t[4], @t[4] + pxor @t[6], @y[5] + pxor @t[7], @y[6] + pxor @t[3], @t[6] # restore t[6] + + pshufd \$0x93, @t[5], @t[5] + pshufd \$0x93, @t[6], @t[6] + pshufd \$0x93, @t[7], @t[7] + pshufd \$0x93, @t[3], @t[3] + + # multiplication by 0x09 + pxor @y[1], @y[4] + pxor @y[1], @t[1] # t[1]=y[1] + pxor @t[5], @t[0] # clobber t[0] + pxor @t[5], @t[1] + pxor @t[0], @y[3] + pxor @y[0], @t[0] # t[0]=y[0] + pxor @t[6], @t[1] + pxor @t[7], @t[6] # clobber t[6] + pxor @t[1], @y[4] + pxor @t[4], @y[7] + pxor @y[4], @t[4] # t[4]=y[4] + pxor @t[3], @y[6] + pxor @y[3], @t[3] # t[3]=y[3] + pxor @t[2], @y[5] + pxor @y[2], @t[2] # t[2]=y[2] + pxor @t[7], @t[3] + pxor @y[5], @t[5] # t[5]=y[5] + pxor @t[6], @t[2] + pxor @t[6], @t[5] + pxor @y[6], @t[6] # t[6]=y[6] + pxor @y[7], @t[7] # t[7]=y[7] + + movdqa @t[0],@XMM[0] + movdqa @t[1],@XMM[1] + movdqa @t[2],@XMM[2] + movdqa @t[3],@XMM[3] + movdqa @t[4],@XMM[4] + movdqa @t[5],@XMM[5] + movdqa @t[6],@XMM[6] + movdqa @t[7],@XMM[7] +___ +} + +sub aesenc { # not used +my @b=@_[0..7]; +my @t=@_[8..15]; +$code.=<<___; + movdqa 0x30($const),@t[0] # .LSR +___ + &ShiftRows (@b,@t[0]); + &Sbox (@b,@t); + &MixColumns (@b[0,1,4,6,3,7,2,5],@t); +} + +sub aesenclast { # not used +my @b=@_[0..7]; +my @t=@_[8..15]; +$code.=<<___; + movdqa 0x40($const),@t[0] # .LSRM0 +___ + &ShiftRows (@b,@t[0]); + &Sbox (@b,@t); +$code.=<<___ + pxor 0x00($key),@b[0] + pxor 0x10($key),@b[1] + pxor 0x20($key),@b[4] + pxor 0x30($key),@b[6] + pxor 0x40($key),@b[3] + pxor 0x50($key),@b[7] + pxor 0x60($key),@b[2] + pxor 0x70($key),@b[5] +___ +} + +sub swapmove { +my ($a,$b,$n,$mask,$t)=@_; +$code.=<<___; + movdqa $b,$t + psrlq \$$n,$b + pxor $a,$b + pand $mask,$b + pxor $b,$a + psllq \$$n,$b + pxor $t,$b +___ +} +sub swapmove2x { +my ($a0,$b0,$a1,$b1,$n,$mask,$t0,$t1)=@_; +$code.=<<___; + movdqa $b0,$t0 + psrlq \$$n,$b0 + movdqa $b1,$t1 + psrlq \$$n,$b1 + pxor $a0,$b0 + pxor $a1,$b1 + pand $mask,$b0 + pand $mask,$b1 + pxor $b0,$a0 + psllq \$$n,$b0 + pxor $b1,$a1 + psllq \$$n,$b1 + pxor $t0,$b0 + pxor $t1,$b1 +___ +} + +sub bitslice { +my @x=reverse(@_[0..7]); +my ($t0,$t1,$t2,$t3)=@_[8..11]; +$code.=<<___; + movdqa 0x00($const),$t0 # .LBS0 + movdqa 0x10($const),$t1 # .LBS1 +___ + &swapmove2x(@x[0,1,2,3],1,$t0,$t2,$t3); + &swapmove2x(@x[4,5,6,7],1,$t0,$t2,$t3); +$code.=<<___; + movdqa 0x20($const),$t0 # .LBS2 +___ + &swapmove2x(@x[0,2,1,3],2,$t1,$t2,$t3); + &swapmove2x(@x[4,6,5,7],2,$t1,$t2,$t3); + + &swapmove2x(@x[0,4,1,5],4,$t0,$t2,$t3); + &swapmove2x(@x[2,6,3,7],4,$t0,$t2,$t3); +} + +$code.=<<___; +.text + +.extern asm_AES_encrypt +.extern asm_AES_decrypt + +.type _bsaes_encrypt8,\@abi-omnipotent +.align 64 +_bsaes_encrypt8: + lea .LBS0(%rip), $const # constants table + + movdqa ($key), @XMM[9] # round 0 key + lea 0x10($key), $key + movdqa 0x60($const), @XMM[8] # .LM0SR + pxor @XMM[9], @XMM[0] # xor with round0 key + pxor @XMM[9], @XMM[1] + pshufb @XMM[8], @XMM[0] + pxor @XMM[9], @XMM[2] + pshufb @XMM[8], @XMM[1] + pxor @XMM[9], @XMM[3] + pshufb @XMM[8], @XMM[2] + pxor @XMM[9], @XMM[4] + pshufb @XMM[8], @XMM[3] + pxor @XMM[9], @XMM[5] + pshufb @XMM[8], @XMM[4] + pxor @XMM[9], @XMM[6] + pshufb @XMM[8], @XMM[5] + pxor @XMM[9], @XMM[7] + pshufb @XMM[8], @XMM[6] + pshufb @XMM[8], @XMM[7] +_bsaes_encrypt8_bitslice: +___ + &bitslice (@XMM[0..7, 8..11]); +$code.=<<___; + dec $rounds + jmp .Lenc_sbox +.align 16 +.Lenc_loop: +___ + &ShiftRows (@XMM[0..7, 8]); +$code.=".Lenc_sbox:\n"; + &Sbox (@XMM[0..7, 8..15]); +$code.=<<___; + dec $rounds + jl .Lenc_done +___ + &MixColumns (@XMM[0,1,4,6,3,7,2,5, 8..15]); +$code.=<<___; + movdqa 0x30($const), @XMM[8] # .LSR + jnz .Lenc_loop + movdqa 0x40($const), @XMM[8] # .LSRM0 + jmp .Lenc_loop +.align 16 +.Lenc_done: +___ + # output in lsb > [t0, t1, t4, t6, t3, t7, t2, t5] < msb + &bitslice (@XMM[0,1,4,6,3,7,2,5, 8..11]); +$code.=<<___; + movdqa ($key), @XMM[8] # last round key + pxor @XMM[8], @XMM[4] + pxor @XMM[8], @XMM[6] + pxor @XMM[8], @XMM[3] + pxor @XMM[8], @XMM[7] + pxor @XMM[8], @XMM[2] + pxor @XMM[8], @XMM[5] + pxor @XMM[8], @XMM[0] + pxor @XMM[8], @XMM[1] + ret +.size _bsaes_encrypt8,.-_bsaes_encrypt8 + +.type _bsaes_decrypt8,\@abi-omnipotent +.align 64 +_bsaes_decrypt8: + lea .LBS0(%rip), $const # constants table + + movdqa ($key), @XMM[9] # round 0 key + lea 0x10($key), $key + movdqa -0x30($const), @XMM[8] # .LM0ISR + pxor @XMM[9], @XMM[0] # xor with round0 key + pxor @XMM[9], @XMM[1] + pshufb @XMM[8], @XMM[0] + pxor @XMM[9], @XMM[2] + pshufb @XMM[8], @XMM[1] + pxor @XMM[9], @XMM[3] + pshufb @XMM[8], @XMM[2] + pxor @XMM[9], @XMM[4] + pshufb @XMM[8], @XMM[3] + pxor @XMM[9], @XMM[5] + pshufb @XMM[8], @XMM[4] + pxor @XMM[9], @XMM[6] + pshufb @XMM[8], @XMM[5] + pxor @XMM[9], @XMM[7] + pshufb @XMM[8], @XMM[6] + pshufb @XMM[8], @XMM[7] +___ + &bitslice (@XMM[0..7, 8..11]); +$code.=<<___; + dec $rounds + jmp .Ldec_sbox +.align 16 +.Ldec_loop: +___ + &ShiftRows (@XMM[0..7, 8]); +$code.=".Ldec_sbox:\n"; + &InvSbox (@XMM[0..7, 8..15]); +$code.=<<___; + dec $rounds + jl .Ldec_done +___ + &InvMixColumns (@XMM[0,1,6,4,2,7,3,5, 8..15]); +$code.=<<___; + movdqa -0x10($const), @XMM[8] # .LISR + jnz .Ldec_loop + movdqa -0x20($const), @XMM[8] # .LISRM0 + jmp .Ldec_loop +.align 16 +.Ldec_done: +___ + &bitslice (@XMM[0,1,6,4,2,7,3,5, 8..11]); +$code.=<<___; + movdqa ($key), @XMM[8] # last round key + pxor @XMM[8], @XMM[6] + pxor @XMM[8], @XMM[4] + pxor @XMM[8], @XMM[2] + pxor @XMM[8], @XMM[7] + pxor @XMM[8], @XMM[3] + pxor @XMM[8], @XMM[5] + pxor @XMM[8], @XMM[0] + pxor @XMM[8], @XMM[1] + ret +.size _bsaes_decrypt8,.-_bsaes_decrypt8 +___ +} +{ +my ($out,$inp,$rounds,$const)=("%rax","%rcx","%r10d","%r11"); + +sub bitslice_key { +my @x=reverse(@_[0..7]); +my ($bs0,$bs1,$bs2,$t2,$t3)=@_[8..12]; + + &swapmove (@x[0,1],1,$bs0,$t2,$t3); +$code.=<<___; + #&swapmove(@x[2,3],1,$t0,$t2,$t3); + movdqa @x[0], @x[2] + movdqa @x[1], @x[3] +___ + #&swapmove2x(@x[4,5,6,7],1,$t0,$t2,$t3); + + &swapmove2x (@x[0,2,1,3],2,$bs1,$t2,$t3); +$code.=<<___; + #&swapmove2x(@x[4,6,5,7],2,$t1,$t2,$t3); + movdqa @x[0], @x[4] + movdqa @x[2], @x[6] + movdqa @x[1], @x[5] + movdqa @x[3], @x[7] +___ + &swapmove2x (@x[0,4,1,5],4,$bs2,$t2,$t3); + &swapmove2x (@x[2,6,3,7],4,$bs2,$t2,$t3); +} + +$code.=<<___; +.type _bsaes_key_convert,\@abi-omnipotent +.align 16 +_bsaes_key_convert: + lea .LBS1(%rip), $const + movdqu ($inp), %xmm7 # load round 0 key + movdqa -0x10($const), %xmm8 # .LBS0 + movdqa 0x00($const), %xmm9 # .LBS1 + movdqa 0x10($const), %xmm10 # .LBS2 + movdqa 0x40($const), %xmm13 # .LM0 + movdqa 0x60($const), %xmm14 # .LNOT + + movdqu 0x10($inp), %xmm6 # load round 1 key + lea 0x10($inp), $inp + movdqa %xmm7, ($out) # save round 0 key + lea 0x10($out), $out + dec $rounds + jmp .Lkey_loop +.align 16 +.Lkey_loop: + pshufb %xmm13, %xmm6 # .LM0 + movdqa %xmm6, %xmm7 +___ + &bitslice_key (map("%xmm$_",(0..7, 8..12))); +$code.=<<___; + pxor %xmm14, %xmm5 # "pnot" + pxor %xmm14, %xmm6 + pxor %xmm14, %xmm0 + pxor %xmm14, %xmm1 + lea 0x10($inp), $inp + movdqa %xmm0, 0x00($out) # write bit-sliced round key + movdqa %xmm1, 0x10($out) + movdqa %xmm2, 0x20($out) + movdqa %xmm3, 0x30($out) + movdqa %xmm4, 0x40($out) + movdqa %xmm5, 0x50($out) + movdqa %xmm6, 0x60($out) + movdqa %xmm7, 0x70($out) + lea 0x80($out),$out + movdqu ($inp), %xmm6 # load next round key + dec $rounds + jnz .Lkey_loop + + movdqa 0x70($const), %xmm7 # .L63 + #movdqa %xmm6, ($out) # don't save last round key + ret +.size _bsaes_key_convert,.-_bsaes_key_convert +___ +} + +if (0 && !$win64) { # following four functions are unsupported interface + # used for benchmarking... +$code.=<<___; +.globl bsaes_enc_key_convert +.type bsaes_enc_key_convert,\@function,2 +.align 16 +bsaes_enc_key_convert: + mov 240($inp),%r10d # pass rounds + mov $inp,%rcx # pass key + mov $out,%rax # pass key schedule + call _bsaes_key_convert + pxor %xmm6,%xmm7 # fix up last round key + movdqa %xmm7,(%rax) # save last round key + ret +.size bsaes_enc_key_convert,.-bsaes_enc_key_convert + +.globl bsaes_encrypt_128 +.type bsaes_encrypt_128,\@function,4 +.align 16 +bsaes_encrypt_128: +.Lenc128_loop: + movdqu 0x00($inp), @XMM[0] # load input + movdqu 0x10($inp), @XMM[1] + movdqu 0x20($inp), @XMM[2] + movdqu 0x30($inp), @XMM[3] + movdqu 0x40($inp), @XMM[4] + movdqu 0x50($inp), @XMM[5] + movdqu 0x60($inp), @XMM[6] + movdqu 0x70($inp), @XMM[7] + mov $key, %rax # pass the $key + lea 0x80($inp), $inp + mov \$10,%r10d + + call _bsaes_encrypt8 + + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + movdqu @XMM[4], 0x20($out) + movdqu @XMM[6], 0x30($out) + movdqu @XMM[3], 0x40($out) + movdqu @XMM[7], 0x50($out) + movdqu @XMM[2], 0x60($out) + movdqu @XMM[5], 0x70($out) + lea 0x80($out), $out + sub \$0x80,$len + ja .Lenc128_loop + ret +.size bsaes_encrypt_128,.-bsaes_encrypt_128 + +.globl bsaes_dec_key_convert +.type bsaes_dec_key_convert,\@function,2 +.align 16 +bsaes_dec_key_convert: + mov 240($inp),%r10d # pass rounds + mov $inp,%rcx # pass key + mov $out,%rax # pass key schedule + call _bsaes_key_convert + pxor ($out),%xmm7 # fix up round 0 key + movdqa %xmm6,(%rax) # save last round key + movdqa %xmm7,($out) + ret +.size bsaes_dec_key_convert,.-bsaes_dec_key_convert + +.globl bsaes_decrypt_128 +.type bsaes_decrypt_128,\@function,4 +.align 16 +bsaes_decrypt_128: +.Ldec128_loop: + movdqu 0x00($inp), @XMM[0] # load input + movdqu 0x10($inp), @XMM[1] + movdqu 0x20($inp), @XMM[2] + movdqu 0x30($inp), @XMM[3] + movdqu 0x40($inp), @XMM[4] + movdqu 0x50($inp), @XMM[5] + movdqu 0x60($inp), @XMM[6] + movdqu 0x70($inp), @XMM[7] + mov $key, %rax # pass the $key + lea 0x80($inp), $inp + mov \$10,%r10d + + call _bsaes_decrypt8 + + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + movdqu @XMM[6], 0x20($out) + movdqu @XMM[4], 0x30($out) + movdqu @XMM[2], 0x40($out) + movdqu @XMM[7], 0x50($out) + movdqu @XMM[3], 0x60($out) + movdqu @XMM[5], 0x70($out) + lea 0x80($out), $out + sub \$0x80,$len + ja .Ldec128_loop + ret +.size bsaes_decrypt_128,.-bsaes_decrypt_128 +___ +} +{ +###################################################################### +# +# OpenSSL interface +# +my ($arg1,$arg2,$arg3,$arg4,$arg5,$arg6)=$win64 ? ("%rcx","%rdx","%r8","%r9","%r10","%r11d") + : ("%rdi","%rsi","%rdx","%rcx","%r8","%r9d"); +my ($inp,$out,$len,$key)=("%r12","%r13","%r14","%r15"); + +if ($ecb) { +$code.=<<___; +.globl bsaes_ecb_encrypt_blocks +.type bsaes_ecb_encrypt_blocks,\@abi-omnipotent +.align 16 +bsaes_ecb_encrypt_blocks: + mov %rsp, %rax +.Lecb_enc_prologue: + push %rbp + push %rbx + push %r12 + push %r13 + push %r14 + push %r15 + lea -0x48(%rsp),%rsp +___ +$code.=<<___ if ($win64); + lea -0xa0(%rsp), %rsp + movaps %xmm6, 0x40(%rsp) + movaps %xmm7, 0x50(%rsp) + movaps %xmm8, 0x60(%rsp) + movaps %xmm9, 0x70(%rsp) + movaps %xmm10, 0x80(%rsp) + movaps %xmm11, 0x90(%rsp) + movaps %xmm12, 0xa0(%rsp) + movaps %xmm13, 0xb0(%rsp) + movaps %xmm14, 0xc0(%rsp) + movaps %xmm15, 0xd0(%rsp) +.Lecb_enc_body: +___ +$code.=<<___; + mov %rsp,%rbp # backup %rsp + mov 240($arg4),%eax # rounds + mov $arg1,$inp # backup arguments + mov $arg2,$out + mov $arg3,$len + mov $arg4,$key + cmp \$8,$arg3 + jb .Lecb_enc_short + + mov %eax,%ebx # backup rounds + shl \$7,%rax # 128 bytes per inner round key + sub \$`128-32`,%rax # size of bit-sliced key schedule + sub %rax,%rsp + mov %rsp,%rax # pass key schedule + mov $key,%rcx # pass key + mov %ebx,%r10d # pass rounds + call _bsaes_key_convert + pxor %xmm6,%xmm7 # fix up last round key + movdqa %xmm7,(%rax) # save last round key + + sub \$8,$len +.Lecb_enc_loop: + movdqu 0x00($inp), @XMM[0] # load input + movdqu 0x10($inp), @XMM[1] + movdqu 0x20($inp), @XMM[2] + movdqu 0x30($inp), @XMM[3] + movdqu 0x40($inp), @XMM[4] + movdqu 0x50($inp), @XMM[5] + mov %rsp, %rax # pass key schedule + movdqu 0x60($inp), @XMM[6] + mov %ebx,%r10d # pass rounds + movdqu 0x70($inp), @XMM[7] + lea 0x80($inp), $inp + + call _bsaes_encrypt8 + + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + movdqu @XMM[4], 0x20($out) + movdqu @XMM[6], 0x30($out) + movdqu @XMM[3], 0x40($out) + movdqu @XMM[7], 0x50($out) + movdqu @XMM[2], 0x60($out) + movdqu @XMM[5], 0x70($out) + lea 0x80($out), $out + sub \$8,$len + jnc .Lecb_enc_loop + + add \$8,$len + jz .Lecb_enc_done + + movdqu 0x00($inp), @XMM[0] # load input + mov %rsp, %rax # pass key schedule + mov %ebx,%r10d # pass rounds + cmp \$2,$len + jb .Lecb_enc_one + movdqu 0x10($inp), @XMM[1] + je .Lecb_enc_two + movdqu 0x20($inp), @XMM[2] + cmp \$4,$len + jb .Lecb_enc_three + movdqu 0x30($inp), @XMM[3] + je .Lecb_enc_four + movdqu 0x40($inp), @XMM[4] + cmp \$6,$len + jb .Lecb_enc_five + movdqu 0x50($inp), @XMM[5] + je .Lecb_enc_six + movdqu 0x60($inp), @XMM[6] + call _bsaes_encrypt8 + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + movdqu @XMM[4], 0x20($out) + movdqu @XMM[6], 0x30($out) + movdqu @XMM[3], 0x40($out) + movdqu @XMM[7], 0x50($out) + movdqu @XMM[2], 0x60($out) + jmp .Lecb_enc_done +.align 16 +.Lecb_enc_six: + call _bsaes_encrypt8 + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + movdqu @XMM[4], 0x20($out) + movdqu @XMM[6], 0x30($out) + movdqu @XMM[3], 0x40($out) + movdqu @XMM[7], 0x50($out) + jmp .Lecb_enc_done +.align 16 +.Lecb_enc_five: + call _bsaes_encrypt8 + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + movdqu @XMM[4], 0x20($out) + movdqu @XMM[6], 0x30($out) + movdqu @XMM[3], 0x40($out) + jmp .Lecb_enc_done +.align 16 +.Lecb_enc_four: + call _bsaes_encrypt8 + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + movdqu @XMM[4], 0x20($out) + movdqu @XMM[6], 0x30($out) + jmp .Lecb_enc_done +.align 16 +.Lecb_enc_three: + call _bsaes_encrypt8 + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + movdqu @XMM[4], 0x20($out) + jmp .Lecb_enc_done +.align 16 +.Lecb_enc_two: + call _bsaes_encrypt8 + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + jmp .Lecb_enc_done +.align 16 +.Lecb_enc_one: + call _bsaes_encrypt8 + movdqu @XMM[0], 0x00($out) # write output + jmp .Lecb_enc_done +.align 16 +.Lecb_enc_short: + lea ($inp), $arg1 + lea ($out), $arg2 + lea ($key), $arg3 + call asm_AES_encrypt + lea 16($inp), $inp + lea 16($out), $out + dec $len + jnz .Lecb_enc_short + +.Lecb_enc_done: + lea (%rsp),%rax + pxor %xmm0, %xmm0 +.Lecb_enc_bzero: # wipe key schedule [if any] + movdqa %xmm0, 0x00(%rax) + movdqa %xmm0, 0x10(%rax) + lea 0x20(%rax), %rax + cmp %rax, %rbp + jb .Lecb_enc_bzero + + lea (%rbp),%rsp # restore %rsp +___ +$code.=<<___ if ($win64); + movaps 0x40(%rbp), %xmm6 + movaps 0x50(%rbp), %xmm7 + movaps 0x60(%rbp), %xmm8 + movaps 0x70(%rbp), %xmm9 + movaps 0x80(%rbp), %xmm10 + movaps 0x90(%rbp), %xmm11 + movaps 0xa0(%rbp), %xmm12 + movaps 0xb0(%rbp), %xmm13 + movaps 0xc0(%rbp), %xmm14 + movaps 0xd0(%rbp), %xmm15 + lea 0xa0(%rbp), %rsp +___ +$code.=<<___; + mov 0x48(%rsp), %r15 + mov 0x50(%rsp), %r14 + mov 0x58(%rsp), %r13 + mov 0x60(%rsp), %r12 + mov 0x68(%rsp), %rbx + mov 0x70(%rsp), %rax + lea 0x78(%rsp), %rsp + mov %rax, %rbp +.Lecb_enc_epilogue: + ret +.size bsaes_ecb_encrypt_blocks,.-bsaes_ecb_encrypt_blocks + +.globl bsaes_ecb_decrypt_blocks +.type bsaes_ecb_decrypt_blocks,\@abi-omnipotent +.align 16 +bsaes_ecb_decrypt_blocks: + mov %rsp, %rax +.Lecb_dec_prologue: + push %rbp + push %rbx + push %r12 + push %r13 + push %r14 + push %r15 + lea -0x48(%rsp),%rsp +___ +$code.=<<___ if ($win64); + lea -0xa0(%rsp), %rsp + movaps %xmm6, 0x40(%rsp) + movaps %xmm7, 0x50(%rsp) + movaps %xmm8, 0x60(%rsp) + movaps %xmm9, 0x70(%rsp) + movaps %xmm10, 0x80(%rsp) + movaps %xmm11, 0x90(%rsp) + movaps %xmm12, 0xa0(%rsp) + movaps %xmm13, 0xb0(%rsp) + movaps %xmm14, 0xc0(%rsp) + movaps %xmm15, 0xd0(%rsp) +.Lecb_dec_body: +___ +$code.=<<___; + mov %rsp,%rbp # backup %rsp + mov 240($arg4),%eax # rounds + mov $arg1,$inp # backup arguments + mov $arg2,$out + mov $arg3,$len + mov $arg4,$key + cmp \$8,$arg3 + jb .Lecb_dec_short + + mov %eax,%ebx # backup rounds + shl \$7,%rax # 128 bytes per inner round key + sub \$`128-32`,%rax # size of bit-sliced key schedule + sub %rax,%rsp + mov %rsp,%rax # pass key schedule + mov $key,%rcx # pass key + mov %ebx,%r10d # pass rounds + call _bsaes_key_convert + pxor (%rsp),%xmm7 # fix up 0 round key + movdqa %xmm6,(%rax) # save last round key + movdqa %xmm7,(%rsp) + + sub \$8,$len +.Lecb_dec_loop: + movdqu 0x00($inp), @XMM[0] # load input + movdqu 0x10($inp), @XMM[1] + movdqu 0x20($inp), @XMM[2] + movdqu 0x30($inp), @XMM[3] + movdqu 0x40($inp), @XMM[4] + movdqu 0x50($inp), @XMM[5] + mov %rsp, %rax # pass key schedule + movdqu 0x60($inp), @XMM[6] + mov %ebx,%r10d # pass rounds + movdqu 0x70($inp), @XMM[7] + lea 0x80($inp), $inp + + call _bsaes_decrypt8 + + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + movdqu @XMM[6], 0x20($out) + movdqu @XMM[4], 0x30($out) + movdqu @XMM[2], 0x40($out) + movdqu @XMM[7], 0x50($out) + movdqu @XMM[3], 0x60($out) + movdqu @XMM[5], 0x70($out) + lea 0x80($out), $out + sub \$8,$len + jnc .Lecb_dec_loop + + add \$8,$len + jz .Lecb_dec_done + + movdqu 0x00($inp), @XMM[0] # load input + mov %rsp, %rax # pass key schedule + mov %ebx,%r10d # pass rounds + cmp \$2,$len + jb .Lecb_dec_one + movdqu 0x10($inp), @XMM[1] + je .Lecb_dec_two + movdqu 0x20($inp), @XMM[2] + cmp \$4,$len + jb .Lecb_dec_three + movdqu 0x30($inp), @XMM[3] + je .Lecb_dec_four + movdqu 0x40($inp), @XMM[4] + cmp \$6,$len + jb .Lecb_dec_five + movdqu 0x50($inp), @XMM[5] + je .Lecb_dec_six + movdqu 0x60($inp), @XMM[6] + call _bsaes_decrypt8 + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + movdqu @XMM[6], 0x20($out) + movdqu @XMM[4], 0x30($out) + movdqu @XMM[2], 0x40($out) + movdqu @XMM[7], 0x50($out) + movdqu @XMM[3], 0x60($out) + jmp .Lecb_dec_done +.align 16 +.Lecb_dec_six: + call _bsaes_decrypt8 + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + movdqu @XMM[6], 0x20($out) + movdqu @XMM[4], 0x30($out) + movdqu @XMM[2], 0x40($out) + movdqu @XMM[7], 0x50($out) + jmp .Lecb_dec_done +.align 16 +.Lecb_dec_five: + call _bsaes_decrypt8 + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + movdqu @XMM[6], 0x20($out) + movdqu @XMM[4], 0x30($out) + movdqu @XMM[2], 0x40($out) + jmp .Lecb_dec_done +.align 16 +.Lecb_dec_four: + call _bsaes_decrypt8 + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + movdqu @XMM[6], 0x20($out) + movdqu @XMM[4], 0x30($out) + jmp .Lecb_dec_done +.align 16 +.Lecb_dec_three: + call _bsaes_decrypt8 + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + movdqu @XMM[6], 0x20($out) + jmp .Lecb_dec_done +.align 16 +.Lecb_dec_two: + call _bsaes_decrypt8 + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + jmp .Lecb_dec_done +.align 16 +.Lecb_dec_one: + call _bsaes_decrypt8 + movdqu @XMM[0], 0x00($out) # write output + jmp .Lecb_dec_done +.align 16 +.Lecb_dec_short: + lea ($inp), $arg1 + lea ($out), $arg2 + lea ($key), $arg3 + call asm_AES_decrypt + lea 16($inp), $inp + lea 16($out), $out + dec $len + jnz .Lecb_dec_short + +.Lecb_dec_done: + lea (%rsp),%rax + pxor %xmm0, %xmm0 +.Lecb_dec_bzero: # wipe key schedule [if any] + movdqa %xmm0, 0x00(%rax) + movdqa %xmm0, 0x10(%rax) + lea 0x20(%rax), %rax + cmp %rax, %rbp + jb .Lecb_dec_bzero + + lea (%rbp),%rsp # restore %rsp +___ +$code.=<<___ if ($win64); + movaps 0x40(%rbp), %xmm6 + movaps 0x50(%rbp), %xmm7 + movaps 0x60(%rbp), %xmm8 + movaps 0x70(%rbp), %xmm9 + movaps 0x80(%rbp), %xmm10 + movaps 0x90(%rbp), %xmm11 + movaps 0xa0(%rbp), %xmm12 + movaps 0xb0(%rbp), %xmm13 + movaps 0xc0(%rbp), %xmm14 + movaps 0xd0(%rbp), %xmm15 + lea 0xa0(%rbp), %rsp +___ +$code.=<<___; + mov 0x48(%rsp), %r15 + mov 0x50(%rsp), %r14 + mov 0x58(%rsp), %r13 + mov 0x60(%rsp), %r12 + mov 0x68(%rsp), %rbx + mov 0x70(%rsp), %rax + lea 0x78(%rsp), %rsp + mov %rax, %rbp +.Lecb_dec_epilogue: + ret +.size bsaes_ecb_decrypt_blocks,.-bsaes_ecb_decrypt_blocks +___ +} +$code.=<<___; +.extern asm_AES_cbc_encrypt +.globl bsaes_cbc_encrypt +.type bsaes_cbc_encrypt,\@abi-omnipotent +.align 16 +bsaes_cbc_encrypt: +___ +$code.=<<___ if ($win64); + mov 48(%rsp),$arg6 # pull direction flag +___ +$code.=<<___; + cmp \$0,$arg6 + jne asm_AES_cbc_encrypt + cmp \$128,$arg3 + jb asm_AES_cbc_encrypt + + mov %rsp, %rax +.Lcbc_dec_prologue: + push %rbp + push %rbx + push %r12 + push %r13 + push %r14 + push %r15 + lea -0x48(%rsp), %rsp +___ +$code.=<<___ if ($win64); + mov 0xa0(%rsp),$arg5 # pull ivp + lea -0xa0(%rsp), %rsp + movaps %xmm6, 0x40(%rsp) + movaps %xmm7, 0x50(%rsp) + movaps %xmm8, 0x60(%rsp) + movaps %xmm9, 0x70(%rsp) + movaps %xmm10, 0x80(%rsp) + movaps %xmm11, 0x90(%rsp) + movaps %xmm12, 0xa0(%rsp) + movaps %xmm13, 0xb0(%rsp) + movaps %xmm14, 0xc0(%rsp) + movaps %xmm15, 0xd0(%rsp) +.Lcbc_dec_body: +___ +$code.=<<___; + mov %rsp, %rbp # backup %rsp + mov 240($arg4), %eax # rounds + mov $arg1, $inp # backup arguments + mov $arg2, $out + mov $arg3, $len + mov $arg4, $key + mov $arg5, %rbx + shr \$4, $len # bytes to blocks + + mov %eax, %edx # rounds + shl \$7, %rax # 128 bytes per inner round key + sub \$`128-32`, %rax # size of bit-sliced key schedule + sub %rax, %rsp + + mov %rsp, %rax # pass key schedule + mov $key, %rcx # pass key + mov %edx, %r10d # pass rounds + call _bsaes_key_convert + pxor (%rsp),%xmm7 # fix up 0 round key + movdqa %xmm6,(%rax) # save last round key + movdqa %xmm7,(%rsp) + + movdqu (%rbx), @XMM[15] # load IV + sub \$8,$len +.Lcbc_dec_loop: + movdqu 0x00($inp), @XMM[0] # load input + movdqu 0x10($inp), @XMM[1] + movdqu 0x20($inp), @XMM[2] + movdqu 0x30($inp), @XMM[3] + movdqu 0x40($inp), @XMM[4] + movdqu 0x50($inp), @XMM[5] + mov %rsp, %rax # pass key schedule + movdqu 0x60($inp), @XMM[6] + mov %edx,%r10d # pass rounds + movdqu 0x70($inp), @XMM[7] + movdqa @XMM[15], 0x20(%rbp) # put aside IV + + call _bsaes_decrypt8 + + pxor 0x20(%rbp), @XMM[0] # ^= IV + movdqu 0x00($inp), @XMM[8] # re-load input + movdqu 0x10($inp), @XMM[9] + pxor @XMM[8], @XMM[1] + movdqu 0x20($inp), @XMM[10] + pxor @XMM[9], @XMM[6] + movdqu 0x30($inp), @XMM[11] + pxor @XMM[10], @XMM[4] + movdqu 0x40($inp), @XMM[12] + pxor @XMM[11], @XMM[2] + movdqu 0x50($inp), @XMM[13] + pxor @XMM[12], @XMM[7] + movdqu 0x60($inp), @XMM[14] + pxor @XMM[13], @XMM[3] + movdqu 0x70($inp), @XMM[15] # IV + pxor @XMM[14], @XMM[5] + movdqu @XMM[0], 0x00($out) # write output + lea 0x80($inp), $inp + movdqu @XMM[1], 0x10($out) + movdqu @XMM[6], 0x20($out) + movdqu @XMM[4], 0x30($out) + movdqu @XMM[2], 0x40($out) + movdqu @XMM[7], 0x50($out) + movdqu @XMM[3], 0x60($out) + movdqu @XMM[5], 0x70($out) + lea 0x80($out), $out + sub \$8,$len + jnc .Lcbc_dec_loop + + add \$8,$len + jz .Lcbc_dec_done + + movdqu 0x00($inp), @XMM[0] # load input + mov %rsp, %rax # pass key schedule + mov %edx, %r10d # pass rounds + cmp \$2,$len + jb .Lcbc_dec_one + movdqu 0x10($inp), @XMM[1] + je .Lcbc_dec_two + movdqu 0x20($inp), @XMM[2] + cmp \$4,$len + jb .Lcbc_dec_three + movdqu 0x30($inp), @XMM[3] + je .Lcbc_dec_four + movdqu 0x40($inp), @XMM[4] + cmp \$6,$len + jb .Lcbc_dec_five + movdqu 0x50($inp), @XMM[5] + je .Lcbc_dec_six + movdqu 0x60($inp), @XMM[6] + movdqa @XMM[15], 0x20(%rbp) # put aside IV + call _bsaes_decrypt8 + pxor 0x20(%rbp), @XMM[0] # ^= IV + movdqu 0x00($inp), @XMM[8] # re-load input + movdqu 0x10($inp), @XMM[9] + pxor @XMM[8], @XMM[1] + movdqu 0x20($inp), @XMM[10] + pxor @XMM[9], @XMM[6] + movdqu 0x30($inp), @XMM[11] + pxor @XMM[10], @XMM[4] + movdqu 0x40($inp), @XMM[12] + pxor @XMM[11], @XMM[2] + movdqu 0x50($inp), @XMM[13] + pxor @XMM[12], @XMM[7] + movdqu 0x60($inp), @XMM[15] # IV + pxor @XMM[13], @XMM[3] + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + movdqu @XMM[6], 0x20($out) + movdqu @XMM[4], 0x30($out) + movdqu @XMM[2], 0x40($out) + movdqu @XMM[7], 0x50($out) + movdqu @XMM[3], 0x60($out) + jmp .Lcbc_dec_done +.align 16 +.Lcbc_dec_six: + movdqa @XMM[15], 0x20(%rbp) # put aside IV + call _bsaes_decrypt8 + pxor 0x20(%rbp), @XMM[0] # ^= IV + movdqu 0x00($inp), @XMM[8] # re-load input + movdqu 0x10($inp), @XMM[9] + pxor @XMM[8], @XMM[1] + movdqu 0x20($inp), @XMM[10] + pxor @XMM[9], @XMM[6] + movdqu 0x30($inp), @XMM[11] + pxor @XMM[10], @XMM[4] + movdqu 0x40($inp), @XMM[12] + pxor @XMM[11], @XMM[2] + movdqu 0x50($inp), @XMM[15] # IV + pxor @XMM[12], @XMM[7] + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + movdqu @XMM[6], 0x20($out) + movdqu @XMM[4], 0x30($out) + movdqu @XMM[2], 0x40($out) + movdqu @XMM[7], 0x50($out) + jmp .Lcbc_dec_done +.align 16 +.Lcbc_dec_five: + movdqa @XMM[15], 0x20(%rbp) # put aside IV + call _bsaes_decrypt8 + pxor 0x20(%rbp), @XMM[0] # ^= IV + movdqu 0x00($inp), @XMM[8] # re-load input + movdqu 0x10($inp), @XMM[9] + pxor @XMM[8], @XMM[1] + movdqu 0x20($inp), @XMM[10] + pxor @XMM[9], @XMM[6] + movdqu 0x30($inp), @XMM[11] + pxor @XMM[10], @XMM[4] + movdqu 0x40($inp), @XMM[15] # IV + pxor @XMM[11], @XMM[2] + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + movdqu @XMM[6], 0x20($out) + movdqu @XMM[4], 0x30($out) + movdqu @XMM[2], 0x40($out) + jmp .Lcbc_dec_done +.align 16 +.Lcbc_dec_four: + movdqa @XMM[15], 0x20(%rbp) # put aside IV + call _bsaes_decrypt8 + pxor 0x20(%rbp), @XMM[0] # ^= IV + movdqu 0x00($inp), @XMM[8] # re-load input + movdqu 0x10($inp), @XMM[9] + pxor @XMM[8], @XMM[1] + movdqu 0x20($inp), @XMM[10] + pxor @XMM[9], @XMM[6] + movdqu 0x30($inp), @XMM[15] # IV + pxor @XMM[10], @XMM[4] + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + movdqu @XMM[6], 0x20($out) + movdqu @XMM[4], 0x30($out) + jmp .Lcbc_dec_done +.align 16 +.Lcbc_dec_three: + movdqa @XMM[15], 0x20(%rbp) # put aside IV + call _bsaes_decrypt8 + pxor 0x20(%rbp), @XMM[0] # ^= IV + movdqu 0x00($inp), @XMM[8] # re-load input + movdqu 0x10($inp), @XMM[9] + pxor @XMM[8], @XMM[1] + movdqu 0x20($inp), @XMM[15] # IV + pxor @XMM[9], @XMM[6] + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + movdqu @XMM[6], 0x20($out) + jmp .Lcbc_dec_done +.align 16 +.Lcbc_dec_two: + movdqa @XMM[15], 0x20(%rbp) # put aside IV + call _bsaes_decrypt8 + pxor 0x20(%rbp), @XMM[0] # ^= IV + movdqu 0x00($inp), @XMM[8] # re-load input + movdqu 0x10($inp), @XMM[15] # IV + pxor @XMM[8], @XMM[1] + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + jmp .Lcbc_dec_done +.align 16 +.Lcbc_dec_one: + lea ($inp), $arg1 + lea 0x20(%rbp), $arg2 # buffer output + lea ($key), $arg3 + call asm_AES_decrypt # doesn't touch %xmm + pxor 0x20(%rbp), @XMM[15] # ^= IV + movdqu @XMM[15], ($out) # write output + movdqa @XMM[0], @XMM[15] # IV + +.Lcbc_dec_done: + movdqu @XMM[15], (%rbx) # return IV + lea (%rsp), %rax + pxor %xmm0, %xmm0 +.Lcbc_dec_bzero: # wipe key schedule [if any] + movdqa %xmm0, 0x00(%rax) + movdqa %xmm0, 0x10(%rax) + lea 0x20(%rax), %rax + cmp %rax, %rbp + ja .Lcbc_dec_bzero + + lea (%rbp),%rsp # restore %rsp +___ +$code.=<<___ if ($win64); + movaps 0x40(%rbp), %xmm6 + movaps 0x50(%rbp), %xmm7 + movaps 0x60(%rbp), %xmm8 + movaps 0x70(%rbp), %xmm9 + movaps 0x80(%rbp), %xmm10 + movaps 0x90(%rbp), %xmm11 + movaps 0xa0(%rbp), %xmm12 + movaps 0xb0(%rbp), %xmm13 + movaps 0xc0(%rbp), %xmm14 + movaps 0xd0(%rbp), %xmm15 + lea 0xa0(%rbp), %rsp +___ +$code.=<<___; + mov 0x48(%rsp), %r15 + mov 0x50(%rsp), %r14 + mov 0x58(%rsp), %r13 + mov 0x60(%rsp), %r12 + mov 0x68(%rsp), %rbx + mov 0x70(%rsp), %rax + lea 0x78(%rsp), %rsp + mov %rax, %rbp +.Lcbc_dec_epilogue: + ret +.size bsaes_cbc_encrypt,.-bsaes_cbc_encrypt + +.globl bsaes_ctr32_encrypt_blocks +.type bsaes_ctr32_encrypt_blocks,\@abi-omnipotent +.align 16 +bsaes_ctr32_encrypt_blocks: + mov %rsp, %rax +.Lctr_enc_prologue: + push %rbp + push %rbx + push %r12 + push %r13 + push %r14 + push %r15 + lea -0x48(%rsp), %rsp +___ +$code.=<<___ if ($win64); + mov 0xa0(%rsp),$arg5 # pull ivp + lea -0xa0(%rsp), %rsp + movaps %xmm6, 0x40(%rsp) + movaps %xmm7, 0x50(%rsp) + movaps %xmm8, 0x60(%rsp) + movaps %xmm9, 0x70(%rsp) + movaps %xmm10, 0x80(%rsp) + movaps %xmm11, 0x90(%rsp) + movaps %xmm12, 0xa0(%rsp) + movaps %xmm13, 0xb0(%rsp) + movaps %xmm14, 0xc0(%rsp) + movaps %xmm15, 0xd0(%rsp) +.Lctr_enc_body: +___ +$code.=<<___; + mov %rsp, %rbp # backup %rsp + movdqu ($arg5), %xmm0 # load counter + mov 240($arg4), %eax # rounds + mov $arg1, $inp # backup arguments + mov $arg2, $out + mov $arg3, $len + mov $arg4, $key + movdqa %xmm0, 0x20(%rbp) # copy counter + cmp \$8, $arg3 + jb .Lctr_enc_short + + mov %eax, %ebx # rounds + shl \$7, %rax # 128 bytes per inner round key + sub \$`128-32`, %rax # size of bit-sliced key schedule + sub %rax, %rsp + + mov %rsp, %rax # pass key schedule + mov $key, %rcx # pass key + mov %ebx, %r10d # pass rounds + call _bsaes_key_convert + pxor %xmm6,%xmm7 # fix up last round key + movdqa %xmm7,(%rax) # save last round key + + movdqa (%rsp), @XMM[9] # load round0 key + lea .LADD1(%rip), %r11 + movdqa 0x20(%rbp), @XMM[0] # counter copy + movdqa -0x20(%r11), @XMM[8] # .LSWPUP + pshufb @XMM[8], @XMM[9] # byte swap upper part + pshufb @XMM[8], @XMM[0] + movdqa @XMM[9], (%rsp) # save adjusted round0 key + jmp .Lctr_enc_loop +.align 16 +.Lctr_enc_loop: + movdqa @XMM[0], 0x20(%rbp) # save counter + movdqa @XMM[0], @XMM[1] # prepare 8 counter values + movdqa @XMM[0], @XMM[2] + paddd 0x00(%r11), @XMM[1] # .LADD1 + movdqa @XMM[0], @XMM[3] + paddd 0x10(%r11), @XMM[2] # .LADD2 + movdqa @XMM[0], @XMM[4] + paddd 0x20(%r11), @XMM[3] # .LADD3 + movdqa @XMM[0], @XMM[5] + paddd 0x30(%r11), @XMM[4] # .LADD4 + movdqa @XMM[0], @XMM[6] + paddd 0x40(%r11), @XMM[5] # .LADD5 + movdqa @XMM[0], @XMM[7] + paddd 0x50(%r11), @XMM[6] # .LADD6 + paddd 0x60(%r11), @XMM[7] # .LADD7 + + # Borrow prologue from _bsaes_encrypt8 to use the opportunity + # to flip byte order in 32-bit counter + movdqa (%rsp), @XMM[9] # round 0 key + lea 0x10(%rsp), %rax # pass key schedule + movdqa -0x10(%r11), @XMM[8] # .LSWPUPM0SR + pxor @XMM[9], @XMM[0] # xor with round0 key + pxor @XMM[9], @XMM[1] + pshufb @XMM[8], @XMM[0] + pxor @XMM[9], @XMM[2] + pshufb @XMM[8], @XMM[1] + pxor @XMM[9], @XMM[3] + pshufb @XMM[8], @XMM[2] + pxor @XMM[9], @XMM[4] + pshufb @XMM[8], @XMM[3] + pxor @XMM[9], @XMM[5] + pshufb @XMM[8], @XMM[4] + pxor @XMM[9], @XMM[6] + pshufb @XMM[8], @XMM[5] + pxor @XMM[9], @XMM[7] + pshufb @XMM[8], @XMM[6] + lea .LBS0(%rip), %r11 # constants table + pshufb @XMM[8], @XMM[7] + mov %ebx,%r10d # pass rounds + + call _bsaes_encrypt8_bitslice + + sub \$8,$len + jc .Lctr_enc_loop_done + + movdqu 0x00($inp), @XMM[8] # load input + movdqu 0x10($inp), @XMM[9] + movdqu 0x20($inp), @XMM[10] + movdqu 0x30($inp), @XMM[11] + movdqu 0x40($inp), @XMM[12] + movdqu 0x50($inp), @XMM[13] + movdqu 0x60($inp), @XMM[14] + movdqu 0x70($inp), @XMM[15] + lea 0x80($inp),$inp + pxor @XMM[0], @XMM[8] + movdqa 0x20(%rbp), @XMM[0] # load counter + pxor @XMM[9], @XMM[1] + movdqu @XMM[8], 0x00($out) # write output + pxor @XMM[10], @XMM[4] + movdqu @XMM[1], 0x10($out) + pxor @XMM[11], @XMM[6] + movdqu @XMM[4], 0x20($out) + pxor @XMM[12], @XMM[3] + movdqu @XMM[6], 0x30($out) + pxor @XMM[13], @XMM[7] + movdqu @XMM[3], 0x40($out) + pxor @XMM[14], @XMM[2] + movdqu @XMM[7], 0x50($out) + pxor @XMM[15], @XMM[5] + movdqu @XMM[2], 0x60($out) + lea .LADD1(%rip), %r11 + movdqu @XMM[5], 0x70($out) + lea 0x80($out), $out + paddd 0x70(%r11), @XMM[0] # .LADD8 + jnz .Lctr_enc_loop + + jmp .Lctr_enc_done +.align 16 +.Lctr_enc_loop_done: + add \$8, $len + movdqu 0x00($inp), @XMM[8] # load input + pxor @XMM[8], @XMM[0] + movdqu @XMM[0], 0x00($out) # write output + cmp \$2,$len + jb .Lctr_enc_done + movdqu 0x10($inp), @XMM[9] + pxor @XMM[9], @XMM[1] + movdqu @XMM[1], 0x10($out) + je .Lctr_enc_done + movdqu 0x20($inp), @XMM[10] + pxor @XMM[10], @XMM[4] + movdqu @XMM[4], 0x20($out) + cmp \$4,$len + jb .Lctr_enc_done + movdqu 0x30($inp), @XMM[11] + pxor @XMM[11], @XMM[6] + movdqu @XMM[6], 0x30($out) + je .Lctr_enc_done + movdqu 0x40($inp), @XMM[12] + pxor @XMM[12], @XMM[3] + movdqu @XMM[3], 0x40($out) + cmp \$6,$len + jb .Lctr_enc_done + movdqu 0x50($inp), @XMM[13] + pxor @XMM[13], @XMM[7] + movdqu @XMM[7], 0x50($out) + je .Lctr_enc_done + movdqu 0x60($inp), @XMM[14] + pxor @XMM[14], @XMM[2] + movdqu @XMM[2], 0x60($out) + jmp .Lctr_enc_done + +.align 16 +.Lctr_enc_short: + lea 0x20(%rbp), $arg1 + lea 0x30(%rbp), $arg2 + lea ($key), $arg3 + call asm_AES_encrypt + movdqu ($inp), @XMM[1] + lea 16($inp), $inp + mov 0x2c(%rbp), %eax # load 32-bit counter + bswap %eax + pxor 0x30(%rbp), @XMM[1] + inc %eax # increment + movdqu @XMM[1], ($out) + bswap %eax + lea 16($out), $out + mov %eax, 0x2c(%rsp) # save 32-bit counter + dec $len + jnz .Lctr_enc_short + +.Lctr_enc_done: + lea (%rsp), %rax + pxor %xmm0, %xmm0 +.Lctr_enc_bzero: # wipe key schedule [if any] + movdqa %xmm0, 0x00(%rax) + movdqa %xmm0, 0x10(%rax) + lea 0x20(%rax), %rax + cmp %rax, %rbp + ja .Lctr_enc_bzero + + lea (%rbp),%rsp # restore %rsp +___ +$code.=<<___ if ($win64); + movaps 0x40(%rbp), %xmm6 + movaps 0x50(%rbp), %xmm7 + movaps 0x60(%rbp), %xmm8 + movaps 0x70(%rbp), %xmm9 + movaps 0x80(%rbp), %xmm10 + movaps 0x90(%rbp), %xmm11 + movaps 0xa0(%rbp), %xmm12 + movaps 0xb0(%rbp), %xmm13 + movaps 0xc0(%rbp), %xmm14 + movaps 0xd0(%rbp), %xmm15 + lea 0xa0(%rbp), %rsp +___ +$code.=<<___; + mov 0x48(%rsp), %r15 + mov 0x50(%rsp), %r14 + mov 0x58(%rsp), %r13 + mov 0x60(%rsp), %r12 + mov 0x68(%rsp), %rbx + mov 0x70(%rsp), %rax + lea 0x78(%rsp), %rsp + mov %rax, %rbp +.Lctr_enc_epilogue: + ret +.size bsaes_ctr32_encrypt_blocks,.-bsaes_ctr32_encrypt_blocks +___ +###################################################################### +# void bsaes_xts_[en|de]crypt(const char *inp,char *out,size_t len, +# const AES_KEY *key1, const AES_KEY *key2, +# const unsigned char iv[16]); +# +my ($twmask,$twres,$twtmp)=@XMM[13..15]; +$code.=<<___; +.globl bsaes_xts_encrypt +.type bsaes_xts_encrypt,\@abi-omnipotent +.align 16 +bsaes_xts_encrypt: + mov %rsp, %rax +.Lxts_enc_prologue: + push %rbp + push %rbx + push %r12 + push %r13 + push %r14 + push %r15 + lea -0x48(%rsp), %rsp +___ +$code.=<<___ if ($win64); + mov 0xa0(%rsp),$arg5 # pull key2 + mov 0xa8(%rsp),$arg6 # pull ivp + lea -0xa0(%rsp), %rsp + movaps %xmm6, 0x40(%rsp) + movaps %xmm7, 0x50(%rsp) + movaps %xmm8, 0x60(%rsp) + movaps %xmm9, 0x70(%rsp) + movaps %xmm10, 0x80(%rsp) + movaps %xmm11, 0x90(%rsp) + movaps %xmm12, 0xa0(%rsp) + movaps %xmm13, 0xb0(%rsp) + movaps %xmm14, 0xc0(%rsp) + movaps %xmm15, 0xd0(%rsp) +.Lxts_enc_body: +___ +$code.=<<___; + mov %rsp, %rbp # backup %rsp + mov $arg1, $inp # backup arguments + mov $arg2, $out + mov $arg3, $len + mov $arg4, $key + + lea ($arg6), $arg1 + lea 0x20(%rbp), $arg2 + lea ($arg5), $arg3 + call asm_AES_encrypt # generate initial tweak + + mov 240($key), %eax # rounds + mov $len, %rbx # backup $len + + mov %eax, %edx # rounds + shl \$7, %rax # 128 bytes per inner round key + sub \$`128-32`, %rax # size of bit-sliced key schedule + sub %rax, %rsp + + mov %rsp, %rax # pass key schedule + mov $key, %rcx # pass key + mov %edx, %r10d # pass rounds + call _bsaes_key_convert + pxor %xmm6, %xmm7 # fix up last round key + movdqa %xmm7, (%rax) # save last round key + + and \$-16, $len + sub \$0x80, %rsp # place for tweak[8] + movdqa 0x20(%rbp), @XMM[7] # initial tweak + + pxor $twtmp, $twtmp + movdqa .Lxts_magic(%rip), $twmask + pcmpgtd @XMM[7], $twtmp # broadcast upper bits + + sub \$0x80, $len + jc .Lxts_enc_short + jmp .Lxts_enc_loop + +.align 16 +.Lxts_enc_loop: +___ + for ($i=0;$i<7;$i++) { + $code.=<<___; + pshufd \$0x13, $twtmp, $twres + pxor $twtmp, $twtmp + movdqa @XMM[7], @XMM[$i] + movdqa @XMM[7], `0x10*$i`(%rsp)# save tweak[$i] + paddq @XMM[7], @XMM[7] # psllq 1,$tweak + pand $twmask, $twres # isolate carry and residue + pcmpgtd @XMM[7], $twtmp # broadcast upper bits + pxor $twres, @XMM[7] +___ + $code.=<<___ if ($i>=1); + movdqu `0x10*($i-1)`($inp), @XMM[8+$i-1] +___ + $code.=<<___ if ($i>=2); + pxor @XMM[8+$i-2], @XMM[$i-2]# input[] ^ tweak[] +___ + } +$code.=<<___; + movdqu 0x60($inp), @XMM[8+6] + pxor @XMM[8+5], @XMM[5] + movdqu 0x70($inp), @XMM[8+7] + lea 0x80($inp), $inp + movdqa @XMM[7], 0x70(%rsp) + pxor @XMM[8+6], @XMM[6] + lea 0x80(%rsp), %rax # pass key schedule + pxor @XMM[8+7], @XMM[7] + mov %edx, %r10d # pass rounds + + call _bsaes_encrypt8 + + pxor 0x00(%rsp), @XMM[0] # ^= tweak[] + pxor 0x10(%rsp), @XMM[1] + movdqu @XMM[0], 0x00($out) # write output + pxor 0x20(%rsp), @XMM[4] + movdqu @XMM[1], 0x10($out) + pxor 0x30(%rsp), @XMM[6] + movdqu @XMM[4], 0x20($out) + pxor 0x40(%rsp), @XMM[3] + movdqu @XMM[6], 0x30($out) + pxor 0x50(%rsp), @XMM[7] + movdqu @XMM[3], 0x40($out) + pxor 0x60(%rsp), @XMM[2] + movdqu @XMM[7], 0x50($out) + pxor 0x70(%rsp), @XMM[5] + movdqu @XMM[2], 0x60($out) + movdqu @XMM[5], 0x70($out) + lea 0x80($out), $out + + movdqa 0x70(%rsp), @XMM[7] # prepare next iteration tweak + pxor $twtmp, $twtmp + movdqa .Lxts_magic(%rip), $twmask + pcmpgtd @XMM[7], $twtmp + pshufd \$0x13, $twtmp, $twres + pxor $twtmp, $twtmp + paddq @XMM[7], @XMM[7] # psllq 1,$tweak + pand $twmask, $twres # isolate carry and residue + pcmpgtd @XMM[7], $twtmp # broadcast upper bits + pxor $twres, @XMM[7] + + sub \$0x80,$len + jnc .Lxts_enc_loop + +.Lxts_enc_short: + add \$0x80, $len + jz .Lxts_enc_done +___ + for ($i=0;$i<7;$i++) { + $code.=<<___; + pshufd \$0x13, $twtmp, $twres + pxor $twtmp, $twtmp + movdqa @XMM[7], @XMM[$i] + movdqa @XMM[7], `0x10*$i`(%rsp)# save tweak[$i] + paddq @XMM[7], @XMM[7] # psllq 1,$tweak + pand $twmask, $twres # isolate carry and residue + pcmpgtd @XMM[7], $twtmp # broadcast upper bits + pxor $twres, @XMM[7] +___ + $code.=<<___ if ($i>=1); + movdqu `0x10*($i-1)`($inp), @XMM[8+$i-1] + cmp \$`0x10*$i`,$len + je .Lxts_enc_$i +___ + $code.=<<___ if ($i>=2); + pxor @XMM[8+$i-2], @XMM[$i-2]# input[] ^ tweak[] +___ + } +$code.=<<___; + movdqu 0x60($inp), @XMM[8+6] + pxor @XMM[8+5], @XMM[5] + movdqa @XMM[7], 0x70(%rsp) + lea 0x70($inp), $inp + pxor @XMM[8+6], @XMM[6] + lea 0x80(%rsp), %rax # pass key schedule + mov %edx, %r10d # pass rounds + + call _bsaes_encrypt8 + + pxor 0x00(%rsp), @XMM[0] # ^= tweak[] + pxor 0x10(%rsp), @XMM[1] + movdqu @XMM[0], 0x00($out) # write output + pxor 0x20(%rsp), @XMM[4] + movdqu @XMM[1], 0x10($out) + pxor 0x30(%rsp), @XMM[6] + movdqu @XMM[4], 0x20($out) + pxor 0x40(%rsp), @XMM[3] + movdqu @XMM[6], 0x30($out) + pxor 0x50(%rsp), @XMM[7] + movdqu @XMM[3], 0x40($out) + pxor 0x60(%rsp), @XMM[2] + movdqu @XMM[7], 0x50($out) + movdqu @XMM[2], 0x60($out) + lea 0x70($out), $out + + movdqa 0x70(%rsp), @XMM[7] # next iteration tweak + jmp .Lxts_enc_done +.align 16 +.Lxts_enc_6: + pxor @XMM[8+4], @XMM[4] + lea 0x60($inp), $inp + pxor @XMM[8+5], @XMM[5] + lea 0x80(%rsp), %rax # pass key schedule + mov %edx, %r10d # pass rounds + + call _bsaes_encrypt8 + + pxor 0x00(%rsp), @XMM[0] # ^= tweak[] + pxor 0x10(%rsp), @XMM[1] + movdqu @XMM[0], 0x00($out) # write output + pxor 0x20(%rsp), @XMM[4] + movdqu @XMM[1], 0x10($out) + pxor 0x30(%rsp), @XMM[6] + movdqu @XMM[4], 0x20($out) + pxor 0x40(%rsp), @XMM[3] + movdqu @XMM[6], 0x30($out) + pxor 0x50(%rsp), @XMM[7] + movdqu @XMM[3], 0x40($out) + movdqu @XMM[7], 0x50($out) + lea 0x60($out), $out + + movdqa 0x60(%rsp), @XMM[7] # next iteration tweak + jmp .Lxts_enc_done +.align 16 +.Lxts_enc_5: + pxor @XMM[8+3], @XMM[3] + lea 0x50($inp), $inp + pxor @XMM[8+4], @XMM[4] + lea 0x80(%rsp), %rax # pass key schedule + mov %edx, %r10d # pass rounds + + call _bsaes_encrypt8 + + pxor 0x00(%rsp), @XMM[0] # ^= tweak[] + pxor 0x10(%rsp), @XMM[1] + movdqu @XMM[0], 0x00($out) # write output + pxor 0x20(%rsp), @XMM[4] + movdqu @XMM[1], 0x10($out) + pxor 0x30(%rsp), @XMM[6] + movdqu @XMM[4], 0x20($out) + pxor 0x40(%rsp), @XMM[3] + movdqu @XMM[6], 0x30($out) + movdqu @XMM[3], 0x40($out) + lea 0x50($out), $out + + movdqa 0x50(%rsp), @XMM[7] # next iteration tweak + jmp .Lxts_enc_done +.align 16 +.Lxts_enc_4: + pxor @XMM[8+2], @XMM[2] + lea 0x40($inp), $inp + pxor @XMM[8+3], @XMM[3] + lea 0x80(%rsp), %rax # pass key schedule + mov %edx, %r10d # pass rounds + + call _bsaes_encrypt8 + + pxor 0x00(%rsp), @XMM[0] # ^= tweak[] + pxor 0x10(%rsp), @XMM[1] + movdqu @XMM[0], 0x00($out) # write output + pxor 0x20(%rsp), @XMM[4] + movdqu @XMM[1], 0x10($out) + pxor 0x30(%rsp), @XMM[6] + movdqu @XMM[4], 0x20($out) + movdqu @XMM[6], 0x30($out) + lea 0x40($out), $out + + movdqa 0x40(%rsp), @XMM[7] # next iteration tweak + jmp .Lxts_enc_done +.align 16 +.Lxts_enc_3: + pxor @XMM[8+1], @XMM[1] + lea 0x30($inp), $inp + pxor @XMM[8+2], @XMM[2] + lea 0x80(%rsp), %rax # pass key schedule + mov %edx, %r10d # pass rounds + + call _bsaes_encrypt8 + + pxor 0x00(%rsp), @XMM[0] # ^= tweak[] + pxor 0x10(%rsp), @XMM[1] + movdqu @XMM[0], 0x00($out) # write output + pxor 0x20(%rsp), @XMM[4] + movdqu @XMM[1], 0x10($out) + movdqu @XMM[4], 0x20($out) + lea 0x30($out), $out + + movdqa 0x30(%rsp), @XMM[7] # next iteration tweak + jmp .Lxts_enc_done +.align 16 +.Lxts_enc_2: + pxor @XMM[8+0], @XMM[0] + lea 0x20($inp), $inp + pxor @XMM[8+1], @XMM[1] + lea 0x80(%rsp), %rax # pass key schedule + mov %edx, %r10d # pass rounds + + call _bsaes_encrypt8 + + pxor 0x00(%rsp), @XMM[0] # ^= tweak[] + pxor 0x10(%rsp), @XMM[1] + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + lea 0x20($out), $out + + movdqa 0x20(%rsp), @XMM[7] # next iteration tweak + jmp .Lxts_enc_done +.align 16 +.Lxts_enc_1: + pxor @XMM[0], @XMM[8] + lea 0x10($inp), $inp + movdqa @XMM[8], 0x20(%rbp) + lea 0x20(%rbp), $arg1 + lea 0x20(%rbp), $arg2 + lea ($key), $arg3 + call asm_AES_encrypt # doesn't touch %xmm + pxor 0x20(%rbp), @XMM[0] # ^= tweak[] + #pxor @XMM[8], @XMM[0] + #lea 0x80(%rsp), %rax # pass key schedule + #mov %edx, %r10d # pass rounds + #call _bsaes_encrypt8 + #pxor 0x00(%rsp), @XMM[0] # ^= tweak[] + movdqu @XMM[0], 0x00($out) # write output + lea 0x10($out), $out + + movdqa 0x10(%rsp), @XMM[7] # next iteration tweak + +.Lxts_enc_done: + and \$15, %ebx + jz .Lxts_enc_ret + mov $out, %rdx + +.Lxts_enc_steal: + movzb ($inp), %eax + movzb -16(%rdx), %ecx + lea 1($inp), $inp + mov %al, -16(%rdx) + mov %cl, 0(%rdx) + lea 1(%rdx), %rdx + sub \$1,%ebx + jnz .Lxts_enc_steal + + movdqu -16($out), @XMM[0] + lea 0x20(%rbp), $arg1 + pxor @XMM[7], @XMM[0] + lea 0x20(%rbp), $arg2 + movdqa @XMM[0], 0x20(%rbp) + lea ($key), $arg3 + call asm_AES_encrypt # doesn't touch %xmm + pxor 0x20(%rbp), @XMM[7] + movdqu @XMM[7], -16($out) + +.Lxts_enc_ret: + lea (%rsp), %rax + pxor %xmm0, %xmm0 +.Lxts_enc_bzero: # wipe key schedule [if any] + movdqa %xmm0, 0x00(%rax) + movdqa %xmm0, 0x10(%rax) + lea 0x20(%rax), %rax + cmp %rax, %rbp + ja .Lxts_enc_bzero + + lea (%rbp),%rsp # restore %rsp +___ +$code.=<<___ if ($win64); + movaps 0x40(%rbp), %xmm6 + movaps 0x50(%rbp), %xmm7 + movaps 0x60(%rbp), %xmm8 + movaps 0x70(%rbp), %xmm9 + movaps 0x80(%rbp), %xmm10 + movaps 0x90(%rbp), %xmm11 + movaps 0xa0(%rbp), %xmm12 + movaps 0xb0(%rbp), %xmm13 + movaps 0xc0(%rbp), %xmm14 + movaps 0xd0(%rbp), %xmm15 + lea 0xa0(%rbp), %rsp +___ +$code.=<<___; + mov 0x48(%rsp), %r15 + mov 0x50(%rsp), %r14 + mov 0x58(%rsp), %r13 + mov 0x60(%rsp), %r12 + mov 0x68(%rsp), %rbx + mov 0x70(%rsp), %rax + lea 0x78(%rsp), %rsp + mov %rax, %rbp +.Lxts_enc_epilogue: + ret +.size bsaes_xts_encrypt,.-bsaes_xts_encrypt + +.globl bsaes_xts_decrypt +.type bsaes_xts_decrypt,\@abi-omnipotent +.align 16 +bsaes_xts_decrypt: + mov %rsp, %rax +.Lxts_dec_prologue: + push %rbp + push %rbx + push %r12 + push %r13 + push %r14 + push %r15 + lea -0x48(%rsp), %rsp +___ +$code.=<<___ if ($win64); + mov 0xa0(%rsp),$arg5 # pull key2 + mov 0xa8(%rsp),$arg6 # pull ivp + lea -0xa0(%rsp), %rsp + movaps %xmm6, 0x40(%rsp) + movaps %xmm7, 0x50(%rsp) + movaps %xmm8, 0x60(%rsp) + movaps %xmm9, 0x70(%rsp) + movaps %xmm10, 0x80(%rsp) + movaps %xmm11, 0x90(%rsp) + movaps %xmm12, 0xa0(%rsp) + movaps %xmm13, 0xb0(%rsp) + movaps %xmm14, 0xc0(%rsp) + movaps %xmm15, 0xd0(%rsp) +.Lxts_dec_body: +___ +$code.=<<___; + mov %rsp, %rbp # backup %rsp + mov $arg1, $inp # backup arguments + mov $arg2, $out + mov $arg3, $len + mov $arg4, $key + + lea ($arg6), $arg1 + lea 0x20(%rbp), $arg2 + lea ($arg5), $arg3 + call asm_AES_encrypt # generate initial tweak + + mov 240($key), %eax # rounds + mov $len, %rbx # backup $len + + mov %eax, %edx # rounds + shl \$7, %rax # 128 bytes per inner round key + sub \$`128-32`, %rax # size of bit-sliced key schedule + sub %rax, %rsp + + mov %rsp, %rax # pass key schedule + mov $key, %rcx # pass key + mov %edx, %r10d # pass rounds + call _bsaes_key_convert + pxor (%rsp), %xmm7 # fix up round 0 key + movdqa %xmm6, (%rax) # save last round key + movdqa %xmm7, (%rsp) + + xor %eax, %eax # if ($len%16) len-=16; + and \$-16, $len + test \$15, %ebx + setnz %al + shl \$4, %rax + sub %rax, $len + + sub \$0x80, %rsp # place for tweak[8] + movdqa 0x20(%rbp), @XMM[7] # initial tweak + + pxor $twtmp, $twtmp + movdqa .Lxts_magic(%rip), $twmask + pcmpgtd @XMM[7], $twtmp # broadcast upper bits + + sub \$0x80, $len + jc .Lxts_dec_short + jmp .Lxts_dec_loop + +.align 16 +.Lxts_dec_loop: +___ + for ($i=0;$i<7;$i++) { + $code.=<<___; + pshufd \$0x13, $twtmp, $twres + pxor $twtmp, $twtmp + movdqa @XMM[7], @XMM[$i] + movdqa @XMM[7], `0x10*$i`(%rsp)# save tweak[$i] + paddq @XMM[7], @XMM[7] # psllq 1,$tweak + pand $twmask, $twres # isolate carry and residue + pcmpgtd @XMM[7], $twtmp # broadcast upper bits + pxor $twres, @XMM[7] +___ + $code.=<<___ if ($i>=1); + movdqu `0x10*($i-1)`($inp), @XMM[8+$i-1] +___ + $code.=<<___ if ($i>=2); + pxor @XMM[8+$i-2], @XMM[$i-2]# input[] ^ tweak[] +___ + } +$code.=<<___; + movdqu 0x60($inp), @XMM[8+6] + pxor @XMM[8+5], @XMM[5] + movdqu 0x70($inp), @XMM[8+7] + lea 0x80($inp), $inp + movdqa @XMM[7], 0x70(%rsp) + pxor @XMM[8+6], @XMM[6] + lea 0x80(%rsp), %rax # pass key schedule + pxor @XMM[8+7], @XMM[7] + mov %edx, %r10d # pass rounds + + call _bsaes_decrypt8 + + pxor 0x00(%rsp), @XMM[0] # ^= tweak[] + pxor 0x10(%rsp), @XMM[1] + movdqu @XMM[0], 0x00($out) # write output + pxor 0x20(%rsp), @XMM[6] + movdqu @XMM[1], 0x10($out) + pxor 0x30(%rsp), @XMM[4] + movdqu @XMM[6], 0x20($out) + pxor 0x40(%rsp), @XMM[2] + movdqu @XMM[4], 0x30($out) + pxor 0x50(%rsp), @XMM[7] + movdqu @XMM[2], 0x40($out) + pxor 0x60(%rsp), @XMM[3] + movdqu @XMM[7], 0x50($out) + pxor 0x70(%rsp), @XMM[5] + movdqu @XMM[3], 0x60($out) + movdqu @XMM[5], 0x70($out) + lea 0x80($out), $out + + movdqa 0x70(%rsp), @XMM[7] # prepare next iteration tweak + pxor $twtmp, $twtmp + movdqa .Lxts_magic(%rip), $twmask + pcmpgtd @XMM[7], $twtmp + pshufd \$0x13, $twtmp, $twres + pxor $twtmp, $twtmp + paddq @XMM[7], @XMM[7] # psllq 1,$tweak + pand $twmask, $twres # isolate carry and residue + pcmpgtd @XMM[7], $twtmp # broadcast upper bits + pxor $twres, @XMM[7] + + sub \$0x80,$len + jnc .Lxts_dec_loop + +.Lxts_dec_short: + add \$0x80, $len + jz .Lxts_dec_done +___ + for ($i=0;$i<7;$i++) { + $code.=<<___; + pshufd \$0x13, $twtmp, $twres + pxor $twtmp, $twtmp + movdqa @XMM[7], @XMM[$i] + movdqa @XMM[7], `0x10*$i`(%rsp)# save tweak[$i] + paddq @XMM[7], @XMM[7] # psllq 1,$tweak + pand $twmask, $twres # isolate carry and residue + pcmpgtd @XMM[7], $twtmp # broadcast upper bits + pxor $twres, @XMM[7] +___ + $code.=<<___ if ($i>=1); + movdqu `0x10*($i-1)`($inp), @XMM[8+$i-1] + cmp \$`0x10*$i`,$len + je .Lxts_dec_$i +___ + $code.=<<___ if ($i>=2); + pxor @XMM[8+$i-2], @XMM[$i-2]# input[] ^ tweak[] +___ + } +$code.=<<___; + movdqu 0x60($inp), @XMM[8+6] + pxor @XMM[8+5], @XMM[5] + movdqa @XMM[7], 0x70(%rsp) + lea 0x70($inp), $inp + pxor @XMM[8+6], @XMM[6] + lea 0x80(%rsp), %rax # pass key schedule + mov %edx, %r10d # pass rounds + + call _bsaes_decrypt8 + + pxor 0x00(%rsp), @XMM[0] # ^= tweak[] + pxor 0x10(%rsp), @XMM[1] + movdqu @XMM[0], 0x00($out) # write output + pxor 0x20(%rsp), @XMM[6] + movdqu @XMM[1], 0x10($out) + pxor 0x30(%rsp), @XMM[4] + movdqu @XMM[6], 0x20($out) + pxor 0x40(%rsp), @XMM[2] + movdqu @XMM[4], 0x30($out) + pxor 0x50(%rsp), @XMM[7] + movdqu @XMM[2], 0x40($out) + pxor 0x60(%rsp), @XMM[3] + movdqu @XMM[7], 0x50($out) + movdqu @XMM[3], 0x60($out) + lea 0x70($out), $out + + movdqa 0x70(%rsp), @XMM[7] # next iteration tweak + jmp .Lxts_dec_done +.align 16 +.Lxts_dec_6: + pxor @XMM[8+4], @XMM[4] + lea 0x60($inp), $inp + pxor @XMM[8+5], @XMM[5] + lea 0x80(%rsp), %rax # pass key schedule + mov %edx, %r10d # pass rounds + + call _bsaes_decrypt8 + + pxor 0x00(%rsp), @XMM[0] # ^= tweak[] + pxor 0x10(%rsp), @XMM[1] + movdqu @XMM[0], 0x00($out) # write output + pxor 0x20(%rsp), @XMM[6] + movdqu @XMM[1], 0x10($out) + pxor 0x30(%rsp), @XMM[4] + movdqu @XMM[6], 0x20($out) + pxor 0x40(%rsp), @XMM[2] + movdqu @XMM[4], 0x30($out) + pxor 0x50(%rsp), @XMM[7] + movdqu @XMM[2], 0x40($out) + movdqu @XMM[7], 0x50($out) + lea 0x60($out), $out + + movdqa 0x60(%rsp), @XMM[7] # next iteration tweak + jmp .Lxts_dec_done +.align 16 +.Lxts_dec_5: + pxor @XMM[8+3], @XMM[3] + lea 0x50($inp), $inp + pxor @XMM[8+4], @XMM[4] + lea 0x80(%rsp), %rax # pass key schedule + mov %edx, %r10d # pass rounds + + call _bsaes_decrypt8 + + pxor 0x00(%rsp), @XMM[0] # ^= tweak[] + pxor 0x10(%rsp), @XMM[1] + movdqu @XMM[0], 0x00($out) # write output + pxor 0x20(%rsp), @XMM[6] + movdqu @XMM[1], 0x10($out) + pxor 0x30(%rsp), @XMM[4] + movdqu @XMM[6], 0x20($out) + pxor 0x40(%rsp), @XMM[2] + movdqu @XMM[4], 0x30($out) + movdqu @XMM[2], 0x40($out) + lea 0x50($out), $out + + movdqa 0x50(%rsp), @XMM[7] # next iteration tweak + jmp .Lxts_dec_done +.align 16 +.Lxts_dec_4: + pxor @XMM[8+2], @XMM[2] + lea 0x40($inp), $inp + pxor @XMM[8+3], @XMM[3] + lea 0x80(%rsp), %rax # pass key schedule + mov %edx, %r10d # pass rounds + + call _bsaes_decrypt8 + + pxor 0x00(%rsp), @XMM[0] # ^= tweak[] + pxor 0x10(%rsp), @XMM[1] + movdqu @XMM[0], 0x00($out) # write output + pxor 0x20(%rsp), @XMM[6] + movdqu @XMM[1], 0x10($out) + pxor 0x30(%rsp), @XMM[4] + movdqu @XMM[6], 0x20($out) + movdqu @XMM[4], 0x30($out) + lea 0x40($out), $out + + movdqa 0x40(%rsp), @XMM[7] # next iteration tweak + jmp .Lxts_dec_done +.align 16 +.Lxts_dec_3: + pxor @XMM[8+1], @XMM[1] + lea 0x30($inp), $inp + pxor @XMM[8+2], @XMM[2] + lea 0x80(%rsp), %rax # pass key schedule + mov %edx, %r10d # pass rounds + + call _bsaes_decrypt8 + + pxor 0x00(%rsp), @XMM[0] # ^= tweak[] + pxor 0x10(%rsp), @XMM[1] + movdqu @XMM[0], 0x00($out) # write output + pxor 0x20(%rsp), @XMM[6] + movdqu @XMM[1], 0x10($out) + movdqu @XMM[6], 0x20($out) + lea 0x30($out), $out + + movdqa 0x30(%rsp), @XMM[7] # next iteration tweak + jmp .Lxts_dec_done +.align 16 +.Lxts_dec_2: + pxor @XMM[8+0], @XMM[0] + lea 0x20($inp), $inp + pxor @XMM[8+1], @XMM[1] + lea 0x80(%rsp), %rax # pass key schedule + mov %edx, %r10d # pass rounds + + call _bsaes_decrypt8 + + pxor 0x00(%rsp), @XMM[0] # ^= tweak[] + pxor 0x10(%rsp), @XMM[1] + movdqu @XMM[0], 0x00($out) # write output + movdqu @XMM[1], 0x10($out) + lea 0x20($out), $out + + movdqa 0x20(%rsp), @XMM[7] # next iteration tweak + jmp .Lxts_dec_done +.align 16 +.Lxts_dec_1: + pxor @XMM[0], @XMM[8] + lea 0x10($inp), $inp + movdqa @XMM[8], 0x20(%rbp) + lea 0x20(%rbp), $arg1 + lea 0x20(%rbp), $arg2 + lea ($key), $arg3 + call asm_AES_decrypt # doesn't touch %xmm + pxor 0x20(%rbp), @XMM[0] # ^= tweak[] + #pxor @XMM[8], @XMM[0] + #lea 0x80(%rsp), %rax # pass key schedule + #mov %edx, %r10d # pass rounds + #call _bsaes_decrypt8 + #pxor 0x00(%rsp), @XMM[0] # ^= tweak[] + movdqu @XMM[0], 0x00($out) # write output + lea 0x10($out), $out + + movdqa 0x10(%rsp), @XMM[7] # next iteration tweak + +.Lxts_dec_done: + and \$15, %ebx + jz .Lxts_dec_ret + + pxor $twtmp, $twtmp + movdqa .Lxts_magic(%rip), $twmask + pcmpgtd @XMM[7], $twtmp + pshufd \$0x13, $twtmp, $twres + movdqa @XMM[7], @XMM[6] + paddq @XMM[7], @XMM[7] # psllq 1,$tweak + pand $twmask, $twres # isolate carry and residue + movdqu ($inp), @XMM[0] + pxor $twres, @XMM[7] + + lea 0x20(%rbp), $arg1 + pxor @XMM[7], @XMM[0] + lea 0x20(%rbp), $arg2 + movdqa @XMM[0], 0x20(%rbp) + lea ($key), $arg3 + call asm_AES_decrypt # doesn't touch %xmm + pxor 0x20(%rbp), @XMM[7] + mov $out, %rdx + movdqu @XMM[7], ($out) + +.Lxts_dec_steal: + movzb 16($inp), %eax + movzb (%rdx), %ecx + lea 1($inp), $inp + mov %al, (%rdx) + mov %cl, 16(%rdx) + lea 1(%rdx), %rdx + sub \$1,%ebx + jnz .Lxts_dec_steal + + movdqu ($out), @XMM[0] + lea 0x20(%rbp), $arg1 + pxor @XMM[6], @XMM[0] + lea 0x20(%rbp), $arg2 + movdqa @XMM[0], 0x20(%rbp) + lea ($key), $arg3 + call asm_AES_decrypt # doesn't touch %xmm + pxor 0x20(%rbp), @XMM[6] + movdqu @XMM[6], ($out) + +.Lxts_dec_ret: + lea (%rsp), %rax + pxor %xmm0, %xmm0 +.Lxts_dec_bzero: # wipe key schedule [if any] + movdqa %xmm0, 0x00(%rax) + movdqa %xmm0, 0x10(%rax) + lea 0x20(%rax), %rax + cmp %rax, %rbp + ja .Lxts_dec_bzero + + lea (%rbp),%rsp # restore %rsp +___ +$code.=<<___ if ($win64); + movaps 0x40(%rbp), %xmm6 + movaps 0x50(%rbp), %xmm7 + movaps 0x60(%rbp), %xmm8 + movaps 0x70(%rbp), %xmm9 + movaps 0x80(%rbp), %xmm10 + movaps 0x90(%rbp), %xmm11 + movaps 0xa0(%rbp), %xmm12 + movaps 0xb0(%rbp), %xmm13 + movaps 0xc0(%rbp), %xmm14 + movaps 0xd0(%rbp), %xmm15 + lea 0xa0(%rbp), %rsp +___ +$code.=<<___; + mov 0x48(%rsp), %r15 + mov 0x50(%rsp), %r14 + mov 0x58(%rsp), %r13 + mov 0x60(%rsp), %r12 + mov 0x68(%rsp), %rbx + mov 0x70(%rsp), %rax + lea 0x78(%rsp), %rsp + mov %rax, %rbp +.Lxts_dec_epilogue: + ret +.size bsaes_xts_decrypt,.-bsaes_xts_decrypt +___ +} +$code.=<<___; +.type _bsaes_const,\@object +.align 64 +_bsaes_const: +.LM0ISR: # InvShiftRows constants + .quad 0x0a0e0206070b0f03, 0x0004080c0d010509 +.LISRM0: + .quad 0x01040b0e0205080f, 0x0306090c00070a0d +.LISR: + .quad 0x0504070602010003, 0x0f0e0d0c080b0a09 +.LBS0: # bit-slice constants + .quad 0x5555555555555555, 0x5555555555555555 +.LBS1: + .quad 0x3333333333333333, 0x3333333333333333 +.LBS2: + .quad 0x0f0f0f0f0f0f0f0f, 0x0f0f0f0f0f0f0f0f +.LSR: # shiftrows constants + .quad 0x0504070600030201, 0x0f0e0d0c0a09080b +.LSRM0: + .quad 0x0304090e00050a0f, 0x01060b0c0207080d +.LM0: + .quad 0x02060a0e03070b0f, 0x0004080c0105090d +.LM0SR: + .quad 0x0a0e02060f03070b, 0x0004080c05090d01 +.LNOT: # magic constants + .quad 0xffffffffffffffff, 0xffffffffffffffff +.L63: + .quad 0x6363636363636363, 0x6363636363636363 +.LSWPUP: # byte-swap upper dword + .quad 0x0706050403020100, 0x0c0d0e0f0b0a0908 +.LSWPUPM0SR: + .quad 0x0a0d02060c03070b, 0x0004080f05090e01 +.LADD1: # counter increment constants + .quad 0x0000000000000000, 0x0000000100000000 +.LADD2: + .quad 0x0000000000000000, 0x0000000200000000 +.LADD3: + .quad 0x0000000000000000, 0x0000000300000000 +.LADD4: + .quad 0x0000000000000000, 0x0000000400000000 +.LADD5: + .quad 0x0000000000000000, 0x0000000500000000 +.LADD6: + .quad 0x0000000000000000, 0x0000000600000000 +.LADD7: + .quad 0x0000000000000000, 0x0000000700000000 +.LADD8: + .quad 0x0000000000000000, 0x0000000800000000 +.Lxts_magic: + .long 0x87,0,1,0 +.asciz "Bit-sliced AES for x86_64/SSSE3, Emilia Käsper, Peter Schwabe, Andy Polyakov" +.align 64 +.size _bsaes_const,.-_bsaes_const +___ + +# EXCEPTION_DISPOSITION handler (EXCEPTION_RECORD *rec,ULONG64 frame, +# CONTEXT *context,DISPATCHER_CONTEXT *disp) +if ($win64) { +$rec="%rcx"; +$frame="%rdx"; +$context="%r8"; +$disp="%r9"; + +$code.=<<___; +.extern __imp_RtlVirtualUnwind +.type se_handler,\@abi-omnipotent +.align 16 +se_handler: + push %rsi + push %rdi + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + pushfq + sub \$64,%rsp + + mov 120($context),%rax # pull context->Rax + mov 248($context),%rbx # pull context->Rip + + mov 8($disp),%rsi # disp->ImageBase + mov 56($disp),%r11 # disp->HandlerData + + mov 0(%r11),%r10d # HandlerData[0] + lea (%rsi,%r10),%r10 # prologue label + cmp %r10,%rbx # context->RipRsp + + mov 4(%r11),%r10d # HandlerData[1] + lea (%rsi,%r10),%r10 # epilogue label + cmp %r10,%rbx # context->Rip>=epilogue label + jae .Lin_prologue + + mov 160($context),%rax # pull context->Rbp + + lea 0x40(%rax),%rsi # %xmm save area + lea 512($context),%rdi # &context.Xmm6 + mov \$20,%ecx # 10*sizeof(%xmm0)/sizeof(%rax) + .long 0xa548f3fc # cld; rep movsq + lea 0xa0(%rax),%rax # adjust stack pointer + + mov 0x70(%rax),%rbp + mov 0x68(%rax),%rbx + mov 0x60(%rax),%r12 + mov 0x58(%rax),%r13 + mov 0x50(%rax),%r14 + mov 0x48(%rax),%r15 + lea 0x78(%rax),%rax # adjust stack pointer + mov %rbx,144($context) # restore context->Rbx + mov %rbp,160($context) # restore context->Rbp + mov %r12,216($context) # restore context->R12 + mov %r13,224($context) # restore context->R13 + mov %r14,232($context) # restore context->R14 + mov %r15,240($context) # restore context->R15 + +.Lin_prologue: + mov %rax,152($context) # restore context->Rsp + + mov 40($disp),%rdi # disp->ContextRecord + mov $context,%rsi # context + mov \$`1232/8`,%ecx # sizeof(CONTEXT) + .long 0xa548f3fc # cld; rep movsq + + mov $disp,%rsi + xor %rcx,%rcx # arg1, UNW_FLAG_NHANDLER + mov 8(%rsi),%rdx # arg2, disp->ImageBase + mov 0(%rsi),%r8 # arg3, disp->ControlPc + mov 16(%rsi),%r9 # arg4, disp->FunctionEntry + mov 40(%rsi),%r10 # disp->ContextRecord + lea 56(%rsi),%r11 # &disp->HandlerData + lea 24(%rsi),%r12 # &disp->EstablisherFrame + mov %r10,32(%rsp) # arg5 + mov %r11,40(%rsp) # arg6 + mov %r12,48(%rsp) # arg7 + mov %rcx,56(%rsp) # arg8, (NULL) + call *__imp_RtlVirtualUnwind(%rip) + + mov \$1,%eax # ExceptionContinueSearch + add \$64,%rsp + popfq + pop %r15 + pop %r14 + pop %r13 + pop %r12 + pop %rbp + pop %rbx + pop %rdi + pop %rsi + ret +.size se_handler,.-se_handler + +.section .pdata +.align 4 +___ +$code.=<<___ if ($ecb); + .rva .Lecb_enc_prologue + .rva .Lecb_enc_epilogue + .rva .Lecb_enc_info + + .rva .Lecb_dec_prologue + .rva .Lecb_dec_epilogue + .rva .Lecb_dec_info +___ +$code.=<<___; + .rva .Lcbc_dec_prologue + .rva .Lcbc_dec_epilogue + .rva .Lcbc_dec_info + + .rva .Lctr_enc_prologue + .rva .Lctr_enc_epilogue + .rva .Lctr_enc_info + + .rva .Lxts_enc_prologue + .rva .Lxts_enc_epilogue + .rva .Lxts_enc_info + + .rva .Lxts_dec_prologue + .rva .Lxts_dec_epilogue + .rva .Lxts_dec_info + +.section .xdata +.align 8 +___ +$code.=<<___ if ($ecb); +.Lecb_enc_info: + .byte 9,0,0,0 + .rva se_handler + .rva .Lecb_enc_body,.Lecb_enc_epilogue # HandlerData[] +.Lecb_dec_info: + .byte 9,0,0,0 + .rva se_handler + .rva .Lecb_dec_body,.Lecb_dec_epilogue # HandlerData[] +___ +$code.=<<___; +.Lcbc_dec_info: + .byte 9,0,0,0 + .rva se_handler + .rva .Lcbc_dec_body,.Lcbc_dec_epilogue # HandlerData[] +.Lctr_enc_info: + .byte 9,0,0,0 + .rva se_handler + .rva .Lctr_enc_body,.Lctr_enc_epilogue # HandlerData[] +.Lxts_enc_info: + .byte 9,0,0,0 + .rva se_handler + .rva .Lxts_enc_body,.Lxts_enc_epilogue # HandlerData[] +.Lxts_dec_info: + .byte 9,0,0,0 + .rva se_handler + .rva .Lxts_dec_body,.Lxts_dec_epilogue # HandlerData[] +___ +} + +$code =~ s/\`([^\`]*)\`/eval($1)/gem; + +print $code; + +close STDOUT; diff --git a/crypto/openssl/crypto/aes/asm/vpaes-x86.pl b/crypto/openssl/crypto/aes/asm/vpaes-x86.pl new file mode 100644 index 0000000000..84a6f6d336 --- /dev/null +++ b/crypto/openssl/crypto/aes/asm/vpaes-x86.pl @@ -0,0 +1,901 @@ +#!/usr/bin/env perl + +###################################################################### +## Constant-time SSSE3 AES core implementation. +## version 0.1 +## +## By Mike Hamburg (Stanford University), 2009 +## Public domain. +## +## For details see http://shiftleft.org/papers/vector_aes/ and +## http://crypto.stanford.edu/vpaes/. + +###################################################################### +# September 2011. +# +# Port vpaes-x86_64.pl as 32-bit "almost" drop-in replacement for +# aes-586.pl. "Almost" refers to the fact that AES_cbc_encrypt +# doesn't handle partial vectors (doesn't have to if called from +# EVP only). "Drop-in" implies that this module doesn't share key +# schedule structure with the original nor does it make assumption +# about its alignment... +# +# Performance summary. aes-586.pl column lists large-block CBC +# encrypt/decrypt/with-hyper-threading-off(*) results in cycles per +# byte processed with 128-bit key, and vpaes-x86.pl column - [also +# large-block CBC] encrypt/decrypt. +# +# aes-586.pl vpaes-x86.pl +# +# Core 2(**) 29.1/42.3/18.3 22.0/25.6(***) +# Nehalem 27.9/40.4/18.1 10.3/12.0 +# Atom 102./119./60.1 64.5/85.3(***) +# +# (*) "Hyper-threading" in the context refers rather to cache shared +# among multiple cores, than to specifically Intel HTT. As vast +# majority of contemporary cores share cache, slower code path +# is common place. In other words "with-hyper-threading-off" +# results are presented mostly for reference purposes. +# +# (**) "Core 2" refers to initial 65nm design, a.k.a. Conroe. +# +# (***) Less impressive improvement on Core 2 and Atom is due to slow +# pshufb, yet it's respectable +32%/65% improvement on Core 2 +# and +58%/40% on Atom (as implied, over "hyper-threading-safe" +# code path). +# +# + +$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; +push(@INC,"${dir}","${dir}../../perlasm"); +require "x86asm.pl"; + +&asm_init($ARGV[0],"vpaes-x86.pl",$x86only = $ARGV[$#ARGV] eq "386"); + +$PREFIX="vpaes"; + +my ($round, $base, $magic, $key, $const, $inp, $out)= + ("eax", "ebx", "ecx", "edx","ebp", "esi","edi"); + +&static_label("_vpaes_consts"); +&static_label("_vpaes_schedule_low_round"); + +&set_label("_vpaes_consts",64); +$k_inv=-0x30; # inv, inva + &data_word(0x0D080180,0x0E05060F,0x0A0B0C02,0x04070309); + &data_word(0x0F0B0780,0x01040A06,0x02050809,0x030D0E0C); + +$k_s0F=-0x10; # s0F + &data_word(0x0F0F0F0F,0x0F0F0F0F,0x0F0F0F0F,0x0F0F0F0F); + +$k_ipt=0x00; # input transform (lo, hi) + &data_word(0x5A2A7000,0xC2B2E898,0x52227808,0xCABAE090); + &data_word(0x317C4D00,0x4C01307D,0xB0FDCC81,0xCD80B1FC); + +$k_sb1=0x20; # sb1u, sb1t + &data_word(0xCB503E00,0xB19BE18F,0x142AF544,0xA5DF7A6E); + &data_word(0xFAE22300,0x3618D415,0x0D2ED9EF,0x3BF7CCC1); +$k_sb2=0x40; # sb2u, sb2t + &data_word(0x0B712400,0xE27A93C6,0xBC982FCD,0x5EB7E955); + &data_word(0x0AE12900,0x69EB8840,0xAB82234A,0xC2A163C8); +$k_sbo=0x60; # sbou, sbot + &data_word(0x6FBDC700,0xD0D26D17,0xC502A878,0x15AABF7A); + &data_word(0x5FBB6A00,0xCFE474A5,0x412B35FA,0x8E1E90D1); + +$k_mc_forward=0x80; # mc_forward + &data_word(0x00030201,0x04070605,0x080B0A09,0x0C0F0E0D); + &data_word(0x04070605,0x080B0A09,0x0C0F0E0D,0x00030201); + &data_word(0x080B0A09,0x0C0F0E0D,0x00030201,0x04070605); + &data_word(0x0C0F0E0D,0x00030201,0x04070605,0x080B0A09); + +$k_mc_backward=0xc0; # mc_backward + &data_word(0x02010003,0x06050407,0x0A09080B,0x0E0D0C0F); + &data_word(0x0E0D0C0F,0x02010003,0x06050407,0x0A09080B); + &data_word(0x0A09080B,0x0E0D0C0F,0x02010003,0x06050407); + &data_word(0x06050407,0x0A09080B,0x0E0D0C0F,0x02010003); + +$k_sr=0x100; # sr + &data_word(0x03020100,0x07060504,0x0B0A0908,0x0F0E0D0C); + &data_word(0x0F0A0500,0x030E0904,0x07020D08,0x0B06010C); + &data_word(0x0B020900,0x0F060D04,0x030A0108,0x070E050C); + &data_word(0x070A0D00,0x0B0E0104,0x0F020508,0x0306090C); + +$k_rcon=0x140; # rcon + &data_word(0xAF9DEEB6,0x1F8391B9,0x4D7C7D81,0x702A9808); + +$k_s63=0x150; # s63: all equal to 0x63 transformed + &data_word(0x5B5B5B5B,0x5B5B5B5B,0x5B5B5B5B,0x5B5B5B5B); + +$k_opt=0x160; # output transform + &data_word(0xD6B66000,0xFF9F4929,0xDEBE6808,0xF7974121); + &data_word(0x50BCEC00,0x01EDBD51,0xB05C0CE0,0xE10D5DB1); + +$k_deskew=0x180; # deskew tables: inverts the sbox's "skew" + &data_word(0x47A4E300,0x07E4A340,0x5DBEF91A,0x1DFEB95A); + &data_word(0x83EA6900,0x5F36B5DC,0xF49D1E77,0x2841C2AB); +## +## Decryption stuff +## Key schedule constants +## +$k_dksd=0x1a0; # decryption key schedule: invskew x*D + &data_word(0xA3E44700,0xFEB91A5D,0x5A1DBEF9,0x0740E3A4); + &data_word(0xB5368300,0x41C277F4,0xAB289D1E,0x5FDC69EA); +$k_dksb=0x1c0; # decryption key schedule: invskew x*B + &data_word(0x8550D500,0x9A4FCA1F,0x1CC94C99,0x03D65386); + &data_word(0xB6FC4A00,0x115BEDA7,0x7E3482C8,0xD993256F); +$k_dkse=0x1e0; # decryption key schedule: invskew x*E + 0x63 + &data_word(0x1FC9D600,0xD5031CCA,0x994F5086,0x53859A4C); + &data_word(0x4FDC7BE8,0xA2319605,0x20B31487,0xCD5EF96A); +$k_dks9=0x200; # decryption key schedule: invskew x*9 + &data_word(0x7ED9A700,0xB6116FC8,0x82255BFC,0x4AED9334); + &data_word(0x27143300,0x45765162,0xE9DAFDCE,0x8BB89FAC); + +## +## Decryption stuff +## Round function constants +## +$k_dipt=0x220; # decryption input transform + &data_word(0x0B545F00,0x0F505B04,0x114E451A,0x154A411E); + &data_word(0x60056500,0x86E383E6,0xF491F194,0x12771772); + +$k_dsb9=0x240; # decryption sbox output *9*u, *9*t + &data_word(0x9A86D600,0x851C0353,0x4F994CC9,0xCAD51F50); + &data_word(0xECD74900,0xC03B1789,0xB2FBA565,0x725E2C9E); +$k_dsbd=0x260; # decryption sbox output *D*u, *D*t + &data_word(0xE6B1A200,0x7D57CCDF,0x882A4439,0xF56E9B13); + &data_word(0x24C6CB00,0x3CE2FAF7,0x15DEEFD3,0x2931180D); +$k_dsbb=0x280; # decryption sbox output *B*u, *B*t + &data_word(0x96B44200,0xD0226492,0xB0F2D404,0x602646F6); + &data_word(0xCD596700,0xC19498A6,0x3255AA6B,0xF3FF0C3E); +$k_dsbe=0x2a0; # decryption sbox output *E*u, *E*t + &data_word(0x26D4D000,0x46F29296,0x64B4F6B0,0x22426004); + &data_word(0xFFAAC100,0x0C55A6CD,0x98593E32,0x9467F36B); +$k_dsbo=0x2c0; # decryption sbox final output + &data_word(0x7EF94000,0x1387EA53,0xD4943E2D,0xC7AA6DB9); + &data_word(0x93441D00,0x12D7560F,0xD8C58E9C,0xCA4B8159); +&asciz ("Vector Permutation AES for x86/SSSE3, Mike Hamburg (Stanford University)"); +&align (64); + +&function_begin_B("_vpaes_preheat"); + &add ($const,&DWP(0,"esp")); + &movdqa ("xmm7",&QWP($k_inv,$const)); + &movdqa ("xmm6",&QWP($k_s0F,$const)); + &ret (); +&function_end_B("_vpaes_preheat"); + +## +## _aes_encrypt_core +## +## AES-encrypt %xmm0. +## +## Inputs: +## %xmm0 = input +## %xmm6-%xmm7 as in _vpaes_preheat +## (%edx) = scheduled keys +## +## Output in %xmm0 +## Clobbers %xmm1-%xmm5, %eax, %ebx, %ecx, %edx +## +## +&function_begin_B("_vpaes_encrypt_core"); + &mov ($magic,16); + &mov ($round,&DWP(240,$key)); + &movdqa ("xmm1","xmm6") + &movdqa ("xmm2",&QWP($k_ipt,$const)); + &pandn ("xmm1","xmm0"); + &movdqu ("xmm5",&QWP(0,$key)); + &psrld ("xmm1",4); + &pand ("xmm0","xmm6"); + &pshufb ("xmm2","xmm0"); + &movdqa ("xmm0",&QWP($k_ipt+16,$const)); + &pshufb ("xmm0","xmm1"); + &pxor ("xmm2","xmm5"); + &pxor ("xmm0","xmm2"); + &add ($key,16); + &lea ($base,&DWP($k_mc_backward,$const)); + &jmp (&label("enc_entry")); + + +&set_label("enc_loop",16); + # middle of middle round + &movdqa ("xmm4",&QWP($k_sb1,$const)); # 4 : sb1u + &pshufb ("xmm4","xmm2"); # 4 = sb1u + &pxor ("xmm4","xmm5"); # 4 = sb1u + k + &movdqa ("xmm0",&QWP($k_sb1+16,$const));# 0 : sb1t + &pshufb ("xmm0","xmm3"); # 0 = sb1t + &pxor ("xmm0","xmm4"); # 0 = A + &movdqa ("xmm5",&QWP($k_sb2,$const)); # 4 : sb2u + &pshufb ("xmm5","xmm2"); # 4 = sb2u + &movdqa ("xmm1",&QWP(-0x40,$base,$magic));# .Lk_mc_forward[] + &movdqa ("xmm2",&QWP($k_sb2+16,$const));# 2 : sb2t + &pshufb ("xmm2","xmm3"); # 2 = sb2t + &pxor ("xmm2","xmm5"); # 2 = 2A + &movdqa ("xmm4",&QWP(0,$base,$magic)); # .Lk_mc_backward[] + &movdqa ("xmm3","xmm0"); # 3 = A + &pshufb ("xmm0","xmm1"); # 0 = B + &add ($key,16); # next key + &pxor ("xmm0","xmm2"); # 0 = 2A+B + &pshufb ("xmm3","xmm4"); # 3 = D + &add ($magic,16); # next mc + &pxor ("xmm3","xmm0"); # 3 = 2A+B+D + &pshufb ("xmm0","xmm1"); # 0 = 2B+C + &and ($magic,0x30); # ... mod 4 + &pxor ("xmm0","xmm3"); # 0 = 2A+3B+C+D + &sub ($round,1); # nr-- + +&set_label("enc_entry"); + # top of round + &movdqa ("xmm1","xmm6"); # 1 : i + &pandn ("xmm1","xmm0"); # 1 = i<<4 + &psrld ("xmm1",4); # 1 = i + &pand ("xmm0","xmm6"); # 0 = k + &movdqa ("xmm5",&QWP($k_inv+16,$const));# 2 : a/k + &pshufb ("xmm5","xmm0"); # 2 = a/k + &pxor ("xmm0","xmm1"); # 0 = j + &movdqa ("xmm3","xmm7"); # 3 : 1/i + &pshufb ("xmm3","xmm1"); # 3 = 1/i + &pxor ("xmm3","xmm5"); # 3 = iak = 1/i + a/k + &movdqa ("xmm4","xmm7"); # 4 : 1/j + &pshufb ("xmm4","xmm0"); # 4 = 1/j + &pxor ("xmm4","xmm5"); # 4 = jak = 1/j + a/k + &movdqa ("xmm2","xmm7"); # 2 : 1/iak + &pshufb ("xmm2","xmm3"); # 2 = 1/iak + &pxor ("xmm2","xmm0"); # 2 = io + &movdqa ("xmm3","xmm7"); # 3 : 1/jak + &movdqu ("xmm5",&QWP(0,$key)); + &pshufb ("xmm3","xmm4"); # 3 = 1/jak + &pxor ("xmm3","xmm1"); # 3 = jo + &jnz (&label("enc_loop")); + + # middle of last round + &movdqa ("xmm4",&QWP($k_sbo,$const)); # 3 : sbou .Lk_sbo + &movdqa ("xmm0",&QWP($k_sbo+16,$const));# 3 : sbot .Lk_sbo+16 + &pshufb ("xmm4","xmm2"); # 4 = sbou + &pxor ("xmm4","xmm5"); # 4 = sb1u + k + &pshufb ("xmm0","xmm3"); # 0 = sb1t + &movdqa ("xmm1",&QWP(0x40,$base,$magic));# .Lk_sr[] + &pxor ("xmm0","xmm4"); # 0 = A + &pshufb ("xmm0","xmm1"); + &ret (); +&function_end_B("_vpaes_encrypt_core"); + +## +## Decryption core +## +## Same API as encryption core. +## +&function_begin_B("_vpaes_decrypt_core"); + &mov ($round,&DWP(240,$key)); + &lea ($base,&DWP($k_dsbd,$const)); + &movdqa ("xmm1","xmm6"); + &movdqa ("xmm2",&QWP($k_dipt-$k_dsbd,$base)); + &pandn ("xmm1","xmm0"); + &mov ($magic,$round); + &psrld ("xmm1",4) + &movdqu ("xmm5",&QWP(0,$key)); + &shl ($magic,4); + &pand ("xmm0","xmm6"); + &pshufb ("xmm2","xmm0"); + &movdqa ("xmm0",&QWP($k_dipt-$k_dsbd+16,$base)); + &xor ($magic,0x30); + &pshufb ("xmm0","xmm1"); + &and ($magic,0x30); + &pxor ("xmm2","xmm5"); + &movdqa ("xmm5",&QWP($k_mc_forward+48,$const)); + &pxor ("xmm0","xmm2"); + &add ($key,16); + &lea ($magic,&DWP($k_sr-$k_dsbd,$base,$magic)); + &jmp (&label("dec_entry")); + +&set_label("dec_loop",16); +## +## Inverse mix columns +## + &movdqa ("xmm4",&QWP(-0x20,$base)); # 4 : sb9u + &pshufb ("xmm4","xmm2"); # 4 = sb9u + &pxor ("xmm4","xmm0"); + &movdqa ("xmm0",&QWP(-0x10,$base)); # 0 : sb9t + &pshufb ("xmm0","xmm3"); # 0 = sb9t + &pxor ("xmm0","xmm4"); # 0 = ch + &add ($key,16); # next round key + + &pshufb ("xmm0","xmm5"); # MC ch + &movdqa ("xmm4",&QWP(0,$base)); # 4 : sbdu + &pshufb ("xmm4","xmm2"); # 4 = sbdu + &pxor ("xmm4","xmm0"); # 4 = ch + &movdqa ("xmm0",&QWP(0x10,$base)); # 0 : sbdt + &pshufb ("xmm0","xmm3"); # 0 = sbdt + &pxor ("xmm0","xmm4"); # 0 = ch + &sub ($round,1); # nr-- + + &pshufb ("xmm0","xmm5"); # MC ch + &movdqa ("xmm4",&QWP(0x20,$base)); # 4 : sbbu + &pshufb ("xmm4","xmm2"); # 4 = sbbu + &pxor ("xmm4","xmm0"); # 4 = ch + &movdqa ("xmm0",&QWP(0x30,$base)); # 0 : sbbt + &pshufb ("xmm0","xmm3"); # 0 = sbbt + &pxor ("xmm0","xmm4"); # 0 = ch + + &pshufb ("xmm0","xmm5"); # MC ch + &movdqa ("xmm4",&QWP(0x40,$base)); # 4 : sbeu + &pshufb ("xmm4","xmm2"); # 4 = sbeu + &pxor ("xmm4","xmm0"); # 4 = ch + &movdqa ("xmm0",&QWP(0x50,$base)); # 0 : sbet + &pshufb ("xmm0","xmm3"); # 0 = sbet + &pxor ("xmm0","xmm4"); # 0 = ch + + &palignr("xmm5","xmm5",12); + +&set_label("dec_entry"); + # top of round + &movdqa ("xmm1","xmm6"); # 1 : i + &pandn ("xmm1","xmm0"); # 1 = i<<4 + &psrld ("xmm1",4); # 1 = i + &pand ("xmm0","xmm6"); # 0 = k + &movdqa ("xmm2",&QWP($k_inv+16,$const));# 2 : a/k + &pshufb ("xmm2","xmm0"); # 2 = a/k + &pxor ("xmm0","xmm1"); # 0 = j + &movdqa ("xmm3","xmm7"); # 3 : 1/i + &pshufb ("xmm3","xmm1"); # 3 = 1/i + &pxor ("xmm3","xmm2"); # 3 = iak = 1/i + a/k + &movdqa ("xmm4","xmm7"); # 4 : 1/j + &pshufb ("xmm4","xmm0"); # 4 = 1/j + &pxor ("xmm4","xmm2"); # 4 = jak = 1/j + a/k + &movdqa ("xmm2","xmm7"); # 2 : 1/iak + &pshufb ("xmm2","xmm3"); # 2 = 1/iak + &pxor ("xmm2","xmm0"); # 2 = io + &movdqa ("xmm3","xmm7"); # 3 : 1/jak + &pshufb ("xmm3","xmm4"); # 3 = 1/jak + &pxor ("xmm3","xmm1"); # 3 = jo + &movdqu ("xmm0",&QWP(0,$key)); + &jnz (&label("dec_loop")); + + # middle of last round + &movdqa ("xmm4",&QWP(0x60,$base)); # 3 : sbou + &pshufb ("xmm4","xmm2"); # 4 = sbou + &pxor ("xmm4","xmm0"); # 4 = sb1u + k + &movdqa ("xmm0",&QWP(0x70,$base)); # 0 : sbot + &movdqa ("xmm2",&QWP(0,$magic)); + &pshufb ("xmm0","xmm3"); # 0 = sb1t + &pxor ("xmm0","xmm4"); # 0 = A + &pshufb ("xmm0","xmm2"); + &ret (); +&function_end_B("_vpaes_decrypt_core"); + +######################################################## +## ## +## AES key schedule ## +## ## +######################################################## +&function_begin_B("_vpaes_schedule_core"); + &add ($const,&DWP(0,"esp")); + &movdqu ("xmm0",&QWP(0,$inp)); # load key (unaligned) + &movdqa ("xmm2",&QWP($k_rcon,$const)); # load rcon + + # input transform + &movdqa ("xmm3","xmm0"); + &lea ($base,&DWP($k_ipt,$const)); + &movdqa (&QWP(4,"esp"),"xmm2"); # xmm8 + &call ("_vpaes_schedule_transform"); + &movdqa ("xmm7","xmm0"); + + &test ($out,$out); + &jnz (&label("schedule_am_decrypting")); + + # encrypting, output zeroth round key after transform + &movdqu (&QWP(0,$key),"xmm0"); + &jmp (&label("schedule_go")); + +&set_label("schedule_am_decrypting"); + # decrypting, output zeroth round key after shiftrows + &movdqa ("xmm1",&QWP($k_sr,$const,$magic)); + &pshufb ("xmm3","xmm1"); + &movdqu (&QWP(0,$key),"xmm3"); + &xor ($magic,0x30); + +&set_label("schedule_go"); + &cmp ($round,192); + &ja (&label("schedule_256")); + &je (&label("schedule_192")); + # 128: fall though + +## +## .schedule_128 +## +## 128-bit specific part of key schedule. +## +## This schedule is really simple, because all its parts +## are accomplished by the subroutines. +## +&set_label("schedule_128"); + &mov ($round,10); + +&set_label("loop_schedule_128"); + &call ("_vpaes_schedule_round"); + &dec ($round); + &jz (&label("schedule_mangle_last")); + &call ("_vpaes_schedule_mangle"); # write output + &jmp (&label("loop_schedule_128")); + +## +## .aes_schedule_192 +## +## 192-bit specific part of key schedule. +## +## The main body of this schedule is the same as the 128-bit +## schedule, but with more smearing. The long, high side is +## stored in %xmm7 as before, and the short, low side is in +## the high bits of %xmm6. +## +## This schedule is somewhat nastier, however, because each +## round produces 192 bits of key material, or 1.5 round keys. +## Therefore, on each cycle we do 2 rounds and produce 3 round +## keys. +## +&set_label("schedule_192",16); + &movdqu ("xmm0",&QWP(8,$inp)); # load key part 2 (very unaligned) + &call ("_vpaes_schedule_transform"); # input transform + &movdqa ("xmm6","xmm0"); # save short part + &pxor ("xmm4","xmm4"); # clear 4 + &movhlps("xmm6","xmm4"); # clobber low side with zeros + &mov ($round,4); + +&set_label("loop_schedule_192"); + &call ("_vpaes_schedule_round"); + &palignr("xmm0","xmm6",8); + &call ("_vpaes_schedule_mangle"); # save key n + &call ("_vpaes_schedule_192_smear"); + &call ("_vpaes_schedule_mangle"); # save key n+1 + &call ("_vpaes_schedule_round"); + &dec ($round); + &jz (&label("schedule_mangle_last")); + &call ("_vpaes_schedule_mangle"); # save key n+2 + &call ("_vpaes_schedule_192_smear"); + &jmp (&label("loop_schedule_192")); + +## +## .aes_schedule_256 +## +## 256-bit specific part of key schedule. +## +## The structure here is very similar to the 128-bit +## schedule, but with an additional "low side" in +## %xmm6. The low side's rounds are the same as the +## high side's, except no rcon and no rotation. +## +&set_label("schedule_256",16); + &movdqu ("xmm0",&QWP(16,$inp)); # load key part 2 (unaligned) + &call ("_vpaes_schedule_transform"); # input transform + &mov ($round,7); + +&set_label("loop_schedule_256"); + &call ("_vpaes_schedule_mangle"); # output low result + &movdqa ("xmm6","xmm0"); # save cur_lo in xmm6 + + # high round + &call ("_vpaes_schedule_round"); + &dec ($round); + &jz (&label("schedule_mangle_last")); + &call ("_vpaes_schedule_mangle"); + + # low round. swap xmm7 and xmm6 + &pshufd ("xmm0","xmm0",0xFF); + &movdqa (&QWP(20,"esp"),"xmm7"); + &movdqa ("xmm7","xmm6"); + &call ("_vpaes_schedule_low_round"); + &movdqa ("xmm7",&QWP(20,"esp")); + + &jmp (&label("loop_schedule_256")); + +## +## .aes_schedule_mangle_last +## +## Mangler for last round of key schedule +## Mangles %xmm0 +## when encrypting, outputs out(%xmm0) ^ 63 +## when decrypting, outputs unskew(%xmm0) +## +## Always called right before return... jumps to cleanup and exits +## +&set_label("schedule_mangle_last",16); + # schedule last round key from xmm0 + &lea ($base,&DWP($k_deskew,$const)); + &test ($out,$out); + &jnz (&label("schedule_mangle_last_dec")); + + # encrypting + &movdqa ("xmm1",&QWP($k_sr,$const,$magic)); + &pshufb ("xmm0","xmm1"); # output permute + &lea ($base,&DWP($k_opt,$const)); # prepare to output transform + &add ($key,32); + +&set_label("schedule_mangle_last_dec"); + &add ($key,-16); + &pxor ("xmm0",&QWP($k_s63,$const)); + &call ("_vpaes_schedule_transform"); # output transform + &movdqu (&QWP(0,$key),"xmm0"); # save last key + + # cleanup + &pxor ("xmm0","xmm0"); + &pxor ("xmm1","xmm1"); + &pxor ("xmm2","xmm2"); + &pxor ("xmm3","xmm3"); + &pxor ("xmm4","xmm4"); + &pxor ("xmm5","xmm5"); + &pxor ("xmm6","xmm6"); + &pxor ("xmm7","xmm7"); + &ret (); +&function_end_B("_vpaes_schedule_core"); + +## +## .aes_schedule_192_smear +## +## Smear the short, low side in the 192-bit key schedule. +## +## Inputs: +## %xmm7: high side, b a x y +## %xmm6: low side, d c 0 0 +## %xmm13: 0 +## +## Outputs: +## %xmm6: b+c+d b+c 0 0 +## %xmm0: b+c+d b+c b a +## +&function_begin_B("_vpaes_schedule_192_smear"); + &pshufd ("xmm0","xmm6",0x80); # d c 0 0 -> c 0 0 0 + &pxor ("xmm6","xmm0"); # -> c+d c 0 0 + &pshufd ("xmm0","xmm7",0xFE); # b a _ _ -> b b b a + &pxor ("xmm6","xmm0"); # -> b+c+d b+c b a + &movdqa ("xmm0","xmm6"); + &pxor ("xmm1","xmm1"); + &movhlps("xmm6","xmm1"); # clobber low side with zeros + &ret (); +&function_end_B("_vpaes_schedule_192_smear"); + +## +## .aes_schedule_round +## +## Runs one main round of the key schedule on %xmm0, %xmm7 +## +## Specifically, runs subbytes on the high dword of %xmm0 +## then rotates it by one byte and xors into the low dword of +## %xmm7. +## +## Adds rcon from low byte of %xmm8, then rotates %xmm8 for +## next rcon. +## +## Smears the dwords of %xmm7 by xoring the low into the +## second low, result into third, result into highest. +## +## Returns results in %xmm7 = %xmm0. +## Clobbers %xmm1-%xmm5. +## +&function_begin_B("_vpaes_schedule_round"); + # extract rcon from xmm8 + &movdqa ("xmm2",&QWP(8,"esp")); # xmm8 + &pxor ("xmm1","xmm1"); + &palignr("xmm1","xmm2",15); + &palignr("xmm2","xmm2",15); + &pxor ("xmm7","xmm1"); + + # rotate + &pshufd ("xmm0","xmm0",0xFF); + &palignr("xmm0","xmm0",1); + + # fall through... + &movdqa (&QWP(8,"esp"),"xmm2"); # xmm8 + + # low round: same as high round, but no rotation and no rcon. +&set_label("_vpaes_schedule_low_round"); + # smear xmm7 + &movdqa ("xmm1","xmm7"); + &pslldq ("xmm7",4); + &pxor ("xmm7","xmm1"); + &movdqa ("xmm1","xmm7"); + &pslldq ("xmm7",8); + &pxor ("xmm7","xmm1"); + &pxor ("xmm7",&QWP($k_s63,$const)); + + # subbyte + &movdqa ("xmm4",&QWP($k_s0F,$const)); + &movdqa ("xmm5",&QWP($k_inv,$const)); # 4 : 1/j + &movdqa ("xmm1","xmm4"); + &pandn ("xmm1","xmm0"); + &psrld ("xmm1",4); # 1 = i + &pand ("xmm0","xmm4"); # 0 = k + &movdqa ("xmm2",&QWP($k_inv+16,$const));# 2 : a/k + &pshufb ("xmm2","xmm0"); # 2 = a/k + &pxor ("xmm0","xmm1"); # 0 = j + &movdqa ("xmm3","xmm5"); # 3 : 1/i + &pshufb ("xmm3","xmm1"); # 3 = 1/i + &pxor ("xmm3","xmm2"); # 3 = iak = 1/i + a/k + &movdqa ("xmm4","xmm5"); # 4 : 1/j + &pshufb ("xmm4","xmm0"); # 4 = 1/j + &pxor ("xmm4","xmm2"); # 4 = jak = 1/j + a/k + &movdqa ("xmm2","xmm5"); # 2 : 1/iak + &pshufb ("xmm2","xmm3"); # 2 = 1/iak + &pxor ("xmm2","xmm0"); # 2 = io + &movdqa ("xmm3","xmm5"); # 3 : 1/jak + &pshufb ("xmm3","xmm4"); # 3 = 1/jak + &pxor ("xmm3","xmm1"); # 3 = jo + &movdqa ("xmm4",&QWP($k_sb1,$const)); # 4 : sbou + &pshufb ("xmm4","xmm2"); # 4 = sbou + &movdqa ("xmm0",&QWP($k_sb1+16,$const));# 0 : sbot + &pshufb ("xmm0","xmm3"); # 0 = sb1t + &pxor ("xmm0","xmm4"); # 0 = sbox output + + # add in smeared stuff + &pxor ("xmm0","xmm7"); + &movdqa ("xmm7","xmm0"); + &ret (); +&function_end_B("_vpaes_schedule_round"); + +## +## .aes_schedule_transform +## +## Linear-transform %xmm0 according to tables at (%ebx) +## +## Output in %xmm0 +## Clobbers %xmm1, %xmm2 +## +&function_begin_B("_vpaes_schedule_transform"); + &movdqa ("xmm2",&QWP($k_s0F,$const)); + &movdqa ("xmm1","xmm2"); + &pandn ("xmm1","xmm0"); + &psrld ("xmm1",4); + &pand ("xmm0","xmm2"); + &movdqa ("xmm2",&QWP(0,$base)); + &pshufb ("xmm2","xmm0"); + &movdqa ("xmm0",&QWP(16,$base)); + &pshufb ("xmm0","xmm1"); + &pxor ("xmm0","xmm2"); + &ret (); +&function_end_B("_vpaes_schedule_transform"); + +## +## .aes_schedule_mangle +## +## Mangle xmm0 from (basis-transformed) standard version +## to our version. +## +## On encrypt, +## xor with 0x63 +## multiply by circulant 0,1,1,1 +## apply shiftrows transform +## +## On decrypt, +## xor with 0x63 +## multiply by "inverse mixcolumns" circulant E,B,D,9 +## deskew +## apply shiftrows transform +## +## +## Writes out to (%edx), and increments or decrements it +## Keeps track of round number mod 4 in %ecx +## Preserves xmm0 +## Clobbers xmm1-xmm5 +## +&function_begin_B("_vpaes_schedule_mangle"); + &movdqa ("xmm4","xmm0"); # save xmm0 for later + &movdqa ("xmm5",&QWP($k_mc_forward,$const)); + &test ($out,$out); + &jnz (&label("schedule_mangle_dec")); + + # encrypting + &add ($key,16); + &pxor ("xmm4",&QWP($k_s63,$const)); + &pshufb ("xmm4","xmm5"); + &movdqa ("xmm3","xmm4"); + &pshufb ("xmm4","xmm5"); + &pxor ("xmm3","xmm4"); + &pshufb ("xmm4","xmm5"); + &pxor ("xmm3","xmm4"); + + &jmp (&label("schedule_mangle_both")); + +&set_label("schedule_mangle_dec",16); + # inverse mix columns + &movdqa ("xmm2",&QWP($k_s0F,$const)); + &lea ($inp,&DWP($k_dksd,$const)); + &movdqa ("xmm1","xmm2"); + &pandn ("xmm1","xmm4"); + &psrld ("xmm1",4); # 1 = hi + &pand ("xmm4","xmm2"); # 4 = lo + + &movdqa ("xmm2",&QWP(0,$inp)); + &pshufb ("xmm2","xmm4"); + &movdqa ("xmm3",&QWP(0x10,$inp)); + &pshufb ("xmm3","xmm1"); + &pxor ("xmm3","xmm2"); + &pshufb ("xmm3","xmm5"); + + &movdqa ("xmm2",&QWP(0x20,$inp)); + &pshufb ("xmm2","xmm4"); + &pxor ("xmm2","xmm3"); + &movdqa ("xmm3",&QWP(0x30,$inp)); + &pshufb ("xmm3","xmm1"); + &pxor ("xmm3","xmm2"); + &pshufb ("xmm3","xmm5"); + + &movdqa ("xmm2",&QWP(0x40,$inp)); + &pshufb ("xmm2","xmm4"); + &pxor ("xmm2","xmm3"); + &movdqa ("xmm3",&QWP(0x50,$inp)); + &pshufb ("xmm3","xmm1"); + &pxor ("xmm3","xmm2"); + &pshufb ("xmm3","xmm5"); + + &movdqa ("xmm2",&QWP(0x60,$inp)); + &pshufb ("xmm2","xmm4"); + &pxor ("xmm2","xmm3"); + &movdqa ("xmm3",&QWP(0x70,$inp)); + &pshufb ("xmm3","xmm1"); + &pxor ("xmm3","xmm2"); + + &add ($key,-16); + +&set_label("schedule_mangle_both"); + &movdqa ("xmm1",&QWP($k_sr,$const,$magic)); + &pshufb ("xmm3","xmm1"); + &add ($magic,-16); + &and ($magic,0x30); + &movdqu (&QWP(0,$key),"xmm3"); + &ret (); +&function_end_B("_vpaes_schedule_mangle"); + +# +# Interface to OpenSSL +# +&function_begin("${PREFIX}_set_encrypt_key"); + &mov ($inp,&wparam(0)); # inp + &lea ($base,&DWP(-56,"esp")); + &mov ($round,&wparam(1)); # bits + &and ($base,-16); + &mov ($key,&wparam(2)); # key + &xchg ($base,"esp"); # alloca + &mov (&DWP(48,"esp"),$base); + + &mov ($base,$round); + &shr ($base,5); + &add ($base,5); + &mov (&DWP(240,$key),$base); # AES_KEY->rounds = nbits/32+5; + &mov ($magic,0x30); + &mov ($out,0); + + &lea ($const,&DWP(&label("_vpaes_consts")."+0x30-".&label("pic_point"))); + &call ("_vpaes_schedule_core"); +&set_label("pic_point"); + + &mov ("esp",&DWP(48,"esp")); + &xor ("eax","eax"); +&function_end("${PREFIX}_set_encrypt_key"); + +&function_begin("${PREFIX}_set_decrypt_key"); + &mov ($inp,&wparam(0)); # inp + &lea ($base,&DWP(-56,"esp")); + &mov ($round,&wparam(1)); # bits + &and ($base,-16); + &mov ($key,&wparam(2)); # key + &xchg ($base,"esp"); # alloca + &mov (&DWP(48,"esp"),$base); + + &mov ($base,$round); + &shr ($base,5); + &add ($base,5); + &mov (&DWP(240,$key),$base); # AES_KEY->rounds = nbits/32+5; + &shl ($base,4); + &lea ($key,&DWP(16,$key,$base)); + + &mov ($out,1); + &mov ($magic,$round); + &shr ($magic,1); + &and ($magic,32); + &xor ($magic,32); # nbist==192?0:32; + + &lea ($const,&DWP(&label("_vpaes_consts")."+0x30-".&label("pic_point"))); + &call ("_vpaes_schedule_core"); +&set_label("pic_point"); + + &mov ("esp",&DWP(48,"esp")); + &xor ("eax","eax"); +&function_end("${PREFIX}_set_decrypt_key"); + +&function_begin("${PREFIX}_encrypt"); + &lea ($const,&DWP(&label("_vpaes_consts")."+0x30-".&label("pic_point"))); + &call ("_vpaes_preheat"); +&set_label("pic_point"); + &mov ($inp,&wparam(0)); # inp + &lea ($base,&DWP(-56,"esp")); + &mov ($out,&wparam(1)); # out + &and ($base,-16); + &mov ($key,&wparam(2)); # key + &xchg ($base,"esp"); # alloca + &mov (&DWP(48,"esp"),$base); + + &movdqu ("xmm0",&QWP(0,$inp)); + &call ("_vpaes_encrypt_core"); + &movdqu (&QWP(0,$out),"xmm0"); + + &mov ("esp",&DWP(48,"esp")); +&function_end("${PREFIX}_encrypt"); + +&function_begin("${PREFIX}_decrypt"); + &lea ($const,&DWP(&label("_vpaes_consts")."+0x30-".&label("pic_point"))); + &call ("_vpaes_preheat"); +&set_label("pic_point"); + &mov ($inp,&wparam(0)); # inp + &lea ($base,&DWP(-56,"esp")); + &mov ($out,&wparam(1)); # out + &and ($base,-16); + &mov ($key,&wparam(2)); # key + &xchg ($base,"esp"); # alloca + &mov (&DWP(48,"esp"),$base); + + &movdqu ("xmm0",&QWP(0,$inp)); + &call ("_vpaes_decrypt_core"); + &movdqu (&QWP(0,$out),"xmm0"); + + &mov ("esp",&DWP(48,"esp")); +&function_end("${PREFIX}_decrypt"); + +&function_begin("${PREFIX}_cbc_encrypt"); + &mov ($inp,&wparam(0)); # inp + &mov ($out,&wparam(1)); # out + &mov ($round,&wparam(2)); # len + &mov ($key,&wparam(3)); # key + &lea ($base,&DWP(-56,"esp")); + &mov ($const,&wparam(4)); # ivp + &and ($base,-16); + &mov ($magic,&wparam(5)); # enc + &xchg ($base,"esp"); # alloca + &movdqu ("xmm1",&QWP(0,$const)); # load IV + &sub ($out,$inp); + &mov (&DWP(48,"esp"),$base); + + &mov (&DWP(0,"esp"),$out); # save out + &sub ($round,16); + &mov (&DWP(4,"esp"),$key) # save key + &mov (&DWP(8,"esp"),$const); # save ivp + &mov ($out,$round); # $out works as $len + + &lea ($const,&DWP(&label("_vpaes_consts")."+0x30-".&label("pic_point"))); + &call ("_vpaes_preheat"); +&set_label("pic_point"); + &cmp ($magic,0); + &je (&label("cbc_dec_loop")); + &jmp (&label("cbc_enc_loop")); + +&set_label("cbc_enc_loop",16); + &movdqu ("xmm0",&QWP(0,$inp)); # load input + &pxor ("xmm0","xmm1"); # inp^=iv + &call ("_vpaes_encrypt_core"); + &mov ($base,&DWP(0,"esp")); # restore out + &mov ($key,&DWP(4,"esp")); # restore key + &movdqa ("xmm1","xmm0"); + &movdqu (&QWP(0,$base,$inp),"xmm0"); # write output + &lea ($inp,&DWP(16,$inp)); + &sub ($out,16); + &jnc (&label("cbc_enc_loop")); + &jmp (&label("cbc_done")); + +&set_label("cbc_dec_loop",16); + &movdqu ("xmm0",&QWP(0,$inp)); # load input + &movdqa (&QWP(16,"esp"),"xmm1"); # save IV + &movdqa (&QWP(32,"esp"),"xmm0"); # save future IV + &call ("_vpaes_decrypt_core"); + &mov ($base,&DWP(0,"esp")); # restore out + &mov ($key,&DWP(4,"esp")); # restore key + &pxor ("xmm0",&QWP(16,"esp")); # out^=iv + &movdqa ("xmm1",&QWP(32,"esp")); # load next IV + &movdqu (&QWP(0,$base,$inp),"xmm0"); # write output + &lea ($inp,&DWP(16,$inp)); + &sub ($out,16); + &jnc (&label("cbc_dec_loop")); + +&set_label("cbc_done"); + &mov ($base,&DWP(8,"esp")); # restore ivp + &mov ("esp",&DWP(48,"esp")); + &movdqu (&QWP(0,$base),"xmm1"); # write IV +&function_end("${PREFIX}_cbc_encrypt"); + +&asm_finish(); diff --git a/crypto/openssl/crypto/aes/asm/vpaes-x86_64.pl b/crypto/openssl/crypto/aes/asm/vpaes-x86_64.pl new file mode 100644 index 0000000000..0254702230 --- /dev/null +++ b/crypto/openssl/crypto/aes/asm/vpaes-x86_64.pl @@ -0,0 +1,1204 @@ +#!/usr/bin/env perl + +###################################################################### +## Constant-time SSSE3 AES core implementation. +## version 0.1 +## +## By Mike Hamburg (Stanford University), 2009 +## Public domain. +## +## For details see http://shiftleft.org/papers/vector_aes/ and +## http://crypto.stanford.edu/vpaes/. + +###################################################################### +# September 2011. +# +# Interface to OpenSSL as "almost" drop-in replacement for +# aes-x86_64.pl. "Almost" refers to the fact that AES_cbc_encrypt +# doesn't handle partial vectors (doesn't have to if called from +# EVP only). "Drop-in" implies that this module doesn't share key +# schedule structure with the original nor does it make assumption +# about its alignment... +# +# Performance summary. aes-x86_64.pl column lists large-block CBC +# encrypt/decrypt/with-hyper-threading-off(*) results in cycles per +# byte processed with 128-bit key, and vpaes-x86_64.pl column - +# [also large-block CBC] encrypt/decrypt. +# +# aes-x86_64.pl vpaes-x86_64.pl +# +# Core 2(**) 30.5/43.7/14.3 21.8/25.7(***) +# Nehalem 30.5/42.2/14.6 9.8/11.8 +# Atom 63.9/79.0/32.1 64.0/84.8(***) +# +# (*) "Hyper-threading" in the context refers rather to cache shared +# among multiple cores, than to specifically Intel HTT. As vast +# majority of contemporary cores share cache, slower code path +# is common place. In other words "with-hyper-threading-off" +# results are presented mostly for reference purposes. +# +# (**) "Core 2" refers to initial 65nm design, a.k.a. Conroe. +# +# (***) Less impressive improvement on Core 2 and Atom is due to slow +# pshufb, yet it's respectable +40%/78% improvement on Core 2 +# (as implied, over "hyper-threading-safe" code path). +# +# + +$flavour = shift; +$output = shift; +if ($flavour =~ /\./) { $output = $flavour; undef $flavour; } + +$win64=0; $win64=1 if ($flavour =~ /[nm]asm|mingw64/ || $output =~ /\.asm$/); + +$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; +( $xlate="${dir}x86_64-xlate.pl" and -f $xlate ) or +( $xlate="${dir}../../perlasm/x86_64-xlate.pl" and -f $xlate) or +die "can't locate x86_64-xlate.pl"; + +open STDOUT,"| $^X $xlate $flavour $output"; + +$PREFIX="vpaes"; + +$code.=<<___; +.text + +## +## _aes_encrypt_core +## +## AES-encrypt %xmm0. +## +## Inputs: +## %xmm0 = input +## %xmm9-%xmm15 as in _vpaes_preheat +## (%rdx) = scheduled keys +## +## Output in %xmm0 +## Clobbers %xmm1-%xmm5, %r9, %r10, %r11, %rax +## Preserves %xmm6 - %xmm8 so you get some local vectors +## +## +.type _vpaes_encrypt_core,\@abi-omnipotent +.align 16 +_vpaes_encrypt_core: + mov %rdx, %r9 + mov \$16, %r11 + mov 240(%rdx),%eax + movdqa %xmm9, %xmm1 + movdqa .Lk_ipt(%rip), %xmm2 # iptlo + pandn %xmm0, %xmm1 + movdqu (%r9), %xmm5 # round0 key + psrld \$4, %xmm1 + pand %xmm9, %xmm0 + pshufb %xmm0, %xmm2 + movdqa .Lk_ipt+16(%rip), %xmm0 # ipthi + pshufb %xmm1, %xmm0 + pxor %xmm5, %xmm2 + pxor %xmm2, %xmm0 + add \$16, %r9 + lea .Lk_mc_backward(%rip),%r10 + jmp .Lenc_entry + +.align 16 +.Lenc_loop: + # middle of middle round + movdqa %xmm13, %xmm4 # 4 : sb1u + pshufb %xmm2, %xmm4 # 4 = sb1u + pxor %xmm5, %xmm4 # 4 = sb1u + k + movdqa %xmm12, %xmm0 # 0 : sb1t + pshufb %xmm3, %xmm0 # 0 = sb1t + pxor %xmm4, %xmm0 # 0 = A + movdqa %xmm15, %xmm5 # 4 : sb2u + pshufb %xmm2, %xmm5 # 4 = sb2u + movdqa -0x40(%r11,%r10), %xmm1 # .Lk_mc_forward[] + movdqa %xmm14, %xmm2 # 2 : sb2t + pshufb %xmm3, %xmm2 # 2 = sb2t + pxor %xmm5, %xmm2 # 2 = 2A + movdqa (%r11,%r10), %xmm4 # .Lk_mc_backward[] + movdqa %xmm0, %xmm3 # 3 = A + pshufb %xmm1, %xmm0 # 0 = B + add \$16, %r9 # next key + pxor %xmm2, %xmm0 # 0 = 2A+B + pshufb %xmm4, %xmm3 # 3 = D + add \$16, %r11 # next mc + pxor %xmm0, %xmm3 # 3 = 2A+B+D + pshufb %xmm1, %xmm0 # 0 = 2B+C + and \$0x30, %r11 # ... mod 4 + pxor %xmm3, %xmm0 # 0 = 2A+3B+C+D + sub \$1,%rax # nr-- + +.Lenc_entry: + # top of round + movdqa %xmm9, %xmm1 # 1 : i + pandn %xmm0, %xmm1 # 1 = i<<4 + psrld \$4, %xmm1 # 1 = i + pand %xmm9, %xmm0 # 0 = k + movdqa %xmm11, %xmm5 # 2 : a/k + pshufb %xmm0, %xmm5 # 2 = a/k + pxor %xmm1, %xmm0 # 0 = j + movdqa %xmm10, %xmm3 # 3 : 1/i + pshufb %xmm1, %xmm3 # 3 = 1/i + pxor %xmm5, %xmm3 # 3 = iak = 1/i + a/k + movdqa %xmm10, %xmm4 # 4 : 1/j + pshufb %xmm0, %xmm4 # 4 = 1/j + pxor %xmm5, %xmm4 # 4 = jak = 1/j + a/k + movdqa %xmm10, %xmm2 # 2 : 1/iak + pshufb %xmm3, %xmm2 # 2 = 1/iak + pxor %xmm0, %xmm2 # 2 = io + movdqa %xmm10, %xmm3 # 3 : 1/jak + movdqu (%r9), %xmm5 + pshufb %xmm4, %xmm3 # 3 = 1/jak + pxor %xmm1, %xmm3 # 3 = jo + jnz .Lenc_loop + + # middle of last round + movdqa -0x60(%r10), %xmm4 # 3 : sbou .Lk_sbo + movdqa -0x50(%r10), %xmm0 # 0 : sbot .Lk_sbo+16 + pshufb %xmm2, %xmm4 # 4 = sbou + pxor %xmm5, %xmm4 # 4 = sb1u + k + pshufb %xmm3, %xmm0 # 0 = sb1t + movdqa 0x40(%r11,%r10), %xmm1 # .Lk_sr[] + pxor %xmm4, %xmm0 # 0 = A + pshufb %xmm1, %xmm0 + ret +.size _vpaes_encrypt_core,.-_vpaes_encrypt_core + +## +## Decryption core +## +## Same API as encryption core. +## +.type _vpaes_decrypt_core,\@abi-omnipotent +.align 16 +_vpaes_decrypt_core: + mov %rdx, %r9 # load key + mov 240(%rdx),%eax + movdqa %xmm9, %xmm1 + movdqa .Lk_dipt(%rip), %xmm2 # iptlo + pandn %xmm0, %xmm1 + mov %rax, %r11 + psrld \$4, %xmm1 + movdqu (%r9), %xmm5 # round0 key + shl \$4, %r11 + pand %xmm9, %xmm0 + pshufb %xmm0, %xmm2 + movdqa .Lk_dipt+16(%rip), %xmm0 # ipthi + xor \$0x30, %r11 + lea .Lk_dsbd(%rip),%r10 + pshufb %xmm1, %xmm0 + and \$0x30, %r11 + pxor %xmm5, %xmm2 + movdqa .Lk_mc_forward+48(%rip), %xmm5 + pxor %xmm2, %xmm0 + add \$16, %r9 + add %r10, %r11 + jmp .Ldec_entry + +.align 16 +.Ldec_loop: +## +## Inverse mix columns +## + movdqa -0x20(%r10),%xmm4 # 4 : sb9u + pshufb %xmm2, %xmm4 # 4 = sb9u + pxor %xmm0, %xmm4 + movdqa -0x10(%r10),%xmm0 # 0 : sb9t + pshufb %xmm3, %xmm0 # 0 = sb9t + pxor %xmm4, %xmm0 # 0 = ch + add \$16, %r9 # next round key + + pshufb %xmm5, %xmm0 # MC ch + movdqa 0x00(%r10),%xmm4 # 4 : sbdu + pshufb %xmm2, %xmm4 # 4 = sbdu + pxor %xmm0, %xmm4 # 4 = ch + movdqa 0x10(%r10),%xmm0 # 0 : sbdt + pshufb %xmm3, %xmm0 # 0 = sbdt + pxor %xmm4, %xmm0 # 0 = ch + sub \$1,%rax # nr-- + + pshufb %xmm5, %xmm0 # MC ch + movdqa 0x20(%r10),%xmm4 # 4 : sbbu + pshufb %xmm2, %xmm4 # 4 = sbbu + pxor %xmm0, %xmm4 # 4 = ch + movdqa 0x30(%r10),%xmm0 # 0 : sbbt + pshufb %xmm3, %xmm0 # 0 = sbbt + pxor %xmm4, %xmm0 # 0 = ch + + pshufb %xmm5, %xmm0 # MC ch + movdqa 0x40(%r10),%xmm4 # 4 : sbeu + pshufb %xmm2, %xmm4 # 4 = sbeu + pxor %xmm0, %xmm4 # 4 = ch + movdqa 0x50(%r10),%xmm0 # 0 : sbet + pshufb %xmm3, %xmm0 # 0 = sbet + pxor %xmm4, %xmm0 # 0 = ch + + palignr \$12, %xmm5, %xmm5 + +.Ldec_entry: + # top of round + movdqa %xmm9, %xmm1 # 1 : i + pandn %xmm0, %xmm1 # 1 = i<<4 + psrld \$4, %xmm1 # 1 = i + pand %xmm9, %xmm0 # 0 = k + movdqa %xmm11, %xmm2 # 2 : a/k + pshufb %xmm0, %xmm2 # 2 = a/k + pxor %xmm1, %xmm0 # 0 = j + movdqa %xmm10, %xmm3 # 3 : 1/i + pshufb %xmm1, %xmm3 # 3 = 1/i + pxor %xmm2, %xmm3 # 3 = iak = 1/i + a/k + movdqa %xmm10, %xmm4 # 4 : 1/j + pshufb %xmm0, %xmm4 # 4 = 1/j + pxor %xmm2, %xmm4 # 4 = jak = 1/j + a/k + movdqa %xmm10, %xmm2 # 2 : 1/iak + pshufb %xmm3, %xmm2 # 2 = 1/iak + pxor %xmm0, %xmm2 # 2 = io + movdqa %xmm10, %xmm3 # 3 : 1/jak + pshufb %xmm4, %xmm3 # 3 = 1/jak + pxor %xmm1, %xmm3 # 3 = jo + movdqu (%r9), %xmm0 + jnz .Ldec_loop + + # middle of last round + movdqa 0x60(%r10), %xmm4 # 3 : sbou + pshufb %xmm2, %xmm4 # 4 = sbou + pxor %xmm0, %xmm4 # 4 = sb1u + k + movdqa 0x70(%r10), %xmm0 # 0 : sbot + movdqa .Lk_sr-.Lk_dsbd(%r11), %xmm2 + pshufb %xmm3, %xmm0 # 0 = sb1t + pxor %xmm4, %xmm0 # 0 = A + pshufb %xmm2, %xmm0 + ret +.size _vpaes_decrypt_core,.-_vpaes_decrypt_core + +######################################################## +## ## +## AES key schedule ## +## ## +######################################################## +.type _vpaes_schedule_core,\@abi-omnipotent +.align 16 +_vpaes_schedule_core: + # rdi = key + # rsi = size in bits + # rdx = buffer + # rcx = direction. 0=encrypt, 1=decrypt + + call _vpaes_preheat # load the tables + movdqa .Lk_rcon(%rip), %xmm8 # load rcon + movdqu (%rdi), %xmm0 # load key (unaligned) + + # input transform + movdqa %xmm0, %xmm3 + lea .Lk_ipt(%rip), %r11 + call _vpaes_schedule_transform + movdqa %xmm0, %xmm7 + + lea .Lk_sr(%rip),%r10 + test %rcx, %rcx + jnz .Lschedule_am_decrypting + + # encrypting, output zeroth round key after transform + movdqu %xmm0, (%rdx) + jmp .Lschedule_go + +.Lschedule_am_decrypting: + # decrypting, output zeroth round key after shiftrows + movdqa (%r8,%r10),%xmm1 + pshufb %xmm1, %xmm3 + movdqu %xmm3, (%rdx) + xor \$0x30, %r8 + +.Lschedule_go: + cmp \$192, %esi + ja .Lschedule_256 + je .Lschedule_192 + # 128: fall though + +## +## .schedule_128 +## +## 128-bit specific part of key schedule. +## +## This schedule is really simple, because all its parts +## are accomplished by the subroutines. +## +.Lschedule_128: + mov \$10, %esi + +.Loop_schedule_128: + call _vpaes_schedule_round + dec %rsi + jz .Lschedule_mangle_last + call _vpaes_schedule_mangle # write output + jmp .Loop_schedule_128 + +## +## .aes_schedule_192 +## +## 192-bit specific part of key schedule. +## +## The main body of this schedule is the same as the 128-bit +## schedule, but with more smearing. The long, high side is +## stored in %xmm7 as before, and the short, low side is in +## the high bits of %xmm6. +## +## This schedule is somewhat nastier, however, because each +## round produces 192 bits of key material, or 1.5 round keys. +## Therefore, on each cycle we do 2 rounds and produce 3 round +## keys. +## +.align 16 +.Lschedule_192: + movdqu 8(%rdi),%xmm0 # load key part 2 (very unaligned) + call _vpaes_schedule_transform # input transform + movdqa %xmm0, %xmm6 # save short part + pxor %xmm4, %xmm4 # clear 4 + movhlps %xmm4, %xmm6 # clobber low side with zeros + mov \$4, %esi + +.Loop_schedule_192: + call _vpaes_schedule_round + palignr \$8,%xmm6,%xmm0 + call _vpaes_schedule_mangle # save key n + call _vpaes_schedule_192_smear + call _vpaes_schedule_mangle # save key n+1 + call _vpaes_schedule_round + dec %rsi + jz .Lschedule_mangle_last + call _vpaes_schedule_mangle # save key n+2 + call _vpaes_schedule_192_smear + jmp .Loop_schedule_192 + +## +## .aes_schedule_256 +## +## 256-bit specific part of key schedule. +## +## The structure here is very similar to the 128-bit +## schedule, but with an additional "low side" in +## %xmm6. The low side's rounds are the same as the +## high side's, except no rcon and no rotation. +## +.align 16 +.Lschedule_256: + movdqu 16(%rdi),%xmm0 # load key part 2 (unaligned) + call _vpaes_schedule_transform # input transform + mov \$7, %esi + +.Loop_schedule_256: + call _vpaes_schedule_mangle # output low result + movdqa %xmm0, %xmm6 # save cur_lo in xmm6 + + # high round + call _vpaes_schedule_round + dec %rsi + jz .Lschedule_mangle_last + call _vpaes_schedule_mangle + + # low round. swap xmm7 and xmm6 + pshufd \$0xFF, %xmm0, %xmm0 + movdqa %xmm7, %xmm5 + movdqa %xmm6, %xmm7 + call _vpaes_schedule_low_round + movdqa %xmm5, %xmm7 + + jmp .Loop_schedule_256 + + +## +## .aes_schedule_mangle_last +## +## Mangler for last round of key schedule +## Mangles %xmm0 +## when encrypting, outputs out(%xmm0) ^ 63 +## when decrypting, outputs unskew(%xmm0) +## +## Always called right before return... jumps to cleanup and exits +## +.align 16 +.Lschedule_mangle_last: + # schedule last round key from xmm0 + lea .Lk_deskew(%rip),%r11 # prepare to deskew + test %rcx, %rcx + jnz .Lschedule_mangle_last_dec + + # encrypting + movdqa (%r8,%r10),%xmm1 + pshufb %xmm1, %xmm0 # output permute + lea .Lk_opt(%rip), %r11 # prepare to output transform + add \$32, %rdx + +.Lschedule_mangle_last_dec: + add \$-16, %rdx + pxor .Lk_s63(%rip), %xmm0 + call _vpaes_schedule_transform # output transform + movdqu %xmm0, (%rdx) # save last key + + # cleanup + pxor %xmm0, %xmm0 + pxor %xmm1, %xmm1 + pxor %xmm2, %xmm2 + pxor %xmm3, %xmm3 + pxor %xmm4, %xmm4 + pxor %xmm5, %xmm5 + pxor %xmm6, %xmm6 + pxor %xmm7, %xmm7 + ret +.size _vpaes_schedule_core,.-_vpaes_schedule_core + +## +## .aes_schedule_192_smear +## +## Smear the short, low side in the 192-bit key schedule. +## +## Inputs: +## %xmm7: high side, b a x y +## %xmm6: low side, d c 0 0 +## %xmm13: 0 +## +## Outputs: +## %xmm6: b+c+d b+c 0 0 +## %xmm0: b+c+d b+c b a +## +.type _vpaes_schedule_192_smear,\@abi-omnipotent +.align 16 +_vpaes_schedule_192_smear: + pshufd \$0x80, %xmm6, %xmm0 # d c 0 0 -> c 0 0 0 + pxor %xmm0, %xmm6 # -> c+d c 0 0 + pshufd \$0xFE, %xmm7, %xmm0 # b a _ _ -> b b b a + pxor %xmm0, %xmm6 # -> b+c+d b+c b a + movdqa %xmm6, %xmm0 + pxor %xmm1, %xmm1 + movhlps %xmm1, %xmm6 # clobber low side with zeros + ret +.size _vpaes_schedule_192_smear,.-_vpaes_schedule_192_smear + +## +## .aes_schedule_round +## +## Runs one main round of the key schedule on %xmm0, %xmm7 +## +## Specifically, runs subbytes on the high dword of %xmm0 +## then rotates it by one byte and xors into the low dword of +## %xmm7. +## +## Adds rcon from low byte of %xmm8, then rotates %xmm8 for +## next rcon. +## +## Smears the dwords of %xmm7 by xoring the low into the +## second low, result into third, result into highest. +## +## Returns results in %xmm7 = %xmm0. +## Clobbers %xmm1-%xmm4, %r11. +## +.type _vpaes_schedule_round,\@abi-omnipotent +.align 16 +_vpaes_schedule_round: + # extract rcon from xmm8 + pxor %xmm1, %xmm1 + palignr \$15, %xmm8, %xmm1 + palignr \$15, %xmm8, %xmm8 + pxor %xmm1, %xmm7 + + # rotate + pshufd \$0xFF, %xmm0, %xmm0 + palignr \$1, %xmm0, %xmm0 + + # fall through... + + # low round: same as high round, but no rotation and no rcon. +_vpaes_schedule_low_round: + # smear xmm7 + movdqa %xmm7, %xmm1 + pslldq \$4, %xmm7 + pxor %xmm1, %xmm7 + movdqa %xmm7, %xmm1 + pslldq \$8, %xmm7 + pxor %xmm1, %xmm7 + pxor .Lk_s63(%rip), %xmm7 + + # subbytes + movdqa %xmm9, %xmm1 + pandn %xmm0, %xmm1 + psrld \$4, %xmm1 # 1 = i + pand %xmm9, %xmm0 # 0 = k + movdqa %xmm11, %xmm2 # 2 : a/k + pshufb %xmm0, %xmm2 # 2 = a/k + pxor %xmm1, %xmm0 # 0 = j + movdqa %xmm10, %xmm3 # 3 : 1/i + pshufb %xmm1, %xmm3 # 3 = 1/i + pxor %xmm2, %xmm3 # 3 = iak = 1/i + a/k + movdqa %xmm10, %xmm4 # 4 : 1/j + pshufb %xmm0, %xmm4 # 4 = 1/j + pxor %xmm2, %xmm4 # 4 = jak = 1/j + a/k + movdqa %xmm10, %xmm2 # 2 : 1/iak + pshufb %xmm3, %xmm2 # 2 = 1/iak + pxor %xmm0, %xmm2 # 2 = io + movdqa %xmm10, %xmm3 # 3 : 1/jak + pshufb %xmm4, %xmm3 # 3 = 1/jak + pxor %xmm1, %xmm3 # 3 = jo + movdqa %xmm13, %xmm4 # 4 : sbou + pshufb %xmm2, %xmm4 # 4 = sbou + movdqa %xmm12, %xmm0 # 0 : sbot + pshufb %xmm3, %xmm0 # 0 = sb1t + pxor %xmm4, %xmm0 # 0 = sbox output + + # add in smeared stuff + pxor %xmm7, %xmm0 + movdqa %xmm0, %xmm7 + ret +.size _vpaes_schedule_round,.-_vpaes_schedule_round + +## +## .aes_schedule_transform +## +## Linear-transform %xmm0 according to tables at (%r11) +## +## Requires that %xmm9 = 0x0F0F... as in preheat +## Output in %xmm0 +## Clobbers %xmm1, %xmm2 +## +.type _vpaes_schedule_transform,\@abi-omnipotent +.align 16 +_vpaes_schedule_transform: + movdqa %xmm9, %xmm1 + pandn %xmm0, %xmm1 + psrld \$4, %xmm1 + pand %xmm9, %xmm0 + movdqa (%r11), %xmm2 # lo + pshufb %xmm0, %xmm2 + movdqa 16(%r11), %xmm0 # hi + pshufb %xmm1, %xmm0 + pxor %xmm2, %xmm0 + ret +.size _vpaes_schedule_transform,.-_vpaes_schedule_transform + +## +## .aes_schedule_mangle +## +## Mangle xmm0 from (basis-transformed) standard version +## to our version. +## +## On encrypt, +## xor with 0x63 +## multiply by circulant 0,1,1,1 +## apply shiftrows transform +## +## On decrypt, +## xor with 0x63 +## multiply by "inverse mixcolumns" circulant E,B,D,9 +## deskew +## apply shiftrows transform +## +## +## Writes out to (%rdx), and increments or decrements it +## Keeps track of round number mod 4 in %r8 +## Preserves xmm0 +## Clobbers xmm1-xmm5 +## +.type _vpaes_schedule_mangle,\@abi-omnipotent +.align 16 +_vpaes_schedule_mangle: + movdqa %xmm0, %xmm4 # save xmm0 for later + movdqa .Lk_mc_forward(%rip),%xmm5 + test %rcx, %rcx + jnz .Lschedule_mangle_dec + + # encrypting + add \$16, %rdx + pxor .Lk_s63(%rip),%xmm4 + pshufb %xmm5, %xmm4 + movdqa %xmm4, %xmm3 + pshufb %xmm5, %xmm4 + pxor %xmm4, %xmm3 + pshufb %xmm5, %xmm4 + pxor %xmm4, %xmm3 + + jmp .Lschedule_mangle_both +.align 16 +.Lschedule_mangle_dec: + # inverse mix columns + lea .Lk_dksd(%rip),%r11 + movdqa %xmm9, %xmm1 + pandn %xmm4, %xmm1 + psrld \$4, %xmm1 # 1 = hi + pand %xmm9, %xmm4 # 4 = lo + + movdqa 0x00(%r11), %xmm2 + pshufb %xmm4, %xmm2 + movdqa 0x10(%r11), %xmm3 + pshufb %xmm1, %xmm3 + pxor %xmm2, %xmm3 + pshufb %xmm5, %xmm3 + + movdqa 0x20(%r11), %xmm2 + pshufb %xmm4, %xmm2 + pxor %xmm3, %xmm2 + movdqa 0x30(%r11), %xmm3 + pshufb %xmm1, %xmm3 + pxor %xmm2, %xmm3 + pshufb %xmm5, %xmm3 + + movdqa 0x40(%r11), %xmm2 + pshufb %xmm4, %xmm2 + pxor %xmm3, %xmm2 + movdqa 0x50(%r11), %xmm3 + pshufb %xmm1, %xmm3 + pxor %xmm2, %xmm3 + pshufb %xmm5, %xmm3 + + movdqa 0x60(%r11), %xmm2 + pshufb %xmm4, %xmm2 + pxor %xmm3, %xmm2 + movdqa 0x70(%r11), %xmm3 + pshufb %xmm1, %xmm3 + pxor %xmm2, %xmm3 + + add \$-16, %rdx + +.Lschedule_mangle_both: + movdqa (%r8,%r10),%xmm1 + pshufb %xmm1,%xmm3 + add \$-16, %r8 + and \$0x30, %r8 + movdqu %xmm3, (%rdx) + ret +.size _vpaes_schedule_mangle,.-_vpaes_schedule_mangle + +# +# Interface to OpenSSL +# +.globl ${PREFIX}_set_encrypt_key +.type ${PREFIX}_set_encrypt_key,\@function,3 +.align 16 +${PREFIX}_set_encrypt_key: +___ +$code.=<<___ if ($win64); + lea -0xb8(%rsp),%rsp + movaps %xmm6,0x10(%rsp) + movaps %xmm7,0x20(%rsp) + movaps %xmm8,0x30(%rsp) + movaps %xmm9,0x40(%rsp) + movaps %xmm10,0x50(%rsp) + movaps %xmm11,0x60(%rsp) + movaps %xmm12,0x70(%rsp) + movaps %xmm13,0x80(%rsp) + movaps %xmm14,0x90(%rsp) + movaps %xmm15,0xa0(%rsp) +.Lenc_key_body: +___ +$code.=<<___; + mov %esi,%eax + shr \$5,%eax + add \$5,%eax + mov %eax,240(%rdx) # AES_KEY->rounds = nbits/32+5; + + mov \$0,%ecx + mov \$0x30,%r8d + call _vpaes_schedule_core +___ +$code.=<<___ if ($win64); + movaps 0x10(%rsp),%xmm6 + movaps 0x20(%rsp),%xmm7 + movaps 0x30(%rsp),%xmm8 + movaps 0x40(%rsp),%xmm9 + movaps 0x50(%rsp),%xmm10 + movaps 0x60(%rsp),%xmm11 + movaps 0x70(%rsp),%xmm12 + movaps 0x80(%rsp),%xmm13 + movaps 0x90(%rsp),%xmm14 + movaps 0xa0(%rsp),%xmm15 + lea 0xb8(%rsp),%rsp +.Lenc_key_epilogue: +___ +$code.=<<___; + xor %eax,%eax + ret +.size ${PREFIX}_set_encrypt_key,.-${PREFIX}_set_encrypt_key + +.globl ${PREFIX}_set_decrypt_key +.type ${PREFIX}_set_decrypt_key,\@function,3 +.align 16 +${PREFIX}_set_decrypt_key: +___ +$code.=<<___ if ($win64); + lea -0xb8(%rsp),%rsp + movaps %xmm6,0x10(%rsp) + movaps %xmm7,0x20(%rsp) + movaps %xmm8,0x30(%rsp) + movaps %xmm9,0x40(%rsp) + movaps %xmm10,0x50(%rsp) + movaps %xmm11,0x60(%rsp) + movaps %xmm12,0x70(%rsp) + movaps %xmm13,0x80(%rsp) + movaps %xmm14,0x90(%rsp) + movaps %xmm15,0xa0(%rsp) +.Ldec_key_body: +___ +$code.=<<___; + mov %esi,%eax + shr \$5,%eax + add \$5,%eax + mov %eax,240(%rdx) # AES_KEY->rounds = nbits/32+5; + shl \$4,%eax + lea 16(%rdx,%rax),%rdx + + mov \$1,%ecx + mov %esi,%r8d + shr \$1,%r8d + and \$32,%r8d + xor \$32,%r8d # nbits==192?0:32 + call _vpaes_schedule_core +___ +$code.=<<___ if ($win64); + movaps 0x10(%rsp),%xmm6 + movaps 0x20(%rsp),%xmm7 + movaps 0x30(%rsp),%xmm8 + movaps 0x40(%rsp),%xmm9 + movaps 0x50(%rsp),%xmm10 + movaps 0x60(%rsp),%xmm11 + movaps 0x70(%rsp),%xmm12 + movaps 0x80(%rsp),%xmm13 + movaps 0x90(%rsp),%xmm14 + movaps 0xa0(%rsp),%xmm15 + lea 0xb8(%rsp),%rsp +.Ldec_key_epilogue: +___ +$code.=<<___; + xor %eax,%eax + ret +.size ${PREFIX}_set_decrypt_key,.-${PREFIX}_set_decrypt_key + +.globl ${PREFIX}_encrypt +.type ${PREFIX}_encrypt,\@function,3 +.align 16 +${PREFIX}_encrypt: +___ +$code.=<<___ if ($win64); + lea -0xb8(%rsp),%rsp + movaps %xmm6,0x10(%rsp) + movaps %xmm7,0x20(%rsp) + movaps %xmm8,0x30(%rsp) + movaps %xmm9,0x40(%rsp) + movaps %xmm10,0x50(%rsp) + movaps %xmm11,0x60(%rsp) + movaps %xmm12,0x70(%rsp) + movaps %xmm13,0x80(%rsp) + movaps %xmm14,0x90(%rsp) + movaps %xmm15,0xa0(%rsp) +.Lenc_body: +___ +$code.=<<___; + movdqu (%rdi),%xmm0 + call _vpaes_preheat + call _vpaes_encrypt_core + movdqu %xmm0,(%rsi) +___ +$code.=<<___ if ($win64); + movaps 0x10(%rsp),%xmm6 + movaps 0x20(%rsp),%xmm7 + movaps 0x30(%rsp),%xmm8 + movaps 0x40(%rsp),%xmm9 + movaps 0x50(%rsp),%xmm10 + movaps 0x60(%rsp),%xmm11 + movaps 0x70(%rsp),%xmm12 + movaps 0x80(%rsp),%xmm13 + movaps 0x90(%rsp),%xmm14 + movaps 0xa0(%rsp),%xmm15 + lea 0xb8(%rsp),%rsp +.Lenc_epilogue: +___ +$code.=<<___; + ret +.size ${PREFIX}_encrypt,.-${PREFIX}_encrypt + +.globl ${PREFIX}_decrypt +.type ${PREFIX}_decrypt,\@function,3 +.align 16 +${PREFIX}_decrypt: +___ +$code.=<<___ if ($win64); + lea -0xb8(%rsp),%rsp + movaps %xmm6,0x10(%rsp) + movaps %xmm7,0x20(%rsp) + movaps %xmm8,0x30(%rsp) + movaps %xmm9,0x40(%rsp) + movaps %xmm10,0x50(%rsp) + movaps %xmm11,0x60(%rsp) + movaps %xmm12,0x70(%rsp) + movaps %xmm13,0x80(%rsp) + movaps %xmm14,0x90(%rsp) + movaps %xmm15,0xa0(%rsp) +.Ldec_body: +___ +$code.=<<___; + movdqu (%rdi),%xmm0 + call _vpaes_preheat + call _vpaes_decrypt_core + movdqu %xmm0,(%rsi) +___ +$code.=<<___ if ($win64); + movaps 0x10(%rsp),%xmm6 + movaps 0x20(%rsp),%xmm7 + movaps 0x30(%rsp),%xmm8 + movaps 0x40(%rsp),%xmm9 + movaps 0x50(%rsp),%xmm10 + movaps 0x60(%rsp),%xmm11 + movaps 0x70(%rsp),%xmm12 + movaps 0x80(%rsp),%xmm13 + movaps 0x90(%rsp),%xmm14 + movaps 0xa0(%rsp),%xmm15 + lea 0xb8(%rsp),%rsp +.Ldec_epilogue: +___ +$code.=<<___; + ret +.size ${PREFIX}_decrypt,.-${PREFIX}_decrypt +___ +{ +my ($inp,$out,$len,$key,$ivp,$enc)=("%rdi","%rsi","%rdx","%rcx","%r8","%r9"); +# void AES_cbc_encrypt (const void char *inp, unsigned char *out, +# size_t length, const AES_KEY *key, +# unsigned char *ivp,const int enc); +$code.=<<___; +.globl ${PREFIX}_cbc_encrypt +.type ${PREFIX}_cbc_encrypt,\@function,6 +.align 16 +${PREFIX}_cbc_encrypt: + xchg $key,$len +___ +($len,$key)=($key,$len); +$code.=<<___; +___ +$code.=<<___ if ($win64); + lea -0xb8(%rsp),%rsp + movaps %xmm6,0x10(%rsp) + movaps %xmm7,0x20(%rsp) + movaps %xmm8,0x30(%rsp) + movaps %xmm9,0x40(%rsp) + movaps %xmm10,0x50(%rsp) + movaps %xmm11,0x60(%rsp) + movaps %xmm12,0x70(%rsp) + movaps %xmm13,0x80(%rsp) + movaps %xmm14,0x90(%rsp) + movaps %xmm15,0xa0(%rsp) +.Lcbc_body: +___ +$code.=<<___; + movdqu ($ivp),%xmm6 # load IV + sub $inp,$out + sub \$16,$len + call _vpaes_preheat + cmp \$0,${enc}d + je .Lcbc_dec_loop + jmp .Lcbc_enc_loop +.align 16 +.Lcbc_enc_loop: + movdqu ($inp),%xmm0 + pxor %xmm6,%xmm0 + call _vpaes_encrypt_core + movdqa %xmm0,%xmm6 + movdqu %xmm0,($out,$inp) + lea 16($inp),$inp + sub \$16,$len + jnc .Lcbc_enc_loop + jmp .Lcbc_done +.align 16 +.Lcbc_dec_loop: + movdqu ($inp),%xmm0 + movdqa %xmm0,%xmm7 + call _vpaes_decrypt_core + pxor %xmm6,%xmm0 + movdqa %xmm7,%xmm6 + movdqu %xmm0,($out,$inp) + lea 16($inp),$inp + sub \$16,$len + jnc .Lcbc_dec_loop +.Lcbc_done: + movdqu %xmm6,($ivp) # save IV +___ +$code.=<<___ if ($win64); + movaps 0x10(%rsp),%xmm6 + movaps 0x20(%rsp),%xmm7 + movaps 0x30(%rsp),%xmm8 + movaps 0x40(%rsp),%xmm9 + movaps 0x50(%rsp),%xmm10 + movaps 0x60(%rsp),%xmm11 + movaps 0x70(%rsp),%xmm12 + movaps 0x80(%rsp),%xmm13 + movaps 0x90(%rsp),%xmm14 + movaps 0xa0(%rsp),%xmm15 + lea 0xb8(%rsp),%rsp +.Lcbc_epilogue: +___ +$code.=<<___; + ret +.size ${PREFIX}_cbc_encrypt,.-${PREFIX}_cbc_encrypt +___ +} +$code.=<<___; +## +## _aes_preheat +## +## Fills register %r10 -> .aes_consts (so you can -fPIC) +## and %xmm9-%xmm15 as specified below. +## +.type _vpaes_preheat,\@abi-omnipotent +.align 16 +_vpaes_preheat: + lea .Lk_s0F(%rip), %r10 + movdqa -0x20(%r10), %xmm10 # .Lk_inv + movdqa -0x10(%r10), %xmm11 # .Lk_inv+16 + movdqa 0x00(%r10), %xmm9 # .Lk_s0F + movdqa 0x30(%r10), %xmm13 # .Lk_sb1 + movdqa 0x40(%r10), %xmm12 # .Lk_sb1+16 + movdqa 0x50(%r10), %xmm15 # .Lk_sb2 + movdqa 0x60(%r10), %xmm14 # .Lk_sb2+16 + ret +.size _vpaes_preheat,.-_vpaes_preheat +######################################################## +## ## +## Constants ## +## ## +######################################################## +.type _vpaes_consts,\@object +.align 64 +_vpaes_consts: +.Lk_inv: # inv, inva + .quad 0x0E05060F0D080180, 0x040703090A0B0C02 + .quad 0x01040A060F0B0780, 0x030D0E0C02050809 + +.Lk_s0F: # s0F + .quad 0x0F0F0F0F0F0F0F0F, 0x0F0F0F0F0F0F0F0F + +.Lk_ipt: # input transform (lo, hi) + .quad 0xC2B2E8985A2A7000, 0xCABAE09052227808 + .quad 0x4C01307D317C4D00, 0xCD80B1FCB0FDCC81 + +.Lk_sb1: # sb1u, sb1t + .quad 0xB19BE18FCB503E00, 0xA5DF7A6E142AF544 + .quad 0x3618D415FAE22300, 0x3BF7CCC10D2ED9EF +.Lk_sb2: # sb2u, sb2t + .quad 0xE27A93C60B712400, 0x5EB7E955BC982FCD + .quad 0x69EB88400AE12900, 0xC2A163C8AB82234A +.Lk_sbo: # sbou, sbot + .quad 0xD0D26D176FBDC700, 0x15AABF7AC502A878 + .quad 0xCFE474A55FBB6A00, 0x8E1E90D1412B35FA + +.Lk_mc_forward: # mc_forward + .quad 0x0407060500030201, 0x0C0F0E0D080B0A09 + .quad 0x080B0A0904070605, 0x000302010C0F0E0D + .quad 0x0C0F0E0D080B0A09, 0x0407060500030201 + .quad 0x000302010C0F0E0D, 0x080B0A0904070605 + +.Lk_mc_backward:# mc_backward + .quad 0x0605040702010003, 0x0E0D0C0F0A09080B + .quad 0x020100030E0D0C0F, 0x0A09080B06050407 + .quad 0x0E0D0C0F0A09080B, 0x0605040702010003 + .quad 0x0A09080B06050407, 0x020100030E0D0C0F + +.Lk_sr: # sr + .quad 0x0706050403020100, 0x0F0E0D0C0B0A0908 + .quad 0x030E09040F0A0500, 0x0B06010C07020D08 + .quad 0x0F060D040B020900, 0x070E050C030A0108 + .quad 0x0B0E0104070A0D00, 0x0306090C0F020508 + +.Lk_rcon: # rcon + .quad 0x1F8391B9AF9DEEB6, 0x702A98084D7C7D81 + +.Lk_s63: # s63: all equal to 0x63 transformed + .quad 0x5B5B5B5B5B5B5B5B, 0x5B5B5B5B5B5B5B5B + +.Lk_opt: # output transform + .quad 0xFF9F4929D6B66000, 0xF7974121DEBE6808 + .quad 0x01EDBD5150BCEC00, 0xE10D5DB1B05C0CE0 + +.Lk_deskew: # deskew tables: inverts the sbox's "skew" + .quad 0x07E4A34047A4E300, 0x1DFEB95A5DBEF91A + .quad 0x5F36B5DC83EA6900, 0x2841C2ABF49D1E77 + +## +## Decryption stuff +## Key schedule constants +## +.Lk_dksd: # decryption key schedule: invskew x*D + .quad 0xFEB91A5DA3E44700, 0x0740E3A45A1DBEF9 + .quad 0x41C277F4B5368300, 0x5FDC69EAAB289D1E +.Lk_dksb: # decryption key schedule: invskew x*B + .quad 0x9A4FCA1F8550D500, 0x03D653861CC94C99 + .quad 0x115BEDA7B6FC4A00, 0xD993256F7E3482C8 +.Lk_dkse: # decryption key schedule: invskew x*E + 0x63 + .quad 0xD5031CCA1FC9D600, 0x53859A4C994F5086 + .quad 0xA23196054FDC7BE8, 0xCD5EF96A20B31487 +.Lk_dks9: # decryption key schedule: invskew x*9 + .quad 0xB6116FC87ED9A700, 0x4AED933482255BFC + .quad 0x4576516227143300, 0x8BB89FACE9DAFDCE + +## +## Decryption stuff +## Round function constants +## +.Lk_dipt: # decryption input transform + .quad 0x0F505B040B545F00, 0x154A411E114E451A + .quad 0x86E383E660056500, 0x12771772F491F194 + +.Lk_dsb9: # decryption sbox output *9*u, *9*t + .quad 0x851C03539A86D600, 0xCAD51F504F994CC9 + .quad 0xC03B1789ECD74900, 0x725E2C9EB2FBA565 +.Lk_dsbd: # decryption sbox output *D*u, *D*t + .quad 0x7D57CCDFE6B1A200, 0xF56E9B13882A4439 + .quad 0x3CE2FAF724C6CB00, 0x2931180D15DEEFD3 +.Lk_dsbb: # decryption sbox output *B*u, *B*t + .quad 0xD022649296B44200, 0x602646F6B0F2D404 + .quad 0xC19498A6CD596700, 0xF3FF0C3E3255AA6B +.Lk_dsbe: # decryption sbox output *E*u, *E*t + .quad 0x46F2929626D4D000, 0x2242600464B4F6B0 + .quad 0x0C55A6CDFFAAC100, 0x9467F36B98593E32 +.Lk_dsbo: # decryption sbox final output + .quad 0x1387EA537EF94000, 0xC7AA6DB9D4943E2D + .quad 0x12D7560F93441D00, 0xCA4B8159D8C58E9C +.asciz "Vector Permutaion AES for x86_64/SSSE3, Mike Hamburg (Stanford University)" +.align 64 +.size _vpaes_consts,.-_vpaes_consts +___ + +if ($win64) { +# EXCEPTION_DISPOSITION handler (EXCEPTION_RECORD *rec,ULONG64 frame, +# CONTEXT *context,DISPATCHER_CONTEXT *disp) +$rec="%rcx"; +$frame="%rdx"; +$context="%r8"; +$disp="%r9"; + +$code.=<<___; +.extern __imp_RtlVirtualUnwind +.type se_handler,\@abi-omnipotent +.align 16 +se_handler: + push %rsi + push %rdi + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + pushfq + sub \$64,%rsp + + mov 120($context),%rax # pull context->Rax + mov 248($context),%rbx # pull context->Rip + + mov 8($disp),%rsi # disp->ImageBase + mov 56($disp),%r11 # disp->HandlerData + + mov 0(%r11),%r10d # HandlerData[0] + lea (%rsi,%r10),%r10 # prologue label + cmp %r10,%rbx # context->RipRsp + + mov 4(%r11),%r10d # HandlerData[1] + lea (%rsi,%r10),%r10 # epilogue label + cmp %r10,%rbx # context->Rip>=epilogue label + jae .Lin_prologue + + lea 16(%rax),%rsi # %xmm save area + lea 512($context),%rdi # &context.Xmm6 + mov \$20,%ecx # 10*sizeof(%xmm0)/sizeof(%rax) + .long 0xa548f3fc # cld; rep movsq + lea 0xb8(%rax),%rax # adjust stack pointer + +.Lin_prologue: + mov 8(%rax),%rdi + mov 16(%rax),%rsi + mov %rax,152($context) # restore context->Rsp + mov %rsi,168($context) # restore context->Rsi + mov %rdi,176($context) # restore context->Rdi + + mov 40($disp),%rdi # disp->ContextRecord + mov $context,%rsi # context + mov \$`1232/8`,%ecx # sizeof(CONTEXT) + .long 0xa548f3fc # cld; rep movsq + + mov $disp,%rsi + xor %rcx,%rcx # arg1, UNW_FLAG_NHANDLER + mov 8(%rsi),%rdx # arg2, disp->ImageBase + mov 0(%rsi),%r8 # arg3, disp->ControlPc + mov 16(%rsi),%r9 # arg4, disp->FunctionEntry + mov 40(%rsi),%r10 # disp->ContextRecord + lea 56(%rsi),%r11 # &disp->HandlerData + lea 24(%rsi),%r12 # &disp->EstablisherFrame + mov %r10,32(%rsp) # arg5 + mov %r11,40(%rsp) # arg6 + mov %r12,48(%rsp) # arg7 + mov %rcx,56(%rsp) # arg8, (NULL) + call *__imp_RtlVirtualUnwind(%rip) + + mov \$1,%eax # ExceptionContinueSearch + add \$64,%rsp + popfq + pop %r15 + pop %r14 + pop %r13 + pop %r12 + pop %rbp + pop %rbx + pop %rdi + pop %rsi + ret +.size se_handler,.-se_handler + +.section .pdata +.align 4 + .rva .LSEH_begin_${PREFIX}_set_encrypt_key + .rva .LSEH_end_${PREFIX}_set_encrypt_key + .rva .LSEH_info_${PREFIX}_set_encrypt_key + + .rva .LSEH_begin_${PREFIX}_set_decrypt_key + .rva .LSEH_end_${PREFIX}_set_decrypt_key + .rva .LSEH_info_${PREFIX}_set_decrypt_key + + .rva .LSEH_begin_${PREFIX}_encrypt + .rva .LSEH_end_${PREFIX}_encrypt + .rva .LSEH_info_${PREFIX}_encrypt + + .rva .LSEH_begin_${PREFIX}_decrypt + .rva .LSEH_end_${PREFIX}_decrypt + .rva .LSEH_info_${PREFIX}_decrypt + + .rva .LSEH_begin_${PREFIX}_cbc_encrypt + .rva .LSEH_end_${PREFIX}_cbc_encrypt + .rva .LSEH_info_${PREFIX}_cbc_encrypt + +.section .xdata +.align 8 +.LSEH_info_${PREFIX}_set_encrypt_key: + .byte 9,0,0,0 + .rva se_handler + .rva .Lenc_key_body,.Lenc_key_epilogue # HandlerData[] +.LSEH_info_${PREFIX}_set_decrypt_key: + .byte 9,0,0,0 + .rva se_handler + .rva .Ldec_key_body,.Ldec_key_epilogue # HandlerData[] +.LSEH_info_${PREFIX}_encrypt: + .byte 9,0,0,0 + .rva se_handler + .rva .Lenc_body,.Lenc_epilogue # HandlerData[] +.LSEH_info_${PREFIX}_decrypt: + .byte 9,0,0,0 + .rva se_handler + .rva .Ldec_body,.Ldec_epilogue # HandlerData[] +.LSEH_info_${PREFIX}_cbc_encrypt: + .byte 9,0,0,0 + .rva se_handler + .rva .Lcbc_body,.Lcbc_epilogue # HandlerData[] +___ +} + +$code =~ s/\`([^\`]*)\`/eval($1)/gem; + +print $code; + +close STDOUT; diff --git a/crypto/openssl/crypto/asn1/a_digest.c b/crypto/openssl/crypto/asn1/a_digest.c index d00d9e22b1..cbdeea6ac0 100644 --- a/crypto/openssl/crypto/asn1/a_digest.c +++ b/crypto/openssl/crypto/asn1/a_digest.c @@ -87,7 +87,8 @@ int ASN1_digest(i2d_of_void *i2d, const EVP_MD *type, char *data, p=str; i2d(data,&p); - EVP_Digest(str, i, md, len, type, NULL); + if (!EVP_Digest(str, i, md, len, type, NULL)) + return 0; OPENSSL_free(str); return(1); } @@ -104,7 +105,8 @@ int ASN1_item_digest(const ASN1_ITEM *it, const EVP_MD *type, void *asn, i=ASN1_item_i2d(asn,&str, it); if (!str) return(0); - EVP_Digest(str, i, md, len, type, NULL); + if (!EVP_Digest(str, i, md, len, type, NULL)) + return 0; OPENSSL_free(str); return(1); } diff --git a/crypto/openssl/crypto/asn1/a_sign.c b/crypto/openssl/crypto/asn1/a_sign.c index ff63bfc7be..7b4a193d6b 100644 --- a/crypto/openssl/crypto/asn1/a_sign.c +++ b/crypto/openssl/crypto/asn1/a_sign.c @@ -184,9 +184,9 @@ int ASN1_sign(i2d_of_void *i2d, X509_ALGOR *algor1, X509_ALGOR *algor2, p=buf_in; i2d(data,&p); - EVP_SignInit_ex(&ctx,type, NULL); - EVP_SignUpdate(&ctx,(unsigned char *)buf_in,inl); - if (!EVP_SignFinal(&ctx,(unsigned char *)buf_out, + if (!EVP_SignInit_ex(&ctx,type, NULL) + || !EVP_SignUpdate(&ctx,(unsigned char *)buf_in,inl) + || !EVP_SignFinal(&ctx,(unsigned char *)buf_out, (unsigned int *)&outl,pkey)) { outl=0; @@ -218,65 +218,100 @@ int ASN1_item_sign(const ASN1_ITEM *it, X509_ALGOR *algor1, X509_ALGOR *algor2, const EVP_MD *type) { EVP_MD_CTX ctx; + EVP_MD_CTX_init(&ctx); + if (!EVP_DigestSignInit(&ctx, NULL, type, NULL, pkey)) + { + EVP_MD_CTX_cleanup(&ctx); + return 0; + } + return ASN1_item_sign_ctx(it, algor1, algor2, signature, asn, &ctx); + } + + +int ASN1_item_sign_ctx(const ASN1_ITEM *it, + X509_ALGOR *algor1, X509_ALGOR *algor2, + ASN1_BIT_STRING *signature, void *asn, EVP_MD_CTX *ctx) + { + const EVP_MD *type; + EVP_PKEY *pkey; unsigned char *buf_in=NULL,*buf_out=NULL; - int inl=0,outl=0,outll=0; + size_t inl=0,outl=0,outll=0; int signid, paramtype; + int rv; + + type = EVP_MD_CTX_md(ctx); + pkey = EVP_PKEY_CTX_get0_pkey(ctx->pctx); - if (type == NULL) + if (!type || !pkey) { - int def_nid; - if (EVP_PKEY_get_default_digest_nid(pkey, &def_nid) > 0) - type = EVP_get_digestbynid(def_nid); + ASN1err(ASN1_F_ASN1_ITEM_SIGN_CTX, ASN1_R_CONTEXT_NOT_INITIALISED); + return 0; } - if (type == NULL) + if (pkey->ameth->item_sign) { - ASN1err(ASN1_F_ASN1_ITEM_SIGN, ASN1_R_NO_DEFAULT_DIGEST); - return 0; + rv = pkey->ameth->item_sign(ctx, it, asn, algor1, algor2, + signature); + if (rv == 1) + outl = signature->length; + /* Return value meanings: + * <=0: error. + * 1: method does everything. + * 2: carry on as normal. + * 3: ASN1 method sets algorithm identifiers: just sign. + */ + if (rv <= 0) + ASN1err(ASN1_F_ASN1_ITEM_SIGN_CTX, ERR_R_EVP_LIB); + if (rv <= 1) + goto err; } + else + rv = 2; - if (type->flags & EVP_MD_FLAG_PKEY_METHOD_SIGNATURE) + if (rv == 2) { - if (!pkey->ameth || - !OBJ_find_sigid_by_algs(&signid, EVP_MD_nid(type), - pkey->ameth->pkey_id)) + if (type->flags & EVP_MD_FLAG_PKEY_METHOD_SIGNATURE) { - ASN1err(ASN1_F_ASN1_ITEM_SIGN, - ASN1_R_DIGEST_AND_KEY_TYPE_NOT_SUPPORTED); - return 0; + if (!pkey->ameth || + !OBJ_find_sigid_by_algs(&signid, + EVP_MD_nid(type), + pkey->ameth->pkey_id)) + { + ASN1err(ASN1_F_ASN1_ITEM_SIGN_CTX, + ASN1_R_DIGEST_AND_KEY_TYPE_NOT_SUPPORTED); + return 0; + } } - } - else - signid = type->pkey_type; + else + signid = type->pkey_type; - if (pkey->ameth->pkey_flags & ASN1_PKEY_SIGPARAM_NULL) - paramtype = V_ASN1_NULL; - else - paramtype = V_ASN1_UNDEF; + if (pkey->ameth->pkey_flags & ASN1_PKEY_SIGPARAM_NULL) + paramtype = V_ASN1_NULL; + else + paramtype = V_ASN1_UNDEF; - if (algor1) - X509_ALGOR_set0(algor1, OBJ_nid2obj(signid), paramtype, NULL); - if (algor2) - X509_ALGOR_set0(algor2, OBJ_nid2obj(signid), paramtype, NULL); + if (algor1) + X509_ALGOR_set0(algor1, OBJ_nid2obj(signid), paramtype, NULL); + if (algor2) + X509_ALGOR_set0(algor2, OBJ_nid2obj(signid), paramtype, NULL); + + } - EVP_MD_CTX_init(&ctx); inl=ASN1_item_i2d(asn,&buf_in, it); outll=outl=EVP_PKEY_size(pkey); - buf_out=(unsigned char *)OPENSSL_malloc((unsigned int)outl); + buf_out=OPENSSL_malloc((unsigned int)outl); if ((buf_in == NULL) || (buf_out == NULL)) { outl=0; - ASN1err(ASN1_F_ASN1_ITEM_SIGN,ERR_R_MALLOC_FAILURE); + ASN1err(ASN1_F_ASN1_ITEM_SIGN_CTX,ERR_R_MALLOC_FAILURE); goto err; } - EVP_SignInit_ex(&ctx,type, NULL); - EVP_SignUpdate(&ctx,(unsigned char *)buf_in,inl); - if (!EVP_SignFinal(&ctx,(unsigned char *)buf_out, - (unsigned int *)&outl,pkey)) + if (!EVP_DigestSignUpdate(ctx, buf_in, inl) + || !EVP_DigestSignFinal(ctx, buf_out, &outl)) { outl=0; - ASN1err(ASN1_F_ASN1_ITEM_SIGN,ERR_R_EVP_LIB); + ASN1err(ASN1_F_ASN1_ITEM_SIGN_CTX,ERR_R_EVP_LIB); goto err; } if (signature->data != NULL) OPENSSL_free(signature->data); @@ -289,7 +324,7 @@ int ASN1_item_sign(const ASN1_ITEM *it, X509_ALGOR *algor1, X509_ALGOR *algor2, signature->flags&= ~(ASN1_STRING_FLAG_BITS_LEFT|0x07); signature->flags|=ASN1_STRING_FLAG_BITS_LEFT; err: - EVP_MD_CTX_cleanup(&ctx); + EVP_MD_CTX_cleanup(ctx); if (buf_in != NULL) { OPENSSL_cleanse((char *)buf_in,(unsigned int)inl); OPENSSL_free(buf_in); } if (buf_out != NULL) diff --git a/crypto/openssl/crypto/asn1/a_verify.c b/crypto/openssl/crypto/asn1/a_verify.c index cecdb13c70..432722e409 100644 --- a/crypto/openssl/crypto/asn1/a_verify.c +++ b/crypto/openssl/crypto/asn1/a_verify.c @@ -101,8 +101,13 @@ int ASN1_verify(i2d_of_void *i2d, X509_ALGOR *a, ASN1_BIT_STRING *signature, p=buf_in; i2d(data,&p); - EVP_VerifyInit_ex(&ctx,type, NULL); - EVP_VerifyUpdate(&ctx,(unsigned char *)buf_in,inl); + if (!EVP_VerifyInit_ex(&ctx,type, NULL) + || !EVP_VerifyUpdate(&ctx,(unsigned char *)buf_in,inl)) + { + ASN1err(ASN1_F_ASN1_VERIFY,ERR_R_EVP_LIB); + ret=0; + goto err; + } OPENSSL_cleanse(buf_in,(unsigned int)inl); OPENSSL_free(buf_in); @@ -126,11 +131,10 @@ err: #endif -int ASN1_item_verify(const ASN1_ITEM *it, X509_ALGOR *a, ASN1_BIT_STRING *signature, - void *asn, EVP_PKEY *pkey) +int ASN1_item_verify(const ASN1_ITEM *it, X509_ALGOR *a, + ASN1_BIT_STRING *signature, void *asn, EVP_PKEY *pkey) { EVP_MD_CTX ctx; - const EVP_MD *type = NULL; unsigned char *buf_in=NULL; int ret= -1,inl; @@ -144,25 +148,47 @@ int ASN1_item_verify(const ASN1_ITEM *it, X509_ALGOR *a, ASN1_BIT_STRING *signat ASN1err(ASN1_F_ASN1_ITEM_VERIFY,ASN1_R_UNKNOWN_SIGNATURE_ALGORITHM); goto err; } - type=EVP_get_digestbynid(mdnid); - if (type == NULL) + if (mdnid == NID_undef) { - ASN1err(ASN1_F_ASN1_ITEM_VERIFY,ASN1_R_UNKNOWN_MESSAGE_DIGEST_ALGORITHM); - goto err; + if (!pkey->ameth || !pkey->ameth->item_verify) + { + ASN1err(ASN1_F_ASN1_ITEM_VERIFY,ASN1_R_UNKNOWN_SIGNATURE_ALGORITHM); + goto err; + } + ret = pkey->ameth->item_verify(&ctx, it, asn, a, + signature, pkey); + /* Return value of 2 means carry on, anything else means we + * exit straight away: either a fatal error of the underlying + * verification routine handles all verification. + */ + if (ret != 2) + goto err; + ret = -1; } - - /* Check public key OID matches public key type */ - if (EVP_PKEY_type(pknid) != pkey->ameth->pkey_id) + else { - ASN1err(ASN1_F_ASN1_ITEM_VERIFY,ASN1_R_WRONG_PUBLIC_KEY_TYPE); - goto err; - } + const EVP_MD *type; + type=EVP_get_digestbynid(mdnid); + if (type == NULL) + { + ASN1err(ASN1_F_ASN1_ITEM_VERIFY,ASN1_R_UNKNOWN_MESSAGE_DIGEST_ALGORITHM); + goto err; + } + + /* Check public key OID matches public key type */ + if (EVP_PKEY_type(pknid) != pkey->ameth->pkey_id) + { + ASN1err(ASN1_F_ASN1_ITEM_VERIFY,ASN1_R_WRONG_PUBLIC_KEY_TYPE); + goto err; + } + + if (!EVP_DigestVerifyInit(&ctx, NULL, type, NULL, pkey)) + { + ASN1err(ASN1_F_ASN1_ITEM_VERIFY,ERR_R_EVP_LIB); + ret=0; + goto err; + } - if (!EVP_VerifyInit_ex(&ctx,type, NULL)) - { - ASN1err(ASN1_F_ASN1_ITEM_VERIFY,ERR_R_EVP_LIB); - ret=0; - goto err; } inl = ASN1_item_i2d(asn, &buf_in, it); @@ -173,13 +199,18 @@ int ASN1_item_verify(const ASN1_ITEM *it, X509_ALGOR *a, ASN1_BIT_STRING *signat goto err; } - EVP_VerifyUpdate(&ctx,(unsigned char *)buf_in,inl); + if (!EVP_DigestVerifyUpdate(&ctx,buf_in,inl)) + { + ASN1err(ASN1_F_ASN1_ITEM_VERIFY,ERR_R_EVP_LIB); + ret=0; + goto err; + } OPENSSL_cleanse(buf_in,(unsigned int)inl); OPENSSL_free(buf_in); - if (EVP_VerifyFinal(&ctx,(unsigned char *)signature->data, - (unsigned int)signature->length,pkey) <= 0) + if (EVP_DigestVerifyFinal(&ctx,signature->data, + (size_t)signature->length) <= 0) { ASN1err(ASN1_F_ASN1_ITEM_VERIFY,ERR_R_EVP_LIB); ret=0; diff --git a/crypto/openssl/crypto/asn1/ameth_lib.c b/crypto/openssl/crypto/asn1/ameth_lib.c index 5a581b90ea..a19e058fca 100644 --- a/crypto/openssl/crypto/asn1/ameth_lib.c +++ b/crypto/openssl/crypto/asn1/ameth_lib.c @@ -69,6 +69,7 @@ extern const EVP_PKEY_ASN1_METHOD dsa_asn1_meths[]; extern const EVP_PKEY_ASN1_METHOD dh_asn1_meth; extern const EVP_PKEY_ASN1_METHOD eckey_asn1_meth; extern const EVP_PKEY_ASN1_METHOD hmac_asn1_meth; +extern const EVP_PKEY_ASN1_METHOD cmac_asn1_meth; /* Keep this sorted in type order !! */ static const EVP_PKEY_ASN1_METHOD *standard_methods[] = @@ -90,7 +91,8 @@ static const EVP_PKEY_ASN1_METHOD *standard_methods[] = #ifndef OPENSSL_NO_EC &eckey_asn1_meth, #endif - &hmac_asn1_meth + &hmac_asn1_meth, + &cmac_asn1_meth }; typedef int sk_cmp_fn_type(const char * const *a, const char * const *b); @@ -291,6 +293,8 @@ EVP_PKEY_ASN1_METHOD* EVP_PKEY_asn1_new(int id, int flags, if (!ameth) return NULL; + memset(ameth, 0, sizeof(EVP_PKEY_ASN1_METHOD)); + ameth->pkey_id = id; ameth->pkey_base_id = id; ameth->pkey_flags = flags | ASN1_PKEY_DYNAMIC; @@ -325,6 +329,9 @@ EVP_PKEY_ASN1_METHOD* EVP_PKEY_asn1_new(int id, int flags, ameth->old_priv_encode = 0; ameth->old_priv_decode = 0; + ameth->item_verify = 0; + ameth->item_sign = 0; + ameth->pkey_size = 0; ameth->pkey_bits = 0; @@ -376,6 +383,9 @@ void EVP_PKEY_asn1_copy(EVP_PKEY_ASN1_METHOD *dst, dst->pkey_free = src->pkey_free; dst->pkey_ctrl = src->pkey_ctrl; + dst->item_sign = src->item_sign; + dst->item_verify = src->item_verify; + } void EVP_PKEY_asn1_free(EVP_PKEY_ASN1_METHOD *ameth) diff --git a/crypto/openssl/crypto/asn1/asn1.h b/crypto/openssl/crypto/asn1/asn1.h index 59540e4e79..220a0c8c63 100644 --- a/crypto/openssl/crypto/asn1/asn1.h +++ b/crypto/openssl/crypto/asn1/asn1.h @@ -235,7 +235,7 @@ typedef struct asn1_object_st */ #define ASN1_STRING_FLAG_MSTRING 0x040 /* This is the base type that holds just about everything :-) */ -typedef struct asn1_string_st +struct asn1_string_st { int length; int type; @@ -245,7 +245,7 @@ typedef struct asn1_string_st * input data has a non-zero 'unused bits' value, it will be * handled correctly */ long flags; - } ASN1_STRING; + }; /* ASN1_ENCODING structure: this is used to save the received * encoding of an ASN1 type. This is useful to get round @@ -293,7 +293,6 @@ DECLARE_STACK_OF(ASN1_STRING_TABLE) * see asn1t.h */ typedef struct ASN1_TEMPLATE_st ASN1_TEMPLATE; -typedef struct ASN1_ITEM_st ASN1_ITEM; typedef struct ASN1_TLC_st ASN1_TLC; /* This is just an opaque pointer */ typedef struct ASN1_VALUE_st ASN1_VALUE; @@ -1194,6 +1193,7 @@ void ERR_load_ASN1_strings(void); #define ASN1_F_ASN1_ITEM_I2D_FP 193 #define ASN1_F_ASN1_ITEM_PACK 198 #define ASN1_F_ASN1_ITEM_SIGN 195 +#define ASN1_F_ASN1_ITEM_SIGN_CTX 220 #define ASN1_F_ASN1_ITEM_UNPACK 199 #define ASN1_F_ASN1_ITEM_VERIFY 197 #define ASN1_F_ASN1_MBSTRING_NCOPY 122 @@ -1266,6 +1266,7 @@ void ERR_load_ASN1_strings(void); #define ASN1_F_PKCS5_PBE2_SET_IV 167 #define ASN1_F_PKCS5_PBE_SET 202 #define ASN1_F_PKCS5_PBE_SET0_ALGOR 215 +#define ASN1_F_PKCS5_PBKDF2_SET 219 #define ASN1_F_SMIME_READ_ASN1 212 #define ASN1_F_SMIME_TEXT 213 #define ASN1_F_X509_CINF_NEW 168 @@ -1291,6 +1292,7 @@ void ERR_load_ASN1_strings(void); #define ASN1_R_BOOLEAN_IS_WRONG_LENGTH 106 #define ASN1_R_BUFFER_TOO_SMALL 107 #define ASN1_R_CIPHER_HAS_NO_OBJECT_IDENTIFIER 108 +#define ASN1_R_CONTEXT_NOT_INITIALISED 217 #define ASN1_R_DATA_IS_WRONG 109 #define ASN1_R_DECODE_ERROR 110 #define ASN1_R_DECODING_ERROR 111 diff --git a/crypto/openssl/crypto/asn1/asn1_err.c b/crypto/openssl/crypto/asn1/asn1_err.c index 6e04d08f31..1a30bf119b 100644 --- a/crypto/openssl/crypto/asn1/asn1_err.c +++ b/crypto/openssl/crypto/asn1/asn1_err.c @@ -1,6 +1,6 @@ /* crypto/asn1/asn1_err.c */ /* ==================================================================== - * Copyright (c) 1999-2009 The OpenSSL Project. All rights reserved. + * Copyright (c) 1999-2011 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -107,6 +107,7 @@ static ERR_STRING_DATA ASN1_str_functs[]= {ERR_FUNC(ASN1_F_ASN1_ITEM_I2D_FP), "ASN1_item_i2d_fp"}, {ERR_FUNC(ASN1_F_ASN1_ITEM_PACK), "ASN1_item_pack"}, {ERR_FUNC(ASN1_F_ASN1_ITEM_SIGN), "ASN1_item_sign"}, +{ERR_FUNC(ASN1_F_ASN1_ITEM_SIGN_CTX), "ASN1_item_sign_ctx"}, {ERR_FUNC(ASN1_F_ASN1_ITEM_UNPACK), "ASN1_item_unpack"}, {ERR_FUNC(ASN1_F_ASN1_ITEM_VERIFY), "ASN1_item_verify"}, {ERR_FUNC(ASN1_F_ASN1_MBSTRING_NCOPY), "ASN1_mbstring_ncopy"}, @@ -179,6 +180,7 @@ static ERR_STRING_DATA ASN1_str_functs[]= {ERR_FUNC(ASN1_F_PKCS5_PBE2_SET_IV), "PKCS5_pbe2_set_iv"}, {ERR_FUNC(ASN1_F_PKCS5_PBE_SET), "PKCS5_pbe_set"}, {ERR_FUNC(ASN1_F_PKCS5_PBE_SET0_ALGOR), "PKCS5_pbe_set0_algor"}, +{ERR_FUNC(ASN1_F_PKCS5_PBKDF2_SET), "PKCS5_pbkdf2_set"}, {ERR_FUNC(ASN1_F_SMIME_READ_ASN1), "SMIME_read_ASN1"}, {ERR_FUNC(ASN1_F_SMIME_TEXT), "SMIME_text"}, {ERR_FUNC(ASN1_F_X509_CINF_NEW), "X509_CINF_NEW"}, @@ -207,6 +209,7 @@ static ERR_STRING_DATA ASN1_str_reasons[]= {ERR_REASON(ASN1_R_BOOLEAN_IS_WRONG_LENGTH),"boolean is wrong length"}, {ERR_REASON(ASN1_R_BUFFER_TOO_SMALL) ,"buffer too small"}, {ERR_REASON(ASN1_R_CIPHER_HAS_NO_OBJECT_IDENTIFIER),"cipher has no object identifier"}, +{ERR_REASON(ASN1_R_CONTEXT_NOT_INITIALISED),"context not initialised"}, {ERR_REASON(ASN1_R_DATA_IS_WRONG) ,"data is wrong"}, {ERR_REASON(ASN1_R_DECODE_ERROR) ,"decode error"}, {ERR_REASON(ASN1_R_DECODING_ERROR) ,"decoding error"}, diff --git a/crypto/openssl/crypto/asn1/asn1_locl.h b/crypto/openssl/crypto/asn1/asn1_locl.h index 5aa65e28f5..9fcf0d9530 100644 --- a/crypto/openssl/crypto/asn1/asn1_locl.h +++ b/crypto/openssl/crypto/asn1/asn1_locl.h @@ -102,6 +102,10 @@ struct evp_pkey_asn1_method_st int (*param_cmp)(const EVP_PKEY *a, const EVP_PKEY *b); int (*param_print)(BIO *out, const EVP_PKEY *pkey, int indent, ASN1_PCTX *pctx); + int (*sig_print)(BIO *out, + const X509_ALGOR *sigalg, const ASN1_STRING *sig, + int indent, ASN1_PCTX *pctx); + void (*pkey_free)(EVP_PKEY *pkey); int (*pkey_ctrl)(EVP_PKEY *pkey, int op, long arg1, void *arg2); @@ -111,6 +115,13 @@ struct evp_pkey_asn1_method_st int (*old_priv_decode)(EVP_PKEY *pkey, const unsigned char **pder, int derlen); int (*old_priv_encode)(const EVP_PKEY *pkey, unsigned char **pder); + /* Custom ASN1 signature verification */ + int (*item_verify)(EVP_MD_CTX *ctx, const ASN1_ITEM *it, void *asn, + X509_ALGOR *a, ASN1_BIT_STRING *sig, + EVP_PKEY *pkey); + int (*item_sign)(EVP_MD_CTX *ctx, const ASN1_ITEM *it, void *asn, + X509_ALGOR *alg1, X509_ALGOR *alg2, + ASN1_BIT_STRING *sig); } /* EVP_PKEY_ASN1_METHOD */; diff --git a/crypto/openssl/crypto/asn1/asn_mime.c b/crypto/openssl/crypto/asn1/asn_mime.c index bbc4952918..54a704a969 100644 --- a/crypto/openssl/crypto/asn1/asn_mime.c +++ b/crypto/openssl/crypto/asn1/asn_mime.c @@ -377,8 +377,12 @@ static int asn1_output_data(BIO *out, BIO *data, ASN1_VALUE *val, int flags, BIO *tmpbio; const ASN1_AUX *aux = it->funcs; ASN1_STREAM_ARG sarg; + int rv = 1; - if (!(flags & SMIME_DETACHED)) + /* If data is not deteched or resigning then the output BIO is + * already set up to finalise when it is written through. + */ + if (!(flags & SMIME_DETACHED) || (flags & PKCS7_REUSE_DIGEST)) { SMIME_crlf_copy(data, out, flags); return 1; @@ -405,7 +409,7 @@ static int asn1_output_data(BIO *out, BIO *data, ASN1_VALUE *val, int flags, /* Finalize structure */ if (aux->asn1_cb(ASN1_OP_DETACHED_POST, &val, it, &sarg) <= 0) - return 0; + rv = 0; /* Now remove any digests prepended to the BIO */ @@ -416,7 +420,7 @@ static int asn1_output_data(BIO *out, BIO *data, ASN1_VALUE *val, int flags, sarg.ndef_bio = tmpbio; } - return 1; + return rv; } @@ -486,9 +490,9 @@ ASN1_VALUE *SMIME_read_ASN1(BIO *bio, BIO **bcont, const ASN1_ITEM *it) if(strcmp(hdr->value, "application/x-pkcs7-signature") && strcmp(hdr->value, "application/pkcs7-signature")) { - sk_MIME_HEADER_pop_free(headers, mime_hdr_free); ASN1err(ASN1_F_SMIME_READ_ASN1,ASN1_R_SIG_INVALID_MIME_TYPE); ERR_add_error_data(2, "type: ", hdr->value); + sk_MIME_HEADER_pop_free(headers, mime_hdr_free); sk_BIO_pop_free(parts, BIO_vfree); return NULL; } @@ -858,12 +862,17 @@ static int mime_hdr_addparam(MIME_HEADER *mhdr, char *name, char *value) static int mime_hdr_cmp(const MIME_HEADER * const *a, const MIME_HEADER * const *b) { + if (!(*a)->name || !(*b)->name) + return !!(*a)->name - !!(*b)->name; + return(strcmp((*a)->name, (*b)->name)); } static int mime_param_cmp(const MIME_PARAM * const *a, const MIME_PARAM * const *b) { + if (!(*a)->param_name || !(*b)->param_name) + return !!(*a)->param_name - !!(*b)->param_name; return(strcmp((*a)->param_name, (*b)->param_name)); } diff --git a/crypto/openssl/crypto/asn1/n_pkey.c b/crypto/openssl/crypto/asn1/n_pkey.c index e7d0439062..e251739933 100644 --- a/crypto/openssl/crypto/asn1/n_pkey.c +++ b/crypto/openssl/crypto/asn1/n_pkey.c @@ -129,6 +129,7 @@ int i2d_RSA_NET(const RSA *a, unsigned char **pp, unsigned char buf[256],*zz; unsigned char key[EVP_MAX_KEY_LENGTH]; EVP_CIPHER_CTX ctx; + EVP_CIPHER_CTX_init(&ctx); if (a == NULL) return(0); @@ -206,24 +207,28 @@ int i2d_RSA_NET(const RSA *a, unsigned char **pp, i = strlen((char *)buf); /* If the key is used for SGC the algorithm is modified a little. */ if(sgckey) { - EVP_Digest(buf, i, buf, NULL, EVP_md5(), NULL); + if (!EVP_Digest(buf, i, buf, NULL, EVP_md5(), NULL)) + goto err; memcpy(buf + 16, "SGCKEYSALT", 10); i = 26; } - EVP_BytesToKey(EVP_rc4(),EVP_md5(),NULL,buf,i,1,key,NULL); + if (!EVP_BytesToKey(EVP_rc4(),EVP_md5(),NULL,buf,i,1,key,NULL)) + goto err; OPENSSL_cleanse(buf,256); /* Encrypt private key in place */ zz = enckey->enckey->digest->data; - EVP_CIPHER_CTX_init(&ctx); - EVP_EncryptInit_ex(&ctx,EVP_rc4(),NULL,key,NULL); - EVP_EncryptUpdate(&ctx,zz,&i,zz,pkeylen); - EVP_EncryptFinal_ex(&ctx,zz + i,&j); - EVP_CIPHER_CTX_cleanup(&ctx); + if (!EVP_EncryptInit_ex(&ctx,EVP_rc4(),NULL,key,NULL)) + goto err; + if (!EVP_EncryptUpdate(&ctx,zz,&i,zz,pkeylen)) + goto err; + if (!EVP_EncryptFinal_ex(&ctx,zz + i,&j)) + goto err; ret = i2d_NETSCAPE_ENCRYPTED_PKEY(enckey, pp); err: + EVP_CIPHER_CTX_cleanup(&ctx); NETSCAPE_ENCRYPTED_PKEY_free(enckey); NETSCAPE_PKEY_free(pkey); return(ret); @@ -288,6 +293,7 @@ static RSA *d2i_RSA_NET_2(RSA **a, ASN1_OCTET_STRING *os, const unsigned char *zz; unsigned char key[EVP_MAX_KEY_LENGTH]; EVP_CIPHER_CTX ctx; + EVP_CIPHER_CTX_init(&ctx); i=cb((char *)buf,256,"Enter Private Key password:",0); if (i != 0) @@ -298,19 +304,22 @@ static RSA *d2i_RSA_NET_2(RSA **a, ASN1_OCTET_STRING *os, i = strlen((char *)buf); if(sgckey){ - EVP_Digest(buf, i, buf, NULL, EVP_md5(), NULL); + if (!EVP_Digest(buf, i, buf, NULL, EVP_md5(), NULL)) + goto err; memcpy(buf + 16, "SGCKEYSALT", 10); i = 26; } - EVP_BytesToKey(EVP_rc4(),EVP_md5(),NULL,buf,i,1,key,NULL); + if (!EVP_BytesToKey(EVP_rc4(),EVP_md5(),NULL,buf,i,1,key,NULL)) + goto err; OPENSSL_cleanse(buf,256); - EVP_CIPHER_CTX_init(&ctx); - EVP_DecryptInit_ex(&ctx,EVP_rc4(),NULL, key,NULL); - EVP_DecryptUpdate(&ctx,os->data,&i,os->data,os->length); - EVP_DecryptFinal_ex(&ctx,&(os->data[i]),&j); - EVP_CIPHER_CTX_cleanup(&ctx); + if (!EVP_DecryptInit_ex(&ctx,EVP_rc4(),NULL, key,NULL)) + goto err; + if (!EVP_DecryptUpdate(&ctx,os->data,&i,os->data,os->length)) + goto err; + if (!EVP_DecryptFinal_ex(&ctx,&(os->data[i]),&j)) + goto err; os->length=i+j; zz=os->data; @@ -328,6 +337,7 @@ static RSA *d2i_RSA_NET_2(RSA **a, ASN1_OCTET_STRING *os, goto err; } err: + EVP_CIPHER_CTX_cleanup(&ctx); NETSCAPE_PKEY_free(pkey); return(ret); } diff --git a/crypto/openssl/crypto/asn1/p5_pbev2.c b/crypto/openssl/crypto/asn1/p5_pbev2.c index cb49b6651d..4ea683036b 100644 --- a/crypto/openssl/crypto/asn1/p5_pbev2.c +++ b/crypto/openssl/crypto/asn1/p5_pbev2.c @@ -91,12 +91,10 @@ X509_ALGOR *PKCS5_pbe2_set_iv(const EVP_CIPHER *cipher, int iter, unsigned char *aiv, int prf_nid) { X509_ALGOR *scheme = NULL, *kalg = NULL, *ret = NULL; - int alg_nid; + int alg_nid, keylen; EVP_CIPHER_CTX ctx; unsigned char iv[EVP_MAX_IV_LENGTH]; - PBKDF2PARAM *kdf = NULL; PBE2PARAM *pbe2 = NULL; - ASN1_OCTET_STRING *osalt = NULL; ASN1_OBJECT *obj; alg_nid = EVP_CIPHER_type(cipher); @@ -127,7 +125,8 @@ X509_ALGOR *PKCS5_pbe2_set_iv(const EVP_CIPHER *cipher, int iter, EVP_CIPHER_CTX_init(&ctx); /* Dummy cipherinit to just setup the IV, and PRF */ - EVP_CipherInit_ex(&ctx, cipher, NULL, NULL, iv, 0); + if (!EVP_CipherInit_ex(&ctx, cipher, NULL, NULL, iv, 0)) + goto err; if(EVP_CIPHER_param_to_asn1(&ctx, scheme->parameter) < 0) { ASN1err(ASN1_F_PKCS5_PBE2_SET_IV, ASN1_R_ERROR_SETTING_CIPHER_PARAMS); @@ -145,55 +144,21 @@ X509_ALGOR *PKCS5_pbe2_set_iv(const EVP_CIPHER *cipher, int iter, } EVP_CIPHER_CTX_cleanup(&ctx); - if(!(kdf = PBKDF2PARAM_new())) goto merr; - if(!(osalt = M_ASN1_OCTET_STRING_new())) goto merr; - - if (!saltlen) saltlen = PKCS5_SALT_LEN; - if (!(osalt->data = OPENSSL_malloc (saltlen))) goto merr; - osalt->length = saltlen; - if (salt) memcpy (osalt->data, salt, saltlen); - else if (RAND_pseudo_bytes (osalt->data, saltlen) < 0) goto merr; - - if(iter <= 0) iter = PKCS5_DEFAULT_ITER; - if(!ASN1_INTEGER_set(kdf->iter, iter)) goto merr; - - /* Now include salt in kdf structure */ - kdf->salt->value.octet_string = osalt; - kdf->salt->type = V_ASN1_OCTET_STRING; - osalt = NULL; - /* If its RC2 then we'd better setup the key length */ - if(alg_nid == NID_rc2_cbc) { - if(!(kdf->keylength = M_ASN1_INTEGER_new())) goto merr; - if(!ASN1_INTEGER_set (kdf->keylength, - EVP_CIPHER_key_length(cipher))) goto merr; - } - - /* prf can stay NULL if we are using hmacWithSHA1 */ - if (prf_nid != NID_hmacWithSHA1) - { - kdf->prf = X509_ALGOR_new(); - if (!kdf->prf) - goto merr; - X509_ALGOR_set0(kdf->prf, OBJ_nid2obj(prf_nid), - V_ASN1_NULL, NULL); - } - - /* Now setup the PBE2PARAM keyfunc structure */ + if(alg_nid == NID_rc2_cbc) + keylen = EVP_CIPHER_key_length(cipher); + else + keylen = -1; - pbe2->keyfunc->algorithm = OBJ_nid2obj(NID_id_pbkdf2); + /* Setup keyfunc */ - /* Encode PBKDF2PARAM into parameter of pbe2 */ + X509_ALGOR_free(pbe2->keyfunc); - if(!(pbe2->keyfunc->parameter = ASN1_TYPE_new())) goto merr; + pbe2->keyfunc = PKCS5_pbkdf2_set(iter, salt, saltlen, prf_nid, keylen); - if(!ASN1_item_pack(kdf, ASN1_ITEM_rptr(PBKDF2PARAM), - &pbe2->keyfunc->parameter->value.sequence)) goto merr; - pbe2->keyfunc->parameter->type = V_ASN1_SEQUENCE; - - PBKDF2PARAM_free(kdf); - kdf = NULL; + if (!pbe2->keyfunc) + goto merr; /* Now set up top level AlgorithmIdentifier */ @@ -219,8 +184,6 @@ X509_ALGOR *PKCS5_pbe2_set_iv(const EVP_CIPHER *cipher, int iter, err: PBE2PARAM_free(pbe2); /* Note 'scheme' is freed as part of pbe2 */ - M_ASN1_OCTET_STRING_free(osalt); - PBKDF2PARAM_free(kdf); X509_ALGOR_free(kalg); X509_ALGOR_free(ret); @@ -233,3 +196,85 @@ X509_ALGOR *PKCS5_pbe2_set(const EVP_CIPHER *cipher, int iter, { return PKCS5_pbe2_set_iv(cipher, iter, salt, saltlen, NULL, -1); } + +X509_ALGOR *PKCS5_pbkdf2_set(int iter, unsigned char *salt, int saltlen, + int prf_nid, int keylen) + { + X509_ALGOR *keyfunc = NULL; + PBKDF2PARAM *kdf = NULL; + ASN1_OCTET_STRING *osalt = NULL; + + if(!(kdf = PBKDF2PARAM_new())) + goto merr; + if(!(osalt = M_ASN1_OCTET_STRING_new())) + goto merr; + + kdf->salt->value.octet_string = osalt; + kdf->salt->type = V_ASN1_OCTET_STRING; + + if (!saltlen) + saltlen = PKCS5_SALT_LEN; + if (!(osalt->data = OPENSSL_malloc (saltlen))) + goto merr; + + osalt->length = saltlen; + + if (salt) + memcpy (osalt->data, salt, saltlen); + else if (RAND_pseudo_bytes (osalt->data, saltlen) < 0) + goto merr; + + if(iter <= 0) + iter = PKCS5_DEFAULT_ITER; + + if(!ASN1_INTEGER_set(kdf->iter, iter)) + goto merr; + + /* If have a key len set it up */ + + if(keylen > 0) + { + if(!(kdf->keylength = M_ASN1_INTEGER_new())) + goto merr; + if(!ASN1_INTEGER_set (kdf->keylength, keylen)) + goto merr; + } + + /* prf can stay NULL if we are using hmacWithSHA1 */ + if (prf_nid > 0 && prf_nid != NID_hmacWithSHA1) + { + kdf->prf = X509_ALGOR_new(); + if (!kdf->prf) + goto merr; + X509_ALGOR_set0(kdf->prf, OBJ_nid2obj(prf_nid), + V_ASN1_NULL, NULL); + } + + /* Finally setup the keyfunc structure */ + + keyfunc = X509_ALGOR_new(); + if (!keyfunc) + goto merr; + + keyfunc->algorithm = OBJ_nid2obj(NID_id_pbkdf2); + + /* Encode PBKDF2PARAM into parameter of pbe2 */ + + if(!(keyfunc->parameter = ASN1_TYPE_new())) + goto merr; + + if(!ASN1_item_pack(kdf, ASN1_ITEM_rptr(PBKDF2PARAM), + &keyfunc->parameter->value.sequence)) + goto merr; + keyfunc->parameter->type = V_ASN1_SEQUENCE; + + PBKDF2PARAM_free(kdf); + return keyfunc; + + merr: + ASN1err(ASN1_F_PKCS5_PBKDF2_SET,ERR_R_MALLOC_FAILURE); + PBKDF2PARAM_free(kdf); + X509_ALGOR_free(keyfunc); + return NULL; + } + diff --git a/crypto/openssl/crypto/asn1/t_crl.c b/crypto/openssl/crypto/asn1/t_crl.c index ee5a687ce8..c61169208a 100644 --- a/crypto/openssl/crypto/asn1/t_crl.c +++ b/crypto/openssl/crypto/asn1/t_crl.c @@ -94,8 +94,7 @@ int X509_CRL_print(BIO *out, X509_CRL *x) l = X509_CRL_get_version(x); BIO_printf(out, "%8sVersion %lu (0x%lx)\n", "", l+1, l); i = OBJ_obj2nid(x->sig_alg->algorithm); - BIO_printf(out, "%8sSignature Algorithm: %s\n", "", - (i == NID_undef) ? "NONE" : OBJ_nid2ln(i)); + X509_signature_print(out, x->sig_alg, NULL); p=X509_NAME_oneline(X509_CRL_get_issuer(x),NULL,0); BIO_printf(out,"%8sIssuer: %s\n","",p); OPENSSL_free(p); diff --git a/crypto/openssl/crypto/asn1/t_x509.c b/crypto/openssl/crypto/asn1/t_x509.c index 89e7a7f546..edbb39a02f 100644 --- a/crypto/openssl/crypto/asn1/t_x509.c +++ b/crypto/openssl/crypto/asn1/t_x509.c @@ -72,6 +72,7 @@ #include #include #include +#include "asn1_locl.h" #ifndef OPENSSL_NO_FP_API int X509_print_fp(FILE *fp, X509 *x) @@ -137,7 +138,7 @@ int X509_print_ex(BIO *bp, X509 *x, unsigned long nmflags, unsigned long cflag) if (BIO_write(bp," Serial Number:",22) <= 0) goto err; bs=X509_get_serialNumber(x); - if (bs->length <= 4) + if (bs->length <= (int)sizeof(long)) { l=ASN1_INTEGER_get(bs); if (bs->type == V_ASN1_NEG_INTEGER) @@ -167,12 +168,16 @@ int X509_print_ex(BIO *bp, X509 *x, unsigned long nmflags, unsigned long cflag) if(!(cflag & X509_FLAG_NO_SIGNAME)) { + if(X509_signature_print(bp, x->sig_alg, NULL) <= 0) + goto err; +#if 0 if (BIO_printf(bp,"%8sSignature Algorithm: ","") <= 0) goto err; if (i2a_ASN1_OBJECT(bp, ci->signature->algorithm) <= 0) goto err; if (BIO_puts(bp, "\n") <= 0) goto err; +#endif } if(!(cflag & X509_FLAG_NO_ISSUER)) @@ -255,7 +260,8 @@ int X509_ocspid_print (BIO *bp, X509 *x) goto err; i2d_X509_NAME(x->cert_info->subject, &dertmp); - EVP_Digest(der, derlen, SHA1md, NULL, EVP_sha1(), NULL); + if (!EVP_Digest(der, derlen, SHA1md, NULL, EVP_sha1(), NULL)) + goto err; for (i=0; i < SHA_DIGEST_LENGTH; i++) { if (BIO_printf(bp,"%02X",SHA1md[i]) <= 0) goto err; @@ -268,8 +274,10 @@ int X509_ocspid_print (BIO *bp, X509 *x) if (BIO_printf(bp,"\n Public key OCSP hash: ") <= 0) goto err; - EVP_Digest(x->cert_info->key->public_key->data, - x->cert_info->key->public_key->length, SHA1md, NULL, EVP_sha1(), NULL); + if (!EVP_Digest(x->cert_info->key->public_key->data, + x->cert_info->key->public_key->length, + SHA1md, NULL, EVP_sha1(), NULL)) + goto err; for (i=0; i < SHA_DIGEST_LENGTH; i++) { if (BIO_printf(bp,"%02X",SHA1md[i]) <= 0) @@ -283,23 +291,50 @@ err: return(0); } -int X509_signature_print(BIO *bp, X509_ALGOR *sigalg, ASN1_STRING *sig) +int X509_signature_dump(BIO *bp, const ASN1_STRING *sig, int indent) { - unsigned char *s; + const unsigned char *s; int i, n; - if (BIO_puts(bp," Signature Algorithm: ") <= 0) return 0; - if (i2a_ASN1_OBJECT(bp, sigalg->algorithm) <= 0) return 0; n=sig->length; s=sig->data; for (i=0; ialgorithm) <= 0) return 0; + + sig_nid = OBJ_obj2nid(sigalg->algorithm); + if (sig_nid != NID_undef) + { + int pkey_nid, dig_nid; + const EVP_PKEY_ASN1_METHOD *ameth; + if (OBJ_find_sigid_algs(sig_nid, &dig_nid, &pkey_nid)) + { + ameth = EVP_PKEY_asn1_find(NULL, pkey_nid); + if (ameth && ameth->sig_print) + return ameth->sig_print(bp, sigalg, sig, 9, 0); + } + } + if (sig) + return X509_signature_dump(bp, sig, 9); + else if (BIO_puts(bp, "\n") <= 0) + return 0; return 1; } diff --git a/crypto/openssl/crypto/asn1/x_algor.c b/crypto/openssl/crypto/asn1/x_algor.c index 99e53429b7..274e456c73 100644 --- a/crypto/openssl/crypto/asn1/x_algor.c +++ b/crypto/openssl/crypto/asn1/x_algor.c @@ -128,3 +128,17 @@ void X509_ALGOR_get0(ASN1_OBJECT **paobj, int *pptype, void **ppval, } } +/* Set up an X509_ALGOR DigestAlgorithmIdentifier from an EVP_MD */ + +void X509_ALGOR_set_md(X509_ALGOR *alg, const EVP_MD *md) + { + int param_type; + + if (md->flags & EVP_MD_FLAG_DIGALGID_ABSENT) + param_type = V_ASN1_UNDEF; + else + param_type = V_ASN1_NULL; + + X509_ALGOR_set0(alg, OBJ_nid2obj(EVP_MD_type(md)), param_type, NULL); + + } diff --git a/crypto/openssl/crypto/asn1/x_name.c b/crypto/openssl/crypto/asn1/x_name.c index 49be08b4da..d7c2318693 100644 --- a/crypto/openssl/crypto/asn1/x_name.c +++ b/crypto/openssl/crypto/asn1/x_name.c @@ -399,8 +399,7 @@ static int asn1_string_canon(ASN1_STRING *out, ASN1_STRING *in) /* If type not in bitmask just copy string across */ if (!(ASN1_tag2bit(in->type) & ASN1_MASK_CANON)) { - out->type = in->type; - if (!ASN1_STRING_set(out, in->data, in->length)) + if (!ASN1_STRING_copy(out, in)) return 0; return 1; } diff --git a/crypto/openssl/crypto/asn1/x_pubkey.c b/crypto/openssl/crypto/asn1/x_pubkey.c index d42b6a2c54..627ec87f9f 100644 --- a/crypto/openssl/crypto/asn1/x_pubkey.c +++ b/crypto/openssl/crypto/asn1/x_pubkey.c @@ -171,7 +171,16 @@ EVP_PKEY *X509_PUBKEY_get(X509_PUBKEY *key) goto error; } - key->pkey = ret; + /* Check to see if another thread set key->pkey first */ + CRYPTO_w_lock(CRYPTO_LOCK_EVP_PKEY); + if (key->pkey) + { + EVP_PKEY_free(ret); + ret = key->pkey; + } + else + key->pkey = ret; + CRYPTO_w_unlock(CRYPTO_LOCK_EVP_PKEY); CRYPTO_add(&ret->references, 1, CRYPTO_LOCK_EVP_PKEY); return ret; diff --git a/crypto/openssl/crypto/bf/bf_skey.c b/crypto/openssl/crypto/bf/bf_skey.c index 3673cdee6e..3b0bca41ae 100644 --- a/crypto/openssl/crypto/bf/bf_skey.c +++ b/crypto/openssl/crypto/bf/bf_skey.c @@ -58,11 +58,19 @@ #include #include +#include #include #include "bf_locl.h" #include "bf_pi.h" void BF_set_key(BF_KEY *key, int len, const unsigned char *data) +#ifdef OPENSSL_FIPS + { + fips_cipher_abort(BLOWFISH); + private_BF_set_key(key, len, data); + } +void private_BF_set_key(BF_KEY *key, int len, const unsigned char *data) +#endif { int i; BF_LONG *p,ri,in[2]; diff --git a/crypto/openssl/crypto/bf/blowfish.h b/crypto/openssl/crypto/bf/blowfish.h index b97e76f9a3..4b6c8920a4 100644 --- a/crypto/openssl/crypto/bf/blowfish.h +++ b/crypto/openssl/crypto/bf/blowfish.h @@ -104,7 +104,9 @@ typedef struct bf_key_st BF_LONG S[4*256]; } BF_KEY; - +#ifdef OPENSSL_FIPS +void private_BF_set_key(BF_KEY *key, int len, const unsigned char *data); +#endif void BF_set_key(BF_KEY *key, int len, const unsigned char *data); void BF_encrypt(BF_LONG *data,const BF_KEY *key); diff --git a/crypto/openssl/crypto/bio/bio.h b/crypto/openssl/crypto/bio/bio.h index ab47abcf14..05699ab212 100644 --- a/crypto/openssl/crypto/bio/bio.h +++ b/crypto/openssl/crypto/bio/bio.h @@ -68,6 +68,14 @@ #include +#ifndef OPENSSL_NO_SCTP +# ifndef OPENSSL_SYS_VMS +# include +# else +# include +# endif +#endif + #ifdef __cplusplus extern "C" { #endif @@ -95,6 +103,9 @@ extern "C" { #define BIO_TYPE_BIO (19|0x0400) /* (half a) BIO pair */ #define BIO_TYPE_LINEBUFFER (20|0x0200) /* filter */ #define BIO_TYPE_DGRAM (21|0x0400|0x0100) +#ifndef OPENSSL_NO_SCTP +#define BIO_TYPE_DGRAM_SCTP (24|0x0400|0x0100) +#endif #define BIO_TYPE_ASN1 (22|0x0200) /* filter */ #define BIO_TYPE_COMP (23|0x0200) /* filter */ @@ -146,6 +157,7 @@ extern "C" { /* #endif */ #define BIO_CTRL_DGRAM_QUERY_MTU 40 /* as kernel for current MTU */ +#define BIO_CTRL_DGRAM_GET_FALLBACK_MTU 47 #define BIO_CTRL_DGRAM_GET_MTU 41 /* get cached value for MTU */ #define BIO_CTRL_DGRAM_SET_MTU 42 /* set cached value for * MTU. want to use this @@ -161,7 +173,22 @@ extern "C" { #define BIO_CTRL_DGRAM_SET_PEER 44 /* Destination for the data */ #define BIO_CTRL_DGRAM_SET_NEXT_TIMEOUT 45 /* Next DTLS handshake timeout to - * adjust socket timeouts */ + * adjust socket timeouts */ + +#ifndef OPENSSL_NO_SCTP +/* SCTP stuff */ +#define BIO_CTRL_DGRAM_SCTP_SET_IN_HANDSHAKE 50 +#define BIO_CTRL_DGRAM_SCTP_ADD_AUTH_KEY 51 +#define BIO_CTRL_DGRAM_SCTP_NEXT_AUTH_KEY 52 +#define BIO_CTRL_DGRAM_SCTP_AUTH_CCS_RCVD 53 +#define BIO_CTRL_DGRAM_SCTP_GET_SNDINFO 60 +#define BIO_CTRL_DGRAM_SCTP_SET_SNDINFO 61 +#define BIO_CTRL_DGRAM_SCTP_GET_RCVINFO 62 +#define BIO_CTRL_DGRAM_SCTP_SET_RCVINFO 63 +#define BIO_CTRL_DGRAM_SCTP_GET_PRINFO 64 +#define BIO_CTRL_DGRAM_SCTP_SET_PRINFO 65 +#define BIO_CTRL_DGRAM_SCTP_SAVE_SHUTDOWN 70 +#endif /* modifiers */ #define BIO_FP_READ 0x02 @@ -331,6 +358,34 @@ typedef struct bio_f_buffer_ctx_struct /* Prefix and suffix callback in ASN1 BIO */ typedef int asn1_ps_func(BIO *b, unsigned char **pbuf, int *plen, void *parg); +#ifndef OPENSSL_NO_SCTP +/* SCTP parameter structs */ +struct bio_dgram_sctp_sndinfo + { + uint16_t snd_sid; + uint16_t snd_flags; + uint32_t snd_ppid; + uint32_t snd_context; + }; + +struct bio_dgram_sctp_rcvinfo + { + uint16_t rcv_sid; + uint16_t rcv_ssn; + uint16_t rcv_flags; + uint32_t rcv_ppid; + uint32_t rcv_tsn; + uint32_t rcv_cumtsn; + uint32_t rcv_context; + }; + +struct bio_dgram_sctp_prinfo + { + uint16_t pr_policy; + uint32_t pr_value; + }; +#endif + /* connect BIO stuff */ #define BIO_CONN_S_BEFORE 1 #define BIO_CONN_S_GET_IP 2 @@ -628,6 +683,9 @@ BIO_METHOD *BIO_f_linebuffer(void); BIO_METHOD *BIO_f_nbio_test(void); #ifndef OPENSSL_NO_DGRAM BIO_METHOD *BIO_s_datagram(void); +#ifndef OPENSSL_NO_SCTP +BIO_METHOD *BIO_s_datagram_sctp(void); +#endif #endif /* BIO_METHOD *BIO_f_ber(void); */ @@ -670,6 +728,15 @@ int BIO_set_tcp_ndelay(int sock,int turn_on); BIO *BIO_new_socket(int sock, int close_flag); BIO *BIO_new_dgram(int fd, int close_flag); +#ifndef OPENSSL_NO_SCTP +BIO *BIO_new_dgram_sctp(int fd, int close_flag); +int BIO_dgram_is_sctp(BIO *bio); +int BIO_dgram_sctp_notification_cb(BIO *b, + void (*handle_notifications)(BIO *bio, void *context, void *buf), + void *context); +int BIO_dgram_sctp_wait_for_dry(BIO *b); +int BIO_dgram_sctp_msg_waiting(BIO *b); +#endif BIO *BIO_new_fd(int fd, int close_flag); BIO *BIO_new_connect(char *host_port); BIO *BIO_new_accept(char *host_port); @@ -734,6 +801,7 @@ void ERR_load_BIO_strings(void); #define BIO_F_BUFFER_CTRL 114 #define BIO_F_CONN_CTRL 127 #define BIO_F_CONN_STATE 115 +#define BIO_F_DGRAM_SCTP_READ 132 #define BIO_F_FILE_CTRL 116 #define BIO_F_FILE_READ 130 #define BIO_F_LINEBUFFER_CTRL 129 diff --git a/crypto/openssl/crypto/bio/bio_err.c b/crypto/openssl/crypto/bio/bio_err.c index a224edd5a0..0dbfbd80d3 100644 --- a/crypto/openssl/crypto/bio/bio_err.c +++ b/crypto/openssl/crypto/bio/bio_err.c @@ -1,6 +1,6 @@ /* crypto/bio/bio_err.c */ /* ==================================================================== - * Copyright (c) 1999-2006 The OpenSSL Project. All rights reserved. + * Copyright (c) 1999-2011 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -95,6 +95,7 @@ static ERR_STRING_DATA BIO_str_functs[]= {ERR_FUNC(BIO_F_BUFFER_CTRL), "BUFFER_CTRL"}, {ERR_FUNC(BIO_F_CONN_CTRL), "CONN_CTRL"}, {ERR_FUNC(BIO_F_CONN_STATE), "CONN_STATE"}, +{ERR_FUNC(BIO_F_DGRAM_SCTP_READ), "DGRAM_SCTP_READ"}, {ERR_FUNC(BIO_F_FILE_CTRL), "FILE_CTRL"}, {ERR_FUNC(BIO_F_FILE_READ), "FILE_READ"}, {ERR_FUNC(BIO_F_LINEBUFFER_CTRL), "LINEBUFFER_CTRL"}, diff --git a/crypto/openssl/crypto/bio/bss_bio.c b/crypto/openssl/crypto/bio/bss_bio.c index 76bd48e767..52ef0ebcb3 100644 --- a/crypto/openssl/crypto/bio/bss_bio.c +++ b/crypto/openssl/crypto/bio/bss_bio.c @@ -277,10 +277,10 @@ static int bio_read(BIO *bio, char *buf, int size_) */ /* WARNING: The non-copying interface is largely untested as of yet * and may contain bugs. */ -static ssize_t bio_nread0(BIO *bio, char **buf) +static ossl_ssize_t bio_nread0(BIO *bio, char **buf) { struct bio_bio_st *b, *peer_b; - ssize_t num; + ossl_ssize_t num; BIO_clear_retry_flags(bio); @@ -315,15 +315,15 @@ static ssize_t bio_nread0(BIO *bio, char **buf) return num; } -static ssize_t bio_nread(BIO *bio, char **buf, size_t num_) +static ossl_ssize_t bio_nread(BIO *bio, char **buf, size_t num_) { struct bio_bio_st *b, *peer_b; - ssize_t num, available; + ossl_ssize_t num, available; if (num_ > SSIZE_MAX) num = SSIZE_MAX; else - num = (ssize_t)num_; + num = (ossl_ssize_t)num_; available = bio_nread0(bio, buf); if (num > available) @@ -428,7 +428,7 @@ static int bio_write(BIO *bio, const char *buf, int num_) * (example usage: bio_nwrite0(), write to buffer, bio_nwrite() * or just bio_nwrite(), write to buffer) */ -static ssize_t bio_nwrite0(BIO *bio, char **buf) +static ossl_ssize_t bio_nwrite0(BIO *bio, char **buf) { struct bio_bio_st *b; size_t num; @@ -476,15 +476,15 @@ static ssize_t bio_nwrite0(BIO *bio, char **buf) return num; } -static ssize_t bio_nwrite(BIO *bio, char **buf, size_t num_) +static ossl_ssize_t bio_nwrite(BIO *bio, char **buf, size_t num_) { struct bio_bio_st *b; - ssize_t num, space; + ossl_ssize_t num, space; if (num_ > SSIZE_MAX) num = SSIZE_MAX; else - num = (ssize_t)num_; + num = (ossl_ssize_t)num_; space = bio_nwrite0(bio, buf); if (num > space) diff --git a/crypto/openssl/crypto/bio/bss_dgram.c b/crypto/openssl/crypto/bio/bss_dgram.c index 71ebe987b6..1b1e4bec81 100644 --- a/crypto/openssl/crypto/bio/bss_dgram.c +++ b/crypto/openssl/crypto/bio/bss_dgram.c @@ -70,6 +70,13 @@ #include #endif +#ifndef OPENSSL_NO_SCTP +#include +#include +#define OPENSSL_SCTP_DATA_CHUNK_TYPE 0x00 +#define OPENSSL_SCTP_FORWARD_CUM_TSN_CHUNK_TYPE 0xc0 +#endif + #ifdef OPENSSL_SYS_LINUX #define IP_MTU 14 /* linux is lame */ #endif @@ -88,6 +95,18 @@ static int dgram_new(BIO *h); static int dgram_free(BIO *data); static int dgram_clear(BIO *bio); +#ifndef OPENSSL_NO_SCTP +static int dgram_sctp_write(BIO *h, const char *buf, int num); +static int dgram_sctp_read(BIO *h, char *buf, int size); +static int dgram_sctp_puts(BIO *h, const char *str); +static long dgram_sctp_ctrl(BIO *h, int cmd, long arg1, void *arg2); +static int dgram_sctp_new(BIO *h); +static int dgram_sctp_free(BIO *data); +#ifdef SCTP_AUTHENTICATION_EVENT +static void dgram_sctp_handle_auth_free_key_event(BIO *b, union sctp_notification *snp); +#endif +#endif + static int BIO_dgram_should_retry(int s); static void get_current_time(struct timeval *t); @@ -106,6 +125,22 @@ static BIO_METHOD methods_dgramp= NULL, }; +#ifndef OPENSSL_NO_SCTP +static BIO_METHOD methods_dgramp_sctp= + { + BIO_TYPE_DGRAM_SCTP, + "datagram sctp socket", + dgram_sctp_write, + dgram_sctp_read, + dgram_sctp_puts, + NULL, /* dgram_gets, */ + dgram_sctp_ctrl, + dgram_sctp_new, + dgram_sctp_free, + NULL, + }; +#endif + typedef struct bio_dgram_data_st { union { @@ -122,6 +157,40 @@ typedef struct bio_dgram_data_st struct timeval socket_timeout; } bio_dgram_data; +#ifndef OPENSSL_NO_SCTP +typedef struct bio_dgram_sctp_save_message_st + { + BIO *bio; + char *data; + int length; + } bio_dgram_sctp_save_message; + +typedef struct bio_dgram_sctp_data_st + { + union { + struct sockaddr sa; + struct sockaddr_in sa_in; +#if OPENSSL_USE_IPV6 + struct sockaddr_in6 sa_in6; +#endif + } peer; + unsigned int connected; + unsigned int _errno; + unsigned int mtu; + struct bio_dgram_sctp_sndinfo sndinfo; + struct bio_dgram_sctp_rcvinfo rcvinfo; + struct bio_dgram_sctp_prinfo prinfo; + void (*handle_notifications)(BIO *bio, void *context, void *buf); + void* notification_context; + int in_handshake; + int ccs_rcvd; + int ccs_sent; + int save_shutdown; + int peer_auth_tested; + bio_dgram_sctp_save_message saved_message; + } bio_dgram_sctp_data; +#endif + BIO_METHOD *BIO_s_datagram(void) { return(&methods_dgramp); @@ -547,6 +616,27 @@ static long dgram_ctrl(BIO *b, int cmd, long num, void *ptr) ret = 0; #endif break; + case BIO_CTRL_DGRAM_GET_FALLBACK_MTU: + switch (data->peer.sa.sa_family) + { + case AF_INET: + ret = 576 - 20 - 8; + break; +#if OPENSSL_USE_IPV6 + case AF_INET6: +#ifdef IN6_IS_ADDR_V4MAPPED + if (IN6_IS_ADDR_V4MAPPED(&data->peer.sa_in6.sin6_addr)) + ret = 576 - 20 - 8; + else +#endif + ret = 1280 - 40 - 8; + break; +#endif + default: + ret = 576 - 20 - 8; + break; + } + break; case BIO_CTRL_DGRAM_GET_MTU: return data->mtu; break; @@ -738,6 +828,912 @@ static int dgram_puts(BIO *bp, const char *str) return(ret); } +#ifndef OPENSSL_NO_SCTP +BIO_METHOD *BIO_s_datagram_sctp(void) + { + return(&methods_dgramp_sctp); + } + +BIO *BIO_new_dgram_sctp(int fd, int close_flag) + { + BIO *bio; + int ret, optval = 20000; + int auth_data = 0, auth_forward = 0; + unsigned char *p; + struct sctp_authchunk auth; + struct sctp_authchunks *authchunks; + socklen_t sockopt_len; +#ifdef SCTP_AUTHENTICATION_EVENT +#ifdef SCTP_EVENT + struct sctp_event event; +#else + struct sctp_event_subscribe event; +#endif +#endif + + bio=BIO_new(BIO_s_datagram_sctp()); + if (bio == NULL) return(NULL); + BIO_set_fd(bio,fd,close_flag); + + /* Activate SCTP-AUTH for DATA and FORWARD-TSN chunks */ + auth.sauth_chunk = OPENSSL_SCTP_DATA_CHUNK_TYPE; + ret = setsockopt(fd, IPPROTO_SCTP, SCTP_AUTH_CHUNK, &auth, sizeof(struct sctp_authchunk)); + OPENSSL_assert(ret >= 0); + auth.sauth_chunk = OPENSSL_SCTP_FORWARD_CUM_TSN_CHUNK_TYPE; + ret = setsockopt(fd, IPPROTO_SCTP, SCTP_AUTH_CHUNK, &auth, sizeof(struct sctp_authchunk)); + OPENSSL_assert(ret >= 0); + + /* Test if activation was successful. When using accept(), + * SCTP-AUTH has to be activated for the listening socket + * already, otherwise the connected socket won't use it. */ + sockopt_len = (socklen_t)(sizeof(sctp_assoc_t) + 256 * sizeof(uint8_t)); + authchunks = OPENSSL_malloc(sockopt_len); + memset(authchunks, 0, sizeof(sockopt_len)); + ret = getsockopt(fd, IPPROTO_SCTP, SCTP_LOCAL_AUTH_CHUNKS, authchunks, &sockopt_len); + OPENSSL_assert(ret >= 0); + + for (p = (unsigned char*) authchunks + sizeof(sctp_assoc_t); + p < (unsigned char*) authchunks + sockopt_len; + p += sizeof(uint8_t)) + { + if (*p == OPENSSL_SCTP_DATA_CHUNK_TYPE) auth_data = 1; + if (*p == OPENSSL_SCTP_FORWARD_CUM_TSN_CHUNK_TYPE) auth_forward = 1; + } + + OPENSSL_free(authchunks); + + OPENSSL_assert(auth_data); + OPENSSL_assert(auth_forward); + +#ifdef SCTP_AUTHENTICATION_EVENT +#ifdef SCTP_EVENT + memset(&event, 0, sizeof(struct sctp_event)); + event.se_assoc_id = 0; + event.se_type = SCTP_AUTHENTICATION_EVENT; + event.se_on = 1; + ret = setsockopt(fd, IPPROTO_SCTP, SCTP_EVENT, &event, sizeof(struct sctp_event)); + OPENSSL_assert(ret >= 0); +#else + sockopt_len = (socklen_t) sizeof(struct sctp_event_subscribe); + ret = getsockopt(fd, IPPROTO_SCTP, SCTP_EVENTS, &event, &sockopt_len); + OPENSSL_assert(ret >= 0); + + event.sctp_authentication_event = 1; + + ret = setsockopt(fd, IPPROTO_SCTP, SCTP_EVENTS, &event, sizeof(struct sctp_event_subscribe)); + OPENSSL_assert(ret >= 0); +#endif +#endif + + /* Disable partial delivery by setting the min size + * larger than the max record size of 2^14 + 2048 + 13 + */ + ret = setsockopt(fd, IPPROTO_SCTP, SCTP_PARTIAL_DELIVERY_POINT, &optval, sizeof(optval)); + OPENSSL_assert(ret >= 0); + + return(bio); + } + +int BIO_dgram_is_sctp(BIO *bio) + { + return (BIO_method_type(bio) == BIO_TYPE_DGRAM_SCTP); + } + +static int dgram_sctp_new(BIO *bi) + { + bio_dgram_sctp_data *data = NULL; + + bi->init=0; + bi->num=0; + data = OPENSSL_malloc(sizeof(bio_dgram_sctp_data)); + if (data == NULL) + return 0; + memset(data, 0x00, sizeof(bio_dgram_sctp_data)); +#ifdef SCTP_PR_SCTP_NONE + data->prinfo.pr_policy = SCTP_PR_SCTP_NONE; +#endif + bi->ptr = data; + + bi->flags=0; + return(1); + } + +static int dgram_sctp_free(BIO *a) + { + bio_dgram_sctp_data *data; + + if (a == NULL) return(0); + if ( ! dgram_clear(a)) + return 0; + + data = (bio_dgram_sctp_data *)a->ptr; + if(data != NULL) OPENSSL_free(data); + + return(1); + } + +#ifdef SCTP_AUTHENTICATION_EVENT +void dgram_sctp_handle_auth_free_key_event(BIO *b, union sctp_notification *snp) + { + unsigned int sockopt_len = 0; + int ret; + struct sctp_authkey_event* authkeyevent = &snp->sn_auth_event; + + if (authkeyevent->auth_indication == SCTP_AUTH_FREE_KEY) + { + struct sctp_authkeyid authkeyid; + + /* delete key */ + authkeyid.scact_keynumber = authkeyevent->auth_keynumber; + sockopt_len = sizeof(struct sctp_authkeyid); + ret = setsockopt(b->num, IPPROTO_SCTP, SCTP_AUTH_DELETE_KEY, + &authkeyid, sockopt_len); + } + } +#endif + +static int dgram_sctp_read(BIO *b, char *out, int outl) + { + int ret = 0, n = 0, i, optval; + socklen_t optlen; + bio_dgram_sctp_data *data = (bio_dgram_sctp_data *)b->ptr; + union sctp_notification *snp; + struct msghdr msg; + struct iovec iov; + struct cmsghdr *cmsg; + char cmsgbuf[512]; + + if (out != NULL) + { + clear_socket_error(); + + do + { + memset(&data->rcvinfo, 0x00, sizeof(struct bio_dgram_sctp_rcvinfo)); + iov.iov_base = out; + iov.iov_len = outl; + msg.msg_name = NULL; + msg.msg_namelen = 0; + msg.msg_iov = &iov; + msg.msg_iovlen = 1; + msg.msg_control = cmsgbuf; + msg.msg_controllen = 512; + msg.msg_flags = 0; + n = recvmsg(b->num, &msg, 0); + + if (msg.msg_controllen > 0) + { + for (cmsg = CMSG_FIRSTHDR(&msg); cmsg; cmsg = CMSG_NXTHDR(&msg, cmsg)) + { + if (cmsg->cmsg_level != IPPROTO_SCTP) + continue; +#ifdef SCTP_RCVINFO + if (cmsg->cmsg_type == SCTP_RCVINFO) + { + struct sctp_rcvinfo *rcvinfo; + + rcvinfo = (struct sctp_rcvinfo *)CMSG_DATA(cmsg); + data->rcvinfo.rcv_sid = rcvinfo->rcv_sid; + data->rcvinfo.rcv_ssn = rcvinfo->rcv_ssn; + data->rcvinfo.rcv_flags = rcvinfo->rcv_flags; + data->rcvinfo.rcv_ppid = rcvinfo->rcv_ppid; + data->rcvinfo.rcv_tsn = rcvinfo->rcv_tsn; + data->rcvinfo.rcv_cumtsn = rcvinfo->rcv_cumtsn; + data->rcvinfo.rcv_context = rcvinfo->rcv_context; + } +#endif +#ifdef SCTP_SNDRCV + if (cmsg->cmsg_type == SCTP_SNDRCV) + { + struct sctp_sndrcvinfo *sndrcvinfo; + + sndrcvinfo = (struct sctp_sndrcvinfo *)CMSG_DATA(cmsg); + data->rcvinfo.rcv_sid = sndrcvinfo->sinfo_stream; + data->rcvinfo.rcv_ssn = sndrcvinfo->sinfo_ssn; + data->rcvinfo.rcv_flags = sndrcvinfo->sinfo_flags; + data->rcvinfo.rcv_ppid = sndrcvinfo->sinfo_ppid; + data->rcvinfo.rcv_tsn = sndrcvinfo->sinfo_tsn; + data->rcvinfo.rcv_cumtsn = sndrcvinfo->sinfo_cumtsn; + data->rcvinfo.rcv_context = sndrcvinfo->sinfo_context; + } +#endif + } + } + + if (n <= 0) + { + if (n < 0) + ret = n; + break; + } + + if (msg.msg_flags & MSG_NOTIFICATION) + { + snp = (union sctp_notification*) out; + if (snp->sn_header.sn_type == SCTP_SENDER_DRY_EVENT) + { +#ifdef SCTP_EVENT + struct sctp_event event; +#else + struct sctp_event_subscribe event; + socklen_t eventsize; +#endif + /* If a message has been delayed until the socket + * is dry, it can be sent now. + */ + if (data->saved_message.length > 0) + { + dgram_sctp_write(data->saved_message.bio, data->saved_message.data, + data->saved_message.length); + OPENSSL_free(data->saved_message.data); + data->saved_message.length = 0; + } + + /* disable sender dry event */ +#ifdef SCTP_EVENT + memset(&event, 0, sizeof(struct sctp_event)); + event.se_assoc_id = 0; + event.se_type = SCTP_SENDER_DRY_EVENT; + event.se_on = 0; + i = setsockopt(b->num, IPPROTO_SCTP, SCTP_EVENT, &event, sizeof(struct sctp_event)); + OPENSSL_assert(i >= 0); +#else + eventsize = sizeof(struct sctp_event_subscribe); + i = getsockopt(b->num, IPPROTO_SCTP, SCTP_EVENTS, &event, &eventsize); + OPENSSL_assert(i >= 0); + + event.sctp_sender_dry_event = 0; + + i = setsockopt(b->num, IPPROTO_SCTP, SCTP_EVENTS, &event, sizeof(struct sctp_event_subscribe)); + OPENSSL_assert(i >= 0); +#endif + } + +#ifdef SCTP_AUTHENTICATION_EVENT + if (snp->sn_header.sn_type == SCTP_AUTHENTICATION_EVENT) + dgram_sctp_handle_auth_free_key_event(b, snp); +#endif + + if (data->handle_notifications != NULL) + data->handle_notifications(b, data->notification_context, (void*) out); + + memset(out, 0, outl); + } + else + ret += n; + } + while ((msg.msg_flags & MSG_NOTIFICATION) && (msg.msg_flags & MSG_EOR) && (ret < outl)); + + if (ret > 0 && !(msg.msg_flags & MSG_EOR)) + { + /* Partial message read, this should never happen! */ + + /* The buffer was too small, this means the peer sent + * a message that was larger than allowed. */ + if (ret == outl) + return -1; + + /* Test if socket buffer can handle max record + * size (2^14 + 2048 + 13) + */ + optlen = (socklen_t) sizeof(int); + ret = getsockopt(b->num, SOL_SOCKET, SO_RCVBUF, &optval, &optlen); + OPENSSL_assert(ret >= 0); + OPENSSL_assert(optval >= 18445); + + /* Test if SCTP doesn't partially deliver below + * max record size (2^14 + 2048 + 13) + */ + optlen = (socklen_t) sizeof(int); + ret = getsockopt(b->num, IPPROTO_SCTP, SCTP_PARTIAL_DELIVERY_POINT, + &optval, &optlen); + OPENSSL_assert(ret >= 0); + OPENSSL_assert(optval >= 18445); + + /* Partially delivered notification??? Probably a bug.... */ + OPENSSL_assert(!(msg.msg_flags & MSG_NOTIFICATION)); + + /* Everything seems ok till now, so it's most likely + * a message dropped by PR-SCTP. + */ + memset(out, 0, outl); + BIO_set_retry_read(b); + return -1; + } + + BIO_clear_retry_flags(b); + if (ret < 0) + { + if (BIO_dgram_should_retry(ret)) + { + BIO_set_retry_read(b); + data->_errno = get_last_socket_error(); + } + } + + /* Test if peer uses SCTP-AUTH before continuing */ + if (!data->peer_auth_tested) + { + int ii, auth_data = 0, auth_forward = 0; + unsigned char *p; + struct sctp_authchunks *authchunks; + + optlen = (socklen_t)(sizeof(sctp_assoc_t) + 256 * sizeof(uint8_t)); + authchunks = OPENSSL_malloc(optlen); + memset(authchunks, 0, sizeof(optlen)); + ii = getsockopt(b->num, IPPROTO_SCTP, SCTP_PEER_AUTH_CHUNKS, authchunks, &optlen); + OPENSSL_assert(ii >= 0); + + for (p = (unsigned char*) authchunks + sizeof(sctp_assoc_t); + p < (unsigned char*) authchunks + optlen; + p += sizeof(uint8_t)) + { + if (*p == OPENSSL_SCTP_DATA_CHUNK_TYPE) auth_data = 1; + if (*p == OPENSSL_SCTP_FORWARD_CUM_TSN_CHUNK_TYPE) auth_forward = 1; + } + + OPENSSL_free(authchunks); + + if (!auth_data || !auth_forward) + { + BIOerr(BIO_F_DGRAM_SCTP_READ,BIO_R_CONNECT_ERROR); + return -1; + } + + data->peer_auth_tested = 1; + } + } + return(ret); + } + +static int dgram_sctp_write(BIO *b, const char *in, int inl) + { + int ret; + bio_dgram_sctp_data *data = (bio_dgram_sctp_data *)b->ptr; + struct bio_dgram_sctp_sndinfo *sinfo = &(data->sndinfo); + struct bio_dgram_sctp_prinfo *pinfo = &(data->prinfo); + struct bio_dgram_sctp_sndinfo handshake_sinfo; + struct iovec iov[1]; + struct msghdr msg; + struct cmsghdr *cmsg; +#if defined(SCTP_SNDINFO) && defined(SCTP_PRINFO) + char cmsgbuf[CMSG_SPACE(sizeof(struct sctp_sndinfo)) + CMSG_SPACE(sizeof(struct sctp_prinfo))]; + struct sctp_sndinfo *sndinfo; + struct sctp_prinfo *prinfo; +#else + char cmsgbuf[CMSG_SPACE(sizeof(struct sctp_sndrcvinfo))]; + struct sctp_sndrcvinfo *sndrcvinfo; +#endif + + clear_socket_error(); + + /* If we're send anything else than application data, + * disable all user parameters and flags. + */ + if (in[0] != 23) { + memset(&handshake_sinfo, 0x00, sizeof(struct bio_dgram_sctp_sndinfo)); +#ifdef SCTP_SACK_IMMEDIATELY + handshake_sinfo.snd_flags = SCTP_SACK_IMMEDIATELY; +#endif + sinfo = &handshake_sinfo; + } + + /* If we have to send a shutdown alert message and the + * socket is not dry yet, we have to save it and send it + * as soon as the socket gets dry. + */ + if (data->save_shutdown && !BIO_dgram_sctp_wait_for_dry(b)) + { + data->saved_message.bio = b; + data->saved_message.length = inl; + data->saved_message.data = OPENSSL_malloc(inl); + memcpy(data->saved_message.data, in, inl); + return inl; + } + + iov[0].iov_base = (char *)in; + iov[0].iov_len = inl; + msg.msg_name = NULL; + msg.msg_namelen = 0; + msg.msg_iov = iov; + msg.msg_iovlen = 1; + msg.msg_control = (caddr_t)cmsgbuf; + msg.msg_controllen = 0; + msg.msg_flags = 0; +#if defined(SCTP_SNDINFO) && defined(SCTP_PRINFO) + cmsg = (struct cmsghdr *)cmsgbuf; + cmsg->cmsg_level = IPPROTO_SCTP; + cmsg->cmsg_type = SCTP_SNDINFO; + cmsg->cmsg_len = CMSG_LEN(sizeof(struct sctp_sndinfo)); + sndinfo = (struct sctp_sndinfo *)CMSG_DATA(cmsg); + memset(sndinfo, 0, sizeof(struct sctp_sndinfo)); + sndinfo->snd_sid = sinfo->snd_sid; + sndinfo->snd_flags = sinfo->snd_flags; + sndinfo->snd_ppid = sinfo->snd_ppid; + sndinfo->snd_context = sinfo->snd_context; + msg.msg_controllen += CMSG_SPACE(sizeof(struct sctp_sndinfo)); + + cmsg = (struct cmsghdr *)&cmsgbuf[CMSG_SPACE(sizeof(struct sctp_sndinfo))]; + cmsg->cmsg_level = IPPROTO_SCTP; + cmsg->cmsg_type = SCTP_PRINFO; + cmsg->cmsg_len = CMSG_LEN(sizeof(struct sctp_prinfo)); + prinfo = (struct sctp_prinfo *)CMSG_DATA(cmsg); + memset(prinfo, 0, sizeof(struct sctp_prinfo)); + prinfo->pr_policy = pinfo->pr_policy; + prinfo->pr_value = pinfo->pr_value; + msg.msg_controllen += CMSG_SPACE(sizeof(struct sctp_prinfo)); +#else + cmsg = (struct cmsghdr *)cmsgbuf; + cmsg->cmsg_level = IPPROTO_SCTP; + cmsg->cmsg_type = SCTP_SNDRCV; + cmsg->cmsg_len = CMSG_LEN(sizeof(struct sctp_sndrcvinfo)); + sndrcvinfo = (struct sctp_sndrcvinfo *)CMSG_DATA(cmsg); + memset(sndrcvinfo, 0, sizeof(struct sctp_sndrcvinfo)); + sndrcvinfo->sinfo_stream = sinfo->snd_sid; + sndrcvinfo->sinfo_flags = sinfo->snd_flags; +#ifdef __FreeBSD__ + sndrcvinfo->sinfo_flags |= pinfo->pr_policy; +#endif + sndrcvinfo->sinfo_ppid = sinfo->snd_ppid; + sndrcvinfo->sinfo_context = sinfo->snd_context; + sndrcvinfo->sinfo_timetolive = pinfo->pr_value; + msg.msg_controllen += CMSG_SPACE(sizeof(struct sctp_sndrcvinfo)); +#endif + + ret = sendmsg(b->num, &msg, 0); + + BIO_clear_retry_flags(b); + if (ret <= 0) + { + if (BIO_dgram_should_retry(ret)) + { + BIO_set_retry_write(b); + data->_errno = get_last_socket_error(); + } + } + return(ret); + } + +static long dgram_sctp_ctrl(BIO *b, int cmd, long num, void *ptr) + { + long ret=1; + bio_dgram_sctp_data *data = NULL; + unsigned int sockopt_len = 0; + struct sctp_authkeyid authkeyid; + struct sctp_authkey *authkey; + + data = (bio_dgram_sctp_data *)b->ptr; + + switch (cmd) + { + case BIO_CTRL_DGRAM_QUERY_MTU: + /* Set to maximum (2^14) + * and ignore user input to enable transport + * protocol fragmentation. + * Returns always 2^14. + */ + data->mtu = 16384; + ret = data->mtu; + break; + case BIO_CTRL_DGRAM_SET_MTU: + /* Set to maximum (2^14) + * and ignore input to enable transport + * protocol fragmentation. + * Returns always 2^14. + */ + data->mtu = 16384; + ret = data->mtu; + break; + case BIO_CTRL_DGRAM_SET_CONNECTED: + case BIO_CTRL_DGRAM_CONNECT: + /* Returns always -1. */ + ret = -1; + break; + case BIO_CTRL_DGRAM_SET_NEXT_TIMEOUT: + /* SCTP doesn't need the DTLS timer + * Returns always 1. + */ + break; + case BIO_CTRL_DGRAM_SCTP_SET_IN_HANDSHAKE: + if (num > 0) + data->in_handshake = 1; + else + data->in_handshake = 0; + + ret = setsockopt(b->num, IPPROTO_SCTP, SCTP_NODELAY, &data->in_handshake, sizeof(int)); + break; + case BIO_CTRL_DGRAM_SCTP_ADD_AUTH_KEY: + /* New shared key for SCTP AUTH. + * Returns 0 on success, -1 otherwise. + */ + + /* Get active key */ + sockopt_len = sizeof(struct sctp_authkeyid); + ret = getsockopt(b->num, IPPROTO_SCTP, SCTP_AUTH_ACTIVE_KEY, &authkeyid, &sockopt_len); + if (ret < 0) break; + + /* Add new key */ + sockopt_len = sizeof(struct sctp_authkey) + 64 * sizeof(uint8_t); + authkey = OPENSSL_malloc(sockopt_len); + memset(authkey, 0x00, sockopt_len); + authkey->sca_keynumber = authkeyid.scact_keynumber + 1; +#ifndef __FreeBSD__ + /* This field is missing in FreeBSD 8.2 and earlier, + * and FreeBSD 8.3 and higher work without it. + */ + authkey->sca_keylength = 64; +#endif + memcpy(&authkey->sca_key[0], ptr, 64 * sizeof(uint8_t)); + + ret = setsockopt(b->num, IPPROTO_SCTP, SCTP_AUTH_KEY, authkey, sockopt_len); + if (ret < 0) break; + + /* Reset active key */ + ret = setsockopt(b->num, IPPROTO_SCTP, SCTP_AUTH_ACTIVE_KEY, + &authkeyid, sizeof(struct sctp_authkeyid)); + if (ret < 0) break; + + break; + case BIO_CTRL_DGRAM_SCTP_NEXT_AUTH_KEY: + /* Returns 0 on success, -1 otherwise. */ + + /* Get active key */ + sockopt_len = sizeof(struct sctp_authkeyid); + ret = getsockopt(b->num, IPPROTO_SCTP, SCTP_AUTH_ACTIVE_KEY, &authkeyid, &sockopt_len); + if (ret < 0) break; + + /* Set active key */ + authkeyid.scact_keynumber = authkeyid.scact_keynumber + 1; + ret = setsockopt(b->num, IPPROTO_SCTP, SCTP_AUTH_ACTIVE_KEY, + &authkeyid, sizeof(struct sctp_authkeyid)); + if (ret < 0) break; + + /* CCS has been sent, so remember that and fall through + * to check if we need to deactivate an old key + */ + data->ccs_sent = 1; + + case BIO_CTRL_DGRAM_SCTP_AUTH_CCS_RCVD: + /* Returns 0 on success, -1 otherwise. */ + + /* Has this command really been called or is this just a fall-through? */ + if (cmd == BIO_CTRL_DGRAM_SCTP_AUTH_CCS_RCVD) + data->ccs_rcvd = 1; + + /* CSS has been both, received and sent, so deactivate an old key */ + if (data->ccs_rcvd == 1 && data->ccs_sent == 1) + { + /* Get active key */ + sockopt_len = sizeof(struct sctp_authkeyid); + ret = getsockopt(b->num, IPPROTO_SCTP, SCTP_AUTH_ACTIVE_KEY, &authkeyid, &sockopt_len); + if (ret < 0) break; + + /* Deactivate key or delete second last key if + * SCTP_AUTHENTICATION_EVENT is not available. + */ + authkeyid.scact_keynumber = authkeyid.scact_keynumber - 1; +#ifdef SCTP_AUTH_DEACTIVATE_KEY + sockopt_len = sizeof(struct sctp_authkeyid); + ret = setsockopt(b->num, IPPROTO_SCTP, SCTP_AUTH_DEACTIVATE_KEY, + &authkeyid, sockopt_len); + if (ret < 0) break; +#endif +#ifndef SCTP_AUTHENTICATION_EVENT + if (authkeyid.scact_keynumber > 0) + { + authkeyid.scact_keynumber = authkeyid.scact_keynumber - 1; + ret = setsockopt(b->num, IPPROTO_SCTP, SCTP_AUTH_DELETE_KEY, + &authkeyid, sizeof(struct sctp_authkeyid)); + if (ret < 0) break; + } +#endif + + data->ccs_rcvd = 0; + data->ccs_sent = 0; + } + break; + case BIO_CTRL_DGRAM_SCTP_GET_SNDINFO: + /* Returns the size of the copied struct. */ + if (num > (long) sizeof(struct bio_dgram_sctp_sndinfo)) + num = sizeof(struct bio_dgram_sctp_sndinfo); + + memcpy(ptr, &(data->sndinfo), num); + ret = num; + break; + case BIO_CTRL_DGRAM_SCTP_SET_SNDINFO: + /* Returns the size of the copied struct. */ + if (num > (long) sizeof(struct bio_dgram_sctp_sndinfo)) + num = sizeof(struct bio_dgram_sctp_sndinfo); + + memcpy(&(data->sndinfo), ptr, num); + break; + case BIO_CTRL_DGRAM_SCTP_GET_RCVINFO: + /* Returns the size of the copied struct. */ + if (num > (long) sizeof(struct bio_dgram_sctp_rcvinfo)) + num = sizeof(struct bio_dgram_sctp_rcvinfo); + + memcpy(ptr, &data->rcvinfo, num); + + ret = num; + break; + case BIO_CTRL_DGRAM_SCTP_SET_RCVINFO: + /* Returns the size of the copied struct. */ + if (num > (long) sizeof(struct bio_dgram_sctp_rcvinfo)) + num = sizeof(struct bio_dgram_sctp_rcvinfo); + + memcpy(&(data->rcvinfo), ptr, num); + break; + case BIO_CTRL_DGRAM_SCTP_GET_PRINFO: + /* Returns the size of the copied struct. */ + if (num > (long) sizeof(struct bio_dgram_sctp_prinfo)) + num = sizeof(struct bio_dgram_sctp_prinfo); + + memcpy(ptr, &(data->prinfo), num); + ret = num; + break; + case BIO_CTRL_DGRAM_SCTP_SET_PRINFO: + /* Returns the size of the copied struct. */ + if (num > (long) sizeof(struct bio_dgram_sctp_prinfo)) + num = sizeof(struct bio_dgram_sctp_prinfo); + + memcpy(&(data->prinfo), ptr, num); + break; + case BIO_CTRL_DGRAM_SCTP_SAVE_SHUTDOWN: + /* Returns always 1. */ + if (num > 0) + data->save_shutdown = 1; + else + data->save_shutdown = 0; + break; + + default: + /* Pass to default ctrl function to + * process SCTP unspecific commands + */ + ret=dgram_ctrl(b, cmd, num, ptr); + break; + } + return(ret); + } + +int BIO_dgram_sctp_notification_cb(BIO *b, + void (*handle_notifications)(BIO *bio, void *context, void *buf), + void *context) + { + bio_dgram_sctp_data *data = (bio_dgram_sctp_data *) b->ptr; + + if (handle_notifications != NULL) + { + data->handle_notifications = handle_notifications; + data->notification_context = context; + } + else + return -1; + + return 0; + } + +int BIO_dgram_sctp_wait_for_dry(BIO *b) +{ + int is_dry = 0; + int n, sockflags, ret; + union sctp_notification snp; + struct msghdr msg; + struct iovec iov; +#ifdef SCTP_EVENT + struct sctp_event event; +#else + struct sctp_event_subscribe event; + socklen_t eventsize; +#endif + bio_dgram_sctp_data *data = (bio_dgram_sctp_data *)b->ptr; + + /* set sender dry event */ +#ifdef SCTP_EVENT + memset(&event, 0, sizeof(struct sctp_event)); + event.se_assoc_id = 0; + event.se_type = SCTP_SENDER_DRY_EVENT; + event.se_on = 1; + ret = setsockopt(b->num, IPPROTO_SCTP, SCTP_EVENT, &event, sizeof(struct sctp_event)); +#else + eventsize = sizeof(struct sctp_event_subscribe); + ret = getsockopt(b->num, IPPROTO_SCTP, SCTP_EVENTS, &event, &eventsize); + if (ret < 0) + return -1; + + event.sctp_sender_dry_event = 1; + + ret = setsockopt(b->num, IPPROTO_SCTP, SCTP_EVENTS, &event, sizeof(struct sctp_event_subscribe)); +#endif + if (ret < 0) + return -1; + + /* peek for notification */ + memset(&snp, 0x00, sizeof(union sctp_notification)); + iov.iov_base = (char *)&snp; + iov.iov_len = sizeof(union sctp_notification); + msg.msg_name = NULL; + msg.msg_namelen = 0; + msg.msg_iov = &iov; + msg.msg_iovlen = 1; + msg.msg_control = NULL; + msg.msg_controllen = 0; + msg.msg_flags = 0; + + n = recvmsg(b->num, &msg, MSG_PEEK); + if (n <= 0) + { + if ((n < 0) && (get_last_socket_error() != EAGAIN) && (get_last_socket_error() != EWOULDBLOCK)) + return -1; + else + return 0; + } + + /* if we find a notification, process it and try again if necessary */ + while (msg.msg_flags & MSG_NOTIFICATION) + { + memset(&snp, 0x00, sizeof(union sctp_notification)); + iov.iov_base = (char *)&snp; + iov.iov_len = sizeof(union sctp_notification); + msg.msg_name = NULL; + msg.msg_namelen = 0; + msg.msg_iov = &iov; + msg.msg_iovlen = 1; + msg.msg_control = NULL; + msg.msg_controllen = 0; + msg.msg_flags = 0; + + n = recvmsg(b->num, &msg, 0); + if (n <= 0) + { + if ((n < 0) && (get_last_socket_error() != EAGAIN) && (get_last_socket_error() != EWOULDBLOCK)) + return -1; + else + return is_dry; + } + + if (snp.sn_header.sn_type == SCTP_SENDER_DRY_EVENT) + { + is_dry = 1; + + /* disable sender dry event */ +#ifdef SCTP_EVENT + memset(&event, 0, sizeof(struct sctp_event)); + event.se_assoc_id = 0; + event.se_type = SCTP_SENDER_DRY_EVENT; + event.se_on = 0; + ret = setsockopt(b->num, IPPROTO_SCTP, SCTP_EVENT, &event, sizeof(struct sctp_event)); +#else + eventsize = (socklen_t) sizeof(struct sctp_event_subscribe); + ret = getsockopt(b->num, IPPROTO_SCTP, SCTP_EVENTS, &event, &eventsize); + if (ret < 0) + return -1; + + event.sctp_sender_dry_event = 0; + + ret = setsockopt(b->num, IPPROTO_SCTP, SCTP_EVENTS, &event, sizeof(struct sctp_event_subscribe)); +#endif + if (ret < 0) + return -1; + } + +#ifdef SCTP_AUTHENTICATION_EVENT + if (snp.sn_header.sn_type == SCTP_AUTHENTICATION_EVENT) + dgram_sctp_handle_auth_free_key_event(b, &snp); +#endif + + if (data->handle_notifications != NULL) + data->handle_notifications(b, data->notification_context, (void*) &snp); + + /* found notification, peek again */ + memset(&snp, 0x00, sizeof(union sctp_notification)); + iov.iov_base = (char *)&snp; + iov.iov_len = sizeof(union sctp_notification); + msg.msg_name = NULL; + msg.msg_namelen = 0; + msg.msg_iov = &iov; + msg.msg_iovlen = 1; + msg.msg_control = NULL; + msg.msg_controllen = 0; + msg.msg_flags = 0; + + /* if we have seen the dry already, don't wait */ + if (is_dry) + { + sockflags = fcntl(b->num, F_GETFL, 0); + fcntl(b->num, F_SETFL, O_NONBLOCK); + } + + n = recvmsg(b->num, &msg, MSG_PEEK); + + if (is_dry) + { + fcntl(b->num, F_SETFL, sockflags); + } + + if (n <= 0) + { + if ((n < 0) && (get_last_socket_error() != EAGAIN) && (get_last_socket_error() != EWOULDBLOCK)) + return -1; + else + return is_dry; + } + } + + /* read anything else */ + return is_dry; +} + +int BIO_dgram_sctp_msg_waiting(BIO *b) + { + int n, sockflags; + union sctp_notification snp; + struct msghdr msg; + struct iovec iov; + bio_dgram_sctp_data *data = (bio_dgram_sctp_data *)b->ptr; + + /* Check if there are any messages waiting to be read */ + do + { + memset(&snp, 0x00, sizeof(union sctp_notification)); + iov.iov_base = (char *)&snp; + iov.iov_len = sizeof(union sctp_notification); + msg.msg_name = NULL; + msg.msg_namelen = 0; + msg.msg_iov = &iov; + msg.msg_iovlen = 1; + msg.msg_control = NULL; + msg.msg_controllen = 0; + msg.msg_flags = 0; + + sockflags = fcntl(b->num, F_GETFL, 0); + fcntl(b->num, F_SETFL, O_NONBLOCK); + n = recvmsg(b->num, &msg, MSG_PEEK); + fcntl(b->num, F_SETFL, sockflags); + + /* if notification, process and try again */ + if (n > 0 && (msg.msg_flags & MSG_NOTIFICATION)) + { +#ifdef SCTP_AUTHENTICATION_EVENT + if (snp.sn_header.sn_type == SCTP_AUTHENTICATION_EVENT) + dgram_sctp_handle_auth_free_key_event(b, &snp); +#endif + + memset(&snp, 0x00, sizeof(union sctp_notification)); + iov.iov_base = (char *)&snp; + iov.iov_len = sizeof(union sctp_notification); + msg.msg_name = NULL; + msg.msg_namelen = 0; + msg.msg_iov = &iov; + msg.msg_iovlen = 1; + msg.msg_control = NULL; + msg.msg_controllen = 0; + msg.msg_flags = 0; + n = recvmsg(b->num, &msg, 0); + + if (data->handle_notifications != NULL) + data->handle_notifications(b, data->notification_context, (void*) &snp); + } + + } while (n > 0 && (msg.msg_flags & MSG_NOTIFICATION)); + + /* Return 1 if there is a message to be read, return 0 otherwise. */ + if (n > 0) + return 1; + else + return 0; + } + +static int dgram_sctp_puts(BIO *bp, const char *str) + { + int n,ret; + + n=strlen(str); + ret=dgram_sctp_write(bp,str,n); + return(ret); + } +#endif + static int BIO_dgram_should_retry(int i) { int err; diff --git a/crypto/openssl/crypto/bn/asm/modexp512-x86_64.pl b/crypto/openssl/crypto/bn/asm/modexp512-x86_64.pl new file mode 100644 index 0000000000..54aeb01921 --- /dev/null +++ b/crypto/openssl/crypto/bn/asm/modexp512-x86_64.pl @@ -0,0 +1,1496 @@ +#!/usr/bin/env perl +# +# Copyright (c) 2010-2011 Intel Corp. +# Author: Vinodh.Gopal@intel.com +# Jim Guilford +# Erdinc.Ozturk@intel.com +# Maxim.Perminov@intel.com +# +# More information about algorithm used can be found at: +# http://www.cse.buffalo.edu/srds2009/escs2009_submission_Gopal.pdf +# +# ==================================================================== +# Copyright (c) 2011 The OpenSSL Project. All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# +# 3. All advertising materials mentioning features or use of this +# software must display the following acknowledgment: +# "This product includes software developed by the OpenSSL Project +# for use in the OpenSSL Toolkit. (http://www.OpenSSL.org/)" +# +# 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to +# endorse or promote products derived from this software without +# prior written permission. For written permission, please contact +# licensing@OpenSSL.org. +# +# 5. Products derived from this software may not be called "OpenSSL" +# nor may "OpenSSL" appear in their names without prior written +# permission of the OpenSSL Project. +# +# 6. Redistributions of any form whatsoever must retain the following +# acknowledgment: +# "This product includes software developed by the OpenSSL Project +# for use in the OpenSSL Toolkit (http://www.OpenSSL.org/)" +# +# THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY +# EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR +# ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT +# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +# HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, +# STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED +# OF THE POSSIBILITY OF SUCH DAMAGE. +# ==================================================================== + +$flavour = shift; +$output = shift; +if ($flavour =~ /\./) { $output = $flavour; undef $flavour; } + +my $win64=0; $win64=1 if ($flavour =~ /[nm]asm|mingw64/ || $output =~ /\.asm$/); + +$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; +( $xlate="${dir}x86_64-xlate.pl" and -f $xlate ) or +( $xlate="${dir}../../perlasm/x86_64-xlate.pl" and -f $xlate) or +die "can't locate x86_64-xlate.pl"; + +open STDOUT,"| $^X $xlate $flavour $output"; + +use strict; +my $code=".text\n\n"; +my $m=0; + +# +# Define x512 macros +# + +#MULSTEP_512_ADD MACRO x7, x6, x5, x4, x3, x2, x1, x0, dst, src1, src2, add_src, tmp1, tmp2 +# +# uses rax, rdx, and args +sub MULSTEP_512_ADD +{ + my ($x, $DST, $SRC2, $ASRC, $OP, $TMP)=@_; + my @X=@$x; # make a copy +$code.=<<___; + mov (+8*0)($SRC2), %rax + mul $OP # rdx:rax = %OP * [0] + mov ($ASRC), $X[0] + add %rax, $X[0] + adc \$0, %rdx + mov $X[0], $DST +___ +for(my $i=1;$i<8;$i++) { +$code.=<<___; + mov %rdx, $TMP + + mov (+8*$i)($SRC2), %rax + mul $OP # rdx:rax = %OP * [$i] + mov (+8*$i)($ASRC), $X[$i] + add %rax, $X[$i] + adc \$0, %rdx + add $TMP, $X[$i] + adc \$0, %rdx +___ +} +$code.=<<___; + mov %rdx, $X[0] +___ +} + +#MULSTEP_512 MACRO x7, x6, x5, x4, x3, x2, x1, x0, dst, src2, src1_val, tmp +# +# uses rax, rdx, and args +sub MULSTEP_512 +{ + my ($x, $DST, $SRC2, $OP, $TMP)=@_; + my @X=@$x; # make a copy +$code.=<<___; + mov (+8*0)($SRC2), %rax + mul $OP # rdx:rax = %OP * [0] + add %rax, $X[0] + adc \$0, %rdx + mov $X[0], $DST +___ +for(my $i=1;$i<8;$i++) { +$code.=<<___; + mov %rdx, $TMP + + mov (+8*$i)($SRC2), %rax + mul $OP # rdx:rax = %OP * [$i] + add %rax, $X[$i] + adc \$0, %rdx + add $TMP, $X[$i] + adc \$0, %rdx +___ +} +$code.=<<___; + mov %rdx, $X[0] +___ +} + +# +# Swizzle Macros +# + +# macro to copy data from flat space to swizzled table +#MACRO swizzle pDst, pSrc, tmp1, tmp2 +# pDst and pSrc are modified +sub swizzle +{ + my ($pDst, $pSrc, $cnt, $d0)=@_; +$code.=<<___; + mov \$8, $cnt +loop_$m: + mov ($pSrc), $d0 + mov $d0#w, ($pDst) + shr \$16, $d0 + mov $d0#w, (+64*1)($pDst) + shr \$16, $d0 + mov $d0#w, (+64*2)($pDst) + shr \$16, $d0 + mov $d0#w, (+64*3)($pDst) + lea 8($pSrc), $pSrc + lea 64*4($pDst), $pDst + dec $cnt + jnz loop_$m +___ + + $m++; +} + +# macro to copy data from swizzled table to flat space +#MACRO unswizzle pDst, pSrc, tmp*3 +sub unswizzle +{ + my ($pDst, $pSrc, $cnt, $d0, $d1)=@_; +$code.=<<___; + mov \$4, $cnt +loop_$m: + movzxw (+64*3+256*0)($pSrc), $d0 + movzxw (+64*3+256*1)($pSrc), $d1 + shl \$16, $d0 + shl \$16, $d1 + mov (+64*2+256*0)($pSrc), $d0#w + mov (+64*2+256*1)($pSrc), $d1#w + shl \$16, $d0 + shl \$16, $d1 + mov (+64*1+256*0)($pSrc), $d0#w + mov (+64*1+256*1)($pSrc), $d1#w + shl \$16, $d0 + shl \$16, $d1 + mov (+64*0+256*0)($pSrc), $d0#w + mov (+64*0+256*1)($pSrc), $d1#w + mov $d0, (+8*0)($pDst) + mov $d1, (+8*1)($pDst) + lea 256*2($pSrc), $pSrc + lea 8*2($pDst), $pDst + sub \$1, $cnt + jnz loop_$m +___ + + $m++; +} + +# +# Data Structures +# + +# Reduce Data +# +# +# Offset Value +# 0C0 Carries +# 0B8 X2[10] +# 0B0 X2[9] +# 0A8 X2[8] +# 0A0 X2[7] +# 098 X2[6] +# 090 X2[5] +# 088 X2[4] +# 080 X2[3] +# 078 X2[2] +# 070 X2[1] +# 068 X2[0] +# 060 X1[12] P[10] +# 058 X1[11] P[9] Z[8] +# 050 X1[10] P[8] Z[7] +# 048 X1[9] P[7] Z[6] +# 040 X1[8] P[6] Z[5] +# 038 X1[7] P[5] Z[4] +# 030 X1[6] P[4] Z[3] +# 028 X1[5] P[3] Z[2] +# 020 X1[4] P[2] Z[1] +# 018 X1[3] P[1] Z[0] +# 010 X1[2] P[0] Y[2] +# 008 X1[1] Q[1] Y[1] +# 000 X1[0] Q[0] Y[0] + +my $X1_offset = 0; # 13 qwords +my $X2_offset = $X1_offset + 13*8; # 11 qwords +my $Carries_offset = $X2_offset + 11*8; # 1 qword +my $Q_offset = 0; # 2 qwords +my $P_offset = $Q_offset + 2*8; # 11 qwords +my $Y_offset = 0; # 3 qwords +my $Z_offset = $Y_offset + 3*8; # 9 qwords + +my $Red_Data_Size = $Carries_offset + 1*8; # (25 qwords) + +# +# Stack Frame +# +# +# offset value +# ... +# ... +# 280 Garray + +# 278 tmp16[15] +# ... ... +# 200 tmp16[0] + +# 1F8 tmp[7] +# ... ... +# 1C0 tmp[0] + +# 1B8 GT[7] +# ... ... +# 180 GT[0] + +# 178 Reduce Data +# ... ... +# 0B8 Reduce Data +# 0B0 reserved +# 0A8 reserved +# 0A0 reserved +# 098 reserved +# 090 reserved +# 088 reduce result addr +# 080 exp[8] + +# ... +# 048 exp[1] +# 040 exp[0] + +# 038 reserved +# 030 loop_idx +# 028 pg +# 020 i +# 018 pData ; arg 4 +# 010 pG ; arg 2 +# 008 pResult ; arg 1 +# 000 rsp ; stack pointer before subtract + +my $rsp_offset = 0; +my $pResult_offset = 8*1 + $rsp_offset; +my $pG_offset = 8*1 + $pResult_offset; +my $pData_offset = 8*1 + $pG_offset; +my $i_offset = 8*1 + $pData_offset; +my $pg_offset = 8*1 + $i_offset; +my $loop_idx_offset = 8*1 + $pg_offset; +my $reserved1_offset = 8*1 + $loop_idx_offset; +my $exp_offset = 8*1 + $reserved1_offset; +my $red_result_addr_offset= 8*9 + $exp_offset; +my $reserved2_offset = 8*1 + $red_result_addr_offset; +my $Reduce_Data_offset = 8*5 + $reserved2_offset; +my $GT_offset = $Red_Data_Size + $Reduce_Data_offset; +my $tmp_offset = 8*8 + $GT_offset; +my $tmp16_offset = 8*8 + $tmp_offset; +my $garray_offset = 8*16 + $tmp16_offset; +my $mem_size = 8*8*32 + $garray_offset; + +# +# Offsets within Reduce Data +# +# +# struct MODF_2FOLD_MONT_512_C1_DATA { +# UINT64 t[8][8]; +# UINT64 m[8]; +# UINT64 m1[8]; /* 2^768 % m */ +# UINT64 m2[8]; /* 2^640 % m */ +# UINT64 k1[2]; /* (- 1/m) % 2^128 */ +# }; + +my $T = 0; +my $M = 512; # = 8 * 8 * 8 +my $M1 = 576; # = 8 * 8 * 9 /* += 8 * 8 */ +my $M2 = 640; # = 8 * 8 * 10 /* += 8 * 8 */ +my $K1 = 704; # = 8 * 8 * 11 /* += 8 * 8 */ + +# +# FUNCTIONS +# + +{{{ +# +# MULADD_128x512 : Function to multiply 128-bits (2 qwords) by 512-bits (8 qwords) +# and add 512-bits (8 qwords) +# to get 640 bits (10 qwords) +# Input: 128-bit mul source: [rdi+8*1], rbp +# 512-bit mul source: [rsi+8*n] +# 512-bit add source: r15, r14, ..., r9, r8 +# Output: r9, r8, r15, r14, r13, r12, r11, r10, [rcx+8*1], [rcx+8*0] +# Clobbers all regs except: rcx, rsi, rdi +$code.=<<___; +.type MULADD_128x512,\@abi-omnipotent +.align 16 +MULADD_128x512: +___ + &MULSTEP_512([map("%r$_",(8..15))], "(+8*0)(%rcx)", "%rsi", "%rbp", "%rbx"); +$code.=<<___; + mov (+8*1)(%rdi), %rbp +___ + &MULSTEP_512([map("%r$_",(9..15,8))], "(+8*1)(%rcx)", "%rsi", "%rbp", "%rbx"); +$code.=<<___; + ret +.size MULADD_128x512,.-MULADD_128x512 +___ +}}} + +{{{ +#MULADD_256x512 MACRO pDst, pA, pB, OP, TMP, X7, X6, X5, X4, X3, X2, X1, X0 +# +# Inputs: pDst: Destination (768 bits, 12 qwords) +# pA: Multiplicand (1024 bits, 16 qwords) +# pB: Multiplicand (512 bits, 8 qwords) +# Dst = Ah * B + Al +# where Ah is (in qwords) A[15:12] (256 bits) and Al is A[7:0] (512 bits) +# Results in X3 X2 X1 X0 X7 X6 X5 X4 Dst[3:0] +# Uses registers: arguments, RAX, RDX +sub MULADD_256x512 +{ + my ($pDst, $pA, $pB, $OP, $TMP, $X)=@_; +$code.=<<___; + mov (+8*12)($pA), $OP +___ + &MULSTEP_512_ADD($X, "(+8*0)($pDst)", $pB, $pA, $OP, $TMP); + push(@$X,shift(@$X)); + +$code.=<<___; + mov (+8*13)($pA), $OP +___ + &MULSTEP_512($X, "(+8*1)($pDst)", $pB, $OP, $TMP); + push(@$X,shift(@$X)); + +$code.=<<___; + mov (+8*14)($pA), $OP +___ + &MULSTEP_512($X, "(+8*2)($pDst)", $pB, $OP, $TMP); + push(@$X,shift(@$X)); + +$code.=<<___; + mov (+8*15)($pA), $OP +___ + &MULSTEP_512($X, "(+8*3)($pDst)", $pB, $OP, $TMP); + push(@$X,shift(@$X)); +} + +# +# mont_reduce(UINT64 *x, /* 1024 bits, 16 qwords */ +# UINT64 *m, /* 512 bits, 8 qwords */ +# MODF_2FOLD_MONT_512_C1_DATA *data, +# UINT64 *r) /* 512 bits, 8 qwords */ +# Input: x (number to be reduced): tmp16 (Implicit) +# m (modulus): [pM] (Implicit) +# data (reduce data): [pData] (Implicit) +# Output: r (result): Address in [red_res_addr] +# result also in: r9, r8, r15, r14, r13, r12, r11, r10 + +my @X=map("%r$_",(8..15)); + +$code.=<<___; +.type mont_reduce,\@abi-omnipotent +.align 16 +mont_reduce: +___ + +my $STACK_DEPTH = 8; + # + # X1 = Xh * M1 + Xl +$code.=<<___; + lea (+$Reduce_Data_offset+$X1_offset+$STACK_DEPTH)(%rsp), %rdi # pX1 (Dst) 769 bits, 13 qwords + mov (+$pData_offset+$STACK_DEPTH)(%rsp), %rsi # pM1 (Bsrc) 512 bits, 8 qwords + add \$$M1, %rsi + lea (+$tmp16_offset+$STACK_DEPTH)(%rsp), %rcx # X (Asrc) 1024 bits, 16 qwords + +___ + + &MULADD_256x512("%rdi", "%rcx", "%rsi", "%rbp", "%rbx", \@X); # rotates @X 4 times + # results in r11, r10, r9, r8, r15, r14, r13, r12, X1[3:0] + +$code.=<<___; + xor %rax, %rax + # X1 += xl + add (+8*8)(%rcx), $X[4] + adc (+8*9)(%rcx), $X[5] + adc (+8*10)(%rcx), $X[6] + adc (+8*11)(%rcx), $X[7] + adc \$0, %rax + # X1 is now rax, r11-r8, r15-r12, tmp16[3:0] + + # + # check for carry ;; carry stored in rax + mov $X[4], (+8*8)(%rdi) # rdi points to X1 + mov $X[5], (+8*9)(%rdi) + mov $X[6], %rbp + mov $X[7], (+8*11)(%rdi) + + mov %rax, (+$Reduce_Data_offset+$Carries_offset+$STACK_DEPTH)(%rsp) + + mov (+8*0)(%rdi), $X[4] + mov (+8*1)(%rdi), $X[5] + mov (+8*2)(%rdi), $X[6] + mov (+8*3)(%rdi), $X[7] + + # X1 is now stored in: X1[11], rbp, X1[9:8], r15-r8 + # rdi -> X1 + # rsi -> M1 + + # + # X2 = Xh * M2 + Xl + # do first part (X2 = Xh * M2) + add \$8*10, %rdi # rdi -> pXh ; 128 bits, 2 qwords + # Xh is actually { [rdi+8*1], rbp } + add \$`$M2-$M1`, %rsi # rsi -> M2 + lea (+$Reduce_Data_offset+$X2_offset+$STACK_DEPTH)(%rsp), %rcx # rcx -> pX2 ; 641 bits, 11 qwords +___ + unshift(@X,pop(@X)); unshift(@X,pop(@X)); +$code.=<<___; + + call MULADD_128x512 # args in rcx, rdi / rbp, rsi, r15-r8 + # result in r9, r8, r15, r14, r13, r12, r11, r10, X2[1:0] + mov (+$Reduce_Data_offset+$Carries_offset+$STACK_DEPTH)(%rsp), %rax + + # X2 += Xl + add (+8*8-8*10)(%rdi), $X[6] # (-8*10) is to adjust rdi -> Xh to Xl + adc (+8*9-8*10)(%rdi), $X[7] + mov $X[6], (+8*8)(%rcx) + mov $X[7], (+8*9)(%rcx) + + adc %rax, %rax + mov %rax, (+$Reduce_Data_offset+$Carries_offset+$STACK_DEPTH)(%rsp) + + lea (+$Reduce_Data_offset+$Q_offset+$STACK_DEPTH)(%rsp), %rdi # rdi -> pQ ; 128 bits, 2 qwords + add \$`$K1-$M2`, %rsi # rsi -> pK1 ; 128 bits, 2 qwords + + # MUL_128x128t128 rdi, rcx, rsi ; Q = X2 * K1 (bottom half) + # B1:B0 = rsi[1:0] = K1[1:0] + # A1:A0 = rcx[1:0] = X2[1:0] + # Result = rdi[1],rbp = Q[1],rbp + mov (%rsi), %r8 # B0 + mov (+8*1)(%rsi), %rbx # B1 + + mov (%rcx), %rax # A0 + mul %r8 # B0 + mov %rax, %rbp + mov %rdx, %r9 + + mov (+8*1)(%rcx), %rax # A1 + mul %r8 # B0 + add %rax, %r9 + + mov (%rcx), %rax # A0 + mul %rbx # B1 + add %rax, %r9 + + mov %r9, (+8*1)(%rdi) + # end MUL_128x128t128 + + sub \$`$K1-$M`, %rsi + + mov (%rcx), $X[6] + mov (+8*1)(%rcx), $X[7] # r9:r8 = X2[1:0] + + call MULADD_128x512 # args in rcx, rdi / rbp, rsi, r15-r8 + # result in r9, r8, r15, r14, r13, r12, r11, r10, X2[1:0] + + # load first half of m to rdx, rdi, rbx, rax + # moved this here for efficiency + mov (+8*0)(%rsi), %rax + mov (+8*1)(%rsi), %rbx + mov (+8*2)(%rsi), %rdi + mov (+8*3)(%rsi), %rdx + + # continue with reduction + mov (+$Reduce_Data_offset+$Carries_offset+$STACK_DEPTH)(%rsp), %rbp + + add (+8*8)(%rcx), $X[6] + adc (+8*9)(%rcx), $X[7] + + #accumulate the final carry to rbp + adc %rbp, %rbp + + # Add in overflow corrections: R = (X2>>128) += T[overflow] + # R = {r9, r8, r15, r14, ..., r10} + shl \$3, %rbp + mov (+$pData_offset+$STACK_DEPTH)(%rsp), %rcx # rsi -> Data (and points to T) + add %rcx, %rbp # pT ; 512 bits, 8 qwords, spread out + + # rsi will be used to generate a mask after the addition + xor %rsi, %rsi + + add (+8*8*0)(%rbp), $X[0] + adc (+8*8*1)(%rbp), $X[1] + adc (+8*8*2)(%rbp), $X[2] + adc (+8*8*3)(%rbp), $X[3] + adc (+8*8*4)(%rbp), $X[4] + adc (+8*8*5)(%rbp), $X[5] + adc (+8*8*6)(%rbp), $X[6] + adc (+8*8*7)(%rbp), $X[7] + + # if there is a carry: rsi = 0xFFFFFFFFFFFFFFFF + # if carry is clear: rsi = 0x0000000000000000 + sbb \$0, %rsi + + # if carry is clear, subtract 0. Otherwise, subtract 256 bits of m + and %rsi, %rax + and %rsi, %rbx + and %rsi, %rdi + and %rsi, %rdx + + mov \$1, %rbp + sub %rax, $X[0] + sbb %rbx, $X[1] + sbb %rdi, $X[2] + sbb %rdx, $X[3] + + # if there is a borrow: rbp = 0 + # if there is no borrow: rbp = 1 + # this is used to save the borrows in between the first half and the 2nd half of the subtraction of m + sbb \$0, %rbp + + #load second half of m to rdx, rdi, rbx, rax + + add \$$M, %rcx + mov (+8*4)(%rcx), %rax + mov (+8*5)(%rcx), %rbx + mov (+8*6)(%rcx), %rdi + mov (+8*7)(%rcx), %rdx + + # use the rsi mask as before + # if carry is clear, subtract 0. Otherwise, subtract 256 bits of m + and %rsi, %rax + and %rsi, %rbx + and %rsi, %rdi + and %rsi, %rdx + + # if rbp = 0, there was a borrow before, it is moved to the carry flag + # if rbp = 1, there was not a borrow before, carry flag is cleared + sub \$1, %rbp + + sbb %rax, $X[4] + sbb %rbx, $X[5] + sbb %rdi, $X[6] + sbb %rdx, $X[7] + + # write R back to memory + + mov (+$red_result_addr_offset+$STACK_DEPTH)(%rsp), %rsi + mov $X[0], (+8*0)(%rsi) + mov $X[1], (+8*1)(%rsi) + mov $X[2], (+8*2)(%rsi) + mov $X[3], (+8*3)(%rsi) + mov $X[4], (+8*4)(%rsi) + mov $X[5], (+8*5)(%rsi) + mov $X[6], (+8*6)(%rsi) + mov $X[7], (+8*7)(%rsi) + + ret +.size mont_reduce,.-mont_reduce +___ +}}} + +{{{ +#MUL_512x512 MACRO pDst, pA, pB, x7, x6, x5, x4, x3, x2, x1, x0, tmp*2 +# +# Inputs: pDst: Destination (1024 bits, 16 qwords) +# pA: Multiplicand (512 bits, 8 qwords) +# pB: Multiplicand (512 bits, 8 qwords) +# Uses registers rax, rdx, args +# B operand in [pB] and also in x7...x0 +sub MUL_512x512 +{ + my ($pDst, $pA, $pB, $x, $OP, $TMP, $pDst_o)=@_; + my ($pDst, $pDst_o) = ($pDst =~ m/([^+]*)\+?(.*)?/); + my @X=@$x; # make a copy + +$code.=<<___; + mov (+8*0)($pA), $OP + + mov $X[0], %rax + mul $OP # rdx:rax = %OP * [0] + mov %rax, (+$pDst_o+8*0)($pDst) + mov %rdx, $X[0] +___ +for(my $i=1;$i<8;$i++) { +$code.=<<___; + mov $X[$i], %rax + mul $OP # rdx:rax = %OP * [$i] + add %rax, $X[$i-1] + adc \$0, %rdx + mov %rdx, $X[$i] +___ +} + +for(my $i=1;$i<8;$i++) { +$code.=<<___; + mov (+8*$i)($pA), $OP +___ + + &MULSTEP_512(\@X, "(+$pDst_o+8*$i)($pDst)", $pB, $OP, $TMP); + push(@X,shift(@X)); +} + +$code.=<<___; + mov $X[0], (+$pDst_o+8*8)($pDst) + mov $X[1], (+$pDst_o+8*9)($pDst) + mov $X[2], (+$pDst_o+8*10)($pDst) + mov $X[3], (+$pDst_o+8*11)($pDst) + mov $X[4], (+$pDst_o+8*12)($pDst) + mov $X[5], (+$pDst_o+8*13)($pDst) + mov $X[6], (+$pDst_o+8*14)($pDst) + mov $X[7], (+$pDst_o+8*15)($pDst) +___ +} + +# +# mont_mul_a3b : subroutine to compute (Src1 * Src2) % M (all 512-bits) +# Input: src1: Address of source 1: rdi +# src2: Address of source 2: rsi +# Output: dst: Address of destination: [red_res_addr] +# src2 and result also in: r9, r8, r15, r14, r13, r12, r11, r10 +# Temp: Clobbers [tmp16], all registers +$code.=<<___; +.type mont_mul_a3b,\@abi-omnipotent +.align 16 +mont_mul_a3b: + # + # multiply tmp = src1 * src2 + # For multiply: dst = rcx, src1 = rdi, src2 = rsi + # stack depth is extra 8 from call +___ + &MUL_512x512("%rsp+$tmp16_offset+8", "%rdi", "%rsi", [map("%r$_",(10..15,8..9))], "%rbp", "%rbx"); +$code.=<<___; + # + # Dst = tmp % m + # Call reduce(tmp, m, data, dst) + + # tail recursion optimization: jmp to mont_reduce and return from there + jmp mont_reduce + # call mont_reduce + # ret +.size mont_mul_a3b,.-mont_mul_a3b +___ +}}} + +{{{ +#SQR_512 MACRO pDest, pA, x7, x6, x5, x4, x3, x2, x1, x0, tmp*4 +# +# Input in memory [pA] and also in x7...x0 +# Uses all argument registers plus rax and rdx +# +# This version computes all of the off-diagonal terms into memory, +# and then it adds in the diagonal terms + +sub SQR_512 +{ + my ($pDst, $pA, $x, $A, $tmp, $x7, $x6, $pDst_o)=@_; + my ($pDst, $pDst_o) = ($pDst =~ m/([^+]*)\+?(.*)?/); + my @X=@$x; # make a copy +$code.=<<___; + # ------------------ + # first pass 01...07 + # ------------------ + mov $X[0], $A + + mov $X[1],%rax + mul $A + mov %rax, (+$pDst_o+8*1)($pDst) +___ +for(my $i=2;$i<8;$i++) { +$code.=<<___; + mov %rdx, $X[$i-2] + mov $X[$i],%rax + mul $A + add %rax, $X[$i-2] + adc \$0, %rdx +___ +} +$code.=<<___; + mov %rdx, $x7 + + mov $X[0], (+$pDst_o+8*2)($pDst) + + # ------------------ + # second pass 12...17 + # ------------------ + + mov (+8*1)($pA), $A + + mov (+8*2)($pA),%rax + mul $A + add %rax, $X[1] + adc \$0, %rdx + mov $X[1], (+$pDst_o+8*3)($pDst) + + mov %rdx, $X[0] + mov (+8*3)($pA),%rax + mul $A + add %rax, $X[2] + adc \$0, %rdx + add $X[0], $X[2] + adc \$0, %rdx + mov $X[2], (+$pDst_o+8*4)($pDst) + + mov %rdx, $X[0] + mov (+8*4)($pA),%rax + mul $A + add %rax, $X[3] + adc \$0, %rdx + add $X[0], $X[3] + adc \$0, %rdx + + mov %rdx, $X[0] + mov (+8*5)($pA),%rax + mul $A + add %rax, $X[4] + adc \$0, %rdx + add $X[0], $X[4] + adc \$0, %rdx + + mov %rdx, $X[0] + mov $X[6],%rax + mul $A + add %rax, $X[5] + adc \$0, %rdx + add $X[0], $X[5] + adc \$0, %rdx + + mov %rdx, $X[0] + mov $X[7],%rax + mul $A + add %rax, $x7 + adc \$0, %rdx + add $X[0], $x7 + adc \$0, %rdx + + mov %rdx, $X[1] + + # ------------------ + # third pass 23...27 + # ------------------ + mov (+8*2)($pA), $A + + mov (+8*3)($pA),%rax + mul $A + add %rax, $X[3] + adc \$0, %rdx + mov $X[3], (+$pDst_o+8*5)($pDst) + + mov %rdx, $X[0] + mov (+8*4)($pA),%rax + mul $A + add %rax, $X[4] + adc \$0, %rdx + add $X[0], $X[4] + adc \$0, %rdx + mov $X[4], (+$pDst_o+8*6)($pDst) + + mov %rdx, $X[0] + mov (+8*5)($pA),%rax + mul $A + add %rax, $X[5] + adc \$0, %rdx + add $X[0], $X[5] + adc \$0, %rdx + + mov %rdx, $X[0] + mov $X[6],%rax + mul $A + add %rax, $x7 + adc \$0, %rdx + add $X[0], $x7 + adc \$0, %rdx + + mov %rdx, $X[0] + mov $X[7],%rax + mul $A + add %rax, $X[1] + adc \$0, %rdx + add $X[0], $X[1] + adc \$0, %rdx + + mov %rdx, $X[2] + + # ------------------ + # fourth pass 34...37 + # ------------------ + + mov (+8*3)($pA), $A + + mov (+8*4)($pA),%rax + mul $A + add %rax, $X[5] + adc \$0, %rdx + mov $X[5], (+$pDst_o+8*7)($pDst) + + mov %rdx, $X[0] + mov (+8*5)($pA),%rax + mul $A + add %rax, $x7 + adc \$0, %rdx + add $X[0], $x7 + adc \$0, %rdx + mov $x7, (+$pDst_o+8*8)($pDst) + + mov %rdx, $X[0] + mov $X[6],%rax + mul $A + add %rax, $X[1] + adc \$0, %rdx + add $X[0], $X[1] + adc \$0, %rdx + + mov %rdx, $X[0] + mov $X[7],%rax + mul $A + add %rax, $X[2] + adc \$0, %rdx + add $X[0], $X[2] + adc \$0, %rdx + + mov %rdx, $X[5] + + # ------------------ + # fifth pass 45...47 + # ------------------ + mov (+8*4)($pA), $A + + mov (+8*5)($pA),%rax + mul $A + add %rax, $X[1] + adc \$0, %rdx + mov $X[1], (+$pDst_o+8*9)($pDst) + + mov %rdx, $X[0] + mov $X[6],%rax + mul $A + add %rax, $X[2] + adc \$0, %rdx + add $X[0], $X[2] + adc \$0, %rdx + mov $X[2], (+$pDst_o+8*10)($pDst) + + mov %rdx, $X[0] + mov $X[7],%rax + mul $A + add %rax, $X[5] + adc \$0, %rdx + add $X[0], $X[5] + adc \$0, %rdx + + mov %rdx, $X[1] + + # ------------------ + # sixth pass 56...57 + # ------------------ + mov (+8*5)($pA), $A + + mov $X[6],%rax + mul $A + add %rax, $X[5] + adc \$0, %rdx + mov $X[5], (+$pDst_o+8*11)($pDst) + + mov %rdx, $X[0] + mov $X[7],%rax + mul $A + add %rax, $X[1] + adc \$0, %rdx + add $X[0], $X[1] + adc \$0, %rdx + mov $X[1], (+$pDst_o+8*12)($pDst) + + mov %rdx, $X[2] + + # ------------------ + # seventh pass 67 + # ------------------ + mov $X[6], $A + + mov $X[7],%rax + mul $A + add %rax, $X[2] + adc \$0, %rdx + mov $X[2], (+$pDst_o+8*13)($pDst) + + mov %rdx, (+$pDst_o+8*14)($pDst) + + # start finalize (add in squares, and double off-terms) + mov (+$pDst_o+8*1)($pDst), $X[0] + mov (+$pDst_o+8*2)($pDst), $X[1] + mov (+$pDst_o+8*3)($pDst), $X[2] + mov (+$pDst_o+8*4)($pDst), $X[3] + mov (+$pDst_o+8*5)($pDst), $X[4] + mov (+$pDst_o+8*6)($pDst), $X[5] + + mov (+8*3)($pA), %rax + mul %rax + mov %rax, $x6 + mov %rdx, $X[6] + + add $X[0], $X[0] + adc $X[1], $X[1] + adc $X[2], $X[2] + adc $X[3], $X[3] + adc $X[4], $X[4] + adc $X[5], $X[5] + adc \$0, $X[6] + + mov (+8*0)($pA), %rax + mul %rax + mov %rax, (+$pDst_o+8*0)($pDst) + mov %rdx, $A + + mov (+8*1)($pA), %rax + mul %rax + + add $A, $X[0] + adc %rax, $X[1] + adc \$0, %rdx + + mov %rdx, $A + mov $X[0], (+$pDst_o+8*1)($pDst) + mov $X[1], (+$pDst_o+8*2)($pDst) + + mov (+8*2)($pA), %rax + mul %rax + + add $A, $X[2] + adc %rax, $X[3] + adc \$0, %rdx + + mov %rdx, $A + + mov $X[2], (+$pDst_o+8*3)($pDst) + mov $X[3], (+$pDst_o+8*4)($pDst) + + xor $tmp, $tmp + add $A, $X[4] + adc $x6, $X[5] + adc \$0, $tmp + + mov $X[4], (+$pDst_o+8*5)($pDst) + mov $X[5], (+$pDst_o+8*6)($pDst) + + # %%tmp has 0/1 in column 7 + # %%A6 has a full value in column 7 + + mov (+$pDst_o+8*7)($pDst), $X[0] + mov (+$pDst_o+8*8)($pDst), $X[1] + mov (+$pDst_o+8*9)($pDst), $X[2] + mov (+$pDst_o+8*10)($pDst), $X[3] + mov (+$pDst_o+8*11)($pDst), $X[4] + mov (+$pDst_o+8*12)($pDst), $X[5] + mov (+$pDst_o+8*13)($pDst), $x6 + mov (+$pDst_o+8*14)($pDst), $x7 + + mov $X[7], %rax + mul %rax + mov %rax, $X[7] + mov %rdx, $A + + add $X[0], $X[0] + adc $X[1], $X[1] + adc $X[2], $X[2] + adc $X[3], $X[3] + adc $X[4], $X[4] + adc $X[5], $X[5] + adc $x6, $x6 + adc $x7, $x7 + adc \$0, $A + + add $tmp, $X[0] + + mov (+8*4)($pA), %rax + mul %rax + + add $X[6], $X[0] + adc %rax, $X[1] + adc \$0, %rdx + + mov %rdx, $tmp + + mov $X[0], (+$pDst_o+8*7)($pDst) + mov $X[1], (+$pDst_o+8*8)($pDst) + + mov (+8*5)($pA), %rax + mul %rax + + add $tmp, $X[2] + adc %rax, $X[3] + adc \$0, %rdx + + mov %rdx, $tmp + + mov $X[2], (+$pDst_o+8*9)($pDst) + mov $X[3], (+$pDst_o+8*10)($pDst) + + mov (+8*6)($pA), %rax + mul %rax + + add $tmp, $X[4] + adc %rax, $X[5] + adc \$0, %rdx + + mov $X[4], (+$pDst_o+8*11)($pDst) + mov $X[5], (+$pDst_o+8*12)($pDst) + + add %rdx, $x6 + adc $X[7], $x7 + adc \$0, $A + + mov $x6, (+$pDst_o+8*13)($pDst) + mov $x7, (+$pDst_o+8*14)($pDst) + mov $A, (+$pDst_o+8*15)($pDst) +___ +} + +# +# sqr_reduce: subroutine to compute Result = reduce(Result * Result) +# +# input and result also in: r9, r8, r15, r14, r13, r12, r11, r10 +# +$code.=<<___; +.type sqr_reduce,\@abi-omnipotent +.align 16 +sqr_reduce: + mov (+$pResult_offset+8)(%rsp), %rcx +___ + &SQR_512("%rsp+$tmp16_offset+8", "%rcx", [map("%r$_",(10..15,8..9))], "%rbx", "%rbp", "%rsi", "%rdi"); +$code.=<<___; + # tail recursion optimization: jmp to mont_reduce and return from there + jmp mont_reduce + # call mont_reduce + # ret +.size sqr_reduce,.-sqr_reduce +___ +}}} + +# +# MAIN FUNCTION +# + +#mod_exp_512(UINT64 *result, /* 512 bits, 8 qwords */ +# UINT64 *g, /* 512 bits, 8 qwords */ +# UINT64 *exp, /* 512 bits, 8 qwords */ +# struct mod_ctx_512 *data) + +# window size = 5 +# table size = 2^5 = 32 +#table_entries equ 32 +#table_size equ table_entries * 8 +$code.=<<___; +.globl mod_exp_512 +.type mod_exp_512,\@function,4 +mod_exp_512: + push %rbp + push %rbx + push %r12 + push %r13 + push %r14 + push %r15 + + # adjust stack down and then align it with cache boundary + mov %rsp, %r8 + sub \$$mem_size, %rsp + and \$-64, %rsp + + # store previous stack pointer and arguments + mov %r8, (+$rsp_offset)(%rsp) + mov %rdi, (+$pResult_offset)(%rsp) + mov %rsi, (+$pG_offset)(%rsp) + mov %rcx, (+$pData_offset)(%rsp) +.Lbody: + # transform g into montgomery space + # GT = reduce(g * C2) = reduce(g * (2^256)) + # reduce expects to have the input in [tmp16] + pxor %xmm4, %xmm4 + movdqu (+16*0)(%rsi), %xmm0 + movdqu (+16*1)(%rsi), %xmm1 + movdqu (+16*2)(%rsi), %xmm2 + movdqu (+16*3)(%rsi), %xmm3 + movdqa %xmm4, (+$tmp16_offset+16*0)(%rsp) + movdqa %xmm4, (+$tmp16_offset+16*1)(%rsp) + movdqa %xmm4, (+$tmp16_offset+16*6)(%rsp) + movdqa %xmm4, (+$tmp16_offset+16*7)(%rsp) + movdqa %xmm0, (+$tmp16_offset+16*2)(%rsp) + movdqa %xmm1, (+$tmp16_offset+16*3)(%rsp) + movdqa %xmm2, (+$tmp16_offset+16*4)(%rsp) + movdqa %xmm3, (+$tmp16_offset+16*5)(%rsp) + + # load pExp before rdx gets blown away + movdqu (+16*0)(%rdx), %xmm0 + movdqu (+16*1)(%rdx), %xmm1 + movdqu (+16*2)(%rdx), %xmm2 + movdqu (+16*3)(%rdx), %xmm3 + + lea (+$GT_offset)(%rsp), %rbx + mov %rbx, (+$red_result_addr_offset)(%rsp) + call mont_reduce + + # Initialize tmp = C + lea (+$tmp_offset)(%rsp), %rcx + xor %rax, %rax + mov %rax, (+8*0)(%rcx) + mov %rax, (+8*1)(%rcx) + mov %rax, (+8*3)(%rcx) + mov %rax, (+8*4)(%rcx) + mov %rax, (+8*5)(%rcx) + mov %rax, (+8*6)(%rcx) + mov %rax, (+8*7)(%rcx) + mov %rax, (+$exp_offset+8*8)(%rsp) + movq \$1, (+8*2)(%rcx) + + lea (+$garray_offset)(%rsp), %rbp + mov %rcx, %rsi # pTmp + mov %rbp, %rdi # Garray[][0] +___ + + &swizzle("%rdi", "%rcx", "%rax", "%rbx"); + + # for (rax = 31; rax != 0; rax--) { + # tmp = reduce(tmp * G) + # swizzle(pg, tmp); + # pg += 2; } +$code.=<<___; + mov \$31, %rax + mov %rax, (+$i_offset)(%rsp) + mov %rbp, (+$pg_offset)(%rsp) + # rsi -> pTmp + mov %rsi, (+$red_result_addr_offset)(%rsp) + mov (+8*0)(%rsi), %r10 + mov (+8*1)(%rsi), %r11 + mov (+8*2)(%rsi), %r12 + mov (+8*3)(%rsi), %r13 + mov (+8*4)(%rsi), %r14 + mov (+8*5)(%rsi), %r15 + mov (+8*6)(%rsi), %r8 + mov (+8*7)(%rsi), %r9 +init_loop: + lea (+$GT_offset)(%rsp), %rdi + call mont_mul_a3b + lea (+$tmp_offset)(%rsp), %rsi + mov (+$pg_offset)(%rsp), %rbp + add \$2, %rbp + mov %rbp, (+$pg_offset)(%rsp) + mov %rsi, %rcx # rcx = rsi = addr of tmp +___ + + &swizzle("%rbp", "%rcx", "%rax", "%rbx"); +$code.=<<___; + mov (+$i_offset)(%rsp), %rax + sub \$1, %rax + mov %rax, (+$i_offset)(%rsp) + jne init_loop + + # + # Copy exponent onto stack + movdqa %xmm0, (+$exp_offset+16*0)(%rsp) + movdqa %xmm1, (+$exp_offset+16*1)(%rsp) + movdqa %xmm2, (+$exp_offset+16*2)(%rsp) + movdqa %xmm3, (+$exp_offset+16*3)(%rsp) + + + # + # Do exponentiation + # Initialize result to G[exp{511:507}] + mov (+$exp_offset+62)(%rsp), %eax + mov %rax, %rdx + shr \$11, %rax + and \$0x07FF, %edx + mov %edx, (+$exp_offset+62)(%rsp) + lea (+$garray_offset)(%rsp,%rax,2), %rsi + mov (+$pResult_offset)(%rsp), %rdx +___ + + &unswizzle("%rdx", "%rsi", "%rbp", "%rbx", "%rax"); + + # + # Loop variables + # rcx = [loop_idx] = index: 510-5 to 0 by 5 +$code.=<<___; + movq \$505, (+$loop_idx_offset)(%rsp) + + mov (+$pResult_offset)(%rsp), %rcx + mov %rcx, (+$red_result_addr_offset)(%rsp) + mov (+8*0)(%rcx), %r10 + mov (+8*1)(%rcx), %r11 + mov (+8*2)(%rcx), %r12 + mov (+8*3)(%rcx), %r13 + mov (+8*4)(%rcx), %r14 + mov (+8*5)(%rcx), %r15 + mov (+8*6)(%rcx), %r8 + mov (+8*7)(%rcx), %r9 + jmp sqr_2 + +main_loop_a3b: + call sqr_reduce + call sqr_reduce + call sqr_reduce +sqr_2: + call sqr_reduce + call sqr_reduce + + # + # Do multiply, first look up proper value in Garray + mov (+$loop_idx_offset)(%rsp), %rcx # bit index + mov %rcx, %rax + shr \$4, %rax # rax is word pointer + mov (+$exp_offset)(%rsp,%rax,2), %edx + and \$15, %rcx + shrq %cl, %rdx + and \$0x1F, %rdx + + lea (+$garray_offset)(%rsp,%rdx,2), %rsi + lea (+$tmp_offset)(%rsp), %rdx + mov %rdx, %rdi +___ + + &unswizzle("%rdx", "%rsi", "%rbp", "%rbx", "%rax"); + # rdi = tmp = pG + + # + # Call mod_mul_a1(pDst, pSrc1, pSrc2, pM, pData) + # result result pG M Data +$code.=<<___; + mov (+$pResult_offset)(%rsp), %rsi + call mont_mul_a3b + + # + # finish loop + mov (+$loop_idx_offset)(%rsp), %rcx + sub \$5, %rcx + mov %rcx, (+$loop_idx_offset)(%rsp) + jge main_loop_a3b + + # + +end_main_loop_a3b: + # transform result out of Montgomery space + # result = reduce(result) + mov (+$pResult_offset)(%rsp), %rdx + pxor %xmm4, %xmm4 + movdqu (+16*0)(%rdx), %xmm0 + movdqu (+16*1)(%rdx), %xmm1 + movdqu (+16*2)(%rdx), %xmm2 + movdqu (+16*3)(%rdx), %xmm3 + movdqa %xmm4, (+$tmp16_offset+16*4)(%rsp) + movdqa %xmm4, (+$tmp16_offset+16*5)(%rsp) + movdqa %xmm4, (+$tmp16_offset+16*6)(%rsp) + movdqa %xmm4, (+$tmp16_offset+16*7)(%rsp) + movdqa %xmm0, (+$tmp16_offset+16*0)(%rsp) + movdqa %xmm1, (+$tmp16_offset+16*1)(%rsp) + movdqa %xmm2, (+$tmp16_offset+16*2)(%rsp) + movdqa %xmm3, (+$tmp16_offset+16*3)(%rsp) + call mont_reduce + + # If result > m, subract m + # load result into r15:r8 + mov (+$pResult_offset)(%rsp), %rax + mov (+8*0)(%rax), %r8 + mov (+8*1)(%rax), %r9 + mov (+8*2)(%rax), %r10 + mov (+8*3)(%rax), %r11 + mov (+8*4)(%rax), %r12 + mov (+8*5)(%rax), %r13 + mov (+8*6)(%rax), %r14 + mov (+8*7)(%rax), %r15 + + # subtract m + mov (+$pData_offset)(%rsp), %rbx + add \$$M, %rbx + + sub (+8*0)(%rbx), %r8 + sbb (+8*1)(%rbx), %r9 + sbb (+8*2)(%rbx), %r10 + sbb (+8*3)(%rbx), %r11 + sbb (+8*4)(%rbx), %r12 + sbb (+8*5)(%rbx), %r13 + sbb (+8*6)(%rbx), %r14 + sbb (+8*7)(%rbx), %r15 + + # if Carry is clear, replace result with difference + mov (+8*0)(%rax), %rsi + mov (+8*1)(%rax), %rdi + mov (+8*2)(%rax), %rcx + mov (+8*3)(%rax), %rdx + cmovnc %r8, %rsi + cmovnc %r9, %rdi + cmovnc %r10, %rcx + cmovnc %r11, %rdx + mov %rsi, (+8*0)(%rax) + mov %rdi, (+8*1)(%rax) + mov %rcx, (+8*2)(%rax) + mov %rdx, (+8*3)(%rax) + + mov (+8*4)(%rax), %rsi + mov (+8*5)(%rax), %rdi + mov (+8*6)(%rax), %rcx + mov (+8*7)(%rax), %rdx + cmovnc %r12, %rsi + cmovnc %r13, %rdi + cmovnc %r14, %rcx + cmovnc %r15, %rdx + mov %rsi, (+8*4)(%rax) + mov %rdi, (+8*5)(%rax) + mov %rcx, (+8*6)(%rax) + mov %rdx, (+8*7)(%rax) + + mov (+$rsp_offset)(%rsp), %rsi + mov 0(%rsi),%r15 + mov 8(%rsi),%r14 + mov 16(%rsi),%r13 + mov 24(%rsi),%r12 + mov 32(%rsi),%rbx + mov 40(%rsi),%rbp + lea 48(%rsi),%rsp +.Lepilogue: + ret +.size mod_exp_512, . - mod_exp_512 +___ + +if ($win64) { +# EXCEPTION_DISPOSITION handler (EXCEPTION_RECORD *rec,ULONG64 frame, +# CONTEXT *context,DISPATCHER_CONTEXT *disp) +my $rec="%rcx"; +my $frame="%rdx"; +my $context="%r8"; +my $disp="%r9"; + +$code.=<<___; +.extern __imp_RtlVirtualUnwind +.type mod_exp_512_se_handler,\@abi-omnipotent +.align 16 +mod_exp_512_se_handler: + push %rsi + push %rdi + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + pushfq + sub \$64,%rsp + + mov 120($context),%rax # pull context->Rax + mov 248($context),%rbx # pull context->Rip + + lea .Lbody(%rip),%r10 + cmp %r10,%rbx # context->RipRsp + + lea .Lepilogue(%rip),%r10 + cmp %r10,%rbx # context->Rip>=epilogue label + jae .Lin_prologue + + mov $rsp_offset(%rax),%rax # pull saved Rsp + + mov 32(%rax),%rbx + mov 40(%rax),%rbp + mov 24(%rax),%r12 + mov 16(%rax),%r13 + mov 8(%rax),%r14 + mov 0(%rax),%r15 + lea 48(%rax),%rax + mov %rbx,144($context) # restore context->Rbx + mov %rbp,160($context) # restore context->Rbp + mov %r12,216($context) # restore context->R12 + mov %r13,224($context) # restore context->R13 + mov %r14,232($context) # restore context->R14 + mov %r15,240($context) # restore context->R15 + +.Lin_prologue: + mov 8(%rax),%rdi + mov 16(%rax),%rsi + mov %rax,152($context) # restore context->Rsp + mov %rsi,168($context) # restore context->Rsi + mov %rdi,176($context) # restore context->Rdi + + mov 40($disp),%rdi # disp->ContextRecord + mov $context,%rsi # context + mov \$154,%ecx # sizeof(CONTEXT) + .long 0xa548f3fc # cld; rep movsq + + mov $disp,%rsi + xor %rcx,%rcx # arg1, UNW_FLAG_NHANDLER + mov 8(%rsi),%rdx # arg2, disp->ImageBase + mov 0(%rsi),%r8 # arg3, disp->ControlPc + mov 16(%rsi),%r9 # arg4, disp->FunctionEntry + mov 40(%rsi),%r10 # disp->ContextRecord + lea 56(%rsi),%r11 # &disp->HandlerData + lea 24(%rsi),%r12 # &disp->EstablisherFrame + mov %r10,32(%rsp) # arg5 + mov %r11,40(%rsp) # arg6 + mov %r12,48(%rsp) # arg7 + mov %rcx,56(%rsp) # arg8, (NULL) + call *__imp_RtlVirtualUnwind(%rip) + + mov \$1,%eax # ExceptionContinueSearch + add \$64,%rsp + popfq + pop %r15 + pop %r14 + pop %r13 + pop %r12 + pop %rbp + pop %rbx + pop %rdi + pop %rsi + ret +.size mod_exp_512_se_handler,.-mod_exp_512_se_handler + +.section .pdata +.align 4 + .rva .LSEH_begin_mod_exp_512 + .rva .LSEH_end_mod_exp_512 + .rva .LSEH_info_mod_exp_512 + +.section .xdata +.align 8 +.LSEH_info_mod_exp_512: + .byte 9,0,0,0 + .rva mod_exp_512_se_handler +___ +} + +sub reg_part { +my ($reg,$conv)=@_; + if ($reg =~ /%r[0-9]+/) { $reg .= $conv; } + elsif ($conv eq "b") { $reg =~ s/%[er]([^x]+)x?/%$1l/; } + elsif ($conv eq "w") { $reg =~ s/%[er](.+)/%$1/; } + elsif ($conv eq "d") { $reg =~ s/%[er](.+)/%e$1/; } + return $reg; +} + +$code =~ s/(%[a-z0-9]+)#([bwd])/reg_part($1,$2)/gem; +$code =~ s/\`([^\`]*)\`/eval $1/gem; +$code =~ s/(\(\+[^)]+\))/eval $1/gem; +print $code; +close STDOUT; diff --git a/crypto/openssl/crypto/bn/asm/x86-gf2m.pl b/crypto/openssl/crypto/bn/asm/x86-gf2m.pl new file mode 100644 index 0000000000..808a1e5969 --- /dev/null +++ b/crypto/openssl/crypto/bn/asm/x86-gf2m.pl @@ -0,0 +1,313 @@ +#!/usr/bin/env perl +# +# ==================================================================== +# Written by Andy Polyakov for the OpenSSL +# project. The module is, however, dual licensed under OpenSSL and +# CRYPTOGAMS licenses depending on where you obtain it. For further +# details see http://www.openssl.org/~appro/cryptogams/. +# ==================================================================== +# +# May 2011 +# +# The module implements bn_GF2m_mul_2x2 polynomial multiplication used +# in bn_gf2m.c. It's kind of low-hanging mechanical port from C for +# the time being... Except that it has three code paths: pure integer +# code suitable for any x86 CPU, MMX code suitable for PIII and later +# and PCLMULQDQ suitable for Westmere and later. Improvement varies +# from one benchmark and µ-arch to another. Below are interval values +# for 163- and 571-bit ECDH benchmarks relative to compiler-generated +# code: +# +# PIII 16%-30% +# P4 12%-12% +# Opteron 18%-40% +# Core2 19%-44% +# Atom 38%-64% +# Westmere 53%-121%(PCLMULQDQ)/20%-32%(MMX) +# Sandy Bridge 72%-127%(PCLMULQDQ)/27%-23%(MMX) +# +# Note that above improvement coefficients are not coefficients for +# bn_GF2m_mul_2x2 itself. For example 120% ECDH improvement is result +# of bn_GF2m_mul_2x2 being >4x faster. As it gets faster, benchmark +# is more and more dominated by other subroutines, most notably by +# BN_GF2m_mod[_mul]_arr... + +$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; +push(@INC,"${dir}","${dir}../../perlasm"); +require "x86asm.pl"; + +&asm_init($ARGV[0],$0,$x86only = $ARGV[$#ARGV] eq "386"); + +$sse2=0; +for (@ARGV) { $sse2=1 if (/-DOPENSSL_IA32_SSE2/); } + +&external_label("OPENSSL_ia32cap_P") if ($sse2); + +$a="eax"; +$b="ebx"; +($a1,$a2,$a4)=("ecx","edx","ebp"); + +$R="mm0"; +@T=("mm1","mm2"); +($A,$B,$B30,$B31)=("mm2","mm3","mm4","mm5"); +@i=("esi","edi"); + + if (!$x86only) { +&function_begin_B("_mul_1x1_mmx"); + &sub ("esp",32+4); + &mov ($a1,$a); + &lea ($a2,&DWP(0,$a,$a)); + &and ($a1,0x3fffffff); + &lea ($a4,&DWP(0,$a2,$a2)); + &mov (&DWP(0*4,"esp"),0); + &and ($a2,0x7fffffff); + &movd ($A,$a); + &movd ($B,$b); + &mov (&DWP(1*4,"esp"),$a1); # a1 + &xor ($a1,$a2); # a1^a2 + &pxor ($B31,$B31); + &pxor ($B30,$B30); + &mov (&DWP(2*4,"esp"),$a2); # a2 + &xor ($a2,$a4); # a2^a4 + &mov (&DWP(3*4,"esp"),$a1); # a1^a2 + &pcmpgtd($B31,$A); # broadcast 31st bit + &paddd ($A,$A); # $A<<=1 + &xor ($a1,$a2); # a1^a4=a1^a2^a2^a4 + &mov (&DWP(4*4,"esp"),$a4); # a4 + &xor ($a4,$a2); # a2=a4^a2^a4 + &pand ($B31,$B); + &pcmpgtd($B30,$A); # broadcast 30th bit + &mov (&DWP(5*4,"esp"),$a1); # a1^a4 + &xor ($a4,$a1); # a1^a2^a4 + &psllq ($B31,31); + &pand ($B30,$B); + &mov (&DWP(6*4,"esp"),$a2); # a2^a4 + &mov (@i[0],0x7); + &mov (&DWP(7*4,"esp"),$a4); # a1^a2^a4 + &mov ($a4,@i[0]); + &and (@i[0],$b); + &shr ($b,3); + &mov (@i[1],$a4); + &psllq ($B30,30); + &and (@i[1],$b); + &shr ($b,3); + &movd ($R,&DWP(0,"esp",@i[0],4)); + &mov (@i[0],$a4); + &and (@i[0],$b); + &shr ($b,3); + for($n=1;$n<9;$n++) { + &movd (@T[1],&DWP(0,"esp",@i[1],4)); + &mov (@i[1],$a4); + &psllq (@T[1],3*$n); + &and (@i[1],$b); + &shr ($b,3); + &pxor ($R,@T[1]); + + push(@i,shift(@i)); push(@T,shift(@T)); + } + &movd (@T[1],&DWP(0,"esp",@i[1],4)); + &pxor ($R,$B30); + &psllq (@T[1],3*$n++); + &pxor ($R,@T[1]); + + &movd (@T[0],&DWP(0,"esp",@i[0],4)); + &pxor ($R,$B31); + &psllq (@T[0],3*$n); + &add ("esp",32+4); + &pxor ($R,@T[0]); + &ret (); +&function_end_B("_mul_1x1_mmx"); + } + +($lo,$hi)=("eax","edx"); +@T=("ecx","ebp"); + +&function_begin_B("_mul_1x1_ialu"); + &sub ("esp",32+4); + &mov ($a1,$a); + &lea ($a2,&DWP(0,$a,$a)); + &lea ($a4,&DWP(0,"",$a,4)); + &and ($a1,0x3fffffff); + &lea (@i[1],&DWP(0,$lo,$lo)); + &sar ($lo,31); # broadcast 31st bit + &mov (&DWP(0*4,"esp"),0); + &and ($a2,0x7fffffff); + &mov (&DWP(1*4,"esp"),$a1); # a1 + &xor ($a1,$a2); # a1^a2 + &mov (&DWP(2*4,"esp"),$a2); # a2 + &xor ($a2,$a4); # a2^a4 + &mov (&DWP(3*4,"esp"),$a1); # a1^a2 + &xor ($a1,$a2); # a1^a4=a1^a2^a2^a4 + &mov (&DWP(4*4,"esp"),$a4); # a4 + &xor ($a4,$a2); # a2=a4^a2^a4 + &mov (&DWP(5*4,"esp"),$a1); # a1^a4 + &xor ($a4,$a1); # a1^a2^a4 + &sar (@i[1],31); # broardcast 30th bit + &and ($lo,$b); + &mov (&DWP(6*4,"esp"),$a2); # a2^a4 + &and (@i[1],$b); + &mov (&DWP(7*4,"esp"),$a4); # a1^a2^a4 + &mov ($hi,$lo); + &shl ($lo,31); + &mov (@T[0],@i[1]); + &shr ($hi,1); + + &mov (@i[0],0x7); + &shl (@i[1],30); + &and (@i[0],$b); + &shr (@T[0],2); + &xor ($lo,@i[1]); + + &shr ($b,3); + &mov (@i[1],0x7); # 5-byte instruction!? + &and (@i[1],$b); + &shr ($b,3); + &xor ($hi,@T[0]); + &xor ($lo,&DWP(0,"esp",@i[0],4)); + &mov (@i[0],0x7); + &and (@i[0],$b); + &shr ($b,3); + for($n=1;$n<9;$n++) { + &mov (@T[1],&DWP(0,"esp",@i[1],4)); + &mov (@i[1],0x7); + &mov (@T[0],@T[1]); + &shl (@T[1],3*$n); + &and (@i[1],$b); + &shr (@T[0],32-3*$n); + &xor ($lo,@T[1]); + &shr ($b,3); + &xor ($hi,@T[0]); + + push(@i,shift(@i)); push(@T,shift(@T)); + } + &mov (@T[1],&DWP(0,"esp",@i[1],4)); + &mov (@T[0],@T[1]); + &shl (@T[1],3*$n); + &mov (@i[1],&DWP(0,"esp",@i[0],4)); + &shr (@T[0],32-3*$n); $n++; + &mov (@i[0],@i[1]); + &xor ($lo,@T[1]); + &shl (@i[1],3*$n); + &xor ($hi,@T[0]); + &shr (@i[0],32-3*$n); + &xor ($lo,@i[1]); + &xor ($hi,@i[0]); + + &add ("esp",32+4); + &ret (); +&function_end_B("_mul_1x1_ialu"); + +# void bn_GF2m_mul_2x2(BN_ULONG *r, BN_ULONG a1, BN_ULONG a0, BN_ULONG b1, BN_ULONG b0); +&function_begin_B("bn_GF2m_mul_2x2"); +if (!$x86only) { + &picmeup("edx","OPENSSL_ia32cap_P"); + &mov ("eax",&DWP(0,"edx")); + &mov ("edx",&DWP(4,"edx")); + &test ("eax",1<<23); # check MMX bit + &jz (&label("ialu")); +if ($sse2) { + &test ("eax",1<<24); # check FXSR bit + &jz (&label("mmx")); + &test ("edx",1<<1); # check PCLMULQDQ bit + &jz (&label("mmx")); + + &movups ("xmm0",&QWP(8,"esp")); + &shufps ("xmm0","xmm0",0b10110001); + &pclmulqdq ("xmm0","xmm0",1); + &mov ("eax",&DWP(4,"esp")); + &movups (&QWP(0,"eax"),"xmm0"); + &ret (); + +&set_label("mmx",16); +} + &push ("ebp"); + &push ("ebx"); + &push ("esi"); + &push ("edi"); + &mov ($a,&wparam(1)); + &mov ($b,&wparam(3)); + &call ("_mul_1x1_mmx"); # a1·b1 + &movq ("mm7",$R); + + &mov ($a,&wparam(2)); + &mov ($b,&wparam(4)); + &call ("_mul_1x1_mmx"); # a0·b0 + &movq ("mm6",$R); + + &mov ($a,&wparam(1)); + &mov ($b,&wparam(3)); + &xor ($a,&wparam(2)); + &xor ($b,&wparam(4)); + &call ("_mul_1x1_mmx"); # (a0+a1)·(b0+b1) + &pxor ($R,"mm7"); + &mov ($a,&wparam(0)); + &pxor ($R,"mm6"); # (a0+a1)·(b0+b1)-a1·b1-a0·b0 + + &movq ($A,$R); + &psllq ($R,32); + &pop ("edi"); + &psrlq ($A,32); + &pop ("esi"); + &pxor ($R,"mm6"); + &pop ("ebx"); + &pxor ($A,"mm7"); + &movq (&QWP(0,$a),$R); + &pop ("ebp"); + &movq (&QWP(8,$a),$A); + &emms (); + &ret (); +&set_label("ialu",16); +} + &push ("ebp"); + &push ("ebx"); + &push ("esi"); + &push ("edi"); + &stack_push(4+1); + + &mov ($a,&wparam(1)); + &mov ($b,&wparam(3)); + &call ("_mul_1x1_ialu"); # a1·b1 + &mov (&DWP(8,"esp"),$lo); + &mov (&DWP(12,"esp"),$hi); + + &mov ($a,&wparam(2)); + &mov ($b,&wparam(4)); + &call ("_mul_1x1_ialu"); # a0·b0 + &mov (&DWP(0,"esp"),$lo); + &mov (&DWP(4,"esp"),$hi); + + &mov ($a,&wparam(1)); + &mov ($b,&wparam(3)); + &xor ($a,&wparam(2)); + &xor ($b,&wparam(4)); + &call ("_mul_1x1_ialu"); # (a0+a1)·(b0+b1) + + &mov ("ebp",&wparam(0)); + @r=("ebx","ecx","edi","esi"); + &mov (@r[0],&DWP(0,"esp")); + &mov (@r[1],&DWP(4,"esp")); + &mov (@r[2],&DWP(8,"esp")); + &mov (@r[3],&DWP(12,"esp")); + + &xor ($lo,$hi); + &xor ($hi,@r[1]); + &xor ($lo,@r[0]); + &mov (&DWP(0,"ebp"),@r[0]); + &xor ($hi,@r[2]); + &mov (&DWP(12,"ebp"),@r[3]); + &xor ($lo,@r[3]); + &stack_pop(4+1); + &xor ($hi,@r[3]); + &pop ("edi"); + &xor ($lo,$hi); + &pop ("esi"); + &mov (&DWP(8,"ebp"),$hi); + &pop ("ebx"); + &mov (&DWP(4,"ebp"),$lo); + &pop ("ebp"); + &ret (); +&function_end_B("bn_GF2m_mul_2x2"); + +&asciz ("GF(2^m) Multiplication for x86, CRYPTOGAMS by "); + +&asm_finish(); diff --git a/crypto/openssl/crypto/bn/asm/x86_64-gf2m.pl b/crypto/openssl/crypto/bn/asm/x86_64-gf2m.pl new file mode 100644 index 0000000000..1658acbbdd --- /dev/null +++ b/crypto/openssl/crypto/bn/asm/x86_64-gf2m.pl @@ -0,0 +1,389 @@ +#!/usr/bin/env perl +# +# ==================================================================== +# Written by Andy Polyakov for the OpenSSL +# project. The module is, however, dual licensed under OpenSSL and +# CRYPTOGAMS licenses depending on where you obtain it. For further +# details see http://www.openssl.org/~appro/cryptogams/. +# ==================================================================== +# +# May 2011 +# +# The module implements bn_GF2m_mul_2x2 polynomial multiplication used +# in bn_gf2m.c. It's kind of low-hanging mechanical port from C for +# the time being... Except that it has two code paths: code suitable +# for any x86_64 CPU and PCLMULQDQ one suitable for Westmere and +# later. Improvement varies from one benchmark and µ-arch to another. +# Vanilla code path is at most 20% faster than compiler-generated code +# [not very impressive], while PCLMULQDQ - whole 85%-160% better on +# 163- and 571-bit ECDH benchmarks on Intel CPUs. Keep in mind that +# these coefficients are not ones for bn_GF2m_mul_2x2 itself, as not +# all CPU time is burnt in it... + +$flavour = shift; +$output = shift; +if ($flavour =~ /\./) { $output = $flavour; undef $flavour; } + +$win64=0; $win64=1 if ($flavour =~ /[nm]asm|mingw64/ || $output =~ /\.asm$/); + +$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; +( $xlate="${dir}x86_64-xlate.pl" and -f $xlate ) or +( $xlate="${dir}../../perlasm/x86_64-xlate.pl" and -f $xlate) or +die "can't locate x86_64-xlate.pl"; + +open STDOUT,"| $^X $xlate $flavour $output"; + +($lo,$hi)=("%rax","%rdx"); $a=$lo; +($i0,$i1)=("%rsi","%rdi"); +($t0,$t1)=("%rbx","%rcx"); +($b,$mask)=("%rbp","%r8"); +($a1,$a2,$a4,$a8,$a12,$a48)=map("%r$_",(9..15)); +($R,$Tx)=("%xmm0","%xmm1"); + +$code.=<<___; +.text + +.type _mul_1x1,\@abi-omnipotent +.align 16 +_mul_1x1: + sub \$128+8,%rsp + mov \$-1,$a1 + lea ($a,$a),$i0 + shr \$3,$a1 + lea (,$a,4),$i1 + and $a,$a1 # a1=a&0x1fffffffffffffff + lea (,$a,8),$a8 + sar \$63,$a # broadcast 63rd bit + lea ($a1,$a1),$a2 + sar \$63,$i0 # broadcast 62nd bit + lea (,$a1,4),$a4 + and $b,$a + sar \$63,$i1 # boardcast 61st bit + mov $a,$hi # $a is $lo + shl \$63,$lo + and $b,$i0 + shr \$1,$hi + mov $i0,$t1 + shl \$62,$i0 + and $b,$i1 + shr \$2,$t1 + xor $i0,$lo + mov $i1,$t0 + shl \$61,$i1 + xor $t1,$hi + shr \$3,$t0 + xor $i1,$lo + xor $t0,$hi + + mov $a1,$a12 + movq \$0,0(%rsp) # tab[0]=0 + xor $a2,$a12 # a1^a2 + mov $a1,8(%rsp) # tab[1]=a1 + mov $a4,$a48 + mov $a2,16(%rsp) # tab[2]=a2 + xor $a8,$a48 # a4^a8 + mov $a12,24(%rsp) # tab[3]=a1^a2 + + xor $a4,$a1 + mov $a4,32(%rsp) # tab[4]=a4 + xor $a4,$a2 + mov $a1,40(%rsp) # tab[5]=a1^a4 + xor $a4,$a12 + mov $a2,48(%rsp) # tab[6]=a2^a4 + xor $a48,$a1 # a1^a4^a4^a8=a1^a8 + mov $a12,56(%rsp) # tab[7]=a1^a2^a4 + xor $a48,$a2 # a2^a4^a4^a8=a1^a8 + + mov $a8,64(%rsp) # tab[8]=a8 + xor $a48,$a12 # a1^a2^a4^a4^a8=a1^a2^a8 + mov $a1,72(%rsp) # tab[9]=a1^a8 + xor $a4,$a1 # a1^a8^a4 + mov $a2,80(%rsp) # tab[10]=a2^a8 + xor $a4,$a2 # a2^a8^a4 + mov $a12,88(%rsp) # tab[11]=a1^a2^a8 + + xor $a4,$a12 # a1^a2^a8^a4 + mov $a48,96(%rsp) # tab[12]=a4^a8 + mov $mask,$i0 + mov $a1,104(%rsp) # tab[13]=a1^a4^a8 + and $b,$i0 + mov $a2,112(%rsp) # tab[14]=a2^a4^a8 + shr \$4,$b + mov $a12,120(%rsp) # tab[15]=a1^a2^a4^a8 + mov $mask,$i1 + and $b,$i1 + shr \$4,$b + + movq (%rsp,$i0,8),$R # half of calculations is done in SSE2 + mov $mask,$i0 + and $b,$i0 + shr \$4,$b +___ + for ($n=1;$n<8;$n++) { + $code.=<<___; + mov (%rsp,$i1,8),$t1 + mov $mask,$i1 + mov $t1,$t0 + shl \$`8*$n-4`,$t1 + and $b,$i1 + movq (%rsp,$i0,8),$Tx + shr \$`64-(8*$n-4)`,$t0 + xor $t1,$lo + pslldq \$$n,$Tx + mov $mask,$i0 + shr \$4,$b + xor $t0,$hi + and $b,$i0 + shr \$4,$b + pxor $Tx,$R +___ + } +$code.=<<___; + mov (%rsp,$i1,8),$t1 + mov $t1,$t0 + shl \$`8*$n-4`,$t1 + movq $R,$i0 + shr \$`64-(8*$n-4)`,$t0 + xor $t1,$lo + psrldq \$8,$R + xor $t0,$hi + movq $R,$i1 + xor $i0,$lo + xor $i1,$hi + + add \$128+8,%rsp + ret +.Lend_mul_1x1: +.size _mul_1x1,.-_mul_1x1 +___ + +($rp,$a1,$a0,$b1,$b0) = $win64? ("%rcx","%rdx","%r8", "%r9","%r10") : # Win64 order + ("%rdi","%rsi","%rdx","%rcx","%r8"); # Unix order + +$code.=<<___; +.extern OPENSSL_ia32cap_P +.globl bn_GF2m_mul_2x2 +.type bn_GF2m_mul_2x2,\@abi-omnipotent +.align 16 +bn_GF2m_mul_2x2: + mov OPENSSL_ia32cap_P(%rip),%rax + bt \$33,%rax + jnc .Lvanilla_mul_2x2 + + movq $a1,%xmm0 + movq $b1,%xmm1 + movq $a0,%xmm2 +___ +$code.=<<___ if ($win64); + movq 40(%rsp),%xmm3 +___ +$code.=<<___ if (!$win64); + movq $b0,%xmm3 +___ +$code.=<<___; + movdqa %xmm0,%xmm4 + movdqa %xmm1,%xmm5 + pclmulqdq \$0,%xmm1,%xmm0 # a1·b1 + pxor %xmm2,%xmm4 + pxor %xmm3,%xmm5 + pclmulqdq \$0,%xmm3,%xmm2 # a0·b0 + pclmulqdq \$0,%xmm5,%xmm4 # (a0+a1)·(b0+b1) + xorps %xmm0,%xmm4 + xorps %xmm2,%xmm4 # (a0+a1)·(b0+b1)-a0·b0-a1·b1 + movdqa %xmm4,%xmm5 + pslldq \$8,%xmm4 + psrldq \$8,%xmm5 + pxor %xmm4,%xmm2 + pxor %xmm5,%xmm0 + movdqu %xmm2,0($rp) + movdqu %xmm0,16($rp) + ret + +.align 16 +.Lvanilla_mul_2x2: + lea -8*17(%rsp),%rsp +___ +$code.=<<___ if ($win64); + mov `8*17+40`(%rsp),$b0 + mov %rdi,8*15(%rsp) + mov %rsi,8*16(%rsp) +___ +$code.=<<___; + mov %r14,8*10(%rsp) + mov %r13,8*11(%rsp) + mov %r12,8*12(%rsp) + mov %rbp,8*13(%rsp) + mov %rbx,8*14(%rsp) +.Lbody_mul_2x2: + mov $rp,32(%rsp) # save the arguments + mov $a1,40(%rsp) + mov $a0,48(%rsp) + mov $b1,56(%rsp) + mov $b0,64(%rsp) + + mov \$0xf,$mask + mov $a1,$a + mov $b1,$b + call _mul_1x1 # a1·b1 + mov $lo,16(%rsp) + mov $hi,24(%rsp) + + mov 48(%rsp),$a + mov 64(%rsp),$b + call _mul_1x1 # a0·b0 + mov $lo,0(%rsp) + mov $hi,8(%rsp) + + mov 40(%rsp),$a + mov 56(%rsp),$b + xor 48(%rsp),$a + xor 64(%rsp),$b + call _mul_1x1 # (a0+a1)·(b0+b1) +___ + @r=("%rbx","%rcx","%rdi","%rsi"); +$code.=<<___; + mov 0(%rsp),@r[0] + mov 8(%rsp),@r[1] + mov 16(%rsp),@r[2] + mov 24(%rsp),@r[3] + mov 32(%rsp),%rbp + + xor $hi,$lo + xor @r[1],$hi + xor @r[0],$lo + mov @r[0],0(%rbp) + xor @r[2],$hi + mov @r[3],24(%rbp) + xor @r[3],$lo + xor @r[3],$hi + xor $hi,$lo + mov $hi,16(%rbp) + mov $lo,8(%rbp) + + mov 8*10(%rsp),%r14 + mov 8*11(%rsp),%r13 + mov 8*12(%rsp),%r12 + mov 8*13(%rsp),%rbp + mov 8*14(%rsp),%rbx +___ +$code.=<<___ if ($win64); + mov 8*15(%rsp),%rdi + mov 8*16(%rsp),%rsi +___ +$code.=<<___; + lea 8*17(%rsp),%rsp + ret +.Lend_mul_2x2: +.size bn_GF2m_mul_2x2,.-bn_GF2m_mul_2x2 +.asciz "GF(2^m) Multiplication for x86_64, CRYPTOGAMS by " +.align 16 +___ + +# EXCEPTION_DISPOSITION handler (EXCEPTION_RECORD *rec,ULONG64 frame, +# CONTEXT *context,DISPATCHER_CONTEXT *disp) +if ($win64) { +$rec="%rcx"; +$frame="%rdx"; +$context="%r8"; +$disp="%r9"; + +$code.=<<___; +.extern __imp_RtlVirtualUnwind + +.type se_handler,\@abi-omnipotent +.align 16 +se_handler: + push %rsi + push %rdi + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + pushfq + sub \$64,%rsp + + mov 152($context),%rax # pull context->Rsp + mov 248($context),%rbx # pull context->Rip + + lea .Lbody_mul_2x2(%rip),%r10 + cmp %r10,%rbx # context->Rip<"prologue" label + jb .Lin_prologue + + mov 8*10(%rax),%r14 # mimic epilogue + mov 8*11(%rax),%r13 + mov 8*12(%rax),%r12 + mov 8*13(%rax),%rbp + mov 8*14(%rax),%rbx + mov 8*15(%rax),%rdi + mov 8*16(%rax),%rsi + + mov %rbx,144($context) # restore context->Rbx + mov %rbp,160($context) # restore context->Rbp + mov %rsi,168($context) # restore context->Rsi + mov %rdi,176($context) # restore context->Rdi + mov %r12,216($context) # restore context->R12 + mov %r13,224($context) # restore context->R13 + mov %r14,232($context) # restore context->R14 + +.Lin_prologue: + lea 8*17(%rax),%rax + mov %rax,152($context) # restore context->Rsp + + mov 40($disp),%rdi # disp->ContextRecord + mov $context,%rsi # context + mov \$154,%ecx # sizeof(CONTEXT) + .long 0xa548f3fc # cld; rep movsq + + mov $disp,%rsi + xor %rcx,%rcx # arg1, UNW_FLAG_NHANDLER + mov 8(%rsi),%rdx # arg2, disp->ImageBase + mov 0(%rsi),%r8 # arg3, disp->ControlPc + mov 16(%rsi),%r9 # arg4, disp->FunctionEntry + mov 40(%rsi),%r10 # disp->ContextRecord + lea 56(%rsi),%r11 # &disp->HandlerData + lea 24(%rsi),%r12 # &disp->EstablisherFrame + mov %r10,32(%rsp) # arg5 + mov %r11,40(%rsp) # arg6 + mov %r12,48(%rsp) # arg7 + mov %rcx,56(%rsp) # arg8, (NULL) + call *__imp_RtlVirtualUnwind(%rip) + + mov \$1,%eax # ExceptionContinueSearch + add \$64,%rsp + popfq + pop %r15 + pop %r14 + pop %r13 + pop %r12 + pop %rbp + pop %rbx + pop %rdi + pop %rsi + ret +.size se_handler,.-se_handler + +.section .pdata +.align 4 + .rva _mul_1x1 + .rva .Lend_mul_1x1 + .rva .LSEH_info_1x1 + + .rva .Lvanilla_mul_2x2 + .rva .Lend_mul_2x2 + .rva .LSEH_info_2x2 +.section .xdata +.align 8 +.LSEH_info_1x1: + .byte 0x01,0x07,0x02,0x00 + .byte 0x07,0x01,0x11,0x00 # sub rsp,128+8 +.LSEH_info_2x2: + .byte 9,0,0,0 + .rva se_handler +___ +} + +$code =~ s/\`([^\`]*)\`/eval($1)/gem; +print $code; +close STDOUT; diff --git a/crypto/openssl/crypto/bn/asm/x86_64-mont.pl b/crypto/openssl/crypto/bn/asm/x86_64-mont.pl index 3b7a6f243f..5d79b35e1c 100755 --- a/crypto/openssl/crypto/bn/asm/x86_64-mont.pl +++ b/crypto/openssl/crypto/bn/asm/x86_64-mont.pl @@ -1,7 +1,7 @@ #!/usr/bin/env perl # ==================================================================== -# Written by Andy Polyakov for the OpenSSL +# Written by Andy Polyakov for the OpenSSL # project. The module is, however, dual licensed under OpenSSL and # CRYPTOGAMS licenses depending on where you obtain it. For further # details see http://www.openssl.org/~appro/cryptogams/. @@ -15,6 +15,20 @@ # respectful 50%. It remains to be seen if loop unrolling and # dedicated squaring routine can provide further improvement... +# July 2011. +# +# Add dedicated squaring procedure. Performance improvement varies +# from platform to platform, but in average it's ~5%/15%/25%/33% +# for 512-/1024-/2048-/4096-bit RSA *sign* benchmarks respectively. + +# August 2011. +# +# Unroll and modulo-schedule inner loops in such manner that they +# are "fallen through" for input lengths of 8, which is critical for +# 1024-bit RSA *sign*. Average performance improvement in comparison +# to *initial* version of this module from 2005 is ~0%/30%/40%/45% +# for 512-/1024-/2048-/4096-bit RSA *sign* benchmarks respectively. + $flavour = shift; $output = shift; if ($flavour =~ /\./) { $output = $flavour; undef $flavour; } @@ -37,7 +51,6 @@ $n0="%r8"; # const BN_ULONG *n0, $num="%r9"; # int num); $lo0="%r10"; $hi0="%r11"; -$bp="%r12"; # reassign $bp $hi1="%r13"; $i="%r14"; $j="%r15"; @@ -51,6 +64,16 @@ $code=<<___; .type bn_mul_mont,\@function,6 .align 16 bn_mul_mont: + test \$3,${num}d + jnz .Lmul_enter + cmp \$8,${num}d + jb .Lmul_enter + cmp $ap,$bp + jne .Lmul4x_enter + jmp .Lsqr4x_enter + +.align 16 +.Lmul_enter: push %rbx push %rbp push %r12 @@ -66,48 +89,66 @@ bn_mul_mont: and \$-1024,%rsp # minimize TLB usage mov %r11,8(%rsp,$num,8) # tp[num+1]=%rsp -.Lprologue: - mov %rdx,$bp # $bp reassigned, remember? - +.Lmul_body: + mov $bp,%r12 # reassign $bp +___ + $bp="%r12"; +$code.=<<___; mov ($n0),$n0 # pull n0[0] value + mov ($bp),$m0 # m0=bp[0] + mov ($ap),%rax xor $i,$i # i=0 xor $j,$j # j=0 - mov ($bp),$m0 # m0=bp[0] - mov ($ap),%rax + mov $n0,$m1 mulq $m0 # ap[0]*bp[0] mov %rax,$lo0 - mov %rdx,$hi0 + mov ($np),%rax - imulq $n0,%rax # "tp[0]"*n0 - mov %rax,$m1 + imulq $lo0,$m1 # "tp[0]"*n0 + mov %rdx,$hi0 - mulq ($np) # np[0]*m1 - add $lo0,%rax # discarded + mulq $m1 # np[0]*m1 + add %rax,$lo0 # discarded + mov 8($ap),%rax adc \$0,%rdx mov %rdx,$hi1 lea 1($j),$j # j++ + jmp .L1st_enter + +.align 16 .L1st: + add %rax,$hi1 mov ($ap,$j,8),%rax - mulq $m0 # ap[j]*bp[0] - add $hi0,%rax adc \$0,%rdx - mov %rax,$lo0 + add $hi0,$hi1 # np[j]*m1+ap[j]*bp[0] + mov $lo0,$hi0 + adc \$0,%rdx + mov $hi1,-16(%rsp,$j,8) # tp[j-1] + mov %rdx,$hi1 + +.L1st_enter: + mulq $m0 # ap[j]*bp[0] + add %rax,$hi0 mov ($np,$j,8),%rax - mov %rdx,$hi0 + adc \$0,%rdx + lea 1($j),$j # j++ + mov %rdx,$lo0 mulq $m1 # np[j]*m1 - add $hi1,%rax - lea 1($j),$j # j++ + cmp $num,$j + jne .L1st + + add %rax,$hi1 + mov ($ap),%rax # ap[0] adc \$0,%rdx - add $lo0,%rax # np[j]*m1+ap[j]*bp[0] + add $hi0,$hi1 # np[j]*m1+ap[j]*bp[0] adc \$0,%rdx - mov %rax,-16(%rsp,$j,8) # tp[j-1] - cmp $num,$j + mov $hi1,-16(%rsp,$j,8) # tp[j-1] mov %rdx,$hi1 - jl .L1st + mov $lo0,$hi0 xor %rdx,%rdx add $hi0,$hi1 @@ -116,50 +157,64 @@ bn_mul_mont: mov %rdx,(%rsp,$num,8) # store upmost overflow bit lea 1($i),$i # i++ -.align 4 + jmp .Louter +.align 16 .Louter: - xor $j,$j # j=0 - mov ($bp,$i,8),$m0 # m0=bp[i] - mov ($ap),%rax # ap[0] + xor $j,$j # j=0 + mov $n0,$m1 + mov (%rsp),$lo0 mulq $m0 # ap[0]*bp[i] - add (%rsp),%rax # ap[0]*bp[i]+tp[0] + add %rax,$lo0 # ap[0]*bp[i]+tp[0] + mov ($np),%rax adc \$0,%rdx - mov %rax,$lo0 - mov %rdx,$hi0 - imulq $n0,%rax # tp[0]*n0 - mov %rax,$m1 + imulq $lo0,$m1 # tp[0]*n0 + mov %rdx,$hi0 - mulq ($np,$j,8) # np[0]*m1 - add $lo0,%rax # discarded - mov 8(%rsp),$lo0 # tp[1] + mulq $m1 # np[0]*m1 + add %rax,$lo0 # discarded + mov 8($ap),%rax adc \$0,%rdx + mov 8(%rsp),$lo0 # tp[1] mov %rdx,$hi1 lea 1($j),$j # j++ -.align 4 + jmp .Linner_enter + +.align 16 .Linner: + add %rax,$hi1 mov ($ap,$j,8),%rax - mulq $m0 # ap[j]*bp[i] - add $hi0,%rax adc \$0,%rdx - add %rax,$lo0 # ap[j]*bp[i]+tp[j] + add $lo0,$hi1 # np[j]*m1+ap[j]*bp[i]+tp[j] + mov (%rsp,$j,8),$lo0 + adc \$0,%rdx + mov $hi1,-16(%rsp,$j,8) # tp[j-1] + mov %rdx,$hi1 + +.Linner_enter: + mulq $m0 # ap[j]*bp[i] + add %rax,$hi0 mov ($np,$j,8),%rax adc \$0,%rdx + add $hi0,$lo0 # ap[j]*bp[i]+tp[j] mov %rdx,$hi0 + adc \$0,$hi0 + lea 1($j),$j # j++ mulq $m1 # np[j]*m1 - add $hi1,%rax - lea 1($j),$j # j++ - adc \$0,%rdx - add $lo0,%rax # np[j]*m1+ap[j]*bp[i]+tp[j] + cmp $num,$j + jne .Linner + + add %rax,$hi1 + mov ($ap),%rax # ap[0] adc \$0,%rdx + add $lo0,$hi1 # np[j]*m1+ap[j]*bp[i]+tp[j] mov (%rsp,$j,8),$lo0 - cmp $num,$j - mov %rax,-16(%rsp,$j,8) # tp[j-1] + adc \$0,%rdx + mov $hi1,-16(%rsp,$j,8) # tp[j-1] mov %rdx,$hi1 - jl .Linner xor %rdx,%rdx add $hi0,$hi1 @@ -173,35 +228,449 @@ bn_mul_mont: cmp $num,$i jl .Louter - lea (%rsp),$ap # borrow ap for tp - lea -1($num),$j # j=num-1 - - mov ($ap),%rax # tp[0] xor $i,$i # i=0 and clear CF! + mov (%rsp),%rax # tp[0] + lea (%rsp),$ap # borrow ap for tp + mov $num,$j # j=num jmp .Lsub .align 16 .Lsub: sbb ($np,$i,8),%rax mov %rax,($rp,$i,8) # rp[i]=tp[i]-np[i] - dec $j # doesn't affect CF! mov 8($ap,$i,8),%rax # tp[i+1] lea 1($i),$i # i++ - jge .Lsub + dec $j # doesnn't affect CF! + jnz .Lsub sbb \$0,%rax # handle upmost overflow bit + xor $i,$i and %rax,$ap not %rax mov $rp,$np and %rax,$np - lea -1($num),$j + mov $num,$j # j=num or $np,$ap # ap=borrow?tp:rp .align 16 .Lcopy: # copy or in-place refresh + mov ($ap,$i,8),%rax + mov $i,(%rsp,$i,8) # zap temporary vector + mov %rax,($rp,$i,8) # rp[i]=tp[i] + lea 1($i),$i + sub \$1,$j + jnz .Lcopy + + mov 8(%rsp,$num,8),%rsi # restore %rsp + mov \$1,%rax + mov (%rsi),%r15 + mov 8(%rsi),%r14 + mov 16(%rsi),%r13 + mov 24(%rsi),%r12 + mov 32(%rsi),%rbp + mov 40(%rsi),%rbx + lea 48(%rsi),%rsp +.Lmul_epilogue: + ret +.size bn_mul_mont,.-bn_mul_mont +___ +{{{ +my @A=("%r10","%r11"); +my @N=("%r13","%rdi"); +$code.=<<___; +.type bn_mul4x_mont,\@function,6 +.align 16 +bn_mul4x_mont: +.Lmul4x_enter: + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + + mov ${num}d,${num}d + lea 4($num),%r10 + mov %rsp,%r11 + neg %r10 + lea (%rsp,%r10,8),%rsp # tp=alloca(8*(num+4)) + and \$-1024,%rsp # minimize TLB usage + + mov %r11,8(%rsp,$num,8) # tp[num+1]=%rsp +.Lmul4x_body: + mov $rp,16(%rsp,$num,8) # tp[num+2]=$rp + mov %rdx,%r12 # reassign $bp +___ + $bp="%r12"; +$code.=<<___; + mov ($n0),$n0 # pull n0[0] value + mov ($bp),$m0 # m0=bp[0] + mov ($ap),%rax + + xor $i,$i # i=0 + xor $j,$j # j=0 + + mov $n0,$m1 + mulq $m0 # ap[0]*bp[0] + mov %rax,$A[0] + mov ($np),%rax + + imulq $A[0],$m1 # "tp[0]"*n0 + mov %rdx,$A[1] + + mulq $m1 # np[0]*m1 + add %rax,$A[0] # discarded + mov 8($ap),%rax + adc \$0,%rdx + mov %rdx,$N[1] + + mulq $m0 + add %rax,$A[1] + mov 8($np),%rax + adc \$0,%rdx + mov %rdx,$A[0] + + mulq $m1 + add %rax,$N[1] + mov 16($ap),%rax + adc \$0,%rdx + add $A[1],$N[1] + lea 4($j),$j # j++ + adc \$0,%rdx + mov $N[1],(%rsp) + mov %rdx,$N[0] + jmp .L1st4x +.align 16 +.L1st4x: + mulq $m0 # ap[j]*bp[0] + add %rax,$A[0] + mov -16($np,$j,8),%rax + adc \$0,%rdx + mov %rdx,$A[1] + + mulq $m1 # np[j]*m1 + add %rax,$N[0] + mov -8($ap,$j,8),%rax + adc \$0,%rdx + add $A[0],$N[0] # np[j]*m1+ap[j]*bp[0] + adc \$0,%rdx + mov $N[0],-24(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[1] + + mulq $m0 # ap[j]*bp[0] + add %rax,$A[1] + mov -8($np,$j,8),%rax + adc \$0,%rdx + mov %rdx,$A[0] + + mulq $m1 # np[j]*m1 + add %rax,$N[1] mov ($ap,$j,8),%rax - mov %rax,($rp,$j,8) # rp[i]=tp[i] - mov $i,(%rsp,$j,8) # zap temporary vector + adc \$0,%rdx + add $A[1],$N[1] # np[j]*m1+ap[j]*bp[0] + adc \$0,%rdx + mov $N[1],-16(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[0] + + mulq $m0 # ap[j]*bp[0] + add %rax,$A[0] + mov ($np,$j,8),%rax + adc \$0,%rdx + mov %rdx,$A[1] + + mulq $m1 # np[j]*m1 + add %rax,$N[0] + mov 8($ap,$j,8),%rax + adc \$0,%rdx + add $A[0],$N[0] # np[j]*m1+ap[j]*bp[0] + adc \$0,%rdx + mov $N[0],-8(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[1] + + mulq $m0 # ap[j]*bp[0] + add %rax,$A[1] + mov 8($np,$j,8),%rax + adc \$0,%rdx + lea 4($j),$j # j++ + mov %rdx,$A[0] + + mulq $m1 # np[j]*m1 + add %rax,$N[1] + mov -16($ap,$j,8),%rax + adc \$0,%rdx + add $A[1],$N[1] # np[j]*m1+ap[j]*bp[0] + adc \$0,%rdx + mov $N[1],-32(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[0] + cmp $num,$j + jl .L1st4x + + mulq $m0 # ap[j]*bp[0] + add %rax,$A[0] + mov -16($np,$j,8),%rax + adc \$0,%rdx + mov %rdx,$A[1] + + mulq $m1 # np[j]*m1 + add %rax,$N[0] + mov -8($ap,$j,8),%rax + adc \$0,%rdx + add $A[0],$N[0] # np[j]*m1+ap[j]*bp[0] + adc \$0,%rdx + mov $N[0],-24(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[1] + + mulq $m0 # ap[j]*bp[0] + add %rax,$A[1] + mov -8($np,$j,8),%rax + adc \$0,%rdx + mov %rdx,$A[0] + + mulq $m1 # np[j]*m1 + add %rax,$N[1] + mov ($ap),%rax # ap[0] + adc \$0,%rdx + add $A[1],$N[1] # np[j]*m1+ap[j]*bp[0] + adc \$0,%rdx + mov $N[1],-16(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[0] + + xor $N[1],$N[1] + add $A[0],$N[0] + adc \$0,$N[1] + mov $N[0],-8(%rsp,$j,8) + mov $N[1],(%rsp,$j,8) # store upmost overflow bit + + lea 1($i),$i # i++ +.align 4 +.Louter4x: + mov ($bp,$i,8),$m0 # m0=bp[i] + xor $j,$j # j=0 + mov (%rsp),$A[0] + mov $n0,$m1 + mulq $m0 # ap[0]*bp[i] + add %rax,$A[0] # ap[0]*bp[i]+tp[0] + mov ($np),%rax + adc \$0,%rdx + + imulq $A[0],$m1 # tp[0]*n0 + mov %rdx,$A[1] + + mulq $m1 # np[0]*m1 + add %rax,$A[0] # "$N[0]", discarded + mov 8($ap),%rax + adc \$0,%rdx + mov %rdx,$N[1] + + mulq $m0 # ap[j]*bp[i] + add %rax,$A[1] + mov 8($np),%rax + adc \$0,%rdx + add 8(%rsp),$A[1] # +tp[1] + adc \$0,%rdx + mov %rdx,$A[0] + + mulq $m1 # np[j]*m1 + add %rax,$N[1] + mov 16($ap),%rax + adc \$0,%rdx + add $A[1],$N[1] # np[j]*m1+ap[j]*bp[i]+tp[j] + lea 4($j),$j # j+=2 + adc \$0,%rdx + mov $N[1],(%rsp) # tp[j-1] + mov %rdx,$N[0] + jmp .Linner4x +.align 16 +.Linner4x: + mulq $m0 # ap[j]*bp[i] + add %rax,$A[0] + mov -16($np,$j,8),%rax + adc \$0,%rdx + add -16(%rsp,$j,8),$A[0] # ap[j]*bp[i]+tp[j] + adc \$0,%rdx + mov %rdx,$A[1] + + mulq $m1 # np[j]*m1 + add %rax,$N[0] + mov -8($ap,$j,8),%rax + adc \$0,%rdx + add $A[0],$N[0] + adc \$0,%rdx + mov $N[0],-24(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[1] + + mulq $m0 # ap[j]*bp[i] + add %rax,$A[1] + mov -8($np,$j,8),%rax + adc \$0,%rdx + add -8(%rsp,$j,8),$A[1] + adc \$0,%rdx + mov %rdx,$A[0] + + mulq $m1 # np[j]*m1 + add %rax,$N[1] + mov ($ap,$j,8),%rax + adc \$0,%rdx + add $A[1],$N[1] + adc \$0,%rdx + mov $N[1],-16(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[0] + + mulq $m0 # ap[j]*bp[i] + add %rax,$A[0] + mov ($np,$j,8),%rax + adc \$0,%rdx + add (%rsp,$j,8),$A[0] # ap[j]*bp[i]+tp[j] + adc \$0,%rdx + mov %rdx,$A[1] + + mulq $m1 # np[j]*m1 + add %rax,$N[0] + mov 8($ap,$j,8),%rax + adc \$0,%rdx + add $A[0],$N[0] + adc \$0,%rdx + mov $N[0],-8(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[1] + + mulq $m0 # ap[j]*bp[i] + add %rax,$A[1] + mov 8($np,$j,8),%rax + adc \$0,%rdx + add 8(%rsp,$j,8),$A[1] + adc \$0,%rdx + lea 4($j),$j # j++ + mov %rdx,$A[0] + + mulq $m1 # np[j]*m1 + add %rax,$N[1] + mov -16($ap,$j,8),%rax + adc \$0,%rdx + add $A[1],$N[1] + adc \$0,%rdx + mov $N[1],-32(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[0] + cmp $num,$j + jl .Linner4x + + mulq $m0 # ap[j]*bp[i] + add %rax,$A[0] + mov -16($np,$j,8),%rax + adc \$0,%rdx + add -16(%rsp,$j,8),$A[0] # ap[j]*bp[i]+tp[j] + adc \$0,%rdx + mov %rdx,$A[1] + + mulq $m1 # np[j]*m1 + add %rax,$N[0] + mov -8($ap,$j,8),%rax + adc \$0,%rdx + add $A[0],$N[0] + adc \$0,%rdx + mov $N[0],-24(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[1] + + mulq $m0 # ap[j]*bp[i] + add %rax,$A[1] + mov -8($np,$j,8),%rax + adc \$0,%rdx + add -8(%rsp,$j,8),$A[1] + adc \$0,%rdx + lea 1($i),$i # i++ + mov %rdx,$A[0] + + mulq $m1 # np[j]*m1 + add %rax,$N[1] + mov ($ap),%rax # ap[0] + adc \$0,%rdx + add $A[1],$N[1] + adc \$0,%rdx + mov $N[1],-16(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[0] + + xor $N[1],$N[1] + add $A[0],$N[0] + adc \$0,$N[1] + add (%rsp,$num,8),$N[0] # pull upmost overflow bit + adc \$0,$N[1] + mov $N[0],-8(%rsp,$j,8) + mov $N[1],(%rsp,$j,8) # store upmost overflow bit + + cmp $num,$i + jl .Louter4x +___ +{ +my @ri=("%rax","%rdx",$m0,$m1); +$code.=<<___; + mov 16(%rsp,$num,8),$rp # restore $rp + mov 0(%rsp),@ri[0] # tp[0] + pxor %xmm0,%xmm0 + mov 8(%rsp),@ri[1] # tp[1] + shr \$2,$num # num/=4 + lea (%rsp),$ap # borrow ap for tp + xor $i,$i # i=0 and clear CF! + + sub 0($np),@ri[0] + mov 16($ap),@ri[2] # tp[2] + mov 24($ap),@ri[3] # tp[3] + sbb 8($np),@ri[1] + lea -1($num),$j # j=num/4-1 + jmp .Lsub4x +.align 16 +.Lsub4x: + mov @ri[0],0($rp,$i,8) # rp[i]=tp[i]-np[i] + mov @ri[1],8($rp,$i,8) # rp[i]=tp[i]-np[i] + sbb 16($np,$i,8),@ri[2] + mov 32($ap,$i,8),@ri[0] # tp[i+1] + mov 40($ap,$i,8),@ri[1] + sbb 24($np,$i,8),@ri[3] + mov @ri[2],16($rp,$i,8) # rp[i]=tp[i]-np[i] + mov @ri[3],24($rp,$i,8) # rp[i]=tp[i]-np[i] + sbb 32($np,$i,8),@ri[0] + mov 48($ap,$i,8),@ri[2] + mov 56($ap,$i,8),@ri[3] + sbb 40($np,$i,8),@ri[1] + lea 4($i),$i # i++ + dec $j # doesnn't affect CF! + jnz .Lsub4x + + mov @ri[0],0($rp,$i,8) # rp[i]=tp[i]-np[i] + mov 32($ap,$i,8),@ri[0] # load overflow bit + sbb 16($np,$i,8),@ri[2] + mov @ri[1],8($rp,$i,8) # rp[i]=tp[i]-np[i] + sbb 24($np,$i,8),@ri[3] + mov @ri[2],16($rp,$i,8) # rp[i]=tp[i]-np[i] + + sbb \$0,@ri[0] # handle upmost overflow bit + mov @ri[3],24($rp,$i,8) # rp[i]=tp[i]-np[i] + xor $i,$i # i=0 + and @ri[0],$ap + not @ri[0] + mov $rp,$np + and @ri[0],$np + lea -1($num),$j + or $np,$ap # ap=borrow?tp:rp + + movdqu ($ap),%xmm1 + movdqa %xmm0,(%rsp) + movdqu %xmm1,($rp) + jmp .Lcopy4x +.align 16 +.Lcopy4x: # copy or in-place refresh + movdqu 16($ap,$i),%xmm2 + movdqu 32($ap,$i),%xmm1 + movdqa %xmm0,16(%rsp,$i) + movdqu %xmm2,16($rp,$i) + movdqa %xmm0,32(%rsp,$i) + movdqu %xmm1,32($rp,$i) + lea 32($i),$i dec $j - jge .Lcopy + jnz .Lcopy4x + shl \$2,$num + movdqu 16($ap,$i),%xmm2 + movdqa %xmm0,16(%rsp,$i) + movdqu %xmm2,16($rp,$i) +___ +} +$code.=<<___; mov 8(%rsp,$num,8),%rsi # restore %rsp mov \$1,%rax mov (%rsi),%r15 @@ -211,9 +680,823 @@ bn_mul_mont: mov 32(%rsi),%rbp mov 40(%rsi),%rbx lea 48(%rsi),%rsp -.Lepilogue: +.Lmul4x_epilogue: ret -.size bn_mul_mont,.-bn_mul_mont +.size bn_mul4x_mont,.-bn_mul4x_mont +___ +}}} + {{{ +###################################################################### +# void bn_sqr4x_mont( +my $rptr="%rdi"; # const BN_ULONG *rptr, +my $aptr="%rsi"; # const BN_ULONG *aptr, +my $bptr="%rdx"; # not used +my $nptr="%rcx"; # const BN_ULONG *nptr, +my $n0 ="%r8"; # const BN_ULONG *n0); +my $num ="%r9"; # int num, has to be divisible by 4 and + # not less than 8 + +my ($i,$j,$tptr)=("%rbp","%rcx",$rptr); +my @A0=("%r10","%r11"); +my @A1=("%r12","%r13"); +my ($a0,$a1,$ai)=("%r14","%r15","%rbx"); + +$code.=<<___; +.type bn_sqr4x_mont,\@function,6 +.align 16 +bn_sqr4x_mont: +.Lsqr4x_enter: + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + + shl \$3,${num}d # convert $num to bytes + xor %r10,%r10 + mov %rsp,%r11 # put aside %rsp + sub $num,%r10 # -$num + mov ($n0),$n0 # *n0 + lea -72(%rsp,%r10,2),%rsp # alloca(frame+2*$num) + and \$-1024,%rsp # minimize TLB usage + ############################################################## + # Stack layout + # + # +0 saved $num, used in reduction section + # +8 &t[2*$num], used in reduction section + # +32 saved $rptr + # +40 saved $nptr + # +48 saved *n0 + # +56 saved %rsp + # +64 t[2*$num] + # + mov $rptr,32(%rsp) # save $rptr + mov $nptr,40(%rsp) + mov $n0, 48(%rsp) + mov %r11, 56(%rsp) # save original %rsp +.Lsqr4x_body: + ############################################################## + # Squaring part: + # + # a) multiply-n-add everything but a[i]*a[i]; + # b) shift result of a) by 1 to the left and accumulate + # a[i]*a[i] products; + # + lea 32(%r10),$i # $i=-($num-32) + lea ($aptr,$num),$aptr # end of a[] buffer, ($aptr,$i)=&ap[2] + + mov $num,$j # $j=$num + + # comments apply to $num==8 case + mov -32($aptr,$i),$a0 # a[0] + lea 64(%rsp,$num,2),$tptr # end of tp[] buffer, &tp[2*$num] + mov -24($aptr,$i),%rax # a[1] + lea -32($tptr,$i),$tptr # end of tp[] window, &tp[2*$num-"$i"] + mov -16($aptr,$i),$ai # a[2] + mov %rax,$a1 + + mul $a0 # a[1]*a[0] + mov %rax,$A0[0] # a[1]*a[0] + mov $ai,%rax # a[2] + mov %rdx,$A0[1] + mov $A0[0],-24($tptr,$i) # t[1] + + xor $A0[0],$A0[0] + mul $a0 # a[2]*a[0] + add %rax,$A0[1] + mov $ai,%rax + adc %rdx,$A0[0] + mov $A0[1],-16($tptr,$i) # t[2] + + lea -16($i),$j # j=-16 + + + mov 8($aptr,$j),$ai # a[3] + mul $a1 # a[2]*a[1] + mov %rax,$A1[0] # a[2]*a[1]+t[3] + mov $ai,%rax + mov %rdx,$A1[1] + + xor $A0[1],$A0[1] + add $A1[0],$A0[0] + lea 16($j),$j + adc \$0,$A0[1] + mul $a0 # a[3]*a[0] + add %rax,$A0[0] # a[3]*a[0]+a[2]*a[1]+t[3] + mov $ai,%rax + adc %rdx,$A0[1] + mov $A0[0],-8($tptr,$j) # t[3] + jmp .Lsqr4x_1st + +.align 16 +.Lsqr4x_1st: + mov ($aptr,$j),$ai # a[4] + xor $A1[0],$A1[0] + mul $a1 # a[3]*a[1] + add %rax,$A1[1] # a[3]*a[1]+t[4] + mov $ai,%rax + adc %rdx,$A1[0] + + xor $A0[0],$A0[0] + add $A1[1],$A0[1] + adc \$0,$A0[0] + mul $a0 # a[4]*a[0] + add %rax,$A0[1] # a[4]*a[0]+a[3]*a[1]+t[4] + mov $ai,%rax # a[3] + adc %rdx,$A0[0] + mov $A0[1],($tptr,$j) # t[4] + + + mov 8($aptr,$j),$ai # a[5] + xor $A1[1],$A1[1] + mul $a1 # a[4]*a[3] + add %rax,$A1[0] # a[4]*a[3]+t[5] + mov $ai,%rax + adc %rdx,$A1[1] + + xor $A0[1],$A0[1] + add $A1[0],$A0[0] + adc \$0,$A0[1] + mul $a0 # a[5]*a[2] + add %rax,$A0[0] # a[5]*a[2]+a[4]*a[3]+t[5] + mov $ai,%rax + adc %rdx,$A0[1] + mov $A0[0],8($tptr,$j) # t[5] + + mov 16($aptr,$j),$ai # a[6] + xor $A1[0],$A1[0] + mul $a1 # a[5]*a[3] + add %rax,$A1[1] # a[5]*a[3]+t[6] + mov $ai,%rax + adc %rdx,$A1[0] + + xor $A0[0],$A0[0] + add $A1[1],$A0[1] + adc \$0,$A0[0] + mul $a0 # a[6]*a[2] + add %rax,$A0[1] # a[6]*a[2]+a[5]*a[3]+t[6] + mov $ai,%rax # a[3] + adc %rdx,$A0[0] + mov $A0[1],16($tptr,$j) # t[6] + + + mov 24($aptr,$j),$ai # a[7] + xor $A1[1],$A1[1] + mul $a1 # a[6]*a[5] + add %rax,$A1[0] # a[6]*a[5]+t[7] + mov $ai,%rax + adc %rdx,$A1[1] + + xor $A0[1],$A0[1] + add $A1[0],$A0[0] + lea 32($j),$j + adc \$0,$A0[1] + mul $a0 # a[7]*a[4] + add %rax,$A0[0] # a[7]*a[4]+a[6]*a[5]+t[6] + mov $ai,%rax + adc %rdx,$A0[1] + mov $A0[0],-8($tptr,$j) # t[7] + + cmp \$0,$j + jne .Lsqr4x_1st + + xor $A1[0],$A1[0] + add $A0[1],$A1[1] + adc \$0,$A1[0] + mul $a1 # a[7]*a[5] + add %rax,$A1[1] + adc %rdx,$A1[0] + + mov $A1[1],($tptr) # t[8] + lea 16($i),$i + mov $A1[0],8($tptr) # t[9] + jmp .Lsqr4x_outer + +.align 16 +.Lsqr4x_outer: # comments apply to $num==6 case + mov -32($aptr,$i),$a0 # a[0] + lea 64(%rsp,$num,2),$tptr # end of tp[] buffer, &tp[2*$num] + mov -24($aptr,$i),%rax # a[1] + lea -32($tptr,$i),$tptr # end of tp[] window, &tp[2*$num-"$i"] + mov -16($aptr,$i),$ai # a[2] + mov %rax,$a1 + + mov -24($tptr,$i),$A0[0] # t[1] + xor $A0[1],$A0[1] + mul $a0 # a[1]*a[0] + add %rax,$A0[0] # a[1]*a[0]+t[1] + mov $ai,%rax # a[2] + adc %rdx,$A0[1] + mov $A0[0],-24($tptr,$i) # t[1] + + xor $A0[0],$A0[0] + add -16($tptr,$i),$A0[1] # a[2]*a[0]+t[2] + adc \$0,$A0[0] + mul $a0 # a[2]*a[0] + add %rax,$A0[1] + mov $ai,%rax + adc %rdx,$A0[0] + mov $A0[1],-16($tptr,$i) # t[2] + + lea -16($i),$j # j=-16 + xor $A1[0],$A1[0] + + + mov 8($aptr,$j),$ai # a[3] + xor $A1[1],$A1[1] + add 8($tptr,$j),$A1[0] + adc \$0,$A1[1] + mul $a1 # a[2]*a[1] + add %rax,$A1[0] # a[2]*a[1]+t[3] + mov $ai,%rax + adc %rdx,$A1[1] + + xor $A0[1],$A0[1] + add $A1[0],$A0[0] + adc \$0,$A0[1] + mul $a0 # a[3]*a[0] + add %rax,$A0[0] # a[3]*a[0]+a[2]*a[1]+t[3] + mov $ai,%rax + adc %rdx,$A0[1] + mov $A0[0],8($tptr,$j) # t[3] + + lea 16($j),$j + jmp .Lsqr4x_inner + +.align 16 +.Lsqr4x_inner: + mov ($aptr,$j),$ai # a[4] + xor $A1[0],$A1[0] + add ($tptr,$j),$A1[1] + adc \$0,$A1[0] + mul $a1 # a[3]*a[1] + add %rax,$A1[1] # a[3]*a[1]+t[4] + mov $ai,%rax + adc %rdx,$A1[0] + + xor $A0[0],$A0[0] + add $A1[1],$A0[1] + adc \$0,$A0[0] + mul $a0 # a[4]*a[0] + add %rax,$A0[1] # a[4]*a[0]+a[3]*a[1]+t[4] + mov $ai,%rax # a[3] + adc %rdx,$A0[0] + mov $A0[1],($tptr,$j) # t[4] + + mov 8($aptr,$j),$ai # a[5] + xor $A1[1],$A1[1] + add 8($tptr,$j),$A1[0] + adc \$0,$A1[1] + mul $a1 # a[4]*a[3] + add %rax,$A1[0] # a[4]*a[3]+t[5] + mov $ai,%rax + adc %rdx,$A1[1] + + xor $A0[1],$A0[1] + add $A1[0],$A0[0] + lea 16($j),$j # j++ + adc \$0,$A0[1] + mul $a0 # a[5]*a[2] + add %rax,$A0[0] # a[5]*a[2]+a[4]*a[3]+t[5] + mov $ai,%rax + adc %rdx,$A0[1] + mov $A0[0],-8($tptr,$j) # t[5], "preloaded t[1]" below + + cmp \$0,$j + jne .Lsqr4x_inner + + xor $A1[0],$A1[0] + add $A0[1],$A1[1] + adc \$0,$A1[0] + mul $a1 # a[5]*a[3] + add %rax,$A1[1] + adc %rdx,$A1[0] + + mov $A1[1],($tptr) # t[6], "preloaded t[2]" below + mov $A1[0],8($tptr) # t[7], "preloaded t[3]" below + + add \$16,$i + jnz .Lsqr4x_outer + + # comments apply to $num==4 case + mov -32($aptr),$a0 # a[0] + lea 64(%rsp,$num,2),$tptr # end of tp[] buffer, &tp[2*$num] + mov -24($aptr),%rax # a[1] + lea -32($tptr,$i),$tptr # end of tp[] window, &tp[2*$num-"$i"] + mov -16($aptr),$ai # a[2] + mov %rax,$a1 + + xor $A0[1],$A0[1] + mul $a0 # a[1]*a[0] + add %rax,$A0[0] # a[1]*a[0]+t[1], preloaded t[1] + mov $ai,%rax # a[2] + adc %rdx,$A0[1] + mov $A0[0],-24($tptr) # t[1] + + xor $A0[0],$A0[0] + add $A1[1],$A0[1] # a[2]*a[0]+t[2], preloaded t[2] + adc \$0,$A0[0] + mul $a0 # a[2]*a[0] + add %rax,$A0[1] + mov $ai,%rax + adc %rdx,$A0[0] + mov $A0[1],-16($tptr) # t[2] + + mov -8($aptr),$ai # a[3] + mul $a1 # a[2]*a[1] + add %rax,$A1[0] # a[2]*a[1]+t[3], preloaded t[3] + mov $ai,%rax + adc \$0,%rdx + + xor $A0[1],$A0[1] + add $A1[0],$A0[0] + mov %rdx,$A1[1] + adc \$0,$A0[1] + mul $a0 # a[3]*a[0] + add %rax,$A0[0] # a[3]*a[0]+a[2]*a[1]+t[3] + mov $ai,%rax + adc %rdx,$A0[1] + mov $A0[0],-8($tptr) # t[3] + + xor $A1[0],$A1[0] + add $A0[1],$A1[1] + adc \$0,$A1[0] + mul $a1 # a[3]*a[1] + add %rax,$A1[1] + mov -16($aptr),%rax # a[2] + adc %rdx,$A1[0] + + mov $A1[1],($tptr) # t[4] + mov $A1[0],8($tptr) # t[5] + + mul $ai # a[2]*a[3] +___ +{ +my ($shift,$carry)=($a0,$a1); +my @S=(@A1,$ai,$n0); +$code.=<<___; + add \$16,$i + xor $shift,$shift + sub $num,$i # $i=16-$num + xor $carry,$carry + + add $A1[0],%rax # t[5] + adc \$0,%rdx + mov %rax,8($tptr) # t[5] + mov %rdx,16($tptr) # t[6] + mov $carry,24($tptr) # t[7] + + mov -16($aptr,$i),%rax # a[0] + lea 64(%rsp,$num,2),$tptr + xor $A0[0],$A0[0] # t[0] + mov -24($tptr,$i,2),$A0[1] # t[1] + + lea ($shift,$A0[0],2),$S[0] # t[2*i]<<1 | shift + shr \$63,$A0[0] + lea ($j,$A0[1],2),$S[1] # t[2*i+1]<<1 | + shr \$63,$A0[1] + or $A0[0],$S[1] # | t[2*i]>>63 + mov -16($tptr,$i,2),$A0[0] # t[2*i+2] # prefetch + mov $A0[1],$shift # shift=t[2*i+1]>>63 + mul %rax # a[i]*a[i] + neg $carry # mov $carry,cf + mov -8($tptr,$i,2),$A0[1] # t[2*i+2+1] # prefetch + adc %rax,$S[0] + mov -8($aptr,$i),%rax # a[i+1] # prefetch + mov $S[0],-32($tptr,$i,2) + adc %rdx,$S[1] + + lea ($shift,$A0[0],2),$S[2] # t[2*i]<<1 | shift + mov $S[1],-24($tptr,$i,2) + sbb $carry,$carry # mov cf,$carry + shr \$63,$A0[0] + lea ($j,$A0[1],2),$S[3] # t[2*i+1]<<1 | + shr \$63,$A0[1] + or $A0[0],$S[3] # | t[2*i]>>63 + mov 0($tptr,$i,2),$A0[0] # t[2*i+2] # prefetch + mov $A0[1],$shift # shift=t[2*i+1]>>63 + mul %rax # a[i]*a[i] + neg $carry # mov $carry,cf + mov 8($tptr,$i,2),$A0[1] # t[2*i+2+1] # prefetch + adc %rax,$S[2] + mov 0($aptr,$i),%rax # a[i+1] # prefetch + mov $S[2],-16($tptr,$i,2) + adc %rdx,$S[3] + lea 16($i),$i + mov $S[3],-40($tptr,$i,2) + sbb $carry,$carry # mov cf,$carry + jmp .Lsqr4x_shift_n_add + +.align 16 +.Lsqr4x_shift_n_add: + lea ($shift,$A0[0],2),$S[0] # t[2*i]<<1 | shift + shr \$63,$A0[0] + lea ($j,$A0[1],2),$S[1] # t[2*i+1]<<1 | + shr \$63,$A0[1] + or $A0[0],$S[1] # | t[2*i]>>63 + mov -16($tptr,$i,2),$A0[0] # t[2*i+2] # prefetch + mov $A0[1],$shift # shift=t[2*i+1]>>63 + mul %rax # a[i]*a[i] + neg $carry # mov $carry,cf + mov -8($tptr,$i,2),$A0[1] # t[2*i+2+1] # prefetch + adc %rax,$S[0] + mov -8($aptr,$i),%rax # a[i+1] # prefetch + mov $S[0],-32($tptr,$i,2) + adc %rdx,$S[1] + + lea ($shift,$A0[0],2),$S[2] # t[2*i]<<1 | shift + mov $S[1],-24($tptr,$i,2) + sbb $carry,$carry # mov cf,$carry + shr \$63,$A0[0] + lea ($j,$A0[1],2),$S[3] # t[2*i+1]<<1 | + shr \$63,$A0[1] + or $A0[0],$S[3] # | t[2*i]>>63 + mov 0($tptr,$i,2),$A0[0] # t[2*i+2] # prefetch + mov $A0[1],$shift # shift=t[2*i+1]>>63 + mul %rax # a[i]*a[i] + neg $carry # mov $carry,cf + mov 8($tptr,$i,2),$A0[1] # t[2*i+2+1] # prefetch + adc %rax,$S[2] + mov 0($aptr,$i),%rax # a[i+1] # prefetch + mov $S[2],-16($tptr,$i,2) + adc %rdx,$S[3] + + lea ($shift,$A0[0],2),$S[0] # t[2*i]<<1 | shift + mov $S[3],-8($tptr,$i,2) + sbb $carry,$carry # mov cf,$carry + shr \$63,$A0[0] + lea ($j,$A0[1],2),$S[1] # t[2*i+1]<<1 | + shr \$63,$A0[1] + or $A0[0],$S[1] # | t[2*i]>>63 + mov 16($tptr,$i,2),$A0[0] # t[2*i+2] # prefetch + mov $A0[1],$shift # shift=t[2*i+1]>>63 + mul %rax # a[i]*a[i] + neg $carry # mov $carry,cf + mov 24($tptr,$i,2),$A0[1] # t[2*i+2+1] # prefetch + adc %rax,$S[0] + mov 8($aptr,$i),%rax # a[i+1] # prefetch + mov $S[0],0($tptr,$i,2) + adc %rdx,$S[1] + + lea ($shift,$A0[0],2),$S[2] # t[2*i]<<1 | shift + mov $S[1],8($tptr,$i,2) + sbb $carry,$carry # mov cf,$carry + shr \$63,$A0[0] + lea ($j,$A0[1],2),$S[3] # t[2*i+1]<<1 | + shr \$63,$A0[1] + or $A0[0],$S[3] # | t[2*i]>>63 + mov 32($tptr,$i,2),$A0[0] # t[2*i+2] # prefetch + mov $A0[1],$shift # shift=t[2*i+1]>>63 + mul %rax # a[i]*a[i] + neg $carry # mov $carry,cf + mov 40($tptr,$i,2),$A0[1] # t[2*i+2+1] # prefetch + adc %rax,$S[2] + mov 16($aptr,$i),%rax # a[i+1] # prefetch + mov $S[2],16($tptr,$i,2) + adc %rdx,$S[3] + mov $S[3],24($tptr,$i,2) + sbb $carry,$carry # mov cf,$carry + add \$32,$i + jnz .Lsqr4x_shift_n_add + + lea ($shift,$A0[0],2),$S[0] # t[2*i]<<1 | shift + shr \$63,$A0[0] + lea ($j,$A0[1],2),$S[1] # t[2*i+1]<<1 | + shr \$63,$A0[1] + or $A0[0],$S[1] # | t[2*i]>>63 + mov -16($tptr),$A0[0] # t[2*i+2] # prefetch + mov $A0[1],$shift # shift=t[2*i+1]>>63 + mul %rax # a[i]*a[i] + neg $carry # mov $carry,cf + mov -8($tptr),$A0[1] # t[2*i+2+1] # prefetch + adc %rax,$S[0] + mov -8($aptr),%rax # a[i+1] # prefetch + mov $S[0],-32($tptr) + adc %rdx,$S[1] + + lea ($shift,$A0[0],2),$S[2] # t[2*i]<<1|shift + mov $S[1],-24($tptr) + sbb $carry,$carry # mov cf,$carry + shr \$63,$A0[0] + lea ($j,$A0[1],2),$S[3] # t[2*i+1]<<1 | + shr \$63,$A0[1] + or $A0[0],$S[3] # | t[2*i]>>63 + mul %rax # a[i]*a[i] + neg $carry # mov $carry,cf + adc %rax,$S[2] + adc %rdx,$S[3] + mov $S[2],-16($tptr) + mov $S[3],-8($tptr) +___ +} +############################################################## +# Montgomery reduction part, "word-by-word" algorithm. +# +{ +my ($topbit,$nptr)=("%rbp",$aptr); +my ($m0,$m1)=($a0,$a1); +my @Ni=("%rbx","%r9"); +$code.=<<___; + mov 40(%rsp),$nptr # restore $nptr + mov 48(%rsp),$n0 # restore *n0 + xor $j,$j + mov $num,0(%rsp) # save $num + sub $num,$j # $j=-$num + mov 64(%rsp),$A0[0] # t[0] # modsched # + mov $n0,$m0 # # modsched # + lea 64(%rsp,$num,2),%rax # end of t[] buffer + lea 64(%rsp,$num),$tptr # end of t[] window + mov %rax,8(%rsp) # save end of t[] buffer + lea ($nptr,$num),$nptr # end of n[] buffer + xor $topbit,$topbit # $topbit=0 + + mov 0($nptr,$j),%rax # n[0] # modsched # + mov 8($nptr,$j),$Ni[1] # n[1] # modsched # + imulq $A0[0],$m0 # m0=t[0]*n0 # modsched # + mov %rax,$Ni[0] # # modsched # + jmp .Lsqr4x_mont_outer + +.align 16 +.Lsqr4x_mont_outer: + xor $A0[1],$A0[1] + mul $m0 # n[0]*m0 + add %rax,$A0[0] # n[0]*m0+t[0] + mov $Ni[1],%rax + adc %rdx,$A0[1] + mov $n0,$m1 + + xor $A0[0],$A0[0] + add 8($tptr,$j),$A0[1] + adc \$0,$A0[0] + mul $m0 # n[1]*m0 + add %rax,$A0[1] # n[1]*m0+t[1] + mov $Ni[0],%rax + adc %rdx,$A0[0] + + imulq $A0[1],$m1 + + mov 16($nptr,$j),$Ni[0] # n[2] + xor $A1[1],$A1[1] + add $A0[1],$A1[0] + adc \$0,$A1[1] + mul $m1 # n[0]*m1 + add %rax,$A1[0] # n[0]*m1+"t[1]" + mov $Ni[0],%rax + adc %rdx,$A1[1] + mov $A1[0],8($tptr,$j) # "t[1]" + + xor $A0[1],$A0[1] + add 16($tptr,$j),$A0[0] + adc \$0,$A0[1] + mul $m0 # n[2]*m0 + add %rax,$A0[0] # n[2]*m0+t[2] + mov $Ni[1],%rax + adc %rdx,$A0[1] + + mov 24($nptr,$j),$Ni[1] # n[3] + xor $A1[0],$A1[0] + add $A0[0],$A1[1] + adc \$0,$A1[0] + mul $m1 # n[1]*m1 + add %rax,$A1[1] # n[1]*m1+"t[2]" + mov $Ni[1],%rax + adc %rdx,$A1[0] + mov $A1[1],16($tptr,$j) # "t[2]" + + xor $A0[0],$A0[0] + add 24($tptr,$j),$A0[1] + lea 32($j),$j + adc \$0,$A0[0] + mul $m0 # n[3]*m0 + add %rax,$A0[1] # n[3]*m0+t[3] + mov $Ni[0],%rax + adc %rdx,$A0[0] + jmp .Lsqr4x_mont_inner + +.align 16 +.Lsqr4x_mont_inner: + mov ($nptr,$j),$Ni[0] # n[4] + xor $A1[1],$A1[1] + add $A0[1],$A1[0] + adc \$0,$A1[1] + mul $m1 # n[2]*m1 + add %rax,$A1[0] # n[2]*m1+"t[3]" + mov $Ni[0],%rax + adc %rdx,$A1[1] + mov $A1[0],-8($tptr,$j) # "t[3]" + + xor $A0[1],$A0[1] + add ($tptr,$j),$A0[0] + adc \$0,$A0[1] + mul $m0 # n[4]*m0 + add %rax,$A0[0] # n[4]*m0+t[4] + mov $Ni[1],%rax + adc %rdx,$A0[1] + + mov 8($nptr,$j),$Ni[1] # n[5] + xor $A1[0],$A1[0] + add $A0[0],$A1[1] + adc \$0,$A1[0] + mul $m1 # n[3]*m1 + add %rax,$A1[1] # n[3]*m1+"t[4]" + mov $Ni[1],%rax + adc %rdx,$A1[0] + mov $A1[1],($tptr,$j) # "t[4]" + + xor $A0[0],$A0[0] + add 8($tptr,$j),$A0[1] + adc \$0,$A0[0] + mul $m0 # n[5]*m0 + add %rax,$A0[1] # n[5]*m0+t[5] + mov $Ni[0],%rax + adc %rdx,$A0[0] + + + mov 16($nptr,$j),$Ni[0] # n[6] + xor $A1[1],$A1[1] + add $A0[1],$A1[0] + adc \$0,$A1[1] + mul $m1 # n[4]*m1 + add %rax,$A1[0] # n[4]*m1+"t[5]" + mov $Ni[0],%rax + adc %rdx,$A1[1] + mov $A1[0],8($tptr,$j) # "t[5]" + + xor $A0[1],$A0[1] + add 16($tptr,$j),$A0[0] + adc \$0,$A0[1] + mul $m0 # n[6]*m0 + add %rax,$A0[0] # n[6]*m0+t[6] + mov $Ni[1],%rax + adc %rdx,$A0[1] + + mov 24($nptr,$j),$Ni[1] # n[7] + xor $A1[0],$A1[0] + add $A0[0],$A1[1] + adc \$0,$A1[0] + mul $m1 # n[5]*m1 + add %rax,$A1[1] # n[5]*m1+"t[6]" + mov $Ni[1],%rax + adc %rdx,$A1[0] + mov $A1[1],16($tptr,$j) # "t[6]" + + xor $A0[0],$A0[0] + add 24($tptr,$j),$A0[1] + lea 32($j),$j + adc \$0,$A0[0] + mul $m0 # n[7]*m0 + add %rax,$A0[1] # n[7]*m0+t[7] + mov $Ni[0],%rax + adc %rdx,$A0[0] + cmp \$0,$j + jne .Lsqr4x_mont_inner + + sub 0(%rsp),$j # $j=-$num # modsched # + mov $n0,$m0 # # modsched # + + xor $A1[1],$A1[1] + add $A0[1],$A1[0] + adc \$0,$A1[1] + mul $m1 # n[6]*m1 + add %rax,$A1[0] # n[6]*m1+"t[7]" + mov $Ni[1],%rax + adc %rdx,$A1[1] + mov $A1[0],-8($tptr) # "t[7]" + + xor $A0[1],$A0[1] + add ($tptr),$A0[0] # +t[8] + adc \$0,$A0[1] + mov 0($nptr,$j),$Ni[0] # n[0] # modsched # + add $topbit,$A0[0] + adc \$0,$A0[1] + + imulq 16($tptr,$j),$m0 # m0=t[0]*n0 # modsched # + xor $A1[0],$A1[0] + mov 8($nptr,$j),$Ni[1] # n[1] # modsched # + add $A0[0],$A1[1] + mov 16($tptr,$j),$A0[0] # t[0] # modsched # + adc \$0,$A1[0] + mul $m1 # n[7]*m1 + add %rax,$A1[1] # n[7]*m1+"t[8]" + mov $Ni[0],%rax # # modsched # + adc %rdx,$A1[0] + mov $A1[1],($tptr) # "t[8]" + + xor $topbit,$topbit + add 8($tptr),$A1[0] # +t[9] + adc $topbit,$topbit + add $A0[1],$A1[0] + lea 16($tptr),$tptr # "t[$num]>>128" + adc \$0,$topbit + mov $A1[0],-8($tptr) # "t[9]" + cmp 8(%rsp),$tptr # are we done? + jb .Lsqr4x_mont_outer + + mov 0(%rsp),$num # restore $num + mov $topbit,($tptr) # save $topbit +___ +} +############################################################## +# Post-condition, 4x unrolled copy from bn_mul_mont +# +{ +my ($tptr,$nptr)=("%rbx",$aptr); +my @ri=("%rax","%rdx","%r10","%r11"); +$code.=<<___; + mov 64(%rsp,$num),@ri[0] # tp[0] + lea 64(%rsp,$num),$tptr # upper half of t[2*$num] holds result + mov 40(%rsp),$nptr # restore $nptr + shr \$5,$num # num/4 + mov 8($tptr),@ri[1] # t[1] + xor $i,$i # i=0 and clear CF! + + mov 32(%rsp),$rptr # restore $rptr + sub 0($nptr),@ri[0] + mov 16($tptr),@ri[2] # t[2] + mov 24($tptr),@ri[3] # t[3] + sbb 8($nptr),@ri[1] + lea -1($num),$j # j=num/4-1 + jmp .Lsqr4x_sub +.align 16 +.Lsqr4x_sub: + mov @ri[0],0($rptr,$i,8) # rp[i]=tp[i]-np[i] + mov @ri[1],8($rptr,$i,8) # rp[i]=tp[i]-np[i] + sbb 16($nptr,$i,8),@ri[2] + mov 32($tptr,$i,8),@ri[0] # tp[i+1] + mov 40($tptr,$i,8),@ri[1] + sbb 24($nptr,$i,8),@ri[3] + mov @ri[2],16($rptr,$i,8) # rp[i]=tp[i]-np[i] + mov @ri[3],24($rptr,$i,8) # rp[i]=tp[i]-np[i] + sbb 32($nptr,$i,8),@ri[0] + mov 48($tptr,$i,8),@ri[2] + mov 56($tptr,$i,8),@ri[3] + sbb 40($nptr,$i,8),@ri[1] + lea 4($i),$i # i++ + dec $j # doesn't affect CF! + jnz .Lsqr4x_sub + + mov @ri[0],0($rptr,$i,8) # rp[i]=tp[i]-np[i] + mov 32($tptr,$i,8),@ri[0] # load overflow bit + sbb 16($nptr,$i,8),@ri[2] + mov @ri[1],8($rptr,$i,8) # rp[i]=tp[i]-np[i] + sbb 24($nptr,$i,8),@ri[3] + mov @ri[2],16($rptr,$i,8) # rp[i]=tp[i]-np[i] + + sbb \$0,@ri[0] # handle upmost overflow bit + mov @ri[3],24($rptr,$i,8) # rp[i]=tp[i]-np[i] + xor $i,$i # i=0 + and @ri[0],$tptr + not @ri[0] + mov $rptr,$nptr + and @ri[0],$nptr + lea -1($num),$j + or $nptr,$tptr # tp=borrow?tp:rp + + pxor %xmm0,%xmm0 + lea 64(%rsp,$num,8),$nptr + movdqu ($tptr),%xmm1 + lea ($nptr,$num,8),$nptr + movdqa %xmm0,64(%rsp) # zap lower half of temporary vector + movdqa %xmm0,($nptr) # zap upper half of temporary vector + movdqu %xmm1,($rptr) + jmp .Lsqr4x_copy +.align 16 +.Lsqr4x_copy: # copy or in-place refresh + movdqu 16($tptr,$i),%xmm2 + movdqu 32($tptr,$i),%xmm1 + movdqa %xmm0,80(%rsp,$i) # zap lower half of temporary vector + movdqa %xmm0,96(%rsp,$i) # zap lower half of temporary vector + movdqa %xmm0,16($nptr,$i) # zap upper half of temporary vector + movdqa %xmm0,32($nptr,$i) # zap upper half of temporary vector + movdqu %xmm2,16($rptr,$i) + movdqu %xmm1,32($rptr,$i) + lea 32($i),$i + dec $j + jnz .Lsqr4x_copy + + movdqu 16($tptr,$i),%xmm2 + movdqa %xmm0,80(%rsp,$i) # zap lower half of temporary vector + movdqa %xmm0,16($nptr,$i) # zap upper half of temporary vector + movdqu %xmm2,16($rptr,$i) +___ +} +$code.=<<___; + mov 56(%rsp),%rsi # restore %rsp + mov \$1,%rax + mov 0(%rsi),%r15 + mov 8(%rsi),%r14 + mov 16(%rsi),%r13 + mov 24(%rsi),%r12 + mov 32(%rsi),%rbp + mov 40(%rsi),%rbx + lea 48(%rsi),%rsp +.Lsqr4x_epilogue: + ret +.size bn_sqr4x_mont,.-bn_sqr4x_mont +___ +}}} +$code.=<<___; .asciz "Montgomery Multiplication for x86_64, CRYPTOGAMS by " .align 16 ___ @@ -228,9 +1511,9 @@ $disp="%r9"; $code.=<<___; .extern __imp_RtlVirtualUnwind -.type se_handler,\@abi-omnipotent +.type mul_handler,\@abi-omnipotent .align 16 -se_handler: +mul_handler: push %rsi push %rdi push %rbx @@ -245,15 +1528,20 @@ se_handler: mov 120($context),%rax # pull context->Rax mov 248($context),%rbx # pull context->Rip - lea .Lprologue(%rip),%r10 - cmp %r10,%rbx # context->Rip<.Lprologue - jb .Lin_prologue + mov 8($disp),%rsi # disp->ImageBase + mov 56($disp),%r11 # disp->HandlerData + + mov 0(%r11),%r10d # HandlerData[0] + lea (%rsi,%r10),%r10 # end of prologue label + cmp %r10,%rbx # context->RipRsp - lea .Lepilogue(%rip),%r10 - cmp %r10,%rbx # context->Rip>=.Lepilogue - jae .Lin_prologue + mov 4(%r11),%r10d # HandlerData[1] + lea (%rsi,%r10),%r10 # epilogue label + cmp %r10,%rbx # context->Rip>=epilogue label + jae .Lcommon_seh_tail mov 192($context),%r10 # pull $num mov 8(%rax,%r10,8),%rax # pull saved stack pointer @@ -272,7 +1560,53 @@ se_handler: mov %r14,232($context) # restore context->R14 mov %r15,240($context) # restore context->R15 -.Lin_prologue: + jmp .Lcommon_seh_tail +.size mul_handler,.-mul_handler + +.type sqr_handler,\@abi-omnipotent +.align 16 +sqr_handler: + push %rsi + push %rdi + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + pushfq + sub \$64,%rsp + + mov 120($context),%rax # pull context->Rax + mov 248($context),%rbx # pull context->Rip + + lea .Lsqr4x_body(%rip),%r10 + cmp %r10,%rbx # context->Rip<.Lsqr_body + jb .Lcommon_seh_tail + + mov 152($context),%rax # pull context->Rsp + + lea .Lsqr4x_epilogue(%rip),%r10 + cmp %r10,%rbx # context->Rip>=.Lsqr_epilogue + jae .Lcommon_seh_tail + + mov 56(%rax),%rax # pull saved stack pointer + lea 48(%rax),%rax + + mov -8(%rax),%rbx + mov -16(%rax),%rbp + mov -24(%rax),%r12 + mov -32(%rax),%r13 + mov -40(%rax),%r14 + mov -48(%rax),%r15 + mov %rbx,144($context) # restore context->Rbx + mov %rbp,160($context) # restore context->Rbp + mov %r12,216($context) # restore context->R12 + mov %r13,224($context) # restore context->R13 + mov %r14,232($context) # restore context->R14 + mov %r15,240($context) # restore context->R15 + +.Lcommon_seh_tail: mov 8(%rax),%rdi mov 16(%rax),%rsi mov %rax,152($context) # restore context->Rsp @@ -310,7 +1644,7 @@ se_handler: pop %rdi pop %rsi ret -.size se_handler,.-se_handler +.size sqr_handler,.-sqr_handler .section .pdata .align 4 @@ -318,11 +1652,27 @@ se_handler: .rva .LSEH_end_bn_mul_mont .rva .LSEH_info_bn_mul_mont + .rva .LSEH_begin_bn_mul4x_mont + .rva .LSEH_end_bn_mul4x_mont + .rva .LSEH_info_bn_mul4x_mont + + .rva .LSEH_begin_bn_sqr4x_mont + .rva .LSEH_end_bn_sqr4x_mont + .rva .LSEH_info_bn_sqr4x_mont + .section .xdata .align 8 .LSEH_info_bn_mul_mont: .byte 9,0,0,0 - .rva se_handler + .rva mul_handler + .rva .Lmul_body,.Lmul_epilogue # HandlerData[] +.LSEH_info_bn_mul4x_mont: + .byte 9,0,0,0 + .rva mul_handler + .rva .Lmul4x_body,.Lmul4x_epilogue # HandlerData[] +.LSEH_info_bn_sqr4x_mont: + .byte 9,0,0,0 + .rva sqr_handler ___ } diff --git a/crypto/openssl/crypto/bn/asm/x86_64-mont5.pl b/crypto/openssl/crypto/bn/asm/x86_64-mont5.pl new file mode 100755 index 0000000000..057cda28aa --- /dev/null +++ b/crypto/openssl/crypto/bn/asm/x86_64-mont5.pl @@ -0,0 +1,1070 @@ +#!/usr/bin/env perl + +# ==================================================================== +# Written by Andy Polyakov for the OpenSSL +# project. The module is, however, dual licensed under OpenSSL and +# CRYPTOGAMS licenses depending on where you obtain it. For further +# details see http://www.openssl.org/~appro/cryptogams/. +# ==================================================================== + +# August 2011. +# +# Companion to x86_64-mont.pl that optimizes cache-timing attack +# countermeasures. The subroutines are produced by replacing bp[i] +# references in their x86_64-mont.pl counterparts with cache-neutral +# references to powers table computed in BN_mod_exp_mont_consttime. +# In addition subroutine that scatters elements of the powers table +# is implemented, so that scatter-/gathering can be tuned without +# bn_exp.c modifications. + +$flavour = shift; +$output = shift; +if ($flavour =~ /\./) { $output = $flavour; undef $flavour; } + +$win64=0; $win64=1 if ($flavour =~ /[nm]asm|mingw64/ || $output =~ /\.asm$/); + +$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; +( $xlate="${dir}x86_64-xlate.pl" and -f $xlate ) or +( $xlate="${dir}../../perlasm/x86_64-xlate.pl" and -f $xlate) or +die "can't locate x86_64-xlate.pl"; + +open STDOUT,"| $^X $xlate $flavour $output"; + +# int bn_mul_mont_gather5( +$rp="%rdi"; # BN_ULONG *rp, +$ap="%rsi"; # const BN_ULONG *ap, +$bp="%rdx"; # const BN_ULONG *bp, +$np="%rcx"; # const BN_ULONG *np, +$n0="%r8"; # const BN_ULONG *n0, +$num="%r9"; # int num, + # int idx); # 0 to 2^5-1, "index" in $bp holding + # pre-computed powers of a', interlaced + # in such manner that b[0] is $bp[idx], + # b[1] is [2^5+idx], etc. +$lo0="%r10"; +$hi0="%r11"; +$hi1="%r13"; +$i="%r14"; +$j="%r15"; +$m0="%rbx"; +$m1="%rbp"; + +$code=<<___; +.text + +.globl bn_mul_mont_gather5 +.type bn_mul_mont_gather5,\@function,6 +.align 64 +bn_mul_mont_gather5: + test \$3,${num}d + jnz .Lmul_enter + cmp \$8,${num}d + jb .Lmul_enter + jmp .Lmul4x_enter + +.align 16 +.Lmul_enter: + mov ${num}d,${num}d + mov `($win64?56:8)`(%rsp),%r10d # load 7th argument + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 +___ +$code.=<<___ if ($win64); + lea -0x28(%rsp),%rsp + movaps %xmm6,(%rsp) + movaps %xmm7,0x10(%rsp) +.Lmul_alloca: +___ +$code.=<<___; + mov %rsp,%rax + lea 2($num),%r11 + neg %r11 + lea (%rsp,%r11,8),%rsp # tp=alloca(8*(num+2)) + and \$-1024,%rsp # minimize TLB usage + + mov %rax,8(%rsp,$num,8) # tp[num+1]=%rsp +.Lmul_body: + mov $bp,%r12 # reassign $bp +___ + $bp="%r12"; + $STRIDE=2**5*8; # 5 is "window size" + $N=$STRIDE/4; # should match cache line size +$code.=<<___; + mov %r10,%r11 + shr \$`log($N/8)/log(2)`,%r10 + and \$`$N/8-1`,%r11 + not %r10 + lea .Lmagic_masks(%rip),%rax + and \$`2**5/($N/8)-1`,%r10 # 5 is "window size" + lea 96($bp,%r11,8),$bp # pointer within 1st cache line + movq 0(%rax,%r10,8),%xmm4 # set of masks denoting which + movq 8(%rax,%r10,8),%xmm5 # cache line contains element + movq 16(%rax,%r10,8),%xmm6 # denoted by 7th argument + movq 24(%rax,%r10,8),%xmm7 + + movq `0*$STRIDE/4-96`($bp),%xmm0 + movq `1*$STRIDE/4-96`($bp),%xmm1 + pand %xmm4,%xmm0 + movq `2*$STRIDE/4-96`($bp),%xmm2 + pand %xmm5,%xmm1 + movq `3*$STRIDE/4-96`($bp),%xmm3 + pand %xmm6,%xmm2 + por %xmm1,%xmm0 + pand %xmm7,%xmm3 + por %xmm2,%xmm0 + lea $STRIDE($bp),$bp + por %xmm3,%xmm0 + + movq %xmm0,$m0 # m0=bp[0] + + mov ($n0),$n0 # pull n0[0] value + mov ($ap),%rax + + xor $i,$i # i=0 + xor $j,$j # j=0 + + movq `0*$STRIDE/4-96`($bp),%xmm0 + movq `1*$STRIDE/4-96`($bp),%xmm1 + pand %xmm4,%xmm0 + movq `2*$STRIDE/4-96`($bp),%xmm2 + pand %xmm5,%xmm1 + + mov $n0,$m1 + mulq $m0 # ap[0]*bp[0] + mov %rax,$lo0 + mov ($np),%rax + + movq `3*$STRIDE/4-96`($bp),%xmm3 + pand %xmm6,%xmm2 + por %xmm1,%xmm0 + pand %xmm7,%xmm3 + + imulq $lo0,$m1 # "tp[0]"*n0 + mov %rdx,$hi0 + + por %xmm2,%xmm0 + lea $STRIDE($bp),$bp + por %xmm3,%xmm0 + + mulq $m1 # np[0]*m1 + add %rax,$lo0 # discarded + mov 8($ap),%rax + adc \$0,%rdx + mov %rdx,$hi1 + + lea 1($j),$j # j++ + jmp .L1st_enter + +.align 16 +.L1st: + add %rax,$hi1 + mov ($ap,$j,8),%rax + adc \$0,%rdx + add $hi0,$hi1 # np[j]*m1+ap[j]*bp[0] + mov $lo0,$hi0 + adc \$0,%rdx + mov $hi1,-16(%rsp,$j,8) # tp[j-1] + mov %rdx,$hi1 + +.L1st_enter: + mulq $m0 # ap[j]*bp[0] + add %rax,$hi0 + mov ($np,$j,8),%rax + adc \$0,%rdx + lea 1($j),$j # j++ + mov %rdx,$lo0 + + mulq $m1 # np[j]*m1 + cmp $num,$j + jne .L1st + + movq %xmm0,$m0 # bp[1] + + add %rax,$hi1 + mov ($ap),%rax # ap[0] + adc \$0,%rdx + add $hi0,$hi1 # np[j]*m1+ap[j]*bp[0] + adc \$0,%rdx + mov $hi1,-16(%rsp,$j,8) # tp[j-1] + mov %rdx,$hi1 + mov $lo0,$hi0 + + xor %rdx,%rdx + add $hi0,$hi1 + adc \$0,%rdx + mov $hi1,-8(%rsp,$num,8) + mov %rdx,(%rsp,$num,8) # store upmost overflow bit + + lea 1($i),$i # i++ + jmp .Louter +.align 16 +.Louter: + xor $j,$j # j=0 + mov $n0,$m1 + mov (%rsp),$lo0 + + movq `0*$STRIDE/4-96`($bp),%xmm0 + movq `1*$STRIDE/4-96`($bp),%xmm1 + pand %xmm4,%xmm0 + movq `2*$STRIDE/4-96`($bp),%xmm2 + pand %xmm5,%xmm1 + + mulq $m0 # ap[0]*bp[i] + add %rax,$lo0 # ap[0]*bp[i]+tp[0] + mov ($np),%rax + adc \$0,%rdx + + movq `3*$STRIDE/4-96`($bp),%xmm3 + pand %xmm6,%xmm2 + por %xmm1,%xmm0 + pand %xmm7,%xmm3 + + imulq $lo0,$m1 # tp[0]*n0 + mov %rdx,$hi0 + + por %xmm2,%xmm0 + lea $STRIDE($bp),$bp + por %xmm3,%xmm0 + + mulq $m1 # np[0]*m1 + add %rax,$lo0 # discarded + mov 8($ap),%rax + adc \$0,%rdx + mov 8(%rsp),$lo0 # tp[1] + mov %rdx,$hi1 + + lea 1($j),$j # j++ + jmp .Linner_enter + +.align 16 +.Linner: + add %rax,$hi1 + mov ($ap,$j,8),%rax + adc \$0,%rdx + add $lo0,$hi1 # np[j]*m1+ap[j]*bp[i]+tp[j] + mov (%rsp,$j,8),$lo0 + adc \$0,%rdx + mov $hi1,-16(%rsp,$j,8) # tp[j-1] + mov %rdx,$hi1 + +.Linner_enter: + mulq $m0 # ap[j]*bp[i] + add %rax,$hi0 + mov ($np,$j,8),%rax + adc \$0,%rdx + add $hi0,$lo0 # ap[j]*bp[i]+tp[j] + mov %rdx,$hi0 + adc \$0,$hi0 + lea 1($j),$j # j++ + + mulq $m1 # np[j]*m1 + cmp $num,$j + jne .Linner + + movq %xmm0,$m0 # bp[i+1] + + add %rax,$hi1 + mov ($ap),%rax # ap[0] + adc \$0,%rdx + add $lo0,$hi1 # np[j]*m1+ap[j]*bp[i]+tp[j] + mov (%rsp,$j,8),$lo0 + adc \$0,%rdx + mov $hi1,-16(%rsp,$j,8) # tp[j-1] + mov %rdx,$hi1 + + xor %rdx,%rdx + add $hi0,$hi1 + adc \$0,%rdx + add $lo0,$hi1 # pull upmost overflow bit + adc \$0,%rdx + mov $hi1,-8(%rsp,$num,8) + mov %rdx,(%rsp,$num,8) # store upmost overflow bit + + lea 1($i),$i # i++ + cmp $num,$i + jl .Louter + + xor $i,$i # i=0 and clear CF! + mov (%rsp),%rax # tp[0] + lea (%rsp),$ap # borrow ap for tp + mov $num,$j # j=num + jmp .Lsub +.align 16 +.Lsub: sbb ($np,$i,8),%rax + mov %rax,($rp,$i,8) # rp[i]=tp[i]-np[i] + mov 8($ap,$i,8),%rax # tp[i+1] + lea 1($i),$i # i++ + dec $j # doesnn't affect CF! + jnz .Lsub + + sbb \$0,%rax # handle upmost overflow bit + xor $i,$i + and %rax,$ap + not %rax + mov $rp,$np + and %rax,$np + mov $num,$j # j=num + or $np,$ap # ap=borrow?tp:rp +.align 16 +.Lcopy: # copy or in-place refresh + mov ($ap,$i,8),%rax + mov $i,(%rsp,$i,8) # zap temporary vector + mov %rax,($rp,$i,8) # rp[i]=tp[i] + lea 1($i),$i + sub \$1,$j + jnz .Lcopy + + mov 8(%rsp,$num,8),%rsi # restore %rsp + mov \$1,%rax +___ +$code.=<<___ if ($win64); + movaps (%rsi),%xmm6 + movaps 0x10(%rsi),%xmm7 + lea 0x28(%rsi),%rsi +___ +$code.=<<___; + mov (%rsi),%r15 + mov 8(%rsi),%r14 + mov 16(%rsi),%r13 + mov 24(%rsi),%r12 + mov 32(%rsi),%rbp + mov 40(%rsi),%rbx + lea 48(%rsi),%rsp +.Lmul_epilogue: + ret +.size bn_mul_mont_gather5,.-bn_mul_mont_gather5 +___ +{{{ +my @A=("%r10","%r11"); +my @N=("%r13","%rdi"); +$code.=<<___; +.type bn_mul4x_mont_gather5,\@function,6 +.align 16 +bn_mul4x_mont_gather5: +.Lmul4x_enter: + mov ${num}d,${num}d + mov `($win64?56:8)`(%rsp),%r10d # load 7th argument + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 +___ +$code.=<<___ if ($win64); + lea -0x28(%rsp),%rsp + movaps %xmm6,(%rsp) + movaps %xmm7,0x10(%rsp) +.Lmul4x_alloca: +___ +$code.=<<___; + mov %rsp,%rax + lea 4($num),%r11 + neg %r11 + lea (%rsp,%r11,8),%rsp # tp=alloca(8*(num+4)) + and \$-1024,%rsp # minimize TLB usage + + mov %rax,8(%rsp,$num,8) # tp[num+1]=%rsp +.Lmul4x_body: + mov $rp,16(%rsp,$num,8) # tp[num+2]=$rp + mov %rdx,%r12 # reassign $bp +___ + $bp="%r12"; + $STRIDE=2**5*8; # 5 is "window size" + $N=$STRIDE/4; # should match cache line size +$code.=<<___; + mov %r10,%r11 + shr \$`log($N/8)/log(2)`,%r10 + and \$`$N/8-1`,%r11 + not %r10 + lea .Lmagic_masks(%rip),%rax + and \$`2**5/($N/8)-1`,%r10 # 5 is "window size" + lea 96($bp,%r11,8),$bp # pointer within 1st cache line + movq 0(%rax,%r10,8),%xmm4 # set of masks denoting which + movq 8(%rax,%r10,8),%xmm5 # cache line contains element + movq 16(%rax,%r10,8),%xmm6 # denoted by 7th argument + movq 24(%rax,%r10,8),%xmm7 + + movq `0*$STRIDE/4-96`($bp),%xmm0 + movq `1*$STRIDE/4-96`($bp),%xmm1 + pand %xmm4,%xmm0 + movq `2*$STRIDE/4-96`($bp),%xmm2 + pand %xmm5,%xmm1 + movq `3*$STRIDE/4-96`($bp),%xmm3 + pand %xmm6,%xmm2 + por %xmm1,%xmm0 + pand %xmm7,%xmm3 + por %xmm2,%xmm0 + lea $STRIDE($bp),$bp + por %xmm3,%xmm0 + + movq %xmm0,$m0 # m0=bp[0] + mov ($n0),$n0 # pull n0[0] value + mov ($ap),%rax + + xor $i,$i # i=0 + xor $j,$j # j=0 + + movq `0*$STRIDE/4-96`($bp),%xmm0 + movq `1*$STRIDE/4-96`($bp),%xmm1 + pand %xmm4,%xmm0 + movq `2*$STRIDE/4-96`($bp),%xmm2 + pand %xmm5,%xmm1 + + mov $n0,$m1 + mulq $m0 # ap[0]*bp[0] + mov %rax,$A[0] + mov ($np),%rax + + movq `3*$STRIDE/4-96`($bp),%xmm3 + pand %xmm6,%xmm2 + por %xmm1,%xmm0 + pand %xmm7,%xmm3 + + imulq $A[0],$m1 # "tp[0]"*n0 + mov %rdx,$A[1] + + por %xmm2,%xmm0 + lea $STRIDE($bp),$bp + por %xmm3,%xmm0 + + mulq $m1 # np[0]*m1 + add %rax,$A[0] # discarded + mov 8($ap),%rax + adc \$0,%rdx + mov %rdx,$N[1] + + mulq $m0 + add %rax,$A[1] + mov 8($np),%rax + adc \$0,%rdx + mov %rdx,$A[0] + + mulq $m1 + add %rax,$N[1] + mov 16($ap),%rax + adc \$0,%rdx + add $A[1],$N[1] + lea 4($j),$j # j++ + adc \$0,%rdx + mov $N[1],(%rsp) + mov %rdx,$N[0] + jmp .L1st4x +.align 16 +.L1st4x: + mulq $m0 # ap[j]*bp[0] + add %rax,$A[0] + mov -16($np,$j,8),%rax + adc \$0,%rdx + mov %rdx,$A[1] + + mulq $m1 # np[j]*m1 + add %rax,$N[0] + mov -8($ap,$j,8),%rax + adc \$0,%rdx + add $A[0],$N[0] # np[j]*m1+ap[j]*bp[0] + adc \$0,%rdx + mov $N[0],-24(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[1] + + mulq $m0 # ap[j]*bp[0] + add %rax,$A[1] + mov -8($np,$j,8),%rax + adc \$0,%rdx + mov %rdx,$A[0] + + mulq $m1 # np[j]*m1 + add %rax,$N[1] + mov ($ap,$j,8),%rax + adc \$0,%rdx + add $A[1],$N[1] # np[j]*m1+ap[j]*bp[0] + adc \$0,%rdx + mov $N[1],-16(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[0] + + mulq $m0 # ap[j]*bp[0] + add %rax,$A[0] + mov ($np,$j,8),%rax + adc \$0,%rdx + mov %rdx,$A[1] + + mulq $m1 # np[j]*m1 + add %rax,$N[0] + mov 8($ap,$j,8),%rax + adc \$0,%rdx + add $A[0],$N[0] # np[j]*m1+ap[j]*bp[0] + adc \$0,%rdx + mov $N[0],-8(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[1] + + mulq $m0 # ap[j]*bp[0] + add %rax,$A[1] + mov 8($np,$j,8),%rax + adc \$0,%rdx + lea 4($j),$j # j++ + mov %rdx,$A[0] + + mulq $m1 # np[j]*m1 + add %rax,$N[1] + mov -16($ap,$j,8),%rax + adc \$0,%rdx + add $A[1],$N[1] # np[j]*m1+ap[j]*bp[0] + adc \$0,%rdx + mov $N[1],-32(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[0] + cmp $num,$j + jl .L1st4x + + mulq $m0 # ap[j]*bp[0] + add %rax,$A[0] + mov -16($np,$j,8),%rax + adc \$0,%rdx + mov %rdx,$A[1] + + mulq $m1 # np[j]*m1 + add %rax,$N[0] + mov -8($ap,$j,8),%rax + adc \$0,%rdx + add $A[0],$N[0] # np[j]*m1+ap[j]*bp[0] + adc \$0,%rdx + mov $N[0],-24(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[1] + + mulq $m0 # ap[j]*bp[0] + add %rax,$A[1] + mov -8($np,$j,8),%rax + adc \$0,%rdx + mov %rdx,$A[0] + + mulq $m1 # np[j]*m1 + add %rax,$N[1] + mov ($ap),%rax # ap[0] + adc \$0,%rdx + add $A[1],$N[1] # np[j]*m1+ap[j]*bp[0] + adc \$0,%rdx + mov $N[1],-16(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[0] + + movq %xmm0,$m0 # bp[1] + + xor $N[1],$N[1] + add $A[0],$N[0] + adc \$0,$N[1] + mov $N[0],-8(%rsp,$j,8) + mov $N[1],(%rsp,$j,8) # store upmost overflow bit + + lea 1($i),$i # i++ +.align 4 +.Louter4x: + xor $j,$j # j=0 + movq `0*$STRIDE/4-96`($bp),%xmm0 + movq `1*$STRIDE/4-96`($bp),%xmm1 + pand %xmm4,%xmm0 + movq `2*$STRIDE/4-96`($bp),%xmm2 + pand %xmm5,%xmm1 + + mov (%rsp),$A[0] + mov $n0,$m1 + mulq $m0 # ap[0]*bp[i] + add %rax,$A[0] # ap[0]*bp[i]+tp[0] + mov ($np),%rax + adc \$0,%rdx + + movq `3*$STRIDE/4-96`($bp),%xmm3 + pand %xmm6,%xmm2 + por %xmm1,%xmm0 + pand %xmm7,%xmm3 + + imulq $A[0],$m1 # tp[0]*n0 + mov %rdx,$A[1] + + por %xmm2,%xmm0 + lea $STRIDE($bp),$bp + por %xmm3,%xmm0 + + mulq $m1 # np[0]*m1 + add %rax,$A[0] # "$N[0]", discarded + mov 8($ap),%rax + adc \$0,%rdx + mov %rdx,$N[1] + + mulq $m0 # ap[j]*bp[i] + add %rax,$A[1] + mov 8($np),%rax + adc \$0,%rdx + add 8(%rsp),$A[1] # +tp[1] + adc \$0,%rdx + mov %rdx,$A[0] + + mulq $m1 # np[j]*m1 + add %rax,$N[1] + mov 16($ap),%rax + adc \$0,%rdx + add $A[1],$N[1] # np[j]*m1+ap[j]*bp[i]+tp[j] + lea 4($j),$j # j+=2 + adc \$0,%rdx + mov %rdx,$N[0] + jmp .Linner4x +.align 16 +.Linner4x: + mulq $m0 # ap[j]*bp[i] + add %rax,$A[0] + mov -16($np,$j,8),%rax + adc \$0,%rdx + add -16(%rsp,$j,8),$A[0] # ap[j]*bp[i]+tp[j] + adc \$0,%rdx + mov %rdx,$A[1] + + mulq $m1 # np[j]*m1 + add %rax,$N[0] + mov -8($ap,$j,8),%rax + adc \$0,%rdx + add $A[0],$N[0] + adc \$0,%rdx + mov $N[1],-32(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[1] + + mulq $m0 # ap[j]*bp[i] + add %rax,$A[1] + mov -8($np,$j,8),%rax + adc \$0,%rdx + add -8(%rsp,$j,8),$A[1] + adc \$0,%rdx + mov %rdx,$A[0] + + mulq $m1 # np[j]*m1 + add %rax,$N[1] + mov ($ap,$j,8),%rax + adc \$0,%rdx + add $A[1],$N[1] + adc \$0,%rdx + mov $N[0],-24(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[0] + + mulq $m0 # ap[j]*bp[i] + add %rax,$A[0] + mov ($np,$j,8),%rax + adc \$0,%rdx + add (%rsp,$j,8),$A[0] # ap[j]*bp[i]+tp[j] + adc \$0,%rdx + mov %rdx,$A[1] + + mulq $m1 # np[j]*m1 + add %rax,$N[0] + mov 8($ap,$j,8),%rax + adc \$0,%rdx + add $A[0],$N[0] + adc \$0,%rdx + mov $N[1],-16(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[1] + + mulq $m0 # ap[j]*bp[i] + add %rax,$A[1] + mov 8($np,$j,8),%rax + adc \$0,%rdx + add 8(%rsp,$j,8),$A[1] + adc \$0,%rdx + lea 4($j),$j # j++ + mov %rdx,$A[0] + + mulq $m1 # np[j]*m1 + add %rax,$N[1] + mov -16($ap,$j,8),%rax + adc \$0,%rdx + add $A[1],$N[1] + adc \$0,%rdx + mov $N[0],-40(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[0] + cmp $num,$j + jl .Linner4x + + mulq $m0 # ap[j]*bp[i] + add %rax,$A[0] + mov -16($np,$j,8),%rax + adc \$0,%rdx + add -16(%rsp,$j,8),$A[0] # ap[j]*bp[i]+tp[j] + adc \$0,%rdx + mov %rdx,$A[1] + + mulq $m1 # np[j]*m1 + add %rax,$N[0] + mov -8($ap,$j,8),%rax + adc \$0,%rdx + add $A[0],$N[0] + adc \$0,%rdx + mov $N[1],-32(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[1] + + mulq $m0 # ap[j]*bp[i] + add %rax,$A[1] + mov -8($np,$j,8),%rax + adc \$0,%rdx + add -8(%rsp,$j,8),$A[1] + adc \$0,%rdx + lea 1($i),$i # i++ + mov %rdx,$A[0] + + mulq $m1 # np[j]*m1 + add %rax,$N[1] + mov ($ap),%rax # ap[0] + adc \$0,%rdx + add $A[1],$N[1] + adc \$0,%rdx + mov $N[0],-24(%rsp,$j,8) # tp[j-1] + mov %rdx,$N[0] + + movq %xmm0,$m0 # bp[i+1] + mov $N[1],-16(%rsp,$j,8) # tp[j-1] + + xor $N[1],$N[1] + add $A[0],$N[0] + adc \$0,$N[1] + add (%rsp,$num,8),$N[0] # pull upmost overflow bit + adc \$0,$N[1] + mov $N[0],-8(%rsp,$j,8) + mov $N[1],(%rsp,$j,8) # store upmost overflow bit + + cmp $num,$i + jl .Louter4x +___ +{ +my @ri=("%rax","%rdx",$m0,$m1); +$code.=<<___; + mov 16(%rsp,$num,8),$rp # restore $rp + mov 0(%rsp),@ri[0] # tp[0] + pxor %xmm0,%xmm0 + mov 8(%rsp),@ri[1] # tp[1] + shr \$2,$num # num/=4 + lea (%rsp),$ap # borrow ap for tp + xor $i,$i # i=0 and clear CF! + + sub 0($np),@ri[0] + mov 16($ap),@ri[2] # tp[2] + mov 24($ap),@ri[3] # tp[3] + sbb 8($np),@ri[1] + lea -1($num),$j # j=num/4-1 + jmp .Lsub4x +.align 16 +.Lsub4x: + mov @ri[0],0($rp,$i,8) # rp[i]=tp[i]-np[i] + mov @ri[1],8($rp,$i,8) # rp[i]=tp[i]-np[i] + sbb 16($np,$i,8),@ri[2] + mov 32($ap,$i,8),@ri[0] # tp[i+1] + mov 40($ap,$i,8),@ri[1] + sbb 24($np,$i,8),@ri[3] + mov @ri[2],16($rp,$i,8) # rp[i]=tp[i]-np[i] + mov @ri[3],24($rp,$i,8) # rp[i]=tp[i]-np[i] + sbb 32($np,$i,8),@ri[0] + mov 48($ap,$i,8),@ri[2] + mov 56($ap,$i,8),@ri[3] + sbb 40($np,$i,8),@ri[1] + lea 4($i),$i # i++ + dec $j # doesnn't affect CF! + jnz .Lsub4x + + mov @ri[0],0($rp,$i,8) # rp[i]=tp[i]-np[i] + mov 32($ap,$i,8),@ri[0] # load overflow bit + sbb 16($np,$i,8),@ri[2] + mov @ri[1],8($rp,$i,8) # rp[i]=tp[i]-np[i] + sbb 24($np,$i,8),@ri[3] + mov @ri[2],16($rp,$i,8) # rp[i]=tp[i]-np[i] + + sbb \$0,@ri[0] # handle upmost overflow bit + mov @ri[3],24($rp,$i,8) # rp[i]=tp[i]-np[i] + xor $i,$i # i=0 + and @ri[0],$ap + not @ri[0] + mov $rp,$np + and @ri[0],$np + lea -1($num),$j + or $np,$ap # ap=borrow?tp:rp + + movdqu ($ap),%xmm1 + movdqa %xmm0,(%rsp) + movdqu %xmm1,($rp) + jmp .Lcopy4x +.align 16 +.Lcopy4x: # copy or in-place refresh + movdqu 16($ap,$i),%xmm2 + movdqu 32($ap,$i),%xmm1 + movdqa %xmm0,16(%rsp,$i) + movdqu %xmm2,16($rp,$i) + movdqa %xmm0,32(%rsp,$i) + movdqu %xmm1,32($rp,$i) + lea 32($i),$i + dec $j + jnz .Lcopy4x + + shl \$2,$num + movdqu 16($ap,$i),%xmm2 + movdqa %xmm0,16(%rsp,$i) + movdqu %xmm2,16($rp,$i) +___ +} +$code.=<<___; + mov 8(%rsp,$num,8),%rsi # restore %rsp + mov \$1,%rax +___ +$code.=<<___ if ($win64); + movaps (%rsi),%xmm6 + movaps 0x10(%rsi),%xmm7 + lea 0x28(%rsi),%rsi +___ +$code.=<<___; + mov (%rsi),%r15 + mov 8(%rsi),%r14 + mov 16(%rsi),%r13 + mov 24(%rsi),%r12 + mov 32(%rsi),%rbp + mov 40(%rsi),%rbx + lea 48(%rsi),%rsp +.Lmul4x_epilogue: + ret +.size bn_mul4x_mont_gather5,.-bn_mul4x_mont_gather5 +___ +}}} + +{ +my ($inp,$num,$tbl,$idx)=$win64?("%rcx","%rdx","%r8", "%r9") : # Win64 order + ("%rdi","%rsi","%rdx","%rcx"); # Unix order +my $out=$inp; +my $STRIDE=2**5*8; +my $N=$STRIDE/4; + +$code.=<<___; +.globl bn_scatter5 +.type bn_scatter5,\@abi-omnipotent +.align 16 +bn_scatter5: + cmp \$0, $num + jz .Lscatter_epilogue + lea ($tbl,$idx,8),$tbl +.Lscatter: + mov ($inp),%rax + lea 8($inp),$inp + mov %rax,($tbl) + lea 32*8($tbl),$tbl + sub \$1,$num + jnz .Lscatter +.Lscatter_epilogue: + ret +.size bn_scatter5,.-bn_scatter5 + +.globl bn_gather5 +.type bn_gather5,\@abi-omnipotent +.align 16 +bn_gather5: +___ +$code.=<<___ if ($win64); +.LSEH_begin_bn_gather5: + # I can't trust assembler to use specific encoding:-( + .byte 0x48,0x83,0xec,0x28 #sub \$0x28,%rsp + .byte 0x0f,0x29,0x34,0x24 #movaps %xmm6,(%rsp) + .byte 0x0f,0x29,0x7c,0x24,0x10 #movdqa %xmm7,0x10(%rsp) +___ +$code.=<<___; + mov $idx,%r11 + shr \$`log($N/8)/log(2)`,$idx + and \$`$N/8-1`,%r11 + not $idx + lea .Lmagic_masks(%rip),%rax + and \$`2**5/($N/8)-1`,$idx # 5 is "window size" + lea 96($tbl,%r11,8),$tbl # pointer within 1st cache line + movq 0(%rax,$idx,8),%xmm4 # set of masks denoting which + movq 8(%rax,$idx,8),%xmm5 # cache line contains element + movq 16(%rax,$idx,8),%xmm6 # denoted by 7th argument + movq 24(%rax,$idx,8),%xmm7 + jmp .Lgather +.align 16 +.Lgather: + movq `0*$STRIDE/4-96`($tbl),%xmm0 + movq `1*$STRIDE/4-96`($tbl),%xmm1 + pand %xmm4,%xmm0 + movq `2*$STRIDE/4-96`($tbl),%xmm2 + pand %xmm5,%xmm1 + movq `3*$STRIDE/4-96`($tbl),%xmm3 + pand %xmm6,%xmm2 + por %xmm1,%xmm0 + pand %xmm7,%xmm3 + por %xmm2,%xmm0 + lea $STRIDE($tbl),$tbl + por %xmm3,%xmm0 + + movq %xmm0,($out) # m0=bp[0] + lea 8($out),$out + sub \$1,$num + jnz .Lgather +___ +$code.=<<___ if ($win64); + movaps %xmm6,(%rsp) + movaps %xmm7,0x10(%rsp) + lea 0x28(%rsp),%rsp +___ +$code.=<<___; + ret +.LSEH_end_bn_gather5: +.size bn_gather5,.-bn_gather5 +___ +} +$code.=<<___; +.align 64 +.Lmagic_masks: + .long 0,0, 0,0, 0,0, -1,-1 + .long 0,0, 0,0, 0,0, 0,0 +.asciz "Montgomery Multiplication with scatter/gather for x86_64, CRYPTOGAMS by " +___ + +# EXCEPTION_DISPOSITION handler (EXCEPTION_RECORD *rec,ULONG64 frame, +# CONTEXT *context,DISPATCHER_CONTEXT *disp) +if ($win64) { +$rec="%rcx"; +$frame="%rdx"; +$context="%r8"; +$disp="%r9"; + +$code.=<<___; +.extern __imp_RtlVirtualUnwind +.type mul_handler,\@abi-omnipotent +.align 16 +mul_handler: + push %rsi + push %rdi + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + pushfq + sub \$64,%rsp + + mov 120($context),%rax # pull context->Rax + mov 248($context),%rbx # pull context->Rip + + mov 8($disp),%rsi # disp->ImageBase + mov 56($disp),%r11 # disp->HandlerData + + mov 0(%r11),%r10d # HandlerData[0] + lea (%rsi,%r10),%r10 # end of prologue label + cmp %r10,%rbx # context->RipRipRsp + + mov 8(%r11),%r10d # HandlerData[2] + lea (%rsi,%r10),%r10 # epilogue label + cmp %r10,%rbx # context->Rip>=epilogue label + jae .Lcommon_seh_tail + + mov 192($context),%r10 # pull $num + mov 8(%rax,%r10,8),%rax # pull saved stack pointer + + movaps (%rax),%xmm0 + movaps 16(%rax),%xmm1 + lea `40+48`(%rax),%rax + + mov -8(%rax),%rbx + mov -16(%rax),%rbp + mov -24(%rax),%r12 + mov -32(%rax),%r13 + mov -40(%rax),%r14 + mov -48(%rax),%r15 + mov %rbx,144($context) # restore context->Rbx + mov %rbp,160($context) # restore context->Rbp + mov %r12,216($context) # restore context->R12 + mov %r13,224($context) # restore context->R13 + mov %r14,232($context) # restore context->R14 + mov %r15,240($context) # restore context->R15 + movups %xmm0,512($context) # restore context->Xmm6 + movups %xmm1,528($context) # restore context->Xmm7 + +.Lcommon_seh_tail: + mov 8(%rax),%rdi + mov 16(%rax),%rsi + mov %rax,152($context) # restore context->Rsp + mov %rsi,168($context) # restore context->Rsi + mov %rdi,176($context) # restore context->Rdi + + mov 40($disp),%rdi # disp->ContextRecord + mov $context,%rsi # context + mov \$154,%ecx # sizeof(CONTEXT) + .long 0xa548f3fc # cld; rep movsq + + mov $disp,%rsi + xor %rcx,%rcx # arg1, UNW_FLAG_NHANDLER + mov 8(%rsi),%rdx # arg2, disp->ImageBase + mov 0(%rsi),%r8 # arg3, disp->ControlPc + mov 16(%rsi),%r9 # arg4, disp->FunctionEntry + mov 40(%rsi),%r10 # disp->ContextRecord + lea 56(%rsi),%r11 # &disp->HandlerData + lea 24(%rsi),%r12 # &disp->EstablisherFrame + mov %r10,32(%rsp) # arg5 + mov %r11,40(%rsp) # arg6 + mov %r12,48(%rsp) # arg7 + mov %rcx,56(%rsp) # arg8, (NULL) + call *__imp_RtlVirtualUnwind(%rip) + + mov \$1,%eax # ExceptionContinueSearch + add \$64,%rsp + popfq + pop %r15 + pop %r14 + pop %r13 + pop %r12 + pop %rbp + pop %rbx + pop %rdi + pop %rsi + ret +.size mul_handler,.-mul_handler + +.section .pdata +.align 4 + .rva .LSEH_begin_bn_mul_mont_gather5 + .rva .LSEH_end_bn_mul_mont_gather5 + .rva .LSEH_info_bn_mul_mont_gather5 + + .rva .LSEH_begin_bn_mul4x_mont_gather5 + .rva .LSEH_end_bn_mul4x_mont_gather5 + .rva .LSEH_info_bn_mul4x_mont_gather5 + + .rva .LSEH_begin_bn_gather5 + .rva .LSEH_end_bn_gather5 + .rva .LSEH_info_bn_gather5 + +.section .xdata +.align 8 +.LSEH_info_bn_mul_mont_gather5: + .byte 9,0,0,0 + .rva mul_handler + .rva .Lmul_alloca,.Lmul_body,.Lmul_epilogue # HandlerData[] +.align 8 +.LSEH_info_bn_mul4x_mont_gather5: + .byte 9,0,0,0 + .rva mul_handler + .rva .Lmul4x_alloca,.Lmul4x_body,.Lmul4x_epilogue # HandlerData[] +.align 8 +.LSEH_info_bn_gather5: + .byte 0x01,0x0d,0x05,0x00 + .byte 0x0d,0x78,0x01,0x00 #movaps 0x10(rsp),xmm7 + .byte 0x08,0x68,0x00,0x00 #movaps (rsp),xmm6 + .byte 0x04,0x42,0x00,0x00 #sub rsp,0x28 +.align 8 +___ +} + +$code =~ s/\`([^\`]*)\`/eval($1)/gem; + +print $code; +close STDOUT; diff --git a/crypto/openssl/crypto/bn/bn.h b/crypto/openssl/crypto/bn/bn.h index a0bc47837d..f34248ec4f 100644 --- a/crypto/openssl/crypto/bn/bn.h +++ b/crypto/openssl/crypto/bn/bn.h @@ -558,6 +558,17 @@ int BN_is_prime_ex(const BIGNUM *p,int nchecks, BN_CTX *ctx, BN_GENCB *cb); int BN_is_prime_fasttest_ex(const BIGNUM *p,int nchecks, BN_CTX *ctx, int do_trial_division, BN_GENCB *cb); +int BN_X931_generate_Xpq(BIGNUM *Xp, BIGNUM *Xq, int nbits, BN_CTX *ctx); + +int BN_X931_derive_prime_ex(BIGNUM *p, BIGNUM *p1, BIGNUM *p2, + const BIGNUM *Xp, const BIGNUM *Xp1, const BIGNUM *Xp2, + const BIGNUM *e, BN_CTX *ctx, BN_GENCB *cb); +int BN_X931_generate_prime_ex(BIGNUM *p, BIGNUM *p1, BIGNUM *p2, + BIGNUM *Xp1, BIGNUM *Xp2, + const BIGNUM *Xp, + const BIGNUM *e, BN_CTX *ctx, + BN_GENCB *cb); + BN_MONT_CTX *BN_MONT_CTX_new(void ); void BN_MONT_CTX_init(BN_MONT_CTX *ctx); int BN_mod_mul_montgomery(BIGNUM *r,const BIGNUM *a,const BIGNUM *b, @@ -612,6 +623,8 @@ int BN_mod_exp_recp(BIGNUM *r, const BIGNUM *a, const BIGNUM *p, int BN_div_recp(BIGNUM *dv, BIGNUM *rem, const BIGNUM *m, BN_RECP_CTX *recp, BN_CTX *ctx); +#ifndef OPENSSL_NO_EC2M + /* Functions for arithmetic over binary polynomials represented by BIGNUMs. * * The BIGNUM::neg property of BIGNUMs representing binary polynomials is @@ -663,6 +676,8 @@ int BN_GF2m_mod_solve_quad_arr(BIGNUM *r, const BIGNUM *a, int BN_GF2m_poly2arr(const BIGNUM *a, int p[], int max); int BN_GF2m_arr2poly(const int p[], BIGNUM *a); +#endif + /* faster mod functions for the 'NIST primes' * 0 <= a < p^2 */ int BN_nist_mod_192(BIGNUM *r, const BIGNUM *a, const BIGNUM *p, BN_CTX *ctx); diff --git a/crypto/openssl/crypto/bn/bn_div.c b/crypto/openssl/crypto/bn/bn_div.c index 802a43d642..52b3304293 100644 --- a/crypto/openssl/crypto/bn/bn_div.c +++ b/crypto/openssl/crypto/bn/bn_div.c @@ -169,15 +169,13 @@ int BN_div(BIGNUM *dv, BIGNUM *rem, const BIGNUM *m, const BIGNUM *d, #endif /* OPENSSL_NO_ASM */ -/* BN_div[_no_branch] computes dv := num / divisor, rounding towards +/* BN_div computes dv := num / divisor, rounding towards * zero, and sets up rm such that dv*divisor + rm = num holds. * Thus: * dv->neg == num->neg ^ divisor->neg (unless the result is zero) * rm->neg == num->neg (unless the remainder is zero) * If 'dv' or 'rm' is NULL, the respective value is not returned. */ -static int BN_div_no_branch(BIGNUM *dv, BIGNUM *rm, const BIGNUM *num, - const BIGNUM *divisor, BN_CTX *ctx); int BN_div(BIGNUM *dv, BIGNUM *rm, const BIGNUM *num, const BIGNUM *divisor, BN_CTX *ctx) { @@ -186,6 +184,7 @@ int BN_div(BIGNUM *dv, BIGNUM *rm, const BIGNUM *num, const BIGNUM *divisor, BN_ULONG *resp,*wnump; BN_ULONG d0,d1; int num_n,div_n; + int no_branch=0; /* Invalid zero-padding would have particularly bad consequences * in the case of 'num', so don't just rely on bn_check_top() for this one @@ -200,7 +199,7 @@ int BN_div(BIGNUM *dv, BIGNUM *rm, const BIGNUM *num, const BIGNUM *divisor, if ((BN_get_flags(num, BN_FLG_CONSTTIME) != 0) || (BN_get_flags(divisor, BN_FLG_CONSTTIME) != 0)) { - return BN_div_no_branch(dv, rm, num, divisor, ctx); + no_branch=1; } bn_check_top(dv); @@ -214,7 +213,7 @@ int BN_div(BIGNUM *dv, BIGNUM *rm, const BIGNUM *num, const BIGNUM *divisor, return(0); } - if (BN_ucmp(num,divisor) < 0) + if (!no_branch && BN_ucmp(num,divisor) < 0) { if (rm != NULL) { if (BN_copy(rm,num) == NULL) return(0); } @@ -239,242 +238,25 @@ int BN_div(BIGNUM *dv, BIGNUM *rm, const BIGNUM *num, const BIGNUM *divisor, norm_shift+=BN_BITS2; if (!(BN_lshift(snum,num,norm_shift))) goto err; snum->neg=0; - div_n=sdiv->top; - num_n=snum->top; - loop=num_n-div_n; - /* Lets setup a 'window' into snum - * This is the part that corresponds to the current - * 'area' being divided */ - wnum.neg = 0; - wnum.d = &(snum->d[loop]); - wnum.top = div_n; - /* only needed when BN_ucmp messes up the values between top and max */ - wnum.dmax = snum->dmax - loop; /* so we don't step out of bounds */ - - /* Get the top 2 words of sdiv */ - /* div_n=sdiv->top; */ - d0=sdiv->d[div_n-1]; - d1=(div_n == 1)?0:sdiv->d[div_n-2]; - - /* pointer to the 'top' of snum */ - wnump= &(snum->d[num_n-1]); - - /* Setup to 'res' */ - res->neg= (num->neg^divisor->neg); - if (!bn_wexpand(res,(loop+1))) goto err; - res->top=loop; - resp= &(res->d[loop-1]); - - /* space for temp */ - if (!bn_wexpand(tmp,(div_n+1))) goto err; - if (BN_ucmp(&wnum,sdiv) >= 0) + if (no_branch) { - /* If BN_DEBUG_RAND is defined BN_ucmp changes (via - * bn_pollute) the const bignum arguments => - * clean the values between top and max again */ - bn_clear_top2max(&wnum); - bn_sub_words(wnum.d, wnum.d, sdiv->d, div_n); - *resp=1; - } - else - res->top--; - /* if res->top == 0 then clear the neg value otherwise decrease - * the resp pointer */ - if (res->top == 0) - res->neg = 0; - else - resp--; - - for (i=0; i 0x%08X\n", - n0, n1, d0, q); -#endif -#endif - -#ifndef REMAINDER_IS_ALREADY_CALCULATED - /* - * rem doesn't have to be BN_ULLONG. The least we - * know it's less that d0, isn't it? - */ - rem=(n1-q*d0)&BN_MASK2; -#endif - t2=(BN_ULLONG)d1*q; - - for (;;) - { - if (t2 <= ((((BN_ULLONG)rem)< 0x%08X\n", - n0, n1, d0, q); -#endif -#ifndef REMAINDER_IS_ALREADY_CALCULATED - rem=(n1-q*d0)&BN_MASK2; -#endif - -#if defined(BN_UMULT_LOHI) - BN_UMULT_LOHI(t2l,t2h,d1,q); -#elif defined(BN_UMULT_HIGH) - t2l = d1 * q; - t2h = BN_UMULT_HIGH(d1,q); -#else + /* Since we don't know whether snum is larger than sdiv, + * we pad snum with enough zeroes without changing its + * value. + */ + if (snum->top <= sdiv->top+1) { - BN_ULONG ql, qh; - t2l=LBITS(d1); t2h=HBITS(d1); - ql =LBITS(q); qh =HBITS(q); - mul64(t2l,t2h,ql,qh); /* t2=(BN_ULLONG)d1*q; */ + if (bn_wexpand(snum, sdiv->top + 2) == NULL) goto err; + for (i = snum->top; i < sdiv->top + 2; i++) snum->d[i] = 0; + snum->top = sdiv->top + 2; } -#endif - - for (;;) - { - if ((t2h < rem) || - ((t2h == rem) && (t2l <= wnump[-2]))) - break; - q--; - rem += d0; - if (rem < d0) break; /* don't let rem overflow */ - if (t2l < d1) t2h--; t2l -= d1; - } -#endif /* !BN_LLONG */ - } -#endif /* !BN_DIV3W */ - - l0=bn_mul_words(tmp->d,sdiv->d,div_n,q); - tmp->d[div_n]=l0; - wnum.d--; - /* ingore top values of the bignums just sub the two - * BN_ULONG arrays with bn_sub_words */ - if (bn_sub_words(wnum.d, wnum.d, tmp->d, div_n+1)) + else { - /* Note: As we have considered only the leading - * two BN_ULONGs in the calculation of q, sdiv * q - * might be greater than wnum (but then (q-1) * sdiv - * is less or equal than wnum) - */ - q--; - if (bn_add_words(wnum.d, wnum.d, sdiv->d, div_n)) - /* we can't have an overflow here (assuming - * that q != 0, but if q == 0 then tmp is - * zero anyway) */ - (*wnump)++; + if (bn_wexpand(snum, snum->top + 1) == NULL) goto err; + snum->d[snum->top] = 0; + snum->top ++; } - /* store part of the result */ - *resp = q; - } - bn_correct_top(snum); - if (rm != NULL) - { - /* Keep a copy of the neg flag in num because if rm==num - * BN_rshift() will overwrite it. - */ - int neg = num->neg; - BN_rshift(rm,snum,norm_shift); - if (!BN_is_zero(rm)) - rm->neg = neg; - bn_check_top(rm); - } - BN_CTX_end(ctx); - return(1); -err: - bn_check_top(rm); - BN_CTX_end(ctx); - return(0); - } - - -/* BN_div_no_branch is a special version of BN_div. It does not contain - * branches that may leak sensitive information. - */ -static int BN_div_no_branch(BIGNUM *dv, BIGNUM *rm, const BIGNUM *num, - const BIGNUM *divisor, BN_CTX *ctx) - { - int norm_shift,i,loop; - BIGNUM *tmp,wnum,*snum,*sdiv,*res; - BN_ULONG *resp,*wnump; - BN_ULONG d0,d1; - int num_n,div_n; - - bn_check_top(dv); - bn_check_top(rm); - /* bn_check_top(num); */ /* 'num' has been checked in BN_div() */ - bn_check_top(divisor); - - if (BN_is_zero(divisor)) - { - BNerr(BN_F_BN_DIV_NO_BRANCH,BN_R_DIV_BY_ZERO); - return(0); - } - - BN_CTX_start(ctx); - tmp=BN_CTX_get(ctx); - snum=BN_CTX_get(ctx); - sdiv=BN_CTX_get(ctx); - if (dv == NULL) - res=BN_CTX_get(ctx); - else res=dv; - if (sdiv == NULL || res == NULL) goto err; - - /* First we normalise the numbers */ - norm_shift=BN_BITS2-((BN_num_bits(divisor))%BN_BITS2); - if (!(BN_lshift(sdiv,divisor,norm_shift))) goto err; - sdiv->neg=0; - norm_shift+=BN_BITS2; - if (!(BN_lshift(snum,num,norm_shift))) goto err; - snum->neg=0; - - /* Since we don't know whether snum is larger than sdiv, - * we pad snum with enough zeroes without changing its - * value. - */ - if (snum->top <= sdiv->top+1) - { - if (bn_wexpand(snum, sdiv->top + 2) == NULL) goto err; - for (i = snum->top; i < sdiv->top + 2; i++) snum->d[i] = 0; - snum->top = sdiv->top + 2; - } - else - { - if (bn_wexpand(snum, snum->top + 1) == NULL) goto err; - snum->d[snum->top] = 0; - snum->top ++; } div_n=sdiv->top; @@ -500,12 +282,27 @@ static int BN_div_no_branch(BIGNUM *dv, BIGNUM *rm, const BIGNUM *num, /* Setup to 'res' */ res->neg= (num->neg^divisor->neg); if (!bn_wexpand(res,(loop+1))) goto err; - res->top=loop-1; + res->top=loop-no_branch; resp= &(res->d[loop-1]); /* space for temp */ if (!bn_wexpand(tmp,(div_n+1))) goto err; + if (!no_branch) + { + if (BN_ucmp(&wnum,sdiv) >= 0) + { + /* If BN_DEBUG_RAND is defined BN_ucmp changes (via + * bn_pollute) the const bignum arguments => + * clean the values between top and max again */ + bn_clear_top2max(&wnum); + bn_sub_words(wnum.d, wnum.d, sdiv->d, div_n); + *resp=1; + } + else + res->top--; + } + /* if res->top == 0 then clear the neg value otherwise decrease * the resp pointer */ if (res->top == 0) @@ -638,7 +435,7 @@ X) -> 0x%08X\n", rm->neg = neg; bn_check_top(rm); } - bn_correct_top(res); + if (no_branch) bn_correct_top(res); BN_CTX_end(ctx); return(1); err: @@ -646,5 +443,4 @@ err: BN_CTX_end(ctx); return(0); } - #endif diff --git a/crypto/openssl/crypto/bn/bn_exp.c b/crypto/openssl/crypto/bn/bn_exp.c index d9b6c737fc..2abf6fd678 100644 --- a/crypto/openssl/crypto/bn/bn_exp.c +++ b/crypto/openssl/crypto/bn/bn_exp.c @@ -113,6 +113,18 @@ #include "cryptlib.h" #include "bn_lcl.h" +#include +#ifdef _WIN32 +# include +# ifndef alloca +# define alloca _alloca +# endif +#elif defined(__GNUC__) +# ifndef alloca +# define alloca(s) __builtin_alloca((s)) +# endif +#endif + /* maximum precomputation table size for *variable* sliding windows */ #define TABLE_SIZE 32 @@ -522,23 +534,17 @@ err: * as cache lines are concerned. The following functions are used to transfer a BIGNUM * from/to that table. */ -static int MOD_EXP_CTIME_COPY_TO_PREBUF(BIGNUM *b, int top, unsigned char *buf, int idx, int width) +static int MOD_EXP_CTIME_COPY_TO_PREBUF(const BIGNUM *b, int top, unsigned char *buf, int idx, int width) { size_t i, j; - if (bn_wexpand(b, top) == NULL) - return 0; - while (b->top < top) - { - b->d[b->top++] = 0; - } - + if (top > b->top) + top = b->top; /* this works because 'buf' is explicitly zeroed */ for (i = 0, j=idx; i < top * sizeof b->d[0]; i++, j+=width) { buf[j] = ((unsigned char*)b->d)[i]; } - bn_correct_top(b); return 1; } @@ -561,7 +567,7 @@ static int MOD_EXP_CTIME_COPY_FROM_PREBUF(BIGNUM *b, int top, unsigned char *buf /* Given a pointer value, compute the next address that is a cache line multiple. */ #define MOD_EXP_CTIME_ALIGN(x_) \ - ((unsigned char*)(x_) + (MOD_EXP_CTIME_MIN_CACHE_LINE_WIDTH - (((BN_ULONG)(x_)) & (MOD_EXP_CTIME_MIN_CACHE_LINE_MASK)))) + ((unsigned char*)(x_) + (MOD_EXP_CTIME_MIN_CACHE_LINE_WIDTH - (((size_t)(x_)) & (MOD_EXP_CTIME_MIN_CACHE_LINE_MASK)))) /* This variant of BN_mod_exp_mont() uses fixed windows and the special * precomputation memory layout to limit data-dependency to a minimum @@ -572,17 +578,15 @@ static int MOD_EXP_CTIME_COPY_FROM_PREBUF(BIGNUM *b, int top, unsigned char *buf int BN_mod_exp_mont_consttime(BIGNUM *rr, const BIGNUM *a, const BIGNUM *p, const BIGNUM *m, BN_CTX *ctx, BN_MONT_CTX *in_mont) { - int i,bits,ret=0,idx,window,wvalue; + int i,bits,ret=0,window,wvalue; int top; - BIGNUM *r; - const BIGNUM *aa; BN_MONT_CTX *mont=NULL; int numPowers; unsigned char *powerbufFree=NULL; int powerbufLen = 0; unsigned char *powerbuf=NULL; - BIGNUM *computeTemp=NULL, *am=NULL; + BIGNUM tmp, am; bn_check_top(a); bn_check_top(p); @@ -602,10 +606,7 @@ int BN_mod_exp_mont_consttime(BIGNUM *rr, const BIGNUM *a, const BIGNUM *p, return ret; } - /* Initialize BIGNUM context and allocate intermediate result */ BN_CTX_start(ctx); - r = BN_CTX_get(ctx); - if (r == NULL) goto err; /* Allocate a montgomery context if it was not supplied by the caller. * If this is not done, things will break in the montgomery part. @@ -620,40 +621,154 @@ int BN_mod_exp_mont_consttime(BIGNUM *rr, const BIGNUM *a, const BIGNUM *p, /* Get the window size to use with size of p. */ window = BN_window_bits_for_ctime_exponent_size(bits); +#if defined(OPENSSL_BN_ASM_MONT5) + if (window==6 && bits<=1024) window=5; /* ~5% improvement of 2048-bit RSA sign */ +#endif /* Allocate a buffer large enough to hold all of the pre-computed - * powers of a. + * powers of am, am itself and tmp. */ numPowers = 1 << window; - powerbufLen = sizeof(m->d[0])*top*numPowers; + powerbufLen = sizeof(m->d[0])*(top*numPowers + + ((2*top)>numPowers?(2*top):numPowers)); +#ifdef alloca + if (powerbufLen < 3072) + powerbufFree = alloca(powerbufLen+MOD_EXP_CTIME_MIN_CACHE_LINE_WIDTH); + else +#endif if ((powerbufFree=(unsigned char*)OPENSSL_malloc(powerbufLen+MOD_EXP_CTIME_MIN_CACHE_LINE_WIDTH)) == NULL) goto err; powerbuf = MOD_EXP_CTIME_ALIGN(powerbufFree); memset(powerbuf, 0, powerbufLen); - /* Initialize the intermediate result. Do this early to save double conversion, - * once each for a^0 and intermediate result. - */ - if (!BN_to_montgomery(r,BN_value_one(),mont,ctx)) goto err; - if (!MOD_EXP_CTIME_COPY_TO_PREBUF(r, top, powerbuf, 0, numPowers)) goto err; +#ifdef alloca + if (powerbufLen < 3072) + powerbufFree = NULL; +#endif - /* Initialize computeTemp as a^1 with montgomery precalcs */ - computeTemp = BN_CTX_get(ctx); - am = BN_CTX_get(ctx); - if (computeTemp==NULL || am==NULL) goto err; + /* lay down tmp and am right after powers table */ + tmp.d = (BN_ULONG *)(powerbuf + sizeof(m->d[0])*top*numPowers); + am.d = tmp.d + top; + tmp.top = am.top = 0; + tmp.dmax = am.dmax = top; + tmp.neg = am.neg = 0; + tmp.flags = am.flags = BN_FLG_STATIC_DATA; + + /* prepare a^0 in Montgomery domain */ +#if 1 + if (!BN_to_montgomery(&tmp,BN_value_one(),mont,ctx)) goto err; +#else + tmp.d[0] = (0-m->d[0])&BN_MASK2; /* 2^(top*BN_BITS2) - m */ + for (i=1;id[i])&BN_MASK2; + tmp.top = top; +#endif + /* prepare a^1 in Montgomery domain */ if (a->neg || BN_ucmp(a,m) >= 0) { - if (!BN_mod(am,a,m,ctx)) - goto err; - aa= am; + if (!BN_mod(&am,a,m,ctx)) goto err; + if (!BN_to_montgomery(&am,&am,mont,ctx)) goto err; } - else - aa=a; - if (!BN_to_montgomery(am,aa,mont,ctx)) goto err; - if (!BN_copy(computeTemp, am)) goto err; - if (!MOD_EXP_CTIME_COPY_TO_PREBUF(am, top, powerbuf, 1, numPowers)) goto err; + else if (!BN_to_montgomery(&am,a,mont,ctx)) goto err; + +#if defined(OPENSSL_BN_ASM_MONT5) + /* This optimization uses ideas from http://eprint.iacr.org/2011/239, + * specifically optimization of cache-timing attack countermeasures + * and pre-computation optimization. */ + + /* Dedicated window==4 case improves 512-bit RSA sign by ~15%, but as + * 512-bit RSA is hardly relevant, we omit it to spare size... */ + if (window==5) + { + void bn_mul_mont_gather5(BN_ULONG *rp,const BN_ULONG *ap, + const void *table,const BN_ULONG *np, + const BN_ULONG *n0,int num,int power); + void bn_scatter5(const BN_ULONG *inp,size_t num, + void *table,size_t power); + void bn_gather5(BN_ULONG *out,size_t num, + void *table,size_t power); + + BN_ULONG *np=mont->N.d, *n0=mont->n0; + + /* BN_to_montgomery can contaminate words above .top + * [in BN_DEBUG[_DEBUG] build]... */ + for (i=am.top; i=0; i--,bits--) + wvalue = (wvalue<<1)+BN_is_bit_set(p,bits); + bn_gather5(tmp.d,top,powerbuf,wvalue); + + /* Scan the exponent one window at a time starting from the most + * significant bits. + */ + while (bits >= 0) + { + for (wvalue=0, i=0; i<5; i++,bits--) + wvalue = (wvalue<<1)+BN_is_bit_set(p,bits); + + bn_mul_mont(tmp.d,tmp.d,tmp.d,np,n0,top); + bn_mul_mont(tmp.d,tmp.d,tmp.d,np,n0,top); + bn_mul_mont(tmp.d,tmp.d,tmp.d,np,n0,top); + bn_mul_mont(tmp.d,tmp.d,tmp.d,np,n0,top); + bn_mul_mont(tmp.d,tmp.d,tmp.d,np,n0,top); + bn_mul_mont_gather5(tmp.d,tmp.d,powerbuf,np,n0,top,wvalue); + } + + tmp.top=top; + bn_correct_top(&tmp); + } + else +#endif + { + if (!MOD_EXP_CTIME_COPY_TO_PREBUF(&tmp, top, powerbuf, 0, numPowers)) goto err; + if (!MOD_EXP_CTIME_COPY_TO_PREBUF(&am, top, powerbuf, 1, numPowers)) goto err; /* If the window size is greater than 1, then calculate * val[i=2..2^winsize-1]. Powers are computed as a*a^(i-1) @@ -662,62 +777,54 @@ int BN_mod_exp_mont_consttime(BIGNUM *rr, const BIGNUM *a, const BIGNUM *p, */ if (window > 1) { - for (i=2; i= 0) + bits--; + for (wvalue=0, i=bits%window; i>=0; i--,bits--) + wvalue = (wvalue<<1)+BN_is_bit_set(p,bits); + if (!MOD_EXP_CTIME_COPY_FROM_PREBUF(&tmp,top,powerbuf,wvalue,numPowers)) goto err; + + /* Scan the exponent one window at a time starting from the most + * significant bits. + */ + while (bits >= 0) { wvalue=0; /* The 'value' of the window */ /* Scan the window, squaring the result as we go */ - for (i=0; i> 4 & 0xF] << 8 | SQR_tb[(w) & 0xF] #endif +#if !defined(OPENSSL_BN_ASM_GF2m) /* Product of two polynomials a, b each with degree < BN_BITS2 - 1, * result is a polynomial r with degree < 2 * BN_BITS - 1 * The caller MUST ensure that the variables have the right amount @@ -216,7 +219,9 @@ static void bn_GF2m_mul_2x2(BN_ULONG *r, const BN_ULONG a1, const BN_ULONG a0, c r[2] ^= m1 ^ r[1] ^ r[3]; /* h0 ^= m1 ^ l1 ^ h1; */ r[1] = r[3] ^ r[2] ^ r[0] ^ m1 ^ m0; /* l1 ^= l0 ^ h0 ^ m0; */ } - +#else +void bn_GF2m_mul_2x2(BN_ULONG *r, BN_ULONG a1, BN_ULONG a0, BN_ULONG b1, BN_ULONG b0); +#endif /* Add polynomials a and b and store result in r; r could be a or b, a and b * could be equal; r is the bitwise XOR of a and b. @@ -360,21 +365,17 @@ int BN_GF2m_mod_arr(BIGNUM *r, const BIGNUM *a, const int p[]) int BN_GF2m_mod(BIGNUM *r, const BIGNUM *a, const BIGNUM *p) { int ret = 0; - const int max = BN_num_bits(p) + 1; - int *arr=NULL; + int arr[6]; bn_check_top(a); bn_check_top(p); - if ((arr = (int *)OPENSSL_malloc(sizeof(int) * max)) == NULL) goto err; - ret = BN_GF2m_poly2arr(p, arr, max); - if (!ret || ret > max) + ret = BN_GF2m_poly2arr(p, arr, sizeof(arr)/sizeof(arr[0])); + if (!ret || ret > (int)(sizeof(arr)/sizeof(arr[0]))) { BNerr(BN_F_BN_GF2M_MOD,BN_R_INVALID_LENGTH); - goto err; + return 0; } ret = BN_GF2m_mod_arr(r, a, arr); bn_check_top(r); -err: - if (arr) OPENSSL_free(arr); return ret; } @@ -521,7 +522,7 @@ err: */ int BN_GF2m_mod_inv(BIGNUM *r, const BIGNUM *a, const BIGNUM *p, BN_CTX *ctx) { - BIGNUM *b, *c, *u, *v, *tmp; + BIGNUM *b, *c = NULL, *u = NULL, *v = NULL, *tmp; int ret = 0; bn_check_top(a); @@ -529,18 +530,18 @@ int BN_GF2m_mod_inv(BIGNUM *r, const BIGNUM *a, const BIGNUM *p, BN_CTX *ctx) BN_CTX_start(ctx); - b = BN_CTX_get(ctx); - c = BN_CTX_get(ctx); - u = BN_CTX_get(ctx); - v = BN_CTX_get(ctx); - if (v == NULL) goto err; + if ((b = BN_CTX_get(ctx))==NULL) goto err; + if ((c = BN_CTX_get(ctx))==NULL) goto err; + if ((u = BN_CTX_get(ctx))==NULL) goto err; + if ((v = BN_CTX_get(ctx))==NULL) goto err; - if (!BN_one(b)) goto err; if (!BN_GF2m_mod(u, a, p)) goto err; - if (!BN_copy(v, p)) goto err; - if (BN_is_zero(u)) goto err; + if (!BN_copy(v, p)) goto err; +#if 0 + if (!BN_one(b)) goto err; + while (1) { while (!BN_is_odd(u)) @@ -565,13 +566,86 @@ int BN_GF2m_mod_inv(BIGNUM *r, const BIGNUM *a, const BIGNUM *p, BN_CTX *ctx) if (!BN_GF2m_add(u, u, v)) goto err; if (!BN_GF2m_add(b, b, c)) goto err; } +#else + { + int i, ubits = BN_num_bits(u), + vbits = BN_num_bits(v), /* v is copy of p */ + top = p->top; + BN_ULONG *udp,*bdp,*vdp,*cdp; + + bn_wexpand(u,top); udp = u->d; + for (i=u->top;itop = top; + bn_wexpand(b,top); bdp = b->d; + bdp[0] = 1; + for (i=1;itop = top; + bn_wexpand(c,top); cdp = c->d; + for (i=0;itop = top; + vdp = v->d; /* It pays off to "cache" *->d pointers, because + * it allows optimizer to be more aggressive. + * But we don't have to "cache" p->d, because *p + * is declared 'const'... */ + while (1) + { + while (ubits && !(udp[0]&1)) + { + BN_ULONG u0,u1,b0,b1,mask; + u0 = udp[0]; + b0 = bdp[0]; + mask = (BN_ULONG)0-(b0&1); + b0 ^= p->d[0]&mask; + for (i=0;i>1)|(u1<<(BN_BITS2-1)))&BN_MASK2; + u0 = u1; + b1 = bdp[i+1]^(p->d[i+1]&mask); + bdp[i] = ((b0>>1)|(b1<<(BN_BITS2-1)))&BN_MASK2; + b0 = b1; + } + udp[i] = u0>>1; + bdp[i] = b0>>1; + ubits--; + } + + if (ubits<=BN_BITS2 && udp[0]==1) break; + + if (ubitsd; + bdp = cdp; cdp = c->d; + } + for(i=0;i # define BN_UMULT_HIGH(a,b) (BN_ULONG)asm("umulh %a0,%a1,%v0",(a),(b)) -# elif defined(__GNUC__) +# elif defined(__GNUC__) && __GNUC__>=2 # define BN_UMULT_HIGH(a,b) ({ \ register BN_ULONG ret; \ asm ("umulh %1,%2,%0" \ @@ -247,7 +247,7 @@ extern "C" { ret; }) # endif /* compiler */ # elif defined(_ARCH_PPC) && defined(__64BIT__) && defined(SIXTY_FOUR_BIT_LONG) -# if defined(__GNUC__) +# if defined(__GNUC__) && __GNUC__>=2 # define BN_UMULT_HIGH(a,b) ({ \ register BN_ULONG ret; \ asm ("mulhdu %0,%1,%2" \ @@ -257,7 +257,7 @@ extern "C" { # endif /* compiler */ # elif (defined(__x86_64) || defined(__x86_64__)) && \ (defined(SIXTY_FOUR_BIT_LONG) || defined(SIXTY_FOUR_BIT)) -# if defined(__GNUC__) +# if defined(__GNUC__) && __GNUC__>=2 # define BN_UMULT_HIGH(a,b) ({ \ register BN_ULONG ret,discard; \ asm ("mulq %3" \ @@ -280,6 +280,19 @@ extern "C" { # define BN_UMULT_HIGH(a,b) __umulh((a),(b)) # define BN_UMULT_LOHI(low,high,a,b) ((low)=_umul128((a),(b),&(high))) # endif +# elif defined(__mips) && (defined(SIXTY_FOUR_BIT) || defined(SIXTY_FOUR_BIT_LONG)) +# if defined(__GNUC__) && __GNUC__>=2 +# define BN_UMULT_HIGH(a,b) ({ \ + register BN_ULONG ret; \ + asm ("dmultu %1,%2" \ + : "=h"(ret) \ + : "r"(a), "r"(b) : "l"); \ + ret; }) +# define BN_UMULT_LOHI(low,high,a,b) \ + asm ("dmultu %2,%3" \ + : "=l"(low),"=h"(high) \ + : "r"(a), "r"(b)); +# endif # endif /* cpu */ #endif /* OPENSSL_NO_ASM */ @@ -459,6 +472,10 @@ extern "C" { } #endif /* !BN_LLONG */ +#if defined(OPENSSL_DOING_MAKEDEPEND) && defined(OPENSSL_FIPS) +#undef bn_div_words +#endif + void bn_mul_normal(BN_ULONG *r,BN_ULONG *a,int na,BN_ULONG *b,int nb); void bn_mul_comba8(BN_ULONG *r,BN_ULONG *a,BN_ULONG *b); void bn_mul_comba4(BN_ULONG *r,BN_ULONG *a,BN_ULONG *b); diff --git a/crypto/openssl/crypto/bn/bn_lib.c b/crypto/openssl/crypto/bn/bn_lib.c index 5470fbe6ef..7a5676de69 100644 --- a/crypto/openssl/crypto/bn/bn_lib.c +++ b/crypto/openssl/crypto/bn/bn_lib.c @@ -139,25 +139,6 @@ const BIGNUM *BN_value_one(void) return(&const_one); } -char *BN_options(void) - { - static int init=0; - static char data[16]; - - if (!init) - { - init++; -#ifdef BN_LLONG - BIO_snprintf(data,sizeof data,"bn(%d,%d)", - (int)sizeof(BN_ULLONG)*8,(int)sizeof(BN_ULONG)*8); -#else - BIO_snprintf(data,sizeof data,"bn(%d,%d)", - (int)sizeof(BN_ULONG)*8,(int)sizeof(BN_ULONG)*8); -#endif - } - return(data); - } - int BN_num_bits_word(BN_ULONG l) { static const unsigned char bits[256]={ diff --git a/crypto/openssl/crypto/bn/bn_mont.c b/crypto/openssl/crypto/bn/bn_mont.c index 1a866880f5..427b5cf4df 100644 --- a/crypto/openssl/crypto/bn/bn_mont.c +++ b/crypto/openssl/crypto/bn/bn_mont.c @@ -177,31 +177,26 @@ err: static int BN_from_montgomery_word(BIGNUM *ret, BIGNUM *r, BN_MONT_CTX *mont) { BIGNUM *n; - BN_ULONG *ap,*np,*rp,n0,v,*nrp; - int al,nl,max,i,x,ri; + BN_ULONG *ap,*np,*rp,n0,v,carry; + int nl,max,i; n= &(mont->N); - /* mont->ri is the size of mont->N in bits (rounded up - to the word size) */ - al=ri=mont->ri/BN_BITS2; - nl=n->top; - if ((al == 0) || (nl == 0)) { ret->top=0; return(1); } + if (nl == 0) { ret->top=0; return(1); } - max=(nl+al+1); /* allow for overflow (no?) XXX */ + max=(2*nl); /* carry is stored separately */ if (bn_wexpand(r,max) == NULL) return(0); r->neg^=n->neg; np=n->d; rp=r->d; - nrp= &(r->d[nl]); /* clear the top words of T */ #if 1 for (i=r->top; id[i]=0; + rp[i]=0; #else - memset(&(r->d[r->top]),0,(max-r->top)*sizeof(BN_ULONG)); + memset(&(rp[r->top]),0,(max-r->top)*sizeof(BN_ULONG)); #endif r->top=max; @@ -210,7 +205,7 @@ static int BN_from_montgomery_word(BIGNUM *ret, BIGNUM *r, BN_MONT_CTX *mont) #ifdef BN_COUNT fprintf(stderr,"word BN_from_montgomery_word %d * %d\n",nl,nl); #endif - for (i=0; i= v) - continue; - else - { - if (((++nrp[0])&BN_MASK2) != 0) continue; - if (((++nrp[1])&BN_MASK2) != 0) continue; - for (x=2; (((++nrp[x])&BN_MASK2) == 0); x++) ; - } - } - bn_correct_top(r); - - /* mont->ri will be a multiple of the word size and below code - * is kind of BN_rshift(ret,r,mont->ri) equivalent */ - if (r->top <= ri) - { - ret->top=0; - return(1); + v = (v+carry+rp[nl])&BN_MASK2; + carry |= (v != rp[nl]); + carry &= (v <= rp[nl]); + rp[nl]=v; } - al=r->top-ri; -#define BRANCH_FREE 1 -#if BRANCH_FREE - if (bn_wexpand(ret,ri) == NULL) return(0); - x=0-(((al-ri)>>(sizeof(al)*8-1))&1); - ret->top=x=(ri&~x)|(al&x); /* min(ri,al) */ + if (bn_wexpand(ret,nl) == NULL) return(0); + ret->top=nl; ret->neg=r->neg; rp=ret->d; - ap=&(r->d[ri]); + ap=&(r->d[nl]); +#define BRANCH_FREE 1 +#if BRANCH_FREE { - size_t m1,m2; - - v=bn_sub_words(rp,ap,np,ri); - /* this ----------------^^ works even in alri) nrp=rp; else nrp=ap; */ - /* in other words if subtraction result is real, then + v=bn_sub_words(rp,ap,np,nl)-carry; + /* if subtraction result is real, then * trick unconditional memcpy below to perform in-place * "refresh" instead of actual copy. */ - m1=0-(size_t)(((al-ri)>>(sizeof(al)*8-1))&1); /* al>(sizeof(al)*8-1))&1); /* al>ri */ - m1|=m2; /* (al!=ri) */ - m1|=(0-(size_t)v); /* (al!=ri || v) */ - m1&=~m2; /* (al!=ri || v) && !al>ri */ - nrp=(BN_ULONG *)(((PTR_SIZE_INT)rp&~m1)|((PTR_SIZE_INT)ap&m1)); - } + m=(0-(size_t)v); + nrp=(BN_ULONG *)(((PTR_SIZE_INT)rp&~m)|((PTR_SIZE_INT)ap&m)); - /* 'itop=al; - ret->neg=r->neg; - - rp=ret->d; - ap=&(r->d[ri]); - al-=4; - for (i=0; iN)) >= 0) - { - if (!BN_usub(ret,ret,&(mont->N))) return(0); - } + if (bn_sub_words (rp,ap,np,nl)-carry) + memcpy(rp,ap,nl*sizeof(BN_ULONG)); #endif + bn_correct_top(r); + bn_correct_top(ret); bn_check_top(ret); return(1); diff --git a/crypto/openssl/crypto/bn/bn_nist.c b/crypto/openssl/crypto/bn/bn_nist.c index c6de032696..43caee4770 100644 --- a/crypto/openssl/crypto/bn/bn_nist.c +++ b/crypto/openssl/crypto/bn/bn_nist.c @@ -319,6 +319,13 @@ static void nist_cp_bn(BN_ULONG *buf, BN_ULONG *a, int top) :(to[(n)/2] =((m)&1)?(from[(m)/2]>>32):(from[(m)/2]&BN_MASK2l))) #define bn_32_set_0(to, n) (((n)&1)?(to[(n)/2]&=BN_MASK2l):(to[(n)/2]=0)); #define bn_cp_32(to,n,from,m) ((m)>=0)?bn_cp_32_naked(to,n,from,m):bn_32_set_0(to,n) +# if defined(L_ENDIAN) +# if defined(__arch64__) +# define NIST_INT64 long +# else +# define NIST_INT64 long long +# endif +# endif #else #define bn_cp_64(to, n, from, m) \ { \ @@ -330,13 +337,15 @@ static void nist_cp_bn(BN_ULONG *buf, BN_ULONG *a, int top) bn_32_set_0(to, (n)*2); \ bn_32_set_0(to, (n)*2+1); \ } -#if BN_BITS2 == 32 #define bn_cp_32(to, n, from, m) (to)[n] = (m>=0)?((from)[m]):0; #define bn_32_set_0(to, n) (to)[n] = (BN_ULONG)0; -#endif +# if defined(_WIN32) && !defined(__GNUC__) +# define NIST_INT64 __int64 +# elif defined(BN_LLONG) +# define NIST_INT64 long long +# endif #endif /* BN_BITS2 != 64 */ - #define nist_set_192(to, from, a1, a2, a3) \ { \ bn_cp_64(to, 0, from, (a3) - 3) \ @@ -350,9 +359,11 @@ int BN_nist_mod_192(BIGNUM *r, const BIGNUM *a, const BIGNUM *field, int top = a->top, i; int carry; register BN_ULONG *r_d, *a_d = a->d; - BN_ULONG t_d[BN_NIST_192_TOP], - buf[BN_NIST_192_TOP], - c_d[BN_NIST_192_TOP], + union { + BN_ULONG bn[BN_NIST_192_TOP]; + unsigned int ui[BN_NIST_192_TOP*sizeof(BN_ULONG)/sizeof(unsigned int)]; + } buf; + BN_ULONG c_d[BN_NIST_192_TOP], *res; PTR_SIZE_INT mask; static const BIGNUM _bignum_nist_p_192_sqr = { @@ -385,15 +396,48 @@ int BN_nist_mod_192(BIGNUM *r, const BIGNUM *a, const BIGNUM *field, else r_d = a_d; - nist_cp_bn_0(buf, a_d + BN_NIST_192_TOP, top - BN_NIST_192_TOP, BN_NIST_192_TOP); + nist_cp_bn_0(buf.bn, a_d + BN_NIST_192_TOP, top - BN_NIST_192_TOP, BN_NIST_192_TOP); + +#if defined(NIST_INT64) + { + NIST_INT64 acc; /* accumulator */ + unsigned int *rp=(unsigned int *)r_d; + const unsigned int *bp=(const unsigned int *)buf.ui; + + acc = rp[0]; acc += bp[3*2-6]; + acc += bp[5*2-6]; rp[0] = (unsigned int)acc; acc >>= 32; + + acc += rp[1]; acc += bp[3*2-5]; + acc += bp[5*2-5]; rp[1] = (unsigned int)acc; acc >>= 32; - nist_set_192(t_d, buf, 0, 3, 3); + acc += rp[2]; acc += bp[3*2-6]; + acc += bp[4*2-6]; + acc += bp[5*2-6]; rp[2] = (unsigned int)acc; acc >>= 32; + + acc += rp[3]; acc += bp[3*2-5]; + acc += bp[4*2-5]; + acc += bp[5*2-5]; rp[3] = (unsigned int)acc; acc >>= 32; + + acc += rp[4]; acc += bp[4*2-6]; + acc += bp[5*2-6]; rp[4] = (unsigned int)acc; acc >>= 32; + + acc += rp[5]; acc += bp[4*2-5]; + acc += bp[5*2-5]; rp[5] = (unsigned int)acc; + + carry = (int)(acc>>32); + } +#else + { + BN_ULONG t_d[BN_NIST_192_TOP]; + + nist_set_192(t_d, buf.bn, 0, 3, 3); carry = (int)bn_add_words(r_d, r_d, t_d, BN_NIST_192_TOP); - nist_set_192(t_d, buf, 4, 4, 0); + nist_set_192(t_d, buf.bn, 4, 4, 0); carry += (int)bn_add_words(r_d, r_d, t_d, BN_NIST_192_TOP); - nist_set_192(t_d, buf, 5, 5, 5) + nist_set_192(t_d, buf.bn, 5, 5, 5) carry += (int)bn_add_words(r_d, r_d, t_d, BN_NIST_192_TOP); - + } +#endif if (carry > 0) carry = (int)bn_sub_words(r_d,r_d,_nist_p_192[carry-1],BN_NIST_192_TOP); else @@ -435,8 +479,7 @@ int BN_nist_mod_224(BIGNUM *r, const BIGNUM *a, const BIGNUM *field, int top = a->top, i; int carry; BN_ULONG *r_d, *a_d = a->d; - BN_ULONG t_d[BN_NIST_224_TOP], - buf[BN_NIST_224_TOP], + BN_ULONG buf[BN_NIST_224_TOP], c_d[BN_NIST_224_TOP], *res; PTR_SIZE_INT mask; @@ -474,14 +517,54 @@ int BN_nist_mod_224(BIGNUM *r, const BIGNUM *a, const BIGNUM *field, #if BN_BITS2==64 /* copy upper 256 bits of 448 bit number ... */ - nist_cp_bn_0(t_d, a_d + (BN_NIST_224_TOP-1), top - (BN_NIST_224_TOP-1), BN_NIST_224_TOP); + nist_cp_bn_0(c_d, a_d + (BN_NIST_224_TOP-1), top - (BN_NIST_224_TOP-1), BN_NIST_224_TOP); /* ... and right shift by 32 to obtain upper 224 bits */ - nist_set_224(buf, t_d, 14, 13, 12, 11, 10, 9, 8); + nist_set_224(buf, c_d, 14, 13, 12, 11, 10, 9, 8); /* truncate lower part to 224 bits too */ r_d[BN_NIST_224_TOP-1] &= BN_MASK2l; #else nist_cp_bn_0(buf, a_d + BN_NIST_224_TOP, top - BN_NIST_224_TOP, BN_NIST_224_TOP); #endif + +#if defined(NIST_INT64) && BN_BITS2!=64 + { + NIST_INT64 acc; /* accumulator */ + unsigned int *rp=(unsigned int *)r_d; + const unsigned int *bp=(const unsigned int *)buf; + + acc = rp[0]; acc -= bp[7-7]; + acc -= bp[11-7]; rp[0] = (unsigned int)acc; acc >>= 32; + + acc += rp[1]; acc -= bp[8-7]; + acc -= bp[12-7]; rp[1] = (unsigned int)acc; acc >>= 32; + + acc += rp[2]; acc -= bp[9-7]; + acc -= bp[13-7]; rp[2] = (unsigned int)acc; acc >>= 32; + + acc += rp[3]; acc += bp[7-7]; + acc += bp[11-7]; + acc -= bp[10-7]; rp[3] = (unsigned int)acc; acc>>= 32; + + acc += rp[4]; acc += bp[8-7]; + acc += bp[12-7]; + acc -= bp[11-7]; rp[4] = (unsigned int)acc; acc >>= 32; + + acc += rp[5]; acc += bp[9-7]; + acc += bp[13-7]; + acc -= bp[12-7]; rp[5] = (unsigned int)acc; acc >>= 32; + + acc += rp[6]; acc += bp[10-7]; + acc -= bp[13-7]; rp[6] = (unsigned int)acc; + + carry = (int)(acc>>32); +# if BN_BITS2==64 + rp[7] = carry; +# endif + } +#else + { + BN_ULONG t_d[BN_NIST_224_TOP]; + nist_set_224(t_d, buf, 10, 9, 8, 7, 0, 0, 0); carry = (int)bn_add_words(r_d, r_d, t_d, BN_NIST_224_TOP); nist_set_224(t_d, buf, 0, 13, 12, 11, 0, 0, 0); @@ -493,6 +576,8 @@ int BN_nist_mod_224(BIGNUM *r, const BIGNUM *a, const BIGNUM *field, #if BN_BITS2==64 carry = (int)(r_d[BN_NIST_224_TOP-1]>>32); +#endif + } #endif u.f = bn_sub_words; if (carry > 0) @@ -548,9 +633,11 @@ int BN_nist_mod_256(BIGNUM *r, const BIGNUM *a, const BIGNUM *field, int i, top = a->top; int carry = 0; register BN_ULONG *a_d = a->d, *r_d; - BN_ULONG t_d[BN_NIST_256_TOP], - buf[BN_NIST_256_TOP], - c_d[BN_NIST_256_TOP], + union { + BN_ULONG bn[BN_NIST_256_TOP]; + unsigned int ui[BN_NIST_256_TOP*sizeof(BN_ULONG)/sizeof(unsigned int)]; + } buf; + BN_ULONG c_d[BN_NIST_256_TOP], *res; PTR_SIZE_INT mask; union { bn_addsub_f f; PTR_SIZE_INT p; } u; @@ -584,12 +671,87 @@ int BN_nist_mod_256(BIGNUM *r, const BIGNUM *a, const BIGNUM *field, else r_d = a_d; - nist_cp_bn_0(buf, a_d + BN_NIST_256_TOP, top - BN_NIST_256_TOP, BN_NIST_256_TOP); + nist_cp_bn_0(buf.bn, a_d + BN_NIST_256_TOP, top - BN_NIST_256_TOP, BN_NIST_256_TOP); + +#if defined(NIST_INT64) + { + NIST_INT64 acc; /* accumulator */ + unsigned int *rp=(unsigned int *)r_d; + const unsigned int *bp=(const unsigned int *)buf.ui; + + acc = rp[0]; acc += bp[8-8]; + acc += bp[9-8]; + acc -= bp[11-8]; + acc -= bp[12-8]; + acc -= bp[13-8]; + acc -= bp[14-8]; rp[0] = (unsigned int)acc; acc >>= 32; + + acc += rp[1]; acc += bp[9-8]; + acc += bp[10-8]; + acc -= bp[12-8]; + acc -= bp[13-8]; + acc -= bp[14-8]; + acc -= bp[15-8]; rp[1] = (unsigned int)acc; acc >>= 32; + + acc += rp[2]; acc += bp[10-8]; + acc += bp[11-8]; + acc -= bp[13-8]; + acc -= bp[14-8]; + acc -= bp[15-8]; rp[2] = (unsigned int)acc; acc >>= 32; + + acc += rp[3]; acc += bp[11-8]; + acc += bp[11-8]; + acc += bp[12-8]; + acc += bp[12-8]; + acc += bp[13-8]; + acc -= bp[15-8]; + acc -= bp[8-8]; + acc -= bp[9-8]; rp[3] = (unsigned int)acc; acc >>= 32; + + acc += rp[4]; acc += bp[12-8]; + acc += bp[12-8]; + acc += bp[13-8]; + acc += bp[13-8]; + acc += bp[14-8]; + acc -= bp[9-8]; + acc -= bp[10-8]; rp[4] = (unsigned int)acc; acc >>= 32; + + acc += rp[5]; acc += bp[13-8]; + acc += bp[13-8]; + acc += bp[14-8]; + acc += bp[14-8]; + acc += bp[15-8]; + acc -= bp[10-8]; + acc -= bp[11-8]; rp[5] = (unsigned int)acc; acc >>= 32; + + acc += rp[6]; acc += bp[14-8]; + acc += bp[14-8]; + acc += bp[15-8]; + acc += bp[15-8]; + acc += bp[14-8]; + acc += bp[13-8]; + acc -= bp[8-8]; + acc -= bp[9-8]; rp[6] = (unsigned int)acc; acc >>= 32; + + acc += rp[7]; acc += bp[15-8]; + acc += bp[15-8]; + acc += bp[15-8]; + acc += bp[8 -8]; + acc -= bp[10-8]; + acc -= bp[11-8]; + acc -= bp[12-8]; + acc -= bp[13-8]; rp[7] = (unsigned int)acc; + + carry = (int)(acc>>32); + } +#else + { + BN_ULONG t_d[BN_NIST_256_TOP]; /*S1*/ - nist_set_256(t_d, buf, 15, 14, 13, 12, 11, 0, 0, 0); + nist_set_256(t_d, buf.bn, 15, 14, 13, 12, 11, 0, 0, 0); /*S2*/ - nist_set_256(c_d, buf, 0, 15, 14, 13, 12, 0, 0, 0); + nist_set_256(c_d, buf.bn, 0, 15, 14, 13, 12, 0, 0, 0); carry = (int)bn_add_words(t_d, t_d, c_d, BN_NIST_256_TOP); /* left shift */ { @@ -607,24 +769,26 @@ int BN_nist_mod_256(BIGNUM *r, const BIGNUM *a, const BIGNUM *field, } carry += (int)bn_add_words(r_d, r_d, t_d, BN_NIST_256_TOP); /*S3*/ - nist_set_256(t_d, buf, 15, 14, 0, 0, 0, 10, 9, 8); + nist_set_256(t_d, buf.bn, 15, 14, 0, 0, 0, 10, 9, 8); carry += (int)bn_add_words(r_d, r_d, t_d, BN_NIST_256_TOP); /*S4*/ - nist_set_256(t_d, buf, 8, 13, 15, 14, 13, 11, 10, 9); + nist_set_256(t_d, buf.bn, 8, 13, 15, 14, 13, 11, 10, 9); carry += (int)bn_add_words(r_d, r_d, t_d, BN_NIST_256_TOP); /*D1*/ - nist_set_256(t_d, buf, 10, 8, 0, 0, 0, 13, 12, 11); + nist_set_256(t_d, buf.bn, 10, 8, 0, 0, 0, 13, 12, 11); carry -= (int)bn_sub_words(r_d, r_d, t_d, BN_NIST_256_TOP); /*D2*/ - nist_set_256(t_d, buf, 11, 9, 0, 0, 15, 14, 13, 12); + nist_set_256(t_d, buf.bn, 11, 9, 0, 0, 15, 14, 13, 12); carry -= (int)bn_sub_words(r_d, r_d, t_d, BN_NIST_256_TOP); /*D3*/ - nist_set_256(t_d, buf, 12, 0, 10, 9, 8, 15, 14, 13); + nist_set_256(t_d, buf.bn, 12, 0, 10, 9, 8, 15, 14, 13); carry -= (int)bn_sub_words(r_d, r_d, t_d, BN_NIST_256_TOP); /*D4*/ - nist_set_256(t_d, buf, 13, 0, 11, 10, 9, 0, 15, 14); + nist_set_256(t_d, buf.bn, 13, 0, 11, 10, 9, 0, 15, 14); carry -= (int)bn_sub_words(r_d, r_d, t_d, BN_NIST_256_TOP); + } +#endif /* see BN_nist_mod_224 for explanation */ u.f = bn_sub_words; if (carry > 0) @@ -672,9 +836,11 @@ int BN_nist_mod_384(BIGNUM *r, const BIGNUM *a, const BIGNUM *field, int i, top = a->top; int carry = 0; register BN_ULONG *r_d, *a_d = a->d; - BN_ULONG t_d[BN_NIST_384_TOP], - buf[BN_NIST_384_TOP], - c_d[BN_NIST_384_TOP], + union { + BN_ULONG bn[BN_NIST_384_TOP]; + unsigned int ui[BN_NIST_384_TOP*sizeof(BN_ULONG)/sizeof(unsigned int)]; + } buf; + BN_ULONG c_d[BN_NIST_384_TOP], *res; PTR_SIZE_INT mask; union { bn_addsub_f f; PTR_SIZE_INT p; } u; @@ -709,10 +875,100 @@ int BN_nist_mod_384(BIGNUM *r, const BIGNUM *a, const BIGNUM *field, else r_d = a_d; - nist_cp_bn_0(buf, a_d + BN_NIST_384_TOP, top - BN_NIST_384_TOP, BN_NIST_384_TOP); + nist_cp_bn_0(buf.bn, a_d + BN_NIST_384_TOP, top - BN_NIST_384_TOP, BN_NIST_384_TOP); + +#if defined(NIST_INT64) + { + NIST_INT64 acc; /* accumulator */ + unsigned int *rp=(unsigned int *)r_d; + const unsigned int *bp=(const unsigned int *)buf.ui; + + acc = rp[0]; acc += bp[12-12]; + acc += bp[21-12]; + acc += bp[20-12]; + acc -= bp[23-12]; rp[0] = (unsigned int)acc; acc >>= 32; + + acc += rp[1]; acc += bp[13-12]; + acc += bp[22-12]; + acc += bp[23-12]; + acc -= bp[12-12]; + acc -= bp[20-12]; rp[1] = (unsigned int)acc; acc >>= 32; + + acc += rp[2]; acc += bp[14-12]; + acc += bp[23-12]; + acc -= bp[13-12]; + acc -= bp[21-12]; rp[2] = (unsigned int)acc; acc >>= 32; + + acc += rp[3]; acc += bp[15-12]; + acc += bp[12-12]; + acc += bp[20-12]; + acc += bp[21-12]; + acc -= bp[14-12]; + acc -= bp[22-12]; + acc -= bp[23-12]; rp[3] = (unsigned int)acc; acc >>= 32; + + acc += rp[4]; acc += bp[21-12]; + acc += bp[21-12]; + acc += bp[16-12]; + acc += bp[13-12]; + acc += bp[12-12]; + acc += bp[20-12]; + acc += bp[22-12]; + acc -= bp[15-12]; + acc -= bp[23-12]; + acc -= bp[23-12]; rp[4] = (unsigned int)acc; acc >>= 32; + + acc += rp[5]; acc += bp[22-12]; + acc += bp[22-12]; + acc += bp[17-12]; + acc += bp[14-12]; + acc += bp[13-12]; + acc += bp[21-12]; + acc += bp[23-12]; + acc -= bp[16-12]; rp[5] = (unsigned int)acc; acc >>= 32; + + acc += rp[6]; acc += bp[23-12]; + acc += bp[23-12]; + acc += bp[18-12]; + acc += bp[15-12]; + acc += bp[14-12]; + acc += bp[22-12]; + acc -= bp[17-12]; rp[6] = (unsigned int)acc; acc >>= 32; + + acc += rp[7]; acc += bp[19-12]; + acc += bp[16-12]; + acc += bp[15-12]; + acc += bp[23-12]; + acc -= bp[18-12]; rp[7] = (unsigned int)acc; acc >>= 32; + + acc += rp[8]; acc += bp[20-12]; + acc += bp[17-12]; + acc += bp[16-12]; + acc -= bp[19-12]; rp[8] = (unsigned int)acc; acc >>= 32; + + acc += rp[9]; acc += bp[21-12]; + acc += bp[18-12]; + acc += bp[17-12]; + acc -= bp[20-12]; rp[9] = (unsigned int)acc; acc >>= 32; + + acc += rp[10]; acc += bp[22-12]; + acc += bp[19-12]; + acc += bp[18-12]; + acc -= bp[21-12]; rp[10] = (unsigned int)acc; acc >>= 32; + + acc += rp[11]; acc += bp[23-12]; + acc += bp[20-12]; + acc += bp[19-12]; + acc -= bp[22-12]; rp[11] = (unsigned int)acc; + + carry = (int)(acc>>32); + } +#else + { + BN_ULONG t_d[BN_NIST_384_TOP]; /*S1*/ - nist_set_256(t_d, buf, 0, 0, 0, 0, 0, 23-4, 22-4, 21-4); + nist_set_256(t_d, buf.bn, 0, 0, 0, 0, 0, 23-4, 22-4, 21-4); /* left shift */ { register BN_ULONG *ap,t,c; @@ -729,29 +985,31 @@ int BN_nist_mod_384(BIGNUM *r, const BIGNUM *a, const BIGNUM *field, carry = (int)bn_add_words(r_d+(128/BN_BITS2), r_d+(128/BN_BITS2), t_d, BN_NIST_256_TOP); /*S2 */ - carry += (int)bn_add_words(r_d, r_d, buf, BN_NIST_384_TOP); + carry += (int)bn_add_words(r_d, r_d, buf.bn, BN_NIST_384_TOP); /*S3*/ - nist_set_384(t_d,buf,20,19,18,17,16,15,14,13,12,23,22,21); + nist_set_384(t_d,buf.bn,20,19,18,17,16,15,14,13,12,23,22,21); carry += (int)bn_add_words(r_d, r_d, t_d, BN_NIST_384_TOP); /*S4*/ - nist_set_384(t_d,buf,19,18,17,16,15,14,13,12,20,0,23,0); + nist_set_384(t_d,buf.bn,19,18,17,16,15,14,13,12,20,0,23,0); carry += (int)bn_add_words(r_d, r_d, t_d, BN_NIST_384_TOP); /*S5*/ - nist_set_384(t_d, buf,0,0,0,0,23,22,21,20,0,0,0,0); + nist_set_384(t_d, buf.bn,0,0,0,0,23,22,21,20,0,0,0,0); carry += (int)bn_add_words(r_d, r_d, t_d, BN_NIST_384_TOP); /*S6*/ - nist_set_384(t_d,buf,0,0,0,0,0,0,23,22,21,0,0,20); + nist_set_384(t_d,buf.bn,0,0,0,0,0,0,23,22,21,0,0,20); carry += (int)bn_add_words(r_d, r_d, t_d, BN_NIST_384_TOP); /*D1*/ - nist_set_384(t_d,buf,22,21,20,19,18,17,16,15,14,13,12,23); + nist_set_384(t_d,buf.bn,22,21,20,19,18,17,16,15,14,13,12,23); carry -= (int)bn_sub_words(r_d, r_d, t_d, BN_NIST_384_TOP); /*D2*/ - nist_set_384(t_d,buf,0,0,0,0,0,0,0,23,22,21,20,0); + nist_set_384(t_d,buf.bn,0,0,0,0,0,0,0,23,22,21,20,0); carry -= (int)bn_sub_words(r_d, r_d, t_d, BN_NIST_384_TOP); /*D3*/ - nist_set_384(t_d,buf,0,0,0,0,0,0,0,23,23,0,0,0); + nist_set_384(t_d,buf.bn,0,0,0,0,0,0,0,23,23,0,0,0); carry -= (int)bn_sub_words(r_d, r_d, t_d, BN_NIST_384_TOP); + } +#endif /* see BN_nist_mod_224 for explanation */ u.f = bn_sub_words; if (carry > 0) diff --git a/crypto/openssl/crypto/bn/bn_print.c b/crypto/openssl/crypto/bn/bn_print.c index bebb466d08..1743b6a7e2 100644 --- a/crypto/openssl/crypto/bn/bn_print.c +++ b/crypto/openssl/crypto/bn/bn_print.c @@ -357,3 +357,22 @@ end: return(ret); } #endif + +char *BN_options(void) + { + static int init=0; + static char data[16]; + + if (!init) + { + init++; +#ifdef BN_LLONG + BIO_snprintf(data,sizeof data,"bn(%d,%d)", + (int)sizeof(BN_ULLONG)*8,(int)sizeof(BN_ULONG)*8); +#else + BIO_snprintf(data,sizeof data,"bn(%d,%d)", + (int)sizeof(BN_ULONG)*8,(int)sizeof(BN_ULONG)*8); +#endif + } + return(data); + } diff --git a/crypto/openssl/crypto/bn/bn_shift.c b/crypto/openssl/crypto/bn/bn_shift.c index c4d301afc4..a6fca2c424 100644 --- a/crypto/openssl/crypto/bn/bn_shift.c +++ b/crypto/openssl/crypto/bn/bn_shift.c @@ -99,7 +99,7 @@ int BN_lshift1(BIGNUM *r, const BIGNUM *a) int BN_rshift1(BIGNUM *r, const BIGNUM *a) { BN_ULONG *ap,*rp,t,c; - int i; + int i,j; bn_check_top(r); bn_check_top(a); @@ -109,22 +109,25 @@ int BN_rshift1(BIGNUM *r, const BIGNUM *a) BN_zero(r); return(1); } + i = a->top; + ap= a->d; + j = i-(ap[i-1]==1); if (a != r) { - if (bn_wexpand(r,a->top) == NULL) return(0); - r->top=a->top; + if (bn_wexpand(r,j) == NULL) return(0); r->neg=a->neg; } - ap=a->d; rp=r->d; - c=0; - for (i=a->top-1; i>=0; i--) + t=ap[--i]; + c=(t&1)?BN_TBIT:0; + if (t>>=1) rp[i]=t; + while (i>0) { - t=ap[i]; + t=ap[--i]; rp[i]=((t>>1)&BN_MASK2)|c; c=(t&1)?BN_TBIT:0; } - bn_correct_top(r); + r->top=j; bn_check_top(r); return(1); } @@ -182,10 +185,11 @@ int BN_rshift(BIGNUM *r, const BIGNUM *a, int n) BN_zero(r); return(1); } + i = (BN_num_bits(a)-n+(BN_BITS2-1))/BN_BITS2; if (r != a) { r->neg=a->neg; - if (bn_wexpand(r,a->top-nw+1) == NULL) return(0); + if (bn_wexpand(r,i) == NULL) return(0); } else { @@ -196,7 +200,7 @@ int BN_rshift(BIGNUM *r, const BIGNUM *a, int n) f= &(a->d[nw]); t=r->d; j=a->top-nw; - r->top=j; + r->top=i; if (rb == 0) { @@ -212,9 +216,8 @@ int BN_rshift(BIGNUM *r, const BIGNUM *a, int n) l= *(f++); *(t++) =(tmp|(l<>rb)&BN_MASK2; + if ((l = (l>>rb)&BN_MASK2)) *(t) = l; } - bn_correct_top(r); bn_check_top(r); return(1); } diff --git a/crypto/openssl/crypto/bn/bn_x931p.c b/crypto/openssl/crypto/bn/bn_x931p.c new file mode 100644 index 0000000000..04c5c874ec --- /dev/null +++ b/crypto/openssl/crypto/bn/bn_x931p.c @@ -0,0 +1,272 @@ +/* bn_x931p.c */ +/* Written by Dr Stephen N Henson (steve@openssl.org) for the OpenSSL + * project 2005. + */ +/* ==================================================================== + * Copyright (c) 2005 The OpenSSL Project. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * 3. All advertising materials mentioning features or use of this + * software must display the following acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit. (http://www.OpenSSL.org/)" + * + * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to + * endorse or promote products derived from this software without + * prior written permission. For written permission, please contact + * licensing@OpenSSL.org. + * + * 5. Products derived from this software may not be called "OpenSSL" + * nor may "OpenSSL" appear in their names without prior written + * permission of the OpenSSL Project. + * + * 6. Redistributions of any form whatsoever must retain the following + * acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit (http://www.OpenSSL.org/)" + * + * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY + * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * ==================================================================== + * + * This product includes cryptographic software written by Eric Young + * (eay@cryptsoft.com). This product includes software written by Tim + * Hudson (tjh@cryptsoft.com). + * + */ + +#include +#include + +/* X9.31 routines for prime derivation */ + +/* X9.31 prime derivation. This is used to generate the primes pi + * (p1, p2, q1, q2) from a parameter Xpi by checking successive odd + * integers. + */ + +static int bn_x931_derive_pi(BIGNUM *pi, const BIGNUM *Xpi, BN_CTX *ctx, + BN_GENCB *cb) + { + int i = 0; + if (!BN_copy(pi, Xpi)) + return 0; + if (!BN_is_odd(pi) && !BN_add_word(pi, 1)) + return 0; + for(;;) + { + i++; + BN_GENCB_call(cb, 0, i); + /* NB 27 MR is specificed in X9.31 */ + if (BN_is_prime_fasttest_ex(pi, 27, ctx, 1, cb)) + break; + if (!BN_add_word(pi, 2)) + return 0; + } + BN_GENCB_call(cb, 2, i); + return 1; + } + +/* This is the main X9.31 prime derivation function. From parameters + * Xp1, Xp2 and Xp derive the prime p. If the parameters p1 or p2 are + * not NULL they will be returned too: this is needed for testing. + */ + +int BN_X931_derive_prime_ex(BIGNUM *p, BIGNUM *p1, BIGNUM *p2, + const BIGNUM *Xp, const BIGNUM *Xp1, const BIGNUM *Xp2, + const BIGNUM *e, BN_CTX *ctx, BN_GENCB *cb) + { + int ret = 0; + + BIGNUM *t, *p1p2, *pm1; + + /* Only even e supported */ + if (!BN_is_odd(e)) + return 0; + + BN_CTX_start(ctx); + if (!p1) + p1 = BN_CTX_get(ctx); + + if (!p2) + p2 = BN_CTX_get(ctx); + + t = BN_CTX_get(ctx); + + p1p2 = BN_CTX_get(ctx); + + pm1 = BN_CTX_get(ctx); + + if (!bn_x931_derive_pi(p1, Xp1, ctx, cb)) + goto err; + + if (!bn_x931_derive_pi(p2, Xp2, ctx, cb)) + goto err; + + if (!BN_mul(p1p2, p1, p2, ctx)) + goto err; + + /* First set p to value of Rp */ + + if (!BN_mod_inverse(p, p2, p1, ctx)) + goto err; + + if (!BN_mul(p, p, p2, ctx)) + goto err; + + if (!BN_mod_inverse(t, p1, p2, ctx)) + goto err; + + if (!BN_mul(t, t, p1, ctx)) + goto err; + + if (!BN_sub(p, p, t)) + goto err; + + if (p->neg && !BN_add(p, p, p1p2)) + goto err; + + /* p now equals Rp */ + + if (!BN_mod_sub(p, p, Xp, p1p2, ctx)) + goto err; + + if (!BN_add(p, p, Xp)) + goto err; + + /* p now equals Yp0 */ + + for (;;) + { + int i = 1; + BN_GENCB_call(cb, 0, i++); + if (!BN_copy(pm1, p)) + goto err; + if (!BN_sub_word(pm1, 1)) + goto err; + if (!BN_gcd(t, pm1, e, ctx)) + goto err; + if (BN_is_one(t) + /* X9.31 specifies 8 MR and 1 Lucas test or any prime test + * offering similar or better guarantees 50 MR is considerably + * better. + */ + && BN_is_prime_fasttest_ex(p, 50, ctx, 1, cb)) + break; + if (!BN_add(p, p, p1p2)) + goto err; + } + + BN_GENCB_call(cb, 3, 0); + + ret = 1; + + err: + + BN_CTX_end(ctx); + + return ret; + } + +/* Generate pair of paramters Xp, Xq for X9.31 prime generation. + * Note: nbits paramter is sum of number of bits in both. + */ + +int BN_X931_generate_Xpq(BIGNUM *Xp, BIGNUM *Xq, int nbits, BN_CTX *ctx) + { + BIGNUM *t; + int i; + /* Number of bits for each prime is of the form + * 512+128s for s = 0, 1, ... + */ + if ((nbits < 1024) || (nbits & 0xff)) + return 0; + nbits >>= 1; + /* The random value Xp must be between sqrt(2) * 2^(nbits-1) and + * 2^nbits - 1. By setting the top two bits we ensure that the lower + * bound is exceeded. + */ + if (!BN_rand(Xp, nbits, 1, 0)) + return 0; + + BN_CTX_start(ctx); + t = BN_CTX_get(ctx); + + for (i = 0; i < 1000; i++) + { + if (!BN_rand(Xq, nbits, 1, 0)) + return 0; + /* Check that |Xp - Xq| > 2^(nbits - 100) */ + BN_sub(t, Xp, Xq); + if (BN_num_bits(t) > (nbits - 100)) + break; + } + + BN_CTX_end(ctx); + + if (i < 1000) + return 1; + + return 0; + + } + +/* Generate primes using X9.31 algorithm. Of the values p, p1, p2, Xp1 + * and Xp2 only 'p' needs to be non-NULL. If any of the others are not NULL + * the relevant parameter will be stored in it. + * + * Due to the fact that |Xp - Xq| > 2^(nbits - 100) must be satisfied Xp and Xq + * are generated using the previous function and supplied as input. + */ + +int BN_X931_generate_prime_ex(BIGNUM *p, BIGNUM *p1, BIGNUM *p2, + BIGNUM *Xp1, BIGNUM *Xp2, + const BIGNUM *Xp, + const BIGNUM *e, BN_CTX *ctx, + BN_GENCB *cb) + { + int ret = 0; + + BN_CTX_start(ctx); + if (!Xp1) + Xp1 = BN_CTX_get(ctx); + if (!Xp2) + Xp2 = BN_CTX_get(ctx); + + if (!BN_rand(Xp1, 101, 0, 0)) + goto error; + if (!BN_rand(Xp2, 101, 0, 0)) + goto error; + if (!BN_X931_derive_prime_ex(p, p1, p2, Xp, Xp1, Xp2, e, ctx, cb)) + goto error; + + ret = 1; + + error: + BN_CTX_end(ctx); + + return ret; + + } + diff --git a/crypto/openssl/crypto/asn1/a_digest.c b/crypto/openssl/crypto/buffer/buf_str.c similarity index 76% copy from crypto/openssl/crypto/asn1/a_digest.c copy to crypto/openssl/crypto/buffer/buf_str.c index d00d9e22b1..151f5ea971 100644 --- a/crypto/openssl/crypto/asn1/a_digest.c +++ b/crypto/openssl/crypto/buffer/buf_str.c @@ -1,4 +1,4 @@ -/* crypto/asn1/a_digest.c */ +/* crypto/buffer/buffer.c */ /* Copyright (C) 1995-1998 Eric Young (eay@cryptsoft.com) * All rights reserved. * @@ -57,55 +57,63 @@ */ #include -#include - #include "cryptlib.h" - -#ifndef NO_SYS_TYPES_H -# include -#endif - -#include -#include #include -#include -#ifndef NO_ASN1_OLD +char *BUF_strdup(const char *str) + { + if (str == NULL) return(NULL); + return BUF_strndup(str, strlen(str)); + } -int ASN1_digest(i2d_of_void *i2d, const EVP_MD *type, char *data, - unsigned char *md, unsigned int *len) +char *BUF_strndup(const char *str, size_t siz) { - int i; - unsigned char *str,*p; + char *ret; - i=i2d(data,NULL); - if ((str=(unsigned char *)OPENSSL_malloc(i)) == NULL) + if (str == NULL) return(NULL); + + ret=OPENSSL_malloc(siz+1); + if (ret == NULL) { - ASN1err(ASN1_F_ASN1_DIGEST,ERR_R_MALLOC_FAILURE); - return(0); + BUFerr(BUF_F_BUF_STRNDUP,ERR_R_MALLOC_FAILURE); + return(NULL); } - p=str; - i2d(data,&p); - - EVP_Digest(str, i, md, len, type, NULL); - OPENSSL_free(str); - return(1); + BUF_strlcpy(ret,str,siz+1); + return(ret); } -#endif - - -int ASN1_item_digest(const ASN1_ITEM *it, const EVP_MD *type, void *asn, - unsigned char *md, unsigned int *len) +void *BUF_memdup(const void *data, size_t siz) { - int i; - unsigned char *str = NULL; + void *ret; - i=ASN1_item_i2d(asn,&str, it); - if (!str) return(0); + if (data == NULL) return(NULL); - EVP_Digest(str, i, md, len, type, NULL); - OPENSSL_free(str); - return(1); + ret=OPENSSL_malloc(siz); + if (ret == NULL) + { + BUFerr(BUF_F_BUF_MEMDUP,ERR_R_MALLOC_FAILURE); + return(NULL); + } + return memcpy(ret, data, siz); + } + +size_t BUF_strlcpy(char *dst, const char *src, size_t size) + { + size_t l = 0; + for(; size > 1 && *src; size--) + { + *dst++ = *src++; + l++; + } + if (size) + *dst = '\0'; + return l + strlen(src); } +size_t BUF_strlcat(char *dst, const char *src, size_t size) + { + size_t l = 0; + for(; size > 0 && *dst; size--, dst++) + l++; + return l + BUF_strlcpy(dst, src, size); + } diff --git a/crypto/openssl/crypto/buffer/buffer.c b/crypto/openssl/crypto/buffer/buffer.c index 620ea8d536..f4b358bbbd 100644 --- a/crypto/openssl/crypto/buffer/buffer.c +++ b/crypto/openssl/crypto/buffer/buffer.c @@ -162,64 +162,6 @@ int BUF_MEM_grow_clean(BUF_MEM *str, size_t len) return(len); } -char *BUF_strdup(const char *str) - { - if (str == NULL) return(NULL); - return BUF_strndup(str, strlen(str)); - } - -char *BUF_strndup(const char *str, size_t siz) - { - char *ret; - - if (str == NULL) return(NULL); - - ret=OPENSSL_malloc(siz+1); - if (ret == NULL) - { - BUFerr(BUF_F_BUF_STRNDUP,ERR_R_MALLOC_FAILURE); - return(NULL); - } - BUF_strlcpy(ret,str,siz+1); - return(ret); - } - -void *BUF_memdup(const void *data, size_t siz) - { - void *ret; - - if (data == NULL) return(NULL); - - ret=OPENSSL_malloc(siz); - if (ret == NULL) - { - BUFerr(BUF_F_BUF_MEMDUP,ERR_R_MALLOC_FAILURE); - return(NULL); - } - return memcpy(ret, data, siz); - } - -size_t BUF_strlcpy(char *dst, const char *src, size_t size) - { - size_t l = 0; - for(; size > 1 && *src; size--) - { - *dst++ = *src++; - l++; - } - if (size) - *dst = '\0'; - return l + strlen(src); - } - -size_t BUF_strlcat(char *dst, const char *src, size_t size) - { - size_t l = 0; - for(; size > 0 && *dst; size--, dst++) - l++; - return l + BUF_strlcpy(dst, src, size); - } - void BUF_reverse(unsigned char *out, unsigned char *in, size_t size) { size_t i; diff --git a/crypto/openssl/crypto/camellia/asm/cmll-x86.pl b/crypto/openssl/crypto/camellia/asm/cmll-x86.pl index 027302ac86..c314d62312 100644 --- a/crypto/openssl/crypto/camellia/asm/cmll-x86.pl +++ b/crypto/openssl/crypto/camellia/asm/cmll-x86.pl @@ -723,11 +723,11 @@ my $bias=int(@T[0])?shift(@T):0; &function_end("Camellia_Ekeygen"); if ($OPENSSL) { -# int Camellia_set_key ( +# int private_Camellia_set_key ( # const unsigned char *userKey, # int bits, # CAMELLIA_KEY *key) -&function_begin_B("Camellia_set_key"); +&function_begin_B("private_Camellia_set_key"); &push ("ebx"); &mov ("ecx",&wparam(0)); # pull arguments &mov ("ebx",&wparam(1)); @@ -760,7 +760,7 @@ if ($OPENSSL) { &set_label("done",4); &pop ("ebx"); &ret (); -&function_end_B("Camellia_set_key"); +&function_end_B("private_Camellia_set_key"); } @SBOX=( diff --git a/crypto/openssl/crypto/camellia/camellia.h b/crypto/openssl/crypto/camellia/camellia.h index cf0457dd97..67911e0adf 100644 --- a/crypto/openssl/crypto/camellia/camellia.h +++ b/crypto/openssl/crypto/camellia/camellia.h @@ -88,6 +88,10 @@ struct camellia_key_st }; typedef struct camellia_key_st CAMELLIA_KEY; +#ifdef OPENSSL_FIPS +int private_Camellia_set_key(const unsigned char *userKey, const int bits, + CAMELLIA_KEY *key); +#endif int Camellia_set_key(const unsigned char *userKey, const int bits, CAMELLIA_KEY *key); diff --git a/crypto/openssl/crypto/camellia/cmll_locl.h b/crypto/openssl/crypto/camellia/cmll_locl.h index 4a4d880d16..246b6ce1d8 100644 --- a/crypto/openssl/crypto/camellia/cmll_locl.h +++ b/crypto/openssl/crypto/camellia/cmll_locl.h @@ -71,7 +71,8 @@ typedef unsigned int u32; typedef unsigned char u8; -int Camellia_Ekeygen(int keyBitLength, const u8 *rawKey, KEY_TABLE_TYPE keyTable); +int Camellia_Ekeygen(int keyBitLength, const u8 *rawKey, + KEY_TABLE_TYPE keyTable); void Camellia_EncryptBlock_Rounds(int grandRounds, const u8 plaintext[], const KEY_TABLE_TYPE keyTable, u8 ciphertext[]); void Camellia_DecryptBlock_Rounds(int grandRounds, const u8 ciphertext[], @@ -80,4 +81,6 @@ void Camellia_EncryptBlock(int keyBitLength, const u8 plaintext[], const KEY_TABLE_TYPE keyTable, u8 ciphertext[]); void Camellia_DecryptBlock(int keyBitLength, const u8 ciphertext[], const KEY_TABLE_TYPE keyTable, u8 plaintext[]); +int private_Camellia_set_key(const unsigned char *userKey, const int bits, + CAMELLIA_KEY *key); #endif /* #ifndef HEADER_CAMELLIA_LOCL_H */ diff --git a/crypto/openssl/crypto/camellia/cmll_misc.c b/crypto/openssl/crypto/camellia/cmll_misc.c index f44689124b..f44d48564c 100644 --- a/crypto/openssl/crypto/camellia/cmll_misc.c +++ b/crypto/openssl/crypto/camellia/cmll_misc.c @@ -50,12 +50,13 @@ */ #include +#include #include #include "cmll_locl.h" const char CAMELLIA_version[]="CAMELLIA" OPENSSL_VERSION_PTEXT; -int Camellia_set_key(const unsigned char *userKey, const int bits, +int private_Camellia_set_key(const unsigned char *userKey, const int bits, CAMELLIA_KEY *key) { if(!userKey || !key) diff --git a/crypto/openssl/crypto/aes/aes_misc.c b/crypto/openssl/crypto/camellia/cmll_utl.c similarity index 85% copy from crypto/openssl/crypto/aes/aes_misc.c copy to crypto/openssl/crypto/camellia/cmll_utl.c index 4fead1b4c7..7a35711ec1 100644 --- a/crypto/openssl/crypto/aes/aes_misc.c +++ b/crypto/openssl/crypto/camellia/cmll_utl.c @@ -1,6 +1,6 @@ -/* crypto/aes/aes_misc.c -*- mode:C; c-file-style: "eay" -*- */ +/* crypto/camellia/cmll_utl.c -*- mode:C; c-file-style: "eay" -*- */ /* ==================================================================== - * Copyright (c) 1998-2002 The OpenSSL Project. All rights reserved. + * Copyright (c) 2011 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -48,17 +48,17 @@ * ==================================================================== * */ - + #include -#include -#include "aes_locl.h" - -const char AES_version[]="AES" OPENSSL_VERSION_PTEXT; +#include +#include +#include "cmll_locl.h" -const char *AES_options(void) { -#ifdef FULL_UNROLL - return "aes(full)"; -#else - return "aes(partial)"; +int Camellia_set_key(const unsigned char *userKey, const int bits, + CAMELLIA_KEY *key) + { +#ifdef OPENSSL_FIPS + fips_cipher_abort(Camellia); #endif -} + return private_Camellia_set_key(userKey, bits, key); + } diff --git a/crypto/openssl/crypto/cast/c_skey.c b/crypto/openssl/crypto/cast/c_skey.c index 76e40005c9..cb6bf9fee3 100644 --- a/crypto/openssl/crypto/cast/c_skey.c +++ b/crypto/openssl/crypto/cast/c_skey.c @@ -56,6 +56,7 @@ * [including the GNU Public Licence.] */ +#include #include #include "cast_lcl.h" #include "cast_s.h" @@ -71,8 +72,14 @@ #define S5 CAST_S_table5 #define S6 CAST_S_table6 #define S7 CAST_S_table7 - void CAST_set_key(CAST_KEY *key, int len, const unsigned char *data) +#ifdef OPENSSL_FIPS + { + fips_cipher_abort(CAST); + private_CAST_set_key(key, len, data); + } +void private_CAST_set_key(CAST_KEY *key, int len, const unsigned char *data) +#endif { CAST_LONG x[16]; CAST_LONG z[16]; diff --git a/crypto/openssl/crypto/cast/cast.h b/crypto/openssl/crypto/cast/cast.h index 1a264f8143..203922ea2b 100644 --- a/crypto/openssl/crypto/cast/cast.h +++ b/crypto/openssl/crypto/cast/cast.h @@ -83,7 +83,9 @@ typedef struct cast_key_st int short_key; /* Use reduced rounds for short key */ } CAST_KEY; - +#ifdef OPENSSL_FIPS +void private_CAST_set_key(CAST_KEY *key, int len, const unsigned char *data); +#endif void CAST_set_key(CAST_KEY *key, int len, const unsigned char *data); void CAST_ecb_encrypt(const unsigned char *in, unsigned char *out, const CAST_KEY *key, int enc); diff --git a/crypto/openssl/crypto/ecdh/ech_locl.h b/crypto/openssl/crypto/cmac/cm_ameth.c similarity index 70% copy from crypto/openssl/crypto/ecdh/ech_locl.h copy to crypto/openssl/crypto/cmac/cm_ameth.c index f658526a7e..0b8e5670b0 100644 --- a/crypto/openssl/crypto/ecdh/ech_locl.h +++ b/crypto/openssl/crypto/cmac/cm_ameth.c @@ -1,6 +1,8 @@ -/* crypto/ecdh/ech_locl.h */ +/* Written by Dr Stephen N Henson (steve@openssl.org) for the OpenSSL + * project 2010. + */ /* ==================================================================== - * Copyright (c) 2000-2005 The OpenSSL Project. All rights reserved. + * Copyright (c) 2010 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -46,49 +48,50 @@ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED * OF THE POSSIBILITY OF SUCH DAMAGE. * ==================================================================== - * - * This product includes cryptographic software written by Eric Young - * (eay@cryptsoft.com). This product includes software written by Tim - * Hudson (tjh@cryptsoft.com). - * */ -#ifndef HEADER_ECH_LOCL_H -#define HEADER_ECH_LOCL_H +#include +#include "cryptlib.h" +#include +#include +#include "asn1_locl.h" -#include +/* CMAC "ASN1" method. This is just here to indicate the + * maximum CMAC output length and to free up a CMAC + * key. + */ -#ifdef __cplusplus -extern "C" { -#endif +static int cmac_size(const EVP_PKEY *pkey) + { + return EVP_MAX_BLOCK_LENGTH; + } -struct ecdh_method +static void cmac_key_free(EVP_PKEY *pkey) { - const char *name; - int (*compute_key)(void *key, size_t outlen, const EC_POINT *pub_key, EC_KEY *ecdh, - void *(*KDF)(const void *in, size_t inlen, void *out, size_t *outlen)); -#if 0 - int (*init)(EC_KEY *eckey); - int (*finish)(EC_KEY *eckey); -#endif - int flags; - char *app_data; - }; + CMAC_CTX *cmctx = (CMAC_CTX *)pkey->pkey.ptr; + if (cmctx) + CMAC_CTX_free(cmctx); + } -typedef struct ecdh_data_st { - /* EC_KEY_METH_DATA part */ - int (*init)(EC_KEY *); - /* method specific part */ - ENGINE *engine; - int flags; - const ECDH_METHOD *meth; - CRYPTO_EX_DATA ex_data; -} ECDH_DATA; +const EVP_PKEY_ASN1_METHOD cmac_asn1_meth = + { + EVP_PKEY_CMAC, + EVP_PKEY_CMAC, + 0, + + "CMAC", + "OpenSSL CMAC method", + + 0,0,0,0, -ECDH_DATA *ecdh_check(EC_KEY *); + 0,0,0, -#ifdef __cplusplus -} -#endif + cmac_size, + 0, + 0,0,0,0,0,0,0, + + cmac_key_free, + 0, + 0,0 + }; -#endif /* HEADER_ECH_LOCL_H */ diff --git a/crypto/openssl/crypto/hmac/hm_pmeth.c b/crypto/openssl/crypto/cmac/cm_pmeth.c similarity index 54% copy from crypto/openssl/crypto/hmac/hm_pmeth.c copy to crypto/openssl/crypto/cmac/cm_pmeth.c index 71e8567a14..072228ec7f 100644 --- a/crypto/openssl/crypto/hmac/hm_pmeth.c +++ b/crypto/openssl/crypto/cmac/cm_pmeth.c @@ -1,8 +1,8 @@ /* Written by Dr Stephen N Henson (steve@openssl.org) for the OpenSSL - * project 2007. + * project 2010. */ /* ==================================================================== - * Copyright (c) 2007 The OpenSSL Project. All rights reserved. + * Copyright (c) 2010 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -48,11 +48,6 @@ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED * OF THE POSSIBILITY OF SUCH DAMAGE. * ==================================================================== - * - * This product includes cryptographic software written by Eric Young - * (eay@cryptsoft.com). This product includes software written by Tim - * Hudson (tjh@cryptsoft.com). - * */ #include @@ -60,140 +55,94 @@ #include #include #include -#include +#include #include "evp_locl.h" -/* HMAC pkey context structure */ - -typedef struct - { - const EVP_MD *md; /* MD for HMAC use */ - ASN1_OCTET_STRING ktmp; /* Temp storage for key */ - HMAC_CTX ctx; - } HMAC_PKEY_CTX; +/* The context structure and "key" is simply a CMAC_CTX */ -static int pkey_hmac_init(EVP_PKEY_CTX *ctx) +static int pkey_cmac_init(EVP_PKEY_CTX *ctx) { - HMAC_PKEY_CTX *hctx; - hctx = OPENSSL_malloc(sizeof(HMAC_PKEY_CTX)); - if (!hctx) + ctx->data = CMAC_CTX_new(); + if (!ctx->data) return 0; - hctx->md = NULL; - hctx->ktmp.data = NULL; - hctx->ktmp.length = 0; - hctx->ktmp.flags = 0; - hctx->ktmp.type = V_ASN1_OCTET_STRING; - HMAC_CTX_init(&hctx->ctx); - - ctx->data = hctx; ctx->keygen_info_count = 0; - return 1; } -static int pkey_hmac_copy(EVP_PKEY_CTX *dst, EVP_PKEY_CTX *src) +static int pkey_cmac_copy(EVP_PKEY_CTX *dst, EVP_PKEY_CTX *src) { - HMAC_PKEY_CTX *sctx, *dctx; - if (!pkey_hmac_init(dst)) + if (!pkey_cmac_init(dst)) + return 0; + if (!CMAC_CTX_copy(dst->data, src->data)) return 0; - sctx = src->data; - dctx = dst->data; - dctx->md = sctx->md; - HMAC_CTX_init(&dctx->ctx); - HMAC_CTX_copy(&dctx->ctx, &sctx->ctx); - if (sctx->ktmp.data) - { - if (!ASN1_OCTET_STRING_set(&dctx->ktmp, - sctx->ktmp.data, sctx->ktmp.length)) - return 0; - } return 1; } -static void pkey_hmac_cleanup(EVP_PKEY_CTX *ctx) +static void pkey_cmac_cleanup(EVP_PKEY_CTX *ctx) { - HMAC_PKEY_CTX *hctx = ctx->data; - HMAC_CTX_cleanup(&hctx->ctx); - if (hctx->ktmp.data) - { - if (hctx->ktmp.length) - OPENSSL_cleanse(hctx->ktmp.data, hctx->ktmp.length); - OPENSSL_free(hctx->ktmp.data); - hctx->ktmp.data = NULL; - } - OPENSSL_free(hctx); + CMAC_CTX_free(ctx->data); } -static int pkey_hmac_keygen(EVP_PKEY_CTX *ctx, EVP_PKEY *pkey) +static int pkey_cmac_keygen(EVP_PKEY_CTX *ctx, EVP_PKEY *pkey) { - ASN1_OCTET_STRING *hkey = NULL; - HMAC_PKEY_CTX *hctx = ctx->data; - if (!hctx->ktmp.data) + CMAC_CTX *cmkey = CMAC_CTX_new(); + CMAC_CTX *cmctx = ctx->data; + if (!cmkey) return 0; - hkey = ASN1_OCTET_STRING_dup(&hctx->ktmp); - if (!hkey) + if (!CMAC_CTX_copy(cmkey, cmctx)) + { + CMAC_CTX_free(cmkey); return 0; - EVP_PKEY_assign(pkey, EVP_PKEY_HMAC, hkey); + } + EVP_PKEY_assign(pkey, EVP_PKEY_CMAC, cmkey); return 1; } static int int_update(EVP_MD_CTX *ctx,const void *data,size_t count) { - HMAC_PKEY_CTX *hctx = ctx->pctx->data; - HMAC_Update(&hctx->ctx, data, count); + if (!CMAC_Update(ctx->pctx->data, data, count)) + return 0; return 1; } -static int hmac_signctx_init(EVP_PKEY_CTX *ctx, EVP_MD_CTX *mctx) +static int cmac_signctx_init(EVP_PKEY_CTX *ctx, EVP_MD_CTX *mctx) { - HMAC_PKEY_CTX *hctx = ctx->data; - HMAC_CTX_set_flags(&hctx->ctx, mctx->flags & ~EVP_MD_CTX_FLAG_NO_INIT); EVP_MD_CTX_set_flags(mctx, EVP_MD_CTX_FLAG_NO_INIT); mctx->update = int_update; return 1; } -static int hmac_signctx(EVP_PKEY_CTX *ctx, unsigned char *sig, size_t *siglen, +static int cmac_signctx(EVP_PKEY_CTX *ctx, unsigned char *sig, size_t *siglen, EVP_MD_CTX *mctx) { - unsigned int hlen; - HMAC_PKEY_CTX *hctx = ctx->data; - int l = EVP_MD_CTX_size(mctx); - - if (l < 0) - return 0; - *siglen = l; - if (!sig) - return 1; - - HMAC_Final(&hctx->ctx, sig, &hlen); - *siglen = (size_t)hlen; - return 1; + return CMAC_Final(ctx->data, sig, siglen); } -static int pkey_hmac_ctrl(EVP_PKEY_CTX *ctx, int type, int p1, void *p2) +static int pkey_cmac_ctrl(EVP_PKEY_CTX *ctx, int type, int p1, void *p2) { - HMAC_PKEY_CTX *hctx = ctx->data; - ASN1_OCTET_STRING *key; + CMAC_CTX *cmctx = ctx->data; switch (type) { case EVP_PKEY_CTRL_SET_MAC_KEY: - if ((!p2 && p1 > 0) || (p1 < -1)) + if (!p2 || p1 < 0) return 0; - if (!ASN1_OCTET_STRING_set(&hctx->ktmp, p2, p1)) + if (!CMAC_Init(cmctx, p2, p1, NULL, NULL)) return 0; break; - case EVP_PKEY_CTRL_MD: - hctx->md = p2; + case EVP_PKEY_CTRL_CIPHER: + if (!CMAC_Init(cmctx, NULL, 0, p2, ctx->engine)) + return 0; break; - case EVP_PKEY_CTRL_DIGESTINIT: - key = (ASN1_OCTET_STRING *)ctx->pkey->pkey.ptr; - HMAC_Init_ex(&hctx->ctx, key->data, key->length, hctx->md, - ctx->engine); + case EVP_PKEY_CTRL_MD: + if (ctx->pkey && !CMAC_CTX_copy(ctx->data, + (CMAC_CTX *)ctx->pkey->pkey.ptr)) + return 0; + if (!CMAC_Init(cmctx, NULL, 0, NULL, NULL)) + return 0; break; default: @@ -203,7 +152,7 @@ static int pkey_hmac_ctrl(EVP_PKEY_CTX *ctx, int type, int p1, void *p2) return 1; } -static int pkey_hmac_ctrl_str(EVP_PKEY_CTX *ctx, +static int pkey_cmac_ctrl_str(EVP_PKEY_CTX *ctx, const char *type, const char *value) { if (!value) @@ -213,8 +162,16 @@ static int pkey_hmac_ctrl_str(EVP_PKEY_CTX *ctx, if (!strcmp(type, "key")) { void *p = (void *)value; - return pkey_hmac_ctrl(ctx, EVP_PKEY_CTRL_SET_MAC_KEY, - -1, p); + return pkey_cmac_ctrl(ctx, EVP_PKEY_CTRL_SET_MAC_KEY, + strlen(p), p); + } + if (!strcmp(type, "cipher")) + { + const EVP_CIPHER *c; + c = EVP_get_cipherbyname(value); + if (!c) + return 0; + return pkey_cmac_ctrl(ctx, EVP_PKEY_CTRL_CIPHER, -1, (void *)c); } if (!strcmp(type, "hexkey")) { @@ -224,25 +181,25 @@ static int pkey_hmac_ctrl_str(EVP_PKEY_CTX *ctx, key = string_to_hex(value, &keylen); if (!key) return 0; - r = pkey_hmac_ctrl(ctx, EVP_PKEY_CTRL_SET_MAC_KEY, keylen, key); + r = pkey_cmac_ctrl(ctx, EVP_PKEY_CTRL_SET_MAC_KEY, keylen, key); OPENSSL_free(key); return r; } return -2; } -const EVP_PKEY_METHOD hmac_pkey_meth = +const EVP_PKEY_METHOD cmac_pkey_meth = { - EVP_PKEY_HMAC, - 0, - pkey_hmac_init, - pkey_hmac_copy, - pkey_hmac_cleanup, + EVP_PKEY_CMAC, + EVP_PKEY_FLAG_SIGCTX_CUSTOM, + pkey_cmac_init, + pkey_cmac_copy, + pkey_cmac_cleanup, 0, 0, 0, - pkey_hmac_keygen, + pkey_cmac_keygen, 0, 0, @@ -250,8 +207,8 @@ const EVP_PKEY_METHOD hmac_pkey_meth = 0,0, - hmac_signctx_init, - hmac_signctx, + cmac_signctx_init, + cmac_signctx, 0,0, @@ -261,7 +218,7 @@ const EVP_PKEY_METHOD hmac_pkey_meth = 0,0, - pkey_hmac_ctrl, - pkey_hmac_ctrl_str + pkey_cmac_ctrl, + pkey_cmac_ctrl_str }; diff --git a/crypto/openssl/crypto/cmac/cmac.c b/crypto/openssl/crypto/cmac/cmac.c new file mode 100644 index 0000000000..b58602680b --- /dev/null +++ b/crypto/openssl/crypto/cmac/cmac.c @@ -0,0 +1,306 @@ +/* crypto/cmac/cmac.c */ +/* Written by Dr Stephen N Henson (steve@openssl.org) for the OpenSSL + * project. + */ +/* ==================================================================== + * Copyright (c) 2010 The OpenSSL Project. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * 3. All advertising materials mentioning features or use of this + * software must display the following acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit. (http://www.OpenSSL.org/)" + * + * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to + * endorse or promote products derived from this software without + * prior written permission. For written permission, please contact + * licensing@OpenSSL.org. + * + * 5. Products derived from this software may not be called "OpenSSL" + * nor may "OpenSSL" appear in their names without prior written + * permission of the OpenSSL Project. + * + * 6. Redistributions of any form whatsoever must retain the following + * acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit (http://www.OpenSSL.org/)" + * + * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY + * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * ==================================================================== + */ + +#include +#include +#include +#include "cryptlib.h" +#include + +#ifdef OPENSSL_FIPS +#include +#endif + +struct CMAC_CTX_st + { + /* Cipher context to use */ + EVP_CIPHER_CTX cctx; + /* Keys k1 and k2 */ + unsigned char k1[EVP_MAX_BLOCK_LENGTH]; + unsigned char k2[EVP_MAX_BLOCK_LENGTH]; + /* Temporary block */ + unsigned char tbl[EVP_MAX_BLOCK_LENGTH]; + /* Last (possibly partial) block */ + unsigned char last_block[EVP_MAX_BLOCK_LENGTH]; + /* Number of bytes in last block: -1 means context not initialised */ + int nlast_block; + }; + + +/* Make temporary keys K1 and K2 */ + +static void make_kn(unsigned char *k1, unsigned char *l, int bl) + { + int i; + /* Shift block to left, including carry */ + for (i = 0; i < bl; i++) + { + k1[i] = l[i] << 1; + if (i < bl - 1 && l[i + 1] & 0x80) + k1[i] |= 1; + } + /* If MSB set fixup with R */ + if (l[0] & 0x80) + k1[bl - 1] ^= bl == 16 ? 0x87 : 0x1b; + } + +CMAC_CTX *CMAC_CTX_new(void) + { + CMAC_CTX *ctx; + ctx = OPENSSL_malloc(sizeof(CMAC_CTX)); + if (!ctx) + return NULL; + EVP_CIPHER_CTX_init(&ctx->cctx); + ctx->nlast_block = -1; + return ctx; + } + +void CMAC_CTX_cleanup(CMAC_CTX *ctx) + { +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !ctx->cctx.engine) + { + FIPS_cmac_ctx_cleanup(ctx); + return; + } +#endif + EVP_CIPHER_CTX_cleanup(&ctx->cctx); + OPENSSL_cleanse(ctx->tbl, EVP_MAX_BLOCK_LENGTH); + OPENSSL_cleanse(ctx->k1, EVP_MAX_BLOCK_LENGTH); + OPENSSL_cleanse(ctx->k2, EVP_MAX_BLOCK_LENGTH); + OPENSSL_cleanse(ctx->last_block, EVP_MAX_BLOCK_LENGTH); + ctx->nlast_block = -1; + } + +EVP_CIPHER_CTX *CMAC_CTX_get0_cipher_ctx(CMAC_CTX *ctx) + { + return &ctx->cctx; + } + +void CMAC_CTX_free(CMAC_CTX *ctx) + { + CMAC_CTX_cleanup(ctx); + OPENSSL_free(ctx); + } + +int CMAC_CTX_copy(CMAC_CTX *out, const CMAC_CTX *in) + { + int bl; + if (in->nlast_block == -1) + return 0; + if (!EVP_CIPHER_CTX_copy(&out->cctx, &in->cctx)) + return 0; + bl = EVP_CIPHER_CTX_block_size(&in->cctx); + memcpy(out->k1, in->k1, bl); + memcpy(out->k2, in->k2, bl); + memcpy(out->tbl, in->tbl, bl); + memcpy(out->last_block, in->last_block, bl); + out->nlast_block = in->nlast_block; + return 1; + } + +int CMAC_Init(CMAC_CTX *ctx, const void *key, size_t keylen, + const EVP_CIPHER *cipher, ENGINE *impl) + { + static unsigned char zero_iv[EVP_MAX_BLOCK_LENGTH]; +#ifdef OPENSSL_FIPS + if (FIPS_mode()) + { + /* If we have an ENGINE need to allow non FIPS */ + if ((impl || ctx->cctx.engine) + && !(ctx->cctx.flags & EVP_CIPH_FLAG_NON_FIPS_ALLOW)) + + { + EVPerr(EVP_F_CMAC_INIT, EVP_R_DISABLED_FOR_FIPS); + return 0; + } + /* Other algorithm blocking will be done in FIPS_cmac_init, + * via FIPS_cipherinit(). + */ + if (!impl && !ctx->cctx.engine) + return FIPS_cmac_init(ctx, key, keylen, cipher, NULL); + } +#endif + /* All zeros means restart */ + if (!key && !cipher && !impl && keylen == 0) + { + /* Not initialised */ + if (ctx->nlast_block == -1) + return 0; + if (!EVP_EncryptInit_ex(&ctx->cctx, NULL, NULL, NULL, zero_iv)) + return 0; + return 1; + } + /* Initialiase context */ + if (cipher && !EVP_EncryptInit_ex(&ctx->cctx, cipher, impl, NULL, NULL)) + return 0; + /* Non-NULL key means initialisation complete */ + if (key) + { + int bl; + if (!EVP_CIPHER_CTX_cipher(&ctx->cctx)) + return 0; + if (!EVP_CIPHER_CTX_set_key_length(&ctx->cctx, keylen)) + return 0; + if (!EVP_EncryptInit_ex(&ctx->cctx, NULL, NULL, key, zero_iv)) + return 0; + bl = EVP_CIPHER_CTX_block_size(&ctx->cctx); + if (!EVP_Cipher(&ctx->cctx, ctx->tbl, zero_iv, bl)) + return 0; + make_kn(ctx->k1, ctx->tbl, bl); + make_kn(ctx->k2, ctx->k1, bl); + OPENSSL_cleanse(ctx->tbl, bl); + /* Reset context again ready for first data block */ + if (!EVP_EncryptInit_ex(&ctx->cctx, NULL, NULL, NULL, zero_iv)) + return 0; + /* Zero tbl so resume works */ + memset(ctx->tbl, 0, bl); + ctx->nlast_block = 0; + } + return 1; + } + +int CMAC_Update(CMAC_CTX *ctx, const void *in, size_t dlen) + { + const unsigned char *data = in; + size_t bl; +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !ctx->cctx.engine) + return FIPS_cmac_update(ctx, in, dlen); +#endif + if (ctx->nlast_block == -1) + return 0; + if (dlen == 0) + return 1; + bl = EVP_CIPHER_CTX_block_size(&ctx->cctx); + /* Copy into partial block if we need to */ + if (ctx->nlast_block > 0) + { + size_t nleft; + nleft = bl - ctx->nlast_block; + if (dlen < nleft) + nleft = dlen; + memcpy(ctx->last_block + ctx->nlast_block, data, nleft); + dlen -= nleft; + ctx->nlast_block += nleft; + /* If no more to process return */ + if (dlen == 0) + return 1; + data += nleft; + /* Else not final block so encrypt it */ + if (!EVP_Cipher(&ctx->cctx, ctx->tbl, ctx->last_block,bl)) + return 0; + } + /* Encrypt all but one of the complete blocks left */ + while(dlen > bl) + { + if (!EVP_Cipher(&ctx->cctx, ctx->tbl, data, bl)) + return 0; + dlen -= bl; + data += bl; + } + /* Copy any data left to last block buffer */ + memcpy(ctx->last_block, data, dlen); + ctx->nlast_block = dlen; + return 1; + + } + +int CMAC_Final(CMAC_CTX *ctx, unsigned char *out, size_t *poutlen) + { + int i, bl, lb; +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !ctx->cctx.engine) + return FIPS_cmac_final(ctx, out, poutlen); +#endif + if (ctx->nlast_block == -1) + return 0; + bl = EVP_CIPHER_CTX_block_size(&ctx->cctx); + *poutlen = (size_t)bl; + if (!out) + return 1; + lb = ctx->nlast_block; + /* Is last block complete? */ + if (lb == bl) + { + for (i = 0; i < bl; i++) + out[i] = ctx->last_block[i] ^ ctx->k1[i]; + } + else + { + ctx->last_block[lb] = 0x80; + if (bl - lb > 1) + memset(ctx->last_block + lb + 1, 0, bl - lb - 1); + for (i = 0; i < bl; i++) + out[i] = ctx->last_block[i] ^ ctx->k2[i]; + } + if (!EVP_Cipher(&ctx->cctx, out, out, bl)) + { + OPENSSL_cleanse(out, bl); + return 0; + } + return 1; + } + +int CMAC_resume(CMAC_CTX *ctx) + { + if (ctx->nlast_block == -1) + return 0; + /* The buffer "tbl" containes the last fully encrypted block + * which is the last IV (or all zeroes if no last encrypted block). + * The last block has not been modified since CMAC_final(). + * So reinitliasing using the last decrypted block will allow + * CMAC to continue after calling CMAC_Final(). + */ + return EVP_EncryptInit_ex(&ctx->cctx, NULL, NULL, NULL, ctx->tbl); + } diff --git a/crypto/openssl/crypto/ecdh/ech_locl.h b/crypto/openssl/crypto/cmac/cmac.h similarity index 71% copy from crypto/openssl/crypto/ecdh/ech_locl.h copy to crypto/openssl/crypto/cmac/cmac.h index f658526a7e..712e92dced 100644 --- a/crypto/openssl/crypto/ecdh/ech_locl.h +++ b/crypto/openssl/crypto/cmac/cmac.h @@ -1,6 +1,9 @@ -/* crypto/ecdh/ech_locl.h */ +/* crypto/cmac/cmac.h */ +/* Written by Dr Stephen N Henson (steve@openssl.org) for the OpenSSL + * project. + */ /* ==================================================================== - * Copyright (c) 2000-2005 The OpenSSL Project. All rights reserved. + * Copyright (c) 2010 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -46,49 +49,34 @@ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED * OF THE POSSIBILITY OF SUCH DAMAGE. * ==================================================================== - * - * This product includes cryptographic software written by Eric Young - * (eay@cryptsoft.com). This product includes software written by Tim - * Hudson (tjh@cryptsoft.com). - * */ -#ifndef HEADER_ECH_LOCL_H -#define HEADER_ECH_LOCL_H -#include +#ifndef HEADER_CMAC_H +#define HEADER_CMAC_H -#ifdef __cplusplus +#ifdef __cplusplus extern "C" { #endif -struct ecdh_method - { - const char *name; - int (*compute_key)(void *key, size_t outlen, const EC_POINT *pub_key, EC_KEY *ecdh, - void *(*KDF)(const void *in, size_t inlen, void *out, size_t *outlen)); -#if 0 - int (*init)(EC_KEY *eckey); - int (*finish)(EC_KEY *eckey); -#endif - int flags; - char *app_data; - }; +#include + +/* Opaque */ +typedef struct CMAC_CTX_st CMAC_CTX; -typedef struct ecdh_data_st { - /* EC_KEY_METH_DATA part */ - int (*init)(EC_KEY *); - /* method specific part */ - ENGINE *engine; - int flags; - const ECDH_METHOD *meth; - CRYPTO_EX_DATA ex_data; -} ECDH_DATA; +CMAC_CTX *CMAC_CTX_new(void); +void CMAC_CTX_cleanup(CMAC_CTX *ctx); +void CMAC_CTX_free(CMAC_CTX *ctx); +EVP_CIPHER_CTX *CMAC_CTX_get0_cipher_ctx(CMAC_CTX *ctx); +int CMAC_CTX_copy(CMAC_CTX *out, const CMAC_CTX *in); -ECDH_DATA *ecdh_check(EC_KEY *); +int CMAC_Init(CMAC_CTX *ctx, const void *key, size_t keylen, + const EVP_CIPHER *cipher, ENGINE *impl); +int CMAC_Update(CMAC_CTX *ctx, const void *data, size_t dlen); +int CMAC_Final(CMAC_CTX *ctx, unsigned char *out, size_t *poutlen); +int CMAC_resume(CMAC_CTX *ctx); #ifdef __cplusplus } #endif - -#endif /* HEADER_ECH_LOCL_H */ +#endif diff --git a/crypto/openssl/crypto/cms/cms.h b/crypto/openssl/crypto/cms/cms.h index 09c45d0412..36994fa6a2 100644 --- a/crypto/openssl/crypto/cms/cms.h +++ b/crypto/openssl/crypto/cms/cms.h @@ -111,6 +111,7 @@ DECLARE_ASN1_PRINT_FUNCTION(CMS_ContentInfo) #define CMS_PARTIAL 0x4000 #define CMS_REUSE_DIGEST 0x8000 #define CMS_USE_KEYID 0x10000 +#define CMS_DEBUG_DECRYPT 0x20000 const ASN1_OBJECT *CMS_get0_type(CMS_ContentInfo *cms); @@ -184,6 +185,8 @@ int CMS_decrypt_set1_pkey(CMS_ContentInfo *cms, EVP_PKEY *pk, X509 *cert); int CMS_decrypt_set1_key(CMS_ContentInfo *cms, unsigned char *key, size_t keylen, unsigned char *id, size_t idlen); +int CMS_decrypt_set1_password(CMS_ContentInfo *cms, + unsigned char *pass, ossl_ssize_t passlen); STACK_OF(CMS_RecipientInfo) *CMS_get0_RecipientInfos(CMS_ContentInfo *cms); int CMS_RecipientInfo_type(CMS_RecipientInfo *ri); @@ -219,6 +222,16 @@ int CMS_RecipientInfo_set0_key(CMS_RecipientInfo *ri, int CMS_RecipientInfo_kekri_id_cmp(CMS_RecipientInfo *ri, const unsigned char *id, size_t idlen); +int CMS_RecipientInfo_set0_password(CMS_RecipientInfo *ri, + unsigned char *pass, + ossl_ssize_t passlen); + +CMS_RecipientInfo *CMS_add0_recipient_password(CMS_ContentInfo *cms, + int iter, int wrap_nid, int pbe_nid, + unsigned char *pass, + ossl_ssize_t passlen, + const EVP_CIPHER *kekciph); + int CMS_RecipientInfo_decrypt(CMS_ContentInfo *cms, CMS_RecipientInfo *ri); int CMS_uncompress(CMS_ContentInfo *cms, BIO *dcont, BIO *out, @@ -330,6 +343,7 @@ void ERR_load_CMS_strings(void); #define CMS_F_CHECK_CONTENT 99 #define CMS_F_CMS_ADD0_CERT 164 #define CMS_F_CMS_ADD0_RECIPIENT_KEY 100 +#define CMS_F_CMS_ADD0_RECIPIENT_PASSWORD 165 #define CMS_F_CMS_ADD1_RECEIPTREQUEST 158 #define CMS_F_CMS_ADD1_RECIPIENT_CERT 101 #define CMS_F_CMS_ADD1_SIGNER 102 @@ -344,6 +358,7 @@ void ERR_load_CMS_strings(void); #define CMS_F_CMS_DATAINIT 111 #define CMS_F_CMS_DECRYPT 112 #define CMS_F_CMS_DECRYPT_SET1_KEY 113 +#define CMS_F_CMS_DECRYPT_SET1_PASSWORD 166 #define CMS_F_CMS_DECRYPT_SET1_PKEY 114 #define CMS_F_CMS_DIGESTALGORITHM_FIND_CTX 115 #define CMS_F_CMS_DIGESTALGORITHM_INIT_BIO 116 @@ -378,7 +393,9 @@ void ERR_load_CMS_strings(void); #define CMS_F_CMS_RECIPIENTINFO_KTRI_ENCRYPT 141 #define CMS_F_CMS_RECIPIENTINFO_KTRI_GET0_ALGS 142 #define CMS_F_CMS_RECIPIENTINFO_KTRI_GET0_SIGNER_ID 143 +#define CMS_F_CMS_RECIPIENTINFO_PWRI_CRYPT 167 #define CMS_F_CMS_RECIPIENTINFO_SET0_KEY 144 +#define CMS_F_CMS_RECIPIENTINFO_SET0_PASSWORD 168 #define CMS_F_CMS_RECIPIENTINFO_SET0_PKEY 145 #define CMS_F_CMS_SET1_SIGNERIDENTIFIER 146 #define CMS_F_CMS_SET_DETACHED 147 @@ -419,6 +436,7 @@ void ERR_load_CMS_strings(void); #define CMS_R_ERROR_SETTING_KEY 115 #define CMS_R_ERROR_SETTING_RECIPIENTINFO 116 #define CMS_R_INVALID_ENCRYPTED_KEY_LENGTH 117 +#define CMS_R_INVALID_KEY_ENCRYPTION_PARAMETER 176 #define CMS_R_INVALID_KEY_LENGTH 118 #define CMS_R_MD_BIO_INIT_ERROR 119 #define CMS_R_MESSAGEDIGEST_ATTRIBUTE_WRONG_LENGTH 120 @@ -431,6 +449,7 @@ void ERR_load_CMS_strings(void); #define CMS_R_NOT_ENCRYPTED_DATA 122 #define CMS_R_NOT_KEK 123 #define CMS_R_NOT_KEY_TRANSPORT 124 +#define CMS_R_NOT_PWRI 177 #define CMS_R_NOT_SUPPORTED_FOR_THIS_KEY_TYPE 125 #define CMS_R_NO_CIPHER 126 #define CMS_R_NO_CONTENT 127 @@ -443,6 +462,7 @@ void ERR_load_CMS_strings(void); #define CMS_R_NO_MATCHING_RECIPIENT 132 #define CMS_R_NO_MATCHING_SIGNATURE 166 #define CMS_R_NO_MSGSIGDIGEST 167 +#define CMS_R_NO_PASSWORD 178 #define CMS_R_NO_PRIVATE_KEY 133 #define CMS_R_NO_PUBLIC_KEY 134 #define CMS_R_NO_RECEIPT_REQUEST 168 @@ -466,10 +486,12 @@ void ERR_load_CMS_strings(void); #define CMS_R_UNSUPPORTED_COMPRESSION_ALGORITHM 151 #define CMS_R_UNSUPPORTED_CONTENT_TYPE 152 #define CMS_R_UNSUPPORTED_KEK_ALGORITHM 153 +#define CMS_R_UNSUPPORTED_KEY_ENCRYPTION_ALGORITHM 179 #define CMS_R_UNSUPPORTED_RECIPIENT_TYPE 154 #define CMS_R_UNSUPPORTED_RECPIENTINFO_TYPE 155 #define CMS_R_UNSUPPORTED_TYPE 156 #define CMS_R_UNWRAP_ERROR 157 +#define CMS_R_UNWRAP_FAILURE 180 #define CMS_R_VERIFICATION_FAILURE 158 #define CMS_R_WRAP_ERROR 159 diff --git a/crypto/openssl/crypto/cms/cms_asn1.c b/crypto/openssl/crypto/cms/cms_asn1.c index fcba4dcbcc..cfe67fb6c1 100644 --- a/crypto/openssl/crypto/cms/cms_asn1.c +++ b/crypto/openssl/crypto/cms/cms_asn1.c @@ -237,6 +237,15 @@ static int cms_ri_cb(int operation, ASN1_VALUE **pval, const ASN1_ITEM *it, OPENSSL_free(kekri->key); } } + else if (ri->type == CMS_RECIPINFO_PASS) + { + CMS_PasswordRecipientInfo *pwri = ri->d.pwri; + if (pwri->pass) + { + OPENSSL_cleanse(pwri->pass, pwri->passlen); + OPENSSL_free(pwri->pass); + } + } } return 1; } diff --git a/crypto/openssl/crypto/cms/cms_enc.c b/crypto/openssl/crypto/cms/cms_enc.c index bab26235bd..580083b45f 100644 --- a/crypto/openssl/crypto/cms/cms_enc.c +++ b/crypto/openssl/crypto/cms/cms_enc.c @@ -73,6 +73,8 @@ BIO *cms_EncryptedContent_init_bio(CMS_EncryptedContentInfo *ec) const EVP_CIPHER *ciph; X509_ALGOR *calg = ec->contentEncryptionAlgorithm; unsigned char iv[EVP_MAX_IV_LENGTH], *piv = NULL; + unsigned char *tkey = NULL; + size_t tkeylen; int ok = 0; @@ -137,32 +139,57 @@ BIO *cms_EncryptedContent_init_bio(CMS_EncryptedContentInfo *ec) CMS_R_CIPHER_PARAMETER_INITIALISATION_ERROR); goto err; } - - - if (enc && !ec->key) + /* Generate random session key */ + if (!enc || !ec->key) { - /* Generate random key */ - if (!ec->keylen) - ec->keylen = EVP_CIPHER_CTX_key_length(ctx); - ec->key = OPENSSL_malloc(ec->keylen); - if (!ec->key) + tkeylen = EVP_CIPHER_CTX_key_length(ctx); + tkey = OPENSSL_malloc(tkeylen); + if (!tkey) { CMSerr(CMS_F_CMS_ENCRYPTEDCONTENT_INIT_BIO, ERR_R_MALLOC_FAILURE); goto err; } - if (EVP_CIPHER_CTX_rand_key(ctx, ec->key) <= 0) + if (EVP_CIPHER_CTX_rand_key(ctx, tkey) <= 0) goto err; - keep_key = 1; } - else if (ec->keylen != (unsigned int)EVP_CIPHER_CTX_key_length(ctx)) + + if (!ec->key) + { + ec->key = tkey; + ec->keylen = tkeylen; + tkey = NULL; + if (enc) + keep_key = 1; + else + ERR_clear_error(); + + } + + if (ec->keylen != tkeylen) { /* If necessary set key length */ if (EVP_CIPHER_CTX_set_key_length(ctx, ec->keylen) <= 0) { - CMSerr(CMS_F_CMS_ENCRYPTEDCONTENT_INIT_BIO, - CMS_R_INVALID_KEY_LENGTH); - goto err; + /* Only reveal failure if debugging so we don't + * leak information which may be useful in MMA. + */ + if (ec->debug) + { + CMSerr(CMS_F_CMS_ENCRYPTEDCONTENT_INIT_BIO, + CMS_R_INVALID_KEY_LENGTH); + goto err; + } + else + { + /* Use random key */ + OPENSSL_cleanse(ec->key, ec->keylen); + OPENSSL_free(ec->key); + ec->key = tkey; + ec->keylen = tkeylen; + tkey = NULL; + ERR_clear_error(); + } } } @@ -198,6 +225,11 @@ BIO *cms_EncryptedContent_init_bio(CMS_EncryptedContentInfo *ec) OPENSSL_free(ec->key); ec->key = NULL; } + if (tkey) + { + OPENSSL_cleanse(tkey, tkeylen); + OPENSSL_free(tkey); + } if (ok) return b; BIO_free(b); diff --git a/crypto/openssl/crypto/cms/cms_env.c b/crypto/openssl/crypto/cms/cms_env.c index b3237d4b94..be20b1c024 100644 --- a/crypto/openssl/crypto/cms/cms_env.c +++ b/crypto/openssl/crypto/cms/cms_env.c @@ -65,14 +65,13 @@ /* CMS EnvelopedData Utilities */ DECLARE_ASN1_ITEM(CMS_EnvelopedData) -DECLARE_ASN1_ITEM(CMS_RecipientInfo) DECLARE_ASN1_ITEM(CMS_KeyTransRecipientInfo) DECLARE_ASN1_ITEM(CMS_KEKRecipientInfo) DECLARE_ASN1_ITEM(CMS_OtherKeyAttribute) DECLARE_STACK_OF(CMS_RecipientInfo) -static CMS_EnvelopedData *cms_get0_enveloped(CMS_ContentInfo *cms) +CMS_EnvelopedData *cms_get0_enveloped(CMS_ContentInfo *cms) { if (OBJ_obj2nid(cms->contentType) != NID_pkcs7_enveloped) { @@ -371,6 +370,8 @@ static int cms_RecipientInfo_ktri_decrypt(CMS_ContentInfo *cms, unsigned char *ek = NULL; size_t eklen; int ret = 0; + CMS_EncryptedContentInfo *ec; + ec = cms->d.envelopedData->encryptedContentInfo; if (ktri->pkey == NULL) { @@ -417,8 +418,14 @@ static int cms_RecipientInfo_ktri_decrypt(CMS_ContentInfo *cms, ret = 1; - cms->d.envelopedData->encryptedContentInfo->key = ek; - cms->d.envelopedData->encryptedContentInfo->keylen = eklen; + if (ec->key) + { + OPENSSL_cleanse(ec->key, ec->keylen); + OPENSSL_free(ec->key); + } + + ec->key = ek; + ec->keylen = eklen; err: if (pctx) @@ -786,6 +793,9 @@ int CMS_RecipientInfo_decrypt(CMS_ContentInfo *cms, CMS_RecipientInfo *ri) case CMS_RECIPINFO_KEK: return cms_RecipientInfo_kekri_decrypt(cms, ri); + case CMS_RECIPINFO_PASS: + return cms_RecipientInfo_pwri_crypt(cms, ri, 0); + default: CMSerr(CMS_F_CMS_RECIPIENTINFO_DECRYPT, CMS_R_UNSUPPORTED_RECPIENTINFO_TYPE); @@ -829,6 +839,10 @@ BIO *cms_EnvelopedData_init_bio(CMS_ContentInfo *cms) r = cms_RecipientInfo_kekri_encrypt(cms, ri); break; + case CMS_RECIPINFO_PASS: + r = cms_RecipientInfo_pwri_crypt(cms, ri, 1); + break; + default: CMSerr(CMS_F_CMS_ENVELOPEDDATA_INIT_BIO, CMS_R_UNSUPPORTED_RECIPIENT_TYPE); diff --git a/crypto/openssl/crypto/cms/cms_err.c b/crypto/openssl/crypto/cms/cms_err.c index ff7b0309e5..8330ead7ed 100644 --- a/crypto/openssl/crypto/cms/cms_err.c +++ b/crypto/openssl/crypto/cms/cms_err.c @@ -1,6 +1,6 @@ /* crypto/cms/cms_err.c */ /* ==================================================================== - * Copyright (c) 1999-2007 The OpenSSL Project. All rights reserved. + * Copyright (c) 1999-2009 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -73,6 +73,7 @@ static ERR_STRING_DATA CMS_str_functs[]= {ERR_FUNC(CMS_F_CHECK_CONTENT), "CHECK_CONTENT"}, {ERR_FUNC(CMS_F_CMS_ADD0_CERT), "CMS_add0_cert"}, {ERR_FUNC(CMS_F_CMS_ADD0_RECIPIENT_KEY), "CMS_add0_recipient_key"}, +{ERR_FUNC(CMS_F_CMS_ADD0_RECIPIENT_PASSWORD), "CMS_add0_recipient_password"}, {ERR_FUNC(CMS_F_CMS_ADD1_RECEIPTREQUEST), "CMS_add1_ReceiptRequest"}, {ERR_FUNC(CMS_F_CMS_ADD1_RECIPIENT_CERT), "CMS_add1_recipient_cert"}, {ERR_FUNC(CMS_F_CMS_ADD1_SIGNER), "CMS_add1_signer"}, @@ -87,6 +88,7 @@ static ERR_STRING_DATA CMS_str_functs[]= {ERR_FUNC(CMS_F_CMS_DATAINIT), "CMS_dataInit"}, {ERR_FUNC(CMS_F_CMS_DECRYPT), "CMS_decrypt"}, {ERR_FUNC(CMS_F_CMS_DECRYPT_SET1_KEY), "CMS_decrypt_set1_key"}, +{ERR_FUNC(CMS_F_CMS_DECRYPT_SET1_PASSWORD), "CMS_decrypt_set1_password"}, {ERR_FUNC(CMS_F_CMS_DECRYPT_SET1_PKEY), "CMS_decrypt_set1_pkey"}, {ERR_FUNC(CMS_F_CMS_DIGESTALGORITHM_FIND_CTX), "cms_DigestAlgorithm_find_ctx"}, {ERR_FUNC(CMS_F_CMS_DIGESTALGORITHM_INIT_BIO), "cms_DigestAlgorithm_init_bio"}, @@ -105,7 +107,7 @@ static ERR_STRING_DATA CMS_str_functs[]= {ERR_FUNC(CMS_F_CMS_GET0_CERTIFICATE_CHOICES), "CMS_GET0_CERTIFICATE_CHOICES"}, {ERR_FUNC(CMS_F_CMS_GET0_CONTENT), "CMS_get0_content"}, {ERR_FUNC(CMS_F_CMS_GET0_ECONTENT_TYPE), "CMS_GET0_ECONTENT_TYPE"}, -{ERR_FUNC(CMS_F_CMS_GET0_ENVELOPED), "CMS_GET0_ENVELOPED"}, +{ERR_FUNC(CMS_F_CMS_GET0_ENVELOPED), "cms_get0_enveloped"}, {ERR_FUNC(CMS_F_CMS_GET0_REVOCATION_CHOICES), "CMS_GET0_REVOCATION_CHOICES"}, {ERR_FUNC(CMS_F_CMS_GET0_SIGNED), "CMS_GET0_SIGNED"}, {ERR_FUNC(CMS_F_CMS_MSGSIGDIGEST_ADD1), "cms_msgSigDigest_add1"}, @@ -121,7 +123,9 @@ static ERR_STRING_DATA CMS_str_functs[]= {ERR_FUNC(CMS_F_CMS_RECIPIENTINFO_KTRI_ENCRYPT), "CMS_RECIPIENTINFO_KTRI_ENCRYPT"}, {ERR_FUNC(CMS_F_CMS_RECIPIENTINFO_KTRI_GET0_ALGS), "CMS_RecipientInfo_ktri_get0_algs"}, {ERR_FUNC(CMS_F_CMS_RECIPIENTINFO_KTRI_GET0_SIGNER_ID), "CMS_RecipientInfo_ktri_get0_signer_id"}, +{ERR_FUNC(CMS_F_CMS_RECIPIENTINFO_PWRI_CRYPT), "cms_RecipientInfo_pwri_crypt"}, {ERR_FUNC(CMS_F_CMS_RECIPIENTINFO_SET0_KEY), "CMS_RecipientInfo_set0_key"}, +{ERR_FUNC(CMS_F_CMS_RECIPIENTINFO_SET0_PASSWORD), "CMS_RecipientInfo_set0_password"}, {ERR_FUNC(CMS_F_CMS_RECIPIENTINFO_SET0_PKEY), "CMS_RecipientInfo_set0_pkey"}, {ERR_FUNC(CMS_F_CMS_SET1_SIGNERIDENTIFIER), "cms_set1_SignerIdentifier"}, {ERR_FUNC(CMS_F_CMS_SET_DETACHED), "CMS_set_detached"}, @@ -165,6 +169,7 @@ static ERR_STRING_DATA CMS_str_reasons[]= {ERR_REASON(CMS_R_ERROR_SETTING_KEY) ,"error setting key"}, {ERR_REASON(CMS_R_ERROR_SETTING_RECIPIENTINFO),"error setting recipientinfo"}, {ERR_REASON(CMS_R_INVALID_ENCRYPTED_KEY_LENGTH),"invalid encrypted key length"}, +{ERR_REASON(CMS_R_INVALID_KEY_ENCRYPTION_PARAMETER),"invalid key encryption parameter"}, {ERR_REASON(CMS_R_INVALID_KEY_LENGTH) ,"invalid key length"}, {ERR_REASON(CMS_R_MD_BIO_INIT_ERROR) ,"md bio init error"}, {ERR_REASON(CMS_R_MESSAGEDIGEST_ATTRIBUTE_WRONG_LENGTH),"messagedigest attribute wrong length"}, @@ -177,6 +182,7 @@ static ERR_STRING_DATA CMS_str_reasons[]= {ERR_REASON(CMS_R_NOT_ENCRYPTED_DATA) ,"not encrypted data"}, {ERR_REASON(CMS_R_NOT_KEK) ,"not kek"}, {ERR_REASON(CMS_R_NOT_KEY_TRANSPORT) ,"not key transport"}, +{ERR_REASON(CMS_R_NOT_PWRI) ,"not pwri"}, {ERR_REASON(CMS_R_NOT_SUPPORTED_FOR_THIS_KEY_TYPE),"not supported for this key type"}, {ERR_REASON(CMS_R_NO_CIPHER) ,"no cipher"}, {ERR_REASON(CMS_R_NO_CONTENT) ,"no content"}, @@ -189,6 +195,7 @@ static ERR_STRING_DATA CMS_str_reasons[]= {ERR_REASON(CMS_R_NO_MATCHING_RECIPIENT) ,"no matching recipient"}, {ERR_REASON(CMS_R_NO_MATCHING_SIGNATURE) ,"no matching signature"}, {ERR_REASON(CMS_R_NO_MSGSIGDIGEST) ,"no msgsigdigest"}, +{ERR_REASON(CMS_R_NO_PASSWORD) ,"no password"}, {ERR_REASON(CMS_R_NO_PRIVATE_KEY) ,"no private key"}, {ERR_REASON(CMS_R_NO_PUBLIC_KEY) ,"no public key"}, {ERR_REASON(CMS_R_NO_RECEIPT_REQUEST) ,"no receipt request"}, @@ -212,10 +219,12 @@ static ERR_STRING_DATA CMS_str_reasons[]= {ERR_REASON(CMS_R_UNSUPPORTED_COMPRESSION_ALGORITHM),"unsupported compression algorithm"}, {ERR_REASON(CMS_R_UNSUPPORTED_CONTENT_TYPE),"unsupported content type"}, {ERR_REASON(CMS_R_UNSUPPORTED_KEK_ALGORITHM),"unsupported kek algorithm"}, +{ERR_REASON(CMS_R_UNSUPPORTED_KEY_ENCRYPTION_ALGORITHM),"unsupported key encryption algorithm"}, {ERR_REASON(CMS_R_UNSUPPORTED_RECIPIENT_TYPE),"unsupported recipient type"}, {ERR_REASON(CMS_R_UNSUPPORTED_RECPIENTINFO_TYPE),"unsupported recpientinfo type"}, {ERR_REASON(CMS_R_UNSUPPORTED_TYPE) ,"unsupported type"}, {ERR_REASON(CMS_R_UNWRAP_ERROR) ,"unwrap error"}, +{ERR_REASON(CMS_R_UNWRAP_FAILURE) ,"unwrap failure"}, {ERR_REASON(CMS_R_VERIFICATION_FAILURE) ,"verification failure"}, {ERR_REASON(CMS_R_WRAP_ERROR) ,"wrap error"}, {0,NULL} diff --git a/crypto/openssl/crypto/cms/cms_lcl.h b/crypto/openssl/crypto/cms/cms_lcl.h index c8ecfa724a..a9f9730157 100644 --- a/crypto/openssl/crypto/cms/cms_lcl.h +++ b/crypto/openssl/crypto/cms/cms_lcl.h @@ -175,6 +175,8 @@ struct CMS_EncryptedContentInfo_st const EVP_CIPHER *cipher; unsigned char *key; size_t keylen; + /* Set to 1 if we are debugging decrypt and don't fake keys for MMA */ + int debug; }; struct CMS_RecipientInfo_st @@ -273,6 +275,9 @@ struct CMS_PasswordRecipientInfo_st X509_ALGOR *keyDerivationAlgorithm; X509_ALGOR *keyEncryptionAlgorithm; ASN1_OCTET_STRING *encryptedKey; + /* Extra info: password to use */ + unsigned char *pass; + size_t passlen; }; struct CMS_OtherRecipientInfo_st @@ -411,6 +416,8 @@ DECLARE_ASN1_ITEM(CMS_SignerInfo) DECLARE_ASN1_ITEM(CMS_IssuerAndSerialNumber) DECLARE_ASN1_ITEM(CMS_Attributes_Sign) DECLARE_ASN1_ITEM(CMS_Attributes_Verify) +DECLARE_ASN1_ITEM(CMS_RecipientInfo) +DECLARE_ASN1_ITEM(CMS_PasswordRecipientInfo) DECLARE_ASN1_ALLOC_FUNCTIONS(CMS_IssuerAndSerialNumber) #define CMS_SIGNERINFO_ISSUER_SERIAL 0 @@ -454,6 +461,11 @@ int cms_msgSigDigest_add1(CMS_SignerInfo *dest, CMS_SignerInfo *src); ASN1_OCTET_STRING *cms_encode_Receipt(CMS_SignerInfo *si); BIO *cms_EnvelopedData_init_bio(CMS_ContentInfo *cms); +CMS_EnvelopedData *cms_get0_enveloped(CMS_ContentInfo *cms); + +/* PWRI routines */ +int cms_RecipientInfo_pwri_crypt(CMS_ContentInfo *cms, CMS_RecipientInfo *ri, + int en_de); #ifdef __cplusplus } diff --git a/crypto/openssl/crypto/cms/cms_lib.c b/crypto/openssl/crypto/cms/cms_lib.c index d00fe0f87b..f88e8f3b52 100644 --- a/crypto/openssl/crypto/cms/cms_lib.c +++ b/crypto/openssl/crypto/cms/cms_lib.c @@ -412,8 +412,7 @@ int cms_DigestAlgorithm_find_ctx(EVP_MD_CTX *mctx, BIO *chain, */ || EVP_MD_pkey_type(EVP_MD_CTX_md(mtmp)) == nid) { - EVP_MD_CTX_copy_ex(mctx, mtmp); - return 1; + return EVP_MD_CTX_copy_ex(mctx, mtmp); } chain = BIO_next(chain); } diff --git a/crypto/openssl/crypto/cms/cms_pwri.c b/crypto/openssl/crypto/cms/cms_pwri.c new file mode 100644 index 0000000000..b79612a12d --- /dev/null +++ b/crypto/openssl/crypto/cms/cms_pwri.c @@ -0,0 +1,454 @@ +/* crypto/cms/cms_pwri.c */ +/* Written by Dr Stephen N Henson (steve@openssl.org) for the OpenSSL + * project. + */ +/* ==================================================================== + * Copyright (c) 2009 The OpenSSL Project. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * 3. All advertising materials mentioning features or use of this + * software must display the following acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit. (http://www.OpenSSL.org/)" + * + * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to + * endorse or promote products derived from this software without + * prior written permission. For written permission, please contact + * licensing@OpenSSL.org. + * + * 5. Products derived from this software may not be called "OpenSSL" + * nor may "OpenSSL" appear in their names without prior written + * permission of the OpenSSL Project. + * + * 6. Redistributions of any form whatsoever must retain the following + * acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit (http://www.OpenSSL.org/)" + * + * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY + * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * ==================================================================== + */ + +#include "cryptlib.h" +#include +#include +#include +#include +#include +#include +#include +#include "cms_lcl.h" +#include "asn1_locl.h" + +int CMS_RecipientInfo_set0_password(CMS_RecipientInfo *ri, + unsigned char *pass, ossl_ssize_t passlen) + { + CMS_PasswordRecipientInfo *pwri; + if (ri->type != CMS_RECIPINFO_PASS) + { + CMSerr(CMS_F_CMS_RECIPIENTINFO_SET0_PASSWORD, CMS_R_NOT_PWRI); + return 0; + } + + pwri = ri->d.pwri; + pwri->pass = pass; + if (pass && passlen < 0) + passlen = strlen((char *)pass); + pwri->passlen = passlen; + return 1; + } + +CMS_RecipientInfo *CMS_add0_recipient_password(CMS_ContentInfo *cms, + int iter, int wrap_nid, int pbe_nid, + unsigned char *pass, + ossl_ssize_t passlen, + const EVP_CIPHER *kekciph) + { + CMS_RecipientInfo *ri = NULL; + CMS_EnvelopedData *env; + CMS_PasswordRecipientInfo *pwri; + EVP_CIPHER_CTX ctx; + X509_ALGOR *encalg = NULL; + unsigned char iv[EVP_MAX_IV_LENGTH]; + int ivlen; + env = cms_get0_enveloped(cms); + if (!env) + goto err; + + if (wrap_nid <= 0) + wrap_nid = NID_id_alg_PWRI_KEK; + + if (pbe_nid <= 0) + pbe_nid = NID_id_pbkdf2; + + /* Get from enveloped data */ + if (kekciph == NULL) + kekciph = env->encryptedContentInfo->cipher; + + if (kekciph == NULL) + { + CMSerr(CMS_F_CMS_ADD0_RECIPIENT_PASSWORD, CMS_R_NO_CIPHER); + return NULL; + } + if (wrap_nid != NID_id_alg_PWRI_KEK) + { + CMSerr(CMS_F_CMS_ADD0_RECIPIENT_PASSWORD, + CMS_R_UNSUPPORTED_KEY_ENCRYPTION_ALGORITHM); + return NULL; + } + + /* Setup algorithm identifier for cipher */ + encalg = X509_ALGOR_new(); + EVP_CIPHER_CTX_init(&ctx); + + if (EVP_EncryptInit_ex(&ctx, kekciph, NULL, NULL, NULL) <= 0) + { + CMSerr(CMS_F_CMS_ADD0_RECIPIENT_PASSWORD, ERR_R_EVP_LIB); + goto err; + } + + ivlen = EVP_CIPHER_CTX_iv_length(&ctx); + + if (ivlen > 0) + { + if (RAND_pseudo_bytes(iv, ivlen) <= 0) + goto err; + if (EVP_EncryptInit_ex(&ctx, NULL, NULL, NULL, iv) <= 0) + { + CMSerr(CMS_F_CMS_ADD0_RECIPIENT_PASSWORD, + ERR_R_EVP_LIB); + goto err; + } + encalg->parameter = ASN1_TYPE_new(); + if (!encalg->parameter) + { + CMSerr(CMS_F_CMS_ADD0_RECIPIENT_PASSWORD, + ERR_R_MALLOC_FAILURE); + goto err; + } + if (EVP_CIPHER_param_to_asn1(&ctx, encalg->parameter) <= 0) + { + CMSerr(CMS_F_CMS_ADD0_RECIPIENT_PASSWORD, + CMS_R_CIPHER_PARAMETER_INITIALISATION_ERROR); + goto err; + } + } + + + encalg->algorithm = OBJ_nid2obj(EVP_CIPHER_CTX_type(&ctx)); + + EVP_CIPHER_CTX_cleanup(&ctx); + + /* Initialize recipient info */ + ri = M_ASN1_new_of(CMS_RecipientInfo); + if (!ri) + goto merr; + + ri->d.pwri = M_ASN1_new_of(CMS_PasswordRecipientInfo); + if (!ri->d.pwri) + goto merr; + ri->type = CMS_RECIPINFO_PASS; + + pwri = ri->d.pwri; + /* Since this is overwritten, free up empty structure already there */ + X509_ALGOR_free(pwri->keyEncryptionAlgorithm); + pwri->keyEncryptionAlgorithm = X509_ALGOR_new(); + if (!pwri->keyEncryptionAlgorithm) + goto merr; + pwri->keyEncryptionAlgorithm->algorithm = OBJ_nid2obj(wrap_nid); + pwri->keyEncryptionAlgorithm->parameter = ASN1_TYPE_new(); + if (!pwri->keyEncryptionAlgorithm->parameter) + goto merr; + + if(!ASN1_item_pack(encalg, ASN1_ITEM_rptr(X509_ALGOR), + &pwri->keyEncryptionAlgorithm->parameter->value.sequence)) + goto merr; + pwri->keyEncryptionAlgorithm->parameter->type = V_ASN1_SEQUENCE; + + X509_ALGOR_free(encalg); + encalg = NULL; + + /* Setup PBE algorithm */ + + pwri->keyDerivationAlgorithm = PKCS5_pbkdf2_set(iter, NULL, 0, -1, -1); + + if (!pwri->keyDerivationAlgorithm) + goto err; + + CMS_RecipientInfo_set0_password(ri, pass, passlen); + pwri->version = 0; + + if (!sk_CMS_RecipientInfo_push(env->recipientInfos, ri)) + goto merr; + + return ri; + + merr: + CMSerr(CMS_F_CMS_ADD0_RECIPIENT_PASSWORD, ERR_R_MALLOC_FAILURE); + err: + EVP_CIPHER_CTX_cleanup(&ctx); + if (ri) + M_ASN1_free_of(ri, CMS_RecipientInfo); + if (encalg) + X509_ALGOR_free(encalg); + return NULL; + + } + +/* This is an implementation of the key wrapping mechanism in RFC3211, + * at some point this should go into EVP. + */ + +static int kek_unwrap_key(unsigned char *out, size_t *outlen, + const unsigned char *in, size_t inlen, EVP_CIPHER_CTX *ctx) + { + size_t blocklen = EVP_CIPHER_CTX_block_size(ctx); + unsigned char *tmp; + int outl, rv = 0; + if (inlen < 2 * blocklen) + { + /* too small */ + return 0; + } + if (inlen % blocklen) + { + /* Invalid size */ + return 0; + } + tmp = OPENSSL_malloc(inlen); + /* setup IV by decrypting last two blocks */ + EVP_DecryptUpdate(ctx, tmp + inlen - 2 * blocklen, &outl, + in + inlen - 2 * blocklen, blocklen * 2); + /* Do a decrypt of last decrypted block to set IV to correct value + * output it to start of buffer so we don't corrupt decrypted block + * this works because buffer is at least two block lengths long. + */ + EVP_DecryptUpdate(ctx, tmp, &outl, + tmp + inlen - blocklen, blocklen); + /* Can now decrypt first n - 1 blocks */ + EVP_DecryptUpdate(ctx, tmp, &outl, in, inlen - blocklen); + + /* Reset IV to original value */ + EVP_DecryptInit_ex(ctx, NULL, NULL, NULL, NULL); + /* Decrypt again */ + EVP_DecryptUpdate(ctx, tmp, &outl, tmp, inlen); + /* Check check bytes */ + if (((tmp[1] ^ tmp[4]) & (tmp[2] ^ tmp[5]) & (tmp[3] ^ tmp[6])) != 0xff) + { + /* Check byte failure */ + goto err; + } + if (inlen < (size_t)(tmp[0] - 4 )) + { + /* Invalid length value */ + goto err; + } + *outlen = (size_t)tmp[0]; + memcpy(out, tmp + 4, *outlen); + rv = 1; + err: + OPENSSL_cleanse(tmp, inlen); + OPENSSL_free(tmp); + return rv; + + } + +static int kek_wrap_key(unsigned char *out, size_t *outlen, + const unsigned char *in, size_t inlen, EVP_CIPHER_CTX *ctx) + { + size_t blocklen = EVP_CIPHER_CTX_block_size(ctx); + size_t olen; + int dummy; + /* First decide length of output buffer: need header and round up to + * multiple of block length. + */ + olen = (inlen + 4 + blocklen - 1)/blocklen; + olen *= blocklen; + if (olen < 2 * blocklen) + { + /* Key too small */ + return 0; + } + if (inlen > 0xFF) + { + /* Key too large */ + return 0; + } + if (out) + { + /* Set header */ + out[0] = (unsigned char)inlen; + out[1] = in[0] ^ 0xFF; + out[2] = in[1] ^ 0xFF; + out[3] = in[2] ^ 0xFF; + memcpy(out + 4, in, inlen); + /* Add random padding to end */ + if (olen > inlen + 4) + RAND_pseudo_bytes(out + 4 + inlen, olen - 4 - inlen); + /* Encrypt twice */ + EVP_EncryptUpdate(ctx, out, &dummy, out, olen); + EVP_EncryptUpdate(ctx, out, &dummy, out, olen); + } + + *outlen = olen; + + return 1; + } + +/* Encrypt/Decrypt content key in PWRI recipient info */ + +int cms_RecipientInfo_pwri_crypt(CMS_ContentInfo *cms, CMS_RecipientInfo *ri, + int en_de) + { + CMS_EncryptedContentInfo *ec; + CMS_PasswordRecipientInfo *pwri; + const unsigned char *p = NULL; + int plen; + int r = 0; + X509_ALGOR *algtmp, *kekalg = NULL; + EVP_CIPHER_CTX kekctx; + const EVP_CIPHER *kekcipher; + unsigned char *key = NULL; + size_t keylen; + + ec = cms->d.envelopedData->encryptedContentInfo; + + pwri = ri->d.pwri; + EVP_CIPHER_CTX_init(&kekctx); + + if (!pwri->pass) + { + CMSerr(CMS_F_CMS_RECIPIENTINFO_PWRI_CRYPT, CMS_R_NO_PASSWORD); + return 0; + } + algtmp = pwri->keyEncryptionAlgorithm; + + if (!algtmp || OBJ_obj2nid(algtmp->algorithm) != NID_id_alg_PWRI_KEK) + { + CMSerr(CMS_F_CMS_RECIPIENTINFO_PWRI_CRYPT, + CMS_R_UNSUPPORTED_KEY_ENCRYPTION_ALGORITHM); + return 0; + } + + if (algtmp->parameter->type == V_ASN1_SEQUENCE) + { + p = algtmp->parameter->value.sequence->data; + plen = algtmp->parameter->value.sequence->length; + kekalg = d2i_X509_ALGOR(NULL, &p, plen); + } + if (kekalg == NULL) + { + CMSerr(CMS_F_CMS_RECIPIENTINFO_PWRI_CRYPT, + CMS_R_INVALID_KEY_ENCRYPTION_PARAMETER); + return 0; + } + + kekcipher = EVP_get_cipherbyobj(kekalg->algorithm); + + if(!kekcipher) + { + CMSerr(CMS_F_CMS_RECIPIENTINFO_PWRI_CRYPT, + CMS_R_UNKNOWN_CIPHER); + goto err; + } + + /* Fixup cipher based on AlgorithmIdentifier to set IV etc */ + if (!EVP_CipherInit_ex(&kekctx, kekcipher, NULL, NULL, NULL, en_de)) + goto err; + EVP_CIPHER_CTX_set_padding(&kekctx, 0); + if(EVP_CIPHER_asn1_to_param(&kekctx, kekalg->parameter) < 0) + { + CMSerr(CMS_F_CMS_RECIPIENTINFO_PWRI_CRYPT, + CMS_R_CIPHER_PARAMETER_INITIALISATION_ERROR); + goto err; + } + + algtmp = pwri->keyDerivationAlgorithm; + + /* Finish password based key derivation to setup key in "ctx" */ + + if (EVP_PBE_CipherInit(algtmp->algorithm, + (char *)pwri->pass, pwri->passlen, + algtmp->parameter, &kekctx, en_de) < 0) + { + CMSerr(CMS_F_CMS_RECIPIENTINFO_PWRI_CRYPT, ERR_R_EVP_LIB); + goto err; + } + + /* Finally wrap/unwrap the key */ + + if (en_de) + { + + if (!kek_wrap_key(NULL, &keylen, ec->key, ec->keylen, &kekctx)) + goto err; + + key = OPENSSL_malloc(keylen); + + if (!key) + goto err; + + if (!kek_wrap_key(key, &keylen, ec->key, ec->keylen, &kekctx)) + goto err; + pwri->encryptedKey->data = key; + pwri->encryptedKey->length = keylen; + } + else + { + key = OPENSSL_malloc(pwri->encryptedKey->length); + + if (!key) + { + CMSerr(CMS_F_CMS_RECIPIENTINFO_PWRI_CRYPT, + ERR_R_MALLOC_FAILURE); + goto err; + } + if (!kek_unwrap_key(key, &keylen, + pwri->encryptedKey->data, + pwri->encryptedKey->length, &kekctx)) + { + CMSerr(CMS_F_CMS_RECIPIENTINFO_PWRI_CRYPT, + CMS_R_UNWRAP_FAILURE); + goto err; + } + + ec->key = key; + ec->keylen = keylen; + + } + + r = 1; + + err: + + EVP_CIPHER_CTX_cleanup(&kekctx); + + if (!r && key) + OPENSSL_free(key); + X509_ALGOR_free(kekalg); + + return r; + + } diff --git a/crypto/openssl/crypto/cms/cms_sd.c b/crypto/openssl/crypto/cms/cms_sd.c index e3192b9c57..77fbd13596 100644 --- a/crypto/openssl/crypto/cms/cms_sd.c +++ b/crypto/openssl/crypto/cms/cms_sd.c @@ -641,7 +641,8 @@ static int cms_SignerInfo_content_sign(CMS_ContentInfo *cms, cms->d.signedData->encapContentInfo->eContentType; unsigned char md[EVP_MAX_MD_SIZE]; unsigned int mdlen; - EVP_DigestFinal_ex(&mctx, md, &mdlen); + if (!EVP_DigestFinal_ex(&mctx, md, &mdlen)) + goto err; if (!CMS_signed_add1_attr_by_NID(si, NID_pkcs9_messageDigest, V_ASN1_OCTET_STRING, md, mdlen)) diff --git a/crypto/openssl/crypto/cms/cms_smime.c b/crypto/openssl/crypto/cms/cms_smime.c index 4a799eb897..8c56e3a852 100644 --- a/crypto/openssl/crypto/cms/cms_smime.c +++ b/crypto/openssl/crypto/cms/cms_smime.c @@ -611,7 +611,10 @@ int CMS_decrypt_set1_pkey(CMS_ContentInfo *cms, EVP_PKEY *pk, X509 *cert) STACK_OF(CMS_RecipientInfo) *ris; CMS_RecipientInfo *ri; int i, r; + int debug = 0; ris = CMS_get0_RecipientInfos(cms); + if (ris) + debug = cms->d.envelopedData->encryptedContentInfo->debug; for (i = 0; i < sk_CMS_RecipientInfo_num(ris); i++) { ri = sk_CMS_RecipientInfo_value(ris, i); @@ -625,17 +628,38 @@ int CMS_decrypt_set1_pkey(CMS_ContentInfo *cms, EVP_PKEY *pk, X509 *cert) CMS_RecipientInfo_set0_pkey(ri, pk); r = CMS_RecipientInfo_decrypt(cms, ri); CMS_RecipientInfo_set0_pkey(ri, NULL); - if (r > 0) - return 1; if (cert) { + /* If not debugging clear any error and + * return success to avoid leaking of + * information useful to MMA + */ + if (!debug) + { + ERR_clear_error(); + return 1; + } + if (r > 0) + return 1; CMSerr(CMS_F_CMS_DECRYPT_SET1_PKEY, CMS_R_DECRYPT_ERROR); return 0; } - ERR_clear_error(); + /* If no cert and not debugging don't leave loop + * after first successful decrypt. Always attempt + * to decrypt all recipients to avoid leaking timing + * of a successful decrypt. + */ + else if (r > 0 && debug) + return 1; } } + /* If no cert and not debugging always return success */ + if (!cert && !debug) + { + ERR_clear_error(); + return 1; + } CMSerr(CMS_F_CMS_DECRYPT_SET1_PKEY, CMS_R_NO_MATCHING_RECIPIENT); return 0; @@ -680,6 +704,30 @@ int CMS_decrypt_set1_key(CMS_ContentInfo *cms, return 0; } + +int CMS_decrypt_set1_password(CMS_ContentInfo *cms, + unsigned char *pass, ossl_ssize_t passlen) + { + STACK_OF(CMS_RecipientInfo) *ris; + CMS_RecipientInfo *ri; + int i, r; + ris = CMS_get0_RecipientInfos(cms); + for (i = 0; i < sk_CMS_RecipientInfo_num(ris); i++) + { + ri = sk_CMS_RecipientInfo_value(ris, i); + if (CMS_RecipientInfo_type(ri) != CMS_RECIPINFO_PASS) + continue; + CMS_RecipientInfo_set0_password(ri, pass, passlen); + r = CMS_RecipientInfo_decrypt(cms, ri); + CMS_RecipientInfo_set0_password(ri, NULL, 0); + if (r > 0) + return 1; + } + + CMSerr(CMS_F_CMS_DECRYPT_SET1_PASSWORD, CMS_R_NO_MATCHING_RECIPIENT); + return 0; + + } int CMS_decrypt(CMS_ContentInfo *cms, EVP_PKEY *pk, X509 *cert, BIO *dcont, BIO *out, @@ -694,9 +742,14 @@ int CMS_decrypt(CMS_ContentInfo *cms, EVP_PKEY *pk, X509 *cert, } if (!dcont && !check_content(cms)) return 0; + if (flags & CMS_DEBUG_DECRYPT) + cms->d.envelopedData->encryptedContentInfo->debug = 1; + else + cms->d.envelopedData->encryptedContentInfo->debug = 0; + if (!pk && !cert && !dcont && !out) + return 1; if (pk && !CMS_decrypt_set1_pkey(cms, pk, cert)) return 0; - cont = CMS_dataInit(cms, dcont); if (!cont) return 0; diff --git a/crypto/openssl/crypto/comp/c_rle.c b/crypto/openssl/crypto/comp/c_rle.c index 18bceae51e..47dfb67fbd 100644 --- a/crypto/openssl/crypto/comp/c_rle.c +++ b/crypto/openssl/crypto/comp/c_rle.c @@ -30,7 +30,7 @@ static int rle_compress_block(COMP_CTX *ctx, unsigned char *out, { /* int i; */ - if (olen < (ilen+1)) + if (ilen == 0 || olen < (ilen-1)) { /* ZZZZZZZZZZZZZZZZZZZZZZ */ return(-1); @@ -46,7 +46,7 @@ static int rle_expand_block(COMP_CTX *ctx, unsigned char *out, { int i; - if (ilen == 0 || olen < (ilen-1)) + if (olen < (ilen-1)) { /* ZZZZZZZZZZZZZZZZZZZZZZ */ return(-1); diff --git a/crypto/openssl/crypto/cpt_err.c b/crypto/openssl/crypto/cpt_err.c index 139b9284e4..289005f662 100644 --- a/crypto/openssl/crypto/cpt_err.c +++ b/crypto/openssl/crypto/cpt_err.c @@ -1,6 +1,6 @@ /* crypto/cpt_err.c */ /* ==================================================================== - * Copyright (c) 1999-2006 The OpenSSL Project. All rights reserved. + * Copyright (c) 1999-2011 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -76,6 +76,7 @@ static ERR_STRING_DATA CRYPTO_str_functs[]= {ERR_FUNC(CRYPTO_F_CRYPTO_SET_EX_DATA), "CRYPTO_set_ex_data"}, {ERR_FUNC(CRYPTO_F_DEF_ADD_INDEX), "DEF_ADD_INDEX"}, {ERR_FUNC(CRYPTO_F_DEF_GET_CLASS), "DEF_GET_CLASS"}, +{ERR_FUNC(CRYPTO_F_FIPS_MODE_SET), "FIPS_mode_set"}, {ERR_FUNC(CRYPTO_F_INT_DUP_EX_DATA), "INT_DUP_EX_DATA"}, {ERR_FUNC(CRYPTO_F_INT_FREE_EX_DATA), "INT_FREE_EX_DATA"}, {ERR_FUNC(CRYPTO_F_INT_NEW_EX_DATA), "INT_NEW_EX_DATA"}, @@ -84,6 +85,7 @@ static ERR_STRING_DATA CRYPTO_str_functs[]= static ERR_STRING_DATA CRYPTO_str_reasons[]= { +{ERR_REASON(CRYPTO_R_FIPS_MODE_NOT_SUPPORTED),"fips mode not supported"}, {ERR_REASON(CRYPTO_R_NO_DYNLOCK_CREATE_CALLBACK),"no dynlock create callback"}, {0,NULL} }; diff --git a/crypto/openssl/crypto/cryptlib.c b/crypto/openssl/crypto/cryptlib.c index 24fe123e14..766ea8cac7 100644 --- a/crypto/openssl/crypto/cryptlib.c +++ b/crypto/openssl/crypto/cryptlib.c @@ -409,6 +409,10 @@ int (*CRYPTO_get_add_lock_callback(void))(int *num,int mount,int type, void CRYPTO_set_locking_callback(void (*func)(int mode,int type, const char *file,int line)) { + /* Calling this here ensures initialisation before any threads + * are started. + */ + OPENSSL_init(); locking_callback=func; } @@ -661,28 +665,52 @@ const char *CRYPTO_get_lock_name(int type) defined(__INTEL__) || \ defined(__x86_64) || defined(__x86_64__) || defined(_M_AMD64) || defined(_M_X64) -unsigned long OPENSSL_ia32cap_P=0; -unsigned long *OPENSSL_ia32cap_loc(void) { return &OPENSSL_ia32cap_P; } +unsigned int OPENSSL_ia32cap_P[2]; +unsigned long *OPENSSL_ia32cap_loc(void) +{ if (sizeof(long)==4) + /* + * If 32-bit application pulls address of OPENSSL_ia32cap_P[0] + * clear second element to maintain the illusion that vector + * is 32-bit. + */ + OPENSSL_ia32cap_P[1]=0; + return (unsigned long *)OPENSSL_ia32cap_P; +} #if defined(OPENSSL_CPUID_OBJ) && !defined(OPENSSL_NO_ASM) && !defined(I386_ONLY) #define OPENSSL_CPUID_SETUP +#if defined(_WIN32) +typedef unsigned __int64 IA32CAP; +#else +typedef unsigned long long IA32CAP; +#endif void OPENSSL_cpuid_setup(void) { static int trigger=0; - unsigned long OPENSSL_ia32_cpuid(void); + IA32CAP OPENSSL_ia32_cpuid(void); + IA32CAP vec; char *env; if (trigger) return; trigger=1; - if ((env=getenv("OPENSSL_ia32cap"))) - OPENSSL_ia32cap_P = strtoul(env,NULL,0)|(1<<10); + if ((env=getenv("OPENSSL_ia32cap"))) { + int off = (env[0]=='~')?1:0; +#if defined(_WIN32) + if (!sscanf(env+off,"%I64i",&vec)) vec = strtoul(env+off,NULL,0); +#else + if (!sscanf(env+off,"%lli",(long long *)&vec)) vec = strtoul(env+off,NULL,0); +#endif + if (off) vec = OPENSSL_ia32_cpuid()&~vec; + } else - OPENSSL_ia32cap_P = OPENSSL_ia32_cpuid()|(1<<10); + vec = OPENSSL_ia32_cpuid(); /* * |(1<<10) sets a reserved bit to signal that variable * was initialized already... This is to avoid interference * with cpuid snippets in ELF .init segment. */ + OPENSSL_ia32cap_P[0] = (unsigned int)vec|(1<<10); + OPENSSL_ia32cap_P[1] = (unsigned int)(vec>>32); } #endif diff --git a/crypto/openssl/crypto/cryptlib.h b/crypto/openssl/crypto/cryptlib.h index fc249c57f3..1761f6b668 100644 --- a/crypto/openssl/crypto/cryptlib.h +++ b/crypto/openssl/crypto/cryptlib.h @@ -99,7 +99,7 @@ extern "C" { #define HEX_SIZE(type) (sizeof(type)*2) void OPENSSL_cpuid_setup(void); -extern unsigned long OPENSSL_ia32cap_P; +extern unsigned int OPENSSL_ia32cap_P[]; void OPENSSL_showfatal(const char *,...); void *OPENSSL_stderr(void); extern int OPENSSL_NONPIC_relocated; diff --git a/crypto/openssl/crypto/crypto.h b/crypto/openssl/crypto/crypto.h index b0360cec51..6aeda0a9ac 100644 --- a/crypto/openssl/crypto/crypto.h +++ b/crypto/openssl/crypto/crypto.h @@ -547,6 +547,33 @@ unsigned long *OPENSSL_ia32cap_loc(void); #define OPENSSL_ia32cap (*(OPENSSL_ia32cap_loc())) int OPENSSL_isservice(void); +int FIPS_mode(void); +int FIPS_mode_set(int r); + +void OPENSSL_init(void); + +#define fips_md_init(alg) fips_md_init_ctx(alg, alg) + +#ifdef OPENSSL_FIPS +#define fips_md_init_ctx(alg, cx) \ + int alg##_Init(cx##_CTX *c) \ + { \ + if (FIPS_mode()) OpenSSLDie(__FILE__, __LINE__, \ + "Low level API call to digest " #alg " forbidden in FIPS mode!"); \ + return private_##alg##_Init(c); \ + } \ + int private_##alg##_Init(cx##_CTX *c) + +#define fips_cipher_abort(alg) \ + if (FIPS_mode()) OpenSSLDie(__FILE__, __LINE__, \ + "Low level API call to cipher " #alg " forbidden in FIPS mode!") + +#else +#define fips_md_init_ctx(alg, cx) \ + int alg##_Init(cx##_CTX *c) +#define fips_cipher_abort(alg) while(0) +#endif + /* BEGIN ERROR CODES */ /* The following lines are auto generated by the script mkerr.pl. Any changes * made after this point may be overwritten when the script is next run. @@ -562,11 +589,13 @@ void ERR_load_CRYPTO_strings(void); #define CRYPTO_F_CRYPTO_SET_EX_DATA 102 #define CRYPTO_F_DEF_ADD_INDEX 104 #define CRYPTO_F_DEF_GET_CLASS 105 +#define CRYPTO_F_FIPS_MODE_SET 109 #define CRYPTO_F_INT_DUP_EX_DATA 106 #define CRYPTO_F_INT_FREE_EX_DATA 107 #define CRYPTO_F_INT_NEW_EX_DATA 108 /* Reason codes. */ +#define CRYPTO_R_FIPS_MODE_NOT_SUPPORTED 101 #define CRYPTO_R_NO_DYNLOCK_CREATE_CALLBACK 100 #ifdef __cplusplus diff --git a/crypto/openssl/crypto/des/des.h b/crypto/openssl/crypto/des/des.h index 92b6663599..1eaedcbd24 100644 --- a/crypto/openssl/crypto/des/des.h +++ b/crypto/openssl/crypto/des/des.h @@ -224,6 +224,9 @@ int DES_set_key(const_DES_cblock *key,DES_key_schedule *schedule); int DES_key_sched(const_DES_cblock *key,DES_key_schedule *schedule); int DES_set_key_checked(const_DES_cblock *key,DES_key_schedule *schedule); void DES_set_key_unchecked(const_DES_cblock *key,DES_key_schedule *schedule); +#ifdef OPENSSL_FIPS +void private_DES_set_key_unchecked(const_DES_cblock *key,DES_key_schedule *schedule); +#endif void DES_string_to_key(const char *str,DES_cblock *key); void DES_string_to_2keys(const char *str,DES_cblock *key1,DES_cblock *key2); void DES_cfb64_encrypt(const unsigned char *in,unsigned char *out,long length, diff --git a/crypto/openssl/crypto/des/set_key.c b/crypto/openssl/crypto/des/set_key.c index 3004cc3ab3..d3e69ca8b5 100644 --- a/crypto/openssl/crypto/des/set_key.c +++ b/crypto/openssl/crypto/des/set_key.c @@ -65,6 +65,8 @@ */ #include "des_locl.h" +#include + OPENSSL_IMPLEMENT_GLOBAL(int,DES_check_key,0) /* defaults to false */ static const unsigned char odd_parity[256]={ @@ -335,6 +337,13 @@ int DES_set_key_checked(const_DES_cblock *key, DES_key_schedule *schedule) } void DES_set_key_unchecked(const_DES_cblock *key, DES_key_schedule *schedule) +#ifdef OPENSSL_FIPS + { + fips_cipher_abort(DES); + private_DES_set_key_unchecked(key, schedule); + } +void private_DES_set_key_unchecked(const_DES_cblock *key, DES_key_schedule *schedule) +#endif { static const int shifts2[16]={0,0,1,1,1,1,1,1,0,1,1,1,1,1,1,0}; register DES_LONG c,d,t,s,t2; diff --git a/crypto/openssl/crypto/dh/dh.h b/crypto/openssl/crypto/dh/dh.h index 849309a489..ea59e610ef 100644 --- a/crypto/openssl/crypto/dh/dh.h +++ b/crypto/openssl/crypto/dh/dh.h @@ -86,6 +86,21 @@ * be used for all exponents. */ +/* If this flag is set the DH method is FIPS compliant and can be used + * in FIPS mode. This is set in the validated module method. If an + * application sets this flag in its own methods it is its reposibility + * to ensure the result is compliant. + */ + +#define DH_FLAG_FIPS_METHOD 0x0400 + +/* If this flag is set the operations normally disabled in FIPS mode are + * permitted it is then the applications responsibility to ensure that the + * usage is compliant. + */ + +#define DH_FLAG_NON_FIPS_ALLOW 0x0400 + #ifdef __cplusplus extern "C" { #endif @@ -230,6 +245,9 @@ void ERR_load_DH_strings(void); #define DH_F_COMPUTE_KEY 102 #define DH_F_DHPARAMS_PRINT_FP 101 #define DH_F_DH_BUILTIN_GENPARAMS 106 +#define DH_F_DH_COMPUTE_KEY 114 +#define DH_F_DH_GENERATE_KEY 115 +#define DH_F_DH_GENERATE_PARAMETERS_EX 116 #define DH_F_DH_NEW_METHOD 105 #define DH_F_DH_PARAM_DECODE 107 #define DH_F_DH_PRIV_DECODE 110 @@ -249,7 +267,9 @@ void ERR_load_DH_strings(void); #define DH_R_DECODE_ERROR 104 #define DH_R_INVALID_PUBKEY 102 #define DH_R_KEYS_NOT_SET 108 +#define DH_R_KEY_SIZE_TOO_SMALL 110 #define DH_R_MODULUS_TOO_LARGE 103 +#define DH_R_NON_FIPS_METHOD 111 #define DH_R_NO_PARAMETERS_SET 107 #define DH_R_NO_PRIVATE_VALUE 100 #define DH_R_PARAMETER_ENCODING_ERROR 105 diff --git a/crypto/openssl/crypto/dh/dh_ameth.c b/crypto/openssl/crypto/dh/dh_ameth.c index 377caf96c9..02ec2d47b4 100644 --- a/crypto/openssl/crypto/dh/dh_ameth.c +++ b/crypto/openssl/crypto/dh/dh_ameth.c @@ -493,6 +493,7 @@ const EVP_PKEY_ASN1_METHOD dh_asn1_meth = dh_copy_parameters, dh_cmp_parameters, dh_param_print, + 0, int_dh_free, 0 diff --git a/crypto/openssl/crypto/dh/dh_err.c b/crypto/openssl/crypto/dh/dh_err.c index d5cf0c22a3..56d3df7356 100644 --- a/crypto/openssl/crypto/dh/dh_err.c +++ b/crypto/openssl/crypto/dh/dh_err.c @@ -1,6 +1,6 @@ /* crypto/dh/dh_err.c */ /* ==================================================================== - * Copyright (c) 1999-2006 The OpenSSL Project. All rights reserved. + * Copyright (c) 1999-2011 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -73,6 +73,9 @@ static ERR_STRING_DATA DH_str_functs[]= {ERR_FUNC(DH_F_COMPUTE_KEY), "COMPUTE_KEY"}, {ERR_FUNC(DH_F_DHPARAMS_PRINT_FP), "DHparams_print_fp"}, {ERR_FUNC(DH_F_DH_BUILTIN_GENPARAMS), "DH_BUILTIN_GENPARAMS"}, +{ERR_FUNC(DH_F_DH_COMPUTE_KEY), "DH_compute_key"}, +{ERR_FUNC(DH_F_DH_GENERATE_KEY), "DH_generate_key"}, +{ERR_FUNC(DH_F_DH_GENERATE_PARAMETERS_EX), "DH_generate_parameters_ex"}, {ERR_FUNC(DH_F_DH_NEW_METHOD), "DH_new_method"}, {ERR_FUNC(DH_F_DH_PARAM_DECODE), "DH_PARAM_DECODE"}, {ERR_FUNC(DH_F_DH_PRIV_DECODE), "DH_PRIV_DECODE"}, @@ -95,7 +98,9 @@ static ERR_STRING_DATA DH_str_reasons[]= {ERR_REASON(DH_R_DECODE_ERROR) ,"decode error"}, {ERR_REASON(DH_R_INVALID_PUBKEY) ,"invalid public key"}, {ERR_REASON(DH_R_KEYS_NOT_SET) ,"keys not set"}, +{ERR_REASON(DH_R_KEY_SIZE_TOO_SMALL) ,"key size too small"}, {ERR_REASON(DH_R_MODULUS_TOO_LARGE) ,"modulus too large"}, +{ERR_REASON(DH_R_NON_FIPS_METHOD) ,"non fips method"}, {ERR_REASON(DH_R_NO_PARAMETERS_SET) ,"no parameters set"}, {ERR_REASON(DH_R_NO_PRIVATE_VALUE) ,"no private value"}, {ERR_REASON(DH_R_PARAMETER_ENCODING_ERROR),"parameter encoding error"}, diff --git a/crypto/openssl/crypto/dh/dh_gen.c b/crypto/openssl/crypto/dh/dh_gen.c index cfd5b11868..7b1fe9c9cb 100644 --- a/crypto/openssl/crypto/dh/dh_gen.c +++ b/crypto/openssl/crypto/dh/dh_gen.c @@ -66,12 +66,29 @@ #include #include +#ifdef OPENSSL_FIPS +#include +#endif + static int dh_builtin_genparams(DH *ret, int prime_len, int generator, BN_GENCB *cb); int DH_generate_parameters_ex(DH *ret, int prime_len, int generator, BN_GENCB *cb) { +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !(ret->meth->flags & DH_FLAG_FIPS_METHOD) + && !(ret->flags & DH_FLAG_NON_FIPS_ALLOW)) + { + DHerr(DH_F_DH_GENERATE_PARAMETERS_EX, DH_R_NON_FIPS_METHOD); + return 0; + } +#endif if(ret->meth->generate_params) return ret->meth->generate_params(ret, prime_len, generator, cb); +#ifdef OPENSSL_FIPS + if (FIPS_mode()) + return FIPS_dh_generate_parameters_ex(ret, prime_len, + generator, cb); +#endif return dh_builtin_genparams(ret, prime_len, generator, cb); } diff --git a/crypto/openssl/crypto/dh/dh_key.c b/crypto/openssl/crypto/dh/dh_key.c index e7db440342..89a74db4e6 100644 --- a/crypto/openssl/crypto/dh/dh_key.c +++ b/crypto/openssl/crypto/dh/dh_key.c @@ -73,11 +73,27 @@ static int dh_finish(DH *dh); int DH_generate_key(DH *dh) { +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !(dh->meth->flags & DH_FLAG_FIPS_METHOD) + && !(dh->flags & DH_FLAG_NON_FIPS_ALLOW)) + { + DHerr(DH_F_DH_GENERATE_KEY, DH_R_NON_FIPS_METHOD); + return 0; + } +#endif return dh->meth->generate_key(dh); } int DH_compute_key(unsigned char *key, const BIGNUM *pub_key, DH *dh) { +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !(dh->meth->flags & DH_FLAG_FIPS_METHOD) + && !(dh->flags & DH_FLAG_NON_FIPS_ALLOW)) + { + DHerr(DH_F_DH_COMPUTE_KEY, DH_R_NON_FIPS_METHOD); + return 0; + } +#endif return dh->meth->compute_key(key, pub_key, dh); } @@ -138,8 +154,21 @@ static int generate_key(DH *dh) if (generate_new_key) { - l = dh->length ? dh->length : BN_num_bits(dh->p)-1; /* secret exponent length */ - if (!BN_rand(priv_key, l, 0, 0)) goto err; + if (dh->q) + { + do + { + if (!BN_rand_range(priv_key, dh->q)) + goto err; + } + while (BN_is_zero(priv_key) || BN_is_one(priv_key)); + } + else + { + /* secret exponent length */ + l = dh->length ? dh->length : BN_num_bits(dh->p)-1; + if (!BN_rand(priv_key, l, 0, 0)) goto err; + } } { diff --git a/crypto/openssl/crypto/dh/dh_lib.c b/crypto/openssl/crypto/dh/dh_lib.c index 7aef080e7a..00218f2b92 100644 --- a/crypto/openssl/crypto/dh/dh_lib.c +++ b/crypto/openssl/crypto/dh/dh_lib.c @@ -64,6 +64,10 @@ #include #endif +#ifdef OPENSSL_FIPS +#include +#endif + const char DH_version[]="Diffie-Hellman" OPENSSL_VERSION_PTEXT; static const DH_METHOD *default_DH_method = NULL; @@ -76,7 +80,16 @@ void DH_set_default_method(const DH_METHOD *meth) const DH_METHOD *DH_get_default_method(void) { if(!default_DH_method) + { +#ifdef OPENSSL_FIPS + if (FIPS_mode()) + return FIPS_dh_openssl(); + else + return DH_OpenSSL(); +#else default_DH_method = DH_OpenSSL(); +#endif + } return default_DH_method; } @@ -156,7 +169,7 @@ DH *DH_new_method(ENGINE *engine) ret->counter = NULL; ret->method_mont_p=NULL; ret->references = 1; - ret->flags=ret->meth->flags; + ret->flags=ret->meth->flags & ~DH_FLAG_NON_FIPS_ALLOW; CRYPTO_new_ex_data(CRYPTO_EX_INDEX_DH, ret, &ret->ex_data); if ((ret->meth->init != NULL) && !ret->meth->init(ret)) { diff --git a/crypto/openssl/crypto/dsa/dsa.h b/crypto/openssl/crypto/dsa/dsa.h index ac50a5c846..a6f6d0b0b2 100644 --- a/crypto/openssl/crypto/dsa/dsa.h +++ b/crypto/openssl/crypto/dsa/dsa.h @@ -97,6 +97,21 @@ * be used for all exponents. */ +/* If this flag is set the DSA method is FIPS compliant and can be used + * in FIPS mode. This is set in the validated module method. If an + * application sets this flag in its own methods it is its reposibility + * to ensure the result is compliant. + */ + +#define DSA_FLAG_FIPS_METHOD 0x0400 + +/* If this flag is set the operations normally disabled in FIPS mode are + * permitted it is then the applications responsibility to ensure that the + * usage is compliant. + */ + +#define DSA_FLAG_NON_FIPS_ALLOW 0x0400 + #ifdef __cplusplus extern "C" { #endif @@ -272,6 +287,8 @@ void ERR_load_DSA_strings(void); #define DSA_F_DSAPARAMS_PRINT_FP 101 #define DSA_F_DSA_DO_SIGN 112 #define DSA_F_DSA_DO_VERIFY 113 +#define DSA_F_DSA_GENERATE_KEY 124 +#define DSA_F_DSA_GENERATE_PARAMETERS_EX 123 #define DSA_F_DSA_NEW_METHOD 103 #define DSA_F_DSA_PARAM_DECODE 119 #define DSA_F_DSA_PRINT_FP 105 @@ -282,6 +299,7 @@ void ERR_load_DSA_strings(void); #define DSA_F_DSA_SIGN 106 #define DSA_F_DSA_SIGN_SETUP 107 #define DSA_F_DSA_SIG_NEW 109 +#define DSA_F_DSA_SIG_PRINT 125 #define DSA_F_DSA_VERIFY 108 #define DSA_F_I2D_DSA_SIG 111 #define DSA_F_OLD_DSA_PRIV_DECODE 122 @@ -298,6 +316,8 @@ void ERR_load_DSA_strings(void); #define DSA_R_INVALID_DIGEST_TYPE 106 #define DSA_R_MISSING_PARAMETERS 101 #define DSA_R_MODULUS_TOO_LARGE 103 +#define DSA_R_NEED_NEW_SETUP_VALUES 110 +#define DSA_R_NON_FIPS_DSA_METHOD 111 #define DSA_R_NO_PARAMETERS_SET 107 #define DSA_R_PARAMETER_ENCODING_ERROR 105 diff --git a/crypto/openssl/crypto/dsa/dsa_ameth.c b/crypto/openssl/crypto/dsa/dsa_ameth.c index 6413aae46e..376156ec5e 100644 --- a/crypto/openssl/crypto/dsa/dsa_ameth.c +++ b/crypto/openssl/crypto/dsa/dsa_ameth.c @@ -542,6 +542,52 @@ static int old_dsa_priv_encode(const EVP_PKEY *pkey, unsigned char **pder) return i2d_DSAPrivateKey(pkey->pkey.dsa, pder); } +static int dsa_sig_print(BIO *bp, const X509_ALGOR *sigalg, + const ASN1_STRING *sig, + int indent, ASN1_PCTX *pctx) + { + DSA_SIG *dsa_sig; + const unsigned char *p; + if (!sig) + { + if (BIO_puts(bp, "\n") <= 0) + return 0; + else + return 1; + } + p = sig->data; + dsa_sig = d2i_DSA_SIG(NULL, &p, sig->length); + if (dsa_sig) + { + int rv = 0; + size_t buf_len = 0; + unsigned char *m=NULL; + update_buflen(dsa_sig->r, &buf_len); + update_buflen(dsa_sig->s, &buf_len); + m = OPENSSL_malloc(buf_len+10); + if (m == NULL) + { + DSAerr(DSA_F_DSA_SIG_PRINT,ERR_R_MALLOC_FAILURE); + goto err; + } + + if (BIO_write(bp, "\n", 1) != 1) + goto err; + + if (!ASN1_bn_print(bp,"r: ",dsa_sig->r,m,indent)) + goto err; + if (!ASN1_bn_print(bp,"s: ",dsa_sig->s,m,indent)) + goto err; + rv = 1; + err: + if (m) + OPENSSL_free(m); + DSA_SIG_free(dsa_sig); + return rv; + } + return X509_signature_dump(bp, sig, indent); + } + static int dsa_pkey_ctrl(EVP_PKEY *pkey, int op, long arg1, void *arg2) { switch (op) @@ -647,6 +693,7 @@ const EVP_PKEY_ASN1_METHOD dsa_asn1_meths[] = dsa_copy_parameters, dsa_cmp_parameters, dsa_param_print, + dsa_sig_print, int_dsa_free, dsa_pkey_ctrl, diff --git a/crypto/openssl/crypto/dsa/dsa_asn1.c b/crypto/openssl/crypto/dsa/dsa_asn1.c index c37460b2d6..6058534374 100644 --- a/crypto/openssl/crypto/dsa/dsa_asn1.c +++ b/crypto/openssl/crypto/dsa/dsa_asn1.c @@ -61,6 +61,7 @@ #include #include #include +#include /* Override the default new methods */ static int sig_cb(int operation, ASN1_VALUE **pval, const ASN1_ITEM *it, @@ -87,7 +88,7 @@ ASN1_SEQUENCE_cb(DSA_SIG, sig_cb) = { ASN1_SIMPLE(DSA_SIG, s, CBIGNUM) } ASN1_SEQUENCE_END_cb(DSA_SIG, DSA_SIG) -IMPLEMENT_ASN1_FUNCTIONS_const(DSA_SIG) +IMPLEMENT_ASN1_ENCODE_FUNCTIONS_const_fname(DSA_SIG, DSA_SIG, DSA_SIG) /* Override the default free and new methods */ static int dsa_cb(int operation, ASN1_VALUE **pval, const ASN1_ITEM *it, @@ -148,3 +149,40 @@ DSA *DSAparams_dup(DSA *dsa) { return ASN1_item_dup(ASN1_ITEM_rptr(DSAparams), dsa); } + +int DSA_sign(int type, const unsigned char *dgst, int dlen, unsigned char *sig, + unsigned int *siglen, DSA *dsa) + { + DSA_SIG *s; + RAND_seed(dgst, dlen); + s=DSA_do_sign(dgst,dlen,dsa); + if (s == NULL) + { + *siglen=0; + return(0); + } + *siglen=i2d_DSA_SIG(s,&sig); + DSA_SIG_free(s); + return(1); + } + +/* data has already been hashed (probably with SHA or SHA-1). */ +/* returns + * 1: correct signature + * 0: incorrect signature + * -1: error + */ +int DSA_verify(int type, const unsigned char *dgst, int dgst_len, + const unsigned char *sigbuf, int siglen, DSA *dsa) + { + DSA_SIG *s; + int ret=-1; + + s = DSA_SIG_new(); + if (s == NULL) return(ret); + if (d2i_DSA_SIG(&s,&sigbuf,siglen) == NULL) goto err; + ret=DSA_do_verify(dgst,dgst_len,s,dsa); +err: + DSA_SIG_free(s); + return(ret); + } diff --git a/crypto/openssl/crypto/dsa/dsa_err.c b/crypto/openssl/crypto/dsa/dsa_err.c index bba984e92e..00545b7b9f 100644 --- a/crypto/openssl/crypto/dsa/dsa_err.c +++ b/crypto/openssl/crypto/dsa/dsa_err.c @@ -1,6 +1,6 @@ /* crypto/dsa/dsa_err.c */ /* ==================================================================== - * Copyright (c) 1999-2006 The OpenSSL Project. All rights reserved. + * Copyright (c) 1999-2011 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -76,6 +76,8 @@ static ERR_STRING_DATA DSA_str_functs[]= {ERR_FUNC(DSA_F_DSAPARAMS_PRINT_FP), "DSAparams_print_fp"}, {ERR_FUNC(DSA_F_DSA_DO_SIGN), "DSA_do_sign"}, {ERR_FUNC(DSA_F_DSA_DO_VERIFY), "DSA_do_verify"}, +{ERR_FUNC(DSA_F_DSA_GENERATE_KEY), "DSA_generate_key"}, +{ERR_FUNC(DSA_F_DSA_GENERATE_PARAMETERS_EX), "DSA_generate_parameters_ex"}, {ERR_FUNC(DSA_F_DSA_NEW_METHOD), "DSA_new_method"}, {ERR_FUNC(DSA_F_DSA_PARAM_DECODE), "DSA_PARAM_DECODE"}, {ERR_FUNC(DSA_F_DSA_PRINT_FP), "DSA_print_fp"}, @@ -86,6 +88,7 @@ static ERR_STRING_DATA DSA_str_functs[]= {ERR_FUNC(DSA_F_DSA_SIGN), "DSA_sign"}, {ERR_FUNC(DSA_F_DSA_SIGN_SETUP), "DSA_sign_setup"}, {ERR_FUNC(DSA_F_DSA_SIG_NEW), "DSA_SIG_new"}, +{ERR_FUNC(DSA_F_DSA_SIG_PRINT), "DSA_SIG_PRINT"}, {ERR_FUNC(DSA_F_DSA_VERIFY), "DSA_verify"}, {ERR_FUNC(DSA_F_I2D_DSA_SIG), "i2d_DSA_SIG"}, {ERR_FUNC(DSA_F_OLD_DSA_PRIV_DECODE), "OLD_DSA_PRIV_DECODE"}, @@ -105,6 +108,8 @@ static ERR_STRING_DATA DSA_str_reasons[]= {ERR_REASON(DSA_R_INVALID_DIGEST_TYPE) ,"invalid digest type"}, {ERR_REASON(DSA_R_MISSING_PARAMETERS) ,"missing parameters"}, {ERR_REASON(DSA_R_MODULUS_TOO_LARGE) ,"modulus too large"}, +{ERR_REASON(DSA_R_NEED_NEW_SETUP_VALUES) ,"need new setup values"}, +{ERR_REASON(DSA_R_NON_FIPS_DSA_METHOD) ,"non fips dsa method"}, {ERR_REASON(DSA_R_NO_PARAMETERS_SET) ,"no parameters set"}, {ERR_REASON(DSA_R_PARAMETER_ENCODING_ERROR),"parameter encoding error"}, {0,NULL} diff --git a/crypto/openssl/crypto/dsa/dsa_gen.c b/crypto/openssl/crypto/dsa/dsa_gen.c index cb0b4538a4..c398761d0d 100644 --- a/crypto/openssl/crypto/dsa/dsa_gen.c +++ b/crypto/openssl/crypto/dsa/dsa_gen.c @@ -81,13 +81,33 @@ #include #include "dsa_locl.h" +#ifdef OPENSSL_FIPS +#include +#endif + int DSA_generate_parameters_ex(DSA *ret, int bits, const unsigned char *seed_in, int seed_len, int *counter_ret, unsigned long *h_ret, BN_GENCB *cb) { +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !(ret->meth->flags & DSA_FLAG_FIPS_METHOD) + && !(ret->flags & DSA_FLAG_NON_FIPS_ALLOW)) + { + DSAerr(DSA_F_DSA_GENERATE_PARAMETERS_EX, DSA_R_NON_FIPS_DSA_METHOD); + return 0; + } +#endif if(ret->meth->dsa_paramgen) return ret->meth->dsa_paramgen(ret, bits, seed_in, seed_len, counter_ret, h_ret, cb); +#ifdef OPENSSL_FIPS + else if (FIPS_mode()) + { + return FIPS_dsa_generate_parameters_ex(ret, bits, + seed_in, seed_len, + counter_ret, h_ret, cb); + } +#endif else { const EVP_MD *evpmd; @@ -105,12 +125,13 @@ int DSA_generate_parameters_ex(DSA *ret, int bits, } return dsa_builtin_paramgen(ret, bits, qbits, evpmd, - seed_in, seed_len, counter_ret, h_ret, cb); + seed_in, seed_len, NULL, counter_ret, h_ret, cb); } } int dsa_builtin_paramgen(DSA *ret, size_t bits, size_t qbits, const EVP_MD *evpmd, const unsigned char *seed_in, size_t seed_len, + unsigned char *seed_out, int *counter_ret, unsigned long *h_ret, BN_GENCB *cb) { int ok=0; @@ -201,8 +222,10 @@ int dsa_builtin_paramgen(DSA *ret, size_t bits, size_t qbits, } /* step 2 */ - EVP_Digest(seed, qsize, md, NULL, evpmd, NULL); - EVP_Digest(buf, qsize, buf2, NULL, evpmd, NULL); + if (!EVP_Digest(seed, qsize, md, NULL, evpmd, NULL)) + goto err; + if (!EVP_Digest(buf, qsize, buf2, NULL, evpmd, NULL)) + goto err; for (i = 0; i < qsize; i++) md[i]^=buf2[i]; @@ -251,7 +274,9 @@ int dsa_builtin_paramgen(DSA *ret, size_t bits, size_t qbits, break; } - EVP_Digest(buf, qsize, md ,NULL, evpmd, NULL); + if (!EVP_Digest(buf, qsize, md ,NULL, evpmd, + NULL)) + goto err; /* step 8 */ if (!BN_bin2bn(md, qsize, r0)) @@ -332,6 +357,8 @@ err: } if (counter_ret != NULL) *counter_ret=counter; if (h_ret != NULL) *h_ret=h; + if (seed_out) + memcpy(seed_out, seed, qsize); } if(ctx) { diff --git a/crypto/openssl/crypto/dsa/dsa_key.c b/crypto/openssl/crypto/dsa/dsa_key.c index c4aa86bc6d..9cf669b921 100644 --- a/crypto/openssl/crypto/dsa/dsa_key.c +++ b/crypto/openssl/crypto/dsa/dsa_key.c @@ -64,12 +64,28 @@ #include #include +#ifdef OPENSSL_FIPS +#include +#endif + static int dsa_builtin_keygen(DSA *dsa); int DSA_generate_key(DSA *dsa) { +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !(dsa->meth->flags & DSA_FLAG_FIPS_METHOD) + && !(dsa->flags & DSA_FLAG_NON_FIPS_ALLOW)) + { + DSAerr(DSA_F_DSA_GENERATE_KEY, DSA_R_NON_FIPS_DSA_METHOD); + return 0; + } +#endif if(dsa->meth->dsa_keygen) return dsa->meth->dsa_keygen(dsa); +#ifdef OPENSSL_FIPS + if (FIPS_mode()) + return FIPS_dsa_generate_key(dsa); +#endif return dsa_builtin_keygen(dsa); } diff --git a/crypto/openssl/crypto/dsa/dsa_lib.c b/crypto/openssl/crypto/dsa/dsa_lib.c index e9b75902db..96d8d0c4b4 100644 --- a/crypto/openssl/crypto/dsa/dsa_lib.c +++ b/crypto/openssl/crypto/dsa/dsa_lib.c @@ -70,6 +70,10 @@ #include #endif +#ifdef OPENSSL_FIPS +#include +#endif + const char DSA_version[]="DSA" OPENSSL_VERSION_PTEXT; static const DSA_METHOD *default_DSA_method = NULL; @@ -82,7 +86,16 @@ void DSA_set_default_method(const DSA_METHOD *meth) const DSA_METHOD *DSA_get_default_method(void) { if(!default_DSA_method) + { +#ifdef OPENSSL_FIPS + if (FIPS_mode()) + return FIPS_dsa_openssl(); + else + return DSA_OpenSSL(); +#else default_DSA_method = DSA_OpenSSL(); +#endif + } return default_DSA_method; } @@ -163,7 +176,7 @@ DSA *DSA_new_method(ENGINE *engine) ret->method_mont_p=NULL; ret->references=1; - ret->flags=ret->meth->flags; + ret->flags=ret->meth->flags & ~DSA_FLAG_NON_FIPS_ALLOW; CRYPTO_new_ex_data(CRYPTO_EX_INDEX_DSA, ret, &ret->ex_data); if ((ret->meth->init != NULL) && !ret->meth->init(ret)) { @@ -276,7 +289,8 @@ void *DSA_get_ex_data(DSA *d, int idx) DH *DSA_dup_DH(const DSA *r) { /* DSA has p, q, g, optional pub_key, optional priv_key. - * DH has p, optional length, g, optional pub_key, optional priv_key. + * DH has p, optional length, g, optional pub_key, optional priv_key, + * optional q. */ DH *ret = NULL; @@ -290,7 +304,11 @@ DH *DSA_dup_DH(const DSA *r) if ((ret->p = BN_dup(r->p)) == NULL) goto err; if (r->q != NULL) + { ret->length = BN_num_bits(r->q); + if ((ret->q = BN_dup(r->q)) == NULL) + goto err; + } if (r->g != NULL) if ((ret->g = BN_dup(r->g)) == NULL) goto err; diff --git a/crypto/openssl/crypto/dsa/dsa_locl.h b/crypto/openssl/crypto/dsa/dsa_locl.h index 2b8cfee3db..21e2e45242 100644 --- a/crypto/openssl/crypto/dsa/dsa_locl.h +++ b/crypto/openssl/crypto/dsa/dsa_locl.h @@ -56,4 +56,5 @@ int dsa_builtin_paramgen(DSA *ret, size_t bits, size_t qbits, const EVP_MD *evpmd, const unsigned char *seed_in, size_t seed_len, + unsigned char *seed_out, int *counter_ret, unsigned long *h_ret, BN_GENCB *cb); diff --git a/crypto/openssl/crypto/dsa/dsa_ossl.c b/crypto/openssl/crypto/dsa/dsa_ossl.c index a3ddd7d281..b3d78e524c 100644 --- a/crypto/openssl/crypto/dsa/dsa_ossl.c +++ b/crypto/openssl/crypto/dsa/dsa_ossl.c @@ -136,6 +136,7 @@ static DSA_SIG *dsa_do_sign(const unsigned char *dgst, int dlen, DSA *dsa) BN_CTX *ctx=NULL; int reason=ERR_R_BN_LIB; DSA_SIG *ret=NULL; + int noredo = 0; BN_init(&m); BN_init(&xr); @@ -150,7 +151,7 @@ static DSA_SIG *dsa_do_sign(const unsigned char *dgst, int dlen, DSA *dsa) if (s == NULL) goto err; ctx=BN_CTX_new(); if (ctx == NULL) goto err; - +redo: if ((dsa->kinv == NULL) || (dsa->r == NULL)) { if (!DSA_sign_setup(dsa,ctx,&kinv,&r)) goto err; @@ -161,6 +162,7 @@ static DSA_SIG *dsa_do_sign(const unsigned char *dgst, int dlen, DSA *dsa) dsa->kinv=NULL; r=dsa->r; dsa->r=NULL; + noredo = 1; } @@ -181,6 +183,18 @@ static DSA_SIG *dsa_do_sign(const unsigned char *dgst, int dlen, DSA *dsa) ret=DSA_SIG_new(); if (ret == NULL) goto err; + /* Redo if r or s is zero as required by FIPS 186-3: this is + * very unlikely. + */ + if (BN_is_zero(r) || BN_is_zero(s)) + { + if (noredo) + { + reason = DSA_R_NEED_NEW_SETUP_VALUES; + goto err; + } + goto redo; + } ret->r = r; ret->s = s; diff --git a/crypto/openssl/crypto/dsa/dsa_pmeth.c b/crypto/openssl/crypto/dsa/dsa_pmeth.c index e2df54fec6..715d8d675b 100644 --- a/crypto/openssl/crypto/dsa/dsa_pmeth.c +++ b/crypto/openssl/crypto/dsa/dsa_pmeth.c @@ -189,7 +189,9 @@ static int pkey_dsa_ctrl(EVP_PKEY_CTX *ctx, int type, int p1, void *p2) EVP_MD_type((const EVP_MD *)p2) != NID_dsa && EVP_MD_type((const EVP_MD *)p2) != NID_dsaWithSHA && EVP_MD_type((const EVP_MD *)p2) != NID_sha224 && - EVP_MD_type((const EVP_MD *)p2) != NID_sha256) + EVP_MD_type((const EVP_MD *)p2) != NID_sha256 && + EVP_MD_type((const EVP_MD *)p2) != NID_sha384 && + EVP_MD_type((const EVP_MD *)p2) != NID_sha512) { DSAerr(DSA_F_PKEY_DSA_CTRL, DSA_R_INVALID_DIGEST_TYPE); return 0; @@ -253,7 +255,7 @@ static int pkey_dsa_paramgen(EVP_PKEY_CTX *ctx, EVP_PKEY *pkey) if (!dsa) return 0; ret = dsa_builtin_paramgen(dsa, dctx->nbits, dctx->qbits, dctx->pmd, - NULL, 0, NULL, NULL, pcb); + NULL, 0, NULL, NULL, NULL, pcb); if (ret) EVP_PKEY_assign_DSA(pkey, dsa); else diff --git a/crypto/openssl/crypto/dsa/dsa_sign.c b/crypto/openssl/crypto/dsa/dsa_sign.c index 17555e5892..c3cc3642ce 100644 --- a/crypto/openssl/crypto/dsa/dsa_sign.c +++ b/crypto/openssl/crypto/dsa/dsa_sign.c @@ -61,30 +61,54 @@ #include "cryptlib.h" #include #include +#include DSA_SIG * DSA_do_sign(const unsigned char *dgst, int dlen, DSA *dsa) { +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !(dsa->meth->flags & DSA_FLAG_FIPS_METHOD) + && !(dsa->flags & DSA_FLAG_NON_FIPS_ALLOW)) + { + DSAerr(DSA_F_DSA_DO_SIGN, DSA_R_NON_FIPS_DSA_METHOD); + return NULL; + } +#endif return dsa->meth->dsa_do_sign(dgst, dlen, dsa); } -int DSA_sign(int type, const unsigned char *dgst, int dlen, unsigned char *sig, - unsigned int *siglen, DSA *dsa) +int DSA_sign_setup(DSA *dsa, BN_CTX *ctx_in, BIGNUM **kinvp, BIGNUM **rp) { - DSA_SIG *s; - RAND_seed(dgst, dlen); - s=DSA_do_sign(dgst,dlen,dsa); - if (s == NULL) +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !(dsa->meth->flags & DSA_FLAG_FIPS_METHOD) + && !(dsa->flags & DSA_FLAG_NON_FIPS_ALLOW)) { - *siglen=0; - return(0); + DSAerr(DSA_F_DSA_SIGN_SETUP, DSA_R_NON_FIPS_DSA_METHOD); + return 0; } - *siglen=i2d_DSA_SIG(s,&sig); - DSA_SIG_free(s); - return(1); +#endif + return dsa->meth->dsa_sign_setup(dsa, ctx_in, kinvp, rp); } -int DSA_sign_setup(DSA *dsa, BN_CTX *ctx_in, BIGNUM **kinvp, BIGNUM **rp) +DSA_SIG *DSA_SIG_new(void) { - return dsa->meth->dsa_sign_setup(dsa, ctx_in, kinvp, rp); + DSA_SIG *sig; + sig = OPENSSL_malloc(sizeof(DSA_SIG)); + if (!sig) + return NULL; + sig->r = NULL; + sig->s = NULL; + return sig; + } + +void DSA_SIG_free(DSA_SIG *sig) + { + if (sig) + { + if (sig->r) + BN_free(sig->r); + if (sig->s) + BN_free(sig->s); + OPENSSL_free(sig); + } } diff --git a/crypto/openssl/crypto/dsa/dsa_vrf.c b/crypto/openssl/crypto/dsa/dsa_vrf.c index 226a75ff3f..674cb5fa5f 100644 --- a/crypto/openssl/crypto/dsa/dsa_vrf.c +++ b/crypto/openssl/crypto/dsa/dsa_vrf.c @@ -64,26 +64,13 @@ int DSA_do_verify(const unsigned char *dgst, int dgst_len, DSA_SIG *sig, DSA *dsa) { +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !(dsa->meth->flags & DSA_FLAG_FIPS_METHOD) + && !(dsa->flags & DSA_FLAG_NON_FIPS_ALLOW)) + { + DSAerr(DSA_F_DSA_DO_VERIFY, DSA_R_NON_FIPS_DSA_METHOD); + return -1; + } +#endif return dsa->meth->dsa_do_verify(dgst, dgst_len, sig, dsa); } - -/* data has already been hashed (probably with SHA or SHA-1). */ -/* returns - * 1: correct signature - * 0: incorrect signature - * -1: error - */ -int DSA_verify(int type, const unsigned char *dgst, int dgst_len, - const unsigned char *sigbuf, int siglen, DSA *dsa) - { - DSA_SIG *s; - int ret=-1; - - s = DSA_SIG_new(); - if (s == NULL) return(ret); - if (d2i_DSA_SIG(&s,&sigbuf,siglen) == NULL) goto err; - ret=DSA_do_verify(dgst,dgst_len,s,dsa); -err: - DSA_SIG_free(s); - return(ret); - } diff --git a/crypto/openssl/crypto/dso/dso_dlfcn.c b/crypto/openssl/crypto/dso/dso_dlfcn.c index c2bc61760b..5f2254806c 100644 --- a/crypto/openssl/crypto/dso/dso_dlfcn.c +++ b/crypto/openssl/crypto/dso/dso_dlfcn.c @@ -86,7 +86,8 @@ DSO_METHOD *DSO_METHOD_dlfcn(void) # if defined(_AIX) || defined(__CYGWIN__) || \ defined(__SCO_VERSION__) || defined(_SCO_ELF) || \ (defined(__osf__) && !defined(RTLD_NEXT)) || \ - (defined(__OpenBSD__) && !defined(RTLD_SELF)) + (defined(__OpenBSD__) && !defined(RTLD_SELF)) || \ + defined(__ANDROID__) # undef HAVE_DLINFO # endif #endif diff --git a/crypto/openssl/crypto/ec/ec.h b/crypto/openssl/crypto/ec/ec.h index ee7078130c..9d01325af3 100644 --- a/crypto/openssl/crypto/ec/ec.h +++ b/crypto/openssl/crypto/ec/ec.h @@ -151,7 +151,24 @@ const EC_METHOD *EC_GFp_mont_method(void); */ const EC_METHOD *EC_GFp_nist_method(void); +#ifndef OPENSSL_NO_EC_NISTP_64_GCC_128 +/** Returns 64-bit optimized methods for nistp224 + * \return EC_METHOD object + */ +const EC_METHOD *EC_GFp_nistp224_method(void); + +/** Returns 64-bit optimized methods for nistp256 + * \return EC_METHOD object + */ +const EC_METHOD *EC_GFp_nistp256_method(void); + +/** Returns 64-bit optimized methods for nistp521 + * \return EC_METHOD object + */ +const EC_METHOD *EC_GFp_nistp521_method(void); +#endif +#ifndef OPENSSL_NO_EC2M /********************************************************************/ /* EC_METHOD for curves over GF(2^m) */ /********************************************************************/ @@ -161,6 +178,8 @@ const EC_METHOD *EC_GFp_nist_method(void); */ const EC_METHOD *EC_GF2m_simple_method(void); +#endif + /********************************************************************/ /* EC_GROUP functions */ @@ -282,6 +301,7 @@ int EC_GROUP_set_curve_GFp(EC_GROUP *group, const BIGNUM *p, const BIGNUM *a, co */ int EC_GROUP_get_curve_GFp(const EC_GROUP *group, BIGNUM *p, BIGNUM *a, BIGNUM *b, BN_CTX *ctx); +#ifndef OPENSSL_NO_EC2M /** Sets the parameter of a ec over GF2m defined by y^2 + x*y = x^3 + a*x^2 + b * \param group EC_GROUP object * \param p BIGNUM with the polynomial defining the underlying field @@ -301,7 +321,7 @@ int EC_GROUP_set_curve_GF2m(EC_GROUP *group, const BIGNUM *p, const BIGNUM *a, c * \return 1 on success and 0 if an error occured */ int EC_GROUP_get_curve_GF2m(const EC_GROUP *group, BIGNUM *p, BIGNUM *a, BIGNUM *b, BN_CTX *ctx); - +#endif /** Returns the number of bits needed to represent a field element * \param group EC_GROUP object * \return number of bits needed to represent a field element @@ -342,7 +362,7 @@ int EC_GROUP_cmp(const EC_GROUP *a, const EC_GROUP *b, BN_CTX *ctx); * \return newly created EC_GROUP object with the specified parameters */ EC_GROUP *EC_GROUP_new_curve_GFp(const BIGNUM *p, const BIGNUM *a, const BIGNUM *b, BN_CTX *ctx); - +#ifndef OPENSSL_NO_EC2M /** Creates a new EC_GROUP object with the specified parameters defined * over GF2m (defined by the equation y^2 + x*y = x^3 + a*x^2 + b) * \param p BIGNUM with the polynomial defining the underlying field @@ -352,7 +372,7 @@ EC_GROUP *EC_GROUP_new_curve_GFp(const BIGNUM *p, const BIGNUM *a, const BIGNUM * \return newly created EC_GROUP object with the specified parameters */ EC_GROUP *EC_GROUP_new_curve_GF2m(const BIGNUM *p, const BIGNUM *a, const BIGNUM *b, BN_CTX *ctx); - +#endif /** Creates a EC_GROUP object with a curve specified by a NID * \param nid NID of the OID of the curve name * \return newly created EC_GROUP object with specified curve or NULL @@ -481,7 +501,7 @@ int EC_POINT_get_affine_coordinates_GFp(const EC_GROUP *group, */ int EC_POINT_set_compressed_coordinates_GFp(const EC_GROUP *group, EC_POINT *p, const BIGNUM *x, int y_bit, BN_CTX *ctx); - +#ifndef OPENSSL_NO_EC2M /** Sets the affine coordinates of a EC_POINT over GF2m * \param group underlying EC_GROUP object * \param p EC_POINT object @@ -514,7 +534,7 @@ int EC_POINT_get_affine_coordinates_GF2m(const EC_GROUP *group, */ int EC_POINT_set_compressed_coordinates_GF2m(const EC_GROUP *group, EC_POINT *p, const BIGNUM *x, int y_bit, BN_CTX *ctx); - +#endif /** Encodes a EC_POINT object to a octet string * \param group underlying EC_GROUP object * \param p EC_POINT object @@ -653,9 +673,11 @@ int EC_GROUP_have_precompute_mult(const EC_GROUP *group); /* EC_GROUP_get_basis_type() returns the NID of the basis type * used to represent the field elements */ int EC_GROUP_get_basis_type(const EC_GROUP *); +#ifndef OPENSSL_NO_EC2M int EC_GROUP_get_trinomial_basis(const EC_GROUP *, unsigned int *k); int EC_GROUP_get_pentanomial_basis(const EC_GROUP *, unsigned int *k1, unsigned int *k2, unsigned int *k3); +#endif #define OPENSSL_EC_NAMED_CURVE 0x001 @@ -689,11 +711,21 @@ typedef struct ec_key_st EC_KEY; #define EC_PKEY_NO_PARAMETERS 0x001 #define EC_PKEY_NO_PUBKEY 0x002 +/* some values for the flags field */ +#define EC_FLAG_NON_FIPS_ALLOW 0x1 +#define EC_FLAG_FIPS_CHECKED 0x2 + /** Creates a new EC_KEY object. * \return EC_KEY object or NULL if an error occurred. */ EC_KEY *EC_KEY_new(void); +int EC_KEY_get_flags(const EC_KEY *key); + +void EC_KEY_set_flags(EC_KEY *key, int flags); + +void EC_KEY_clear_flags(EC_KEY *key, int flags); + /** Creates a new EC_KEY object using a named curve as underlying * EC_GROUP object. * \param nid NID of the named curve. @@ -799,6 +831,15 @@ int EC_KEY_generate_key(EC_KEY *key); */ int EC_KEY_check_key(const EC_KEY *key); +/** Sets a public key from affine coordindates performing + * neccessary NIST PKV tests. + * \param key the EC_KEY object + * \param x public key x coordinate + * \param y public key y coordinate + * \return 1 on success and 0 otherwise. + */ +int EC_KEY_set_public_key_affine_coordinates(EC_KEY *key, BIGNUM *x, BIGNUM *y); + /********************************************************************/ /* de- and encoding functions for SEC1 ECPrivateKey */ @@ -926,6 +967,7 @@ void ERR_load_EC_strings(void); /* Error codes for the EC functions. */ /* Function codes. */ +#define EC_F_BN_TO_FELEM 224 #define EC_F_COMPUTE_WNAF 143 #define EC_F_D2I_ECPARAMETERS 144 #define EC_F_D2I_ECPKPARAMETERS 145 @@ -968,6 +1010,15 @@ void ERR_load_EC_strings(void); #define EC_F_EC_GFP_MONT_FIELD_SQR 132 #define EC_F_EC_GFP_MONT_GROUP_SET_CURVE 189 #define EC_F_EC_GFP_MONT_GROUP_SET_CURVE_GFP 135 +#define EC_F_EC_GFP_NISTP224_GROUP_SET_CURVE 225 +#define EC_F_EC_GFP_NISTP224_POINTS_MUL 228 +#define EC_F_EC_GFP_NISTP224_POINT_GET_AFFINE_COORDINATES 226 +#define EC_F_EC_GFP_NISTP256_GROUP_SET_CURVE 230 +#define EC_F_EC_GFP_NISTP256_POINTS_MUL 231 +#define EC_F_EC_GFP_NISTP256_POINT_GET_AFFINE_COORDINATES 232 +#define EC_F_EC_GFP_NISTP521_GROUP_SET_CURVE 233 +#define EC_F_EC_GFP_NISTP521_POINTS_MUL 234 +#define EC_F_EC_GFP_NISTP521_POINT_GET_AFFINE_COORDINATES 235 #define EC_F_EC_GFP_NIST_FIELD_MUL 200 #define EC_F_EC_GFP_NIST_FIELD_SQR 201 #define EC_F_EC_GFP_NIST_GROUP_SET_CURVE 202 @@ -1010,6 +1061,7 @@ void ERR_load_EC_strings(void); #define EC_F_EC_KEY_NEW 182 #define EC_F_EC_KEY_PRINT 180 #define EC_F_EC_KEY_PRINT_FP 181 +#define EC_F_EC_KEY_SET_PUBLIC_KEY_AFFINE_COORDINATES 229 #define EC_F_EC_POINTS_MAKE_AFFINE 136 #define EC_F_EC_POINT_ADD 112 #define EC_F_EC_POINT_CMP 113 @@ -1040,6 +1092,9 @@ void ERR_load_EC_strings(void); #define EC_F_I2D_ECPKPARAMETERS 191 #define EC_F_I2D_ECPRIVATEKEY 192 #define EC_F_I2O_ECPUBLICKEY 151 +#define EC_F_NISTP224_PRE_COMP_NEW 227 +#define EC_F_NISTP256_PRE_COMP_NEW 236 +#define EC_F_NISTP521_PRE_COMP_NEW 237 #define EC_F_O2I_ECPUBLICKEY 152 #define EC_F_OLD_EC_PRIV_DECODE 222 #define EC_F_PKEY_EC_CTRL 197 @@ -1052,12 +1107,15 @@ void ERR_load_EC_strings(void); /* Reason codes. */ #define EC_R_ASN1_ERROR 115 #define EC_R_ASN1_UNKNOWN_FIELD 116 +#define EC_R_BIGNUM_OUT_OF_RANGE 144 #define EC_R_BUFFER_TOO_SMALL 100 +#define EC_R_COORDINATES_OUT_OF_RANGE 146 #define EC_R_D2I_ECPKPARAMETERS_FAILURE 117 #define EC_R_DECODE_ERROR 142 #define EC_R_DISCRIMINANT_IS_ZERO 118 #define EC_R_EC_GROUP_NEW_BY_NAME_FAILURE 119 #define EC_R_FIELD_TOO_LARGE 143 +#define EC_R_GF2M_NOT_SUPPORTED 147 #define EC_R_GROUP2PKPARAMETERS_FAILURE 120 #define EC_R_I2D_ECPKPARAMETERS_FAILURE 121 #define EC_R_INCOMPATIBLE_OBJECTS 101 @@ -1092,6 +1150,7 @@ void ERR_load_EC_strings(void); #define EC_R_UNKNOWN_GROUP 129 #define EC_R_UNKNOWN_ORDER 114 #define EC_R_UNSUPPORTED_FIELD 131 +#define EC_R_WRONG_CURVE_PARAMETERS 145 #define EC_R_WRONG_ORDER 130 #ifdef __cplusplus diff --git a/crypto/openssl/crypto/ec/ec2_mult.c b/crypto/openssl/crypto/ec/ec2_mult.c index e12b9b284a..26f4a783fc 100644 --- a/crypto/openssl/crypto/ec/ec2_mult.c +++ b/crypto/openssl/crypto/ec/ec2_mult.c @@ -71,6 +71,8 @@ #include "ec_lcl.h" +#ifndef OPENSSL_NO_EC2M + /* Compute the x-coordinate x/z for the point 2*(x/z) in Montgomery projective * coordinates. @@ -384,3 +386,5 @@ int ec_GF2m_have_precompute_mult(const EC_GROUP *group) { return ec_wNAF_have_precompute_mult(group); } + +#endif diff --git a/crypto/openssl/crypto/ec/ec2_oct.c b/crypto/openssl/crypto/ec/ec2_oct.c new file mode 100644 index 0000000000..f1d75e5ddf --- /dev/null +++ b/crypto/openssl/crypto/ec/ec2_oct.c @@ -0,0 +1,407 @@ +/* crypto/ec/ec2_oct.c */ +/* ==================================================================== + * Copyright 2002 Sun Microsystems, Inc. ALL RIGHTS RESERVED. + * + * The Elliptic Curve Public-Key Crypto Library (ECC Code) included + * herein is developed by SUN MICROSYSTEMS, INC., and is contributed + * to the OpenSSL project. + * + * The ECC Code is licensed pursuant to the OpenSSL open source + * license provided below. + * + * The software is originally written by Sheueling Chang Shantz and + * Douglas Stebila of Sun Microsystems Laboratories. + * + */ +/* ==================================================================== + * Copyright (c) 1998-2005 The OpenSSL Project. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * 3. All advertising materials mentioning features or use of this + * software must display the following acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit. (http://www.openssl.org/)" + * + * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to + * endorse or promote products derived from this software without + * prior written permission. For written permission, please contact + * openssl-core@openssl.org. + * + * 5. Products derived from this software may not be called "OpenSSL" + * nor may "OpenSSL" appear in their names without prior written + * permission of the OpenSSL Project. + * + * 6. Redistributions of any form whatsoever must retain the following + * acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit (http://www.openssl.org/)" + * + * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY + * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * ==================================================================== + * + * This product includes cryptographic software written by Eric Young + * (eay@cryptsoft.com). This product includes software written by Tim + * Hudson (tjh@cryptsoft.com). + * + */ + +#include + +#include "ec_lcl.h" + +#ifndef OPENSSL_NO_EC2M + +/* Calculates and sets the affine coordinates of an EC_POINT from the given + * compressed coordinates. Uses algorithm 2.3.4 of SEC 1. + * Note that the simple implementation only uses affine coordinates. + * + * The method is from the following publication: + * + * Harper, Menezes, Vanstone: + * "Public-Key Cryptosystems with Very Small Key Lengths", + * EUROCRYPT '92, Springer-Verlag LNCS 658, + * published February 1993 + * + * US Patents 6,141,420 and 6,618,483 (Vanstone, Mullin, Agnew) describe + * the same method, but claim no priority date earlier than July 29, 1994 + * (and additionally fail to cite the EUROCRYPT '92 publication as prior art). + */ +int ec_GF2m_simple_set_compressed_coordinates(const EC_GROUP *group, EC_POINT *point, + const BIGNUM *x_, int y_bit, BN_CTX *ctx) + { + BN_CTX *new_ctx = NULL; + BIGNUM *tmp, *x, *y, *z; + int ret = 0, z0; + + /* clear error queue */ + ERR_clear_error(); + + if (ctx == NULL) + { + ctx = new_ctx = BN_CTX_new(); + if (ctx == NULL) + return 0; + } + + y_bit = (y_bit != 0) ? 1 : 0; + + BN_CTX_start(ctx); + tmp = BN_CTX_get(ctx); + x = BN_CTX_get(ctx); + y = BN_CTX_get(ctx); + z = BN_CTX_get(ctx); + if (z == NULL) goto err; + + if (!BN_GF2m_mod_arr(x, x_, group->poly)) goto err; + if (BN_is_zero(x)) + { + if (!BN_GF2m_mod_sqrt_arr(y, &group->b, group->poly, ctx)) goto err; + } + else + { + if (!group->meth->field_sqr(group, tmp, x, ctx)) goto err; + if (!group->meth->field_div(group, tmp, &group->b, tmp, ctx)) goto err; + if (!BN_GF2m_add(tmp, &group->a, tmp)) goto err; + if (!BN_GF2m_add(tmp, x, tmp)) goto err; + if (!BN_GF2m_mod_solve_quad_arr(z, tmp, group->poly, ctx)) + { + unsigned long err = ERR_peek_last_error(); + + if (ERR_GET_LIB(err) == ERR_LIB_BN && ERR_GET_REASON(err) == BN_R_NO_SOLUTION) + { + ERR_clear_error(); + ECerr(EC_F_EC_GF2M_SIMPLE_SET_COMPRESSED_COORDINATES, EC_R_INVALID_COMPRESSED_POINT); + } + else + ECerr(EC_F_EC_GF2M_SIMPLE_SET_COMPRESSED_COORDINATES, ERR_R_BN_LIB); + goto err; + } + z0 = (BN_is_odd(z)) ? 1 : 0; + if (!group->meth->field_mul(group, y, x, z, ctx)) goto err; + if (z0 != y_bit) + { + if (!BN_GF2m_add(y, y, x)) goto err; + } + } + + if (!EC_POINT_set_affine_coordinates_GF2m(group, point, x, y, ctx)) goto err; + + ret = 1; + + err: + BN_CTX_end(ctx); + if (new_ctx != NULL) + BN_CTX_free(new_ctx); + return ret; + } + + +/* Converts an EC_POINT to an octet string. + * If buf is NULL, the encoded length will be returned. + * If the length len of buf is smaller than required an error will be returned. + */ +size_t ec_GF2m_simple_point2oct(const EC_GROUP *group, const EC_POINT *point, point_conversion_form_t form, + unsigned char *buf, size_t len, BN_CTX *ctx) + { + size_t ret; + BN_CTX *new_ctx = NULL; + int used_ctx = 0; + BIGNUM *x, *y, *yxi; + size_t field_len, i, skip; + + if ((form != POINT_CONVERSION_COMPRESSED) + && (form != POINT_CONVERSION_UNCOMPRESSED) + && (form != POINT_CONVERSION_HYBRID)) + { + ECerr(EC_F_EC_GF2M_SIMPLE_POINT2OCT, EC_R_INVALID_FORM); + goto err; + } + + if (EC_POINT_is_at_infinity(group, point)) + { + /* encodes to a single 0 octet */ + if (buf != NULL) + { + if (len < 1) + { + ECerr(EC_F_EC_GF2M_SIMPLE_POINT2OCT, EC_R_BUFFER_TOO_SMALL); + return 0; + } + buf[0] = 0; + } + return 1; + } + + + /* ret := required output buffer length */ + field_len = (EC_GROUP_get_degree(group) + 7) / 8; + ret = (form == POINT_CONVERSION_COMPRESSED) ? 1 + field_len : 1 + 2*field_len; + + /* if 'buf' is NULL, just return required length */ + if (buf != NULL) + { + if (len < ret) + { + ECerr(EC_F_EC_GF2M_SIMPLE_POINT2OCT, EC_R_BUFFER_TOO_SMALL); + goto err; + } + + if (ctx == NULL) + { + ctx = new_ctx = BN_CTX_new(); + if (ctx == NULL) + return 0; + } + + BN_CTX_start(ctx); + used_ctx = 1; + x = BN_CTX_get(ctx); + y = BN_CTX_get(ctx); + yxi = BN_CTX_get(ctx); + if (yxi == NULL) goto err; + + if (!EC_POINT_get_affine_coordinates_GF2m(group, point, x, y, ctx)) goto err; + + buf[0] = form; + if ((form != POINT_CONVERSION_UNCOMPRESSED) && !BN_is_zero(x)) + { + if (!group->meth->field_div(group, yxi, y, x, ctx)) goto err; + if (BN_is_odd(yxi)) buf[0]++; + } + + i = 1; + + skip = field_len - BN_num_bytes(x); + if (skip > field_len) + { + ECerr(EC_F_EC_GF2M_SIMPLE_POINT2OCT, ERR_R_INTERNAL_ERROR); + goto err; + } + while (skip > 0) + { + buf[i++] = 0; + skip--; + } + skip = BN_bn2bin(x, buf + i); + i += skip; + if (i != 1 + field_len) + { + ECerr(EC_F_EC_GF2M_SIMPLE_POINT2OCT, ERR_R_INTERNAL_ERROR); + goto err; + } + + if (form == POINT_CONVERSION_UNCOMPRESSED || form == POINT_CONVERSION_HYBRID) + { + skip = field_len - BN_num_bytes(y); + if (skip > field_len) + { + ECerr(EC_F_EC_GF2M_SIMPLE_POINT2OCT, ERR_R_INTERNAL_ERROR); + goto err; + } + while (skip > 0) + { + buf[i++] = 0; + skip--; + } + skip = BN_bn2bin(y, buf + i); + i += skip; + } + + if (i != ret) + { + ECerr(EC_F_EC_GF2M_SIMPLE_POINT2OCT, ERR_R_INTERNAL_ERROR); + goto err; + } + } + + if (used_ctx) + BN_CTX_end(ctx); + if (new_ctx != NULL) + BN_CTX_free(new_ctx); + return ret; + + err: + if (used_ctx) + BN_CTX_end(ctx); + if (new_ctx != NULL) + BN_CTX_free(new_ctx); + return 0; + } + + +/* Converts an octet string representation to an EC_POINT. + * Note that the simple implementation only uses affine coordinates. + */ +int ec_GF2m_simple_oct2point(const EC_GROUP *group, EC_POINT *point, + const unsigned char *buf, size_t len, BN_CTX *ctx) + { + point_conversion_form_t form; + int y_bit; + BN_CTX *new_ctx = NULL; + BIGNUM *x, *y, *yxi; + size_t field_len, enc_len; + int ret = 0; + + if (len == 0) + { + ECerr(EC_F_EC_GF2M_SIMPLE_OCT2POINT, EC_R_BUFFER_TOO_SMALL); + return 0; + } + form = buf[0]; + y_bit = form & 1; + form = form & ~1U; + if ((form != 0) && (form != POINT_CONVERSION_COMPRESSED) + && (form != POINT_CONVERSION_UNCOMPRESSED) + && (form != POINT_CONVERSION_HYBRID)) + { + ECerr(EC_F_EC_GF2M_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); + return 0; + } + if ((form == 0 || form == POINT_CONVERSION_UNCOMPRESSED) && y_bit) + { + ECerr(EC_F_EC_GF2M_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); + return 0; + } + + if (form == 0) + { + if (len != 1) + { + ECerr(EC_F_EC_GF2M_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); + return 0; + } + + return EC_POINT_set_to_infinity(group, point); + } + + field_len = (EC_GROUP_get_degree(group) + 7) / 8; + enc_len = (form == POINT_CONVERSION_COMPRESSED) ? 1 + field_len : 1 + 2*field_len; + + if (len != enc_len) + { + ECerr(EC_F_EC_GF2M_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); + return 0; + } + + if (ctx == NULL) + { + ctx = new_ctx = BN_CTX_new(); + if (ctx == NULL) + return 0; + } + + BN_CTX_start(ctx); + x = BN_CTX_get(ctx); + y = BN_CTX_get(ctx); + yxi = BN_CTX_get(ctx); + if (yxi == NULL) goto err; + + if (!BN_bin2bn(buf + 1, field_len, x)) goto err; + if (BN_ucmp(x, &group->field) >= 0) + { + ECerr(EC_F_EC_GF2M_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); + goto err; + } + + if (form == POINT_CONVERSION_COMPRESSED) + { + if (!EC_POINT_set_compressed_coordinates_GF2m(group, point, x, y_bit, ctx)) goto err; + } + else + { + if (!BN_bin2bn(buf + 1 + field_len, field_len, y)) goto err; + if (BN_ucmp(y, &group->field) >= 0) + { + ECerr(EC_F_EC_GF2M_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); + goto err; + } + if (form == POINT_CONVERSION_HYBRID) + { + if (!group->meth->field_div(group, yxi, y, x, ctx)) goto err; + if (y_bit != BN_is_odd(yxi)) + { + ECerr(EC_F_EC_GF2M_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); + goto err; + } + } + + if (!EC_POINT_set_affine_coordinates_GF2m(group, point, x, y, ctx)) goto err; + } + + if (!EC_POINT_is_on_curve(group, point, ctx)) /* test required by X9.62 */ + { + ECerr(EC_F_EC_GF2M_SIMPLE_OCT2POINT, EC_R_POINT_IS_NOT_ON_CURVE); + goto err; + } + + ret = 1; + + err: + BN_CTX_end(ctx); + if (new_ctx != NULL) + BN_CTX_free(new_ctx); + return ret; + } +#endif diff --git a/crypto/openssl/crypto/ec/ec2_smpl.c b/crypto/openssl/crypto/ec/ec2_smpl.c index 03deae6674..e0e59c7d82 100644 --- a/crypto/openssl/crypto/ec/ec2_smpl.c +++ b/crypto/openssl/crypto/ec/ec2_smpl.c @@ -71,10 +71,20 @@ #include "ec_lcl.h" +#ifndef OPENSSL_NO_EC2M + +#ifdef OPENSSL_FIPS +#include +#endif + const EC_METHOD *EC_GF2m_simple_method(void) { +#ifdef OPENSSL_FIPS + return fips_ec_gf2m_simple_method(); +#else static const EC_METHOD ret = { + EC_FLAGS_DEFAULT_OCT, NID_X9_62_characteristic_two_field, ec_GF2m_simple_group_init, ec_GF2m_simple_group_finish, @@ -93,9 +103,7 @@ const EC_METHOD *EC_GF2m_simple_method(void) 0 /* get_Jprojective_coordinates_GFp */, ec_GF2m_simple_point_set_affine_coordinates, ec_GF2m_simple_point_get_affine_coordinates, - ec_GF2m_simple_set_compressed_coordinates, - ec_GF2m_simple_point2oct, - ec_GF2m_simple_oct2point, + 0,0,0, ec_GF2m_simple_add, ec_GF2m_simple_dbl, ec_GF2m_simple_invert, @@ -118,6 +126,7 @@ const EC_METHOD *EC_GF2m_simple_method(void) 0 /* field_set_to_one */ }; return &ret; +#endif } @@ -405,340 +414,6 @@ int ec_GF2m_simple_point_get_affine_coordinates(const EC_GROUP *group, const EC_ return ret; } - -/* Calculates and sets the affine coordinates of an EC_POINT from the given - * compressed coordinates. Uses algorithm 2.3.4 of SEC 1. - * Note that the simple implementation only uses affine coordinates. - * - * The method is from the following publication: - * - * Harper, Menezes, Vanstone: - * "Public-Key Cryptosystems with Very Small Key Lengths", - * EUROCRYPT '92, Springer-Verlag LNCS 658, - * published February 1993 - * - * US Patents 6,141,420 and 6,618,483 (Vanstone, Mullin, Agnew) describe - * the same method, but claim no priority date earlier than July 29, 1994 - * (and additionally fail to cite the EUROCRYPT '92 publication as prior art). - */ -int ec_GF2m_simple_set_compressed_coordinates(const EC_GROUP *group, EC_POINT *point, - const BIGNUM *x_, int y_bit, BN_CTX *ctx) - { - BN_CTX *new_ctx = NULL; - BIGNUM *tmp, *x, *y, *z; - int ret = 0, z0; - - /* clear error queue */ - ERR_clear_error(); - - if (ctx == NULL) - { - ctx = new_ctx = BN_CTX_new(); - if (ctx == NULL) - return 0; - } - - y_bit = (y_bit != 0) ? 1 : 0; - - BN_CTX_start(ctx); - tmp = BN_CTX_get(ctx); - x = BN_CTX_get(ctx); - y = BN_CTX_get(ctx); - z = BN_CTX_get(ctx); - if (z == NULL) goto err; - - if (!BN_GF2m_mod_arr(x, x_, group->poly)) goto err; - if (BN_is_zero(x)) - { - if (!BN_GF2m_mod_sqrt_arr(y, &group->b, group->poly, ctx)) goto err; - } - else - { - if (!group->meth->field_sqr(group, tmp, x, ctx)) goto err; - if (!group->meth->field_div(group, tmp, &group->b, tmp, ctx)) goto err; - if (!BN_GF2m_add(tmp, &group->a, tmp)) goto err; - if (!BN_GF2m_add(tmp, x, tmp)) goto err; - if (!BN_GF2m_mod_solve_quad_arr(z, tmp, group->poly, ctx)) - { - unsigned long err = ERR_peek_last_error(); - - if (ERR_GET_LIB(err) == ERR_LIB_BN && ERR_GET_REASON(err) == BN_R_NO_SOLUTION) - { - ERR_clear_error(); - ECerr(EC_F_EC_GF2M_SIMPLE_SET_COMPRESSED_COORDINATES, EC_R_INVALID_COMPRESSED_POINT); - } - else - ECerr(EC_F_EC_GF2M_SIMPLE_SET_COMPRESSED_COORDINATES, ERR_R_BN_LIB); - goto err; - } - z0 = (BN_is_odd(z)) ? 1 : 0; - if (!group->meth->field_mul(group, y, x, z, ctx)) goto err; - if (z0 != y_bit) - { - if (!BN_GF2m_add(y, y, x)) goto err; - } - } - - if (!EC_POINT_set_affine_coordinates_GF2m(group, point, x, y, ctx)) goto err; - - ret = 1; - - err: - BN_CTX_end(ctx); - if (new_ctx != NULL) - BN_CTX_free(new_ctx); - return ret; - } - - -/* Converts an EC_POINT to an octet string. - * If buf is NULL, the encoded length will be returned. - * If the length len of buf is smaller than required an error will be returned. - */ -size_t ec_GF2m_simple_point2oct(const EC_GROUP *group, const EC_POINT *point, point_conversion_form_t form, - unsigned char *buf, size_t len, BN_CTX *ctx) - { - size_t ret; - BN_CTX *new_ctx = NULL; - int used_ctx = 0; - BIGNUM *x, *y, *yxi; - size_t field_len, i, skip; - - if ((form != POINT_CONVERSION_COMPRESSED) - && (form != POINT_CONVERSION_UNCOMPRESSED) - && (form != POINT_CONVERSION_HYBRID)) - { - ECerr(EC_F_EC_GF2M_SIMPLE_POINT2OCT, EC_R_INVALID_FORM); - goto err; - } - - if (EC_POINT_is_at_infinity(group, point)) - { - /* encodes to a single 0 octet */ - if (buf != NULL) - { - if (len < 1) - { - ECerr(EC_F_EC_GF2M_SIMPLE_POINT2OCT, EC_R_BUFFER_TOO_SMALL); - return 0; - } - buf[0] = 0; - } - return 1; - } - - - /* ret := required output buffer length */ - field_len = (EC_GROUP_get_degree(group) + 7) / 8; - ret = (form == POINT_CONVERSION_COMPRESSED) ? 1 + field_len : 1 + 2*field_len; - - /* if 'buf' is NULL, just return required length */ - if (buf != NULL) - { - if (len < ret) - { - ECerr(EC_F_EC_GF2M_SIMPLE_POINT2OCT, EC_R_BUFFER_TOO_SMALL); - goto err; - } - - if (ctx == NULL) - { - ctx = new_ctx = BN_CTX_new(); - if (ctx == NULL) - return 0; - } - - BN_CTX_start(ctx); - used_ctx = 1; - x = BN_CTX_get(ctx); - y = BN_CTX_get(ctx); - yxi = BN_CTX_get(ctx); - if (yxi == NULL) goto err; - - if (!EC_POINT_get_affine_coordinates_GF2m(group, point, x, y, ctx)) goto err; - - buf[0] = form; - if ((form != POINT_CONVERSION_UNCOMPRESSED) && !BN_is_zero(x)) - { - if (!group->meth->field_div(group, yxi, y, x, ctx)) goto err; - if (BN_is_odd(yxi)) buf[0]++; - } - - i = 1; - - skip = field_len - BN_num_bytes(x); - if (skip > field_len) - { - ECerr(EC_F_EC_GF2M_SIMPLE_POINT2OCT, ERR_R_INTERNAL_ERROR); - goto err; - } - while (skip > 0) - { - buf[i++] = 0; - skip--; - } - skip = BN_bn2bin(x, buf + i); - i += skip; - if (i != 1 + field_len) - { - ECerr(EC_F_EC_GF2M_SIMPLE_POINT2OCT, ERR_R_INTERNAL_ERROR); - goto err; - } - - if (form == POINT_CONVERSION_UNCOMPRESSED || form == POINT_CONVERSION_HYBRID) - { - skip = field_len - BN_num_bytes(y); - if (skip > field_len) - { - ECerr(EC_F_EC_GF2M_SIMPLE_POINT2OCT, ERR_R_INTERNAL_ERROR); - goto err; - } - while (skip > 0) - { - buf[i++] = 0; - skip--; - } - skip = BN_bn2bin(y, buf + i); - i += skip; - } - - if (i != ret) - { - ECerr(EC_F_EC_GF2M_SIMPLE_POINT2OCT, ERR_R_INTERNAL_ERROR); - goto err; - } - } - - if (used_ctx) - BN_CTX_end(ctx); - if (new_ctx != NULL) - BN_CTX_free(new_ctx); - return ret; - - err: - if (used_ctx) - BN_CTX_end(ctx); - if (new_ctx != NULL) - BN_CTX_free(new_ctx); - return 0; - } - - -/* Converts an octet string representation to an EC_POINT. - * Note that the simple implementation only uses affine coordinates. - */ -int ec_GF2m_simple_oct2point(const EC_GROUP *group, EC_POINT *point, - const unsigned char *buf, size_t len, BN_CTX *ctx) - { - point_conversion_form_t form; - int y_bit; - BN_CTX *new_ctx = NULL; - BIGNUM *x, *y, *yxi; - size_t field_len, enc_len; - int ret = 0; - - if (len == 0) - { - ECerr(EC_F_EC_GF2M_SIMPLE_OCT2POINT, EC_R_BUFFER_TOO_SMALL); - return 0; - } - form = buf[0]; - y_bit = form & 1; - form = form & ~1U; - if ((form != 0) && (form != POINT_CONVERSION_COMPRESSED) - && (form != POINT_CONVERSION_UNCOMPRESSED) - && (form != POINT_CONVERSION_HYBRID)) - { - ECerr(EC_F_EC_GF2M_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); - return 0; - } - if ((form == 0 || form == POINT_CONVERSION_UNCOMPRESSED) && y_bit) - { - ECerr(EC_F_EC_GF2M_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); - return 0; - } - - if (form == 0) - { - if (len != 1) - { - ECerr(EC_F_EC_GF2M_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); - return 0; - } - - return EC_POINT_set_to_infinity(group, point); - } - - field_len = (EC_GROUP_get_degree(group) + 7) / 8; - enc_len = (form == POINT_CONVERSION_COMPRESSED) ? 1 + field_len : 1 + 2*field_len; - - if (len != enc_len) - { - ECerr(EC_F_EC_GF2M_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); - return 0; - } - - if (ctx == NULL) - { - ctx = new_ctx = BN_CTX_new(); - if (ctx == NULL) - return 0; - } - - BN_CTX_start(ctx); - x = BN_CTX_get(ctx); - y = BN_CTX_get(ctx); - yxi = BN_CTX_get(ctx); - if (yxi == NULL) goto err; - - if (!BN_bin2bn(buf + 1, field_len, x)) goto err; - if (BN_ucmp(x, &group->field) >= 0) - { - ECerr(EC_F_EC_GF2M_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); - goto err; - } - - if (form == POINT_CONVERSION_COMPRESSED) - { - if (!EC_POINT_set_compressed_coordinates_GF2m(group, point, x, y_bit, ctx)) goto err; - } - else - { - if (!BN_bin2bn(buf + 1 + field_len, field_len, y)) goto err; - if (BN_ucmp(y, &group->field) >= 0) - { - ECerr(EC_F_EC_GF2M_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); - goto err; - } - if (form == POINT_CONVERSION_HYBRID) - { - if (!group->meth->field_div(group, yxi, y, x, ctx)) goto err; - if (y_bit != BN_is_odd(yxi)) - { - ECerr(EC_F_EC_GF2M_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); - goto err; - } - } - - if (!EC_POINT_set_affine_coordinates_GF2m(group, point, x, y, ctx)) goto err; - } - - if (!EC_POINT_is_on_curve(group, point, ctx)) /* test required by X9.62 */ - { - ECerr(EC_F_EC_GF2M_SIMPLE_OCT2POINT, EC_R_POINT_IS_NOT_ON_CURVE); - goto err; - } - - ret = 1; - - err: - BN_CTX_end(ctx); - if (new_ctx != NULL) - BN_CTX_free(new_ctx); - return ret; - } - - /* Computes a + b and stores the result in r. r could be a or b, a could be b. * Uses algorithm A.10.2 of IEEE P1363. */ @@ -1040,3 +715,5 @@ int ec_GF2m_simple_field_div(const EC_GROUP *group, BIGNUM *r, const BIGNUM *a, { return BN_GF2m_mod_div(r, a, b, &group->field, ctx); } + +#endif diff --git a/crypto/openssl/crypto/ec/ec_ameth.c b/crypto/openssl/crypto/ec/ec_ameth.c index c00f7d746c..83909c1853 100644 --- a/crypto/openssl/crypto/ec/ec_ameth.c +++ b/crypto/openssl/crypto/ec/ec_ameth.c @@ -651,6 +651,7 @@ const EVP_PKEY_ASN1_METHOD eckey_asn1_meth = ec_copy_parameters, ec_cmp_parameters, eckey_param_print, + 0, int_ec_free, ec_pkey_ctrl, diff --git a/crypto/openssl/crypto/ec/ec_asn1.c b/crypto/openssl/crypto/ec/ec_asn1.c index ae55539859..175eec5342 100644 --- a/crypto/openssl/crypto/ec/ec_asn1.c +++ b/crypto/openssl/crypto/ec/ec_asn1.c @@ -83,7 +83,7 @@ int EC_GROUP_get_basis_type(const EC_GROUP *group) /* everything else is currently not supported */ return 0; } - +#ifndef OPENSSL_NO_EC2M int EC_GROUP_get_trinomial_basis(const EC_GROUP *group, unsigned int *k) { if (group == NULL) @@ -101,7 +101,6 @@ int EC_GROUP_get_trinomial_basis(const EC_GROUP *group, unsigned int *k) return 1; } - int EC_GROUP_get_pentanomial_basis(const EC_GROUP *group, unsigned int *k1, unsigned int *k2, unsigned int *k3) { @@ -124,7 +123,7 @@ int EC_GROUP_get_pentanomial_basis(const EC_GROUP *group, unsigned int *k1, return 1; } - +#endif /* some structures needed for the asn1 encoding */ @@ -340,6 +339,12 @@ static int ec_asn1_group2fieldid(const EC_GROUP *group, X9_62_FIELDID *field) } } else /* nid == NID_X9_62_characteristic_two_field */ +#ifdef OPENSSL_NO_EC2M + { + ECerr(EC_F_EC_ASN1_GROUP2FIELDID, EC_R_GF2M_NOT_SUPPORTED); + goto err; + } +#else { int field_type; X9_62_CHARACTERISTIC_TWO *char_two; @@ -419,6 +424,7 @@ static int ec_asn1_group2fieldid(const EC_GROUP *group, X9_62_FIELDID *field) } } } +#endif ok = 1; @@ -456,6 +462,7 @@ static int ec_asn1_group2curve(const EC_GROUP *group, X9_62_CURVE *curve) goto err; } } +#ifndef OPENSSL_NO_EC2M else /* nid == NID_X9_62_characteristic_two_field */ { if (!EC_GROUP_get_curve_GF2m(group, NULL, tmp_1, tmp_2, NULL)) @@ -464,7 +471,7 @@ static int ec_asn1_group2curve(const EC_GROUP *group, X9_62_CURVE *curve) goto err; } } - +#endif len_1 = (size_t)BN_num_bytes(tmp_1); len_2 = (size_t)BN_num_bytes(tmp_2); @@ -775,8 +782,13 @@ static EC_GROUP *ec_asn1_parameters2group(const ECPARAMETERS *params) /* get the field parameters */ tmp = OBJ_obj2nid(params->fieldID->fieldType); - if (tmp == NID_X9_62_characteristic_two_field) +#ifdef OPENSSL_NO_EC2M + { + ECerr(EC_F_EC_ASN1_PARAMETERS2GROUP, EC_R_GF2M_NOT_SUPPORTED); + goto err; + } +#else { X9_62_CHARACTERISTIC_TWO *char_two; @@ -862,6 +874,7 @@ static EC_GROUP *ec_asn1_parameters2group(const ECPARAMETERS *params) /* create the EC_GROUP structure */ ret = EC_GROUP_new_curve_GF2m(p, a, b, NULL); } +#endif else if (tmp == NID_X9_62_prime_field) { /* we have a curve over a prime field */ @@ -1065,6 +1078,7 @@ EC_GROUP *d2i_ECPKParameters(EC_GROUP **a, const unsigned char **in, long len) if ((group = ec_asn1_pkparameters2group(params)) == NULL) { ECerr(EC_F_D2I_ECPKPARAMETERS, EC_R_PKPARAMETERS2GROUP_FAILURE); + ECPKPARAMETERS_free(params); return NULL; } diff --git a/crypto/openssl/crypto/ec/ec_curve.c b/crypto/openssl/crypto/ec/ec_curve.c index 23274e4031..c72fb2697c 100644 --- a/crypto/openssl/crypto/ec/ec_curve.c +++ b/crypto/openssl/crypto/ec/ec_curve.c @@ -3,7 +3,7 @@ * Written by Nils Larsch for the OpenSSL project. */ /* ==================================================================== - * Copyright (c) 1998-2004 The OpenSSL Project. All rights reserved. + * Copyright (c) 1998-2010 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -72,6 +72,7 @@ #include "ec_lcl.h" #include #include +#include typedef struct { int field_type, /* either NID_X9_62_prime_field or @@ -703,6 +704,8 @@ static const struct { EC_CURVE_DATA h; unsigned char data[0+28*6]; } 0x13,0xDD,0x29,0x45,0x5C,0x5C,0x2A,0x3D } }; +#ifndef OPENSSL_NO_EC2M + /* characteristic two curves */ static const struct { EC_CURVE_DATA h; unsigned char data[20+15*6]; } _EC_SECG_CHAR2_113R1 = { @@ -1300,7 +1303,7 @@ static const struct { EC_CURVE_DATA h; unsigned char data[20+21*6]; } { 0x53,0x81,0x4C,0x05,0x0D,0x44,0xD6,0x96,0xE6,0x76, /* seed */ 0x87,0x56,0x15,0x17,0x58,0x0C,0xA4,0xE2,0x9F,0xFD, - 0x08,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /* p */ + 0x08,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /* p */ 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x01, 0x07, 0x01,0x08,0xB3,0x9E,0x77,0xC4,0xB1,0x08,0xBE,0xD9, /* a */ @@ -1817,103 +1820,128 @@ static const struct { EC_CURVE_DATA h; unsigned char data[0+24*6]; } 0xBA,0xFC,0xA7,0x5E } }; +#endif + typedef struct _ec_list_element_st { int nid; const EC_CURVE_DATA *data; + const EC_METHOD *(*meth)(void); const char *comment; } ec_list_element; static const ec_list_element curve_list[] = { - /* prime field curves */ + /* prime field curves */ /* secg curves */ - { NID_secp112r1, &_EC_SECG_PRIME_112R1.h, "SECG/WTLS curve over a 112 bit prime field"}, - { NID_secp112r2, &_EC_SECG_PRIME_112R2.h, "SECG curve over a 112 bit prime field"}, - { NID_secp128r1, &_EC_SECG_PRIME_128R1.h, "SECG curve over a 128 bit prime field"}, - { NID_secp128r2, &_EC_SECG_PRIME_128R2.h, "SECG curve over a 128 bit prime field"}, - { NID_secp160k1, &_EC_SECG_PRIME_160K1.h, "SECG curve over a 160 bit prime field"}, - { NID_secp160r1, &_EC_SECG_PRIME_160R1.h, "SECG curve over a 160 bit prime field"}, - { NID_secp160r2, &_EC_SECG_PRIME_160R2.h, "SECG/WTLS curve over a 160 bit prime field"}, + { NID_secp112r1, &_EC_SECG_PRIME_112R1.h, 0, "SECG/WTLS curve over a 112 bit prime field" }, + { NID_secp112r2, &_EC_SECG_PRIME_112R2.h, 0, "SECG curve over a 112 bit prime field" }, + { NID_secp128r1, &_EC_SECG_PRIME_128R1.h, 0, "SECG curve over a 128 bit prime field" }, + { NID_secp128r2, &_EC_SECG_PRIME_128R2.h, 0, "SECG curve over a 128 bit prime field" }, + { NID_secp160k1, &_EC_SECG_PRIME_160K1.h, 0, "SECG curve over a 160 bit prime field" }, + { NID_secp160r1, &_EC_SECG_PRIME_160R1.h, 0, "SECG curve over a 160 bit prime field" }, + { NID_secp160r2, &_EC_SECG_PRIME_160R2.h, 0, "SECG/WTLS curve over a 160 bit prime field" }, /* SECG secp192r1 is the same as X9.62 prime192v1 and hence omitted */ - { NID_secp192k1, &_EC_SECG_PRIME_192K1.h, "SECG curve over a 192 bit prime field"}, - { NID_secp224k1, &_EC_SECG_PRIME_224K1.h, "SECG curve over a 224 bit prime field"}, - { NID_secp224r1, &_EC_NIST_PRIME_224.h, "NIST/SECG curve over a 224 bit prime field"}, - { NID_secp256k1, &_EC_SECG_PRIME_256K1.h, "SECG curve over a 256 bit prime field"}, + { NID_secp192k1, &_EC_SECG_PRIME_192K1.h, 0, "SECG curve over a 192 bit prime field" }, + { NID_secp224k1, &_EC_SECG_PRIME_224K1.h, 0, "SECG curve over a 224 bit prime field" }, +#ifndef OPENSSL_NO_EC_NISTP_64_GCC_128 + { NID_secp224r1, &_EC_NIST_PRIME_224.h, EC_GFp_nistp224_method, "NIST/SECG curve over a 224 bit prime field" }, +#else + { NID_secp224r1, &_EC_NIST_PRIME_224.h, 0, "NIST/SECG curve over a 224 bit prime field" }, +#endif + { NID_secp256k1, &_EC_SECG_PRIME_256K1.h, 0, "SECG curve over a 256 bit prime field" }, /* SECG secp256r1 is the same as X9.62 prime256v1 and hence omitted */ - { NID_secp384r1, &_EC_NIST_PRIME_384.h, "NIST/SECG curve over a 384 bit prime field"}, - { NID_secp521r1, &_EC_NIST_PRIME_521.h, "NIST/SECG curve over a 521 bit prime field"}, + { NID_secp384r1, &_EC_NIST_PRIME_384.h, 0, "NIST/SECG curve over a 384 bit prime field" }, +#ifndef OPENSSL_NO_EC_NISTP_64_GCC_128 + { NID_secp521r1, &_EC_NIST_PRIME_521.h, EC_GFp_nistp521_method, "NIST/SECG curve over a 521 bit prime field" }, +#else + { NID_secp521r1, &_EC_NIST_PRIME_521.h, 0, "NIST/SECG curve over a 521 bit prime field" }, +#endif /* X9.62 curves */ - { NID_X9_62_prime192v1, &_EC_NIST_PRIME_192.h, "NIST/X9.62/SECG curve over a 192 bit prime field"}, - { NID_X9_62_prime192v2, &_EC_X9_62_PRIME_192V2.h, "X9.62 curve over a 192 bit prime field"}, - { NID_X9_62_prime192v3, &_EC_X9_62_PRIME_192V3.h, "X9.62 curve over a 192 bit prime field"}, - { NID_X9_62_prime239v1, &_EC_X9_62_PRIME_239V1.h, "X9.62 curve over a 239 bit prime field"}, - { NID_X9_62_prime239v2, &_EC_X9_62_PRIME_239V2.h, "X9.62 curve over a 239 bit prime field"}, - { NID_X9_62_prime239v3, &_EC_X9_62_PRIME_239V3.h, "X9.62 curve over a 239 bit prime field"}, - { NID_X9_62_prime256v1, &_EC_X9_62_PRIME_256V1.h, "X9.62/SECG curve over a 256 bit prime field"}, + { NID_X9_62_prime192v1, &_EC_NIST_PRIME_192.h, 0, "NIST/X9.62/SECG curve over a 192 bit prime field" }, + { NID_X9_62_prime192v2, &_EC_X9_62_PRIME_192V2.h, 0, "X9.62 curve over a 192 bit prime field" }, + { NID_X9_62_prime192v3, &_EC_X9_62_PRIME_192V3.h, 0, "X9.62 curve over a 192 bit prime field" }, + { NID_X9_62_prime239v1, &_EC_X9_62_PRIME_239V1.h, 0, "X9.62 curve over a 239 bit prime field" }, + { NID_X9_62_prime239v2, &_EC_X9_62_PRIME_239V2.h, 0, "X9.62 curve over a 239 bit prime field" }, + { NID_X9_62_prime239v3, &_EC_X9_62_PRIME_239V3.h, 0, "X9.62 curve over a 239 bit prime field" }, +#ifndef OPENSSL_NO_EC_NISTP_64_GCC_128 + { NID_X9_62_prime256v1, &_EC_X9_62_PRIME_256V1.h, EC_GFp_nistp256_method, "X9.62/SECG curve over a 256 bit prime field" }, +#else + { NID_X9_62_prime256v1, &_EC_X9_62_PRIME_256V1.h, 0, "X9.62/SECG curve over a 256 bit prime field" }, +#endif +#ifndef OPENSSL_NO_EC2M /* characteristic two field curves */ /* NIST/SECG curves */ - { NID_sect113r1, &_EC_SECG_CHAR2_113R1.h, "SECG curve over a 113 bit binary field"}, - { NID_sect113r2, &_EC_SECG_CHAR2_113R2.h, "SECG curve over a 113 bit binary field"}, - { NID_sect131r1, &_EC_SECG_CHAR2_131R1.h, "SECG/WTLS curve over a 131 bit binary field"}, - { NID_sect131r2, &_EC_SECG_CHAR2_131R2.h, "SECG curve over a 131 bit binary field"}, - { NID_sect163k1, &_EC_NIST_CHAR2_163K.h, "NIST/SECG/WTLS curve over a 163 bit binary field" }, - { NID_sect163r1, &_EC_SECG_CHAR2_163R1.h, "SECG curve over a 163 bit binary field"}, - { NID_sect163r2, &_EC_NIST_CHAR2_163B.h, "NIST/SECG curve over a 163 bit binary field" }, - { NID_sect193r1, &_EC_SECG_CHAR2_193R1.h, "SECG curve over a 193 bit binary field"}, - { NID_sect193r2, &_EC_SECG_CHAR2_193R2.h, "SECG curve over a 193 bit binary field"}, - { NID_sect233k1, &_EC_NIST_CHAR2_233K.h, "NIST/SECG/WTLS curve over a 233 bit binary field" }, - { NID_sect233r1, &_EC_NIST_CHAR2_233B.h, "NIST/SECG/WTLS curve over a 233 bit binary field" }, - { NID_sect239k1, &_EC_SECG_CHAR2_239K1.h, "SECG curve over a 239 bit binary field"}, - { NID_sect283k1, &_EC_NIST_CHAR2_283K.h, "NIST/SECG curve over a 283 bit binary field" }, - { NID_sect283r1, &_EC_NIST_CHAR2_283B.h, "NIST/SECG curve over a 283 bit binary field" }, - { NID_sect409k1, &_EC_NIST_CHAR2_409K.h, "NIST/SECG curve over a 409 bit binary field" }, - { NID_sect409r1, &_EC_NIST_CHAR2_409B.h, "NIST/SECG curve over a 409 bit binary field" }, - { NID_sect571k1, &_EC_NIST_CHAR2_571K.h, "NIST/SECG curve over a 571 bit binary field" }, - { NID_sect571r1, &_EC_NIST_CHAR2_571B.h, "NIST/SECG curve over a 571 bit binary field" }, + { NID_sect113r1, &_EC_SECG_CHAR2_113R1.h, 0, "SECG curve over a 113 bit binary field" }, + { NID_sect113r2, &_EC_SECG_CHAR2_113R2.h, 0, "SECG curve over a 113 bit binary field" }, + { NID_sect131r1, &_EC_SECG_CHAR2_131R1.h, 0, "SECG/WTLS curve over a 131 bit binary field" }, + { NID_sect131r2, &_EC_SECG_CHAR2_131R2.h, 0, "SECG curve over a 131 bit binary field" }, + { NID_sect163k1, &_EC_NIST_CHAR2_163K.h, 0, "NIST/SECG/WTLS curve over a 163 bit binary field" }, + { NID_sect163r1, &_EC_SECG_CHAR2_163R1.h, 0, "SECG curve over a 163 bit binary field" }, + { NID_sect163r2, &_EC_NIST_CHAR2_163B.h, 0, "NIST/SECG curve over a 163 bit binary field" }, + { NID_sect193r1, &_EC_SECG_CHAR2_193R1.h, 0, "SECG curve over a 193 bit binary field" }, + { NID_sect193r2, &_EC_SECG_CHAR2_193R2.h, 0, "SECG curve over a 193 bit binary field" }, + { NID_sect233k1, &_EC_NIST_CHAR2_233K.h, 0, "NIST/SECG/WTLS curve over a 233 bit binary field" }, + { NID_sect233r1, &_EC_NIST_CHAR2_233B.h, 0, "NIST/SECG/WTLS curve over a 233 bit binary field" }, + { NID_sect239k1, &_EC_SECG_CHAR2_239K1.h, 0, "SECG curve over a 239 bit binary field" }, + { NID_sect283k1, &_EC_NIST_CHAR2_283K.h, 0, "NIST/SECG curve over a 283 bit binary field" }, + { NID_sect283r1, &_EC_NIST_CHAR2_283B.h, 0, "NIST/SECG curve over a 283 bit binary field" }, + { NID_sect409k1, &_EC_NIST_CHAR2_409K.h, 0, "NIST/SECG curve over a 409 bit binary field" }, + { NID_sect409r1, &_EC_NIST_CHAR2_409B.h, 0, "NIST/SECG curve over a 409 bit binary field" }, + { NID_sect571k1, &_EC_NIST_CHAR2_571K.h, 0, "NIST/SECG curve over a 571 bit binary field" }, + { NID_sect571r1, &_EC_NIST_CHAR2_571B.h, 0, "NIST/SECG curve over a 571 bit binary field" }, /* X9.62 curves */ - { NID_X9_62_c2pnb163v1, &_EC_X9_62_CHAR2_163V1.h, "X9.62 curve over a 163 bit binary field"}, - { NID_X9_62_c2pnb163v2, &_EC_X9_62_CHAR2_163V2.h, "X9.62 curve over a 163 bit binary field"}, - { NID_X9_62_c2pnb163v3, &_EC_X9_62_CHAR2_163V3.h, "X9.62 curve over a 163 bit binary field"}, - { NID_X9_62_c2pnb176v1, &_EC_X9_62_CHAR2_176V1.h, "X9.62 curve over a 176 bit binary field"}, - { NID_X9_62_c2tnb191v1, &_EC_X9_62_CHAR2_191V1.h, "X9.62 curve over a 191 bit binary field"}, - { NID_X9_62_c2tnb191v2, &_EC_X9_62_CHAR2_191V2.h, "X9.62 curve over a 191 bit binary field"}, - { NID_X9_62_c2tnb191v3, &_EC_X9_62_CHAR2_191V3.h, "X9.62 curve over a 191 bit binary field"}, - { NID_X9_62_c2pnb208w1, &_EC_X9_62_CHAR2_208W1.h, "X9.62 curve over a 208 bit binary field"}, - { NID_X9_62_c2tnb239v1, &_EC_X9_62_CHAR2_239V1.h, "X9.62 curve over a 239 bit binary field"}, - { NID_X9_62_c2tnb239v2, &_EC_X9_62_CHAR2_239V2.h, "X9.62 curve over a 239 bit binary field"}, - { NID_X9_62_c2tnb239v3, &_EC_X9_62_CHAR2_239V3.h, "X9.62 curve over a 239 bit binary field"}, - { NID_X9_62_c2pnb272w1, &_EC_X9_62_CHAR2_272W1.h, "X9.62 curve over a 272 bit binary field"}, - { NID_X9_62_c2pnb304w1, &_EC_X9_62_CHAR2_304W1.h, "X9.62 curve over a 304 bit binary field"}, - { NID_X9_62_c2tnb359v1, &_EC_X9_62_CHAR2_359V1.h, "X9.62 curve over a 359 bit binary field"}, - { NID_X9_62_c2pnb368w1, &_EC_X9_62_CHAR2_368W1.h, "X9.62 curve over a 368 bit binary field"}, - { NID_X9_62_c2tnb431r1, &_EC_X9_62_CHAR2_431R1.h, "X9.62 curve over a 431 bit binary field"}, + { NID_X9_62_c2pnb163v1, &_EC_X9_62_CHAR2_163V1.h, 0, "X9.62 curve over a 163 bit binary field" }, + { NID_X9_62_c2pnb163v2, &_EC_X9_62_CHAR2_163V2.h, 0, "X9.62 curve over a 163 bit binary field" }, + { NID_X9_62_c2pnb163v3, &_EC_X9_62_CHAR2_163V3.h, 0, "X9.62 curve over a 163 bit binary field" }, + { NID_X9_62_c2pnb176v1, &_EC_X9_62_CHAR2_176V1.h, 0, "X9.62 curve over a 176 bit binary field" }, + { NID_X9_62_c2tnb191v1, &_EC_X9_62_CHAR2_191V1.h, 0, "X9.62 curve over a 191 bit binary field" }, + { NID_X9_62_c2tnb191v2, &_EC_X9_62_CHAR2_191V2.h, 0, "X9.62 curve over a 191 bit binary field" }, + { NID_X9_62_c2tnb191v3, &_EC_X9_62_CHAR2_191V3.h, 0, "X9.62 curve over a 191 bit binary field" }, + { NID_X9_62_c2pnb208w1, &_EC_X9_62_CHAR2_208W1.h, 0, "X9.62 curve over a 208 bit binary field" }, + { NID_X9_62_c2tnb239v1, &_EC_X9_62_CHAR2_239V1.h, 0, "X9.62 curve over a 239 bit binary field" }, + { NID_X9_62_c2tnb239v2, &_EC_X9_62_CHAR2_239V2.h, 0, "X9.62 curve over a 239 bit binary field" }, + { NID_X9_62_c2tnb239v3, &_EC_X9_62_CHAR2_239V3.h, 0, "X9.62 curve over a 239 bit binary field" }, + { NID_X9_62_c2pnb272w1, &_EC_X9_62_CHAR2_272W1.h, 0, "X9.62 curve over a 272 bit binary field" }, + { NID_X9_62_c2pnb304w1, &_EC_X9_62_CHAR2_304W1.h, 0, "X9.62 curve over a 304 bit binary field" }, + { NID_X9_62_c2tnb359v1, &_EC_X9_62_CHAR2_359V1.h, 0, "X9.62 curve over a 359 bit binary field" }, + { NID_X9_62_c2pnb368w1, &_EC_X9_62_CHAR2_368W1.h, 0, "X9.62 curve over a 368 bit binary field" }, + { NID_X9_62_c2tnb431r1, &_EC_X9_62_CHAR2_431R1.h, 0, "X9.62 curve over a 431 bit binary field" }, /* the WAP/WTLS curves * [unlike SECG, spec has its own OIDs for curves from X9.62] */ - { NID_wap_wsg_idm_ecid_wtls1, &_EC_WTLS_1.h, "WTLS curve over a 113 bit binary field"}, - { NID_wap_wsg_idm_ecid_wtls3, &_EC_NIST_CHAR2_163K.h, "NIST/SECG/WTLS curve over a 163 bit binary field"}, - { NID_wap_wsg_idm_ecid_wtls4, &_EC_SECG_CHAR2_113R1.h, "SECG curve over a 113 bit binary field"}, - { NID_wap_wsg_idm_ecid_wtls5, &_EC_X9_62_CHAR2_163V1.h, "X9.62 curve over a 163 bit binary field"}, - { NID_wap_wsg_idm_ecid_wtls6, &_EC_SECG_PRIME_112R1.h, "SECG/WTLS curve over a 112 bit prime field"}, - { NID_wap_wsg_idm_ecid_wtls7, &_EC_SECG_PRIME_160R2.h, "SECG/WTLS curve over a 160 bit prime field"}, - { NID_wap_wsg_idm_ecid_wtls8, &_EC_WTLS_8.h, "WTLS curve over a 112 bit prime field"}, - { NID_wap_wsg_idm_ecid_wtls9, &_EC_WTLS_9.h, "WTLS curve over a 160 bit prime field" }, - { NID_wap_wsg_idm_ecid_wtls10, &_EC_NIST_CHAR2_233K.h, "NIST/SECG/WTLS curve over a 233 bit binary field"}, - { NID_wap_wsg_idm_ecid_wtls11, &_EC_NIST_CHAR2_233B.h, "NIST/SECG/WTLS curve over a 233 bit binary field"}, - { NID_wap_wsg_idm_ecid_wtls12, &_EC_WTLS_12.h, "WTLS curvs over a 224 bit prime field"}, + { NID_wap_wsg_idm_ecid_wtls1, &_EC_WTLS_1.h, 0, "WTLS curve over a 113 bit binary field" }, + { NID_wap_wsg_idm_ecid_wtls3, &_EC_NIST_CHAR2_163K.h, 0, "NIST/SECG/WTLS curve over a 163 bit binary field" }, + { NID_wap_wsg_idm_ecid_wtls4, &_EC_SECG_CHAR2_113R1.h, 0, "SECG curve over a 113 bit binary field" }, + { NID_wap_wsg_idm_ecid_wtls5, &_EC_X9_62_CHAR2_163V1.h, 0, "X9.62 curve over a 163 bit binary field" }, +#endif + { NID_wap_wsg_idm_ecid_wtls6, &_EC_SECG_PRIME_112R1.h, 0, "SECG/WTLS curve over a 112 bit prime field" }, + { NID_wap_wsg_idm_ecid_wtls7, &_EC_SECG_PRIME_160R2.h, 0, "SECG/WTLS curve over a 160 bit prime field" }, + { NID_wap_wsg_idm_ecid_wtls8, &_EC_WTLS_8.h, 0, "WTLS curve over a 112 bit prime field" }, + { NID_wap_wsg_idm_ecid_wtls9, &_EC_WTLS_9.h, 0, "WTLS curve over a 160 bit prime field" }, +#ifndef OPENSSL_NO_EC2M + { NID_wap_wsg_idm_ecid_wtls10, &_EC_NIST_CHAR2_233K.h, 0, "NIST/SECG/WTLS curve over a 233 bit binary field" }, + { NID_wap_wsg_idm_ecid_wtls11, &_EC_NIST_CHAR2_233B.h, 0, "NIST/SECG/WTLS curve over a 233 bit binary field" }, +#endif + { NID_wap_wsg_idm_ecid_wtls12, &_EC_WTLS_12.h, 0, "WTLS curvs over a 224 bit prime field" }, +#ifndef OPENSSL_NO_EC2M /* IPSec curves */ - { NID_ipsec3, &_EC_IPSEC_155_ID3.h, "\n\tIPSec/IKE/Oakley curve #3 over a 155 bit binary field.\n""\tNot suitable for ECDSA.\n\tQuestionable extension field!"}, - { NID_ipsec4, &_EC_IPSEC_185_ID4.h, "\n\tIPSec/IKE/Oakley curve #4 over a 185 bit binary field.\n""\tNot suitable for ECDSA.\n\tQuestionable extension field!"}, + { NID_ipsec3, &_EC_IPSEC_155_ID3.h, 0, "\n\tIPSec/IKE/Oakley curve #3 over a 155 bit binary field.\n" + "\tNot suitable for ECDSA.\n\tQuestionable extension field!" }, + { NID_ipsec4, &_EC_IPSEC_185_ID4.h, 0, "\n\tIPSec/IKE/Oakley curve #4 over a 185 bit binary field.\n" + "\tNot suitable for ECDSA.\n\tQuestionable extension field!" }, +#endif }; #define curve_list_length (sizeof(curve_list)/sizeof(ec_list_element)) -static EC_GROUP *ec_group_new_from_data(const EC_CURVE_DATA *data) +static EC_GROUP *ec_group_new_from_data(const ec_list_element curve) { EC_GROUP *group=NULL; EC_POINT *P=NULL; BN_CTX *ctx=NULL; - BIGNUM *p=NULL, *a=NULL, *b=NULL, *x=NULL, *y=NULL, *order=NULL; + BIGNUM *p=NULL, *a=NULL, *b=NULL, *x=NULL, *y=NULL, *order=NULL; int ok=0; int seed_len,param_len; + const EC_METHOD *meth; + const EC_CURVE_DATA *data; const unsigned char *params; if ((ctx = BN_CTX_new()) == NULL) @@ -1922,10 +1950,11 @@ static EC_GROUP *ec_group_new_from_data(const EC_CURVE_DATA *data) goto err; } + data = curve.data; seed_len = data->seed_len; param_len = data->param_len; - params = (const unsigned char *)(data+1); /* skip header */ - params += seed_len; /* skip seed */ + params = (const unsigned char *)(data+1); /* skip header */ + params += seed_len; /* skip seed */ if (!(p = BN_bin2bn(params+0*param_len, param_len, NULL)) || !(a = BN_bin2bn(params+1*param_len, param_len, NULL)) @@ -1935,7 +1964,17 @@ static EC_GROUP *ec_group_new_from_data(const EC_CURVE_DATA *data) goto err; } - if (data->field_type == NID_X9_62_prime_field) + if (curve.meth != 0) + { + meth = curve.meth(); + if (((group = EC_GROUP_new(meth)) == NULL) || + (!(group->meth->group_set_curve(group, p, a, b, ctx)))) + { + ECerr(EC_F_EC_GROUP_NEW_FROM_DATA, ERR_R_EC_LIB); + goto err; + } + } + else if (data->field_type == NID_X9_62_prime_field) { if ((group = EC_GROUP_new_curve_GFp(p, a, b, ctx)) == NULL) { @@ -1943,6 +1982,7 @@ static EC_GROUP *ec_group_new_from_data(const EC_CURVE_DATA *data) goto err; } } +#ifndef OPENSSL_NO_EC2M else /* field_type == NID_X9_62_characteristic_two_field */ { if ((group = EC_GROUP_new_curve_GF2m(p, a, b, ctx)) == NULL) @@ -1951,20 +1991,21 @@ static EC_GROUP *ec_group_new_from_data(const EC_CURVE_DATA *data) goto err; } } +#endif if ((P = EC_POINT_new(group)) == NULL) { ECerr(EC_F_EC_GROUP_NEW_FROM_DATA, ERR_R_EC_LIB); goto err; } - + if (!(x = BN_bin2bn(params+3*param_len, param_len, NULL)) || !(y = BN_bin2bn(params+4*param_len, param_len, NULL))) { ECerr(EC_F_EC_GROUP_NEW_FROM_DATA, ERR_R_BN_LIB); goto err; } - if (!EC_POINT_set_affine_coordinates_GF2m(group, P, x, y, ctx)) + if (!EC_POINT_set_affine_coordinates_GFp(group, P, x, y, ctx)) { ECerr(EC_F_EC_GROUP_NEW_FROM_DATA, ERR_R_EC_LIB); goto err; @@ -2025,7 +2066,7 @@ EC_GROUP *EC_GROUP_new_by_curve_name(int nid) for (i=0; i + */ + meth = EC_GFp_mont_method(); +#else meth = EC_GFp_nist_method(); +#endif ret = EC_GROUP_new(meth); if (ret == NULL) @@ -122,7 +147,7 @@ EC_GROUP *EC_GROUP_new_curve_GFp(const BIGNUM *p, const BIGNUM *a, const BIGNUM return ret; } - +#ifndef OPENSSL_NO_EC2M EC_GROUP *EC_GROUP_new_curve_GF2m(const BIGNUM *p, const BIGNUM *a, const BIGNUM *b, BN_CTX *ctx) { const EC_METHOD *meth; @@ -142,3 +167,4 @@ EC_GROUP *EC_GROUP_new_curve_GF2m(const BIGNUM *p, const BIGNUM *a, const BIGNUM return ret; } +#endif diff --git a/crypto/openssl/crypto/ec/ec_err.c b/crypto/openssl/crypto/ec/ec_err.c index 84b4833371..0d19398731 100644 --- a/crypto/openssl/crypto/ec/ec_err.c +++ b/crypto/openssl/crypto/ec/ec_err.c @@ -1,6 +1,6 @@ /* crypto/ec/ec_err.c */ /* ==================================================================== - * Copyright (c) 1999-2007 The OpenSSL Project. All rights reserved. + * Copyright (c) 1999-2011 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -70,6 +70,7 @@ static ERR_STRING_DATA EC_str_functs[]= { +{ERR_FUNC(EC_F_BN_TO_FELEM), "BN_TO_FELEM"}, {ERR_FUNC(EC_F_COMPUTE_WNAF), "COMPUTE_WNAF"}, {ERR_FUNC(EC_F_D2I_ECPARAMETERS), "d2i_ECParameters"}, {ERR_FUNC(EC_F_D2I_ECPKPARAMETERS), "d2i_ECPKParameters"}, @@ -112,6 +113,15 @@ static ERR_STRING_DATA EC_str_functs[]= {ERR_FUNC(EC_F_EC_GFP_MONT_FIELD_SQR), "ec_GFp_mont_field_sqr"}, {ERR_FUNC(EC_F_EC_GFP_MONT_GROUP_SET_CURVE), "ec_GFp_mont_group_set_curve"}, {ERR_FUNC(EC_F_EC_GFP_MONT_GROUP_SET_CURVE_GFP), "EC_GFP_MONT_GROUP_SET_CURVE_GFP"}, +{ERR_FUNC(EC_F_EC_GFP_NISTP224_GROUP_SET_CURVE), "ec_GFp_nistp224_group_set_curve"}, +{ERR_FUNC(EC_F_EC_GFP_NISTP224_POINTS_MUL), "ec_GFp_nistp224_points_mul"}, +{ERR_FUNC(EC_F_EC_GFP_NISTP224_POINT_GET_AFFINE_COORDINATES), "ec_GFp_nistp224_point_get_affine_coordinates"}, +{ERR_FUNC(EC_F_EC_GFP_NISTP256_GROUP_SET_CURVE), "ec_GFp_nistp256_group_set_curve"}, +{ERR_FUNC(EC_F_EC_GFP_NISTP256_POINTS_MUL), "ec_GFp_nistp256_points_mul"}, +{ERR_FUNC(EC_F_EC_GFP_NISTP256_POINT_GET_AFFINE_COORDINATES), "ec_GFp_nistp256_point_get_affine_coordinates"}, +{ERR_FUNC(EC_F_EC_GFP_NISTP521_GROUP_SET_CURVE), "ec_GFp_nistp521_group_set_curve"}, +{ERR_FUNC(EC_F_EC_GFP_NISTP521_POINTS_MUL), "ec_GFp_nistp521_points_mul"}, +{ERR_FUNC(EC_F_EC_GFP_NISTP521_POINT_GET_AFFINE_COORDINATES), "ec_GFp_nistp521_point_get_affine_coordinates"}, {ERR_FUNC(EC_F_EC_GFP_NIST_FIELD_MUL), "ec_GFp_nist_field_mul"}, {ERR_FUNC(EC_F_EC_GFP_NIST_FIELD_SQR), "ec_GFp_nist_field_sqr"}, {ERR_FUNC(EC_F_EC_GFP_NIST_GROUP_SET_CURVE), "ec_GFp_nist_group_set_curve"}, @@ -154,6 +164,7 @@ static ERR_STRING_DATA EC_str_functs[]= {ERR_FUNC(EC_F_EC_KEY_NEW), "EC_KEY_new"}, {ERR_FUNC(EC_F_EC_KEY_PRINT), "EC_KEY_print"}, {ERR_FUNC(EC_F_EC_KEY_PRINT_FP), "EC_KEY_print_fp"}, +{ERR_FUNC(EC_F_EC_KEY_SET_PUBLIC_KEY_AFFINE_COORDINATES), "EC_KEY_set_public_key_affine_coordinates"}, {ERR_FUNC(EC_F_EC_POINTS_MAKE_AFFINE), "EC_POINTs_make_affine"}, {ERR_FUNC(EC_F_EC_POINT_ADD), "EC_POINT_add"}, {ERR_FUNC(EC_F_EC_POINT_CMP), "EC_POINT_cmp"}, @@ -184,6 +195,9 @@ static ERR_STRING_DATA EC_str_functs[]= {ERR_FUNC(EC_F_I2D_ECPKPARAMETERS), "i2d_ECPKParameters"}, {ERR_FUNC(EC_F_I2D_ECPRIVATEKEY), "i2d_ECPrivateKey"}, {ERR_FUNC(EC_F_I2O_ECPUBLICKEY), "i2o_ECPublicKey"}, +{ERR_FUNC(EC_F_NISTP224_PRE_COMP_NEW), "NISTP224_PRE_COMP_NEW"}, +{ERR_FUNC(EC_F_NISTP256_PRE_COMP_NEW), "NISTP256_PRE_COMP_NEW"}, +{ERR_FUNC(EC_F_NISTP521_PRE_COMP_NEW), "NISTP521_PRE_COMP_NEW"}, {ERR_FUNC(EC_F_O2I_ECPUBLICKEY), "o2i_ECPublicKey"}, {ERR_FUNC(EC_F_OLD_EC_PRIV_DECODE), "OLD_EC_PRIV_DECODE"}, {ERR_FUNC(EC_F_PKEY_EC_CTRL), "PKEY_EC_CTRL"}, @@ -199,12 +213,15 @@ static ERR_STRING_DATA EC_str_reasons[]= { {ERR_REASON(EC_R_ASN1_ERROR) ,"asn1 error"}, {ERR_REASON(EC_R_ASN1_UNKNOWN_FIELD) ,"asn1 unknown field"}, +{ERR_REASON(EC_R_BIGNUM_OUT_OF_RANGE) ,"bignum out of range"}, {ERR_REASON(EC_R_BUFFER_TOO_SMALL) ,"buffer too small"}, +{ERR_REASON(EC_R_COORDINATES_OUT_OF_RANGE),"coordinates out of range"}, {ERR_REASON(EC_R_D2I_ECPKPARAMETERS_FAILURE),"d2i ecpkparameters failure"}, {ERR_REASON(EC_R_DECODE_ERROR) ,"decode error"}, {ERR_REASON(EC_R_DISCRIMINANT_IS_ZERO) ,"discriminant is zero"}, {ERR_REASON(EC_R_EC_GROUP_NEW_BY_NAME_FAILURE),"ec group new by name failure"}, {ERR_REASON(EC_R_FIELD_TOO_LARGE) ,"field too large"}, +{ERR_REASON(EC_R_GF2M_NOT_SUPPORTED) ,"gf2m not supported"}, {ERR_REASON(EC_R_GROUP2PKPARAMETERS_FAILURE),"group2pkparameters failure"}, {ERR_REASON(EC_R_I2D_ECPKPARAMETERS_FAILURE),"i2d ecpkparameters failure"}, {ERR_REASON(EC_R_INCOMPATIBLE_OBJECTS) ,"incompatible objects"}, @@ -239,6 +256,7 @@ static ERR_STRING_DATA EC_str_reasons[]= {ERR_REASON(EC_R_UNKNOWN_GROUP) ,"unknown group"}, {ERR_REASON(EC_R_UNKNOWN_ORDER) ,"unknown order"}, {ERR_REASON(EC_R_UNSUPPORTED_FIELD) ,"unsupported field"}, +{ERR_REASON(EC_R_WRONG_CURVE_PARAMETERS) ,"wrong curve parameters"}, {ERR_REASON(EC_R_WRONG_ORDER) ,"wrong order"}, {0,NULL} }; diff --git a/crypto/openssl/crypto/ec/ec_key.c b/crypto/openssl/crypto/ec/ec_key.c index 522802c07a..bf9fd2dc2c 100644 --- a/crypto/openssl/crypto/ec/ec_key.c +++ b/crypto/openssl/crypto/ec/ec_key.c @@ -64,7 +64,9 @@ #include #include "ec_lcl.h" #include -#include +#ifdef OPENSSL_FIPS +#include +#endif EC_KEY *EC_KEY_new(void) { @@ -78,6 +80,7 @@ EC_KEY *EC_KEY_new(void) } ret->version = 1; + ret->flags = 0; ret->group = NULL; ret->pub_key = NULL; ret->priv_key= NULL; @@ -197,6 +200,7 @@ EC_KEY *EC_KEY_copy(EC_KEY *dest, const EC_KEY *src) dest->enc_flag = src->enc_flag; dest->conv_form = src->conv_form; dest->version = src->version; + dest->flags = src->flags; return dest; } @@ -237,6 +241,11 @@ int EC_KEY_generate_key(EC_KEY *eckey) BIGNUM *priv_key = NULL, *order = NULL; EC_POINT *pub_key = NULL; +#ifdef OPENSSL_FIPS + if (FIPS_mode()) + return FIPS_ec_key_generate_key(eckey); +#endif + if (!eckey || !eckey->group) { ECerr(EC_F_EC_KEY_GENERATE_KEY, ERR_R_PASSED_NULL_PARAMETER); @@ -371,6 +380,82 @@ err: return(ok); } +int EC_KEY_set_public_key_affine_coordinates(EC_KEY *key, BIGNUM *x, BIGNUM *y) + { + BN_CTX *ctx = NULL; + BIGNUM *tx, *ty; + EC_POINT *point = NULL; + int ok = 0, tmp_nid, is_char_two = 0; + + if (!key || !key->group || !x || !y) + { + ECerr(EC_F_EC_KEY_SET_PUBLIC_KEY_AFFINE_COORDINATES, + ERR_R_PASSED_NULL_PARAMETER); + return 0; + } + ctx = BN_CTX_new(); + if (!ctx) + goto err; + + point = EC_POINT_new(key->group); + + if (!point) + goto err; + + tmp_nid = EC_METHOD_get_field_type(EC_GROUP_method_of(key->group)); + + if (tmp_nid == NID_X9_62_characteristic_two_field) + is_char_two = 1; + + tx = BN_CTX_get(ctx); + ty = BN_CTX_get(ctx); +#ifndef OPENSSL_NO_EC2M + if (is_char_two) + { + if (!EC_POINT_set_affine_coordinates_GF2m(key->group, point, + x, y, ctx)) + goto err; + if (!EC_POINT_get_affine_coordinates_GF2m(key->group, point, + tx, ty, ctx)) + goto err; + } + else +#endif + { + if (!EC_POINT_set_affine_coordinates_GFp(key->group, point, + x, y, ctx)) + goto err; + if (!EC_POINT_get_affine_coordinates_GFp(key->group, point, + tx, ty, ctx)) + goto err; + } + /* Check if retrieved coordinates match originals: if not values + * are out of range. + */ + if (BN_cmp(x, tx) || BN_cmp(y, ty)) + { + ECerr(EC_F_EC_KEY_SET_PUBLIC_KEY_AFFINE_COORDINATES, + EC_R_COORDINATES_OUT_OF_RANGE); + goto err; + } + + if (!EC_KEY_set_public_key(key, point)) + goto err; + + if (EC_KEY_check_key(key) == 0) + goto err; + + ok = 1; + + err: + if (ctx) + BN_CTX_free(ctx); + if (point) + EC_POINT_free(point); + return ok; + + } + const EC_GROUP *EC_KEY_get0_group(const EC_KEY *key) { return key->group; @@ -461,3 +546,18 @@ int EC_KEY_precompute_mult(EC_KEY *key, BN_CTX *ctx) return 0; return EC_GROUP_precompute_mult(key->group, ctx); } + +int EC_KEY_get_flags(const EC_KEY *key) + { + return key->flags; + } + +void EC_KEY_set_flags(EC_KEY *key, int flags) + { + key->flags |= flags; + } + +void EC_KEY_clear_flags(EC_KEY *key, int flags) + { + key->flags &= ~flags; + } diff --git a/crypto/openssl/crypto/ec/ec_lcl.h b/crypto/openssl/crypto/ec/ec_lcl.h index 3e2c34b0bc..da7967df38 100644 --- a/crypto/openssl/crypto/ec/ec_lcl.h +++ b/crypto/openssl/crypto/ec/ec_lcl.h @@ -3,7 +3,7 @@ * Originally written by Bodo Moeller for the OpenSSL project. */ /* ==================================================================== - * Copyright (c) 1998-2003 The OpenSSL Project. All rights reserved. + * Copyright (c) 1998-2010 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -82,10 +82,15 @@ # endif #endif +/* Use default functions for poin2oct, oct2point and compressed coordinates */ +#define EC_FLAGS_DEFAULT_OCT 0x1 + /* Structure details are not part of the exported interface, * so all this may change in future versions. */ struct ec_method_st { + /* Various method flags */ + int flags; /* used by EC_METHOD_get_field_type: */ int field_type; /* a NID */ @@ -244,6 +249,7 @@ struct ec_key_st { point_conversion_form_t conv_form; int references; + int flags; EC_EXTRA_DATA *method_data; } /* EC_KEY */; @@ -391,3 +397,50 @@ int ec_GF2m_simple_mul(const EC_GROUP *group, EC_POINT *r, const BIGNUM *scalar, size_t num, const EC_POINT *points[], const BIGNUM *scalars[], BN_CTX *); int ec_GF2m_precompute_mult(EC_GROUP *group, BN_CTX *ctx); int ec_GF2m_have_precompute_mult(const EC_GROUP *group); + +/* method functions in ec2_mult.c */ +int ec_GF2m_simple_mul(const EC_GROUP *group, EC_POINT *r, const BIGNUM *scalar, + size_t num, const EC_POINT *points[], const BIGNUM *scalars[], BN_CTX *); +int ec_GF2m_precompute_mult(EC_GROUP *group, BN_CTX *ctx); +int ec_GF2m_have_precompute_mult(const EC_GROUP *group); + +#ifndef OPENSSL_EC_NISTP_64_GCC_128 +/* method functions in ecp_nistp224.c */ +int ec_GFp_nistp224_group_init(EC_GROUP *group); +int ec_GFp_nistp224_group_set_curve(EC_GROUP *group, const BIGNUM *p, const BIGNUM *a, const BIGNUM *n, BN_CTX *); +int ec_GFp_nistp224_point_get_affine_coordinates(const EC_GROUP *group, const EC_POINT *point, BIGNUM *x, BIGNUM *y, BN_CTX *ctx); +int ec_GFp_nistp224_mul(const EC_GROUP *group, EC_POINT *r, const BIGNUM *scalar, size_t num, const EC_POINT *points[], const BIGNUM *scalars[], BN_CTX *); +int ec_GFp_nistp224_points_mul(const EC_GROUP *group, EC_POINT *r, const BIGNUM *scalar, size_t num, const EC_POINT *points[], const BIGNUM *scalars[], BN_CTX *ctx); +int ec_GFp_nistp224_precompute_mult(EC_GROUP *group, BN_CTX *ctx); +int ec_GFp_nistp224_have_precompute_mult(const EC_GROUP *group); + +/* method functions in ecp_nistp256.c */ +int ec_GFp_nistp256_group_init(EC_GROUP *group); +int ec_GFp_nistp256_group_set_curve(EC_GROUP *group, const BIGNUM *p, const BIGNUM *a, const BIGNUM *n, BN_CTX *); +int ec_GFp_nistp256_point_get_affine_coordinates(const EC_GROUP *group, const EC_POINT *point, BIGNUM *x, BIGNUM *y, BN_CTX *ctx); +int ec_GFp_nistp256_mul(const EC_GROUP *group, EC_POINT *r, const BIGNUM *scalar, size_t num, const EC_POINT *points[], const BIGNUM *scalars[], BN_CTX *); +int ec_GFp_nistp256_points_mul(const EC_GROUP *group, EC_POINT *r, const BIGNUM *scalar, size_t num, const EC_POINT *points[], const BIGNUM *scalars[], BN_CTX *ctx); +int ec_GFp_nistp256_precompute_mult(EC_GROUP *group, BN_CTX *ctx); +int ec_GFp_nistp256_have_precompute_mult(const EC_GROUP *group); + +/* method functions in ecp_nistp521.c */ +int ec_GFp_nistp521_group_init(EC_GROUP *group); +int ec_GFp_nistp521_group_set_curve(EC_GROUP *group, const BIGNUM *p, const BIGNUM *a, const BIGNUM *n, BN_CTX *); +int ec_GFp_nistp521_point_get_affine_coordinates(const EC_GROUP *group, const EC_POINT *point, BIGNUM *x, BIGNUM *y, BN_CTX *ctx); +int ec_GFp_nistp521_mul(const EC_GROUP *group, EC_POINT *r, const BIGNUM *scalar, size_t num, const EC_POINT *points[], const BIGNUM *scalars[], BN_CTX *); +int ec_GFp_nistp521_points_mul(const EC_GROUP *group, EC_POINT *r, const BIGNUM *scalar, size_t num, const EC_POINT *points[], const BIGNUM *scalars[], BN_CTX *ctx); +int ec_GFp_nistp521_precompute_mult(EC_GROUP *group, BN_CTX *ctx); +int ec_GFp_nistp521_have_precompute_mult(const EC_GROUP *group); + +/* utility functions in ecp_nistputil.c */ +void ec_GFp_nistp_points_make_affine_internal(size_t num, void *point_array, + size_t felem_size, void *tmp_felems, + void (*felem_one)(void *out), + int (*felem_is_zero)(const void *in), + void (*felem_assign)(void *out, const void *in), + void (*felem_square)(void *out, const void *in), + void (*felem_mul)(void *out, const void *in1, const void *in2), + void (*felem_inv)(void *out, const void *in), + void (*felem_contract)(void *out, const void *in)); +void ec_GFp_nistp_recode_scalar_bits(unsigned char *sign, unsigned char *digit, unsigned char in); +#endif diff --git a/crypto/openssl/crypto/ec/ec_lib.c b/crypto/openssl/crypto/ec/ec_lib.c index dd7da0fcf9..25247b5803 100644 --- a/crypto/openssl/crypto/ec/ec_lib.c +++ b/crypto/openssl/crypto/ec/ec_lib.c @@ -425,7 +425,7 @@ int EC_GROUP_get_curve_GFp(const EC_GROUP *group, BIGNUM *p, BIGNUM *a, BIGNUM * return group->meth->group_get_curve(group, p, a, b, ctx); } - +#ifndef OPENSSL_NO_EC2M int EC_GROUP_set_curve_GF2m(EC_GROUP *group, const BIGNUM *p, const BIGNUM *a, const BIGNUM *b, BN_CTX *ctx) { if (group->meth->group_set_curve == 0) @@ -446,7 +446,7 @@ int EC_GROUP_get_curve_GF2m(const EC_GROUP *group, BIGNUM *p, BIGNUM *a, BIGNUM } return group->meth->group_get_curve(group, p, a, b, ctx); } - +#endif int EC_GROUP_get_degree(const EC_GROUP *group) { @@ -856,7 +856,7 @@ int EC_POINT_set_affine_coordinates_GFp(const EC_GROUP *group, EC_POINT *point, return group->meth->point_set_affine_coordinates(group, point, x, y, ctx); } - +#ifndef OPENSSL_NO_EC2M int EC_POINT_set_affine_coordinates_GF2m(const EC_GROUP *group, EC_POINT *point, const BIGNUM *x, const BIGNUM *y, BN_CTX *ctx) { @@ -872,7 +872,7 @@ int EC_POINT_set_affine_coordinates_GF2m(const EC_GROUP *group, EC_POINT *point, } return group->meth->point_set_affine_coordinates(group, point, x, y, ctx); } - +#endif int EC_POINT_get_affine_coordinates_GFp(const EC_GROUP *group, const EC_POINT *point, BIGNUM *x, BIGNUM *y, BN_CTX *ctx) @@ -890,7 +890,7 @@ int EC_POINT_get_affine_coordinates_GFp(const EC_GROUP *group, const EC_POINT *p return group->meth->point_get_affine_coordinates(group, point, x, y, ctx); } - +#ifndef OPENSSL_NO_EC2M int EC_POINT_get_affine_coordinates_GF2m(const EC_GROUP *group, const EC_POINT *point, BIGNUM *x, BIGNUM *y, BN_CTX *ctx) { @@ -906,75 +906,7 @@ int EC_POINT_get_affine_coordinates_GF2m(const EC_GROUP *group, const EC_POINT * } return group->meth->point_get_affine_coordinates(group, point, x, y, ctx); } - - -int EC_POINT_set_compressed_coordinates_GFp(const EC_GROUP *group, EC_POINT *point, - const BIGNUM *x, int y_bit, BN_CTX *ctx) - { - if (group->meth->point_set_compressed_coordinates == 0) - { - ECerr(EC_F_EC_POINT_SET_COMPRESSED_COORDINATES_GFP, ERR_R_SHOULD_NOT_HAVE_BEEN_CALLED); - return 0; - } - if (group->meth != point->meth) - { - ECerr(EC_F_EC_POINT_SET_COMPRESSED_COORDINATES_GFP, EC_R_INCOMPATIBLE_OBJECTS); - return 0; - } - return group->meth->point_set_compressed_coordinates(group, point, x, y_bit, ctx); - } - - -int EC_POINT_set_compressed_coordinates_GF2m(const EC_GROUP *group, EC_POINT *point, - const BIGNUM *x, int y_bit, BN_CTX *ctx) - { - if (group->meth->point_set_compressed_coordinates == 0) - { - ECerr(EC_F_EC_POINT_SET_COMPRESSED_COORDINATES_GF2M, ERR_R_SHOULD_NOT_HAVE_BEEN_CALLED); - return 0; - } - if (group->meth != point->meth) - { - ECerr(EC_F_EC_POINT_SET_COMPRESSED_COORDINATES_GF2M, EC_R_INCOMPATIBLE_OBJECTS); - return 0; - } - return group->meth->point_set_compressed_coordinates(group, point, x, y_bit, ctx); - } - - -size_t EC_POINT_point2oct(const EC_GROUP *group, const EC_POINT *point, point_conversion_form_t form, - unsigned char *buf, size_t len, BN_CTX *ctx) - { - if (group->meth->point2oct == 0) - { - ECerr(EC_F_EC_POINT_POINT2OCT, ERR_R_SHOULD_NOT_HAVE_BEEN_CALLED); - return 0; - } - if (group->meth != point->meth) - { - ECerr(EC_F_EC_POINT_POINT2OCT, EC_R_INCOMPATIBLE_OBJECTS); - return 0; - } - return group->meth->point2oct(group, point, form, buf, len, ctx); - } - - -int EC_POINT_oct2point(const EC_GROUP *group, EC_POINT *point, - const unsigned char *buf, size_t len, BN_CTX *ctx) - { - if (group->meth->oct2point == 0) - { - ECerr(EC_F_EC_POINT_OCT2POINT, ERR_R_SHOULD_NOT_HAVE_BEEN_CALLED); - return 0; - } - if (group->meth != point->meth) - { - ECerr(EC_F_EC_POINT_OCT2POINT, EC_R_INCOMPATIBLE_OBJECTS); - return 0; - } - return group->meth->oct2point(group, point, buf, len, ctx); - } - +#endif int EC_POINT_add(const EC_GROUP *group, EC_POINT *r, const EC_POINT *a, const EC_POINT *b, BN_CTX *ctx) { diff --git a/crypto/openssl/crypto/ec/ec_oct.c b/crypto/openssl/crypto/ec/ec_oct.c new file mode 100644 index 0000000000..fd9db0798d --- /dev/null +++ b/crypto/openssl/crypto/ec/ec_oct.c @@ -0,0 +1,199 @@ +/* crypto/ec/ec_lib.c */ +/* + * Originally written by Bodo Moeller for the OpenSSL project. + */ +/* ==================================================================== + * Copyright (c) 1998-2003 The OpenSSL Project. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * 3. All advertising materials mentioning features or use of this + * software must display the following acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit. (http://www.openssl.org/)" + * + * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to + * endorse or promote products derived from this software without + * prior written permission. For written permission, please contact + * openssl-core@openssl.org. + * + * 5. Products derived from this software may not be called "OpenSSL" + * nor may "OpenSSL" appear in their names without prior written + * permission of the OpenSSL Project. + * + * 6. Redistributions of any form whatsoever must retain the following + * acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit (http://www.openssl.org/)" + * + * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY + * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * ==================================================================== + * + * This product includes cryptographic software written by Eric Young + * (eay@cryptsoft.com). This product includes software written by Tim + * Hudson (tjh@cryptsoft.com). + * + */ +/* ==================================================================== + * Copyright 2002 Sun Microsystems, Inc. ALL RIGHTS RESERVED. + * Binary polynomial ECC support in OpenSSL originally developed by + * SUN MICROSYSTEMS, INC., and contributed to the OpenSSL project. + */ + +#include + +#include +#include + +#include "ec_lcl.h" + +int EC_POINT_set_compressed_coordinates_GFp(const EC_GROUP *group, EC_POINT *point, + const BIGNUM *x, int y_bit, BN_CTX *ctx) + { + if (group->meth->point_set_compressed_coordinates == 0 + && !(group->meth->flags & EC_FLAGS_DEFAULT_OCT)) + { + ECerr(EC_F_EC_POINT_SET_COMPRESSED_COORDINATES_GFP, ERR_R_SHOULD_NOT_HAVE_BEEN_CALLED); + return 0; + } + if (group->meth != point->meth) + { + ECerr(EC_F_EC_POINT_SET_COMPRESSED_COORDINATES_GFP, EC_R_INCOMPATIBLE_OBJECTS); + return 0; + } + if(group->meth->flags & EC_FLAGS_DEFAULT_OCT) + { + if (group->meth->field_type == NID_X9_62_prime_field) + return ec_GFp_simple_set_compressed_coordinates( + group, point, x, y_bit, ctx); + else +#ifdef OPENSSL_NO_EC2M + { + ECerr(EC_F_EC_POINT_SET_COMPRESSED_COORDINATES_GFP, EC_R_GF2M_NOT_SUPPORTED); + return 0; + } +#else + return ec_GF2m_simple_set_compressed_coordinates( + group, point, x, y_bit, ctx); +#endif + } + return group->meth->point_set_compressed_coordinates(group, point, x, y_bit, ctx); + } + +#ifndef OPENSSL_NO_EC2M +int EC_POINT_set_compressed_coordinates_GF2m(const EC_GROUP *group, EC_POINT *point, + const BIGNUM *x, int y_bit, BN_CTX *ctx) + { + if (group->meth->point_set_compressed_coordinates == 0 + && !(group->meth->flags & EC_FLAGS_DEFAULT_OCT)) + { + ECerr(EC_F_EC_POINT_SET_COMPRESSED_COORDINATES_GF2M, ERR_R_SHOULD_NOT_HAVE_BEEN_CALLED); + return 0; + } + if (group->meth != point->meth) + { + ECerr(EC_F_EC_POINT_SET_COMPRESSED_COORDINATES_GF2M, EC_R_INCOMPATIBLE_OBJECTS); + return 0; + } + if(group->meth->flags & EC_FLAGS_DEFAULT_OCT) + { + if (group->meth->field_type == NID_X9_62_prime_field) + return ec_GFp_simple_set_compressed_coordinates( + group, point, x, y_bit, ctx); + else + return ec_GF2m_simple_set_compressed_coordinates( + group, point, x, y_bit, ctx); + } + return group->meth->point_set_compressed_coordinates(group, point, x, y_bit, ctx); + } +#endif + +size_t EC_POINT_point2oct(const EC_GROUP *group, const EC_POINT *point, point_conversion_form_t form, + unsigned char *buf, size_t len, BN_CTX *ctx) + { + if (group->meth->point2oct == 0 + && !(group->meth->flags & EC_FLAGS_DEFAULT_OCT)) + { + ECerr(EC_F_EC_POINT_POINT2OCT, ERR_R_SHOULD_NOT_HAVE_BEEN_CALLED); + return 0; + } + if (group->meth != point->meth) + { + ECerr(EC_F_EC_POINT_POINT2OCT, EC_R_INCOMPATIBLE_OBJECTS); + return 0; + } + if(group->meth->flags & EC_FLAGS_DEFAULT_OCT) + { + if (group->meth->field_type == NID_X9_62_prime_field) + return ec_GFp_simple_point2oct(group, point, + form, buf, len, ctx); + else +#ifdef OPENSSL_NO_EC2M + { + ECerr(EC_F_EC_POINT_POINT2OCT, EC_R_GF2M_NOT_SUPPORTED); + return 0; + } +#else + return ec_GF2m_simple_point2oct(group, point, + form, buf, len, ctx); +#endif + } + + return group->meth->point2oct(group, point, form, buf, len, ctx); + } + + +int EC_POINT_oct2point(const EC_GROUP *group, EC_POINT *point, + const unsigned char *buf, size_t len, BN_CTX *ctx) + { + if (group->meth->oct2point == 0 + && !(group->meth->flags & EC_FLAGS_DEFAULT_OCT)) + { + ECerr(EC_F_EC_POINT_OCT2POINT, ERR_R_SHOULD_NOT_HAVE_BEEN_CALLED); + return 0; + } + if (group->meth != point->meth) + { + ECerr(EC_F_EC_POINT_OCT2POINT, EC_R_INCOMPATIBLE_OBJECTS); + return 0; + } + if(group->meth->flags & EC_FLAGS_DEFAULT_OCT) + { + if (group->meth->field_type == NID_X9_62_prime_field) + return ec_GFp_simple_oct2point(group, point, + buf, len, ctx); + else +#ifdef OPENSSL_NO_EC2M + { + ECerr(EC_F_EC_POINT_OCT2POINT, EC_R_GF2M_NOT_SUPPORTED); + return 0; + } +#else + return ec_GF2m_simple_oct2point(group, point, + buf, len, ctx); +#endif + } + return group->meth->oct2point(group, point, buf, len, ctx); + } + diff --git a/crypto/openssl/crypto/ec/ec_pmeth.c b/crypto/openssl/crypto/ec/ec_pmeth.c index f433076ca1..d1ed66c37e 100644 --- a/crypto/openssl/crypto/ec/ec_pmeth.c +++ b/crypto/openssl/crypto/ec/ec_pmeth.c @@ -221,6 +221,7 @@ static int pkey_ec_ctrl(EVP_PKEY_CTX *ctx, int type, int p1, void *p2) case EVP_PKEY_CTRL_MD: if (EVP_MD_type((const EVP_MD *)p2) != NID_sha1 && + EVP_MD_type((const EVP_MD *)p2) != NID_ecdsa_with_SHA1 && EVP_MD_type((const EVP_MD *)p2) != NID_sha224 && EVP_MD_type((const EVP_MD *)p2) != NID_sha256 && EVP_MD_type((const EVP_MD *)p2) != NID_sha384 && diff --git a/crypto/openssl/crypto/ec/eck_prn.c b/crypto/openssl/crypto/ec/eck_prn.c index 7d3e175ae7..06de8f3959 100644 --- a/crypto/openssl/crypto/ec/eck_prn.c +++ b/crypto/openssl/crypto/ec/eck_prn.c @@ -207,7 +207,7 @@ int ECPKParameters_print(BIO *bp, const EC_GROUP *x, int off) reason = ERR_R_MALLOC_FAILURE; goto err; } - +#ifndef OPENSSL_NO_EC2M if (is_char_two) { if (!EC_GROUP_get_curve_GF2m(x, p, a, b, ctx)) @@ -217,6 +217,7 @@ int ECPKParameters_print(BIO *bp, const EC_GROUP *x, int off) } } else /* prime field */ +#endif { if (!EC_GROUP_get_curve_GFp(x, p, a, b, ctx)) { diff --git a/crypto/openssl/crypto/ec/ecp_mont.c b/crypto/openssl/crypto/ec/ecp_mont.c index 9fc4a466a5..079e47431b 100644 --- a/crypto/openssl/crypto/ec/ecp_mont.c +++ b/crypto/openssl/crypto/ec/ecp_mont.c @@ -63,12 +63,20 @@ #include +#ifdef OPENSSL_FIPS +#include +#endif + #include "ec_lcl.h" const EC_METHOD *EC_GFp_mont_method(void) { +#ifdef OPENSSL_FIPS + return fips_ec_gfp_mont_method(); +#else static const EC_METHOD ret = { + EC_FLAGS_DEFAULT_OCT, NID_X9_62_prime_field, ec_GFp_mont_group_init, ec_GFp_mont_group_finish, @@ -87,9 +95,7 @@ const EC_METHOD *EC_GFp_mont_method(void) ec_GFp_simple_get_Jprojective_coordinates_GFp, ec_GFp_simple_point_set_affine_coordinates, ec_GFp_simple_point_get_affine_coordinates, - ec_GFp_simple_set_compressed_coordinates, - ec_GFp_simple_point2oct, - ec_GFp_simple_oct2point, + 0,0,0, ec_GFp_simple_add, ec_GFp_simple_dbl, ec_GFp_simple_invert, @@ -108,7 +114,9 @@ const EC_METHOD *EC_GFp_mont_method(void) ec_GFp_mont_field_decode, ec_GFp_mont_field_set_to_one }; + return &ret; +#endif } diff --git a/crypto/openssl/crypto/ec/ecp_nist.c b/crypto/openssl/crypto/ec/ecp_nist.c index 2a5682ea41..aad2d5f443 100644 --- a/crypto/openssl/crypto/ec/ecp_nist.c +++ b/crypto/openssl/crypto/ec/ecp_nist.c @@ -67,9 +67,17 @@ #include #include "ec_lcl.h" +#ifdef OPENSSL_FIPS +#include +#endif + const EC_METHOD *EC_GFp_nist_method(void) { +#ifdef OPENSSL_FIPS + return fips_ec_gfp_nist_method(); +#else static const EC_METHOD ret = { + EC_FLAGS_DEFAULT_OCT, NID_X9_62_prime_field, ec_GFp_simple_group_init, ec_GFp_simple_group_finish, @@ -88,9 +96,7 @@ const EC_METHOD *EC_GFp_nist_method(void) ec_GFp_simple_get_Jprojective_coordinates_GFp, ec_GFp_simple_point_set_affine_coordinates, ec_GFp_simple_point_get_affine_coordinates, - ec_GFp_simple_set_compressed_coordinates, - ec_GFp_simple_point2oct, - ec_GFp_simple_oct2point, + 0,0,0, ec_GFp_simple_add, ec_GFp_simple_dbl, ec_GFp_simple_invert, @@ -110,6 +116,7 @@ const EC_METHOD *EC_GFp_nist_method(void) 0 /* field_set_to_one */ }; return &ret; +#endif } int ec_GFp_nist_group_copy(EC_GROUP *dest, const EC_GROUP *src) diff --git a/crypto/openssl/crypto/ec/ecp_nistp224.c b/crypto/openssl/crypto/ec/ecp_nistp224.c new file mode 100644 index 0000000000..b5ff56c252 --- /dev/null +++ b/crypto/openssl/crypto/ec/ecp_nistp224.c @@ -0,0 +1,1658 @@ +/* crypto/ec/ecp_nistp224.c */ +/* + * Written by Emilia Kasper (Google) for the OpenSSL project. + */ +/* Copyright 2011 Google Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/* + * A 64-bit implementation of the NIST P-224 elliptic curve point multiplication + * + * Inspired by Daniel J. Bernstein's public domain nistp224 implementation + * and Adam Langley's public domain 64-bit C implementation of curve25519 + */ + +#include +#ifndef OPENSSL_NO_EC_NISTP_64_GCC_128 + +#ifndef OPENSSL_SYS_VMS +#include +#else +#include +#endif + +#include +#include +#include "ec_lcl.h" + +#if defined(__GNUC__) && (__GNUC__ > 3 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 1)) + /* even with gcc, the typedef won't work for 32-bit platforms */ + typedef __uint128_t uint128_t; /* nonstandard; implemented by gcc on 64-bit platforms */ +#else + #error "Need GCC 3.1 or later to define type uint128_t" +#endif + +typedef uint8_t u8; +typedef uint64_t u64; +typedef int64_t s64; + + +/******************************************************************************/ +/* INTERNAL REPRESENTATION OF FIELD ELEMENTS + * + * Field elements are represented as a_0 + 2^56*a_1 + 2^112*a_2 + 2^168*a_3 + * using 64-bit coefficients called 'limbs', + * and sometimes (for multiplication results) as + * b_0 + 2^56*b_1 + 2^112*b_2 + 2^168*b_3 + 2^224*b_4 + 2^280*b_5 + 2^336*b_6 + * using 128-bit coefficients called 'widelimbs'. + * A 4-limb representation is an 'felem'; + * a 7-widelimb representation is a 'widefelem'. + * Even within felems, bits of adjacent limbs overlap, and we don't always + * reduce the representations: we ensure that inputs to each felem + * multiplication satisfy a_i < 2^60, so outputs satisfy b_i < 4*2^60*2^60, + * and fit into a 128-bit word without overflow. The coefficients are then + * again partially reduced to obtain an felem satisfying a_i < 2^57. + * We only reduce to the unique minimal representation at the end of the + * computation. + */ + +typedef uint64_t limb; +typedef uint128_t widelimb; + +typedef limb felem[4]; +typedef widelimb widefelem[7]; + +/* Field element represented as a byte arrary. + * 28*8 = 224 bits is also the group order size for the elliptic curve, + * and we also use this type for scalars for point multiplication. + */ +typedef u8 felem_bytearray[28]; + +static const felem_bytearray nistp224_curve_params[5] = { + {0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF, /* p */ + 0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0x00,0x00,0x00,0x00, + 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x01}, + {0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF, /* a */ + 0xFF,0xFF,0xFF,0xFF,0xFF,0xFE,0xFF,0xFF,0xFF,0xFF, + 0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFE}, + {0xB4,0x05,0x0A,0x85,0x0C,0x04,0xB3,0xAB,0xF5,0x41, /* b */ + 0x32,0x56,0x50,0x44,0xB0,0xB7,0xD7,0xBF,0xD8,0xBA, + 0x27,0x0B,0x39,0x43,0x23,0x55,0xFF,0xB4}, + {0xB7,0x0E,0x0C,0xBD,0x6B,0xB4,0xBF,0x7F,0x32,0x13, /* x */ + 0x90,0xB9,0x4A,0x03,0xC1,0xD3,0x56,0xC2,0x11,0x22, + 0x34,0x32,0x80,0xD6,0x11,0x5C,0x1D,0x21}, + {0xbd,0x37,0x63,0x88,0xb5,0xf7,0x23,0xfb,0x4c,0x22, /* y */ + 0xdf,0xe6,0xcd,0x43,0x75,0xa0,0x5a,0x07,0x47,0x64, + 0x44,0xd5,0x81,0x99,0x85,0x00,0x7e,0x34} +}; + +/* Precomputed multiples of the standard generator + * Points are given in coordinates (X, Y, Z) where Z normally is 1 + * (0 for the point at infinity). + * For each field element, slice a_0 is word 0, etc. + * + * The table has 2 * 16 elements, starting with the following: + * index | bits | point + * ------+---------+------------------------------ + * 0 | 0 0 0 0 | 0G + * 1 | 0 0 0 1 | 1G + * 2 | 0 0 1 0 | 2^56G + * 3 | 0 0 1 1 | (2^56 + 1)G + * 4 | 0 1 0 0 | 2^112G + * 5 | 0 1 0 1 | (2^112 + 1)G + * 6 | 0 1 1 0 | (2^112 + 2^56)G + * 7 | 0 1 1 1 | (2^112 + 2^56 + 1)G + * 8 | 1 0 0 0 | 2^168G + * 9 | 1 0 0 1 | (2^168 + 1)G + * 10 | 1 0 1 0 | (2^168 + 2^56)G + * 11 | 1 0 1 1 | (2^168 + 2^56 + 1)G + * 12 | 1 1 0 0 | (2^168 + 2^112)G + * 13 | 1 1 0 1 | (2^168 + 2^112 + 1)G + * 14 | 1 1 1 0 | (2^168 + 2^112 + 2^56)G + * 15 | 1 1 1 1 | (2^168 + 2^112 + 2^56 + 1)G + * followed by a copy of this with each element multiplied by 2^28. + * + * The reason for this is so that we can clock bits into four different + * locations when doing simple scalar multiplies against the base point, + * and then another four locations using the second 16 elements. + */ +static const felem gmul[2][16][3] = +{{{{0, 0, 0, 0}, + {0, 0, 0, 0}, + {0, 0, 0, 0}}, + {{0x3280d6115c1d21, 0xc1d356c2112234, 0x7f321390b94a03, 0xb70e0cbd6bb4bf}, + {0xd5819985007e34, 0x75a05a07476444, 0xfb4c22dfe6cd43, 0xbd376388b5f723}, + {1, 0, 0, 0}}, + {{0xfd9675666ebbe9, 0xbca7664d40ce5e, 0x2242df8d8a2a43, 0x1f49bbb0f99bc5}, + {0x29e0b892dc9c43, 0xece8608436e662, 0xdc858f185310d0, 0x9812dd4eb8d321}, + {1, 0, 0, 0}}, + {{0x6d3e678d5d8eb8, 0x559eed1cb362f1, 0x16e9a3bbce8a3f, 0xeedcccd8c2a748}, + {0xf19f90ed50266d, 0xabf2b4bf65f9df, 0x313865468fafec, 0x5cb379ba910a17}, + {1, 0, 0, 0}}, + {{0x0641966cab26e3, 0x91fb2991fab0a0, 0xefec27a4e13a0b, 0x0499aa8a5f8ebe}, + {0x7510407766af5d, 0x84d929610d5450, 0x81d77aae82f706, 0x6916f6d4338c5b}, + {1, 0, 0, 0}}, + {{0xea95ac3b1f15c6, 0x086000905e82d4, 0xdd323ae4d1c8b1, 0x932b56be7685a3}, + {0x9ef93dea25dbbf, 0x41665960f390f0, 0xfdec76dbe2a8a7, 0x523e80f019062a}, + {1, 0, 0, 0}}, + {{0x822fdd26732c73, 0xa01c83531b5d0f, 0x363f37347c1ba4, 0xc391b45c84725c}, + {0xbbd5e1b2d6ad24, 0xddfbcde19dfaec, 0xc393da7e222a7f, 0x1efb7890ede244}, + {1, 0, 0, 0}}, + {{0x4c9e90ca217da1, 0xd11beca79159bb, 0xff8d33c2c98b7c, 0x2610b39409f849}, + {0x44d1352ac64da0, 0xcdbb7b2c46b4fb, 0x966c079b753c89, 0xfe67e4e820b112}, + {1, 0, 0, 0}}, + {{0xe28cae2df5312d, 0xc71b61d16f5c6e, 0x79b7619a3e7c4c, 0x05c73240899b47}, + {0x9f7f6382c73e3a, 0x18615165c56bda, 0x641fab2116fd56, 0x72855882b08394}, + {1, 0, 0, 0}}, + {{0x0469182f161c09, 0x74a98ca8d00fb5, 0xb89da93489a3e0, 0x41c98768fb0c1d}, + {0xe5ea05fb32da81, 0x3dce9ffbca6855, 0x1cfe2d3fbf59e6, 0x0e5e03408738a7}, + {1, 0, 0, 0}}, + {{0xdab22b2333e87f, 0x4430137a5dd2f6, 0xe03ab9f738beb8, 0xcb0c5d0dc34f24}, + {0x764a7df0c8fda5, 0x185ba5c3fa2044, 0x9281d688bcbe50, 0xc40331df893881}, + {1, 0, 0, 0}}, + {{0xb89530796f0f60, 0xade92bd26909a3, 0x1a0c83fb4884da, 0x1765bf22a5a984}, + {0x772a9ee75db09e, 0x23bc6c67cec16f, 0x4c1edba8b14e2f, 0xe2a215d9611369}, + {1, 0, 0, 0}}, + {{0x571e509fb5efb3, 0xade88696410552, 0xc8ae85fada74fe, 0x6c7e4be83bbde3}, + {0xff9f51160f4652, 0xb47ce2495a6539, 0xa2946c53b582f4, 0x286d2db3ee9a60}, + {1, 0, 0, 0}}, + {{0x40bbd5081a44af, 0x0995183b13926c, 0xbcefba6f47f6d0, 0x215619e9cc0057}, + {0x8bc94d3b0df45e, 0xf11c54a3694f6f, 0x8631b93cdfe8b5, 0xe7e3f4b0982db9}, + {1, 0, 0, 0}}, + {{0xb17048ab3e1c7b, 0xac38f36ff8a1d8, 0x1c29819435d2c6, 0xc813132f4c07e9}, + {0x2891425503b11f, 0x08781030579fea, 0xf5426ba5cc9674, 0x1e28ebf18562bc}, + {1, 0, 0, 0}}, + {{0x9f31997cc864eb, 0x06cd91d28b5e4c, 0xff17036691a973, 0xf1aef351497c58}, + {0xdd1f2d600564ff, 0xdead073b1402db, 0x74a684435bd693, 0xeea7471f962558}, + {1, 0, 0, 0}}}, + {{{0, 0, 0, 0}, + {0, 0, 0, 0}, + {0, 0, 0, 0}}, + {{0x9665266dddf554, 0x9613d78b60ef2d, 0xce27a34cdba417, 0xd35ab74d6afc31}, + {0x85ccdd22deb15e, 0x2137e5783a6aab, 0xa141cffd8c93c6, 0x355a1830e90f2d}, + {1, 0, 0, 0}}, + {{0x1a494eadaade65, 0xd6da4da77fe53c, 0xe7992996abec86, 0x65c3553c6090e3}, + {0xfa610b1fb09346, 0xf1c6540b8a4aaf, 0xc51a13ccd3cbab, 0x02995b1b18c28a}, + {1, 0, 0, 0}}, + {{0x7874568e7295ef, 0x86b419fbe38d04, 0xdc0690a7550d9a, 0xd3966a44beac33}, + {0x2b7280ec29132f, 0xbeaa3b6a032df3, 0xdc7dd88ae41200, 0xd25e2513e3a100}, + {1, 0, 0, 0}}, + {{0x924857eb2efafd, 0xac2bce41223190, 0x8edaa1445553fc, 0x825800fd3562d5}, + {0x8d79148ea96621, 0x23a01c3dd9ed8d, 0xaf8b219f9416b5, 0xd8db0cc277daea}, + {1, 0, 0, 0}}, + {{0x76a9c3b1a700f0, 0xe9acd29bc7e691, 0x69212d1a6b0327, 0x6322e97fe154be}, + {0x469fc5465d62aa, 0x8d41ed18883b05, 0x1f8eae66c52b88, 0xe4fcbe9325be51}, + {1, 0, 0, 0}}, + {{0x825fdf583cac16, 0x020b857c7b023a, 0x683c17744b0165, 0x14ffd0a2daf2f1}, + {0x323b36184218f9, 0x4944ec4e3b47d4, 0xc15b3080841acf, 0x0bced4b01a28bb}, + {1, 0, 0, 0}}, + {{0x92ac22230df5c4, 0x52f33b4063eda8, 0xcb3f19870c0c93, 0x40064f2ba65233}, + {0xfe16f0924f8992, 0x012da25af5b517, 0x1a57bb24f723a6, 0x06f8bc76760def}, + {1, 0, 0, 0}}, + {{0x4a7084f7817cb9, 0xbcab0738ee9a78, 0x3ec11e11d9c326, 0xdc0fe90e0f1aae}, + {0xcf639ea5f98390, 0x5c350aa22ffb74, 0x9afae98a4047b7, 0x956ec2d617fc45}, + {1, 0, 0, 0}}, + {{0x4306d648c1be6a, 0x9247cd8bc9a462, 0xf5595e377d2f2e, 0xbd1c3caff1a52e}, + {0x045e14472409d0, 0x29f3e17078f773, 0x745a602b2d4f7d, 0x191837685cdfbb}, + {1, 0, 0, 0}}, + {{0x5b6ee254a8cb79, 0x4953433f5e7026, 0xe21faeb1d1def4, 0xc4c225785c09de}, + {0x307ce7bba1e518, 0x31b125b1036db8, 0x47e91868839e8f, 0xc765866e33b9f3}, + {1, 0, 0, 0}}, + {{0x3bfece24f96906, 0x4794da641e5093, 0xde5df64f95db26, 0x297ecd89714b05}, + {0x701bd3ebb2c3aa, 0x7073b4f53cb1d5, 0x13c5665658af16, 0x9895089d66fe58}, + {1, 0, 0, 0}}, + {{0x0fef05f78c4790, 0x2d773633b05d2e, 0x94229c3a951c94, 0xbbbd70df4911bb}, + {0xb2c6963d2c1168, 0x105f47a72b0d73, 0x9fdf6111614080, 0x7b7e94b39e67b0}, + {1, 0, 0, 0}}, + {{0xad1a7d6efbe2b3, 0xf012482c0da69d, 0x6b3bdf12438345, 0x40d7558d7aa4d9}, + {0x8a09fffb5c6d3d, 0x9a356e5d9ffd38, 0x5973f15f4f9b1c, 0xdcd5f59f63c3ea}, + {1, 0, 0, 0}}, + {{0xacf39f4c5ca7ab, 0x4c8071cc5fd737, 0xc64e3602cd1184, 0x0acd4644c9abba}, + {0x6c011a36d8bf6e, 0xfecd87ba24e32a, 0x19f6f56574fad8, 0x050b204ced9405}, + {1, 0, 0, 0}}, + {{0xed4f1cae7d9a96, 0x5ceef7ad94c40a, 0x778e4a3bf3ef9b, 0x7405783dc3b55e}, + {0x32477c61b6e8c6, 0xb46a97570f018b, 0x91176d0a7e95d1, 0x3df90fbc4c7d0e}, + {1, 0, 0, 0}}}}; + +/* Precomputation for the group generator. */ +typedef struct { + felem g_pre_comp[2][16][3]; + int references; +} NISTP224_PRE_COMP; + +const EC_METHOD *EC_GFp_nistp224_method(void) + { + static const EC_METHOD ret = { + EC_FLAGS_DEFAULT_OCT, + NID_X9_62_prime_field, + ec_GFp_nistp224_group_init, + ec_GFp_simple_group_finish, + ec_GFp_simple_group_clear_finish, + ec_GFp_nist_group_copy, + ec_GFp_nistp224_group_set_curve, + ec_GFp_simple_group_get_curve, + ec_GFp_simple_group_get_degree, + ec_GFp_simple_group_check_discriminant, + ec_GFp_simple_point_init, + ec_GFp_simple_point_finish, + ec_GFp_simple_point_clear_finish, + ec_GFp_simple_point_copy, + ec_GFp_simple_point_set_to_infinity, + ec_GFp_simple_set_Jprojective_coordinates_GFp, + ec_GFp_simple_get_Jprojective_coordinates_GFp, + ec_GFp_simple_point_set_affine_coordinates, + ec_GFp_nistp224_point_get_affine_coordinates, + 0 /* point_set_compressed_coordinates */, + 0 /* point2oct */, + 0 /* oct2point */, + ec_GFp_simple_add, + ec_GFp_simple_dbl, + ec_GFp_simple_invert, + ec_GFp_simple_is_at_infinity, + ec_GFp_simple_is_on_curve, + ec_GFp_simple_cmp, + ec_GFp_simple_make_affine, + ec_GFp_simple_points_make_affine, + ec_GFp_nistp224_points_mul, + ec_GFp_nistp224_precompute_mult, + ec_GFp_nistp224_have_precompute_mult, + ec_GFp_nist_field_mul, + ec_GFp_nist_field_sqr, + 0 /* field_div */, + 0 /* field_encode */, + 0 /* field_decode */, + 0 /* field_set_to_one */ }; + + return &ret; + } + +/* Helper functions to convert field elements to/from internal representation */ +static void bin28_to_felem(felem out, const u8 in[28]) + { + out[0] = *((const uint64_t *)(in)) & 0x00ffffffffffffff; + out[1] = (*((const uint64_t *)(in+7))) & 0x00ffffffffffffff; + out[2] = (*((const uint64_t *)(in+14))) & 0x00ffffffffffffff; + out[3] = (*((const uint64_t *)(in+21))) & 0x00ffffffffffffff; + } + +static void felem_to_bin28(u8 out[28], const felem in) + { + unsigned i; + for (i = 0; i < 7; ++i) + { + out[i] = in[0]>>(8*i); + out[i+7] = in[1]>>(8*i); + out[i+14] = in[2]>>(8*i); + out[i+21] = in[3]>>(8*i); + } + } + +/* To preserve endianness when using BN_bn2bin and BN_bin2bn */ +static void flip_endian(u8 *out, const u8 *in, unsigned len) + { + unsigned i; + for (i = 0; i < len; ++i) + out[i] = in[len-1-i]; + } + +/* From OpenSSL BIGNUM to internal representation */ +static int BN_to_felem(felem out, const BIGNUM *bn) + { + felem_bytearray b_in; + felem_bytearray b_out; + unsigned num_bytes; + + /* BN_bn2bin eats leading zeroes */ + memset(b_out, 0, sizeof b_out); + num_bytes = BN_num_bytes(bn); + if (num_bytes > sizeof b_out) + { + ECerr(EC_F_BN_TO_FELEM, EC_R_BIGNUM_OUT_OF_RANGE); + return 0; + } + if (BN_is_negative(bn)) + { + ECerr(EC_F_BN_TO_FELEM, EC_R_BIGNUM_OUT_OF_RANGE); + return 0; + } + num_bytes = BN_bn2bin(bn, b_in); + flip_endian(b_out, b_in, num_bytes); + bin28_to_felem(out, b_out); + return 1; + } + +/* From internal representation to OpenSSL BIGNUM */ +static BIGNUM *felem_to_BN(BIGNUM *out, const felem in) + { + felem_bytearray b_in, b_out; + felem_to_bin28(b_in, in); + flip_endian(b_out, b_in, sizeof b_out); + return BN_bin2bn(b_out, sizeof b_out, out); + } + +/******************************************************************************/ +/* FIELD OPERATIONS + * + * Field operations, using the internal representation of field elements. + * NB! These operations are specific to our point multiplication and cannot be + * expected to be correct in general - e.g., multiplication with a large scalar + * will cause an overflow. + * + */ + +static void felem_one(felem out) + { + out[0] = 1; + out[1] = 0; + out[2] = 0; + out[3] = 0; + } + +static void felem_assign(felem out, const felem in) + { + out[0] = in[0]; + out[1] = in[1]; + out[2] = in[2]; + out[3] = in[3]; + } + +/* Sum two field elements: out += in */ +static void felem_sum(felem out, const felem in) + { + out[0] += in[0]; + out[1] += in[1]; + out[2] += in[2]; + out[3] += in[3]; + } + +/* Get negative value: out = -in */ +/* Assumes in[i] < 2^57 */ +static void felem_neg(felem out, const felem in) + { + static const limb two58p2 = (((limb) 1) << 58) + (((limb) 1) << 2); + static const limb two58m2 = (((limb) 1) << 58) - (((limb) 1) << 2); + static const limb two58m42m2 = (((limb) 1) << 58) - + (((limb) 1) << 42) - (((limb) 1) << 2); + + /* Set to 0 mod 2^224-2^96+1 to ensure out > in */ + out[0] = two58p2 - in[0]; + out[1] = two58m42m2 - in[1]; + out[2] = two58m2 - in[2]; + out[3] = two58m2 - in[3]; + } + +/* Subtract field elements: out -= in */ +/* Assumes in[i] < 2^57 */ +static void felem_diff(felem out, const felem in) + { + static const limb two58p2 = (((limb) 1) << 58) + (((limb) 1) << 2); + static const limb two58m2 = (((limb) 1) << 58) - (((limb) 1) << 2); + static const limb two58m42m2 = (((limb) 1) << 58) - + (((limb) 1) << 42) - (((limb) 1) << 2); + + /* Add 0 mod 2^224-2^96+1 to ensure out > in */ + out[0] += two58p2; + out[1] += two58m42m2; + out[2] += two58m2; + out[3] += two58m2; + + out[0] -= in[0]; + out[1] -= in[1]; + out[2] -= in[2]; + out[3] -= in[3]; + } + +/* Subtract in unreduced 128-bit mode: out -= in */ +/* Assumes in[i] < 2^119 */ +static void widefelem_diff(widefelem out, const widefelem in) + { + static const widelimb two120 = ((widelimb) 1) << 120; + static const widelimb two120m64 = (((widelimb) 1) << 120) - + (((widelimb) 1) << 64); + static const widelimb two120m104m64 = (((widelimb) 1) << 120) - + (((widelimb) 1) << 104) - (((widelimb) 1) << 64); + + /* Add 0 mod 2^224-2^96+1 to ensure out > in */ + out[0] += two120; + out[1] += two120m64; + out[2] += two120m64; + out[3] += two120; + out[4] += two120m104m64; + out[5] += two120m64; + out[6] += two120m64; + + out[0] -= in[0]; + out[1] -= in[1]; + out[2] -= in[2]; + out[3] -= in[3]; + out[4] -= in[4]; + out[5] -= in[5]; + out[6] -= in[6]; + } + +/* Subtract in mixed mode: out128 -= in64 */ +/* in[i] < 2^63 */ +static void felem_diff_128_64(widefelem out, const felem in) + { + static const widelimb two64p8 = (((widelimb) 1) << 64) + + (((widelimb) 1) << 8); + static const widelimb two64m8 = (((widelimb) 1) << 64) - + (((widelimb) 1) << 8); + static const widelimb two64m48m8 = (((widelimb) 1) << 64) - + (((widelimb) 1) << 48) - (((widelimb) 1) << 8); + + /* Add 0 mod 2^224-2^96+1 to ensure out > in */ + out[0] += two64p8; + out[1] += two64m48m8; + out[2] += two64m8; + out[3] += two64m8; + + out[0] -= in[0]; + out[1] -= in[1]; + out[2] -= in[2]; + out[3] -= in[3]; + } + +/* Multiply a field element by a scalar: out = out * scalar + * The scalars we actually use are small, so results fit without overflow */ +static void felem_scalar(felem out, const limb scalar) + { + out[0] *= scalar; + out[1] *= scalar; + out[2] *= scalar; + out[3] *= scalar; + } + +/* Multiply an unreduced field element by a scalar: out = out * scalar + * The scalars we actually use are small, so results fit without overflow */ +static void widefelem_scalar(widefelem out, const widelimb scalar) + { + out[0] *= scalar; + out[1] *= scalar; + out[2] *= scalar; + out[3] *= scalar; + out[4] *= scalar; + out[5] *= scalar; + out[6] *= scalar; + } + +/* Square a field element: out = in^2 */ +static void felem_square(widefelem out, const felem in) + { + limb tmp0, tmp1, tmp2; + tmp0 = 2 * in[0]; tmp1 = 2 * in[1]; tmp2 = 2 * in[2]; + out[0] = ((widelimb) in[0]) * in[0]; + out[1] = ((widelimb) in[0]) * tmp1; + out[2] = ((widelimb) in[0]) * tmp2 + ((widelimb) in[1]) * in[1]; + out[3] = ((widelimb) in[3]) * tmp0 + + ((widelimb) in[1]) * tmp2; + out[4] = ((widelimb) in[3]) * tmp1 + ((widelimb) in[2]) * in[2]; + out[5] = ((widelimb) in[3]) * tmp2; + out[6] = ((widelimb) in[3]) * in[3]; + } + +/* Multiply two field elements: out = in1 * in2 */ +static void felem_mul(widefelem out, const felem in1, const felem in2) + { + out[0] = ((widelimb) in1[0]) * in2[0]; + out[1] = ((widelimb) in1[0]) * in2[1] + ((widelimb) in1[1]) * in2[0]; + out[2] = ((widelimb) in1[0]) * in2[2] + ((widelimb) in1[1]) * in2[1] + + ((widelimb) in1[2]) * in2[0]; + out[3] = ((widelimb) in1[0]) * in2[3] + ((widelimb) in1[1]) * in2[2] + + ((widelimb) in1[2]) * in2[1] + ((widelimb) in1[3]) * in2[0]; + out[4] = ((widelimb) in1[1]) * in2[3] + ((widelimb) in1[2]) * in2[2] + + ((widelimb) in1[3]) * in2[1]; + out[5] = ((widelimb) in1[2]) * in2[3] + ((widelimb) in1[3]) * in2[2]; + out[6] = ((widelimb) in1[3]) * in2[3]; + } + +/* Reduce seven 128-bit coefficients to four 64-bit coefficients. + * Requires in[i] < 2^126, + * ensures out[0] < 2^56, out[1] < 2^56, out[2] < 2^56, out[3] <= 2^56 + 2^16 */ +static void felem_reduce(felem out, const widefelem in) + { + static const widelimb two127p15 = (((widelimb) 1) << 127) + + (((widelimb) 1) << 15); + static const widelimb two127m71 = (((widelimb) 1) << 127) - + (((widelimb) 1) << 71); + static const widelimb two127m71m55 = (((widelimb) 1) << 127) - + (((widelimb) 1) << 71) - (((widelimb) 1) << 55); + widelimb output[5]; + + /* Add 0 mod 2^224-2^96+1 to ensure all differences are positive */ + output[0] = in[0] + two127p15; + output[1] = in[1] + two127m71m55; + output[2] = in[2] + two127m71; + output[3] = in[3]; + output[4] = in[4]; + + /* Eliminate in[4], in[5], in[6] */ + output[4] += in[6] >> 16; + output[3] += (in[6] & 0xffff) << 40; + output[2] -= in[6]; + + output[3] += in[5] >> 16; + output[2] += (in[5] & 0xffff) << 40; + output[1] -= in[5]; + + output[2] += output[4] >> 16; + output[1] += (output[4] & 0xffff) << 40; + output[0] -= output[4]; + + /* Carry 2 -> 3 -> 4 */ + output[3] += output[2] >> 56; + output[2] &= 0x00ffffffffffffff; + + output[4] = output[3] >> 56; + output[3] &= 0x00ffffffffffffff; + + /* Now output[2] < 2^56, output[3] < 2^56, output[4] < 2^72 */ + + /* Eliminate output[4] */ + output[2] += output[4] >> 16; + /* output[2] < 2^56 + 2^56 = 2^57 */ + output[1] += (output[4] & 0xffff) << 40; + output[0] -= output[4]; + + /* Carry 0 -> 1 -> 2 -> 3 */ + output[1] += output[0] >> 56; + out[0] = output[0] & 0x00ffffffffffffff; + + output[2] += output[1] >> 56; + /* output[2] < 2^57 + 2^72 */ + out[1] = output[1] & 0x00ffffffffffffff; + output[3] += output[2] >> 56; + /* output[3] <= 2^56 + 2^16 */ + out[2] = output[2] & 0x00ffffffffffffff; + + /* out[0] < 2^56, out[1] < 2^56, out[2] < 2^56, + * out[3] <= 2^56 + 2^16 (due to final carry), + * so out < 2*p */ + out[3] = output[3]; + } + +static void felem_square_reduce(felem out, const felem in) + { + widefelem tmp; + felem_square(tmp, in); + felem_reduce(out, tmp); + } + +static void felem_mul_reduce(felem out, const felem in1, const felem in2) + { + widefelem tmp; + felem_mul(tmp, in1, in2); + felem_reduce(out, tmp); + } + +/* Reduce to unique minimal representation. + * Requires 0 <= in < 2*p (always call felem_reduce first) */ +static void felem_contract(felem out, const felem in) + { + static const int64_t two56 = ((limb) 1) << 56; + /* 0 <= in < 2*p, p = 2^224 - 2^96 + 1 */ + /* if in > p , reduce in = in - 2^224 + 2^96 - 1 */ + int64_t tmp[4], a; + tmp[0] = in[0]; + tmp[1] = in[1]; + tmp[2] = in[2]; + tmp[3] = in[3]; + /* Case 1: a = 1 iff in >= 2^224 */ + a = (in[3] >> 56); + tmp[0] -= a; + tmp[1] += a << 40; + tmp[3] &= 0x00ffffffffffffff; + /* Case 2: a = 0 iff p <= in < 2^224, i.e., + * the high 128 bits are all 1 and the lower part is non-zero */ + a = ((in[3] & in[2] & (in[1] | 0x000000ffffffffff)) + 1) | + (((int64_t)(in[0] + (in[1] & 0x000000ffffffffff)) - 1) >> 63); + a &= 0x00ffffffffffffff; + /* turn a into an all-one mask (if a = 0) or an all-zero mask */ + a = (a - 1) >> 63; + /* subtract 2^224 - 2^96 + 1 if a is all-one*/ + tmp[3] &= a ^ 0xffffffffffffffff; + tmp[2] &= a ^ 0xffffffffffffffff; + tmp[1] &= (a ^ 0xffffffffffffffff) | 0x000000ffffffffff; + tmp[0] -= 1 & a; + + /* eliminate negative coefficients: if tmp[0] is negative, tmp[1] must + * be non-zero, so we only need one step */ + a = tmp[0] >> 63; + tmp[0] += two56 & a; + tmp[1] -= 1 & a; + + /* carry 1 -> 2 -> 3 */ + tmp[2] += tmp[1] >> 56; + tmp[1] &= 0x00ffffffffffffff; + + tmp[3] += tmp[2] >> 56; + tmp[2] &= 0x00ffffffffffffff; + + /* Now 0 <= out < p */ + out[0] = tmp[0]; + out[1] = tmp[1]; + out[2] = tmp[2]; + out[3] = tmp[3]; + } + +/* Zero-check: returns 1 if input is 0, and 0 otherwise. + * We know that field elements are reduced to in < 2^225, + * so we only need to check three cases: 0, 2^224 - 2^96 + 1, + * and 2^225 - 2^97 + 2 */ +static limb felem_is_zero(const felem in) + { + limb zero, two224m96p1, two225m97p2; + + zero = in[0] | in[1] | in[2] | in[3]; + zero = (((int64_t)(zero) - 1) >> 63) & 1; + two224m96p1 = (in[0] ^ 1) | (in[1] ^ 0x00ffff0000000000) + | (in[2] ^ 0x00ffffffffffffff) | (in[3] ^ 0x00ffffffffffffff); + two224m96p1 = (((int64_t)(two224m96p1) - 1) >> 63) & 1; + two225m97p2 = (in[0] ^ 2) | (in[1] ^ 0x00fffe0000000000) + | (in[2] ^ 0x00ffffffffffffff) | (in[3] ^ 0x01ffffffffffffff); + two225m97p2 = (((int64_t)(two225m97p2) - 1) >> 63) & 1; + return (zero | two224m96p1 | two225m97p2); + } + +static limb felem_is_zero_int(const felem in) + { + return (int) (felem_is_zero(in) & ((limb)1)); + } + +/* Invert a field element */ +/* Computation chain copied from djb's code */ +static void felem_inv(felem out, const felem in) + { + felem ftmp, ftmp2, ftmp3, ftmp4; + widefelem tmp; + unsigned i; + + felem_square(tmp, in); felem_reduce(ftmp, tmp); /* 2 */ + felem_mul(tmp, in, ftmp); felem_reduce(ftmp, tmp); /* 2^2 - 1 */ + felem_square(tmp, ftmp); felem_reduce(ftmp, tmp); /* 2^3 - 2 */ + felem_mul(tmp, in, ftmp); felem_reduce(ftmp, tmp); /* 2^3 - 1 */ + felem_square(tmp, ftmp); felem_reduce(ftmp2, tmp); /* 2^4 - 2 */ + felem_square(tmp, ftmp2); felem_reduce(ftmp2, tmp); /* 2^5 - 4 */ + felem_square(tmp, ftmp2); felem_reduce(ftmp2, tmp); /* 2^6 - 8 */ + felem_mul(tmp, ftmp2, ftmp); felem_reduce(ftmp, tmp); /* 2^6 - 1 */ + felem_square(tmp, ftmp); felem_reduce(ftmp2, tmp); /* 2^7 - 2 */ + for (i = 0; i < 5; ++i) /* 2^12 - 2^6 */ + { + felem_square(tmp, ftmp2); felem_reduce(ftmp2, tmp); + } + felem_mul(tmp, ftmp2, ftmp); felem_reduce(ftmp2, tmp); /* 2^12 - 1 */ + felem_square(tmp, ftmp2); felem_reduce(ftmp3, tmp); /* 2^13 - 2 */ + for (i = 0; i < 11; ++i) /* 2^24 - 2^12 */ + { + felem_square(tmp, ftmp3); felem_reduce(ftmp3, tmp); + } + felem_mul(tmp, ftmp3, ftmp2); felem_reduce(ftmp2, tmp); /* 2^24 - 1 */ + felem_square(tmp, ftmp2); felem_reduce(ftmp3, tmp); /* 2^25 - 2 */ + for (i = 0; i < 23; ++i) /* 2^48 - 2^24 */ + { + felem_square(tmp, ftmp3); felem_reduce(ftmp3, tmp); + } + felem_mul(tmp, ftmp3, ftmp2); felem_reduce(ftmp3, tmp); /* 2^48 - 1 */ + felem_square(tmp, ftmp3); felem_reduce(ftmp4, tmp); /* 2^49 - 2 */ + for (i = 0; i < 47; ++i) /* 2^96 - 2^48 */ + { + felem_square(tmp, ftmp4); felem_reduce(ftmp4, tmp); + } + felem_mul(tmp, ftmp3, ftmp4); felem_reduce(ftmp3, tmp); /* 2^96 - 1 */ + felem_square(tmp, ftmp3); felem_reduce(ftmp4, tmp); /* 2^97 - 2 */ + for (i = 0; i < 23; ++i) /* 2^120 - 2^24 */ + { + felem_square(tmp, ftmp4); felem_reduce(ftmp4, tmp); + } + felem_mul(tmp, ftmp2, ftmp4); felem_reduce(ftmp2, tmp); /* 2^120 - 1 */ + for (i = 0; i < 6; ++i) /* 2^126 - 2^6 */ + { + felem_square(tmp, ftmp2); felem_reduce(ftmp2, tmp); + } + felem_mul(tmp, ftmp2, ftmp); felem_reduce(ftmp, tmp); /* 2^126 - 1 */ + felem_square(tmp, ftmp); felem_reduce(ftmp, tmp); /* 2^127 - 2 */ + felem_mul(tmp, ftmp, in); felem_reduce(ftmp, tmp); /* 2^127 - 1 */ + for (i = 0; i < 97; ++i) /* 2^224 - 2^97 */ + { + felem_square(tmp, ftmp); felem_reduce(ftmp, tmp); + } + felem_mul(tmp, ftmp, ftmp3); felem_reduce(out, tmp); /* 2^224 - 2^96 - 1 */ + } + +/* Copy in constant time: + * if icopy == 1, copy in to out, + * if icopy == 0, copy out to itself. */ +static void +copy_conditional(felem out, const felem in, limb icopy) + { + unsigned i; + /* icopy is a (64-bit) 0 or 1, so copy is either all-zero or all-one */ + const limb copy = -icopy; + for (i = 0; i < 4; ++i) + { + const limb tmp = copy & (in[i] ^ out[i]); + out[i] ^= tmp; + } + } + +/******************************************************************************/ +/* ELLIPTIC CURVE POINT OPERATIONS + * + * Points are represented in Jacobian projective coordinates: + * (X, Y, Z) corresponds to the affine point (X/Z^2, Y/Z^3), + * or to the point at infinity if Z == 0. + * + */ + +/* Double an elliptic curve point: + * (X', Y', Z') = 2 * (X, Y, Z), where + * X' = (3 * (X - Z^2) * (X + Z^2))^2 - 8 * X * Y^2 + * Y' = 3 * (X - Z^2) * (X + Z^2) * (4 * X * Y^2 - X') - 8 * Y^2 + * Z' = (Y + Z)^2 - Y^2 - Z^2 = 2 * Y * Z + * Outputs can equal corresponding inputs, i.e., x_out == x_in is allowed, + * while x_out == y_in is not (maybe this works, but it's not tested). */ +static void +point_double(felem x_out, felem y_out, felem z_out, + const felem x_in, const felem y_in, const felem z_in) + { + widefelem tmp, tmp2; + felem delta, gamma, beta, alpha, ftmp, ftmp2; + + felem_assign(ftmp, x_in); + felem_assign(ftmp2, x_in); + + /* delta = z^2 */ + felem_square(tmp, z_in); + felem_reduce(delta, tmp); + + /* gamma = y^2 */ + felem_square(tmp, y_in); + felem_reduce(gamma, tmp); + + /* beta = x*gamma */ + felem_mul(tmp, x_in, gamma); + felem_reduce(beta, tmp); + + /* alpha = 3*(x-delta)*(x+delta) */ + felem_diff(ftmp, delta); + /* ftmp[i] < 2^57 + 2^58 + 2 < 2^59 */ + felem_sum(ftmp2, delta); + /* ftmp2[i] < 2^57 + 2^57 = 2^58 */ + felem_scalar(ftmp2, 3); + /* ftmp2[i] < 3 * 2^58 < 2^60 */ + felem_mul(tmp, ftmp, ftmp2); + /* tmp[i] < 2^60 * 2^59 * 4 = 2^121 */ + felem_reduce(alpha, tmp); + + /* x' = alpha^2 - 8*beta */ + felem_square(tmp, alpha); + /* tmp[i] < 4 * 2^57 * 2^57 = 2^116 */ + felem_assign(ftmp, beta); + felem_scalar(ftmp, 8); + /* ftmp[i] < 8 * 2^57 = 2^60 */ + felem_diff_128_64(tmp, ftmp); + /* tmp[i] < 2^116 + 2^64 + 8 < 2^117 */ + felem_reduce(x_out, tmp); + + /* z' = (y + z)^2 - gamma - delta */ + felem_sum(delta, gamma); + /* delta[i] < 2^57 + 2^57 = 2^58 */ + felem_assign(ftmp, y_in); + felem_sum(ftmp, z_in); + /* ftmp[i] < 2^57 + 2^57 = 2^58 */ + felem_square(tmp, ftmp); + /* tmp[i] < 4 * 2^58 * 2^58 = 2^118 */ + felem_diff_128_64(tmp, delta); + /* tmp[i] < 2^118 + 2^64 + 8 < 2^119 */ + felem_reduce(z_out, tmp); + + /* y' = alpha*(4*beta - x') - 8*gamma^2 */ + felem_scalar(beta, 4); + /* beta[i] < 4 * 2^57 = 2^59 */ + felem_diff(beta, x_out); + /* beta[i] < 2^59 + 2^58 + 2 < 2^60 */ + felem_mul(tmp, alpha, beta); + /* tmp[i] < 4 * 2^57 * 2^60 = 2^119 */ + felem_square(tmp2, gamma); + /* tmp2[i] < 4 * 2^57 * 2^57 = 2^116 */ + widefelem_scalar(tmp2, 8); + /* tmp2[i] < 8 * 2^116 = 2^119 */ + widefelem_diff(tmp, tmp2); + /* tmp[i] < 2^119 + 2^120 < 2^121 */ + felem_reduce(y_out, tmp); + } + +/* Add two elliptic curve points: + * (X_1, Y_1, Z_1) + (X_2, Y_2, Z_2) = (X_3, Y_3, Z_3), where + * X_3 = (Z_1^3 * Y_2 - Z_2^3 * Y_1)^2 - (Z_1^2 * X_2 - Z_2^2 * X_1)^3 - + * 2 * Z_2^2 * X_1 * (Z_1^2 * X_2 - Z_2^2 * X_1)^2 + * Y_3 = (Z_1^3 * Y_2 - Z_2^3 * Y_1) * (Z_2^2 * X_1 * (Z_1^2 * X_2 - Z_2^2 * X_1)^2 - X_3) - + * Z_2^3 * Y_1 * (Z_1^2 * X_2 - Z_2^2 * X_1)^3 + * Z_3 = (Z_1^2 * X_2 - Z_2^2 * X_1) * (Z_1 * Z_2) + * + * This runs faster if 'mixed' is set, which requires Z_2 = 1 or Z_2 = 0. + */ + +/* This function is not entirely constant-time: + * it includes a branch for checking whether the two input points are equal, + * (while not equal to the point at infinity). + * This case never happens during single point multiplication, + * so there is no timing leak for ECDH or ECDSA signing. */ +static void point_add(felem x3, felem y3, felem z3, + const felem x1, const felem y1, const felem z1, + const int mixed, const felem x2, const felem y2, const felem z2) + { + felem ftmp, ftmp2, ftmp3, ftmp4, ftmp5, x_out, y_out, z_out; + widefelem tmp, tmp2; + limb z1_is_zero, z2_is_zero, x_equal, y_equal; + + if (!mixed) + { + /* ftmp2 = z2^2 */ + felem_square(tmp, z2); + felem_reduce(ftmp2, tmp); + + /* ftmp4 = z2^3 */ + felem_mul(tmp, ftmp2, z2); + felem_reduce(ftmp4, tmp); + + /* ftmp4 = z2^3*y1 */ + felem_mul(tmp2, ftmp4, y1); + felem_reduce(ftmp4, tmp2); + + /* ftmp2 = z2^2*x1 */ + felem_mul(tmp2, ftmp2, x1); + felem_reduce(ftmp2, tmp2); + } + else + { + /* We'll assume z2 = 1 (special case z2 = 0 is handled later) */ + + /* ftmp4 = z2^3*y1 */ + felem_assign(ftmp4, y1); + + /* ftmp2 = z2^2*x1 */ + felem_assign(ftmp2, x1); + } + + /* ftmp = z1^2 */ + felem_square(tmp, z1); + felem_reduce(ftmp, tmp); + + /* ftmp3 = z1^3 */ + felem_mul(tmp, ftmp, z1); + felem_reduce(ftmp3, tmp); + + /* tmp = z1^3*y2 */ + felem_mul(tmp, ftmp3, y2); + /* tmp[i] < 4 * 2^57 * 2^57 = 2^116 */ + + /* ftmp3 = z1^3*y2 - z2^3*y1 */ + felem_diff_128_64(tmp, ftmp4); + /* tmp[i] < 2^116 + 2^64 + 8 < 2^117 */ + felem_reduce(ftmp3, tmp); + + /* tmp = z1^2*x2 */ + felem_mul(tmp, ftmp, x2); + /* tmp[i] < 4 * 2^57 * 2^57 = 2^116 */ + + /* ftmp = z1^2*x2 - z2^2*x1 */ + felem_diff_128_64(tmp, ftmp2); + /* tmp[i] < 2^116 + 2^64 + 8 < 2^117 */ + felem_reduce(ftmp, tmp); + + /* the formulae are incorrect if the points are equal + * so we check for this and do doubling if this happens */ + x_equal = felem_is_zero(ftmp); + y_equal = felem_is_zero(ftmp3); + z1_is_zero = felem_is_zero(z1); + z2_is_zero = felem_is_zero(z2); + /* In affine coordinates, (X_1, Y_1) == (X_2, Y_2) */ + if (x_equal && y_equal && !z1_is_zero && !z2_is_zero) + { + point_double(x3, y3, z3, x1, y1, z1); + return; + } + + /* ftmp5 = z1*z2 */ + if (!mixed) + { + felem_mul(tmp, z1, z2); + felem_reduce(ftmp5, tmp); + } + else + { + /* special case z2 = 0 is handled later */ + felem_assign(ftmp5, z1); + } + + /* z_out = (z1^2*x2 - z2^2*x1)*(z1*z2) */ + felem_mul(tmp, ftmp, ftmp5); + felem_reduce(z_out, tmp); + + /* ftmp = (z1^2*x2 - z2^2*x1)^2 */ + felem_assign(ftmp5, ftmp); + felem_square(tmp, ftmp); + felem_reduce(ftmp, tmp); + + /* ftmp5 = (z1^2*x2 - z2^2*x1)^3 */ + felem_mul(tmp, ftmp, ftmp5); + felem_reduce(ftmp5, tmp); + + /* ftmp2 = z2^2*x1*(z1^2*x2 - z2^2*x1)^2 */ + felem_mul(tmp, ftmp2, ftmp); + felem_reduce(ftmp2, tmp); + + /* tmp = z2^3*y1*(z1^2*x2 - z2^2*x1)^3 */ + felem_mul(tmp, ftmp4, ftmp5); + /* tmp[i] < 4 * 2^57 * 2^57 = 2^116 */ + + /* tmp2 = (z1^3*y2 - z2^3*y1)^2 */ + felem_square(tmp2, ftmp3); + /* tmp2[i] < 4 * 2^57 * 2^57 < 2^116 */ + + /* tmp2 = (z1^3*y2 - z2^3*y1)^2 - (z1^2*x2 - z2^2*x1)^3 */ + felem_diff_128_64(tmp2, ftmp5); + /* tmp2[i] < 2^116 + 2^64 + 8 < 2^117 */ + + /* ftmp5 = 2*z2^2*x1*(z1^2*x2 - z2^2*x1)^2 */ + felem_assign(ftmp5, ftmp2); + felem_scalar(ftmp5, 2); + /* ftmp5[i] < 2 * 2^57 = 2^58 */ + + /* x_out = (z1^3*y2 - z2^3*y1)^2 - (z1^2*x2 - z2^2*x1)^3 - + 2*z2^2*x1*(z1^2*x2 - z2^2*x1)^2 */ + felem_diff_128_64(tmp2, ftmp5); + /* tmp2[i] < 2^117 + 2^64 + 8 < 2^118 */ + felem_reduce(x_out, tmp2); + + /* ftmp2 = z2^2*x1*(z1^2*x2 - z2^2*x1)^2 - x_out */ + felem_diff(ftmp2, x_out); + /* ftmp2[i] < 2^57 + 2^58 + 2 < 2^59 */ + + /* tmp2 = (z1^3*y2 - z2^3*y1)*(z2^2*x1*(z1^2*x2 - z2^2*x1)^2 - x_out) */ + felem_mul(tmp2, ftmp3, ftmp2); + /* tmp2[i] < 4 * 2^57 * 2^59 = 2^118 */ + + /* y_out = (z1^3*y2 - z2^3*y1)*(z2^2*x1*(z1^2*x2 - z2^2*x1)^2 - x_out) - + z2^3*y1*(z1^2*x2 - z2^2*x1)^3 */ + widefelem_diff(tmp2, tmp); + /* tmp2[i] < 2^118 + 2^120 < 2^121 */ + felem_reduce(y_out, tmp2); + + /* the result (x_out, y_out, z_out) is incorrect if one of the inputs is + * the point at infinity, so we need to check for this separately */ + + /* if point 1 is at infinity, copy point 2 to output, and vice versa */ + copy_conditional(x_out, x2, z1_is_zero); + copy_conditional(x_out, x1, z2_is_zero); + copy_conditional(y_out, y2, z1_is_zero); + copy_conditional(y_out, y1, z2_is_zero); + copy_conditional(z_out, z2, z1_is_zero); + copy_conditional(z_out, z1, z2_is_zero); + felem_assign(x3, x_out); + felem_assign(y3, y_out); + felem_assign(z3, z_out); + } + +/* select_point selects the |idx|th point from a precomputation table and + * copies it to out. */ +static void select_point(const u64 idx, unsigned int size, const felem pre_comp[/*size*/][3], felem out[3]) + { + unsigned i, j; + limb *outlimbs = &out[0][0]; + memset(outlimbs, 0, 3 * sizeof(felem)); + + for (i = 0; i < size; i++) + { + const limb *inlimbs = &pre_comp[i][0][0]; + u64 mask = i ^ idx; + mask |= mask >> 4; + mask |= mask >> 2; + mask |= mask >> 1; + mask &= 1; + mask--; + for (j = 0; j < 4 * 3; j++) + outlimbs[j] |= inlimbs[j] & mask; + } + } + +/* get_bit returns the |i|th bit in |in| */ +static char get_bit(const felem_bytearray in, unsigned i) + { + if (i >= 224) + return 0; + return (in[i >> 3] >> (i & 7)) & 1; + } + +/* Interleaved point multiplication using precomputed point multiples: + * The small point multiples 0*P, 1*P, ..., 16*P are in pre_comp[], + * the scalars in scalars[]. If g_scalar is non-NULL, we also add this multiple + * of the generator, using certain (large) precomputed multiples in g_pre_comp. + * Output point (X, Y, Z) is stored in x_out, y_out, z_out */ +static void batch_mul(felem x_out, felem y_out, felem z_out, + const felem_bytearray scalars[], const unsigned num_points, const u8 *g_scalar, + const int mixed, const felem pre_comp[][17][3], const felem g_pre_comp[2][16][3]) + { + int i, skip; + unsigned num; + unsigned gen_mul = (g_scalar != NULL); + felem nq[3], tmp[4]; + u64 bits; + u8 sign, digit; + + /* set nq to the point at infinity */ + memset(nq, 0, 3 * sizeof(felem)); + + /* Loop over all scalars msb-to-lsb, interleaving additions + * of multiples of the generator (two in each of the last 28 rounds) + * and additions of other points multiples (every 5th round). + */ + skip = 1; /* save two point operations in the first round */ + for (i = (num_points ? 220 : 27); i >= 0; --i) + { + /* double */ + if (!skip) + point_double(nq[0], nq[1], nq[2], nq[0], nq[1], nq[2]); + + /* add multiples of the generator */ + if (gen_mul && (i <= 27)) + { + /* first, look 28 bits upwards */ + bits = get_bit(g_scalar, i + 196) << 3; + bits |= get_bit(g_scalar, i + 140) << 2; + bits |= get_bit(g_scalar, i + 84) << 1; + bits |= get_bit(g_scalar, i + 28); + /* select the point to add, in constant time */ + select_point(bits, 16, g_pre_comp[1], tmp); + + if (!skip) + { + point_add(nq[0], nq[1], nq[2], + nq[0], nq[1], nq[2], + 1 /* mixed */, tmp[0], tmp[1], tmp[2]); + } + else + { + memcpy(nq, tmp, 3 * sizeof(felem)); + skip = 0; + } + + /* second, look at the current position */ + bits = get_bit(g_scalar, i + 168) << 3; + bits |= get_bit(g_scalar, i + 112) << 2; + bits |= get_bit(g_scalar, i + 56) << 1; + bits |= get_bit(g_scalar, i); + /* select the point to add, in constant time */ + select_point(bits, 16, g_pre_comp[0], tmp); + point_add(nq[0], nq[1], nq[2], + nq[0], nq[1], nq[2], + 1 /* mixed */, tmp[0], tmp[1], tmp[2]); + } + + /* do other additions every 5 doublings */ + if (num_points && (i % 5 == 0)) + { + /* loop over all scalars */ + for (num = 0; num < num_points; ++num) + { + bits = get_bit(scalars[num], i + 4) << 5; + bits |= get_bit(scalars[num], i + 3) << 4; + bits |= get_bit(scalars[num], i + 2) << 3; + bits |= get_bit(scalars[num], i + 1) << 2; + bits |= get_bit(scalars[num], i) << 1; + bits |= get_bit(scalars[num], i - 1); + ec_GFp_nistp_recode_scalar_bits(&sign, &digit, bits); + + /* select the point to add or subtract */ + select_point(digit, 17, pre_comp[num], tmp); + felem_neg(tmp[3], tmp[1]); /* (X, -Y, Z) is the negative point */ + copy_conditional(tmp[1], tmp[3], sign); + + if (!skip) + { + point_add(nq[0], nq[1], nq[2], + nq[0], nq[1], nq[2], + mixed, tmp[0], tmp[1], tmp[2]); + } + else + { + memcpy(nq, tmp, 3 * sizeof(felem)); + skip = 0; + } + } + } + } + felem_assign(x_out, nq[0]); + felem_assign(y_out, nq[1]); + felem_assign(z_out, nq[2]); + } + +/******************************************************************************/ +/* FUNCTIONS TO MANAGE PRECOMPUTATION + */ + +static NISTP224_PRE_COMP *nistp224_pre_comp_new() + { + NISTP224_PRE_COMP *ret = NULL; + ret = (NISTP224_PRE_COMP *) OPENSSL_malloc(sizeof *ret); + if (!ret) + { + ECerr(EC_F_NISTP224_PRE_COMP_NEW, ERR_R_MALLOC_FAILURE); + return ret; + } + memset(ret->g_pre_comp, 0, sizeof(ret->g_pre_comp)); + ret->references = 1; + return ret; + } + +static void *nistp224_pre_comp_dup(void *src_) + { + NISTP224_PRE_COMP *src = src_; + + /* no need to actually copy, these objects never change! */ + CRYPTO_add(&src->references, 1, CRYPTO_LOCK_EC_PRE_COMP); + + return src_; + } + +static void nistp224_pre_comp_free(void *pre_) + { + int i; + NISTP224_PRE_COMP *pre = pre_; + + if (!pre) + return; + + i = CRYPTO_add(&pre->references, -1, CRYPTO_LOCK_EC_PRE_COMP); + if (i > 0) + return; + + OPENSSL_free(pre); + } + +static void nistp224_pre_comp_clear_free(void *pre_) + { + int i; + NISTP224_PRE_COMP *pre = pre_; + + if (!pre) + return; + + i = CRYPTO_add(&pre->references, -1, CRYPTO_LOCK_EC_PRE_COMP); + if (i > 0) + return; + + OPENSSL_cleanse(pre, sizeof *pre); + OPENSSL_free(pre); + } + +/******************************************************************************/ +/* OPENSSL EC_METHOD FUNCTIONS + */ + +int ec_GFp_nistp224_group_init(EC_GROUP *group) + { + int ret; + ret = ec_GFp_simple_group_init(group); + group->a_is_minus3 = 1; + return ret; + } + +int ec_GFp_nistp224_group_set_curve(EC_GROUP *group, const BIGNUM *p, + const BIGNUM *a, const BIGNUM *b, BN_CTX *ctx) + { + int ret = 0; + BN_CTX *new_ctx = NULL; + BIGNUM *curve_p, *curve_a, *curve_b; + + if (ctx == NULL) + if ((ctx = new_ctx = BN_CTX_new()) == NULL) return 0; + BN_CTX_start(ctx); + if (((curve_p = BN_CTX_get(ctx)) == NULL) || + ((curve_a = BN_CTX_get(ctx)) == NULL) || + ((curve_b = BN_CTX_get(ctx)) == NULL)) goto err; + BN_bin2bn(nistp224_curve_params[0], sizeof(felem_bytearray), curve_p); + BN_bin2bn(nistp224_curve_params[1], sizeof(felem_bytearray), curve_a); + BN_bin2bn(nistp224_curve_params[2], sizeof(felem_bytearray), curve_b); + if ((BN_cmp(curve_p, p)) || (BN_cmp(curve_a, a)) || + (BN_cmp(curve_b, b))) + { + ECerr(EC_F_EC_GFP_NISTP224_GROUP_SET_CURVE, + EC_R_WRONG_CURVE_PARAMETERS); + goto err; + } + group->field_mod_func = BN_nist_mod_224; + ret = ec_GFp_simple_group_set_curve(group, p, a, b, ctx); +err: + BN_CTX_end(ctx); + if (new_ctx != NULL) + BN_CTX_free(new_ctx); + return ret; + } + +/* Takes the Jacobian coordinates (X, Y, Z) of a point and returns + * (X', Y') = (X/Z^2, Y/Z^3) */ +int ec_GFp_nistp224_point_get_affine_coordinates(const EC_GROUP *group, + const EC_POINT *point, BIGNUM *x, BIGNUM *y, BN_CTX *ctx) + { + felem z1, z2, x_in, y_in, x_out, y_out; + widefelem tmp; + + if (EC_POINT_is_at_infinity(group, point)) + { + ECerr(EC_F_EC_GFP_NISTP224_POINT_GET_AFFINE_COORDINATES, + EC_R_POINT_AT_INFINITY); + return 0; + } + if ((!BN_to_felem(x_in, &point->X)) || (!BN_to_felem(y_in, &point->Y)) || + (!BN_to_felem(z1, &point->Z))) return 0; + felem_inv(z2, z1); + felem_square(tmp, z2); felem_reduce(z1, tmp); + felem_mul(tmp, x_in, z1); felem_reduce(x_in, tmp); + felem_contract(x_out, x_in); + if (x != NULL) + { + if (!felem_to_BN(x, x_out)) { + ECerr(EC_F_EC_GFP_NISTP224_POINT_GET_AFFINE_COORDINATES, + ERR_R_BN_LIB); + return 0; + } + } + felem_mul(tmp, z1, z2); felem_reduce(z1, tmp); + felem_mul(tmp, y_in, z1); felem_reduce(y_in, tmp); + felem_contract(y_out, y_in); + if (y != NULL) + { + if (!felem_to_BN(y, y_out)) { + ECerr(EC_F_EC_GFP_NISTP224_POINT_GET_AFFINE_COORDINATES, + ERR_R_BN_LIB); + return 0; + } + } + return 1; + } + +static void make_points_affine(size_t num, felem points[/*num*/][3], felem tmp_felems[/*num+1*/]) + { + /* Runs in constant time, unless an input is the point at infinity + * (which normally shouldn't happen). */ + ec_GFp_nistp_points_make_affine_internal( + num, + points, + sizeof(felem), + tmp_felems, + (void (*)(void *)) felem_one, + (int (*)(const void *)) felem_is_zero_int, + (void (*)(void *, const void *)) felem_assign, + (void (*)(void *, const void *)) felem_square_reduce, + (void (*)(void *, const void *, const void *)) felem_mul_reduce, + (void (*)(void *, const void *)) felem_inv, + (void (*)(void *, const void *)) felem_contract); + } + +/* Computes scalar*generator + \sum scalars[i]*points[i], ignoring NULL values + * Result is stored in r (r can equal one of the inputs). */ +int ec_GFp_nistp224_points_mul(const EC_GROUP *group, EC_POINT *r, + const BIGNUM *scalar, size_t num, const EC_POINT *points[], + const BIGNUM *scalars[], BN_CTX *ctx) + { + int ret = 0; + int j; + unsigned i; + int mixed = 0; + BN_CTX *new_ctx = NULL; + BIGNUM *x, *y, *z, *tmp_scalar; + felem_bytearray g_secret; + felem_bytearray *secrets = NULL; + felem (*pre_comp)[17][3] = NULL; + felem *tmp_felems = NULL; + felem_bytearray tmp; + unsigned num_bytes; + int have_pre_comp = 0; + size_t num_points = num; + felem x_in, y_in, z_in, x_out, y_out, z_out; + NISTP224_PRE_COMP *pre = NULL; + const felem (*g_pre_comp)[16][3] = NULL; + EC_POINT *generator = NULL; + const EC_POINT *p = NULL; + const BIGNUM *p_scalar = NULL; + + if (ctx == NULL) + if ((ctx = new_ctx = BN_CTX_new()) == NULL) return 0; + BN_CTX_start(ctx); + if (((x = BN_CTX_get(ctx)) == NULL) || + ((y = BN_CTX_get(ctx)) == NULL) || + ((z = BN_CTX_get(ctx)) == NULL) || + ((tmp_scalar = BN_CTX_get(ctx)) == NULL)) + goto err; + + if (scalar != NULL) + { + pre = EC_EX_DATA_get_data(group->extra_data, + nistp224_pre_comp_dup, nistp224_pre_comp_free, + nistp224_pre_comp_clear_free); + if (pre) + /* we have precomputation, try to use it */ + g_pre_comp = (const felem (*)[16][3]) pre->g_pre_comp; + else + /* try to use the standard precomputation */ + g_pre_comp = &gmul[0]; + generator = EC_POINT_new(group); + if (generator == NULL) + goto err; + /* get the generator from precomputation */ + if (!felem_to_BN(x, g_pre_comp[0][1][0]) || + !felem_to_BN(y, g_pre_comp[0][1][1]) || + !felem_to_BN(z, g_pre_comp[0][1][2])) + { + ECerr(EC_F_EC_GFP_NISTP224_POINTS_MUL, ERR_R_BN_LIB); + goto err; + } + if (!EC_POINT_set_Jprojective_coordinates_GFp(group, + generator, x, y, z, ctx)) + goto err; + if (0 == EC_POINT_cmp(group, generator, group->generator, ctx)) + /* precomputation matches generator */ + have_pre_comp = 1; + else + /* we don't have valid precomputation: + * treat the generator as a random point */ + num_points = num_points + 1; + } + + if (num_points > 0) + { + if (num_points >= 3) + { + /* unless we precompute multiples for just one or two points, + * converting those into affine form is time well spent */ + mixed = 1; + } + secrets = OPENSSL_malloc(num_points * sizeof(felem_bytearray)); + pre_comp = OPENSSL_malloc(num_points * 17 * 3 * sizeof(felem)); + if (mixed) + tmp_felems = OPENSSL_malloc((num_points * 17 + 1) * sizeof(felem)); + if ((secrets == NULL) || (pre_comp == NULL) || (mixed && (tmp_felems == NULL))) + { + ECerr(EC_F_EC_GFP_NISTP224_POINTS_MUL, ERR_R_MALLOC_FAILURE); + goto err; + } + + /* we treat NULL scalars as 0, and NULL points as points at infinity, + * i.e., they contribute nothing to the linear combination */ + memset(secrets, 0, num_points * sizeof(felem_bytearray)); + memset(pre_comp, 0, num_points * 17 * 3 * sizeof(felem)); + for (i = 0; i < num_points; ++i) + { + if (i == num) + /* the generator */ + { + p = EC_GROUP_get0_generator(group); + p_scalar = scalar; + } + else + /* the i^th point */ + { + p = points[i]; + p_scalar = scalars[i]; + } + if ((p_scalar != NULL) && (p != NULL)) + { + /* reduce scalar to 0 <= scalar < 2^224 */ + if ((BN_num_bits(p_scalar) > 224) || (BN_is_negative(p_scalar))) + { + /* this is an unusual input, and we don't guarantee + * constant-timeness */ + if (!BN_nnmod(tmp_scalar, p_scalar, &group->order, ctx)) + { + ECerr(EC_F_EC_GFP_NISTP224_POINTS_MUL, ERR_R_BN_LIB); + goto err; + } + num_bytes = BN_bn2bin(tmp_scalar, tmp); + } + else + num_bytes = BN_bn2bin(p_scalar, tmp); + flip_endian(secrets[i], tmp, num_bytes); + /* precompute multiples */ + if ((!BN_to_felem(x_out, &p->X)) || + (!BN_to_felem(y_out, &p->Y)) || + (!BN_to_felem(z_out, &p->Z))) goto err; + felem_assign(pre_comp[i][1][0], x_out); + felem_assign(pre_comp[i][1][1], y_out); + felem_assign(pre_comp[i][1][2], z_out); + for (j = 2; j <= 16; ++j) + { + if (j & 1) + { + point_add( + pre_comp[i][j][0], pre_comp[i][j][1], pre_comp[i][j][2], + pre_comp[i][1][0], pre_comp[i][1][1], pre_comp[i][1][2], + 0, pre_comp[i][j-1][0], pre_comp[i][j-1][1], pre_comp[i][j-1][2]); + } + else + { + point_double( + pre_comp[i][j][0], pre_comp[i][j][1], pre_comp[i][j][2], + pre_comp[i][j/2][0], pre_comp[i][j/2][1], pre_comp[i][j/2][2]); + } + } + } + } + if (mixed) + make_points_affine(num_points * 17, pre_comp[0], tmp_felems); + } + + /* the scalar for the generator */ + if ((scalar != NULL) && (have_pre_comp)) + { + memset(g_secret, 0, sizeof g_secret); + /* reduce scalar to 0 <= scalar < 2^224 */ + if ((BN_num_bits(scalar) > 224) || (BN_is_negative(scalar))) + { + /* this is an unusual input, and we don't guarantee + * constant-timeness */ + if (!BN_nnmod(tmp_scalar, scalar, &group->order, ctx)) + { + ECerr(EC_F_EC_GFP_NISTP224_POINTS_MUL, ERR_R_BN_LIB); + goto err; + } + num_bytes = BN_bn2bin(tmp_scalar, tmp); + } + else + num_bytes = BN_bn2bin(scalar, tmp); + flip_endian(g_secret, tmp, num_bytes); + /* do the multiplication with generator precomputation*/ + batch_mul(x_out, y_out, z_out, + (const felem_bytearray (*)) secrets, num_points, + g_secret, + mixed, (const felem (*)[17][3]) pre_comp, + g_pre_comp); + } + else + /* do the multiplication without generator precomputation */ + batch_mul(x_out, y_out, z_out, + (const felem_bytearray (*)) secrets, num_points, + NULL, mixed, (const felem (*)[17][3]) pre_comp, NULL); + /* reduce the output to its unique minimal representation */ + felem_contract(x_in, x_out); + felem_contract(y_in, y_out); + felem_contract(z_in, z_out); + if ((!felem_to_BN(x, x_in)) || (!felem_to_BN(y, y_in)) || + (!felem_to_BN(z, z_in))) + { + ECerr(EC_F_EC_GFP_NISTP224_POINTS_MUL, ERR_R_BN_LIB); + goto err; + } + ret = EC_POINT_set_Jprojective_coordinates_GFp(group, r, x, y, z, ctx); + +err: + BN_CTX_end(ctx); + if (generator != NULL) + EC_POINT_free(generator); + if (new_ctx != NULL) + BN_CTX_free(new_ctx); + if (secrets != NULL) + OPENSSL_free(secrets); + if (pre_comp != NULL) + OPENSSL_free(pre_comp); + if (tmp_felems != NULL) + OPENSSL_free(tmp_felems); + return ret; + } + +int ec_GFp_nistp224_precompute_mult(EC_GROUP *group, BN_CTX *ctx) + { + int ret = 0; + NISTP224_PRE_COMP *pre = NULL; + int i, j; + BN_CTX *new_ctx = NULL; + BIGNUM *x, *y; + EC_POINT *generator = NULL; + felem tmp_felems[32]; + + /* throw away old precomputation */ + EC_EX_DATA_free_data(&group->extra_data, nistp224_pre_comp_dup, + nistp224_pre_comp_free, nistp224_pre_comp_clear_free); + if (ctx == NULL) + if ((ctx = new_ctx = BN_CTX_new()) == NULL) return 0; + BN_CTX_start(ctx); + if (((x = BN_CTX_get(ctx)) == NULL) || + ((y = BN_CTX_get(ctx)) == NULL)) + goto err; + /* get the generator */ + if (group->generator == NULL) goto err; + generator = EC_POINT_new(group); + if (generator == NULL) + goto err; + BN_bin2bn(nistp224_curve_params[3], sizeof (felem_bytearray), x); + BN_bin2bn(nistp224_curve_params[4], sizeof (felem_bytearray), y); + if (!EC_POINT_set_affine_coordinates_GFp(group, generator, x, y, ctx)) + goto err; + if ((pre = nistp224_pre_comp_new()) == NULL) + goto err; + /* if the generator is the standard one, use built-in precomputation */ + if (0 == EC_POINT_cmp(group, generator, group->generator, ctx)) + { + memcpy(pre->g_pre_comp, gmul, sizeof(pre->g_pre_comp)); + ret = 1; + goto err; + } + if ((!BN_to_felem(pre->g_pre_comp[0][1][0], &group->generator->X)) || + (!BN_to_felem(pre->g_pre_comp[0][1][1], &group->generator->Y)) || + (!BN_to_felem(pre->g_pre_comp[0][1][2], &group->generator->Z))) + goto err; + /* compute 2^56*G, 2^112*G, 2^168*G for the first table, + * 2^28*G, 2^84*G, 2^140*G, 2^196*G for the second one + */ + for (i = 1; i <= 8; i <<= 1) + { + point_double( + pre->g_pre_comp[1][i][0], pre->g_pre_comp[1][i][1], pre->g_pre_comp[1][i][2], + pre->g_pre_comp[0][i][0], pre->g_pre_comp[0][i][1], pre->g_pre_comp[0][i][2]); + for (j = 0; j < 27; ++j) + { + point_double( + pre->g_pre_comp[1][i][0], pre->g_pre_comp[1][i][1], pre->g_pre_comp[1][i][2], + pre->g_pre_comp[1][i][0], pre->g_pre_comp[1][i][1], pre->g_pre_comp[1][i][2]); + } + if (i == 8) + break; + point_double( + pre->g_pre_comp[0][2*i][0], pre->g_pre_comp[0][2*i][1], pre->g_pre_comp[0][2*i][2], + pre->g_pre_comp[1][i][0], pre->g_pre_comp[1][i][1], pre->g_pre_comp[1][i][2]); + for (j = 0; j < 27; ++j) + { + point_double( + pre->g_pre_comp[0][2*i][0], pre->g_pre_comp[0][2*i][1], pre->g_pre_comp[0][2*i][2], + pre->g_pre_comp[0][2*i][0], pre->g_pre_comp[0][2*i][1], pre->g_pre_comp[0][2*i][2]); + } + } + for (i = 0; i < 2; i++) + { + /* g_pre_comp[i][0] is the point at infinity */ + memset(pre->g_pre_comp[i][0], 0, sizeof(pre->g_pre_comp[i][0])); + /* the remaining multiples */ + /* 2^56*G + 2^112*G resp. 2^84*G + 2^140*G */ + point_add( + pre->g_pre_comp[i][6][0], pre->g_pre_comp[i][6][1], + pre->g_pre_comp[i][6][2], pre->g_pre_comp[i][4][0], + pre->g_pre_comp[i][4][1], pre->g_pre_comp[i][4][2], + 0, pre->g_pre_comp[i][2][0], pre->g_pre_comp[i][2][1], + pre->g_pre_comp[i][2][2]); + /* 2^56*G + 2^168*G resp. 2^84*G + 2^196*G */ + point_add( + pre->g_pre_comp[i][10][0], pre->g_pre_comp[i][10][1], + pre->g_pre_comp[i][10][2], pre->g_pre_comp[i][8][0], + pre->g_pre_comp[i][8][1], pre->g_pre_comp[i][8][2], + 0, pre->g_pre_comp[i][2][0], pre->g_pre_comp[i][2][1], + pre->g_pre_comp[i][2][2]); + /* 2^112*G + 2^168*G resp. 2^140*G + 2^196*G */ + point_add( + pre->g_pre_comp[i][12][0], pre->g_pre_comp[i][12][1], + pre->g_pre_comp[i][12][2], pre->g_pre_comp[i][8][0], + pre->g_pre_comp[i][8][1], pre->g_pre_comp[i][8][2], + 0, pre->g_pre_comp[i][4][0], pre->g_pre_comp[i][4][1], + pre->g_pre_comp[i][4][2]); + /* 2^56*G + 2^112*G + 2^168*G resp. 2^84*G + 2^140*G + 2^196*G */ + point_add( + pre->g_pre_comp[i][14][0], pre->g_pre_comp[i][14][1], + pre->g_pre_comp[i][14][2], pre->g_pre_comp[i][12][0], + pre->g_pre_comp[i][12][1], pre->g_pre_comp[i][12][2], + 0, pre->g_pre_comp[i][2][0], pre->g_pre_comp[i][2][1], + pre->g_pre_comp[i][2][2]); + for (j = 1; j < 8; ++j) + { + /* odd multiples: add G resp. 2^28*G */ + point_add( + pre->g_pre_comp[i][2*j+1][0], pre->g_pre_comp[i][2*j+1][1], + pre->g_pre_comp[i][2*j+1][2], pre->g_pre_comp[i][2*j][0], + pre->g_pre_comp[i][2*j][1], pre->g_pre_comp[i][2*j][2], + 0, pre->g_pre_comp[i][1][0], pre->g_pre_comp[i][1][1], + pre->g_pre_comp[i][1][2]); + } + } + make_points_affine(31, &(pre->g_pre_comp[0][1]), tmp_felems); + + if (!EC_EX_DATA_set_data(&group->extra_data, pre, nistp224_pre_comp_dup, + nistp224_pre_comp_free, nistp224_pre_comp_clear_free)) + goto err; + ret = 1; + pre = NULL; + err: + BN_CTX_end(ctx); + if (generator != NULL) + EC_POINT_free(generator); + if (new_ctx != NULL) + BN_CTX_free(new_ctx); + if (pre) + nistp224_pre_comp_free(pre); + return ret; + } + +int ec_GFp_nistp224_have_precompute_mult(const EC_GROUP *group) + { + if (EC_EX_DATA_get_data(group->extra_data, nistp224_pre_comp_dup, + nistp224_pre_comp_free, nistp224_pre_comp_clear_free) + != NULL) + return 1; + else + return 0; + } + +#else +static void *dummy=&dummy; +#endif diff --git a/crypto/openssl/crypto/ec/ecp_nistp256.c b/crypto/openssl/crypto/ec/ecp_nistp256.c new file mode 100644 index 0000000000..4bc0f5dce0 --- /dev/null +++ b/crypto/openssl/crypto/ec/ecp_nistp256.c @@ -0,0 +1,2171 @@ +/* crypto/ec/ecp_nistp256.c */ +/* + * Written by Adam Langley (Google) for the OpenSSL project + */ +/* Copyright 2011 Google Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/* + * A 64-bit implementation of the NIST P-256 elliptic curve point multiplication + * + * OpenSSL integration was taken from Emilia Kasper's work in ecp_nistp224.c. + * Otherwise based on Emilia's P224 work, which was inspired by my curve25519 + * work which got its smarts from Daniel J. Bernstein's work on the same. + */ + +#include +#ifndef OPENSSL_NO_EC_NISTP_64_GCC_128 + +#ifndef OPENSSL_SYS_VMS +#include +#else +#include +#endif + +#include +#include +#include "ec_lcl.h" + +#if defined(__GNUC__) && (__GNUC__ > 3 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 1)) + /* even with gcc, the typedef won't work for 32-bit platforms */ + typedef __uint128_t uint128_t; /* nonstandard; implemented by gcc on 64-bit platforms */ + typedef __int128_t int128_t; +#else + #error "Need GCC 3.1 or later to define type uint128_t" +#endif + +typedef uint8_t u8; +typedef uint32_t u32; +typedef uint64_t u64; +typedef int64_t s64; + +/* The underlying field. + * + * P256 operates over GF(2^256-2^224+2^192+2^96-1). We can serialise an element + * of this field into 32 bytes. We call this an felem_bytearray. */ + +typedef u8 felem_bytearray[32]; + +/* These are the parameters of P256, taken from FIPS 186-3, page 86. These + * values are big-endian. */ +static const felem_bytearray nistp256_curve_params[5] = { + {0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x01, /* p */ + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff}, + {0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x01, /* a = -3 */ + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xfc}, /* b */ + {0x5a, 0xc6, 0x35, 0xd8, 0xaa, 0x3a, 0x93, 0xe7, + 0xb3, 0xeb, 0xbd, 0x55, 0x76, 0x98, 0x86, 0xbc, + 0x65, 0x1d, 0x06, 0xb0, 0xcc, 0x53, 0xb0, 0xf6, + 0x3b, 0xce, 0x3c, 0x3e, 0x27, 0xd2, 0x60, 0x4b}, + {0x6b, 0x17, 0xd1, 0xf2, 0xe1, 0x2c, 0x42, 0x47, /* x */ + 0xf8, 0xbc, 0xe6, 0xe5, 0x63, 0xa4, 0x40, 0xf2, + 0x77, 0x03, 0x7d, 0x81, 0x2d, 0xeb, 0x33, 0xa0, + 0xf4, 0xa1, 0x39, 0x45, 0xd8, 0x98, 0xc2, 0x96}, + {0x4f, 0xe3, 0x42, 0xe2, 0xfe, 0x1a, 0x7f, 0x9b, /* y */ + 0x8e, 0xe7, 0xeb, 0x4a, 0x7c, 0x0f, 0x9e, 0x16, + 0x2b, 0xce, 0x33, 0x57, 0x6b, 0x31, 0x5e, 0xce, + 0xcb, 0xb6, 0x40, 0x68, 0x37, 0xbf, 0x51, 0xf5} +}; + +/* The representation of field elements. + * ------------------------------------ + * + * We represent field elements with either four 128-bit values, eight 128-bit + * values, or four 64-bit values. The field element represented is: + * v[0]*2^0 + v[1]*2^64 + v[2]*2^128 + v[3]*2^192 (mod p) + * or: + * v[0]*2^0 + v[1]*2^64 + v[2]*2^128 + ... + v[8]*2^512 (mod p) + * + * 128-bit values are called 'limbs'. Since the limbs are spaced only 64 bits + * apart, but are 128-bits wide, the most significant bits of each limb overlap + * with the least significant bits of the next. + * + * A field element with four limbs is an 'felem'. One with eight limbs is a + * 'longfelem' + * + * A field element with four, 64-bit values is called a 'smallfelem'. Small + * values are used as intermediate values before multiplication. + */ + +#define NLIMBS 4 + +typedef uint128_t limb; +typedef limb felem[NLIMBS]; +typedef limb longfelem[NLIMBS * 2]; +typedef u64 smallfelem[NLIMBS]; + +/* This is the value of the prime as four 64-bit words, little-endian. */ +static const u64 kPrime[4] = { 0xfffffffffffffffful, 0xffffffff, 0, 0xffffffff00000001ul }; +static const limb bottom32bits = 0xffffffff; +static const u64 bottom63bits = 0x7ffffffffffffffful; + +/* bin32_to_felem takes a little-endian byte array and converts it into felem + * form. This assumes that the CPU is little-endian. */ +static void bin32_to_felem(felem out, const u8 in[32]) + { + out[0] = *((u64*) &in[0]); + out[1] = *((u64*) &in[8]); + out[2] = *((u64*) &in[16]); + out[3] = *((u64*) &in[24]); + } + +/* smallfelem_to_bin32 takes a smallfelem and serialises into a little endian, + * 32 byte array. This assumes that the CPU is little-endian. */ +static void smallfelem_to_bin32(u8 out[32], const smallfelem in) + { + *((u64*) &out[0]) = in[0]; + *((u64*) &out[8]) = in[1]; + *((u64*) &out[16]) = in[2]; + *((u64*) &out[24]) = in[3]; + } + +/* To preserve endianness when using BN_bn2bin and BN_bin2bn */ +static void flip_endian(u8 *out, const u8 *in, unsigned len) + { + unsigned i; + for (i = 0; i < len; ++i) + out[i] = in[len-1-i]; + } + +/* BN_to_felem converts an OpenSSL BIGNUM into an felem */ +static int BN_to_felem(felem out, const BIGNUM *bn) + { + felem_bytearray b_in; + felem_bytearray b_out; + unsigned num_bytes; + + /* BN_bn2bin eats leading zeroes */ + memset(b_out, 0, sizeof b_out); + num_bytes = BN_num_bytes(bn); + if (num_bytes > sizeof b_out) + { + ECerr(EC_F_BN_TO_FELEM, EC_R_BIGNUM_OUT_OF_RANGE); + return 0; + } + if (BN_is_negative(bn)) + { + ECerr(EC_F_BN_TO_FELEM, EC_R_BIGNUM_OUT_OF_RANGE); + return 0; + } + num_bytes = BN_bn2bin(bn, b_in); + flip_endian(b_out, b_in, num_bytes); + bin32_to_felem(out, b_out); + return 1; + } + +/* felem_to_BN converts an felem into an OpenSSL BIGNUM */ +static BIGNUM *smallfelem_to_BN(BIGNUM *out, const smallfelem in) + { + felem_bytearray b_in, b_out; + smallfelem_to_bin32(b_in, in); + flip_endian(b_out, b_in, sizeof b_out); + return BN_bin2bn(b_out, sizeof b_out, out); + } + + +/* Field operations + * ---------------- */ + +static void smallfelem_one(smallfelem out) + { + out[0] = 1; + out[1] = 0; + out[2] = 0; + out[3] = 0; + } + +static void smallfelem_assign(smallfelem out, const smallfelem in) + { + out[0] = in[0]; + out[1] = in[1]; + out[2] = in[2]; + out[3] = in[3]; + } + +static void felem_assign(felem out, const felem in) + { + out[0] = in[0]; + out[1] = in[1]; + out[2] = in[2]; + out[3] = in[3]; + } + +/* felem_sum sets out = out + in. */ +static void felem_sum(felem out, const felem in) + { + out[0] += in[0]; + out[1] += in[1]; + out[2] += in[2]; + out[3] += in[3]; + } + +/* felem_small_sum sets out = out + in. */ +static void felem_small_sum(felem out, const smallfelem in) + { + out[0] += in[0]; + out[1] += in[1]; + out[2] += in[2]; + out[3] += in[3]; + } + +/* felem_scalar sets out = out * scalar */ +static void felem_scalar(felem out, const u64 scalar) + { + out[0] *= scalar; + out[1] *= scalar; + out[2] *= scalar; + out[3] *= scalar; + } + +/* longfelem_scalar sets out = out * scalar */ +static void longfelem_scalar(longfelem out, const u64 scalar) + { + out[0] *= scalar; + out[1] *= scalar; + out[2] *= scalar; + out[3] *= scalar; + out[4] *= scalar; + out[5] *= scalar; + out[6] *= scalar; + out[7] *= scalar; + } + +#define two105m41m9 (((limb)1) << 105) - (((limb)1) << 41) - (((limb)1) << 9) +#define two105 (((limb)1) << 105) +#define two105m41p9 (((limb)1) << 105) - (((limb)1) << 41) + (((limb)1) << 9) + +/* zero105 is 0 mod p */ +static const felem zero105 = { two105m41m9, two105, two105m41p9, two105m41p9 }; + +/* smallfelem_neg sets |out| to |-small| + * On exit: + * out[i] < out[i] + 2^105 + */ +static void smallfelem_neg(felem out, const smallfelem small) + { + /* In order to prevent underflow, we subtract from 0 mod p. */ + out[0] = zero105[0] - small[0]; + out[1] = zero105[1] - small[1]; + out[2] = zero105[2] - small[2]; + out[3] = zero105[3] - small[3]; + } + +/* felem_diff subtracts |in| from |out| + * On entry: + * in[i] < 2^104 + * On exit: + * out[i] < out[i] + 2^105 + */ +static void felem_diff(felem out, const felem in) + { + /* In order to prevent underflow, we add 0 mod p before subtracting. */ + out[0] += zero105[0]; + out[1] += zero105[1]; + out[2] += zero105[2]; + out[3] += zero105[3]; + + out[0] -= in[0]; + out[1] -= in[1]; + out[2] -= in[2]; + out[3] -= in[3]; + } + +#define two107m43m11 (((limb)1) << 107) - (((limb)1) << 43) - (((limb)1) << 11) +#define two107 (((limb)1) << 107) +#define two107m43p11 (((limb)1) << 107) - (((limb)1) << 43) + (((limb)1) << 11) + +/* zero107 is 0 mod p */ +static const felem zero107 = { two107m43m11, two107, two107m43p11, two107m43p11 }; + +/* An alternative felem_diff for larger inputs |in| + * felem_diff_zero107 subtracts |in| from |out| + * On entry: + * in[i] < 2^106 + * On exit: + * out[i] < out[i] + 2^107 + */ +static void felem_diff_zero107(felem out, const felem in) + { + /* In order to prevent underflow, we add 0 mod p before subtracting. */ + out[0] += zero107[0]; + out[1] += zero107[1]; + out[2] += zero107[2]; + out[3] += zero107[3]; + + out[0] -= in[0]; + out[1] -= in[1]; + out[2] -= in[2]; + out[3] -= in[3]; + } + +/* longfelem_diff subtracts |in| from |out| + * On entry: + * in[i] < 7*2^67 + * On exit: + * out[i] < out[i] + 2^70 + 2^40 + */ +static void longfelem_diff(longfelem out, const longfelem in) + { + static const limb two70m8p6 = (((limb)1) << 70) - (((limb)1) << 8) + (((limb)1) << 6); + static const limb two70p40 = (((limb)1) << 70) + (((limb)1) << 40); + static const limb two70 = (((limb)1) << 70); + static const limb two70m40m38p6 = (((limb)1) << 70) - (((limb)1) << 40) - (((limb)1) << 38) + (((limb)1) << 6); + static const limb two70m6 = (((limb)1) << 70) - (((limb)1) << 6); + + /* add 0 mod p to avoid underflow */ + out[0] += two70m8p6; + out[1] += two70p40; + out[2] += two70; + out[3] += two70m40m38p6; + out[4] += two70m6; + out[5] += two70m6; + out[6] += two70m6; + out[7] += two70m6; + + /* in[i] < 7*2^67 < 2^70 - 2^40 - 2^38 + 2^6 */ + out[0] -= in[0]; + out[1] -= in[1]; + out[2] -= in[2]; + out[3] -= in[3]; + out[4] -= in[4]; + out[5] -= in[5]; + out[6] -= in[6]; + out[7] -= in[7]; + } + +#define two64m0 (((limb)1) << 64) - 1 +#define two110p32m0 (((limb)1) << 110) + (((limb)1) << 32) - 1 +#define two64m46 (((limb)1) << 64) - (((limb)1) << 46) +#define two64m32 (((limb)1) << 64) - (((limb)1) << 32) + +/* zero110 is 0 mod p */ +static const felem zero110 = { two64m0, two110p32m0, two64m46, two64m32 }; + +/* felem_shrink converts an felem into a smallfelem. The result isn't quite + * minimal as the value may be greater than p. + * + * On entry: + * in[i] < 2^109 + * On exit: + * out[i] < 2^64 + */ +static void felem_shrink(smallfelem out, const felem in) + { + felem tmp; + u64 a, b, mask; + s64 high, low; + static const u64 kPrime3Test = 0x7fffffff00000001ul; /* 2^63 - 2^32 + 1 */ + + /* Carry 2->3 */ + tmp[3] = zero110[3] + in[3] + ((u64) (in[2] >> 64)); + /* tmp[3] < 2^110 */ + + tmp[2] = zero110[2] + (u64) in[2]; + tmp[0] = zero110[0] + in[0]; + tmp[1] = zero110[1] + in[1]; + /* tmp[0] < 2**110, tmp[1] < 2^111, tmp[2] < 2**65 */ + + /* We perform two partial reductions where we eliminate the + * high-word of tmp[3]. We don't update the other words till the end. + */ + a = tmp[3] >> 64; /* a < 2^46 */ + tmp[3] = (u64) tmp[3]; + tmp[3] -= a; + tmp[3] += ((limb)a) << 32; + /* tmp[3] < 2^79 */ + + b = a; + a = tmp[3] >> 64; /* a < 2^15 */ + b += a; /* b < 2^46 + 2^15 < 2^47 */ + tmp[3] = (u64) tmp[3]; + tmp[3] -= a; + tmp[3] += ((limb)a) << 32; + /* tmp[3] < 2^64 + 2^47 */ + + /* This adjusts the other two words to complete the two partial + * reductions. */ + tmp[0] += b; + tmp[1] -= (((limb)b) << 32); + + /* In order to make space in tmp[3] for the carry from 2 -> 3, we + * conditionally subtract kPrime if tmp[3] is large enough. */ + high = tmp[3] >> 64; + /* As tmp[3] < 2^65, high is either 1 or 0 */ + high <<= 63; + high >>= 63; + /* high is: + * all ones if the high word of tmp[3] is 1 + * all zeros if the high word of tmp[3] if 0 */ + low = tmp[3]; + mask = low >> 63; + /* mask is: + * all ones if the MSB of low is 1 + * all zeros if the MSB of low if 0 */ + low &= bottom63bits; + low -= kPrime3Test; + /* if low was greater than kPrime3Test then the MSB is zero */ + low = ~low; + low >>= 63; + /* low is: + * all ones if low was > kPrime3Test + * all zeros if low was <= kPrime3Test */ + mask = (mask & low) | high; + tmp[0] -= mask & kPrime[0]; + tmp[1] -= mask & kPrime[1]; + /* kPrime[2] is zero, so omitted */ + tmp[3] -= mask & kPrime[3]; + /* tmp[3] < 2**64 - 2**32 + 1 */ + + tmp[1] += ((u64) (tmp[0] >> 64)); tmp[0] = (u64) tmp[0]; + tmp[2] += ((u64) (tmp[1] >> 64)); tmp[1] = (u64) tmp[1]; + tmp[3] += ((u64) (tmp[2] >> 64)); tmp[2] = (u64) tmp[2]; + /* tmp[i] < 2^64 */ + + out[0] = tmp[0]; + out[1] = tmp[1]; + out[2] = tmp[2]; + out[3] = tmp[3]; + } + +/* smallfelem_expand converts a smallfelem to an felem */ +static void smallfelem_expand(felem out, const smallfelem in) + { + out[0] = in[0]; + out[1] = in[1]; + out[2] = in[2]; + out[3] = in[3]; + } + +/* smallfelem_square sets |out| = |small|^2 + * On entry: + * small[i] < 2^64 + * On exit: + * out[i] < 7 * 2^64 < 2^67 + */ +static void smallfelem_square(longfelem out, const smallfelem small) + { + limb a; + u64 high, low; + + a = ((uint128_t) small[0]) * small[0]; + low = a; + high = a >> 64; + out[0] = low; + out[1] = high; + + a = ((uint128_t) small[0]) * small[1]; + low = a; + high = a >> 64; + out[1] += low; + out[1] += low; + out[2] = high; + + a = ((uint128_t) small[0]) * small[2]; + low = a; + high = a >> 64; + out[2] += low; + out[2] *= 2; + out[3] = high; + + a = ((uint128_t) small[0]) * small[3]; + low = a; + high = a >> 64; + out[3] += low; + out[4] = high; + + a = ((uint128_t) small[1]) * small[2]; + low = a; + high = a >> 64; + out[3] += low; + out[3] *= 2; + out[4] += high; + + a = ((uint128_t) small[1]) * small[1]; + low = a; + high = a >> 64; + out[2] += low; + out[3] += high; + + a = ((uint128_t) small[1]) * small[3]; + low = a; + high = a >> 64; + out[4] += low; + out[4] *= 2; + out[5] = high; + + a = ((uint128_t) small[2]) * small[3]; + low = a; + high = a >> 64; + out[5] += low; + out[5] *= 2; + out[6] = high; + out[6] += high; + + a = ((uint128_t) small[2]) * small[2]; + low = a; + high = a >> 64; + out[4] += low; + out[5] += high; + + a = ((uint128_t) small[3]) * small[3]; + low = a; + high = a >> 64; + out[6] += low; + out[7] = high; + } + +/* felem_square sets |out| = |in|^2 + * On entry: + * in[i] < 2^109 + * On exit: + * out[i] < 7 * 2^64 < 2^67 + */ +static void felem_square(longfelem out, const felem in) + { + u64 small[4]; + felem_shrink(small, in); + smallfelem_square(out, small); + } + +/* smallfelem_mul sets |out| = |small1| * |small2| + * On entry: + * small1[i] < 2^64 + * small2[i] < 2^64 + * On exit: + * out[i] < 7 * 2^64 < 2^67 + */ +static void smallfelem_mul(longfelem out, const smallfelem small1, const smallfelem small2) + { + limb a; + u64 high, low; + + a = ((uint128_t) small1[0]) * small2[0]; + low = a; + high = a >> 64; + out[0] = low; + out[1] = high; + + + a = ((uint128_t) small1[0]) * small2[1]; + low = a; + high = a >> 64; + out[1] += low; + out[2] = high; + + a = ((uint128_t) small1[1]) * small2[0]; + low = a; + high = a >> 64; + out[1] += low; + out[2] += high; + + + a = ((uint128_t) small1[0]) * small2[2]; + low = a; + high = a >> 64; + out[2] += low; + out[3] = high; + + a = ((uint128_t) small1[1]) * small2[1]; + low = a; + high = a >> 64; + out[2] += low; + out[3] += high; + + a = ((uint128_t) small1[2]) * small2[0]; + low = a; + high = a >> 64; + out[2] += low; + out[3] += high; + + + a = ((uint128_t) small1[0]) * small2[3]; + low = a; + high = a >> 64; + out[3] += low; + out[4] = high; + + a = ((uint128_t) small1[1]) * small2[2]; + low = a; + high = a >> 64; + out[3] += low; + out[4] += high; + + a = ((uint128_t) small1[2]) * small2[1]; + low = a; + high = a >> 64; + out[3] += low; + out[4] += high; + + a = ((uint128_t) small1[3]) * small2[0]; + low = a; + high = a >> 64; + out[3] += low; + out[4] += high; + + + a = ((uint128_t) small1[1]) * small2[3]; + low = a; + high = a >> 64; + out[4] += low; + out[5] = high; + + a = ((uint128_t) small1[2]) * small2[2]; + low = a; + high = a >> 64; + out[4] += low; + out[5] += high; + + a = ((uint128_t) small1[3]) * small2[1]; + low = a; + high = a >> 64; + out[4] += low; + out[5] += high; + + + a = ((uint128_t) small1[2]) * small2[3]; + low = a; + high = a >> 64; + out[5] += low; + out[6] = high; + + a = ((uint128_t) small1[3]) * small2[2]; + low = a; + high = a >> 64; + out[5] += low; + out[6] += high; + + + a = ((uint128_t) small1[3]) * small2[3]; + low = a; + high = a >> 64; + out[6] += low; + out[7] = high; + } + +/* felem_mul sets |out| = |in1| * |in2| + * On entry: + * in1[i] < 2^109 + * in2[i] < 2^109 + * On exit: + * out[i] < 7 * 2^64 < 2^67 + */ +static void felem_mul(longfelem out, const felem in1, const felem in2) + { + smallfelem small1, small2; + felem_shrink(small1, in1); + felem_shrink(small2, in2); + smallfelem_mul(out, small1, small2); + } + +/* felem_small_mul sets |out| = |small1| * |in2| + * On entry: + * small1[i] < 2^64 + * in2[i] < 2^109 + * On exit: + * out[i] < 7 * 2^64 < 2^67 + */ +static void felem_small_mul(longfelem out, const smallfelem small1, const felem in2) + { + smallfelem small2; + felem_shrink(small2, in2); + smallfelem_mul(out, small1, small2); + } + +#define two100m36m4 (((limb)1) << 100) - (((limb)1) << 36) - (((limb)1) << 4) +#define two100 (((limb)1) << 100) +#define two100m36p4 (((limb)1) << 100) - (((limb)1) << 36) + (((limb)1) << 4) +/* zero100 is 0 mod p */ +static const felem zero100 = { two100m36m4, two100, two100m36p4, two100m36p4 }; + +/* Internal function for the different flavours of felem_reduce. + * felem_reduce_ reduces the higher coefficients in[4]-in[7]. + * On entry: + * out[0] >= in[6] + 2^32*in[6] + in[7] + 2^32*in[7] + * out[1] >= in[7] + 2^32*in[4] + * out[2] >= in[5] + 2^32*in[5] + * out[3] >= in[4] + 2^32*in[5] + 2^32*in[6] + * On exit: + * out[0] <= out[0] + in[4] + 2^32*in[5] + * out[1] <= out[1] + in[5] + 2^33*in[6] + * out[2] <= out[2] + in[7] + 2*in[6] + 2^33*in[7] + * out[3] <= out[3] + 2^32*in[4] + 3*in[7] + */ +static void felem_reduce_(felem out, const longfelem in) + { + int128_t c; + /* combine common terms from below */ + c = in[4] + (in[5] << 32); + out[0] += c; + out[3] -= c; + + c = in[5] - in[7]; + out[1] += c; + out[2] -= c; + + /* the remaining terms */ + /* 256: [(0,1),(96,-1),(192,-1),(224,1)] */ + out[1] -= (in[4] << 32); + out[3] += (in[4] << 32); + + /* 320: [(32,1),(64,1),(128,-1),(160,-1),(224,-1)] */ + out[2] -= (in[5] << 32); + + /* 384: [(0,-1),(32,-1),(96,2),(128,2),(224,-1)] */ + out[0] -= in[6]; + out[0] -= (in[6] << 32); + out[1] += (in[6] << 33); + out[2] += (in[6] * 2); + out[3] -= (in[6] << 32); + + /* 448: [(0,-1),(32,-1),(64,-1),(128,1),(160,2),(192,3)] */ + out[0] -= in[7]; + out[0] -= (in[7] << 32); + out[2] += (in[7] << 33); + out[3] += (in[7] * 3); + } + +/* felem_reduce converts a longfelem into an felem. + * To be called directly after felem_square or felem_mul. + * On entry: + * in[0] < 2^64, in[1] < 3*2^64, in[2] < 5*2^64, in[3] < 7*2^64 + * in[4] < 7*2^64, in[5] < 5*2^64, in[6] < 3*2^64, in[7] < 2*64 + * On exit: + * out[i] < 2^101 + */ +static void felem_reduce(felem out, const longfelem in) + { + out[0] = zero100[0] + in[0]; + out[1] = zero100[1] + in[1]; + out[2] = zero100[2] + in[2]; + out[3] = zero100[3] + in[3]; + + felem_reduce_(out, in); + + /* out[0] > 2^100 - 2^36 - 2^4 - 3*2^64 - 3*2^96 - 2^64 - 2^96 > 0 + * out[1] > 2^100 - 2^64 - 7*2^96 > 0 + * out[2] > 2^100 - 2^36 + 2^4 - 5*2^64 - 5*2^96 > 0 + * out[3] > 2^100 - 2^36 + 2^4 - 7*2^64 - 5*2^96 - 3*2^96 > 0 + * + * out[0] < 2^100 + 2^64 + 7*2^64 + 5*2^96 < 2^101 + * out[1] < 2^100 + 3*2^64 + 5*2^64 + 3*2^97 < 2^101 + * out[2] < 2^100 + 5*2^64 + 2^64 + 3*2^65 + 2^97 < 2^101 + * out[3] < 2^100 + 7*2^64 + 7*2^96 + 3*2^64 < 2^101 + */ + } + +/* felem_reduce_zero105 converts a larger longfelem into an felem. + * On entry: + * in[0] < 2^71 + * On exit: + * out[i] < 2^106 + */ +static void felem_reduce_zero105(felem out, const longfelem in) + { + out[0] = zero105[0] + in[0]; + out[1] = zero105[1] + in[1]; + out[2] = zero105[2] + in[2]; + out[3] = zero105[3] + in[3]; + + felem_reduce_(out, in); + + /* out[0] > 2^105 - 2^41 - 2^9 - 2^71 - 2^103 - 2^71 - 2^103 > 0 + * out[1] > 2^105 - 2^71 - 2^103 > 0 + * out[2] > 2^105 - 2^41 + 2^9 - 2^71 - 2^103 > 0 + * out[3] > 2^105 - 2^41 + 2^9 - 2^71 - 2^103 - 2^103 > 0 + * + * out[0] < 2^105 + 2^71 + 2^71 + 2^103 < 2^106 + * out[1] < 2^105 + 2^71 + 2^71 + 2^103 < 2^106 + * out[2] < 2^105 + 2^71 + 2^71 + 2^71 + 2^103 < 2^106 + * out[3] < 2^105 + 2^71 + 2^103 + 2^71 < 2^106 + */ + } + +/* subtract_u64 sets *result = *result - v and *carry to one if the subtraction + * underflowed. */ +static void subtract_u64(u64* result, u64* carry, u64 v) + { + uint128_t r = *result; + r -= v; + *carry = (r >> 64) & 1; + *result = (u64) r; + } + +/* felem_contract converts |in| to its unique, minimal representation. + * On entry: + * in[i] < 2^109 + */ +static void felem_contract(smallfelem out, const felem in) + { + unsigned i; + u64 all_equal_so_far = 0, result = 0, carry; + + felem_shrink(out, in); + /* small is minimal except that the value might be > p */ + + all_equal_so_far--; + /* We are doing a constant time test if out >= kPrime. We need to + * compare each u64, from most-significant to least significant. For + * each one, if all words so far have been equal (m is all ones) then a + * non-equal result is the answer. Otherwise we continue. */ + for (i = 3; i < 4; i--) + { + u64 equal; + uint128_t a = ((uint128_t) kPrime[i]) - out[i]; + /* if out[i] > kPrime[i] then a will underflow and the high + * 64-bits will all be set. */ + result |= all_equal_so_far & ((u64) (a >> 64)); + + /* if kPrime[i] == out[i] then |equal| will be all zeros and + * the decrement will make it all ones. */ + equal = kPrime[i] ^ out[i]; + equal--; + equal &= equal << 32; + equal &= equal << 16; + equal &= equal << 8; + equal &= equal << 4; + equal &= equal << 2; + equal &= equal << 1; + equal = ((s64) equal) >> 63; + + all_equal_so_far &= equal; + } + + /* if all_equal_so_far is still all ones then the two values are equal + * and so out >= kPrime is true. */ + result |= all_equal_so_far; + + /* if out >= kPrime then we subtract kPrime. */ + subtract_u64(&out[0], &carry, result & kPrime[0]); + subtract_u64(&out[1], &carry, carry); + subtract_u64(&out[2], &carry, carry); + subtract_u64(&out[3], &carry, carry); + + subtract_u64(&out[1], &carry, result & kPrime[1]); + subtract_u64(&out[2], &carry, carry); + subtract_u64(&out[3], &carry, carry); + + subtract_u64(&out[2], &carry, result & kPrime[2]); + subtract_u64(&out[3], &carry, carry); + + subtract_u64(&out[3], &carry, result & kPrime[3]); + } + +static void smallfelem_square_contract(smallfelem out, const smallfelem in) + { + longfelem longtmp; + felem tmp; + + smallfelem_square(longtmp, in); + felem_reduce(tmp, longtmp); + felem_contract(out, tmp); + } + +static void smallfelem_mul_contract(smallfelem out, const smallfelem in1, const smallfelem in2) + { + longfelem longtmp; + felem tmp; + + smallfelem_mul(longtmp, in1, in2); + felem_reduce(tmp, longtmp); + felem_contract(out, tmp); + } + +/* felem_is_zero returns a limb with all bits set if |in| == 0 (mod p) and 0 + * otherwise. + * On entry: + * small[i] < 2^64 + */ +static limb smallfelem_is_zero(const smallfelem small) + { + limb result; + u64 is_p; + + u64 is_zero = small[0] | small[1] | small[2] | small[3]; + is_zero--; + is_zero &= is_zero << 32; + is_zero &= is_zero << 16; + is_zero &= is_zero << 8; + is_zero &= is_zero << 4; + is_zero &= is_zero << 2; + is_zero &= is_zero << 1; + is_zero = ((s64) is_zero) >> 63; + + is_p = (small[0] ^ kPrime[0]) | + (small[1] ^ kPrime[1]) | + (small[2] ^ kPrime[2]) | + (small[3] ^ kPrime[3]); + is_p--; + is_p &= is_p << 32; + is_p &= is_p << 16; + is_p &= is_p << 8; + is_p &= is_p << 4; + is_p &= is_p << 2; + is_p &= is_p << 1; + is_p = ((s64) is_p) >> 63; + + is_zero |= is_p; + + result = is_zero; + result |= ((limb) is_zero) << 64; + return result; + } + +static int smallfelem_is_zero_int(const smallfelem small) + { + return (int) (smallfelem_is_zero(small) & ((limb)1)); + } + +/* felem_inv calculates |out| = |in|^{-1} + * + * Based on Fermat's Little Theorem: + * a^p = a (mod p) + * a^{p-1} = 1 (mod p) + * a^{p-2} = a^{-1} (mod p) + */ +static void felem_inv(felem out, const felem in) + { + felem ftmp, ftmp2; + /* each e_I will hold |in|^{2^I - 1} */ + felem e2, e4, e8, e16, e32, e64; + longfelem tmp; + unsigned i; + + felem_square(tmp, in); felem_reduce(ftmp, tmp); /* 2^1 */ + felem_mul(tmp, in, ftmp); felem_reduce(ftmp, tmp); /* 2^2 - 2^0 */ + felem_assign(e2, ftmp); + felem_square(tmp, ftmp); felem_reduce(ftmp, tmp); /* 2^3 - 2^1 */ + felem_square(tmp, ftmp); felem_reduce(ftmp, tmp); /* 2^4 - 2^2 */ + felem_mul(tmp, ftmp, e2); felem_reduce(ftmp, tmp); /* 2^4 - 2^0 */ + felem_assign(e4, ftmp); + felem_square(tmp, ftmp); felem_reduce(ftmp, tmp); /* 2^5 - 2^1 */ + felem_square(tmp, ftmp); felem_reduce(ftmp, tmp); /* 2^6 - 2^2 */ + felem_square(tmp, ftmp); felem_reduce(ftmp, tmp); /* 2^7 - 2^3 */ + felem_square(tmp, ftmp); felem_reduce(ftmp, tmp); /* 2^8 - 2^4 */ + felem_mul(tmp, ftmp, e4); felem_reduce(ftmp, tmp); /* 2^8 - 2^0 */ + felem_assign(e8, ftmp); + for (i = 0; i < 8; i++) { + felem_square(tmp, ftmp); felem_reduce(ftmp, tmp); + } /* 2^16 - 2^8 */ + felem_mul(tmp, ftmp, e8); felem_reduce(ftmp, tmp); /* 2^16 - 2^0 */ + felem_assign(e16, ftmp); + for (i = 0; i < 16; i++) { + felem_square(tmp, ftmp); felem_reduce(ftmp, tmp); + } /* 2^32 - 2^16 */ + felem_mul(tmp, ftmp, e16); felem_reduce(ftmp, tmp); /* 2^32 - 2^0 */ + felem_assign(e32, ftmp); + for (i = 0; i < 32; i++) { + felem_square(tmp, ftmp); felem_reduce(ftmp, tmp); + } /* 2^64 - 2^32 */ + felem_assign(e64, ftmp); + felem_mul(tmp, ftmp, in); felem_reduce(ftmp, tmp); /* 2^64 - 2^32 + 2^0 */ + for (i = 0; i < 192; i++) { + felem_square(tmp, ftmp); felem_reduce(ftmp, tmp); + } /* 2^256 - 2^224 + 2^192 */ + + felem_mul(tmp, e64, e32); felem_reduce(ftmp2, tmp); /* 2^64 - 2^0 */ + for (i = 0; i < 16; i++) { + felem_square(tmp, ftmp2); felem_reduce(ftmp2, tmp); + } /* 2^80 - 2^16 */ + felem_mul(tmp, ftmp2, e16); felem_reduce(ftmp2, tmp); /* 2^80 - 2^0 */ + for (i = 0; i < 8; i++) { + felem_square(tmp, ftmp2); felem_reduce(ftmp2, tmp); + } /* 2^88 - 2^8 */ + felem_mul(tmp, ftmp2, e8); felem_reduce(ftmp2, tmp); /* 2^88 - 2^0 */ + for (i = 0; i < 4; i++) { + felem_square(tmp, ftmp2); felem_reduce(ftmp2, tmp); + } /* 2^92 - 2^4 */ + felem_mul(tmp, ftmp2, e4); felem_reduce(ftmp2, tmp); /* 2^92 - 2^0 */ + felem_square(tmp, ftmp2); felem_reduce(ftmp2, tmp); /* 2^93 - 2^1 */ + felem_square(tmp, ftmp2); felem_reduce(ftmp2, tmp); /* 2^94 - 2^2 */ + felem_mul(tmp, ftmp2, e2); felem_reduce(ftmp2, tmp); /* 2^94 - 2^0 */ + felem_square(tmp, ftmp2); felem_reduce(ftmp2, tmp); /* 2^95 - 2^1 */ + felem_square(tmp, ftmp2); felem_reduce(ftmp2, tmp); /* 2^96 - 2^2 */ + felem_mul(tmp, ftmp2, in); felem_reduce(ftmp2, tmp); /* 2^96 - 3 */ + + felem_mul(tmp, ftmp2, ftmp); felem_reduce(out, tmp); /* 2^256 - 2^224 + 2^192 + 2^96 - 3 */ + } + +static void smallfelem_inv_contract(smallfelem out, const smallfelem in) + { + felem tmp; + + smallfelem_expand(tmp, in); + felem_inv(tmp, tmp); + felem_contract(out, tmp); + } + +/* Group operations + * ---------------- + * + * Building on top of the field operations we have the operations on the + * elliptic curve group itself. Points on the curve are represented in Jacobian + * coordinates */ + +/* point_double calculates 2*(x_in, y_in, z_in) + * + * The method is taken from: + * http://hyperelliptic.org/EFD/g1p/auto-shortw-jacobian-3.html#doubling-dbl-2001-b + * + * Outputs can equal corresponding inputs, i.e., x_out == x_in is allowed. + * while x_out == y_in is not (maybe this works, but it's not tested). */ +static void +point_double(felem x_out, felem y_out, felem z_out, + const felem x_in, const felem y_in, const felem z_in) + { + longfelem tmp, tmp2; + felem delta, gamma, beta, alpha, ftmp, ftmp2; + smallfelem small1, small2; + + felem_assign(ftmp, x_in); + /* ftmp[i] < 2^106 */ + felem_assign(ftmp2, x_in); + /* ftmp2[i] < 2^106 */ + + /* delta = z^2 */ + felem_square(tmp, z_in); + felem_reduce(delta, tmp); + /* delta[i] < 2^101 */ + + /* gamma = y^2 */ + felem_square(tmp, y_in); + felem_reduce(gamma, tmp); + /* gamma[i] < 2^101 */ + felem_shrink(small1, gamma); + + /* beta = x*gamma */ + felem_small_mul(tmp, small1, x_in); + felem_reduce(beta, tmp); + /* beta[i] < 2^101 */ + + /* alpha = 3*(x-delta)*(x+delta) */ + felem_diff(ftmp, delta); + /* ftmp[i] < 2^105 + 2^106 < 2^107 */ + felem_sum(ftmp2, delta); + /* ftmp2[i] < 2^105 + 2^106 < 2^107 */ + felem_scalar(ftmp2, 3); + /* ftmp2[i] < 3 * 2^107 < 2^109 */ + felem_mul(tmp, ftmp, ftmp2); + felem_reduce(alpha, tmp); + /* alpha[i] < 2^101 */ + felem_shrink(small2, alpha); + + /* x' = alpha^2 - 8*beta */ + smallfelem_square(tmp, small2); + felem_reduce(x_out, tmp); + felem_assign(ftmp, beta); + felem_scalar(ftmp, 8); + /* ftmp[i] < 8 * 2^101 = 2^104 */ + felem_diff(x_out, ftmp); + /* x_out[i] < 2^105 + 2^101 < 2^106 */ + + /* z' = (y + z)^2 - gamma - delta */ + felem_sum(delta, gamma); + /* delta[i] < 2^101 + 2^101 = 2^102 */ + felem_assign(ftmp, y_in); + felem_sum(ftmp, z_in); + /* ftmp[i] < 2^106 + 2^106 = 2^107 */ + felem_square(tmp, ftmp); + felem_reduce(z_out, tmp); + felem_diff(z_out, delta); + /* z_out[i] < 2^105 + 2^101 < 2^106 */ + + /* y' = alpha*(4*beta - x') - 8*gamma^2 */ + felem_scalar(beta, 4); + /* beta[i] < 4 * 2^101 = 2^103 */ + felem_diff_zero107(beta, x_out); + /* beta[i] < 2^107 + 2^103 < 2^108 */ + felem_small_mul(tmp, small2, beta); + /* tmp[i] < 7 * 2^64 < 2^67 */ + smallfelem_square(tmp2, small1); + /* tmp2[i] < 7 * 2^64 */ + longfelem_scalar(tmp2, 8); + /* tmp2[i] < 8 * 7 * 2^64 = 7 * 2^67 */ + longfelem_diff(tmp, tmp2); + /* tmp[i] < 2^67 + 2^70 + 2^40 < 2^71 */ + felem_reduce_zero105(y_out, tmp); + /* y_out[i] < 2^106 */ + } + +/* point_double_small is the same as point_double, except that it operates on + * smallfelems */ +static void +point_double_small(smallfelem x_out, smallfelem y_out, smallfelem z_out, + const smallfelem x_in, const smallfelem y_in, const smallfelem z_in) + { + felem felem_x_out, felem_y_out, felem_z_out; + felem felem_x_in, felem_y_in, felem_z_in; + + smallfelem_expand(felem_x_in, x_in); + smallfelem_expand(felem_y_in, y_in); + smallfelem_expand(felem_z_in, z_in); + point_double(felem_x_out, felem_y_out, felem_z_out, + felem_x_in, felem_y_in, felem_z_in); + felem_shrink(x_out, felem_x_out); + felem_shrink(y_out, felem_y_out); + felem_shrink(z_out, felem_z_out); + } + +/* copy_conditional copies in to out iff mask is all ones. */ +static void +copy_conditional(felem out, const felem in, limb mask) + { + unsigned i; + for (i = 0; i < NLIMBS; ++i) + { + const limb tmp = mask & (in[i] ^ out[i]); + out[i] ^= tmp; + } + } + +/* copy_small_conditional copies in to out iff mask is all ones. */ +static void +copy_small_conditional(felem out, const smallfelem in, limb mask) + { + unsigned i; + const u64 mask64 = mask; + for (i = 0; i < NLIMBS; ++i) + { + out[i] = ((limb) (in[i] & mask64)) | (out[i] & ~mask); + } + } + +/* point_add calcuates (x1, y1, z1) + (x2, y2, z2) + * + * The method is taken from: + * http://hyperelliptic.org/EFD/g1p/auto-shortw-jacobian-3.html#addition-add-2007-bl, + * adapted for mixed addition (z2 = 1, or z2 = 0 for the point at infinity). + * + * This function includes a branch for checking whether the two input points + * are equal, (while not equal to the point at infinity). This case never + * happens during single point multiplication, so there is no timing leak for + * ECDH or ECDSA signing. */ +static void point_add(felem x3, felem y3, felem z3, + const felem x1, const felem y1, const felem z1, + const int mixed, const smallfelem x2, const smallfelem y2, const smallfelem z2) + { + felem ftmp, ftmp2, ftmp3, ftmp4, ftmp5, ftmp6, x_out, y_out, z_out; + longfelem tmp, tmp2; + smallfelem small1, small2, small3, small4, small5; + limb x_equal, y_equal, z1_is_zero, z2_is_zero; + + felem_shrink(small3, z1); + + z1_is_zero = smallfelem_is_zero(small3); + z2_is_zero = smallfelem_is_zero(z2); + + /* ftmp = z1z1 = z1**2 */ + smallfelem_square(tmp, small3); + felem_reduce(ftmp, tmp); + /* ftmp[i] < 2^101 */ + felem_shrink(small1, ftmp); + + if(!mixed) + { + /* ftmp2 = z2z2 = z2**2 */ + smallfelem_square(tmp, z2); + felem_reduce(ftmp2, tmp); + /* ftmp2[i] < 2^101 */ + felem_shrink(small2, ftmp2); + + felem_shrink(small5, x1); + + /* u1 = ftmp3 = x1*z2z2 */ + smallfelem_mul(tmp, small5, small2); + felem_reduce(ftmp3, tmp); + /* ftmp3[i] < 2^101 */ + + /* ftmp5 = z1 + z2 */ + felem_assign(ftmp5, z1); + felem_small_sum(ftmp5, z2); + /* ftmp5[i] < 2^107 */ + + /* ftmp5 = (z1 + z2)**2 - (z1z1 + z2z2) = 2z1z2 */ + felem_square(tmp, ftmp5); + felem_reduce(ftmp5, tmp); + /* ftmp2 = z2z2 + z1z1 */ + felem_sum(ftmp2, ftmp); + /* ftmp2[i] < 2^101 + 2^101 = 2^102 */ + felem_diff(ftmp5, ftmp2); + /* ftmp5[i] < 2^105 + 2^101 < 2^106 */ + + /* ftmp2 = z2 * z2z2 */ + smallfelem_mul(tmp, small2, z2); + felem_reduce(ftmp2, tmp); + + /* s1 = ftmp2 = y1 * z2**3 */ + felem_mul(tmp, y1, ftmp2); + felem_reduce(ftmp6, tmp); + /* ftmp6[i] < 2^101 */ + } + else + { + /* We'll assume z2 = 1 (special case z2 = 0 is handled later) */ + + /* u1 = ftmp3 = x1*z2z2 */ + felem_assign(ftmp3, x1); + /* ftmp3[i] < 2^106 */ + + /* ftmp5 = 2z1z2 */ + felem_assign(ftmp5, z1); + felem_scalar(ftmp5, 2); + /* ftmp5[i] < 2*2^106 = 2^107 */ + + /* s1 = ftmp2 = y1 * z2**3 */ + felem_assign(ftmp6, y1); + /* ftmp6[i] < 2^106 */ + } + + /* u2 = x2*z1z1 */ + smallfelem_mul(tmp, x2, small1); + felem_reduce(ftmp4, tmp); + + /* h = ftmp4 = u2 - u1 */ + felem_diff_zero107(ftmp4, ftmp3); + /* ftmp4[i] < 2^107 + 2^101 < 2^108 */ + felem_shrink(small4, ftmp4); + + x_equal = smallfelem_is_zero(small4); + + /* z_out = ftmp5 * h */ + felem_small_mul(tmp, small4, ftmp5); + felem_reduce(z_out, tmp); + /* z_out[i] < 2^101 */ + + /* ftmp = z1 * z1z1 */ + smallfelem_mul(tmp, small1, small3); + felem_reduce(ftmp, tmp); + + /* s2 = tmp = y2 * z1**3 */ + felem_small_mul(tmp, y2, ftmp); + felem_reduce(ftmp5, tmp); + + /* r = ftmp5 = (s2 - s1)*2 */ + felem_diff_zero107(ftmp5, ftmp6); + /* ftmp5[i] < 2^107 + 2^107 = 2^108*/ + felem_scalar(ftmp5, 2); + /* ftmp5[i] < 2^109 */ + felem_shrink(small1, ftmp5); + y_equal = smallfelem_is_zero(small1); + + if (x_equal && y_equal && !z1_is_zero && !z2_is_zero) + { + point_double(x3, y3, z3, x1, y1, z1); + return; + } + + /* I = ftmp = (2h)**2 */ + felem_assign(ftmp, ftmp4); + felem_scalar(ftmp, 2); + /* ftmp[i] < 2*2^108 = 2^109 */ + felem_square(tmp, ftmp); + felem_reduce(ftmp, tmp); + + /* J = ftmp2 = h * I */ + felem_mul(tmp, ftmp4, ftmp); + felem_reduce(ftmp2, tmp); + + /* V = ftmp4 = U1 * I */ + felem_mul(tmp, ftmp3, ftmp); + felem_reduce(ftmp4, tmp); + + /* x_out = r**2 - J - 2V */ + smallfelem_square(tmp, small1); + felem_reduce(x_out, tmp); + felem_assign(ftmp3, ftmp4); + felem_scalar(ftmp4, 2); + felem_sum(ftmp4, ftmp2); + /* ftmp4[i] < 2*2^101 + 2^101 < 2^103 */ + felem_diff(x_out, ftmp4); + /* x_out[i] < 2^105 + 2^101 */ + + /* y_out = r(V-x_out) - 2 * s1 * J */ + felem_diff_zero107(ftmp3, x_out); + /* ftmp3[i] < 2^107 + 2^101 < 2^108 */ + felem_small_mul(tmp, small1, ftmp3); + felem_mul(tmp2, ftmp6, ftmp2); + longfelem_scalar(tmp2, 2); + /* tmp2[i] < 2*2^67 = 2^68 */ + longfelem_diff(tmp, tmp2); + /* tmp[i] < 2^67 + 2^70 + 2^40 < 2^71 */ + felem_reduce_zero105(y_out, tmp); + /* y_out[i] < 2^106 */ + + copy_small_conditional(x_out, x2, z1_is_zero); + copy_conditional(x_out, x1, z2_is_zero); + copy_small_conditional(y_out, y2, z1_is_zero); + copy_conditional(y_out, y1, z2_is_zero); + copy_small_conditional(z_out, z2, z1_is_zero); + copy_conditional(z_out, z1, z2_is_zero); + felem_assign(x3, x_out); + felem_assign(y3, y_out); + felem_assign(z3, z_out); + } + +/* point_add_small is the same as point_add, except that it operates on + * smallfelems */ +static void point_add_small(smallfelem x3, smallfelem y3, smallfelem z3, + smallfelem x1, smallfelem y1, smallfelem z1, + smallfelem x2, smallfelem y2, smallfelem z2) + { + felem felem_x3, felem_y3, felem_z3; + felem felem_x1, felem_y1, felem_z1; + smallfelem_expand(felem_x1, x1); + smallfelem_expand(felem_y1, y1); + smallfelem_expand(felem_z1, z1); + point_add(felem_x3, felem_y3, felem_z3, felem_x1, felem_y1, felem_z1, 0, x2, y2, z2); + felem_shrink(x3, felem_x3); + felem_shrink(y3, felem_y3); + felem_shrink(z3, felem_z3); + } + +/* Base point pre computation + * -------------------------- + * + * Two different sorts of precomputed tables are used in the following code. + * Each contain various points on the curve, where each point is three field + * elements (x, y, z). + * + * For the base point table, z is usually 1 (0 for the point at infinity). + * This table has 2 * 16 elements, starting with the following: + * index | bits | point + * ------+---------+------------------------------ + * 0 | 0 0 0 0 | 0G + * 1 | 0 0 0 1 | 1G + * 2 | 0 0 1 0 | 2^64G + * 3 | 0 0 1 1 | (2^64 + 1)G + * 4 | 0 1 0 0 | 2^128G + * 5 | 0 1 0 1 | (2^128 + 1)G + * 6 | 0 1 1 0 | (2^128 + 2^64)G + * 7 | 0 1 1 1 | (2^128 + 2^64 + 1)G + * 8 | 1 0 0 0 | 2^192G + * 9 | 1 0 0 1 | (2^192 + 1)G + * 10 | 1 0 1 0 | (2^192 + 2^64)G + * 11 | 1 0 1 1 | (2^192 + 2^64 + 1)G + * 12 | 1 1 0 0 | (2^192 + 2^128)G + * 13 | 1 1 0 1 | (2^192 + 2^128 + 1)G + * 14 | 1 1 1 0 | (2^192 + 2^128 + 2^64)G + * 15 | 1 1 1 1 | (2^192 + 2^128 + 2^64 + 1)G + * followed by a copy of this with each element multiplied by 2^32. + * + * The reason for this is so that we can clock bits into four different + * locations when doing simple scalar multiplies against the base point, + * and then another four locations using the second 16 elements. + * + * Tables for other points have table[i] = iG for i in 0 .. 16. */ + +/* gmul is the table of precomputed base points */ +static const smallfelem gmul[2][16][3] = +{{{{0, 0, 0, 0}, + {0, 0, 0, 0}, + {0, 0, 0, 0}}, + {{0xf4a13945d898c296, 0x77037d812deb33a0, 0xf8bce6e563a440f2, 0x6b17d1f2e12c4247}, + {0xcbb6406837bf51f5, 0x2bce33576b315ece, 0x8ee7eb4a7c0f9e16, 0x4fe342e2fe1a7f9b}, + {1, 0, 0, 0}}, + {{0x90e75cb48e14db63, 0x29493baaad651f7e, 0x8492592e326e25de, 0x0fa822bc2811aaa5}, + {0xe41124545f462ee7, 0x34b1a65050fe82f5, 0x6f4ad4bcb3df188b, 0xbff44ae8f5dba80d}, + {1, 0, 0, 0}}, + {{0x93391ce2097992af, 0xe96c98fd0d35f1fa, 0xb257c0de95e02789, 0x300a4bbc89d6726f}, + {0xaa54a291c08127a0, 0x5bb1eeada9d806a5, 0x7f1ddb25ff1e3c6f, 0x72aac7e0d09b4644}, + {1, 0, 0, 0}}, + {{0x57c84fc9d789bd85, 0xfc35ff7dc297eac3, 0xfb982fd588c6766e, 0x447d739beedb5e67}, + {0x0c7e33c972e25b32, 0x3d349b95a7fae500, 0xe12e9d953a4aaff7, 0x2d4825ab834131ee}, + {1, 0, 0, 0}}, + {{0x13949c932a1d367f, 0xef7fbd2b1a0a11b7, 0xddc6068bb91dfc60, 0xef9519328a9c72ff}, + {0x196035a77376d8a8, 0x23183b0895ca1740, 0xc1ee9807022c219c, 0x611e9fc37dbb2c9b}, + {1, 0, 0, 0}}, + {{0xcae2b1920b57f4bc, 0x2936df5ec6c9bc36, 0x7dea6482e11238bf, 0x550663797b51f5d8}, + {0x44ffe216348a964c, 0x9fb3d576dbdefbe1, 0x0afa40018d9d50e5, 0x157164848aecb851}, + {1, 0, 0, 0}}, + {{0xe48ecafffc5cde01, 0x7ccd84e70d715f26, 0xa2e8f483f43e4391, 0xeb5d7745b21141ea}, + {0xcac917e2731a3479, 0x85f22cfe2844b645, 0x0990e6a158006cee, 0xeafd72ebdbecc17b}, + {1, 0, 0, 0}}, + {{0x6cf20ffb313728be, 0x96439591a3c6b94a, 0x2736ff8344315fc5, 0xa6d39677a7849276}, + {0xf2bab833c357f5f4, 0x824a920c2284059b, 0x66b8babd2d27ecdf, 0x674f84749b0b8816}, + {1, 0, 0, 0}}, + {{0x2df48c04677c8a3e, 0x74e02f080203a56b, 0x31855f7db8c7fedb, 0x4e769e7672c9ddad}, + {0xa4c36165b824bbb0, 0xfb9ae16f3b9122a5, 0x1ec0057206947281, 0x42b99082de830663}, + {1, 0, 0, 0}}, + {{0x6ef95150dda868b9, 0xd1f89e799c0ce131, 0x7fdc1ca008a1c478, 0x78878ef61c6ce04d}, + {0x9c62b9121fe0d976, 0x6ace570ebde08d4f, 0xde53142c12309def, 0xb6cb3f5d7b72c321}, + {1, 0, 0, 0}}, + {{0x7f991ed2c31a3573, 0x5b82dd5bd54fb496, 0x595c5220812ffcae, 0x0c88bc4d716b1287}, + {0x3a57bf635f48aca8, 0x7c8181f4df2564f3, 0x18d1b5b39c04e6aa, 0xdd5ddea3f3901dc6}, + {1, 0, 0, 0}}, + {{0xe96a79fb3e72ad0c, 0x43a0a28c42ba792f, 0xefe0a423083e49f3, 0x68f344af6b317466}, + {0xcdfe17db3fb24d4a, 0x668bfc2271f5c626, 0x604ed93c24d67ff3, 0x31b9c405f8540a20}, + {1, 0, 0, 0}}, + {{0xd36b4789a2582e7f, 0x0d1a10144ec39c28, 0x663c62c3edbad7a0, 0x4052bf4b6f461db9}, + {0x235a27c3188d25eb, 0xe724f33999bfcc5b, 0x862be6bd71d70cc8, 0xfecf4d5190b0fc61}, + {1, 0, 0, 0}}, + {{0x74346c10a1d4cfac, 0xafdf5cc08526a7a4, 0x123202a8f62bff7a, 0x1eddbae2c802e41a}, + {0x8fa0af2dd603f844, 0x36e06b7e4c701917, 0x0c45f45273db33a0, 0x43104d86560ebcfc}, + {1, 0, 0, 0}}, + {{0x9615b5110d1d78e5, 0x66b0de3225c4744b, 0x0a4a46fb6aaf363a, 0xb48e26b484f7a21c}, + {0x06ebb0f621a01b2d, 0xc004e4048b7b0f98, 0x64131bcdfed6f668, 0xfac015404d4d3dab}, + {1, 0, 0, 0}}}, + {{{0, 0, 0, 0}, + {0, 0, 0, 0}, + {0, 0, 0, 0}}, + {{0x3a5a9e22185a5943, 0x1ab919365c65dfb6, 0x21656b32262c71da, 0x7fe36b40af22af89}, + {0xd50d152c699ca101, 0x74b3d5867b8af212, 0x9f09f40407dca6f1, 0xe697d45825b63624}, + {1, 0, 0, 0}}, + {{0xa84aa9397512218e, 0xe9a521b074ca0141, 0x57880b3a18a2e902, 0x4a5b506612a677a6}, + {0x0beada7a4c4f3840, 0x626db15419e26d9d, 0xc42604fbe1627d40, 0xeb13461ceac089f1}, + {1, 0, 0, 0}}, + {{0xf9faed0927a43281, 0x5e52c4144103ecbc, 0xc342967aa815c857, 0x0781b8291c6a220a}, + {0x5a8343ceeac55f80, 0x88f80eeee54a05e3, 0x97b2a14f12916434, 0x690cde8df0151593}, + {1, 0, 0, 0}}, + {{0xaee9c75df7f82f2a, 0x9e4c35874afdf43a, 0xf5622df437371326, 0x8a535f566ec73617}, + {0xc5f9a0ac223094b7, 0xcde533864c8c7669, 0x37e02819085a92bf, 0x0455c08468b08bd7}, + {1, 0, 0, 0}}, + {{0x0c0a6e2c9477b5d9, 0xf9a4bf62876dc444, 0x5050a949b6cdc279, 0x06bada7ab77f8276}, + {0xc8b4aed1ea48dac9, 0xdebd8a4b7ea1070f, 0x427d49101366eb70, 0x5b476dfd0e6cb18a}, + {1, 0, 0, 0}}, + {{0x7c5c3e44278c340a, 0x4d54606812d66f3b, 0x29a751b1ae23c5d8, 0x3e29864e8a2ec908}, + {0x142d2a6626dbb850, 0xad1744c4765bd780, 0x1f150e68e322d1ed, 0x239b90ea3dc31e7e}, + {1, 0, 0, 0}}, + {{0x78c416527a53322a, 0x305dde6709776f8e, 0xdbcab759f8862ed4, 0x820f4dd949f72ff7}, + {0x6cc544a62b5debd4, 0x75be5d937b4e8cc4, 0x1b481b1b215c14d3, 0x140406ec783a05ec}, + {1, 0, 0, 0}}, + {{0x6a703f10e895df07, 0xfd75f3fa01876bd8, 0xeb5b06e70ce08ffe, 0x68f6b8542783dfee}, + {0x90c76f8a78712655, 0xcf5293d2f310bf7f, 0xfbc8044dfda45028, 0xcbe1feba92e40ce6}, + {1, 0, 0, 0}}, + {{0xe998ceea4396e4c1, 0xfc82ef0b6acea274, 0x230f729f2250e927, 0xd0b2f94d2f420109}, + {0x4305adddb38d4966, 0x10b838f8624c3b45, 0x7db2636658954e7a, 0x971459828b0719e5}, + {1, 0, 0, 0}}, + {{0x4bd6b72623369fc9, 0x57f2929e53d0b876, 0xc2d5cba4f2340687, 0x961610004a866aba}, + {0x49997bcd2e407a5e, 0x69ab197d92ddcb24, 0x2cf1f2438fe5131c, 0x7acb9fadcee75e44}, + {1, 0, 0, 0}}, + {{0x254e839423d2d4c0, 0xf57f0c917aea685b, 0xa60d880f6f75aaea, 0x24eb9acca333bf5b}, + {0xe3de4ccb1cda5dea, 0xfeef9341c51a6b4f, 0x743125f88bac4c4d, 0x69f891c5acd079cc}, + {1, 0, 0, 0}}, + {{0xeee44b35702476b5, 0x7ed031a0e45c2258, 0xb422d1e7bd6f8514, 0xe51f547c5972a107}, + {0xa25bcd6fc9cf343d, 0x8ca922ee097c184e, 0xa62f98b3a9fe9a06, 0x1c309a2b25bb1387}, + {1, 0, 0, 0}}, + {{0x9295dbeb1967c459, 0xb00148833472c98e, 0xc504977708011828, 0x20b87b8aa2c4e503}, + {0x3063175de057c277, 0x1bd539338fe582dd, 0x0d11adef5f69a044, 0xf5c6fa49919776be}, + {1, 0, 0, 0}}, + {{0x8c944e760fd59e11, 0x3876cba1102fad5f, 0xa454c3fad83faa56, 0x1ed7d1b9332010b9}, + {0xa1011a270024b889, 0x05e4d0dcac0cd344, 0x52b520f0eb6a2a24, 0x3a2b03f03217257a}, + {1, 0, 0, 0}}, + {{0xf20fc2afdf1d043d, 0xf330240db58d5a62, 0xfc7d229ca0058c3b, 0x15fee545c78dd9f6}, + {0x501e82885bc98cda, 0x41ef80e5d046ac04, 0x557d9f49461210fb, 0x4ab5b6b2b8753f81}, + {1, 0, 0, 0}}}}; + +/* select_point selects the |idx|th point from a precomputation table and + * copies it to out. */ +static void select_point(const u64 idx, unsigned int size, const smallfelem pre_comp[16][3], smallfelem out[3]) + { + unsigned i, j; + u64 *outlimbs = &out[0][0]; + memset(outlimbs, 0, 3 * sizeof(smallfelem)); + + for (i = 0; i < size; i++) + { + const u64 *inlimbs = (u64*) &pre_comp[i][0][0]; + u64 mask = i ^ idx; + mask |= mask >> 4; + mask |= mask >> 2; + mask |= mask >> 1; + mask &= 1; + mask--; + for (j = 0; j < NLIMBS * 3; j++) + outlimbs[j] |= inlimbs[j] & mask; + } + } + +/* get_bit returns the |i|th bit in |in| */ +static char get_bit(const felem_bytearray in, int i) + { + if ((i < 0) || (i >= 256)) + return 0; + return (in[i >> 3] >> (i & 7)) & 1; + } + +/* Interleaved point multiplication using precomputed point multiples: + * The small point multiples 0*P, 1*P, ..., 17*P are in pre_comp[], + * the scalars in scalars[]. If g_scalar is non-NULL, we also add this multiple + * of the generator, using certain (large) precomputed multiples in g_pre_comp. + * Output point (X, Y, Z) is stored in x_out, y_out, z_out */ +static void batch_mul(felem x_out, felem y_out, felem z_out, + const felem_bytearray scalars[], const unsigned num_points, const u8 *g_scalar, + const int mixed, const smallfelem pre_comp[][17][3], const smallfelem g_pre_comp[2][16][3]) + { + int i, skip; + unsigned num, gen_mul = (g_scalar != NULL); + felem nq[3], ftmp; + smallfelem tmp[3]; + u64 bits; + u8 sign, digit; + + /* set nq to the point at infinity */ + memset(nq, 0, 3 * sizeof(felem)); + + /* Loop over all scalars msb-to-lsb, interleaving additions + * of multiples of the generator (two in each of the last 32 rounds) + * and additions of other points multiples (every 5th round). + */ + skip = 1; /* save two point operations in the first round */ + for (i = (num_points ? 255 : 31); i >= 0; --i) + { + /* double */ + if (!skip) + point_double(nq[0], nq[1], nq[2], nq[0], nq[1], nq[2]); + + /* add multiples of the generator */ + if (gen_mul && (i <= 31)) + { + /* first, look 32 bits upwards */ + bits = get_bit(g_scalar, i + 224) << 3; + bits |= get_bit(g_scalar, i + 160) << 2; + bits |= get_bit(g_scalar, i + 96) << 1; + bits |= get_bit(g_scalar, i + 32); + /* select the point to add, in constant time */ + select_point(bits, 16, g_pre_comp[1], tmp); + + if (!skip) + { + point_add(nq[0], nq[1], nq[2], + nq[0], nq[1], nq[2], + 1 /* mixed */, tmp[0], tmp[1], tmp[2]); + } + else + { + smallfelem_expand(nq[0], tmp[0]); + smallfelem_expand(nq[1], tmp[1]); + smallfelem_expand(nq[2], tmp[2]); + skip = 0; + } + + /* second, look at the current position */ + bits = get_bit(g_scalar, i + 192) << 3; + bits |= get_bit(g_scalar, i + 128) << 2; + bits |= get_bit(g_scalar, i + 64) << 1; + bits |= get_bit(g_scalar, i); + /* select the point to add, in constant time */ + select_point(bits, 16, g_pre_comp[0], tmp); + point_add(nq[0], nq[1], nq[2], + nq[0], nq[1], nq[2], + 1 /* mixed */, tmp[0], tmp[1], tmp[2]); + } + + /* do other additions every 5 doublings */ + if (num_points && (i % 5 == 0)) + { + /* loop over all scalars */ + for (num = 0; num < num_points; ++num) + { + bits = get_bit(scalars[num], i + 4) << 5; + bits |= get_bit(scalars[num], i + 3) << 4; + bits |= get_bit(scalars[num], i + 2) << 3; + bits |= get_bit(scalars[num], i + 1) << 2; + bits |= get_bit(scalars[num], i) << 1; + bits |= get_bit(scalars[num], i - 1); + ec_GFp_nistp_recode_scalar_bits(&sign, &digit, bits); + + /* select the point to add or subtract, in constant time */ + select_point(digit, 17, pre_comp[num], tmp); + smallfelem_neg(ftmp, tmp[1]); /* (X, -Y, Z) is the negative point */ + copy_small_conditional(ftmp, tmp[1], (((limb) sign) - 1)); + felem_contract(tmp[1], ftmp); + + if (!skip) + { + point_add(nq[0], nq[1], nq[2], + nq[0], nq[1], nq[2], + mixed, tmp[0], tmp[1], tmp[2]); + } + else + { + smallfelem_expand(nq[0], tmp[0]); + smallfelem_expand(nq[1], tmp[1]); + smallfelem_expand(nq[2], tmp[2]); + skip = 0; + } + } + } + } + felem_assign(x_out, nq[0]); + felem_assign(y_out, nq[1]); + felem_assign(z_out, nq[2]); + } + +/* Precomputation for the group generator. */ +typedef struct { + smallfelem g_pre_comp[2][16][3]; + int references; +} NISTP256_PRE_COMP; + +const EC_METHOD *EC_GFp_nistp256_method(void) + { + static const EC_METHOD ret = { + EC_FLAGS_DEFAULT_OCT, + NID_X9_62_prime_field, + ec_GFp_nistp256_group_init, + ec_GFp_simple_group_finish, + ec_GFp_simple_group_clear_finish, + ec_GFp_nist_group_copy, + ec_GFp_nistp256_group_set_curve, + ec_GFp_simple_group_get_curve, + ec_GFp_simple_group_get_degree, + ec_GFp_simple_group_check_discriminant, + ec_GFp_simple_point_init, + ec_GFp_simple_point_finish, + ec_GFp_simple_point_clear_finish, + ec_GFp_simple_point_copy, + ec_GFp_simple_point_set_to_infinity, + ec_GFp_simple_set_Jprojective_coordinates_GFp, + ec_GFp_simple_get_Jprojective_coordinates_GFp, + ec_GFp_simple_point_set_affine_coordinates, + ec_GFp_nistp256_point_get_affine_coordinates, + 0 /* point_set_compressed_coordinates */, + 0 /* point2oct */, + 0 /* oct2point */, + ec_GFp_simple_add, + ec_GFp_simple_dbl, + ec_GFp_simple_invert, + ec_GFp_simple_is_at_infinity, + ec_GFp_simple_is_on_curve, + ec_GFp_simple_cmp, + ec_GFp_simple_make_affine, + ec_GFp_simple_points_make_affine, + ec_GFp_nistp256_points_mul, + ec_GFp_nistp256_precompute_mult, + ec_GFp_nistp256_have_precompute_mult, + ec_GFp_nist_field_mul, + ec_GFp_nist_field_sqr, + 0 /* field_div */, + 0 /* field_encode */, + 0 /* field_decode */, + 0 /* field_set_to_one */ }; + + return &ret; + } + +/******************************************************************************/ +/* FUNCTIONS TO MANAGE PRECOMPUTATION + */ + +static NISTP256_PRE_COMP *nistp256_pre_comp_new() + { + NISTP256_PRE_COMP *ret = NULL; + ret = (NISTP256_PRE_COMP *) OPENSSL_malloc(sizeof *ret); + if (!ret) + { + ECerr(EC_F_NISTP256_PRE_COMP_NEW, ERR_R_MALLOC_FAILURE); + return ret; + } + memset(ret->g_pre_comp, 0, sizeof(ret->g_pre_comp)); + ret->references = 1; + return ret; + } + +static void *nistp256_pre_comp_dup(void *src_) + { + NISTP256_PRE_COMP *src = src_; + + /* no need to actually copy, these objects never change! */ + CRYPTO_add(&src->references, 1, CRYPTO_LOCK_EC_PRE_COMP); + + return src_; + } + +static void nistp256_pre_comp_free(void *pre_) + { + int i; + NISTP256_PRE_COMP *pre = pre_; + + if (!pre) + return; + + i = CRYPTO_add(&pre->references, -1, CRYPTO_LOCK_EC_PRE_COMP); + if (i > 0) + return; + + OPENSSL_free(pre); + } + +static void nistp256_pre_comp_clear_free(void *pre_) + { + int i; + NISTP256_PRE_COMP *pre = pre_; + + if (!pre) + return; + + i = CRYPTO_add(&pre->references, -1, CRYPTO_LOCK_EC_PRE_COMP); + if (i > 0) + return; + + OPENSSL_cleanse(pre, sizeof *pre); + OPENSSL_free(pre); + } + +/******************************************************************************/ +/* OPENSSL EC_METHOD FUNCTIONS + */ + +int ec_GFp_nistp256_group_init(EC_GROUP *group) + { + int ret; + ret = ec_GFp_simple_group_init(group); + group->a_is_minus3 = 1; + return ret; + } + +int ec_GFp_nistp256_group_set_curve(EC_GROUP *group, const BIGNUM *p, + const BIGNUM *a, const BIGNUM *b, BN_CTX *ctx) + { + int ret = 0; + BN_CTX *new_ctx = NULL; + BIGNUM *curve_p, *curve_a, *curve_b; + + if (ctx == NULL) + if ((ctx = new_ctx = BN_CTX_new()) == NULL) return 0; + BN_CTX_start(ctx); + if (((curve_p = BN_CTX_get(ctx)) == NULL) || + ((curve_a = BN_CTX_get(ctx)) == NULL) || + ((curve_b = BN_CTX_get(ctx)) == NULL)) goto err; + BN_bin2bn(nistp256_curve_params[0], sizeof(felem_bytearray), curve_p); + BN_bin2bn(nistp256_curve_params[1], sizeof(felem_bytearray), curve_a); + BN_bin2bn(nistp256_curve_params[2], sizeof(felem_bytearray), curve_b); + if ((BN_cmp(curve_p, p)) || (BN_cmp(curve_a, a)) || + (BN_cmp(curve_b, b))) + { + ECerr(EC_F_EC_GFP_NISTP256_GROUP_SET_CURVE, + EC_R_WRONG_CURVE_PARAMETERS); + goto err; + } + group->field_mod_func = BN_nist_mod_256; + ret = ec_GFp_simple_group_set_curve(group, p, a, b, ctx); +err: + BN_CTX_end(ctx); + if (new_ctx != NULL) + BN_CTX_free(new_ctx); + return ret; + } + +/* Takes the Jacobian coordinates (X, Y, Z) of a point and returns + * (X', Y') = (X/Z^2, Y/Z^3) */ +int ec_GFp_nistp256_point_get_affine_coordinates(const EC_GROUP *group, + const EC_POINT *point, BIGNUM *x, BIGNUM *y, BN_CTX *ctx) + { + felem z1, z2, x_in, y_in; + smallfelem x_out, y_out; + longfelem tmp; + + if (EC_POINT_is_at_infinity(group, point)) + { + ECerr(EC_F_EC_GFP_NISTP256_POINT_GET_AFFINE_COORDINATES, + EC_R_POINT_AT_INFINITY); + return 0; + } + if ((!BN_to_felem(x_in, &point->X)) || (!BN_to_felem(y_in, &point->Y)) || + (!BN_to_felem(z1, &point->Z))) return 0; + felem_inv(z2, z1); + felem_square(tmp, z2); felem_reduce(z1, tmp); + felem_mul(tmp, x_in, z1); felem_reduce(x_in, tmp); + felem_contract(x_out, x_in); + if (x != NULL) + { + if (!smallfelem_to_BN(x, x_out)) { + ECerr(EC_F_EC_GFP_NISTP256_POINT_GET_AFFINE_COORDINATES, + ERR_R_BN_LIB); + return 0; + } + } + felem_mul(tmp, z1, z2); felem_reduce(z1, tmp); + felem_mul(tmp, y_in, z1); felem_reduce(y_in, tmp); + felem_contract(y_out, y_in); + if (y != NULL) + { + if (!smallfelem_to_BN(y, y_out)) + { + ECerr(EC_F_EC_GFP_NISTP256_POINT_GET_AFFINE_COORDINATES, + ERR_R_BN_LIB); + return 0; + } + } + return 1; + } + +static void make_points_affine(size_t num, smallfelem points[/* num */][3], smallfelem tmp_smallfelems[/* num+1 */]) + { + /* Runs in constant time, unless an input is the point at infinity + * (which normally shouldn't happen). */ + ec_GFp_nistp_points_make_affine_internal( + num, + points, + sizeof(smallfelem), + tmp_smallfelems, + (void (*)(void *)) smallfelem_one, + (int (*)(const void *)) smallfelem_is_zero_int, + (void (*)(void *, const void *)) smallfelem_assign, + (void (*)(void *, const void *)) smallfelem_square_contract, + (void (*)(void *, const void *, const void *)) smallfelem_mul_contract, + (void (*)(void *, const void *)) smallfelem_inv_contract, + (void (*)(void *, const void *)) smallfelem_assign /* nothing to contract */); + } + +/* Computes scalar*generator + \sum scalars[i]*points[i], ignoring NULL values + * Result is stored in r (r can equal one of the inputs). */ +int ec_GFp_nistp256_points_mul(const EC_GROUP *group, EC_POINT *r, + const BIGNUM *scalar, size_t num, const EC_POINT *points[], + const BIGNUM *scalars[], BN_CTX *ctx) + { + int ret = 0; + int j; + int mixed = 0; + BN_CTX *new_ctx = NULL; + BIGNUM *x, *y, *z, *tmp_scalar; + felem_bytearray g_secret; + felem_bytearray *secrets = NULL; + smallfelem (*pre_comp)[17][3] = NULL; + smallfelem *tmp_smallfelems = NULL; + felem_bytearray tmp; + unsigned i, num_bytes; + int have_pre_comp = 0; + size_t num_points = num; + smallfelem x_in, y_in, z_in; + felem x_out, y_out, z_out; + NISTP256_PRE_COMP *pre = NULL; + const smallfelem (*g_pre_comp)[16][3] = NULL; + EC_POINT *generator = NULL; + const EC_POINT *p = NULL; + const BIGNUM *p_scalar = NULL; + + if (ctx == NULL) + if ((ctx = new_ctx = BN_CTX_new()) == NULL) return 0; + BN_CTX_start(ctx); + if (((x = BN_CTX_get(ctx)) == NULL) || + ((y = BN_CTX_get(ctx)) == NULL) || + ((z = BN_CTX_get(ctx)) == NULL) || + ((tmp_scalar = BN_CTX_get(ctx)) == NULL)) + goto err; + + if (scalar != NULL) + { + pre = EC_EX_DATA_get_data(group->extra_data, + nistp256_pre_comp_dup, nistp256_pre_comp_free, + nistp256_pre_comp_clear_free); + if (pre) + /* we have precomputation, try to use it */ + g_pre_comp = (const smallfelem (*)[16][3]) pre->g_pre_comp; + else + /* try to use the standard precomputation */ + g_pre_comp = &gmul[0]; + generator = EC_POINT_new(group); + if (generator == NULL) + goto err; + /* get the generator from precomputation */ + if (!smallfelem_to_BN(x, g_pre_comp[0][1][0]) || + !smallfelem_to_BN(y, g_pre_comp[0][1][1]) || + !smallfelem_to_BN(z, g_pre_comp[0][1][2])) + { + ECerr(EC_F_EC_GFP_NISTP256_POINTS_MUL, ERR_R_BN_LIB); + goto err; + } + if (!EC_POINT_set_Jprojective_coordinates_GFp(group, + generator, x, y, z, ctx)) + goto err; + if (0 == EC_POINT_cmp(group, generator, group->generator, ctx)) + /* precomputation matches generator */ + have_pre_comp = 1; + else + /* we don't have valid precomputation: + * treat the generator as a random point */ + num_points++; + } + if (num_points > 0) + { + if (num_points >= 3) + { + /* unless we precompute multiples for just one or two points, + * converting those into affine form is time well spent */ + mixed = 1; + } + secrets = OPENSSL_malloc(num_points * sizeof(felem_bytearray)); + pre_comp = OPENSSL_malloc(num_points * 17 * 3 * sizeof(smallfelem)); + if (mixed) + tmp_smallfelems = OPENSSL_malloc((num_points * 17 + 1) * sizeof(smallfelem)); + if ((secrets == NULL) || (pre_comp == NULL) || (mixed && (tmp_smallfelems == NULL))) + { + ECerr(EC_F_EC_GFP_NISTP256_POINTS_MUL, ERR_R_MALLOC_FAILURE); + goto err; + } + + /* we treat NULL scalars as 0, and NULL points as points at infinity, + * i.e., they contribute nothing to the linear combination */ + memset(secrets, 0, num_points * sizeof(felem_bytearray)); + memset(pre_comp, 0, num_points * 17 * 3 * sizeof(smallfelem)); + for (i = 0; i < num_points; ++i) + { + if (i == num) + /* we didn't have a valid precomputation, so we pick + * the generator */ + { + p = EC_GROUP_get0_generator(group); + p_scalar = scalar; + } + else + /* the i^th point */ + { + p = points[i]; + p_scalar = scalars[i]; + } + if ((p_scalar != NULL) && (p != NULL)) + { + /* reduce scalar to 0 <= scalar < 2^256 */ + if ((BN_num_bits(p_scalar) > 256) || (BN_is_negative(p_scalar))) + { + /* this is an unusual input, and we don't guarantee + * constant-timeness */ + if (!BN_nnmod(tmp_scalar, p_scalar, &group->order, ctx)) + { + ECerr(EC_F_EC_GFP_NISTP256_POINTS_MUL, ERR_R_BN_LIB); + goto err; + } + num_bytes = BN_bn2bin(tmp_scalar, tmp); + } + else + num_bytes = BN_bn2bin(p_scalar, tmp); + flip_endian(secrets[i], tmp, num_bytes); + /* precompute multiples */ + if ((!BN_to_felem(x_out, &p->X)) || + (!BN_to_felem(y_out, &p->Y)) || + (!BN_to_felem(z_out, &p->Z))) goto err; + felem_shrink(pre_comp[i][1][0], x_out); + felem_shrink(pre_comp[i][1][1], y_out); + felem_shrink(pre_comp[i][1][2], z_out); + for (j = 2; j <= 16; ++j) + { + if (j & 1) + { + point_add_small( + pre_comp[i][j][0], pre_comp[i][j][1], pre_comp[i][j][2], + pre_comp[i][1][0], pre_comp[i][1][1], pre_comp[i][1][2], + pre_comp[i][j-1][0], pre_comp[i][j-1][1], pre_comp[i][j-1][2]); + } + else + { + point_double_small( + pre_comp[i][j][0], pre_comp[i][j][1], pre_comp[i][j][2], + pre_comp[i][j/2][0], pre_comp[i][j/2][1], pre_comp[i][j/2][2]); + } + } + } + } + if (mixed) + make_points_affine(num_points * 17, pre_comp[0], tmp_smallfelems); + } + + /* the scalar for the generator */ + if ((scalar != NULL) && (have_pre_comp)) + { + memset(g_secret, 0, sizeof(g_secret)); + /* reduce scalar to 0 <= scalar < 2^256 */ + if ((BN_num_bits(scalar) > 256) || (BN_is_negative(scalar))) + { + /* this is an unusual input, and we don't guarantee + * constant-timeness */ + if (!BN_nnmod(tmp_scalar, scalar, &group->order, ctx)) + { + ECerr(EC_F_EC_GFP_NISTP256_POINTS_MUL, ERR_R_BN_LIB); + goto err; + } + num_bytes = BN_bn2bin(tmp_scalar, tmp); + } + else + num_bytes = BN_bn2bin(scalar, tmp); + flip_endian(g_secret, tmp, num_bytes); + /* do the multiplication with generator precomputation*/ + batch_mul(x_out, y_out, z_out, + (const felem_bytearray (*)) secrets, num_points, + g_secret, + mixed, (const smallfelem (*)[17][3]) pre_comp, + g_pre_comp); + } + else + /* do the multiplication without generator precomputation */ + batch_mul(x_out, y_out, z_out, + (const felem_bytearray (*)) secrets, num_points, + NULL, mixed, (const smallfelem (*)[17][3]) pre_comp, NULL); + /* reduce the output to its unique minimal representation */ + felem_contract(x_in, x_out); + felem_contract(y_in, y_out); + felem_contract(z_in, z_out); + if ((!smallfelem_to_BN(x, x_in)) || (!smallfelem_to_BN(y, y_in)) || + (!smallfelem_to_BN(z, z_in))) + { + ECerr(EC_F_EC_GFP_NISTP256_POINTS_MUL, ERR_R_BN_LIB); + goto err; + } + ret = EC_POINT_set_Jprojective_coordinates_GFp(group, r, x, y, z, ctx); + +err: + BN_CTX_end(ctx); + if (generator != NULL) + EC_POINT_free(generator); + if (new_ctx != NULL) + BN_CTX_free(new_ctx); + if (secrets != NULL) + OPENSSL_free(secrets); + if (pre_comp != NULL) + OPENSSL_free(pre_comp); + if (tmp_smallfelems != NULL) + OPENSSL_free(tmp_smallfelems); + return ret; + } + +int ec_GFp_nistp256_precompute_mult(EC_GROUP *group, BN_CTX *ctx) + { + int ret = 0; + NISTP256_PRE_COMP *pre = NULL; + int i, j; + BN_CTX *new_ctx = NULL; + BIGNUM *x, *y; + EC_POINT *generator = NULL; + smallfelem tmp_smallfelems[32]; + felem x_tmp, y_tmp, z_tmp; + + /* throw away old precomputation */ + EC_EX_DATA_free_data(&group->extra_data, nistp256_pre_comp_dup, + nistp256_pre_comp_free, nistp256_pre_comp_clear_free); + if (ctx == NULL) + if ((ctx = new_ctx = BN_CTX_new()) == NULL) return 0; + BN_CTX_start(ctx); + if (((x = BN_CTX_get(ctx)) == NULL) || + ((y = BN_CTX_get(ctx)) == NULL)) + goto err; + /* get the generator */ + if (group->generator == NULL) goto err; + generator = EC_POINT_new(group); + if (generator == NULL) + goto err; + BN_bin2bn(nistp256_curve_params[3], sizeof (felem_bytearray), x); + BN_bin2bn(nistp256_curve_params[4], sizeof (felem_bytearray), y); + if (!EC_POINT_set_affine_coordinates_GFp(group, generator, x, y, ctx)) + goto err; + if ((pre = nistp256_pre_comp_new()) == NULL) + goto err; + /* if the generator is the standard one, use built-in precomputation */ + if (0 == EC_POINT_cmp(group, generator, group->generator, ctx)) + { + memcpy(pre->g_pre_comp, gmul, sizeof(pre->g_pre_comp)); + ret = 1; + goto err; + } + if ((!BN_to_felem(x_tmp, &group->generator->X)) || + (!BN_to_felem(y_tmp, &group->generator->Y)) || + (!BN_to_felem(z_tmp, &group->generator->Z))) + goto err; + felem_shrink(pre->g_pre_comp[0][1][0], x_tmp); + felem_shrink(pre->g_pre_comp[0][1][1], y_tmp); + felem_shrink(pre->g_pre_comp[0][1][2], z_tmp); + /* compute 2^64*G, 2^128*G, 2^192*G for the first table, + * 2^32*G, 2^96*G, 2^160*G, 2^224*G for the second one + */ + for (i = 1; i <= 8; i <<= 1) + { + point_double_small( + pre->g_pre_comp[1][i][0], pre->g_pre_comp[1][i][1], pre->g_pre_comp[1][i][2], + pre->g_pre_comp[0][i][0], pre->g_pre_comp[0][i][1], pre->g_pre_comp[0][i][2]); + for (j = 0; j < 31; ++j) + { + point_double_small( + pre->g_pre_comp[1][i][0], pre->g_pre_comp[1][i][1], pre->g_pre_comp[1][i][2], + pre->g_pre_comp[1][i][0], pre->g_pre_comp[1][i][1], pre->g_pre_comp[1][i][2]); + } + if (i == 8) + break; + point_double_small( + pre->g_pre_comp[0][2*i][0], pre->g_pre_comp[0][2*i][1], pre->g_pre_comp[0][2*i][2], + pre->g_pre_comp[1][i][0], pre->g_pre_comp[1][i][1], pre->g_pre_comp[1][i][2]); + for (j = 0; j < 31; ++j) + { + point_double_small( + pre->g_pre_comp[0][2*i][0], pre->g_pre_comp[0][2*i][1], pre->g_pre_comp[0][2*i][2], + pre->g_pre_comp[0][2*i][0], pre->g_pre_comp[0][2*i][1], pre->g_pre_comp[0][2*i][2]); + } + } + for (i = 0; i < 2; i++) + { + /* g_pre_comp[i][0] is the point at infinity */ + memset(pre->g_pre_comp[i][0], 0, sizeof(pre->g_pre_comp[i][0])); + /* the remaining multiples */ + /* 2^64*G + 2^128*G resp. 2^96*G + 2^160*G */ + point_add_small( + pre->g_pre_comp[i][6][0], pre->g_pre_comp[i][6][1], pre->g_pre_comp[i][6][2], + pre->g_pre_comp[i][4][0], pre->g_pre_comp[i][4][1], pre->g_pre_comp[i][4][2], + pre->g_pre_comp[i][2][0], pre->g_pre_comp[i][2][1], pre->g_pre_comp[i][2][2]); + /* 2^64*G + 2^192*G resp. 2^96*G + 2^224*G */ + point_add_small( + pre->g_pre_comp[i][10][0], pre->g_pre_comp[i][10][1], pre->g_pre_comp[i][10][2], + pre->g_pre_comp[i][8][0], pre->g_pre_comp[i][8][1], pre->g_pre_comp[i][8][2], + pre->g_pre_comp[i][2][0], pre->g_pre_comp[i][2][1], pre->g_pre_comp[i][2][2]); + /* 2^128*G + 2^192*G resp. 2^160*G + 2^224*G */ + point_add_small( + pre->g_pre_comp[i][12][0], pre->g_pre_comp[i][12][1], pre->g_pre_comp[i][12][2], + pre->g_pre_comp[i][8][0], pre->g_pre_comp[i][8][1], pre->g_pre_comp[i][8][2], + pre->g_pre_comp[i][4][0], pre->g_pre_comp[i][4][1], pre->g_pre_comp[i][4][2]); + /* 2^64*G + 2^128*G + 2^192*G resp. 2^96*G + 2^160*G + 2^224*G */ + point_add_small( + pre->g_pre_comp[i][14][0], pre->g_pre_comp[i][14][1], pre->g_pre_comp[i][14][2], + pre->g_pre_comp[i][12][0], pre->g_pre_comp[i][12][1], pre->g_pre_comp[i][12][2], + pre->g_pre_comp[i][2][0], pre->g_pre_comp[i][2][1], pre->g_pre_comp[i][2][2]); + for (j = 1; j < 8; ++j) + { + /* odd multiples: add G resp. 2^32*G */ + point_add_small( + pre->g_pre_comp[i][2*j+1][0], pre->g_pre_comp[i][2*j+1][1], pre->g_pre_comp[i][2*j+1][2], + pre->g_pre_comp[i][2*j][0], pre->g_pre_comp[i][2*j][1], pre->g_pre_comp[i][2*j][2], + pre->g_pre_comp[i][1][0], pre->g_pre_comp[i][1][1], pre->g_pre_comp[i][1][2]); + } + } + make_points_affine(31, &(pre->g_pre_comp[0][1]), tmp_smallfelems); + + if (!EC_EX_DATA_set_data(&group->extra_data, pre, nistp256_pre_comp_dup, + nistp256_pre_comp_free, nistp256_pre_comp_clear_free)) + goto err; + ret = 1; + pre = NULL; + err: + BN_CTX_end(ctx); + if (generator != NULL) + EC_POINT_free(generator); + if (new_ctx != NULL) + BN_CTX_free(new_ctx); + if (pre) + nistp256_pre_comp_free(pre); + return ret; + } + +int ec_GFp_nistp256_have_precompute_mult(const EC_GROUP *group) + { + if (EC_EX_DATA_get_data(group->extra_data, nistp256_pre_comp_dup, + nistp256_pre_comp_free, nistp256_pre_comp_clear_free) + != NULL) + return 1; + else + return 0; + } +#else +static void *dummy=&dummy; +#endif diff --git a/crypto/openssl/crypto/ec/ecp_nistp521.c b/crypto/openssl/crypto/ec/ecp_nistp521.c new file mode 100644 index 0000000000..178b655f7f --- /dev/null +++ b/crypto/openssl/crypto/ec/ecp_nistp521.c @@ -0,0 +1,2025 @@ +/* crypto/ec/ecp_nistp521.c */ +/* + * Written by Adam Langley (Google) for the OpenSSL project + */ +/* Copyright 2011 Google Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/* + * A 64-bit implementation of the NIST P-521 elliptic curve point multiplication + * + * OpenSSL integration was taken from Emilia Kasper's work in ecp_nistp224.c. + * Otherwise based on Emilia's P224 work, which was inspired by my curve25519 + * work which got its smarts from Daniel J. Bernstein's work on the same. + */ + +#include +#ifndef OPENSSL_NO_EC_NISTP_64_GCC_128 + +#ifndef OPENSSL_SYS_VMS +#include +#else +#include +#endif + +#include +#include +#include "ec_lcl.h" + +#if defined(__GNUC__) && (__GNUC__ > 3 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 1)) + /* even with gcc, the typedef won't work for 32-bit platforms */ + typedef __uint128_t uint128_t; /* nonstandard; implemented by gcc on 64-bit platforms */ +#else + #error "Need GCC 3.1 or later to define type uint128_t" +#endif + +typedef uint8_t u8; +typedef uint64_t u64; +typedef int64_t s64; + +/* The underlying field. + * + * P521 operates over GF(2^521-1). We can serialise an element of this field + * into 66 bytes where the most significant byte contains only a single bit. We + * call this an felem_bytearray. */ + +typedef u8 felem_bytearray[66]; + +/* These are the parameters of P521, taken from FIPS 186-3, section D.1.2.5. + * These values are big-endian. */ +static const felem_bytearray nistp521_curve_params[5] = + { + {0x01, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, /* p */ + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff}, + {0x01, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, /* a = -3 */ + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xfc}, + {0x00, 0x51, 0x95, 0x3e, 0xb9, 0x61, 0x8e, 0x1c, /* b */ + 0x9a, 0x1f, 0x92, 0x9a, 0x21, 0xa0, 0xb6, 0x85, + 0x40, 0xee, 0xa2, 0xda, 0x72, 0x5b, 0x99, 0xb3, + 0x15, 0xf3, 0xb8, 0xb4, 0x89, 0x91, 0x8e, 0xf1, + 0x09, 0xe1, 0x56, 0x19, 0x39, 0x51, 0xec, 0x7e, + 0x93, 0x7b, 0x16, 0x52, 0xc0, 0xbd, 0x3b, 0xb1, + 0xbf, 0x07, 0x35, 0x73, 0xdf, 0x88, 0x3d, 0x2c, + 0x34, 0xf1, 0xef, 0x45, 0x1f, 0xd4, 0x6b, 0x50, + 0x3f, 0x00}, + {0x00, 0xc6, 0x85, 0x8e, 0x06, 0xb7, 0x04, 0x04, /* x */ + 0xe9, 0xcd, 0x9e, 0x3e, 0xcb, 0x66, 0x23, 0x95, + 0xb4, 0x42, 0x9c, 0x64, 0x81, 0x39, 0x05, 0x3f, + 0xb5, 0x21, 0xf8, 0x28, 0xaf, 0x60, 0x6b, 0x4d, + 0x3d, 0xba, 0xa1, 0x4b, 0x5e, 0x77, 0xef, 0xe7, + 0x59, 0x28, 0xfe, 0x1d, 0xc1, 0x27, 0xa2, 0xff, + 0xa8, 0xde, 0x33, 0x48, 0xb3, 0xc1, 0x85, 0x6a, + 0x42, 0x9b, 0xf9, 0x7e, 0x7e, 0x31, 0xc2, 0xe5, + 0xbd, 0x66}, + {0x01, 0x18, 0x39, 0x29, 0x6a, 0x78, 0x9a, 0x3b, /* y */ + 0xc0, 0x04, 0x5c, 0x8a, 0x5f, 0xb4, 0x2c, 0x7d, + 0x1b, 0xd9, 0x98, 0xf5, 0x44, 0x49, 0x57, 0x9b, + 0x44, 0x68, 0x17, 0xaf, 0xbd, 0x17, 0x27, 0x3e, + 0x66, 0x2c, 0x97, 0xee, 0x72, 0x99, 0x5e, 0xf4, + 0x26, 0x40, 0xc5, 0x50, 0xb9, 0x01, 0x3f, 0xad, + 0x07, 0x61, 0x35, 0x3c, 0x70, 0x86, 0xa2, 0x72, + 0xc2, 0x40, 0x88, 0xbe, 0x94, 0x76, 0x9f, 0xd1, + 0x66, 0x50} + }; + +/* The representation of field elements. + * ------------------------------------ + * + * We represent field elements with nine values. These values are either 64 or + * 128 bits and the field element represented is: + * v[0]*2^0 + v[1]*2^58 + v[2]*2^116 + ... + v[8]*2^464 (mod p) + * Each of the nine values is called a 'limb'. Since the limbs are spaced only + * 58 bits apart, but are greater than 58 bits in length, the most significant + * bits of each limb overlap with the least significant bits of the next. + * + * A field element with 64-bit limbs is an 'felem'. One with 128-bit limbs is a + * 'largefelem' */ + +#define NLIMBS 9 + +typedef uint64_t limb; +typedef limb felem[NLIMBS]; +typedef uint128_t largefelem[NLIMBS]; + +static const limb bottom57bits = 0x1ffffffffffffff; +static const limb bottom58bits = 0x3ffffffffffffff; + +/* bin66_to_felem takes a little-endian byte array and converts it into felem + * form. This assumes that the CPU is little-endian. */ +static void bin66_to_felem(felem out, const u8 in[66]) + { + out[0] = (*((limb*) &in[0])) & bottom58bits; + out[1] = (*((limb*) &in[7]) >> 2) & bottom58bits; + out[2] = (*((limb*) &in[14]) >> 4) & bottom58bits; + out[3] = (*((limb*) &in[21]) >> 6) & bottom58bits; + out[4] = (*((limb*) &in[29])) & bottom58bits; + out[5] = (*((limb*) &in[36]) >> 2) & bottom58bits; + out[6] = (*((limb*) &in[43]) >> 4) & bottom58bits; + out[7] = (*((limb*) &in[50]) >> 6) & bottom58bits; + out[8] = (*((limb*) &in[58])) & bottom57bits; + } + +/* felem_to_bin66 takes an felem and serialises into a little endian, 66 byte + * array. This assumes that the CPU is little-endian. */ +static void felem_to_bin66(u8 out[66], const felem in) + { + memset(out, 0, 66); + (*((limb*) &out[0])) = in[0]; + (*((limb*) &out[7])) |= in[1] << 2; + (*((limb*) &out[14])) |= in[2] << 4; + (*((limb*) &out[21])) |= in[3] << 6; + (*((limb*) &out[29])) = in[4]; + (*((limb*) &out[36])) |= in[5] << 2; + (*((limb*) &out[43])) |= in[6] << 4; + (*((limb*) &out[50])) |= in[7] << 6; + (*((limb*) &out[58])) = in[8]; + } + +/* To preserve endianness when using BN_bn2bin and BN_bin2bn */ +static void flip_endian(u8 *out, const u8 *in, unsigned len) + { + unsigned i; + for (i = 0; i < len; ++i) + out[i] = in[len-1-i]; + } + +/* BN_to_felem converts an OpenSSL BIGNUM into an felem */ +static int BN_to_felem(felem out, const BIGNUM *bn) + { + felem_bytearray b_in; + felem_bytearray b_out; + unsigned num_bytes; + + /* BN_bn2bin eats leading zeroes */ + memset(b_out, 0, sizeof b_out); + num_bytes = BN_num_bytes(bn); + if (num_bytes > sizeof b_out) + { + ECerr(EC_F_BN_TO_FELEM, EC_R_BIGNUM_OUT_OF_RANGE); + return 0; + } + if (BN_is_negative(bn)) + { + ECerr(EC_F_BN_TO_FELEM, EC_R_BIGNUM_OUT_OF_RANGE); + return 0; + } + num_bytes = BN_bn2bin(bn, b_in); + flip_endian(b_out, b_in, num_bytes); + bin66_to_felem(out, b_out); + return 1; + } + +/* felem_to_BN converts an felem into an OpenSSL BIGNUM */ +static BIGNUM *felem_to_BN(BIGNUM *out, const felem in) + { + felem_bytearray b_in, b_out; + felem_to_bin66(b_in, in); + flip_endian(b_out, b_in, sizeof b_out); + return BN_bin2bn(b_out, sizeof b_out, out); + } + + +/* Field operations + * ---------------- */ + +static void felem_one(felem out) + { + out[0] = 1; + out[1] = 0; + out[2] = 0; + out[3] = 0; + out[4] = 0; + out[5] = 0; + out[6] = 0; + out[7] = 0; + out[8] = 0; + } + +static void felem_assign(felem out, const felem in) + { + out[0] = in[0]; + out[1] = in[1]; + out[2] = in[2]; + out[3] = in[3]; + out[4] = in[4]; + out[5] = in[5]; + out[6] = in[6]; + out[7] = in[7]; + out[8] = in[8]; + } + +/* felem_sum64 sets out = out + in. */ +static void felem_sum64(felem out, const felem in) + { + out[0] += in[0]; + out[1] += in[1]; + out[2] += in[2]; + out[3] += in[3]; + out[4] += in[4]; + out[5] += in[5]; + out[6] += in[6]; + out[7] += in[7]; + out[8] += in[8]; + } + +/* felem_scalar sets out = in * scalar */ +static void felem_scalar(felem out, const felem in, limb scalar) + { + out[0] = in[0] * scalar; + out[1] = in[1] * scalar; + out[2] = in[2] * scalar; + out[3] = in[3] * scalar; + out[4] = in[4] * scalar; + out[5] = in[5] * scalar; + out[6] = in[6] * scalar; + out[7] = in[7] * scalar; + out[8] = in[8] * scalar; + } + +/* felem_scalar64 sets out = out * scalar */ +static void felem_scalar64(felem out, limb scalar) + { + out[0] *= scalar; + out[1] *= scalar; + out[2] *= scalar; + out[3] *= scalar; + out[4] *= scalar; + out[5] *= scalar; + out[6] *= scalar; + out[7] *= scalar; + out[8] *= scalar; + } + +/* felem_scalar128 sets out = out * scalar */ +static void felem_scalar128(largefelem out, limb scalar) + { + out[0] *= scalar; + out[1] *= scalar; + out[2] *= scalar; + out[3] *= scalar; + out[4] *= scalar; + out[5] *= scalar; + out[6] *= scalar; + out[7] *= scalar; + out[8] *= scalar; + } + +/* felem_neg sets |out| to |-in| + * On entry: + * in[i] < 2^59 + 2^14 + * On exit: + * out[i] < 2^62 + */ +static void felem_neg(felem out, const felem in) + { + /* In order to prevent underflow, we subtract from 0 mod p. */ + static const limb two62m3 = (((limb)1) << 62) - (((limb)1) << 5); + static const limb two62m2 = (((limb)1) << 62) - (((limb)1) << 4); + + out[0] = two62m3 - in[0]; + out[1] = two62m2 - in[1]; + out[2] = two62m2 - in[2]; + out[3] = two62m2 - in[3]; + out[4] = two62m2 - in[4]; + out[5] = two62m2 - in[5]; + out[6] = two62m2 - in[6]; + out[7] = two62m2 - in[7]; + out[8] = two62m2 - in[8]; + } + +/* felem_diff64 subtracts |in| from |out| + * On entry: + * in[i] < 2^59 + 2^14 + * On exit: + * out[i] < out[i] + 2^62 + */ +static void felem_diff64(felem out, const felem in) + { + /* In order to prevent underflow, we add 0 mod p before subtracting. */ + static const limb two62m3 = (((limb)1) << 62) - (((limb)1) << 5); + static const limb two62m2 = (((limb)1) << 62) - (((limb)1) << 4); + + out[0] += two62m3 - in[0]; + out[1] += two62m2 - in[1]; + out[2] += two62m2 - in[2]; + out[3] += two62m2 - in[3]; + out[4] += two62m2 - in[4]; + out[5] += two62m2 - in[5]; + out[6] += two62m2 - in[6]; + out[7] += two62m2 - in[7]; + out[8] += two62m2 - in[8]; + } + +/* felem_diff_128_64 subtracts |in| from |out| + * On entry: + * in[i] < 2^62 + 2^17 + * On exit: + * out[i] < out[i] + 2^63 + */ +static void felem_diff_128_64(largefelem out, const felem in) + { + /* In order to prevent underflow, we add 0 mod p before subtracting. */ + static const limb two63m6 = (((limb)1) << 62) - (((limb)1) << 5); + static const limb two63m5 = (((limb)1) << 62) - (((limb)1) << 4); + + out[0] += two63m6 - in[0]; + out[1] += two63m5 - in[1]; + out[2] += two63m5 - in[2]; + out[3] += two63m5 - in[3]; + out[4] += two63m5 - in[4]; + out[5] += two63m5 - in[5]; + out[6] += two63m5 - in[6]; + out[7] += two63m5 - in[7]; + out[8] += two63m5 - in[8]; + } + +/* felem_diff_128_64 subtracts |in| from |out| + * On entry: + * in[i] < 2^126 + * On exit: + * out[i] < out[i] + 2^127 - 2^69 + */ +static void felem_diff128(largefelem out, const largefelem in) + { + /* In order to prevent underflow, we add 0 mod p before subtracting. */ + static const uint128_t two127m70 = (((uint128_t)1) << 127) - (((uint128_t)1) << 70); + static const uint128_t two127m69 = (((uint128_t)1) << 127) - (((uint128_t)1) << 69); + + out[0] += (two127m70 - in[0]); + out[1] += (two127m69 - in[1]); + out[2] += (two127m69 - in[2]); + out[3] += (two127m69 - in[3]); + out[4] += (two127m69 - in[4]); + out[5] += (two127m69 - in[5]); + out[6] += (two127m69 - in[6]); + out[7] += (two127m69 - in[7]); + out[8] += (two127m69 - in[8]); + } + +/* felem_square sets |out| = |in|^2 + * On entry: + * in[i] < 2^62 + * On exit: + * out[i] < 17 * max(in[i]) * max(in[i]) + */ +static void felem_square(largefelem out, const felem in) + { + felem inx2, inx4; + felem_scalar(inx2, in, 2); + felem_scalar(inx4, in, 4); + + /* We have many cases were we want to do + * in[x] * in[y] + + * in[y] * in[x] + * This is obviously just + * 2 * in[x] * in[y] + * However, rather than do the doubling on the 128 bit result, we + * double one of the inputs to the multiplication by reading from + * |inx2| */ + + out[0] = ((uint128_t) in[0]) * in[0]; + out[1] = ((uint128_t) in[0]) * inx2[1]; + out[2] = ((uint128_t) in[0]) * inx2[2] + + ((uint128_t) in[1]) * in[1]; + out[3] = ((uint128_t) in[0]) * inx2[3] + + ((uint128_t) in[1]) * inx2[2]; + out[4] = ((uint128_t) in[0]) * inx2[4] + + ((uint128_t) in[1]) * inx2[3] + + ((uint128_t) in[2]) * in[2]; + out[5] = ((uint128_t) in[0]) * inx2[5] + + ((uint128_t) in[1]) * inx2[4] + + ((uint128_t) in[2]) * inx2[3]; + out[6] = ((uint128_t) in[0]) * inx2[6] + + ((uint128_t) in[1]) * inx2[5] + + ((uint128_t) in[2]) * inx2[4] + + ((uint128_t) in[3]) * in[3]; + out[7] = ((uint128_t) in[0]) * inx2[7] + + ((uint128_t) in[1]) * inx2[6] + + ((uint128_t) in[2]) * inx2[5] + + ((uint128_t) in[3]) * inx2[4]; + out[8] = ((uint128_t) in[0]) * inx2[8] + + ((uint128_t) in[1]) * inx2[7] + + ((uint128_t) in[2]) * inx2[6] + + ((uint128_t) in[3]) * inx2[5] + + ((uint128_t) in[4]) * in[4]; + + /* The remaining limbs fall above 2^521, with the first falling at + * 2^522. They correspond to locations one bit up from the limbs + * produced above so we would have to multiply by two to align them. + * Again, rather than operate on the 128-bit result, we double one of + * the inputs to the multiplication. If we want to double for both this + * reason, and the reason above, then we end up multiplying by four. */ + + /* 9 */ + out[0] += ((uint128_t) in[1]) * inx4[8] + + ((uint128_t) in[2]) * inx4[7] + + ((uint128_t) in[3]) * inx4[6] + + ((uint128_t) in[4]) * inx4[5]; + + /* 10 */ + out[1] += ((uint128_t) in[2]) * inx4[8] + + ((uint128_t) in[3]) * inx4[7] + + ((uint128_t) in[4]) * inx4[6] + + ((uint128_t) in[5]) * inx2[5]; + + /* 11 */ + out[2] += ((uint128_t) in[3]) * inx4[8] + + ((uint128_t) in[4]) * inx4[7] + + ((uint128_t) in[5]) * inx4[6]; + + /* 12 */ + out[3] += ((uint128_t) in[4]) * inx4[8] + + ((uint128_t) in[5]) * inx4[7] + + ((uint128_t) in[6]) * inx2[6]; + + /* 13 */ + out[4] += ((uint128_t) in[5]) * inx4[8] + + ((uint128_t) in[6]) * inx4[7]; + + /* 14 */ + out[5] += ((uint128_t) in[6]) * inx4[8] + + ((uint128_t) in[7]) * inx2[7]; + + /* 15 */ + out[6] += ((uint128_t) in[7]) * inx4[8]; + + /* 16 */ + out[7] += ((uint128_t) in[8]) * inx2[8]; + } + +/* felem_mul sets |out| = |in1| * |in2| + * On entry: + * in1[i] < 2^64 + * in2[i] < 2^63 + * On exit: + * out[i] < 17 * max(in1[i]) * max(in2[i]) + */ +static void felem_mul(largefelem out, const felem in1, const felem in2) + { + felem in2x2; + felem_scalar(in2x2, in2, 2); + + out[0] = ((uint128_t) in1[0]) * in2[0]; + + out[1] = ((uint128_t) in1[0]) * in2[1] + + ((uint128_t) in1[1]) * in2[0]; + + out[2] = ((uint128_t) in1[0]) * in2[2] + + ((uint128_t) in1[1]) * in2[1] + + ((uint128_t) in1[2]) * in2[0]; + + out[3] = ((uint128_t) in1[0]) * in2[3] + + ((uint128_t) in1[1]) * in2[2] + + ((uint128_t) in1[2]) * in2[1] + + ((uint128_t) in1[3]) * in2[0]; + + out[4] = ((uint128_t) in1[0]) * in2[4] + + ((uint128_t) in1[1]) * in2[3] + + ((uint128_t) in1[2]) * in2[2] + + ((uint128_t) in1[3]) * in2[1] + + ((uint128_t) in1[4]) * in2[0]; + + out[5] = ((uint128_t) in1[0]) * in2[5] + + ((uint128_t) in1[1]) * in2[4] + + ((uint128_t) in1[2]) * in2[3] + + ((uint128_t) in1[3]) * in2[2] + + ((uint128_t) in1[4]) * in2[1] + + ((uint128_t) in1[5]) * in2[0]; + + out[6] = ((uint128_t) in1[0]) * in2[6] + + ((uint128_t) in1[1]) * in2[5] + + ((uint128_t) in1[2]) * in2[4] + + ((uint128_t) in1[3]) * in2[3] + + ((uint128_t) in1[4]) * in2[2] + + ((uint128_t) in1[5]) * in2[1] + + ((uint128_t) in1[6]) * in2[0]; + + out[7] = ((uint128_t) in1[0]) * in2[7] + + ((uint128_t) in1[1]) * in2[6] + + ((uint128_t) in1[2]) * in2[5] + + ((uint128_t) in1[3]) * in2[4] + + ((uint128_t) in1[4]) * in2[3] + + ((uint128_t) in1[5]) * in2[2] + + ((uint128_t) in1[6]) * in2[1] + + ((uint128_t) in1[7]) * in2[0]; + + out[8] = ((uint128_t) in1[0]) * in2[8] + + ((uint128_t) in1[1]) * in2[7] + + ((uint128_t) in1[2]) * in2[6] + + ((uint128_t) in1[3]) * in2[5] + + ((uint128_t) in1[4]) * in2[4] + + ((uint128_t) in1[5]) * in2[3] + + ((uint128_t) in1[6]) * in2[2] + + ((uint128_t) in1[7]) * in2[1] + + ((uint128_t) in1[8]) * in2[0]; + + /* See comment in felem_square about the use of in2x2 here */ + + out[0] += ((uint128_t) in1[1]) * in2x2[8] + + ((uint128_t) in1[2]) * in2x2[7] + + ((uint128_t) in1[3]) * in2x2[6] + + ((uint128_t) in1[4]) * in2x2[5] + + ((uint128_t) in1[5]) * in2x2[4] + + ((uint128_t) in1[6]) * in2x2[3] + + ((uint128_t) in1[7]) * in2x2[2] + + ((uint128_t) in1[8]) * in2x2[1]; + + out[1] += ((uint128_t) in1[2]) * in2x2[8] + + ((uint128_t) in1[3]) * in2x2[7] + + ((uint128_t) in1[4]) * in2x2[6] + + ((uint128_t) in1[5]) * in2x2[5] + + ((uint128_t) in1[6]) * in2x2[4] + + ((uint128_t) in1[7]) * in2x2[3] + + ((uint128_t) in1[8]) * in2x2[2]; + + out[2] += ((uint128_t) in1[3]) * in2x2[8] + + ((uint128_t) in1[4]) * in2x2[7] + + ((uint128_t) in1[5]) * in2x2[6] + + ((uint128_t) in1[6]) * in2x2[5] + + ((uint128_t) in1[7]) * in2x2[4] + + ((uint128_t) in1[8]) * in2x2[3]; + + out[3] += ((uint128_t) in1[4]) * in2x2[8] + + ((uint128_t) in1[5]) * in2x2[7] + + ((uint128_t) in1[6]) * in2x2[6] + + ((uint128_t) in1[7]) * in2x2[5] + + ((uint128_t) in1[8]) * in2x2[4]; + + out[4] += ((uint128_t) in1[5]) * in2x2[8] + + ((uint128_t) in1[6]) * in2x2[7] + + ((uint128_t) in1[7]) * in2x2[6] + + ((uint128_t) in1[8]) * in2x2[5]; + + out[5] += ((uint128_t) in1[6]) * in2x2[8] + + ((uint128_t) in1[7]) * in2x2[7] + + ((uint128_t) in1[8]) * in2x2[6]; + + out[6] += ((uint128_t) in1[7]) * in2x2[8] + + ((uint128_t) in1[8]) * in2x2[7]; + + out[7] += ((uint128_t) in1[8]) * in2x2[8]; + } + +static const limb bottom52bits = 0xfffffffffffff; + +/* felem_reduce converts a largefelem to an felem. + * On entry: + * in[i] < 2^128 + * On exit: + * out[i] < 2^59 + 2^14 + */ +static void felem_reduce(felem out, const largefelem in) + { + u64 overflow1, overflow2; + + out[0] = ((limb) in[0]) & bottom58bits; + out[1] = ((limb) in[1]) & bottom58bits; + out[2] = ((limb) in[2]) & bottom58bits; + out[3] = ((limb) in[3]) & bottom58bits; + out[4] = ((limb) in[4]) & bottom58bits; + out[5] = ((limb) in[5]) & bottom58bits; + out[6] = ((limb) in[6]) & bottom58bits; + out[7] = ((limb) in[7]) & bottom58bits; + out[8] = ((limb) in[8]) & bottom58bits; + + /* out[i] < 2^58 */ + + out[1] += ((limb) in[0]) >> 58; + out[1] += (((limb) (in[0] >> 64)) & bottom52bits) << 6; + /* out[1] < 2^58 + 2^6 + 2^58 + * = 2^59 + 2^6 */ + out[2] += ((limb) (in[0] >> 64)) >> 52; + + out[2] += ((limb) in[1]) >> 58; + out[2] += (((limb) (in[1] >> 64)) & bottom52bits) << 6; + out[3] += ((limb) (in[1] >> 64)) >> 52; + + out[3] += ((limb) in[2]) >> 58; + out[3] += (((limb) (in[2] >> 64)) & bottom52bits) << 6; + out[4] += ((limb) (in[2] >> 64)) >> 52; + + out[4] += ((limb) in[3]) >> 58; + out[4] += (((limb) (in[3] >> 64)) & bottom52bits) << 6; + out[5] += ((limb) (in[3] >> 64)) >> 52; + + out[5] += ((limb) in[4]) >> 58; + out[5] += (((limb) (in[4] >> 64)) & bottom52bits) << 6; + out[6] += ((limb) (in[4] >> 64)) >> 52; + + out[6] += ((limb) in[5]) >> 58; + out[6] += (((limb) (in[5] >> 64)) & bottom52bits) << 6; + out[7] += ((limb) (in[5] >> 64)) >> 52; + + out[7] += ((limb) in[6]) >> 58; + out[7] += (((limb) (in[6] >> 64)) & bottom52bits) << 6; + out[8] += ((limb) (in[6] >> 64)) >> 52; + + out[8] += ((limb) in[7]) >> 58; + out[8] += (((limb) (in[7] >> 64)) & bottom52bits) << 6; + /* out[x > 1] < 2^58 + 2^6 + 2^58 + 2^12 + * < 2^59 + 2^13 */ + overflow1 = ((limb) (in[7] >> 64)) >> 52; + + overflow1 += ((limb) in[8]) >> 58; + overflow1 += (((limb) (in[8] >> 64)) & bottom52bits) << 6; + overflow2 = ((limb) (in[8] >> 64)) >> 52; + + overflow1 <<= 1; /* overflow1 < 2^13 + 2^7 + 2^59 */ + overflow2 <<= 1; /* overflow2 < 2^13 */ + + out[0] += overflow1; /* out[0] < 2^60 */ + out[1] += overflow2; /* out[1] < 2^59 + 2^6 + 2^13 */ + + out[1] += out[0] >> 58; out[0] &= bottom58bits; + /* out[0] < 2^58 + * out[1] < 2^59 + 2^6 + 2^13 + 2^2 + * < 2^59 + 2^14 */ + } + +static void felem_square_reduce(felem out, const felem in) + { + largefelem tmp; + felem_square(tmp, in); + felem_reduce(out, tmp); + } + +static void felem_mul_reduce(felem out, const felem in1, const felem in2) + { + largefelem tmp; + felem_mul(tmp, in1, in2); + felem_reduce(out, tmp); + } + +/* felem_inv calculates |out| = |in|^{-1} + * + * Based on Fermat's Little Theorem: + * a^p = a (mod p) + * a^{p-1} = 1 (mod p) + * a^{p-2} = a^{-1} (mod p) + */ +static void felem_inv(felem out, const felem in) + { + felem ftmp, ftmp2, ftmp3, ftmp4; + largefelem tmp; + unsigned i; + + felem_square(tmp, in); felem_reduce(ftmp, tmp); /* 2^1 */ + felem_mul(tmp, in, ftmp); felem_reduce(ftmp, tmp); /* 2^2 - 2^0 */ + felem_assign(ftmp2, ftmp); + felem_square(tmp, ftmp); felem_reduce(ftmp, tmp); /* 2^3 - 2^1 */ + felem_mul(tmp, in, ftmp); felem_reduce(ftmp, tmp); /* 2^3 - 2^0 */ + felem_square(tmp, ftmp); felem_reduce(ftmp, tmp); /* 2^4 - 2^1 */ + + felem_square(tmp, ftmp2); felem_reduce(ftmp3, tmp); /* 2^3 - 2^1 */ + felem_square(tmp, ftmp3); felem_reduce(ftmp3, tmp); /* 2^4 - 2^2 */ + felem_mul(tmp, ftmp3, ftmp2); felem_reduce(ftmp3, tmp); /* 2^4 - 2^0 */ + + felem_assign(ftmp2, ftmp3); + felem_square(tmp, ftmp3); felem_reduce(ftmp3, tmp); /* 2^5 - 2^1 */ + felem_square(tmp, ftmp3); felem_reduce(ftmp3, tmp); /* 2^6 - 2^2 */ + felem_square(tmp, ftmp3); felem_reduce(ftmp3, tmp); /* 2^7 - 2^3 */ + felem_square(tmp, ftmp3); felem_reduce(ftmp3, tmp); /* 2^8 - 2^4 */ + felem_assign(ftmp4, ftmp3); + felem_mul(tmp, ftmp3, ftmp); felem_reduce(ftmp4, tmp); /* 2^8 - 2^1 */ + felem_square(tmp, ftmp4); felem_reduce(ftmp4, tmp); /* 2^9 - 2^2 */ + felem_mul(tmp, ftmp3, ftmp2); felem_reduce(ftmp3, tmp); /* 2^8 - 2^0 */ + felem_assign(ftmp2, ftmp3); + + for (i = 0; i < 8; i++) + { + felem_square(tmp, ftmp3); felem_reduce(ftmp3, tmp); /* 2^16 - 2^8 */ + } + felem_mul(tmp, ftmp3, ftmp2); felem_reduce(ftmp3, tmp); /* 2^16 - 2^0 */ + felem_assign(ftmp2, ftmp3); + + for (i = 0; i < 16; i++) + { + felem_square(tmp, ftmp3); felem_reduce(ftmp3, tmp); /* 2^32 - 2^16 */ + } + felem_mul(tmp, ftmp3, ftmp2); felem_reduce(ftmp3, tmp); /* 2^32 - 2^0 */ + felem_assign(ftmp2, ftmp3); + + for (i = 0; i < 32; i++) + { + felem_square(tmp, ftmp3); felem_reduce(ftmp3, tmp); /* 2^64 - 2^32 */ + } + felem_mul(tmp, ftmp3, ftmp2); felem_reduce(ftmp3, tmp); /* 2^64 - 2^0 */ + felem_assign(ftmp2, ftmp3); + + for (i = 0; i < 64; i++) + { + felem_square(tmp, ftmp3); felem_reduce(ftmp3, tmp); /* 2^128 - 2^64 */ + } + felem_mul(tmp, ftmp3, ftmp2); felem_reduce(ftmp3, tmp); /* 2^128 - 2^0 */ + felem_assign(ftmp2, ftmp3); + + for (i = 0; i < 128; i++) + { + felem_square(tmp, ftmp3); felem_reduce(ftmp3, tmp); /* 2^256 - 2^128 */ + } + felem_mul(tmp, ftmp3, ftmp2); felem_reduce(ftmp3, tmp); /* 2^256 - 2^0 */ + felem_assign(ftmp2, ftmp3); + + for (i = 0; i < 256; i++) + { + felem_square(tmp, ftmp3); felem_reduce(ftmp3, tmp); /* 2^512 - 2^256 */ + } + felem_mul(tmp, ftmp3, ftmp2); felem_reduce(ftmp3, tmp); /* 2^512 - 2^0 */ + + for (i = 0; i < 9; i++) + { + felem_square(tmp, ftmp3); felem_reduce(ftmp3, tmp); /* 2^521 - 2^9 */ + } + felem_mul(tmp, ftmp3, ftmp4); felem_reduce(ftmp3, tmp); /* 2^512 - 2^2 */ + felem_mul(tmp, ftmp3, in); felem_reduce(out, tmp); /* 2^512 - 3 */ +} + +/* This is 2^521-1, expressed as an felem */ +static const felem kPrime = + { + 0x03ffffffffffffff, 0x03ffffffffffffff, 0x03ffffffffffffff, + 0x03ffffffffffffff, 0x03ffffffffffffff, 0x03ffffffffffffff, + 0x03ffffffffffffff, 0x03ffffffffffffff, 0x01ffffffffffffff + }; + +/* felem_is_zero returns a limb with all bits set if |in| == 0 (mod p) and 0 + * otherwise. + * On entry: + * in[i] < 2^59 + 2^14 + */ +static limb felem_is_zero(const felem in) + { + felem ftmp; + limb is_zero, is_p; + felem_assign(ftmp, in); + + ftmp[0] += ftmp[8] >> 57; ftmp[8] &= bottom57bits; + /* ftmp[8] < 2^57 */ + ftmp[1] += ftmp[0] >> 58; ftmp[0] &= bottom58bits; + ftmp[2] += ftmp[1] >> 58; ftmp[1] &= bottom58bits; + ftmp[3] += ftmp[2] >> 58; ftmp[2] &= bottom58bits; + ftmp[4] += ftmp[3] >> 58; ftmp[3] &= bottom58bits; + ftmp[5] += ftmp[4] >> 58; ftmp[4] &= bottom58bits; + ftmp[6] += ftmp[5] >> 58; ftmp[5] &= bottom58bits; + ftmp[7] += ftmp[6] >> 58; ftmp[6] &= bottom58bits; + ftmp[8] += ftmp[7] >> 58; ftmp[7] &= bottom58bits; + /* ftmp[8] < 2^57 + 4 */ + + /* The ninth limb of 2*(2^521-1) is 0x03ffffffffffffff, which is + * greater than our bound for ftmp[8]. Therefore we only have to check + * if the zero is zero or 2^521-1. */ + + is_zero = 0; + is_zero |= ftmp[0]; + is_zero |= ftmp[1]; + is_zero |= ftmp[2]; + is_zero |= ftmp[3]; + is_zero |= ftmp[4]; + is_zero |= ftmp[5]; + is_zero |= ftmp[6]; + is_zero |= ftmp[7]; + is_zero |= ftmp[8]; + + is_zero--; + /* We know that ftmp[i] < 2^63, therefore the only way that the top bit + * can be set is if is_zero was 0 before the decrement. */ + is_zero = ((s64) is_zero) >> 63; + + is_p = ftmp[0] ^ kPrime[0]; + is_p |= ftmp[1] ^ kPrime[1]; + is_p |= ftmp[2] ^ kPrime[2]; + is_p |= ftmp[3] ^ kPrime[3]; + is_p |= ftmp[4] ^ kPrime[4]; + is_p |= ftmp[5] ^ kPrime[5]; + is_p |= ftmp[6] ^ kPrime[6]; + is_p |= ftmp[7] ^ kPrime[7]; + is_p |= ftmp[8] ^ kPrime[8]; + + is_p--; + is_p = ((s64) is_p) >> 63; + + is_zero |= is_p; + return is_zero; + } + +static int felem_is_zero_int(const felem in) + { + return (int) (felem_is_zero(in) & ((limb)1)); + } + +/* felem_contract converts |in| to its unique, minimal representation. + * On entry: + * in[i] < 2^59 + 2^14 + */ +static void felem_contract(felem out, const felem in) + { + limb is_p, is_greater, sign; + static const limb two58 = ((limb)1) << 58; + + felem_assign(out, in); + + out[0] += out[8] >> 57; out[8] &= bottom57bits; + /* out[8] < 2^57 */ + out[1] += out[0] >> 58; out[0] &= bottom58bits; + out[2] += out[1] >> 58; out[1] &= bottom58bits; + out[3] += out[2] >> 58; out[2] &= bottom58bits; + out[4] += out[3] >> 58; out[3] &= bottom58bits; + out[5] += out[4] >> 58; out[4] &= bottom58bits; + out[6] += out[5] >> 58; out[5] &= bottom58bits; + out[7] += out[6] >> 58; out[6] &= bottom58bits; + out[8] += out[7] >> 58; out[7] &= bottom58bits; + /* out[8] < 2^57 + 4 */ + + /* If the value is greater than 2^521-1 then we have to subtract + * 2^521-1 out. See the comments in felem_is_zero regarding why we + * don't test for other multiples of the prime. */ + + /* First, if |out| is equal to 2^521-1, we subtract it out to get zero. */ + + is_p = out[0] ^ kPrime[0]; + is_p |= out[1] ^ kPrime[1]; + is_p |= out[2] ^ kPrime[2]; + is_p |= out[3] ^ kPrime[3]; + is_p |= out[4] ^ kPrime[4]; + is_p |= out[5] ^ kPrime[5]; + is_p |= out[6] ^ kPrime[6]; + is_p |= out[7] ^ kPrime[7]; + is_p |= out[8] ^ kPrime[8]; + + is_p--; + is_p &= is_p << 32; + is_p &= is_p << 16; + is_p &= is_p << 8; + is_p &= is_p << 4; + is_p &= is_p << 2; + is_p &= is_p << 1; + is_p = ((s64) is_p) >> 63; + is_p = ~is_p; + + /* is_p is 0 iff |out| == 2^521-1 and all ones otherwise */ + + out[0] &= is_p; + out[1] &= is_p; + out[2] &= is_p; + out[3] &= is_p; + out[4] &= is_p; + out[5] &= is_p; + out[6] &= is_p; + out[7] &= is_p; + out[8] &= is_p; + + /* In order to test that |out| >= 2^521-1 we need only test if out[8] + * >> 57 is greater than zero as (2^521-1) + x >= 2^522 */ + is_greater = out[8] >> 57; + is_greater |= is_greater << 32; + is_greater |= is_greater << 16; + is_greater |= is_greater << 8; + is_greater |= is_greater << 4; + is_greater |= is_greater << 2; + is_greater |= is_greater << 1; + is_greater = ((s64) is_greater) >> 63; + + out[0] -= kPrime[0] & is_greater; + out[1] -= kPrime[1] & is_greater; + out[2] -= kPrime[2] & is_greater; + out[3] -= kPrime[3] & is_greater; + out[4] -= kPrime[4] & is_greater; + out[5] -= kPrime[5] & is_greater; + out[6] -= kPrime[6] & is_greater; + out[7] -= kPrime[7] & is_greater; + out[8] -= kPrime[8] & is_greater; + + /* Eliminate negative coefficients */ + sign = -(out[0] >> 63); out[0] += (two58 & sign); out[1] -= (1 & sign); + sign = -(out[1] >> 63); out[1] += (two58 & sign); out[2] -= (1 & sign); + sign = -(out[2] >> 63); out[2] += (two58 & sign); out[3] -= (1 & sign); + sign = -(out[3] >> 63); out[3] += (two58 & sign); out[4] -= (1 & sign); + sign = -(out[4] >> 63); out[4] += (two58 & sign); out[5] -= (1 & sign); + sign = -(out[0] >> 63); out[5] += (two58 & sign); out[6] -= (1 & sign); + sign = -(out[6] >> 63); out[6] += (two58 & sign); out[7] -= (1 & sign); + sign = -(out[7] >> 63); out[7] += (two58 & sign); out[8] -= (1 & sign); + sign = -(out[5] >> 63); out[5] += (two58 & sign); out[6] -= (1 & sign); + sign = -(out[6] >> 63); out[6] += (two58 & sign); out[7] -= (1 & sign); + sign = -(out[7] >> 63); out[7] += (two58 & sign); out[8] -= (1 & sign); + } + +/* Group operations + * ---------------- + * + * Building on top of the field operations we have the operations on the + * elliptic curve group itself. Points on the curve are represented in Jacobian + * coordinates */ + +/* point_double calcuates 2*(x_in, y_in, z_in) + * + * The method is taken from: + * http://hyperelliptic.org/EFD/g1p/auto-shortw-jacobian-3.html#doubling-dbl-2001-b + * + * Outputs can equal corresponding inputs, i.e., x_out == x_in is allowed. + * while x_out == y_in is not (maybe this works, but it's not tested). */ +static void +point_double(felem x_out, felem y_out, felem z_out, + const felem x_in, const felem y_in, const felem z_in) + { + largefelem tmp, tmp2; + felem delta, gamma, beta, alpha, ftmp, ftmp2; + + felem_assign(ftmp, x_in); + felem_assign(ftmp2, x_in); + + /* delta = z^2 */ + felem_square(tmp, z_in); + felem_reduce(delta, tmp); /* delta[i] < 2^59 + 2^14 */ + + /* gamma = y^2 */ + felem_square(tmp, y_in); + felem_reduce(gamma, tmp); /* gamma[i] < 2^59 + 2^14 */ + + /* beta = x*gamma */ + felem_mul(tmp, x_in, gamma); + felem_reduce(beta, tmp); /* beta[i] < 2^59 + 2^14 */ + + /* alpha = 3*(x-delta)*(x+delta) */ + felem_diff64(ftmp, delta); + /* ftmp[i] < 2^61 */ + felem_sum64(ftmp2, delta); + /* ftmp2[i] < 2^60 + 2^15 */ + felem_scalar64(ftmp2, 3); + /* ftmp2[i] < 3*2^60 + 3*2^15 */ + felem_mul(tmp, ftmp, ftmp2); + /* tmp[i] < 17(3*2^121 + 3*2^76) + * = 61*2^121 + 61*2^76 + * < 64*2^121 + 64*2^76 + * = 2^127 + 2^82 + * < 2^128 */ + felem_reduce(alpha, tmp); + + /* x' = alpha^2 - 8*beta */ + felem_square(tmp, alpha); + /* tmp[i] < 17*2^120 + * < 2^125 */ + felem_assign(ftmp, beta); + felem_scalar64(ftmp, 8); + /* ftmp[i] < 2^62 + 2^17 */ + felem_diff_128_64(tmp, ftmp); + /* tmp[i] < 2^125 + 2^63 + 2^62 + 2^17 */ + felem_reduce(x_out, tmp); + + /* z' = (y + z)^2 - gamma - delta */ + felem_sum64(delta, gamma); + /* delta[i] < 2^60 + 2^15 */ + felem_assign(ftmp, y_in); + felem_sum64(ftmp, z_in); + /* ftmp[i] < 2^60 + 2^15 */ + felem_square(tmp, ftmp); + /* tmp[i] < 17(2^122) + * < 2^127 */ + felem_diff_128_64(tmp, delta); + /* tmp[i] < 2^127 + 2^63 */ + felem_reduce(z_out, tmp); + + /* y' = alpha*(4*beta - x') - 8*gamma^2 */ + felem_scalar64(beta, 4); + /* beta[i] < 2^61 + 2^16 */ + felem_diff64(beta, x_out); + /* beta[i] < 2^61 + 2^60 + 2^16 */ + felem_mul(tmp, alpha, beta); + /* tmp[i] < 17*((2^59 + 2^14)(2^61 + 2^60 + 2^16)) + * = 17*(2^120 + 2^75 + 2^119 + 2^74 + 2^75 + 2^30) + * = 17*(2^120 + 2^119 + 2^76 + 2^74 + 2^30) + * < 2^128 */ + felem_square(tmp2, gamma); + /* tmp2[i] < 17*(2^59 + 2^14)^2 + * = 17*(2^118 + 2^74 + 2^28) */ + felem_scalar128(tmp2, 8); + /* tmp2[i] < 8*17*(2^118 + 2^74 + 2^28) + * = 2^125 + 2^121 + 2^81 + 2^77 + 2^35 + 2^31 + * < 2^126 */ + felem_diff128(tmp, tmp2); + /* tmp[i] < 2^127 - 2^69 + 17(2^120 + 2^119 + 2^76 + 2^74 + 2^30) + * = 2^127 + 2^124 + 2^122 + 2^120 + 2^118 + 2^80 + 2^78 + 2^76 + + * 2^74 + 2^69 + 2^34 + 2^30 + * < 2^128 */ + felem_reduce(y_out, tmp); + } + +/* copy_conditional copies in to out iff mask is all ones. */ +static void +copy_conditional(felem out, const felem in, limb mask) + { + unsigned i; + for (i = 0; i < NLIMBS; ++i) + { + const limb tmp = mask & (in[i] ^ out[i]); + out[i] ^= tmp; + } + } + +/* point_add calcuates (x1, y1, z1) + (x2, y2, z2) + * + * The method is taken from + * http://hyperelliptic.org/EFD/g1p/auto-shortw-jacobian-3.html#addition-add-2007-bl, + * adapted for mixed addition (z2 = 1, or z2 = 0 for the point at infinity). + * + * This function includes a branch for checking whether the two input points + * are equal (while not equal to the point at infinity). This case never + * happens during single point multiplication, so there is no timing leak for + * ECDH or ECDSA signing. */ +static void point_add(felem x3, felem y3, felem z3, + const felem x1, const felem y1, const felem z1, + const int mixed, const felem x2, const felem y2, const felem z2) + { + felem ftmp, ftmp2, ftmp3, ftmp4, ftmp5, ftmp6, x_out, y_out, z_out; + largefelem tmp, tmp2; + limb x_equal, y_equal, z1_is_zero, z2_is_zero; + + z1_is_zero = felem_is_zero(z1); + z2_is_zero = felem_is_zero(z2); + + /* ftmp = z1z1 = z1**2 */ + felem_square(tmp, z1); + felem_reduce(ftmp, tmp); + + if (!mixed) + { + /* ftmp2 = z2z2 = z2**2 */ + felem_square(tmp, z2); + felem_reduce(ftmp2, tmp); + + /* u1 = ftmp3 = x1*z2z2 */ + felem_mul(tmp, x1, ftmp2); + felem_reduce(ftmp3, tmp); + + /* ftmp5 = z1 + z2 */ + felem_assign(ftmp5, z1); + felem_sum64(ftmp5, z2); + /* ftmp5[i] < 2^61 */ + + /* ftmp5 = (z1 + z2)**2 - z1z1 - z2z2 = 2*z1z2 */ + felem_square(tmp, ftmp5); + /* tmp[i] < 17*2^122 */ + felem_diff_128_64(tmp, ftmp); + /* tmp[i] < 17*2^122 + 2^63 */ + felem_diff_128_64(tmp, ftmp2); + /* tmp[i] < 17*2^122 + 2^64 */ + felem_reduce(ftmp5, tmp); + + /* ftmp2 = z2 * z2z2 */ + felem_mul(tmp, ftmp2, z2); + felem_reduce(ftmp2, tmp); + + /* s1 = ftmp6 = y1 * z2**3 */ + felem_mul(tmp, y1, ftmp2); + felem_reduce(ftmp6, tmp); + } + else + { + /* We'll assume z2 = 1 (special case z2 = 0 is handled later) */ + + /* u1 = ftmp3 = x1*z2z2 */ + felem_assign(ftmp3, x1); + + /* ftmp5 = 2*z1z2 */ + felem_scalar(ftmp5, z1, 2); + + /* s1 = ftmp6 = y1 * z2**3 */ + felem_assign(ftmp6, y1); + } + + /* u2 = x2*z1z1 */ + felem_mul(tmp, x2, ftmp); + /* tmp[i] < 17*2^120 */ + + /* h = ftmp4 = u2 - u1 */ + felem_diff_128_64(tmp, ftmp3); + /* tmp[i] < 17*2^120 + 2^63 */ + felem_reduce(ftmp4, tmp); + + x_equal = felem_is_zero(ftmp4); + + /* z_out = ftmp5 * h */ + felem_mul(tmp, ftmp5, ftmp4); + felem_reduce(z_out, tmp); + + /* ftmp = z1 * z1z1 */ + felem_mul(tmp, ftmp, z1); + felem_reduce(ftmp, tmp); + + /* s2 = tmp = y2 * z1**3 */ + felem_mul(tmp, y2, ftmp); + /* tmp[i] < 17*2^120 */ + + /* r = ftmp5 = (s2 - s1)*2 */ + felem_diff_128_64(tmp, ftmp6); + /* tmp[i] < 17*2^120 + 2^63 */ + felem_reduce(ftmp5, tmp); + y_equal = felem_is_zero(ftmp5); + felem_scalar64(ftmp5, 2); + /* ftmp5[i] < 2^61 */ + + if (x_equal && y_equal && !z1_is_zero && !z2_is_zero) + { + point_double(x3, y3, z3, x1, y1, z1); + return; + } + + /* I = ftmp = (2h)**2 */ + felem_assign(ftmp, ftmp4); + felem_scalar64(ftmp, 2); + /* ftmp[i] < 2^61 */ + felem_square(tmp, ftmp); + /* tmp[i] < 17*2^122 */ + felem_reduce(ftmp, tmp); + + /* J = ftmp2 = h * I */ + felem_mul(tmp, ftmp4, ftmp); + felem_reduce(ftmp2, tmp); + + /* V = ftmp4 = U1 * I */ + felem_mul(tmp, ftmp3, ftmp); + felem_reduce(ftmp4, tmp); + + /* x_out = r**2 - J - 2V */ + felem_square(tmp, ftmp5); + /* tmp[i] < 17*2^122 */ + felem_diff_128_64(tmp, ftmp2); + /* tmp[i] < 17*2^122 + 2^63 */ + felem_assign(ftmp3, ftmp4); + felem_scalar64(ftmp4, 2); + /* ftmp4[i] < 2^61 */ + felem_diff_128_64(tmp, ftmp4); + /* tmp[i] < 17*2^122 + 2^64 */ + felem_reduce(x_out, tmp); + + /* y_out = r(V-x_out) - 2 * s1 * J */ + felem_diff64(ftmp3, x_out); + /* ftmp3[i] < 2^60 + 2^60 + * = 2^61 */ + felem_mul(tmp, ftmp5, ftmp3); + /* tmp[i] < 17*2^122 */ + felem_mul(tmp2, ftmp6, ftmp2); + /* tmp2[i] < 17*2^120 */ + felem_scalar128(tmp2, 2); + /* tmp2[i] < 17*2^121 */ + felem_diff128(tmp, tmp2); + /* tmp[i] < 2^127 - 2^69 + 17*2^122 + * = 2^126 - 2^122 - 2^6 - 2^2 - 1 + * < 2^127 */ + felem_reduce(y_out, tmp); + + copy_conditional(x_out, x2, z1_is_zero); + copy_conditional(x_out, x1, z2_is_zero); + copy_conditional(y_out, y2, z1_is_zero); + copy_conditional(y_out, y1, z2_is_zero); + copy_conditional(z_out, z2, z1_is_zero); + copy_conditional(z_out, z1, z2_is_zero); + felem_assign(x3, x_out); + felem_assign(y3, y_out); + felem_assign(z3, z_out); + } + +/* Base point pre computation + * -------------------------- + * + * Two different sorts of precomputed tables are used in the following code. + * Each contain various points on the curve, where each point is three field + * elements (x, y, z). + * + * For the base point table, z is usually 1 (0 for the point at infinity). + * This table has 16 elements: + * index | bits | point + * ------+---------+------------------------------ + * 0 | 0 0 0 0 | 0G + * 1 | 0 0 0 1 | 1G + * 2 | 0 0 1 0 | 2^130G + * 3 | 0 0 1 1 | (2^130 + 1)G + * 4 | 0 1 0 0 | 2^260G + * 5 | 0 1 0 1 | (2^260 + 1)G + * 6 | 0 1 1 0 | (2^260 + 2^130)G + * 7 | 0 1 1 1 | (2^260 + 2^130 + 1)G + * 8 | 1 0 0 0 | 2^390G + * 9 | 1 0 0 1 | (2^390 + 1)G + * 10 | 1 0 1 0 | (2^390 + 2^130)G + * 11 | 1 0 1 1 | (2^390 + 2^130 + 1)G + * 12 | 1 1 0 0 | (2^390 + 2^260)G + * 13 | 1 1 0 1 | (2^390 + 2^260 + 1)G + * 14 | 1 1 1 0 | (2^390 + 2^260 + 2^130)G + * 15 | 1 1 1 1 | (2^390 + 2^260 + 2^130 + 1)G + * + * The reason for this is so that we can clock bits into four different + * locations when doing simple scalar multiplies against the base point. + * + * Tables for other points have table[i] = iG for i in 0 .. 16. */ + +/* gmul is the table of precomputed base points */ +static const felem gmul[16][3] = + {{{0, 0, 0, 0, 0, 0, 0, 0, 0}, + {0, 0, 0, 0, 0, 0, 0, 0, 0}, + {0, 0, 0, 0, 0, 0, 0, 0, 0}}, + {{0x017e7e31c2e5bd66, 0x022cf0615a90a6fe, 0x00127a2ffa8de334, + 0x01dfbf9d64a3f877, 0x006b4d3dbaa14b5e, 0x014fed487e0a2bd8, + 0x015b4429c6481390, 0x03a73678fb2d988e, 0x00c6858e06b70404}, + {0x00be94769fd16650, 0x031c21a89cb09022, 0x039013fad0761353, + 0x02657bd099031542, 0x03273e662c97ee72, 0x01e6d11a05ebef45, + 0x03d1bd998f544495, 0x03001172297ed0b1, 0x011839296a789a3b}, + {1, 0, 0, 0, 0, 0, 0, 0, 0}}, + {{0x0373faacbc875bae, 0x00f325023721c671, 0x00f666fd3dbde5ad, + 0x01a6932363f88ea7, 0x01fc6d9e13f9c47b, 0x03bcbffc2bbf734e, + 0x013ee3c3647f3a92, 0x029409fefe75d07d, 0x00ef9199963d85e5}, + {0x011173743ad5b178, 0x02499c7c21bf7d46, 0x035beaeabb8b1a58, + 0x00f989c4752ea0a3, 0x0101e1de48a9c1a3, 0x01a20076be28ba6c, + 0x02f8052e5eb2de95, 0x01bfe8f82dea117c, 0x0160074d3c36ddb7}, + {1, 0, 0, 0, 0, 0, 0, 0, 0}}, + {{0x012f3fc373393b3b, 0x03d3d6172f1419fa, 0x02adc943c0b86873, + 0x00d475584177952b, 0x012a4d1673750ee2, 0x00512517a0f13b0c, + 0x02b184671a7b1734, 0x0315b84236f1a50a, 0x00a4afc472edbdb9}, + {0x00152a7077f385c4, 0x03044007d8d1c2ee, 0x0065829d61d52b52, + 0x00494ff6b6631d0d, 0x00a11d94d5f06bcf, 0x02d2f89474d9282e, + 0x0241c5727c06eeb9, 0x0386928710fbdb9d, 0x01f883f727b0dfbe}, + {1, 0, 0, 0, 0, 0, 0, 0, 0}}, + {{0x019b0c3c9185544d, 0x006243a37c9d97db, 0x02ee3cbe030a2ad2, + 0x00cfdd946bb51e0d, 0x0271c00932606b91, 0x03f817d1ec68c561, + 0x03f37009806a369c, 0x03c1f30baf184fd5, 0x01091022d6d2f065}, + {0x0292c583514c45ed, 0x0316fca51f9a286c, 0x00300af507c1489a, + 0x0295f69008298cf1, 0x02c0ed8274943d7b, 0x016509b9b47a431e, + 0x02bc9de9634868ce, 0x005b34929bffcb09, 0x000c1a0121681524}, + {1, 0, 0, 0, 0, 0, 0, 0, 0}}, + {{0x0286abc0292fb9f2, 0x02665eee9805b3f7, 0x01ed7455f17f26d6, + 0x0346355b83175d13, 0x006284944cd0a097, 0x0191895bcdec5e51, + 0x02e288370afda7d9, 0x03b22312bfefa67a, 0x01d104d3fc0613fe}, + {0x0092421a12f7e47f, 0x0077a83fa373c501, 0x03bd25c5f696bd0d, + 0x035c41e4d5459761, 0x01ca0d1742b24f53, 0x00aaab27863a509c, + 0x018b6de47df73917, 0x025c0b771705cd01, 0x01fd51d566d760a7}, + {1, 0, 0, 0, 0, 0, 0, 0, 0}}, + {{0x01dd92ff6b0d1dbd, 0x039c5e2e8f8afa69, 0x0261ed13242c3b27, + 0x0382c6e67026e6a0, 0x01d60b10be2089f9, 0x03c15f3dce86723f, + 0x03c764a32d2a062d, 0x017307eac0fad056, 0x018207c0b96c5256}, + {0x0196a16d60e13154, 0x03e6ce74c0267030, 0x00ddbf2b4e52a5aa, + 0x012738241bbf31c8, 0x00ebe8dc04685a28, 0x024c2ad6d380d4a2, + 0x035ee062a6e62d0e, 0x0029ed74af7d3a0f, 0x00eef32aec142ebd}, + {1, 0, 0, 0, 0, 0, 0, 0, 0}}, + {{0x00c31ec398993b39, 0x03a9f45bcda68253, 0x00ac733c24c70890, + 0x00872b111401ff01, 0x01d178c23195eafb, 0x03bca2c816b87f74, + 0x0261a9af46fbad7a, 0x0324b2a8dd3d28f9, 0x00918121d8f24e23}, + {0x032bc8c1ca983cd7, 0x00d869dfb08fc8c6, 0x01693cb61fce1516, + 0x012a5ea68f4e88a8, 0x010869cab88d7ae3, 0x009081ad277ceee1, + 0x033a77166d064cdc, 0x03955235a1fb3a95, 0x01251a4a9b25b65e}, + {1, 0, 0, 0, 0, 0, 0, 0, 0}}, + {{0x00148a3a1b27f40b, 0x0123186df1b31fdc, 0x00026e7beaad34ce, + 0x01db446ac1d3dbba, 0x0299c1a33437eaec, 0x024540610183cbb7, + 0x0173bb0e9ce92e46, 0x02b937e43921214b, 0x01ab0436a9bf01b5}, + {0x0383381640d46948, 0x008dacbf0e7f330f, 0x03602122bcc3f318, + 0x01ee596b200620d6, 0x03bd0585fda430b3, 0x014aed77fd123a83, + 0x005ace749e52f742, 0x0390fe041da2b842, 0x0189a8ceb3299242}, + {1, 0, 0, 0, 0, 0, 0, 0, 0}}, + {{0x012a19d6b3282473, 0x00c0915918b423ce, 0x023a954eb94405ae, + 0x00529f692be26158, 0x0289fa1b6fa4b2aa, 0x0198ae4ceea346ef, + 0x0047d8cdfbdedd49, 0x00cc8c8953f0f6b8, 0x001424abbff49203}, + {0x0256732a1115a03a, 0x0351bc38665c6733, 0x03f7b950fb4a6447, + 0x000afffa94c22155, 0x025763d0a4dab540, 0x000511e92d4fc283, + 0x030a7e9eda0ee96c, 0x004c3cd93a28bf0a, 0x017edb3a8719217f}, + {1, 0, 0, 0, 0, 0, 0, 0, 0}}, + {{0x011de5675a88e673, 0x031d7d0f5e567fbe, 0x0016b2062c970ae5, + 0x03f4a2be49d90aa7, 0x03cef0bd13822866, 0x03f0923dcf774a6c, + 0x0284bebc4f322f72, 0x016ab2645302bb2c, 0x01793f95dace0e2a}, + {0x010646e13527a28f, 0x01ca1babd59dc5e7, 0x01afedfd9a5595df, + 0x01f15785212ea6b1, 0x0324e5d64f6ae3f4, 0x02d680f526d00645, + 0x0127920fadf627a7, 0x03b383f75df4f684, 0x0089e0057e783b0a}, + {1, 0, 0, 0, 0, 0, 0, 0, 0}}, + {{0x00f334b9eb3c26c6, 0x0298fdaa98568dce, 0x01c2d24843a82292, + 0x020bcb24fa1b0711, 0x02cbdb3d2b1875e6, 0x0014907598f89422, + 0x03abe3aa43b26664, 0x02cbf47f720bc168, 0x0133b5e73014b79b}, + {0x034aab5dab05779d, 0x00cdc5d71fee9abb, 0x0399f16bd4bd9d30, + 0x03582fa592d82647, 0x02be1cdfb775b0e9, 0x0034f7cea32e94cb, + 0x0335a7f08f56f286, 0x03b707e9565d1c8b, 0x0015c946ea5b614f}, + {1, 0, 0, 0, 0, 0, 0, 0, 0}}, + {{0x024676f6cff72255, 0x00d14625cac96378, 0x00532b6008bc3767, + 0x01fc16721b985322, 0x023355ea1b091668, 0x029de7afdc0317c3, + 0x02fc8a7ca2da037c, 0x02de1217d74a6f30, 0x013f7173175b73bf}, + {0x0344913f441490b5, 0x0200f9e272b61eca, 0x0258a246b1dd55d2, + 0x03753db9ea496f36, 0x025e02937a09c5ef, 0x030cbd3d14012692, + 0x01793a67e70dc72a, 0x03ec1d37048a662e, 0x006550f700c32a8d}, + {1, 0, 0, 0, 0, 0, 0, 0, 0}}, + {{0x00d3f48a347eba27, 0x008e636649b61bd8, 0x00d3b93716778fb3, + 0x004d1915757bd209, 0x019d5311a3da44e0, 0x016d1afcbbe6aade, + 0x0241bf5f73265616, 0x0384672e5d50d39b, 0x005009fee522b684}, + {0x029b4fab064435fe, 0x018868ee095bbb07, 0x01ea3d6936cc92b8, + 0x000608b00f78a2f3, 0x02db911073d1c20f, 0x018205938470100a, + 0x01f1e4964cbe6ff2, 0x021a19a29eed4663, 0x01414485f42afa81}, + {1, 0, 0, 0, 0, 0, 0, 0, 0}}, + {{0x01612b3a17f63e34, 0x03813992885428e6, 0x022b3c215b5a9608, + 0x029b4057e19f2fcb, 0x0384059a587af7e6, 0x02d6400ace6fe610, + 0x029354d896e8e331, 0x00c047ee6dfba65e, 0x0037720542e9d49d}, + {0x02ce9eed7c5e9278, 0x0374ed703e79643b, 0x01316c54c4072006, + 0x005aaa09054b2ee8, 0x002824000c840d57, 0x03d4eba24771ed86, + 0x0189c50aabc3bdae, 0x0338c01541e15510, 0x00466d56e38eed42}, + {1, 0, 0, 0, 0, 0, 0, 0, 0}}, + {{0x007efd8330ad8bd6, 0x02465ed48047710b, 0x0034c6606b215e0c, + 0x016ae30c53cbf839, 0x01fa17bd37161216, 0x018ead4e61ce8ab9, + 0x005482ed5f5dee46, 0x037543755bba1d7f, 0x005e5ac7e70a9d0f}, + {0x0117e1bb2fdcb2a2, 0x03deea36249f40c4, 0x028d09b4a6246cb7, + 0x03524b8855bcf756, 0x023d7d109d5ceb58, 0x0178e43e3223ef9c, + 0x0154536a0c6e966a, 0x037964d1286ee9fe, 0x0199bcd90e125055}, + {1, 0, 0, 0, 0, 0, 0, 0, 0}}}; + +/* select_point selects the |idx|th point from a precomputation table and + * copies it to out. */ +static void select_point(const limb idx, unsigned int size, const felem pre_comp[/* size */][3], + felem out[3]) + { + unsigned i, j; + limb *outlimbs = &out[0][0]; + memset(outlimbs, 0, 3 * sizeof(felem)); + + for (i = 0; i < size; i++) + { + const limb *inlimbs = &pre_comp[i][0][0]; + limb mask = i ^ idx; + mask |= mask >> 4; + mask |= mask >> 2; + mask |= mask >> 1; + mask &= 1; + mask--; + for (j = 0; j < NLIMBS * 3; j++) + outlimbs[j] |= inlimbs[j] & mask; + } + } + +/* get_bit returns the |i|th bit in |in| */ +static char get_bit(const felem_bytearray in, int i) + { + if (i < 0) + return 0; + return (in[i >> 3] >> (i & 7)) & 1; + } + +/* Interleaved point multiplication using precomputed point multiples: + * The small point multiples 0*P, 1*P, ..., 16*P are in pre_comp[], + * the scalars in scalars[]. If g_scalar is non-NULL, we also add this multiple + * of the generator, using certain (large) precomputed multiples in g_pre_comp. + * Output point (X, Y, Z) is stored in x_out, y_out, z_out */ +static void batch_mul(felem x_out, felem y_out, felem z_out, + const felem_bytearray scalars[], const unsigned num_points, const u8 *g_scalar, + const int mixed, const felem pre_comp[][17][3], const felem g_pre_comp[16][3]) + { + int i, skip; + unsigned num, gen_mul = (g_scalar != NULL); + felem nq[3], tmp[4]; + limb bits; + u8 sign, digit; + + /* set nq to the point at infinity */ + memset(nq, 0, 3 * sizeof(felem)); + + /* Loop over all scalars msb-to-lsb, interleaving additions + * of multiples of the generator (last quarter of rounds) + * and additions of other points multiples (every 5th round). + */ + skip = 1; /* save two point operations in the first round */ + for (i = (num_points ? 520 : 130); i >= 0; --i) + { + /* double */ + if (!skip) + point_double(nq[0], nq[1], nq[2], nq[0], nq[1], nq[2]); + + /* add multiples of the generator */ + if (gen_mul && (i <= 130)) + { + bits = get_bit(g_scalar, i + 390) << 3; + if (i < 130) + { + bits |= get_bit(g_scalar, i + 260) << 2; + bits |= get_bit(g_scalar, i + 130) << 1; + bits |= get_bit(g_scalar, i); + } + /* select the point to add, in constant time */ + select_point(bits, 16, g_pre_comp, tmp); + if (!skip) + { + point_add(nq[0], nq[1], nq[2], + nq[0], nq[1], nq[2], + 1 /* mixed */, tmp[0], tmp[1], tmp[2]); + } + else + { + memcpy(nq, tmp, 3 * sizeof(felem)); + skip = 0; + } + } + + /* do other additions every 5 doublings */ + if (num_points && (i % 5 == 0)) + { + /* loop over all scalars */ + for (num = 0; num < num_points; ++num) + { + bits = get_bit(scalars[num], i + 4) << 5; + bits |= get_bit(scalars[num], i + 3) << 4; + bits |= get_bit(scalars[num], i + 2) << 3; + bits |= get_bit(scalars[num], i + 1) << 2; + bits |= get_bit(scalars[num], i) << 1; + bits |= get_bit(scalars[num], i - 1); + ec_GFp_nistp_recode_scalar_bits(&sign, &digit, bits); + + /* select the point to add or subtract, in constant time */ + select_point(digit, 17, pre_comp[num], tmp); + felem_neg(tmp[3], tmp[1]); /* (X, -Y, Z) is the negative point */ + copy_conditional(tmp[1], tmp[3], (-(limb) sign)); + + if (!skip) + { + point_add(nq[0], nq[1], nq[2], + nq[0], nq[1], nq[2], + mixed, tmp[0], tmp[1], tmp[2]); + } + else + { + memcpy(nq, tmp, 3 * sizeof(felem)); + skip = 0; + } + } + } + } + felem_assign(x_out, nq[0]); + felem_assign(y_out, nq[1]); + felem_assign(z_out, nq[2]); + } + + +/* Precomputation for the group generator. */ +typedef struct { + felem g_pre_comp[16][3]; + int references; +} NISTP521_PRE_COMP; + +const EC_METHOD *EC_GFp_nistp521_method(void) + { + static const EC_METHOD ret = { + EC_FLAGS_DEFAULT_OCT, + NID_X9_62_prime_field, + ec_GFp_nistp521_group_init, + ec_GFp_simple_group_finish, + ec_GFp_simple_group_clear_finish, + ec_GFp_nist_group_copy, + ec_GFp_nistp521_group_set_curve, + ec_GFp_simple_group_get_curve, + ec_GFp_simple_group_get_degree, + ec_GFp_simple_group_check_discriminant, + ec_GFp_simple_point_init, + ec_GFp_simple_point_finish, + ec_GFp_simple_point_clear_finish, + ec_GFp_simple_point_copy, + ec_GFp_simple_point_set_to_infinity, + ec_GFp_simple_set_Jprojective_coordinates_GFp, + ec_GFp_simple_get_Jprojective_coordinates_GFp, + ec_GFp_simple_point_set_affine_coordinates, + ec_GFp_nistp521_point_get_affine_coordinates, + 0 /* point_set_compressed_coordinates */, + 0 /* point2oct */, + 0 /* oct2point */, + ec_GFp_simple_add, + ec_GFp_simple_dbl, + ec_GFp_simple_invert, + ec_GFp_simple_is_at_infinity, + ec_GFp_simple_is_on_curve, + ec_GFp_simple_cmp, + ec_GFp_simple_make_affine, + ec_GFp_simple_points_make_affine, + ec_GFp_nistp521_points_mul, + ec_GFp_nistp521_precompute_mult, + ec_GFp_nistp521_have_precompute_mult, + ec_GFp_nist_field_mul, + ec_GFp_nist_field_sqr, + 0 /* field_div */, + 0 /* field_encode */, + 0 /* field_decode */, + 0 /* field_set_to_one */ }; + + return &ret; + } + + +/******************************************************************************/ +/* FUNCTIONS TO MANAGE PRECOMPUTATION + */ + +static NISTP521_PRE_COMP *nistp521_pre_comp_new() + { + NISTP521_PRE_COMP *ret = NULL; + ret = (NISTP521_PRE_COMP *)OPENSSL_malloc(sizeof(NISTP521_PRE_COMP)); + if (!ret) + { + ECerr(EC_F_NISTP521_PRE_COMP_NEW, ERR_R_MALLOC_FAILURE); + return ret; + } + memset(ret->g_pre_comp, 0, sizeof(ret->g_pre_comp)); + ret->references = 1; + return ret; + } + +static void *nistp521_pre_comp_dup(void *src_) + { + NISTP521_PRE_COMP *src = src_; + + /* no need to actually copy, these objects never change! */ + CRYPTO_add(&src->references, 1, CRYPTO_LOCK_EC_PRE_COMP); + + return src_; + } + +static void nistp521_pre_comp_free(void *pre_) + { + int i; + NISTP521_PRE_COMP *pre = pre_; + + if (!pre) + return; + + i = CRYPTO_add(&pre->references, -1, CRYPTO_LOCK_EC_PRE_COMP); + if (i > 0) + return; + + OPENSSL_free(pre); + } + +static void nistp521_pre_comp_clear_free(void *pre_) + { + int i; + NISTP521_PRE_COMP *pre = pre_; + + if (!pre) + return; + + i = CRYPTO_add(&pre->references, -1, CRYPTO_LOCK_EC_PRE_COMP); + if (i > 0) + return; + + OPENSSL_cleanse(pre, sizeof(*pre)); + OPENSSL_free(pre); + } + +/******************************************************************************/ +/* OPENSSL EC_METHOD FUNCTIONS + */ + +int ec_GFp_nistp521_group_init(EC_GROUP *group) + { + int ret; + ret = ec_GFp_simple_group_init(group); + group->a_is_minus3 = 1; + return ret; + } + +int ec_GFp_nistp521_group_set_curve(EC_GROUP *group, const BIGNUM *p, + const BIGNUM *a, const BIGNUM *b, BN_CTX *ctx) + { + int ret = 0; + BN_CTX *new_ctx = NULL; + BIGNUM *curve_p, *curve_a, *curve_b; + + if (ctx == NULL) + if ((ctx = new_ctx = BN_CTX_new()) == NULL) return 0; + BN_CTX_start(ctx); + if (((curve_p = BN_CTX_get(ctx)) == NULL) || + ((curve_a = BN_CTX_get(ctx)) == NULL) || + ((curve_b = BN_CTX_get(ctx)) == NULL)) goto err; + BN_bin2bn(nistp521_curve_params[0], sizeof(felem_bytearray), curve_p); + BN_bin2bn(nistp521_curve_params[1], sizeof(felem_bytearray), curve_a); + BN_bin2bn(nistp521_curve_params[2], sizeof(felem_bytearray), curve_b); + if ((BN_cmp(curve_p, p)) || (BN_cmp(curve_a, a)) || + (BN_cmp(curve_b, b))) + { + ECerr(EC_F_EC_GFP_NISTP521_GROUP_SET_CURVE, + EC_R_WRONG_CURVE_PARAMETERS); + goto err; + } + group->field_mod_func = BN_nist_mod_521; + ret = ec_GFp_simple_group_set_curve(group, p, a, b, ctx); +err: + BN_CTX_end(ctx); + if (new_ctx != NULL) + BN_CTX_free(new_ctx); + return ret; + } + +/* Takes the Jacobian coordinates (X, Y, Z) of a point and returns + * (X', Y') = (X/Z^2, Y/Z^3) */ +int ec_GFp_nistp521_point_get_affine_coordinates(const EC_GROUP *group, + const EC_POINT *point, BIGNUM *x, BIGNUM *y, BN_CTX *ctx) + { + felem z1, z2, x_in, y_in, x_out, y_out; + largefelem tmp; + + if (EC_POINT_is_at_infinity(group, point)) + { + ECerr(EC_F_EC_GFP_NISTP521_POINT_GET_AFFINE_COORDINATES, + EC_R_POINT_AT_INFINITY); + return 0; + } + if ((!BN_to_felem(x_in, &point->X)) || (!BN_to_felem(y_in, &point->Y)) || + (!BN_to_felem(z1, &point->Z))) return 0; + felem_inv(z2, z1); + felem_square(tmp, z2); felem_reduce(z1, tmp); + felem_mul(tmp, x_in, z1); felem_reduce(x_in, tmp); + felem_contract(x_out, x_in); + if (x != NULL) + { + if (!felem_to_BN(x, x_out)) + { + ECerr(EC_F_EC_GFP_NISTP521_POINT_GET_AFFINE_COORDINATES, ERR_R_BN_LIB); + return 0; + } + } + felem_mul(tmp, z1, z2); felem_reduce(z1, tmp); + felem_mul(tmp, y_in, z1); felem_reduce(y_in, tmp); + felem_contract(y_out, y_in); + if (y != NULL) + { + if (!felem_to_BN(y, y_out)) + { + ECerr(EC_F_EC_GFP_NISTP521_POINT_GET_AFFINE_COORDINATES, ERR_R_BN_LIB); + return 0; + } + } + return 1; + } + +static void make_points_affine(size_t num, felem points[/* num */][3], felem tmp_felems[/* num+1 */]) + { + /* Runs in constant time, unless an input is the point at infinity + * (which normally shouldn't happen). */ + ec_GFp_nistp_points_make_affine_internal( + num, + points, + sizeof(felem), + tmp_felems, + (void (*)(void *)) felem_one, + (int (*)(const void *)) felem_is_zero_int, + (void (*)(void *, const void *)) felem_assign, + (void (*)(void *, const void *)) felem_square_reduce, + (void (*)(void *, const void *, const void *)) felem_mul_reduce, + (void (*)(void *, const void *)) felem_inv, + (void (*)(void *, const void *)) felem_contract); + } + +/* Computes scalar*generator + \sum scalars[i]*points[i], ignoring NULL values + * Result is stored in r (r can equal one of the inputs). */ +int ec_GFp_nistp521_points_mul(const EC_GROUP *group, EC_POINT *r, + const BIGNUM *scalar, size_t num, const EC_POINT *points[], + const BIGNUM *scalars[], BN_CTX *ctx) + { + int ret = 0; + int j; + int mixed = 0; + BN_CTX *new_ctx = NULL; + BIGNUM *x, *y, *z, *tmp_scalar; + felem_bytearray g_secret; + felem_bytearray *secrets = NULL; + felem (*pre_comp)[17][3] = NULL; + felem *tmp_felems = NULL; + felem_bytearray tmp; + unsigned i, num_bytes; + int have_pre_comp = 0; + size_t num_points = num; + felem x_in, y_in, z_in, x_out, y_out, z_out; + NISTP521_PRE_COMP *pre = NULL; + felem (*g_pre_comp)[3] = NULL; + EC_POINT *generator = NULL; + const EC_POINT *p = NULL; + const BIGNUM *p_scalar = NULL; + + if (ctx == NULL) + if ((ctx = new_ctx = BN_CTX_new()) == NULL) return 0; + BN_CTX_start(ctx); + if (((x = BN_CTX_get(ctx)) == NULL) || + ((y = BN_CTX_get(ctx)) == NULL) || + ((z = BN_CTX_get(ctx)) == NULL) || + ((tmp_scalar = BN_CTX_get(ctx)) == NULL)) + goto err; + + if (scalar != NULL) + { + pre = EC_EX_DATA_get_data(group->extra_data, + nistp521_pre_comp_dup, nistp521_pre_comp_free, + nistp521_pre_comp_clear_free); + if (pre) + /* we have precomputation, try to use it */ + g_pre_comp = &pre->g_pre_comp[0]; + else + /* try to use the standard precomputation */ + g_pre_comp = (felem (*)[3]) gmul; + generator = EC_POINT_new(group); + if (generator == NULL) + goto err; + /* get the generator from precomputation */ + if (!felem_to_BN(x, g_pre_comp[1][0]) || + !felem_to_BN(y, g_pre_comp[1][1]) || + !felem_to_BN(z, g_pre_comp[1][2])) + { + ECerr(EC_F_EC_GFP_NISTP521_POINTS_MUL, ERR_R_BN_LIB); + goto err; + } + if (!EC_POINT_set_Jprojective_coordinates_GFp(group, + generator, x, y, z, ctx)) + goto err; + if (0 == EC_POINT_cmp(group, generator, group->generator, ctx)) + /* precomputation matches generator */ + have_pre_comp = 1; + else + /* we don't have valid precomputation: + * treat the generator as a random point */ + num_points++; + } + + if (num_points > 0) + { + if (num_points >= 2) + { + /* unless we precompute multiples for just one point, + * converting those into affine form is time well spent */ + mixed = 1; + } + secrets = OPENSSL_malloc(num_points * sizeof(felem_bytearray)); + pre_comp = OPENSSL_malloc(num_points * 17 * 3 * sizeof(felem)); + if (mixed) + tmp_felems = OPENSSL_malloc((num_points * 17 + 1) * sizeof(felem)); + if ((secrets == NULL) || (pre_comp == NULL) || (mixed && (tmp_felems == NULL))) + { + ECerr(EC_F_EC_GFP_NISTP521_POINTS_MUL, ERR_R_MALLOC_FAILURE); + goto err; + } + + /* we treat NULL scalars as 0, and NULL points as points at infinity, + * i.e., they contribute nothing to the linear combination */ + memset(secrets, 0, num_points * sizeof(felem_bytearray)); + memset(pre_comp, 0, num_points * 17 * 3 * sizeof(felem)); + for (i = 0; i < num_points; ++i) + { + if (i == num) + /* we didn't have a valid precomputation, so we pick + * the generator */ + { + p = EC_GROUP_get0_generator(group); + p_scalar = scalar; + } + else + /* the i^th point */ + { + p = points[i]; + p_scalar = scalars[i]; + } + if ((p_scalar != NULL) && (p != NULL)) + { + /* reduce scalar to 0 <= scalar < 2^521 */ + if ((BN_num_bits(p_scalar) > 521) || (BN_is_negative(p_scalar))) + { + /* this is an unusual input, and we don't guarantee + * constant-timeness */ + if (!BN_nnmod(tmp_scalar, p_scalar, &group->order, ctx)) + { + ECerr(EC_F_EC_GFP_NISTP521_POINTS_MUL, ERR_R_BN_LIB); + goto err; + } + num_bytes = BN_bn2bin(tmp_scalar, tmp); + } + else + num_bytes = BN_bn2bin(p_scalar, tmp); + flip_endian(secrets[i], tmp, num_bytes); + /* precompute multiples */ + if ((!BN_to_felem(x_out, &p->X)) || + (!BN_to_felem(y_out, &p->Y)) || + (!BN_to_felem(z_out, &p->Z))) goto err; + memcpy(pre_comp[i][1][0], x_out, sizeof(felem)); + memcpy(pre_comp[i][1][1], y_out, sizeof(felem)); + memcpy(pre_comp[i][1][2], z_out, sizeof(felem)); + for (j = 2; j <= 16; ++j) + { + if (j & 1) + { + point_add( + pre_comp[i][j][0], pre_comp[i][j][1], pre_comp[i][j][2], + pre_comp[i][1][0], pre_comp[i][1][1], pre_comp[i][1][2], + 0, pre_comp[i][j-1][0], pre_comp[i][j-1][1], pre_comp[i][j-1][2]); + } + else + { + point_double( + pre_comp[i][j][0], pre_comp[i][j][1], pre_comp[i][j][2], + pre_comp[i][j/2][0], pre_comp[i][j/2][1], pre_comp[i][j/2][2]); + } + } + } + } + if (mixed) + make_points_affine(num_points * 17, pre_comp[0], tmp_felems); + } + + /* the scalar for the generator */ + if ((scalar != NULL) && (have_pre_comp)) + { + memset(g_secret, 0, sizeof(g_secret)); + /* reduce scalar to 0 <= scalar < 2^521 */ + if ((BN_num_bits(scalar) > 521) || (BN_is_negative(scalar))) + { + /* this is an unusual input, and we don't guarantee + * constant-timeness */ + if (!BN_nnmod(tmp_scalar, scalar, &group->order, ctx)) + { + ECerr(EC_F_EC_GFP_NISTP521_POINTS_MUL, ERR_R_BN_LIB); + goto err; + } + num_bytes = BN_bn2bin(tmp_scalar, tmp); + } + else + num_bytes = BN_bn2bin(scalar, tmp); + flip_endian(g_secret, tmp, num_bytes); + /* do the multiplication with generator precomputation*/ + batch_mul(x_out, y_out, z_out, + (const felem_bytearray (*)) secrets, num_points, + g_secret, + mixed, (const felem (*)[17][3]) pre_comp, + (const felem (*)[3]) g_pre_comp); + } + else + /* do the multiplication without generator precomputation */ + batch_mul(x_out, y_out, z_out, + (const felem_bytearray (*)) secrets, num_points, + NULL, mixed, (const felem (*)[17][3]) pre_comp, NULL); + /* reduce the output to its unique minimal representation */ + felem_contract(x_in, x_out); + felem_contract(y_in, y_out); + felem_contract(z_in, z_out); + if ((!felem_to_BN(x, x_in)) || (!felem_to_BN(y, y_in)) || + (!felem_to_BN(z, z_in))) + { + ECerr(EC_F_EC_GFP_NISTP521_POINTS_MUL, ERR_R_BN_LIB); + goto err; + } + ret = EC_POINT_set_Jprojective_coordinates_GFp(group, r, x, y, z, ctx); + +err: + BN_CTX_end(ctx); + if (generator != NULL) + EC_POINT_free(generator); + if (new_ctx != NULL) + BN_CTX_free(new_ctx); + if (secrets != NULL) + OPENSSL_free(secrets); + if (pre_comp != NULL) + OPENSSL_free(pre_comp); + if (tmp_felems != NULL) + OPENSSL_free(tmp_felems); + return ret; + } + +int ec_GFp_nistp521_precompute_mult(EC_GROUP *group, BN_CTX *ctx) + { + int ret = 0; + NISTP521_PRE_COMP *pre = NULL; + int i, j; + BN_CTX *new_ctx = NULL; + BIGNUM *x, *y; + EC_POINT *generator = NULL; + felem tmp_felems[16]; + + /* throw away old precomputation */ + EC_EX_DATA_free_data(&group->extra_data, nistp521_pre_comp_dup, + nistp521_pre_comp_free, nistp521_pre_comp_clear_free); + if (ctx == NULL) + if ((ctx = new_ctx = BN_CTX_new()) == NULL) return 0; + BN_CTX_start(ctx); + if (((x = BN_CTX_get(ctx)) == NULL) || + ((y = BN_CTX_get(ctx)) == NULL)) + goto err; + /* get the generator */ + if (group->generator == NULL) goto err; + generator = EC_POINT_new(group); + if (generator == NULL) + goto err; + BN_bin2bn(nistp521_curve_params[3], sizeof (felem_bytearray), x); + BN_bin2bn(nistp521_curve_params[4], sizeof (felem_bytearray), y); + if (!EC_POINT_set_affine_coordinates_GFp(group, generator, x, y, ctx)) + goto err; + if ((pre = nistp521_pre_comp_new()) == NULL) + goto err; + /* if the generator is the standard one, use built-in precomputation */ + if (0 == EC_POINT_cmp(group, generator, group->generator, ctx)) + { + memcpy(pre->g_pre_comp, gmul, sizeof(pre->g_pre_comp)); + ret = 1; + goto err; + } + if ((!BN_to_felem(pre->g_pre_comp[1][0], &group->generator->X)) || + (!BN_to_felem(pre->g_pre_comp[1][1], &group->generator->Y)) || + (!BN_to_felem(pre->g_pre_comp[1][2], &group->generator->Z))) + goto err; + /* compute 2^130*G, 2^260*G, 2^390*G */ + for (i = 1; i <= 4; i <<= 1) + { + point_double(pre->g_pre_comp[2*i][0], pre->g_pre_comp[2*i][1], + pre->g_pre_comp[2*i][2], pre->g_pre_comp[i][0], + pre->g_pre_comp[i][1], pre->g_pre_comp[i][2]); + for (j = 0; j < 129; ++j) + { + point_double(pre->g_pre_comp[2*i][0], + pre->g_pre_comp[2*i][1], + pre->g_pre_comp[2*i][2], + pre->g_pre_comp[2*i][0], + pre->g_pre_comp[2*i][1], + pre->g_pre_comp[2*i][2]); + } + } + /* g_pre_comp[0] is the point at infinity */ + memset(pre->g_pre_comp[0], 0, sizeof(pre->g_pre_comp[0])); + /* the remaining multiples */ + /* 2^130*G + 2^260*G */ + point_add(pre->g_pre_comp[6][0], pre->g_pre_comp[6][1], + pre->g_pre_comp[6][2], pre->g_pre_comp[4][0], + pre->g_pre_comp[4][1], pre->g_pre_comp[4][2], + 0, pre->g_pre_comp[2][0], pre->g_pre_comp[2][1], + pre->g_pre_comp[2][2]); + /* 2^130*G + 2^390*G */ + point_add(pre->g_pre_comp[10][0], pre->g_pre_comp[10][1], + pre->g_pre_comp[10][2], pre->g_pre_comp[8][0], + pre->g_pre_comp[8][1], pre->g_pre_comp[8][2], + 0, pre->g_pre_comp[2][0], pre->g_pre_comp[2][1], + pre->g_pre_comp[2][2]); + /* 2^260*G + 2^390*G */ + point_add(pre->g_pre_comp[12][0], pre->g_pre_comp[12][1], + pre->g_pre_comp[12][2], pre->g_pre_comp[8][0], + pre->g_pre_comp[8][1], pre->g_pre_comp[8][2], + 0, pre->g_pre_comp[4][0], pre->g_pre_comp[4][1], + pre->g_pre_comp[4][2]); + /* 2^130*G + 2^260*G + 2^390*G */ + point_add(pre->g_pre_comp[14][0], pre->g_pre_comp[14][1], + pre->g_pre_comp[14][2], pre->g_pre_comp[12][0], + pre->g_pre_comp[12][1], pre->g_pre_comp[12][2], + 0, pre->g_pre_comp[2][0], pre->g_pre_comp[2][1], + pre->g_pre_comp[2][2]); + for (i = 1; i < 8; ++i) + { + /* odd multiples: add G */ + point_add(pre->g_pre_comp[2*i+1][0], pre->g_pre_comp[2*i+1][1], + pre->g_pre_comp[2*i+1][2], pre->g_pre_comp[2*i][0], + pre->g_pre_comp[2*i][1], pre->g_pre_comp[2*i][2], + 0, pre->g_pre_comp[1][0], pre->g_pre_comp[1][1], + pre->g_pre_comp[1][2]); + } + make_points_affine(15, &(pre->g_pre_comp[1]), tmp_felems); + + if (!EC_EX_DATA_set_data(&group->extra_data, pre, nistp521_pre_comp_dup, + nistp521_pre_comp_free, nistp521_pre_comp_clear_free)) + goto err; + ret = 1; + pre = NULL; + err: + BN_CTX_end(ctx); + if (generator != NULL) + EC_POINT_free(generator); + if (new_ctx != NULL) + BN_CTX_free(new_ctx); + if (pre) + nistp521_pre_comp_free(pre); + return ret; + } + +int ec_GFp_nistp521_have_precompute_mult(const EC_GROUP *group) + { + if (EC_EX_DATA_get_data(group->extra_data, nistp521_pre_comp_dup, + nistp521_pre_comp_free, nistp521_pre_comp_clear_free) + != NULL) + return 1; + else + return 0; + } + +#else +static void *dummy=&dummy; +#endif diff --git a/crypto/openssl/crypto/ec/ecp_nistputil.c b/crypto/openssl/crypto/ec/ecp_nistputil.c new file mode 100644 index 0000000000..c8140c807f --- /dev/null +++ b/crypto/openssl/crypto/ec/ecp_nistputil.c @@ -0,0 +1,197 @@ +/* crypto/ec/ecp_nistputil.c */ +/* + * Written by Bodo Moeller for the OpenSSL project. + */ +/* Copyright 2011 Google Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include +#ifndef OPENSSL_NO_EC_NISTP_64_GCC_128 + +/* + * Common utility functions for ecp_nistp224.c, ecp_nistp256.c, ecp_nistp521.c. + */ + +#include +#include "ec_lcl.h" + +/* Convert an array of points into affine coordinates. + * (If the point at infinity is found (Z = 0), it remains unchanged.) + * This function is essentially an equivalent to EC_POINTs_make_affine(), but + * works with the internal representation of points as used by ecp_nistp###.c + * rather than with (BIGNUM-based) EC_POINT data structures. + * + * point_array is the input/output buffer ('num' points in projective form, + * i.e. three coordinates each), based on an internal representation of + * field elements of size 'felem_size'. + * + * tmp_felems needs to point to a temporary array of 'num'+1 field elements + * for storage of intermediate values. + */ +void ec_GFp_nistp_points_make_affine_internal(size_t num, void *point_array, + size_t felem_size, void *tmp_felems, + void (*felem_one)(void *out), + int (*felem_is_zero)(const void *in), + void (*felem_assign)(void *out, const void *in), + void (*felem_square)(void *out, const void *in), + void (*felem_mul)(void *out, const void *in1, const void *in2), + void (*felem_inv)(void *out, const void *in), + void (*felem_contract)(void *out, const void *in)) + { + int i = 0; + +#define tmp_felem(I) (&((char *)tmp_felems)[(I) * felem_size]) +#define X(I) (&((char *)point_array)[3*(I) * felem_size]) +#define Y(I) (&((char *)point_array)[(3*(I) + 1) * felem_size]) +#define Z(I) (&((char *)point_array)[(3*(I) + 2) * felem_size]) + + if (!felem_is_zero(Z(0))) + felem_assign(tmp_felem(0), Z(0)); + else + felem_one(tmp_felem(0)); + for (i = 1; i < (int)num; i++) + { + if (!felem_is_zero(Z(i))) + felem_mul(tmp_felem(i), tmp_felem(i-1), Z(i)); + else + felem_assign(tmp_felem(i), tmp_felem(i-1)); + } + /* Now each tmp_felem(i) is the product of Z(0) .. Z(i), skipping any zero-valued factors: + * if Z(i) = 0, we essentially pretend that Z(i) = 1 */ + + felem_inv(tmp_felem(num-1), tmp_felem(num-1)); + for (i = num - 1; i >= 0; i--) + { + if (i > 0) + /* tmp_felem(i-1) is the product of Z(0) .. Z(i-1), + * tmp_felem(i) is the inverse of the product of Z(0) .. Z(i) + */ + felem_mul(tmp_felem(num), tmp_felem(i-1), tmp_felem(i)); /* 1/Z(i) */ + else + felem_assign(tmp_felem(num), tmp_felem(0)); /* 1/Z(0) */ + + if (!felem_is_zero(Z(i))) + { + if (i > 0) + /* For next iteration, replace tmp_felem(i-1) by its inverse */ + felem_mul(tmp_felem(i-1), tmp_felem(i), Z(i)); + + /* Convert point (X, Y, Z) into affine form (X/(Z^2), Y/(Z^3), 1) */ + felem_square(Z(i), tmp_felem(num)); /* 1/(Z^2) */ + felem_mul(X(i), X(i), Z(i)); /* X/(Z^2) */ + felem_mul(Z(i), Z(i), tmp_felem(num)); /* 1/(Z^3) */ + felem_mul(Y(i), Y(i), Z(i)); /* Y/(Z^3) */ + felem_contract(X(i), X(i)); + felem_contract(Y(i), Y(i)); + felem_one(Z(i)); + } + else + { + if (i > 0) + /* For next iteration, replace tmp_felem(i-1) by its inverse */ + felem_assign(tmp_felem(i-1), tmp_felem(i)); + } + } + } + +/* + * This function looks at 5+1 scalar bits (5 current, 1 adjacent less + * significant bit), and recodes them into a signed digit for use in fast point + * multiplication: the use of signed rather than unsigned digits means that + * fewer points need to be precomputed, given that point inversion is easy + * (a precomputed point dP makes -dP available as well). + * + * BACKGROUND: + * + * Signed digits for multiplication were introduced by Booth ("A signed binary + * multiplication technique", Quart. Journ. Mech. and Applied Math., vol. IV, + * pt. 2 (1951), pp. 236-240), in that case for multiplication of integers. + * Booth's original encoding did not generally improve the density of nonzero + * digits over the binary representation, and was merely meant to simplify the + * handling of signed factors given in two's complement; but it has since been + * shown to be the basis of various signed-digit representations that do have + * further advantages, including the wNAF, using the following general approach: + * + * (1) Given a binary representation + * + * b_k ... b_2 b_1 b_0, + * + * of a nonnegative integer (b_k in {0, 1}), rewrite it in digits 0, 1, -1 + * by using bit-wise subtraction as follows: + * + * b_k b_(k-1) ... b_2 b_1 b_0 + * - b_k ... b_3 b_2 b_1 b_0 + * ------------------------------------- + * s_k b_(k-1) ... s_3 s_2 s_1 s_0 + * + * A left-shift followed by subtraction of the original value yields a new + * representation of the same value, using signed bits s_i = b_(i+1) - b_i. + * This representation from Booth's paper has since appeared in the + * literature under a variety of different names including "reversed binary + * form", "alternating greedy expansion", "mutual opposite form", and + * "sign-alternating {+-1}-representation". + * + * An interesting property is that among the nonzero bits, values 1 and -1 + * strictly alternate. + * + * (2) Various window schemes can be applied to the Booth representation of + * integers: for example, right-to-left sliding windows yield the wNAF + * (a signed-digit encoding independently discovered by various researchers + * in the 1990s), and left-to-right sliding windows yield a left-to-right + * equivalent of the wNAF (independently discovered by various researchers + * around 2004). + * + * To prevent leaking information through side channels in point multiplication, + * we need to recode the given integer into a regular pattern: sliding windows + * as in wNAFs won't do, we need their fixed-window equivalent -- which is a few + * decades older: we'll be using the so-called "modified Booth encoding" due to + * MacSorley ("High-speed arithmetic in binary computers", Proc. IRE, vol. 49 + * (1961), pp. 67-91), in a radix-2^5 setting. That is, we always combine five + * signed bits into a signed digit: + * + * s_(4j + 4) s_(4j + 3) s_(4j + 2) s_(4j + 1) s_(4j) + * + * The sign-alternating property implies that the resulting digit values are + * integers from -16 to 16. + * + * Of course, we don't actually need to compute the signed digits s_i as an + * intermediate step (that's just a nice way to see how this scheme relates + * to the wNAF): a direct computation obtains the recoded digit from the + * six bits b_(4j + 4) ... b_(4j - 1). + * + * This function takes those five bits as an integer (0 .. 63), writing the + * recoded digit to *sign (0 for positive, 1 for negative) and *digit (absolute + * value, in the range 0 .. 8). Note that this integer essentially provides the + * input bits "shifted to the left" by one position: for example, the input to + * compute the least significant recoded digit, given that there's no bit b_-1, + * has to be b_4 b_3 b_2 b_1 b_0 0. + * + */ +void ec_GFp_nistp_recode_scalar_bits(unsigned char *sign, unsigned char *digit, unsigned char in) + { + unsigned char s, d; + + s = ~((in >> 5) - 1); /* sets all bits to MSB(in), 'in' seen as 6-bit value */ + d = (1 << 6) - in - 1; + d = (d & s) | (in & ~s); + d = (d >> 1) + (d & 1); + + *sign = s & 1; + *digit = d; + } +#else +static void *dummy=&dummy; +#endif diff --git a/crypto/openssl/crypto/ec/ecp_oct.c b/crypto/openssl/crypto/ec/ecp_oct.c new file mode 100644 index 0000000000..374a0ee731 --- /dev/null +++ b/crypto/openssl/crypto/ec/ecp_oct.c @@ -0,0 +1,433 @@ +/* crypto/ec/ecp_oct.c */ +/* Includes code written by Lenka Fibikova + * for the OpenSSL project. + * Includes code written by Bodo Moeller for the OpenSSL project. +*/ +/* ==================================================================== + * Copyright (c) 1998-2002 The OpenSSL Project. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * 3. All advertising materials mentioning features or use of this + * software must display the following acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit. (http://www.openssl.org/)" + * + * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to + * endorse or promote products derived from this software without + * prior written permission. For written permission, please contact + * openssl-core@openssl.org. + * + * 5. Products derived from this software may not be called "OpenSSL" + * nor may "OpenSSL" appear in their names without prior written + * permission of the OpenSSL Project. + * + * 6. Redistributions of any form whatsoever must retain the following + * acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit (http://www.openssl.org/)" + * + * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY + * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * ==================================================================== + * + * This product includes cryptographic software written by Eric Young + * (eay@cryptsoft.com). This product includes software written by Tim + * Hudson (tjh@cryptsoft.com). + * + */ +/* ==================================================================== + * Copyright 2002 Sun Microsystems, Inc. ALL RIGHTS RESERVED. + * Portions of this software developed by SUN MICROSYSTEMS, INC., + * and contributed to the OpenSSL project. + */ + +#include +#include + +#include "ec_lcl.h" + +int ec_GFp_simple_set_compressed_coordinates(const EC_GROUP *group, EC_POINT *point, + const BIGNUM *x_, int y_bit, BN_CTX *ctx) + { + BN_CTX *new_ctx = NULL; + BIGNUM *tmp1, *tmp2, *x, *y; + int ret = 0; + + /* clear error queue*/ + ERR_clear_error(); + + if (ctx == NULL) + { + ctx = new_ctx = BN_CTX_new(); + if (ctx == NULL) + return 0; + } + + y_bit = (y_bit != 0); + + BN_CTX_start(ctx); + tmp1 = BN_CTX_get(ctx); + tmp2 = BN_CTX_get(ctx); + x = BN_CTX_get(ctx); + y = BN_CTX_get(ctx); + if (y == NULL) goto err; + + /* Recover y. We have a Weierstrass equation + * y^2 = x^3 + a*x + b, + * so y is one of the square roots of x^3 + a*x + b. + */ + + /* tmp1 := x^3 */ + if (!BN_nnmod(x, x_, &group->field,ctx)) goto err; + if (group->meth->field_decode == 0) + { + /* field_{sqr,mul} work on standard representation */ + if (!group->meth->field_sqr(group, tmp2, x_, ctx)) goto err; + if (!group->meth->field_mul(group, tmp1, tmp2, x_, ctx)) goto err; + } + else + { + if (!BN_mod_sqr(tmp2, x_, &group->field, ctx)) goto err; + if (!BN_mod_mul(tmp1, tmp2, x_, &group->field, ctx)) goto err; + } + + /* tmp1 := tmp1 + a*x */ + if (group->a_is_minus3) + { + if (!BN_mod_lshift1_quick(tmp2, x, &group->field)) goto err; + if (!BN_mod_add_quick(tmp2, tmp2, x, &group->field)) goto err; + if (!BN_mod_sub_quick(tmp1, tmp1, tmp2, &group->field)) goto err; + } + else + { + if (group->meth->field_decode) + { + if (!group->meth->field_decode(group, tmp2, &group->a, ctx)) goto err; + if (!BN_mod_mul(tmp2, tmp2, x, &group->field, ctx)) goto err; + } + else + { + /* field_mul works on standard representation */ + if (!group->meth->field_mul(group, tmp2, &group->a, x, ctx)) goto err; + } + + if (!BN_mod_add_quick(tmp1, tmp1, tmp2, &group->field)) goto err; + } + + /* tmp1 := tmp1 + b */ + if (group->meth->field_decode) + { + if (!group->meth->field_decode(group, tmp2, &group->b, ctx)) goto err; + if (!BN_mod_add_quick(tmp1, tmp1, tmp2, &group->field)) goto err; + } + else + { + if (!BN_mod_add_quick(tmp1, tmp1, &group->b, &group->field)) goto err; + } + + if (!BN_mod_sqrt(y, tmp1, &group->field, ctx)) + { + unsigned long err = ERR_peek_last_error(); + + if (ERR_GET_LIB(err) == ERR_LIB_BN && ERR_GET_REASON(err) == BN_R_NOT_A_SQUARE) + { + ERR_clear_error(); + ECerr(EC_F_EC_GFP_SIMPLE_SET_COMPRESSED_COORDINATES, EC_R_INVALID_COMPRESSED_POINT); + } + else + ECerr(EC_F_EC_GFP_SIMPLE_SET_COMPRESSED_COORDINATES, ERR_R_BN_LIB); + goto err; + } + + if (y_bit != BN_is_odd(y)) + { + if (BN_is_zero(y)) + { + int kron; + + kron = BN_kronecker(x, &group->field, ctx); + if (kron == -2) goto err; + + if (kron == 1) + ECerr(EC_F_EC_GFP_SIMPLE_SET_COMPRESSED_COORDINATES, EC_R_INVALID_COMPRESSION_BIT); + else + /* BN_mod_sqrt() should have cought this error (not a square) */ + ECerr(EC_F_EC_GFP_SIMPLE_SET_COMPRESSED_COORDINATES, EC_R_INVALID_COMPRESSED_POINT); + goto err; + } + if (!BN_usub(y, &group->field, y)) goto err; + } + if (y_bit != BN_is_odd(y)) + { + ECerr(EC_F_EC_GFP_SIMPLE_SET_COMPRESSED_COORDINATES, ERR_R_INTERNAL_ERROR); + goto err; + } + + if (!EC_POINT_set_affine_coordinates_GFp(group, point, x, y, ctx)) goto err; + + ret = 1; + + err: + BN_CTX_end(ctx); + if (new_ctx != NULL) + BN_CTX_free(new_ctx); + return ret; + } + + +size_t ec_GFp_simple_point2oct(const EC_GROUP *group, const EC_POINT *point, point_conversion_form_t form, + unsigned char *buf, size_t len, BN_CTX *ctx) + { + size_t ret; + BN_CTX *new_ctx = NULL; + int used_ctx = 0; + BIGNUM *x, *y; + size_t field_len, i, skip; + + if ((form != POINT_CONVERSION_COMPRESSED) + && (form != POINT_CONVERSION_UNCOMPRESSED) + && (form != POINT_CONVERSION_HYBRID)) + { + ECerr(EC_F_EC_GFP_SIMPLE_POINT2OCT, EC_R_INVALID_FORM); + goto err; + } + + if (EC_POINT_is_at_infinity(group, point)) + { + /* encodes to a single 0 octet */ + if (buf != NULL) + { + if (len < 1) + { + ECerr(EC_F_EC_GFP_SIMPLE_POINT2OCT, EC_R_BUFFER_TOO_SMALL); + return 0; + } + buf[0] = 0; + } + return 1; + } + + + /* ret := required output buffer length */ + field_len = BN_num_bytes(&group->field); + ret = (form == POINT_CONVERSION_COMPRESSED) ? 1 + field_len : 1 + 2*field_len; + + /* if 'buf' is NULL, just return required length */ + if (buf != NULL) + { + if (len < ret) + { + ECerr(EC_F_EC_GFP_SIMPLE_POINT2OCT, EC_R_BUFFER_TOO_SMALL); + goto err; + } + + if (ctx == NULL) + { + ctx = new_ctx = BN_CTX_new(); + if (ctx == NULL) + return 0; + } + + BN_CTX_start(ctx); + used_ctx = 1; + x = BN_CTX_get(ctx); + y = BN_CTX_get(ctx); + if (y == NULL) goto err; + + if (!EC_POINT_get_affine_coordinates_GFp(group, point, x, y, ctx)) goto err; + + if ((form == POINT_CONVERSION_COMPRESSED || form == POINT_CONVERSION_HYBRID) && BN_is_odd(y)) + buf[0] = form + 1; + else + buf[0] = form; + + i = 1; + + skip = field_len - BN_num_bytes(x); + if (skip > field_len) + { + ECerr(EC_F_EC_GFP_SIMPLE_POINT2OCT, ERR_R_INTERNAL_ERROR); + goto err; + } + while (skip > 0) + { + buf[i++] = 0; + skip--; + } + skip = BN_bn2bin(x, buf + i); + i += skip; + if (i != 1 + field_len) + { + ECerr(EC_F_EC_GFP_SIMPLE_POINT2OCT, ERR_R_INTERNAL_ERROR); + goto err; + } + + if (form == POINT_CONVERSION_UNCOMPRESSED || form == POINT_CONVERSION_HYBRID) + { + skip = field_len - BN_num_bytes(y); + if (skip > field_len) + { + ECerr(EC_F_EC_GFP_SIMPLE_POINT2OCT, ERR_R_INTERNAL_ERROR); + goto err; + } + while (skip > 0) + { + buf[i++] = 0; + skip--; + } + skip = BN_bn2bin(y, buf + i); + i += skip; + } + + if (i != ret) + { + ECerr(EC_F_EC_GFP_SIMPLE_POINT2OCT, ERR_R_INTERNAL_ERROR); + goto err; + } + } + + if (used_ctx) + BN_CTX_end(ctx); + if (new_ctx != NULL) + BN_CTX_free(new_ctx); + return ret; + + err: + if (used_ctx) + BN_CTX_end(ctx); + if (new_ctx != NULL) + BN_CTX_free(new_ctx); + return 0; + } + + +int ec_GFp_simple_oct2point(const EC_GROUP *group, EC_POINT *point, + const unsigned char *buf, size_t len, BN_CTX *ctx) + { + point_conversion_form_t form; + int y_bit; + BN_CTX *new_ctx = NULL; + BIGNUM *x, *y; + size_t field_len, enc_len; + int ret = 0; + + if (len == 0) + { + ECerr(EC_F_EC_GFP_SIMPLE_OCT2POINT, EC_R_BUFFER_TOO_SMALL); + return 0; + } + form = buf[0]; + y_bit = form & 1; + form = form & ~1U; + if ((form != 0) && (form != POINT_CONVERSION_COMPRESSED) + && (form != POINT_CONVERSION_UNCOMPRESSED) + && (form != POINT_CONVERSION_HYBRID)) + { + ECerr(EC_F_EC_GFP_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); + return 0; + } + if ((form == 0 || form == POINT_CONVERSION_UNCOMPRESSED) && y_bit) + { + ECerr(EC_F_EC_GFP_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); + return 0; + } + + if (form == 0) + { + if (len != 1) + { + ECerr(EC_F_EC_GFP_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); + return 0; + } + + return EC_POINT_set_to_infinity(group, point); + } + + field_len = BN_num_bytes(&group->field); + enc_len = (form == POINT_CONVERSION_COMPRESSED) ? 1 + field_len : 1 + 2*field_len; + + if (len != enc_len) + { + ECerr(EC_F_EC_GFP_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); + return 0; + } + + if (ctx == NULL) + { + ctx = new_ctx = BN_CTX_new(); + if (ctx == NULL) + return 0; + } + + BN_CTX_start(ctx); + x = BN_CTX_get(ctx); + y = BN_CTX_get(ctx); + if (y == NULL) goto err; + + if (!BN_bin2bn(buf + 1, field_len, x)) goto err; + if (BN_ucmp(x, &group->field) >= 0) + { + ECerr(EC_F_EC_GFP_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); + goto err; + } + + if (form == POINT_CONVERSION_COMPRESSED) + { + if (!EC_POINT_set_compressed_coordinates_GFp(group, point, x, y_bit, ctx)) goto err; + } + else + { + if (!BN_bin2bn(buf + 1 + field_len, field_len, y)) goto err; + if (BN_ucmp(y, &group->field) >= 0) + { + ECerr(EC_F_EC_GFP_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); + goto err; + } + if (form == POINT_CONVERSION_HYBRID) + { + if (y_bit != BN_is_odd(y)) + { + ECerr(EC_F_EC_GFP_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); + goto err; + } + } + + if (!EC_POINT_set_affine_coordinates_GFp(group, point, x, y, ctx)) goto err; + } + + if (!EC_POINT_is_on_curve(group, point, ctx)) /* test required by X9.62 */ + { + ECerr(EC_F_EC_GFP_SIMPLE_OCT2POINT, EC_R_POINT_IS_NOT_ON_CURVE); + goto err; + } + + ret = 1; + + err: + BN_CTX_end(ctx); + if (new_ctx != NULL) + BN_CTX_free(new_ctx); + return ret; + } + diff --git a/crypto/openssl/crypto/ec/ecp_smpl.c b/crypto/openssl/crypto/ec/ecp_smpl.c index 66a92e2a90..7cbb321f9a 100644 --- a/crypto/openssl/crypto/ec/ecp_smpl.c +++ b/crypto/openssl/crypto/ec/ecp_smpl.c @@ -65,11 +65,19 @@ #include #include +#ifdef OPENSSL_FIPS +#include +#endif + #include "ec_lcl.h" const EC_METHOD *EC_GFp_simple_method(void) { +#ifdef OPENSSL_FIPS + return fips_ec_gfp_simple_method(); +#else static const EC_METHOD ret = { + EC_FLAGS_DEFAULT_OCT, NID_X9_62_prime_field, ec_GFp_simple_group_init, ec_GFp_simple_group_finish, @@ -88,9 +96,7 @@ const EC_METHOD *EC_GFp_simple_method(void) ec_GFp_simple_get_Jprojective_coordinates_GFp, ec_GFp_simple_point_set_affine_coordinates, ec_GFp_simple_point_get_affine_coordinates, - ec_GFp_simple_set_compressed_coordinates, - ec_GFp_simple_point2oct, - ec_GFp_simple_oct2point, + 0,0,0, ec_GFp_simple_add, ec_GFp_simple_dbl, ec_GFp_simple_invert, @@ -110,6 +116,7 @@ const EC_METHOD *EC_GFp_simple_method(void) 0 /* field_set_to_one */ }; return &ret; +#endif } @@ -633,372 +640,6 @@ int ec_GFp_simple_point_get_affine_coordinates(const EC_GROUP *group, const EC_P return ret; } - -int ec_GFp_simple_set_compressed_coordinates(const EC_GROUP *group, EC_POINT *point, - const BIGNUM *x_, int y_bit, BN_CTX *ctx) - { - BN_CTX *new_ctx = NULL; - BIGNUM *tmp1, *tmp2, *x, *y; - int ret = 0; - - /* clear error queue*/ - ERR_clear_error(); - - if (ctx == NULL) - { - ctx = new_ctx = BN_CTX_new(); - if (ctx == NULL) - return 0; - } - - y_bit = (y_bit != 0); - - BN_CTX_start(ctx); - tmp1 = BN_CTX_get(ctx); - tmp2 = BN_CTX_get(ctx); - x = BN_CTX_get(ctx); - y = BN_CTX_get(ctx); - if (y == NULL) goto err; - - /* Recover y. We have a Weierstrass equation - * y^2 = x^3 + a*x + b, - * so y is one of the square roots of x^3 + a*x + b. - */ - - /* tmp1 := x^3 */ - if (!BN_nnmod(x, x_, &group->field,ctx)) goto err; - if (group->meth->field_decode == 0) - { - /* field_{sqr,mul} work on standard representation */ - if (!group->meth->field_sqr(group, tmp2, x_, ctx)) goto err; - if (!group->meth->field_mul(group, tmp1, tmp2, x_, ctx)) goto err; - } - else - { - if (!BN_mod_sqr(tmp2, x_, &group->field, ctx)) goto err; - if (!BN_mod_mul(tmp1, tmp2, x_, &group->field, ctx)) goto err; - } - - /* tmp1 := tmp1 + a*x */ - if (group->a_is_minus3) - { - if (!BN_mod_lshift1_quick(tmp2, x, &group->field)) goto err; - if (!BN_mod_add_quick(tmp2, tmp2, x, &group->field)) goto err; - if (!BN_mod_sub_quick(tmp1, tmp1, tmp2, &group->field)) goto err; - } - else - { - if (group->meth->field_decode) - { - if (!group->meth->field_decode(group, tmp2, &group->a, ctx)) goto err; - if (!BN_mod_mul(tmp2, tmp2, x, &group->field, ctx)) goto err; - } - else - { - /* field_mul works on standard representation */ - if (!group->meth->field_mul(group, tmp2, &group->a, x, ctx)) goto err; - } - - if (!BN_mod_add_quick(tmp1, tmp1, tmp2, &group->field)) goto err; - } - - /* tmp1 := tmp1 + b */ - if (group->meth->field_decode) - { - if (!group->meth->field_decode(group, tmp2, &group->b, ctx)) goto err; - if (!BN_mod_add_quick(tmp1, tmp1, tmp2, &group->field)) goto err; - } - else - { - if (!BN_mod_add_quick(tmp1, tmp1, &group->b, &group->field)) goto err; - } - - if (!BN_mod_sqrt(y, tmp1, &group->field, ctx)) - { - unsigned long err = ERR_peek_last_error(); - - if (ERR_GET_LIB(err) == ERR_LIB_BN && ERR_GET_REASON(err) == BN_R_NOT_A_SQUARE) - { - ERR_clear_error(); - ECerr(EC_F_EC_GFP_SIMPLE_SET_COMPRESSED_COORDINATES, EC_R_INVALID_COMPRESSED_POINT); - } - else - ECerr(EC_F_EC_GFP_SIMPLE_SET_COMPRESSED_COORDINATES, ERR_R_BN_LIB); - goto err; - } - - if (y_bit != BN_is_odd(y)) - { - if (BN_is_zero(y)) - { - int kron; - - kron = BN_kronecker(x, &group->field, ctx); - if (kron == -2) goto err; - - if (kron == 1) - ECerr(EC_F_EC_GFP_SIMPLE_SET_COMPRESSED_COORDINATES, EC_R_INVALID_COMPRESSION_BIT); - else - /* BN_mod_sqrt() should have cought this error (not a square) */ - ECerr(EC_F_EC_GFP_SIMPLE_SET_COMPRESSED_COORDINATES, EC_R_INVALID_COMPRESSED_POINT); - goto err; - } - if (!BN_usub(y, &group->field, y)) goto err; - } - if (y_bit != BN_is_odd(y)) - { - ECerr(EC_F_EC_GFP_SIMPLE_SET_COMPRESSED_COORDINATES, ERR_R_INTERNAL_ERROR); - goto err; - } - - if (!EC_POINT_set_affine_coordinates_GFp(group, point, x, y, ctx)) goto err; - - ret = 1; - - err: - BN_CTX_end(ctx); - if (new_ctx != NULL) - BN_CTX_free(new_ctx); - return ret; - } - - -size_t ec_GFp_simple_point2oct(const EC_GROUP *group, const EC_POINT *point, point_conversion_form_t form, - unsigned char *buf, size_t len, BN_CTX *ctx) - { - size_t ret; - BN_CTX *new_ctx = NULL; - int used_ctx = 0; - BIGNUM *x, *y; - size_t field_len, i, skip; - - if ((form != POINT_CONVERSION_COMPRESSED) - && (form != POINT_CONVERSION_UNCOMPRESSED) - && (form != POINT_CONVERSION_HYBRID)) - { - ECerr(EC_F_EC_GFP_SIMPLE_POINT2OCT, EC_R_INVALID_FORM); - goto err; - } - - if (EC_POINT_is_at_infinity(group, point)) - { - /* encodes to a single 0 octet */ - if (buf != NULL) - { - if (len < 1) - { - ECerr(EC_F_EC_GFP_SIMPLE_POINT2OCT, EC_R_BUFFER_TOO_SMALL); - return 0; - } - buf[0] = 0; - } - return 1; - } - - - /* ret := required output buffer length */ - field_len = BN_num_bytes(&group->field); - ret = (form == POINT_CONVERSION_COMPRESSED) ? 1 + field_len : 1 + 2*field_len; - - /* if 'buf' is NULL, just return required length */ - if (buf != NULL) - { - if (len < ret) - { - ECerr(EC_F_EC_GFP_SIMPLE_POINT2OCT, EC_R_BUFFER_TOO_SMALL); - goto err; - } - - if (ctx == NULL) - { - ctx = new_ctx = BN_CTX_new(); - if (ctx == NULL) - return 0; - } - - BN_CTX_start(ctx); - used_ctx = 1; - x = BN_CTX_get(ctx); - y = BN_CTX_get(ctx); - if (y == NULL) goto err; - - if (!EC_POINT_get_affine_coordinates_GFp(group, point, x, y, ctx)) goto err; - - if ((form == POINT_CONVERSION_COMPRESSED || form == POINT_CONVERSION_HYBRID) && BN_is_odd(y)) - buf[0] = form + 1; - else - buf[0] = form; - - i = 1; - - skip = field_len - BN_num_bytes(x); - if (skip > field_len) - { - ECerr(EC_F_EC_GFP_SIMPLE_POINT2OCT, ERR_R_INTERNAL_ERROR); - goto err; - } - while (skip > 0) - { - buf[i++] = 0; - skip--; - } - skip = BN_bn2bin(x, buf + i); - i += skip; - if (i != 1 + field_len) - { - ECerr(EC_F_EC_GFP_SIMPLE_POINT2OCT, ERR_R_INTERNAL_ERROR); - goto err; - } - - if (form == POINT_CONVERSION_UNCOMPRESSED || form == POINT_CONVERSION_HYBRID) - { - skip = field_len - BN_num_bytes(y); - if (skip > field_len) - { - ECerr(EC_F_EC_GFP_SIMPLE_POINT2OCT, ERR_R_INTERNAL_ERROR); - goto err; - } - while (skip > 0) - { - buf[i++] = 0; - skip--; - } - skip = BN_bn2bin(y, buf + i); - i += skip; - } - - if (i != ret) - { - ECerr(EC_F_EC_GFP_SIMPLE_POINT2OCT, ERR_R_INTERNAL_ERROR); - goto err; - } - } - - if (used_ctx) - BN_CTX_end(ctx); - if (new_ctx != NULL) - BN_CTX_free(new_ctx); - return ret; - - err: - if (used_ctx) - BN_CTX_end(ctx); - if (new_ctx != NULL) - BN_CTX_free(new_ctx); - return 0; - } - - -int ec_GFp_simple_oct2point(const EC_GROUP *group, EC_POINT *point, - const unsigned char *buf, size_t len, BN_CTX *ctx) - { - point_conversion_form_t form; - int y_bit; - BN_CTX *new_ctx = NULL; - BIGNUM *x, *y; - size_t field_len, enc_len; - int ret = 0; - - if (len == 0) - { - ECerr(EC_F_EC_GFP_SIMPLE_OCT2POINT, EC_R_BUFFER_TOO_SMALL); - return 0; - } - form = buf[0]; - y_bit = form & 1; - form = form & ~1U; - if ((form != 0) && (form != POINT_CONVERSION_COMPRESSED) - && (form != POINT_CONVERSION_UNCOMPRESSED) - && (form != POINT_CONVERSION_HYBRID)) - { - ECerr(EC_F_EC_GFP_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); - return 0; - } - if ((form == 0 || form == POINT_CONVERSION_UNCOMPRESSED) && y_bit) - { - ECerr(EC_F_EC_GFP_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); - return 0; - } - - if (form == 0) - { - if (len != 1) - { - ECerr(EC_F_EC_GFP_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); - return 0; - } - - return EC_POINT_set_to_infinity(group, point); - } - - field_len = BN_num_bytes(&group->field); - enc_len = (form == POINT_CONVERSION_COMPRESSED) ? 1 + field_len : 1 + 2*field_len; - - if (len != enc_len) - { - ECerr(EC_F_EC_GFP_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); - return 0; - } - - if (ctx == NULL) - { - ctx = new_ctx = BN_CTX_new(); - if (ctx == NULL) - return 0; - } - - BN_CTX_start(ctx); - x = BN_CTX_get(ctx); - y = BN_CTX_get(ctx); - if (y == NULL) goto err; - - if (!BN_bin2bn(buf + 1, field_len, x)) goto err; - if (BN_ucmp(x, &group->field) >= 0) - { - ECerr(EC_F_EC_GFP_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); - goto err; - } - - if (form == POINT_CONVERSION_COMPRESSED) - { - if (!EC_POINT_set_compressed_coordinates_GFp(group, point, x, y_bit, ctx)) goto err; - } - else - { - if (!BN_bin2bn(buf + 1 + field_len, field_len, y)) goto err; - if (BN_ucmp(y, &group->field) >= 0) - { - ECerr(EC_F_EC_GFP_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); - goto err; - } - if (form == POINT_CONVERSION_HYBRID) - { - if (y_bit != BN_is_odd(y)) - { - ECerr(EC_F_EC_GFP_SIMPLE_OCT2POINT, EC_R_INVALID_ENCODING); - goto err; - } - } - - if (!EC_POINT_set_affine_coordinates_GFp(group, point, x, y, ctx)) goto err; - } - - if (!EC_POINT_is_on_curve(group, point, ctx)) /* test required by X9.62 */ - { - ECerr(EC_F_EC_GFP_SIMPLE_OCT2POINT, EC_R_POINT_IS_NOT_ON_CURVE); - goto err; - } - - ret = 1; - - err: - BN_CTX_end(ctx); - if (new_ctx != NULL) - BN_CTX_free(new_ctx); - return ret; - } - - int ec_GFp_simple_add(const EC_GROUP *group, EC_POINT *r, const EC_POINT *a, const EC_POINT *b, BN_CTX *ctx) { int (*field_mul)(const EC_GROUP *, BIGNUM *, const BIGNUM *, const BIGNUM *, BN_CTX *); diff --git a/crypto/openssl/crypto/ecdh/ecdh.h b/crypto/openssl/crypto/ecdh/ecdh.h index b4b58ee65b..8887102c0b 100644 --- a/crypto/openssl/crypto/ecdh/ecdh.h +++ b/crypto/openssl/crypto/ecdh/ecdh.h @@ -109,11 +109,13 @@ void ERR_load_ECDH_strings(void); /* Error codes for the ECDH functions. */ /* Function codes. */ +#define ECDH_F_ECDH_CHECK 102 #define ECDH_F_ECDH_COMPUTE_KEY 100 #define ECDH_F_ECDH_DATA_NEW_METHOD 101 /* Reason codes. */ #define ECDH_R_KDF_FAILED 102 +#define ECDH_R_NON_FIPS_METHOD 103 #define ECDH_R_NO_PRIVATE_VALUE 100 #define ECDH_R_POINT_ARITHMETIC_FAILURE 101 diff --git a/crypto/openssl/crypto/ecdh/ech_err.c b/crypto/openssl/crypto/ecdh/ech_err.c index 6f4b0c9953..3bd247398d 100644 --- a/crypto/openssl/crypto/ecdh/ech_err.c +++ b/crypto/openssl/crypto/ecdh/ech_err.c @@ -1,6 +1,6 @@ /* crypto/ecdh/ech_err.c */ /* ==================================================================== - * Copyright (c) 1999-2006 The OpenSSL Project. All rights reserved. + * Copyright (c) 1999-2011 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -70,6 +70,7 @@ static ERR_STRING_DATA ECDH_str_functs[]= { +{ERR_FUNC(ECDH_F_ECDH_CHECK), "ECDH_CHECK"}, {ERR_FUNC(ECDH_F_ECDH_COMPUTE_KEY), "ECDH_compute_key"}, {ERR_FUNC(ECDH_F_ECDH_DATA_NEW_METHOD), "ECDH_DATA_new_method"}, {0,NULL} @@ -78,6 +79,7 @@ static ERR_STRING_DATA ECDH_str_functs[]= static ERR_STRING_DATA ECDH_str_reasons[]= { {ERR_REASON(ECDH_R_KDF_FAILED) ,"KDF failed"}, +{ERR_REASON(ECDH_R_NON_FIPS_METHOD) ,"non fips method"}, {ERR_REASON(ECDH_R_NO_PRIVATE_VALUE) ,"no private value"}, {ERR_REASON(ECDH_R_POINT_ARITHMETIC_FAILURE),"point arithmetic failure"}, {0,NULL} diff --git a/crypto/openssl/crypto/ecdh/ech_lib.c b/crypto/openssl/crypto/ecdh/ech_lib.c index 4d8ea03d3d..dadbfd3c49 100644 --- a/crypto/openssl/crypto/ecdh/ech_lib.c +++ b/crypto/openssl/crypto/ecdh/ech_lib.c @@ -73,6 +73,9 @@ #include #endif #include +#ifdef OPENSSL_FIPS +#include +#endif const char ECDH_version[]="ECDH" OPENSSL_VERSION_PTEXT; @@ -90,7 +93,16 @@ void ECDH_set_default_method(const ECDH_METHOD *meth) const ECDH_METHOD *ECDH_get_default_method(void) { if(!default_ECDH_method) + { +#ifdef OPENSSL_FIPS + if (FIPS_mode()) + return FIPS_ecdh_openssl(); + else + return ECDH_OpenSSL(); +#else default_ECDH_method = ECDH_OpenSSL(); +#endif + } return default_ECDH_method; } @@ -215,6 +227,14 @@ ECDH_DATA *ecdh_check(EC_KEY *key) } else ecdh_data = (ECDH_DATA *)data; +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !(ecdh_data->flags & ECDH_FLAG_FIPS_METHOD) + && !(EC_KEY_get_flags(key) & EC_FLAG_NON_FIPS_ALLOW)) + { + ECDHerr(ECDH_F_ECDH_CHECK, ECDH_R_NON_FIPS_METHOD); + return NULL; + } +#endif return ecdh_data; diff --git a/crypto/openssl/crypto/ecdh/ech_locl.h b/crypto/openssl/crypto/ecdh/ech_locl.h index f658526a7e..f6cad6a894 100644 --- a/crypto/openssl/crypto/ecdh/ech_locl.h +++ b/crypto/openssl/crypto/ecdh/ech_locl.h @@ -75,6 +75,14 @@ struct ecdh_method char *app_data; }; +/* If this flag is set the ECDH method is FIPS compliant and can be used + * in FIPS mode. This is set in the validated module method. If an + * application sets this flag in its own methods it is its responsibility + * to ensure the result is compliant. + */ + +#define ECDH_FLAG_FIPS_METHOD 0x1 + typedef struct ecdh_data_st { /* EC_KEY_METH_DATA part */ int (*init)(EC_KEY *); diff --git a/crypto/openssl/crypto/ecdh/ech_ossl.c b/crypto/openssl/crypto/ecdh/ech_ossl.c index 2a40ff12df..4a30628fbc 100644 --- a/crypto/openssl/crypto/ecdh/ech_ossl.c +++ b/crypto/openssl/crypto/ecdh/ech_ossl.c @@ -157,6 +157,7 @@ static int ecdh_compute_key(void *out, size_t outlen, const EC_POINT *pub_key, goto err; } } +#ifndef OPENSSL_NO_EC2M else { if (!EC_POINT_get_affine_coordinates_GF2m(group, tmp, x, y, ctx)) @@ -165,6 +166,7 @@ static int ecdh_compute_key(void *out, size_t outlen, const EC_POINT *pub_key, goto err; } } +#endif buflen = (EC_GROUP_get_degree(group) + 7)/8; len = BN_num_bytes(x); diff --git a/crypto/openssl/crypto/ecdsa/ecdsa.h b/crypto/openssl/crypto/ecdsa/ecdsa.h index e61c539812..7fb5254b62 100644 --- a/crypto/openssl/crypto/ecdsa/ecdsa.h +++ b/crypto/openssl/crypto/ecdsa/ecdsa.h @@ -238,6 +238,7 @@ void ERR_load_ECDSA_strings(void); /* Error codes for the ECDSA functions. */ /* Function codes. */ +#define ECDSA_F_ECDSA_CHECK 104 #define ECDSA_F_ECDSA_DATA_NEW_METHOD 100 #define ECDSA_F_ECDSA_DO_SIGN 101 #define ECDSA_F_ECDSA_DO_VERIFY 102 @@ -249,6 +250,7 @@ void ERR_load_ECDSA_strings(void); #define ECDSA_R_ERR_EC_LIB 102 #define ECDSA_R_MISSING_PARAMETERS 103 #define ECDSA_R_NEED_NEW_SETUP_VALUES 106 +#define ECDSA_R_NON_FIPS_METHOD 107 #define ECDSA_R_RANDOM_NUMBER_GENERATION_FAILED 104 #define ECDSA_R_SIGNATURE_MALLOC_FAILED 105 diff --git a/crypto/openssl/crypto/ecdsa/ecs_err.c b/crypto/openssl/crypto/ecdsa/ecs_err.c index 98e38d537f..81542e6d15 100644 --- a/crypto/openssl/crypto/ecdsa/ecs_err.c +++ b/crypto/openssl/crypto/ecdsa/ecs_err.c @@ -1,6 +1,6 @@ /* crypto/ecdsa/ecs_err.c */ /* ==================================================================== - * Copyright (c) 1999-2006 The OpenSSL Project. All rights reserved. + * Copyright (c) 1999-2011 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -70,6 +70,7 @@ static ERR_STRING_DATA ECDSA_str_functs[]= { +{ERR_FUNC(ECDSA_F_ECDSA_CHECK), "ECDSA_CHECK"}, {ERR_FUNC(ECDSA_F_ECDSA_DATA_NEW_METHOD), "ECDSA_DATA_NEW_METHOD"}, {ERR_FUNC(ECDSA_F_ECDSA_DO_SIGN), "ECDSA_do_sign"}, {ERR_FUNC(ECDSA_F_ECDSA_DO_VERIFY), "ECDSA_do_verify"}, @@ -84,6 +85,7 @@ static ERR_STRING_DATA ECDSA_str_reasons[]= {ERR_REASON(ECDSA_R_ERR_EC_LIB) ,"err ec lib"}, {ERR_REASON(ECDSA_R_MISSING_PARAMETERS) ,"missing parameters"}, {ERR_REASON(ECDSA_R_NEED_NEW_SETUP_VALUES),"need new setup values"}, +{ERR_REASON(ECDSA_R_NON_FIPS_METHOD) ,"non fips method"}, {ERR_REASON(ECDSA_R_RANDOM_NUMBER_GENERATION_FAILED),"random number generation failed"}, {ERR_REASON(ECDSA_R_SIGNATURE_MALLOC_FAILED),"signature malloc failed"}, {0,NULL} diff --git a/crypto/openssl/crypto/ecdsa/ecs_lib.c b/crypto/openssl/crypto/ecdsa/ecs_lib.c index 2ebae3aa27..e477da430b 100644 --- a/crypto/openssl/crypto/ecdsa/ecs_lib.c +++ b/crypto/openssl/crypto/ecdsa/ecs_lib.c @@ -60,6 +60,9 @@ #endif #include #include +#ifdef OPENSSL_FIPS +#include +#endif const char ECDSA_version[]="ECDSA" OPENSSL_VERSION_PTEXT; @@ -77,7 +80,16 @@ void ECDSA_set_default_method(const ECDSA_METHOD *meth) const ECDSA_METHOD *ECDSA_get_default_method(void) { if(!default_ECDSA_method) + { +#ifdef OPENSSL_FIPS + if (FIPS_mode()) + return FIPS_ecdsa_openssl(); + else + return ECDSA_OpenSSL(); +#else default_ECDSA_method = ECDSA_OpenSSL(); +#endif + } return default_ECDSA_method; } @@ -193,7 +205,14 @@ ECDSA_DATA *ecdsa_check(EC_KEY *key) } else ecdsa_data = (ECDSA_DATA *)data; - +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !(ecdsa_data->flags & ECDSA_FLAG_FIPS_METHOD) + && !(EC_KEY_get_flags(key) & EC_FLAG_NON_FIPS_ALLOW)) + { + ECDSAerr(ECDSA_F_ECDSA_CHECK, ECDSA_R_NON_FIPS_METHOD); + return NULL; + } +#endif return ecdsa_data; } diff --git a/crypto/openssl/crypto/ecdsa/ecs_locl.h b/crypto/openssl/crypto/ecdsa/ecs_locl.h index 3a69a840e2..cb3be13cfc 100644 --- a/crypto/openssl/crypto/ecdsa/ecs_locl.h +++ b/crypto/openssl/crypto/ecdsa/ecs_locl.h @@ -82,6 +82,14 @@ struct ecdsa_method char *app_data; }; +/* If this flag is set the ECDSA method is FIPS compliant and can be used + * in FIPS mode. This is set in the validated module method. If an + * application sets this flag in its own methods it is its responsibility + * to ensure the result is compliant. + */ + +#define ECDSA_FLAG_FIPS_METHOD 0x1 + typedef struct ecdsa_data_st { /* EC_KEY_METH_DATA part */ int (*init)(EC_KEY *); diff --git a/crypto/openssl/crypto/ecdsa/ecs_ossl.c b/crypto/openssl/crypto/ecdsa/ecs_ossl.c index 1bbf328de5..7725935610 100644 --- a/crypto/openssl/crypto/ecdsa/ecs_ossl.c +++ b/crypto/openssl/crypto/ecdsa/ecs_ossl.c @@ -167,6 +167,7 @@ static int ecdsa_sign_setup(EC_KEY *eckey, BN_CTX *ctx_in, BIGNUM **kinvp, goto err; } } +#ifndef OPENSSL_NO_EC2M else /* NID_X9_62_characteristic_two_field */ { if (!EC_POINT_get_affine_coordinates_GF2m(group, @@ -176,6 +177,7 @@ static int ecdsa_sign_setup(EC_KEY *eckey, BN_CTX *ctx_in, BIGNUM **kinvp, goto err; } } +#endif if (!BN_nnmod(r, X, order, ctx)) { ECDSAerr(ECDSA_F_ECDSA_SIGN_SETUP, ERR_R_BN_LIB); @@ -454,6 +456,7 @@ static int ecdsa_do_verify(const unsigned char *dgst, int dgst_len, goto err; } } +#ifndef OPENSSL_NO_EC2M else /* NID_X9_62_characteristic_two_field */ { if (!EC_POINT_get_affine_coordinates_GF2m(group, @@ -463,7 +466,7 @@ static int ecdsa_do_verify(const unsigned char *dgst, int dgst_len, goto err; } } - +#endif if (!BN_nnmod(u1, X, order, ctx)) { ECDSAerr(ECDSA_F_ECDSA_DO_VERIFY, ERR_R_BN_LIB); diff --git a/crypto/openssl/crypto/engine/eng_all.c b/crypto/openssl/crypto/engine/eng_all.c index 22c120454f..6093376df4 100644 --- a/crypto/openssl/crypto/engine/eng_all.c +++ b/crypto/openssl/crypto/engine/eng_all.c @@ -61,6 +61,8 @@ void ENGINE_load_builtin_engines(void) { + /* Some ENGINEs need this */ + OPENSSL_cpuid_setup(); #if 0 /* There's no longer any need for an "openssl" ENGINE unless, one day, * it is the *only* way for standard builtin implementations to be be @@ -70,6 +72,12 @@ void ENGINE_load_builtin_engines(void) #endif #if !defined(OPENSSL_NO_HW) && (defined(__OpenBSD__) || defined(__FreeBSD__) || defined(HAVE_CRYPTODEV)) ENGINE_load_cryptodev(); +#endif +#ifndef OPENSSL_NO_RSAX + ENGINE_load_rsax(); +#endif +#ifndef OPENSSL_NO_RDRAND + ENGINE_load_rdrand(); #endif ENGINE_load_dynamic(); #ifndef OPENSSL_NO_STATIC_ENGINE @@ -112,6 +120,7 @@ void ENGINE_load_builtin_engines(void) ENGINE_load_capi(); #endif #endif + ENGINE_register_all_complete(); } #if defined(__OpenBSD__) || defined(__FreeBSD__) || defined(HAVE_CRYPTODEV) diff --git a/crypto/openssl/crypto/engine/eng_cryptodev.c b/crypto/openssl/crypto/engine/eng_cryptodev.c index 52f4ca3901..5a715aca4f 100644 --- a/crypto/openssl/crypto/engine/eng_cryptodev.c +++ b/crypto/openssl/crypto/engine/eng_cryptodev.c @@ -79,8 +79,6 @@ struct dev_crypto_state { unsigned char digest_res[HASH_MAX_LEN]; char *mac_data; int mac_len; - - int copy; #endif }; @@ -200,6 +198,7 @@ get_dev_crypto(void) if ((fd = open_dev_crypto()) == -1) return (-1); +#ifndef CRIOGET_NOT_NEEDED if (ioctl(fd, CRIOGET, &retfd) == -1) return (-1); @@ -208,9 +207,19 @@ get_dev_crypto(void) close(retfd); return (-1); } +#else + retfd = fd; +#endif return (retfd); } +static void put_dev_crypto(int fd) +{ +#ifndef CRIOGET_NOT_NEEDED + close(fd); +#endif +} + /* Caching version for asym operations */ static int get_asym_dev_crypto(void) @@ -252,7 +261,7 @@ get_cryptodev_ciphers(const int **cnids) ioctl(fd, CIOCFSESSION, &sess.ses) != -1) nids[count++] = ciphers[i].nid; } - close(fd); + put_dev_crypto(fd); if (count > 0) *cnids = nids; @@ -291,7 +300,7 @@ get_cryptodev_digests(const int **cnids) ioctl(fd, CIOCFSESSION, &sess.ses) != -1) nids[count++] = digests[i].nid; } - close(fd); + put_dev_crypto(fd); if (count > 0) *cnids = nids; @@ -436,7 +445,7 @@ cryptodev_init_key(EVP_CIPHER_CTX *ctx, const unsigned char *key, sess->cipher = cipher; if (ioctl(state->d_fd, CIOCGSESSION, sess) == -1) { - close(state->d_fd); + put_dev_crypto(state->d_fd); state->d_fd = -1; return (0); } @@ -473,7 +482,7 @@ cryptodev_cleanup(EVP_CIPHER_CTX *ctx) } else { ret = 1; } - close(state->d_fd); + put_dev_crypto(state->d_fd); state->d_fd = -1; return (ret); @@ -686,7 +695,7 @@ static int cryptodev_digest_init(EVP_MD_CTX *ctx) sess->mac = digest; if (ioctl(state->d_fd, CIOCGSESSION, sess) < 0) { - close(state->d_fd); + put_dev_crypto(state->d_fd); state->d_fd = -1; printf("cryptodev_digest_init: Open session failed\n"); return (0); @@ -758,14 +767,12 @@ static int cryptodev_digest_final(EVP_MD_CTX *ctx, unsigned char *md) if (! (ctx->flags & EVP_MD_CTX_FLAG_ONESHOT) ) { /* if application doesn't support one buffer */ memset(&cryp, 0, sizeof(cryp)); - cryp.ses = sess->ses; cryp.flags = 0; cryp.len = state->mac_len; cryp.src = state->mac_data; cryp.dst = NULL; cryp.mac = (caddr_t)md; - if (ioctl(state->d_fd, CIOCCRYPT, &cryp) < 0) { printf("cryptodev_digest_final: digest failed\n"); return (0); @@ -786,6 +793,9 @@ static int cryptodev_digest_cleanup(EVP_MD_CTX *ctx) struct dev_crypto_state *state = ctx->md_data; struct session_op *sess = &state->d_sess; + if (state == NULL) + return 0; + if (state->d_fd < 0) { printf("cryptodev_digest_cleanup: illegal input\n"); return (0); @@ -797,16 +807,13 @@ static int cryptodev_digest_cleanup(EVP_MD_CTX *ctx) state->mac_len = 0; } - if (state->copy) - return 1; - if (ioctl(state->d_fd, CIOCFSESSION, &sess->ses) < 0) { printf("cryptodev_digest_cleanup: failed to close session\n"); ret = 0; } else { ret = 1; } - close(state->d_fd); + put_dev_crypto(state->d_fd); state->d_fd = -1; return (ret); @@ -816,15 +823,39 @@ static int cryptodev_digest_copy(EVP_MD_CTX *to,const EVP_MD_CTX *from) { struct dev_crypto_state *fstate = from->md_data; struct dev_crypto_state *dstate = to->md_data; + struct session_op *sess; + int digest; - memcpy(dstate, fstate, sizeof(struct dev_crypto_state)); + if (dstate == NULL || fstate == NULL) + return 1; - if (fstate->mac_len != 0) { - dstate->mac_data = OPENSSL_malloc(fstate->mac_len); - memcpy(dstate->mac_data, fstate->mac_data, fstate->mac_len); + memcpy(dstate, fstate, sizeof(struct dev_crypto_state)); + + sess = &dstate->d_sess; + + digest = digest_nid_to_cryptodev(to->digest->type); + + sess->mackey = dstate->dummy_mac_key; + sess->mackeylen = digest_key_length(to->digest->type); + sess->mac = digest; + + dstate->d_fd = get_dev_crypto(); + + if (ioctl(dstate->d_fd, CIOCGSESSION, sess) < 0) { + put_dev_crypto(dstate->d_fd); + dstate->d_fd = -1; + printf("cryptodev_digest_init: Open session failed\n"); + return (0); } - dstate->copy = 1; + if (fstate->mac_len != 0) { + if (fstate->mac_data != NULL) + { + dstate->mac_data = OPENSSL_malloc(fstate->mac_len); + memcpy(dstate->mac_data, fstate->mac_data, fstate->mac_len); + dstate->mac_len = fstate->mac_len; + } + } return 1; } @@ -1347,11 +1378,11 @@ ENGINE_load_cryptodev(void) * find out what asymmetric crypto algorithms we support */ if (ioctl(fd, CIOCASYMFEAT, &cryptodev_asymfeat) == -1) { - close(fd); + put_dev_crypto(fd); ENGINE_free(engine); return; } - close(fd); + put_dev_crypto(fd); if (!ENGINE_set_id(engine, "cryptodev") || !ENGINE_set_name(engine, "BSD cryptodev engine") || diff --git a/crypto/openssl/crypto/engine/eng_fat.c b/crypto/openssl/crypto/engine/eng_fat.c index db66e62350..789b8d57e5 100644 --- a/crypto/openssl/crypto/engine/eng_fat.c +++ b/crypto/openssl/crypto/engine/eng_fat.c @@ -176,6 +176,7 @@ int ENGINE_register_all_complete(void) ENGINE *e; for(e=ENGINE_get_first() ; e ; e=ENGINE_get_next(e)) - ENGINE_register_complete(e); + if (!(e->flags & ENGINE_FLAGS_NO_REGISTER_ALL)) + ENGINE_register_complete(e); return 1; } diff --git a/crypto/openssl/crypto/rand/rand_err.c b/crypto/openssl/crypto/engine/eng_rdrand.c similarity index 57% copy from crypto/openssl/crypto/rand/rand_err.c copy to crypto/openssl/crypto/engine/eng_rdrand.c index 03cda4dd92..a9ba5ae6f9 100644 --- a/crypto/openssl/crypto/rand/rand_err.c +++ b/crypto/openssl/crypto/engine/eng_rdrand.c @@ -1,13 +1,12 @@ -/* crypto/rand/rand_err.c */ /* ==================================================================== - * Copyright (c) 1999-2006 The OpenSSL Project. All rights reserved. + * Copyright (c) 2011 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. + * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in @@ -22,7 +21,7 @@ * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to * endorse or promote products derived from this software without * prior written permission. For written permission, please contact - * openssl-core@OpenSSL.org. + * licensing@OpenSSL.org. * * 5. Products derived from this software may not be called "OpenSSL" * nor may "OpenSSL" appear in their names without prior written @@ -46,51 +45,98 @@ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED * OF THE POSSIBILITY OF SUCH DAMAGE. * ==================================================================== - * - * This product includes cryptographic software written by Eric Young - * (eay@cryptsoft.com). This product includes software written by Tim - * Hudson (tjh@cryptsoft.com). - * */ -/* NOTE: this file was auto generated by the mkerr.pl script: any changes - * made to it will be overwritten when the script next updates this file, - * only reason strings will be preserved. - */ +#include #include -#include +#include +#include #include +#include -/* BEGIN ERROR CODES */ -#ifndef OPENSSL_NO_ERR +#if (defined(__i386) || defined(__i386__) || defined(_M_IX86) || \ + defined(__x86_64) || defined(__x86_64__) || \ + defined(_M_AMD64) || defined (_M_X64)) && defined(OPENSSL_CPUID_OBJ) -#define ERR_FUNC(func) ERR_PACK(ERR_LIB_RAND,func,0) -#define ERR_REASON(reason) ERR_PACK(ERR_LIB_RAND,0,reason) +size_t OPENSSL_ia32_rdrand(void); -static ERR_STRING_DATA RAND_str_functs[]= +static int get_random_bytes (unsigned char *buf, int num) { -{ERR_FUNC(RAND_F_RAND_GET_RAND_METHOD), "RAND_get_rand_method"}, -{ERR_FUNC(RAND_F_SSLEAY_RAND_BYTES), "SSLEAY_RAND_BYTES"}, -{0,NULL} - }; + size_t rnd; + + while (num>=(int)sizeof(size_t)) { + if ((rnd = OPENSSL_ia32_rdrand()) == 0) return 0; + + *((size_t *)buf) = rnd; + buf += sizeof(size_t); + num -= sizeof(size_t); + } + if (num) { + if ((rnd = OPENSSL_ia32_rdrand()) == 0) return 0; + + memcpy (buf,&rnd,num); + } + + return 1; + } + +static int random_status (void) +{ return 1; } -static ERR_STRING_DATA RAND_str_reasons[]= +static RAND_METHOD rdrand_meth = { -{ERR_REASON(RAND_R_PRNG_NOT_SEEDED) ,"PRNG not seeded"}, -{0,NULL} + NULL, /* seed */ + get_random_bytes, + NULL, /* cleanup */ + NULL, /* add */ + get_random_bytes, + random_status, }; -#endif +static int rdrand_init(ENGINE *e) +{ return 1; } + +static const char *engine_e_rdrand_id = "rdrand"; +static const char *engine_e_rdrand_name = "Intel RDRAND engine"; -void ERR_load_RAND_strings(void) +static int bind_helper(ENGINE *e) { -#ifndef OPENSSL_NO_ERR + if (!ENGINE_set_id(e, engine_e_rdrand_id) || + !ENGINE_set_name(e, engine_e_rdrand_name) || + !ENGINE_set_init_function(e, rdrand_init) || + !ENGINE_set_RAND(e, &rdrand_meth) ) + return 0; + + return 1; + } - if (ERR_func_error_string(RAND_str_functs[0].error) == NULL) +static ENGINE *ENGINE_rdrand(void) + { + ENGINE *ret = ENGINE_new(); + if(!ret) + return NULL; + if(!bind_helper(ret)) { - ERR_load_strings(0,RAND_str_functs); - ERR_load_strings(0,RAND_str_reasons); + ENGINE_free(ret); + return NULL; } -#endif + return ret; } + +void ENGINE_load_rdrand (void) + { + extern unsigned int OPENSSL_ia32cap_P[]; + + if (OPENSSL_ia32cap_P[1] & (1<<(62-32))) + { + ENGINE *toadd = ENGINE_rdrand(); + if(!toadd) return; + ENGINE_add(toadd); + ENGINE_free(toadd); + ERR_clear_error(); + } + } +#else +void ENGINE_load_rdrand (void) {} +#endif diff --git a/crypto/openssl/crypto/engine/eng_rsax.c b/crypto/openssl/crypto/engine/eng_rsax.c new file mode 100644 index 0000000000..96e63477ee --- /dev/null +++ b/crypto/openssl/crypto/engine/eng_rsax.c @@ -0,0 +1,668 @@ +/* crypto/engine/eng_rsax.c */ +/* Copyright (c) 2010-2010 Intel Corp. + * Author: Vinodh.Gopal@intel.com + * Jim Guilford + * Erdinc.Ozturk@intel.com + * Maxim.Perminov@intel.com + * Ying.Huang@intel.com + * + * More information about algorithm used can be found at: + * http://www.cse.buffalo.edu/srds2009/escs2009_submission_Gopal.pdf + */ +/* ==================================================================== + * Copyright (c) 1999-2001 The OpenSSL Project. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * 3. All advertising materials mentioning features or use of this + * software must display the following acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit. (http://www.OpenSSL.org/)" + * + * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to + * endorse or promote products derived from this software without + * prior written permission. For written permission, please contact + * licensing@OpenSSL.org. + * + * 5. Products derived from this software may not be called "OpenSSL" + * nor may "OpenSSL" appear in their names without prior written + * permission of the OpenSSL Project. + * + * 6. Redistributions of any form whatsoever must retain the following + * acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit (http://www.OpenSSL.org/)" + * + * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY + * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * ==================================================================== + * + * This product includes cryptographic software written by Eric Young + * (eay@cryptsoft.com). This product includes software written by Tim + * Hudson (tjh@cryptsoft.com). + */ + +#include + +#include +#include +#include +#include +#include +#ifndef OPENSSL_NO_RSA +#include +#endif +#include +#include + +/* RSAX is available **ONLY* on x86_64 CPUs */ +#undef COMPILE_RSAX + +#if (defined(__x86_64) || defined(__x86_64__) || \ + defined(_M_AMD64) || defined (_M_X64)) && !defined(OPENSSL_NO_ASM) +#define COMPILE_RSAX +static ENGINE *ENGINE_rsax (void); +#endif + +void ENGINE_load_rsax (void) + { +/* On non-x86 CPUs it just returns. */ +#ifdef COMPILE_RSAX + ENGINE *toadd = ENGINE_rsax(); + if(!toadd) return; + ENGINE_add(toadd); + ENGINE_free(toadd); + ERR_clear_error(); +#endif + } + +#ifdef COMPILE_RSAX +#define E_RSAX_LIB_NAME "rsax engine" + +static int e_rsax_destroy(ENGINE *e); +static int e_rsax_init(ENGINE *e); +static int e_rsax_finish(ENGINE *e); +static int e_rsax_ctrl(ENGINE *e, int cmd, long i, void *p, void (*f)(void)); + +#ifndef OPENSSL_NO_RSA +/* RSA stuff */ +static int e_rsax_rsa_mod_exp(BIGNUM *r, const BIGNUM *I, RSA *rsa, BN_CTX *ctx); +static int e_rsax_rsa_finish(RSA *r); +#endif + +static const ENGINE_CMD_DEFN e_rsax_cmd_defns[] = { + {0, NULL, NULL, 0} + }; + +#ifndef OPENSSL_NO_RSA +/* Our internal RSA_METHOD that we provide pointers to */ +static RSA_METHOD e_rsax_rsa = + { + "Intel RSA-X method", + NULL, + NULL, + NULL, + NULL, + e_rsax_rsa_mod_exp, + NULL, + NULL, + e_rsax_rsa_finish, + RSA_FLAG_CACHE_PUBLIC|RSA_FLAG_CACHE_PRIVATE, + NULL, + NULL, + NULL + }; +#endif + +/* Constants used when creating the ENGINE */ +static const char *engine_e_rsax_id = "rsax"; +static const char *engine_e_rsax_name = "RSAX engine support"; + +/* This internal function is used by ENGINE_rsax() */ +static int bind_helper(ENGINE *e) + { +#ifndef OPENSSL_NO_RSA + const RSA_METHOD *meth1; +#endif + if(!ENGINE_set_id(e, engine_e_rsax_id) || + !ENGINE_set_name(e, engine_e_rsax_name) || +#ifndef OPENSSL_NO_RSA + !ENGINE_set_RSA(e, &e_rsax_rsa) || +#endif + !ENGINE_set_destroy_function(e, e_rsax_destroy) || + !ENGINE_set_init_function(e, e_rsax_init) || + !ENGINE_set_finish_function(e, e_rsax_finish) || + !ENGINE_set_ctrl_function(e, e_rsax_ctrl) || + !ENGINE_set_cmd_defns(e, e_rsax_cmd_defns)) + return 0; + +#ifndef OPENSSL_NO_RSA + meth1 = RSA_PKCS1_SSLeay(); + e_rsax_rsa.rsa_pub_enc = meth1->rsa_pub_enc; + e_rsax_rsa.rsa_pub_dec = meth1->rsa_pub_dec; + e_rsax_rsa.rsa_priv_enc = meth1->rsa_priv_enc; + e_rsax_rsa.rsa_priv_dec = meth1->rsa_priv_dec; + e_rsax_rsa.bn_mod_exp = meth1->bn_mod_exp; +#endif + return 1; + } + +static ENGINE *ENGINE_rsax(void) + { + ENGINE *ret = ENGINE_new(); + if(!ret) + return NULL; + if(!bind_helper(ret)) + { + ENGINE_free(ret); + return NULL; + } + return ret; + } + +#ifndef OPENSSL_NO_RSA +/* Used to attach our own key-data to an RSA structure */ +static int rsax_ex_data_idx = -1; +#endif + +static int e_rsax_destroy(ENGINE *e) + { + return 1; + } + +/* (de)initialisation functions. */ +static int e_rsax_init(ENGINE *e) + { +#ifndef OPENSSL_NO_RSA + if (rsax_ex_data_idx == -1) + rsax_ex_data_idx = RSA_get_ex_new_index(0, + NULL, + NULL, NULL, NULL); +#endif + if (rsax_ex_data_idx == -1) + return 0; + return 1; + } + +static int e_rsax_finish(ENGINE *e) + { + return 1; + } + +static int e_rsax_ctrl(ENGINE *e, int cmd, long i, void *p, void (*f)(void)) + { + int to_return = 1; + + switch(cmd) + { + /* The command isn't understood by this engine */ + default: + to_return = 0; + break; + } + + return to_return; + } + + +#ifndef OPENSSL_NO_RSA + +#ifdef _WIN32 +typedef unsigned __int64 UINT64; +#else +typedef unsigned long long UINT64; +#endif +typedef unsigned short UINT16; + +/* Table t is interleaved in the following manner: + * The order in memory is t[0][0], t[0][1], ..., t[0][7], t[1][0], ... + * A particular 512-bit value is stored in t[][index] rather than the more + * normal t[index][]; i.e. the qwords of a particular entry in t are not + * adjacent in memory + */ + +/* Init BIGNUM b from the interleaved UINT64 array */ +static int interleaved_array_to_bn_512(BIGNUM* b, UINT64 *array); + +/* Extract array elements from BIGNUM b + * To set the whole array from b, call with n=8 + */ +static int bn_extract_to_array_512(const BIGNUM* b, unsigned int n, UINT64 *array); + +struct mod_ctx_512 { + UINT64 t[8][8]; + UINT64 m[8]; + UINT64 m1[8]; /* 2^278 % m */ + UINT64 m2[8]; /* 2^640 % m */ + UINT64 k1[2]; /* (- 1/m) % 2^128 */ +}; + +static int mod_exp_pre_compute_data_512(UINT64 *m, struct mod_ctx_512 *data); + +void mod_exp_512(UINT64 *result, /* 512 bits, 8 qwords */ + UINT64 *g, /* 512 bits, 8 qwords */ + UINT64 *exp, /* 512 bits, 8 qwords */ + struct mod_ctx_512 *data); + +typedef struct st_e_rsax_mod_ctx +{ + UINT64 type; + union { + struct mod_ctx_512 b512; + } ctx; + +} E_RSAX_MOD_CTX; + +static E_RSAX_MOD_CTX *e_rsax_get_ctx(RSA *rsa, int idx, BIGNUM* m) +{ + E_RSAX_MOD_CTX *hptr; + + if (idx < 0 || idx > 2) + return NULL; + + hptr = RSA_get_ex_data(rsa, rsax_ex_data_idx); + if (!hptr) { + hptr = OPENSSL_malloc(3*sizeof(E_RSAX_MOD_CTX)); + if (!hptr) return NULL; + hptr[2].type = hptr[1].type= hptr[0].type = 0; + RSA_set_ex_data(rsa, rsax_ex_data_idx, hptr); + } + + if (hptr[idx].type == (UINT64)BN_num_bits(m)) + return hptr+idx; + + if (BN_num_bits(m) == 512) { + UINT64 _m[8]; + bn_extract_to_array_512(m, 8, _m); + memset( &hptr[idx].ctx.b512, 0, sizeof(struct mod_ctx_512)); + mod_exp_pre_compute_data_512(_m, &hptr[idx].ctx.b512); + } + + hptr[idx].type = BN_num_bits(m); + return hptr+idx; +} + +static int e_rsax_rsa_finish(RSA *rsa) + { + E_RSAX_MOD_CTX *hptr = RSA_get_ex_data(rsa, rsax_ex_data_idx); + if(hptr) + { + OPENSSL_free(hptr); + RSA_set_ex_data(rsa, rsax_ex_data_idx, NULL); + } + if (rsa->_method_mod_n) + BN_MONT_CTX_free(rsa->_method_mod_n); + if (rsa->_method_mod_p) + BN_MONT_CTX_free(rsa->_method_mod_p); + if (rsa->_method_mod_q) + BN_MONT_CTX_free(rsa->_method_mod_q); + return 1; + } + + +static int e_rsax_bn_mod_exp(BIGNUM *r, const BIGNUM *g, const BIGNUM *e, + const BIGNUM *m, BN_CTX *ctx, BN_MONT_CTX *in_mont, E_RSAX_MOD_CTX* rsax_mod_ctx ) +{ + if (rsax_mod_ctx && BN_get_flags(e, BN_FLG_CONSTTIME) != 0) { + if (BN_num_bits(m) == 512) { + UINT64 _r[8]; + UINT64 _g[8]; + UINT64 _e[8]; + + /* Init the arrays from the BIGNUMs */ + bn_extract_to_array_512(g, 8, _g); + bn_extract_to_array_512(e, 8, _e); + + mod_exp_512(_r, _g, _e, &rsax_mod_ctx->ctx.b512); + /* Return the result in the BIGNUM */ + interleaved_array_to_bn_512(r, _r); + return 1; + } + } + + return BN_mod_exp_mont(r, g, e, m, ctx, in_mont); +} + +/* Declares for the Intel CIAP 512-bit / CRT / 1024 bit RSA modular + * exponentiation routine precalculations and a structure to hold the + * necessary values. These files are meant to live in crypto/rsa/ in + * the target openssl. + */ + +/* + * Local method: extracts a piece from a BIGNUM, to fit it into + * an array. Call with n=8 to extract an entire 512-bit BIGNUM + */ +static int bn_extract_to_array_512(const BIGNUM* b, unsigned int n, UINT64 *array) +{ + int i; + UINT64 tmp; + unsigned char bn_buff[64]; + memset(bn_buff, 0, 64); + if (BN_num_bytes(b) > 64) { + printf ("Can't support this byte size\n"); + return 0; } + if (BN_num_bytes(b)!=0) { + if (!BN_bn2bin(b, bn_buff+(64-BN_num_bytes(b)))) { + printf ("Error's in bn2bin\n"); + /* We have to error, here */ + return 0; } } + while (n-- > 0) { + array[n] = 0; + for (i=7; i>=0; i--) { + tmp = bn_buff[63-(n*8+i)]; + array[n] |= tmp << (8*i); } } + return 1; +} + +/* Init a 512-bit BIGNUM from the UINT64*_ (8 * 64) interleaved array */ +static int interleaved_array_to_bn_512(BIGNUM* b, UINT64 *array) +{ + unsigned char tmp[64]; + int n=8; + int i; + while (n-- > 0) { + for (i = 7; i>=0; i--) { + tmp[63-(n*8+i)] = (unsigned char)(array[n]>>(8*i)); } } + BN_bin2bn(tmp, 64, b); + return 0; +} + + +/* The main 512bit precompute call */ +static int mod_exp_pre_compute_data_512(UINT64 *m, struct mod_ctx_512 *data) + { + BIGNUM two_768, two_640, two_128, two_512, tmp, _m, tmp2; + + /* We need a BN_CTX for the modulo functions */ + BN_CTX* ctx; + /* Some tmps */ + UINT64 _t[8]; + int i, j, ret = 0; + + /* Init _m with m */ + BN_init(&_m); + interleaved_array_to_bn_512(&_m, m); + memset(_t, 0, 64); + + /* Inits */ + BN_init(&two_768); + BN_init(&two_640); + BN_init(&two_128); + BN_init(&two_512); + BN_init(&tmp); + BN_init(&tmp2); + + /* Create our context */ + if ((ctx=BN_CTX_new()) == NULL) { goto err; } + BN_CTX_start(ctx); + + /* + * For production, if you care, these only need to be set once, + * and may be made constants. + */ + BN_lshift(&two_768, BN_value_one(), 768); + BN_lshift(&two_640, BN_value_one(), 640); + BN_lshift(&two_128, BN_value_one(), 128); + BN_lshift(&two_512, BN_value_one(), 512); + + if (0 == (m[7] & 0x8000000000000000)) { + exit(1); + } + if (0 == (m[0] & 0x1)) { /* Odd modulus required for Mont */ + exit(1); + } + + /* Precompute m1 */ + BN_mod(&tmp, &two_768, &_m, ctx); + if (!bn_extract_to_array_512(&tmp, 8, &data->m1[0])) { + goto err; } + + /* Precompute m2 */ + BN_mod(&tmp, &two_640, &_m, ctx); + if (!bn_extract_to_array_512(&tmp, 8, &data->m2[0])) { + goto err; + } + + /* + * Precompute k1, a 128b number = ((-1)* m-1 ) mod 2128; k1 should + * be non-negative. + */ + BN_mod_inverse(&tmp, &_m, &two_128, ctx); + if (!BN_is_zero(&tmp)) { BN_sub(&tmp, &two_128, &tmp); } + if (!bn_extract_to_array_512(&tmp, 2, &data->k1[0])) { + goto err; } + + /* Precompute t */ + for (i=0; i<8; i++) { + BN_zero(&tmp); + if (i & 1) { BN_add(&tmp, &two_512, &tmp); } + if (i & 2) { BN_add(&tmp, &two_512, &tmp); } + if (i & 4) { BN_add(&tmp, &two_640, &tmp); } + + BN_nnmod(&tmp2, &tmp, &_m, ctx); + if (!bn_extract_to_array_512(&tmp2, 8, _t)) { + goto err; } + for (j=0; j<8; j++) data->t[j][i] = _t[j]; } + + /* Precompute m */ + for (i=0; i<8; i++) { + data->m[i] = m[i]; } + + ret = 1; + +err: + /* Cleanup */ + if (ctx != NULL) { + BN_CTX_end(ctx); BN_CTX_free(ctx); } + BN_free(&two_768); + BN_free(&two_640); + BN_free(&two_128); + BN_free(&two_512); + BN_free(&tmp); + BN_free(&tmp2); + BN_free(&_m); + + return ret; +} + + +static int e_rsax_rsa_mod_exp(BIGNUM *r0, const BIGNUM *I, RSA *rsa, BN_CTX *ctx) + { + BIGNUM *r1,*m1,*vrfy; + BIGNUM local_dmp1,local_dmq1,local_c,local_r1; + BIGNUM *dmp1,*dmq1,*c,*pr1; + int ret=0; + + BN_CTX_start(ctx); + r1 = BN_CTX_get(ctx); + m1 = BN_CTX_get(ctx); + vrfy = BN_CTX_get(ctx); + + { + BIGNUM local_p, local_q; + BIGNUM *p = NULL, *q = NULL; + int error = 0; + + /* Make sure BN_mod_inverse in Montgomery + * intialization uses the BN_FLG_CONSTTIME flag + * (unless RSA_FLAG_NO_CONSTTIME is set) + */ + if (!(rsa->flags & RSA_FLAG_NO_CONSTTIME)) + { + BN_init(&local_p); + p = &local_p; + BN_with_flags(p, rsa->p, BN_FLG_CONSTTIME); + + BN_init(&local_q); + q = &local_q; + BN_with_flags(q, rsa->q, BN_FLG_CONSTTIME); + } + else + { + p = rsa->p; + q = rsa->q; + } + + if (rsa->flags & RSA_FLAG_CACHE_PRIVATE) + { + if (!BN_MONT_CTX_set_locked(&rsa->_method_mod_p, CRYPTO_LOCK_RSA, p, ctx)) + error = 1; + if (!BN_MONT_CTX_set_locked(&rsa->_method_mod_q, CRYPTO_LOCK_RSA, q, ctx)) + error = 1; + } + + /* clean up */ + if (!(rsa->flags & RSA_FLAG_NO_CONSTTIME)) + { + BN_free(&local_p); + BN_free(&local_q); + } + if ( error ) + goto err; + } + + if (rsa->flags & RSA_FLAG_CACHE_PUBLIC) + if (!BN_MONT_CTX_set_locked(&rsa->_method_mod_n, CRYPTO_LOCK_RSA, rsa->n, ctx)) + goto err; + + /* compute I mod q */ + if (!(rsa->flags & RSA_FLAG_NO_CONSTTIME)) + { + c = &local_c; + BN_with_flags(c, I, BN_FLG_CONSTTIME); + if (!BN_mod(r1,c,rsa->q,ctx)) goto err; + } + else + { + if (!BN_mod(r1,I,rsa->q,ctx)) goto err; + } + + /* compute r1^dmq1 mod q */ + if (!(rsa->flags & RSA_FLAG_NO_CONSTTIME)) + { + dmq1 = &local_dmq1; + BN_with_flags(dmq1, rsa->dmq1, BN_FLG_CONSTTIME); + } + else + dmq1 = rsa->dmq1; + + if (!e_rsax_bn_mod_exp(m1,r1,dmq1,rsa->q,ctx, + rsa->_method_mod_q, e_rsax_get_ctx(rsa, 0, rsa->q) )) goto err; + + /* compute I mod p */ + if (!(rsa->flags & RSA_FLAG_NO_CONSTTIME)) + { + c = &local_c; + BN_with_flags(c, I, BN_FLG_CONSTTIME); + if (!BN_mod(r1,c,rsa->p,ctx)) goto err; + } + else + { + if (!BN_mod(r1,I,rsa->p,ctx)) goto err; + } + + /* compute r1^dmp1 mod p */ + if (!(rsa->flags & RSA_FLAG_NO_CONSTTIME)) + { + dmp1 = &local_dmp1; + BN_with_flags(dmp1, rsa->dmp1, BN_FLG_CONSTTIME); + } + else + dmp1 = rsa->dmp1; + + if (!e_rsax_bn_mod_exp(r0,r1,dmp1,rsa->p,ctx, + rsa->_method_mod_p, e_rsax_get_ctx(rsa, 1, rsa->p) )) goto err; + + if (!BN_sub(r0,r0,m1)) goto err; + /* This will help stop the size of r0 increasing, which does + * affect the multiply if it optimised for a power of 2 size */ + if (BN_is_negative(r0)) + if (!BN_add(r0,r0,rsa->p)) goto err; + + if (!BN_mul(r1,r0,rsa->iqmp,ctx)) goto err; + + /* Turn BN_FLG_CONSTTIME flag on before division operation */ + if (!(rsa->flags & RSA_FLAG_NO_CONSTTIME)) + { + pr1 = &local_r1; + BN_with_flags(pr1, r1, BN_FLG_CONSTTIME); + } + else + pr1 = r1; + if (!BN_mod(r0,pr1,rsa->p,ctx)) goto err; + + /* If p < q it is occasionally possible for the correction of + * adding 'p' if r0 is negative above to leave the result still + * negative. This can break the private key operations: the following + * second correction should *always* correct this rare occurrence. + * This will *never* happen with OpenSSL generated keys because + * they ensure p > q [steve] + */ + if (BN_is_negative(r0)) + if (!BN_add(r0,r0,rsa->p)) goto err; + if (!BN_mul(r1,r0,rsa->q,ctx)) goto err; + if (!BN_add(r0,r1,m1)) goto err; + + if (rsa->e && rsa->n) + { + if (!e_rsax_bn_mod_exp(vrfy,r0,rsa->e,rsa->n,ctx,rsa->_method_mod_n, e_rsax_get_ctx(rsa, 2, rsa->n) )) + goto err; + + /* If 'I' was greater than (or equal to) rsa->n, the operation + * will be equivalent to using 'I mod n'. However, the result of + * the verify will *always* be less than 'n' so we don't check + * for absolute equality, just congruency. */ + if (!BN_sub(vrfy, vrfy, I)) goto err; + if (!BN_mod(vrfy, vrfy, rsa->n, ctx)) goto err; + if (BN_is_negative(vrfy)) + if (!BN_add(vrfy, vrfy, rsa->n)) goto err; + if (!BN_is_zero(vrfy)) + { + /* 'I' and 'vrfy' aren't congruent mod n. Don't leak + * miscalculated CRT output, just do a raw (slower) + * mod_exp and return that instead. */ + + BIGNUM local_d; + BIGNUM *d = NULL; + + if (!(rsa->flags & RSA_FLAG_NO_CONSTTIME)) + { + d = &local_d; + BN_with_flags(d, rsa->d, BN_FLG_CONSTTIME); + } + else + d = rsa->d; + if (!e_rsax_bn_mod_exp(r0,I,d,rsa->n,ctx, + rsa->_method_mod_n, e_rsax_get_ctx(rsa, 2, rsa->n) )) goto err; + } + } + ret=1; + +err: + BN_CTX_end(ctx); + + return ret; + } +#endif /* !OPENSSL_NO_RSA */ +#endif /* !COMPILE_RSAX */ diff --git a/crypto/openssl/crypto/engine/engine.h b/crypto/openssl/crypto/engine/engine.h index 943aeae215..f8be497724 100644 --- a/crypto/openssl/crypto/engine/engine.h +++ b/crypto/openssl/crypto/engine/engine.h @@ -141,6 +141,13 @@ extern "C" { * the existing ENGINE's structural reference count. */ #define ENGINE_FLAGS_BY_ID_COPY (int)0x0004 +/* This flag if for an ENGINE that does not want its methods registered as + * part of ENGINE_register_all_complete() for example if the methods are + * not usable as default methods. + */ + +#define ENGINE_FLAGS_NO_REGISTER_ALL (int)0x0008 + /* ENGINEs can support their own command types, and these flags are used in * ENGINE_CTRL_GET_CMD_FLAGS to indicate to the caller what kind of input each * command expects. Currently only numeric and string input is supported. If a @@ -344,6 +351,8 @@ void ENGINE_load_gost(void); #endif #endif void ENGINE_load_cryptodev(void); +void ENGINE_load_rsax(void); +void ENGINE_load_rdrand(void); void ENGINE_load_builtin_engines(void); /* Get and set global flags (ENGINE_TABLE_FLAG_***) for the implementation diff --git a/crypto/openssl/crypto/err/err.c b/crypto/openssl/crypto/err/err.c index 69713a6e2f..fcdb244008 100644 --- a/crypto/openssl/crypto/err/err.c +++ b/crypto/openssl/crypto/err/err.c @@ -1066,6 +1066,13 @@ void ERR_set_error_data(char *data, int flags) void ERR_add_error_data(int num, ...) { va_list args; + va_start(args, num); + ERR_add_error_vdata(num, args); + va_end(args); + } + +void ERR_add_error_vdata(int num, va_list args) + { int i,n,s; char *str,*p,*a; @@ -1074,7 +1081,6 @@ void ERR_add_error_data(int num, ...) if (str == NULL) return; str[0]='\0'; - va_start(args, num); n=0; for (i=0; i +#ifdef OPENSSL_FIPS +#include +#endif + void ERR_load_crypto_strings(void) { #ifndef OPENSSL_NO_ERR @@ -156,5 +160,8 @@ void ERR_load_crypto_strings(void) ERR_load_JPAKE_strings(); #endif ERR_load_COMP_strings(); +#endif +#ifdef OPENSSL_FIPS + ERR_load_FIPS_strings(); #endif } diff --git a/crypto/openssl/crypto/evp/bio_md.c b/crypto/openssl/crypto/evp/bio_md.c index 9841e32e1a..144fdfd56a 100644 --- a/crypto/openssl/crypto/evp/bio_md.c +++ b/crypto/openssl/crypto/evp/bio_md.c @@ -153,8 +153,12 @@ static int md_write(BIO *b, const char *in, int inl) { if (ret > 0) { - EVP_DigestUpdate(ctx,(const unsigned char *)in, - (unsigned int)ret); + if (!EVP_DigestUpdate(ctx,(const unsigned char *)in, + (unsigned int)ret)) + { + BIO_clear_retry_flags(b); + return 0; + } } } if(b->next_bio != NULL) @@ -220,7 +224,8 @@ static long md_ctrl(BIO *b, int cmd, long num, void *ptr) case BIO_CTRL_DUP: dbio=ptr; dctx=dbio->ptr; - EVP_MD_CTX_copy_ex(dctx,ctx); + if (!EVP_MD_CTX_copy_ex(dctx,ctx)) + return 0; b->init=1; break; default: diff --git a/crypto/openssl/crypto/evp/bio_ok.c b/crypto/openssl/crypto/evp/bio_ok.c index 98bc1ab409..e64335353f 100644 --- a/crypto/openssl/crypto/evp/bio_ok.c +++ b/crypto/openssl/crypto/evp/bio_ok.c @@ -133,10 +133,10 @@ static int ok_new(BIO *h); static int ok_free(BIO *data); static long ok_callback_ctrl(BIO *h, int cmd, bio_info_cb *fp); -static void sig_out(BIO* b); -static void sig_in(BIO* b); -static void block_out(BIO* b); -static void block_in(BIO* b); +static int sig_out(BIO* b); +static int sig_in(BIO* b); +static int block_out(BIO* b); +static int block_in(BIO* b); #define OK_BLOCK_SIZE (1024*4) #define OK_BLOCK_BLOCK 4 #define IOBS (OK_BLOCK_SIZE+ OK_BLOCK_BLOCK+ 3*EVP_MAX_MD_SIZE) @@ -266,10 +266,24 @@ static int ok_read(BIO *b, char *out, int outl) ctx->buf_len+= i; /* no signature yet -- check if we got one */ - if (ctx->sigio == 1) sig_in(b); + if (ctx->sigio == 1) + { + if (!sig_in(b)) + { + BIO_clear_retry_flags(b); + return 0; + } + } /* signature ok -- check if we got block */ - if (ctx->sigio == 0) block_in(b); + if (ctx->sigio == 0) + { + if (!block_in(b)) + { + BIO_clear_retry_flags(b); + return 0; + } + } /* invalid block -- cancel */ if (ctx->cont <= 0) break; @@ -293,7 +307,8 @@ static int ok_write(BIO *b, const char *in, int inl) if ((ctx == NULL) || (b->next_bio == NULL) || (b->init == 0)) return(0); - if(ctx->sigio) sig_out(b); + if(ctx->sigio && !sig_out(b)) + return 0; do{ BIO_clear_retry_flags(b); @@ -332,7 +347,11 @@ static int ok_write(BIO *b, const char *in, int inl) if(ctx->buf_len >= OK_BLOCK_SIZE+ OK_BLOCK_BLOCK) { - block_out(b); + if (!block_out(b)) + { + BIO_clear_retry_flags(b); + return 0; + } } }while(inl > 0); @@ -379,7 +398,8 @@ static long ok_ctrl(BIO *b, int cmd, long num, void *ptr) case BIO_CTRL_FLUSH: /* do a final write */ if(ctx->blockout == 0) - block_out(b); + if (!block_out(b)) + return 0; while (ctx->blockout) { @@ -408,7 +428,8 @@ static long ok_ctrl(BIO *b, int cmd, long num, void *ptr) break; case BIO_C_SET_MD: md=ptr; - EVP_DigestInit_ex(&ctx->md, md, NULL); + if (!EVP_DigestInit_ex(&ctx->md, md, NULL)) + return 0; b->init=1; break; case BIO_C_GET_MD: @@ -455,7 +476,7 @@ static void longswap(void *_ptr, size_t len) } } -static void sig_out(BIO* b) +static int sig_out(BIO* b) { BIO_OK_CTX *ctx; EVP_MD_CTX *md; @@ -463,9 +484,10 @@ static void sig_out(BIO* b) ctx=b->ptr; md=&ctx->md; - if(ctx->buf_len+ 2* md->digest->md_size > OK_BLOCK_SIZE) return; + if(ctx->buf_len+ 2* md->digest->md_size > OK_BLOCK_SIZE) return 1; - EVP_DigestInit_ex(md, md->digest, NULL); + if (!EVP_DigestInit_ex(md, md->digest, NULL)) + goto berr; /* FIXME: there's absolutely no guarantee this makes any sense at all, * particularly now EVP_MD_CTX has been restructured. */ @@ -474,14 +496,20 @@ static void sig_out(BIO* b) longswap(&(ctx->buf[ctx->buf_len]), md->digest->md_size); ctx->buf_len+= md->digest->md_size; - EVP_DigestUpdate(md, WELLKNOWN, strlen(WELLKNOWN)); - EVP_DigestFinal_ex(md, &(ctx->buf[ctx->buf_len]), NULL); + if (!EVP_DigestUpdate(md, WELLKNOWN, strlen(WELLKNOWN))) + goto berr; + if (!EVP_DigestFinal_ex(md, &(ctx->buf[ctx->buf_len]), NULL)) + goto berr; ctx->buf_len+= md->digest->md_size; ctx->blockout= 1; ctx->sigio= 0; + return 1; + berr: + BIO_clear_retry_flags(b); + return 0; } -static void sig_in(BIO* b) +static int sig_in(BIO* b) { BIO_OK_CTX *ctx; EVP_MD_CTX *md; @@ -491,15 +519,18 @@ static void sig_in(BIO* b) ctx=b->ptr; md=&ctx->md; - if((int)(ctx->buf_len-ctx->buf_off) < 2*md->digest->md_size) return; + if((int)(ctx->buf_len-ctx->buf_off) < 2*md->digest->md_size) return 1; - EVP_DigestInit_ex(md, md->digest, NULL); + if (!EVP_DigestInit_ex(md, md->digest, NULL)) + goto berr; memcpy(md->md_data, &(ctx->buf[ctx->buf_off]), md->digest->md_size); longswap(md->md_data, md->digest->md_size); ctx->buf_off+= md->digest->md_size; - EVP_DigestUpdate(md, WELLKNOWN, strlen(WELLKNOWN)); - EVP_DigestFinal_ex(md, tmp, NULL); + if (!EVP_DigestUpdate(md, WELLKNOWN, strlen(WELLKNOWN))) + goto berr; + if (!EVP_DigestFinal_ex(md, tmp, NULL)) + goto berr; ret= memcmp(&(ctx->buf[ctx->buf_off]), tmp, md->digest->md_size) == 0; ctx->buf_off+= md->digest->md_size; if(ret == 1) @@ -516,9 +547,13 @@ static void sig_in(BIO* b) { ctx->cont= 0; } + return 1; + berr: + BIO_clear_retry_flags(b); + return 0; } -static void block_out(BIO* b) +static int block_out(BIO* b) { BIO_OK_CTX *ctx; EVP_MD_CTX *md; @@ -532,13 +567,20 @@ static void block_out(BIO* b) ctx->buf[1]=(unsigned char)(tl>>16); ctx->buf[2]=(unsigned char)(tl>>8); ctx->buf[3]=(unsigned char)(tl); - EVP_DigestUpdate(md, (unsigned char*) &(ctx->buf[OK_BLOCK_BLOCK]), tl); - EVP_DigestFinal_ex(md, &(ctx->buf[ctx->buf_len]), NULL); + if (!EVP_DigestUpdate(md, + (unsigned char*) &(ctx->buf[OK_BLOCK_BLOCK]), tl)) + goto berr; + if (!EVP_DigestFinal_ex(md, &(ctx->buf[ctx->buf_len]), NULL)) + goto berr; ctx->buf_len+= md->digest->md_size; ctx->blockout= 1; + return 1; + berr: + BIO_clear_retry_flags(b); + return 0; } -static void block_in(BIO* b) +static int block_in(BIO* b) { BIO_OK_CTX *ctx; EVP_MD_CTX *md; @@ -554,10 +596,13 @@ static void block_in(BIO* b) tl|=ctx->buf[2]; tl<<=8; tl|=ctx->buf[3]; - if (ctx->buf_len < tl+ OK_BLOCK_BLOCK+ md->digest->md_size) return; + if (ctx->buf_len < tl+ OK_BLOCK_BLOCK+ md->digest->md_size) return 1; - EVP_DigestUpdate(md, (unsigned char*) &(ctx->buf[OK_BLOCK_BLOCK]), tl); - EVP_DigestFinal_ex(md, tmp, NULL); + if (!EVP_DigestUpdate(md, + (unsigned char*) &(ctx->buf[OK_BLOCK_BLOCK]), tl)) + goto berr; + if (!EVP_DigestFinal_ex(md, tmp, NULL)) + goto berr; if(memcmp(&(ctx->buf[tl+ OK_BLOCK_BLOCK]), tmp, md->digest->md_size) == 0) { /* there might be parts from next block lurking around ! */ @@ -571,5 +616,9 @@ static void block_in(BIO* b) { ctx->cont= 0; } + return 1; + berr: + BIO_clear_retry_flags(b); + return 0; } diff --git a/crypto/openssl/crypto/evp/c_allc.c b/crypto/openssl/crypto/evp/c_allc.c index c5f9268378..2a45d435e5 100644 --- a/crypto/openssl/crypto/evp/c_allc.c +++ b/crypto/openssl/crypto/evp/c_allc.c @@ -98,6 +98,9 @@ void OpenSSL_add_all_ciphers(void) #ifndef OPENSSL_NO_RC4 EVP_add_cipher(EVP_rc4()); EVP_add_cipher(EVP_rc4_40()); +#ifndef OPENSSL_NO_MD5 + EVP_add_cipher(EVP_rc4_hmac_md5()); +#endif #endif #ifndef OPENSSL_NO_IDEA @@ -166,9 +169,9 @@ void OpenSSL_add_all_ciphers(void) EVP_add_cipher(EVP_aes_128_cfb1()); EVP_add_cipher(EVP_aes_128_cfb8()); EVP_add_cipher(EVP_aes_128_ofb()); -#if 0 EVP_add_cipher(EVP_aes_128_ctr()); -#endif + EVP_add_cipher(EVP_aes_128_gcm()); + EVP_add_cipher(EVP_aes_128_xts()); EVP_add_cipher_alias(SN_aes_128_cbc,"AES128"); EVP_add_cipher_alias(SN_aes_128_cbc,"aes128"); EVP_add_cipher(EVP_aes_192_ecb()); @@ -177,9 +180,8 @@ void OpenSSL_add_all_ciphers(void) EVP_add_cipher(EVP_aes_192_cfb1()); EVP_add_cipher(EVP_aes_192_cfb8()); EVP_add_cipher(EVP_aes_192_ofb()); -#if 0 EVP_add_cipher(EVP_aes_192_ctr()); -#endif + EVP_add_cipher(EVP_aes_192_gcm()); EVP_add_cipher_alias(SN_aes_192_cbc,"AES192"); EVP_add_cipher_alias(SN_aes_192_cbc,"aes192"); EVP_add_cipher(EVP_aes_256_ecb()); @@ -188,11 +190,15 @@ void OpenSSL_add_all_ciphers(void) EVP_add_cipher(EVP_aes_256_cfb1()); EVP_add_cipher(EVP_aes_256_cfb8()); EVP_add_cipher(EVP_aes_256_ofb()); -#if 0 EVP_add_cipher(EVP_aes_256_ctr()); -#endif + EVP_add_cipher(EVP_aes_256_gcm()); + EVP_add_cipher(EVP_aes_256_xts()); EVP_add_cipher_alias(SN_aes_256_cbc,"AES256"); EVP_add_cipher_alias(SN_aes_256_cbc,"aes256"); +#if !defined(OPENSSL_NO_SHA) && !defined(OPENSSL_NO_SHA1) + EVP_add_cipher(EVP_aes_128_cbc_hmac_sha1()); + EVP_add_cipher(EVP_aes_256_cbc_hmac_sha1()); +#endif #endif #ifndef OPENSSL_NO_CAMELLIA diff --git a/crypto/openssl/crypto/evp/digest.c b/crypto/openssl/crypto/evp/digest.c index 982ba2b136..467e6b5ae9 100644 --- a/crypto/openssl/crypto/evp/digest.c +++ b/crypto/openssl/crypto/evp/digest.c @@ -117,6 +117,10 @@ #include #endif +#ifdef OPENSSL_FIPS +#include +#endif + void EVP_MD_CTX_init(EVP_MD_CTX *ctx) { memset(ctx,'\0',sizeof *ctx); @@ -225,12 +229,26 @@ skip_to_init: } if (ctx->flags & EVP_MD_CTX_FLAG_NO_INIT) return 1; +#ifdef OPENSSL_FIPS + if (FIPS_mode()) + { + if (FIPS_digestinit(ctx, type)) + return 1; + OPENSSL_free(ctx->md_data); + ctx->md_data = NULL; + return 0; + } +#endif return ctx->digest->init(ctx); } int EVP_DigestUpdate(EVP_MD_CTX *ctx, const void *data, size_t count) { +#ifdef OPENSSL_FIPS + return FIPS_digestupdate(ctx, data, count); +#else return ctx->update(ctx,data,count); +#endif } /* The caller can assume that this removes any secret data from the context */ @@ -245,8 +263,10 @@ int EVP_DigestFinal(EVP_MD_CTX *ctx, unsigned char *md, unsigned int *size) /* The caller can assume that this removes any secret data from the context */ int EVP_DigestFinal_ex(EVP_MD_CTX *ctx, unsigned char *md, unsigned int *size) { +#ifdef OPENSSL_FIPS + return FIPS_digestfinal(ctx, md, size); +#else int ret; - OPENSSL_assert(ctx->digest->md_size <= EVP_MAX_MD_SIZE); ret=ctx->digest->final(ctx,md); if (size != NULL) @@ -258,6 +278,7 @@ int EVP_DigestFinal_ex(EVP_MD_CTX *ctx, unsigned char *md, unsigned int *size) } memset(ctx->md_data,0,ctx->digest->ctx_size); return ret; +#endif } int EVP_MD_CTX_copy(EVP_MD_CTX *out, const EVP_MD_CTX *in) @@ -351,6 +372,7 @@ void EVP_MD_CTX_destroy(EVP_MD_CTX *ctx) /* This call frees resources associated with the context */ int EVP_MD_CTX_cleanup(EVP_MD_CTX *ctx) { +#ifndef OPENSSL_FIPS /* Don't assume ctx->md_data was cleaned in EVP_Digest_Final, * because sometimes only copies of the context are ever finalised. */ @@ -363,6 +385,7 @@ int EVP_MD_CTX_cleanup(EVP_MD_CTX *ctx) OPENSSL_cleanse(ctx->md_data,ctx->digest->ctx_size); OPENSSL_free(ctx->md_data); } +#endif if (ctx->pctx) EVP_PKEY_CTX_free(ctx->pctx); #ifndef OPENSSL_NO_ENGINE @@ -370,6 +393,9 @@ int EVP_MD_CTX_cleanup(EVP_MD_CTX *ctx) /* The EVP_MD we used belongs to an ENGINE, release the * functional reference we held for this reason. */ ENGINE_finish(ctx->engine); +#endif +#ifdef OPENSSL_FIPS + FIPS_md_ctx_cleanup(ctx); #endif memset(ctx,'\0',sizeof *ctx); diff --git a/crypto/openssl/crypto/evp/e_aes.c b/crypto/openssl/crypto/evp/e_aes.c index bd6c0a3a62..1e4af0cb75 100644 --- a/crypto/openssl/crypto/evp/e_aes.c +++ b/crypto/openssl/crypto/evp/e_aes.c @@ -1,5 +1,5 @@ /* ==================================================================== - * Copyright (c) 2001 The OpenSSL Project. All rights reserved. + * Copyright (c) 2001-2011 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -56,57 +56,511 @@ #include #include #include "evp_locl.h" - -static int aes_init_key(EVP_CIPHER_CTX *ctx, const unsigned char *key, - const unsigned char *iv, int enc); +#ifndef OPENSSL_FIPS +#include "modes_lcl.h" +#include typedef struct { AES_KEY ks; + block128_f block; + union { + cbc128_f cbc; + ctr128_f ctr; + } stream; } EVP_AES_KEY; -#define data(ctx) EVP_C_DATA(EVP_AES_KEY,ctx) - -IMPLEMENT_BLOCK_CIPHER(aes_128, ks, AES, EVP_AES_KEY, - NID_aes_128, 16, 16, 16, 128, - 0, aes_init_key, NULL, - EVP_CIPHER_set_asn1_iv, - EVP_CIPHER_get_asn1_iv, - NULL) -IMPLEMENT_BLOCK_CIPHER(aes_192, ks, AES, EVP_AES_KEY, - NID_aes_192, 16, 24, 16, 128, - 0, aes_init_key, NULL, - EVP_CIPHER_set_asn1_iv, - EVP_CIPHER_get_asn1_iv, - NULL) -IMPLEMENT_BLOCK_CIPHER(aes_256, ks, AES, EVP_AES_KEY, - NID_aes_256, 16, 32, 16, 128, - 0, aes_init_key, NULL, - EVP_CIPHER_set_asn1_iv, - EVP_CIPHER_get_asn1_iv, - NULL) - -#define IMPLEMENT_AES_CFBR(ksize,cbits) IMPLEMENT_CFBR(aes,AES,EVP_AES_KEY,ks,ksize,cbits,16) - -IMPLEMENT_AES_CFBR(128,1) -IMPLEMENT_AES_CFBR(192,1) -IMPLEMENT_AES_CFBR(256,1) - -IMPLEMENT_AES_CFBR(128,8) -IMPLEMENT_AES_CFBR(192,8) -IMPLEMENT_AES_CFBR(256,8) +typedef struct + { + AES_KEY ks; /* AES key schedule to use */ + int key_set; /* Set if key initialised */ + int iv_set; /* Set if an iv is set */ + GCM128_CONTEXT gcm; + unsigned char *iv; /* Temporary IV store */ + int ivlen; /* IV length */ + int taglen; + int iv_gen; /* It is OK to generate IVs */ + int tls_aad_len; /* TLS AAD length */ + ctr128_f ctr; + } EVP_AES_GCM_CTX; + +typedef struct + { + AES_KEY ks1, ks2; /* AES key schedules to use */ + XTS128_CONTEXT xts; + void (*stream)(const unsigned char *in, + unsigned char *out, size_t length, + const AES_KEY *key1, const AES_KEY *key2, + const unsigned char iv[16]); + } EVP_AES_XTS_CTX; + +typedef struct + { + AES_KEY ks; /* AES key schedule to use */ + int key_set; /* Set if key initialised */ + int iv_set; /* Set if an iv is set */ + int tag_set; /* Set if tag is valid */ + int len_set; /* Set if message length set */ + int L, M; /* L and M parameters from RFC3610 */ + CCM128_CONTEXT ccm; + ccm128_f str; + } EVP_AES_CCM_CTX; + +#define MAXBITCHUNK ((size_t)1<<(sizeof(size_t)*8-4)) + +#ifdef VPAES_ASM +int vpaes_set_encrypt_key(const unsigned char *userKey, int bits, + AES_KEY *key); +int vpaes_set_decrypt_key(const unsigned char *userKey, int bits, + AES_KEY *key); + +void vpaes_encrypt(const unsigned char *in, unsigned char *out, + const AES_KEY *key); +void vpaes_decrypt(const unsigned char *in, unsigned char *out, + const AES_KEY *key); + +void vpaes_cbc_encrypt(const unsigned char *in, + unsigned char *out, + size_t length, + const AES_KEY *key, + unsigned char *ivec, int enc); +#endif +#ifdef BSAES_ASM +void bsaes_cbc_encrypt(const unsigned char *in, unsigned char *out, + size_t length, const AES_KEY *key, + unsigned char ivec[16], int enc); +void bsaes_ctr32_encrypt_blocks(const unsigned char *in, unsigned char *out, + size_t len, const AES_KEY *key, + const unsigned char ivec[16]); +void bsaes_xts_encrypt(const unsigned char *inp, unsigned char *out, + size_t len, const AES_KEY *key1, + const AES_KEY *key2, const unsigned char iv[16]); +void bsaes_xts_decrypt(const unsigned char *inp, unsigned char *out, + size_t len, const AES_KEY *key1, + const AES_KEY *key2, const unsigned char iv[16]); +#endif +#ifdef AES_CTR_ASM +void AES_ctr32_encrypt(const unsigned char *in, unsigned char *out, + size_t blocks, const AES_KEY *key, + const unsigned char ivec[AES_BLOCK_SIZE]); +#endif +#ifdef AES_XTS_ASM +void AES_xts_encrypt(const char *inp,char *out,size_t len, + const AES_KEY *key1, const AES_KEY *key2, + const unsigned char iv[16]); +void AES_xts_decrypt(const char *inp,char *out,size_t len, + const AES_KEY *key1, const AES_KEY *key2, + const unsigned char iv[16]); +#endif + +#if defined(AES_ASM) && !defined(I386_ONLY) && ( \ + ((defined(__i386) || defined(__i386__) || \ + defined(_M_IX86)) && defined(OPENSSL_IA32_SSE2))|| \ + defined(__x86_64) || defined(__x86_64__) || \ + defined(_M_AMD64) || defined(_M_X64) || \ + defined(__INTEL__) ) + +extern unsigned int OPENSSL_ia32cap_P[2]; + +#ifdef VPAES_ASM +#define VPAES_CAPABLE (OPENSSL_ia32cap_P[1]&(1<<(41-32))) +#endif +#ifdef BSAES_ASM +#define BSAES_CAPABLE VPAES_CAPABLE +#endif +/* + * AES-NI section + */ +#define AESNI_CAPABLE (OPENSSL_ia32cap_P[1]&(1<<(57-32))) + +int aesni_set_encrypt_key(const unsigned char *userKey, int bits, + AES_KEY *key); +int aesni_set_decrypt_key(const unsigned char *userKey, int bits, + AES_KEY *key); + +void aesni_encrypt(const unsigned char *in, unsigned char *out, + const AES_KEY *key); +void aesni_decrypt(const unsigned char *in, unsigned char *out, + const AES_KEY *key); + +void aesni_ecb_encrypt(const unsigned char *in, + unsigned char *out, + size_t length, + const AES_KEY *key, + int enc); +void aesni_cbc_encrypt(const unsigned char *in, + unsigned char *out, + size_t length, + const AES_KEY *key, + unsigned char *ivec, int enc); + +void aesni_ctr32_encrypt_blocks(const unsigned char *in, + unsigned char *out, + size_t blocks, + const void *key, + const unsigned char *ivec); + +void aesni_xts_encrypt(const unsigned char *in, + unsigned char *out, + size_t length, + const AES_KEY *key1, const AES_KEY *key2, + const unsigned char iv[16]); + +void aesni_xts_decrypt(const unsigned char *in, + unsigned char *out, + size_t length, + const AES_KEY *key1, const AES_KEY *key2, + const unsigned char iv[16]); + +void aesni_ccm64_encrypt_blocks (const unsigned char *in, + unsigned char *out, + size_t blocks, + const void *key, + const unsigned char ivec[16], + unsigned char cmac[16]); + +void aesni_ccm64_decrypt_blocks (const unsigned char *in, + unsigned char *out, + size_t blocks, + const void *key, + const unsigned char ivec[16], + unsigned char cmac[16]); + +static int aesni_init_key(EVP_CIPHER_CTX *ctx, const unsigned char *key, + const unsigned char *iv, int enc) + { + int ret, mode; + EVP_AES_KEY *dat = (EVP_AES_KEY *)ctx->cipher_data; + + mode = ctx->cipher->flags & EVP_CIPH_MODE; + if ((mode == EVP_CIPH_ECB_MODE || mode == EVP_CIPH_CBC_MODE) + && !enc) + { + ret = aesni_set_decrypt_key(key, ctx->key_len*8, ctx->cipher_data); + dat->block = (block128_f)aesni_decrypt; + dat->stream.cbc = mode==EVP_CIPH_CBC_MODE ? + (cbc128_f)aesni_cbc_encrypt : + NULL; + } + else { + ret = aesni_set_encrypt_key(key, ctx->key_len*8, ctx->cipher_data); + dat->block = (block128_f)aesni_encrypt; + if (mode==EVP_CIPH_CBC_MODE) + dat->stream.cbc = (cbc128_f)aesni_cbc_encrypt; + else if (mode==EVP_CIPH_CTR_MODE) + dat->stream.ctr = (ctr128_f)aesni_ctr32_encrypt_blocks; + else + dat->stream.cbc = NULL; + } + + if(ret < 0) + { + EVPerr(EVP_F_AESNI_INIT_KEY,EVP_R_AES_KEY_SETUP_FAILED); + return 0; + } + + return 1; + } + +static int aesni_cbc_cipher(EVP_CIPHER_CTX *ctx,unsigned char *out, + const unsigned char *in, size_t len) +{ + aesni_cbc_encrypt(in,out,len,ctx->cipher_data,ctx->iv,ctx->encrypt); + + return 1; +} + +static int aesni_ecb_cipher(EVP_CIPHER_CTX *ctx,unsigned char *out, + const unsigned char *in, size_t len) +{ + size_t bl = ctx->cipher->block_size; + + if (lencipher_data,ctx->encrypt); + + return 1; +} + +#define aesni_ofb_cipher aes_ofb_cipher +static int aesni_ofb_cipher(EVP_CIPHER_CTX *ctx,unsigned char *out, + const unsigned char *in,size_t len); + +#define aesni_cfb_cipher aes_cfb_cipher +static int aesni_cfb_cipher(EVP_CIPHER_CTX *ctx,unsigned char *out, + const unsigned char *in,size_t len); + +#define aesni_cfb8_cipher aes_cfb8_cipher +static int aesni_cfb8_cipher(EVP_CIPHER_CTX *ctx,unsigned char *out, + const unsigned char *in,size_t len); + +#define aesni_cfb1_cipher aes_cfb1_cipher +static int aesni_cfb1_cipher(EVP_CIPHER_CTX *ctx,unsigned char *out, + const unsigned char *in,size_t len); + +#define aesni_ctr_cipher aes_ctr_cipher +static int aesni_ctr_cipher(EVP_CIPHER_CTX *ctx, unsigned char *out, + const unsigned char *in, size_t len); + +static int aesni_gcm_init_key(EVP_CIPHER_CTX *ctx, const unsigned char *key, + const unsigned char *iv, int enc) + { + EVP_AES_GCM_CTX *gctx = ctx->cipher_data; + if (!iv && !key) + return 1; + if (key) + { + aesni_set_encrypt_key(key, ctx->key_len * 8, &gctx->ks); + CRYPTO_gcm128_init(&gctx->gcm, &gctx->ks, + (block128_f)aesni_encrypt); + gctx->ctr = (ctr128_f)aesni_ctr32_encrypt_blocks; + /* If we have an iv can set it directly, otherwise use + * saved IV. + */ + if (iv == NULL && gctx->iv_set) + iv = gctx->iv; + if (iv) + { + CRYPTO_gcm128_setiv(&gctx->gcm, iv, gctx->ivlen); + gctx->iv_set = 1; + } + gctx->key_set = 1; + } + else + { + /* If key set use IV, otherwise copy */ + if (gctx->key_set) + CRYPTO_gcm128_setiv(&gctx->gcm, iv, gctx->ivlen); + else + memcpy(gctx->iv, iv, gctx->ivlen); + gctx->iv_set = 1; + gctx->iv_gen = 0; + } + return 1; + } + +#define aesni_gcm_cipher aes_gcm_cipher +static int aesni_gcm_cipher(EVP_CIPHER_CTX *ctx, unsigned char *out, + const unsigned char *in, size_t len); + +static int aesni_xts_init_key(EVP_CIPHER_CTX *ctx, const unsigned char *key, + const unsigned char *iv, int enc) + { + EVP_AES_XTS_CTX *xctx = ctx->cipher_data; + if (!iv && !key) + return 1; + + if (key) + { + /* key_len is two AES keys */ + if (enc) + { + aesni_set_encrypt_key(key, ctx->key_len * 4, &xctx->ks1); + xctx->xts.block1 = (block128_f)aesni_encrypt; + xctx->stream = aesni_xts_encrypt; + } + else + { + aesni_set_decrypt_key(key, ctx->key_len * 4, &xctx->ks1); + xctx->xts.block1 = (block128_f)aesni_decrypt; + xctx->stream = aesni_xts_decrypt; + } + + aesni_set_encrypt_key(key + ctx->key_len/2, + ctx->key_len * 4, &xctx->ks2); + xctx->xts.block2 = (block128_f)aesni_encrypt; + + xctx->xts.key1 = &xctx->ks1; + } + + if (iv) + { + xctx->xts.key2 = &xctx->ks2; + memcpy(ctx->iv, iv, 16); + } + + return 1; + } + +#define aesni_xts_cipher aes_xts_cipher +static int aesni_xts_cipher(EVP_CIPHER_CTX *ctx, unsigned char *out, + const unsigned char *in, size_t len); + +static int aesni_ccm_init_key(EVP_CIPHER_CTX *ctx, const unsigned char *key, + const unsigned char *iv, int enc) + { + EVP_AES_CCM_CTX *cctx = ctx->cipher_data; + if (!iv && !key) + return 1; + if (key) + { + aesni_set_encrypt_key(key, ctx->key_len * 8, &cctx->ks); + CRYPTO_ccm128_init(&cctx->ccm, cctx->M, cctx->L, + &cctx->ks, (block128_f)aesni_encrypt); + cctx->str = enc?(ccm128_f)aesni_ccm64_encrypt_blocks : + (ccm128_f)aesni_ccm64_decrypt_blocks; + cctx->key_set = 1; + } + if (iv) + { + memcpy(ctx->iv, iv, 15 - cctx->L); + cctx->iv_set = 1; + } + return 1; + } + +#define aesni_ccm_cipher aes_ccm_cipher +static int aesni_ccm_cipher(EVP_CIPHER_CTX *ctx, unsigned char *out, + const unsigned char *in, size_t len); + +#define BLOCK_CIPHER_generic(nid,keylen,blocksize,ivlen,nmode,mode,MODE,flags) \ +static const EVP_CIPHER aesni_##keylen##_##mode = { \ + nid##_##keylen##_##nmode,blocksize,keylen/8,ivlen, \ + flags|EVP_CIPH_##MODE##_MODE, \ + aesni_init_key, \ + aesni_##mode##_cipher, \ + NULL, \ + sizeof(EVP_AES_KEY), \ + NULL,NULL,NULL,NULL }; \ +static const EVP_CIPHER aes_##keylen##_##mode = { \ + nid##_##keylen##_##nmode,blocksize, \ + keylen/8,ivlen, \ + flags|EVP_CIPH_##MODE##_MODE, \ + aes_init_key, \ + aes_##mode##_cipher, \ + NULL, \ + sizeof(EVP_AES_KEY), \ + NULL,NULL,NULL,NULL }; \ +const EVP_CIPHER *EVP_aes_##keylen##_##mode(void) \ +{ return AESNI_CAPABLE?&aesni_##keylen##_##mode:&aes_##keylen##_##mode; } + +#define BLOCK_CIPHER_custom(nid,keylen,blocksize,ivlen,mode,MODE,flags) \ +static const EVP_CIPHER aesni_##keylen##_##mode = { \ + nid##_##keylen##_##mode,blocksize, \ + (EVP_CIPH_##MODE##_MODE==EVP_CIPH_XTS_MODE?2:1)*keylen/8, ivlen, \ + flags|EVP_CIPH_##MODE##_MODE, \ + aesni_##mode##_init_key, \ + aesni_##mode##_cipher, \ + aes_##mode##_cleanup, \ + sizeof(EVP_AES_##MODE##_CTX), \ + NULL,NULL,aes_##mode##_ctrl,NULL }; \ +static const EVP_CIPHER aes_##keylen##_##mode = { \ + nid##_##keylen##_##mode,blocksize, \ + (EVP_CIPH_##MODE##_MODE==EVP_CIPH_XTS_MODE?2:1)*keylen/8, ivlen, \ + flags|EVP_CIPH_##MODE##_MODE, \ + aes_##mode##_init_key, \ + aes_##mode##_cipher, \ + aes_##mode##_cleanup, \ + sizeof(EVP_AES_##MODE##_CTX), \ + NULL,NULL,aes_##mode##_ctrl,NULL }; \ +const EVP_CIPHER *EVP_aes_##keylen##_##mode(void) \ +{ return AESNI_CAPABLE?&aesni_##keylen##_##mode:&aes_##keylen##_##mode; } + +#else + +#define BLOCK_CIPHER_generic(nid,keylen,blocksize,ivlen,nmode,mode,MODE,flags) \ +static const EVP_CIPHER aes_##keylen##_##mode = { \ + nid##_##keylen##_##nmode,blocksize,keylen/8,ivlen, \ + flags|EVP_CIPH_##MODE##_MODE, \ + aes_init_key, \ + aes_##mode##_cipher, \ + NULL, \ + sizeof(EVP_AES_KEY), \ + NULL,NULL,NULL,NULL }; \ +const EVP_CIPHER *EVP_aes_##keylen##_##mode(void) \ +{ return &aes_##keylen##_##mode; } + +#define BLOCK_CIPHER_custom(nid,keylen,blocksize,ivlen,mode,MODE,flags) \ +static const EVP_CIPHER aes_##keylen##_##mode = { \ + nid##_##keylen##_##mode,blocksize, \ + (EVP_CIPH_##MODE##_MODE==EVP_CIPH_XTS_MODE?2:1)*keylen/8, ivlen, \ + flags|EVP_CIPH_##MODE##_MODE, \ + aes_##mode##_init_key, \ + aes_##mode##_cipher, \ + aes_##mode##_cleanup, \ + sizeof(EVP_AES_##MODE##_CTX), \ + NULL,NULL,aes_##mode##_ctrl,NULL }; \ +const EVP_CIPHER *EVP_aes_##keylen##_##mode(void) \ +{ return &aes_##keylen##_##mode; } +#endif + +#define BLOCK_CIPHER_generic_pack(nid,keylen,flags) \ + BLOCK_CIPHER_generic(nid,keylen,16,16,cbc,cbc,CBC,flags|EVP_CIPH_FLAG_DEFAULT_ASN1) \ + BLOCK_CIPHER_generic(nid,keylen,16,0,ecb,ecb,ECB,flags|EVP_CIPH_FLAG_DEFAULT_ASN1) \ + BLOCK_CIPHER_generic(nid,keylen,1,16,ofb128,ofb,OFB,flags|EVP_CIPH_FLAG_DEFAULT_ASN1) \ + BLOCK_CIPHER_generic(nid,keylen,1,16,cfb128,cfb,CFB,flags|EVP_CIPH_FLAG_DEFAULT_ASN1) \ + BLOCK_CIPHER_generic(nid,keylen,1,16,cfb1,cfb1,CFB,flags) \ + BLOCK_CIPHER_generic(nid,keylen,1,16,cfb8,cfb8,CFB,flags) \ + BLOCK_CIPHER_generic(nid,keylen,1,16,ctr,ctr,CTR,flags) static int aes_init_key(EVP_CIPHER_CTX *ctx, const unsigned char *key, const unsigned char *iv, int enc) { - int ret; + int ret, mode; + EVP_AES_KEY *dat = (EVP_AES_KEY *)ctx->cipher_data; - if ((ctx->cipher->flags & EVP_CIPH_MODE) == EVP_CIPH_CFB_MODE - || (ctx->cipher->flags & EVP_CIPH_MODE) == EVP_CIPH_OFB_MODE - || enc) - ret=AES_set_encrypt_key(key, ctx->key_len * 8, ctx->cipher_data); + mode = ctx->cipher->flags & EVP_CIPH_MODE; + if ((mode == EVP_CIPH_ECB_MODE || mode == EVP_CIPH_CBC_MODE) + && !enc) +#ifdef BSAES_CAPABLE + if (BSAES_CAPABLE && mode==EVP_CIPH_CBC_MODE) + { + ret = AES_set_decrypt_key(key,ctx->key_len*8,&dat->ks); + dat->block = (block128_f)AES_decrypt; + dat->stream.cbc = (cbc128_f)bsaes_cbc_encrypt; + } + else +#endif +#ifdef VPAES_CAPABLE + if (VPAES_CAPABLE) + { + ret = vpaes_set_decrypt_key(key,ctx->key_len*8,&dat->ks); + dat->block = (block128_f)vpaes_decrypt; + dat->stream.cbc = mode==EVP_CIPH_CBC_MODE ? + (cbc128_f)vpaes_cbc_encrypt : + NULL; + } + else +#endif + { + ret = AES_set_decrypt_key(key,ctx->key_len*8,&dat->ks); + dat->block = (block128_f)AES_decrypt; + dat->stream.cbc = mode==EVP_CIPH_CBC_MODE ? + (cbc128_f)AES_cbc_encrypt : + NULL; + } else - ret=AES_set_decrypt_key(key, ctx->key_len * 8, ctx->cipher_data); +#ifdef BSAES_CAPABLE + if (BSAES_CAPABLE && mode==EVP_CIPH_CTR_MODE) + { + ret = AES_set_encrypt_key(key,ctx->key_len*8,&dat->ks); + dat->block = (block128_f)AES_encrypt; + dat->stream.ctr = (ctr128_f)bsaes_ctr32_encrypt_blocks; + } + else +#endif +#ifdef VPAES_CAPABLE + if (VPAES_CAPABLE) + { + ret = vpaes_set_encrypt_key(key,ctx->key_len*8,&dat->ks); + dat->block = (block128_f)vpaes_encrypt; + dat->stream.cbc = mode==EVP_CIPH_CBC_MODE ? + (cbc128_f)vpaes_cbc_encrypt : + NULL; + } + else +#endif + { + ret = AES_set_encrypt_key(key,ctx->key_len*8,&dat->ks); + dat->block = (block128_f)AES_encrypt; + dat->stream.cbc = mode==EVP_CIPH_CBC_MODE ? + (cbc128_f)AES_cbc_encrypt : + NULL; +#ifdef AES_CTR_ASM + if (mode==EVP_CIPH_CTR_MODE) + dat->stream.ctr = (ctr128_f)AES_ctr32_encrypt; +#endif + } if(ret < 0) { @@ -117,4 +571,743 @@ static int aes_init_key(EVP_CIPHER_CTX *ctx, const unsigned char *key, return 1; } +static int aes_cbc_cipher(EVP_CIPHER_CTX *ctx,unsigned char *out, + const unsigned char *in, size_t len) +{ + EVP_AES_KEY *dat = (EVP_AES_KEY *)ctx->cipher_data; + + if (dat->stream.cbc) + (*dat->stream.cbc)(in,out,len,&dat->ks,ctx->iv,ctx->encrypt); + else if (ctx->encrypt) + CRYPTO_cbc128_encrypt(in,out,len,&dat->ks,ctx->iv,dat->block); + else + CRYPTO_cbc128_encrypt(in,out,len,&dat->ks,ctx->iv,dat->block); + + return 1; +} + +static int aes_ecb_cipher(EVP_CIPHER_CTX *ctx,unsigned char *out, + const unsigned char *in, size_t len) +{ + size_t bl = ctx->cipher->block_size; + size_t i; + EVP_AES_KEY *dat = (EVP_AES_KEY *)ctx->cipher_data; + + if (lenblock)(in+i,out+i,&dat->ks); + + return 1; +} + +static int aes_ofb_cipher(EVP_CIPHER_CTX *ctx,unsigned char *out, + const unsigned char *in,size_t len) +{ + EVP_AES_KEY *dat = (EVP_AES_KEY *)ctx->cipher_data; + + CRYPTO_ofb128_encrypt(in,out,len,&dat->ks, + ctx->iv,&ctx->num,dat->block); + return 1; +} + +static int aes_cfb_cipher(EVP_CIPHER_CTX *ctx,unsigned char *out, + const unsigned char *in,size_t len) +{ + EVP_AES_KEY *dat = (EVP_AES_KEY *)ctx->cipher_data; + + CRYPTO_cfb128_encrypt(in,out,len,&dat->ks, + ctx->iv,&ctx->num,ctx->encrypt,dat->block); + return 1; +} + +static int aes_cfb8_cipher(EVP_CIPHER_CTX *ctx,unsigned char *out, + const unsigned char *in,size_t len) +{ + EVP_AES_KEY *dat = (EVP_AES_KEY *)ctx->cipher_data; + + CRYPTO_cfb128_8_encrypt(in,out,len,&dat->ks, + ctx->iv,&ctx->num,ctx->encrypt,dat->block); + return 1; +} + +static int aes_cfb1_cipher(EVP_CIPHER_CTX *ctx,unsigned char *out, + const unsigned char *in,size_t len) +{ + EVP_AES_KEY *dat = (EVP_AES_KEY *)ctx->cipher_data; + + if (ctx->flags&EVP_CIPH_FLAG_LENGTH_BITS) { + CRYPTO_cfb128_1_encrypt(in,out,len,&dat->ks, + ctx->iv,&ctx->num,ctx->encrypt,dat->block); + return 1; + } + + while (len>=MAXBITCHUNK) { + CRYPTO_cfb128_1_encrypt(in,out,MAXBITCHUNK*8,&dat->ks, + ctx->iv,&ctx->num,ctx->encrypt,dat->block); + len-=MAXBITCHUNK; + } + if (len) + CRYPTO_cfb128_1_encrypt(in,out,len*8,&dat->ks, + ctx->iv,&ctx->num,ctx->encrypt,dat->block); + + return 1; +} + +static int aes_ctr_cipher (EVP_CIPHER_CTX *ctx, unsigned char *out, + const unsigned char *in, size_t len) +{ + unsigned int num = ctx->num; + EVP_AES_KEY *dat = (EVP_AES_KEY *)ctx->cipher_data; + + if (dat->stream.ctr) + CRYPTO_ctr128_encrypt_ctr32(in,out,len,&dat->ks, + ctx->iv,ctx->buf,&num,dat->stream.ctr); + else + CRYPTO_ctr128_encrypt(in,out,len,&dat->ks, + ctx->iv,ctx->buf,&num,dat->block); + ctx->num = (size_t)num; + return 1; +} + +BLOCK_CIPHER_generic_pack(NID_aes,128,EVP_CIPH_FLAG_FIPS) +BLOCK_CIPHER_generic_pack(NID_aes,192,EVP_CIPH_FLAG_FIPS) +BLOCK_CIPHER_generic_pack(NID_aes,256,EVP_CIPH_FLAG_FIPS) + +static int aes_gcm_cleanup(EVP_CIPHER_CTX *c) + { + EVP_AES_GCM_CTX *gctx = c->cipher_data; + OPENSSL_cleanse(&gctx->gcm, sizeof(gctx->gcm)); + if (gctx->iv != c->iv) + OPENSSL_free(gctx->iv); + return 1; + } + +/* increment counter (64-bit int) by 1 */ +static void ctr64_inc(unsigned char *counter) { + int n=8; + unsigned char c; + + do { + --n; + c = counter[n]; + ++c; + counter[n] = c; + if (c) return; + } while (n); +} + +static int aes_gcm_ctrl(EVP_CIPHER_CTX *c, int type, int arg, void *ptr) + { + EVP_AES_GCM_CTX *gctx = c->cipher_data; + switch (type) + { + case EVP_CTRL_INIT: + gctx->key_set = 0; + gctx->iv_set = 0; + gctx->ivlen = c->cipher->iv_len; + gctx->iv = c->iv; + gctx->taglen = -1; + gctx->iv_gen = 0; + gctx->tls_aad_len = -1; + return 1; + + case EVP_CTRL_GCM_SET_IVLEN: + if (arg <= 0) + return 0; +#ifdef OPENSSL_FIPS + if (FIPS_module_mode() && !(c->flags & EVP_CIPH_FLAG_NON_FIPS_ALLOW) + && arg < 12) + return 0; +#endif + /* Allocate memory for IV if needed */ + if ((arg > EVP_MAX_IV_LENGTH) && (arg > gctx->ivlen)) + { + if (gctx->iv != c->iv) + OPENSSL_free(gctx->iv); + gctx->iv = OPENSSL_malloc(arg); + if (!gctx->iv) + return 0; + } + gctx->ivlen = arg; + return 1; + + case EVP_CTRL_GCM_SET_TAG: + if (arg <= 0 || arg > 16 || c->encrypt) + return 0; + memcpy(c->buf, ptr, arg); + gctx->taglen = arg; + return 1; + + case EVP_CTRL_GCM_GET_TAG: + if (arg <= 0 || arg > 16 || !c->encrypt || gctx->taglen < 0) + return 0; + memcpy(ptr, c->buf, arg); + return 1; + + case EVP_CTRL_GCM_SET_IV_FIXED: + /* Special case: -1 length restores whole IV */ + if (arg == -1) + { + memcpy(gctx->iv, ptr, gctx->ivlen); + gctx->iv_gen = 1; + return 1; + } + /* Fixed field must be at least 4 bytes and invocation field + * at least 8. + */ + if ((arg < 4) || (gctx->ivlen - arg) < 8) + return 0; + if (arg) + memcpy(gctx->iv, ptr, arg); + if (c->encrypt && + RAND_bytes(gctx->iv + arg, gctx->ivlen - arg) <= 0) + return 0; + gctx->iv_gen = 1; + return 1; + + case EVP_CTRL_GCM_IV_GEN: + if (gctx->iv_gen == 0 || gctx->key_set == 0) + return 0; + CRYPTO_gcm128_setiv(&gctx->gcm, gctx->iv, gctx->ivlen); + if (arg <= 0 || arg > gctx->ivlen) + arg = gctx->ivlen; + memcpy(ptr, gctx->iv + gctx->ivlen - arg, arg); + /* Invocation field will be at least 8 bytes in size and + * so no need to check wrap around or increment more than + * last 8 bytes. + */ + ctr64_inc(gctx->iv + gctx->ivlen - 8); + gctx->iv_set = 1; + return 1; + + case EVP_CTRL_GCM_SET_IV_INV: + if (gctx->iv_gen == 0 || gctx->key_set == 0 || c->encrypt) + return 0; + memcpy(gctx->iv + gctx->ivlen - arg, ptr, arg); + CRYPTO_gcm128_setiv(&gctx->gcm, gctx->iv, gctx->ivlen); + gctx->iv_set = 1; + return 1; + + case EVP_CTRL_AEAD_TLS1_AAD: + /* Save the AAD for later use */ + if (arg != 13) + return 0; + memcpy(c->buf, ptr, arg); + gctx->tls_aad_len = arg; + { + unsigned int len=c->buf[arg-2]<<8|c->buf[arg-1]; + /* Correct length for explicit IV */ + len -= EVP_GCM_TLS_EXPLICIT_IV_LEN; + /* If decrypting correct for tag too */ + if (!c->encrypt) + len -= EVP_GCM_TLS_TAG_LEN; + c->buf[arg-2] = len>>8; + c->buf[arg-1] = len & 0xff; + } + /* Extra padding: tag appended to record */ + return EVP_GCM_TLS_TAG_LEN; + + default: + return -1; + + } + } + +static int aes_gcm_init_key(EVP_CIPHER_CTX *ctx, const unsigned char *key, + const unsigned char *iv, int enc) + { + EVP_AES_GCM_CTX *gctx = ctx->cipher_data; + if (!iv && !key) + return 1; + if (key) + { do { +#ifdef BSAES_CAPABLE + if (BSAES_CAPABLE) + { + AES_set_encrypt_key(key,ctx->key_len*8,&gctx->ks); + CRYPTO_gcm128_init(&gctx->gcm,&gctx->ks, + (block128_f)AES_encrypt); + gctx->ctr = (ctr128_f)bsaes_ctr32_encrypt_blocks; + break; + } + else +#endif +#ifdef VPAES_CAPABLE + if (VPAES_CAPABLE) + { + vpaes_set_encrypt_key(key,ctx->key_len*8,&gctx->ks); + CRYPTO_gcm128_init(&gctx->gcm,&gctx->ks, + (block128_f)vpaes_encrypt); + gctx->ctr = NULL; + break; + } +#endif + AES_set_encrypt_key(key, ctx->key_len * 8, &gctx->ks); + CRYPTO_gcm128_init(&gctx->gcm, &gctx->ks, (block128_f)AES_encrypt); +#ifdef AES_CTR_ASM + gctx->ctr = (ctr128_f)AES_ctr32_encrypt; +#else + gctx->ctr = NULL; +#endif + } while (0); + + /* If we have an iv can set it directly, otherwise use + * saved IV. + */ + if (iv == NULL && gctx->iv_set) + iv = gctx->iv; + if (iv) + { + CRYPTO_gcm128_setiv(&gctx->gcm, iv, gctx->ivlen); + gctx->iv_set = 1; + } + gctx->key_set = 1; + } + else + { + /* If key set use IV, otherwise copy */ + if (gctx->key_set) + CRYPTO_gcm128_setiv(&gctx->gcm, iv, gctx->ivlen); + else + memcpy(gctx->iv, iv, gctx->ivlen); + gctx->iv_set = 1; + gctx->iv_gen = 0; + } + return 1; + } + +/* Handle TLS GCM packet format. This consists of the last portion of the IV + * followed by the payload and finally the tag. On encrypt generate IV, + * encrypt payload and write the tag. On verify retrieve IV, decrypt payload + * and verify tag. + */ + +static int aes_gcm_tls_cipher(EVP_CIPHER_CTX *ctx, unsigned char *out, + const unsigned char *in, size_t len) + { + EVP_AES_GCM_CTX *gctx = ctx->cipher_data; + int rv = -1; + /* Encrypt/decrypt must be performed in place */ + if (out != in || len < (EVP_GCM_TLS_EXPLICIT_IV_LEN+EVP_GCM_TLS_TAG_LEN)) + return -1; + /* Set IV from start of buffer or generate IV and write to start + * of buffer. + */ + if (EVP_CIPHER_CTX_ctrl(ctx, ctx->encrypt ? + EVP_CTRL_GCM_IV_GEN : EVP_CTRL_GCM_SET_IV_INV, + EVP_GCM_TLS_EXPLICIT_IV_LEN, out) <= 0) + goto err; + /* Use saved AAD */ + if (CRYPTO_gcm128_aad(&gctx->gcm, ctx->buf, gctx->tls_aad_len)) + goto err; + /* Fix buffer and length to point to payload */ + in += EVP_GCM_TLS_EXPLICIT_IV_LEN; + out += EVP_GCM_TLS_EXPLICIT_IV_LEN; + len -= EVP_GCM_TLS_EXPLICIT_IV_LEN + EVP_GCM_TLS_TAG_LEN; + if (ctx->encrypt) + { + /* Encrypt payload */ + if (gctx->ctr) + { + if (CRYPTO_gcm128_encrypt_ctr32(&gctx->gcm, + in, out, len, + gctx->ctr)) + goto err; + } + else { + if (CRYPTO_gcm128_encrypt(&gctx->gcm, in, out, len)) + goto err; + } + out += len; + /* Finally write tag */ + CRYPTO_gcm128_tag(&gctx->gcm, out, EVP_GCM_TLS_TAG_LEN); + rv = len + EVP_GCM_TLS_EXPLICIT_IV_LEN + EVP_GCM_TLS_TAG_LEN; + } + else + { + /* Decrypt */ + if (gctx->ctr) + { + if (CRYPTO_gcm128_decrypt_ctr32(&gctx->gcm, + in, out, len, + gctx->ctr)) + goto err; + } + else { + if (CRYPTO_gcm128_decrypt(&gctx->gcm, in, out, len)) + goto err; + } + /* Retrieve tag */ + CRYPTO_gcm128_tag(&gctx->gcm, ctx->buf, + EVP_GCM_TLS_TAG_LEN); + /* If tag mismatch wipe buffer */ + if (memcmp(ctx->buf, in + len, EVP_GCM_TLS_TAG_LEN)) + { + OPENSSL_cleanse(out, len); + goto err; + } + rv = len; + } + + err: + gctx->iv_set = 0; + gctx->tls_aad_len = -1; + return rv; + } + +static int aes_gcm_cipher(EVP_CIPHER_CTX *ctx, unsigned char *out, + const unsigned char *in, size_t len) + { + EVP_AES_GCM_CTX *gctx = ctx->cipher_data; + /* If not set up, return error */ + if (!gctx->key_set) + return -1; + + if (gctx->tls_aad_len >= 0) + return aes_gcm_tls_cipher(ctx, out, in, len); + + if (!gctx->iv_set) + return -1; + if (!ctx->encrypt && gctx->taglen < 0) + return -1; + if (in) + { + if (out == NULL) + { + if (CRYPTO_gcm128_aad(&gctx->gcm, in, len)) + return -1; + } + else if (ctx->encrypt) + { + if (gctx->ctr) + { + if (CRYPTO_gcm128_encrypt_ctr32(&gctx->gcm, + in, out, len, + gctx->ctr)) + return -1; + } + else { + if (CRYPTO_gcm128_encrypt(&gctx->gcm, in, out, len)) + return -1; + } + } + else + { + if (gctx->ctr) + { + if (CRYPTO_gcm128_decrypt_ctr32(&gctx->gcm, + in, out, len, + gctx->ctr)) + return -1; + } + else { + if (CRYPTO_gcm128_decrypt(&gctx->gcm, in, out, len)) + return -1; + } + } + return len; + } + else + { + if (!ctx->encrypt) + { + if (CRYPTO_gcm128_finish(&gctx->gcm, + ctx->buf, gctx->taglen) != 0) + return -1; + gctx->iv_set = 0; + return 0; + } + CRYPTO_gcm128_tag(&gctx->gcm, ctx->buf, 16); + gctx->taglen = 16; + /* Don't reuse the IV */ + gctx->iv_set = 0; + return 0; + } + + } + +#define CUSTOM_FLAGS (EVP_CIPH_FLAG_DEFAULT_ASN1 \ + | EVP_CIPH_CUSTOM_IV | EVP_CIPH_FLAG_CUSTOM_CIPHER \ + | EVP_CIPH_ALWAYS_CALL_INIT | EVP_CIPH_CTRL_INIT) + +BLOCK_CIPHER_custom(NID_aes,128,1,12,gcm,GCM, + EVP_CIPH_FLAG_FIPS|EVP_CIPH_FLAG_AEAD_CIPHER|CUSTOM_FLAGS) +BLOCK_CIPHER_custom(NID_aes,192,1,12,gcm,GCM, + EVP_CIPH_FLAG_FIPS|EVP_CIPH_FLAG_AEAD_CIPHER|CUSTOM_FLAGS) +BLOCK_CIPHER_custom(NID_aes,256,1,12,gcm,GCM, + EVP_CIPH_FLAG_FIPS|EVP_CIPH_FLAG_AEAD_CIPHER|CUSTOM_FLAGS) + +static int aes_xts_ctrl(EVP_CIPHER_CTX *c, int type, int arg, void *ptr) + { + EVP_AES_XTS_CTX *xctx = c->cipher_data; + if (type != EVP_CTRL_INIT) + return -1; + /* key1 and key2 are used as an indicator both key and IV are set */ + xctx->xts.key1 = NULL; + xctx->xts.key2 = NULL; + return 1; + } + +static int aes_xts_init_key(EVP_CIPHER_CTX *ctx, const unsigned char *key, + const unsigned char *iv, int enc) + { + EVP_AES_XTS_CTX *xctx = ctx->cipher_data; + if (!iv && !key) + return 1; + + if (key) do + { +#ifdef AES_XTS_ASM + xctx->stream = enc ? AES_xts_encrypt : AES_xts_decrypt; +#else + xctx->stream = NULL; +#endif + /* key_len is two AES keys */ +#ifdef BSAES_CAPABLE + if (BSAES_CAPABLE) + xctx->stream = enc ? bsaes_xts_encrypt : bsaes_xts_decrypt; + else +#endif +#ifdef VPAES_CAPABLE + if (VPAES_CAPABLE) + { + if (enc) + { + vpaes_set_encrypt_key(key, ctx->key_len * 4, &xctx->ks1); + xctx->xts.block1 = (block128_f)vpaes_encrypt; + } + else + { + vpaes_set_decrypt_key(key, ctx->key_len * 4, &xctx->ks1); + xctx->xts.block1 = (block128_f)vpaes_decrypt; + } + + vpaes_set_encrypt_key(key + ctx->key_len/2, + ctx->key_len * 4, &xctx->ks2); + xctx->xts.block2 = (block128_f)vpaes_encrypt; + + xctx->xts.key1 = &xctx->ks1; + break; + } +#endif + if (enc) + { + AES_set_encrypt_key(key, ctx->key_len * 4, &xctx->ks1); + xctx->xts.block1 = (block128_f)AES_encrypt; + } + else + { + AES_set_decrypt_key(key, ctx->key_len * 4, &xctx->ks1); + xctx->xts.block1 = (block128_f)AES_decrypt; + } + + AES_set_encrypt_key(key + ctx->key_len/2, + ctx->key_len * 4, &xctx->ks2); + xctx->xts.block2 = (block128_f)AES_encrypt; + + xctx->xts.key1 = &xctx->ks1; + } while (0); + + if (iv) + { + xctx->xts.key2 = &xctx->ks2; + memcpy(ctx->iv, iv, 16); + } + + return 1; + } + +static int aes_xts_cipher(EVP_CIPHER_CTX *ctx, unsigned char *out, + const unsigned char *in, size_t len) + { + EVP_AES_XTS_CTX *xctx = ctx->cipher_data; + if (!xctx->xts.key1 || !xctx->xts.key2) + return 0; + if (!out || !in || lenflags & EVP_CIPH_FLAG_NON_FIPS_ALLOW) && + (len > (1UL<<20)*16)) + { + EVPerr(EVP_F_AES_XTS_CIPHER, EVP_R_TOO_LARGE); + return 0; + } +#endif + if (xctx->stream) + (*xctx->stream)(in, out, len, + xctx->xts.key1, xctx->xts.key2, ctx->iv); + else if (CRYPTO_xts128_encrypt(&xctx->xts, ctx->iv, in, out, len, + ctx->encrypt)) + return 0; + return 1; + } + +#define aes_xts_cleanup NULL + +#define XTS_FLAGS (EVP_CIPH_FLAG_DEFAULT_ASN1 | EVP_CIPH_CUSTOM_IV \ + | EVP_CIPH_ALWAYS_CALL_INIT | EVP_CIPH_CTRL_INIT) + +BLOCK_CIPHER_custom(NID_aes,128,1,16,xts,XTS,EVP_CIPH_FLAG_FIPS|XTS_FLAGS) +BLOCK_CIPHER_custom(NID_aes,256,1,16,xts,XTS,EVP_CIPH_FLAG_FIPS|XTS_FLAGS) + +static int aes_ccm_ctrl(EVP_CIPHER_CTX *c, int type, int arg, void *ptr) + { + EVP_AES_CCM_CTX *cctx = c->cipher_data; + switch (type) + { + case EVP_CTRL_INIT: + cctx->key_set = 0; + cctx->iv_set = 0; + cctx->L = 8; + cctx->M = 12; + cctx->tag_set = 0; + cctx->len_set = 0; + return 1; + + case EVP_CTRL_CCM_SET_IVLEN: + arg = 15 - arg; + case EVP_CTRL_CCM_SET_L: + if (arg < 2 || arg > 8) + return 0; + cctx->L = arg; + return 1; + + case EVP_CTRL_CCM_SET_TAG: + if ((arg & 1) || arg < 4 || arg > 16) + return 0; + if ((c->encrypt && ptr) || (!c->encrypt && !ptr)) + return 0; + if (ptr) + { + cctx->tag_set = 1; + memcpy(c->buf, ptr, arg); + } + cctx->M = arg; + return 1; + + case EVP_CTRL_CCM_GET_TAG: + if (!c->encrypt || !cctx->tag_set) + return 0; + if(!CRYPTO_ccm128_tag(&cctx->ccm, ptr, (size_t)arg)) + return 0; + cctx->tag_set = 0; + cctx->iv_set = 0; + cctx->len_set = 0; + return 1; + + default: + return -1; + + } + } + +static int aes_ccm_init_key(EVP_CIPHER_CTX *ctx, const unsigned char *key, + const unsigned char *iv, int enc) + { + EVP_AES_CCM_CTX *cctx = ctx->cipher_data; + if (!iv && !key) + return 1; + if (key) do + { +#ifdef VPAES_CAPABLE + if (VPAES_CAPABLE) + { + vpaes_set_encrypt_key(key, ctx->key_len*8, &cctx->ks); + CRYPTO_ccm128_init(&cctx->ccm, cctx->M, cctx->L, + &cctx->ks, (block128_f)vpaes_encrypt); + cctx->key_set = 1; + break; + } +#endif + AES_set_encrypt_key(key, ctx->key_len * 8, &cctx->ks); + CRYPTO_ccm128_init(&cctx->ccm, cctx->M, cctx->L, + &cctx->ks, (block128_f)AES_encrypt); + cctx->str = NULL; + cctx->key_set = 1; + } while (0); + if (iv) + { + memcpy(ctx->iv, iv, 15 - cctx->L); + cctx->iv_set = 1; + } + return 1; + } + +static int aes_ccm_cipher(EVP_CIPHER_CTX *ctx, unsigned char *out, + const unsigned char *in, size_t len) + { + EVP_AES_CCM_CTX *cctx = ctx->cipher_data; + CCM128_CONTEXT *ccm = &cctx->ccm; + /* If not set up, return error */ + if (!cctx->iv_set && !cctx->key_set) + return -1; + if (!ctx->encrypt && !cctx->tag_set) + return -1; + if (!out) + { + if (!in) + { + if (CRYPTO_ccm128_setiv(ccm, ctx->iv, 15 - cctx->L,len)) + return -1; + cctx->len_set = 1; + return len; + } + /* If have AAD need message length */ + if (!cctx->len_set && len) + return -1; + CRYPTO_ccm128_aad(ccm, in, len); + return len; + } + /* EVP_*Final() doesn't return any data */ + if (!in) + return 0; + /* If not set length yet do it */ + if (!cctx->len_set) + { + if (CRYPTO_ccm128_setiv(ccm, ctx->iv, 15 - cctx->L, len)) + return -1; + cctx->len_set = 1; + } + if (ctx->encrypt) + { + if (cctx->str ? CRYPTO_ccm128_encrypt_ccm64(ccm, in, out, len, + cctx->str) : + CRYPTO_ccm128_encrypt(ccm, in, out, len)) + return -1; + cctx->tag_set = 1; + return len; + } + else + { + int rv = -1; + if (cctx->str ? !CRYPTO_ccm128_decrypt_ccm64(ccm, in, out, len, + cctx->str) : + !CRYPTO_ccm128_decrypt(ccm, in, out, len)) + { + unsigned char tag[16]; + if (CRYPTO_ccm128_tag(ccm, tag, cctx->M)) + { + if (!memcmp(tag, ctx->buf, cctx->M)) + rv = len; + } + } + if (rv == -1) + OPENSSL_cleanse(out, len); + cctx->iv_set = 0; + cctx->tag_set = 0; + cctx->len_set = 0; + return rv; + } + + } + +#define aes_ccm_cleanup NULL + +BLOCK_CIPHER_custom(NID_aes,128,1,12,ccm,CCM,EVP_CIPH_FLAG_FIPS|CUSTOM_FLAGS) +BLOCK_CIPHER_custom(NID_aes,192,1,12,ccm,CCM,EVP_CIPH_FLAG_FIPS|CUSTOM_FLAGS) +BLOCK_CIPHER_custom(NID_aes,256,1,12,ccm,CCM,EVP_CIPH_FLAG_FIPS|CUSTOM_FLAGS) + +#endif #endif diff --git a/crypto/openssl/crypto/evp/e_aes_cbc_hmac_sha1.c b/crypto/openssl/crypto/evp/e_aes_cbc_hmac_sha1.c new file mode 100644 index 0000000000..278c6caa28 --- /dev/null +++ b/crypto/openssl/crypto/evp/e_aes_cbc_hmac_sha1.c @@ -0,0 +1,404 @@ +/* ==================================================================== + * Copyright (c) 2011 The OpenSSL Project. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * 3. All advertising materials mentioning features or use of this + * software must display the following acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit. (http://www.OpenSSL.org/)" + * + * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to + * endorse or promote products derived from this software without + * prior written permission. For written permission, please contact + * licensing@OpenSSL.org. + * + * 5. Products derived from this software may not be called "OpenSSL" + * nor may "OpenSSL" appear in their names without prior written + * permission of the OpenSSL Project. + * + * 6. Redistributions of any form whatsoever must retain the following + * acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit (http://www.OpenSSL.org/)" + * + * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY + * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * ==================================================================== + */ + +#include + +#include +#include + +#if !defined(OPENSSL_NO_AES) && !defined(OPENSSL_NO_SHA1) + +#include +#include +#include +#include +#include "evp_locl.h" + +#ifndef EVP_CIPH_FLAG_AEAD_CIPHER +#define EVP_CIPH_FLAG_AEAD_CIPHER 0x200000 +#define EVP_CTRL_AEAD_TLS1_AAD 0x16 +#define EVP_CTRL_AEAD_SET_MAC_KEY 0x17 +#endif + +#if !defined(EVP_CIPH_FLAG_DEFAULT_ASN1) +#define EVP_CIPH_FLAG_DEFAULT_ASN1 0 +#endif + +#define TLS1_1_VERSION 0x0302 + +typedef struct + { + AES_KEY ks; + SHA_CTX head,tail,md; + size_t payload_length; /* AAD length in decrypt case */ + union { + unsigned int tls_ver; + unsigned char tls_aad[16]; /* 13 used */ + } aux; + } EVP_AES_HMAC_SHA1; + +#if defined(AES_ASM) && ( \ + defined(__x86_64) || defined(__x86_64__) || \ + defined(_M_AMD64) || defined(_M_X64) || \ + defined(__INTEL__) ) + +extern unsigned int OPENSSL_ia32cap_P[2]; +#define AESNI_CAPABLE (1<<(57-32)) + +int aesni_set_encrypt_key(const unsigned char *userKey, int bits, + AES_KEY *key); +int aesni_set_decrypt_key(const unsigned char *userKey, int bits, + AES_KEY *key); + +void aesni_cbc_encrypt(const unsigned char *in, + unsigned char *out, + size_t length, + const AES_KEY *key, + unsigned char *ivec, int enc); + +void aesni_cbc_sha1_enc (const void *inp, void *out, size_t blocks, + const AES_KEY *key, unsigned char iv[16], + SHA_CTX *ctx,const void *in0); + +#define data(ctx) ((EVP_AES_HMAC_SHA1 *)(ctx)->cipher_data) + +static int aesni_cbc_hmac_sha1_init_key(EVP_CIPHER_CTX *ctx, + const unsigned char *inkey, + const unsigned char *iv, int enc) + { + EVP_AES_HMAC_SHA1 *key = data(ctx); + int ret; + + if (enc) + ret=aesni_set_encrypt_key(inkey,ctx->key_len*8,&key->ks); + else + ret=aesni_set_decrypt_key(inkey,ctx->key_len*8,&key->ks); + + SHA1_Init(&key->head); /* handy when benchmarking */ + key->tail = key->head; + key->md = key->head; + + key->payload_length = 0; + + return ret<0?0:1; + } + +#define STITCHED_CALL + +#if !defined(STITCHED_CALL) +#define aes_off 0 +#endif + +void sha1_block_data_order (void *c,const void *p,size_t len); + +static void sha1_update(SHA_CTX *c,const void *data,size_t len) +{ const unsigned char *ptr = data; + size_t res; + + if ((res = c->num)) { + res = SHA_CBLOCK-res; + if (lenNh += len>>29; + c->Nl += len<<=3; + if (c->Nl<(unsigned int)len) c->Nh++; + } + + if (res) + SHA1_Update(c,ptr,res); +} + +#define SHA1_Update sha1_update + +static int aesni_cbc_hmac_sha1_cipher(EVP_CIPHER_CTX *ctx, unsigned char *out, + const unsigned char *in, size_t len) + { + EVP_AES_HMAC_SHA1 *key = data(ctx); + unsigned int l; + size_t plen = key->payload_length, + iv = 0, /* explicit IV in TLS 1.1 and later */ + sha_off = 0; +#if defined(STITCHED_CALL) + size_t aes_off = 0, + blocks; + + sha_off = SHA_CBLOCK-key->md.num; +#endif + + if (len%AES_BLOCK_SIZE) return 0; + + if (ctx->encrypt) { + if (plen==0) + plen = len; + else if (len!=((plen+SHA_DIGEST_LENGTH+AES_BLOCK_SIZE)&-AES_BLOCK_SIZE)) + return 0; + else if (key->aux.tls_ver >= TLS1_1_VERSION) + iv = AES_BLOCK_SIZE; + +#if defined(STITCHED_CALL) + if (plen>(sha_off+iv) && (blocks=(plen-(sha_off+iv))/SHA_CBLOCK)) { + SHA1_Update(&key->md,in+iv,sha_off); + + aesni_cbc_sha1_enc(in,out,blocks,&key->ks, + ctx->iv,&key->md,in+iv+sha_off); + blocks *= SHA_CBLOCK; + aes_off += blocks; + sha_off += blocks; + key->md.Nh += blocks>>29; + key->md.Nl += blocks<<=3; + if (key->md.Nl<(unsigned int)blocks) key->md.Nh++; + } else { + sha_off = 0; + } +#endif + sha_off += iv; + SHA1_Update(&key->md,in+sha_off,plen-sha_off); + + if (plen!=len) { /* "TLS" mode of operation */ + if (in!=out) + memcpy(out+aes_off,in+aes_off,plen-aes_off); + + /* calculate HMAC and append it to payload */ + SHA1_Final(out+plen,&key->md); + key->md = key->tail; + SHA1_Update(&key->md,out+plen,SHA_DIGEST_LENGTH); + SHA1_Final(out+plen,&key->md); + + /* pad the payload|hmac */ + plen += SHA_DIGEST_LENGTH; + for (l=len-plen-1;plenks,ctx->iv,1); + } else { + aesni_cbc_encrypt(in+aes_off,out+aes_off,len-aes_off, + &key->ks,ctx->iv,1); + } + } else { + unsigned char mac[SHA_DIGEST_LENGTH]; + + /* decrypt HMAC|padding at once */ + aesni_cbc_encrypt(in,out,len, + &key->ks,ctx->iv,0); + + if (plen) { /* "TLS" mode of operation */ + /* figure out payload length */ + if (len<(size_t)(out[len-1]+1+SHA_DIGEST_LENGTH)) + return 0; + + len -= (out[len-1]+1+SHA_DIGEST_LENGTH); + + if ((key->aux.tls_aad[plen-4]<<8|key->aux.tls_aad[plen-3]) + >= TLS1_1_VERSION) { + len -= AES_BLOCK_SIZE; + iv = AES_BLOCK_SIZE; + } + + key->aux.tls_aad[plen-2] = len>>8; + key->aux.tls_aad[plen-1] = len; + + /* calculate HMAC and verify it */ + key->md = key->head; + SHA1_Update(&key->md,key->aux.tls_aad,plen); + SHA1_Update(&key->md,out+iv,len); + SHA1_Final(mac,&key->md); + + key->md = key->tail; + SHA1_Update(&key->md,mac,SHA_DIGEST_LENGTH); + SHA1_Final(mac,&key->md); + + if (memcmp(out+iv+len,mac,SHA_DIGEST_LENGTH)) + return 0; + } else { + SHA1_Update(&key->md,out,len); + } + } + + key->payload_length = 0; + + return 1; + } + +static int aesni_cbc_hmac_sha1_ctrl(EVP_CIPHER_CTX *ctx, int type, int arg, void *ptr) + { + EVP_AES_HMAC_SHA1 *key = data(ctx); + + switch (type) + { + case EVP_CTRL_AEAD_SET_MAC_KEY: + { + unsigned int i; + unsigned char hmac_key[64]; + + memset (hmac_key,0,sizeof(hmac_key)); + + if (arg > (int)sizeof(hmac_key)) { + SHA1_Init(&key->head); + SHA1_Update(&key->head,ptr,arg); + SHA1_Final(hmac_key,&key->head); + } else { + memcpy(hmac_key,ptr,arg); + } + + for (i=0;ihead); + SHA1_Update(&key->head,hmac_key,sizeof(hmac_key)); + + for (i=0;itail); + SHA1_Update(&key->tail,hmac_key,sizeof(hmac_key)); + + return 1; + } + case EVP_CTRL_AEAD_TLS1_AAD: + { + unsigned char *p=ptr; + unsigned int len=p[arg-2]<<8|p[arg-1]; + + if (ctx->encrypt) + { + key->payload_length = len; + if ((key->aux.tls_ver=p[arg-4]<<8|p[arg-3]) >= TLS1_1_VERSION) { + len -= AES_BLOCK_SIZE; + p[arg-2] = len>>8; + p[arg-1] = len; + } + key->md = key->head; + SHA1_Update(&key->md,p,arg); + + return (int)(((len+SHA_DIGEST_LENGTH+AES_BLOCK_SIZE)&-AES_BLOCK_SIZE) + - len); + } + else + { + if (arg>13) arg = 13; + memcpy(key->aux.tls_aad,ptr,arg); + key->payload_length = arg; + + return SHA_DIGEST_LENGTH; + } + } + default: + return -1; + } + } + +static EVP_CIPHER aesni_128_cbc_hmac_sha1_cipher = + { +#ifdef NID_aes_128_cbc_hmac_sha1 + NID_aes_128_cbc_hmac_sha1, +#else + NID_undef, +#endif + 16,16,16, + EVP_CIPH_CBC_MODE|EVP_CIPH_FLAG_DEFAULT_ASN1|EVP_CIPH_FLAG_AEAD_CIPHER, + aesni_cbc_hmac_sha1_init_key, + aesni_cbc_hmac_sha1_cipher, + NULL, + sizeof(EVP_AES_HMAC_SHA1), + EVP_CIPH_FLAG_DEFAULT_ASN1?NULL:EVP_CIPHER_set_asn1_iv, + EVP_CIPH_FLAG_DEFAULT_ASN1?NULL:EVP_CIPHER_get_asn1_iv, + aesni_cbc_hmac_sha1_ctrl, + NULL + }; + +static EVP_CIPHER aesni_256_cbc_hmac_sha1_cipher = + { +#ifdef NID_aes_256_cbc_hmac_sha1 + NID_aes_256_cbc_hmac_sha1, +#else + NID_undef, +#endif + 16,32,16, + EVP_CIPH_CBC_MODE|EVP_CIPH_FLAG_DEFAULT_ASN1|EVP_CIPH_FLAG_AEAD_CIPHER, + aesni_cbc_hmac_sha1_init_key, + aesni_cbc_hmac_sha1_cipher, + NULL, + sizeof(EVP_AES_HMAC_SHA1), + EVP_CIPH_FLAG_DEFAULT_ASN1?NULL:EVP_CIPHER_set_asn1_iv, + EVP_CIPH_FLAG_DEFAULT_ASN1?NULL:EVP_CIPHER_get_asn1_iv, + aesni_cbc_hmac_sha1_ctrl, + NULL + }; + +const EVP_CIPHER *EVP_aes_128_cbc_hmac_sha1(void) + { + return(OPENSSL_ia32cap_P[1]&AESNI_CAPABLE? + &aesni_128_cbc_hmac_sha1_cipher:NULL); + } + +const EVP_CIPHER *EVP_aes_256_cbc_hmac_sha1(void) + { + return(OPENSSL_ia32cap_P[1]&AESNI_CAPABLE? + &aesni_256_cbc_hmac_sha1_cipher:NULL); + } +#else +const EVP_CIPHER *EVP_aes_128_cbc_hmac_sha1(void) + { + return NULL; + } +const EVP_CIPHER *EVP_aes_256_cbc_hmac_sha1(void) + { + return NULL; + } +#endif +#endif diff --git a/crypto/openssl/crypto/evp/e_des3.c b/crypto/openssl/crypto/evp/e_des3.c index 3232cfe024..1e69972662 100644 --- a/crypto/openssl/crypto/evp/e_des3.c +++ b/crypto/openssl/crypto/evp/e_des3.c @@ -65,6 +65,8 @@ #include #include +#ifndef OPENSSL_FIPS + static int des_ede_init_key(EVP_CIPHER_CTX *ctx, const unsigned char *key, const unsigned char *iv,int enc); @@ -311,3 +313,4 @@ const EVP_CIPHER *EVP_des_ede3(void) return &des_ede3_ecb; } #endif +#endif diff --git a/crypto/openssl/crypto/evp/e_null.c b/crypto/openssl/crypto/evp/e_null.c index 7cf50e1416..f0c1f78b5f 100644 --- a/crypto/openssl/crypto/evp/e_null.c +++ b/crypto/openssl/crypto/evp/e_null.c @@ -61,6 +61,8 @@ #include #include +#ifndef OPENSSL_FIPS + static int null_init_key(EVP_CIPHER_CTX *ctx, const unsigned char *key, const unsigned char *iv,int enc); static int null_cipher(EVP_CIPHER_CTX *ctx, unsigned char *out, @@ -99,4 +101,4 @@ static int null_cipher(EVP_CIPHER_CTX *ctx, unsigned char *out, memcpy((char *)out,(const char *)in,inl); return 1; } - +#endif diff --git a/crypto/openssl/crypto/ec/ec_cvt.c b/crypto/openssl/crypto/evp/e_old.c similarity index 53% copy from crypto/openssl/crypto/ec/ec_cvt.c copy to crypto/openssl/crypto/evp/e_old.c index d45640bab9..1642af4869 100644 --- a/crypto/openssl/crypto/ec/ec_cvt.c +++ b/crypto/openssl/crypto/evp/e_old.c @@ -1,9 +1,9 @@ -/* crypto/ec/ec_cvt.c */ -/* - * Originally written by Bodo Moeller for the OpenSSL project. +/* crypto/evp/e_old.c -*- mode:C; c-file-style: "eay" -*- */ +/* Written by Richard Levitte (richard@levitte.org) for the OpenSSL + * project 2004. */ /* ==================================================================== - * Copyright (c) 1998-2002 The OpenSSL Project. All rights reserved. + * Copyright (c) 2004 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -55,90 +55,71 @@ * Hudson (tjh@cryptsoft.com). * */ -/* ==================================================================== - * Copyright 2002 Sun Microsystems, Inc. ALL RIGHTS RESERVED. - * - * Portions of the attached software ("Contribution") are developed by - * SUN MICROSYSTEMS, INC., and are contributed to the OpenSSL project. - * - * The Contribution is licensed pursuant to the OpenSSL open source - * license provided above. - * - * The elliptic curve binary polynomial software is originally written by - * Sheueling Chang Shantz and Douglas Stebila of Sun Microsystems Laboratories. - * - */ - -#include -#include "ec_lcl.h" - - -EC_GROUP *EC_GROUP_new_curve_GFp(const BIGNUM *p, const BIGNUM *a, const BIGNUM *b, BN_CTX *ctx) - { - const EC_METHOD *meth; - EC_GROUP *ret; - - meth = EC_GFp_nist_method(); - - ret = EC_GROUP_new(meth); - if (ret == NULL) - return NULL; - if (!EC_GROUP_set_curve_GFp(ret, p, a, b, ctx)) - { - unsigned long err; - - err = ERR_peek_last_error(); +#ifdef OPENSSL_NO_DEPRECATED +static void *dummy = &dummy; +#else - if (!(ERR_GET_LIB(err) == ERR_LIB_EC && - ((ERR_GET_REASON(err) == EC_R_NOT_A_NIST_PRIME) || - (ERR_GET_REASON(err) == EC_R_NOT_A_SUPPORTED_NIST_PRIME)))) - { - /* real error */ - - EC_GROUP_clear_free(ret); - return NULL; - } - - - /* not an actual error, we just cannot use EC_GFp_nist_method */ +#include - ERR_clear_error(); +/* Define some deprecated functions, so older programs + don't crash and burn too quickly. On Windows and VMS, + these will never be used, since functions and variables + in shared libraries are selected by entry point location, + not by name. */ - EC_GROUP_clear_free(ret); - meth = EC_GFp_mont_method(); +#ifndef OPENSSL_NO_BF +#undef EVP_bf_cfb +const EVP_CIPHER *EVP_bf_cfb(void); +const EVP_CIPHER *EVP_bf_cfb(void) { return EVP_bf_cfb64(); } +#endif - ret = EC_GROUP_new(meth); - if (ret == NULL) - return NULL; +#ifndef OPENSSL_NO_DES +#undef EVP_des_cfb +const EVP_CIPHER *EVP_des_cfb(void); +const EVP_CIPHER *EVP_des_cfb(void) { return EVP_des_cfb64(); } +#undef EVP_des_ede3_cfb +const EVP_CIPHER *EVP_des_ede3_cfb(void); +const EVP_CIPHER *EVP_des_ede3_cfb(void) { return EVP_des_ede3_cfb64(); } +#undef EVP_des_ede_cfb +const EVP_CIPHER *EVP_des_ede_cfb(void); +const EVP_CIPHER *EVP_des_ede_cfb(void) { return EVP_des_ede_cfb64(); } +#endif - if (!EC_GROUP_set_curve_GFp(ret, p, a, b, ctx)) - { - EC_GROUP_clear_free(ret); - return NULL; - } - } +#ifndef OPENSSL_NO_IDEA +#undef EVP_idea_cfb +const EVP_CIPHER *EVP_idea_cfb(void); +const EVP_CIPHER *EVP_idea_cfb(void) { return EVP_idea_cfb64(); } +#endif - return ret; - } +#ifndef OPENSSL_NO_RC2 +#undef EVP_rc2_cfb +const EVP_CIPHER *EVP_rc2_cfb(void); +const EVP_CIPHER *EVP_rc2_cfb(void) { return EVP_rc2_cfb64(); } +#endif +#ifndef OPENSSL_NO_CAST +#undef EVP_cast5_cfb +const EVP_CIPHER *EVP_cast5_cfb(void); +const EVP_CIPHER *EVP_cast5_cfb(void) { return EVP_cast5_cfb64(); } +#endif -EC_GROUP *EC_GROUP_new_curve_GF2m(const BIGNUM *p, const BIGNUM *a, const BIGNUM *b, BN_CTX *ctx) - { - const EC_METHOD *meth; - EC_GROUP *ret; - - meth = EC_GF2m_simple_method(); - - ret = EC_GROUP_new(meth); - if (ret == NULL) - return NULL; +#ifndef OPENSSL_NO_RC5 +#undef EVP_rc5_32_12_16_cfb +const EVP_CIPHER *EVP_rc5_32_12_16_cfb(void); +const EVP_CIPHER *EVP_rc5_32_12_16_cfb(void) { return EVP_rc5_32_12_16_cfb64(); } +#endif - if (!EC_GROUP_set_curve_GF2m(ret, p, a, b, ctx)) - { - EC_GROUP_clear_free(ret); - return NULL; - } +#ifndef OPENSSL_NO_AES +#undef EVP_aes_128_cfb +const EVP_CIPHER *EVP_aes_128_cfb(void); +const EVP_CIPHER *EVP_aes_128_cfb(void) { return EVP_aes_128_cfb128(); } +#undef EVP_aes_192_cfb +const EVP_CIPHER *EVP_aes_192_cfb(void); +const EVP_CIPHER *EVP_aes_192_cfb(void) { return EVP_aes_192_cfb128(); } +#undef EVP_aes_256_cfb +const EVP_CIPHER *EVP_aes_256_cfb(void); +const EVP_CIPHER *EVP_aes_256_cfb(void) { return EVP_aes_256_cfb128(); } +#endif - return ret; - } +#endif diff --git a/crypto/openssl/crypto/evp/e_rc2.c b/crypto/openssl/crypto/evp/e_rc2.c index f78d781129..d4c33b58d4 100644 --- a/crypto/openssl/crypto/evp/e_rc2.c +++ b/crypto/openssl/crypto/evp/e_rc2.c @@ -183,7 +183,8 @@ static int rc2_get_asn1_type_and_iv(EVP_CIPHER_CTX *c, ASN1_TYPE *type) key_bits =rc2_magic_to_meth((int)num); if (!key_bits) return(-1); - if(i > 0) EVP_CipherInit_ex(c, NULL, NULL, NULL, iv, -1); + if(i > 0 && !EVP_CipherInit_ex(c, NULL, NULL, NULL, iv, -1)) + return -1; EVP_CIPHER_CTX_ctrl(c, EVP_CTRL_SET_RC2_KEY_BITS, key_bits, NULL); EVP_CIPHER_CTX_set_key_length(c, key_bits / 8); } diff --git a/crypto/openssl/crypto/evp/e_rc4.c b/crypto/openssl/crypto/evp/e_rc4.c index 8b5175e0fd..b4f6bda82d 100644 --- a/crypto/openssl/crypto/evp/e_rc4.c +++ b/crypto/openssl/crypto/evp/e_rc4.c @@ -62,6 +62,7 @@ #ifndef OPENSSL_NO_RC4 #include +#include "evp_locl.h" #include #include diff --git a/crypto/openssl/crypto/evp/e_rc4_hmac_md5.c b/crypto/openssl/crypto/evp/e_rc4_hmac_md5.c new file mode 100644 index 0000000000..eaa7a5312c --- /dev/null +++ b/crypto/openssl/crypto/evp/e_rc4_hmac_md5.c @@ -0,0 +1,293 @@ +/* ==================================================================== + * Copyright (c) 2011 The OpenSSL Project. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * 3. All advertising materials mentioning features or use of this + * software must display the following acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit. (http://www.OpenSSL.org/)" + * + * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to + * endorse or promote products derived from this software without + * prior written permission. For written permission, please contact + * licensing@OpenSSL.org. + * + * 5. Products derived from this software may not be called "OpenSSL" + * nor may "OpenSSL" appear in their names without prior written + * permission of the OpenSSL Project. + * + * 6. Redistributions of any form whatsoever must retain the following + * acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit (http://www.OpenSSL.org/)" + * + * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY + * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * ==================================================================== + */ + +#include + +#include +#include + +#if !defined(OPENSSL_NO_RC4) && !defined(OPENSSL_NO_MD5) + +#include +#include +#include +#include + +#ifndef EVP_CIPH_FLAG_AEAD_CIPHER +#define EVP_CIPH_FLAG_AEAD_CIPHER 0x200000 +#define EVP_CTRL_AEAD_TLS1_AAD 0x16 +#define EVP_CTRL_AEAD_SET_MAC_KEY 0x17 +#endif + +/* FIXME: surely this is available elsewhere? */ +#define EVP_RC4_KEY_SIZE 16 + +typedef struct + { + RC4_KEY ks; + MD5_CTX head,tail,md; + size_t payload_length; + } EVP_RC4_HMAC_MD5; + +void rc4_md5_enc (RC4_KEY *key, const void *in0, void *out, + MD5_CTX *ctx,const void *inp,size_t blocks); + +#define data(ctx) ((EVP_RC4_HMAC_MD5 *)(ctx)->cipher_data) + +static int rc4_hmac_md5_init_key(EVP_CIPHER_CTX *ctx, + const unsigned char *inkey, + const unsigned char *iv, int enc) + { + EVP_RC4_HMAC_MD5 *key = data(ctx); + + RC4_set_key(&key->ks,EVP_CIPHER_CTX_key_length(ctx), + inkey); + + MD5_Init(&key->head); /* handy when benchmarking */ + key->tail = key->head; + key->md = key->head; + + key->payload_length = 0; + + return 1; + } + +#if !defined(OPENSSL_NO_ASM) && ( \ + defined(__x86_64) || defined(__x86_64__) || \ + defined(_M_AMD64) || defined(_M_X64) || \ + defined(__INTEL__) ) && \ + !(defined(__APPLE__) && defined(__MACH__)) +#define STITCHED_CALL +#endif + +#if !defined(STITCHED_CALL) +#define rc4_off 0 +#define md5_off 0 +#endif + +static int rc4_hmac_md5_cipher(EVP_CIPHER_CTX *ctx, unsigned char *out, + const unsigned char *in, size_t len) + { + EVP_RC4_HMAC_MD5 *key = data(ctx); +#if defined(STITCHED_CALL) + size_t rc4_off = 32-1-(key->ks.x&(32-1)), /* 32 is $MOD from rc4_md5-x86_64.pl */ + md5_off = MD5_CBLOCK-key->md.num, + blocks; + unsigned int l; +#endif + size_t plen = key->payload_length; + + if (plen && len!=(plen+MD5_DIGEST_LENGTH)) return 0; + + if (ctx->encrypt) { + if (plen==0) plen = len; +#if defined(STITCHED_CALL) + /* cipher has to "fall behind" */ + if (rc4_off>md5_off) md5_off+=MD5_CBLOCK; + + if (plen>md5_off && (blocks=(plen-md5_off)/MD5_CBLOCK)) { + MD5_Update(&key->md,in,md5_off); + RC4(&key->ks,rc4_off,in,out); + + rc4_md5_enc(&key->ks,in+rc4_off,out+rc4_off, + &key->md,in+md5_off,blocks); + blocks *= MD5_CBLOCK; + rc4_off += blocks; + md5_off += blocks; + key->md.Nh += blocks>>29; + key->md.Nl += blocks<<=3; + if (key->md.Nl<(unsigned int)blocks) key->md.Nh++; + } else { + rc4_off = 0; + md5_off = 0; + } +#endif + MD5_Update(&key->md,in+md5_off,plen-md5_off); + + if (plen!=len) { /* "TLS" mode of operation */ + if (in!=out) + memcpy(out+rc4_off,in+rc4_off,plen-rc4_off); + + /* calculate HMAC and append it to payload */ + MD5_Final(out+plen,&key->md); + key->md = key->tail; + MD5_Update(&key->md,out+plen,MD5_DIGEST_LENGTH); + MD5_Final(out+plen,&key->md); + /* encrypt HMAC at once */ + RC4(&key->ks,len-rc4_off,out+rc4_off,out+rc4_off); + } else { + RC4(&key->ks,len-rc4_off,in+rc4_off,out+rc4_off); + } + } else { + unsigned char mac[MD5_DIGEST_LENGTH]; +#if defined(STITCHED_CALL) + /* digest has to "fall behind" */ + if (md5_off>rc4_off) rc4_off += 2*MD5_CBLOCK; + else rc4_off += MD5_CBLOCK; + + if (len>rc4_off && (blocks=(len-rc4_off)/MD5_CBLOCK)) { + RC4(&key->ks,rc4_off,in,out); + MD5_Update(&key->md,out,md5_off); + + rc4_md5_enc(&key->ks,in+rc4_off,out+rc4_off, + &key->md,out+md5_off,blocks); + blocks *= MD5_CBLOCK; + rc4_off += blocks; + md5_off += blocks; + l = (key->md.Nl+(blocks<<3))&0xffffffffU; + if (lmd.Nl) key->md.Nh++; + key->md.Nl = l; + key->md.Nh += blocks>>29; + } else { + md5_off=0; + rc4_off=0; + } +#endif + /* decrypt HMAC at once */ + RC4(&key->ks,len-rc4_off,in+rc4_off,out+rc4_off); + if (plen) { /* "TLS" mode of operation */ + MD5_Update(&key->md,out+md5_off,plen-md5_off); + + /* calculate HMAC and verify it */ + MD5_Final(mac,&key->md); + key->md = key->tail; + MD5_Update(&key->md,mac,MD5_DIGEST_LENGTH); + MD5_Final(mac,&key->md); + + if (memcmp(out+plen,mac,MD5_DIGEST_LENGTH)) + return 0; + } else { + MD5_Update(&key->md,out+md5_off,len-md5_off); + } + } + + key->payload_length = 0; + + return 1; + } + +static int rc4_hmac_md5_ctrl(EVP_CIPHER_CTX *ctx, int type, int arg, void *ptr) + { + EVP_RC4_HMAC_MD5 *key = data(ctx); + + switch (type) + { + case EVP_CTRL_AEAD_SET_MAC_KEY: + { + unsigned int i; + unsigned char hmac_key[64]; + + memset (hmac_key,0,sizeof(hmac_key)); + + if (arg > (int)sizeof(hmac_key)) { + MD5_Init(&key->head); + MD5_Update(&key->head,ptr,arg); + MD5_Final(hmac_key,&key->head); + } else { + memcpy(hmac_key,ptr,arg); + } + + for (i=0;ihead); + MD5_Update(&key->head,hmac_key,sizeof(hmac_key)); + + for (i=0;itail); + MD5_Update(&key->tail,hmac_key,sizeof(hmac_key)); + + return 1; + } + case EVP_CTRL_AEAD_TLS1_AAD: + { + unsigned char *p=ptr; + unsigned int len=p[arg-2]<<8|p[arg-1]; + + if (!ctx->encrypt) + { + len -= MD5_DIGEST_LENGTH; + p[arg-2] = len>>8; + p[arg-1] = len; + } + key->payload_length=len; + key->md = key->head; + MD5_Update(&key->md,p,arg); + + return MD5_DIGEST_LENGTH; + } + default: + return -1; + } + } + +static EVP_CIPHER r4_hmac_md5_cipher= + { +#ifdef NID_rc4_hmac_md5 + NID_rc4_hmac_md5, +#else + NID_undef, +#endif + 1,EVP_RC4_KEY_SIZE,0, + EVP_CIPH_STREAM_CIPHER|EVP_CIPH_VARIABLE_LENGTH|EVP_CIPH_FLAG_AEAD_CIPHER, + rc4_hmac_md5_init_key, + rc4_hmac_md5_cipher, + NULL, + sizeof(EVP_RC4_HMAC_MD5), + NULL, + NULL, + rc4_hmac_md5_ctrl, + NULL + }; + +const EVP_CIPHER *EVP_rc4_hmac_md5(void) + { + return(&r4_hmac_md5_cipher); + } +#endif diff --git a/crypto/openssl/crypto/evp/evp.h b/crypto/openssl/crypto/evp/evp.h index 9f9795e2d9..0d1b20a7d3 100644 --- a/crypto/openssl/crypto/evp/evp.h +++ b/crypto/openssl/crypto/evp/evp.h @@ -83,7 +83,7 @@ #define EVP_RC5_32_12_16_KEY_SIZE 16 */ #define EVP_MAX_MD_SIZE 64 /* longest known is SHA512 */ -#define EVP_MAX_KEY_LENGTH 32 +#define EVP_MAX_KEY_LENGTH 64 #define EVP_MAX_IV_LENGTH 16 #define EVP_MAX_BLOCK_LENGTH 32 @@ -116,6 +116,7 @@ #define EVP_PKEY_DH NID_dhKeyAgreement #define EVP_PKEY_EC NID_X9_62_id_ecPublicKey #define EVP_PKEY_HMAC NID_hmac +#define EVP_PKEY_CMAC NID_cmac #ifdef __cplusplus extern "C" { @@ -216,6 +217,8 @@ typedef int evp_verify_method(int type,const unsigned char *m, #define EVP_MD_FLAG_DIGALGID_CUSTOM 0x0018 +#define EVP_MD_FLAG_FIPS 0x0400 /* Note if suitable for use in FIPS mode */ + /* Digest ctrls */ #define EVP_MD_CTRL_DIGALGID 0x1 @@ -325,6 +328,10 @@ struct evp_cipher_st #define EVP_CIPH_CBC_MODE 0x2 #define EVP_CIPH_CFB_MODE 0x3 #define EVP_CIPH_OFB_MODE 0x4 +#define EVP_CIPH_CTR_MODE 0x5 +#define EVP_CIPH_GCM_MODE 0x6 +#define EVP_CIPH_CCM_MODE 0x7 +#define EVP_CIPH_XTS_MODE 0x10001 #define EVP_CIPH_MODE 0xF0007 /* Set if variable length cipher */ #define EVP_CIPH_VARIABLE_LENGTH 0x8 @@ -346,6 +353,15 @@ struct evp_cipher_st #define EVP_CIPH_FLAG_DEFAULT_ASN1 0x1000 /* Buffer length in bits not bytes: CFB1 mode only */ #define EVP_CIPH_FLAG_LENGTH_BITS 0x2000 +/* Note if suitable for use in FIPS mode */ +#define EVP_CIPH_FLAG_FIPS 0x4000 +/* Allow non FIPS cipher in FIPS mode */ +#define EVP_CIPH_FLAG_NON_FIPS_ALLOW 0x8000 +/* Cipher handles any and all padding logic as well + * as finalisation. + */ +#define EVP_CIPH_FLAG_CUSTOM_CIPHER 0x100000 +#define EVP_CIPH_FLAG_AEAD_CIPHER 0x200000 /* ctrl() values */ @@ -358,6 +374,34 @@ struct evp_cipher_st #define EVP_CTRL_RAND_KEY 0x6 #define EVP_CTRL_PBE_PRF_NID 0x7 #define EVP_CTRL_COPY 0x8 +#define EVP_CTRL_GCM_SET_IVLEN 0x9 +#define EVP_CTRL_GCM_GET_TAG 0x10 +#define EVP_CTRL_GCM_SET_TAG 0x11 +#define EVP_CTRL_GCM_SET_IV_FIXED 0x12 +#define EVP_CTRL_GCM_IV_GEN 0x13 +#define EVP_CTRL_CCM_SET_IVLEN EVP_CTRL_GCM_SET_IVLEN +#define EVP_CTRL_CCM_GET_TAG EVP_CTRL_GCM_GET_TAG +#define EVP_CTRL_CCM_SET_TAG EVP_CTRL_GCM_SET_TAG +#define EVP_CTRL_CCM_SET_L 0x14 +#define EVP_CTRL_CCM_SET_MSGLEN 0x15 +/* AEAD cipher deduces payload length and returns number of bytes + * required to store MAC and eventual padding. Subsequent call to + * EVP_Cipher even appends/verifies MAC. + */ +#define EVP_CTRL_AEAD_TLS1_AAD 0x16 +/* Used by composite AEAD ciphers, no-op in GCM, CCM... */ +#define EVP_CTRL_AEAD_SET_MAC_KEY 0x17 +/* Set the GCM invocation field, decrypt only */ +#define EVP_CTRL_GCM_SET_IV_INV 0x18 + +/* GCM TLS constants */ +/* Length of fixed part of IV derived from PRF */ +#define EVP_GCM_TLS_FIXED_IV_LEN 4 +/* Length of explicit part of IV part of TLS records */ +#define EVP_GCM_TLS_EXPLICIT_IV_LEN 8 +/* Length of tag for TLS */ +#define EVP_GCM_TLS_TAG_LEN 16 + typedef struct evp_cipher_info_st { @@ -375,7 +419,7 @@ struct evp_cipher_ctx_st unsigned char oiv[EVP_MAX_IV_LENGTH]; /* original iv */ unsigned char iv[EVP_MAX_IV_LENGTH]; /* working iv */ unsigned char buf[EVP_MAX_BLOCK_LENGTH];/* saved partial block */ - int num; /* used by cfb/ofb mode */ + int num; /* used by cfb/ofb/ctr mode */ void *app_data; /* application stuff */ int key_len; /* May change for variable length cipher */ @@ -695,6 +739,9 @@ const EVP_MD *EVP_dev_crypto_md5(void); #ifndef OPENSSL_NO_RC4 const EVP_CIPHER *EVP_rc4(void); const EVP_CIPHER *EVP_rc4_40(void); +#ifndef OPENSSL_NO_MD5 +const EVP_CIPHER *EVP_rc4_hmac_md5(void); +#endif #endif #ifndef OPENSSL_NO_IDEA const EVP_CIPHER *EVP_idea_ecb(void); @@ -741,9 +788,10 @@ const EVP_CIPHER *EVP_aes_128_cfb8(void); const EVP_CIPHER *EVP_aes_128_cfb128(void); # define EVP_aes_128_cfb EVP_aes_128_cfb128 const EVP_CIPHER *EVP_aes_128_ofb(void); -#if 0 const EVP_CIPHER *EVP_aes_128_ctr(void); -#endif +const EVP_CIPHER *EVP_aes_128_gcm(void); +const EVP_CIPHER *EVP_aes_128_ccm(void); +const EVP_CIPHER *EVP_aes_128_xts(void); const EVP_CIPHER *EVP_aes_192_ecb(void); const EVP_CIPHER *EVP_aes_192_cbc(void); const EVP_CIPHER *EVP_aes_192_cfb1(void); @@ -751,9 +799,9 @@ const EVP_CIPHER *EVP_aes_192_cfb8(void); const EVP_CIPHER *EVP_aes_192_cfb128(void); # define EVP_aes_192_cfb EVP_aes_192_cfb128 const EVP_CIPHER *EVP_aes_192_ofb(void); -#if 0 const EVP_CIPHER *EVP_aes_192_ctr(void); -#endif +const EVP_CIPHER *EVP_aes_192_gcm(void); +const EVP_CIPHER *EVP_aes_192_ccm(void); const EVP_CIPHER *EVP_aes_256_ecb(void); const EVP_CIPHER *EVP_aes_256_cbc(void); const EVP_CIPHER *EVP_aes_256_cfb1(void); @@ -761,8 +809,13 @@ const EVP_CIPHER *EVP_aes_256_cfb8(void); const EVP_CIPHER *EVP_aes_256_cfb128(void); # define EVP_aes_256_cfb EVP_aes_256_cfb128 const EVP_CIPHER *EVP_aes_256_ofb(void); -#if 0 const EVP_CIPHER *EVP_aes_256_ctr(void); +const EVP_CIPHER *EVP_aes_256_gcm(void); +const EVP_CIPHER *EVP_aes_256_ccm(void); +const EVP_CIPHER *EVP_aes_256_xts(void); +#if !defined(OPENSSL_NO_SHA) && !defined(OPENSSL_NO_SHA1) +const EVP_CIPHER *EVP_aes_128_cbc_hmac_sha1(void); +const EVP_CIPHER *EVP_aes_256_cbc_hmac_sha1(void); #endif #endif #ifndef OPENSSL_NO_CAMELLIA @@ -1047,13 +1100,22 @@ void EVP_PKEY_asn1_set_ctrl(EVP_PKEY_ASN1_METHOD *ameth, #define EVP_PKEY_CTRL_CMS_DECRYPT 10 #define EVP_PKEY_CTRL_CMS_SIGN 11 +#define EVP_PKEY_CTRL_CIPHER 12 + #define EVP_PKEY_ALG_CTRL 0x1000 #define EVP_PKEY_FLAG_AUTOARGLEN 2 +/* Method handles all operations: don't assume any digest related + * defaults. + */ +#define EVP_PKEY_FLAG_SIGCTX_CUSTOM 4 const EVP_PKEY_METHOD *EVP_PKEY_meth_find(int type); EVP_PKEY_METHOD* EVP_PKEY_meth_new(int id, int flags); +void EVP_PKEY_meth_get0_info(int *ppkey_id, int *pflags, + const EVP_PKEY_METHOD *meth); +void EVP_PKEY_meth_copy(EVP_PKEY_METHOD *dst, const EVP_PKEY_METHOD *src); void EVP_PKEY_meth_free(EVP_PKEY_METHOD *pmeth); int EVP_PKEY_meth_add0(const EVP_PKEY_METHOD *pmeth); @@ -1071,7 +1133,7 @@ int EVP_PKEY_CTX_get_operation(EVP_PKEY_CTX *ctx); void EVP_PKEY_CTX_set0_keygen_info(EVP_PKEY_CTX *ctx, int *dat, int datlen); EVP_PKEY *EVP_PKEY_new_mac_key(int type, ENGINE *e, - unsigned char *key, int keylen); + const unsigned char *key, int keylen); void EVP_PKEY_CTX_set_data(EVP_PKEY_CTX *ctx, void *data); void *EVP_PKEY_CTX_get_data(EVP_PKEY_CTX *ctx); @@ -1190,8 +1252,13 @@ void ERR_load_EVP_strings(void); /* Error codes for the EVP functions. */ /* Function codes. */ +#define EVP_F_AESNI_INIT_KEY 165 +#define EVP_F_AESNI_XTS_CIPHER 176 #define EVP_F_AES_INIT_KEY 133 +#define EVP_F_AES_XTS 172 +#define EVP_F_AES_XTS_CIPHER 175 #define EVP_F_CAMELLIA_INIT_KEY 159 +#define EVP_F_CMAC_INIT 173 #define EVP_F_D2I_PKEY 100 #define EVP_F_DO_SIGVER_INIT 161 #define EVP_F_DSAPKEY2PKCS8 134 @@ -1246,15 +1313,24 @@ void ERR_load_EVP_strings(void); #define EVP_F_EVP_RIJNDAEL 126 #define EVP_F_EVP_SIGNFINAL 107 #define EVP_F_EVP_VERIFYFINAL 108 +#define EVP_F_FIPS_CIPHERINIT 166 +#define EVP_F_FIPS_CIPHER_CTX_COPY 170 +#define EVP_F_FIPS_CIPHER_CTX_CTRL 167 +#define EVP_F_FIPS_CIPHER_CTX_SET_KEY_LENGTH 171 +#define EVP_F_FIPS_DIGESTINIT 168 +#define EVP_F_FIPS_MD_CTX_COPY 169 +#define EVP_F_HMAC_INIT_EX 174 #define EVP_F_INT_CTX_NEW 157 #define EVP_F_PKCS5_PBE_KEYIVGEN 117 #define EVP_F_PKCS5_V2_PBE_KEYIVGEN 118 +#define EVP_F_PKCS5_V2_PBKDF2_KEYIVGEN 164 #define EVP_F_PKCS8_SET_BROKEN 112 #define EVP_F_PKEY_SET_TYPE 158 #define EVP_F_RC2_MAGIC_TO_METH 109 #define EVP_F_RC5_CTRL 125 /* Reason codes. */ +#define EVP_R_AES_IV_SETUP_FAILED 162 #define EVP_R_AES_KEY_SETUP_FAILED 143 #define EVP_R_ASN1_LIB 140 #define EVP_R_BAD_BLOCK_LENGTH 136 @@ -1272,6 +1348,7 @@ void ERR_load_EVP_strings(void); #define EVP_R_DECODE_ERROR 114 #define EVP_R_DIFFERENT_KEY_TYPES 101 #define EVP_R_DIFFERENT_PARAMETERS 153 +#define EVP_R_DISABLED_FOR_FIPS 163 #define EVP_R_ENCODE_ERROR 115 #define EVP_R_EVP_PBE_CIPHERINIT_ERROR 119 #define EVP_R_EXPECTING_AN_RSA_KEY 127 @@ -1303,6 +1380,7 @@ void ERR_load_EVP_strings(void); #define EVP_R_PRIVATE_KEY_DECODE_ERROR 145 #define EVP_R_PRIVATE_KEY_ENCODE_ERROR 146 #define EVP_R_PUBLIC_KEY_NOT_RSA 106 +#define EVP_R_TOO_LARGE 164 #define EVP_R_UNKNOWN_CIPHER 160 #define EVP_R_UNKNOWN_DIGEST 161 #define EVP_R_UNKNOWN_PBE_ALGORITHM 121 diff --git a/crypto/openssl/crypto/evp/evp_enc.c b/crypto/openssl/crypto/evp/evp_enc.c index c268d25cb4..691072655b 100644 --- a/crypto/openssl/crypto/evp/evp_enc.c +++ b/crypto/openssl/crypto/evp/evp_enc.c @@ -64,8 +64,18 @@ #ifndef OPENSSL_NO_ENGINE #include #endif +#ifdef OPENSSL_FIPS +#include +#endif #include "evp_locl.h" +#ifdef OPENSSL_FIPS +#define M_do_cipher(ctx, out, in, inl) FIPS_cipher(ctx, out, in, inl) +#else +#define M_do_cipher(ctx, out, in, inl) ctx->cipher->do_cipher(ctx, out, in, inl) +#endif + + const char EVP_version[]="EVP" OPENSSL_VERSION_PTEXT; void EVP_CIPHER_CTX_init(EVP_CIPHER_CTX *ctx) @@ -115,10 +125,14 @@ int EVP_CipherInit_ex(EVP_CIPHER_CTX *ctx, const EVP_CIPHER *cipher, ENGINE *imp /* Ensure a context left lying around from last time is cleared * (the previous check attempted to avoid this if the same * ENGINE and EVP_CIPHER could be used). */ - EVP_CIPHER_CTX_cleanup(ctx); - - /* Restore encrypt field: it is zeroed by cleanup */ - ctx->encrypt = enc; + if (ctx->cipher) + { + unsigned long flags = ctx->flags; + EVP_CIPHER_CTX_cleanup(ctx); + /* Restore encrypt and flags */ + ctx->encrypt = enc; + ctx->flags = flags; + } #ifndef OPENSSL_NO_ENGINE if(impl) { @@ -155,6 +169,9 @@ int EVP_CipherInit_ex(EVP_CIPHER_CTX *ctx, const EVP_CIPHER *cipher, ENGINE *imp ctx->engine = NULL; #endif +#ifdef OPENSSL_FIPS + return FIPS_cipherinit(ctx, cipher, key, iv, enc); +#else ctx->cipher=cipher; if (ctx->cipher->ctx_size) { @@ -179,6 +196,7 @@ int EVP_CipherInit_ex(EVP_CIPHER_CTX *ctx, const EVP_CIPHER *cipher, ENGINE *imp return 0; } } +#endif } else if(!ctx->cipher) { @@ -188,6 +206,9 @@ int EVP_CipherInit_ex(EVP_CIPHER_CTX *ctx, const EVP_CIPHER *cipher, ENGINE *imp #ifndef OPENSSL_NO_ENGINE skip_to_init: #endif +#ifdef OPENSSL_FIPS + return FIPS_cipherinit(ctx, cipher, key, iv, enc); +#else /* we assume block size is a power of 2 in *cryptUpdate */ OPENSSL_assert(ctx->cipher->block_size == 1 || ctx->cipher->block_size == 8 @@ -214,6 +235,13 @@ skip_to_init: memcpy(ctx->iv, ctx->oiv, EVP_CIPHER_CTX_iv_length(ctx)); break; + case EVP_CIPH_CTR_MODE: + ctx->num = 0; + /* Don't reuse IV for CTR mode */ + if(iv) + memcpy(ctx->iv, iv, EVP_CIPHER_CTX_iv_length(ctx)); + break; + default: return 0; break; @@ -227,6 +255,7 @@ skip_to_init: ctx->final_used=0; ctx->block_mask=ctx->cipher->block_size-1; return 1; +#endif } int EVP_CipherUpdate(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl, @@ -280,6 +309,16 @@ int EVP_EncryptUpdate(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl, { int i,j,bl; + if (ctx->cipher->flags & EVP_CIPH_FLAG_CUSTOM_CIPHER) + { + i = M_do_cipher(ctx, out, in, inl); + if (i < 0) + return 0; + else + *outl = i; + return 1; + } + if (inl <= 0) { *outl = 0; @@ -288,7 +327,7 @@ int EVP_EncryptUpdate(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl, if(ctx->buf_len == 0 && (inl&(ctx->block_mask)) == 0) { - if(ctx->cipher->do_cipher(ctx,out,in,inl)) + if(M_do_cipher(ctx,out,in,inl)) { *outl=inl; return 1; @@ -315,7 +354,7 @@ int EVP_EncryptUpdate(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl, { j=bl-i; memcpy(&(ctx->buf[i]),in,j); - if(!ctx->cipher->do_cipher(ctx,out,ctx->buf,bl)) return 0; + if(!M_do_cipher(ctx,out,ctx->buf,bl)) return 0; inl-=j; in+=j; out+=bl; @@ -328,7 +367,7 @@ int EVP_EncryptUpdate(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl, inl-=i; if (inl > 0) { - if(!ctx->cipher->do_cipher(ctx,out,in,inl)) return 0; + if(!M_do_cipher(ctx,out,in,inl)) return 0; *outl+=inl; } @@ -350,6 +389,16 @@ int EVP_EncryptFinal_ex(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl) int n,ret; unsigned int i, b, bl; + if (ctx->cipher->flags & EVP_CIPH_FLAG_CUSTOM_CIPHER) + { + ret = M_do_cipher(ctx, out, NULL, 0); + if (ret < 0) + return 0; + else + *outl = ret; + return 1; + } + b=ctx->cipher->block_size; OPENSSL_assert(b <= sizeof ctx->buf); if (b == 1) @@ -372,7 +421,7 @@ int EVP_EncryptFinal_ex(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl) n=b-bl; for (i=bl; ibuf[i]=n; - ret=ctx->cipher->do_cipher(ctx,out,ctx->buf,b); + ret=M_do_cipher(ctx,out,ctx->buf,b); if(ret) @@ -387,6 +436,19 @@ int EVP_DecryptUpdate(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl, int fix_len; unsigned int b; + if (ctx->cipher->flags & EVP_CIPH_FLAG_CUSTOM_CIPHER) + { + fix_len = M_do_cipher(ctx, out, in, inl); + if (fix_len < 0) + { + *outl = 0; + return 0; + } + else + *outl = fix_len; + return 1; + } + if (inl <= 0) { *outl = 0; @@ -440,8 +502,18 @@ int EVP_DecryptFinal_ex(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl) { int i,n; unsigned int b; - *outl=0; + + if (ctx->cipher->flags & EVP_CIPH_FLAG_CUSTOM_CIPHER) + { + i = M_do_cipher(ctx, out, NULL, 0); + if (i < 0) + return 0; + else + *outl = i; + return 1; + } + b=ctx->cipher->block_size; if (ctx->flags & EVP_CIPH_NO_PADDING) { @@ -496,6 +568,7 @@ void EVP_CIPHER_CTX_free(EVP_CIPHER_CTX *ctx) int EVP_CIPHER_CTX_cleanup(EVP_CIPHER_CTX *c) { +#ifndef OPENSSL_FIPS if (c->cipher != NULL) { if(c->cipher->cleanup && !c->cipher->cleanup(c)) @@ -506,11 +579,15 @@ int EVP_CIPHER_CTX_cleanup(EVP_CIPHER_CTX *c) } if (c->cipher_data) OPENSSL_free(c->cipher_data); +#endif #ifndef OPENSSL_NO_ENGINE if (c->engine) /* The EVP_CIPHER we used belongs to an ENGINE, release the * functional reference we held for this reason. */ ENGINE_finish(c->engine); +#endif +#ifdef OPENSSL_FIPS + FIPS_cipher_ctx_cleanup(c); #endif memset(c,0,sizeof(EVP_CIPHER_CTX)); return 1; diff --git a/crypto/openssl/crypto/evp/evp_err.c b/crypto/openssl/crypto/evp/evp_err.c index d8bfec0959..db0f76d59b 100644 --- a/crypto/openssl/crypto/evp/evp_err.c +++ b/crypto/openssl/crypto/evp/evp_err.c @@ -1,6 +1,6 @@ /* crypto/evp/evp_err.c */ /* ==================================================================== - * Copyright (c) 1999-2008 The OpenSSL Project. All rights reserved. + * Copyright (c) 1999-2011 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -70,8 +70,13 @@ static ERR_STRING_DATA EVP_str_functs[]= { +{ERR_FUNC(EVP_F_AESNI_INIT_KEY), "AESNI_INIT_KEY"}, +{ERR_FUNC(EVP_F_AESNI_XTS_CIPHER), "AESNI_XTS_CIPHER"}, {ERR_FUNC(EVP_F_AES_INIT_KEY), "AES_INIT_KEY"}, +{ERR_FUNC(EVP_F_AES_XTS), "AES_XTS"}, +{ERR_FUNC(EVP_F_AES_XTS_CIPHER), "AES_XTS_CIPHER"}, {ERR_FUNC(EVP_F_CAMELLIA_INIT_KEY), "CAMELLIA_INIT_KEY"}, +{ERR_FUNC(EVP_F_CMAC_INIT), "CMAC_INIT"}, {ERR_FUNC(EVP_F_D2I_PKEY), "D2I_PKEY"}, {ERR_FUNC(EVP_F_DO_SIGVER_INIT), "DO_SIGVER_INIT"}, {ERR_FUNC(EVP_F_DSAPKEY2PKCS8), "DSAPKEY2PKCS8"}, @@ -86,7 +91,7 @@ static ERR_STRING_DATA EVP_str_functs[]= {ERR_FUNC(EVP_F_EVP_DIGESTINIT_EX), "EVP_DigestInit_ex"}, {ERR_FUNC(EVP_F_EVP_ENCRYPTFINAL_EX), "EVP_EncryptFinal_ex"}, {ERR_FUNC(EVP_F_EVP_MD_CTX_COPY_EX), "EVP_MD_CTX_copy_ex"}, -{ERR_FUNC(EVP_F_EVP_MD_SIZE), "EVP_MD_SIZE"}, +{ERR_FUNC(EVP_F_EVP_MD_SIZE), "EVP_MD_size"}, {ERR_FUNC(EVP_F_EVP_OPENINIT), "EVP_OpenInit"}, {ERR_FUNC(EVP_F_EVP_PBE_ALG_ADD), "EVP_PBE_alg_add"}, {ERR_FUNC(EVP_F_EVP_PBE_ALG_ADD_TYPE), "EVP_PBE_alg_add_type"}, @@ -126,9 +131,17 @@ static ERR_STRING_DATA EVP_str_functs[]= {ERR_FUNC(EVP_F_EVP_RIJNDAEL), "EVP_RIJNDAEL"}, {ERR_FUNC(EVP_F_EVP_SIGNFINAL), "EVP_SignFinal"}, {ERR_FUNC(EVP_F_EVP_VERIFYFINAL), "EVP_VerifyFinal"}, +{ERR_FUNC(EVP_F_FIPS_CIPHERINIT), "FIPS_CIPHERINIT"}, +{ERR_FUNC(EVP_F_FIPS_CIPHER_CTX_COPY), "FIPS_CIPHER_CTX_COPY"}, +{ERR_FUNC(EVP_F_FIPS_CIPHER_CTX_CTRL), "FIPS_CIPHER_CTX_CTRL"}, +{ERR_FUNC(EVP_F_FIPS_CIPHER_CTX_SET_KEY_LENGTH), "FIPS_CIPHER_CTX_SET_KEY_LENGTH"}, +{ERR_FUNC(EVP_F_FIPS_DIGESTINIT), "FIPS_DIGESTINIT"}, +{ERR_FUNC(EVP_F_FIPS_MD_CTX_COPY), "FIPS_MD_CTX_COPY"}, +{ERR_FUNC(EVP_F_HMAC_INIT_EX), "HMAC_Init_ex"}, {ERR_FUNC(EVP_F_INT_CTX_NEW), "INT_CTX_NEW"}, {ERR_FUNC(EVP_F_PKCS5_PBE_KEYIVGEN), "PKCS5_PBE_keyivgen"}, {ERR_FUNC(EVP_F_PKCS5_V2_PBE_KEYIVGEN), "PKCS5_v2_PBE_keyivgen"}, +{ERR_FUNC(EVP_F_PKCS5_V2_PBKDF2_KEYIVGEN), "PKCS5_V2_PBKDF2_KEYIVGEN"}, {ERR_FUNC(EVP_F_PKCS8_SET_BROKEN), "PKCS8_set_broken"}, {ERR_FUNC(EVP_F_PKEY_SET_TYPE), "PKEY_SET_TYPE"}, {ERR_FUNC(EVP_F_RC2_MAGIC_TO_METH), "RC2_MAGIC_TO_METH"}, @@ -138,6 +151,7 @@ static ERR_STRING_DATA EVP_str_functs[]= static ERR_STRING_DATA EVP_str_reasons[]= { +{ERR_REASON(EVP_R_AES_IV_SETUP_FAILED) ,"aes iv setup failed"}, {ERR_REASON(EVP_R_AES_KEY_SETUP_FAILED) ,"aes key setup failed"}, {ERR_REASON(EVP_R_ASN1_LIB) ,"asn1 lib"}, {ERR_REASON(EVP_R_BAD_BLOCK_LENGTH) ,"bad block length"}, @@ -155,6 +169,7 @@ static ERR_STRING_DATA EVP_str_reasons[]= {ERR_REASON(EVP_R_DECODE_ERROR) ,"decode error"}, {ERR_REASON(EVP_R_DIFFERENT_KEY_TYPES) ,"different key types"}, {ERR_REASON(EVP_R_DIFFERENT_PARAMETERS) ,"different parameters"}, +{ERR_REASON(EVP_R_DISABLED_FOR_FIPS) ,"disabled for fips"}, {ERR_REASON(EVP_R_ENCODE_ERROR) ,"encode error"}, {ERR_REASON(EVP_R_EVP_PBE_CIPHERINIT_ERROR),"evp pbe cipherinit error"}, {ERR_REASON(EVP_R_EXPECTING_AN_RSA_KEY) ,"expecting an rsa key"}, @@ -186,6 +201,7 @@ static ERR_STRING_DATA EVP_str_reasons[]= {ERR_REASON(EVP_R_PRIVATE_KEY_DECODE_ERROR),"private key decode error"}, {ERR_REASON(EVP_R_PRIVATE_KEY_ENCODE_ERROR),"private key encode error"}, {ERR_REASON(EVP_R_PUBLIC_KEY_NOT_RSA) ,"public key not rsa"}, +{ERR_REASON(EVP_R_TOO_LARGE) ,"too large"}, {ERR_REASON(EVP_R_UNKNOWN_CIPHER) ,"unknown cipher"}, {ERR_REASON(EVP_R_UNKNOWN_DIGEST) ,"unknown digest"}, {ERR_REASON(EVP_R_UNKNOWN_PBE_ALGORITHM) ,"unknown pbe algorithm"}, diff --git a/crypto/openssl/crypto/evp/evp_fips.c b/crypto/openssl/crypto/evp/evp_fips.c new file mode 100644 index 0000000000..cb7f4fc0fa --- /dev/null +++ b/crypto/openssl/crypto/evp/evp_fips.c @@ -0,0 +1,113 @@ +/* crypto/evp/evp_fips.c */ +/* Written by Dr Stephen N Henson (steve@openssl.org) for the OpenSSL + * project. + */ +/* ==================================================================== + * Copyright (c) 2011 The OpenSSL Project. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * 3. All advertising materials mentioning features or use of this + * software must display the following acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit. (http://www.OpenSSL.org/)" + * + * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to + * endorse or promote products derived from this software without + * prior written permission. For written permission, please contact + * licensing@OpenSSL.org. + * + * 5. Products derived from this software may not be called "OpenSSL" + * nor may "OpenSSL" appear in their names without prior written + * permission of the OpenSSL Project. + * + * 6. Redistributions of any form whatsoever must retain the following + * acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit (http://www.OpenSSL.org/)" + * + * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY + * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * ==================================================================== + */ + + +#include + +#ifdef OPENSSL_FIPS +#include + +const EVP_CIPHER *EVP_aes_128_cbc(void) { return FIPS_evp_aes_128_cbc(); } +const EVP_CIPHER *EVP_aes_128_ccm(void) { return FIPS_evp_aes_128_ccm(); } +const EVP_CIPHER *EVP_aes_128_cfb1(void) { return FIPS_evp_aes_128_cfb1(); } +const EVP_CIPHER *EVP_aes_128_cfb128(void) { return FIPS_evp_aes_128_cfb128(); } +const EVP_CIPHER *EVP_aes_128_cfb8(void) { return FIPS_evp_aes_128_cfb8(); } +const EVP_CIPHER *EVP_aes_128_ctr(void) { return FIPS_evp_aes_128_ctr(); } +const EVP_CIPHER *EVP_aes_128_ecb(void) { return FIPS_evp_aes_128_ecb(); } +const EVP_CIPHER *EVP_aes_128_gcm(void) { return FIPS_evp_aes_128_gcm(); } +const EVP_CIPHER *EVP_aes_128_ofb(void) { return FIPS_evp_aes_128_ofb(); } +const EVP_CIPHER *EVP_aes_128_xts(void) { return FIPS_evp_aes_128_xts(); } +const EVP_CIPHER *EVP_aes_192_cbc(void) { return FIPS_evp_aes_192_cbc(); } +const EVP_CIPHER *EVP_aes_192_ccm(void) { return FIPS_evp_aes_192_ccm(); } +const EVP_CIPHER *EVP_aes_192_cfb1(void) { return FIPS_evp_aes_192_cfb1(); } +const EVP_CIPHER *EVP_aes_192_cfb128(void) { return FIPS_evp_aes_192_cfb128(); } +const EVP_CIPHER *EVP_aes_192_cfb8(void) { return FIPS_evp_aes_192_cfb8(); } +const EVP_CIPHER *EVP_aes_192_ctr(void) { return FIPS_evp_aes_192_ctr(); } +const EVP_CIPHER *EVP_aes_192_ecb(void) { return FIPS_evp_aes_192_ecb(); } +const EVP_CIPHER *EVP_aes_192_gcm(void) { return FIPS_evp_aes_192_gcm(); } +const EVP_CIPHER *EVP_aes_192_ofb(void) { return FIPS_evp_aes_192_ofb(); } +const EVP_CIPHER *EVP_aes_256_cbc(void) { return FIPS_evp_aes_256_cbc(); } +const EVP_CIPHER *EVP_aes_256_ccm(void) { return FIPS_evp_aes_256_ccm(); } +const EVP_CIPHER *EVP_aes_256_cfb1(void) { return FIPS_evp_aes_256_cfb1(); } +const EVP_CIPHER *EVP_aes_256_cfb128(void) { return FIPS_evp_aes_256_cfb128(); } +const EVP_CIPHER *EVP_aes_256_cfb8(void) { return FIPS_evp_aes_256_cfb8(); } +const EVP_CIPHER *EVP_aes_256_ctr(void) { return FIPS_evp_aes_256_ctr(); } +const EVP_CIPHER *EVP_aes_256_ecb(void) { return FIPS_evp_aes_256_ecb(); } +const EVP_CIPHER *EVP_aes_256_gcm(void) { return FIPS_evp_aes_256_gcm(); } +const EVP_CIPHER *EVP_aes_256_ofb(void) { return FIPS_evp_aes_256_ofb(); } +const EVP_CIPHER *EVP_aes_256_xts(void) { return FIPS_evp_aes_256_xts(); } +const EVP_CIPHER *EVP_des_ede(void) { return FIPS_evp_des_ede(); } +const EVP_CIPHER *EVP_des_ede3(void) { return FIPS_evp_des_ede3(); } +const EVP_CIPHER *EVP_des_ede3_cbc(void) { return FIPS_evp_des_ede3_cbc(); } +const EVP_CIPHER *EVP_des_ede3_cfb1(void) { return FIPS_evp_des_ede3_cfb1(); } +const EVP_CIPHER *EVP_des_ede3_cfb64(void) { return FIPS_evp_des_ede3_cfb64(); } +const EVP_CIPHER *EVP_des_ede3_cfb8(void) { return FIPS_evp_des_ede3_cfb8(); } +const EVP_CIPHER *EVP_des_ede3_ecb(void) { return FIPS_evp_des_ede3_ecb(); } +const EVP_CIPHER *EVP_des_ede3_ofb(void) { return FIPS_evp_des_ede3_ofb(); } +const EVP_CIPHER *EVP_des_ede_cbc(void) { return FIPS_evp_des_ede_cbc(); } +const EVP_CIPHER *EVP_des_ede_cfb64(void) { return FIPS_evp_des_ede_cfb64(); } +const EVP_CIPHER *EVP_des_ede_ecb(void) { return FIPS_evp_des_ede_ecb(); } +const EVP_CIPHER *EVP_des_ede_ofb(void) { return FIPS_evp_des_ede_ofb(); } +const EVP_CIPHER *EVP_enc_null(void) { return FIPS_evp_enc_null(); } + +const EVP_MD *EVP_sha1(void) { return FIPS_evp_sha1(); } +const EVP_MD *EVP_sha224(void) { return FIPS_evp_sha224(); } +const EVP_MD *EVP_sha256(void) { return FIPS_evp_sha256(); } +const EVP_MD *EVP_sha384(void) { return FIPS_evp_sha384(); } +const EVP_MD *EVP_sha512(void) { return FIPS_evp_sha512(); } + +const EVP_MD *EVP_dss(void) { return FIPS_evp_dss(); } +const EVP_MD *EVP_dss1(void) { return FIPS_evp_dss1(); } +const EVP_MD *EVP_ecdsa(void) { return FIPS_evp_ecdsa(); } + +#endif diff --git a/crypto/openssl/crypto/evp/evp_key.c b/crypto/openssl/crypto/evp/evp_key.c index 839d6a3a16..7961fbebf2 100644 --- a/crypto/openssl/crypto/evp/evp_key.c +++ b/crypto/openssl/crypto/evp/evp_key.c @@ -120,7 +120,7 @@ int EVP_BytesToKey(const EVP_CIPHER *type, const EVP_MD *md, unsigned char md_buf[EVP_MAX_MD_SIZE]; int niv,nkey,addmd=0; unsigned int mds=0,i; - + int rv = 0; nkey=type->key_len; niv=type->iv_len; OPENSSL_assert(nkey <= EVP_MAX_KEY_LENGTH); @@ -134,17 +134,24 @@ int EVP_BytesToKey(const EVP_CIPHER *type, const EVP_MD *md, if (!EVP_DigestInit_ex(&c,md, NULL)) return 0; if (addmd++) - EVP_DigestUpdate(&c,&(md_buf[0]),mds); - EVP_DigestUpdate(&c,data,datal); + if (!EVP_DigestUpdate(&c,&(md_buf[0]),mds)) + goto err; + if (!EVP_DigestUpdate(&c,data,datal)) + goto err; if (salt != NULL) - EVP_DigestUpdate(&c,salt,PKCS5_SALT_LEN); - EVP_DigestFinal_ex(&c,&(md_buf[0]),&mds); + if (!EVP_DigestUpdate(&c,salt,PKCS5_SALT_LEN)) + goto err; + if (!EVP_DigestFinal_ex(&c,&(md_buf[0]),&mds)) + goto err; for (i=1; i<(unsigned int)count; i++) { - EVP_DigestInit_ex(&c,md, NULL); - EVP_DigestUpdate(&c,&(md_buf[0]),mds); - EVP_DigestFinal_ex(&c,&(md_buf[0]),&mds); + if (!EVP_DigestInit_ex(&c,md, NULL)) + goto err; + if (!EVP_DigestUpdate(&c,&(md_buf[0]),mds)) + goto err; + if (!EVP_DigestFinal_ex(&c,&(md_buf[0]),&mds)) + goto err; } i=0; if (nkey) @@ -173,8 +180,10 @@ int EVP_BytesToKey(const EVP_CIPHER *type, const EVP_MD *md, } if ((nkey == 0) && (niv == 0)) break; } + rv = type->key_len; + err: EVP_MD_CTX_cleanup(&c); OPENSSL_cleanse(&(md_buf[0]),EVP_MAX_MD_SIZE); - return(type->key_len); + return rv; } diff --git a/crypto/openssl/crypto/evp/evp_lib.c b/crypto/openssl/crypto/evp/evp_lib.c index 40951a04f0..b180e4828a 100644 --- a/crypto/openssl/crypto/evp/evp_lib.c +++ b/crypto/openssl/crypto/evp/evp_lib.c @@ -67,6 +67,8 @@ int EVP_CIPHER_param_to_asn1(EVP_CIPHER_CTX *c, ASN1_TYPE *type) if (c->cipher->set_asn1_parameters != NULL) ret=c->cipher->set_asn1_parameters(c,type); + else if (c->cipher->flags & EVP_CIPH_FLAG_DEFAULT_ASN1) + ret=EVP_CIPHER_set_asn1_iv(c, type); else ret=-1; return(ret); @@ -78,6 +80,8 @@ int EVP_CIPHER_asn1_to_param(EVP_CIPHER_CTX *c, ASN1_TYPE *type) if (c->cipher->get_asn1_parameters != NULL) ret=c->cipher->get_asn1_parameters(c,type); + else if (c->cipher->flags & EVP_CIPH_FLAG_DEFAULT_ASN1) + ret=EVP_CIPHER_get_asn1_iv(c, type); else ret=-1; return(ret); diff --git a/crypto/openssl/crypto/evp/evp_locl.h b/crypto/openssl/crypto/evp/evp_locl.h index 292d74c188..08c0a66d39 100644 --- a/crypto/openssl/crypto/evp/evp_locl.h +++ b/crypto/openssl/crypto/evp/evp_locl.h @@ -343,3 +343,43 @@ struct evp_pkey_method_st } /* EVP_PKEY_METHOD */; void evp_pkey_set_cb_translate(BN_GENCB *cb, EVP_PKEY_CTX *ctx); + +int PKCS5_v2_PBKDF2_keyivgen(EVP_CIPHER_CTX *ctx, const char *pass, int passlen, + ASN1_TYPE *param, + const EVP_CIPHER *c, const EVP_MD *md, int en_de); + +#ifdef OPENSSL_FIPS + +#ifdef OPENSSL_DOING_MAKEDEPEND +#undef SHA1_Init +#undef SHA1_Update +#undef SHA224_Init +#undef SHA256_Init +#undef SHA384_Init +#undef SHA512_Init +#undef DES_set_key_unchecked +#endif + +#define RIPEMD160_Init private_RIPEMD160_Init +#define WHIRLPOOL_Init private_WHIRLPOOL_Init +#define MD5_Init private_MD5_Init +#define MD4_Init private_MD4_Init +#define MD2_Init private_MD2_Init +#define MDC2_Init private_MDC2_Init +#define SHA_Init private_SHA_Init +#define SHA1_Init private_SHA1_Init +#define SHA224_Init private_SHA224_Init +#define SHA256_Init private_SHA256_Init +#define SHA384_Init private_SHA384_Init +#define SHA512_Init private_SHA512_Init + +#define BF_set_key private_BF_set_key +#define CAST_set_key private_CAST_set_key +#define idea_set_encrypt_key private_idea_set_encrypt_key +#define SEED_set_key private_SEED_set_key +#define RC2_set_key private_RC2_set_key +#define RC4_set_key private_RC4_set_key +#define DES_set_key_unchecked private_DES_set_key_unchecked +#define Camellia_set_key private_Camellia_set_key + +#endif diff --git a/crypto/openssl/crypto/evp/evp_pbe.c b/crypto/openssl/crypto/evp/evp_pbe.c index c9d932d205..f8c32d825e 100644 --- a/crypto/openssl/crypto/evp/evp_pbe.c +++ b/crypto/openssl/crypto/evp/evp_pbe.c @@ -61,6 +61,7 @@ #include #include #include +#include "evp_locl.h" /* Password based encryption (PBE) functions */ @@ -87,6 +88,10 @@ static const EVP_PBE_CTL builtin_pbe[] = {EVP_PBE_TYPE_OUTER, NID_pbeWithSHA1AndRC2_CBC, NID_rc2_64_cbc, NID_sha1, PKCS5_PBE_keyivgen}, +#ifndef OPENSSL_NO_HMAC + {EVP_PBE_TYPE_OUTER, NID_id_pbkdf2, -1, -1, PKCS5_v2_PBKDF2_keyivgen}, +#endif + {EVP_PBE_TYPE_OUTER, NID_pbe_WithSHA1And128BitRC4, NID_rc4, NID_sha1, PKCS12_PBE_keyivgen}, {EVP_PBE_TYPE_OUTER, NID_pbe_WithSHA1And40BitRC4, diff --git a/crypto/openssl/crypto/evp/m_dss.c b/crypto/openssl/crypto/evp/m_dss.c index 48c2689504..4ad63ada6f 100644 --- a/crypto/openssl/crypto/evp/m_dss.c +++ b/crypto/openssl/crypto/evp/m_dss.c @@ -66,6 +66,7 @@ #endif #ifndef OPENSSL_NO_SHA +#ifndef OPENSSL_FIPS static int init(EVP_MD_CTX *ctx) { return SHA1_Init(ctx->md_data); } @@ -97,3 +98,4 @@ const EVP_MD *EVP_dss(void) return(&dsa_md); } #endif +#endif diff --git a/crypto/openssl/crypto/evp/m_dss1.c b/crypto/openssl/crypto/evp/m_dss1.c index 4f03fb70e0..f80170efeb 100644 --- a/crypto/openssl/crypto/evp/m_dss1.c +++ b/crypto/openssl/crypto/evp/m_dss1.c @@ -68,6 +68,8 @@ #include #endif +#ifndef OPENSSL_FIPS + static int init(EVP_MD_CTX *ctx) { return SHA1_Init(ctx->md_data); } @@ -98,3 +100,4 @@ const EVP_MD *EVP_dss1(void) return(&dss1_md); } #endif +#endif diff --git a/crypto/openssl/crypto/evp/m_ecdsa.c b/crypto/openssl/crypto/evp/m_ecdsa.c index 8d87a49ebe..4b15fb0f6c 100644 --- a/crypto/openssl/crypto/evp/m_ecdsa.c +++ b/crypto/openssl/crypto/evp/m_ecdsa.c @@ -116,6 +116,8 @@ #include #ifndef OPENSSL_NO_SHA +#ifndef OPENSSL_FIPS + static int init(EVP_MD_CTX *ctx) { return SHA1_Init(ctx->md_data); } @@ -146,3 +148,4 @@ const EVP_MD *EVP_ecdsa(void) return(&ecdsa_md); } #endif +#endif diff --git a/crypto/openssl/crypto/evp/m_md4.c b/crypto/openssl/crypto/evp/m_md4.c index 1e0b7c5b42..6d47f61b27 100644 --- a/crypto/openssl/crypto/evp/m_md4.c +++ b/crypto/openssl/crypto/evp/m_md4.c @@ -69,6 +69,8 @@ #include #endif +#include "evp_locl.h" + static int init(EVP_MD_CTX *ctx) { return MD4_Init(ctx->md_data); } diff --git a/crypto/openssl/crypto/evp/m_md5.c b/crypto/openssl/crypto/evp/m_md5.c index 63c142119e..9a8bae0258 100644 --- a/crypto/openssl/crypto/evp/m_md5.c +++ b/crypto/openssl/crypto/evp/m_md5.c @@ -68,6 +68,7 @@ #ifndef OPENSSL_NO_RSA #include #endif +#include "evp_locl.h" static int init(EVP_MD_CTX *ctx) { return MD5_Init(ctx->md_data); } diff --git a/crypto/openssl/crypto/evp/m_mdc2.c b/crypto/openssl/crypto/evp/m_mdc2.c index b08d559803..3602bed316 100644 --- a/crypto/openssl/crypto/evp/m_mdc2.c +++ b/crypto/openssl/crypto/evp/m_mdc2.c @@ -69,6 +69,8 @@ #include #endif +#include "evp_locl.h" + static int init(EVP_MD_CTX *ctx) { return MDC2_Init(ctx->md_data); } diff --git a/crypto/openssl/crypto/evp/m_ripemd.c b/crypto/openssl/crypto/evp/m_ripemd.c index a1d60ee78d..7bf4804cf8 100644 --- a/crypto/openssl/crypto/evp/m_ripemd.c +++ b/crypto/openssl/crypto/evp/m_ripemd.c @@ -68,6 +68,7 @@ #ifndef OPENSSL_NO_RSA #include #endif +#include "evp_locl.h" static int init(EVP_MD_CTX *ctx) { return RIPEMD160_Init(ctx->md_data); } diff --git a/crypto/openssl/crypto/evp/m_sha.c b/crypto/openssl/crypto/evp/m_sha.c index acccc8f92d..8769cdd42f 100644 --- a/crypto/openssl/crypto/evp/m_sha.c +++ b/crypto/openssl/crypto/evp/m_sha.c @@ -67,6 +67,7 @@ #ifndef OPENSSL_NO_RSA #include #endif +#include "evp_locl.h" static int init(EVP_MD_CTX *ctx) { return SHA_Init(ctx->md_data); } diff --git a/crypto/openssl/crypto/evp/m_sha1.c b/crypto/openssl/crypto/evp/m_sha1.c index 9a2790fdea..3cb11f1ebb 100644 --- a/crypto/openssl/crypto/evp/m_sha1.c +++ b/crypto/openssl/crypto/evp/m_sha1.c @@ -59,6 +59,8 @@ #include #include "cryptlib.h" +#ifndef OPENSSL_FIPS + #ifndef OPENSSL_NO_SHA #include @@ -68,6 +70,7 @@ #include #endif + static int init(EVP_MD_CTX *ctx) { return SHA1_Init(ctx->md_data); } @@ -202,3 +205,5 @@ static const EVP_MD sha512_md= const EVP_MD *EVP_sha512(void) { return(&sha512_md); } #endif /* ifndef OPENSSL_NO_SHA512 */ + +#endif diff --git a/crypto/openssl/crypto/evp/m_wp.c b/crypto/openssl/crypto/evp/m_wp.c index 1ce47c040b..c51bc2d5d1 100644 --- a/crypto/openssl/crypto/evp/m_wp.c +++ b/crypto/openssl/crypto/evp/m_wp.c @@ -9,6 +9,7 @@ #include #include #include +#include "evp_locl.h" static int init(EVP_MD_CTX *ctx) { return WHIRLPOOL_Init(ctx->md_data); } diff --git a/crypto/openssl/crypto/evp/names.c b/crypto/openssl/crypto/evp/names.c index f2869f5c78..6311ad7cfb 100644 --- a/crypto/openssl/crypto/evp/names.c +++ b/crypto/openssl/crypto/evp/names.c @@ -66,6 +66,10 @@ int EVP_add_cipher(const EVP_CIPHER *c) { int r; + if (c == NULL) return 0; + + OPENSSL_init(); + r=OBJ_NAME_add(OBJ_nid2sn(c->nid),OBJ_NAME_TYPE_CIPHER_METH,(const char *)c); if (r == 0) return(0); check_defer(c->nid); @@ -78,6 +82,7 @@ int EVP_add_digest(const EVP_MD *md) { int r; const char *name; + OPENSSL_init(); name=OBJ_nid2sn(md->type); r=OBJ_NAME_add(name,OBJ_NAME_TYPE_MD_METH,(const char *)md); diff --git a/crypto/openssl/crypto/evp/p5_crpt.c b/crypto/openssl/crypto/evp/p5_crpt.c index 7ecfa8dad9..7d9c1f0123 100644 --- a/crypto/openssl/crypto/evp/p5_crpt.c +++ b/crypto/openssl/crypto/evp/p5_crpt.c @@ -82,6 +82,8 @@ int PKCS5_PBE_keyivgen(EVP_CIPHER_CTX *cctx, const char *pass, int passlen, unsigned char *salt; const unsigned char *pbuf; int mdsize; + int rv = 0; + EVP_MD_CTX_init(&ctx); /* Extract useful info from parameter */ if (param == NULL || param->type != V_ASN1_SEQUENCE || @@ -104,29 +106,37 @@ int PKCS5_PBE_keyivgen(EVP_CIPHER_CTX *cctx, const char *pass, int passlen, if(!pass) passlen = 0; else if(passlen == -1) passlen = strlen(pass); - EVP_MD_CTX_init(&ctx); - EVP_DigestInit_ex(&ctx, md, NULL); - EVP_DigestUpdate(&ctx, pass, passlen); - EVP_DigestUpdate(&ctx, salt, saltlen); + if (!EVP_DigestInit_ex(&ctx, md, NULL)) + goto err; + if (!EVP_DigestUpdate(&ctx, pass, passlen)) + goto err; + if (!EVP_DigestUpdate(&ctx, salt, saltlen)) + goto err; PBEPARAM_free(pbe); - EVP_DigestFinal_ex(&ctx, md_tmp, NULL); + if (!EVP_DigestFinal_ex(&ctx, md_tmp, NULL)) + goto err; mdsize = EVP_MD_size(md); if (mdsize < 0) return 0; for (i = 1; i < iter; i++) { - EVP_DigestInit_ex(&ctx, md, NULL); - EVP_DigestUpdate(&ctx, md_tmp, mdsize); - EVP_DigestFinal_ex (&ctx, md_tmp, NULL); + if (!EVP_DigestInit_ex(&ctx, md, NULL)) + goto err; + if (!EVP_DigestUpdate(&ctx, md_tmp, mdsize)) + goto err; + if (!EVP_DigestFinal_ex (&ctx, md_tmp, NULL)) + goto err; } - EVP_MD_CTX_cleanup(&ctx); OPENSSL_assert(EVP_CIPHER_key_length(cipher) <= (int)sizeof(md_tmp)); memcpy(key, md_tmp, EVP_CIPHER_key_length(cipher)); OPENSSL_assert(EVP_CIPHER_iv_length(cipher) <= 16); memcpy(iv, md_tmp + (16 - EVP_CIPHER_iv_length(cipher)), EVP_CIPHER_iv_length(cipher)); - EVP_CipherInit_ex(cctx, cipher, NULL, key, iv, en_de); + if (!EVP_CipherInit_ex(cctx, cipher, NULL, key, iv, en_de)) + goto err; OPENSSL_cleanse(md_tmp, EVP_MAX_MD_SIZE); OPENSSL_cleanse(key, EVP_MAX_KEY_LENGTH); OPENSSL_cleanse(iv, EVP_MAX_IV_LENGTH); - return 1; + rv = 1; + err: + return rv; } diff --git a/crypto/openssl/crypto/evp/p5_crpt2.c b/crypto/openssl/crypto/evp/p5_crpt2.c index 334379f310..975d004df4 100644 --- a/crypto/openssl/crypto/evp/p5_crpt2.c +++ b/crypto/openssl/crypto/evp/p5_crpt2.c @@ -62,6 +62,7 @@ #include #include #include +#include "evp_locl.h" /* set this to print out info about the keygen algorithm */ /* #define DEBUG_PKCS5V2 */ @@ -110,10 +111,14 @@ int PKCS5_PBKDF2_HMAC(const char *pass, int passlen, itmp[1] = (unsigned char)((i >> 16) & 0xff); itmp[2] = (unsigned char)((i >> 8) & 0xff); itmp[3] = (unsigned char)(i & 0xff); - HMAC_Init_ex(&hctx, pass, passlen, digest, NULL); - HMAC_Update(&hctx, salt, saltlen); - HMAC_Update(&hctx, itmp, 4); - HMAC_Final(&hctx, digtmp, NULL); + if (!HMAC_Init_ex(&hctx, pass, passlen, digest, NULL) + || !HMAC_Update(&hctx, salt, saltlen) + || !HMAC_Update(&hctx, itmp, 4) + || !HMAC_Final(&hctx, digtmp, NULL)) + { + HMAC_CTX_cleanup(&hctx); + return 0; + } memcpy(p, digtmp, cplen); for(j = 1; j < iter; j++) { @@ -168,27 +173,24 @@ int PKCS5_v2_PBE_keyivgen(EVP_CIPHER_CTX *ctx, const char *pass, int passlen, ASN1_TYPE *param, const EVP_CIPHER *c, const EVP_MD *md, int en_de) { - unsigned char *salt, key[EVP_MAX_KEY_LENGTH]; const unsigned char *pbuf; - int saltlen, iter, plen; - unsigned int keylen; + int plen; PBE2PARAM *pbe2 = NULL; const EVP_CIPHER *cipher; - PBKDF2PARAM *kdf = NULL; - const EVP_MD *prfmd; - int prf_nid, hmac_md_nid; + + int rv = 0; if (param == NULL || param->type != V_ASN1_SEQUENCE || param->value.sequence == NULL) { EVPerr(EVP_F_PKCS5_V2_PBE_KEYIVGEN,EVP_R_DECODE_ERROR); - return 0; + goto err; } pbuf = param->value.sequence->data; plen = param->value.sequence->length; if(!(pbe2 = d2i_PBE2PARAM(NULL, &pbuf, plen))) { EVPerr(EVP_F_PKCS5_V2_PBE_KEYIVGEN,EVP_R_DECODE_ERROR); - return 0; + goto err; } /* See if we recognise the key derivation function */ @@ -211,38 +213,63 @@ int PKCS5_v2_PBE_keyivgen(EVP_CIPHER_CTX *ctx, const char *pass, int passlen, } /* Fixup cipher based on AlgorithmIdentifier */ - EVP_CipherInit_ex(ctx, cipher, NULL, NULL, NULL, en_de); + if (!EVP_CipherInit_ex(ctx, cipher, NULL, NULL, NULL, en_de)) + goto err; if(EVP_CIPHER_asn1_to_param(ctx, pbe2->encryption->parameter) < 0) { EVPerr(EVP_F_PKCS5_V2_PBE_KEYIVGEN, EVP_R_CIPHER_PARAMETER_ERROR); goto err; } + rv = PKCS5_v2_PBKDF2_keyivgen(ctx, pass, passlen, + pbe2->keyfunc->parameter, c, md, en_de); + err: + PBE2PARAM_free(pbe2); + return rv; +} + +int PKCS5_v2_PBKDF2_keyivgen(EVP_CIPHER_CTX *ctx, const char *pass, int passlen, + ASN1_TYPE *param, + const EVP_CIPHER *c, const EVP_MD *md, int en_de) +{ + unsigned char *salt, key[EVP_MAX_KEY_LENGTH]; + const unsigned char *pbuf; + int saltlen, iter, plen; + int rv = 0; + unsigned int keylen = 0; + int prf_nid, hmac_md_nid; + PBKDF2PARAM *kdf = NULL; + const EVP_MD *prfmd; + + if (EVP_CIPHER_CTX_cipher(ctx) == NULL) + { + EVPerr(EVP_F_PKCS5_V2_PBKDF2_KEYIVGEN,EVP_R_NO_CIPHER_SET); + goto err; + } keylen = EVP_CIPHER_CTX_key_length(ctx); OPENSSL_assert(keylen <= sizeof key); - /* Now decode key derivation function */ + /* Decode parameter */ - if(!pbe2->keyfunc->parameter || - (pbe2->keyfunc->parameter->type != V_ASN1_SEQUENCE)) + if(!param || (param->type != V_ASN1_SEQUENCE)) { - EVPerr(EVP_F_PKCS5_V2_PBE_KEYIVGEN,EVP_R_DECODE_ERROR); + EVPerr(EVP_F_PKCS5_V2_PBKDF2_KEYIVGEN,EVP_R_DECODE_ERROR); goto err; } - pbuf = pbe2->keyfunc->parameter->value.sequence->data; - plen = pbe2->keyfunc->parameter->value.sequence->length; + pbuf = param->value.sequence->data; + plen = param->value.sequence->length; + if(!(kdf = d2i_PBKDF2PARAM(NULL, &pbuf, plen)) ) { - EVPerr(EVP_F_PKCS5_V2_PBE_KEYIVGEN,EVP_R_DECODE_ERROR); + EVPerr(EVP_F_PKCS5_V2_PBKDF2_KEYIVGEN,EVP_R_DECODE_ERROR); goto err; } - PBE2PARAM_free(pbe2); - pbe2 = NULL; + keylen = EVP_CIPHER_CTX_key_length(ctx); /* Now check the parameters of the kdf */ if(kdf->keylength && (ASN1_INTEGER_get(kdf->keylength) != (int)keylen)){ - EVPerr(EVP_F_PKCS5_V2_PBE_KEYIVGEN, + EVPerr(EVP_F_PKCS5_V2_PBKDF2_KEYIVGEN, EVP_R_UNSUPPORTED_KEYLENGTH); goto err; } @@ -254,19 +281,19 @@ int PKCS5_v2_PBE_keyivgen(EVP_CIPHER_CTX *ctx, const char *pass, int passlen, if (!EVP_PBE_find(EVP_PBE_TYPE_PRF, prf_nid, NULL, &hmac_md_nid, 0)) { - EVPerr(EVP_F_PKCS5_V2_PBE_KEYIVGEN, EVP_R_UNSUPPORTED_PRF); + EVPerr(EVP_F_PKCS5_V2_PBKDF2_KEYIVGEN, EVP_R_UNSUPPORTED_PRF); goto err; } prfmd = EVP_get_digestbynid(hmac_md_nid); if (prfmd == NULL) { - EVPerr(EVP_F_PKCS5_V2_PBE_KEYIVGEN, EVP_R_UNSUPPORTED_PRF); + EVPerr(EVP_F_PKCS5_V2_PBKDF2_KEYIVGEN, EVP_R_UNSUPPORTED_PRF); goto err; } if(kdf->salt->type != V_ASN1_OCTET_STRING) { - EVPerr(EVP_F_PKCS5_V2_PBE_KEYIVGEN, + EVPerr(EVP_F_PKCS5_V2_PBKDF2_KEYIVGEN, EVP_R_UNSUPPORTED_SALT_TYPE); goto err; } @@ -278,15 +305,11 @@ int PKCS5_v2_PBE_keyivgen(EVP_CIPHER_CTX *ctx, const char *pass, int passlen, if(!PKCS5_PBKDF2_HMAC(pass, passlen, salt, saltlen, iter, prfmd, keylen, key)) goto err; - EVP_CipherInit_ex(ctx, NULL, NULL, key, NULL, en_de); - OPENSSL_cleanse(key, keylen); - PBKDF2PARAM_free(kdf); - return 1; - + rv = EVP_CipherInit_ex(ctx, NULL, NULL, key, NULL, en_de); err: - PBE2PARAM_free(pbe2); + OPENSSL_cleanse(key, keylen); PBKDF2PARAM_free(kdf); - return 0; + return rv; } #ifdef DEBUG_PKCS5V2 diff --git a/crypto/openssl/crypto/evp/p_open.c b/crypto/openssl/crypto/evp/p_open.c index 53a59a295c..c748fbea87 100644 --- a/crypto/openssl/crypto/evp/p_open.c +++ b/crypto/openssl/crypto/evp/p_open.c @@ -115,7 +115,8 @@ int EVP_OpenFinal(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl) int i; i=EVP_DecryptFinal_ex(ctx,out,outl); - EVP_DecryptInit_ex(ctx,NULL,NULL,NULL,NULL); + if (i) + i = EVP_DecryptInit_ex(ctx,NULL,NULL,NULL,NULL); return(i); } #else /* !OPENSSL_NO_RSA */ diff --git a/crypto/openssl/crypto/evp/p_seal.c b/crypto/openssl/crypto/evp/p_seal.c index d8324526e7..e5919b0fbf 100644 --- a/crypto/openssl/crypto/evp/p_seal.c +++ b/crypto/openssl/crypto/evp/p_seal.c @@ -110,6 +110,7 @@ int EVP_SealFinal(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl) { int i; i = EVP_EncryptFinal_ex(ctx,out,outl); - EVP_EncryptInit_ex(ctx,NULL,NULL,NULL,NULL); + if (i) + i = EVP_EncryptInit_ex(ctx,NULL,NULL,NULL,NULL); return i; } diff --git a/crypto/openssl/crypto/evp/p_sign.c b/crypto/openssl/crypto/evp/p_sign.c index bb893f5bde..dfa48c157c 100644 --- a/crypto/openssl/crypto/evp/p_sign.c +++ b/crypto/openssl/crypto/evp/p_sign.c @@ -80,18 +80,20 @@ int EVP_SignFinal(EVP_MD_CTX *ctx, unsigned char *sigret, unsigned int *siglen, { unsigned char m[EVP_MAX_MD_SIZE]; unsigned int m_len; - int i,ok=0,v; + int i=0,ok=0,v; EVP_MD_CTX tmp_ctx; + EVP_PKEY_CTX *pkctx = NULL; *siglen=0; EVP_MD_CTX_init(&tmp_ctx); - EVP_MD_CTX_copy_ex(&tmp_ctx,ctx); - EVP_DigestFinal_ex(&tmp_ctx,&(m[0]),&m_len); + if (!EVP_MD_CTX_copy_ex(&tmp_ctx,ctx)) + goto err; + if (!EVP_DigestFinal_ex(&tmp_ctx,&(m[0]),&m_len)) + goto err; EVP_MD_CTX_cleanup(&tmp_ctx); if (ctx->digest->flags & EVP_MD_FLAG_PKEY_METHOD_SIGNATURE) { - EVP_PKEY_CTX *pkctx = NULL; size_t sltmp = (size_t)EVP_PKEY_size(pkey); i = 0; pkctx = EVP_PKEY_CTX_new(pkey, NULL); diff --git a/crypto/openssl/crypto/evp/p_verify.c b/crypto/openssl/crypto/evp/p_verify.c index 41d4b67130..5f5c409f45 100644 --- a/crypto/openssl/crypto/evp/p_verify.c +++ b/crypto/openssl/crypto/evp/p_verify.c @@ -67,17 +67,19 @@ int EVP_VerifyFinal(EVP_MD_CTX *ctx, const unsigned char *sigbuf, { unsigned char m[EVP_MAX_MD_SIZE]; unsigned int m_len; - int i,ok=0,v; + int i=-1,ok=0,v; EVP_MD_CTX tmp_ctx; + EVP_PKEY_CTX *pkctx = NULL; EVP_MD_CTX_init(&tmp_ctx); - EVP_MD_CTX_copy_ex(&tmp_ctx,ctx); - EVP_DigestFinal_ex(&tmp_ctx,&(m[0]),&m_len); + if (!EVP_MD_CTX_copy_ex(&tmp_ctx,ctx)) + goto err; + if (!EVP_DigestFinal_ex(&tmp_ctx,&(m[0]),&m_len)) + goto err; EVP_MD_CTX_cleanup(&tmp_ctx); if (ctx->digest->flags & EVP_MD_FLAG_PKEY_METHOD_SIGNATURE) { - EVP_PKEY_CTX *pkctx = NULL; i = -1; pkctx = EVP_PKEY_CTX_new(pkey, NULL); if (!pkctx) diff --git a/crypto/openssl/crypto/evp/pmeth_gn.c b/crypto/openssl/crypto/evp/pmeth_gn.c index 5d74161a09..4651c81370 100644 --- a/crypto/openssl/crypto/evp/pmeth_gn.c +++ b/crypto/openssl/crypto/evp/pmeth_gn.c @@ -199,7 +199,7 @@ int EVP_PKEY_CTX_get_keygen_info(EVP_PKEY_CTX *ctx, int idx) } EVP_PKEY *EVP_PKEY_new_mac_key(int type, ENGINE *e, - unsigned char *key, int keylen) + const unsigned char *key, int keylen) { EVP_PKEY_CTX *mac_ctx = NULL; EVP_PKEY *mac_key = NULL; @@ -209,7 +209,8 @@ EVP_PKEY *EVP_PKEY_new_mac_key(int type, ENGINE *e, if (EVP_PKEY_keygen_init(mac_ctx) <= 0) goto merr; if (EVP_PKEY_CTX_ctrl(mac_ctx, -1, EVP_PKEY_OP_KEYGEN, - EVP_PKEY_CTRL_SET_MAC_KEY, keylen, key) <= 0) + EVP_PKEY_CTRL_SET_MAC_KEY, + keylen, (void *)key) <= 0) goto merr; if (EVP_PKEY_keygen(mac_ctx, &mac_key) <= 0) goto merr; diff --git a/crypto/openssl/crypto/evp/pmeth_lib.c b/crypto/openssl/crypto/evp/pmeth_lib.c index 5481d4b8a5..acfa7b6f87 100644 --- a/crypto/openssl/crypto/evp/pmeth_lib.c +++ b/crypto/openssl/crypto/evp/pmeth_lib.c @@ -73,7 +73,7 @@ DECLARE_STACK_OF(EVP_PKEY_METHOD) STACK_OF(EVP_PKEY_METHOD) *app_pkey_methods = NULL; extern const EVP_PKEY_METHOD rsa_pkey_meth, dh_pkey_meth, dsa_pkey_meth; -extern const EVP_PKEY_METHOD ec_pkey_meth, hmac_pkey_meth; +extern const EVP_PKEY_METHOD ec_pkey_meth, hmac_pkey_meth, cmac_pkey_meth; static const EVP_PKEY_METHOD *standard_methods[] = { @@ -90,6 +90,7 @@ static const EVP_PKEY_METHOD *standard_methods[] = &ec_pkey_meth, #endif &hmac_pkey_meth, + &cmac_pkey_meth }; DECLARE_OBJ_BSEARCH_CMP_FN(const EVP_PKEY_METHOD *, const EVP_PKEY_METHOD *, @@ -203,6 +204,8 @@ EVP_PKEY_METHOD* EVP_PKEY_meth_new(int id, int flags) if (!pmeth) return NULL; + memset(pmeth, 0, sizeof(EVP_PKEY_METHOD)); + pmeth->pkey_id = id; pmeth->flags = flags | EVP_PKEY_FLAG_DYNAMIC; @@ -235,6 +238,56 @@ EVP_PKEY_METHOD* EVP_PKEY_meth_new(int id, int flags) return pmeth; } +void EVP_PKEY_meth_get0_info(int *ppkey_id, int *pflags, + const EVP_PKEY_METHOD *meth) + { + if (ppkey_id) + *ppkey_id = meth->pkey_id; + if (pflags) + *pflags = meth->flags; + } + +void EVP_PKEY_meth_copy(EVP_PKEY_METHOD *dst, const EVP_PKEY_METHOD *src) + { + + dst->init = src->init; + dst->copy = src->copy; + dst->cleanup = src->cleanup; + + dst->paramgen_init = src->paramgen_init; + dst->paramgen = src->paramgen; + + dst->keygen_init = src->keygen_init; + dst->keygen = src->keygen; + + dst->sign_init = src->sign_init; + dst->sign = src->sign; + + dst->verify_init = src->verify_init; + dst->verify = src->verify; + + dst->verify_recover_init = src->verify_recover_init; + dst->verify_recover = src->verify_recover; + + dst->signctx_init = src->signctx_init; + dst->signctx = src->signctx; + + dst->verifyctx_init = src->verifyctx_init; + dst->verifyctx = src->verifyctx; + + dst->encrypt_init = src->encrypt_init; + dst->encrypt = src->encrypt; + + dst->decrypt_init = src->decrypt_init; + dst->decrypt = src->decrypt; + + dst->derive_init = src->derive_init; + dst->derive = src->derive; + + dst->ctrl = src->ctrl; + dst->ctrl_str = src->ctrl_str; + } + void EVP_PKEY_meth_free(EVP_PKEY_METHOD *pmeth) { if (pmeth && (pmeth->flags & EVP_PKEY_FLAG_DYNAMIC)) diff --git a/crypto/openssl/crypto/fips_ers.c b/crypto/openssl/crypto/fips_ers.c new file mode 100644 index 0000000000..09f11748f6 --- /dev/null +++ b/crypto/openssl/crypto/fips_ers.c @@ -0,0 +1,7 @@ +#include + +#ifdef OPENSSL_FIPS +# include "fips_err.h" +#else +static void *dummy=&dummy; +#endif diff --git a/crypto/openssl/crypto/hmac/hm_ameth.c b/crypto/openssl/crypto/hmac/hm_ameth.c index 6d8a89149e..e03f24aeda 100644 --- a/crypto/openssl/crypto/hmac/hm_ameth.c +++ b/crypto/openssl/crypto/hmac/hm_ameth.c @@ -153,7 +153,7 @@ const EVP_PKEY_ASN1_METHOD hmac_asn1_meth = hmac_size, 0, - 0,0,0,0,0,0, + 0,0,0,0,0,0,0, hmac_key_free, hmac_pkey_ctrl, diff --git a/crypto/openssl/crypto/hmac/hm_pmeth.c b/crypto/openssl/crypto/hmac/hm_pmeth.c index 71e8567a14..0daa44511d 100644 --- a/crypto/openssl/crypto/hmac/hm_pmeth.c +++ b/crypto/openssl/crypto/hmac/hm_pmeth.c @@ -100,7 +100,8 @@ static int pkey_hmac_copy(EVP_PKEY_CTX *dst, EVP_PKEY_CTX *src) dctx = dst->data; dctx->md = sctx->md; HMAC_CTX_init(&dctx->ctx); - HMAC_CTX_copy(&dctx->ctx, &sctx->ctx); + if (!HMAC_CTX_copy(&dctx->ctx, &sctx->ctx)) + return 0; if (sctx->ktmp.data) { if (!ASN1_OCTET_STRING_set(&dctx->ktmp, @@ -141,7 +142,8 @@ static int pkey_hmac_keygen(EVP_PKEY_CTX *ctx, EVP_PKEY *pkey) static int int_update(EVP_MD_CTX *ctx,const void *data,size_t count) { HMAC_PKEY_CTX *hctx = ctx->pctx->data; - HMAC_Update(&hctx->ctx, data, count); + if (!HMAC_Update(&hctx->ctx, data, count)) + return 0; return 1; } @@ -167,7 +169,8 @@ static int hmac_signctx(EVP_PKEY_CTX *ctx, unsigned char *sig, size_t *siglen, if (!sig) return 1; - HMAC_Final(&hctx->ctx, sig, &hlen); + if (!HMAC_Final(&hctx->ctx, sig, &hlen)) + return 0; *siglen = (size_t)hlen; return 1; } @@ -192,8 +195,9 @@ static int pkey_hmac_ctrl(EVP_PKEY_CTX *ctx, int type, int p1, void *p2) case EVP_PKEY_CTRL_DIGESTINIT: key = (ASN1_OCTET_STRING *)ctx->pkey->pkey.ptr; - HMAC_Init_ex(&hctx->ctx, key->data, key->length, hctx->md, - ctx->engine); + if (!HMAC_Init_ex(&hctx->ctx, key->data, key->length, hctx->md, + ctx->engine)) + return 0; break; default: diff --git a/crypto/openssl/crypto/hmac/hmac.c b/crypto/openssl/crypto/hmac/hmac.c index 6c98fc43a3..ba27cbf56f 100644 --- a/crypto/openssl/crypto/hmac/hmac.c +++ b/crypto/openssl/crypto/hmac/hmac.c @@ -61,12 +61,34 @@ #include "cryptlib.h" #include +#ifdef OPENSSL_FIPS +#include +#endif + int HMAC_Init_ex(HMAC_CTX *ctx, const void *key, int len, const EVP_MD *md, ENGINE *impl) { int i,j,reset=0; unsigned char pad[HMAC_MAX_MD_CBLOCK]; +#ifdef OPENSSL_FIPS + if (FIPS_mode()) + { + /* If we have an ENGINE need to allow non FIPS */ + if ((impl || ctx->i_ctx.engine) + && !(ctx->i_ctx.flags & EVP_CIPH_FLAG_NON_FIPS_ALLOW)) + { + EVPerr(EVP_F_HMAC_INIT_EX, EVP_R_DISABLED_FOR_FIPS); + return 0; + } + /* Other algorithm blocking will be done in FIPS_cmac_init, + * via FIPS_hmac_init_ex(). + */ + if (!impl && !ctx->i_ctx.engine) + return FIPS_hmac_init_ex(ctx, key, len, md, NULL); + } +#endif + if (md != NULL) { reset=1; @@ -133,6 +155,10 @@ int HMAC_Init(HMAC_CTX *ctx, const void *key, int len, const EVP_MD *md) int HMAC_Update(HMAC_CTX *ctx, const unsigned char *data, size_t len) { +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !ctx->i_ctx.engine) + return FIPS_hmac_update(ctx, data, len); +#endif return EVP_DigestUpdate(&ctx->md_ctx,data,len); } @@ -140,6 +166,10 @@ int HMAC_Final(HMAC_CTX *ctx, unsigned char *md, unsigned int *len) { unsigned int i; unsigned char buf[EVP_MAX_MD_SIZE]; +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !ctx->i_ctx.engine) + return FIPS_hmac_final(ctx, md, len); +#endif if (!EVP_DigestFinal_ex(&ctx->md_ctx,buf,&i)) goto err; @@ -179,6 +209,13 @@ int HMAC_CTX_copy(HMAC_CTX *dctx, HMAC_CTX *sctx) void HMAC_CTX_cleanup(HMAC_CTX *ctx) { +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !ctx->i_ctx.engine) + { + FIPS_hmac_ctx_cleanup(ctx); + return; + } +#endif EVP_MD_CTX_cleanup(&ctx->i_ctx); EVP_MD_CTX_cleanup(&ctx->o_ctx); EVP_MD_CTX_cleanup(&ctx->md_ctx); diff --git a/crypto/openssl/crypto/idea/i_skey.c b/crypto/openssl/crypto/idea/i_skey.c index 1c95bc9c7b..afb830964d 100644 --- a/crypto/openssl/crypto/idea/i_skey.c +++ b/crypto/openssl/crypto/idea/i_skey.c @@ -56,11 +56,19 @@ * [including the GNU Public Licence.] */ +#include #include #include "idea_lcl.h" static IDEA_INT inverse(unsigned int xin); void idea_set_encrypt_key(const unsigned char *key, IDEA_KEY_SCHEDULE *ks) +#ifdef OPENSSL_FIPS + { + fips_cipher_abort(IDEA); + private_idea_set_encrypt_key(key, ks); + } +void private_idea_set_encrypt_key(const unsigned char *key, IDEA_KEY_SCHEDULE *ks) +#endif { int i; register IDEA_INT *kt,*kf,r0,r1,r2; diff --git a/crypto/openssl/crypto/idea/idea.h b/crypto/openssl/crypto/idea/idea.h index 5782e54b0f..e9a1e7f1a5 100644 --- a/crypto/openssl/crypto/idea/idea.h +++ b/crypto/openssl/crypto/idea/idea.h @@ -83,6 +83,9 @@ typedef struct idea_key_st const char *idea_options(void); void idea_ecb_encrypt(const unsigned char *in, unsigned char *out, IDEA_KEY_SCHEDULE *ks); +#ifdef OPENSSL_FIPS +void private_idea_set_encrypt_key(const unsigned char *key, IDEA_KEY_SCHEDULE *ks); +#endif void idea_set_encrypt_key(const unsigned char *key, IDEA_KEY_SCHEDULE *ks); void idea_set_decrypt_key(IDEA_KEY_SCHEDULE *ek, IDEA_KEY_SCHEDULE *dk); void idea_cbc_encrypt(const unsigned char *in, unsigned char *out, diff --git a/crypto/openssl/crypto/md4/md4.h b/crypto/openssl/crypto/md4/md4.h index c3ed9b3f75..a55368a790 100644 --- a/crypto/openssl/crypto/md4/md4.h +++ b/crypto/openssl/crypto/md4/md4.h @@ -105,6 +105,9 @@ typedef struct MD4state_st unsigned int num; } MD4_CTX; +#ifdef OPENSSL_FIPS +int private_MD4_Init(MD4_CTX *c); +#endif int MD4_Init(MD4_CTX *c); int MD4_Update(MD4_CTX *c, const void *data, size_t len); int MD4_Final(unsigned char *md, MD4_CTX *c); diff --git a/crypto/openssl/crypto/md4/md4_dgst.c b/crypto/openssl/crypto/md4/md4_dgst.c index e0c42e8596..82c2cb2d98 100644 --- a/crypto/openssl/crypto/md4/md4_dgst.c +++ b/crypto/openssl/crypto/md4/md4_dgst.c @@ -57,8 +57,9 @@ */ #include -#include "md4_locl.h" #include +#include +#include "md4_locl.h" const char MD4_version[]="MD4" OPENSSL_VERSION_PTEXT; @@ -70,7 +71,7 @@ const char MD4_version[]="MD4" OPENSSL_VERSION_PTEXT; #define INIT_DATA_C (unsigned long)0x98badcfeL #define INIT_DATA_D (unsigned long)0x10325476L -int MD4_Init(MD4_CTX *c) +fips_md_init(MD4) { memset (c,0,sizeof(*c)); c->A=INIT_DATA_A; diff --git a/crypto/openssl/crypto/md5/md5.h b/crypto/openssl/crypto/md5/md5.h index 4cbf84386b..541cc925fe 100644 --- a/crypto/openssl/crypto/md5/md5.h +++ b/crypto/openssl/crypto/md5/md5.h @@ -105,6 +105,9 @@ typedef struct MD5state_st unsigned int num; } MD5_CTX; +#ifdef OPENSSL_FIPS +int private_MD5_Init(MD5_CTX *c); +#endif int MD5_Init(MD5_CTX *c); int MD5_Update(MD5_CTX *c, const void *data, size_t len); int MD5_Final(unsigned char *md, MD5_CTX *c); diff --git a/crypto/openssl/crypto/md5/md5_dgst.c b/crypto/openssl/crypto/md5/md5_dgst.c index beace632e3..265890de52 100644 --- a/crypto/openssl/crypto/md5/md5_dgst.c +++ b/crypto/openssl/crypto/md5/md5_dgst.c @@ -59,6 +59,7 @@ #include #include "md5_locl.h" #include +#include const char MD5_version[]="MD5" OPENSSL_VERSION_PTEXT; @@ -70,7 +71,7 @@ const char MD5_version[]="MD5" OPENSSL_VERSION_PTEXT; #define INIT_DATA_C (unsigned long)0x98badcfeL #define INIT_DATA_D (unsigned long)0x10325476L -int MD5_Init(MD5_CTX *c) +fips_md_init(MD5) { memset (c,0,sizeof(*c)); c->A=INIT_DATA_A; diff --git a/crypto/openssl/crypto/mdc2/mdc2.h b/crypto/openssl/crypto/mdc2/mdc2.h index 72778a5212..f3e8e579d2 100644 --- a/crypto/openssl/crypto/mdc2/mdc2.h +++ b/crypto/openssl/crypto/mdc2/mdc2.h @@ -81,6 +81,9 @@ typedef struct mdc2_ctx_st } MDC2_CTX; +#ifdef OPENSSL_FIPS +int private_MDC2_Init(MDC2_CTX *c); +#endif int MDC2_Init(MDC2_CTX *c); int MDC2_Update(MDC2_CTX *c, const unsigned char *data, size_t len); int MDC2_Final(unsigned char *md, MDC2_CTX *c); diff --git a/crypto/openssl/crypto/mdc2/mdc2dgst.c b/crypto/openssl/crypto/mdc2/mdc2dgst.c index 4aa406edc3..b74bb1a759 100644 --- a/crypto/openssl/crypto/mdc2/mdc2dgst.c +++ b/crypto/openssl/crypto/mdc2/mdc2dgst.c @@ -61,6 +61,7 @@ #include #include #include +#include #undef c2l #define c2l(c,l) (l =((DES_LONG)(*((c)++))) , \ @@ -75,7 +76,7 @@ *((c)++)=(unsigned char)(((l)>>24L)&0xff)) static void mdc2_body(MDC2_CTX *c, const unsigned char *in, size_t len); -int MDC2_Init(MDC2_CTX *c) +fips_md_init(MDC2) { c->num=0; c->pad_type=1; diff --git a/crypto/openssl/crypto/mem.c b/crypto/openssl/crypto/mem.c index 6f80dd33eb..8f736c3b1f 100644 --- a/crypto/openssl/crypto/mem.c +++ b/crypto/openssl/crypto/mem.c @@ -125,6 +125,7 @@ static long (*get_debug_options_func)(void) = NULL; int CRYPTO_set_mem_functions(void *(*m)(size_t), void *(*r)(void *, size_t), void (*f)(void *)) { + OPENSSL_init(); if (!allow_customize) return 0; if ((m == 0) || (r == 0) || (f == 0)) @@ -186,6 +187,7 @@ int CRYPTO_set_mem_debug_functions(void (*m)(void *,int,const char *,int,int), { if (!allow_customize_debug) return 0; + OPENSSL_init(); malloc_debug_func=m; realloc_debug_func=r; free_debug_func=f; diff --git a/crypto/openssl/crypto/modes/asm/ghash-x86.pl b/crypto/openssl/crypto/modes/asm/ghash-x86.pl new file mode 100644 index 0000000000..6b09669d47 --- /dev/null +++ b/crypto/openssl/crypto/modes/asm/ghash-x86.pl @@ -0,0 +1,1342 @@ +#!/usr/bin/env perl +# +# ==================================================================== +# Written by Andy Polyakov for the OpenSSL +# project. The module is, however, dual licensed under OpenSSL and +# CRYPTOGAMS licenses depending on where you obtain it. For further +# details see http://www.openssl.org/~appro/cryptogams/. +# ==================================================================== +# +# March, May, June 2010 +# +# The module implements "4-bit" GCM GHASH function and underlying +# single multiplication operation in GF(2^128). "4-bit" means that it +# uses 256 bytes per-key table [+64/128 bytes fixed table]. It has two +# code paths: vanilla x86 and vanilla MMX. Former will be executed on +# 486 and Pentium, latter on all others. MMX GHASH features so called +# "528B" variant of "4-bit" method utilizing additional 256+16 bytes +# of per-key storage [+512 bytes shared table]. Performance results +# are for streamed GHASH subroutine and are expressed in cycles per +# processed byte, less is better: +# +# gcc 2.95.3(*) MMX assembler x86 assembler +# +# Pentium 105/111(**) - 50 +# PIII 68 /75 12.2 24 +# P4 125/125 17.8 84(***) +# Opteron 66 /70 10.1 30 +# Core2 54 /67 8.4 18 +# +# (*) gcc 3.4.x was observed to generate few percent slower code, +# which is one of reasons why 2.95.3 results were chosen, +# another reason is lack of 3.4.x results for older CPUs; +# comparison with MMX results is not completely fair, because C +# results are for vanilla "256B" implementation, while +# assembler results are for "528B";-) +# (**) second number is result for code compiled with -fPIC flag, +# which is actually more relevant, because assembler code is +# position-independent; +# (***) see comment in non-MMX routine for further details; +# +# To summarize, it's >2-5 times faster than gcc-generated code. To +# anchor it to something else SHA1 assembler processes one byte in +# 11-13 cycles on contemporary x86 cores. As for choice of MMX in +# particular, see comment at the end of the file... + +# May 2010 +# +# Add PCLMULQDQ version performing at 2.10 cycles per processed byte. +# The question is how close is it to theoretical limit? The pclmulqdq +# instruction latency appears to be 14 cycles and there can't be more +# than 2 of them executing at any given time. This means that single +# Karatsuba multiplication would take 28 cycles *plus* few cycles for +# pre- and post-processing. Then multiplication has to be followed by +# modulo-reduction. Given that aggregated reduction method [see +# "Carry-less Multiplication and Its Usage for Computing the GCM Mode" +# white paper by Intel] allows you to perform reduction only once in +# a while we can assume that asymptotic performance can be estimated +# as (28+Tmod/Naggr)/16, where Tmod is time to perform reduction +# and Naggr is the aggregation factor. +# +# Before we proceed to this implementation let's have closer look at +# the best-performing code suggested by Intel in their white paper. +# By tracing inter-register dependencies Tmod is estimated as ~19 +# cycles and Naggr chosen by Intel is 4, resulting in 2.05 cycles per +# processed byte. As implied, this is quite optimistic estimate, +# because it does not account for Karatsuba pre- and post-processing, +# which for a single multiplication is ~5 cycles. Unfortunately Intel +# does not provide performance data for GHASH alone. But benchmarking +# AES_GCM_encrypt ripped out of Fig. 15 of the white paper with aadt +# alone resulted in 2.46 cycles per byte of out 16KB buffer. Note that +# the result accounts even for pre-computing of degrees of the hash +# key H, but its portion is negligible at 16KB buffer size. +# +# Moving on to the implementation in question. Tmod is estimated as +# ~13 cycles and Naggr is 2, giving asymptotic performance of ... +# 2.16. How is it possible that measured performance is better than +# optimistic theoretical estimate? There is one thing Intel failed +# to recognize. By serializing GHASH with CTR in same subroutine +# former's performance is really limited to above (Tmul + Tmod/Naggr) +# equation. But if GHASH procedure is detached, the modulo-reduction +# can be interleaved with Naggr-1 multiplications at instruction level +# and under ideal conditions even disappear from the equation. So that +# optimistic theoretical estimate for this implementation is ... +# 28/16=1.75, and not 2.16. Well, it's probably way too optimistic, +# at least for such small Naggr. I'd argue that (28+Tproc/Naggr), +# where Tproc is time required for Karatsuba pre- and post-processing, +# is more realistic estimate. In this case it gives ... 1.91 cycles. +# Or in other words, depending on how well we can interleave reduction +# and one of the two multiplications the performance should be betwen +# 1.91 and 2.16. As already mentioned, this implementation processes +# one byte out of 8KB buffer in 2.10 cycles, while x86_64 counterpart +# - in 2.02. x86_64 performance is better, because larger register +# bank allows to interleave reduction and multiplication better. +# +# Does it make sense to increase Naggr? To start with it's virtually +# impossible in 32-bit mode, because of limited register bank +# capacity. Otherwise improvement has to be weighed agiainst slower +# setup, as well as code size and complexity increase. As even +# optimistic estimate doesn't promise 30% performance improvement, +# there are currently no plans to increase Naggr. +# +# Special thanks to David Woodhouse for +# providing access to a Westmere-based system on behalf of Intel +# Open Source Technology Centre. + +# January 2010 +# +# Tweaked to optimize transitions between integer and FP operations +# on same XMM register, PCLMULQDQ subroutine was measured to process +# one byte in 2.07 cycles on Sandy Bridge, and in 2.12 - on Westmere. +# The minor regression on Westmere is outweighed by ~15% improvement +# on Sandy Bridge. Strangely enough attempt to modify 64-bit code in +# similar manner resulted in almost 20% degradation on Sandy Bridge, +# where original 64-bit code processes one byte in 1.95 cycles. + +$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; +push(@INC,"${dir}","${dir}../../perlasm"); +require "x86asm.pl"; + +&asm_init($ARGV[0],"ghash-x86.pl",$x86only = $ARGV[$#ARGV] eq "386"); + +$sse2=0; +for (@ARGV) { $sse2=1 if (/-DOPENSSL_IA32_SSE2/); } + +($Zhh,$Zhl,$Zlh,$Zll) = ("ebp","edx","ecx","ebx"); +$inp = "edi"; +$Htbl = "esi"; + +$unroll = 0; # Affects x86 loop. Folded loop performs ~7% worse + # than unrolled, which has to be weighted against + # 2.5x x86-specific code size reduction. + +sub x86_loop { + my $off = shift; + my $rem = "eax"; + + &mov ($Zhh,&DWP(4,$Htbl,$Zll)); + &mov ($Zhl,&DWP(0,$Htbl,$Zll)); + &mov ($Zlh,&DWP(12,$Htbl,$Zll)); + &mov ($Zll,&DWP(8,$Htbl,$Zll)); + &xor ($rem,$rem); # avoid partial register stalls on PIII + + # shrd practically kills P4, 2.5x deterioration, but P4 has + # MMX code-path to execute. shrd runs tad faster [than twice + # the shifts, move's and or's] on pre-MMX Pentium (as well as + # PIII and Core2), *but* minimizes code size, spares register + # and thus allows to fold the loop... + if (!$unroll) { + my $cnt = $inp; + &mov ($cnt,15); + &jmp (&label("x86_loop")); + &set_label("x86_loop",16); + for($i=1;$i<=2;$i++) { + &mov (&LB($rem),&LB($Zll)); + &shrd ($Zll,$Zlh,4); + &and (&LB($rem),0xf); + &shrd ($Zlh,$Zhl,4); + &shrd ($Zhl,$Zhh,4); + &shr ($Zhh,4); + &xor ($Zhh,&DWP($off+16,"esp",$rem,4)); + + &mov (&LB($rem),&BP($off,"esp",$cnt)); + if ($i&1) { + &and (&LB($rem),0xf0); + } else { + &shl (&LB($rem),4); + } + + &xor ($Zll,&DWP(8,$Htbl,$rem)); + &xor ($Zlh,&DWP(12,$Htbl,$rem)); + &xor ($Zhl,&DWP(0,$Htbl,$rem)); + &xor ($Zhh,&DWP(4,$Htbl,$rem)); + + if ($i&1) { + &dec ($cnt); + &js (&label("x86_break")); + } else { + &jmp (&label("x86_loop")); + } + } + &set_label("x86_break",16); + } else { + for($i=1;$i<32;$i++) { + &comment($i); + &mov (&LB($rem),&LB($Zll)); + &shrd ($Zll,$Zlh,4); + &and (&LB($rem),0xf); + &shrd ($Zlh,$Zhl,4); + &shrd ($Zhl,$Zhh,4); + &shr ($Zhh,4); + &xor ($Zhh,&DWP($off+16,"esp",$rem,4)); + + if ($i&1) { + &mov (&LB($rem),&BP($off+15-($i>>1),"esp")); + &and (&LB($rem),0xf0); + } else { + &mov (&LB($rem),&BP($off+15-($i>>1),"esp")); + &shl (&LB($rem),4); + } + + &xor ($Zll,&DWP(8,$Htbl,$rem)); + &xor ($Zlh,&DWP(12,$Htbl,$rem)); + &xor ($Zhl,&DWP(0,$Htbl,$rem)); + &xor ($Zhh,&DWP(4,$Htbl,$rem)); + } + } + &bswap ($Zll); + &bswap ($Zlh); + &bswap ($Zhl); + if (!$x86only) { + &bswap ($Zhh); + } else { + &mov ("eax",$Zhh); + &bswap ("eax"); + &mov ($Zhh,"eax"); + } +} + +if ($unroll) { + &function_begin_B("_x86_gmult_4bit_inner"); + &x86_loop(4); + &ret (); + &function_end_B("_x86_gmult_4bit_inner"); +} + +sub deposit_rem_4bit { + my $bias = shift; + + &mov (&DWP($bias+0, "esp"),0x0000<<16); + &mov (&DWP($bias+4, "esp"),0x1C20<<16); + &mov (&DWP($bias+8, "esp"),0x3840<<16); + &mov (&DWP($bias+12,"esp"),0x2460<<16); + &mov (&DWP($bias+16,"esp"),0x7080<<16); + &mov (&DWP($bias+20,"esp"),0x6CA0<<16); + &mov (&DWP($bias+24,"esp"),0x48C0<<16); + &mov (&DWP($bias+28,"esp"),0x54E0<<16); + &mov (&DWP($bias+32,"esp"),0xE100<<16); + &mov (&DWP($bias+36,"esp"),0xFD20<<16); + &mov (&DWP($bias+40,"esp"),0xD940<<16); + &mov (&DWP($bias+44,"esp"),0xC560<<16); + &mov (&DWP($bias+48,"esp"),0x9180<<16); + &mov (&DWP($bias+52,"esp"),0x8DA0<<16); + &mov (&DWP($bias+56,"esp"),0xA9C0<<16); + &mov (&DWP($bias+60,"esp"),0xB5E0<<16); +} + +$suffix = $x86only ? "" : "_x86"; + +&function_begin("gcm_gmult_4bit".$suffix); + &stack_push(16+4+1); # +1 for stack alignment + &mov ($inp,&wparam(0)); # load Xi + &mov ($Htbl,&wparam(1)); # load Htable + + &mov ($Zhh,&DWP(0,$inp)); # load Xi[16] + &mov ($Zhl,&DWP(4,$inp)); + &mov ($Zlh,&DWP(8,$inp)); + &mov ($Zll,&DWP(12,$inp)); + + &deposit_rem_4bit(16); + + &mov (&DWP(0,"esp"),$Zhh); # copy Xi[16] on stack + &mov (&DWP(4,"esp"),$Zhl); + &mov (&DWP(8,"esp"),$Zlh); + &mov (&DWP(12,"esp"),$Zll); + &shr ($Zll,20); + &and ($Zll,0xf0); + + if ($unroll) { + &call ("_x86_gmult_4bit_inner"); + } else { + &x86_loop(0); + &mov ($inp,&wparam(0)); + } + + &mov (&DWP(12,$inp),$Zll); + &mov (&DWP(8,$inp),$Zlh); + &mov (&DWP(4,$inp),$Zhl); + &mov (&DWP(0,$inp),$Zhh); + &stack_pop(16+4+1); +&function_end("gcm_gmult_4bit".$suffix); + +&function_begin("gcm_ghash_4bit".$suffix); + &stack_push(16+4+1); # +1 for 64-bit alignment + &mov ($Zll,&wparam(0)); # load Xi + &mov ($Htbl,&wparam(1)); # load Htable + &mov ($inp,&wparam(2)); # load in + &mov ("ecx",&wparam(3)); # load len + &add ("ecx",$inp); + &mov (&wparam(3),"ecx"); + + &mov ($Zhh,&DWP(0,$Zll)); # load Xi[16] + &mov ($Zhl,&DWP(4,$Zll)); + &mov ($Zlh,&DWP(8,$Zll)); + &mov ($Zll,&DWP(12,$Zll)); + + &deposit_rem_4bit(16); + + &set_label("x86_outer_loop",16); + &xor ($Zll,&DWP(12,$inp)); # xor with input + &xor ($Zlh,&DWP(8,$inp)); + &xor ($Zhl,&DWP(4,$inp)); + &xor ($Zhh,&DWP(0,$inp)); + &mov (&DWP(12,"esp"),$Zll); # dump it on stack + &mov (&DWP(8,"esp"),$Zlh); + &mov (&DWP(4,"esp"),$Zhl); + &mov (&DWP(0,"esp"),$Zhh); + + &shr ($Zll,20); + &and ($Zll,0xf0); + + if ($unroll) { + &call ("_x86_gmult_4bit_inner"); + } else { + &x86_loop(0); + &mov ($inp,&wparam(2)); + } + &lea ($inp,&DWP(16,$inp)); + &cmp ($inp,&wparam(3)); + &mov (&wparam(2),$inp) if (!$unroll); + &jb (&label("x86_outer_loop")); + + &mov ($inp,&wparam(0)); # load Xi + &mov (&DWP(12,$inp),$Zll); + &mov (&DWP(8,$inp),$Zlh); + &mov (&DWP(4,$inp),$Zhl); + &mov (&DWP(0,$inp),$Zhh); + &stack_pop(16+4+1); +&function_end("gcm_ghash_4bit".$suffix); + +if (!$x86only) {{{ + +&static_label("rem_4bit"); + +if (!$sse2) {{ # pure-MMX "May" version... + +$S=12; # shift factor for rem_4bit + +&function_begin_B("_mmx_gmult_4bit_inner"); +# MMX version performs 3.5 times better on P4 (see comment in non-MMX +# routine for further details), 100% better on Opteron, ~70% better +# on Core2 and PIII... In other words effort is considered to be well +# spent... Since initial release the loop was unrolled in order to +# "liberate" register previously used as loop counter. Instead it's +# used to optimize critical path in 'Z.hi ^= rem_4bit[Z.lo&0xf]'. +# The path involves move of Z.lo from MMX to integer register, +# effective address calculation and finally merge of value to Z.hi. +# Reference to rem_4bit is scheduled so late that I had to >>4 +# rem_4bit elements. This resulted in 20-45% procent improvement +# on contemporary µ-archs. +{ + my $cnt; + my $rem_4bit = "eax"; + my @rem = ($Zhh,$Zll); + my $nhi = $Zhl; + my $nlo = $Zlh; + + my ($Zlo,$Zhi) = ("mm0","mm1"); + my $tmp = "mm2"; + + &xor ($nlo,$nlo); # avoid partial register stalls on PIII + &mov ($nhi,$Zll); + &mov (&LB($nlo),&LB($nhi)); + &shl (&LB($nlo),4); + &and ($nhi,0xf0); + &movq ($Zlo,&QWP(8,$Htbl,$nlo)); + &movq ($Zhi,&QWP(0,$Htbl,$nlo)); + &movd ($rem[0],$Zlo); + + for ($cnt=28;$cnt>=-2;$cnt--) { + my $odd = $cnt&1; + my $nix = $odd ? $nlo : $nhi; + + &shl (&LB($nlo),4) if ($odd); + &psrlq ($Zlo,4); + &movq ($tmp,$Zhi); + &psrlq ($Zhi,4); + &pxor ($Zlo,&QWP(8,$Htbl,$nix)); + &mov (&LB($nlo),&BP($cnt/2,$inp)) if (!$odd && $cnt>=0); + &psllq ($tmp,60); + &and ($nhi,0xf0) if ($odd); + &pxor ($Zhi,&QWP(0,$rem_4bit,$rem[1],8)) if ($cnt<28); + &and ($rem[0],0xf); + &pxor ($Zhi,&QWP(0,$Htbl,$nix)); + &mov ($nhi,$nlo) if (!$odd && $cnt>=0); + &movd ($rem[1],$Zlo); + &pxor ($Zlo,$tmp); + + push (@rem,shift(@rem)); # "rotate" registers + } + + &mov ($inp,&DWP(4,$rem_4bit,$rem[1],8)); # last rem_4bit[rem] + + &psrlq ($Zlo,32); # lower part of Zlo is already there + &movd ($Zhl,$Zhi); + &psrlq ($Zhi,32); + &movd ($Zlh,$Zlo); + &movd ($Zhh,$Zhi); + &shl ($inp,4); # compensate for rem_4bit[i] being >>4 + + &bswap ($Zll); + &bswap ($Zhl); + &bswap ($Zlh); + &xor ($Zhh,$inp); + &bswap ($Zhh); + + &ret (); +} +&function_end_B("_mmx_gmult_4bit_inner"); + +&function_begin("gcm_gmult_4bit_mmx"); + &mov ($inp,&wparam(0)); # load Xi + &mov ($Htbl,&wparam(1)); # load Htable + + &call (&label("pic_point")); + &set_label("pic_point"); + &blindpop("eax"); + &lea ("eax",&DWP(&label("rem_4bit")."-".&label("pic_point"),"eax")); + + &movz ($Zll,&BP(15,$inp)); + + &call ("_mmx_gmult_4bit_inner"); + + &mov ($inp,&wparam(0)); # load Xi + &emms (); + &mov (&DWP(12,$inp),$Zll); + &mov (&DWP(4,$inp),$Zhl); + &mov (&DWP(8,$inp),$Zlh); + &mov (&DWP(0,$inp),$Zhh); +&function_end("gcm_gmult_4bit_mmx"); + +# Streamed version performs 20% better on P4, 7% on Opteron, +# 10% on Core2 and PIII... +&function_begin("gcm_ghash_4bit_mmx"); + &mov ($Zhh,&wparam(0)); # load Xi + &mov ($Htbl,&wparam(1)); # load Htable + &mov ($inp,&wparam(2)); # load in + &mov ($Zlh,&wparam(3)); # load len + + &call (&label("pic_point")); + &set_label("pic_point"); + &blindpop("eax"); + &lea ("eax",&DWP(&label("rem_4bit")."-".&label("pic_point"),"eax")); + + &add ($Zlh,$inp); + &mov (&wparam(3),$Zlh); # len to point at the end of input + &stack_push(4+1); # +1 for stack alignment + + &mov ($Zll,&DWP(12,$Zhh)); # load Xi[16] + &mov ($Zhl,&DWP(4,$Zhh)); + &mov ($Zlh,&DWP(8,$Zhh)); + &mov ($Zhh,&DWP(0,$Zhh)); + &jmp (&label("mmx_outer_loop")); + + &set_label("mmx_outer_loop",16); + &xor ($Zll,&DWP(12,$inp)); + &xor ($Zhl,&DWP(4,$inp)); + &xor ($Zlh,&DWP(8,$inp)); + &xor ($Zhh,&DWP(0,$inp)); + &mov (&wparam(2),$inp); + &mov (&DWP(12,"esp"),$Zll); + &mov (&DWP(4,"esp"),$Zhl); + &mov (&DWP(8,"esp"),$Zlh); + &mov (&DWP(0,"esp"),$Zhh); + + &mov ($inp,"esp"); + &shr ($Zll,24); + + &call ("_mmx_gmult_4bit_inner"); + + &mov ($inp,&wparam(2)); + &lea ($inp,&DWP(16,$inp)); + &cmp ($inp,&wparam(3)); + &jb (&label("mmx_outer_loop")); + + &mov ($inp,&wparam(0)); # load Xi + &emms (); + &mov (&DWP(12,$inp),$Zll); + &mov (&DWP(4,$inp),$Zhl); + &mov (&DWP(8,$inp),$Zlh); + &mov (&DWP(0,$inp),$Zhh); + + &stack_pop(4+1); +&function_end("gcm_ghash_4bit_mmx"); + +}} else {{ # "June" MMX version... + # ... has slower "April" gcm_gmult_4bit_mmx with folded + # loop. This is done to conserve code size... +$S=16; # shift factor for rem_4bit + +sub mmx_loop() { +# MMX version performs 2.8 times better on P4 (see comment in non-MMX +# routine for further details), 40% better on Opteron and Core2, 50% +# better on PIII... In other words effort is considered to be well +# spent... + my $inp = shift; + my $rem_4bit = shift; + my $cnt = $Zhh; + my $nhi = $Zhl; + my $nlo = $Zlh; + my $rem = $Zll; + + my ($Zlo,$Zhi) = ("mm0","mm1"); + my $tmp = "mm2"; + + &xor ($nlo,$nlo); # avoid partial register stalls on PIII + &mov ($nhi,$Zll); + &mov (&LB($nlo),&LB($nhi)); + &mov ($cnt,14); + &shl (&LB($nlo),4); + &and ($nhi,0xf0); + &movq ($Zlo,&QWP(8,$Htbl,$nlo)); + &movq ($Zhi,&QWP(0,$Htbl,$nlo)); + &movd ($rem,$Zlo); + &jmp (&label("mmx_loop")); + + &set_label("mmx_loop",16); + &psrlq ($Zlo,4); + &and ($rem,0xf); + &movq ($tmp,$Zhi); + &psrlq ($Zhi,4); + &pxor ($Zlo,&QWP(8,$Htbl,$nhi)); + &mov (&LB($nlo),&BP(0,$inp,$cnt)); + &psllq ($tmp,60); + &pxor ($Zhi,&QWP(0,$rem_4bit,$rem,8)); + &dec ($cnt); + &movd ($rem,$Zlo); + &pxor ($Zhi,&QWP(0,$Htbl,$nhi)); + &mov ($nhi,$nlo); + &pxor ($Zlo,$tmp); + &js (&label("mmx_break")); + + &shl (&LB($nlo),4); + &and ($rem,0xf); + &psrlq ($Zlo,4); + &and ($nhi,0xf0); + &movq ($tmp,$Zhi); + &psrlq ($Zhi,4); + &pxor ($Zlo,&QWP(8,$Htbl,$nlo)); + &psllq ($tmp,60); + &pxor ($Zhi,&QWP(0,$rem_4bit,$rem,8)); + &movd ($rem,$Zlo); + &pxor ($Zhi,&QWP(0,$Htbl,$nlo)); + &pxor ($Zlo,$tmp); + &jmp (&label("mmx_loop")); + + &set_label("mmx_break",16); + &shl (&LB($nlo),4); + &and ($rem,0xf); + &psrlq ($Zlo,4); + &and ($nhi,0xf0); + &movq ($tmp,$Zhi); + &psrlq ($Zhi,4); + &pxor ($Zlo,&QWP(8,$Htbl,$nlo)); + &psllq ($tmp,60); + &pxor ($Zhi,&QWP(0,$rem_4bit,$rem,8)); + &movd ($rem,$Zlo); + &pxor ($Zhi,&QWP(0,$Htbl,$nlo)); + &pxor ($Zlo,$tmp); + + &psrlq ($Zlo,4); + &and ($rem,0xf); + &movq ($tmp,$Zhi); + &psrlq ($Zhi,4); + &pxor ($Zlo,&QWP(8,$Htbl,$nhi)); + &psllq ($tmp,60); + &pxor ($Zhi,&QWP(0,$rem_4bit,$rem,8)); + &movd ($rem,$Zlo); + &pxor ($Zhi,&QWP(0,$Htbl,$nhi)); + &pxor ($Zlo,$tmp); + + &psrlq ($Zlo,32); # lower part of Zlo is already there + &movd ($Zhl,$Zhi); + &psrlq ($Zhi,32); + &movd ($Zlh,$Zlo); + &movd ($Zhh,$Zhi); + + &bswap ($Zll); + &bswap ($Zhl); + &bswap ($Zlh); + &bswap ($Zhh); +} + +&function_begin("gcm_gmult_4bit_mmx"); + &mov ($inp,&wparam(0)); # load Xi + &mov ($Htbl,&wparam(1)); # load Htable + + &call (&label("pic_point")); + &set_label("pic_point"); + &blindpop("eax"); + &lea ("eax",&DWP(&label("rem_4bit")."-".&label("pic_point"),"eax")); + + &movz ($Zll,&BP(15,$inp)); + + &mmx_loop($inp,"eax"); + + &emms (); + &mov (&DWP(12,$inp),$Zll); + &mov (&DWP(4,$inp),$Zhl); + &mov (&DWP(8,$inp),$Zlh); + &mov (&DWP(0,$inp),$Zhh); +&function_end("gcm_gmult_4bit_mmx"); + +###################################################################### +# Below subroutine is "528B" variant of "4-bit" GCM GHASH function +# (see gcm128.c for details). It provides further 20-40% performance +# improvement over above mentioned "May" version. + +&static_label("rem_8bit"); + +&function_begin("gcm_ghash_4bit_mmx"); +{ my ($Zlo,$Zhi) = ("mm7","mm6"); + my $rem_8bit = "esi"; + my $Htbl = "ebx"; + + # parameter block + &mov ("eax",&wparam(0)); # Xi + &mov ("ebx",&wparam(1)); # Htable + &mov ("ecx",&wparam(2)); # inp + &mov ("edx",&wparam(3)); # len + &mov ("ebp","esp"); # original %esp + &call (&label("pic_point")); + &set_label ("pic_point"); + &blindpop ($rem_8bit); + &lea ($rem_8bit,&DWP(&label("rem_8bit")."-".&label("pic_point"),$rem_8bit)); + + &sub ("esp",512+16+16); # allocate stack frame... + &and ("esp",-64); # ...and align it + &sub ("esp",16); # place for (u8)(H[]<<4) + + &add ("edx","ecx"); # pointer to the end of input + &mov (&DWP(528+16+0,"esp"),"eax"); # save Xi + &mov (&DWP(528+16+8,"esp"),"edx"); # save inp+len + &mov (&DWP(528+16+12,"esp"),"ebp"); # save original %esp + + { my @lo = ("mm0","mm1","mm2"); + my @hi = ("mm3","mm4","mm5"); + my @tmp = ("mm6","mm7"); + my $off1=0,$off2=0,$i; + + &add ($Htbl,128); # optimize for size + &lea ("edi",&DWP(16+128,"esp")); + &lea ("ebp",&DWP(16+256+128,"esp")); + + # decompose Htable (low and high parts are kept separately), + # generate Htable[]>>4, (u8)(Htable[]<<4), save to stack... + for ($i=0;$i<18;$i++) { + + &mov ("edx",&DWP(16*$i+8-128,$Htbl)) if ($i<16); + &movq ($lo[0],&QWP(16*$i+8-128,$Htbl)) if ($i<16); + &psllq ($tmp[1],60) if ($i>1); + &movq ($hi[0],&QWP(16*$i+0-128,$Htbl)) if ($i<16); + &por ($lo[2],$tmp[1]) if ($i>1); + &movq (&QWP($off1-128,"edi"),$lo[1]) if ($i>0 && $i<17); + &psrlq ($lo[1],4) if ($i>0 && $i<17); + &movq (&QWP($off1,"edi"),$hi[1]) if ($i>0 && $i<17); + &movq ($tmp[0],$hi[1]) if ($i>0 && $i<17); + &movq (&QWP($off2-128,"ebp"),$lo[2]) if ($i>1); + &psrlq ($hi[1],4) if ($i>0 && $i<17); + &movq (&QWP($off2,"ebp"),$hi[2]) if ($i>1); + &shl ("edx",4) if ($i<16); + &mov (&BP($i,"esp"),&LB("edx")) if ($i<16); + + unshift (@lo,pop(@lo)); # "rotate" registers + unshift (@hi,pop(@hi)); + unshift (@tmp,pop(@tmp)); + $off1 += 8 if ($i>0); + $off2 += 8 if ($i>1); + } + } + + &movq ($Zhi,&QWP(0,"eax")); + &mov ("ebx",&DWP(8,"eax")); + &mov ("edx",&DWP(12,"eax")); # load Xi + +&set_label("outer",16); + { my $nlo = "eax"; + my $dat = "edx"; + my @nhi = ("edi","ebp"); + my @rem = ("ebx","ecx"); + my @red = ("mm0","mm1","mm2"); + my $tmp = "mm3"; + + &xor ($dat,&DWP(12,"ecx")); # merge input data + &xor ("ebx",&DWP(8,"ecx")); + &pxor ($Zhi,&QWP(0,"ecx")); + &lea ("ecx",&DWP(16,"ecx")); # inp+=16 + #&mov (&DWP(528+12,"esp"),$dat); # save inp^Xi + &mov (&DWP(528+8,"esp"),"ebx"); + &movq (&QWP(528+0,"esp"),$Zhi); + &mov (&DWP(528+16+4,"esp"),"ecx"); # save inp + + &xor ($nlo,$nlo); + &rol ($dat,8); + &mov (&LB($nlo),&LB($dat)); + &mov ($nhi[1],$nlo); + &and (&LB($nlo),0x0f); + &shr ($nhi[1],4); + &pxor ($red[0],$red[0]); + &rol ($dat,8); # next byte + &pxor ($red[1],$red[1]); + &pxor ($red[2],$red[2]); + + # Just like in "May" verson modulo-schedule for critical path in + # 'Z.hi ^= rem_8bit[Z.lo&0xff^((u8)H[nhi]<<4)]<<48'. Final 'pxor' + # is scheduled so late that rem_8bit[] has to be shifted *right* + # by 16, which is why last argument to pinsrw is 2, which + # corresponds to <<32=<<48>>16... + for ($j=11,$i=0;$i<15;$i++) { + + if ($i>0) { + &pxor ($Zlo,&QWP(16,"esp",$nlo,8)); # Z^=H[nlo] + &rol ($dat,8); # next byte + &pxor ($Zhi,&QWP(16+128,"esp",$nlo,8)); + + &pxor ($Zlo,$tmp); + &pxor ($Zhi,&QWP(16+256+128,"esp",$nhi[0],8)); + &xor (&LB($rem[1]),&BP(0,"esp",$nhi[0])); # rem^(H[nhi]<<4) + } else { + &movq ($Zlo,&QWP(16,"esp",$nlo,8)); + &movq ($Zhi,&QWP(16+128,"esp",$nlo,8)); + } + + &mov (&LB($nlo),&LB($dat)); + &mov ($dat,&DWP(528+$j,"esp")) if (--$j%4==0); + + &movd ($rem[0],$Zlo); + &movz ($rem[1],&LB($rem[1])) if ($i>0); + &psrlq ($Zlo,8); # Z>>=8 + + &movq ($tmp,$Zhi); + &mov ($nhi[0],$nlo); + &psrlq ($Zhi,8); + + &pxor ($Zlo,&QWP(16+256+0,"esp",$nhi[1],8)); # Z^=H[nhi]>>4 + &and (&LB($nlo),0x0f); + &psllq ($tmp,56); + + &pxor ($Zhi,$red[1]) if ($i>1); + &shr ($nhi[0],4); + &pinsrw ($red[0],&WP(0,$rem_8bit,$rem[1],2),2) if ($i>0); + + unshift (@red,pop(@red)); # "rotate" registers + unshift (@rem,pop(@rem)); + unshift (@nhi,pop(@nhi)); + } + + &pxor ($Zlo,&QWP(16,"esp",$nlo,8)); # Z^=H[nlo] + &pxor ($Zhi,&QWP(16+128,"esp",$nlo,8)); + &xor (&LB($rem[1]),&BP(0,"esp",$nhi[0])); # rem^(H[nhi]<<4) + + &pxor ($Zlo,$tmp); + &pxor ($Zhi,&QWP(16+256+128,"esp",$nhi[0],8)); + &movz ($rem[1],&LB($rem[1])); + + &pxor ($red[2],$red[2]); # clear 2nd word + &psllq ($red[1],4); + + &movd ($rem[0],$Zlo); + &psrlq ($Zlo,4); # Z>>=4 + + &movq ($tmp,$Zhi); + &psrlq ($Zhi,4); + &shl ($rem[0],4); # rem<<4 + + &pxor ($Zlo,&QWP(16,"esp",$nhi[1],8)); # Z^=H[nhi] + &psllq ($tmp,60); + &movz ($rem[0],&LB($rem[0])); + + &pxor ($Zlo,$tmp); + &pxor ($Zhi,&QWP(16+128,"esp",$nhi[1],8)); + + &pinsrw ($red[0],&WP(0,$rem_8bit,$rem[1],2),2); + &pxor ($Zhi,$red[1]); + + &movd ($dat,$Zlo); + &pinsrw ($red[2],&WP(0,$rem_8bit,$rem[0],2),3); # last is <<48 + + &psllq ($red[0],12); # correct by <<16>>4 + &pxor ($Zhi,$red[0]); + &psrlq ($Zlo,32); + &pxor ($Zhi,$red[2]); + + &mov ("ecx",&DWP(528+16+4,"esp")); # restore inp + &movd ("ebx",$Zlo); + &movq ($tmp,$Zhi); # 01234567 + &psllw ($Zhi,8); # 1.3.5.7. + &psrlw ($tmp,8); # .0.2.4.6 + &por ($Zhi,$tmp); # 10325476 + &bswap ($dat); + &pshufw ($Zhi,$Zhi,0b00011011); # 76543210 + &bswap ("ebx"); + + &cmp ("ecx",&DWP(528+16+8,"esp")); # are we done? + &jne (&label("outer")); + } + + &mov ("eax",&DWP(528+16+0,"esp")); # restore Xi + &mov (&DWP(12,"eax"),"edx"); + &mov (&DWP(8,"eax"),"ebx"); + &movq (&QWP(0,"eax"),$Zhi); + + &mov ("esp",&DWP(528+16+12,"esp")); # restore original %esp + &emms (); +} +&function_end("gcm_ghash_4bit_mmx"); +}} + +if ($sse2) {{ +###################################################################### +# PCLMULQDQ version. + +$Xip="eax"; +$Htbl="edx"; +$const="ecx"; +$inp="esi"; +$len="ebx"; + +($Xi,$Xhi)=("xmm0","xmm1"); $Hkey="xmm2"; +($T1,$T2,$T3)=("xmm3","xmm4","xmm5"); +($Xn,$Xhn)=("xmm6","xmm7"); + +&static_label("bswap"); + +sub clmul64x64_T2 { # minimal "register" pressure +my ($Xhi,$Xi,$Hkey)=@_; + + &movdqa ($Xhi,$Xi); # + &pshufd ($T1,$Xi,0b01001110); + &pshufd ($T2,$Hkey,0b01001110); + &pxor ($T1,$Xi); # + &pxor ($T2,$Hkey); + + &pclmulqdq ($Xi,$Hkey,0x00); ####### + &pclmulqdq ($Xhi,$Hkey,0x11); ####### + &pclmulqdq ($T1,$T2,0x00); ####### + &xorps ($T1,$Xi); # + &xorps ($T1,$Xhi); # + + &movdqa ($T2,$T1); # + &psrldq ($T1,8); + &pslldq ($T2,8); # + &pxor ($Xhi,$T1); + &pxor ($Xi,$T2); # +} + +sub clmul64x64_T3 { +# Even though this subroutine offers visually better ILP, it +# was empirically found to be a tad slower than above version. +# At least in gcm_ghash_clmul context. But it's just as well, +# because loop modulo-scheduling is possible only thanks to +# minimized "register" pressure... +my ($Xhi,$Xi,$Hkey)=@_; + + &movdqa ($T1,$Xi); # + &movdqa ($Xhi,$Xi); + &pclmulqdq ($Xi,$Hkey,0x00); ####### + &pclmulqdq ($Xhi,$Hkey,0x11); ####### + &pshufd ($T2,$T1,0b01001110); # + &pshufd ($T3,$Hkey,0b01001110); + &pxor ($T2,$T1); # + &pxor ($T3,$Hkey); + &pclmulqdq ($T2,$T3,0x00); ####### + &pxor ($T2,$Xi); # + &pxor ($T2,$Xhi); # + + &movdqa ($T3,$T2); # + &psrldq ($T2,8); + &pslldq ($T3,8); # + &pxor ($Xhi,$T2); + &pxor ($Xi,$T3); # +} + +if (1) { # Algorithm 9 with <<1 twist. + # Reduction is shorter and uses only two + # temporary registers, which makes it better + # candidate for interleaving with 64x64 + # multiplication. Pre-modulo-scheduled loop + # was found to be ~20% faster than Algorithm 5 + # below. Algorithm 9 was therefore chosen for + # further optimization... + +sub reduction_alg9 { # 17/13 times faster than Intel version +my ($Xhi,$Xi) = @_; + + # 1st phase + &movdqa ($T1,$Xi) # + &psllq ($Xi,1); + &pxor ($Xi,$T1); # + &psllq ($Xi,5); # + &pxor ($Xi,$T1); # + &psllq ($Xi,57); # + &movdqa ($T2,$Xi); # + &pslldq ($Xi,8); + &psrldq ($T2,8); # + &pxor ($Xi,$T1); + &pxor ($Xhi,$T2); # + + # 2nd phase + &movdqa ($T2,$Xi); + &psrlq ($Xi,5); + &pxor ($Xi,$T2); # + &psrlq ($Xi,1); # + &pxor ($Xi,$T2); # + &pxor ($T2,$Xhi); + &psrlq ($Xi,1); # + &pxor ($Xi,$T2); # +} + +&function_begin_B("gcm_init_clmul"); + &mov ($Htbl,&wparam(0)); + &mov ($Xip,&wparam(1)); + + &call (&label("pic")); +&set_label("pic"); + &blindpop ($const); + &lea ($const,&DWP(&label("bswap")."-".&label("pic"),$const)); + + &movdqu ($Hkey,&QWP(0,$Xip)); + &pshufd ($Hkey,$Hkey,0b01001110);# dword swap + + # <<1 twist + &pshufd ($T2,$Hkey,0b11111111); # broadcast uppermost dword + &movdqa ($T1,$Hkey); + &psllq ($Hkey,1); + &pxor ($T3,$T3); # + &psrlq ($T1,63); + &pcmpgtd ($T3,$T2); # broadcast carry bit + &pslldq ($T1,8); + &por ($Hkey,$T1); # H<<=1 + + # magic reduction + &pand ($T3,&QWP(16,$const)); # 0x1c2_polynomial + &pxor ($Hkey,$T3); # if(carry) H^=0x1c2_polynomial + + # calculate H^2 + &movdqa ($Xi,$Hkey); + &clmul64x64_T2 ($Xhi,$Xi,$Hkey); + &reduction_alg9 ($Xhi,$Xi); + + &movdqu (&QWP(0,$Htbl),$Hkey); # save H + &movdqu (&QWP(16,$Htbl),$Xi); # save H^2 + + &ret (); +&function_end_B("gcm_init_clmul"); + +&function_begin_B("gcm_gmult_clmul"); + &mov ($Xip,&wparam(0)); + &mov ($Htbl,&wparam(1)); + + &call (&label("pic")); +&set_label("pic"); + &blindpop ($const); + &lea ($const,&DWP(&label("bswap")."-".&label("pic"),$const)); + + &movdqu ($Xi,&QWP(0,$Xip)); + &movdqa ($T3,&QWP(0,$const)); + &movups ($Hkey,&QWP(0,$Htbl)); + &pshufb ($Xi,$T3); + + &clmul64x64_T2 ($Xhi,$Xi,$Hkey); + &reduction_alg9 ($Xhi,$Xi); + + &pshufb ($Xi,$T3); + &movdqu (&QWP(0,$Xip),$Xi); + + &ret (); +&function_end_B("gcm_gmult_clmul"); + +&function_begin("gcm_ghash_clmul"); + &mov ($Xip,&wparam(0)); + &mov ($Htbl,&wparam(1)); + &mov ($inp,&wparam(2)); + &mov ($len,&wparam(3)); + + &call (&label("pic")); +&set_label("pic"); + &blindpop ($const); + &lea ($const,&DWP(&label("bswap")."-".&label("pic"),$const)); + + &movdqu ($Xi,&QWP(0,$Xip)); + &movdqa ($T3,&QWP(0,$const)); + &movdqu ($Hkey,&QWP(0,$Htbl)); + &pshufb ($Xi,$T3); + + &sub ($len,0x10); + &jz (&label("odd_tail")); + + ####### + # Xi+2 =[H*(Ii+1 + Xi+1)] mod P = + # [(H*Ii+1) + (H*Xi+1)] mod P = + # [(H*Ii+1) + H^2*(Ii+Xi)] mod P + # + &movdqu ($T1,&QWP(0,$inp)); # Ii + &movdqu ($Xn,&QWP(16,$inp)); # Ii+1 + &pshufb ($T1,$T3); + &pshufb ($Xn,$T3); + &pxor ($Xi,$T1); # Ii+Xi + + &clmul64x64_T2 ($Xhn,$Xn,$Hkey); # H*Ii+1 + &movups ($Hkey,&QWP(16,$Htbl)); # load H^2 + + &lea ($inp,&DWP(32,$inp)); # i+=2 + &sub ($len,0x20); + &jbe (&label("even_tail")); + +&set_label("mod_loop"); + &clmul64x64_T2 ($Xhi,$Xi,$Hkey); # H^2*(Ii+Xi) + &movdqu ($T1,&QWP(0,$inp)); # Ii + &movups ($Hkey,&QWP(0,$Htbl)); # load H + + &pxor ($Xi,$Xn); # (H*Ii+1) + H^2*(Ii+Xi) + &pxor ($Xhi,$Xhn); + + &movdqu ($Xn,&QWP(16,$inp)); # Ii+1 + &pshufb ($T1,$T3); + &pshufb ($Xn,$T3); + + &movdqa ($T3,$Xn); #&clmul64x64_TX ($Xhn,$Xn,$Hkey); H*Ii+1 + &movdqa ($Xhn,$Xn); + &pxor ($Xhi,$T1); # "Ii+Xi", consume early + + &movdqa ($T1,$Xi) #&reduction_alg9($Xhi,$Xi); 1st phase + &psllq ($Xi,1); + &pxor ($Xi,$T1); # + &psllq ($Xi,5); # + &pxor ($Xi,$T1); # + &pclmulqdq ($Xn,$Hkey,0x00); ####### + &psllq ($Xi,57); # + &movdqa ($T2,$Xi); # + &pslldq ($Xi,8); + &psrldq ($T2,8); # + &pxor ($Xi,$T1); + &pshufd ($T1,$T3,0b01001110); + &pxor ($Xhi,$T2); # + &pxor ($T1,$T3); + &pshufd ($T3,$Hkey,0b01001110); + &pxor ($T3,$Hkey); # + + &pclmulqdq ($Xhn,$Hkey,0x11); ####### + &movdqa ($T2,$Xi); # 2nd phase + &psrlq ($Xi,5); + &pxor ($Xi,$T2); # + &psrlq ($Xi,1); # + &pxor ($Xi,$T2); # + &pxor ($T2,$Xhi); + &psrlq ($Xi,1); # + &pxor ($Xi,$T2); # + + &pclmulqdq ($T1,$T3,0x00); ####### + &movups ($Hkey,&QWP(16,$Htbl)); # load H^2 + &xorps ($T1,$Xn); # + &xorps ($T1,$Xhn); # + + &movdqa ($T3,$T1); # + &psrldq ($T1,8); + &pslldq ($T3,8); # + &pxor ($Xhn,$T1); + &pxor ($Xn,$T3); # + &movdqa ($T3,&QWP(0,$const)); + + &lea ($inp,&DWP(32,$inp)); + &sub ($len,0x20); + &ja (&label("mod_loop")); + +&set_label("even_tail"); + &clmul64x64_T2 ($Xhi,$Xi,$Hkey); # H^2*(Ii+Xi) + + &pxor ($Xi,$Xn); # (H*Ii+1) + H^2*(Ii+Xi) + &pxor ($Xhi,$Xhn); + + &reduction_alg9 ($Xhi,$Xi); + + &test ($len,$len); + &jnz (&label("done")); + + &movups ($Hkey,&QWP(0,$Htbl)); # load H +&set_label("odd_tail"); + &movdqu ($T1,&QWP(0,$inp)); # Ii + &pshufb ($T1,$T3); + &pxor ($Xi,$T1); # Ii+Xi + + &clmul64x64_T2 ($Xhi,$Xi,$Hkey); # H*(Ii+Xi) + &reduction_alg9 ($Xhi,$Xi); + +&set_label("done"); + &pshufb ($Xi,$T3); + &movdqu (&QWP(0,$Xip),$Xi); +&function_end("gcm_ghash_clmul"); + +} else { # Algorith 5. Kept for reference purposes. + +sub reduction_alg5 { # 19/16 times faster than Intel version +my ($Xhi,$Xi)=@_; + + # <<1 + &movdqa ($T1,$Xi); # + &movdqa ($T2,$Xhi); + &pslld ($Xi,1); + &pslld ($Xhi,1); # + &psrld ($T1,31); + &psrld ($T2,31); # + &movdqa ($T3,$T1); + &pslldq ($T1,4); + &psrldq ($T3,12); # + &pslldq ($T2,4); + &por ($Xhi,$T3); # + &por ($Xi,$T1); + &por ($Xhi,$T2); # + + # 1st phase + &movdqa ($T1,$Xi); + &movdqa ($T2,$Xi); + &movdqa ($T3,$Xi); # + &pslld ($T1,31); + &pslld ($T2,30); + &pslld ($Xi,25); # + &pxor ($T1,$T2); + &pxor ($T1,$Xi); # + &movdqa ($T2,$T1); # + &pslldq ($T1,12); + &psrldq ($T2,4); # + &pxor ($T3,$T1); + + # 2nd phase + &pxor ($Xhi,$T3); # + &movdqa ($Xi,$T3); + &movdqa ($T1,$T3); + &psrld ($Xi,1); # + &psrld ($T1,2); + &psrld ($T3,7); # + &pxor ($Xi,$T1); + &pxor ($Xhi,$T2); + &pxor ($Xi,$T3); # + &pxor ($Xi,$Xhi); # +} + +&function_begin_B("gcm_init_clmul"); + &mov ($Htbl,&wparam(0)); + &mov ($Xip,&wparam(1)); + + &call (&label("pic")); +&set_label("pic"); + &blindpop ($const); + &lea ($const,&DWP(&label("bswap")."-".&label("pic"),$const)); + + &movdqu ($Hkey,&QWP(0,$Xip)); + &pshufd ($Hkey,$Hkey,0b01001110);# dword swap + + # calculate H^2 + &movdqa ($Xi,$Hkey); + &clmul64x64_T3 ($Xhi,$Xi,$Hkey); + &reduction_alg5 ($Xhi,$Xi); + + &movdqu (&QWP(0,$Htbl),$Hkey); # save H + &movdqu (&QWP(16,$Htbl),$Xi); # save H^2 + + &ret (); +&function_end_B("gcm_init_clmul"); + +&function_begin_B("gcm_gmult_clmul"); + &mov ($Xip,&wparam(0)); + &mov ($Htbl,&wparam(1)); + + &call (&label("pic")); +&set_label("pic"); + &blindpop ($const); + &lea ($const,&DWP(&label("bswap")."-".&label("pic"),$const)); + + &movdqu ($Xi,&QWP(0,$Xip)); + &movdqa ($Xn,&QWP(0,$const)); + &movdqu ($Hkey,&QWP(0,$Htbl)); + &pshufb ($Xi,$Xn); + + &clmul64x64_T3 ($Xhi,$Xi,$Hkey); + &reduction_alg5 ($Xhi,$Xi); + + &pshufb ($Xi,$Xn); + &movdqu (&QWP(0,$Xip),$Xi); + + &ret (); +&function_end_B("gcm_gmult_clmul"); + +&function_begin("gcm_ghash_clmul"); + &mov ($Xip,&wparam(0)); + &mov ($Htbl,&wparam(1)); + &mov ($inp,&wparam(2)); + &mov ($len,&wparam(3)); + + &call (&label("pic")); +&set_label("pic"); + &blindpop ($const); + &lea ($const,&DWP(&label("bswap")."-".&label("pic"),$const)); + + &movdqu ($Xi,&QWP(0,$Xip)); + &movdqa ($T3,&QWP(0,$const)); + &movdqu ($Hkey,&QWP(0,$Htbl)); + &pshufb ($Xi,$T3); + + &sub ($len,0x10); + &jz (&label("odd_tail")); + + ####### + # Xi+2 =[H*(Ii+1 + Xi+1)] mod P = + # [(H*Ii+1) + (H*Xi+1)] mod P = + # [(H*Ii+1) + H^2*(Ii+Xi)] mod P + # + &movdqu ($T1,&QWP(0,$inp)); # Ii + &movdqu ($Xn,&QWP(16,$inp)); # Ii+1 + &pshufb ($T1,$T3); + &pshufb ($Xn,$T3); + &pxor ($Xi,$T1); # Ii+Xi + + &clmul64x64_T3 ($Xhn,$Xn,$Hkey); # H*Ii+1 + &movdqu ($Hkey,&QWP(16,$Htbl)); # load H^2 + + &sub ($len,0x20); + &lea ($inp,&DWP(32,$inp)); # i+=2 + &jbe (&label("even_tail")); + +&set_label("mod_loop"); + &clmul64x64_T3 ($Xhi,$Xi,$Hkey); # H^2*(Ii+Xi) + &movdqu ($Hkey,&QWP(0,$Htbl)); # load H + + &pxor ($Xi,$Xn); # (H*Ii+1) + H^2*(Ii+Xi) + &pxor ($Xhi,$Xhn); + + &reduction_alg5 ($Xhi,$Xi); + + ####### + &movdqa ($T3,&QWP(0,$const)); + &movdqu ($T1,&QWP(0,$inp)); # Ii + &movdqu ($Xn,&QWP(16,$inp)); # Ii+1 + &pshufb ($T1,$T3); + &pshufb ($Xn,$T3); + &pxor ($Xi,$T1); # Ii+Xi + + &clmul64x64_T3 ($Xhn,$Xn,$Hkey); # H*Ii+1 + &movdqu ($Hkey,&QWP(16,$Htbl)); # load H^2 + + &sub ($len,0x20); + &lea ($inp,&DWP(32,$inp)); + &ja (&label("mod_loop")); + +&set_label("even_tail"); + &clmul64x64_T3 ($Xhi,$Xi,$Hkey); # H^2*(Ii+Xi) + + &pxor ($Xi,$Xn); # (H*Ii+1) + H^2*(Ii+Xi) + &pxor ($Xhi,$Xhn); + + &reduction_alg5 ($Xhi,$Xi); + + &movdqa ($T3,&QWP(0,$const)); + &test ($len,$len); + &jnz (&label("done")); + + &movdqu ($Hkey,&QWP(0,$Htbl)); # load H +&set_label("odd_tail"); + &movdqu ($T1,&QWP(0,$inp)); # Ii + &pshufb ($T1,$T3); + &pxor ($Xi,$T1); # Ii+Xi + + &clmul64x64_T3 ($Xhi,$Xi,$Hkey); # H*(Ii+Xi) + &reduction_alg5 ($Xhi,$Xi); + + &movdqa ($T3,&QWP(0,$const)); +&set_label("done"); + &pshufb ($Xi,$T3); + &movdqu (&QWP(0,$Xip),$Xi); +&function_end("gcm_ghash_clmul"); + +} + +&set_label("bswap",64); + &data_byte(15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0); + &data_byte(1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0xc2); # 0x1c2_polynomial +}} # $sse2 + +&set_label("rem_4bit",64); + &data_word(0,0x0000<<$S,0,0x1C20<<$S,0,0x3840<<$S,0,0x2460<<$S); + &data_word(0,0x7080<<$S,0,0x6CA0<<$S,0,0x48C0<<$S,0,0x54E0<<$S); + &data_word(0,0xE100<<$S,0,0xFD20<<$S,0,0xD940<<$S,0,0xC560<<$S); + &data_word(0,0x9180<<$S,0,0x8DA0<<$S,0,0xA9C0<<$S,0,0xB5E0<<$S); +&set_label("rem_8bit",64); + &data_short(0x0000,0x01C2,0x0384,0x0246,0x0708,0x06CA,0x048C,0x054E); + &data_short(0x0E10,0x0FD2,0x0D94,0x0C56,0x0918,0x08DA,0x0A9C,0x0B5E); + &data_short(0x1C20,0x1DE2,0x1FA4,0x1E66,0x1B28,0x1AEA,0x18AC,0x196E); + &data_short(0x1230,0x13F2,0x11B4,0x1076,0x1538,0x14FA,0x16BC,0x177E); + &data_short(0x3840,0x3982,0x3BC4,0x3A06,0x3F48,0x3E8A,0x3CCC,0x3D0E); + &data_short(0x3650,0x3792,0x35D4,0x3416,0x3158,0x309A,0x32DC,0x331E); + &data_short(0x2460,0x25A2,0x27E4,0x2626,0x2368,0x22AA,0x20EC,0x212E); + &data_short(0x2A70,0x2BB2,0x29F4,0x2836,0x2D78,0x2CBA,0x2EFC,0x2F3E); + &data_short(0x7080,0x7142,0x7304,0x72C6,0x7788,0x764A,0x740C,0x75CE); + &data_short(0x7E90,0x7F52,0x7D14,0x7CD6,0x7998,0x785A,0x7A1C,0x7BDE); + &data_short(0x6CA0,0x6D62,0x6F24,0x6EE6,0x6BA8,0x6A6A,0x682C,0x69EE); + &data_short(0x62B0,0x6372,0x6134,0x60F6,0x65B8,0x647A,0x663C,0x67FE); + &data_short(0x48C0,0x4902,0x4B44,0x4A86,0x4FC8,0x4E0A,0x4C4C,0x4D8E); + &data_short(0x46D0,0x4712,0x4554,0x4496,0x41D8,0x401A,0x425C,0x439E); + &data_short(0x54E0,0x5522,0x5764,0x56A6,0x53E8,0x522A,0x506C,0x51AE); + &data_short(0x5AF0,0x5B32,0x5974,0x58B6,0x5DF8,0x5C3A,0x5E7C,0x5FBE); + &data_short(0xE100,0xE0C2,0xE284,0xE346,0xE608,0xE7CA,0xE58C,0xE44E); + &data_short(0xEF10,0xEED2,0xEC94,0xED56,0xE818,0xE9DA,0xEB9C,0xEA5E); + &data_short(0xFD20,0xFCE2,0xFEA4,0xFF66,0xFA28,0xFBEA,0xF9AC,0xF86E); + &data_short(0xF330,0xF2F2,0xF0B4,0xF176,0xF438,0xF5FA,0xF7BC,0xF67E); + &data_short(0xD940,0xD882,0xDAC4,0xDB06,0xDE48,0xDF8A,0xDDCC,0xDC0E); + &data_short(0xD750,0xD692,0xD4D4,0xD516,0xD058,0xD19A,0xD3DC,0xD21E); + &data_short(0xC560,0xC4A2,0xC6E4,0xC726,0xC268,0xC3AA,0xC1EC,0xC02E); + &data_short(0xCB70,0xCAB2,0xC8F4,0xC936,0xCC78,0xCDBA,0xCFFC,0xCE3E); + &data_short(0x9180,0x9042,0x9204,0x93C6,0x9688,0x974A,0x950C,0x94CE); + &data_short(0x9F90,0x9E52,0x9C14,0x9DD6,0x9898,0x995A,0x9B1C,0x9ADE); + &data_short(0x8DA0,0x8C62,0x8E24,0x8FE6,0x8AA8,0x8B6A,0x892C,0x88EE); + &data_short(0x83B0,0x8272,0x8034,0x81F6,0x84B8,0x857A,0x873C,0x86FE); + &data_short(0xA9C0,0xA802,0xAA44,0xAB86,0xAEC8,0xAF0A,0xAD4C,0xAC8E); + &data_short(0xA7D0,0xA612,0xA454,0xA596,0xA0D8,0xA11A,0xA35C,0xA29E); + &data_short(0xB5E0,0xB422,0xB664,0xB7A6,0xB2E8,0xB32A,0xB16C,0xB0AE); + &data_short(0xBBF0,0xBA32,0xB874,0xB9B6,0xBCF8,0xBD3A,0xBF7C,0xBEBE); +}}} # !$x86only + +&asciz("GHASH for x86, CRYPTOGAMS by "); +&asm_finish(); + +# A question was risen about choice of vanilla MMX. Or rather why wasn't +# SSE2 chosen instead? In addition to the fact that MMX runs on legacy +# CPUs such as PIII, "4-bit" MMX version was observed to provide better +# performance than *corresponding* SSE2 one even on contemporary CPUs. +# SSE2 results were provided by Peter-Michael Hager. He maintains SSE2 +# implementation featuring full range of lookup-table sizes, but with +# per-invocation lookup table setup. Latter means that table size is +# chosen depending on how much data is to be hashed in every given call, +# more data - larger table. Best reported result for Core2 is ~4 cycles +# per processed byte out of 64KB block. This number accounts even for +# 64KB table setup overhead. As discussed in gcm128.c we choose to be +# more conservative in respect to lookup table sizes, but how do the +# results compare? Minimalistic "256B" MMX version delivers ~11 cycles +# on same platform. As also discussed in gcm128.c, next in line "8-bit +# Shoup's" or "4KB" method should deliver twice the performance of +# "256B" one, in other words not worse than ~6 cycles per byte. It +# should be also be noted that in SSE2 case improvement can be "super- +# linear," i.e. more than twice, mostly because >>8 maps to single +# instruction on SSE2 register. This is unlike "4-bit" case when >>4 +# maps to same amount of instructions in both MMX and SSE2 cases. +# Bottom line is that switch to SSE2 is considered to be justifiable +# only in case we choose to implement "8-bit" method... diff --git a/crypto/openssl/crypto/modes/asm/ghash-x86_64.pl b/crypto/openssl/crypto/modes/asm/ghash-x86_64.pl new file mode 100644 index 0000000000..a5ae180882 --- /dev/null +++ b/crypto/openssl/crypto/modes/asm/ghash-x86_64.pl @@ -0,0 +1,805 @@ +#!/usr/bin/env perl +# +# ==================================================================== +# Written by Andy Polyakov for the OpenSSL +# project. The module is, however, dual licensed under OpenSSL and +# CRYPTOGAMS licenses depending on where you obtain it. For further +# details see http://www.openssl.org/~appro/cryptogams/. +# ==================================================================== +# +# March, June 2010 +# +# The module implements "4-bit" GCM GHASH function and underlying +# single multiplication operation in GF(2^128). "4-bit" means that +# it uses 256 bytes per-key table [+128 bytes shared table]. GHASH +# function features so called "528B" variant utilizing additional +# 256+16 bytes of per-key storage [+512 bytes shared table]. +# Performance results are for this streamed GHASH subroutine and are +# expressed in cycles per processed byte, less is better: +# +# gcc 3.4.x(*) assembler +# +# P4 28.6 14.0 +100% +# Opteron 19.3 7.7 +150% +# Core2 17.8 8.1(**) +120% +# +# (*) comparison is not completely fair, because C results are +# for vanilla "256B" implementation, while assembler results +# are for "528B";-) +# (**) it's mystery [to me] why Core2 result is not same as for +# Opteron; + +# May 2010 +# +# Add PCLMULQDQ version performing at 2.02 cycles per processed byte. +# See ghash-x86.pl for background information and details about coding +# techniques. +# +# Special thanks to David Woodhouse for +# providing access to a Westmere-based system on behalf of Intel +# Open Source Technology Centre. + +$flavour = shift; +$output = shift; +if ($flavour =~ /\./) { $output = $flavour; undef $flavour; } + +$win64=0; $win64=1 if ($flavour =~ /[nm]asm|mingw64/ || $output =~ /\.asm$/); + +$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; +( $xlate="${dir}x86_64-xlate.pl" and -f $xlate ) or +( $xlate="${dir}../../perlasm/x86_64-xlate.pl" and -f $xlate) or +die "can't locate x86_64-xlate.pl"; + +open STDOUT,"| $^X $xlate $flavour $output"; + +# common register layout +$nlo="%rax"; +$nhi="%rbx"; +$Zlo="%r8"; +$Zhi="%r9"; +$tmp="%r10"; +$rem_4bit = "%r11"; + +$Xi="%rdi"; +$Htbl="%rsi"; + +# per-function register layout +$cnt="%rcx"; +$rem="%rdx"; + +sub LB() { my $r=shift; $r =~ s/%[er]([a-d])x/%\1l/ or + $r =~ s/%[er]([sd]i)/%\1l/ or + $r =~ s/%[er](bp)/%\1l/ or + $r =~ s/%(r[0-9]+)[d]?/%\1b/; $r; } + +sub AUTOLOAD() # thunk [simplified] 32-bit style perlasm +{ my $opcode = $AUTOLOAD; $opcode =~ s/.*:://; + my $arg = pop; + $arg = "\$$arg" if ($arg*1 eq $arg); + $code .= "\t$opcode\t".join(',',$arg,reverse @_)."\n"; +} + +{ my $N; + sub loop() { + my $inp = shift; + + $N++; +$code.=<<___; + xor $nlo,$nlo + xor $nhi,$nhi + mov `&LB("$Zlo")`,`&LB("$nlo")` + mov `&LB("$Zlo")`,`&LB("$nhi")` + shl \$4,`&LB("$nlo")` + mov \$14,$cnt + mov 8($Htbl,$nlo),$Zlo + mov ($Htbl,$nlo),$Zhi + and \$0xf0,`&LB("$nhi")` + mov $Zlo,$rem + jmp .Loop$N + +.align 16 +.Loop$N: + shr \$4,$Zlo + and \$0xf,$rem + mov $Zhi,$tmp + mov ($inp,$cnt),`&LB("$nlo")` + shr \$4,$Zhi + xor 8($Htbl,$nhi),$Zlo + shl \$60,$tmp + xor ($Htbl,$nhi),$Zhi + mov `&LB("$nlo")`,`&LB("$nhi")` + xor ($rem_4bit,$rem,8),$Zhi + mov $Zlo,$rem + shl \$4,`&LB("$nlo")` + xor $tmp,$Zlo + dec $cnt + js .Lbreak$N + + shr \$4,$Zlo + and \$0xf,$rem + mov $Zhi,$tmp + shr \$4,$Zhi + xor 8($Htbl,$nlo),$Zlo + shl \$60,$tmp + xor ($Htbl,$nlo),$Zhi + and \$0xf0,`&LB("$nhi")` + xor ($rem_4bit,$rem,8),$Zhi + mov $Zlo,$rem + xor $tmp,$Zlo + jmp .Loop$N + +.align 16 +.Lbreak$N: + shr \$4,$Zlo + and \$0xf,$rem + mov $Zhi,$tmp + shr \$4,$Zhi + xor 8($Htbl,$nlo),$Zlo + shl \$60,$tmp + xor ($Htbl,$nlo),$Zhi + and \$0xf0,`&LB("$nhi")` + xor ($rem_4bit,$rem,8),$Zhi + mov $Zlo,$rem + xor $tmp,$Zlo + + shr \$4,$Zlo + and \$0xf,$rem + mov $Zhi,$tmp + shr \$4,$Zhi + xor 8($Htbl,$nhi),$Zlo + shl \$60,$tmp + xor ($Htbl,$nhi),$Zhi + xor $tmp,$Zlo + xor ($rem_4bit,$rem,8),$Zhi + + bswap $Zlo + bswap $Zhi +___ +}} + +$code=<<___; +.text + +.globl gcm_gmult_4bit +.type gcm_gmult_4bit,\@function,2 +.align 16 +gcm_gmult_4bit: + push %rbx + push %rbp # %rbp and %r12 are pushed exclusively in + push %r12 # order to reuse Win64 exception handler... +.Lgmult_prologue: + + movzb 15($Xi),$Zlo + lea .Lrem_4bit(%rip),$rem_4bit +___ + &loop ($Xi); +$code.=<<___; + mov $Zlo,8($Xi) + mov $Zhi,($Xi) + + mov 16(%rsp),%rbx + lea 24(%rsp),%rsp +.Lgmult_epilogue: + ret +.size gcm_gmult_4bit,.-gcm_gmult_4bit +___ + +# per-function register layout +$inp="%rdx"; +$len="%rcx"; +$rem_8bit=$rem_4bit; + +$code.=<<___; +.globl gcm_ghash_4bit +.type gcm_ghash_4bit,\@function,4 +.align 16 +gcm_ghash_4bit: + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + sub \$280,%rsp +.Lghash_prologue: + mov $inp,%r14 # reassign couple of args + mov $len,%r15 +___ +{ my $inp="%r14"; + my $dat="%edx"; + my $len="%r15"; + my @nhi=("%ebx","%ecx"); + my @rem=("%r12","%r13"); + my $Hshr4="%rbp"; + + &sub ($Htbl,-128); # size optimization + &lea ($Hshr4,"16+128(%rsp)"); + { my @lo =($nlo,$nhi); + my @hi =($Zlo,$Zhi); + + &xor ($dat,$dat); + for ($i=0,$j=-2;$i<18;$i++,$j++) { + &mov ("$j(%rsp)",&LB($dat)) if ($i>1); + &or ($lo[0],$tmp) if ($i>1); + &mov (&LB($dat),&LB($lo[1])) if ($i>0 && $i<17); + &shr ($lo[1],4) if ($i>0 && $i<17); + &mov ($tmp,$hi[1]) if ($i>0 && $i<17); + &shr ($hi[1],4) if ($i>0 && $i<17); + &mov ("8*$j($Hshr4)",$hi[0]) if ($i>1); + &mov ($hi[0],"16*$i+0-128($Htbl)") if ($i<16); + &shl (&LB($dat),4) if ($i>0 && $i<17); + &mov ("8*$j-128($Hshr4)",$lo[0]) if ($i>1); + &mov ($lo[0],"16*$i+8-128($Htbl)") if ($i<16); + &shl ($tmp,60) if ($i>0 && $i<17); + + push (@lo,shift(@lo)); + push (@hi,shift(@hi)); + } + } + &add ($Htbl,-128); + &mov ($Zlo,"8($Xi)"); + &mov ($Zhi,"0($Xi)"); + &add ($len,$inp); # pointer to the end of data + &lea ($rem_8bit,".Lrem_8bit(%rip)"); + &jmp (".Louter_loop"); + +$code.=".align 16\n.Louter_loop:\n"; + &xor ($Zhi,"($inp)"); + &mov ("%rdx","8($inp)"); + &lea ($inp,"16($inp)"); + &xor ("%rdx",$Zlo); + &mov ("($Xi)",$Zhi); + &mov ("8($Xi)","%rdx"); + &shr ("%rdx",32); + + &xor ($nlo,$nlo); + &rol ($dat,8); + &mov (&LB($nlo),&LB($dat)); + &movz ($nhi[0],&LB($dat)); + &shl (&LB($nlo),4); + &shr ($nhi[0],4); + + for ($j=11,$i=0;$i<15;$i++) { + &rol ($dat,8); + &xor ($Zlo,"8($Htbl,$nlo)") if ($i>0); + &xor ($Zhi,"($Htbl,$nlo)") if ($i>0); + &mov ($Zlo,"8($Htbl,$nlo)") if ($i==0); + &mov ($Zhi,"($Htbl,$nlo)") if ($i==0); + + &mov (&LB($nlo),&LB($dat)); + &xor ($Zlo,$tmp) if ($i>0); + &movzw ($rem[1],"($rem_8bit,$rem[1],2)") if ($i>0); + + &movz ($nhi[1],&LB($dat)); + &shl (&LB($nlo),4); + &movzb ($rem[0],"(%rsp,$nhi[0])"); + + &shr ($nhi[1],4) if ($i<14); + &and ($nhi[1],0xf0) if ($i==14); + &shl ($rem[1],48) if ($i>0); + &xor ($rem[0],$Zlo); + + &mov ($tmp,$Zhi); + &xor ($Zhi,$rem[1]) if ($i>0); + &shr ($Zlo,8); + + &movz ($rem[0],&LB($rem[0])); + &mov ($dat,"$j($Xi)") if (--$j%4==0); + &shr ($Zhi,8); + + &xor ($Zlo,"-128($Hshr4,$nhi[0],8)"); + &shl ($tmp,56); + &xor ($Zhi,"($Hshr4,$nhi[0],8)"); + + unshift (@nhi,pop(@nhi)); # "rotate" registers + unshift (@rem,pop(@rem)); + } + &movzw ($rem[1],"($rem_8bit,$rem[1],2)"); + &xor ($Zlo,"8($Htbl,$nlo)"); + &xor ($Zhi,"($Htbl,$nlo)"); + + &shl ($rem[1],48); + &xor ($Zlo,$tmp); + + &xor ($Zhi,$rem[1]); + &movz ($rem[0],&LB($Zlo)); + &shr ($Zlo,4); + + &mov ($tmp,$Zhi); + &shl (&LB($rem[0]),4); + &shr ($Zhi,4); + + &xor ($Zlo,"8($Htbl,$nhi[0])"); + &movzw ($rem[0],"($rem_8bit,$rem[0],2)"); + &shl ($tmp,60); + + &xor ($Zhi,"($Htbl,$nhi[0])"); + &xor ($Zlo,$tmp); + &shl ($rem[0],48); + + &bswap ($Zlo); + &xor ($Zhi,$rem[0]); + + &bswap ($Zhi); + &cmp ($inp,$len); + &jb (".Louter_loop"); +} +$code.=<<___; + mov $Zlo,8($Xi) + mov $Zhi,($Xi) + + lea 280(%rsp),%rsi + mov 0(%rsi),%r15 + mov 8(%rsi),%r14 + mov 16(%rsi),%r13 + mov 24(%rsi),%r12 + mov 32(%rsi),%rbp + mov 40(%rsi),%rbx + lea 48(%rsi),%rsp +.Lghash_epilogue: + ret +.size gcm_ghash_4bit,.-gcm_ghash_4bit +___ + +###################################################################### +# PCLMULQDQ version. + +@_4args=$win64? ("%rcx","%rdx","%r8", "%r9") : # Win64 order + ("%rdi","%rsi","%rdx","%rcx"); # Unix order + +($Xi,$Xhi)=("%xmm0","%xmm1"); $Hkey="%xmm2"; +($T1,$T2,$T3)=("%xmm3","%xmm4","%xmm5"); + +sub clmul64x64_T2 { # minimal register pressure +my ($Xhi,$Xi,$Hkey,$modulo)=@_; + +$code.=<<___ if (!defined($modulo)); + movdqa $Xi,$Xhi # + pshufd \$0b01001110,$Xi,$T1 + pshufd \$0b01001110,$Hkey,$T2 + pxor $Xi,$T1 # + pxor $Hkey,$T2 +___ +$code.=<<___; + pclmulqdq \$0x00,$Hkey,$Xi ####### + pclmulqdq \$0x11,$Hkey,$Xhi ####### + pclmulqdq \$0x00,$T2,$T1 ####### + pxor $Xi,$T1 # + pxor $Xhi,$T1 # + + movdqa $T1,$T2 # + psrldq \$8,$T1 + pslldq \$8,$T2 # + pxor $T1,$Xhi + pxor $T2,$Xi # +___ +} + +sub reduction_alg9 { # 17/13 times faster than Intel version +my ($Xhi,$Xi) = @_; + +$code.=<<___; + # 1st phase + movdqa $Xi,$T1 # + psllq \$1,$Xi + pxor $T1,$Xi # + psllq \$5,$Xi # + pxor $T1,$Xi # + psllq \$57,$Xi # + movdqa $Xi,$T2 # + pslldq \$8,$Xi + psrldq \$8,$T2 # + pxor $T1,$Xi + pxor $T2,$Xhi # + + # 2nd phase + movdqa $Xi,$T2 + psrlq \$5,$Xi + pxor $T2,$Xi # + psrlq \$1,$Xi # + pxor $T2,$Xi # + pxor $Xhi,$T2 + psrlq \$1,$Xi # + pxor $T2,$Xi # +___ +} + +{ my ($Htbl,$Xip)=@_4args; + +$code.=<<___; +.globl gcm_init_clmul +.type gcm_init_clmul,\@abi-omnipotent +.align 16 +gcm_init_clmul: + movdqu ($Xip),$Hkey + pshufd \$0b01001110,$Hkey,$Hkey # dword swap + + # <<1 twist + pshufd \$0b11111111,$Hkey,$T2 # broadcast uppermost dword + movdqa $Hkey,$T1 + psllq \$1,$Hkey + pxor $T3,$T3 # + psrlq \$63,$T1 + pcmpgtd $T2,$T3 # broadcast carry bit + pslldq \$8,$T1 + por $T1,$Hkey # H<<=1 + + # magic reduction + pand .L0x1c2_polynomial(%rip),$T3 + pxor $T3,$Hkey # if(carry) H^=0x1c2_polynomial + + # calculate H^2 + movdqa $Hkey,$Xi +___ + &clmul64x64_T2 ($Xhi,$Xi,$Hkey); + &reduction_alg9 ($Xhi,$Xi); +$code.=<<___; + movdqu $Hkey,($Htbl) # save H + movdqu $Xi,16($Htbl) # save H^2 + ret +.size gcm_init_clmul,.-gcm_init_clmul +___ +} + +{ my ($Xip,$Htbl)=@_4args; + +$code.=<<___; +.globl gcm_gmult_clmul +.type gcm_gmult_clmul,\@abi-omnipotent +.align 16 +gcm_gmult_clmul: + movdqu ($Xip),$Xi + movdqa .Lbswap_mask(%rip),$T3 + movdqu ($Htbl),$Hkey + pshufb $T3,$Xi +___ + &clmul64x64_T2 ($Xhi,$Xi,$Hkey); + &reduction_alg9 ($Xhi,$Xi); +$code.=<<___; + pshufb $T3,$Xi + movdqu $Xi,($Xip) + ret +.size gcm_gmult_clmul,.-gcm_gmult_clmul +___ +} + +{ my ($Xip,$Htbl,$inp,$len)=@_4args; + my $Xn="%xmm6"; + my $Xhn="%xmm7"; + my $Hkey2="%xmm8"; + my $T1n="%xmm9"; + my $T2n="%xmm10"; + +$code.=<<___; +.globl gcm_ghash_clmul +.type gcm_ghash_clmul,\@abi-omnipotent +.align 16 +gcm_ghash_clmul: +___ +$code.=<<___ if ($win64); +.LSEH_begin_gcm_ghash_clmul: + # I can't trust assembler to use specific encoding:-( + .byte 0x48,0x83,0xec,0x58 #sub \$0x58,%rsp + .byte 0x0f,0x29,0x34,0x24 #movaps %xmm6,(%rsp) + .byte 0x0f,0x29,0x7c,0x24,0x10 #movdqa %xmm7,0x10(%rsp) + .byte 0x44,0x0f,0x29,0x44,0x24,0x20 #movaps %xmm8,0x20(%rsp) + .byte 0x44,0x0f,0x29,0x4c,0x24,0x30 #movaps %xmm9,0x30(%rsp) + .byte 0x44,0x0f,0x29,0x54,0x24,0x40 #movaps %xmm10,0x40(%rsp) +___ +$code.=<<___; + movdqa .Lbswap_mask(%rip),$T3 + + movdqu ($Xip),$Xi + movdqu ($Htbl),$Hkey + pshufb $T3,$Xi + + sub \$0x10,$len + jz .Lodd_tail + + movdqu 16($Htbl),$Hkey2 + ####### + # Xi+2 =[H*(Ii+1 + Xi+1)] mod P = + # [(H*Ii+1) + (H*Xi+1)] mod P = + # [(H*Ii+1) + H^2*(Ii+Xi)] mod P + # + movdqu ($inp),$T1 # Ii + movdqu 16($inp),$Xn # Ii+1 + pshufb $T3,$T1 + pshufb $T3,$Xn + pxor $T1,$Xi # Ii+Xi +___ + &clmul64x64_T2 ($Xhn,$Xn,$Hkey); # H*Ii+1 +$code.=<<___; + movdqa $Xi,$Xhi # + pshufd \$0b01001110,$Xi,$T1 + pshufd \$0b01001110,$Hkey2,$T2 + pxor $Xi,$T1 # + pxor $Hkey2,$T2 + + lea 32($inp),$inp # i+=2 + sub \$0x20,$len + jbe .Leven_tail + +.Lmod_loop: +___ + &clmul64x64_T2 ($Xhi,$Xi,$Hkey2,1); # H^2*(Ii+Xi) +$code.=<<___; + movdqu ($inp),$T1 # Ii + pxor $Xn,$Xi # (H*Ii+1) + H^2*(Ii+Xi) + pxor $Xhn,$Xhi + + movdqu 16($inp),$Xn # Ii+1 + pshufb $T3,$T1 + pshufb $T3,$Xn + + movdqa $Xn,$Xhn # + pshufd \$0b01001110,$Xn,$T1n + pshufd \$0b01001110,$Hkey,$T2n + pxor $Xn,$T1n # + pxor $Hkey,$T2n + pxor $T1,$Xhi # "Ii+Xi", consume early + + movdqa $Xi,$T1 # 1st phase + psllq \$1,$Xi + pxor $T1,$Xi # + psllq \$5,$Xi # + pxor $T1,$Xi # + pclmulqdq \$0x00,$Hkey,$Xn ####### + psllq \$57,$Xi # + movdqa $Xi,$T2 # + pslldq \$8,$Xi + psrldq \$8,$T2 # + pxor $T1,$Xi + pxor $T2,$Xhi # + + pclmulqdq \$0x11,$Hkey,$Xhn ####### + movdqa $Xi,$T2 # 2nd phase + psrlq \$5,$Xi + pxor $T2,$Xi # + psrlq \$1,$Xi # + pxor $T2,$Xi # + pxor $Xhi,$T2 + psrlq \$1,$Xi # + pxor $T2,$Xi # + + pclmulqdq \$0x00,$T2n,$T1n ####### + movdqa $Xi,$Xhi # + pshufd \$0b01001110,$Xi,$T1 + pshufd \$0b01001110,$Hkey2,$T2 + pxor $Xi,$T1 # + pxor $Hkey2,$T2 + + pxor $Xn,$T1n # + pxor $Xhn,$T1n # + movdqa $T1n,$T2n # + psrldq \$8,$T1n + pslldq \$8,$T2n # + pxor $T1n,$Xhn + pxor $T2n,$Xn # + + lea 32($inp),$inp + sub \$0x20,$len + ja .Lmod_loop + +.Leven_tail: +___ + &clmul64x64_T2 ($Xhi,$Xi,$Hkey2,1); # H^2*(Ii+Xi) +$code.=<<___; + pxor $Xn,$Xi # (H*Ii+1) + H^2*(Ii+Xi) + pxor $Xhn,$Xhi +___ + &reduction_alg9 ($Xhi,$Xi); +$code.=<<___; + test $len,$len + jnz .Ldone + +.Lodd_tail: + movdqu ($inp),$T1 # Ii + pshufb $T3,$T1 + pxor $T1,$Xi # Ii+Xi +___ + &clmul64x64_T2 ($Xhi,$Xi,$Hkey); # H*(Ii+Xi) + &reduction_alg9 ($Xhi,$Xi); +$code.=<<___; +.Ldone: + pshufb $T3,$Xi + movdqu $Xi,($Xip) +___ +$code.=<<___ if ($win64); + movaps (%rsp),%xmm6 + movaps 0x10(%rsp),%xmm7 + movaps 0x20(%rsp),%xmm8 + movaps 0x30(%rsp),%xmm9 + movaps 0x40(%rsp),%xmm10 + add \$0x58,%rsp +___ +$code.=<<___; + ret +.LSEH_end_gcm_ghash_clmul: +.size gcm_ghash_clmul,.-gcm_ghash_clmul +___ +} + +$code.=<<___; +.align 64 +.Lbswap_mask: + .byte 15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0 +.L0x1c2_polynomial: + .byte 1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0xc2 +.align 64 +.type .Lrem_4bit,\@object +.Lrem_4bit: + .long 0,`0x0000<<16`,0,`0x1C20<<16`,0,`0x3840<<16`,0,`0x2460<<16` + .long 0,`0x7080<<16`,0,`0x6CA0<<16`,0,`0x48C0<<16`,0,`0x54E0<<16` + .long 0,`0xE100<<16`,0,`0xFD20<<16`,0,`0xD940<<16`,0,`0xC560<<16` + .long 0,`0x9180<<16`,0,`0x8DA0<<16`,0,`0xA9C0<<16`,0,`0xB5E0<<16` +.type .Lrem_8bit,\@object +.Lrem_8bit: + .value 0x0000,0x01C2,0x0384,0x0246,0x0708,0x06CA,0x048C,0x054E + .value 0x0E10,0x0FD2,0x0D94,0x0C56,0x0918,0x08DA,0x0A9C,0x0B5E + .value 0x1C20,0x1DE2,0x1FA4,0x1E66,0x1B28,0x1AEA,0x18AC,0x196E + .value 0x1230,0x13F2,0x11B4,0x1076,0x1538,0x14FA,0x16BC,0x177E + .value 0x3840,0x3982,0x3BC4,0x3A06,0x3F48,0x3E8A,0x3CCC,0x3D0E + .value 0x3650,0x3792,0x35D4,0x3416,0x3158,0x309A,0x32DC,0x331E + .value 0x2460,0x25A2,0x27E4,0x2626,0x2368,0x22AA,0x20EC,0x212E + .value 0x2A70,0x2BB2,0x29F4,0x2836,0x2D78,0x2CBA,0x2EFC,0x2F3E + .value 0x7080,0x7142,0x7304,0x72C6,0x7788,0x764A,0x740C,0x75CE + .value 0x7E90,0x7F52,0x7D14,0x7CD6,0x7998,0x785A,0x7A1C,0x7BDE + .value 0x6CA0,0x6D62,0x6F24,0x6EE6,0x6BA8,0x6A6A,0x682C,0x69EE + .value 0x62B0,0x6372,0x6134,0x60F6,0x65B8,0x647A,0x663C,0x67FE + .value 0x48C0,0x4902,0x4B44,0x4A86,0x4FC8,0x4E0A,0x4C4C,0x4D8E + .value 0x46D0,0x4712,0x4554,0x4496,0x41D8,0x401A,0x425C,0x439E + .value 0x54E0,0x5522,0x5764,0x56A6,0x53E8,0x522A,0x506C,0x51AE + .value 0x5AF0,0x5B32,0x5974,0x58B6,0x5DF8,0x5C3A,0x5E7C,0x5FBE + .value 0xE100,0xE0C2,0xE284,0xE346,0xE608,0xE7CA,0xE58C,0xE44E + .value 0xEF10,0xEED2,0xEC94,0xED56,0xE818,0xE9DA,0xEB9C,0xEA5E + .value 0xFD20,0xFCE2,0xFEA4,0xFF66,0xFA28,0xFBEA,0xF9AC,0xF86E + .value 0xF330,0xF2F2,0xF0B4,0xF176,0xF438,0xF5FA,0xF7BC,0xF67E + .value 0xD940,0xD882,0xDAC4,0xDB06,0xDE48,0xDF8A,0xDDCC,0xDC0E + .value 0xD750,0xD692,0xD4D4,0xD516,0xD058,0xD19A,0xD3DC,0xD21E + .value 0xC560,0xC4A2,0xC6E4,0xC726,0xC268,0xC3AA,0xC1EC,0xC02E + .value 0xCB70,0xCAB2,0xC8F4,0xC936,0xCC78,0xCDBA,0xCFFC,0xCE3E + .value 0x9180,0x9042,0x9204,0x93C6,0x9688,0x974A,0x950C,0x94CE + .value 0x9F90,0x9E52,0x9C14,0x9DD6,0x9898,0x995A,0x9B1C,0x9ADE + .value 0x8DA0,0x8C62,0x8E24,0x8FE6,0x8AA8,0x8B6A,0x892C,0x88EE + .value 0x83B0,0x8272,0x8034,0x81F6,0x84B8,0x857A,0x873C,0x86FE + .value 0xA9C0,0xA802,0xAA44,0xAB86,0xAEC8,0xAF0A,0xAD4C,0xAC8E + .value 0xA7D0,0xA612,0xA454,0xA596,0xA0D8,0xA11A,0xA35C,0xA29E + .value 0xB5E0,0xB422,0xB664,0xB7A6,0xB2E8,0xB32A,0xB16C,0xB0AE + .value 0xBBF0,0xBA32,0xB874,0xB9B6,0xBCF8,0xBD3A,0xBF7C,0xBEBE + +.asciz "GHASH for x86_64, CRYPTOGAMS by " +.align 64 +___ + +# EXCEPTION_DISPOSITION handler (EXCEPTION_RECORD *rec,ULONG64 frame, +# CONTEXT *context,DISPATCHER_CONTEXT *disp) +if ($win64) { +$rec="%rcx"; +$frame="%rdx"; +$context="%r8"; +$disp="%r9"; + +$code.=<<___; +.extern __imp_RtlVirtualUnwind +.type se_handler,\@abi-omnipotent +.align 16 +se_handler: + push %rsi + push %rdi + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + pushfq + sub \$64,%rsp + + mov 120($context),%rax # pull context->Rax + mov 248($context),%rbx # pull context->Rip + + mov 8($disp),%rsi # disp->ImageBase + mov 56($disp),%r11 # disp->HandlerData + + mov 0(%r11),%r10d # HandlerData[0] + lea (%rsi,%r10),%r10 # prologue label + cmp %r10,%rbx # context->RipRsp + + mov 4(%r11),%r10d # HandlerData[1] + lea (%rsi,%r10),%r10 # epilogue label + cmp %r10,%rbx # context->Rip>=epilogue label + jae .Lin_prologue + + lea 24(%rax),%rax # adjust "rsp" + + mov -8(%rax),%rbx + mov -16(%rax),%rbp + mov -24(%rax),%r12 + mov %rbx,144($context) # restore context->Rbx + mov %rbp,160($context) # restore context->Rbp + mov %r12,216($context) # restore context->R12 + +.Lin_prologue: + mov 8(%rax),%rdi + mov 16(%rax),%rsi + mov %rax,152($context) # restore context->Rsp + mov %rsi,168($context) # restore context->Rsi + mov %rdi,176($context) # restore context->Rdi + + mov 40($disp),%rdi # disp->ContextRecord + mov $context,%rsi # context + mov \$`1232/8`,%ecx # sizeof(CONTEXT) + .long 0xa548f3fc # cld; rep movsq + + mov $disp,%rsi + xor %rcx,%rcx # arg1, UNW_FLAG_NHANDLER + mov 8(%rsi),%rdx # arg2, disp->ImageBase + mov 0(%rsi),%r8 # arg3, disp->ControlPc + mov 16(%rsi),%r9 # arg4, disp->FunctionEntry + mov 40(%rsi),%r10 # disp->ContextRecord + lea 56(%rsi),%r11 # &disp->HandlerData + lea 24(%rsi),%r12 # &disp->EstablisherFrame + mov %r10,32(%rsp) # arg5 + mov %r11,40(%rsp) # arg6 + mov %r12,48(%rsp) # arg7 + mov %rcx,56(%rsp) # arg8, (NULL) + call *__imp_RtlVirtualUnwind(%rip) + + mov \$1,%eax # ExceptionContinueSearch + add \$64,%rsp + popfq + pop %r15 + pop %r14 + pop %r13 + pop %r12 + pop %rbp + pop %rbx + pop %rdi + pop %rsi + ret +.size se_handler,.-se_handler + +.section .pdata +.align 4 + .rva .LSEH_begin_gcm_gmult_4bit + .rva .LSEH_end_gcm_gmult_4bit + .rva .LSEH_info_gcm_gmult_4bit + + .rva .LSEH_begin_gcm_ghash_4bit + .rva .LSEH_end_gcm_ghash_4bit + .rva .LSEH_info_gcm_ghash_4bit + + .rva .LSEH_begin_gcm_ghash_clmul + .rva .LSEH_end_gcm_ghash_clmul + .rva .LSEH_info_gcm_ghash_clmul + +.section .xdata +.align 8 +.LSEH_info_gcm_gmult_4bit: + .byte 9,0,0,0 + .rva se_handler + .rva .Lgmult_prologue,.Lgmult_epilogue # HandlerData +.LSEH_info_gcm_ghash_4bit: + .byte 9,0,0,0 + .rva se_handler + .rva .Lghash_prologue,.Lghash_epilogue # HandlerData +.LSEH_info_gcm_ghash_clmul: + .byte 0x01,0x1f,0x0b,0x00 + .byte 0x1f,0xa8,0x04,0x00 #movaps 0x40(rsp),xmm10 + .byte 0x19,0x98,0x03,0x00 #movaps 0x30(rsp),xmm9 + .byte 0x13,0x88,0x02,0x00 #movaps 0x20(rsp),xmm8 + .byte 0x0d,0x78,0x01,0x00 #movaps 0x10(rsp),xmm7 + .byte 0x08,0x68,0x00,0x00 #movaps (rsp),xmm6 + .byte 0x04,0xa2,0x00,0x00 #sub rsp,0x58 +___ +} + +$code =~ s/\`([^\`]*)\`/eval($1)/gem; + +print $code; + +close STDOUT; diff --git a/crypto/openssl/crypto/modes/cbc128.c b/crypto/openssl/crypto/modes/cbc128.c index 8f8bd563b9..3d3782cbe1 100644 --- a/crypto/openssl/crypto/modes/cbc128.c +++ b/crypto/openssl/crypto/modes/cbc128.c @@ -48,7 +48,8 @@ * */ -#include "modes.h" +#include +#include "modes_lcl.h" #include #ifndef MODES_DEBUG @@ -58,12 +59,7 @@ #endif #include -#define STRICT_ALIGNMENT 1 -#if defined(__i386) || defined(__i386__) || \ - defined(__x86_64) || defined(__x86_64__) || \ - defined(_M_IX86) || defined(_M_AMD64) || defined(_M_X64) || \ - defined(__s390__) || defined(__s390x__) -# undef STRICT_ALIGNMENT +#ifndef STRICT_ALIGNMENT # define STRICT_ALIGNMENT 0 #endif diff --git a/crypto/openssl/crypto/modes/ccm128.c b/crypto/openssl/crypto/modes/ccm128.c new file mode 100644 index 0000000000..c9b35e5b35 --- /dev/null +++ b/crypto/openssl/crypto/modes/ccm128.c @@ -0,0 +1,441 @@ +/* ==================================================================== + * Copyright (c) 2011 The OpenSSL Project. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * 3. All advertising materials mentioning features or use of this + * software must display the following acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit. (http://www.openssl.org/)" + * + * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to + * endorse or promote products derived from this software without + * prior written permission. For written permission, please contact + * openssl-core@openssl.org. + * + * 5. Products derived from this software may not be called "OpenSSL" + * nor may "OpenSSL" appear in their names without prior written + * permission of the OpenSSL Project. + * + * 6. Redistributions of any form whatsoever must retain the following + * acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit (http://www.openssl.org/)" + * + * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY + * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * ==================================================================== + */ + +#include +#include "modes_lcl.h" +#include + +#ifndef MODES_DEBUG +# ifndef NDEBUG +# define NDEBUG +# endif +#endif +#include + +/* First you setup M and L parameters and pass the key schedule. + * This is called once per session setup... */ +void CRYPTO_ccm128_init(CCM128_CONTEXT *ctx, + unsigned int M,unsigned int L,void *key,block128_f block) +{ + memset(ctx->nonce.c,0,sizeof(ctx->nonce.c)); + ctx->nonce.c[0] = ((u8)(L-1)&7) | (u8)(((M-2)/2)&7)<<3; + ctx->blocks = 0; + ctx->block = block; + ctx->key = key; +} + +/* !!! Following interfaces are to be called *once* per packet !!! */ + +/* Then you setup per-message nonce and pass the length of the message */ +int CRYPTO_ccm128_setiv(CCM128_CONTEXT *ctx, + const unsigned char *nonce,size_t nlen,size_t mlen) +{ + unsigned int L = ctx->nonce.c[0]&7; /* the L parameter */ + + if (nlen<(14-L)) return -1; /* nonce is too short */ + + if (sizeof(mlen)==8 && L>=3) { + ctx->nonce.c[8] = (u8)(mlen>>(56%(sizeof(mlen)*8))); + ctx->nonce.c[9] = (u8)(mlen>>(48%(sizeof(mlen)*8))); + ctx->nonce.c[10] = (u8)(mlen>>(40%(sizeof(mlen)*8))); + ctx->nonce.c[11] = (u8)(mlen>>(32%(sizeof(mlen)*8))); + } + else + *(u32*)(&ctx->nonce.c[8]) = 0; + + ctx->nonce.c[12] = (u8)(mlen>>24); + ctx->nonce.c[13] = (u8)(mlen>>16); + ctx->nonce.c[14] = (u8)(mlen>>8); + ctx->nonce.c[15] = (u8)mlen; + + ctx->nonce.c[0] &= ~0x40; /* clear Adata flag */ + memcpy(&ctx->nonce.c[1],nonce,14-L); + + return 0; +} + +/* Then you pass additional authentication data, this is optional */ +void CRYPTO_ccm128_aad(CCM128_CONTEXT *ctx, + const unsigned char *aad,size_t alen) +{ unsigned int i; + block128_f block = ctx->block; + + if (alen==0) return; + + ctx->nonce.c[0] |= 0x40; /* set Adata flag */ + (*block)(ctx->nonce.c,ctx->cmac.c,ctx->key), + ctx->blocks++; + + if (alen<(0x10000-0x100)) { + ctx->cmac.c[0] ^= (u8)(alen>>8); + ctx->cmac.c[1] ^= (u8)alen; + i=2; + } + else if (sizeof(alen)==8 && alen>=(size_t)1<<(32%(sizeof(alen)*8))) { + ctx->cmac.c[0] ^= 0xFF; + ctx->cmac.c[1] ^= 0xFF; + ctx->cmac.c[2] ^= (u8)(alen>>(56%(sizeof(alen)*8))); + ctx->cmac.c[3] ^= (u8)(alen>>(48%(sizeof(alen)*8))); + ctx->cmac.c[4] ^= (u8)(alen>>(40%(sizeof(alen)*8))); + ctx->cmac.c[5] ^= (u8)(alen>>(32%(sizeof(alen)*8))); + ctx->cmac.c[6] ^= (u8)(alen>>24); + ctx->cmac.c[7] ^= (u8)(alen>>16); + ctx->cmac.c[8] ^= (u8)(alen>>8); + ctx->cmac.c[9] ^= (u8)alen; + i=10; + } + else { + ctx->cmac.c[0] ^= 0xFF; + ctx->cmac.c[1] ^= 0xFE; + ctx->cmac.c[2] ^= (u8)(alen>>24); + ctx->cmac.c[3] ^= (u8)(alen>>16); + ctx->cmac.c[4] ^= (u8)(alen>>8); + ctx->cmac.c[5] ^= (u8)alen; + i=6; + } + + do { + for(;i<16 && alen;++i,++aad,--alen) + ctx->cmac.c[i] ^= *aad; + (*block)(ctx->cmac.c,ctx->cmac.c,ctx->key), + ctx->blocks++; + i=0; + } while (alen); +} + +/* Finally you encrypt or decrypt the message */ + +/* counter part of nonce may not be larger than L*8 bits, + * L is not larger than 8, therefore 64-bit counter... */ +static void ctr64_inc(unsigned char *counter) { + unsigned int n=8; + u8 c; + + counter += 8; + do { + --n; + c = counter[n]; + ++c; + counter[n] = c; + if (c) return; + } while (n); +} + +int CRYPTO_ccm128_encrypt(CCM128_CONTEXT *ctx, + const unsigned char *inp, unsigned char *out, + size_t len) +{ + size_t n; + unsigned int i,L; + unsigned char flags0 = ctx->nonce.c[0]; + block128_f block = ctx->block; + void * key = ctx->key; + union { u64 u[2]; u8 c[16]; } scratch; + + if (!(flags0&0x40)) + (*block)(ctx->nonce.c,ctx->cmac.c,key), + ctx->blocks++; + + ctx->nonce.c[0] = L = flags0&7; + for (n=0,i=15-L;i<15;++i) { + n |= ctx->nonce.c[i]; + ctx->nonce.c[i]=0; + n <<= 8; + } + n |= ctx->nonce.c[15]; /* reconstructed length */ + ctx->nonce.c[15]=1; + + if (n!=len) return -1; /* length mismatch */ + + ctx->blocks += ((len+15)>>3)|1; + if (ctx->blocks > (U64(1)<<61)) return -2; /* too much data */ + + while (len>=16) { +#if defined(STRICT_ALIGNMENT) + union { u64 u[2]; u8 c[16]; } temp; + + memcpy (temp.c,inp,16); + ctx->cmac.u[0] ^= temp.u[0]; + ctx->cmac.u[1] ^= temp.u[1]; +#else + ctx->cmac.u[0] ^= ((u64*)inp)[0]; + ctx->cmac.u[1] ^= ((u64*)inp)[1]; +#endif + (*block)(ctx->cmac.c,ctx->cmac.c,key); + (*block)(ctx->nonce.c,scratch.c,key); + ctr64_inc(ctx->nonce.c); +#if defined(STRICT_ALIGNMENT) + temp.u[0] ^= scratch.u[0]; + temp.u[1] ^= scratch.u[1]; + memcpy(out,temp.c,16); +#else + ((u64*)out)[0] = scratch.u[0]^((u64*)inp)[0]; + ((u64*)out)[1] = scratch.u[1]^((u64*)inp)[1]; +#endif + inp += 16; + out += 16; + len -= 16; + } + + if (len) { + for (i=0; icmac.c[i] ^= inp[i]; + (*block)(ctx->cmac.c,ctx->cmac.c,key); + (*block)(ctx->nonce.c,scratch.c,key); + for (i=0; inonce.c[i]=0; + + (*block)(ctx->nonce.c,scratch.c,key); + ctx->cmac.u[0] ^= scratch.u[0]; + ctx->cmac.u[1] ^= scratch.u[1]; + + ctx->nonce.c[0] = flags0; + + return 0; +} + +int CRYPTO_ccm128_decrypt(CCM128_CONTEXT *ctx, + const unsigned char *inp, unsigned char *out, + size_t len) +{ + size_t n; + unsigned int i,L; + unsigned char flags0 = ctx->nonce.c[0]; + block128_f block = ctx->block; + void * key = ctx->key; + union { u64 u[2]; u8 c[16]; } scratch; + + if (!(flags0&0x40)) + (*block)(ctx->nonce.c,ctx->cmac.c,key); + + ctx->nonce.c[0] = L = flags0&7; + for (n=0,i=15-L;i<15;++i) { + n |= ctx->nonce.c[i]; + ctx->nonce.c[i]=0; + n <<= 8; + } + n |= ctx->nonce.c[15]; /* reconstructed length */ + ctx->nonce.c[15]=1; + + if (n!=len) return -1; + + while (len>=16) { +#if defined(STRICT_ALIGNMENT) + union { u64 u[2]; u8 c[16]; } temp; +#endif + (*block)(ctx->nonce.c,scratch.c,key); + ctr64_inc(ctx->nonce.c); +#if defined(STRICT_ALIGNMENT) + memcpy (temp.c,inp,16); + ctx->cmac.u[0] ^= (scratch.u[0] ^= temp.u[0]); + ctx->cmac.u[1] ^= (scratch.u[1] ^= temp.u[1]); + memcpy (out,scratch.c,16); +#else + ctx->cmac.u[0] ^= (((u64*)out)[0] = scratch.u[0]^((u64*)inp)[0]); + ctx->cmac.u[1] ^= (((u64*)out)[1] = scratch.u[1]^((u64*)inp)[1]); +#endif + (*block)(ctx->cmac.c,ctx->cmac.c,key); + + inp += 16; + out += 16; + len -= 16; + } + + if (len) { + (*block)(ctx->nonce.c,scratch.c,key); + for (i=0; icmac.c[i] ^= (out[i] = scratch.c[i]^inp[i]); + (*block)(ctx->cmac.c,ctx->cmac.c,key); + } + + for (i=15-L;i<16;++i) + ctx->nonce.c[i]=0; + + (*block)(ctx->nonce.c,scratch.c,key); + ctx->cmac.u[0] ^= scratch.u[0]; + ctx->cmac.u[1] ^= scratch.u[1]; + + ctx->nonce.c[0] = flags0; + + return 0; +} + +static void ctr64_add (unsigned char *counter,size_t inc) +{ size_t n=8, val=0; + + counter += 8; + do { + --n; + val += counter[n] + (inc&0xff); + counter[n] = (unsigned char)val; + val >>= 8; /* carry bit */ + inc >>= 8; + } while(n && (inc || val)); +} + +int CRYPTO_ccm128_encrypt_ccm64(CCM128_CONTEXT *ctx, + const unsigned char *inp, unsigned char *out, + size_t len,ccm128_f stream) +{ + size_t n; + unsigned int i,L; + unsigned char flags0 = ctx->nonce.c[0]; + block128_f block = ctx->block; + void * key = ctx->key; + union { u64 u[2]; u8 c[16]; } scratch; + + if (!(flags0&0x40)) + (*block)(ctx->nonce.c,ctx->cmac.c,key), + ctx->blocks++; + + ctx->nonce.c[0] = L = flags0&7; + for (n=0,i=15-L;i<15;++i) { + n |= ctx->nonce.c[i]; + ctx->nonce.c[i]=0; + n <<= 8; + } + n |= ctx->nonce.c[15]; /* reconstructed length */ + ctx->nonce.c[15]=1; + + if (n!=len) return -1; /* length mismatch */ + + ctx->blocks += ((len+15)>>3)|1; + if (ctx->blocks > (U64(1)<<61)) return -2; /* too much data */ + + if ((n=len/16)) { + (*stream)(inp,out,n,key,ctx->nonce.c,ctx->cmac.c); + n *= 16; + inp += n; + out += n; + len -= n; + if (len) ctr64_add(ctx->nonce.c,n/16); + } + + if (len) { + for (i=0; icmac.c[i] ^= inp[i]; + (*block)(ctx->cmac.c,ctx->cmac.c,key); + (*block)(ctx->nonce.c,scratch.c,key); + for (i=0; inonce.c[i]=0; + + (*block)(ctx->nonce.c,scratch.c,key); + ctx->cmac.u[0] ^= scratch.u[0]; + ctx->cmac.u[1] ^= scratch.u[1]; + + ctx->nonce.c[0] = flags0; + + return 0; +} + +int CRYPTO_ccm128_decrypt_ccm64(CCM128_CONTEXT *ctx, + const unsigned char *inp, unsigned char *out, + size_t len,ccm128_f stream) +{ + size_t n; + unsigned int i,L; + unsigned char flags0 = ctx->nonce.c[0]; + block128_f block = ctx->block; + void * key = ctx->key; + union { u64 u[2]; u8 c[16]; } scratch; + + if (!(flags0&0x40)) + (*block)(ctx->nonce.c,ctx->cmac.c,key); + + ctx->nonce.c[0] = L = flags0&7; + for (n=0,i=15-L;i<15;++i) { + n |= ctx->nonce.c[i]; + ctx->nonce.c[i]=0; + n <<= 8; + } + n |= ctx->nonce.c[15]; /* reconstructed length */ + ctx->nonce.c[15]=1; + + if (n!=len) return -1; + + if ((n=len/16)) { + (*stream)(inp,out,n,key,ctx->nonce.c,ctx->cmac.c); + n *= 16; + inp += n; + out += n; + len -= n; + if (len) ctr64_add(ctx->nonce.c,n/16); + } + + if (len) { + (*block)(ctx->nonce.c,scratch.c,key); + for (i=0; icmac.c[i] ^= (out[i] = scratch.c[i]^inp[i]); + (*block)(ctx->cmac.c,ctx->cmac.c,key); + } + + for (i=15-L;i<16;++i) + ctx->nonce.c[i]=0; + + (*block)(ctx->nonce.c,scratch.c,key); + ctx->cmac.u[0] ^= scratch.u[0]; + ctx->cmac.u[1] ^= scratch.u[1]; + + ctx->nonce.c[0] = flags0; + + return 0; +} + +size_t CRYPTO_ccm128_tag(CCM128_CONTEXT *ctx,unsigned char *tag,size_t len) +{ unsigned int M = (ctx->nonce.c[0]>>3)&7; /* the M parameter */ + + M *= 2; M += 2; + if (lencmac.c,M); + return M; +} diff --git a/crypto/openssl/crypto/modes/cfb128.c b/crypto/openssl/crypto/modes/cfb128.c index e5938c6137..4e6f5d35e1 100644 --- a/crypto/openssl/crypto/modes/cfb128.c +++ b/crypto/openssl/crypto/modes/cfb128.c @@ -48,7 +48,8 @@ * */ -#include "modes.h" +#include +#include "modes_lcl.h" #include #ifndef MODES_DEBUG @@ -58,14 +59,6 @@ #endif #include -#define STRICT_ALIGNMENT -#if defined(__i386) || defined(__i386__) || \ - defined(__x86_64) || defined(__x86_64__) || \ - defined(_M_IX86) || defined(_M_AMD64) || defined(_M_X64) || \ - defined(__s390__) || defined(__s390x__) -# undef STRICT_ALIGNMENT -#endif - /* The input and output encrypted as though 128bit cfb mode is being * used. The extra state information to record how much of the * 128bit block we have used is contained in *num; diff --git a/crypto/openssl/crypto/modes/ctr128.c b/crypto/openssl/crypto/modes/ctr128.c index 932037f551..ee642c5863 100644 --- a/crypto/openssl/crypto/modes/ctr128.c +++ b/crypto/openssl/crypto/modes/ctr128.c @@ -48,7 +48,8 @@ * */ -#include "modes.h" +#include +#include "modes_lcl.h" #include #ifndef MODES_DEBUG @@ -58,17 +59,6 @@ #endif #include -typedef unsigned int u32; -typedef unsigned char u8; - -#define STRICT_ALIGNMENT -#if defined(__i386) || defined(__i386__) || \ - defined(__x86_64) || defined(__x86_64__) || \ - defined(_M_IX86) || defined(_M_AMD64) || defined(_M_X64) || \ - defined(__s390__) || defined(__s390x__) -# undef STRICT_ALIGNMENT -#endif - /* NOTE: the IV/counter CTR mode is big-endian. The code itself * is endian-neutral. */ @@ -182,3 +172,81 @@ void CRYPTO_ctr128_encrypt(const unsigned char *in, unsigned char *out, *num=n; } + +/* increment upper 96 bits of 128-bit counter by 1 */ +static void ctr96_inc(unsigned char *counter) { + u32 n=12; + u8 c; + + do { + --n; + c = counter[n]; + ++c; + counter[n] = c; + if (c) return; + } while (n); +} + +void CRYPTO_ctr128_encrypt_ctr32(const unsigned char *in, unsigned char *out, + size_t len, const void *key, + unsigned char ivec[16], unsigned char ecount_buf[16], + unsigned int *num, ctr128_f func) +{ + unsigned int n,ctr32; + + assert(in && out && key && ecount_buf && num); + assert(*num < 16); + + n = *num; + + while (n && len) { + *(out++) = *(in++) ^ ecount_buf[n]; + --len; + n = (n+1) % 16; + } + + ctr32 = GETU32(ivec+12); + while (len>=16) { + size_t blocks = len/16; + /* + * 1<<28 is just a not-so-small yet not-so-large number... + * Below condition is practically never met, but it has to + * be checked for code correctness. + */ + if (sizeof(size_t)>sizeof(unsigned int) && blocks>(1U<<28)) + blocks = (1U<<28); + /* + * As (*func) operates on 32-bit counter, caller + * has to handle overflow. 'if' below detects the + * overflow, which is then handled by limiting the + * amount of blocks to the exact overflow point... + */ + ctr32 += (u32)blocks; + if (ctr32 < blocks) { + blocks -= ctr32; + ctr32 = 0; + } + (*func)(in,out,blocks,key,ivec); + /* (*ctr) does not update ivec, caller does: */ + PUTU32(ivec+12,ctr32); + /* ... overflow was detected, propogate carry. */ + if (ctr32 == 0) ctr96_inc(ivec); + blocks *= 16; + len -= blocks; + out += blocks; + in += blocks; + } + if (len) { + memset(ecount_buf,0,16); + (*func)(ecount_buf,ecount_buf,1,key,ivec); + ++ctr32; + PUTU32(ivec+12,ctr32); + if (ctr32 == 0) ctr96_inc(ivec); + while (len--) { + out[n] = in[n] ^ ecount_buf[n]; + ++n; + } + } + + *num=n; +} diff --git a/crypto/openssl/crypto/modes/cts128.c b/crypto/openssl/crypto/modes/cts128.c index e0430f9fdc..c0e1f3696c 100644 --- a/crypto/openssl/crypto/modes/cts128.c +++ b/crypto/openssl/crypto/modes/cts128.c @@ -5,7 +5,8 @@ * forms are granted according to the OpenSSL license. */ -#include "modes.h" +#include +#include "modes_lcl.h" #include #ifndef MODES_DEBUG @@ -23,8 +24,9 @@ * deviates from mentioned RFCs. Most notably it allows input to be * of block length and it doesn't flip the order of the last two * blocks. CTS is being discussed even in ECB context, but it's not - * adopted for any known application. This implementation complies - * with mentioned RFCs and [as such] extends CBC mode. + * adopted for any known application. This implementation provides + * two interfaces: one compliant with above mentioned RFCs and one + * compliant with the NIST proposal, both extending CBC mode. */ size_t CRYPTO_cts128_encrypt_block(const unsigned char *in, unsigned char *out, @@ -54,6 +56,34 @@ size_t CRYPTO_cts128_encrypt_block(const unsigned char *in, unsigned char *out, return len+residue; } +size_t CRYPTO_nistcts128_encrypt_block(const unsigned char *in, unsigned char *out, + size_t len, const void *key, + unsigned char ivec[16], block128_f block) +{ size_t residue, n; + + assert (in && out && key && ivec); + + if (len < 16) return 0; + + residue=len%16; + + len -= residue; + + CRYPTO_cbc128_encrypt(in,out,len,key,ivec,block); + + if (residue==0) return len; + + in += len; + out += len; + + for (n=0; n +#include "modes_lcl.h" +#include + +#ifndef MODES_DEBUG +# ifndef NDEBUG +# define NDEBUG +# endif +#endif +#include + +#if defined(BSWAP4) && defined(STRICT_ALIGNMENT) +/* redefine, because alignment is ensured */ +#undef GETU32 +#define GETU32(p) BSWAP4(*(const u32 *)(p)) +#undef PUTU32 +#define PUTU32(p,v) *(u32 *)(p) = BSWAP4(v) +#endif + +#define PACK(s) ((size_t)(s)<<(sizeof(size_t)*8-16)) +#define REDUCE1BIT(V) do { \ + if (sizeof(size_t)==8) { \ + u64 T = U64(0xe100000000000000) & (0-(V.lo&1)); \ + V.lo = (V.hi<<63)|(V.lo>>1); \ + V.hi = (V.hi>>1 )^T; \ + } \ + else { \ + u32 T = 0xe1000000U & (0-(u32)(V.lo&1)); \ + V.lo = (V.hi<<63)|(V.lo>>1); \ + V.hi = (V.hi>>1 )^((u64)T<<32); \ + } \ +} while(0) + +/* + * Even though permitted values for TABLE_BITS are 8, 4 and 1, it should + * never be set to 8. 8 is effectively reserved for testing purposes. + * TABLE_BITS>1 are lookup-table-driven implementations referred to as + * "Shoup's" in GCM specification. In other words OpenSSL does not cover + * whole spectrum of possible table driven implementations. Why? In + * non-"Shoup's" case memory access pattern is segmented in such manner, + * that it's trivial to see that cache timing information can reveal + * fair portion of intermediate hash value. Given that ciphertext is + * always available to attacker, it's possible for him to attempt to + * deduce secret parameter H and if successful, tamper with messages + * [which is nothing but trivial in CTR mode]. In "Shoup's" case it's + * not as trivial, but there is no reason to believe that it's resistant + * to cache-timing attack. And the thing about "8-bit" implementation is + * that it consumes 16 (sixteen) times more memory, 4KB per individual + * key + 1KB shared. Well, on pros side it should be twice as fast as + * "4-bit" version. And for gcc-generated x86[_64] code, "8-bit" version + * was observed to run ~75% faster, closer to 100% for commercial + * compilers... Yet "4-bit" procedure is preferred, because it's + * believed to provide better security-performance balance and adequate + * all-round performance. "All-round" refers to things like: + * + * - shorter setup time effectively improves overall timing for + * handling short messages; + * - larger table allocation can become unbearable because of VM + * subsystem penalties (for example on Windows large enough free + * results in VM working set trimming, meaning that consequent + * malloc would immediately incur working set expansion); + * - larger table has larger cache footprint, which can affect + * performance of other code paths (not necessarily even from same + * thread in Hyper-Threading world); + * + * Value of 1 is not appropriate for performance reasons. + */ +#if TABLE_BITS==8 + +static void gcm_init_8bit(u128 Htable[256], u64 H[2]) +{ + int i, j; + u128 V; + + Htable[0].hi = 0; + Htable[0].lo = 0; + V.hi = H[0]; + V.lo = H[1]; + + for (Htable[128]=V, i=64; i>0; i>>=1) { + REDUCE1BIT(V); + Htable[i] = V; + } + + for (i=2; i<256; i<<=1) { + u128 *Hi = Htable+i, H0 = *Hi; + for (j=1; j>8); + Z.hi = (Z.hi>>8); + if (sizeof(size_t)==8) + Z.hi ^= rem_8bit[rem]; + else + Z.hi ^= (u64)rem_8bit[rem]<<32; + } + + if (is_endian.little) { +#ifdef BSWAP8 + Xi[0] = BSWAP8(Z.hi); + Xi[1] = BSWAP8(Z.lo); +#else + u8 *p = (u8 *)Xi; + u32 v; + v = (u32)(Z.hi>>32); PUTU32(p,v); + v = (u32)(Z.hi); PUTU32(p+4,v); + v = (u32)(Z.lo>>32); PUTU32(p+8,v); + v = (u32)(Z.lo); PUTU32(p+12,v); +#endif + } + else { + Xi[0] = Z.hi; + Xi[1] = Z.lo; + } +} +#define GCM_MUL(ctx,Xi) gcm_gmult_8bit(ctx->Xi.u,ctx->Htable) + +#elif TABLE_BITS==4 + +static void gcm_init_4bit(u128 Htable[16], u64 H[2]) +{ + u128 V; +#if defined(OPENSSL_SMALL_FOOTPRINT) + int i; +#endif + + Htable[0].hi = 0; + Htable[0].lo = 0; + V.hi = H[0]; + V.lo = H[1]; + +#if defined(OPENSSL_SMALL_FOOTPRINT) + for (Htable[8]=V, i=4; i>0; i>>=1) { + REDUCE1BIT(V); + Htable[i] = V; + } + + for (i=2; i<16; i<<=1) { + u128 *Hi = Htable+i; + int j; + for (V=*Hi, j=1; j>32; + Htable[j].lo = V.hi<<32|V.hi>>32; + } + } +#endif +} + +#ifndef GHASH_ASM +static const size_t rem_4bit[16] = { + PACK(0x0000), PACK(0x1C20), PACK(0x3840), PACK(0x2460), + PACK(0x7080), PACK(0x6CA0), PACK(0x48C0), PACK(0x54E0), + PACK(0xE100), PACK(0xFD20), PACK(0xD940), PACK(0xC560), + PACK(0x9180), PACK(0x8DA0), PACK(0xA9C0), PACK(0xB5E0) }; + +static void gcm_gmult_4bit(u64 Xi[2], const u128 Htable[16]) +{ + u128 Z; + int cnt = 15; + size_t rem, nlo, nhi; + const union { long one; char little; } is_endian = {1}; + + nlo = ((const u8 *)Xi)[15]; + nhi = nlo>>4; + nlo &= 0xf; + + Z.hi = Htable[nlo].hi; + Z.lo = Htable[nlo].lo; + + while (1) { + rem = (size_t)Z.lo&0xf; + Z.lo = (Z.hi<<60)|(Z.lo>>4); + Z.hi = (Z.hi>>4); + if (sizeof(size_t)==8) + Z.hi ^= rem_4bit[rem]; + else + Z.hi ^= (u64)rem_4bit[rem]<<32; + + Z.hi ^= Htable[nhi].hi; + Z.lo ^= Htable[nhi].lo; + + if (--cnt<0) break; + + nlo = ((const u8 *)Xi)[cnt]; + nhi = nlo>>4; + nlo &= 0xf; + + rem = (size_t)Z.lo&0xf; + Z.lo = (Z.hi<<60)|(Z.lo>>4); + Z.hi = (Z.hi>>4); + if (sizeof(size_t)==8) + Z.hi ^= rem_4bit[rem]; + else + Z.hi ^= (u64)rem_4bit[rem]<<32; + + Z.hi ^= Htable[nlo].hi; + Z.lo ^= Htable[nlo].lo; + } + + if (is_endian.little) { +#ifdef BSWAP8 + Xi[0] = BSWAP8(Z.hi); + Xi[1] = BSWAP8(Z.lo); +#else + u8 *p = (u8 *)Xi; + u32 v; + v = (u32)(Z.hi>>32); PUTU32(p,v); + v = (u32)(Z.hi); PUTU32(p+4,v); + v = (u32)(Z.lo>>32); PUTU32(p+8,v); + v = (u32)(Z.lo); PUTU32(p+12,v); +#endif + } + else { + Xi[0] = Z.hi; + Xi[1] = Z.lo; + } +} + +#if !defined(OPENSSL_SMALL_FOOTPRINT) +/* + * Streamed gcm_mult_4bit, see CRYPTO_gcm128_[en|de]crypt for + * details... Compiler-generated code doesn't seem to give any + * performance improvement, at least not on x86[_64]. It's here + * mostly as reference and a placeholder for possible future + * non-trivial optimization[s]... + */ +static void gcm_ghash_4bit(u64 Xi[2],const u128 Htable[16], + const u8 *inp,size_t len) +{ + u128 Z; + int cnt; + size_t rem, nlo, nhi; + const union { long one; char little; } is_endian = {1}; + +#if 1 + do { + cnt = 15; + nlo = ((const u8 *)Xi)[15]; + nlo ^= inp[15]; + nhi = nlo>>4; + nlo &= 0xf; + + Z.hi = Htable[nlo].hi; + Z.lo = Htable[nlo].lo; + + while (1) { + rem = (size_t)Z.lo&0xf; + Z.lo = (Z.hi<<60)|(Z.lo>>4); + Z.hi = (Z.hi>>4); + if (sizeof(size_t)==8) + Z.hi ^= rem_4bit[rem]; + else + Z.hi ^= (u64)rem_4bit[rem]<<32; + + Z.hi ^= Htable[nhi].hi; + Z.lo ^= Htable[nhi].lo; + + if (--cnt<0) break; + + nlo = ((const u8 *)Xi)[cnt]; + nlo ^= inp[cnt]; + nhi = nlo>>4; + nlo &= 0xf; + + rem = (size_t)Z.lo&0xf; + Z.lo = (Z.hi<<60)|(Z.lo>>4); + Z.hi = (Z.hi>>4); + if (sizeof(size_t)==8) + Z.hi ^= rem_4bit[rem]; + else + Z.hi ^= (u64)rem_4bit[rem]<<32; + + Z.hi ^= Htable[nlo].hi; + Z.lo ^= Htable[nlo].lo; + } +#else + /* + * Extra 256+16 bytes per-key plus 512 bytes shared tables + * [should] give ~50% improvement... One could have PACK()-ed + * the rem_8bit even here, but the priority is to minimize + * cache footprint... + */ + u128 Hshr4[16]; /* Htable shifted right by 4 bits */ + u8 Hshl4[16]; /* Htable shifted left by 4 bits */ + static const unsigned short rem_8bit[256] = { + 0x0000, 0x01C2, 0x0384, 0x0246, 0x0708, 0x06CA, 0x048C, 0x054E, + 0x0E10, 0x0FD2, 0x0D94, 0x0C56, 0x0918, 0x08DA, 0x0A9C, 0x0B5E, + 0x1C20, 0x1DE2, 0x1FA4, 0x1E66, 0x1B28, 0x1AEA, 0x18AC, 0x196E, + 0x1230, 0x13F2, 0x11B4, 0x1076, 0x1538, 0x14FA, 0x16BC, 0x177E, + 0x3840, 0x3982, 0x3BC4, 0x3A06, 0x3F48, 0x3E8A, 0x3CCC, 0x3D0E, + 0x3650, 0x3792, 0x35D4, 0x3416, 0x3158, 0x309A, 0x32DC, 0x331E, + 0x2460, 0x25A2, 0x27E4, 0x2626, 0x2368, 0x22AA, 0x20EC, 0x212E, + 0x2A70, 0x2BB2, 0x29F4, 0x2836, 0x2D78, 0x2CBA, 0x2EFC, 0x2F3E, + 0x7080, 0x7142, 0x7304, 0x72C6, 0x7788, 0x764A, 0x740C, 0x75CE, + 0x7E90, 0x7F52, 0x7D14, 0x7CD6, 0x7998, 0x785A, 0x7A1C, 0x7BDE, + 0x6CA0, 0x6D62, 0x6F24, 0x6EE6, 0x6BA8, 0x6A6A, 0x682C, 0x69EE, + 0x62B0, 0x6372, 0x6134, 0x60F6, 0x65B8, 0x647A, 0x663C, 0x67FE, + 0x48C0, 0x4902, 0x4B44, 0x4A86, 0x4FC8, 0x4E0A, 0x4C4C, 0x4D8E, + 0x46D0, 0x4712, 0x4554, 0x4496, 0x41D8, 0x401A, 0x425C, 0x439E, + 0x54E0, 0x5522, 0x5764, 0x56A6, 0x53E8, 0x522A, 0x506C, 0x51AE, + 0x5AF0, 0x5B32, 0x5974, 0x58B6, 0x5DF8, 0x5C3A, 0x5E7C, 0x5FBE, + 0xE100, 0xE0C2, 0xE284, 0xE346, 0xE608, 0xE7CA, 0xE58C, 0xE44E, + 0xEF10, 0xEED2, 0xEC94, 0xED56, 0xE818, 0xE9DA, 0xEB9C, 0xEA5E, + 0xFD20, 0xFCE2, 0xFEA4, 0xFF66, 0xFA28, 0xFBEA, 0xF9AC, 0xF86E, + 0xF330, 0xF2F2, 0xF0B4, 0xF176, 0xF438, 0xF5FA, 0xF7BC, 0xF67E, + 0xD940, 0xD882, 0xDAC4, 0xDB06, 0xDE48, 0xDF8A, 0xDDCC, 0xDC0E, + 0xD750, 0xD692, 0xD4D4, 0xD516, 0xD058, 0xD19A, 0xD3DC, 0xD21E, + 0xC560, 0xC4A2, 0xC6E4, 0xC726, 0xC268, 0xC3AA, 0xC1EC, 0xC02E, + 0xCB70, 0xCAB2, 0xC8F4, 0xC936, 0xCC78, 0xCDBA, 0xCFFC, 0xCE3E, + 0x9180, 0x9042, 0x9204, 0x93C6, 0x9688, 0x974A, 0x950C, 0x94CE, + 0x9F90, 0x9E52, 0x9C14, 0x9DD6, 0x9898, 0x995A, 0x9B1C, 0x9ADE, + 0x8DA0, 0x8C62, 0x8E24, 0x8FE6, 0x8AA8, 0x8B6A, 0x892C, 0x88EE, + 0x83B0, 0x8272, 0x8034, 0x81F6, 0x84B8, 0x857A, 0x873C, 0x86FE, + 0xA9C0, 0xA802, 0xAA44, 0xAB86, 0xAEC8, 0xAF0A, 0xAD4C, 0xAC8E, + 0xA7D0, 0xA612, 0xA454, 0xA596, 0xA0D8, 0xA11A, 0xA35C, 0xA29E, + 0xB5E0, 0xB422, 0xB664, 0xB7A6, 0xB2E8, 0xB32A, 0xB16C, 0xB0AE, + 0xBBF0, 0xBA32, 0xB874, 0xB9B6, 0xBCF8, 0xBD3A, 0xBF7C, 0xBEBE }; + /* + * This pre-processing phase slows down procedure by approximately + * same time as it makes each loop spin faster. In other words + * single block performance is approximately same as straightforward + * "4-bit" implementation, and then it goes only faster... + */ + for (cnt=0; cnt<16; ++cnt) { + Z.hi = Htable[cnt].hi; + Z.lo = Htable[cnt].lo; + Hshr4[cnt].lo = (Z.hi<<60)|(Z.lo>>4); + Hshr4[cnt].hi = (Z.hi>>4); + Hshl4[cnt] = (u8)(Z.lo<<4); + } + + do { + for (Z.lo=0, Z.hi=0, cnt=15; cnt; --cnt) { + nlo = ((const u8 *)Xi)[cnt]; + nlo ^= inp[cnt]; + nhi = nlo>>4; + nlo &= 0xf; + + Z.hi ^= Htable[nlo].hi; + Z.lo ^= Htable[nlo].lo; + + rem = (size_t)Z.lo&0xff; + + Z.lo = (Z.hi<<56)|(Z.lo>>8); + Z.hi = (Z.hi>>8); + + Z.hi ^= Hshr4[nhi].hi; + Z.lo ^= Hshr4[nhi].lo; + Z.hi ^= (u64)rem_8bit[rem^Hshl4[nhi]]<<48; + } + + nlo = ((const u8 *)Xi)[0]; + nlo ^= inp[0]; + nhi = nlo>>4; + nlo &= 0xf; + + Z.hi ^= Htable[nlo].hi; + Z.lo ^= Htable[nlo].lo; + + rem = (size_t)Z.lo&0xf; + + Z.lo = (Z.hi<<60)|(Z.lo>>4); + Z.hi = (Z.hi>>4); + + Z.hi ^= Htable[nhi].hi; + Z.lo ^= Htable[nhi].lo; + Z.hi ^= ((u64)rem_8bit[rem<<4])<<48; +#endif + + if (is_endian.little) { +#ifdef BSWAP8 + Xi[0] = BSWAP8(Z.hi); + Xi[1] = BSWAP8(Z.lo); +#else + u8 *p = (u8 *)Xi; + u32 v; + v = (u32)(Z.hi>>32); PUTU32(p,v); + v = (u32)(Z.hi); PUTU32(p+4,v); + v = (u32)(Z.lo>>32); PUTU32(p+8,v); + v = (u32)(Z.lo); PUTU32(p+12,v); +#endif + } + else { + Xi[0] = Z.hi; + Xi[1] = Z.lo; + } + } while (inp+=16, len-=16); +} +#endif +#else +void gcm_gmult_4bit(u64 Xi[2],const u128 Htable[16]); +void gcm_ghash_4bit(u64 Xi[2],const u128 Htable[16],const u8 *inp,size_t len); +#endif + +#define GCM_MUL(ctx,Xi) gcm_gmult_4bit(ctx->Xi.u,ctx->Htable) +#if defined(GHASH_ASM) || !defined(OPENSSL_SMALL_FOOTPRINT) +#define GHASH(ctx,in,len) gcm_ghash_4bit((ctx)->Xi.u,(ctx)->Htable,in,len) +/* GHASH_CHUNK is "stride parameter" missioned to mitigate cache + * trashing effect. In other words idea is to hash data while it's + * still in L1 cache after encryption pass... */ +#define GHASH_CHUNK (3*1024) +#endif + +#else /* TABLE_BITS */ + +static void gcm_gmult_1bit(u64 Xi[2],const u64 H[2]) +{ + u128 V,Z = { 0,0 }; + long X; + int i,j; + const long *xi = (const long *)Xi; + const union { long one; char little; } is_endian = {1}; + + V.hi = H[0]; /* H is in host byte order, no byte swapping */ + V.lo = H[1]; + + for (j=0; j<16/sizeof(long); ++j) { + if (is_endian.little) { + if (sizeof(long)==8) { +#ifdef BSWAP8 + X = (long)(BSWAP8(xi[j])); +#else + const u8 *p = (const u8 *)(xi+j); + X = (long)((u64)GETU32(p)<<32|GETU32(p+4)); +#endif + } + else { + const u8 *p = (const u8 *)(xi+j); + X = (long)GETU32(p); + } + } + else + X = xi[j]; + + for (i=0; i<8*sizeof(long); ++i, X<<=1) { + u64 M = (u64)(X>>(8*sizeof(long)-1)); + Z.hi ^= V.hi&M; + Z.lo ^= V.lo&M; + + REDUCE1BIT(V); + } + } + + if (is_endian.little) { +#ifdef BSWAP8 + Xi[0] = BSWAP8(Z.hi); + Xi[1] = BSWAP8(Z.lo); +#else + u8 *p = (u8 *)Xi; + u32 v; + v = (u32)(Z.hi>>32); PUTU32(p,v); + v = (u32)(Z.hi); PUTU32(p+4,v); + v = (u32)(Z.lo>>32); PUTU32(p+8,v); + v = (u32)(Z.lo); PUTU32(p+12,v); +#endif + } + else { + Xi[0] = Z.hi; + Xi[1] = Z.lo; + } +} +#define GCM_MUL(ctx,Xi) gcm_gmult_1bit(ctx->Xi.u,ctx->H.u) + +#endif + +#if TABLE_BITS==4 && defined(GHASH_ASM) +# if !defined(I386_ONLY) && \ + (defined(__i386) || defined(__i386__) || \ + defined(__x86_64) || defined(__x86_64__) || \ + defined(_M_IX86) || defined(_M_AMD64) || defined(_M_X64)) +# define GHASH_ASM_X86_OR_64 +# define GCM_FUNCREF_4BIT +extern unsigned int OPENSSL_ia32cap_P[2]; + +void gcm_init_clmul(u128 Htable[16],const u64 Xi[2]); +void gcm_gmult_clmul(u64 Xi[2],const u128 Htable[16]); +void gcm_ghash_clmul(u64 Xi[2],const u128 Htable[16],const u8 *inp,size_t len); + +# if defined(__i386) || defined(__i386__) || defined(_M_IX86) +# define GHASH_ASM_X86 +void gcm_gmult_4bit_mmx(u64 Xi[2],const u128 Htable[16]); +void gcm_ghash_4bit_mmx(u64 Xi[2],const u128 Htable[16],const u8 *inp,size_t len); + +void gcm_gmult_4bit_x86(u64 Xi[2],const u128 Htable[16]); +void gcm_ghash_4bit_x86(u64 Xi[2],const u128 Htable[16],const u8 *inp,size_t len); +# endif +# elif defined(__arm__) || defined(__arm) +# include "arm_arch.h" +# if __ARM_ARCH__>=7 +# define GHASH_ASM_ARM +# define GCM_FUNCREF_4BIT +void gcm_gmult_neon(u64 Xi[2],const u128 Htable[16]); +void gcm_ghash_neon(u64 Xi[2],const u128 Htable[16],const u8 *inp,size_t len); +# endif +# endif +#endif + +#ifdef GCM_FUNCREF_4BIT +# undef GCM_MUL +# define GCM_MUL(ctx,Xi) (*gcm_gmult_p)(ctx->Xi.u,ctx->Htable) +# ifdef GHASH +# undef GHASH +# define GHASH(ctx,in,len) (*gcm_ghash_p)(ctx->Xi.u,ctx->Htable,in,len) +# endif +#endif + +void CRYPTO_gcm128_init(GCM128_CONTEXT *ctx,void *key,block128_f block) +{ + const union { long one; char little; } is_endian = {1}; + + memset(ctx,0,sizeof(*ctx)); + ctx->block = block; + ctx->key = key; + + (*block)(ctx->H.c,ctx->H.c,key); + + if (is_endian.little) { + /* H is stored in host byte order */ +#ifdef BSWAP8 + ctx->H.u[0] = BSWAP8(ctx->H.u[0]); + ctx->H.u[1] = BSWAP8(ctx->H.u[1]); +#else + u8 *p = ctx->H.c; + u64 hi,lo; + hi = (u64)GETU32(p) <<32|GETU32(p+4); + lo = (u64)GETU32(p+8)<<32|GETU32(p+12); + ctx->H.u[0] = hi; + ctx->H.u[1] = lo; +#endif + } + +#if TABLE_BITS==8 + gcm_init_8bit(ctx->Htable,ctx->H.u); +#elif TABLE_BITS==4 +# if defined(GHASH_ASM_X86_OR_64) +# if !defined(GHASH_ASM_X86) || defined(OPENSSL_IA32_SSE2) + if (OPENSSL_ia32cap_P[0]&(1<<24) && /* check FXSR bit */ + OPENSSL_ia32cap_P[1]&(1<<1) ) { /* check PCLMULQDQ bit */ + gcm_init_clmul(ctx->Htable,ctx->H.u); + ctx->gmult = gcm_gmult_clmul; + ctx->ghash = gcm_ghash_clmul; + return; + } +# endif + gcm_init_4bit(ctx->Htable,ctx->H.u); +# if defined(GHASH_ASM_X86) /* x86 only */ +# if defined(OPENSSL_IA32_SSE2) + if (OPENSSL_ia32cap_P[0]&(1<<25)) { /* check SSE bit */ +# else + if (OPENSSL_ia32cap_P[0]&(1<<23)) { /* check MMX bit */ +# endif + ctx->gmult = gcm_gmult_4bit_mmx; + ctx->ghash = gcm_ghash_4bit_mmx; + } else { + ctx->gmult = gcm_gmult_4bit_x86; + ctx->ghash = gcm_ghash_4bit_x86; + } +# else + ctx->gmult = gcm_gmult_4bit; + ctx->ghash = gcm_ghash_4bit; +# endif +# elif defined(GHASH_ASM_ARM) + if (OPENSSL_armcap_P & ARMV7_NEON) { + ctx->gmult = gcm_gmult_neon; + ctx->ghash = gcm_ghash_neon; + } else { + gcm_init_4bit(ctx->Htable,ctx->H.u); + ctx->gmult = gcm_gmult_4bit; + ctx->ghash = gcm_ghash_4bit; + } +# else + gcm_init_4bit(ctx->Htable,ctx->H.u); +# endif +#endif +} + +void CRYPTO_gcm128_setiv(GCM128_CONTEXT *ctx,const unsigned char *iv,size_t len) +{ + const union { long one; char little; } is_endian = {1}; + unsigned int ctr; +#ifdef GCM_FUNCREF_4BIT + void (*gcm_gmult_p)(u64 Xi[2],const u128 Htable[16]) = ctx->gmult; +#endif + + ctx->Yi.u[0] = 0; + ctx->Yi.u[1] = 0; + ctx->Xi.u[0] = 0; + ctx->Xi.u[1] = 0; + ctx->len.u[0] = 0; /* AAD length */ + ctx->len.u[1] = 0; /* message length */ + ctx->ares = 0; + ctx->mres = 0; + + if (len==12) { + memcpy(ctx->Yi.c,iv,12); + ctx->Yi.c[15]=1; + ctr=1; + } + else { + size_t i; + u64 len0 = len; + + while (len>=16) { + for (i=0; i<16; ++i) ctx->Yi.c[i] ^= iv[i]; + GCM_MUL(ctx,Yi); + iv += 16; + len -= 16; + } + if (len) { + for (i=0; iYi.c[i] ^= iv[i]; + GCM_MUL(ctx,Yi); + } + len0 <<= 3; + if (is_endian.little) { +#ifdef BSWAP8 + ctx->Yi.u[1] ^= BSWAP8(len0); +#else + ctx->Yi.c[8] ^= (u8)(len0>>56); + ctx->Yi.c[9] ^= (u8)(len0>>48); + ctx->Yi.c[10] ^= (u8)(len0>>40); + ctx->Yi.c[11] ^= (u8)(len0>>32); + ctx->Yi.c[12] ^= (u8)(len0>>24); + ctx->Yi.c[13] ^= (u8)(len0>>16); + ctx->Yi.c[14] ^= (u8)(len0>>8); + ctx->Yi.c[15] ^= (u8)(len0); +#endif + } + else + ctx->Yi.u[1] ^= len0; + + GCM_MUL(ctx,Yi); + + if (is_endian.little) + ctr = GETU32(ctx->Yi.c+12); + else + ctr = ctx->Yi.d[3]; + } + + (*ctx->block)(ctx->Yi.c,ctx->EK0.c,ctx->key); + ++ctr; + if (is_endian.little) + PUTU32(ctx->Yi.c+12,ctr); + else + ctx->Yi.d[3] = ctr; +} + +int CRYPTO_gcm128_aad(GCM128_CONTEXT *ctx,const unsigned char *aad,size_t len) +{ + size_t i; + unsigned int n; + u64 alen = ctx->len.u[0]; +#ifdef GCM_FUNCREF_4BIT + void (*gcm_gmult_p)(u64 Xi[2],const u128 Htable[16]) = ctx->gmult; +# ifdef GHASH + void (*gcm_ghash_p)(u64 Xi[2],const u128 Htable[16], + const u8 *inp,size_t len) = ctx->ghash; +# endif +#endif + + if (ctx->len.u[1]) return -2; + + alen += len; + if (alen>(U64(1)<<61) || (sizeof(len)==8 && alenlen.u[0] = alen; + + n = ctx->ares; + if (n) { + while (n && len) { + ctx->Xi.c[n] ^= *(aad++); + --len; + n = (n+1)%16; + } + if (n==0) GCM_MUL(ctx,Xi); + else { + ctx->ares = n; + return 0; + } + } + +#ifdef GHASH + if ((i = (len&(size_t)-16))) { + GHASH(ctx,aad,i); + aad += i; + len -= i; + } +#else + while (len>=16) { + for (i=0; i<16; ++i) ctx->Xi.c[i] ^= aad[i]; + GCM_MUL(ctx,Xi); + aad += 16; + len -= 16; + } +#endif + if (len) { + n = (unsigned int)len; + for (i=0; iXi.c[i] ^= aad[i]; + } + + ctx->ares = n; + return 0; +} + +int CRYPTO_gcm128_encrypt(GCM128_CONTEXT *ctx, + const unsigned char *in, unsigned char *out, + size_t len) +{ + const union { long one; char little; } is_endian = {1}; + unsigned int n, ctr; + size_t i; + u64 mlen = ctx->len.u[1]; + block128_f block = ctx->block; + void *key = ctx->key; +#ifdef GCM_FUNCREF_4BIT + void (*gcm_gmult_p)(u64 Xi[2],const u128 Htable[16]) = ctx->gmult; +# ifdef GHASH + void (*gcm_ghash_p)(u64 Xi[2],const u128 Htable[16], + const u8 *inp,size_t len) = ctx->ghash; +# endif +#endif + +#if 0 + n = (unsigned int)mlen%16; /* alternative to ctx->mres */ +#endif + mlen += len; + if (mlen>((U64(1)<<36)-32) || (sizeof(len)==8 && mlenlen.u[1] = mlen; + + if (ctx->ares) { + /* First call to encrypt finalizes GHASH(AAD) */ + GCM_MUL(ctx,Xi); + ctx->ares = 0; + } + + if (is_endian.little) + ctr = GETU32(ctx->Yi.c+12); + else + ctr = ctx->Yi.d[3]; + + n = ctx->mres; +#if !defined(OPENSSL_SMALL_FOOTPRINT) + if (16%sizeof(size_t) == 0) do { /* always true actually */ + if (n) { + while (n && len) { + ctx->Xi.c[n] ^= *(out++) = *(in++)^ctx->EKi.c[n]; + --len; + n = (n+1)%16; + } + if (n==0) GCM_MUL(ctx,Xi); + else { + ctx->mres = n; + return 0; + } + } +#if defined(STRICT_ALIGNMENT) + if (((size_t)in|(size_t)out)%sizeof(size_t) != 0) + break; +#endif +#if defined(GHASH) && defined(GHASH_CHUNK) + while (len>=GHASH_CHUNK) { + size_t j=GHASH_CHUNK; + + while (j) { + (*block)(ctx->Yi.c,ctx->EKi.c,key); + ++ctr; + if (is_endian.little) + PUTU32(ctx->Yi.c+12,ctr); + else + ctx->Yi.d[3] = ctr; + for (i=0; i<16; i+=sizeof(size_t)) + *(size_t *)(out+i) = + *(size_t *)(in+i)^*(size_t *)(ctx->EKi.c+i); + out += 16; + in += 16; + j -= 16; + } + GHASH(ctx,out-GHASH_CHUNK,GHASH_CHUNK); + len -= GHASH_CHUNK; + } + if ((i = (len&(size_t)-16))) { + size_t j=i; + + while (len>=16) { + (*block)(ctx->Yi.c,ctx->EKi.c,key); + ++ctr; + if (is_endian.little) + PUTU32(ctx->Yi.c+12,ctr); + else + ctx->Yi.d[3] = ctr; + for (i=0; i<16; i+=sizeof(size_t)) + *(size_t *)(out+i) = + *(size_t *)(in+i)^*(size_t *)(ctx->EKi.c+i); + out += 16; + in += 16; + len -= 16; + } + GHASH(ctx,out-j,j); + } +#else + while (len>=16) { + (*block)(ctx->Yi.c,ctx->EKi.c,key); + ++ctr; + if (is_endian.little) + PUTU32(ctx->Yi.c+12,ctr); + else + ctx->Yi.d[3] = ctr; + for (i=0; i<16; i+=sizeof(size_t)) + *(size_t *)(ctx->Xi.c+i) ^= + *(size_t *)(out+i) = + *(size_t *)(in+i)^*(size_t *)(ctx->EKi.c+i); + GCM_MUL(ctx,Xi); + out += 16; + in += 16; + len -= 16; + } +#endif + if (len) { + (*block)(ctx->Yi.c,ctx->EKi.c,key); + ++ctr; + if (is_endian.little) + PUTU32(ctx->Yi.c+12,ctr); + else + ctx->Yi.d[3] = ctr; + while (len--) { + ctx->Xi.c[n] ^= out[n] = in[n]^ctx->EKi.c[n]; + ++n; + } + } + + ctx->mres = n; + return 0; + } while(0); +#endif + for (i=0;iYi.c,ctx->EKi.c,key); + ++ctr; + if (is_endian.little) + PUTU32(ctx->Yi.c+12,ctr); + else + ctx->Yi.d[3] = ctr; + } + ctx->Xi.c[n] ^= out[i] = in[i]^ctx->EKi.c[n]; + n = (n+1)%16; + if (n==0) + GCM_MUL(ctx,Xi); + } + + ctx->mres = n; + return 0; +} + +int CRYPTO_gcm128_decrypt(GCM128_CONTEXT *ctx, + const unsigned char *in, unsigned char *out, + size_t len) +{ + const union { long one; char little; } is_endian = {1}; + unsigned int n, ctr; + size_t i; + u64 mlen = ctx->len.u[1]; + block128_f block = ctx->block; + void *key = ctx->key; +#ifdef GCM_FUNCREF_4BIT + void (*gcm_gmult_p)(u64 Xi[2],const u128 Htable[16]) = ctx->gmult; +# ifdef GHASH + void (*gcm_ghash_p)(u64 Xi[2],const u128 Htable[16], + const u8 *inp,size_t len) = ctx->ghash; +# endif +#endif + + mlen += len; + if (mlen>((U64(1)<<36)-32) || (sizeof(len)==8 && mlenlen.u[1] = mlen; + + if (ctx->ares) { + /* First call to decrypt finalizes GHASH(AAD) */ + GCM_MUL(ctx,Xi); + ctx->ares = 0; + } + + if (is_endian.little) + ctr = GETU32(ctx->Yi.c+12); + else + ctr = ctx->Yi.d[3]; + + n = ctx->mres; +#if !defined(OPENSSL_SMALL_FOOTPRINT) + if (16%sizeof(size_t) == 0) do { /* always true actually */ + if (n) { + while (n && len) { + u8 c = *(in++); + *(out++) = c^ctx->EKi.c[n]; + ctx->Xi.c[n] ^= c; + --len; + n = (n+1)%16; + } + if (n==0) GCM_MUL (ctx,Xi); + else { + ctx->mres = n; + return 0; + } + } +#if defined(STRICT_ALIGNMENT) + if (((size_t)in|(size_t)out)%sizeof(size_t) != 0) + break; +#endif +#if defined(GHASH) && defined(GHASH_CHUNK) + while (len>=GHASH_CHUNK) { + size_t j=GHASH_CHUNK; + + GHASH(ctx,in,GHASH_CHUNK); + while (j) { + (*block)(ctx->Yi.c,ctx->EKi.c,key); + ++ctr; + if (is_endian.little) + PUTU32(ctx->Yi.c+12,ctr); + else + ctx->Yi.d[3] = ctr; + for (i=0; i<16; i+=sizeof(size_t)) + *(size_t *)(out+i) = + *(size_t *)(in+i)^*(size_t *)(ctx->EKi.c+i); + out += 16; + in += 16; + j -= 16; + } + len -= GHASH_CHUNK; + } + if ((i = (len&(size_t)-16))) { + GHASH(ctx,in,i); + while (len>=16) { + (*block)(ctx->Yi.c,ctx->EKi.c,key); + ++ctr; + if (is_endian.little) + PUTU32(ctx->Yi.c+12,ctr); + else + ctx->Yi.d[3] = ctr; + for (i=0; i<16; i+=sizeof(size_t)) + *(size_t *)(out+i) = + *(size_t *)(in+i)^*(size_t *)(ctx->EKi.c+i); + out += 16; + in += 16; + len -= 16; + } + } +#else + while (len>=16) { + (*block)(ctx->Yi.c,ctx->EKi.c,key); + ++ctr; + if (is_endian.little) + PUTU32(ctx->Yi.c+12,ctr); + else + ctx->Yi.d[3] = ctr; + for (i=0; i<16; i+=sizeof(size_t)) { + size_t c = *(size_t *)(in+i); + *(size_t *)(out+i) = c^*(size_t *)(ctx->EKi.c+i); + *(size_t *)(ctx->Xi.c+i) ^= c; + } + GCM_MUL(ctx,Xi); + out += 16; + in += 16; + len -= 16; + } +#endif + if (len) { + (*block)(ctx->Yi.c,ctx->EKi.c,key); + ++ctr; + if (is_endian.little) + PUTU32(ctx->Yi.c+12,ctr); + else + ctx->Yi.d[3] = ctr; + while (len--) { + u8 c = in[n]; + ctx->Xi.c[n] ^= c; + out[n] = c^ctx->EKi.c[n]; + ++n; + } + } + + ctx->mres = n; + return 0; + } while(0); +#endif + for (i=0;iYi.c,ctx->EKi.c,key); + ++ctr; + if (is_endian.little) + PUTU32(ctx->Yi.c+12,ctr); + else + ctx->Yi.d[3] = ctr; + } + c = in[i]; + out[i] = c^ctx->EKi.c[n]; + ctx->Xi.c[n] ^= c; + n = (n+1)%16; + if (n==0) + GCM_MUL(ctx,Xi); + } + + ctx->mres = n; + return 0; +} + +int CRYPTO_gcm128_encrypt_ctr32(GCM128_CONTEXT *ctx, + const unsigned char *in, unsigned char *out, + size_t len, ctr128_f stream) +{ + const union { long one; char little; } is_endian = {1}; + unsigned int n, ctr; + size_t i; + u64 mlen = ctx->len.u[1]; + void *key = ctx->key; +#ifdef GCM_FUNCREF_4BIT + void (*gcm_gmult_p)(u64 Xi[2],const u128 Htable[16]) = ctx->gmult; +# ifdef GHASH + void (*gcm_ghash_p)(u64 Xi[2],const u128 Htable[16], + const u8 *inp,size_t len) = ctx->ghash; +# endif +#endif + + mlen += len; + if (mlen>((U64(1)<<36)-32) || (sizeof(len)==8 && mlenlen.u[1] = mlen; + + if (ctx->ares) { + /* First call to encrypt finalizes GHASH(AAD) */ + GCM_MUL(ctx,Xi); + ctx->ares = 0; + } + + if (is_endian.little) + ctr = GETU32(ctx->Yi.c+12); + else + ctr = ctx->Yi.d[3]; + + n = ctx->mres; + if (n) { + while (n && len) { + ctx->Xi.c[n] ^= *(out++) = *(in++)^ctx->EKi.c[n]; + --len; + n = (n+1)%16; + } + if (n==0) GCM_MUL(ctx,Xi); + else { + ctx->mres = n; + return 0; + } + } +#if defined(GHASH) && !defined(OPENSSL_SMALL_FOOTPRINT) + while (len>=GHASH_CHUNK) { + (*stream)(in,out,GHASH_CHUNK/16,key,ctx->Yi.c); + ctr += GHASH_CHUNK/16; + if (is_endian.little) + PUTU32(ctx->Yi.c+12,ctr); + else + ctx->Yi.d[3] = ctr; + GHASH(ctx,out,GHASH_CHUNK); + out += GHASH_CHUNK; + in += GHASH_CHUNK; + len -= GHASH_CHUNK; + } +#endif + if ((i = (len&(size_t)-16))) { + size_t j=i/16; + + (*stream)(in,out,j,key,ctx->Yi.c); + ctr += (unsigned int)j; + if (is_endian.little) + PUTU32(ctx->Yi.c+12,ctr); + else + ctx->Yi.d[3] = ctr; + in += i; + len -= i; +#if defined(GHASH) + GHASH(ctx,out,i); + out += i; +#else + while (j--) { + for (i=0;i<16;++i) ctx->Xi.c[i] ^= out[i]; + GCM_MUL(ctx,Xi); + out += 16; + } +#endif + } + if (len) { + (*ctx->block)(ctx->Yi.c,ctx->EKi.c,key); + ++ctr; + if (is_endian.little) + PUTU32(ctx->Yi.c+12,ctr); + else + ctx->Yi.d[3] = ctr; + while (len--) { + ctx->Xi.c[n] ^= out[n] = in[n]^ctx->EKi.c[n]; + ++n; + } + } + + ctx->mres = n; + return 0; +} + +int CRYPTO_gcm128_decrypt_ctr32(GCM128_CONTEXT *ctx, + const unsigned char *in, unsigned char *out, + size_t len,ctr128_f stream) +{ + const union { long one; char little; } is_endian = {1}; + unsigned int n, ctr; + size_t i; + u64 mlen = ctx->len.u[1]; + void *key = ctx->key; +#ifdef GCM_FUNCREF_4BIT + void (*gcm_gmult_p)(u64 Xi[2],const u128 Htable[16]) = ctx->gmult; +# ifdef GHASH + void (*gcm_ghash_p)(u64 Xi[2],const u128 Htable[16], + const u8 *inp,size_t len) = ctx->ghash; +# endif +#endif + + mlen += len; + if (mlen>((U64(1)<<36)-32) || (sizeof(len)==8 && mlenlen.u[1] = mlen; + + if (ctx->ares) { + /* First call to decrypt finalizes GHASH(AAD) */ + GCM_MUL(ctx,Xi); + ctx->ares = 0; + } + + if (is_endian.little) + ctr = GETU32(ctx->Yi.c+12); + else + ctr = ctx->Yi.d[3]; + + n = ctx->mres; + if (n) { + while (n && len) { + u8 c = *(in++); + *(out++) = c^ctx->EKi.c[n]; + ctx->Xi.c[n] ^= c; + --len; + n = (n+1)%16; + } + if (n==0) GCM_MUL (ctx,Xi); + else { + ctx->mres = n; + return 0; + } + } +#if defined(GHASH) && !defined(OPENSSL_SMALL_FOOTPRINT) + while (len>=GHASH_CHUNK) { + GHASH(ctx,in,GHASH_CHUNK); + (*stream)(in,out,GHASH_CHUNK/16,key,ctx->Yi.c); + ctr += GHASH_CHUNK/16; + if (is_endian.little) + PUTU32(ctx->Yi.c+12,ctr); + else + ctx->Yi.d[3] = ctr; + out += GHASH_CHUNK; + in += GHASH_CHUNK; + len -= GHASH_CHUNK; + } +#endif + if ((i = (len&(size_t)-16))) { + size_t j=i/16; + +#if defined(GHASH) + GHASH(ctx,in,i); +#else + while (j--) { + size_t k; + for (k=0;k<16;++k) ctx->Xi.c[k] ^= in[k]; + GCM_MUL(ctx,Xi); + in += 16; + } + j = i/16; + in -= i; +#endif + (*stream)(in,out,j,key,ctx->Yi.c); + ctr += (unsigned int)j; + if (is_endian.little) + PUTU32(ctx->Yi.c+12,ctr); + else + ctx->Yi.d[3] = ctr; + out += i; + in += i; + len -= i; + } + if (len) { + (*ctx->block)(ctx->Yi.c,ctx->EKi.c,key); + ++ctr; + if (is_endian.little) + PUTU32(ctx->Yi.c+12,ctr); + else + ctx->Yi.d[3] = ctr; + while (len--) { + u8 c = in[n]; + ctx->Xi.c[n] ^= c; + out[n] = c^ctx->EKi.c[n]; + ++n; + } + } + + ctx->mres = n; + return 0; +} + +int CRYPTO_gcm128_finish(GCM128_CONTEXT *ctx,const unsigned char *tag, + size_t len) +{ + const union { long one; char little; } is_endian = {1}; + u64 alen = ctx->len.u[0]<<3; + u64 clen = ctx->len.u[1]<<3; +#ifdef GCM_FUNCREF_4BIT + void (*gcm_gmult_p)(u64 Xi[2],const u128 Htable[16]) = ctx->gmult; +#endif + + if (ctx->mres) + GCM_MUL(ctx,Xi); + + if (is_endian.little) { +#ifdef BSWAP8 + alen = BSWAP8(alen); + clen = BSWAP8(clen); +#else + u8 *p = ctx->len.c; + + ctx->len.u[0] = alen; + ctx->len.u[1] = clen; + + alen = (u64)GETU32(p) <<32|GETU32(p+4); + clen = (u64)GETU32(p+8)<<32|GETU32(p+12); +#endif + } + + ctx->Xi.u[0] ^= alen; + ctx->Xi.u[1] ^= clen; + GCM_MUL(ctx,Xi); + + ctx->Xi.u[0] ^= ctx->EK0.u[0]; + ctx->Xi.u[1] ^= ctx->EK0.u[1]; + + if (tag && len<=sizeof(ctx->Xi)) + return memcmp(ctx->Xi.c,tag,len); + else + return -1; +} + +void CRYPTO_gcm128_tag(GCM128_CONTEXT *ctx, unsigned char *tag, size_t len) +{ + CRYPTO_gcm128_finish(ctx, NULL, 0); + memcpy(tag, ctx->Xi.c, len<=sizeof(ctx->Xi.c)?len:sizeof(ctx->Xi.c)); +} + +GCM128_CONTEXT *CRYPTO_gcm128_new(void *key, block128_f block) +{ + GCM128_CONTEXT *ret; + + if ((ret = (GCM128_CONTEXT *)OPENSSL_malloc(sizeof(GCM128_CONTEXT)))) + CRYPTO_gcm128_init(ret,key,block); + + return ret; +} + +void CRYPTO_gcm128_release(GCM128_CONTEXT *ctx) +{ + if (ctx) { + OPENSSL_cleanse(ctx,sizeof(*ctx)); + OPENSSL_free(ctx); + } +} + +#if defined(SELFTEST) +#include +#include + +/* Test Case 1 */ +static const u8 K1[16], + *P1=NULL, + *A1=NULL, + IV1[12], + *C1=NULL, + T1[]= {0x58,0xe2,0xfc,0xce,0xfa,0x7e,0x30,0x61,0x36,0x7f,0x1d,0x57,0xa4,0xe7,0x45,0x5a}; + +/* Test Case 2 */ +#define K2 K1 +#define A2 A1 +#define IV2 IV1 +static const u8 P2[16], + C2[]= {0x03,0x88,0xda,0xce,0x60,0xb6,0xa3,0x92,0xf3,0x28,0xc2,0xb9,0x71,0xb2,0xfe,0x78}, + T2[]= {0xab,0x6e,0x47,0xd4,0x2c,0xec,0x13,0xbd,0xf5,0x3a,0x67,0xb2,0x12,0x57,0xbd,0xdf}; + +/* Test Case 3 */ +#define A3 A2 +static const u8 K3[]= {0xfe,0xff,0xe9,0x92,0x86,0x65,0x73,0x1c,0x6d,0x6a,0x8f,0x94,0x67,0x30,0x83,0x08}, + P3[]= {0xd9,0x31,0x32,0x25,0xf8,0x84,0x06,0xe5,0xa5,0x59,0x09,0xc5,0xaf,0xf5,0x26,0x9a, + 0x86,0xa7,0xa9,0x53,0x15,0x34,0xf7,0xda,0x2e,0x4c,0x30,0x3d,0x8a,0x31,0x8a,0x72, + 0x1c,0x3c,0x0c,0x95,0x95,0x68,0x09,0x53,0x2f,0xcf,0x0e,0x24,0x49,0xa6,0xb5,0x25, + 0xb1,0x6a,0xed,0xf5,0xaa,0x0d,0xe6,0x57,0xba,0x63,0x7b,0x39,0x1a,0xaf,0xd2,0x55}, + IV3[]= {0xca,0xfe,0xba,0xbe,0xfa,0xce,0xdb,0xad,0xde,0xca,0xf8,0x88}, + C3[]= {0x42,0x83,0x1e,0xc2,0x21,0x77,0x74,0x24,0x4b,0x72,0x21,0xb7,0x84,0xd0,0xd4,0x9c, + 0xe3,0xaa,0x21,0x2f,0x2c,0x02,0xa4,0xe0,0x35,0xc1,0x7e,0x23,0x29,0xac,0xa1,0x2e, + 0x21,0xd5,0x14,0xb2,0x54,0x66,0x93,0x1c,0x7d,0x8f,0x6a,0x5a,0xac,0x84,0xaa,0x05, + 0x1b,0xa3,0x0b,0x39,0x6a,0x0a,0xac,0x97,0x3d,0x58,0xe0,0x91,0x47,0x3f,0x59,0x85}, + T3[]= {0x4d,0x5c,0x2a,0xf3,0x27,0xcd,0x64,0xa6,0x2c,0xf3,0x5a,0xbd,0x2b,0xa6,0xfa,0xb4}; + +/* Test Case 4 */ +#define K4 K3 +#define IV4 IV3 +static const u8 P4[]= {0xd9,0x31,0x32,0x25,0xf8,0x84,0x06,0xe5,0xa5,0x59,0x09,0xc5,0xaf,0xf5,0x26,0x9a, + 0x86,0xa7,0xa9,0x53,0x15,0x34,0xf7,0xda,0x2e,0x4c,0x30,0x3d,0x8a,0x31,0x8a,0x72, + 0x1c,0x3c,0x0c,0x95,0x95,0x68,0x09,0x53,0x2f,0xcf,0x0e,0x24,0x49,0xa6,0xb5,0x25, + 0xb1,0x6a,0xed,0xf5,0xaa,0x0d,0xe6,0x57,0xba,0x63,0x7b,0x39}, + A4[]= {0xfe,0xed,0xfa,0xce,0xde,0xad,0xbe,0xef,0xfe,0xed,0xfa,0xce,0xde,0xad,0xbe,0xef, + 0xab,0xad,0xda,0xd2}, + C4[]= {0x42,0x83,0x1e,0xc2,0x21,0x77,0x74,0x24,0x4b,0x72,0x21,0xb7,0x84,0xd0,0xd4,0x9c, + 0xe3,0xaa,0x21,0x2f,0x2c,0x02,0xa4,0xe0,0x35,0xc1,0x7e,0x23,0x29,0xac,0xa1,0x2e, + 0x21,0xd5,0x14,0xb2,0x54,0x66,0x93,0x1c,0x7d,0x8f,0x6a,0x5a,0xac,0x84,0xaa,0x05, + 0x1b,0xa3,0x0b,0x39,0x6a,0x0a,0xac,0x97,0x3d,0x58,0xe0,0x91}, + T4[]= {0x5b,0xc9,0x4f,0xbc,0x32,0x21,0xa5,0xdb,0x94,0xfa,0xe9,0x5a,0xe7,0x12,0x1a,0x47}; + +/* Test Case 5 */ +#define K5 K4 +#define P5 P4 +#define A5 A4 +static const u8 IV5[]= {0xca,0xfe,0xba,0xbe,0xfa,0xce,0xdb,0xad}, + C5[]= {0x61,0x35,0x3b,0x4c,0x28,0x06,0x93,0x4a,0x77,0x7f,0xf5,0x1f,0xa2,0x2a,0x47,0x55, + 0x69,0x9b,0x2a,0x71,0x4f,0xcd,0xc6,0xf8,0x37,0x66,0xe5,0xf9,0x7b,0x6c,0x74,0x23, + 0x73,0x80,0x69,0x00,0xe4,0x9f,0x24,0xb2,0x2b,0x09,0x75,0x44,0xd4,0x89,0x6b,0x42, + 0x49,0x89,0xb5,0xe1,0xeb,0xac,0x0f,0x07,0xc2,0x3f,0x45,0x98}, + T5[]= {0x36,0x12,0xd2,0xe7,0x9e,0x3b,0x07,0x85,0x56,0x1b,0xe1,0x4a,0xac,0xa2,0xfc,0xcb}; + +/* Test Case 6 */ +#define K6 K5 +#define P6 P5 +#define A6 A5 +static const u8 IV6[]= {0x93,0x13,0x22,0x5d,0xf8,0x84,0x06,0xe5,0x55,0x90,0x9c,0x5a,0xff,0x52,0x69,0xaa, + 0x6a,0x7a,0x95,0x38,0x53,0x4f,0x7d,0xa1,0xe4,0xc3,0x03,0xd2,0xa3,0x18,0xa7,0x28, + 0xc3,0xc0,0xc9,0x51,0x56,0x80,0x95,0x39,0xfc,0xf0,0xe2,0x42,0x9a,0x6b,0x52,0x54, + 0x16,0xae,0xdb,0xf5,0xa0,0xde,0x6a,0x57,0xa6,0x37,0xb3,0x9b}, + C6[]= {0x8c,0xe2,0x49,0x98,0x62,0x56,0x15,0xb6,0x03,0xa0,0x33,0xac,0xa1,0x3f,0xb8,0x94, + 0xbe,0x91,0x12,0xa5,0xc3,0xa2,0x11,0xa8,0xba,0x26,0x2a,0x3c,0xca,0x7e,0x2c,0xa7, + 0x01,0xe4,0xa9,0xa4,0xfb,0xa4,0x3c,0x90,0xcc,0xdc,0xb2,0x81,0xd4,0x8c,0x7c,0x6f, + 0xd6,0x28,0x75,0xd2,0xac,0xa4,0x17,0x03,0x4c,0x34,0xae,0xe5}, + T6[]= {0x61,0x9c,0xc5,0xae,0xff,0xfe,0x0b,0xfa,0x46,0x2a,0xf4,0x3c,0x16,0x99,0xd0,0x50}; + +/* Test Case 7 */ +static const u8 K7[24], + *P7=NULL, + *A7=NULL, + IV7[12], + *C7=NULL, + T7[]= {0xcd,0x33,0xb2,0x8a,0xc7,0x73,0xf7,0x4b,0xa0,0x0e,0xd1,0xf3,0x12,0x57,0x24,0x35}; + +/* Test Case 8 */ +#define K8 K7 +#define IV8 IV7 +#define A8 A7 +static const u8 P8[16], + C8[]= {0x98,0xe7,0x24,0x7c,0x07,0xf0,0xfe,0x41,0x1c,0x26,0x7e,0x43,0x84,0xb0,0xf6,0x00}, + T8[]= {0x2f,0xf5,0x8d,0x80,0x03,0x39,0x27,0xab,0x8e,0xf4,0xd4,0x58,0x75,0x14,0xf0,0xfb}; + +/* Test Case 9 */ +#define A9 A8 +static const u8 K9[]= {0xfe,0xff,0xe9,0x92,0x86,0x65,0x73,0x1c,0x6d,0x6a,0x8f,0x94,0x67,0x30,0x83,0x08, + 0xfe,0xff,0xe9,0x92,0x86,0x65,0x73,0x1c}, + P9[]= {0xd9,0x31,0x32,0x25,0xf8,0x84,0x06,0xe5,0xa5,0x59,0x09,0xc5,0xaf,0xf5,0x26,0x9a, + 0x86,0xa7,0xa9,0x53,0x15,0x34,0xf7,0xda,0x2e,0x4c,0x30,0x3d,0x8a,0x31,0x8a,0x72, + 0x1c,0x3c,0x0c,0x95,0x95,0x68,0x09,0x53,0x2f,0xcf,0x0e,0x24,0x49,0xa6,0xb5,0x25, + 0xb1,0x6a,0xed,0xf5,0xaa,0x0d,0xe6,0x57,0xba,0x63,0x7b,0x39,0x1a,0xaf,0xd2,0x55}, + IV9[]= {0xca,0xfe,0xba,0xbe,0xfa,0xce,0xdb,0xad,0xde,0xca,0xf8,0x88}, + C9[]= {0x39,0x80,0xca,0x0b,0x3c,0x00,0xe8,0x41,0xeb,0x06,0xfa,0xc4,0x87,0x2a,0x27,0x57, + 0x85,0x9e,0x1c,0xea,0xa6,0xef,0xd9,0x84,0x62,0x85,0x93,0xb4,0x0c,0xa1,0xe1,0x9c, + 0x7d,0x77,0x3d,0x00,0xc1,0x44,0xc5,0x25,0xac,0x61,0x9d,0x18,0xc8,0x4a,0x3f,0x47, + 0x18,0xe2,0x44,0x8b,0x2f,0xe3,0x24,0xd9,0xcc,0xda,0x27,0x10,0xac,0xad,0xe2,0x56}, + T9[]= {0x99,0x24,0xa7,0xc8,0x58,0x73,0x36,0xbf,0xb1,0x18,0x02,0x4d,0xb8,0x67,0x4a,0x14}; + +/* Test Case 10 */ +#define K10 K9 +#define IV10 IV9 +static const u8 P10[]= {0xd9,0x31,0x32,0x25,0xf8,0x84,0x06,0xe5,0xa5,0x59,0x09,0xc5,0xaf,0xf5,0x26,0x9a, + 0x86,0xa7,0xa9,0x53,0x15,0x34,0xf7,0xda,0x2e,0x4c,0x30,0x3d,0x8a,0x31,0x8a,0x72, + 0x1c,0x3c,0x0c,0x95,0x95,0x68,0x09,0x53,0x2f,0xcf,0x0e,0x24,0x49,0xa6,0xb5,0x25, + 0xb1,0x6a,0xed,0xf5,0xaa,0x0d,0xe6,0x57,0xba,0x63,0x7b,0x39}, + A10[]= {0xfe,0xed,0xfa,0xce,0xde,0xad,0xbe,0xef,0xfe,0xed,0xfa,0xce,0xde,0xad,0xbe,0xef, + 0xab,0xad,0xda,0xd2}, + C10[]= {0x39,0x80,0xca,0x0b,0x3c,0x00,0xe8,0x41,0xeb,0x06,0xfa,0xc4,0x87,0x2a,0x27,0x57, + 0x85,0x9e,0x1c,0xea,0xa6,0xef,0xd9,0x84,0x62,0x85,0x93,0xb4,0x0c,0xa1,0xe1,0x9c, + 0x7d,0x77,0x3d,0x00,0xc1,0x44,0xc5,0x25,0xac,0x61,0x9d,0x18,0xc8,0x4a,0x3f,0x47, + 0x18,0xe2,0x44,0x8b,0x2f,0xe3,0x24,0xd9,0xcc,0xda,0x27,0x10}, + T10[]= {0x25,0x19,0x49,0x8e,0x80,0xf1,0x47,0x8f,0x37,0xba,0x55,0xbd,0x6d,0x27,0x61,0x8c}; + +/* Test Case 11 */ +#define K11 K10 +#define P11 P10 +#define A11 A10 +static const u8 IV11[]={0xca,0xfe,0xba,0xbe,0xfa,0xce,0xdb,0xad}, + C11[]= {0x0f,0x10,0xf5,0x99,0xae,0x14,0xa1,0x54,0xed,0x24,0xb3,0x6e,0x25,0x32,0x4d,0xb8, + 0xc5,0x66,0x63,0x2e,0xf2,0xbb,0xb3,0x4f,0x83,0x47,0x28,0x0f,0xc4,0x50,0x70,0x57, + 0xfd,0xdc,0x29,0xdf,0x9a,0x47,0x1f,0x75,0xc6,0x65,0x41,0xd4,0xd4,0xda,0xd1,0xc9, + 0xe9,0x3a,0x19,0xa5,0x8e,0x8b,0x47,0x3f,0xa0,0xf0,0x62,0xf7}, + T11[]= {0x65,0xdc,0xc5,0x7f,0xcf,0x62,0x3a,0x24,0x09,0x4f,0xcc,0xa4,0x0d,0x35,0x33,0xf8}; + +/* Test Case 12 */ +#define K12 K11 +#define P12 P11 +#define A12 A11 +static const u8 IV12[]={0x93,0x13,0x22,0x5d,0xf8,0x84,0x06,0xe5,0x55,0x90,0x9c,0x5a,0xff,0x52,0x69,0xaa, + 0x6a,0x7a,0x95,0x38,0x53,0x4f,0x7d,0xa1,0xe4,0xc3,0x03,0xd2,0xa3,0x18,0xa7,0x28, + 0xc3,0xc0,0xc9,0x51,0x56,0x80,0x95,0x39,0xfc,0xf0,0xe2,0x42,0x9a,0x6b,0x52,0x54, + 0x16,0xae,0xdb,0xf5,0xa0,0xde,0x6a,0x57,0xa6,0x37,0xb3,0x9b}, + C12[]= {0xd2,0x7e,0x88,0x68,0x1c,0xe3,0x24,0x3c,0x48,0x30,0x16,0x5a,0x8f,0xdc,0xf9,0xff, + 0x1d,0xe9,0xa1,0xd8,0xe6,0xb4,0x47,0xef,0x6e,0xf7,0xb7,0x98,0x28,0x66,0x6e,0x45, + 0x81,0xe7,0x90,0x12,0xaf,0x34,0xdd,0xd9,0xe2,0xf0,0x37,0x58,0x9b,0x29,0x2d,0xb3, + 0xe6,0x7c,0x03,0x67,0x45,0xfa,0x22,0xe7,0xe9,0xb7,0x37,0x3b}, + T12[]= {0xdc,0xf5,0x66,0xff,0x29,0x1c,0x25,0xbb,0xb8,0x56,0x8f,0xc3,0xd3,0x76,0xa6,0xd9}; + +/* Test Case 13 */ +static const u8 K13[32], + *P13=NULL, + *A13=NULL, + IV13[12], + *C13=NULL, + T13[]={0x53,0x0f,0x8a,0xfb,0xc7,0x45,0x36,0xb9,0xa9,0x63,0xb4,0xf1,0xc4,0xcb,0x73,0x8b}; + +/* Test Case 14 */ +#define K14 K13 +#define A14 A13 +static const u8 P14[16], + IV14[12], + C14[]= {0xce,0xa7,0x40,0x3d,0x4d,0x60,0x6b,0x6e,0x07,0x4e,0xc5,0xd3,0xba,0xf3,0x9d,0x18}, + T14[]= {0xd0,0xd1,0xc8,0xa7,0x99,0x99,0x6b,0xf0,0x26,0x5b,0x98,0xb5,0xd4,0x8a,0xb9,0x19}; + +/* Test Case 15 */ +#define A15 A14 +static const u8 K15[]= {0xfe,0xff,0xe9,0x92,0x86,0x65,0x73,0x1c,0x6d,0x6a,0x8f,0x94,0x67,0x30,0x83,0x08, + 0xfe,0xff,0xe9,0x92,0x86,0x65,0x73,0x1c,0x6d,0x6a,0x8f,0x94,0x67,0x30,0x83,0x08}, + P15[]= {0xd9,0x31,0x32,0x25,0xf8,0x84,0x06,0xe5,0xa5,0x59,0x09,0xc5,0xaf,0xf5,0x26,0x9a, + 0x86,0xa7,0xa9,0x53,0x15,0x34,0xf7,0xda,0x2e,0x4c,0x30,0x3d,0x8a,0x31,0x8a,0x72, + 0x1c,0x3c,0x0c,0x95,0x95,0x68,0x09,0x53,0x2f,0xcf,0x0e,0x24,0x49,0xa6,0xb5,0x25, + 0xb1,0x6a,0xed,0xf5,0xaa,0x0d,0xe6,0x57,0xba,0x63,0x7b,0x39,0x1a,0xaf,0xd2,0x55}, + IV15[]={0xca,0xfe,0xba,0xbe,0xfa,0xce,0xdb,0xad,0xde,0xca,0xf8,0x88}, + C15[]= {0x52,0x2d,0xc1,0xf0,0x99,0x56,0x7d,0x07,0xf4,0x7f,0x37,0xa3,0x2a,0x84,0x42,0x7d, + 0x64,0x3a,0x8c,0xdc,0xbf,0xe5,0xc0,0xc9,0x75,0x98,0xa2,0xbd,0x25,0x55,0xd1,0xaa, + 0x8c,0xb0,0x8e,0x48,0x59,0x0d,0xbb,0x3d,0xa7,0xb0,0x8b,0x10,0x56,0x82,0x88,0x38, + 0xc5,0xf6,0x1e,0x63,0x93,0xba,0x7a,0x0a,0xbc,0xc9,0xf6,0x62,0x89,0x80,0x15,0xad}, + T15[]= {0xb0,0x94,0xda,0xc5,0xd9,0x34,0x71,0xbd,0xec,0x1a,0x50,0x22,0x70,0xe3,0xcc,0x6c}; + +/* Test Case 16 */ +#define K16 K15 +#define IV16 IV15 +static const u8 P16[]= {0xd9,0x31,0x32,0x25,0xf8,0x84,0x06,0xe5,0xa5,0x59,0x09,0xc5,0xaf,0xf5,0x26,0x9a, + 0x86,0xa7,0xa9,0x53,0x15,0x34,0xf7,0xda,0x2e,0x4c,0x30,0x3d,0x8a,0x31,0x8a,0x72, + 0x1c,0x3c,0x0c,0x95,0x95,0x68,0x09,0x53,0x2f,0xcf,0x0e,0x24,0x49,0xa6,0xb5,0x25, + 0xb1,0x6a,0xed,0xf5,0xaa,0x0d,0xe6,0x57,0xba,0x63,0x7b,0x39}, + A16[]= {0xfe,0xed,0xfa,0xce,0xde,0xad,0xbe,0xef,0xfe,0xed,0xfa,0xce,0xde,0xad,0xbe,0xef, + 0xab,0xad,0xda,0xd2}, + C16[]= {0x52,0x2d,0xc1,0xf0,0x99,0x56,0x7d,0x07,0xf4,0x7f,0x37,0xa3,0x2a,0x84,0x42,0x7d, + 0x64,0x3a,0x8c,0xdc,0xbf,0xe5,0xc0,0xc9,0x75,0x98,0xa2,0xbd,0x25,0x55,0xd1,0xaa, + 0x8c,0xb0,0x8e,0x48,0x59,0x0d,0xbb,0x3d,0xa7,0xb0,0x8b,0x10,0x56,0x82,0x88,0x38, + 0xc5,0xf6,0x1e,0x63,0x93,0xba,0x7a,0x0a,0xbc,0xc9,0xf6,0x62}, + T16[]= {0x76,0xfc,0x6e,0xce,0x0f,0x4e,0x17,0x68,0xcd,0xdf,0x88,0x53,0xbb,0x2d,0x55,0x1b}; + +/* Test Case 17 */ +#define K17 K16 +#define P17 P16 +#define A17 A16 +static const u8 IV17[]={0xca,0xfe,0xba,0xbe,0xfa,0xce,0xdb,0xad}, + C17[]= {0xc3,0x76,0x2d,0xf1,0xca,0x78,0x7d,0x32,0xae,0x47,0xc1,0x3b,0xf1,0x98,0x44,0xcb, + 0xaf,0x1a,0xe1,0x4d,0x0b,0x97,0x6a,0xfa,0xc5,0x2f,0xf7,0xd7,0x9b,0xba,0x9d,0xe0, + 0xfe,0xb5,0x82,0xd3,0x39,0x34,0xa4,0xf0,0x95,0x4c,0xc2,0x36,0x3b,0xc7,0x3f,0x78, + 0x62,0xac,0x43,0x0e,0x64,0xab,0xe4,0x99,0xf4,0x7c,0x9b,0x1f}, + T17[]= {0x3a,0x33,0x7d,0xbf,0x46,0xa7,0x92,0xc4,0x5e,0x45,0x49,0x13,0xfe,0x2e,0xa8,0xf2}; + +/* Test Case 18 */ +#define K18 K17 +#define P18 P17 +#define A18 A17 +static const u8 IV18[]={0x93,0x13,0x22,0x5d,0xf8,0x84,0x06,0xe5,0x55,0x90,0x9c,0x5a,0xff,0x52,0x69,0xaa, + 0x6a,0x7a,0x95,0x38,0x53,0x4f,0x7d,0xa1,0xe4,0xc3,0x03,0xd2,0xa3,0x18,0xa7,0x28, + 0xc3,0xc0,0xc9,0x51,0x56,0x80,0x95,0x39,0xfc,0xf0,0xe2,0x42,0x9a,0x6b,0x52,0x54, + 0x16,0xae,0xdb,0xf5,0xa0,0xde,0x6a,0x57,0xa6,0x37,0xb3,0x9b}, + C18[]= {0x5a,0x8d,0xef,0x2f,0x0c,0x9e,0x53,0xf1,0xf7,0x5d,0x78,0x53,0x65,0x9e,0x2a,0x20, + 0xee,0xb2,0xb2,0x2a,0xaf,0xde,0x64,0x19,0xa0,0x58,0xab,0x4f,0x6f,0x74,0x6b,0xf4, + 0x0f,0xc0,0xc3,0xb7,0x80,0xf2,0x44,0x45,0x2d,0xa3,0xeb,0xf1,0xc5,0xd8,0x2c,0xde, + 0xa2,0x41,0x89,0x97,0x20,0x0e,0xf8,0x2e,0x44,0xae,0x7e,0x3f}, + T18[]= {0xa4,0x4a,0x82,0x66,0xee,0x1c,0x8e,0xb0,0xc8,0xb5,0xd4,0xcf,0x5a,0xe9,0xf1,0x9a}; + +#define TEST_CASE(n) do { \ + u8 out[sizeof(P##n)]; \ + AES_set_encrypt_key(K##n,sizeof(K##n)*8,&key); \ + CRYPTO_gcm128_init(&ctx,&key,(block128_f)AES_encrypt); \ + CRYPTO_gcm128_setiv(&ctx,IV##n,sizeof(IV##n)); \ + memset(out,0,sizeof(out)); \ + if (A##n) CRYPTO_gcm128_aad(&ctx,A##n,sizeof(A##n)); \ + if (P##n) CRYPTO_gcm128_encrypt(&ctx,P##n,out,sizeof(out)); \ + if (CRYPTO_gcm128_finish(&ctx,T##n,16) || \ + (C##n && memcmp(out,C##n,sizeof(out)))) \ + ret++, printf ("encrypt test#%d failed.\n",n); \ + CRYPTO_gcm128_setiv(&ctx,IV##n,sizeof(IV##n)); \ + memset(out,0,sizeof(out)); \ + if (A##n) CRYPTO_gcm128_aad(&ctx,A##n,sizeof(A##n)); \ + if (C##n) CRYPTO_gcm128_decrypt(&ctx,C##n,out,sizeof(out)); \ + if (CRYPTO_gcm128_finish(&ctx,T##n,16) || \ + (P##n && memcmp(out,P##n,sizeof(out)))) \ + ret++, printf ("decrypt test#%d failed.\n",n); \ + } while(0) + +int main() +{ + GCM128_CONTEXT ctx; + AES_KEY key; + int ret=0; + + TEST_CASE(1); + TEST_CASE(2); + TEST_CASE(3); + TEST_CASE(4); + TEST_CASE(5); + TEST_CASE(6); + TEST_CASE(7); + TEST_CASE(8); + TEST_CASE(9); + TEST_CASE(10); + TEST_CASE(11); + TEST_CASE(12); + TEST_CASE(13); + TEST_CASE(14); + TEST_CASE(15); + TEST_CASE(16); + TEST_CASE(17); + TEST_CASE(18); + +#ifdef OPENSSL_CPUID_OBJ + { + size_t start,stop,gcm_t,ctr_t,OPENSSL_rdtsc(); + union { u64 u; u8 c[1024]; } buf; + int i; + + AES_set_encrypt_key(K1,sizeof(K1)*8,&key); + CRYPTO_gcm128_init(&ctx,&key,(block128_f)AES_encrypt); + CRYPTO_gcm128_setiv(&ctx,IV1,sizeof(IV1)); + + CRYPTO_gcm128_encrypt(&ctx,buf.c,buf.c,sizeof(buf)); + start = OPENSSL_rdtsc(); + CRYPTO_gcm128_encrypt(&ctx,buf.c,buf.c,sizeof(buf)); + gcm_t = OPENSSL_rdtsc() - start; + + CRYPTO_ctr128_encrypt(buf.c,buf.c,sizeof(buf), + &key,ctx.Yi.c,ctx.EKi.c,&ctx.mres, + (block128_f)AES_encrypt); + start = OPENSSL_rdtsc(); + CRYPTO_ctr128_encrypt(buf.c,buf.c,sizeof(buf), + &key,ctx.Yi.c,ctx.EKi.c,&ctx.mres, + (block128_f)AES_encrypt); + ctr_t = OPENSSL_rdtsc() - start; + + printf("%.2f-%.2f=%.2f\n", + gcm_t/(double)sizeof(buf), + ctr_t/(double)sizeof(buf), + (gcm_t-ctr_t)/(double)sizeof(buf)); +#ifdef GHASH + GHASH(&ctx,buf.c,sizeof(buf)); + start = OPENSSL_rdtsc(); + for (i=0;i<100;++i) GHASH(&ctx,buf.c,sizeof(buf)); + gcm_t = OPENSSL_rdtsc() - start; + printf("%.2f\n",gcm_t/(double)sizeof(buf)/(double)i); +#endif + } +#endif + + return ret; +} +#endif diff --git a/crypto/openssl/crypto/modes/modes.h b/crypto/openssl/crypto/modes/modes.h index af8d97d795..f18215bb2b 100644 --- a/crypto/openssl/crypto/modes/modes.h +++ b/crypto/openssl/crypto/modes/modes.h @@ -15,6 +15,14 @@ typedef void (*cbc128_f)(const unsigned char *in, unsigned char *out, size_t len, const void *key, unsigned char ivec[16], int enc); +typedef void (*ctr128_f)(const unsigned char *in, unsigned char *out, + size_t blocks, const void *key, + const unsigned char ivec[16]); + +typedef void (*ccm128_f)(const unsigned char *in, unsigned char *out, + size_t blocks, const void *key, + const unsigned char ivec[16],unsigned char cmac[16]); + void CRYPTO_cbc128_encrypt(const unsigned char *in, unsigned char *out, size_t len, const void *key, unsigned char ivec[16], block128_f block); @@ -27,6 +35,11 @@ void CRYPTO_ctr128_encrypt(const unsigned char *in, unsigned char *out, unsigned char ivec[16], unsigned char ecount_buf[16], unsigned int *num, block128_f block); +void CRYPTO_ctr128_encrypt_ctr32(const unsigned char *in, unsigned char *out, + size_t len, const void *key, + unsigned char ivec[16], unsigned char ecount_buf[16], + unsigned int *num, ctr128_f ctr); + void CRYPTO_ofb128_encrypt(const unsigned char *in, unsigned char *out, size_t len, const void *key, unsigned char ivec[16], int *num, @@ -57,3 +70,66 @@ size_t CRYPTO_cts128_decrypt_block(const unsigned char *in, unsigned char *out, size_t CRYPTO_cts128_decrypt(const unsigned char *in, unsigned char *out, size_t len, const void *key, unsigned char ivec[16], cbc128_f cbc); + +size_t CRYPTO_nistcts128_encrypt_block(const unsigned char *in, unsigned char *out, + size_t len, const void *key, + unsigned char ivec[16], block128_f block); +size_t CRYPTO_nistcts128_encrypt(const unsigned char *in, unsigned char *out, + size_t len, const void *key, + unsigned char ivec[16], cbc128_f cbc); +size_t CRYPTO_nistcts128_decrypt_block(const unsigned char *in, unsigned char *out, + size_t len, const void *key, + unsigned char ivec[16], block128_f block); +size_t CRYPTO_nistcts128_decrypt(const unsigned char *in, unsigned char *out, + size_t len, const void *key, + unsigned char ivec[16], cbc128_f cbc); + +typedef struct gcm128_context GCM128_CONTEXT; + +GCM128_CONTEXT *CRYPTO_gcm128_new(void *key, block128_f block); +void CRYPTO_gcm128_init(GCM128_CONTEXT *ctx,void *key,block128_f block); +void CRYPTO_gcm128_setiv(GCM128_CONTEXT *ctx, const unsigned char *iv, + size_t len); +int CRYPTO_gcm128_aad(GCM128_CONTEXT *ctx, const unsigned char *aad, + size_t len); +int CRYPTO_gcm128_encrypt(GCM128_CONTEXT *ctx, + const unsigned char *in, unsigned char *out, + size_t len); +int CRYPTO_gcm128_decrypt(GCM128_CONTEXT *ctx, + const unsigned char *in, unsigned char *out, + size_t len); +int CRYPTO_gcm128_encrypt_ctr32(GCM128_CONTEXT *ctx, + const unsigned char *in, unsigned char *out, + size_t len, ctr128_f stream); +int CRYPTO_gcm128_decrypt_ctr32(GCM128_CONTEXT *ctx, + const unsigned char *in, unsigned char *out, + size_t len, ctr128_f stream); +int CRYPTO_gcm128_finish(GCM128_CONTEXT *ctx,const unsigned char *tag, + size_t len); +void CRYPTO_gcm128_tag(GCM128_CONTEXT *ctx, unsigned char *tag, size_t len); +void CRYPTO_gcm128_release(GCM128_CONTEXT *ctx); + +typedef struct ccm128_context CCM128_CONTEXT; + +void CRYPTO_ccm128_init(CCM128_CONTEXT *ctx, + unsigned int M, unsigned int L, void *key,block128_f block); +int CRYPTO_ccm128_setiv(CCM128_CONTEXT *ctx, + const unsigned char *nonce, size_t nlen, size_t mlen); +void CRYPTO_ccm128_aad(CCM128_CONTEXT *ctx, + const unsigned char *aad, size_t alen); +int CRYPTO_ccm128_encrypt(CCM128_CONTEXT *ctx, + const unsigned char *inp, unsigned char *out, size_t len); +int CRYPTO_ccm128_decrypt(CCM128_CONTEXT *ctx, + const unsigned char *inp, unsigned char *out, size_t len); +int CRYPTO_ccm128_encrypt_ccm64(CCM128_CONTEXT *ctx, + const unsigned char *inp, unsigned char *out, size_t len, + ccm128_f stream); +int CRYPTO_ccm128_decrypt_ccm64(CCM128_CONTEXT *ctx, + const unsigned char *inp, unsigned char *out, size_t len, + ccm128_f stream); +size_t CRYPTO_ccm128_tag(CCM128_CONTEXT *ctx, unsigned char *tag, size_t len); + +typedef struct xts128_context XTS128_CONTEXT; + +int CRYPTO_xts128_encrypt(const XTS128_CONTEXT *ctx, const unsigned char iv[16], + const unsigned char *inp, unsigned char *out, size_t len, int enc); diff --git a/crypto/openssl/crypto/modes/modes_lcl.h b/crypto/openssl/crypto/modes/modes_lcl.h new file mode 100644 index 0000000000..7a82a981ca --- /dev/null +++ b/crypto/openssl/crypto/modes/modes_lcl.h @@ -0,0 +1,131 @@ +/* ==================================================================== + * Copyright (c) 2010 The OpenSSL Project. All rights reserved. + * + * Redistribution and use is governed by OpenSSL license. + * ==================================================================== + */ + +#include + + +#if (defined(_WIN32) || defined(_WIN64)) && !defined(__MINGW32__) +typedef __int64 i64; +typedef unsigned __int64 u64; +#define U64(C) C##UI64 +#elif defined(__arch64__) +typedef long i64; +typedef unsigned long u64; +#define U64(C) C##UL +#else +typedef long long i64; +typedef unsigned long long u64; +#define U64(C) C##ULL +#endif + +typedef unsigned int u32; +typedef unsigned char u8; + +#define STRICT_ALIGNMENT 1 +#if defined(__i386) || defined(__i386__) || \ + defined(__x86_64) || defined(__x86_64__) || \ + defined(_M_IX86) || defined(_M_AMD64) || defined(_M_X64) || \ + defined(__s390__) || defined(__s390x__) || \ + ( (defined(__arm__) || defined(__arm)) && \ + (defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) || \ + defined(__ARM_ARCH_7R__) || defined(__ARM_ARCH_7M__)) ) +# undef STRICT_ALIGNMENT +#endif + +#if !defined(PEDANTIC) && !defined(OPENSSL_NO_ASM) && !defined(OPENSSL_NO_INLINE_ASM) +#if defined(__GNUC__) && __GNUC__>=2 +# if defined(__x86_64) || defined(__x86_64__) +# define BSWAP8(x) ({ u64 ret=(x); \ + asm ("bswapq %0" \ + : "+r"(ret)); ret; }) +# define BSWAP4(x) ({ u32 ret=(x); \ + asm ("bswapl %0" \ + : "+r"(ret)); ret; }) +# elif (defined(__i386) || defined(__i386__)) +# define BSWAP8(x) ({ u32 lo=(u64)(x)>>32,hi=(x); \ + asm ("bswapl %0; bswapl %1" \ + : "+r"(hi),"+r"(lo)); \ + (u64)hi<<32|lo; }) +# define BSWAP4(x) ({ u32 ret=(x); \ + asm ("bswapl %0" \ + : "+r"(ret)); ret; }) +# elif (defined(__arm__) || defined(__arm)) && !defined(STRICT_ALIGNMENT) +# define BSWAP8(x) ({ u32 lo=(u64)(x)>>32,hi=(x); \ + asm ("rev %0,%0; rev %1,%1" \ + : "+r"(hi),"+r"(lo)); \ + (u64)hi<<32|lo; }) +# define BSWAP4(x) ({ u32 ret; \ + asm ("rev %0,%1" \ + : "=r"(ret) : "r"((u32)(x))); \ + ret; }) +# endif +#elif defined(_MSC_VER) +# if _MSC_VER>=1300 +# pragma intrinsic(_byteswap_uint64,_byteswap_ulong) +# define BSWAP8(x) _byteswap_uint64((u64)(x)) +# define BSWAP4(x) _byteswap_ulong((u32)(x)) +# elif defined(_M_IX86) + __inline u32 _bswap4(u32 val) { + _asm mov eax,val + _asm bswap eax + } +# define BSWAP4(x) _bswap4(x) +# endif +#endif +#endif + +#if defined(BSWAP4) && !defined(STRICT_ALIGNMENT) +#define GETU32(p) BSWAP4(*(const u32 *)(p)) +#define PUTU32(p,v) *(u32 *)(p) = BSWAP4(v) +#else +#define GETU32(p) ((u32)(p)[0]<<24|(u32)(p)[1]<<16|(u32)(p)[2]<<8|(u32)(p)[3]) +#define PUTU32(p,v) ((p)[0]=(u8)((v)>>24),(p)[1]=(u8)((v)>>16),(p)[2]=(u8)((v)>>8),(p)[3]=(u8)(v)) +#endif + +/* GCM definitions */ + +typedef struct { u64 hi,lo; } u128; + +#ifdef TABLE_BITS +#undef TABLE_BITS +#endif +/* + * Even though permitted values for TABLE_BITS are 8, 4 and 1, it should + * never be set to 8 [or 1]. For further information see gcm128.c. + */ +#define TABLE_BITS 4 + +struct gcm128_context { + /* Following 6 names follow names in GCM specification */ + union { u64 u[2]; u32 d[4]; u8 c[16]; } Yi,EKi,EK0,len, + Xi,H; + /* Relative position of Xi, H and pre-computed Htable is used + * in some assembler modules, i.e. don't change the order! */ +#if TABLE_BITS==8 + u128 Htable[256]; +#else + u128 Htable[16]; + void (*gmult)(u64 Xi[2],const u128 Htable[16]); + void (*ghash)(u64 Xi[2],const u128 Htable[16],const u8 *inp,size_t len); +#endif + unsigned int mres, ares; + block128_f block; + void *key; +}; + +struct xts128_context { + void *key1, *key2; + block128_f block1,block2; +}; + +struct ccm128_context { + union { u64 u[2]; u8 c[16]; } nonce, cmac; + u64 blocks; + block128_f block; + void *key; +}; + diff --git a/crypto/openssl/crypto/modes/ofb128.c b/crypto/openssl/crypto/modes/ofb128.c index c732e2ec58..01c01702c4 100644 --- a/crypto/openssl/crypto/modes/ofb128.c +++ b/crypto/openssl/crypto/modes/ofb128.c @@ -48,7 +48,8 @@ * */ -#include "modes.h" +#include +#include "modes_lcl.h" #include #ifndef MODES_DEBUG @@ -58,14 +59,6 @@ #endif #include -#define STRICT_ALIGNMENT -#if defined(__i386) || defined(__i386__) || \ - defined(__x86_64) || defined(__x86_64__) || \ - defined(_M_IX86) || defined(_M_AMD64) || defined(_M_X64) || \ - defined(__s390__) || defined(__s390x__) -# undef STRICT_ALIGNMENT -#endif - /* The input and output encrypted as though 128bit ofb mode is being * used. The extra state information to record how much of the * 128bit block we have used is contained in *num; diff --git a/crypto/openssl/crypto/modes/xts128.c b/crypto/openssl/crypto/modes/xts128.c new file mode 100644 index 0000000000..9cf27a25e9 --- /dev/null +++ b/crypto/openssl/crypto/modes/xts128.c @@ -0,0 +1,187 @@ +/* ==================================================================== + * Copyright (c) 2011 The OpenSSL Project. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * 3. All advertising materials mentioning features or use of this + * software must display the following acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit. (http://www.openssl.org/)" + * + * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to + * endorse or promote products derived from this software without + * prior written permission. For written permission, please contact + * openssl-core@openssl.org. + * + * 5. Products derived from this software may not be called "OpenSSL" + * nor may "OpenSSL" appear in their names without prior written + * permission of the OpenSSL Project. + * + * 6. Redistributions of any form whatsoever must retain the following + * acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit (http://www.openssl.org/)" + * + * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY + * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * ==================================================================== + */ + +#include +#include "modes_lcl.h" +#include + +#ifndef MODES_DEBUG +# ifndef NDEBUG +# define NDEBUG +# endif +#endif +#include + +int CRYPTO_xts128_encrypt(const XTS128_CONTEXT *ctx, const unsigned char iv[16], + const unsigned char *inp, unsigned char *out, + size_t len, int enc) +{ + const union { long one; char little; } is_endian = {1}; + union { u64 u[2]; u32 d[4]; u8 c[16]; } tweak, scratch; + unsigned int i; + + if (len<16) return -1; + + memcpy(tweak.c, iv, 16); + + (*ctx->block2)(tweak.c,tweak.c,ctx->key2); + + if (!enc && (len%16)) len-=16; + + while (len>=16) { +#if defined(STRICT_ALIGNMENT) + memcpy(scratch.c,inp,16); + scratch.u[0] ^= tweak.u[0]; + scratch.u[1] ^= tweak.u[1]; +#else + scratch.u[0] = ((u64*)inp)[0]^tweak.u[0]; + scratch.u[1] = ((u64*)inp)[1]^tweak.u[1]; +#endif + (*ctx->block1)(scratch.c,scratch.c,ctx->key1); +#if defined(STRICT_ALIGNMENT) + scratch.u[0] ^= tweak.u[0]; + scratch.u[1] ^= tweak.u[1]; + memcpy(out,scratch.c,16); +#else + ((u64*)out)[0] = scratch.u[0]^=tweak.u[0]; + ((u64*)out)[1] = scratch.u[1]^=tweak.u[1]; +#endif + inp += 16; + out += 16; + len -= 16; + + if (len==0) return 0; + + if (is_endian.little) { + unsigned int carry,res; + + res = 0x87&(((int)tweak.d[3])>>31); + carry = (unsigned int)(tweak.u[0]>>63); + tweak.u[0] = (tweak.u[0]<<1)^res; + tweak.u[1] = (tweak.u[1]<<1)|carry; + } + else { + size_t c; + + for (c=0,i=0;i<16;++i) { + /*+ substitutes for |, because c is 1 bit */ + c += ((size_t)tweak.c[i])<<1; + tweak.c[i] = (u8)c; + c = c>>8; + } + tweak.c[0] ^= (u8)(0x87&(0-c)); + } + } + if (enc) { + for (i=0;iblock1)(scratch.c,scratch.c,ctx->key1); + scratch.u[0] ^= tweak.u[0]; + scratch.u[1] ^= tweak.u[1]; + memcpy(out-16,scratch.c,16); + } + else { + union { u64 u[2]; u8 c[16]; } tweak1; + + if (is_endian.little) { + unsigned int carry,res; + + res = 0x87&(((int)tweak.d[3])>>31); + carry = (unsigned int)(tweak.u[0]>>63); + tweak1.u[0] = (tweak.u[0]<<1)^res; + tweak1.u[1] = (tweak.u[1]<<1)|carry; + } + else { + size_t c; + + for (c=0,i=0;i<16;++i) { + /*+ substitutes for |, because c is 1 bit */ + c += ((size_t)tweak.c[i])<<1; + tweak1.c[i] = (u8)c; + c = c>>8; + } + tweak1.c[0] ^= (u8)(0x87&(0-c)); + } +#if defined(STRICT_ALIGNMENT) + memcpy(scratch.c,inp,16); + scratch.u[0] ^= tweak1.u[0]; + scratch.u[1] ^= tweak1.u[1]; +#else + scratch.u[0] = ((u64*)inp)[0]^tweak1.u[0]; + scratch.u[1] = ((u64*)inp)[1]^tweak1.u[1]; +#endif + (*ctx->block1)(scratch.c,scratch.c,ctx->key1); + scratch.u[0] ^= tweak1.u[0]; + scratch.u[1] ^= tweak1.u[1]; + + for (i=0;iblock1)(scratch.c,scratch.c,ctx->key1); +#if defined(STRICT_ALIGNMENT) + scratch.u[0] ^= tweak.u[0]; + scratch.u[1] ^= tweak.u[1]; + memcpy (out,scratch.c,16); +#else + ((u64*)out)[0] = scratch.u[0]^tweak.u[0]; + ((u64*)out)[1] = scratch.u[1]^tweak.u[1]; +#endif + } + + return 0; +} diff --git a/crypto/openssl/crypto/dsa/dsa_locl.h b/crypto/openssl/crypto/o_fips.c similarity index 75% copy from crypto/openssl/crypto/dsa/dsa_locl.h copy to crypto/openssl/crypto/o_fips.c index 2b8cfee3db..6a82395750 100644 --- a/crypto/openssl/crypto/dsa/dsa_locl.h +++ b/crypto/openssl/crypto/o_fips.c @@ -1,5 +1,8 @@ +/* Written by Stephen henson (steve@openssl.org) for the OpenSSL + * project 2011. + */ /* ==================================================================== - * Copyright (c) 2007 The OpenSSL Project. All rights reserved. + * Copyright (c) 2011 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -52,8 +55,41 @@ * */ -#include +#include "cryptlib.h" +#ifdef OPENSSL_FIPS +#include +#include +#include +#endif + +int FIPS_mode(void) + { +#ifdef OPENSSL_FIPS + return FIPS_module_mode(); +#else + return 0; +#endif + } + +int FIPS_mode_set(int r) + { + OPENSSL_init(); +#ifdef OPENSSL_FIPS +#ifndef FIPS_AUTH_USER_PASS +#define FIPS_AUTH_USER_PASS "Default FIPS Crypto User Password" +#endif + if (!FIPS_module_mode_set(r, FIPS_AUTH_USER_PASS)) + return 0; + if (r) + RAND_set_rand_method(FIPS_rand_get_method()); + else + RAND_set_rand_method(NULL); + return 1; +#else + if (r == 0) + return 1; + CRYPTOerr(CRYPTO_F_FIPS_MODE_SET, CRYPTO_R_FIPS_MODE_NOT_SUPPORTED); + return 0; +#endif + } -int dsa_builtin_paramgen(DSA *ret, size_t bits, size_t qbits, - const EVP_MD *evpmd, const unsigned char *seed_in, size_t seed_len, - int *counter_ret, unsigned long *h_ret, BN_GENCB *cb); diff --git a/crypto/openssl/crypto/aes/aes_misc.c b/crypto/openssl/crypto/o_init.c similarity index 76% copy from crypto/openssl/crypto/aes/aes_misc.c copy to crypto/openssl/crypto/o_init.c index 4fead1b4c7..db4cdc443b 100644 --- a/crypto/openssl/crypto/aes/aes_misc.c +++ b/crypto/openssl/crypto/o_init.c @@ -1,6 +1,9 @@ -/* crypto/aes/aes_misc.c -*- mode:C; c-file-style: "eay" -*- */ +/* o_init.c */ +/* Written by Dr Stephen N Henson (steve@openssl.org) for the OpenSSL + * project. + */ /* ==================================================================== - * Copyright (c) 1998-2002 The OpenSSL Project. All rights reserved. + * Copyright (c) 2011 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -49,16 +52,31 @@ * */ -#include -#include -#include "aes_locl.h" +#include +#include +#ifdef OPENSSL_FIPS +#include +#include +#endif -const char AES_version[]="AES" OPENSSL_VERSION_PTEXT; +/* Perform any essential OpenSSL initialization operations. + * Currently only sets FIPS callbacks + */ -const char *AES_options(void) { -#ifdef FULL_UNROLL - return "aes(full)"; -#else - return "aes(partial)"; +void OPENSSL_init(void) + { + static int done = 0; + if (done) + return; + done = 1; +#ifdef OPENSSL_FIPS + FIPS_set_locking_callbacks(CRYPTO_lock, CRYPTO_add_lock); + FIPS_set_error_callbacks(ERR_put_error, ERR_add_error_vdata); + FIPS_set_malloc_callbacks(CRYPTO_malloc, CRYPTO_free); + RAND_init_fips(); #endif -} +#if 0 + fprintf(stderr, "Called OPENSSL_init\n"); +#endif + } + diff --git a/crypto/openssl/crypto/objects/obj_dat.h b/crypto/openssl/crypto/objects/obj_dat.h index 6449be6071..d404ad07c9 100644 --- a/crypto/openssl/crypto/objects/obj_dat.h +++ b/crypto/openssl/crypto/objects/obj_dat.h @@ -62,12 +62,12 @@ * [including the GNU Public Licence.] */ -#define NUM_NID 893 -#define NUM_SN 886 -#define NUM_LN 886 -#define NUM_OBJ 840 +#define NUM_NID 920 +#define NUM_SN 913 +#define NUM_LN 913 +#define NUM_OBJ 857 -static const unsigned char lvalues[5824]={ +static const unsigned char lvalues[5980]={ 0x00, /* [ 0] OBJ_undef */ 0x2A,0x86,0x48,0x86,0xF7,0x0D, /* [ 1] OBJ_rsadsi */ 0x2A,0x86,0x48,0x86,0xF7,0x0D,0x01, /* [ 7] OBJ_pkcs */ @@ -908,6 +908,23 @@ static const unsigned char lvalues[5824]={ 0x55,0x04,0x34, /* [5814] OBJ_supportedAlgorithms */ 0x55,0x04,0x35, /* [5817] OBJ_deltaRevocationList */ 0x55,0x04,0x36, /* [5820] OBJ_dmdName */ +0x2A,0x86,0x48,0x86,0xF7,0x0D,0x01,0x09,0x10,0x03,0x09,/* [5823] OBJ_id_alg_PWRI_KEK */ +0x60,0x86,0x48,0x01,0x65,0x03,0x04,0x01,0x06,/* [5834] OBJ_aes_128_gcm */ +0x60,0x86,0x48,0x01,0x65,0x03,0x04,0x01,0x07,/* [5843] OBJ_aes_128_ccm */ +0x60,0x86,0x48,0x01,0x65,0x03,0x04,0x01,0x08,/* [5852] OBJ_id_aes128_wrap_pad */ +0x60,0x86,0x48,0x01,0x65,0x03,0x04,0x01,0x1A,/* [5861] OBJ_aes_192_gcm */ +0x60,0x86,0x48,0x01,0x65,0x03,0x04,0x01,0x1B,/* [5870] OBJ_aes_192_ccm */ +0x60,0x86,0x48,0x01,0x65,0x03,0x04,0x01,0x1C,/* [5879] OBJ_id_aes192_wrap_pad */ +0x60,0x86,0x48,0x01,0x65,0x03,0x04,0x01,0x2E,/* [5888] OBJ_aes_256_gcm */ +0x60,0x86,0x48,0x01,0x65,0x03,0x04,0x01,0x2F,/* [5897] OBJ_aes_256_ccm */ +0x60,0x86,0x48,0x01,0x65,0x03,0x04,0x01,0x30,/* [5906] OBJ_id_aes256_wrap_pad */ +0x2A,0x83,0x08,0x8C,0x9A,0x4B,0x3D,0x01,0x01,0x03,0x02,/* [5915] OBJ_id_camellia128_wrap */ +0x2A,0x83,0x08,0x8C,0x9A,0x4B,0x3D,0x01,0x01,0x03,0x03,/* [5926] OBJ_id_camellia192_wrap */ +0x2A,0x83,0x08,0x8C,0x9A,0x4B,0x3D,0x01,0x01,0x03,0x04,/* [5937] OBJ_id_camellia256_wrap */ +0x55,0x1D,0x25,0x00, /* [5948] OBJ_anyExtendedKeyUsage */ +0x2A,0x86,0x48,0x86,0xF7,0x0D,0x01,0x01,0x08,/* [5952] OBJ_mgf1 */ +0x2A,0x86,0x48,0x86,0xF7,0x0D,0x01,0x01,0x0A,/* [5961] OBJ_rsassaPss */ +0x2A,0x86,0x48,0x86,0xF7,0x0D,0x01,0x01,0x07,/* [5970] OBJ_rsaesOaep */ }; static const ASN1_OBJECT nid_objs[NUM_NID]={ @@ -2351,28 +2368,74 @@ static const ASN1_OBJECT nid_objs[NUM_NID]={ {"deltaRevocationList","deltaRevocationList",NID_deltaRevocationList, 3,&(lvalues[5817]),0}, {"dmdName","dmdName",NID_dmdName,3,&(lvalues[5820]),0}, +{"id-alg-PWRI-KEK","id-alg-PWRI-KEK",NID_id_alg_PWRI_KEK,11, + &(lvalues[5823]),0}, +{"CMAC","cmac",NID_cmac,0,NULL,0}, +{"id-aes128-GCM","aes-128-gcm",NID_aes_128_gcm,9,&(lvalues[5834]),0}, +{"id-aes128-CCM","aes-128-ccm",NID_aes_128_ccm,9,&(lvalues[5843]),0}, +{"id-aes128-wrap-pad","id-aes128-wrap-pad",NID_id_aes128_wrap_pad,9, + &(lvalues[5852]),0}, +{"id-aes192-GCM","aes-192-gcm",NID_aes_192_gcm,9,&(lvalues[5861]),0}, +{"id-aes192-CCM","aes-192-ccm",NID_aes_192_ccm,9,&(lvalues[5870]),0}, +{"id-aes192-wrap-pad","id-aes192-wrap-pad",NID_id_aes192_wrap_pad,9, + &(lvalues[5879]),0}, +{"id-aes256-GCM","aes-256-gcm",NID_aes_256_gcm,9,&(lvalues[5888]),0}, +{"id-aes256-CCM","aes-256-ccm",NID_aes_256_ccm,9,&(lvalues[5897]),0}, +{"id-aes256-wrap-pad","id-aes256-wrap-pad",NID_id_aes256_wrap_pad,9, + &(lvalues[5906]),0}, +{"AES-128-CTR","aes-128-ctr",NID_aes_128_ctr,0,NULL,0}, +{"AES-192-CTR","aes-192-ctr",NID_aes_192_ctr,0,NULL,0}, +{"AES-256-CTR","aes-256-ctr",NID_aes_256_ctr,0,NULL,0}, +{"id-camellia128-wrap","id-camellia128-wrap",NID_id_camellia128_wrap, + 11,&(lvalues[5915]),0}, +{"id-camellia192-wrap","id-camellia192-wrap",NID_id_camellia192_wrap, + 11,&(lvalues[5926]),0}, +{"id-camellia256-wrap","id-camellia256-wrap",NID_id_camellia256_wrap, + 11,&(lvalues[5937]),0}, +{"anyExtendedKeyUsage","Any Extended Key Usage", + NID_anyExtendedKeyUsage,4,&(lvalues[5948]),0}, +{"MGF1","mgf1",NID_mgf1,9,&(lvalues[5952]),0}, +{"RSASSA-PSS","rsassaPss",NID_rsassaPss,9,&(lvalues[5961]),0}, +{"AES-128-XTS","aes-128-xts",NID_aes_128_xts,0,NULL,0}, +{"AES-256-XTS","aes-256-xts",NID_aes_256_xts,0,NULL,0}, +{"RC4-HMAC-MD5","rc4-hmac-md5",NID_rc4_hmac_md5,0,NULL,0}, +{"AES-128-CBC-HMAC-SHA1","aes-128-cbc-hmac-sha1", + NID_aes_128_cbc_hmac_sha1,0,NULL,0}, +{"AES-192-CBC-HMAC-SHA1","aes-192-cbc-hmac-sha1", + NID_aes_192_cbc_hmac_sha1,0,NULL,0}, +{"AES-256-CBC-HMAC-SHA1","aes-256-cbc-hmac-sha1", + NID_aes_256_cbc_hmac_sha1,0,NULL,0}, +{"RSAES-OAEP","rsaesOaep",NID_rsaesOaep,9,&(lvalues[5970]),0}, }; static const unsigned int sn_objs[NUM_SN]={ 364, /* "AD_DVCS" */ 419, /* "AES-128-CBC" */ +916, /* "AES-128-CBC-HMAC-SHA1" */ 421, /* "AES-128-CFB" */ 650, /* "AES-128-CFB1" */ 653, /* "AES-128-CFB8" */ +904, /* "AES-128-CTR" */ 418, /* "AES-128-ECB" */ 420, /* "AES-128-OFB" */ +913, /* "AES-128-XTS" */ 423, /* "AES-192-CBC" */ +917, /* "AES-192-CBC-HMAC-SHA1" */ 425, /* "AES-192-CFB" */ 651, /* "AES-192-CFB1" */ 654, /* "AES-192-CFB8" */ +905, /* "AES-192-CTR" */ 422, /* "AES-192-ECB" */ 424, /* "AES-192-OFB" */ 427, /* "AES-256-CBC" */ +918, /* "AES-256-CBC-HMAC-SHA1" */ 429, /* "AES-256-CFB" */ 652, /* "AES-256-CFB1" */ 655, /* "AES-256-CFB8" */ +906, /* "AES-256-CTR" */ 426, /* "AES-256-ECB" */ 428, /* "AES-256-OFB" */ +914, /* "AES-256-XTS" */ 91, /* "BF-CBC" */ 93, /* "BF-CFB" */ 92, /* "BF-ECB" */ @@ -2400,6 +2463,7 @@ static const unsigned int sn_objs[NUM_SN]={ 110, /* "CAST5-CFB" */ 109, /* "CAST5-ECB" */ 111, /* "CAST5-OFB" */ +894, /* "CMAC" */ 13, /* "CN" */ 141, /* "CRLReason" */ 417, /* "CSPName" */ @@ -2451,6 +2515,7 @@ static const unsigned int sn_objs[NUM_SN]={ 4, /* "MD5" */ 114, /* "MD5-SHA1" */ 95, /* "MDC2" */ +911, /* "MGF1" */ 388, /* "Mail" */ 393, /* "NULL" */ 404, /* "NULL" */ @@ -2487,6 +2552,7 @@ static const unsigned int sn_objs[NUM_SN]={ 40, /* "RC2-OFB" */ 5, /* "RC4" */ 97, /* "RC4-40" */ +915, /* "RC4-HMAC-MD5" */ 120, /* "RC5-CBC" */ 122, /* "RC5-CFB" */ 121, /* "RC5-ECB" */ @@ -2507,6 +2573,8 @@ static const unsigned int sn_objs[NUM_SN]={ 668, /* "RSA-SHA256" */ 669, /* "RSA-SHA384" */ 670, /* "RSA-SHA512" */ +919, /* "RSAES-OAEP" */ +912, /* "RSASSA-PSS" */ 777, /* "SEED-CBC" */ 779, /* "SEED-CFB" */ 776, /* "SEED-ECB" */ @@ -2540,6 +2608,7 @@ static const unsigned int sn_objs[NUM_SN]={ 363, /* "ad_timestamping" */ 376, /* "algorithm" */ 405, /* "ansi-X9-62" */ +910, /* "anyExtendedKeyUsage" */ 746, /* "anyPolicy" */ 370, /* "archiveCutoff" */ 484, /* "associatedDomain" */ @@ -2716,14 +2785,27 @@ static const unsigned int sn_objs[NUM_SN]={ 357, /* "id-aca-group" */ 358, /* "id-aca-role" */ 176, /* "id-ad" */ +896, /* "id-aes128-CCM" */ +895, /* "id-aes128-GCM" */ 788, /* "id-aes128-wrap" */ +897, /* "id-aes128-wrap-pad" */ +899, /* "id-aes192-CCM" */ +898, /* "id-aes192-GCM" */ 789, /* "id-aes192-wrap" */ +900, /* "id-aes192-wrap-pad" */ +902, /* "id-aes256-CCM" */ +901, /* "id-aes256-GCM" */ 790, /* "id-aes256-wrap" */ +903, /* "id-aes256-wrap-pad" */ 262, /* "id-alg" */ +893, /* "id-alg-PWRI-KEK" */ 323, /* "id-alg-des40" */ 326, /* "id-alg-dh-pop" */ 325, /* "id-alg-dh-sig-hmac-sha1" */ 324, /* "id-alg-noSignature" */ +907, /* "id-camellia128-wrap" */ +908, /* "id-camellia192-wrap" */ +909, /* "id-camellia256-wrap" */ 268, /* "id-cct" */ 361, /* "id-cct-PKIData" */ 362, /* "id-cct-PKIResponse" */ @@ -3246,6 +3328,7 @@ static const unsigned int ln_objs[NUM_LN]={ 363, /* "AD Time Stamping" */ 405, /* "ANSI X9.62" */ 368, /* "Acceptable OCSP Responses" */ +910, /* "Any Extended Key Usage" */ 664, /* "Any language" */ 177, /* "Authority Information Access" */ 365, /* "Basic OCSP Response" */ @@ -3386,23 +3469,37 @@ static const unsigned int ln_objs[NUM_LN]={ 364, /* "ad dvcs" */ 606, /* "additional verification" */ 419, /* "aes-128-cbc" */ +916, /* "aes-128-cbc-hmac-sha1" */ +896, /* "aes-128-ccm" */ 421, /* "aes-128-cfb" */ 650, /* "aes-128-cfb1" */ 653, /* "aes-128-cfb8" */ +904, /* "aes-128-ctr" */ 418, /* "aes-128-ecb" */ +895, /* "aes-128-gcm" */ 420, /* "aes-128-ofb" */ +913, /* "aes-128-xts" */ 423, /* "aes-192-cbc" */ +917, /* "aes-192-cbc-hmac-sha1" */ +899, /* "aes-192-ccm" */ 425, /* "aes-192-cfb" */ 651, /* "aes-192-cfb1" */ 654, /* "aes-192-cfb8" */ +905, /* "aes-192-ctr" */ 422, /* "aes-192-ecb" */ +898, /* "aes-192-gcm" */ 424, /* "aes-192-ofb" */ 427, /* "aes-256-cbc" */ +918, /* "aes-256-cbc-hmac-sha1" */ +902, /* "aes-256-ccm" */ 429, /* "aes-256-cfb" */ 652, /* "aes-256-cfb1" */ 655, /* "aes-256-cfb8" */ +906, /* "aes-256-ctr" */ 426, /* "aes-256-ecb" */ +901, /* "aes-256-gcm" */ 428, /* "aes-256-ofb" */ +914, /* "aes-256-xts" */ 376, /* "algorithm" */ 484, /* "associatedDomain" */ 485, /* "associatedName" */ @@ -3467,6 +3564,7 @@ static const unsigned int ln_objs[NUM_LN]={ 407, /* "characteristic-two-field" */ 395, /* "clearance" */ 633, /* "cleartext track 2" */ +894, /* "cmac" */ 13, /* "commonName" */ 513, /* "content types" */ 50, /* "contentType" */ @@ -3602,13 +3700,20 @@ static const unsigned int ln_objs[NUM_LN]={ 358, /* "id-aca-role" */ 176, /* "id-ad" */ 788, /* "id-aes128-wrap" */ +897, /* "id-aes128-wrap-pad" */ 789, /* "id-aes192-wrap" */ +900, /* "id-aes192-wrap-pad" */ 790, /* "id-aes256-wrap" */ +903, /* "id-aes256-wrap-pad" */ 262, /* "id-alg" */ +893, /* "id-alg-PWRI-KEK" */ 323, /* "id-alg-des40" */ 326, /* "id-alg-dh-pop" */ 325, /* "id-alg-dh-sig-hmac-sha1" */ 324, /* "id-alg-noSignature" */ +907, /* "id-camellia128-wrap" */ +908, /* "id-camellia192-wrap" */ +909, /* "id-camellia256-wrap" */ 268, /* "id-cct" */ 361, /* "id-cct-PKIData" */ 362, /* "id-cct-PKIResponse" */ @@ -3806,6 +3911,7 @@ static const unsigned int ln_objs[NUM_LN]={ 602, /* "merchant initiated auth" */ 514, /* "message extensions" */ 51, /* "messageDigest" */ +911, /* "mgf1" */ 506, /* "mime-mhs-bodies" */ 505, /* "mime-mhs-headings" */ 488, /* "mobileTelephoneNumber" */ @@ -3889,6 +3995,7 @@ static const unsigned int ln_objs[NUM_LN]={ 40, /* "rc2-ofb" */ 5, /* "rc4" */ 97, /* "rc4-40" */ +915, /* "rc4-hmac-md5" */ 120, /* "rc5-cbc" */ 122, /* "rc5-cfb" */ 121, /* "rc5-ecb" */ @@ -3905,6 +4012,8 @@ static const unsigned int ln_objs[NUM_LN]={ 6, /* "rsaEncryption" */ 644, /* "rsaOAEPEncryptionSET" */ 377, /* "rsaSignature" */ +919, /* "rsaesOaep" */ +912, /* "rsassaPss" */ 124, /* "run length compression" */ 482, /* "sOARecord" */ 155, /* "safeContentsBag" */ @@ -4254,6 +4363,7 @@ static const unsigned int obj_objs[NUM_OBJ]={ 96, /* OBJ_mdc2WithRSA 2 5 8 3 100 */ 95, /* OBJ_mdc2 2 5 8 3 101 */ 746, /* OBJ_any_policy 2 5 29 32 0 */ +910, /* OBJ_anyExtendedKeyUsage 2 5 29 37 0 */ 519, /* OBJ_setct_PANData 2 23 42 0 0 */ 520, /* OBJ_setct_PANToken 2 23 42 0 1 */ 521, /* OBJ_setct_PANOnly 2 23 42 0 2 */ @@ -4720,6 +4830,9 @@ static const unsigned int obj_objs[NUM_OBJ]={ 8, /* OBJ_md5WithRSAEncryption 1 2 840 113549 1 1 4 */ 65, /* OBJ_sha1WithRSAEncryption 1 2 840 113549 1 1 5 */ 644, /* OBJ_rsaOAEPEncryptionSET 1 2 840 113549 1 1 6 */ +919, /* OBJ_rsaesOaep 1 2 840 113549 1 1 7 */ +911, /* OBJ_mgf1 1 2 840 113549 1 1 8 */ +912, /* OBJ_rsassaPss 1 2 840 113549 1 1 10 */ 668, /* OBJ_sha256WithRSAEncryption 1 2 840 113549 1 1 11 */ 669, /* OBJ_sha384WithRSAEncryption 1 2 840 113549 1 1 12 */ 670, /* OBJ_sha512WithRSAEncryption 1 2 840 113549 1 1 13 */ @@ -4785,16 +4898,25 @@ static const unsigned int obj_objs[NUM_OBJ]={ 420, /* OBJ_aes_128_ofb128 2 16 840 1 101 3 4 1 3 */ 421, /* OBJ_aes_128_cfb128 2 16 840 1 101 3 4 1 4 */ 788, /* OBJ_id_aes128_wrap 2 16 840 1 101 3 4 1 5 */ +895, /* OBJ_aes_128_gcm 2 16 840 1 101 3 4 1 6 */ +896, /* OBJ_aes_128_ccm 2 16 840 1 101 3 4 1 7 */ +897, /* OBJ_id_aes128_wrap_pad 2 16 840 1 101 3 4 1 8 */ 422, /* OBJ_aes_192_ecb 2 16 840 1 101 3 4 1 21 */ 423, /* OBJ_aes_192_cbc 2 16 840 1 101 3 4 1 22 */ 424, /* OBJ_aes_192_ofb128 2 16 840 1 101 3 4 1 23 */ 425, /* OBJ_aes_192_cfb128 2 16 840 1 101 3 4 1 24 */ 789, /* OBJ_id_aes192_wrap 2 16 840 1 101 3 4 1 25 */ +898, /* OBJ_aes_192_gcm 2 16 840 1 101 3 4 1 26 */ +899, /* OBJ_aes_192_ccm 2 16 840 1 101 3 4 1 27 */ +900, /* OBJ_id_aes192_wrap_pad 2 16 840 1 101 3 4 1 28 */ 426, /* OBJ_aes_256_ecb 2 16 840 1 101 3 4 1 41 */ 427, /* OBJ_aes_256_cbc 2 16 840 1 101 3 4 1 42 */ 428, /* OBJ_aes_256_ofb128 2 16 840 1 101 3 4 1 43 */ 429, /* OBJ_aes_256_cfb128 2 16 840 1 101 3 4 1 44 */ 790, /* OBJ_id_aes256_wrap 2 16 840 1 101 3 4 1 45 */ +901, /* OBJ_aes_256_gcm 2 16 840 1 101 3 4 1 46 */ +902, /* OBJ_aes_256_ccm 2 16 840 1 101 3 4 1 47 */ +903, /* OBJ_id_aes256_wrap_pad 2 16 840 1 101 3 4 1 48 */ 672, /* OBJ_sha256 2 16 840 1 101 3 4 2 1 */ 673, /* OBJ_sha384 2 16 840 1 101 3 4 2 2 */ 674, /* OBJ_sha512 2 16 840 1 101 3 4 2 3 */ @@ -4901,6 +5023,9 @@ static const unsigned int obj_objs[NUM_OBJ]={ 751, /* OBJ_camellia_128_cbc 1 2 392 200011 61 1 1 1 2 */ 752, /* OBJ_camellia_192_cbc 1 2 392 200011 61 1 1 1 3 */ 753, /* OBJ_camellia_256_cbc 1 2 392 200011 61 1 1 1 4 */ +907, /* OBJ_id_camellia128_wrap 1 2 392 200011 61 1 1 3 2 */ +908, /* OBJ_id_camellia192_wrap 1 2 392 200011 61 1 1 3 3 */ +909, /* OBJ_id_camellia256_wrap 1 2 392 200011 61 1 1 3 4 */ 196, /* OBJ_id_smime_mod_cms 1 2 840 113549 1 9 16 0 1 */ 197, /* OBJ_id_smime_mod_ess 1 2 840 113549 1 9 16 0 2 */ 198, /* OBJ_id_smime_mod_oid 1 2 840 113549 1 9 16 0 3 */ @@ -4956,6 +5081,7 @@ static const unsigned int obj_objs[NUM_OBJ]={ 246, /* OBJ_id_smime_alg_CMS3DESwrap 1 2 840 113549 1 9 16 3 6 */ 247, /* OBJ_id_smime_alg_CMSRC2wrap 1 2 840 113549 1 9 16 3 7 */ 125, /* OBJ_zlib_compression 1 2 840 113549 1 9 16 3 8 */ +893, /* OBJ_id_alg_PWRI_KEK 1 2 840 113549 1 9 16 3 9 */ 248, /* OBJ_id_smime_cd_ldap 1 2 840 113549 1 9 16 4 1 */ 249, /* OBJ_id_smime_spq_ets_sqt_uri 1 2 840 113549 1 9 16 5 1 */ 250, /* OBJ_id_smime_spq_ets_sqt_unotice 1 2 840 113549 1 9 16 5 2 */ diff --git a/crypto/openssl/crypto/objects/obj_mac.h b/crypto/openssl/crypto/objects/obj_mac.h index 282f11a8a8..b5ea7cdab4 100644 --- a/crypto/openssl/crypto/objects/obj_mac.h +++ b/crypto/openssl/crypto/objects/obj_mac.h @@ -580,6 +580,21 @@ #define NID_sha1WithRSAEncryption 65 #define OBJ_sha1WithRSAEncryption OBJ_pkcs1,5L +#define SN_rsaesOaep "RSAES-OAEP" +#define LN_rsaesOaep "rsaesOaep" +#define NID_rsaesOaep 919 +#define OBJ_rsaesOaep OBJ_pkcs1,7L + +#define SN_mgf1 "MGF1" +#define LN_mgf1 "mgf1" +#define NID_mgf1 911 +#define OBJ_mgf1 OBJ_pkcs1,8L + +#define SN_rsassaPss "RSASSA-PSS" +#define LN_rsassaPss "rsassaPss" +#define NID_rsassaPss 912 +#define OBJ_rsassaPss OBJ_pkcs1,10L + #define SN_sha256WithRSAEncryption "RSA-SHA256" #define LN_sha256WithRSAEncryption "sha256WithRSAEncryption" #define NID_sha256WithRSAEncryption 668 @@ -981,6 +996,10 @@ #define NID_id_smime_alg_CMSRC2wrap 247 #define OBJ_id_smime_alg_CMSRC2wrap OBJ_id_smime_alg,7L +#define SN_id_alg_PWRI_KEK "id-alg-PWRI-KEK" +#define NID_id_alg_PWRI_KEK 893 +#define OBJ_id_alg_PWRI_KEK OBJ_id_smime_alg,9L + #define SN_id_smime_cd_ldap "id-smime-cd-ldap" #define NID_id_smime_cd_ldap 248 #define OBJ_id_smime_cd_ldap OBJ_id_smime_cd,1L @@ -2399,6 +2418,11 @@ #define NID_no_rev_avail 403 #define OBJ_no_rev_avail OBJ_id_ce,56L +#define SN_anyExtendedKeyUsage "anyExtendedKeyUsage" +#define LN_anyExtendedKeyUsage "Any Extended Key Usage" +#define NID_anyExtendedKeyUsage 910 +#define OBJ_anyExtendedKeyUsage OBJ_ext_key_usage,0L + #define SN_netscape "Netscape" #define LN_netscape "Netscape Communications Corp." #define NID_netscape 57 @@ -2586,6 +2610,24 @@ #define NID_aes_128_cfb128 421 #define OBJ_aes_128_cfb128 OBJ_aes,4L +#define SN_id_aes128_wrap "id-aes128-wrap" +#define NID_id_aes128_wrap 788 +#define OBJ_id_aes128_wrap OBJ_aes,5L + +#define SN_aes_128_gcm "id-aes128-GCM" +#define LN_aes_128_gcm "aes-128-gcm" +#define NID_aes_128_gcm 895 +#define OBJ_aes_128_gcm OBJ_aes,6L + +#define SN_aes_128_ccm "id-aes128-CCM" +#define LN_aes_128_ccm "aes-128-ccm" +#define NID_aes_128_ccm 896 +#define OBJ_aes_128_ccm OBJ_aes,7L + +#define SN_id_aes128_wrap_pad "id-aes128-wrap-pad" +#define NID_id_aes128_wrap_pad 897 +#define OBJ_id_aes128_wrap_pad OBJ_aes,8L + #define SN_aes_192_ecb "AES-192-ECB" #define LN_aes_192_ecb "aes-192-ecb" #define NID_aes_192_ecb 422 @@ -2606,6 +2648,24 @@ #define NID_aes_192_cfb128 425 #define OBJ_aes_192_cfb128 OBJ_aes,24L +#define SN_id_aes192_wrap "id-aes192-wrap" +#define NID_id_aes192_wrap 789 +#define OBJ_id_aes192_wrap OBJ_aes,25L + +#define SN_aes_192_gcm "id-aes192-GCM" +#define LN_aes_192_gcm "aes-192-gcm" +#define NID_aes_192_gcm 898 +#define OBJ_aes_192_gcm OBJ_aes,26L + +#define SN_aes_192_ccm "id-aes192-CCM" +#define LN_aes_192_ccm "aes-192-ccm" +#define NID_aes_192_ccm 899 +#define OBJ_aes_192_ccm OBJ_aes,27L + +#define SN_id_aes192_wrap_pad "id-aes192-wrap-pad" +#define NID_id_aes192_wrap_pad 900 +#define OBJ_id_aes192_wrap_pad OBJ_aes,28L + #define SN_aes_256_ecb "AES-256-ECB" #define LN_aes_256_ecb "aes-256-ecb" #define NID_aes_256_ecb 426 @@ -2626,6 +2686,24 @@ #define NID_aes_256_cfb128 429 #define OBJ_aes_256_cfb128 OBJ_aes,44L +#define SN_id_aes256_wrap "id-aes256-wrap" +#define NID_id_aes256_wrap 790 +#define OBJ_id_aes256_wrap OBJ_aes,45L + +#define SN_aes_256_gcm "id-aes256-GCM" +#define LN_aes_256_gcm "aes-256-gcm" +#define NID_aes_256_gcm 901 +#define OBJ_aes_256_gcm OBJ_aes,46L + +#define SN_aes_256_ccm "id-aes256-CCM" +#define LN_aes_256_ccm "aes-256-ccm" +#define NID_aes_256_ccm 902 +#define OBJ_aes_256_ccm OBJ_aes,47L + +#define SN_id_aes256_wrap_pad "id-aes256-wrap-pad" +#define NID_id_aes256_wrap_pad 903 +#define OBJ_id_aes256_wrap_pad OBJ_aes,48L + #define SN_aes_128_cfb1 "AES-128-CFB1" #define LN_aes_128_cfb1 "aes-128-cfb1" #define NID_aes_128_cfb1 650 @@ -2650,6 +2728,26 @@ #define LN_aes_256_cfb8 "aes-256-cfb8" #define NID_aes_256_cfb8 655 +#define SN_aes_128_ctr "AES-128-CTR" +#define LN_aes_128_ctr "aes-128-ctr" +#define NID_aes_128_ctr 904 + +#define SN_aes_192_ctr "AES-192-CTR" +#define LN_aes_192_ctr "aes-192-ctr" +#define NID_aes_192_ctr 905 + +#define SN_aes_256_ctr "AES-256-CTR" +#define LN_aes_256_ctr "aes-256-ctr" +#define NID_aes_256_ctr 906 + +#define SN_aes_128_xts "AES-128-XTS" +#define LN_aes_128_xts "aes-128-xts" +#define NID_aes_128_xts 913 + +#define SN_aes_256_xts "AES-256-XTS" +#define LN_aes_256_xts "aes-256-xts" +#define NID_aes_256_xts 914 + #define SN_des_cfb1 "DES-CFB1" #define LN_des_cfb1 "des-cfb1" #define NID_des_cfb1 656 @@ -2666,18 +2764,6 @@ #define LN_des_ede3_cfb8 "des-ede3-cfb8" #define NID_des_ede3_cfb8 659 -#define SN_id_aes128_wrap "id-aes128-wrap" -#define NID_id_aes128_wrap 788 -#define OBJ_id_aes128_wrap OBJ_aes,5L - -#define SN_id_aes192_wrap "id-aes192-wrap" -#define NID_id_aes192_wrap 789 -#define OBJ_id_aes192_wrap OBJ_aes,25L - -#define SN_id_aes256_wrap "id-aes256-wrap" -#define NID_id_aes256_wrap 790 -#define OBJ_id_aes256_wrap OBJ_aes,45L - #define OBJ_nist_hashalgs OBJ_nistAlgorithms,2L #define SN_sha256 "SHA256" @@ -3810,6 +3896,18 @@ #define NID_camellia_256_cbc 753 #define OBJ_camellia_256_cbc 1L,2L,392L,200011L,61L,1L,1L,1L,4L +#define SN_id_camellia128_wrap "id-camellia128-wrap" +#define NID_id_camellia128_wrap 907 +#define OBJ_id_camellia128_wrap 1L,2L,392L,200011L,61L,1L,1L,3L,2L + +#define SN_id_camellia192_wrap "id-camellia192-wrap" +#define NID_id_camellia192_wrap 908 +#define OBJ_id_camellia192_wrap 1L,2L,392L,200011L,61L,1L,1L,3L,3L + +#define SN_id_camellia256_wrap "id-camellia256-wrap" +#define NID_id_camellia256_wrap 909 +#define OBJ_id_camellia256_wrap 1L,2L,392L,200011L,61L,1L,1L,3L,4L + #define OBJ_ntt_ds 0L,3L,4401L,5L #define OBJ_camellia OBJ_ntt_ds,3L,1L,9L @@ -3912,3 +4010,23 @@ #define LN_hmac "hmac" #define NID_hmac 855 +#define SN_cmac "CMAC" +#define LN_cmac "cmac" +#define NID_cmac 894 + +#define SN_rc4_hmac_md5 "RC4-HMAC-MD5" +#define LN_rc4_hmac_md5 "rc4-hmac-md5" +#define NID_rc4_hmac_md5 915 + +#define SN_aes_128_cbc_hmac_sha1 "AES-128-CBC-HMAC-SHA1" +#define LN_aes_128_cbc_hmac_sha1 "aes-128-cbc-hmac-sha1" +#define NID_aes_128_cbc_hmac_sha1 916 + +#define SN_aes_192_cbc_hmac_sha1 "AES-192-CBC-HMAC-SHA1" +#define LN_aes_192_cbc_hmac_sha1 "aes-192-cbc-hmac-sha1" +#define NID_aes_192_cbc_hmac_sha1 917 + +#define SN_aes_256_cbc_hmac_sha1 "AES-256-CBC-HMAC-SHA1" +#define LN_aes_256_cbc_hmac_sha1 "aes-256-cbc-hmac-sha1" +#define NID_aes_256_cbc_hmac_sha1 918 + diff --git a/crypto/openssl/crypto/objects/obj_xref.c b/crypto/openssl/crypto/objects/obj_xref.c index 152eca5c67..9f744bcede 100644 --- a/crypto/openssl/crypto/objects/obj_xref.c +++ b/crypto/openssl/crypto/objects/obj_xref.c @@ -110,8 +110,10 @@ int OBJ_find_sigid_algs(int signid, int *pdig_nid, int *ppkey_nid) #endif if (rv == NULL) return 0; - *pdig_nid = rv->hash_id; - *ppkey_nid = rv->pkey_id; + if (pdig_nid) + *pdig_nid = rv->hash_id; + if (ppkey_nid) + *ppkey_nid = rv->pkey_id; return 1; } @@ -144,7 +146,8 @@ int OBJ_find_sigid_by_algs(int *psignid, int dig_nid, int pkey_nid) #endif if (rv == NULL) return 0; - *psignid = (*rv)->sign_id; + if (psignid) + *psignid = (*rv)->sign_id; return 1; } diff --git a/crypto/openssl/crypto/objects/obj_xref.h b/crypto/openssl/crypto/objects/obj_xref.h index d5b9b8e198..e23938c296 100644 --- a/crypto/openssl/crypto/objects/obj_xref.h +++ b/crypto/openssl/crypto/objects/obj_xref.h @@ -38,10 +38,12 @@ static const nid_triple sigoid_srt[] = {NID_id_GostR3411_94_with_GostR3410_94, NID_id_GostR3411_94, NID_id_GostR3410_94}, {NID_id_GostR3411_94_with_GostR3410_94_cc, NID_id_GostR3411_94, NID_id_GostR3410_94_cc}, {NID_id_GostR3411_94_with_GostR3410_2001_cc, NID_id_GostR3411_94, NID_id_GostR3410_2001_cc}, + {NID_rsassaPss, NID_undef, NID_rsaEncryption}, }; static const nid_triple * const sigoid_srt_xref[] = { + &sigoid_srt[29], &sigoid_srt[17], &sigoid_srt[18], &sigoid_srt[0], diff --git a/crypto/openssl/crypto/ocsp/ocsp_lib.c b/crypto/openssl/crypto/ocsp/ocsp_lib.c index e92b86c060..a94dc838ee 100644 --- a/crypto/openssl/crypto/ocsp/ocsp_lib.c +++ b/crypto/openssl/crypto/ocsp/ocsp_lib.c @@ -124,7 +124,8 @@ OCSP_CERTID *OCSP_cert_id_new(const EVP_MD *dgst, if (!(ASN1_OCTET_STRING_set(cid->issuerNameHash, md, i))) goto err; /* Calculate the issuerKey hash, excluding tag and length */ - EVP_Digest(issuerKey->data, issuerKey->length, md, &i, dgst, NULL); + if (!EVP_Digest(issuerKey->data, issuerKey->length, md, &i, dgst, NULL)) + goto err; if (!(ASN1_OCTET_STRING_set(cid->issuerKeyHash, md, i))) goto err; diff --git a/crypto/openssl/crypto/opensslv.h b/crypto/openssl/crypto/opensslv.h index a368f6fc78..bf42556070 100644 --- a/crypto/openssl/crypto/opensslv.h +++ b/crypto/openssl/crypto/opensslv.h @@ -25,11 +25,11 @@ * (Prior to 0.9.5a beta1, a different scheme was used: MMNNFFRBB for * major minor fix final patch/beta) */ -#define OPENSSL_VERSION_NUMBER 0x1000007fL +#define OPENSSL_VERSION_NUMBER 0x1000100fL #ifdef OPENSSL_FIPS -#define OPENSSL_VERSION_TEXT "OpenSSL 1.0.0g-fips 18 Jan 2012" +#define OPENSSL_VERSION_TEXT "OpenSSL 1.0.1-fips 14 Mar 2012" #else -#define OPENSSL_VERSION_TEXT "OpenSSL 1.0.0g 18 Jan 2012" +#define OPENSSL_VERSION_TEXT "OpenSSL 1.0.1 14 Mar 2012" #endif #define OPENSSL_VERSION_PTEXT " part of " OPENSSL_VERSION_TEXT diff --git a/crypto/openssl/crypto/ossl_typ.h b/crypto/openssl/crypto/ossl_typ.h index 12bd7014de..ea9227f6f9 100644 --- a/crypto/openssl/crypto/ossl_typ.h +++ b/crypto/openssl/crypto/ossl_typ.h @@ -91,10 +91,12 @@ typedef struct asn1_string_st ASN1_TIME; typedef struct asn1_string_st ASN1_GENERALIZEDTIME; typedef struct asn1_string_st ASN1_VISIBLESTRING; typedef struct asn1_string_st ASN1_UTF8STRING; +typedef struct asn1_string_st ASN1_STRING; typedef int ASN1_BOOLEAN; typedef int ASN1_NULL; #endif +typedef struct ASN1_ITEM_st ASN1_ITEM; typedef struct asn1_pctx_st ASN1_PCTX; #ifdef OPENSSL_SYS_WIN32 diff --git a/crypto/openssl/crypto/pem/pvkfmt.c b/crypto/openssl/crypto/pem/pvkfmt.c index 5f130c4528..b1bf71a5da 100644 --- a/crypto/openssl/crypto/pem/pvkfmt.c +++ b/crypto/openssl/crypto/pem/pvkfmt.c @@ -709,13 +709,16 @@ static int derive_pvk_key(unsigned char *key, const unsigned char *pass, int passlen) { EVP_MD_CTX mctx; + int rv = 1; EVP_MD_CTX_init(&mctx); - EVP_DigestInit_ex(&mctx, EVP_sha1(), NULL); - EVP_DigestUpdate(&mctx, salt, saltlen); - EVP_DigestUpdate(&mctx, pass, passlen); - EVP_DigestFinal_ex(&mctx, key, NULL); + if (!EVP_DigestInit_ex(&mctx, EVP_sha1(), NULL) + || !EVP_DigestUpdate(&mctx, salt, saltlen) + || !EVP_DigestUpdate(&mctx, pass, passlen) + || !EVP_DigestFinal_ex(&mctx, key, NULL)) + rv = 0; + EVP_MD_CTX_cleanup(&mctx); - return 1; + return rv; } @@ -727,11 +730,12 @@ static EVP_PKEY *do_PVK_body(const unsigned char **in, const unsigned char *p = *in; unsigned int magic; unsigned char *enctmp = NULL, *q; + EVP_CIPHER_CTX cctx; + EVP_CIPHER_CTX_init(&cctx); if (saltlen) { char psbuf[PEM_BUFSIZE]; unsigned char keybuf[20]; - EVP_CIPHER_CTX cctx; int enctmplen, inlen; if (cb) inlen=cb(psbuf,PEM_BUFSIZE,0,u); @@ -757,37 +761,41 @@ static EVP_PKEY *do_PVK_body(const unsigned char **in, p += 8; inlen = keylen - 8; q = enctmp + 8; - EVP_CIPHER_CTX_init(&cctx); - EVP_DecryptInit_ex(&cctx, EVP_rc4(), NULL, keybuf, NULL); - EVP_DecryptUpdate(&cctx, q, &enctmplen, p, inlen); - EVP_DecryptFinal_ex(&cctx, q + enctmplen, &enctmplen); + if (!EVP_DecryptInit_ex(&cctx, EVP_rc4(), NULL, keybuf, NULL)) + goto err; + if (!EVP_DecryptUpdate(&cctx, q, &enctmplen, p, inlen)) + goto err; + if (!EVP_DecryptFinal_ex(&cctx, q + enctmplen, &enctmplen)) + goto err; magic = read_ledword((const unsigned char **)&q); if (magic != MS_RSA2MAGIC && magic != MS_DSS2MAGIC) { q = enctmp + 8; memset(keybuf + 5, 0, 11); - EVP_DecryptInit_ex(&cctx, EVP_rc4(), NULL, keybuf, - NULL); + if (!EVP_DecryptInit_ex(&cctx, EVP_rc4(), NULL, keybuf, + NULL)) + goto err; OPENSSL_cleanse(keybuf, 20); - EVP_DecryptUpdate(&cctx, q, &enctmplen, p, inlen); - EVP_DecryptFinal_ex(&cctx, q + enctmplen, - &enctmplen); + if (!EVP_DecryptUpdate(&cctx, q, &enctmplen, p, inlen)) + goto err; + if (!EVP_DecryptFinal_ex(&cctx, q + enctmplen, + &enctmplen)) + goto err; magic = read_ledword((const unsigned char **)&q); if (magic != MS_RSA2MAGIC && magic != MS_DSS2MAGIC) { - EVP_CIPHER_CTX_cleanup(&cctx); PEMerr(PEM_F_DO_PVK_BODY, PEM_R_BAD_DECRYPT); goto err; } } else OPENSSL_cleanse(keybuf, 20); - EVP_CIPHER_CTX_cleanup(&cctx); p = enctmp; } ret = b2i_PrivateKey(&p, keylen); err: + EVP_CIPHER_CTX_cleanup(&cctx); if (enctmp && saltlen) OPENSSL_free(enctmp); return ret; @@ -841,6 +849,8 @@ static int i2b_PVK(unsigned char **out, EVP_PKEY*pk, int enclevel, { int outlen = 24, pklen; unsigned char *p, *salt = NULL; + EVP_CIPHER_CTX cctx; + EVP_CIPHER_CTX_init(&cctx); if (enclevel) outlen += PVK_SALTLEN; pklen = do_i2b(NULL, pk, 0); @@ -885,7 +895,6 @@ static int i2b_PVK(unsigned char **out, EVP_PKEY*pk, int enclevel, { char psbuf[PEM_BUFSIZE]; unsigned char keybuf[20]; - EVP_CIPHER_CTX cctx; int enctmplen, inlen; if (cb) inlen=cb(psbuf,PEM_BUFSIZE,1,u); @@ -902,16 +911,19 @@ static int i2b_PVK(unsigned char **out, EVP_PKEY*pk, int enclevel, if (enclevel == 1) memset(keybuf + 5, 0, 11); p = salt + PVK_SALTLEN + 8; - EVP_CIPHER_CTX_init(&cctx); - EVP_EncryptInit_ex(&cctx, EVP_rc4(), NULL, keybuf, NULL); + if (!EVP_EncryptInit_ex(&cctx, EVP_rc4(), NULL, keybuf, NULL)) + goto error; OPENSSL_cleanse(keybuf, 20); - EVP_DecryptUpdate(&cctx, p, &enctmplen, p, pklen - 8); - EVP_DecryptFinal_ex(&cctx, p + enctmplen, &enctmplen); - EVP_CIPHER_CTX_cleanup(&cctx); + if (!EVP_DecryptUpdate(&cctx, p, &enctmplen, p, pklen - 8)) + goto error; + if (!EVP_DecryptFinal_ex(&cctx, p + enctmplen, &enctmplen)) + goto error; } + EVP_CIPHER_CTX_cleanup(&cctx); return outlen; error: + EVP_CIPHER_CTX_cleanup(&cctx); return -1; } diff --git a/crypto/openssl/crypto/perlasm/x86_64-xlate.pl b/crypto/openssl/crypto/perlasm/x86_64-xlate.pl index e47116b74b..56d9b64b6f 100755 --- a/crypto/openssl/crypto/perlasm/x86_64-xlate.pl +++ b/crypto/openssl/crypto/perlasm/x86_64-xlate.pl @@ -62,12 +62,8 @@ my $flavour = shift; my $output = shift; if ($flavour =~ /\./) { $output = $flavour; undef $flavour; } -{ my ($stddev,$stdino,@junk)=stat(STDOUT); - my ($outdev,$outino,@junk)=stat($output); - - open STDOUT,">$output" || die "can't open $output: $!" - if ($stddev!=$outdev || $stdino!=$outino); -} +open STDOUT,">$output" || die "can't open $output: $!" + if (defined($output)); my $gas=1; $gas=0 if ($output =~ /\.asm$/); my $elf=1; $elf=0 if (!$gas); @@ -116,12 +112,16 @@ my %globals; $line = substr($line,@+[0]); $line =~ s/^\s+//; undef $self->{sz}; - if ($self->{op} =~ /^(movz)b.*/) { # movz is pain... + if ($self->{op} =~ /^(movz)x?([bw]).*/) { # movz is pain... $self->{op} = $1; - $self->{sz} = "b"; + $self->{sz} = $2; } elsif ($self->{op} =~ /call|jmp/) { $self->{sz} = ""; - } elsif ($self->{op} =~ /^p/ && $' !~ /^(ush|op)/) { # SSEn + } elsif ($self->{op} =~ /^p/ && $' !~ /^(ush|op|insrw)/) { # SSEn + $self->{sz} = ""; + } elsif ($self->{op} =~ /^v/) { # VEX + $self->{sz} = ""; + } elsif ($self->{op} =~ /movq/ && $line =~ /%xmm/) { $self->{sz} = ""; } elsif ($self->{op} =~ /([a-z]{3,})([qlwb])$/) { $self->{op} = $1; @@ -246,35 +246,39 @@ my %globals; $self->{index} =~ s/^[er](.?[0-9xpi])[d]?$/r\1/; $self->{base} =~ s/^[er](.?[0-9xpi])[d]?$/r\1/; + # Solaris /usr/ccs/bin/as can't handle multiplications + # in $self->{label}, new gas requires sign extension... + use integer; + $self->{label} =~ s/(?{label} =~ s/([0-9]+\s*[\*\/\%]\s*[0-9]+)/eval($1)/eg; + $self->{label} =~ s/([0-9]+)/$1<<32>>32/eg; + if ($gas) { - # Solaris /usr/ccs/bin/as can't handle multiplications - # in $self->{label}, new gas requires sign extension... - use integer; - $self->{label} =~ s/(?{label} =~ s/([0-9]+\s*[\*\/\%]\s*[0-9]+)/eval($1)/eg; - $self->{label} =~ s/([0-9]+)/$1<<32>>32/eg; $self->{label} =~ s/^___imp_/__imp__/ if ($flavour eq "mingw64"); if (defined($self->{index})) { - sprintf "%s%s(%%%s,%%%s,%d)",$self->{asterisk}, - $self->{label},$self->{base}, + sprintf "%s%s(%s,%%%s,%d)",$self->{asterisk}, + $self->{label}, + $self->{base}?"%$self->{base}":"", $self->{index},$self->{scale}; } else { sprintf "%s%s(%%%s)", $self->{asterisk},$self->{label},$self->{base}; } } else { - %szmap = ( b=>"BYTE$PTR", w=>"WORD$PTR", l=>"DWORD$PTR", q=>"QWORD$PTR" ); + %szmap = ( b=>"BYTE$PTR", w=>"WORD$PTR", l=>"DWORD$PTR", + q=>"QWORD$PTR",o=>"OWORD$PTR",x=>"XMMWORD$PTR" ); $self->{label} =~ s/\./\$/g; $self->{label} =~ s/(?{label} = "($self->{label})" if ($self->{label} =~ /[\*\+\-\/]/); - $sz="q" if ($self->{asterisk}); + $sz="q" if ($self->{asterisk} || opcode->mnemonic() eq "movq"); + $sz="l" if (opcode->mnemonic() eq "movd"); if (defined($self->{index})) { - sprintf "%s[%s%s*%d+%s]",$szmap{$sz}, + sprintf "%s[%s%s*%d%s]",$szmap{$sz}, $self->{label}?"$self->{label}+":"", $self->{index},$self->{scale}, - $self->{base}; + $self->{base}?"+$self->{base}":""; } elsif ($self->{base} eq "rip") { sprintf "%s[%s]",$szmap{$sz},$self->{label}; } else { @@ -506,6 +510,12 @@ my %globals; } } elsif ($dir =~ /\.(text|data)/) { $current_segment=".$1"; + } elsif ($dir =~ /\.hidden/) { + if ($flavour eq "macosx") { $self->{value} = ".private_extern\t$prefix$line"; } + elsif ($flavour eq "mingw64") { $self->{value} = ""; } + } elsif ($dir =~ /\.comm/) { + $self->{value} = "$dir\t$prefix$line"; + $self->{value} =~ s|,([0-9]+),([0-9]+)$|",$1,".log($2)/log(2)|e if ($flavour eq "macosx"); } $line = ""; return $self; @@ -555,7 +565,8 @@ my %globals; $v.=" READONLY"; $v.=" ALIGN(".($1 eq "p" ? 4 : 8).")" if ($masm>=$masmref); } elsif ($line=~/\.CRT\$/i) { - $v.=" READONLY DWORD"; + $v.=" READONLY "; + $v.=$masm>=$masmref ? "ALIGN(8)" : "DWORD"; } } $current_segment = $line; @@ -577,7 +588,7 @@ my %globals; $self->{value}="${decor}SEH_end_$current_function->{name}:"; $self->{value}.=":\n" if($masm); } - $self->{value}.="$current_function->{name}\tENDP" if($masm); + $self->{value}.="$current_function->{name}\tENDP" if($masm && $current_function->{name}); undef $current_function; } last; @@ -613,6 +624,19 @@ my %globals; .join(",",@str) if (@str); last; }; + /\.comm/ && do { my @str=split(/,\s*/,$line); + my $v=undef; + if ($nasm) { + $v.="common $prefix@str[0] @str[1]"; + } else { + $v="$current_segment\tENDS\n" if ($current_segment); + $current_segment = "_DATA"; + $v.="$current_segment\tSEGMENT\n"; + $v.="COMM @str[0]:DWORD:".@str[1]/4; + } + $self->{value} = $v; + last; + }; } $line = ""; } @@ -625,9 +649,133 @@ my %globals; } } +sub rex { + local *opcode=shift; + my ($dst,$src,$rex)=@_; + + $rex|=0x04 if($dst>=8); + $rex|=0x01 if($src>=8); + push @opcode,($rex|0x40) if ($rex); +} + +# older gas and ml64 don't handle SSE>2 instructions +my %regrm = ( "%eax"=>0, "%ecx"=>1, "%edx"=>2, "%ebx"=>3, + "%esp"=>4, "%ebp"=>5, "%esi"=>6, "%edi"=>7 ); + +my $movq = sub { # elderly gas can't handle inter-register movq + my $arg = shift; + my @opcode=(0x66); + if ($arg =~ /%xmm([0-9]+),\s*%r(\w+)/) { + my ($src,$dst)=($1,$2); + if ($dst !~ /[0-9]+/) { $dst = $regrm{"%e$dst"}; } + rex(\@opcode,$src,$dst,0x8); + push @opcode,0x0f,0x7e; + push @opcode,0xc0|(($src&7)<<3)|($dst&7); # ModR/M + @opcode; + } elsif ($arg =~ /%r(\w+),\s*%xmm([0-9]+)/) { + my ($src,$dst)=($2,$1); + if ($dst !~ /[0-9]+/) { $dst = $regrm{"%e$dst"}; } + rex(\@opcode,$src,$dst,0x8); + push @opcode,0x0f,0x6e; + push @opcode,0xc0|(($src&7)<<3)|($dst&7); # ModR/M + @opcode; + } else { + (); + } +}; + +my $pextrd = sub { + if (shift =~ /\$([0-9]+),\s*%xmm([0-9]+),\s*(%\w+)/) { + my @opcode=(0x66); + $imm=$1; + $src=$2; + $dst=$3; + if ($dst =~ /%r([0-9]+)d/) { $dst = $1; } + elsif ($dst =~ /%e/) { $dst = $regrm{$dst}; } + rex(\@opcode,$src,$dst); + push @opcode,0x0f,0x3a,0x16; + push @opcode,0xc0|(($src&7)<<3)|($dst&7); # ModR/M + push @opcode,$imm; + @opcode; + } else { + (); + } +}; + +my $pinsrd = sub { + if (shift =~ /\$([0-9]+),\s*(%\w+),\s*%xmm([0-9]+)/) { + my @opcode=(0x66); + $imm=$1; + $src=$2; + $dst=$3; + if ($src =~ /%r([0-9]+)/) { $src = $1; } + elsif ($src =~ /%e/) { $src = $regrm{$src}; } + rex(\@opcode,$dst,$src); + push @opcode,0x0f,0x3a,0x22; + push @opcode,0xc0|(($dst&7)<<3)|($src&7); # ModR/M + push @opcode,$imm; + @opcode; + } else { + (); + } +}; + +my $pshufb = sub { + if (shift =~ /%xmm([0-9]+),\s*%xmm([0-9]+)/) { + my @opcode=(0x66); + rex(\@opcode,$2,$1); + push @opcode,0x0f,0x38,0x00; + push @opcode,0xc0|($1&7)|(($2&7)<<3); # ModR/M + @opcode; + } else { + (); + } +}; + +my $palignr = sub { + if (shift =~ /\$([0-9]+),\s*%xmm([0-9]+),\s*%xmm([0-9]+)/) { + my @opcode=(0x66); + rex(\@opcode,$3,$2); + push @opcode,0x0f,0x3a,0x0f; + push @opcode,0xc0|($2&7)|(($3&7)<<3); # ModR/M + push @opcode,$1; + @opcode; + } else { + (); + } +}; + +my $pclmulqdq = sub { + if (shift =~ /\$([x0-9a-f]+),\s*%xmm([0-9]+),\s*%xmm([0-9]+)/) { + my @opcode=(0x66); + rex(\@opcode,$3,$2); + push @opcode,0x0f,0x3a,0x44; + push @opcode,0xc0|($2&7)|(($3&7)<<3); # ModR/M + my $c=$1; + push @opcode,$c=~/^0/?oct($c):$c; + @opcode; + } else { + (); + } +}; + +my $rdrand = sub { + if (shift =~ /%[er](\w+)/) { + my @opcode=(); + my $dst=$1; + if ($dst !~ /[0-9]+/) { $dst = $regrm{"%e$dst"}; } + rex(\@opcode,0,$1,8); + push @opcode,0x0f,0xc7,0xf0|($dst&7); + @opcode; + } else { + (); + } +}; + if ($nasm) { print <<___; default rel +%define XMMWORD ___ } elsif ($masm) { print <<___; @@ -644,14 +792,22 @@ while($line=<>) { undef $label; undef $opcode; - undef $sz; undef @args; if ($label=label->re(\$line)) { print $label->out(); } if (directive->re(\$line)) { printf "%s",directive->out(); - } elsif ($opcode=opcode->re(\$line)) { ARGUMENT: while (1) { + } elsif ($opcode=opcode->re(\$line)) { + my $asm = eval("\$".$opcode->mnemonic()); + undef @bytes; + + if ((ref($asm) eq 'CODE') && scalar(@bytes=&$asm($line))) { + print $gas?".byte\t":"DB\t",join(',',@bytes),"\n"; + next; + } + + ARGUMENT: while (1) { my $arg; if ($arg=register->re(\$line)) { opcode->size($arg->size()); } @@ -667,19 +823,26 @@ while($line=<>) { $line =~ s/^,\s*//; } # ARGUMENT: - $sz=opcode->size(); - if ($#args>=0) { my $insn; + my $sz=opcode->size(); + if ($gas) { $insn = $opcode->out($#args>=1?$args[$#args]->size():$sz); + @args = map($_->out($sz),@args); + printf "\t%s\t%s",$insn,join(",",@args); } else { $insn = $opcode->out(); - $insn .= $sz if (map($_->out() =~ /x?mm/,@args)); + foreach (@args) { + my $arg = $_->out(); + # $insn.=$sz compensates for movq, pinsrw, ... + if ($arg =~ /^xmm[0-9]+$/) { $insn.=$sz; $sz="x" if(!$sz); last; } + if ($arg =~ /^mm[0-9]+$/) { $insn.=$sz; $sz="q" if(!$sz); last; } + } @args = reverse(@args); undef $sz if ($nasm && $opcode->mnemonic() eq "lea"); + printf "\t%s\t%s",$insn,join(",",map($_->out($sz),@args)); } - printf "\t%s\t%s",$insn,join(",",map($_->out($sz),@args)); } else { printf "\t%s",$opcode->out(); } diff --git a/crypto/openssl/crypto/perlasm/x86asm.pl b/crypto/openssl/crypto/perlasm/x86asm.pl index 28080caaa6..eb543db2f6 100644 --- a/crypto/openssl/crypto/perlasm/x86asm.pl +++ b/crypto/openssl/crypto/perlasm/x86asm.pl @@ -80,6 +80,57 @@ sub ::movq { &::generic("movq",@_); } } +# SSE>2 instructions +my %regrm = ( "eax"=>0, "ecx"=>1, "edx"=>2, "ebx"=>3, + "esp"=>4, "ebp"=>5, "esi"=>6, "edi"=>7 ); +sub ::pextrd +{ my($dst,$src,$imm)=@_; + if ("$dst:$src" =~ /(e[a-dsd][ixp]):xmm([0-7])/) + { &::data_byte(0x66,0x0f,0x3a,0x16,0xc0|($2<<3)|$regrm{$1},$imm); } + else + { &::generic("pextrd",@_); } +} + +sub ::pinsrd +{ my($dst,$src,$imm)=@_; + if ("$dst:$src" =~ /xmm([0-7]):(e[a-dsd][ixp])/) + { &::data_byte(0x66,0x0f,0x3a,0x22,0xc0|($1<<3)|$regrm{$2},$imm); } + else + { &::generic("pinsrd",@_); } +} + +sub ::pshufb +{ my($dst,$src)=@_; + if ("$dst:$src" =~ /xmm([0-7]):xmm([0-7])/) + { &data_byte(0x66,0x0f,0x38,0x00,0xc0|($1<<3)|$2); } + else + { &::generic("pshufb",@_); } +} + +sub ::palignr +{ my($dst,$src,$imm)=@_; + if ("$dst:$src" =~ /xmm([0-7]):xmm([0-7])/) + { &::data_byte(0x66,0x0f,0x3a,0x0f,0xc0|($1<<3)|$2,$imm); } + else + { &::generic("palignr",@_); } +} + +sub ::pclmulqdq +{ my($dst,$src,$imm)=@_; + if ("$dst:$src" =~ /xmm([0-7]):xmm([0-7])/) + { &::data_byte(0x66,0x0f,0x3a,0x44,0xc0|($1<<3)|$2,$imm); } + else + { &::generic("pclmulqdq",@_); } +} + +sub ::rdrand +{ my ($dst)=@_; + if ($dst =~ /(e[a-dsd][ixp])/) + { &::data_byte(0x0f,0xc7,0xf0|$regrm{$dst}); } + else + { &::generic("rdrand",@_); } +} + # label management $lbdecor="L"; # local label decoration, set by package $label="000"; @@ -167,7 +218,7 @@ sub ::asm_init $filename=$fn; $i386=$cpu; - $elf=$cpp=$coff=$aout=$macosx=$win32=$netware=$mwerks=0; + $elf=$cpp=$coff=$aout=$macosx=$win32=$netware=$mwerks=$android=0; if (($type eq "elf")) { $elf=1; require "x86gas.pl"; } elsif (($type eq "a\.out")) @@ -184,6 +235,8 @@ sub ::asm_init { $win32=1; require "x86masm.pl"; } elsif (($type eq "macosx")) { $aout=1; $macosx=1; require "x86gas.pl"; } + elsif (($type eq "android")) + { $elf=1; $android=1; require "x86gas.pl"; } else { print STDERR <<"EOF"; Pick one target type from diff --git a/crypto/openssl/crypto/perlasm/x86gas.pl b/crypto/openssl/crypto/perlasm/x86gas.pl index 6eab727fd4..682a3a3163 100644 --- a/crypto/openssl/crypto/perlasm/x86gas.pl +++ b/crypto/openssl/crypto/perlasm/x86gas.pl @@ -45,9 +45,8 @@ sub ::generic undef $suffix if ($dst =~ m/^%[xm]/o || $src =~ m/^%[xm]/o); if ($#_==0) { &::emit($opcode); } - elsif ($opcode =~ m/^j/o && $#_==1) { &::emit($opcode,@arg); } - elsif ($opcode eq "call" && $#_==1) { &::emit($opcode,@arg); } - elsif ($opcode =~ m/^set/&& $#_==1) { &::emit($opcode,@arg); } + elsif ($#_==1 && $opcode =~ m/^(call|clflush|j|loop|set)/o) + { &::emit($opcode,@arg); } else { &::emit($opcode.$suffix,@arg);} 1; @@ -91,6 +90,7 @@ sub ::DWP } sub ::QWP { &::DWP(@_); } sub ::BP { &::DWP(@_); } +sub ::WP { &::DWP(@_); } sub ::BC { @_; } sub ::DWC { @_; } @@ -149,22 +149,24 @@ sub ::public_label { push(@out,".globl\t".&::LABEL($_[0],$nmdecor.$_[0])."\n"); } sub ::file_end -{ if (grep {/\b${nmdecor}OPENSSL_ia32cap_P\b/i} @out) { - my $tmp=".comm\t${nmdecor}OPENSSL_ia32cap_P,4"; - if ($::elf) { push (@out,"$tmp,4\n"); } - else { push (@out,"$tmp\n"); } - } - if ($::macosx) +{ if ($::macosx) { if (%non_lazy_ptr) { push(@out,".section __IMPORT,__pointers,non_lazy_symbol_pointers\n"); foreach $i (keys %non_lazy_ptr) { push(@out,"$non_lazy_ptr{$i}:\n.indirect_symbol\t$i\n.long\t0\n"); } } } + if (grep {/\b${nmdecor}OPENSSL_ia32cap_P\b/i} @out) { + my $tmp=".comm\t${nmdecor}OPENSSL_ia32cap_P,8"; + if ($::macosx) { push (@out,"$tmp,2\n"); } + elsif ($::elf) { push (@out,"$tmp,4\n"); } + else { push (@out,"$tmp\n"); } + } push(@out,$initseg) if ($initseg); } sub ::data_byte { push(@out,".byte\t".join(',',@_)."\n"); } +sub ::data_short{ push(@out,".value\t".join(',',@_)."\n"); } sub ::data_word { push(@out,".long\t".join(',',@_)."\n"); } sub ::align @@ -180,7 +182,7 @@ sub ::align sub ::picmeup { my($dst,$sym,$base,$reflabel)=@_; - if ($::pic && ($::elf || $::aout)) + if (($::pic && ($::elf || $::aout)) || $::macosx) { if (!defined($base)) { &::call(&::label("PIC_me_up")); &::set_label("PIC_me_up"); @@ -206,13 +208,17 @@ sub ::picmeup sub ::initseg { my $f=$nmdecor.shift; - if ($::elf) + if ($::android) + { $initseg.=<<___; +.section .init_array +.align 4 +.long $f +___ + } + elsif ($::elf) { $initseg.=<<___; .section .init call $f - jmp .Linitalign -.align $align -.Linitalign: ___ } elsif ($::coff) diff --git a/crypto/openssl/crypto/pkcs12/p12_decr.c b/crypto/openssl/crypto/pkcs12/p12_decr.c index ba77dbbe32..9d3557e8d7 100644 --- a/crypto/openssl/crypto/pkcs12/p12_decr.c +++ b/crypto/openssl/crypto/pkcs12/p12_decr.c @@ -89,7 +89,14 @@ unsigned char * PKCS12_pbe_crypt(X509_ALGOR *algor, const char *pass, goto err; } - EVP_CipherUpdate(&ctx, out, &i, in, inlen); + if (!EVP_CipherUpdate(&ctx, out, &i, in, inlen)) + { + OPENSSL_free(out); + out = NULL; + PKCS12err(PKCS12_F_PKCS12_PBE_CRYPT,ERR_R_EVP_LIB); + goto err; + } + outlen = i; if(!EVP_CipherFinal_ex(&ctx, out + i, &i)) { OPENSSL_free(out); diff --git a/crypto/openssl/crypto/pkcs12/p12_key.c b/crypto/openssl/crypto/pkcs12/p12_key.c index 424203f648..c55c7b60b3 100644 --- a/crypto/openssl/crypto/pkcs12/p12_key.c +++ b/crypto/openssl/crypto/pkcs12/p12_key.c @@ -152,14 +152,16 @@ int PKCS12_key_gen_uni(unsigned char *pass, int passlen, unsigned char *salt, for (i = 0; i < Slen; i++) *p++ = salt[i % saltlen]; for (i = 0; i < Plen; i++) *p++ = pass[i % passlen]; for (;;) { - EVP_DigestInit_ex(&ctx, md_type, NULL); - EVP_DigestUpdate(&ctx, D, v); - EVP_DigestUpdate(&ctx, I, Ilen); - EVP_DigestFinal_ex(&ctx, Ai, NULL); + if (!EVP_DigestInit_ex(&ctx, md_type, NULL) + || !EVP_DigestUpdate(&ctx, D, v) + || !EVP_DigestUpdate(&ctx, I, Ilen) + || !EVP_DigestFinal_ex(&ctx, Ai, NULL)) + goto err; for (j = 1; j < iter; j++) { - EVP_DigestInit_ex(&ctx, md_type, NULL); - EVP_DigestUpdate(&ctx, Ai, u); - EVP_DigestFinal_ex(&ctx, Ai, NULL); + if (!EVP_DigestInit_ex(&ctx, md_type, NULL) + || !EVP_DigestUpdate(&ctx, Ai, u) + || !EVP_DigestFinal_ex(&ctx, Ai, NULL)) + goto err; } memcpy (out, Ai, min (n, u)); if (u >= n) { diff --git a/crypto/openssl/crypto/pkcs12/p12_kiss.c b/crypto/openssl/crypto/pkcs12/p12_kiss.c index 292cc3ed4a..206b1b0b18 100644 --- a/crypto/openssl/crypto/pkcs12/p12_kiss.c +++ b/crypto/openssl/crypto/pkcs12/p12_kiss.c @@ -167,7 +167,7 @@ int PKCS12_parse(PKCS12 *p12, const char *pass, EVP_PKEY **pkey, X509 **cert, if (cert && *cert) X509_free(*cert); if (x) - X509_free(*cert); + X509_free(x); if (ocerts) sk_X509_pop_free(ocerts, X509_free); return 0; diff --git a/crypto/openssl/crypto/pkcs12/p12_mutl.c b/crypto/openssl/crypto/pkcs12/p12_mutl.c index 9ab740d51f..96de1bd11e 100644 --- a/crypto/openssl/crypto/pkcs12/p12_mutl.c +++ b/crypto/openssl/crypto/pkcs12/p12_mutl.c @@ -97,10 +97,14 @@ int PKCS12_gen_mac(PKCS12 *p12, const char *pass, int passlen, return 0; } HMAC_CTX_init(&hmac); - HMAC_Init_ex(&hmac, key, md_size, md_type, NULL); - HMAC_Update(&hmac, p12->authsafes->d.data->data, - p12->authsafes->d.data->length); - HMAC_Final(&hmac, mac, maclen); + if (!HMAC_Init_ex(&hmac, key, md_size, md_type, NULL) + || !HMAC_Update(&hmac, p12->authsafes->d.data->data, + p12->authsafes->d.data->length) + || !HMAC_Final(&hmac, mac, maclen)) + { + HMAC_CTX_cleanup(&hmac); + return 0; + } HMAC_CTX_cleanup(&hmac); return 1; } diff --git a/crypto/openssl/crypto/pkcs7/pk7_doit.c b/crypto/openssl/crypto/pkcs7/pk7_doit.c index 3bf1a367bb..fae8eda46c 100644 --- a/crypto/openssl/crypto/pkcs7/pk7_doit.c +++ b/crypto/openssl/crypto/pkcs7/pk7_doit.c @@ -204,11 +204,11 @@ static int pkcs7_decrypt_rinfo(unsigned char **pek, int *peklen, unsigned char *ek = NULL; size_t eklen; - int ret = 0; + int ret = -1; pctx = EVP_PKEY_CTX_new(pkey, NULL); if (!pctx) - return 0; + return -1; if (EVP_PKEY_decrypt_init(pctx) <= 0) goto err; @@ -235,12 +235,19 @@ static int pkcs7_decrypt_rinfo(unsigned char **pek, int *peklen, if (EVP_PKEY_decrypt(pctx, ek, &eklen, ri->enc_key->data, ri->enc_key->length) <= 0) { + ret = 0; PKCS7err(PKCS7_F_PKCS7_DECRYPT_RINFO, ERR_R_EVP_LIB); goto err; } ret = 1; + if (*pek) + { + OPENSSL_cleanse(*pek, *peklen); + OPENSSL_free(*pek); + } + *pek = ek; *peklen = eklen; @@ -500,8 +507,8 @@ BIO *PKCS7_dataDecode(PKCS7 *p7, EVP_PKEY *pkey, BIO *in_bio, X509 *pcert) int max; X509_OBJECT ret; #endif - unsigned char *ek = NULL; - int eklen; + unsigned char *ek = NULL, *tkey = NULL; + int eklen, tkeylen; if ((etmp=BIO_new(BIO_f_cipher())) == NULL) { @@ -534,29 +541,28 @@ BIO *PKCS7_dataDecode(PKCS7 *p7, EVP_PKEY *pkey, BIO *in_bio, X509 *pcert) } /* If we haven't got a certificate try each ri in turn */ - if (pcert == NULL) { + /* Always attempt to decrypt all rinfo even + * after sucess as a defence against MMA timing + * attacks. + */ for (i=0; i 0) - break; + ri, pkey) < 0) + goto err; ERR_clear_error(); - ri = NULL; - } - if (ri == NULL) - { - PKCS7err(PKCS7_F_PKCS7_DATADECODE, - PKCS7_R_NO_RECIPIENT_MATCHES_KEY); - goto err; } } else { - if (pkcs7_decrypt_rinfo(&ek, &eklen, ri, pkey) <= 0) + /* Only exit on fatal errors, not decrypt failure */ + if (pkcs7_decrypt_rinfo(&ek, &eklen, ri, pkey) < 0) goto err; + ERR_clear_error(); } evp_ctx=NULL; @@ -565,6 +571,19 @@ BIO *PKCS7_dataDecode(PKCS7 *p7, EVP_PKEY *pkey, BIO *in_bio, X509 *pcert) goto err; if (EVP_CIPHER_asn1_to_param(evp_ctx,enc_alg->parameter) < 0) goto err; + /* Generate random key as MMA defence */ + tkeylen = EVP_CIPHER_CTX_key_length(evp_ctx); + tkey = OPENSSL_malloc(tkeylen); + if (!tkey) + goto err; + if (EVP_CIPHER_CTX_rand_key(evp_ctx, tkey) <= 0) + goto err; + if (ek == NULL) + { + ek = tkey; + eklen = tkeylen; + tkey = NULL; + } if (eklen != EVP_CIPHER_CTX_key_length(evp_ctx)) { /* Some S/MIME clients don't use the same key @@ -573,11 +592,16 @@ BIO *PKCS7_dataDecode(PKCS7 *p7, EVP_PKEY *pkey, BIO *in_bio, X509 *pcert) */ if(!EVP_CIPHER_CTX_set_key_length(evp_ctx, eklen)) { - PKCS7err(PKCS7_F_PKCS7_DATADECODE, - PKCS7_R_DECRYPTED_KEY_IS_WRONG_LENGTH); - goto err; + /* Use random key as MMA defence */ + OPENSSL_cleanse(ek, eklen); + OPENSSL_free(ek); + ek = tkey; + eklen = tkeylen; + tkey = NULL; } } + /* Clear errors so we don't leak information useful in MMA */ + ERR_clear_error(); if (EVP_CipherInit_ex(evp_ctx,NULL,NULL,ek,NULL,0) <= 0) goto err; @@ -586,6 +610,11 @@ BIO *PKCS7_dataDecode(PKCS7 *p7, EVP_PKEY *pkey, BIO *in_bio, X509 *pcert) OPENSSL_cleanse(ek,eklen); OPENSSL_free(ek); } + if (tkey) + { + OPENSSL_cleanse(tkey,tkeylen); + OPENSSL_free(tkey); + } if (out == NULL) out=etmp; @@ -676,7 +705,11 @@ static int do_pkcs7_signed_attrib(PKCS7_SIGNER_INFO *si, EVP_MD_CTX *mctx) } /* Add digest */ - EVP_DigestFinal_ex(mctx, md_data,&md_len); + if (!EVP_DigestFinal_ex(mctx, md_data,&md_len)) + { + PKCS7err(PKCS7_F_DO_PKCS7_SIGNED_ATTRIB, ERR_R_EVP_LIB); + return 0; + } if (!PKCS7_add1_attrib_digest(si, md_data, md_len)) { PKCS7err(PKCS7_F_DO_PKCS7_SIGNED_ATTRIB, ERR_R_MALLOC_FAILURE); @@ -784,7 +817,8 @@ int PKCS7_dataFinal(PKCS7 *p7, BIO *bio) /* We now have the EVP_MD_CTX, lets do the * signing. */ - EVP_MD_CTX_copy_ex(&ctx_tmp,mdc); + if (!EVP_MD_CTX_copy_ex(&ctx_tmp,mdc)) + goto err; sk=si->auth_attr; @@ -822,7 +856,8 @@ int PKCS7_dataFinal(PKCS7 *p7, BIO *bio) if (!PKCS7_find_digest(&mdc, bio, OBJ_obj2nid(p7->d.digest->md->algorithm))) goto err; - EVP_DigestFinal_ex(mdc,md_data,&md_len); + if (!EVP_DigestFinal_ex(mdc,md_data,&md_len)) + goto err; M_ASN1_OCTET_STRING_set(p7->d.digest->digest, md_data, md_len); } @@ -1015,7 +1050,8 @@ int PKCS7_signatureVerify(BIO *bio, PKCS7 *p7, PKCS7_SIGNER_INFO *si, /* mdc is the digest ctx that we want, unless there are attributes, * in which case the digest is the signed attributes */ - EVP_MD_CTX_copy_ex(&mdc_tmp,mdc); + if (!EVP_MD_CTX_copy_ex(&mdc_tmp,mdc)) + goto err; sk=si->auth_attr; if ((sk != NULL) && (sk_X509_ATTRIBUTE_num(sk) != 0)) @@ -1025,7 +1061,8 @@ int PKCS7_signatureVerify(BIO *bio, PKCS7 *p7, PKCS7_SIGNER_INFO *si, int alen; ASN1_OCTET_STRING *message_digest; - EVP_DigestFinal_ex(&mdc_tmp,md_dat,&md_len); + if (!EVP_DigestFinal_ex(&mdc_tmp,md_dat,&md_len)) + goto err; message_digest=PKCS7_digest_from_attributes(sk); if (!message_digest) { @@ -1050,7 +1087,8 @@ for (ii=0; ii 0 && BIO_method_type(tmpmem) == BIO_TYPE_CIPHER) + { + if (!BIO_get_cipher_status(tmpmem)) + ret = 0; + } BIO_free_all(bread); return ret; } else { for(;;) { i = BIO_read(tmpmem, buf, sizeof(buf)); - if(i <= 0) break; - BIO_write(data, buf, i); + if(i <= 0) + { + ret = 1; + if (BIO_method_type(tmpmem) == BIO_TYPE_CIPHER) + { + if (!BIO_get_cipher_status(tmpmem)) + ret = 0; + } + + break; + } + if (BIO_write(data, buf, i) != i) + { + ret = 0; + break; + } } BIO_free_all(tmpmem); - return 1; + return ret; } } diff --git a/crypto/openssl/crypto/rand/md_rand.c b/crypto/openssl/crypto/rand/md_rand.c index b2f04ff13e..fcdd3f2a84 100644 --- a/crypto/openssl/crypto/rand/md_rand.c +++ b/crypto/openssl/crypto/rand/md_rand.c @@ -109,6 +109,8 @@ * */ +#define OPENSSL_FIPSEVP + #ifdef MD_RAND_DEBUG # ifndef NDEBUG # define NDEBUG @@ -157,13 +159,14 @@ const char RAND_version[]="RAND" OPENSSL_VERSION_PTEXT; static void ssleay_rand_cleanup(void); static void ssleay_rand_seed(const void *buf, int num); static void ssleay_rand_add(const void *buf, int num, double add_entropy); -static int ssleay_rand_bytes(unsigned char *buf, int num); +static int ssleay_rand_bytes(unsigned char *buf, int num, int pseudo); +static int ssleay_rand_nopseudo_bytes(unsigned char *buf, int num); static int ssleay_rand_pseudo_bytes(unsigned char *buf, int num); static int ssleay_rand_status(void); RAND_METHOD rand_ssleay_meth={ ssleay_rand_seed, - ssleay_rand_bytes, + ssleay_rand_nopseudo_bytes, ssleay_rand_cleanup, ssleay_rand_add, ssleay_rand_pseudo_bytes, @@ -328,7 +331,7 @@ static void ssleay_rand_seed(const void *buf, int num) ssleay_rand_add(buf, num, (double)num); } -static int ssleay_rand_bytes(unsigned char *buf, int num) +static int ssleay_rand_bytes(unsigned char *buf, int num, int pseudo) { static volatile int stirred_pool = 0; int i,j,k,st_num,st_idx; @@ -517,7 +520,9 @@ static int ssleay_rand_bytes(unsigned char *buf, int num) EVP_MD_CTX_cleanup(&m); if (ok) return(1); - else + else if (pseudo) + return 0; + else { RANDerr(RAND_F_SSLEAY_RAND_BYTES,RAND_R_PRNG_NOT_SEEDED); ERR_add_error_data(1, "You need to read the OpenSSL FAQ, " @@ -526,22 +531,16 @@ static int ssleay_rand_bytes(unsigned char *buf, int num) } } +static int ssleay_rand_nopseudo_bytes(unsigned char *buf, int num) + { + return ssleay_rand_bytes(buf, num, 0); + } + /* pseudo-random bytes that are guaranteed to be unique but not unpredictable */ static int ssleay_rand_pseudo_bytes(unsigned char *buf, int num) { - int ret; - unsigned long err; - - ret = RAND_bytes(buf, num); - if (ret == 0) - { - err = ERR_peek_error(); - if (ERR_GET_LIB(err) == ERR_LIB_RAND && - ERR_GET_REASON(err) == RAND_R_PRNG_NOT_SEEDED) - ERR_clear_error(); - } - return (ret); + return ssleay_rand_bytes(buf, num, 1); } static int ssleay_rand_status(void) diff --git a/crypto/openssl/crypto/rand/rand.h b/crypto/openssl/crypto/rand/rand.h index ac6c021763..dc8fcf94c5 100644 --- a/crypto/openssl/crypto/rand/rand.h +++ b/crypto/openssl/crypto/rand/rand.h @@ -119,6 +119,11 @@ int RAND_event(UINT, WPARAM, LPARAM); #endif +#ifdef OPENSSL_FIPS +void RAND_set_fips_drbg_type(int type, int flags); +int RAND_init_fips(void); +#endif + /* BEGIN ERROR CODES */ /* The following lines are auto generated by the script mkerr.pl. Any changes * made after this point may be overwritten when the script is next run. @@ -129,9 +134,13 @@ void ERR_load_RAND_strings(void); /* Function codes. */ #define RAND_F_RAND_GET_RAND_METHOD 101 +#define RAND_F_RAND_INIT_FIPS 102 #define RAND_F_SSLEAY_RAND_BYTES 100 /* Reason codes. */ +#define RAND_R_ERROR_INITIALISING_DRBG 102 +#define RAND_R_ERROR_INSTANTIATING_DRBG 103 +#define RAND_R_NO_FIPS_RANDOM_METHOD_SET 101 #define RAND_R_PRNG_NOT_SEEDED 100 #ifdef __cplusplus diff --git a/crypto/openssl/crypto/rand/rand_err.c b/crypto/openssl/crypto/rand/rand_err.c index 03cda4dd92..b8586c8f4a 100644 --- a/crypto/openssl/crypto/rand/rand_err.c +++ b/crypto/openssl/crypto/rand/rand_err.c @@ -1,6 +1,6 @@ /* crypto/rand/rand_err.c */ /* ==================================================================== - * Copyright (c) 1999-2006 The OpenSSL Project. All rights reserved. + * Copyright (c) 1999-2011 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -71,12 +71,16 @@ static ERR_STRING_DATA RAND_str_functs[]= { {ERR_FUNC(RAND_F_RAND_GET_RAND_METHOD), "RAND_get_rand_method"}, +{ERR_FUNC(RAND_F_RAND_INIT_FIPS), "RAND_init_fips"}, {ERR_FUNC(RAND_F_SSLEAY_RAND_BYTES), "SSLEAY_RAND_BYTES"}, {0,NULL} }; static ERR_STRING_DATA RAND_str_reasons[]= { +{ERR_REASON(RAND_R_ERROR_INITIALISING_DRBG),"error initialising drbg"}, +{ERR_REASON(RAND_R_ERROR_INSTANTIATING_DRBG),"error instantiating drbg"}, +{ERR_REASON(RAND_R_NO_FIPS_RANDOM_METHOD_SET),"no fips random method set"}, {ERR_REASON(RAND_R_PRNG_NOT_SEEDED) ,"PRNG not seeded"}, {0,NULL} }; diff --git a/crypto/openssl/crypto/rand/rand_lib.c b/crypto/openssl/crypto/rand/rand_lib.c index 513e338985..daf1dab973 100644 --- a/crypto/openssl/crypto/rand/rand_lib.c +++ b/crypto/openssl/crypto/rand/rand_lib.c @@ -60,10 +60,16 @@ #include #include "cryptlib.h" #include + #ifndef OPENSSL_NO_ENGINE #include #endif +#ifdef OPENSSL_FIPS +#include +#include +#endif + #ifndef OPENSSL_NO_ENGINE /* non-NULL if default_RAND_meth is ENGINE-provided */ static ENGINE *funct_ref =NULL; @@ -174,3 +180,116 @@ int RAND_status(void) return meth->status(); return 0; } + +#ifdef OPENSSL_FIPS + +/* FIPS DRBG initialisation code. This sets up the DRBG for use by the + * rest of OpenSSL. + */ + +/* Entropy gatherer: use standard OpenSSL PRNG to seed (this will gather + * entropy internally through RAND_poll(). + */ + +static size_t drbg_get_entropy(DRBG_CTX *ctx, unsigned char **pout, + int entropy, size_t min_len, size_t max_len) + { + /* Round up request to multiple of block size */ + min_len = ((min_len + 19) / 20) * 20; + *pout = OPENSSL_malloc(min_len); + if (!*pout) + return 0; + if (RAND_SSLeay()->bytes(*pout, min_len) <= 0) + { + OPENSSL_free(*pout); + *pout = NULL; + return 0; + } + return min_len; + } + +static void drbg_free_entropy(DRBG_CTX *ctx, unsigned char *out, size_t olen) + { + OPENSSL_cleanse(out, olen); + OPENSSL_free(out); + } + +/* Set "additional input" when generating random data. This uses the + * current PID, a time value and a counter. + */ + +static size_t drbg_get_adin(DRBG_CTX *ctx, unsigned char **pout) + { + /* Use of static variables is OK as this happens under a lock */ + static unsigned char buf[16]; + static unsigned long counter; + FIPS_get_timevec(buf, &counter); + *pout = buf; + return sizeof(buf); + } + +/* RAND_add() and RAND_seed() pass through to OpenSSL PRNG so it is + * correctly seeded by RAND_poll(). + */ + +static int drbg_rand_add(DRBG_CTX *ctx, const void *in, int inlen, + double entropy) + { + RAND_SSLeay()->add(in, inlen, entropy); + return 1; + } + +static int drbg_rand_seed(DRBG_CTX *ctx, const void *in, int inlen) + { + RAND_SSLeay()->seed(in, inlen); + return 1; + } + +#ifndef OPENSSL_DRBG_DEFAULT_TYPE +#define OPENSSL_DRBG_DEFAULT_TYPE NID_aes_256_ctr +#endif +#ifndef OPENSSL_DRBG_DEFAULT_FLAGS +#define OPENSSL_DRBG_DEFAULT_FLAGS DRBG_FLAG_CTR_USE_DF +#endif + +static int fips_drbg_type = OPENSSL_DRBG_DEFAULT_TYPE; +static int fips_drbg_flags = OPENSSL_DRBG_DEFAULT_FLAGS; + +void RAND_set_fips_drbg_type(int type, int flags) + { + fips_drbg_type = type; + fips_drbg_flags = flags; + } + +int RAND_init_fips(void) + { + DRBG_CTX *dctx; + size_t plen; + unsigned char pers[32], *p; + dctx = FIPS_get_default_drbg(); + if (FIPS_drbg_init(dctx, fips_drbg_type, fips_drbg_flags) <= 0) + { + RANDerr(RAND_F_RAND_INIT_FIPS, RAND_R_ERROR_INITIALISING_DRBG); + return 0; + } + + FIPS_drbg_set_callbacks(dctx, + drbg_get_entropy, drbg_free_entropy, 20, + drbg_get_entropy, drbg_free_entropy); + FIPS_drbg_set_rand_callbacks(dctx, drbg_get_adin, 0, + drbg_rand_seed, drbg_rand_add); + /* Personalisation string: a string followed by date time vector */ + strcpy((char *)pers, "OpenSSL DRBG2.0"); + plen = drbg_get_adin(dctx, &p); + memcpy(pers + 16, p, plen); + + if (FIPS_drbg_instantiate(dctx, pers, sizeof(pers)) <= 0) + { + RANDerr(RAND_F_RAND_INIT_FIPS, RAND_R_ERROR_INSTANTIATING_DRBG); + return 0; + } + FIPS_rand_set_method(FIPS_drbg_method()); + return 1; + } + +#endif diff --git a/crypto/openssl/crypto/rc2/rc2.h b/crypto/openssl/crypto/rc2/rc2.h index 34c8362317..e542ec94ff 100644 --- a/crypto/openssl/crypto/rc2/rc2.h +++ b/crypto/openssl/crypto/rc2/rc2.h @@ -79,7 +79,9 @@ typedef struct rc2_key_st RC2_INT data[64]; } RC2_KEY; - +#ifdef OPENSSL_FIPS +void private_RC2_set_key(RC2_KEY *key, int len, const unsigned char *data,int bits); +#endif void RC2_set_key(RC2_KEY *key, int len, const unsigned char *data,int bits); void RC2_ecb_encrypt(const unsigned char *in,unsigned char *out,RC2_KEY *key, int enc); diff --git a/crypto/openssl/crypto/rc2/rc2_skey.c b/crypto/openssl/crypto/rc2/rc2_skey.c index 0150b0e035..6668ac011f 100644 --- a/crypto/openssl/crypto/rc2/rc2_skey.c +++ b/crypto/openssl/crypto/rc2/rc2_skey.c @@ -56,6 +56,7 @@ * [including the GNU Public Licence.] */ +#include #include #include "rc2_locl.h" @@ -95,6 +96,13 @@ static const unsigned char key_table[256]={ * the same as specifying 1024 for the 'bits' parameter. Bsafe uses * a version where the bits parameter is the same as len*8 */ void RC2_set_key(RC2_KEY *key, int len, const unsigned char *data, int bits) +#ifdef OPENSSL_FIPS + { + fips_cipher_abort(RC2); + private_RC2_set_key(key, len, data, bits); + } +void private_RC2_set_key(RC2_KEY *key, int len, const unsigned char *data, int bits) +#endif { int i,j; unsigned char *k; diff --git a/crypto/openssl/crypto/rc4/asm/rc4-586.pl b/crypto/openssl/crypto/rc4/asm/rc4-586.pl index 38a44a70ef..5c9ac6ad28 100644 --- a/crypto/openssl/crypto/rc4/asm/rc4-586.pl +++ b/crypto/openssl/crypto/rc4/asm/rc4-586.pl @@ -28,6 +28,34 @@ # # +# May 2011 +# +# Optimize for Core2 and Westmere [and incidentally Opteron]. Current +# performance in cycles per processed byte (less is better) and +# improvement relative to previous version of this module is: +# +# Pentium 10.2 # original numbers +# Pentium III 7.8(*) +# Intel P4 7.5 +# +# Opteron 6.1/+20% # new MMX numbers +# Core2 5.3/+67%(**) +# Westmere 5.1/+94%(**) +# Sandy Bridge 5.0/+8% +# Atom 12.6/+6% +# +# (*) PIII can actually deliver 6.6 cycles per byte with MMX code, +# but this specific code performs poorly on Core2. And vice +# versa, below MMX/SSE code delivering 5.8/7.1 on Core2 performs +# poorly on PIII, at 8.0/14.5:-( As PIII is not a "hot" CPU +# [anymore], I chose to discard PIII-specific code path and opt +# for original IALU-only code, which is why MMX/SSE code path +# is guarded by SSE2 bit (see below), not MMX/SSE. +# (**) Performance vs. block size on Core2 and Westmere had a maximum +# at ... 64 bytes block size. And it was quite a maximum, 40-60% +# in comparison to largest 8KB block size. Above improvement +# coefficients are for the largest block size. + $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; push(@INC,"${dir}","${dir}../../perlasm"); require "x86asm.pl"; @@ -62,6 +90,68 @@ sub RC4_loop { &$func ($out,&DWP(0,$dat,$ty,4)); } +if ($alt=0) { + # >20% faster on Atom and Sandy Bridge[!], 8% faster on Opteron, + # but ~40% slower on Core2 and Westmere... Attempt to add movz + # brings down Opteron by 25%, Atom and Sandy Bridge by 15%, yet + # on Core2 with movz it's almost 20% slower than below alternative + # code... Yes, it's a total mess... + my @XX=($xx,$out); + $RC4_loop_mmx = sub { # SSE actually... + my $i=shift; + my $j=$i<=0?0:$i>>1; + my $mm=$i<=0?"mm0":"mm".($i&1); + + &add (&LB($yy),&LB($tx)); + &lea (@XX[1],&DWP(1,@XX[0])); + &pxor ("mm2","mm0") if ($i==0); + &psllq ("mm1",8) if ($i==0); + &and (@XX[1],0xff); + &pxor ("mm0","mm0") if ($i<=0); + &mov ($ty,&DWP(0,$dat,$yy,4)); + &mov (&DWP(0,$dat,$yy,4),$tx); + &pxor ("mm1","mm2") if ($i==0); + &mov (&DWP(0,$dat,$XX[0],4),$ty); + &add (&LB($ty),&LB($tx)); + &movd (@XX[0],"mm7") if ($i==0); + &mov ($tx,&DWP(0,$dat,@XX[1],4)); + &pxor ("mm1","mm1") if ($i==1); + &movq ("mm2",&QWP(0,$inp)) if ($i==1); + &movq (&QWP(-8,(@XX[0],$inp)),"mm1") if ($i==0); + &pinsrw ($mm,&DWP(0,$dat,$ty,4),$j); + + push (@XX,shift(@XX)) if ($i>=0); + } +} else { + # Using pinsrw here improves performane on Intel CPUs by 2-3%, but + # brings down AMD by 7%... + $RC4_loop_mmx = sub { + my $i=shift; + + &add (&LB($yy),&LB($tx)); + &psllq ("mm1",8*(($i-1)&7)) if (abs($i)!=1); + &mov ($ty,&DWP(0,$dat,$yy,4)); + &mov (&DWP(0,$dat,$yy,4),$tx); + &mov (&DWP(0,$dat,$xx,4),$ty); + &inc ($xx); + &add ($ty,$tx); + &movz ($xx,&LB($xx)); # (*) + &movz ($ty,&LB($ty)); # (*) + &pxor ("mm2",$i==1?"mm0":"mm1") if ($i>=0); + &movq ("mm0",&QWP(0,$inp)) if ($i<=0); + &movq (&QWP(-8,($out,$inp)),"mm2") if ($i==0); + &mov ($tx,&DWP(0,$dat,$xx,4)); + &movd ($i>0?"mm1":"mm2",&DWP(0,$dat,$ty,4)); + + # (*) This is the key to Core2 and Westmere performance. + # Whithout movz out-of-order execution logic confuses + # itself and fails to reorder loads and stores. Problem + # appears to be fixed in Sandy Bridge... + } +} + +&external_label("OPENSSL_ia32cap_P"); + # void RC4(RC4_KEY *key,size_t len,const unsigned char *inp,unsigned char *out); &function_begin("RC4"); &mov ($dat,&wparam(0)); # load key schedule pointer @@ -94,11 +184,56 @@ sub RC4_loop { &and ($ty,-4); # how many 4-byte chunks? &jz (&label("loop1")); + &test ($ty,-8); + &mov (&wparam(3),$out); # $out as accumulator in these loops + &jz (&label("go4loop4")); + + &picmeup($out,"OPENSSL_ia32cap_P"); + &bt (&DWP(0,$out),26); # check SSE2 bit [could have been MMX] + &jnc (&label("go4loop4")); + + &mov ($out,&wparam(3)) if (!$alt); + &movd ("mm7",&wparam(3)) if ($alt); + &and ($ty,-8); + &lea ($ty,&DWP(-8,$inp,$ty)); + &mov (&DWP(-4,$dat),$ty); # save input+(len/8)*8-8 + + &$RC4_loop_mmx(-1); + &jmp(&label("loop_mmx_enter")); + + &set_label("loop_mmx",16); + &$RC4_loop_mmx(0); + &set_label("loop_mmx_enter"); + for ($i=1;$i<8;$i++) { &$RC4_loop_mmx($i); } + &mov ($ty,$yy); + &xor ($yy,$yy); # this is second key to Core2 + &mov (&LB($yy),&LB($ty)); # and Westmere performance... + &cmp ($inp,&DWP(-4,$dat)); + &lea ($inp,&DWP(8,$inp)); + &jb (&label("loop_mmx")); + + if ($alt) { + &movd ($out,"mm7"); + &pxor ("mm2","mm0"); + &psllq ("mm1",8); + &pxor ("mm1","mm2"); + &movq (&QWP(-8,$out,$inp),"mm1"); + } else { + &psllq ("mm1",56); + &pxor ("mm2","mm1"); + &movq (&QWP(-8,$out,$inp),"mm2"); + } + &emms (); + + &cmp ($inp,&wparam(1)); # compare to input+len + &je (&label("done")); + &jmp (&label("loop1")); + +&set_label("go4loop4",16); &lea ($ty,&DWP(-4,$inp,$ty)); &mov (&wparam(2),$ty); # save input+(len/4)*4-4 - &mov (&wparam(3),$out); # $out as accumulator in this loop - &set_label("loop4",16); + &set_label("loop4"); for ($i=0;$i<4;$i++) { RC4_loop($i); } &ror ($out,8); &xor ($out,&DWP(0,$inp)); @@ -151,7 +286,7 @@ sub RC4_loop { &set_label("done"); &dec (&LB($xx)); - &mov (&BP(-4,$dat),&LB($yy)); # save key->y + &mov (&DWP(-4,$dat),$yy); # save key->y &mov (&BP(-8,$dat),&LB($xx)); # save key->x &set_label("abort"); &function_end("RC4"); @@ -164,10 +299,8 @@ $idi="ebp"; $ido="ecx"; $idx="edx"; -&external_label("OPENSSL_ia32cap_P"); - # void RC4_set_key(RC4_KEY *key,int len,const unsigned char *data); -&function_begin("RC4_set_key"); +&function_begin("private_RC4_set_key"); &mov ($out,&wparam(0)); # load key &mov ($idi,&wparam(1)); # load len &mov ($inp,&wparam(2)); # load data @@ -245,7 +378,7 @@ $idx="edx"; &xor ("eax","eax"); &mov (&DWP(-8,$out),"eax"); # key->x=0; &mov (&DWP(-4,$out),"eax"); # key->y=0; -&function_end("RC4_set_key"); +&function_end("private_RC4_set_key"); # const char *RC4_options(void); &function_begin_B("RC4_options"); @@ -254,14 +387,21 @@ $idx="edx"; &blindpop("eax"); &lea ("eax",&DWP(&label("opts")."-".&label("pic_point"),"eax")); &picmeup("edx","OPENSSL_ia32cap_P"); - &bt (&DWP(0,"edx"),20); - &jnc (&label("skip")); - &add ("eax",12); - &set_label("skip"); + &mov ("edx",&DWP(0,"edx")); + &bt ("edx",20); + &jc (&label("1xchar")); + &bt ("edx",26); + &jnc (&label("ret")); + &add ("eax",25); + &ret (); +&set_label("1xchar"); + &add ("eax",12); +&set_label("ret"); &ret (); &set_label("opts",64); &asciz ("rc4(4x,int)"); &asciz ("rc4(1x,char)"); +&asciz ("rc4(8x,mmx)"); &asciz ("RC4 for x86, CRYPTOGAMS by "); &align (64); &function_end_B("RC4_options"); diff --git a/crypto/openssl/crypto/rc4/asm/rc4-md5-x86_64.pl b/crypto/openssl/crypto/rc4/asm/rc4-md5-x86_64.pl new file mode 100644 index 0000000000..7f684092d4 --- /dev/null +++ b/crypto/openssl/crypto/rc4/asm/rc4-md5-x86_64.pl @@ -0,0 +1,631 @@ +#!/usr/bin/env perl +# +# ==================================================================== +# Written by Andy Polyakov for the OpenSSL +# project. The module is, however, dual licensed under OpenSSL and +# CRYPTOGAMS licenses depending on where you obtain it. For further +# details see http://www.openssl.org/~appro/cryptogams/. +# ==================================================================== + +# June 2011 +# +# This is RC4+MD5 "stitch" implementation. The idea, as spelled in +# http://download.intel.com/design/intarch/papers/323686.pdf, is that +# since both algorithms exhibit instruction-level parallelism, ILP, +# below theoretical maximum, interleaving them would allow to utilize +# processor resources better and achieve better performance. RC4 +# instruction sequence is virtually identical to rc4-x86_64.pl, which +# is heavily based on submission by Maxim Perminov, Maxim Locktyukhin +# and Jim Guilford of Intel. MD5 is fresh implementation aiming to +# minimize register usage, which was used as "main thread" with RC4 +# weaved into it, one RC4 round per one MD5 round. In addition to the +# stiched subroutine the script can generate standalone replacement +# md5_block_asm_data_order and RC4. Below are performance numbers in +# cycles per processed byte, less is better, for these the standalone +# subroutines, sum of them, and stitched one: +# +# RC4 MD5 RC4+MD5 stitch gain +# Opteron 6.5(*) 5.4 11.9 7.0 +70%(*) +# Core2 6.5 5.8 12.3 7.7 +60% +# Westmere 4.3 5.2 9.5 7.0 +36% +# Sandy Bridge 4.2 5.5 9.7 6.8 +43% +# Atom 9.3 6.5 15.8 11.1 +42% +# +# (*) rc4-x86_64.pl delivers 5.3 on Opteron, so real improvement +# is +53%... + +my ($rc4,$md5)=(1,1); # what to generate? +my $D="#" if (!$md5); # if set to "#", MD5 is stitched into RC4(), + # but its result is discarded. Idea here is + # to be able to use 'openssl speed rc4' for + # benchmarking the stitched subroutine... + +my $flavour = shift; +my $output = shift; +if ($flavour =~ /\./) { $output = $flavour; undef $flavour; } + +my $win64=0; $win64=1 if ($flavour =~ /[nm]asm|mingw64/ || $output =~ /\.asm$/); + +$0 =~ m/(.*[\/\\])[^\/\\]+$/; my $dir=$1; my $xlate; +( $xlate="${dir}x86_64-xlate.pl" and -f $xlate ) or +( $xlate="${dir}../../perlasm/x86_64-xlate.pl" and -f $xlate) or +die "can't locate x86_64-xlate.pl"; + +open STDOUT,"| $^X $xlate $flavour $output"; + +my ($dat,$in0,$out,$ctx,$inp,$len, $func,$nargs); + +if ($rc4 && !$md5) { + ($dat,$len,$in0,$out) = ("%rdi","%rsi","%rdx","%rcx"); + $func="RC4"; $nargs=4; +} elsif ($md5 && !$rc4) { + ($ctx,$inp,$len) = ("%rdi","%rsi","%rdx"); + $func="md5_block_asm_data_order"; $nargs=3; +} else { + ($dat,$in0,$out,$ctx,$inp,$len) = ("%rdi","%rsi","%rdx","%rcx","%r8","%r9"); + $func="rc4_md5_enc"; $nargs=6; + # void rc4_md5_enc( + # RC4_KEY *key, # + # const void *in0, # RC4 input + # void *out, # RC4 output + # MD5_CTX *ctx, # + # const void *inp, # MD5 input + # size_t len); # number of 64-byte blocks +} + +my @K=( 0xd76aa478,0xe8c7b756,0x242070db,0xc1bdceee, + 0xf57c0faf,0x4787c62a,0xa8304613,0xfd469501, + 0x698098d8,0x8b44f7af,0xffff5bb1,0x895cd7be, + 0x6b901122,0xfd987193,0xa679438e,0x49b40821, + + 0xf61e2562,0xc040b340,0x265e5a51,0xe9b6c7aa, + 0xd62f105d,0x02441453,0xd8a1e681,0xe7d3fbc8, + 0x21e1cde6,0xc33707d6,0xf4d50d87,0x455a14ed, + 0xa9e3e905,0xfcefa3f8,0x676f02d9,0x8d2a4c8a, + + 0xfffa3942,0x8771f681,0x6d9d6122,0xfde5380c, + 0xa4beea44,0x4bdecfa9,0xf6bb4b60,0xbebfbc70, + 0x289b7ec6,0xeaa127fa,0xd4ef3085,0x04881d05, + 0xd9d4d039,0xe6db99e5,0x1fa27cf8,0xc4ac5665, + + 0xf4292244,0x432aff97,0xab9423a7,0xfc93a039, + 0x655b59c3,0x8f0ccc92,0xffeff47d,0x85845dd1, + 0x6fa87e4f,0xfe2ce6e0,0xa3014314,0x4e0811a1, + 0xf7537e82,0xbd3af235,0x2ad7d2bb,0xeb86d391 ); + +my @V=("%r8d","%r9d","%r10d","%r11d"); # MD5 registers +my $tmp="%r12d"; + +my @XX=("%rbp","%rsi"); # RC4 registers +my @TX=("%rax","%rbx"); +my $YY="%rcx"; +my $TY="%rdx"; + +my $MOD=32; # 16, 32 or 64 + +$code.=<<___; +.text +.align 16 + +.globl $func +.type $func,\@function,$nargs +$func: + cmp \$0,$len + je .Labort + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + sub \$40,%rsp +.Lbody: +___ +if ($rc4) { +$code.=<<___; +$D#md5# mov $ctx,%r11 # reassign arguments + mov $len,%r12 + mov $in0,%r13 + mov $out,%r14 +$D#md5# mov $inp,%r15 +___ + $ctx="%r11" if ($md5); # reassign arguments + $len="%r12"; + $in0="%r13"; + $out="%r14"; + $inp="%r15" if ($md5); + $inp=$in0 if (!$md5); +$code.=<<___; + xor $XX[0],$XX[0] + xor $YY,$YY + + lea 8($dat),$dat + mov -8($dat),$XX[0]#b + mov -4($dat),$YY#b + + inc $XX[0]#b + sub $in0,$out + movl ($dat,$XX[0],4),$TX[0]#d +___ +$code.=<<___ if (!$md5); + xor $TX[1],$TX[1] + test \$-128,$len + jz .Loop1 + sub $XX[0],$TX[1] + and \$`$MOD-1`,$TX[1] + jz .Loop${MOD}_is_hot + sub $TX[1],$len +.Loop${MOD}_warmup: + add $TX[0]#b,$YY#b + movl ($dat,$YY,4),$TY#d + movl $TX[0]#d,($dat,$YY,4) + movl $TY#d,($dat,$XX[0],4) + add $TY#b,$TX[0]#b + inc $XX[0]#b + movl ($dat,$TX[0],4),$TY#d + movl ($dat,$XX[0],4),$TX[0]#d + xorb ($in0),$TY#b + movb $TY#b,($out,$in0) + lea 1($in0),$in0 + dec $TX[1] + jnz .Loop${MOD}_warmup + + mov $YY,$TX[1] + xor $YY,$YY + mov $TX[1]#b,$YY#b + +.Loop${MOD}_is_hot: + mov $len,32(%rsp) # save original $len + shr \$6,$len # number of 64-byte blocks +___ + if ($D && !$md5) { # stitch in dummy MD5 + $md5=1; + $ctx="%r11"; + $inp="%r15"; + $code.=<<___; + mov %rsp,$ctx + mov $in0,$inp +___ + } +} +$code.=<<___; +#rc4# add $TX[0]#b,$YY#b +#rc4# lea ($dat,$XX[0],4),$XX[1] + shl \$6,$len + add $inp,$len # pointer to the end of input + mov $len,16(%rsp) + +#md5# mov $ctx,24(%rsp) # save pointer to MD5_CTX +#md5# mov 0*4($ctx),$V[0] # load current hash value from MD5_CTX +#md5# mov 1*4($ctx),$V[1] +#md5# mov 2*4($ctx),$V[2] +#md5# mov 3*4($ctx),$V[3] + jmp .Loop + +.align 16 +.Loop: +#md5# mov $V[0],0*4(%rsp) # put aside current hash value +#md5# mov $V[1],1*4(%rsp) +#md5# mov $V[2],2*4(%rsp) +#md5# mov $V[3],$tmp # forward reference +#md5# mov $V[3],3*4(%rsp) +___ + +sub R0 { + my ($i,$a,$b,$c,$d)=@_; + my @rot0=(7,12,17,22); + my $j=$i%16; + my $k=$i%$MOD; + my $xmm="%xmm".($j&1); + $code.=" movdqu ($in0),%xmm2\n" if ($rc4 && $j==15); + $code.=" add \$$MOD,$XX[0]#b\n" if ($rc4 && $j==15 && $k==$MOD-1); + $code.=" pxor $xmm,$xmm\n" if ($rc4 && $j<=1); + $code.=<<___; +#rc4# movl ($dat,$YY,4),$TY#d +#md5# xor $c,$tmp +#rc4# movl $TX[0]#d,($dat,$YY,4) +#md5# and $b,$tmp +#md5# add 4*`$j`($inp),$a +#rc4# add $TY#b,$TX[0]#b +#rc4# movl `4*(($k+1)%$MOD)`(`$k==$MOD-1?"$dat,$XX[0],4":"$XX[1]"`),$TX[1]#d +#md5# add \$$K[$i],$a +#md5# xor $d,$tmp +#rc4# movz $TX[0]#b,$TX[0]#d +#rc4# movl $TY#d,4*$k($XX[1]) +#md5# add $tmp,$a +#rc4# add $TX[1]#b,$YY#b +#md5# rol \$$rot0[$j%4],$a +#md5# mov `$j==15?"$b":"$c"`,$tmp # forward reference +#rc4# pinsrw \$`($j>>1)&7`,($dat,$TX[0],4),$xmm\n +#md5# add $b,$a +___ + $code.=<<___ if ($rc4 && $j==15 && $k==$MOD-1); + mov $YY,$XX[1] + xor $YY,$YY # keyword to partial register + mov $XX[1]#b,$YY#b + lea ($dat,$XX[0],4),$XX[1] +___ + $code.=<<___ if ($rc4 && $j==15); + psllq \$8,%xmm1 + pxor %xmm0,%xmm2 + pxor %xmm1,%xmm2 +___ +} +sub R1 { + my ($i,$a,$b,$c,$d)=@_; + my @rot1=(5,9,14,20); + my $j=$i%16; + my $k=$i%$MOD; + my $xmm="%xmm".($j&1); + $code.=" movdqu 16($in0),%xmm3\n" if ($rc4 && $j==15); + $code.=" add \$$MOD,$XX[0]#b\n" if ($rc4 && $j==15 && $k==$MOD-1); + $code.=" pxor $xmm,$xmm\n" if ($rc4 && $j<=1); + $code.=<<___; +#rc4# movl ($dat,$YY,4),$TY#d +#md5# xor $b,$tmp +#rc4# movl $TX[0]#d,($dat,$YY,4) +#md5# and $d,$tmp +#md5# add 4*`((1+5*$j)%16)`($inp),$a +#rc4# add $TY#b,$TX[0]#b +#rc4# movl `4*(($k+1)%$MOD)`(`$k==$MOD-1?"$dat,$XX[0],4":"$XX[1]"`),$TX[1]#d +#md5# add \$$K[$i],$a +#md5# xor $c,$tmp +#rc4# movz $TX[0]#b,$TX[0]#d +#rc4# movl $TY#d,4*$k($XX[1]) +#md5# add $tmp,$a +#rc4# add $TX[1]#b,$YY#b +#md5# rol \$$rot1[$j%4],$a +#md5# mov `$j==15?"$c":"$b"`,$tmp # forward reference +#rc4# pinsrw \$`($j>>1)&7`,($dat,$TX[0],4),$xmm\n +#md5# add $b,$a +___ + $code.=<<___ if ($rc4 && $j==15 && $k==$MOD-1); + mov $YY,$XX[1] + xor $YY,$YY # keyword to partial register + mov $XX[1]#b,$YY#b + lea ($dat,$XX[0],4),$XX[1] +___ + $code.=<<___ if ($rc4 && $j==15); + psllq \$8,%xmm1 + pxor %xmm0,%xmm3 + pxor %xmm1,%xmm3 +___ +} +sub R2 { + my ($i,$a,$b,$c,$d)=@_; + my @rot2=(4,11,16,23); + my $j=$i%16; + my $k=$i%$MOD; + my $xmm="%xmm".($j&1); + $code.=" movdqu 32($in0),%xmm4\n" if ($rc4 && $j==15); + $code.=" add \$$MOD,$XX[0]#b\n" if ($rc4 && $j==15 && $k==$MOD-1); + $code.=" pxor $xmm,$xmm\n" if ($rc4 && $j<=1); + $code.=<<___; +#rc4# movl ($dat,$YY,4),$TY#d +#md5# xor $c,$tmp +#rc4# movl $TX[0]#d,($dat,$YY,4) +#md5# xor $b,$tmp +#md5# add 4*`((5+3*$j)%16)`($inp),$a +#rc4# add $TY#b,$TX[0]#b +#rc4# movl `4*(($k+1)%$MOD)`(`$k==$MOD-1?"$dat,$XX[0],4":"$XX[1]"`),$TX[1]#d +#md5# add \$$K[$i],$a +#rc4# movz $TX[0]#b,$TX[0]#d +#md5# add $tmp,$a +#rc4# movl $TY#d,4*$k($XX[1]) +#rc4# add $TX[1]#b,$YY#b +#md5# rol \$$rot2[$j%4],$a +#md5# mov `$j==15?"\\\$-1":"$c"`,$tmp # forward reference +#rc4# pinsrw \$`($j>>1)&7`,($dat,$TX[0],4),$xmm\n +#md5# add $b,$a +___ + $code.=<<___ if ($rc4 && $j==15 && $k==$MOD-1); + mov $YY,$XX[1] + xor $YY,$YY # keyword to partial register + mov $XX[1]#b,$YY#b + lea ($dat,$XX[0],4),$XX[1] +___ + $code.=<<___ if ($rc4 && $j==15); + psllq \$8,%xmm1 + pxor %xmm0,%xmm4 + pxor %xmm1,%xmm4 +___ +} +sub R3 { + my ($i,$a,$b,$c,$d)=@_; + my @rot3=(6,10,15,21); + my $j=$i%16; + my $k=$i%$MOD; + my $xmm="%xmm".($j&1); + $code.=" movdqu 48($in0),%xmm5\n" if ($rc4 && $j==15); + $code.=" add \$$MOD,$XX[0]#b\n" if ($rc4 && $j==15 && $k==$MOD-1); + $code.=" pxor $xmm,$xmm\n" if ($rc4 && $j<=1); + $code.=<<___; +#rc4# movl ($dat,$YY,4),$TY#d +#md5# xor $d,$tmp +#rc4# movl $TX[0]#d,($dat,$YY,4) +#md5# or $b,$tmp +#md5# add 4*`((7*$j)%16)`($inp),$a +#rc4# add $TY#b,$TX[0]#b +#rc4# movl `4*(($k+1)%$MOD)`(`$k==$MOD-1?"$dat,$XX[0],4":"$XX[1]"`),$TX[1]#d +#md5# add \$$K[$i],$a +#rc4# movz $TX[0]#b,$TX[0]#d +#md5# xor $c,$tmp +#rc4# movl $TY#d,4*$k($XX[1]) +#md5# add $tmp,$a +#rc4# add $TX[1]#b,$YY#b +#md5# rol \$$rot3[$j%4],$a +#md5# mov \$-1,$tmp # forward reference +#rc4# pinsrw \$`($j>>1)&7`,($dat,$TX[0],4),$xmm\n +#md5# add $b,$a +___ + $code.=<<___ if ($rc4 && $j==15); + mov $XX[0],$XX[1] + xor $XX[0],$XX[0] # keyword to partial register + mov $XX[1]#b,$XX[0]#b + mov $YY,$XX[1] + xor $YY,$YY # keyword to partial register + mov $XX[1]#b,$YY#b + lea ($dat,$XX[0],4),$XX[1] + psllq \$8,%xmm1 + pxor %xmm0,%xmm5 + pxor %xmm1,%xmm5 +___ +} + +my $i=0; +for(;$i<16;$i++) { R0($i,@V); unshift(@V,pop(@V)); push(@TX,shift(@TX)); } +for(;$i<32;$i++) { R1($i,@V); unshift(@V,pop(@V)); push(@TX,shift(@TX)); } +for(;$i<48;$i++) { R2($i,@V); unshift(@V,pop(@V)); push(@TX,shift(@TX)); } +for(;$i<64;$i++) { R3($i,@V); unshift(@V,pop(@V)); push(@TX,shift(@TX)); } + +$code.=<<___; +#md5# add 0*4(%rsp),$V[0] # accumulate hash value +#md5# add 1*4(%rsp),$V[1] +#md5# add 2*4(%rsp),$V[2] +#md5# add 3*4(%rsp),$V[3] + +#rc4# movdqu %xmm2,($out,$in0) # write RC4 output +#rc4# movdqu %xmm3,16($out,$in0) +#rc4# movdqu %xmm4,32($out,$in0) +#rc4# movdqu %xmm5,48($out,$in0) +#md5# lea 64($inp),$inp +#rc4# lea 64($in0),$in0 + cmp 16(%rsp),$inp # are we done? + jb .Loop + +#md5# mov 24(%rsp),$len # restore pointer to MD5_CTX +#rc4# sub $TX[0]#b,$YY#b # correct $YY +#md5# mov $V[0],0*4($len) # write MD5_CTX +#md5# mov $V[1],1*4($len) +#md5# mov $V[2],2*4($len) +#md5# mov $V[3],3*4($len) +___ +$code.=<<___ if ($rc4 && (!$md5 || $D)); + mov 32(%rsp),$len # restore original $len + and \$63,$len # remaining bytes + jnz .Loop1 + jmp .Ldone + +.align 16 +.Loop1: + add $TX[0]#b,$YY#b + movl ($dat,$YY,4),$TY#d + movl $TX[0]#d,($dat,$YY,4) + movl $TY#d,($dat,$XX[0],4) + add $TY#b,$TX[0]#b + inc $XX[0]#b + movl ($dat,$TX[0],4),$TY#d + movl ($dat,$XX[0],4),$TX[0]#d + xorb ($in0),$TY#b + movb $TY#b,($out,$in0) + lea 1($in0),$in0 + dec $len + jnz .Loop1 + +.Ldone: +___ +$code.=<<___; +#rc4# sub \$1,$XX[0]#b +#rc4# movl $XX[0]#d,-8($dat) +#rc4# movl $YY#d,-4($dat) + + mov 40(%rsp),%r15 + mov 48(%rsp),%r14 + mov 56(%rsp),%r13 + mov 64(%rsp),%r12 + mov 72(%rsp),%rbp + mov 80(%rsp),%rbx + lea 88(%rsp),%rsp +.Lepilogue: +.Labort: + ret +.size $func,.-$func +___ + +if ($rc4 && $D) { # sole purpose of this section is to provide + # option to use the generated module as drop-in + # replacement for rc4-x86_64.pl for debugging + # and testing purposes... +my ($idx,$ido)=("%r8","%r9"); +my ($dat,$len,$inp)=("%rdi","%rsi","%rdx"); + +$code.=<<___; +.globl RC4_set_key +.type RC4_set_key,\@function,3 +.align 16 +RC4_set_key: + lea 8($dat),$dat + lea ($inp,$len),$inp + neg $len + mov $len,%rcx + xor %eax,%eax + xor $ido,$ido + xor %r10,%r10 + xor %r11,%r11 + jmp .Lw1stloop + +.align 16 +.Lw1stloop: + mov %eax,($dat,%rax,4) + add \$1,%al + jnc .Lw1stloop + + xor $ido,$ido + xor $idx,$idx +.align 16 +.Lw2ndloop: + mov ($dat,$ido,4),%r10d + add ($inp,$len,1),$idx#b + add %r10b,$idx#b + add \$1,$len + mov ($dat,$idx,4),%r11d + cmovz %rcx,$len + mov %r10d,($dat,$idx,4) + mov %r11d,($dat,$ido,4) + add \$1,$ido#b + jnc .Lw2ndloop + + xor %eax,%eax + mov %eax,-8($dat) + mov %eax,-4($dat) + ret +.size RC4_set_key,.-RC4_set_key + +.globl RC4_options +.type RC4_options,\@abi-omnipotent +.align 16 +RC4_options: + lea .Lopts(%rip),%rax + ret +.align 64 +.Lopts: +.asciz "rc4(64x,int)" +.align 64 +.size RC4_options,.-RC4_options +___ +} +# EXCEPTION_DISPOSITION handler (EXCEPTION_RECORD *rec,ULONG64 frame, +# CONTEXT *context,DISPATCHER_CONTEXT *disp) +if ($win64) { +my $rec="%rcx"; +my $frame="%rdx"; +my $context="%r8"; +my $disp="%r9"; + +$code.=<<___; +.extern __imp_RtlVirtualUnwind +.type se_handler,\@abi-omnipotent +.align 16 +se_handler: + push %rsi + push %rdi + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + pushfq + sub \$64,%rsp + + mov 120($context),%rax # pull context->Rax + mov 248($context),%rbx # pull context->Rip + + lea .Lbody(%rip),%r10 + cmp %r10,%rbx # context->Rip<.Lbody + jb .Lin_prologue + + mov 152($context),%rax # pull context->Rsp + + lea .Lepilogue(%rip),%r10 + cmp %r10,%rbx # context->Rip>=.Lepilogue + jae .Lin_prologue + + mov 40(%rax),%r15 + mov 48(%rax),%r14 + mov 56(%rax),%r13 + mov 64(%rax),%r12 + mov 72(%rax),%rbp + mov 80(%rax),%rbx + lea 88(%rax),%rax + + mov %rbx,144($context) # restore context->Rbx + mov %rbp,160($context) # restore context->Rbp + mov %r12,216($context) # restore context->R12 + mov %r13,224($context) # restore context->R12 + mov %r14,232($context) # restore context->R14 + mov %r15,240($context) # restore context->R15 + +.Lin_prologue: + mov 8(%rax),%rdi + mov 16(%rax),%rsi + mov %rax,152($context) # restore context->Rsp + mov %rsi,168($context) # restore context->Rsi + mov %rdi,176($context) # restore context->Rdi + + mov 40($disp),%rdi # disp->ContextRecord + mov $context,%rsi # context + mov \$154,%ecx # sizeof(CONTEXT) + .long 0xa548f3fc # cld; rep movsq + + mov $disp,%rsi + xor %rcx,%rcx # arg1, UNW_FLAG_NHANDLER + mov 8(%rsi),%rdx # arg2, disp->ImageBase + mov 0(%rsi),%r8 # arg3, disp->ControlPc + mov 16(%rsi),%r9 # arg4, disp->FunctionEntry + mov 40(%rsi),%r10 # disp->ContextRecord + lea 56(%rsi),%r11 # &disp->HandlerData + lea 24(%rsi),%r12 # &disp->EstablisherFrame + mov %r10,32(%rsp) # arg5 + mov %r11,40(%rsp) # arg6 + mov %r12,48(%rsp) # arg7 + mov %rcx,56(%rsp) # arg8, (NULL) + call *__imp_RtlVirtualUnwind(%rip) + + mov \$1,%eax # ExceptionContinueSearch + add \$64,%rsp + popfq + pop %r15 + pop %r14 + pop %r13 + pop %r12 + pop %rbp + pop %rbx + pop %rdi + pop %rsi + ret +.size se_handler,.-se_handler + +.section .pdata +.align 4 + .rva .LSEH_begin_$func + .rva .LSEH_end_$func + .rva .LSEH_info_$func + +.section .xdata +.align 8 +.LSEH_info_$func: + .byte 9,0,0,0 + .rva se_handler +___ +} + +sub reg_part { +my ($reg,$conv)=@_; + if ($reg =~ /%r[0-9]+/) { $reg .= $conv; } + elsif ($conv eq "b") { $reg =~ s/%[er]([^x]+)x?/%$1l/; } + elsif ($conv eq "w") { $reg =~ s/%[er](.+)/%$1/; } + elsif ($conv eq "d") { $reg =~ s/%[er](.+)/%e$1/; } + return $reg; +} + +$code =~ s/(%[a-z0-9]+)#([bwd])/reg_part($1,$2)/gem; +$code =~ s/\`([^\`]*)\`/eval $1/gem; +$code =~ s/pinsrw\s+\$0,/movd /gm; + +$code =~ s/#md5#//gm if ($md5); +$code =~ s/#rc4#//gm if ($rc4); + +print $code; + +close STDOUT; diff --git a/crypto/openssl/crypto/rc4/asm/rc4-x86_64.pl b/crypto/openssl/crypto/rc4/asm/rc4-x86_64.pl index 677be5fe25..d6eac205e9 100755 --- a/crypto/openssl/crypto/rc4/asm/rc4-x86_64.pl +++ b/crypto/openssl/crypto/rc4/asm/rc4-x86_64.pl @@ -7,6 +7,8 @@ # details see http://www.openssl.org/~appro/cryptogams/. # ==================================================================== # +# July 2004 +# # 2.22x RC4 tune-up:-) It should be noted though that my hand [as in # "hand-coded assembler"] doesn't stand for the whole improvement # coefficient. It turned out that eliminating RC4_CHAR from config @@ -19,6 +21,8 @@ # to operate on partial registers, it turned out to be the best bet. # At least for AMD... How IA32E would perform remains to be seen... +# November 2004 +# # As was shown by Marc Bevand reordering of couple of load operations # results in even higher performance gain of 3.3x:-) At least on # Opteron... For reference, 1x in this case is RC4_CHAR C-code @@ -26,6 +30,8 @@ # Latter means that if you want to *estimate* what to expect from # *your* Opteron, then multiply 54 by 3.3 and clock frequency in GHz. +# November 2004 +# # Intel P4 EM64T core was found to run the AMD64 code really slow... # The only way to achieve comparable performance on P4 was to keep # RC4_CHAR. Kind of ironic, huh? As it's apparently impossible to @@ -33,10 +39,14 @@ # on either AMD and Intel platforms, I implement both cases. See # rc4_skey.c for further details... +# April 2005 +# # P4 EM64T core appears to be "allergic" to 64-bit inc/dec. Replacing # those with add/sub results in 50% performance improvement of folded # loop... +# May 2005 +# # As was shown by Zou Nanhai loop unrolling can improve Intel EM64T # performance by >30% [unlike P4 32-bit case that is]. But this is # provided that loads are reordered even more aggressively! Both code @@ -50,6 +60,8 @@ # is not implemented, then this final RC4_CHAR code-path should be # preferred, as it provides better *all-round* performance]. +# March 2007 +# # Intel Core2 was observed to perform poorly on both code paths:-( It # apparently suffers from some kind of partial register stall, which # occurs in 64-bit mode only [as virtually identical 32-bit loop was @@ -58,6 +70,37 @@ # fit for Core2 and therefore the code was modified to skip cloop8 on # this CPU. +# May 2010 +# +# Intel Westmere was observed to perform suboptimally. Adding yet +# another movzb to cloop1 improved performance by almost 50%! Core2 +# performance is improved too, but nominally... + +# May 2011 +# +# The only code path that was not modified is P4-specific one. Non-P4 +# Intel code path optimization is heavily based on submission by Maxim +# Perminov, Maxim Locktyukhin and Jim Guilford of Intel. I've used +# some of the ideas even in attempt to optmize the original RC4_INT +# code path... Current performance in cycles per processed byte (less +# is better) and improvement coefficients relative to previous +# version of this module are: +# +# Opteron 5.3/+0%(*) +# P4 6.5 +# Core2 6.2/+15%(**) +# Westmere 4.2/+60% +# Sandy Bridge 4.2/+120% +# Atom 9.3/+80% +# +# (*) But corresponding loop has less instructions, which should have +# positive effect on upcoming Bulldozer, which has one less ALU. +# For reference, Intel code runs at 6.8 cpb rate on Opteron. +# (**) Note that Core2 result is ~15% lower than corresponding result +# for 32-bit code, meaning that it's possible to improve it, +# but more than likely at the cost of the others (see rc4-586.pl +# to get the idea)... + $flavour = shift; $output = shift; if ($flavour =~ /\./) { $output = $flavour; undef $flavour; } @@ -76,13 +119,10 @@ $len="%rsi"; # arg2 $inp="%rdx"; # arg3 $out="%rcx"; # arg4 -@XX=("%r8","%r10"); -@TX=("%r9","%r11"); -$YY="%r12"; -$TY="%r13"; - +{ $code=<<___; .text +.extern OPENSSL_ia32cap_P .globl RC4 .type RC4,\@function,4 @@ -95,48 +135,173 @@ RC4: or $len,$len push %r12 push %r13 .Lprologue: + mov $len,%r11 + mov $inp,%r12 + mov $out,%r13 +___ +my $len="%r11"; # reassign input arguments +my $inp="%r12"; +my $out="%r13"; - add \$8,$dat - movl -8($dat),$XX[0]#d - movl -4($dat),$YY#d +my @XX=("%r10","%rsi"); +my @TX=("%rax","%rbx"); +my $YY="%rcx"; +my $TY="%rdx"; + +$code.=<<___; + xor $XX[0],$XX[0] + xor $YY,$YY + + lea 8($dat),$dat + mov -8($dat),$XX[0]#b + mov -4($dat),$YY#b cmpl \$-1,256($dat) je .LRC4_CHAR + mov OPENSSL_ia32cap_P(%rip),%r8d + xor $TX[1],$TX[1] inc $XX[0]#b + sub $XX[0],$TX[1] + sub $inp,$out movl ($dat,$XX[0],4),$TX[0]#d - test \$-8,$len + test \$-16,$len jz .Lloop1 - jmp .Lloop8 + bt \$30,%r8d # Intel CPU? + jc .Lintel + and \$7,$TX[1] + lea 1($XX[0]),$XX[1] + jz .Loop8 + sub $TX[1],$len +.Loop8_warmup: + add $TX[0]#b,$YY#b + movl ($dat,$YY,4),$TY#d + movl $TX[0]#d,($dat,$YY,4) + movl $TY#d,($dat,$XX[0],4) + add $TY#b,$TX[0]#b + inc $XX[0]#b + movl ($dat,$TX[0],4),$TY#d + movl ($dat,$XX[0],4),$TX[0]#d + xorb ($inp),$TY#b + movb $TY#b,($out,$inp) + lea 1($inp),$inp + dec $TX[1] + jnz .Loop8_warmup + + lea 1($XX[0]),$XX[1] + jmp .Loop8 .align 16 -.Lloop8: +.Loop8: ___ for ($i=0;$i<8;$i++) { +$code.=<<___ if ($i==7); + add \$8,$XX[1]#b +___ $code.=<<___; add $TX[0]#b,$YY#b - mov $XX[0],$XX[1] movl ($dat,$YY,4),$TY#d - ror \$8,%rax # ror is redundant when $i=0 - inc $XX[1]#b - movl ($dat,$XX[1],4),$TX[1]#d - cmp $XX[1],$YY movl $TX[0]#d,($dat,$YY,4) - cmove $TX[0],$TX[1] - movl $TY#d,($dat,$XX[0],4) + movl `4*($i==7?-1:$i)`($dat,$XX[1],4),$TX[1]#d + ror \$8,%r8 # ror is redundant when $i=0 + movl $TY#d,4*$i($dat,$XX[0],4) add $TX[0]#b,$TY#b - movb ($dat,$TY,4),%al + movb ($dat,$TY,4),%r8b ___ -push(@TX,shift(@TX)); push(@XX,shift(@XX)); # "rotate" registers +push(@TX,shift(@TX)); #push(@XX,shift(@XX)); # "rotate" registers } $code.=<<___; - ror \$8,%rax + add \$8,$XX[0]#b + ror \$8,%r8 sub \$8,$len - xor ($inp),%rax - add \$8,$inp - mov %rax,($out) - add \$8,$out + xor ($inp),%r8 + mov %r8,($out,$inp) + lea 8($inp),$inp test \$-8,$len - jnz .Lloop8 + jnz .Loop8 + cmp \$0,$len + jne .Lloop1 + jmp .Lexit + +.align 16 +.Lintel: + test \$-32,$len + jz .Lloop1 + and \$15,$TX[1] + jz .Loop16_is_hot + sub $TX[1],$len +.Loop16_warmup: + add $TX[0]#b,$YY#b + movl ($dat,$YY,4),$TY#d + movl $TX[0]#d,($dat,$YY,4) + movl $TY#d,($dat,$XX[0],4) + add $TY#b,$TX[0]#b + inc $XX[0]#b + movl ($dat,$TX[0],4),$TY#d + movl ($dat,$XX[0],4),$TX[0]#d + xorb ($inp),$TY#b + movb $TY#b,($out,$inp) + lea 1($inp),$inp + dec $TX[1] + jnz .Loop16_warmup + + mov $YY,$TX[1] + xor $YY,$YY + mov $TX[1]#b,$YY#b + +.Loop16_is_hot: + lea ($dat,$XX[0],4),$XX[1] +___ +sub RC4_loop { + my $i=shift; + my $j=$i<0?0:$i; + my $xmm="%xmm".($j&1); + + $code.=" add \$16,$XX[0]#b\n" if ($i==15); + $code.=" movdqu ($inp),%xmm2\n" if ($i==15); + $code.=" add $TX[0]#b,$YY#b\n" if ($i<=0); + $code.=" movl ($dat,$YY,4),$TY#d\n"; + $code.=" pxor %xmm0,%xmm2\n" if ($i==0); + $code.=" psllq \$8,%xmm1\n" if ($i==0); + $code.=" pxor $xmm,$xmm\n" if ($i<=1); + $code.=" movl $TX[0]#d,($dat,$YY,4)\n"; + $code.=" add $TY#b,$TX[0]#b\n"; + $code.=" movl `4*($j+1)`($XX[1]),$TX[1]#d\n" if ($i<15); + $code.=" movz $TX[0]#b,$TX[0]#d\n"; + $code.=" movl $TY#d,4*$j($XX[1])\n"; + $code.=" pxor %xmm1,%xmm2\n" if ($i==0); + $code.=" lea ($dat,$XX[0],4),$XX[1]\n" if ($i==15); + $code.=" add $TX[1]#b,$YY#b\n" if ($i<15); + $code.=" pinsrw \$`($j>>1)&7`,($dat,$TX[0],4),$xmm\n"; + $code.=" movdqu %xmm2,($out,$inp)\n" if ($i==0); + $code.=" lea 16($inp),$inp\n" if ($i==0); + $code.=" movl ($XX[1]),$TX[1]#d\n" if ($i==15); +} + RC4_loop(-1); +$code.=<<___; + jmp .Loop16_enter +.align 16 +.Loop16: +___ + +for ($i=0;$i<16;$i++) { + $code.=".Loop16_enter:\n" if ($i==1); + RC4_loop($i); + push(@TX,shift(@TX)); # "rotate" registers +} +$code.=<<___; + mov $YY,$TX[1] + xor $YY,$YY # keyword to partial register + sub \$16,$len + mov $TX[1]#b,$YY#b + test \$-16,$len + jnz .Loop16 + + psllq \$8,%xmm1 + pxor %xmm0,%xmm2 + pxor %xmm1,%xmm2 + movdqu %xmm2,($out,$inp) + lea 16($inp),$inp + cmp \$0,$len jne .Lloop1 jmp .Lexit @@ -152,9 +317,8 @@ $code.=<<___; movl ($dat,$TX[0],4),$TY#d movl ($dat,$XX[0],4),$TX[0]#d xorb ($inp),$TY#b - inc $inp - movb $TY#b,($out) - inc $out + movb $TY#b,($out,$inp) + lea 1($inp),$inp dec $len jnz .Lloop1 jmp .Lexit @@ -165,13 +329,11 @@ $code.=<<___; movzb ($dat,$XX[0]),$TX[0]#d test \$-8,$len jz .Lcloop1 - cmpl \$0,260($dat) - jnz .Lcloop1 jmp .Lcloop8 .align 16 .Lcloop8: - mov ($inp),%eax - mov 4($inp),%ebx + mov ($inp),%r8d + mov 4($inp),%r9d ___ # unroll 2x4-wise, because 64-bit rotates kill Intel P4... for ($i=0;$i<4;$i++) { @@ -188,8 +350,8 @@ $code.=<<___; mov $TX[0],$TX[1] .Lcmov$i: add $TX[0]#b,$TY#b - xor ($dat,$TY),%al - ror \$8,%eax + xor ($dat,$TY),%r8b + ror \$8,%r8d ___ push(@TX,shift(@TX)); push(@XX,shift(@XX)); # "rotate" registers } @@ -207,16 +369,16 @@ $code.=<<___; mov $TX[0],$TX[1] .Lcmov$i: add $TX[0]#b,$TY#b - xor ($dat,$TY),%bl - ror \$8,%ebx + xor ($dat,$TY),%r9b + ror \$8,%r9d ___ push(@TX,shift(@TX)); push(@XX,shift(@XX)); # "rotate" registers } $code.=<<___; lea -8($len),$len - mov %eax,($out) + mov %r8d,($out) lea 8($inp),$inp - mov %ebx,4($out) + mov %r9d,4($out) lea 8($out),$out test \$-8,$len @@ -229,6 +391,7 @@ $code.=<<___; .align 16 .Lcloop1: add $TX[0]#b,$YY#b + movzb $YY#b,$YY#d movzb ($dat,$YY),$TY#d movb $TX[0]#b,($dat,$YY) movb $TY#b,($dat,$XX[0]) @@ -260,16 +423,16 @@ $code.=<<___; ret .size RC4,.-RC4 ___ +} $idx="%r8"; $ido="%r9"; $code.=<<___; -.extern OPENSSL_ia32cap_P -.globl RC4_set_key -.type RC4_set_key,\@function,3 +.globl private_RC4_set_key +.type private_RC4_set_key,\@function,3 .align 16 -RC4_set_key: +private_RC4_set_key: lea 8($dat),$dat lea ($inp,$len),$inp neg $len @@ -280,12 +443,9 @@ RC4_set_key: xor %r11,%r11 mov OPENSSL_ia32cap_P(%rip),$idx#d - bt \$20,$idx#d - jnc .Lw1stloop - bt \$30,$idx#d - setc $ido#b - mov $ido#d,260($dat) - jmp .Lc1stloop + bt \$20,$idx#d # RC4_CHAR? + jc .Lc1stloop + jmp .Lw1stloop .align 16 .Lw1stloop: @@ -339,7 +499,7 @@ RC4_set_key: mov %eax,-8($dat) mov %eax,-4($dat) ret -.size RC4_set_key,.-RC4_set_key +.size private_RC4_set_key,.-private_RC4_set_key .globl RC4_options .type RC4_options,\@abi-omnipotent @@ -348,18 +508,20 @@ RC4_options: lea .Lopts(%rip),%rax mov OPENSSL_ia32cap_P(%rip),%edx bt \$20,%edx - jnc .Ldone - add \$12,%rax + jc .L8xchar bt \$30,%edx jnc .Ldone - add \$13,%rax + add \$25,%rax + ret +.L8xchar: + add \$12,%rax .Ldone: ret .align 64 .Lopts: .asciz "rc4(8x,int)" .asciz "rc4(8x,char)" -.asciz "rc4(1x,char)" +.asciz "rc4(16x,int)" .asciz "RC4 for x86_64, CRYPTOGAMS by " .align 64 .size RC4_options,.-RC4_options @@ -482,22 +644,32 @@ key_se_handler: .rva .LSEH_end_RC4 .rva .LSEH_info_RC4 - .rva .LSEH_begin_RC4_set_key - .rva .LSEH_end_RC4_set_key - .rva .LSEH_info_RC4_set_key + .rva .LSEH_begin_private_RC4_set_key + .rva .LSEH_end_private_RC4_set_key + .rva .LSEH_info_private_RC4_set_key .section .xdata .align 8 .LSEH_info_RC4: .byte 9,0,0,0 .rva stream_se_handler -.LSEH_info_RC4_set_key: +.LSEH_info_private_RC4_set_key: .byte 9,0,0,0 .rva key_se_handler ___ } -$code =~ s/#([bwd])/$1/gm; +sub reg_part { +my ($reg,$conv)=@_; + if ($reg =~ /%r[0-9]+/) { $reg .= $conv; } + elsif ($conv eq "b") { $reg =~ s/%[er]([^x]+)x?/%$1l/; } + elsif ($conv eq "w") { $reg =~ s/%[er](.+)/%$1/; } + elsif ($conv eq "d") { $reg =~ s/%[er](.+)/%e$1/; } + return $reg; +} + +$code =~ s/(%[a-z0-9]+)#([bwd])/reg_part($1,$2)/gem; +$code =~ s/\`([^\`]*)\`/eval $1/gem; print $code; diff --git a/crypto/openssl/crypto/rc4/rc4.h b/crypto/openssl/crypto/rc4/rc4.h index 29d1acccf5..88ceb46bc5 100644 --- a/crypto/openssl/crypto/rc4/rc4.h +++ b/crypto/openssl/crypto/rc4/rc4.h @@ -79,6 +79,7 @@ typedef struct rc4_key_st const char *RC4_options(void); void RC4_set_key(RC4_KEY *key, int len, const unsigned char *data); +void private_RC4_set_key(RC4_KEY *key, int len, const unsigned char *data); void RC4(RC4_KEY *key, size_t len, const unsigned char *indata, unsigned char *outdata); diff --git a/crypto/openssl/crypto/rc4/rc4_skey.c b/crypto/openssl/crypto/rc4/rc4_skey.c index b22c40b0bd..fda27636e7 100644 --- a/crypto/openssl/crypto/rc4/rc4_skey.c +++ b/crypto/openssl/crypto/rc4/rc4_skey.c @@ -85,7 +85,7 @@ const char *RC4_options(void) * Date: Wed, 14 Sep 1994 06:35:31 GMT */ -void RC4_set_key(RC4_KEY *key, int len, const unsigned char *data) +void private_RC4_set_key(RC4_KEY *key, int len, const unsigned char *data) { register RC4_INT tmp; register int id1,id2; @@ -104,40 +104,6 @@ void RC4_set_key(RC4_KEY *key, int len, const unsigned char *data) d[(n)]=d[id2]; \ d[id2]=tmp; } -#if defined(OPENSSL_CPUID_OBJ) && !defined(OPENSSL_NO_ASM) -# if defined(__i386) || defined(__i386__) || defined(_M_IX86) || \ - defined(__INTEL__) || \ - defined(__x86_64) || defined(__x86_64__) || defined(_M_AMD64) - if (sizeof(RC4_INT) > 1) { - /* - * Unlike all other x86 [and x86_64] implementations, - * Intel P4 core [including EM64T] was found to perform - * poorly with wider RC4_INT. Performance improvement - * for IA-32 hand-coded assembler turned out to be 2.8x - * if re-coded for RC4_CHAR! It's however inappropriate - * to just switch to RC4_CHAR for x86[_64], as non-P4 - * implementations suffer from significant performance - * losses then, e.g. PIII exhibits >2x deterioration, - * and so does Opteron. In order to assure optimal - * all-round performance, let us [try to] detect P4 at - * run-time by checking upon HTT bit in CPU capability - * vector and set up compressed key schedule, which is - * recognized by correspondingly updated assembler - * module... - * - */ - if (OPENSSL_ia32cap_P & (1<<28)) { - unsigned char *cp=(unsigned char *)d; - - for (i=0;i<256;i++) cp[i]=i; - for (i=0;i<256;i++) SK_LOOP(cp,i); - /* mark schedule as compressed! */ - d[256/sizeof(RC4_INT)]=-1; - return; - } - } -# endif -#endif for (i=0; i < 256; i++) d[i]=i; for (i=0; i < 256; i+=4) { diff --git a/crypto/openssl/crypto/aes/aes_misc.c b/crypto/openssl/crypto/rc4/rc4_utl.c similarity index 87% copy from crypto/openssl/crypto/aes/aes_misc.c copy to crypto/openssl/crypto/rc4/rc4_utl.c index 4fead1b4c7..ab3f02fe6a 100644 --- a/crypto/openssl/crypto/aes/aes_misc.c +++ b/crypto/openssl/crypto/rc4/rc4_utl.c @@ -1,6 +1,6 @@ -/* crypto/aes/aes_misc.c -*- mode:C; c-file-style: "eay" -*- */ +/* crypto/rc4/rc4_utl.c -*- mode:C; c-file-style: "eay" -*- */ /* ==================================================================== - * Copyright (c) 1998-2002 The OpenSSL Project. All rights reserved. + * Copyright (c) 2011 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -50,15 +50,13 @@ */ #include -#include -#include "aes_locl.h" +#include +#include -const char AES_version[]="AES" OPENSSL_VERSION_PTEXT; - -const char *AES_options(void) { -#ifdef FULL_UNROLL - return "aes(full)"; -#else - return "aes(partial)"; +void RC4_set_key(RC4_KEY *key, int len, const unsigned char *data) + { +#ifdef OPENSSL_FIPS + fips_cipher_abort(RC4); #endif -} + private_RC4_set_key(key, len, data); + } diff --git a/crypto/openssl/crypto/ripemd/ripemd.h b/crypto/openssl/crypto/ripemd/ripemd.h index 5942eb6180..189bd8c90e 100644 --- a/crypto/openssl/crypto/ripemd/ripemd.h +++ b/crypto/openssl/crypto/ripemd/ripemd.h @@ -91,6 +91,9 @@ typedef struct RIPEMD160state_st unsigned int num; } RIPEMD160_CTX; +#ifdef OPENSSL_FIPS +int private_RIPEMD160_Init(RIPEMD160_CTX *c); +#endif int RIPEMD160_Init(RIPEMD160_CTX *c); int RIPEMD160_Update(RIPEMD160_CTX *c, const void *data, size_t len); int RIPEMD160_Final(unsigned char *md, RIPEMD160_CTX *c); diff --git a/crypto/openssl/crypto/ripemd/rmd_dgst.c b/crypto/openssl/crypto/ripemd/rmd_dgst.c index 59b017f8c0..63f0d983f7 100644 --- a/crypto/openssl/crypto/ripemd/rmd_dgst.c +++ b/crypto/openssl/crypto/ripemd/rmd_dgst.c @@ -59,6 +59,7 @@ #include #include "rmd_locl.h" #include +#include const char RMD160_version[]="RIPE-MD160" OPENSSL_VERSION_PTEXT; @@ -69,7 +70,7 @@ const char RMD160_version[]="RIPE-MD160" OPENSSL_VERSION_PTEXT; void ripemd160_block(RIPEMD160_CTX *c, unsigned long *p,size_t num); # endif -int RIPEMD160_Init(RIPEMD160_CTX *c) +fips_md_init(RIPEMD160) { memset (c,0,sizeof(*c)); c->A=RIPEMD160_A; diff --git a/crypto/openssl/crypto/rsa/rsa.h b/crypto/openssl/crypto/rsa/rsa.h index cf74343657..4814a2fc15 100644 --- a/crypto/openssl/crypto/rsa/rsa.h +++ b/crypto/openssl/crypto/rsa/rsa.h @@ -222,12 +222,22 @@ struct rsa_st EVP_PKEY_CTX_ctrl(ctx, EVP_PKEY_RSA, -1, EVP_PKEY_CTRL_RSA_PADDING, \ pad, NULL) +#define EVP_PKEY_CTX_get_rsa_padding(ctx, ppad) \ + EVP_PKEY_CTX_ctrl(ctx, EVP_PKEY_RSA, -1, \ + EVP_PKEY_CTRL_GET_RSA_PADDING, 0, ppad) + #define EVP_PKEY_CTX_set_rsa_pss_saltlen(ctx, len) \ EVP_PKEY_CTX_ctrl(ctx, EVP_PKEY_RSA, \ (EVP_PKEY_OP_SIGN|EVP_PKEY_OP_VERIFY), \ EVP_PKEY_CTRL_RSA_PSS_SALTLEN, \ len, NULL) +#define EVP_PKEY_CTX_get_rsa_pss_saltlen(ctx, plen) \ + EVP_PKEY_CTX_ctrl(ctx, EVP_PKEY_RSA, \ + (EVP_PKEY_OP_SIGN|EVP_PKEY_OP_VERIFY), \ + EVP_PKEY_CTRL_GET_RSA_PSS_SALTLEN, \ + 0, plen) + #define EVP_PKEY_CTX_set_rsa_keygen_bits(ctx, bits) \ EVP_PKEY_CTX_ctrl(ctx, EVP_PKEY_RSA, EVP_PKEY_OP_KEYGEN, \ EVP_PKEY_CTRL_RSA_KEYGEN_BITS, bits, NULL) @@ -236,11 +246,24 @@ struct rsa_st EVP_PKEY_CTX_ctrl(ctx, EVP_PKEY_RSA, EVP_PKEY_OP_KEYGEN, \ EVP_PKEY_CTRL_RSA_KEYGEN_PUBEXP, 0, pubexp) +#define EVP_PKEY_CTX_set_rsa_mgf1_md(ctx, md) \ + EVP_PKEY_CTX_ctrl(ctx, EVP_PKEY_RSA, EVP_PKEY_OP_TYPE_SIG, \ + EVP_PKEY_CTRL_RSA_MGF1_MD, 0, (void *)md) + +#define EVP_PKEY_CTX_get_rsa_mgf1_md(ctx, pmd) \ + EVP_PKEY_CTX_ctrl(ctx, EVP_PKEY_RSA, EVP_PKEY_OP_TYPE_SIG, \ + EVP_PKEY_CTRL_GET_RSA_MGF1_MD, 0, (void *)pmd) + #define EVP_PKEY_CTRL_RSA_PADDING (EVP_PKEY_ALG_CTRL + 1) #define EVP_PKEY_CTRL_RSA_PSS_SALTLEN (EVP_PKEY_ALG_CTRL + 2) #define EVP_PKEY_CTRL_RSA_KEYGEN_BITS (EVP_PKEY_ALG_CTRL + 3) #define EVP_PKEY_CTRL_RSA_KEYGEN_PUBEXP (EVP_PKEY_ALG_CTRL + 4) +#define EVP_PKEY_CTRL_RSA_MGF1_MD (EVP_PKEY_ALG_CTRL + 5) + +#define EVP_PKEY_CTRL_GET_RSA_PADDING (EVP_PKEY_ALG_CTRL + 6) +#define EVP_PKEY_CTRL_GET_RSA_PSS_SALTLEN (EVP_PKEY_ALG_CTRL + 7) +#define EVP_PKEY_CTRL_GET_RSA_MGF1_MD (EVP_PKEY_ALG_CTRL + 8) #define RSA_PKCS1_PADDING 1 #define RSA_SSLV23_PADDING 2 @@ -300,6 +323,16 @@ const RSA_METHOD *RSA_null_method(void); DECLARE_ASN1_ENCODE_FUNCTIONS_const(RSA, RSAPublicKey) DECLARE_ASN1_ENCODE_FUNCTIONS_const(RSA, RSAPrivateKey) +typedef struct rsa_pss_params_st + { + X509_ALGOR *hashAlgorithm; + X509_ALGOR *maskGenAlgorithm; + ASN1_INTEGER *saltLength; + ASN1_INTEGER *trailerField; + } RSA_PSS_PARAMS; + +DECLARE_ASN1_FUNCTIONS(RSA_PSS_PARAMS) + #ifndef OPENSSL_NO_FP_API int RSA_print_fp(FILE *fp, const RSA *r,int offset); #endif @@ -380,6 +413,14 @@ int RSA_padding_add_PKCS1_PSS(RSA *rsa, unsigned char *EM, const unsigned char *mHash, const EVP_MD *Hash, int sLen); +int RSA_verify_PKCS1_PSS_mgf1(RSA *rsa, const unsigned char *mHash, + const EVP_MD *Hash, const EVP_MD *mgf1Hash, + const unsigned char *EM, int sLen); + +int RSA_padding_add_PKCS1_PSS_mgf1(RSA *rsa, unsigned char *EM, + const unsigned char *mHash, + const EVP_MD *Hash, const EVP_MD *mgf1Hash, int sLen); + int RSA_get_ex_new_index(long argl, void *argp, CRYPTO_EX_new *new_func, CRYPTO_EX_dup *dup_func, CRYPTO_EX_free *free_func); int RSA_set_ex_data(RSA *r,int idx,void *arg); @@ -388,6 +429,25 @@ void *RSA_get_ex_data(const RSA *r, int idx); RSA *RSAPublicKey_dup(RSA *rsa); RSA *RSAPrivateKey_dup(RSA *rsa); +/* If this flag is set the RSA method is FIPS compliant and can be used + * in FIPS mode. This is set in the validated module method. If an + * application sets this flag in its own methods it is its responsibility + * to ensure the result is compliant. + */ + +#define RSA_FLAG_FIPS_METHOD 0x0400 + +/* If this flag is set the operations normally disabled in FIPS mode are + * permitted it is then the applications responsibility to ensure that the + * usage is compliant. + */ + +#define RSA_FLAG_NON_FIPS_ALLOW 0x0400 +/* Application has decided PRNG is good enough to generate a key: don't + * check. + */ +#define RSA_FLAG_CHECKED 0x0800 + /* BEGIN ERROR CODES */ /* The following lines are auto generated by the script mkerr.pl. Any changes * made after this point may be overwritten when the script is next run. @@ -405,6 +465,7 @@ void ERR_load_RSA_strings(void); #define RSA_F_PKEY_RSA_CTRL 143 #define RSA_F_PKEY_RSA_CTRL_STR 144 #define RSA_F_PKEY_RSA_SIGN 142 +#define RSA_F_PKEY_RSA_VERIFY 154 #define RSA_F_PKEY_RSA_VERIFYRECOVER 141 #define RSA_F_RSA_BUILTIN_KEYGEN 129 #define RSA_F_RSA_CHECK_KEY 123 @@ -413,6 +474,8 @@ void ERR_load_RSA_strings(void); #define RSA_F_RSA_EAY_PUBLIC_DECRYPT 103 #define RSA_F_RSA_EAY_PUBLIC_ENCRYPT 104 #define RSA_F_RSA_GENERATE_KEY 105 +#define RSA_F_RSA_GENERATE_KEY_EX 155 +#define RSA_F_RSA_ITEM_VERIFY 156 #define RSA_F_RSA_MEMORY_LOCK 130 #define RSA_F_RSA_NEW_METHOD 106 #define RSA_F_RSA_NULL 124 @@ -424,6 +487,7 @@ void ERR_load_RSA_strings(void); #define RSA_F_RSA_PADDING_ADD_NONE 107 #define RSA_F_RSA_PADDING_ADD_PKCS1_OAEP 121 #define RSA_F_RSA_PADDING_ADD_PKCS1_PSS 125 +#define RSA_F_RSA_PADDING_ADD_PKCS1_PSS_MGF1 148 #define RSA_F_RSA_PADDING_ADD_PKCS1_TYPE_1 108 #define RSA_F_RSA_PADDING_ADD_PKCS1_TYPE_2 109 #define RSA_F_RSA_PADDING_ADD_SSLV23 110 @@ -436,8 +500,12 @@ void ERR_load_RSA_strings(void); #define RSA_F_RSA_PADDING_CHECK_X931 128 #define RSA_F_RSA_PRINT 115 #define RSA_F_RSA_PRINT_FP 116 +#define RSA_F_RSA_PRIVATE_DECRYPT 150 +#define RSA_F_RSA_PRIVATE_ENCRYPT 151 #define RSA_F_RSA_PRIV_DECODE 137 #define RSA_F_RSA_PRIV_ENCODE 138 +#define RSA_F_RSA_PUBLIC_DECRYPT 152 +#define RSA_F_RSA_PUBLIC_ENCRYPT 153 #define RSA_F_RSA_PUB_DECODE 139 #define RSA_F_RSA_SETUP_BLINDING 136 #define RSA_F_RSA_SIGN 117 @@ -445,6 +513,7 @@ void ERR_load_RSA_strings(void); #define RSA_F_RSA_VERIFY 119 #define RSA_F_RSA_VERIFY_ASN1_OCTET_STRING 120 #define RSA_F_RSA_VERIFY_PKCS1_PSS 126 +#define RSA_F_RSA_VERIFY_PKCS1_PSS_MGF1 149 /* Reason codes. */ #define RSA_R_ALGORITHM_MISMATCH 100 @@ -470,19 +539,24 @@ void ERR_load_RSA_strings(void); #define RSA_R_INVALID_HEADER 137 #define RSA_R_INVALID_KEYBITS 145 #define RSA_R_INVALID_MESSAGE_LENGTH 131 +#define RSA_R_INVALID_MGF1_MD 156 #define RSA_R_INVALID_PADDING 138 #define RSA_R_INVALID_PADDING_MODE 141 +#define RSA_R_INVALID_PSS_PARAMETERS 149 #define RSA_R_INVALID_PSS_SALTLEN 146 +#define RSA_R_INVALID_SALT_LENGTH 150 #define RSA_R_INVALID_TRAILER 139 #define RSA_R_INVALID_X931_DIGEST 142 #define RSA_R_IQMP_NOT_INVERSE_OF_Q 126 #define RSA_R_KEY_SIZE_TOO_SMALL 120 #define RSA_R_LAST_OCTET_INVALID 134 #define RSA_R_MODULUS_TOO_LARGE 105 +#define RSA_R_NON_FIPS_RSA_METHOD 157 #define RSA_R_NO_PUBLIC_EXPONENT 140 #define RSA_R_NULL_BEFORE_BLOCK_MISSING 113 #define RSA_R_N_DOES_NOT_EQUAL_P_Q 127 #define RSA_R_OAEP_DECODING_ERROR 121 +#define RSA_R_OPERATION_NOT_ALLOWED_IN_FIPS_MODE 158 #define RSA_R_OPERATION_NOT_SUPPORTED_FOR_THIS_KEYTYPE 148 #define RSA_R_PADDING_CHECK_FAILED 114 #define RSA_R_P_NOT_PRIME 128 @@ -493,7 +567,12 @@ void ERR_load_RSA_strings(void); #define RSA_R_SSLV3_ROLLBACK_ATTACK 115 #define RSA_R_THE_ASN1_OBJECT_IDENTIFIER_IS_NOT_KNOWN_FOR_THIS_MD 116 #define RSA_R_UNKNOWN_ALGORITHM_TYPE 117 +#define RSA_R_UNKNOWN_MASK_DIGEST 151 #define RSA_R_UNKNOWN_PADDING_TYPE 118 +#define RSA_R_UNKNOWN_PSS_DIGEST 152 +#define RSA_R_UNSUPPORTED_MASK_ALGORITHM 153 +#define RSA_R_UNSUPPORTED_MASK_PARAMETER 154 +#define RSA_R_UNSUPPORTED_SIGNATURE_TYPE 155 #define RSA_R_VALUE_MISSING 147 #define RSA_R_WRONG_SIGNATURE_LENGTH 119 diff --git a/crypto/openssl/crypto/rsa/rsa_ameth.c b/crypto/openssl/crypto/rsa/rsa_ameth.c index 8c3209885e..2460910ab2 100644 --- a/crypto/openssl/crypto/rsa/rsa_ameth.c +++ b/crypto/openssl/crypto/rsa/rsa_ameth.c @@ -265,6 +265,147 @@ static int rsa_priv_print(BIO *bp, const EVP_PKEY *pkey, int indent, return do_rsa_print(bp, pkey->pkey.rsa, indent, 1); } +static RSA_PSS_PARAMS *rsa_pss_decode(const X509_ALGOR *alg, + X509_ALGOR **pmaskHash) + { + const unsigned char *p; + int plen; + RSA_PSS_PARAMS *pss; + + *pmaskHash = NULL; + + if (!alg->parameter || alg->parameter->type != V_ASN1_SEQUENCE) + return NULL; + p = alg->parameter->value.sequence->data; + plen = alg->parameter->value.sequence->length; + pss = d2i_RSA_PSS_PARAMS(NULL, &p, plen); + + if (!pss) + return NULL; + + if (pss->maskGenAlgorithm) + { + ASN1_TYPE *param = pss->maskGenAlgorithm->parameter; + if (OBJ_obj2nid(pss->maskGenAlgorithm->algorithm) == NID_mgf1 + && param->type == V_ASN1_SEQUENCE) + { + p = param->value.sequence->data; + plen = param->value.sequence->length; + *pmaskHash = d2i_X509_ALGOR(NULL, &p, plen); + } + } + + return pss; + } + +static int rsa_pss_param_print(BIO *bp, RSA_PSS_PARAMS *pss, + X509_ALGOR *maskHash, int indent) + { + int rv = 0; + if (!pss) + { + if (BIO_puts(bp, " (INVALID PSS PARAMETERS)\n") <= 0) + return 0; + return 1; + } + if (BIO_puts(bp, "\n") <= 0) + goto err; + if (!BIO_indent(bp, indent, 128)) + goto err; + if (BIO_puts(bp, "Hash Algorithm: ") <= 0) + goto err; + + if (pss->hashAlgorithm) + { + if (i2a_ASN1_OBJECT(bp, pss->hashAlgorithm->algorithm) <= 0) + goto err; + } + else if (BIO_puts(bp, "sha1 (default)") <= 0) + goto err; + + if (BIO_puts(bp, "\n") <= 0) + goto err; + + if (!BIO_indent(bp, indent, 128)) + goto err; + + if (BIO_puts(bp, "Mask Algorithm: ") <= 0) + goto err; + if (pss->maskGenAlgorithm) + { + if (i2a_ASN1_OBJECT(bp, pss->maskGenAlgorithm->algorithm) <= 0) + goto err; + if (BIO_puts(bp, " with ") <= 0) + goto err; + if (maskHash) + { + if (i2a_ASN1_OBJECT(bp, maskHash->algorithm) <= 0) + goto err; + } + else if (BIO_puts(bp, "INVALID") <= 0) + goto err; + } + else if (BIO_puts(bp, "mgf1 with sha1 (default)") <= 0) + goto err; + BIO_puts(bp, "\n"); + + if (!BIO_indent(bp, indent, 128)) + goto err; + if (BIO_puts(bp, "Salt Length: ") <= 0) + goto err; + if (pss->saltLength) + { + if (i2a_ASN1_INTEGER(bp, pss->saltLength) <= 0) + goto err; + } + else if (BIO_puts(bp, "20 (default)") <= 0) + goto err; + BIO_puts(bp, "\n"); + + if (!BIO_indent(bp, indent, 128)) + goto err; + if (BIO_puts(bp, "Trailer Field: ") <= 0) + goto err; + if (pss->trailerField) + { + if (i2a_ASN1_INTEGER(bp, pss->trailerField) <= 0) + goto err; + } + else if (BIO_puts(bp, "0xbc (default)") <= 0) + goto err; + BIO_puts(bp, "\n"); + + rv = 1; + + err: + return rv; + + } + +static int rsa_sig_print(BIO *bp, const X509_ALGOR *sigalg, + const ASN1_STRING *sig, + int indent, ASN1_PCTX *pctx) + { + if (OBJ_obj2nid(sigalg->algorithm) == NID_rsassaPss) + { + int rv; + RSA_PSS_PARAMS *pss; + X509_ALGOR *maskHash; + pss = rsa_pss_decode(sigalg, &maskHash); + rv = rsa_pss_param_print(bp, pss, maskHash, indent); + if (pss) + RSA_PSS_PARAMS_free(pss); + if (maskHash) + X509_ALGOR_free(maskHash); + if (!rv) + return 0; + } + else if (!sig && BIO_puts(bp, "\n") <= 0) + return 0; + if (sig) + return X509_signature_dump(bp, sig, indent); + return 1; + } static int rsa_pkey_ctrl(EVP_PKEY *pkey, int op, long arg1, void *arg2) { @@ -310,6 +451,211 @@ static int rsa_pkey_ctrl(EVP_PKEY *pkey, int op, long arg1, void *arg2) } +/* Customised RSA item verification routine. This is called + * when a signature is encountered requiring special handling. We + * currently only handle PSS. + */ + + +static int rsa_item_verify(EVP_MD_CTX *ctx, const ASN1_ITEM *it, void *asn, + X509_ALGOR *sigalg, ASN1_BIT_STRING *sig, + EVP_PKEY *pkey) + { + int rv = -1; + int saltlen; + const EVP_MD *mgf1md = NULL, *md = NULL; + RSA_PSS_PARAMS *pss; + X509_ALGOR *maskHash; + EVP_PKEY_CTX *pkctx; + /* Sanity check: make sure it is PSS */ + if (OBJ_obj2nid(sigalg->algorithm) != NID_rsassaPss) + { + RSAerr(RSA_F_RSA_ITEM_VERIFY, RSA_R_UNSUPPORTED_SIGNATURE_TYPE); + return -1; + } + /* Decode PSS parameters */ + pss = rsa_pss_decode(sigalg, &maskHash); + + if (pss == NULL) + { + RSAerr(RSA_F_RSA_ITEM_VERIFY, RSA_R_INVALID_PSS_PARAMETERS); + goto err; + } + /* Check mask and lookup mask hash algorithm */ + if (pss->maskGenAlgorithm) + { + if (OBJ_obj2nid(pss->maskGenAlgorithm->algorithm) != NID_mgf1) + { + RSAerr(RSA_F_RSA_ITEM_VERIFY, RSA_R_UNSUPPORTED_MASK_ALGORITHM); + goto err; + } + if (!maskHash) + { + RSAerr(RSA_F_RSA_ITEM_VERIFY, RSA_R_UNSUPPORTED_MASK_PARAMETER); + goto err; + } + mgf1md = EVP_get_digestbyobj(maskHash->algorithm); + if (mgf1md == NULL) + { + RSAerr(RSA_F_RSA_ITEM_VERIFY, RSA_R_UNKNOWN_MASK_DIGEST); + goto err; + } + } + else + mgf1md = EVP_sha1(); + + if (pss->hashAlgorithm) + { + md = EVP_get_digestbyobj(pss->hashAlgorithm->algorithm); + if (md == NULL) + { + RSAerr(RSA_F_RSA_ITEM_VERIFY, RSA_R_UNKNOWN_PSS_DIGEST); + goto err; + } + } + else + md = EVP_sha1(); + + if (pss->saltLength) + { + saltlen = ASN1_INTEGER_get(pss->saltLength); + + /* Could perform more salt length sanity checks but the main + * RSA routines will trap other invalid values anyway. + */ + if (saltlen < 0) + { + RSAerr(RSA_F_RSA_ITEM_VERIFY, RSA_R_INVALID_SALT_LENGTH); + goto err; + } + } + else + saltlen = 20; + + /* low-level routines support only trailer field 0xbc (value 1) + * and PKCS#1 says we should reject any other value anyway. + */ + if (pss->trailerField && ASN1_INTEGER_get(pss->trailerField) != 1) + { + RSAerr(RSA_F_RSA_ITEM_VERIFY, RSA_R_INVALID_TRAILER); + goto err; + } + + /* We have all parameters now set up context */ + + if (!EVP_DigestVerifyInit(ctx, &pkctx, md, NULL, pkey)) + goto err; + + if (EVP_PKEY_CTX_set_rsa_padding(pkctx, RSA_PKCS1_PSS_PADDING) <= 0) + goto err; + + if (EVP_PKEY_CTX_set_rsa_pss_saltlen(pkctx, saltlen) <= 0) + goto err; + + if (EVP_PKEY_CTX_set_rsa_mgf1_md(pkctx, mgf1md) <= 0) + goto err; + /* Carry on */ + rv = 2; + + err: + RSA_PSS_PARAMS_free(pss); + if (maskHash) + X509_ALGOR_free(maskHash); + return rv; + } + +static int rsa_item_sign(EVP_MD_CTX *ctx, const ASN1_ITEM *it, void *asn, + X509_ALGOR *alg1, X509_ALGOR *alg2, + ASN1_BIT_STRING *sig) + { + int pad_mode; + EVP_PKEY_CTX *pkctx = ctx->pctx; + if (EVP_PKEY_CTX_get_rsa_padding(pkctx, &pad_mode) <= 0) + return 0; + if (pad_mode == RSA_PKCS1_PADDING) + return 2; + if (pad_mode == RSA_PKCS1_PSS_PADDING) + { + const EVP_MD *sigmd, *mgf1md; + RSA_PSS_PARAMS *pss = NULL; + X509_ALGOR *mgf1alg = NULL; + ASN1_STRING *os1 = NULL, *os2 = NULL; + EVP_PKEY *pk = EVP_PKEY_CTX_get0_pkey(pkctx); + int saltlen, rv = 0; + sigmd = EVP_MD_CTX_md(ctx); + if (EVP_PKEY_CTX_get_rsa_mgf1_md(pkctx, &mgf1md) <= 0) + goto err; + if (!EVP_PKEY_CTX_get_rsa_pss_saltlen(pkctx, &saltlen)) + goto err; + if (saltlen == -1) + saltlen = EVP_MD_size(sigmd); + else if (saltlen == -2) + { + saltlen = EVP_PKEY_size(pk) - EVP_MD_size(sigmd) - 2; + if (((EVP_PKEY_bits(pk) - 1) & 0x7) == 0) + saltlen--; + } + pss = RSA_PSS_PARAMS_new(); + if (!pss) + goto err; + if (saltlen != 20) + { + pss->saltLength = ASN1_INTEGER_new(); + if (!pss->saltLength) + goto err; + if (!ASN1_INTEGER_set(pss->saltLength, saltlen)) + goto err; + } + if (EVP_MD_type(sigmd) != NID_sha1) + { + pss->hashAlgorithm = X509_ALGOR_new(); + if (!pss->hashAlgorithm) + goto err; + X509_ALGOR_set_md(pss->hashAlgorithm, sigmd); + } + if (EVP_MD_type(mgf1md) != NID_sha1) + { + ASN1_STRING *stmp = NULL; + /* need to embed algorithm ID inside another */ + mgf1alg = X509_ALGOR_new(); + X509_ALGOR_set_md(mgf1alg, mgf1md); + if (!ASN1_item_pack(mgf1alg, ASN1_ITEM_rptr(X509_ALGOR), + &stmp)) + goto err; + pss->maskGenAlgorithm = X509_ALGOR_new(); + if (!pss->maskGenAlgorithm) + goto err; + X509_ALGOR_set0(pss->maskGenAlgorithm, + OBJ_nid2obj(NID_mgf1), + V_ASN1_SEQUENCE, stmp); + } + /* Finally create string with pss parameter encoding. */ + if (!ASN1_item_pack(pss, ASN1_ITEM_rptr(RSA_PSS_PARAMS), &os1)) + goto err; + if (alg2) + { + os2 = ASN1_STRING_dup(os1); + if (!os2) + goto err; + X509_ALGOR_set0(alg2, OBJ_nid2obj(NID_rsassaPss), + V_ASN1_SEQUENCE, os2); + } + X509_ALGOR_set0(alg1, OBJ_nid2obj(NID_rsassaPss), + V_ASN1_SEQUENCE, os1); + os1 = os2 = NULL; + rv = 3; + err: + if (mgf1alg) + X509_ALGOR_free(mgf1alg); + if (pss) + RSA_PSS_PARAMS_free(pss); + if (os1) + ASN1_STRING_free(os1); + return rv; + + } + return 2; + } const EVP_PKEY_ASN1_METHOD rsa_asn1_meths[] = { @@ -335,10 +681,13 @@ const EVP_PKEY_ASN1_METHOD rsa_asn1_meths[] = 0,0,0,0,0,0, + rsa_sig_print, int_rsa_free, rsa_pkey_ctrl, old_rsa_priv_decode, - old_rsa_priv_encode + old_rsa_priv_encode, + rsa_item_verify, + rsa_item_sign }, { diff --git a/crypto/openssl/crypto/rsa/rsa_asn1.c b/crypto/openssl/crypto/rsa/rsa_asn1.c index 4efca8cdc8..6ed5de3db4 100644 --- a/crypto/openssl/crypto/rsa/rsa_asn1.c +++ b/crypto/openssl/crypto/rsa/rsa_asn1.c @@ -60,6 +60,7 @@ #include "cryptlib.h" #include #include +#include #include /* Override the default free and new methods */ @@ -96,6 +97,15 @@ ASN1_SEQUENCE_cb(RSAPublicKey, rsa_cb) = { ASN1_SIMPLE(RSA, e, BIGNUM), } ASN1_SEQUENCE_END_cb(RSA, RSAPublicKey) +ASN1_SEQUENCE(RSA_PSS_PARAMS) = { + ASN1_EXP_OPT(RSA_PSS_PARAMS, hashAlgorithm, X509_ALGOR,0), + ASN1_EXP_OPT(RSA_PSS_PARAMS, maskGenAlgorithm, X509_ALGOR,1), + ASN1_EXP_OPT(RSA_PSS_PARAMS, saltLength, ASN1_INTEGER,2), + ASN1_EXP_OPT(RSA_PSS_PARAMS, trailerField, ASN1_INTEGER,3) +} ASN1_SEQUENCE_END(RSA_PSS_PARAMS) + +IMPLEMENT_ASN1_FUNCTIONS(RSA_PSS_PARAMS) + IMPLEMENT_ASN1_ENCODE_FUNCTIONS_const_fname(RSA, RSAPrivateKey, RSAPrivateKey) IMPLEMENT_ASN1_ENCODE_FUNCTIONS_const_fname(RSA, RSAPublicKey, RSAPublicKey) diff --git a/crypto/openssl/crypto/rsa/rsa_lib.c b/crypto/openssl/crypto/rsa/rsa_crpt.c similarity index 56% copy from crypto/openssl/crypto/rsa/rsa_lib.c copy to crypto/openssl/crypto/rsa/rsa_crpt.c index de45088d76..d3e44785dc 100644 --- a/crypto/openssl/crypto/rsa/rsa_lib.c +++ b/crypto/openssl/crypto/rsa/rsa_crpt.c @@ -67,219 +67,6 @@ #include #endif -const char RSA_version[]="RSA" OPENSSL_VERSION_PTEXT; - -static const RSA_METHOD *default_RSA_meth=NULL; - -RSA *RSA_new(void) - { - RSA *r=RSA_new_method(NULL); - - return r; - } - -void RSA_set_default_method(const RSA_METHOD *meth) - { - default_RSA_meth = meth; - } - -const RSA_METHOD *RSA_get_default_method(void) - { - if (default_RSA_meth == NULL) - { -#ifdef RSA_NULL - default_RSA_meth=RSA_null_method(); -#else -#if 0 /* was: #ifdef RSAref */ - default_RSA_meth=RSA_PKCS1_RSAref(); -#else - default_RSA_meth=RSA_PKCS1_SSLeay(); -#endif -#endif - } - - return default_RSA_meth; - } - -const RSA_METHOD *RSA_get_method(const RSA *rsa) - { - return rsa->meth; - } - -int RSA_set_method(RSA *rsa, const RSA_METHOD *meth) - { - /* NB: The caller is specifically setting a method, so it's not up to us - * to deal with which ENGINE it comes from. */ - const RSA_METHOD *mtmp; - mtmp = rsa->meth; - if (mtmp->finish) mtmp->finish(rsa); -#ifndef OPENSSL_NO_ENGINE - if (rsa->engine) - { - ENGINE_finish(rsa->engine); - rsa->engine = NULL; - } -#endif - rsa->meth = meth; - if (meth->init) meth->init(rsa); - return 1; - } - -RSA *RSA_new_method(ENGINE *engine) - { - RSA *ret; - - ret=(RSA *)OPENSSL_malloc(sizeof(RSA)); - if (ret == NULL) - { - RSAerr(RSA_F_RSA_NEW_METHOD,ERR_R_MALLOC_FAILURE); - return NULL; - } - - ret->meth = RSA_get_default_method(); -#ifndef OPENSSL_NO_ENGINE - if (engine) - { - if (!ENGINE_init(engine)) - { - RSAerr(RSA_F_RSA_NEW_METHOD, ERR_R_ENGINE_LIB); - OPENSSL_free(ret); - return NULL; - } - ret->engine = engine; - } - else - ret->engine = ENGINE_get_default_RSA(); - if(ret->engine) - { - ret->meth = ENGINE_get_RSA(ret->engine); - if(!ret->meth) - { - RSAerr(RSA_F_RSA_NEW_METHOD, - ERR_R_ENGINE_LIB); - ENGINE_finish(ret->engine); - OPENSSL_free(ret); - return NULL; - } - } -#endif - - ret->pad=0; - ret->version=0; - ret->n=NULL; - ret->e=NULL; - ret->d=NULL; - ret->p=NULL; - ret->q=NULL; - ret->dmp1=NULL; - ret->dmq1=NULL; - ret->iqmp=NULL; - ret->references=1; - ret->_method_mod_n=NULL; - ret->_method_mod_p=NULL; - ret->_method_mod_q=NULL; - ret->blinding=NULL; - ret->mt_blinding=NULL; - ret->bignum_data=NULL; - ret->flags=ret->meth->flags; - if (!CRYPTO_new_ex_data(CRYPTO_EX_INDEX_RSA, ret, &ret->ex_data)) - { -#ifndef OPENSSL_NO_ENGINE - if (ret->engine) - ENGINE_finish(ret->engine); -#endif - OPENSSL_free(ret); - return(NULL); - } - - if ((ret->meth->init != NULL) && !ret->meth->init(ret)) - { -#ifndef OPENSSL_NO_ENGINE - if (ret->engine) - ENGINE_finish(ret->engine); -#endif - CRYPTO_free_ex_data(CRYPTO_EX_INDEX_RSA, ret, &ret->ex_data); - OPENSSL_free(ret); - ret=NULL; - } - return(ret); - } - -void RSA_free(RSA *r) - { - int i; - - if (r == NULL) return; - - i=CRYPTO_add(&r->references,-1,CRYPTO_LOCK_RSA); -#ifdef REF_PRINT - REF_PRINT("RSA",r); -#endif - if (i > 0) return; -#ifdef REF_CHECK - if (i < 0) - { - fprintf(stderr,"RSA_free, bad reference count\n"); - abort(); - } -#endif - - if (r->meth->finish) - r->meth->finish(r); -#ifndef OPENSSL_NO_ENGINE - if (r->engine) - ENGINE_finish(r->engine); -#endif - - CRYPTO_free_ex_data(CRYPTO_EX_INDEX_RSA, r, &r->ex_data); - - if (r->n != NULL) BN_clear_free(r->n); - if (r->e != NULL) BN_clear_free(r->e); - if (r->d != NULL) BN_clear_free(r->d); - if (r->p != NULL) BN_clear_free(r->p); - if (r->q != NULL) BN_clear_free(r->q); - if (r->dmp1 != NULL) BN_clear_free(r->dmp1); - if (r->dmq1 != NULL) BN_clear_free(r->dmq1); - if (r->iqmp != NULL) BN_clear_free(r->iqmp); - if (r->blinding != NULL) BN_BLINDING_free(r->blinding); - if (r->mt_blinding != NULL) BN_BLINDING_free(r->mt_blinding); - if (r->bignum_data != NULL) OPENSSL_free_locked(r->bignum_data); - OPENSSL_free(r); - } - -int RSA_up_ref(RSA *r) - { - int i = CRYPTO_add(&r->references, 1, CRYPTO_LOCK_RSA); -#ifdef REF_PRINT - REF_PRINT("RSA",r); -#endif -#ifdef REF_CHECK - if (i < 2) - { - fprintf(stderr, "RSA_up_ref, bad reference count\n"); - abort(); - } -#endif - return ((i > 1) ? 1 : 0); - } - -int RSA_get_ex_new_index(long argl, void *argp, CRYPTO_EX_new *new_func, - CRYPTO_EX_dup *dup_func, CRYPTO_EX_free *free_func) - { - return CRYPTO_get_ex_new_index(CRYPTO_EX_INDEX_RSA, argl, argp, - new_func, dup_func, free_func); - } - -int RSA_set_ex_data(RSA *r, int idx, void *arg) - { - return(CRYPTO_set_ex_data(&r->ex_data,idx,arg)); - } - -void *RSA_get_ex_data(const RSA *r, int idx) - { - return(CRYPTO_get_ex_data(&r->ex_data,idx)); - } - int RSA_size(const RSA *r) { return(BN_num_bytes(r->n)); @@ -288,24 +75,56 @@ int RSA_size(const RSA *r) int RSA_public_encrypt(int flen, const unsigned char *from, unsigned char *to, RSA *rsa, int padding) { +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !(rsa->meth->flags & RSA_FLAG_FIPS_METHOD) + && !(rsa->flags & RSA_FLAG_NON_FIPS_ALLOW)) + { + RSAerr(RSA_F_RSA_PUBLIC_ENCRYPT, RSA_R_NON_FIPS_RSA_METHOD); + return -1; + } +#endif return(rsa->meth->rsa_pub_enc(flen, from, to, rsa, padding)); } int RSA_private_encrypt(int flen, const unsigned char *from, unsigned char *to, RSA *rsa, int padding) { +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !(rsa->meth->flags & RSA_FLAG_FIPS_METHOD) + && !(rsa->flags & RSA_FLAG_NON_FIPS_ALLOW)) + { + RSAerr(RSA_F_RSA_PRIVATE_ENCRYPT, RSA_R_NON_FIPS_RSA_METHOD); + return -1; + } +#endif return(rsa->meth->rsa_priv_enc(flen, from, to, rsa, padding)); } int RSA_private_decrypt(int flen, const unsigned char *from, unsigned char *to, RSA *rsa, int padding) { +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !(rsa->meth->flags & RSA_FLAG_FIPS_METHOD) + && !(rsa->flags & RSA_FLAG_NON_FIPS_ALLOW)) + { + RSAerr(RSA_F_RSA_PRIVATE_DECRYPT, RSA_R_NON_FIPS_RSA_METHOD); + return -1; + } +#endif return(rsa->meth->rsa_priv_dec(flen, from, to, rsa, padding)); } int RSA_public_decrypt(int flen, const unsigned char *from, unsigned char *to, RSA *rsa, int padding) { +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !(rsa->meth->flags & RSA_FLAG_FIPS_METHOD) + && !(rsa->flags & RSA_FLAG_NON_FIPS_ALLOW)) + { + RSAerr(RSA_F_RSA_PUBLIC_DECRYPT, RSA_R_NON_FIPS_RSA_METHOD); + return -1; + } +#endif return(rsa->meth->rsa_pub_dec(flen, from, to, rsa, padding)); } @@ -436,48 +255,3 @@ err: return ret; } - -int RSA_memory_lock(RSA *r) - { - int i,j,k,off; - char *p; - BIGNUM *bn,**t[6],*b; - BN_ULONG *ul; - - if (r->d == NULL) return(1); - t[0]= &r->d; - t[1]= &r->p; - t[2]= &r->q; - t[3]= &r->dmp1; - t[4]= &r->dmq1; - t[5]= &r->iqmp; - k=sizeof(BIGNUM)*6; - off=k/sizeof(BN_ULONG)+1; - j=1; - for (i=0; i<6; i++) - j+= (*t[i])->top; - if ((p=OPENSSL_malloc_locked((off+j)*sizeof(BN_ULONG))) == NULL) - { - RSAerr(RSA_F_RSA_MEMORY_LOCK,ERR_R_MALLOC_FAILURE); - return(0); - } - bn=(BIGNUM *)p; - ul=(BN_ULONG *)&(p[off]); - for (i=0; i<6; i++) - { - b= *(t[i]); - *(t[i])= &(bn[i]); - memcpy((char *)&(bn[i]),(char *)b,sizeof(BIGNUM)); - bn[i].flags=BN_FLG_STATIC_DATA; - bn[i].d=ul; - memcpy((char *)ul,b->d,sizeof(BN_ULONG)*b->top); - ul+=b->top; - BN_clear_free(b); - } - - /* I should fix this so it can still be done */ - r->flags&= ~(RSA_FLAG_CACHE_PRIVATE|RSA_FLAG_CACHE_PUBLIC); - - r->bignum_data=p; - return(1); - } diff --git a/crypto/openssl/crypto/rsa/rsa_err.c b/crypto/openssl/crypto/rsa/rsa_err.c index cf9f1106b0..46e0bf9980 100644 --- a/crypto/openssl/crypto/rsa/rsa_err.c +++ b/crypto/openssl/crypto/rsa/rsa_err.c @@ -1,6 +1,6 @@ /* crypto/rsa/rsa_err.c */ /* ==================================================================== - * Copyright (c) 1999-2008 The OpenSSL Project. All rights reserved. + * Copyright (c) 1999-2011 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -78,6 +78,7 @@ static ERR_STRING_DATA RSA_str_functs[]= {ERR_FUNC(RSA_F_PKEY_RSA_CTRL), "PKEY_RSA_CTRL"}, {ERR_FUNC(RSA_F_PKEY_RSA_CTRL_STR), "PKEY_RSA_CTRL_STR"}, {ERR_FUNC(RSA_F_PKEY_RSA_SIGN), "PKEY_RSA_SIGN"}, +{ERR_FUNC(RSA_F_PKEY_RSA_VERIFY), "PKEY_RSA_VERIFY"}, {ERR_FUNC(RSA_F_PKEY_RSA_VERIFYRECOVER), "PKEY_RSA_VERIFYRECOVER"}, {ERR_FUNC(RSA_F_RSA_BUILTIN_KEYGEN), "RSA_BUILTIN_KEYGEN"}, {ERR_FUNC(RSA_F_RSA_CHECK_KEY), "RSA_check_key"}, @@ -86,6 +87,8 @@ static ERR_STRING_DATA RSA_str_functs[]= {ERR_FUNC(RSA_F_RSA_EAY_PUBLIC_DECRYPT), "RSA_EAY_PUBLIC_DECRYPT"}, {ERR_FUNC(RSA_F_RSA_EAY_PUBLIC_ENCRYPT), "RSA_EAY_PUBLIC_ENCRYPT"}, {ERR_FUNC(RSA_F_RSA_GENERATE_KEY), "RSA_generate_key"}, +{ERR_FUNC(RSA_F_RSA_GENERATE_KEY_EX), "RSA_generate_key_ex"}, +{ERR_FUNC(RSA_F_RSA_ITEM_VERIFY), "RSA_ITEM_VERIFY"}, {ERR_FUNC(RSA_F_RSA_MEMORY_LOCK), "RSA_memory_lock"}, {ERR_FUNC(RSA_F_RSA_NEW_METHOD), "RSA_new_method"}, {ERR_FUNC(RSA_F_RSA_NULL), "RSA_NULL"}, @@ -97,6 +100,7 @@ static ERR_STRING_DATA RSA_str_functs[]= {ERR_FUNC(RSA_F_RSA_PADDING_ADD_NONE), "RSA_padding_add_none"}, {ERR_FUNC(RSA_F_RSA_PADDING_ADD_PKCS1_OAEP), "RSA_padding_add_PKCS1_OAEP"}, {ERR_FUNC(RSA_F_RSA_PADDING_ADD_PKCS1_PSS), "RSA_padding_add_PKCS1_PSS"}, +{ERR_FUNC(RSA_F_RSA_PADDING_ADD_PKCS1_PSS_MGF1), "RSA_padding_add_PKCS1_PSS_mgf1"}, {ERR_FUNC(RSA_F_RSA_PADDING_ADD_PKCS1_TYPE_1), "RSA_padding_add_PKCS1_type_1"}, {ERR_FUNC(RSA_F_RSA_PADDING_ADD_PKCS1_TYPE_2), "RSA_padding_add_PKCS1_type_2"}, {ERR_FUNC(RSA_F_RSA_PADDING_ADD_SSLV23), "RSA_padding_add_SSLv23"}, @@ -109,8 +113,12 @@ static ERR_STRING_DATA RSA_str_functs[]= {ERR_FUNC(RSA_F_RSA_PADDING_CHECK_X931), "RSA_padding_check_X931"}, {ERR_FUNC(RSA_F_RSA_PRINT), "RSA_print"}, {ERR_FUNC(RSA_F_RSA_PRINT_FP), "RSA_print_fp"}, +{ERR_FUNC(RSA_F_RSA_PRIVATE_DECRYPT), "RSA_private_decrypt"}, +{ERR_FUNC(RSA_F_RSA_PRIVATE_ENCRYPT), "RSA_private_encrypt"}, {ERR_FUNC(RSA_F_RSA_PRIV_DECODE), "RSA_PRIV_DECODE"}, {ERR_FUNC(RSA_F_RSA_PRIV_ENCODE), "RSA_PRIV_ENCODE"}, +{ERR_FUNC(RSA_F_RSA_PUBLIC_DECRYPT), "RSA_public_decrypt"}, +{ERR_FUNC(RSA_F_RSA_PUBLIC_ENCRYPT), "RSA_public_encrypt"}, {ERR_FUNC(RSA_F_RSA_PUB_DECODE), "RSA_PUB_DECODE"}, {ERR_FUNC(RSA_F_RSA_SETUP_BLINDING), "RSA_setup_blinding"}, {ERR_FUNC(RSA_F_RSA_SIGN), "RSA_sign"}, @@ -118,6 +126,7 @@ static ERR_STRING_DATA RSA_str_functs[]= {ERR_FUNC(RSA_F_RSA_VERIFY), "RSA_verify"}, {ERR_FUNC(RSA_F_RSA_VERIFY_ASN1_OCTET_STRING), "RSA_verify_ASN1_OCTET_STRING"}, {ERR_FUNC(RSA_F_RSA_VERIFY_PKCS1_PSS), "RSA_verify_PKCS1_PSS"}, +{ERR_FUNC(RSA_F_RSA_VERIFY_PKCS1_PSS_MGF1), "RSA_verify_PKCS1_PSS_mgf1"}, {0,NULL} }; @@ -146,19 +155,24 @@ static ERR_STRING_DATA RSA_str_reasons[]= {ERR_REASON(RSA_R_INVALID_HEADER) ,"invalid header"}, {ERR_REASON(RSA_R_INVALID_KEYBITS) ,"invalid keybits"}, {ERR_REASON(RSA_R_INVALID_MESSAGE_LENGTH),"invalid message length"}, +{ERR_REASON(RSA_R_INVALID_MGF1_MD) ,"invalid mgf1 md"}, {ERR_REASON(RSA_R_INVALID_PADDING) ,"invalid padding"}, {ERR_REASON(RSA_R_INVALID_PADDING_MODE) ,"invalid padding mode"}, +{ERR_REASON(RSA_R_INVALID_PSS_PARAMETERS),"invalid pss parameters"}, {ERR_REASON(RSA_R_INVALID_PSS_SALTLEN) ,"invalid pss saltlen"}, +{ERR_REASON(RSA_R_INVALID_SALT_LENGTH) ,"invalid salt length"}, {ERR_REASON(RSA_R_INVALID_TRAILER) ,"invalid trailer"}, {ERR_REASON(RSA_R_INVALID_X931_DIGEST) ,"invalid x931 digest"}, {ERR_REASON(RSA_R_IQMP_NOT_INVERSE_OF_Q) ,"iqmp not inverse of q"}, {ERR_REASON(RSA_R_KEY_SIZE_TOO_SMALL) ,"key size too small"}, {ERR_REASON(RSA_R_LAST_OCTET_INVALID) ,"last octet invalid"}, {ERR_REASON(RSA_R_MODULUS_TOO_LARGE) ,"modulus too large"}, +{ERR_REASON(RSA_R_NON_FIPS_RSA_METHOD) ,"non fips rsa method"}, {ERR_REASON(RSA_R_NO_PUBLIC_EXPONENT) ,"no public exponent"}, {ERR_REASON(RSA_R_NULL_BEFORE_BLOCK_MISSING),"null before block missing"}, {ERR_REASON(RSA_R_N_DOES_NOT_EQUAL_P_Q) ,"n does not equal p q"}, {ERR_REASON(RSA_R_OAEP_DECODING_ERROR) ,"oaep decoding error"}, +{ERR_REASON(RSA_R_OPERATION_NOT_ALLOWED_IN_FIPS_MODE),"operation not allowed in fips mode"}, {ERR_REASON(RSA_R_OPERATION_NOT_SUPPORTED_FOR_THIS_KEYTYPE),"operation not supported for this keytype"}, {ERR_REASON(RSA_R_PADDING_CHECK_FAILED) ,"padding check failed"}, {ERR_REASON(RSA_R_P_NOT_PRIME) ,"p not prime"}, @@ -169,7 +183,12 @@ static ERR_STRING_DATA RSA_str_reasons[]= {ERR_REASON(RSA_R_SSLV3_ROLLBACK_ATTACK) ,"sslv3 rollback attack"}, {ERR_REASON(RSA_R_THE_ASN1_OBJECT_IDENTIFIER_IS_NOT_KNOWN_FOR_THIS_MD),"the asn1 object identifier is not known for this md"}, {ERR_REASON(RSA_R_UNKNOWN_ALGORITHM_TYPE),"unknown algorithm type"}, +{ERR_REASON(RSA_R_UNKNOWN_MASK_DIGEST) ,"unknown mask digest"}, {ERR_REASON(RSA_R_UNKNOWN_PADDING_TYPE) ,"unknown padding type"}, +{ERR_REASON(RSA_R_UNKNOWN_PSS_DIGEST) ,"unknown pss digest"}, +{ERR_REASON(RSA_R_UNSUPPORTED_MASK_ALGORITHM),"unsupported mask algorithm"}, +{ERR_REASON(RSA_R_UNSUPPORTED_MASK_PARAMETER),"unsupported mask parameter"}, +{ERR_REASON(RSA_R_UNSUPPORTED_SIGNATURE_TYPE),"unsupported signature type"}, {ERR_REASON(RSA_R_VALUE_MISSING) ,"value missing"}, {ERR_REASON(RSA_R_WRONG_SIGNATURE_LENGTH),"wrong signature length"}, {0,NULL} diff --git a/crypto/openssl/crypto/rsa/rsa_gen.c b/crypto/openssl/crypto/rsa/rsa_gen.c index 767f7ab682..42290cce66 100644 --- a/crypto/openssl/crypto/rsa/rsa_gen.c +++ b/crypto/openssl/crypto/rsa/rsa_gen.c @@ -67,6 +67,9 @@ #include "cryptlib.h" #include #include +#ifdef OPENSSL_FIPS +#include +#endif static int rsa_builtin_keygen(RSA *rsa, int bits, BIGNUM *e_value, BN_GENCB *cb); @@ -77,8 +80,20 @@ static int rsa_builtin_keygen(RSA *rsa, int bits, BIGNUM *e_value, BN_GENCB *cb) * now just because key-generation is part of RSA_METHOD. */ int RSA_generate_key_ex(RSA *rsa, int bits, BIGNUM *e_value, BN_GENCB *cb) { +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !(rsa->meth->flags & RSA_FLAG_FIPS_METHOD) + && !(rsa->flags & RSA_FLAG_NON_FIPS_ALLOW)) + { + RSAerr(RSA_F_RSA_GENERATE_KEY_EX, RSA_R_NON_FIPS_RSA_METHOD); + return 0; + } +#endif if(rsa->meth->rsa_keygen) return rsa->meth->rsa_keygen(rsa, bits, e_value, cb); +#ifdef OPENSSL_FIPS + if (FIPS_mode()) + return FIPS_rsa_generate_key_ex(rsa, bits, e_value, cb); +#endif return rsa_builtin_keygen(rsa, bits, e_value, cb); } diff --git a/crypto/openssl/crypto/rsa/rsa_lib.c b/crypto/openssl/crypto/rsa/rsa_lib.c index de45088d76..c95ceafc82 100644 --- a/crypto/openssl/crypto/rsa/rsa_lib.c +++ b/crypto/openssl/crypto/rsa/rsa_lib.c @@ -67,6 +67,10 @@ #include #endif +#ifdef OPENSSL_FIPS +#include +#endif + const char RSA_version[]="RSA" OPENSSL_VERSION_PTEXT; static const RSA_METHOD *default_RSA_meth=NULL; @@ -87,11 +91,14 @@ const RSA_METHOD *RSA_get_default_method(void) { if (default_RSA_meth == NULL) { +#ifdef OPENSSL_FIPS + if (FIPS_mode()) + return FIPS_rsa_pkcs1_ssleay(); + else + return RSA_PKCS1_SSLeay(); +#else #ifdef RSA_NULL default_RSA_meth=RSA_null_method(); -#else -#if 0 /* was: #ifdef RSAref */ - default_RSA_meth=RSA_PKCS1_RSAref(); #else default_RSA_meth=RSA_PKCS1_SSLeay(); #endif @@ -181,7 +188,7 @@ RSA *RSA_new_method(ENGINE *engine) ret->blinding=NULL; ret->mt_blinding=NULL; ret->bignum_data=NULL; - ret->flags=ret->meth->flags; + ret->flags=ret->meth->flags & ~RSA_FLAG_NON_FIPS_ALLOW; if (!CRYPTO_new_ex_data(CRYPTO_EX_INDEX_RSA, ret, &ret->ex_data)) { #ifndef OPENSSL_NO_ENGINE @@ -280,163 +287,6 @@ void *RSA_get_ex_data(const RSA *r, int idx) return(CRYPTO_get_ex_data(&r->ex_data,idx)); } -int RSA_size(const RSA *r) - { - return(BN_num_bytes(r->n)); - } - -int RSA_public_encrypt(int flen, const unsigned char *from, unsigned char *to, - RSA *rsa, int padding) - { - return(rsa->meth->rsa_pub_enc(flen, from, to, rsa, padding)); - } - -int RSA_private_encrypt(int flen, const unsigned char *from, unsigned char *to, - RSA *rsa, int padding) - { - return(rsa->meth->rsa_priv_enc(flen, from, to, rsa, padding)); - } - -int RSA_private_decrypt(int flen, const unsigned char *from, unsigned char *to, - RSA *rsa, int padding) - { - return(rsa->meth->rsa_priv_dec(flen, from, to, rsa, padding)); - } - -int RSA_public_decrypt(int flen, const unsigned char *from, unsigned char *to, - RSA *rsa, int padding) - { - return(rsa->meth->rsa_pub_dec(flen, from, to, rsa, padding)); - } - -int RSA_flags(const RSA *r) - { - return((r == NULL)?0:r->meth->flags); - } - -void RSA_blinding_off(RSA *rsa) - { - if (rsa->blinding != NULL) - { - BN_BLINDING_free(rsa->blinding); - rsa->blinding=NULL; - } - rsa->flags &= ~RSA_FLAG_BLINDING; - rsa->flags |= RSA_FLAG_NO_BLINDING; - } - -int RSA_blinding_on(RSA *rsa, BN_CTX *ctx) - { - int ret=0; - - if (rsa->blinding != NULL) - RSA_blinding_off(rsa); - - rsa->blinding = RSA_setup_blinding(rsa, ctx); - if (rsa->blinding == NULL) - goto err; - - rsa->flags |= RSA_FLAG_BLINDING; - rsa->flags &= ~RSA_FLAG_NO_BLINDING; - ret=1; -err: - return(ret); - } - -static BIGNUM *rsa_get_public_exp(const BIGNUM *d, const BIGNUM *p, - const BIGNUM *q, BN_CTX *ctx) -{ - BIGNUM *ret = NULL, *r0, *r1, *r2; - - if (d == NULL || p == NULL || q == NULL) - return NULL; - - BN_CTX_start(ctx); - r0 = BN_CTX_get(ctx); - r1 = BN_CTX_get(ctx); - r2 = BN_CTX_get(ctx); - if (r2 == NULL) - goto err; - - if (!BN_sub(r1, p, BN_value_one())) goto err; - if (!BN_sub(r2, q, BN_value_one())) goto err; - if (!BN_mul(r0, r1, r2, ctx)) goto err; - - ret = BN_mod_inverse(NULL, d, r0, ctx); -err: - BN_CTX_end(ctx); - return ret; -} - -BN_BLINDING *RSA_setup_blinding(RSA *rsa, BN_CTX *in_ctx) -{ - BIGNUM local_n; - BIGNUM *e,*n; - BN_CTX *ctx; - BN_BLINDING *ret = NULL; - - if (in_ctx == NULL) - { - if ((ctx = BN_CTX_new()) == NULL) return 0; - } - else - ctx = in_ctx; - - BN_CTX_start(ctx); - e = BN_CTX_get(ctx); - if (e == NULL) - { - RSAerr(RSA_F_RSA_SETUP_BLINDING, ERR_R_MALLOC_FAILURE); - goto err; - } - - if (rsa->e == NULL) - { - e = rsa_get_public_exp(rsa->d, rsa->p, rsa->q, ctx); - if (e == NULL) - { - RSAerr(RSA_F_RSA_SETUP_BLINDING, RSA_R_NO_PUBLIC_EXPONENT); - goto err; - } - } - else - e = rsa->e; - - - if ((RAND_status() == 0) && rsa->d != NULL && rsa->d->d != NULL) - { - /* if PRNG is not properly seeded, resort to secret - * exponent as unpredictable seed */ - RAND_add(rsa->d->d, rsa->d->dmax * sizeof rsa->d->d[0], 0.0); - } - - if (!(rsa->flags & RSA_FLAG_NO_CONSTTIME)) - { - /* Set BN_FLG_CONSTTIME flag */ - n = &local_n; - BN_with_flags(n, rsa->n, BN_FLG_CONSTTIME); - } - else - n = rsa->n; - - ret = BN_BLINDING_create_param(NULL, e, n, ctx, - rsa->meth->bn_mod_exp, rsa->_method_mod_n); - if (ret == NULL) - { - RSAerr(RSA_F_RSA_SETUP_BLINDING, ERR_R_BN_LIB); - goto err; - } - CRYPTO_THREADID_current(BN_BLINDING_thread_id(ret)); -err: - BN_CTX_end(ctx); - if (in_ctx == NULL) - BN_CTX_free(ctx); - if(rsa->e == NULL) - BN_free(e); - - return ret; -} - int RSA_memory_lock(RSA *r) { int i,j,k,off; diff --git a/crypto/openssl/crypto/rsa/rsa_oaep.c b/crypto/openssl/crypto/rsa/rsa_oaep.c index 18d307ea9e..553d212ebe 100644 --- a/crypto/openssl/crypto/rsa/rsa_oaep.c +++ b/crypto/openssl/crypto/rsa/rsa_oaep.c @@ -56,7 +56,8 @@ int RSA_padding_add_PKCS1_OAEP(unsigned char *to, int tlen, seed = to + 1; db = to + SHA_DIGEST_LENGTH + 1; - EVP_Digest((void *)param, plen, db, NULL, EVP_sha1(), NULL); + if (!EVP_Digest((void *)param, plen, db, NULL, EVP_sha1(), NULL)) + return 0; memset(db + SHA_DIGEST_LENGTH, 0, emlen - flen - 2 * SHA_DIGEST_LENGTH - 1); db[emlen - flen - SHA_DIGEST_LENGTH - 1] = 0x01; @@ -145,7 +146,8 @@ int RSA_padding_check_PKCS1_OAEP(unsigned char *to, int tlen, for (i = 0; i < dblen; i++) db[i] ^= maskeddb[i]; - EVP_Digest((void *)param, plen, phash, NULL, EVP_sha1(), NULL); + if (!EVP_Digest((void *)param, plen, phash, NULL, EVP_sha1(), NULL)) + return -1; if (memcmp(db, phash, SHA_DIGEST_LENGTH) != 0 || bad) goto decoding_err; diff --git a/crypto/openssl/crypto/rsa/rsa_pmeth.c b/crypto/openssl/crypto/rsa/rsa_pmeth.c index c6892ecd09..5b2ecf56ad 100644 --- a/crypto/openssl/crypto/rsa/rsa_pmeth.c +++ b/crypto/openssl/crypto/rsa/rsa_pmeth.c @@ -63,6 +63,12 @@ #include #include #include +#ifndef OPENSSL_NO_CMS +#include +#endif +#ifdef OPENSSL_FIPS +#include +#endif #include "evp_locl.h" #include "rsa_locl.h" @@ -79,6 +85,8 @@ typedef struct int pad_mode; /* message digest */ const EVP_MD *md; + /* message digest for MGF1 */ + const EVP_MD *mgf1md; /* PSS/OAEP salt length */ int saltlen; /* Temp buffer */ @@ -95,6 +103,7 @@ static int pkey_rsa_init(EVP_PKEY_CTX *ctx) rctx->pub_exp = NULL; rctx->pad_mode = RSA_PKCS1_PADDING; rctx->md = NULL; + rctx->mgf1md = NULL; rctx->tbuf = NULL; rctx->saltlen = -2; @@ -147,6 +156,31 @@ static void pkey_rsa_cleanup(EVP_PKEY_CTX *ctx) OPENSSL_free(rctx); } } +#ifdef OPENSSL_FIPS +/* FIP checker. Return value indicates status of context parameters: + * 1 : redirect to FIPS. + * 0 : don't redirect to FIPS. + * -1 : illegal operation in FIPS mode. + */ + +static int pkey_fips_check_ctx(EVP_PKEY_CTX *ctx) + { + RSA_PKEY_CTX *rctx = ctx->data; + RSA *rsa = ctx->pkey->pkey.rsa; + int rv = -1; + if (!FIPS_mode()) + return 0; + if (rsa->flags & RSA_FLAG_NON_FIPS_ALLOW) + rv = 0; + if (!(rsa->meth->flags & RSA_FLAG_FIPS_METHOD) && rv) + return -1; + if (rctx->md && !(rctx->md->flags & EVP_MD_FLAG_FIPS)) + return rv; + if (rctx->mgf1md && !(rctx->mgf1md->flags & EVP_MD_FLAG_FIPS)) + return rv; + return 1; + } +#endif static int pkey_rsa_sign(EVP_PKEY_CTX *ctx, unsigned char *sig, size_t *siglen, const unsigned char *tbs, size_t tbslen) @@ -155,6 +189,15 @@ static int pkey_rsa_sign(EVP_PKEY_CTX *ctx, unsigned char *sig, size_t *siglen, RSA_PKEY_CTX *rctx = ctx->data; RSA *rsa = ctx->pkey->pkey.rsa; +#ifdef OPENSSL_FIPS + ret = pkey_fips_check_ctx(ctx); + if (ret < 0) + { + RSAerr(RSA_F_PKEY_RSA_SIGN, RSA_R_OPERATION_NOT_ALLOWED_IN_FIPS_MODE); + return -1; + } +#endif + if (rctx->md) { if (tbslen != (size_t)EVP_MD_size(rctx->md)) @@ -163,7 +206,36 @@ static int pkey_rsa_sign(EVP_PKEY_CTX *ctx, unsigned char *sig, size_t *siglen, RSA_R_INVALID_DIGEST_LENGTH); return -1; } - if (rctx->pad_mode == RSA_X931_PADDING) +#ifdef OPENSSL_FIPS + if (ret > 0) + { + unsigned int slen; + ret = FIPS_rsa_sign_digest(rsa, tbs, tbslen, rctx->md, + rctx->pad_mode, + rctx->saltlen, + rctx->mgf1md, + sig, &slen); + if (ret > 0) + *siglen = slen; + else + *siglen = 0; + return ret; + } +#endif + + if (EVP_MD_type(rctx->md) == NID_mdc2) + { + unsigned int sltmp; + if (rctx->pad_mode != RSA_PKCS1_PADDING) + return -1; + ret = RSA_sign_ASN1_OCTET_STRING(NID_mdc2, + tbs, tbslen, sig, &sltmp, rsa); + + if (ret <= 0) + return ret; + ret = sltmp; + } + else if (rctx->pad_mode == RSA_X931_PADDING) { if (!setup_tbuf(rctx, ctx)) return -1; @@ -186,8 +258,10 @@ static int pkey_rsa_sign(EVP_PKEY_CTX *ctx, unsigned char *sig, size_t *siglen, { if (!setup_tbuf(rctx, ctx)) return -1; - if (!RSA_padding_add_PKCS1_PSS(rsa, rctx->tbuf, tbs, - rctx->md, rctx->saltlen)) + if (!RSA_padding_add_PKCS1_PSS_mgf1(rsa, + rctx->tbuf, tbs, + rctx->md, rctx->mgf1md, + rctx->saltlen)) return -1; ret = RSA_private_encrypt(RSA_size(rsa), rctx->tbuf, sig, rsa, RSA_NO_PADDING); @@ -269,8 +343,30 @@ static int pkey_rsa_verify(EVP_PKEY_CTX *ctx, RSA_PKEY_CTX *rctx = ctx->data; RSA *rsa = ctx->pkey->pkey.rsa; size_t rslen; +#ifdef OPENSSL_FIPS + int rv; + rv = pkey_fips_check_ctx(ctx); + if (rv < 0) + { + RSAerr(RSA_F_PKEY_RSA_VERIFY, RSA_R_OPERATION_NOT_ALLOWED_IN_FIPS_MODE); + return -1; + } +#endif if (rctx->md) { +#ifdef OPENSSL_FIPS + if (rv > 0) + { + return FIPS_rsa_verify_digest(rsa, + tbs, tbslen, + rctx->md, + rctx->pad_mode, + rctx->saltlen, + rctx->mgf1md, + sig, siglen); + + } +#endif if (rctx->pad_mode == RSA_PKCS1_PADDING) return RSA_verify(EVP_MD_type(rctx->md), tbs, tbslen, sig, siglen, rsa); @@ -289,7 +385,8 @@ static int pkey_rsa_verify(EVP_PKEY_CTX *ctx, rsa, RSA_NO_PADDING); if (ret <= 0) return 0; - ret = RSA_verify_PKCS1_PSS(rsa, tbs, rctx->md, + ret = RSA_verify_PKCS1_PSS_mgf1(rsa, tbs, + rctx->md, rctx->mgf1md, rctx->tbuf, rctx->saltlen); if (ret <= 0) return 0; @@ -403,15 +500,25 @@ static int pkey_rsa_ctrl(EVP_PKEY_CTX *ctx, int type, int p1, void *p2) RSA_R_ILLEGAL_OR_UNSUPPORTED_PADDING_MODE); return -2; + case EVP_PKEY_CTRL_GET_RSA_PADDING: + *(int *)p2 = rctx->pad_mode; + return 1; + case EVP_PKEY_CTRL_RSA_PSS_SALTLEN: - if (p1 < -2) - return -2; + case EVP_PKEY_CTRL_GET_RSA_PSS_SALTLEN: if (rctx->pad_mode != RSA_PKCS1_PSS_PADDING) { RSAerr(RSA_F_PKEY_RSA_CTRL, RSA_R_INVALID_PSS_SALTLEN); return -2; } - rctx->saltlen = p1; + if (type == EVP_PKEY_CTRL_GET_RSA_PSS_SALTLEN) + *(int *)p2 = rctx->saltlen; + else + { + if (p1 < -2) + return -2; + rctx->saltlen = p1; + } return 1; case EVP_PKEY_CTRL_RSA_KEYGEN_BITS: @@ -435,16 +542,45 @@ static int pkey_rsa_ctrl(EVP_PKEY_CTX *ctx, int type, int p1, void *p2) rctx->md = p2; return 1; + case EVP_PKEY_CTRL_RSA_MGF1_MD: + case EVP_PKEY_CTRL_GET_RSA_MGF1_MD: + if (rctx->pad_mode != RSA_PKCS1_PSS_PADDING) + { + RSAerr(RSA_F_PKEY_RSA_CTRL, RSA_R_INVALID_MGF1_MD); + return -2; + } + if (type == EVP_PKEY_CTRL_GET_RSA_MGF1_MD) + { + if (rctx->mgf1md) + *(const EVP_MD **)p2 = rctx->mgf1md; + else + *(const EVP_MD **)p2 = rctx->md; + } + else + rctx->mgf1md = p2; + return 1; + case EVP_PKEY_CTRL_DIGESTINIT: case EVP_PKEY_CTRL_PKCS7_ENCRYPT: case EVP_PKEY_CTRL_PKCS7_DECRYPT: case EVP_PKEY_CTRL_PKCS7_SIGN: + return 1; #ifndef OPENSSL_NO_CMS - case EVP_PKEY_CTRL_CMS_ENCRYPT: case EVP_PKEY_CTRL_CMS_DECRYPT: + { + X509_ALGOR *alg = NULL; + ASN1_OBJECT *encalg = NULL; + if (p2) + CMS_RecipientInfo_ktri_get0_algs(p2, NULL, NULL, &alg); + if (alg) + X509_ALGOR_get0(&encalg, NULL, NULL, alg); + if (encalg && OBJ_obj2nid(encalg) == NID_rsaesOaep) + rctx->pad_mode = RSA_PKCS1_OAEP_PADDING; + } + case EVP_PKEY_CTRL_CMS_ENCRYPT: case EVP_PKEY_CTRL_CMS_SIGN: -#endif return 1; +#endif case EVP_PKEY_CTRL_PEER_KEY: RSAerr(RSA_F_PKEY_RSA_CTRL, RSA_R_OPERATION_NOT_SUPPORTED_FOR_THIS_KEYTYPE); diff --git a/crypto/openssl/crypto/rsa/rsa_pss.c b/crypto/openssl/crypto/rsa/rsa_pss.c index ac211e2ffe..5f9f533d0c 100644 --- a/crypto/openssl/crypto/rsa/rsa_pss.c +++ b/crypto/openssl/crypto/rsa/rsa_pss.c @@ -73,6 +73,13 @@ static const unsigned char zeroes[] = {0,0,0,0,0,0,0,0}; int RSA_verify_PKCS1_PSS(RSA *rsa, const unsigned char *mHash, const EVP_MD *Hash, const unsigned char *EM, int sLen) { + return RSA_verify_PKCS1_PSS_mgf1(rsa, mHash, Hash, NULL, EM, sLen); + } + +int RSA_verify_PKCS1_PSS_mgf1(RSA *rsa, const unsigned char *mHash, + const EVP_MD *Hash, const EVP_MD *mgf1Hash, + const unsigned char *EM, int sLen) + { int i; int ret = 0; int hLen, maskedDBLen, MSBits, emLen; @@ -80,6 +87,10 @@ int RSA_verify_PKCS1_PSS(RSA *rsa, const unsigned char *mHash, unsigned char *DB = NULL; EVP_MD_CTX ctx; unsigned char H_[EVP_MAX_MD_SIZE]; + EVP_MD_CTX_init(&ctx); + + if (mgf1Hash == NULL) + mgf1Hash = Hash; hLen = EVP_MD_size(Hash); if (hLen < 0) @@ -94,7 +105,7 @@ int RSA_verify_PKCS1_PSS(RSA *rsa, const unsigned char *mHash, else if (sLen == -2) sLen = -2; else if (sLen < -2) { - RSAerr(RSA_F_RSA_VERIFY_PKCS1_PSS, RSA_R_SLEN_CHECK_FAILED); + RSAerr(RSA_F_RSA_VERIFY_PKCS1_PSS_MGF1, RSA_R_SLEN_CHECK_FAILED); goto err; } @@ -102,7 +113,7 @@ int RSA_verify_PKCS1_PSS(RSA *rsa, const unsigned char *mHash, emLen = RSA_size(rsa); if (EM[0] & (0xFF << MSBits)) { - RSAerr(RSA_F_RSA_VERIFY_PKCS1_PSS, RSA_R_FIRST_OCTET_INVALID); + RSAerr(RSA_F_RSA_VERIFY_PKCS1_PSS_MGF1, RSA_R_FIRST_OCTET_INVALID); goto err; } if (MSBits == 0) @@ -112,12 +123,12 @@ int RSA_verify_PKCS1_PSS(RSA *rsa, const unsigned char *mHash, } if (emLen < (hLen + sLen + 2)) /* sLen can be small negative */ { - RSAerr(RSA_F_RSA_VERIFY_PKCS1_PSS, RSA_R_DATA_TOO_LARGE); + RSAerr(RSA_F_RSA_VERIFY_PKCS1_PSS_MGF1, RSA_R_DATA_TOO_LARGE); goto err; } if (EM[emLen - 1] != 0xbc) { - RSAerr(RSA_F_RSA_VERIFY_PKCS1_PSS, RSA_R_LAST_OCTET_INVALID); + RSAerr(RSA_F_RSA_VERIFY_PKCS1_PSS_MGF1, RSA_R_LAST_OCTET_INVALID); goto err; } maskedDBLen = emLen - hLen - 1; @@ -125,10 +136,10 @@ int RSA_verify_PKCS1_PSS(RSA *rsa, const unsigned char *mHash, DB = OPENSSL_malloc(maskedDBLen); if (!DB) { - RSAerr(RSA_F_RSA_VERIFY_PKCS1_PSS, ERR_R_MALLOC_FAILURE); + RSAerr(RSA_F_RSA_VERIFY_PKCS1_PSS_MGF1, ERR_R_MALLOC_FAILURE); goto err; } - if (PKCS1_MGF1(DB, maskedDBLen, H, hLen, Hash) < 0) + if (PKCS1_MGF1(DB, maskedDBLen, H, hLen, mgf1Hash) < 0) goto err; for (i = 0; i < maskedDBLen; i++) DB[i] ^= EM[i]; @@ -137,25 +148,28 @@ int RSA_verify_PKCS1_PSS(RSA *rsa, const unsigned char *mHash, for (i = 0; DB[i] == 0 && i < (maskedDBLen-1); i++) ; if (DB[i++] != 0x1) { - RSAerr(RSA_F_RSA_VERIFY_PKCS1_PSS, RSA_R_SLEN_RECOVERY_FAILED); + RSAerr(RSA_F_RSA_VERIFY_PKCS1_PSS_MGF1, RSA_R_SLEN_RECOVERY_FAILED); goto err; } if (sLen >= 0 && (maskedDBLen - i) != sLen) { - RSAerr(RSA_F_RSA_VERIFY_PKCS1_PSS, RSA_R_SLEN_CHECK_FAILED); + RSAerr(RSA_F_RSA_VERIFY_PKCS1_PSS_MGF1, RSA_R_SLEN_CHECK_FAILED); goto err; } - EVP_MD_CTX_init(&ctx); - EVP_DigestInit_ex(&ctx, Hash, NULL); - EVP_DigestUpdate(&ctx, zeroes, sizeof zeroes); - EVP_DigestUpdate(&ctx, mHash, hLen); + if (!EVP_DigestInit_ex(&ctx, Hash, NULL) + || !EVP_DigestUpdate(&ctx, zeroes, sizeof zeroes) + || !EVP_DigestUpdate(&ctx, mHash, hLen)) + goto err; if (maskedDBLen - i) - EVP_DigestUpdate(&ctx, DB + i, maskedDBLen - i); - EVP_DigestFinal(&ctx, H_, NULL); - EVP_MD_CTX_cleanup(&ctx); + { + if (!EVP_DigestUpdate(&ctx, DB + i, maskedDBLen - i)) + goto err; + } + if (!EVP_DigestFinal_ex(&ctx, H_, NULL)) + goto err; if (memcmp(H_, H, hLen)) { - RSAerr(RSA_F_RSA_VERIFY_PKCS1_PSS, RSA_R_BAD_SIGNATURE); + RSAerr(RSA_F_RSA_VERIFY_PKCS1_PSS_MGF1, RSA_R_BAD_SIGNATURE); ret = 0; } else @@ -164,6 +178,7 @@ int RSA_verify_PKCS1_PSS(RSA *rsa, const unsigned char *mHash, err: if (DB) OPENSSL_free(DB); + EVP_MD_CTX_cleanup(&ctx); return ret; @@ -173,12 +188,22 @@ int RSA_padding_add_PKCS1_PSS(RSA *rsa, unsigned char *EM, const unsigned char *mHash, const EVP_MD *Hash, int sLen) { + return RSA_padding_add_PKCS1_PSS_mgf1(rsa, EM, mHash, Hash, NULL, sLen); + } + +int RSA_padding_add_PKCS1_PSS_mgf1(RSA *rsa, unsigned char *EM, + const unsigned char *mHash, + const EVP_MD *Hash, const EVP_MD *mgf1Hash, int sLen) + { int i; int ret = 0; int hLen, maskedDBLen, MSBits, emLen; unsigned char *H, *salt = NULL, *p; EVP_MD_CTX ctx; + if (mgf1Hash == NULL) + mgf1Hash = Hash; + hLen = EVP_MD_size(Hash); if (hLen < 0) goto err; @@ -192,7 +217,7 @@ int RSA_padding_add_PKCS1_PSS(RSA *rsa, unsigned char *EM, else if (sLen == -2) sLen = -2; else if (sLen < -2) { - RSAerr(RSA_F_RSA_PADDING_ADD_PKCS1_PSS, RSA_R_SLEN_CHECK_FAILED); + RSAerr(RSA_F_RSA_PADDING_ADD_PKCS1_PSS_MGF1, RSA_R_SLEN_CHECK_FAILED); goto err; } @@ -209,8 +234,7 @@ int RSA_padding_add_PKCS1_PSS(RSA *rsa, unsigned char *EM, } else if (emLen < (hLen + sLen + 2)) { - RSAerr(RSA_F_RSA_PADDING_ADD_PKCS1_PSS, - RSA_R_DATA_TOO_LARGE_FOR_KEY_SIZE); + RSAerr(RSA_F_RSA_PADDING_ADD_PKCS1_PSS_MGF1,RSA_R_DATA_TOO_LARGE_FOR_KEY_SIZE); goto err; } if (sLen > 0) @@ -218,8 +242,7 @@ int RSA_padding_add_PKCS1_PSS(RSA *rsa, unsigned char *EM, salt = OPENSSL_malloc(sLen); if (!salt) { - RSAerr(RSA_F_RSA_PADDING_ADD_PKCS1_PSS, - ERR_R_MALLOC_FAILURE); + RSAerr(RSA_F_RSA_PADDING_ADD_PKCS1_PSS_MGF1,ERR_R_MALLOC_FAILURE); goto err; } if (RAND_bytes(salt, sLen) <= 0) @@ -228,16 +251,18 @@ int RSA_padding_add_PKCS1_PSS(RSA *rsa, unsigned char *EM, maskedDBLen = emLen - hLen - 1; H = EM + maskedDBLen; EVP_MD_CTX_init(&ctx); - EVP_DigestInit_ex(&ctx, Hash, NULL); - EVP_DigestUpdate(&ctx, zeroes, sizeof zeroes); - EVP_DigestUpdate(&ctx, mHash, hLen); - if (sLen) - EVP_DigestUpdate(&ctx, salt, sLen); - EVP_DigestFinal(&ctx, H, NULL); + if (!EVP_DigestInit_ex(&ctx, Hash, NULL) + || !EVP_DigestUpdate(&ctx, zeroes, sizeof zeroes) + || !EVP_DigestUpdate(&ctx, mHash, hLen)) + goto err; + if (sLen && !EVP_DigestUpdate(&ctx, salt, sLen)) + goto err; + if (!EVP_DigestFinal_ex(&ctx, H, NULL)) + goto err; EVP_MD_CTX_cleanup(&ctx); /* Generate dbMask in place then perform XOR on it */ - if (PKCS1_MGF1(EM, maskedDBLen, H, hLen, Hash)) + if (PKCS1_MGF1(EM, maskedDBLen, H, hLen, mgf1Hash)) goto err; p = EM; diff --git a/crypto/openssl/crypto/rsa/rsa_sign.c b/crypto/openssl/crypto/rsa/rsa_sign.c index 0be4ec7fb0..b6f6037ae0 100644 --- a/crypto/openssl/crypto/rsa/rsa_sign.c +++ b/crypto/openssl/crypto/rsa/rsa_sign.c @@ -77,6 +77,14 @@ int RSA_sign(int type, const unsigned char *m, unsigned int m_len, const unsigned char *s = NULL; X509_ALGOR algor; ASN1_OCTET_STRING digest; +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !(rsa->meth->flags & RSA_FLAG_FIPS_METHOD) + && !(rsa->flags & RSA_FLAG_NON_FIPS_ALLOW)) + { + RSAerr(RSA_F_RSA_SIGN, RSA_R_NON_FIPS_RSA_METHOD); + return 0; + } +#endif if((rsa->flags & RSA_FLAG_SIGN_VER) && rsa->meth->rsa_sign) { return rsa->meth->rsa_sign(type, m, m_len, @@ -153,6 +161,15 @@ int int_rsa_verify(int dtype, const unsigned char *m, unsigned char *s; X509_SIG *sig=NULL; +#ifdef OPENSSL_FIPS + if (FIPS_mode() && !(rsa->meth->flags & RSA_FLAG_FIPS_METHOD) + && !(rsa->flags & RSA_FLAG_NON_FIPS_ALLOW)) + { + RSAerr(RSA_F_INT_RSA_VERIFY, RSA_R_NON_FIPS_RSA_METHOD); + return 0; + } +#endif + if (siglen != (unsigned int)RSA_size(rsa)) { RSAerr(RSA_F_INT_RSA_VERIFY,RSA_R_WRONG_SIGNATURE_LENGTH); @@ -182,6 +199,22 @@ int int_rsa_verify(int dtype, const unsigned char *m, i=RSA_public_decrypt((int)siglen,sigbuf,s,rsa,RSA_PKCS1_PADDING); if (i <= 0) goto err; + /* Oddball MDC2 case: signature can be OCTET STRING. + * check for correct tag and length octets. + */ + if (dtype == NID_mdc2 && i == 18 && s[0] == 0x04 && s[1] == 0x10) + { + if (rm) + { + memcpy(rm, s + 2, 16); + *prm_len = 16; + ret = 1; + } + else if(memcmp(m, s + 2, 16)) + RSAerr(RSA_F_INT_RSA_VERIFY,RSA_R_BAD_SIGNATURE); + else + ret = 1; + } /* Special case: SSL signature */ if(dtype == NID_md5_sha1) { diff --git a/crypto/openssl/crypto/seed/seed.c b/crypto/openssl/crypto/seed/seed.c index 2bc384a19f..3e675a8d75 100644 --- a/crypto/openssl/crypto/seed/seed.c +++ b/crypto/openssl/crypto/seed/seed.c @@ -32,9 +32,14 @@ #include #endif +#include #include #include "seed_locl.h" +#ifdef SS /* can get defined on Solaris by inclusion of */ +#undef SS +#endif + static const seed_word SS[4][256] = { { 0x2989a1a8, 0x05858184, 0x16c6d2d4, 0x13c3d3d0, 0x14445054, 0x1d0d111c, 0x2c8ca0ac, 0x25052124, 0x1d4d515c, 0x03434340, 0x18081018, 0x1e0e121c, 0x11415150, 0x3cccf0fc, 0x0acac2c8, 0x23436360, @@ -192,8 +197,14 @@ static const seed_word KC[] = { KC0, KC1, KC2, KC3, KC4, KC5, KC6, KC7, KC8, KC9, KC10, KC11, KC12, KC13, KC14, KC15 }; #endif - void SEED_set_key(const unsigned char rawkey[SEED_KEY_LENGTH], SEED_KEY_SCHEDULE *ks) +#ifdef OPENSSL_FIPS + { + fips_cipher_abort(SEED); + private_SEED_set_key(rawkey, ks); + } +void private_SEED_set_key(const unsigned char rawkey[SEED_KEY_LENGTH], SEED_KEY_SCHEDULE *ks) +#endif { seed_word x1, x2, x3, x4; seed_word t0, t1; diff --git a/crypto/openssl/crypto/seed/seed.h b/crypto/openssl/crypto/seed/seed.h index 6ffa5f024e..c50fdd3607 100644 --- a/crypto/openssl/crypto/seed/seed.h +++ b/crypto/openssl/crypto/seed/seed.h @@ -116,7 +116,9 @@ typedef struct seed_key_st { #endif } SEED_KEY_SCHEDULE; - +#ifdef OPENSSL_FIPS +void private_SEED_set_key(const unsigned char rawkey[SEED_KEY_LENGTH], SEED_KEY_SCHEDULE *ks); +#endif void SEED_set_key(const unsigned char rawkey[SEED_KEY_LENGTH], SEED_KEY_SCHEDULE *ks); void SEED_encrypt(const unsigned char s[SEED_BLOCK_SIZE], unsigned char d[SEED_BLOCK_SIZE], const SEED_KEY_SCHEDULE *ks); diff --git a/crypto/openssl/crypto/sha/asm/sha1-586.pl b/crypto/openssl/crypto/sha/asm/sha1-586.pl index a1f876281a..1084d227fe 100644 --- a/crypto/openssl/crypto/sha/asm/sha1-586.pl +++ b/crypto/openssl/crypto/sha/asm/sha1-586.pl @@ -12,6 +12,8 @@ # commentary below], and in 2006 the rest was rewritten in order to # gain freedom to liberate licensing terms. +# January, September 2004. +# # It was noted that Intel IA-32 C compiler generates code which # performs ~30% *faster* on P4 CPU than original *hand-coded* # SHA1 assembler implementation. To address this problem (and @@ -31,12 +33,92 @@ # ---------------------------------------------------------------- # +# August 2009. +# +# George Spelvin has tipped that F_40_59(b,c,d) can be rewritten as +# '(c&d) + (b&(c^d))', which allows to accumulate partial results +# and lighten "pressure" on scratch registers. This resulted in +# >12% performance improvement on contemporary AMD cores (with no +# degradation on other CPUs:-). Also, the code was revised to maximize +# "distance" between instructions producing input to 'lea' instruction +# and the 'lea' instruction itself, which is essential for Intel Atom +# core and resulted in ~15% improvement. + +# October 2010. +# +# Add SSSE3, Supplemental[!] SSE3, implementation. The idea behind it +# is to offload message schedule denoted by Wt in NIST specification, +# or Xupdate in OpenSSL source, to SIMD unit. The idea is not novel, +# and in SSE2 context was first explored by Dean Gaudet in 2004, see +# http://arctic.org/~dean/crypto/sha1.html. Since then several things +# have changed that made it interesting again: +# +# a) XMM units became faster and wider; +# b) instruction set became more versatile; +# c) an important observation was made by Max Locktykhin, which made +# it possible to reduce amount of instructions required to perform +# the operation in question, for further details see +# http://software.intel.com/en-us/articles/improving-the-performance-of-the-secure-hash-algorithm-1/. + +# April 2011. +# +# Add AVX code path, probably most controversial... The thing is that +# switch to AVX alone improves performance by as little as 4% in +# comparison to SSSE3 code path. But below result doesn't look like +# 4% improvement... Trouble is that Sandy Bridge decodes 'ro[rl]' as +# pair of µ-ops, and it's the additional µ-ops, two per round, that +# make it run slower than Core2 and Westmere. But 'sh[rl]d' is decoded +# as single µ-op by Sandy Bridge and it's replacing 'ro[rl]' with +# equivalent 'sh[rl]d' that is responsible for the impressive 5.1 +# cycles per processed byte. But 'sh[rl]d' is not something that used +# to be fast, nor does it appear to be fast in upcoming Bulldozer +# [according to its optimization manual]. Which is why AVX code path +# is guarded by *both* AVX and synthetic bit denoting Intel CPUs. +# One can argue that it's unfair to AMD, but without 'sh[rl]d' it +# makes no sense to keep the AVX code path. If somebody feels that +# strongly, it's probably more appropriate to discuss possibility of +# using vector rotate XOP on AMD... + +###################################################################### +# Current performance is summarized in following table. Numbers are +# CPU clock cycles spent to process single byte (less is better). +# +# x86 SSSE3 AVX +# Pentium 15.7 - +# PIII 11.5 - +# P4 10.6 - +# AMD K8 7.1 - +# Core2 7.3 6.1/+20% - +# Atom 12.5 9.5(*)/+32% - +# Westmere 7.3 5.6/+30% - +# Sandy Bridge 8.8 6.2/+40% 5.1(**)/+70% +# +# (*) Loop is 1056 instructions long and expected result is ~8.25. +# It remains mystery [to me] why ILP is limited to 1.7. +# +# (**) As per above comment, the result is for AVX *plus* sh[rl]d. + $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; push(@INC,"${dir}","${dir}../../perlasm"); require "x86asm.pl"; &asm_init($ARGV[0],"sha1-586.pl",$ARGV[$#ARGV] eq "386"); +$xmm=$ymm=0; +for (@ARGV) { $xmm=1 if (/-DOPENSSL_IA32_SSE2/); } + +$ymm=1 if ($xmm && + `$ENV{CC} -Wa,-v -c -o /dev/null -x assembler /dev/null 2>&1` + =~ /GNU assembler version ([2-9]\.[0-9]+)/ && + $1>=2.19); # first version supporting AVX + +$ymm=1 if ($xmm && !$ymm && $ARGV[0] eq "win32n" && + `nasm -v 2>&1` =~ /NASM version ([2-9]\.[0-9]+)/ && + $1>=2.03); # first version supporting AVX + +&external_label("OPENSSL_ia32cap_P") if ($xmm); + + $A="eax"; $B="ebx"; $C="ecx"; @@ -47,6 +129,10 @@ $tmp1="ebp"; @V=($A,$B,$C,$D,$E,$T); +$alt=0; # 1 denotes alternative IALU implementation, which performs + # 8% *worse* on P4, same on Westmere and Atom, 2% better on + # Sandy Bridge... + sub BODY_00_15 { local($n,$a,$b,$c,$d,$e,$f)=@_; @@ -59,16 +145,18 @@ sub BODY_00_15 &rotl($tmp1,5); # tmp1=ROTATE(a,5) &xor($f,$d); &add($tmp1,$e); # tmp1+=e; - &and($f,$b); - &mov($e,&swtmp($n%16)); # e becomes volatile and is loaded + &mov($e,&swtmp($n%16)); # e becomes volatile and is loaded # with xi, also note that e becomes # f in next round... - &xor($f,$d); # f holds F_00_19(b,c,d) + &and($f,$b); &rotr($b,2); # b=ROTATE(b,30) - &lea($tmp1,&DWP(0x5a827999,$tmp1,$e)); # tmp1+=K_00_19+xi + &xor($f,$d); # f holds F_00_19(b,c,d) + &lea($tmp1,&DWP(0x5a827999,$tmp1,$e)); # tmp1+=K_00_19+xi - if ($n==15) { &add($f,$tmp1); } # f+=tmp1 + if ($n==15) { &mov($e,&swtmp(($n+1)%16));# pre-fetch f for next round + &add($f,$tmp1); } # f+=tmp1 else { &add($tmp1,$f); } # f becomes a in next round + &mov($tmp1,$a) if ($alt && $n==15); } sub BODY_16_19 @@ -77,22 +165,41 @@ sub BODY_16_19 &comment("16_19 $n"); - &mov($f,&swtmp($n%16)); # f to hold Xupdate(xi,xa,xb,xc,xd) - &mov($tmp1,$c); # tmp1 to hold F_00_19(b,c,d) - &xor($f,&swtmp(($n+2)%16)); - &xor($tmp1,$d); - &xor($f,&swtmp(($n+8)%16)); - &and($tmp1,$b); # tmp1 holds F_00_19(b,c,d) - &rotr($b,2); # b=ROTATE(b,30) +if ($alt) { + &xor($c,$d); + &xor($f,&swtmp(($n+2)%16)); # f to hold Xupdate(xi,xa,xb,xc,xd) + &and($tmp1,$c); # tmp1 to hold F_00_19(b,c,d), b&=c^d + &xor($f,&swtmp(($n+8)%16)); + &xor($tmp1,$d); # tmp1=F_00_19(b,c,d) + &xor($f,&swtmp(($n+13)%16)); # f holds xa^xb^xc^xd + &rotl($f,1); # f=ROTATE(f,1) + &add($e,$tmp1); # e+=F_00_19(b,c,d) + &xor($c,$d); # restore $c + &mov($tmp1,$a); # b in next round + &rotr($b,$n==16?2:7); # b=ROTATE(b,30) + &mov(&swtmp($n%16),$f); # xi=f + &rotl($a,5); # ROTATE(a,5) + &lea($f,&DWP(0x5a827999,$f,$e));# f+=F_00_19(b,c,d)+e + &mov($e,&swtmp(($n+1)%16)); # pre-fetch f for next round + &add($f,$a); # f+=ROTATE(a,5) +} else { + &mov($tmp1,$c); # tmp1 to hold F_00_19(b,c,d) + &xor($f,&swtmp(($n+2)%16)); # f to hold Xupdate(xi,xa,xb,xc,xd) + &xor($tmp1,$d); + &xor($f,&swtmp(($n+8)%16)); + &and($tmp1,$b); &xor($f,&swtmp(($n+13)%16)); # f holds xa^xb^xc^xd &rotl($f,1); # f=ROTATE(f,1) &xor($tmp1,$d); # tmp1=F_00_19(b,c,d) - &mov(&swtmp($n%16),$f); # xi=f - &lea($f,&DWP(0x5a827999,$f,$e));# f+=K_00_19+e - &mov($e,$a); # e becomes volatile - &rotl($e,5); # e=ROTATE(a,5) - &add($f,$tmp1); # f+=F_00_19(b,c,d) - &add($f,$e); # f+=ROTATE(a,5) + &add($e,$tmp1); # e+=F_00_19(b,c,d) + &mov($tmp1,$a); + &rotr($b,2); # b=ROTATE(b,30) + &mov(&swtmp($n%16),$f); # xi=f + &rotl($tmp1,5); # ROTATE(a,5) + &lea($f,&DWP(0x5a827999,$f,$e));# f+=F_00_19(b,c,d)+e + &mov($e,&swtmp(($n+1)%16)); # pre-fetch f for next round + &add($f,$tmp1); # f+=ROTATE(a,5) +} } sub BODY_20_39 @@ -102,21 +209,41 @@ sub BODY_20_39 &comment("20_39 $n"); +if ($alt) { + &xor($tmp1,$c); # tmp1 to hold F_20_39(b,c,d), b^=c + &xor($f,&swtmp(($n+2)%16)); # f to hold Xupdate(xi,xa,xb,xc,xd) + &xor($tmp1,$d); # tmp1 holds F_20_39(b,c,d) + &xor($f,&swtmp(($n+8)%16)); + &add($e,$tmp1); # e+=F_20_39(b,c,d) + &xor($f,&swtmp(($n+13)%16)); # f holds xa^xb^xc^xd + &rotl($f,1); # f=ROTATE(f,1) + &mov($tmp1,$a); # b in next round + &rotr($b,7); # b=ROTATE(b,30) + &mov(&swtmp($n%16),$f) if($n<77);# xi=f + &rotl($a,5); # ROTATE(a,5) + &xor($b,$c) if($n==39);# warm up for BODY_40_59 + &and($tmp1,$b) if($n==39); + &lea($f,&DWP($K,$f,$e)); # f+=e+K_XX_YY + &mov($e,&swtmp(($n+1)%16)) if($n<79);# pre-fetch f for next round + &add($f,$a); # f+=ROTATE(a,5) + &rotr($a,5) if ($n==79); +} else { &mov($tmp1,$b); # tmp1 to hold F_20_39(b,c,d) - &mov($f,&swtmp($n%16)); # f to hold Xupdate(xi,xa,xb,xc,xd) - &rotr($b,2); # b=ROTATE(b,30) - &xor($f,&swtmp(($n+2)%16)); + &xor($f,&swtmp(($n+2)%16)); # f to hold Xupdate(xi,xa,xb,xc,xd) &xor($tmp1,$c); &xor($f,&swtmp(($n+8)%16)); &xor($tmp1,$d); # tmp1 holds F_20_39(b,c,d) &xor($f,&swtmp(($n+13)%16)); # f holds xa^xb^xc^xd &rotl($f,1); # f=ROTATE(f,1) - &add($tmp1,$e); - &mov(&swtmp($n%16),$f); # xi=f - &mov($e,$a); # e becomes volatile - &rotl($e,5); # e=ROTATE(a,5) - &lea($f,&DWP($K,$f,$tmp1)); # f+=K_20_39+e - &add($f,$e); # f+=ROTATE(a,5) + &add($e,$tmp1); # e+=F_20_39(b,c,d) + &rotr($b,2); # b=ROTATE(b,30) + &mov($tmp1,$a); + &rotl($tmp1,5); # ROTATE(a,5) + &mov(&swtmp($n%16),$f) if($n<77);# xi=f + &lea($f,&DWP($K,$f,$e)); # f+=e+K_XX_YY + &mov($e,&swtmp(($n+1)%16)) if($n<79);# pre-fetch f for next round + &add($f,$tmp1); # f+=ROTATE(a,5) +} } sub BODY_40_59 @@ -125,41 +252,86 @@ sub BODY_40_59 &comment("40_59 $n"); - &mov($f,&swtmp($n%16)); # f to hold Xupdate(xi,xa,xb,xc,xd) - &mov($tmp1,&swtmp(($n+2)%16)); - &xor($f,$tmp1); - &mov($tmp1,&swtmp(($n+8)%16)); - &xor($f,$tmp1); - &mov($tmp1,&swtmp(($n+13)%16)); - &xor($f,$tmp1); # f holds xa^xb^xc^xd - &mov($tmp1,$b); # tmp1 to hold F_40_59(b,c,d) +if ($alt) { + &add($e,$tmp1); # e+=b&(c^d) + &xor($f,&swtmp(($n+2)%16)); # f to hold Xupdate(xi,xa,xb,xc,xd) + &mov($tmp1,$d); + &xor($f,&swtmp(($n+8)%16)); + &xor($c,$d); # restore $c + &xor($f,&swtmp(($n+13)%16)); # f holds xa^xb^xc^xd &rotl($f,1); # f=ROTATE(f,1) - &or($tmp1,$c); - &mov(&swtmp($n%16),$f); # xi=f - &and($tmp1,$d); - &lea($f,&DWP(0x8f1bbcdc,$f,$e));# f+=K_40_59+e - &mov($e,$b); # e becomes volatile and is used - # to calculate F_40_59(b,c,d) + &and($tmp1,$c); + &rotr($b,7); # b=ROTATE(b,30) + &add($e,$tmp1); # e+=c&d + &mov($tmp1,$a); # b in next round + &mov(&swtmp($n%16),$f); # xi=f + &rotl($a,5); # ROTATE(a,5) + &xor($b,$c) if ($n<59); + &and($tmp1,$b) if ($n<59);# tmp1 to hold F_40_59(b,c,d) + &lea($f,&DWP(0x8f1bbcdc,$f,$e));# f+=K_40_59+e+(b&(c^d)) + &mov($e,&swtmp(($n+1)%16)); # pre-fetch f for next round + &add($f,$a); # f+=ROTATE(a,5) +} else { + &mov($tmp1,$c); # tmp1 to hold F_40_59(b,c,d) + &xor($f,&swtmp(($n+2)%16)); # f to hold Xupdate(xi,xa,xb,xc,xd) + &xor($tmp1,$d); + &xor($f,&swtmp(($n+8)%16)); + &and($tmp1,$b); + &xor($f,&swtmp(($n+13)%16)); # f holds xa^xb^xc^xd + &rotl($f,1); # f=ROTATE(f,1) + &add($tmp1,$e); # b&(c^d)+=e &rotr($b,2); # b=ROTATE(b,30) - &and($e,$c); - &or($tmp1,$e); # tmp1 holds F_40_59(b,c,d) - &mov($e,$a); - &rotl($e,5); # e=ROTATE(a,5) - &add($f,$tmp1); # f+=tmp1; + &mov($e,$a); # e becomes volatile + &rotl($e,5); # ROTATE(a,5) + &mov(&swtmp($n%16),$f); # xi=f + &lea($f,&DWP(0x8f1bbcdc,$f,$tmp1));# f+=K_40_59+e+(b&(c^d)) + &mov($tmp1,$c); &add($f,$e); # f+=ROTATE(a,5) + &and($tmp1,$d); + &mov($e,&swtmp(($n+1)%16)); # pre-fetch f for next round + &add($f,$tmp1); # f+=c&d +} } &function_begin("sha1_block_data_order"); +if ($xmm) { + &static_label("ssse3_shortcut"); + &static_label("avx_shortcut") if ($ymm); + &static_label("K_XX_XX"); + + &call (&label("pic_point")); # make it PIC! + &set_label("pic_point"); + &blindpop($tmp1); + &picmeup($T,"OPENSSL_ia32cap_P",$tmp1,&label("pic_point")); + &lea ($tmp1,&DWP(&label("K_XX_XX")."-".&label("pic_point"),$tmp1)); + + &mov ($A,&DWP(0,$T)); + &mov ($D,&DWP(4,$T)); + &test ($D,1<<9); # check SSSE3 bit + &jz (&label("x86")); + &test ($A,1<<24); # check FXSR bit + &jz (&label("x86")); + if ($ymm) { + &and ($D,1<<28); # mask AVX bit + &and ($A,1<<30); # mask "Intel CPU" bit + &or ($A,$D); + &cmp ($A,1<<28|1<<30); + &je (&label("avx_shortcut")); + } + &jmp (&label("ssse3_shortcut")); + &set_label("x86",16); +} &mov($tmp1,&wparam(0)); # SHA_CTX *c &mov($T,&wparam(1)); # const void *input &mov($A,&wparam(2)); # size_t num - &stack_push(16); # allocate X[16] + &stack_push(16+3); # allocate X[16] &shl($A,6); &add($A,$T); &mov(&wparam(2),$A); # pointer beyond the end of input &mov($E,&DWP(16,$tmp1));# pre-load E + &jmp(&label("loop")); - &set_label("loop",16); +&set_label("loop",16); # copy input chunk to X, but reversing byte order! for ($i=0; $i<16; $i+=4) @@ -213,8 +385,845 @@ sub BODY_40_59 &mov(&DWP(16,$tmp1),$C); &jb(&label("loop")); - &stack_pop(16); + &stack_pop(16+3); &function_end("sha1_block_data_order"); + +if ($xmm) { +###################################################################### +# The SSSE3 implementation. +# +# %xmm[0-7] are used as ring @X[] buffer containing quadruples of last +# 32 elements of the message schedule or Xupdate outputs. First 4 +# quadruples are simply byte-swapped input, next 4 are calculated +# according to method originally suggested by Dean Gaudet (modulo +# being implemented in SSSE3). Once 8 quadruples or 32 elements are +# collected, it switches to routine proposed by Max Locktyukhin. +# +# Calculations inevitably require temporary reqisters, and there are +# no %xmm registers left to spare. For this reason part of the ring +# buffer, X[2..4] to be specific, is offloaded to 3 quadriples ring +# buffer on the stack. Keep in mind that X[2] is alias X[-6], X[3] - +# X[-5], and X[4] - X[-4]... +# +# Another notable optimization is aggressive stack frame compression +# aiming to minimize amount of 9-byte instructions... +# +# Yet another notable optimization is "jumping" $B variable. It means +# that there is no register permanently allocated for $B value. This +# allowed to eliminate one instruction from body_20_39... +# +my $Xi=4; # 4xSIMD Xupdate round, start pre-seeded +my @X=map("xmm$_",(4..7,0..3)); # pre-seeded for $Xi=4 +my @V=($A,$B,$C,$D,$E); +my $j=0; # hash round +my @T=($T,$tmp1); +my $inp; + +my $_rol=sub { &rol(@_) }; +my $_ror=sub { &ror(@_) }; + +&function_begin("_sha1_block_data_order_ssse3"); + &call (&label("pic_point")); # make it PIC! + &set_label("pic_point"); + &blindpop($tmp1); + &lea ($tmp1,&DWP(&label("K_XX_XX")."-".&label("pic_point"),$tmp1)); +&set_label("ssse3_shortcut"); + + &movdqa (@X[3],&QWP(0,$tmp1)); # K_00_19 + &movdqa (@X[4],&QWP(16,$tmp1)); # K_20_39 + &movdqa (@X[5],&QWP(32,$tmp1)); # K_40_59 + &movdqa (@X[6],&QWP(48,$tmp1)); # K_60_79 + &movdqa (@X[2],&QWP(64,$tmp1)); # pbswap mask + + &mov ($E,&wparam(0)); # load argument block + &mov ($inp=@T[1],&wparam(1)); + &mov ($D,&wparam(2)); + &mov (@T[0],"esp"); + + # stack frame layout + # + # +0 X[0]+K X[1]+K X[2]+K X[3]+K # XMM->IALU xfer area + # X[4]+K X[5]+K X[6]+K X[7]+K + # X[8]+K X[9]+K X[10]+K X[11]+K + # X[12]+K X[13]+K X[14]+K X[15]+K + # + # +64 X[0] X[1] X[2] X[3] # XMM->XMM backtrace area + # X[4] X[5] X[6] X[7] + # X[8] X[9] X[10] X[11] # even borrowed for K_00_19 + # + # +112 K_20_39 K_20_39 K_20_39 K_20_39 # constants + # K_40_59 K_40_59 K_40_59 K_40_59 + # K_60_79 K_60_79 K_60_79 K_60_79 + # K_00_19 K_00_19 K_00_19 K_00_19 + # pbswap mask + # + # +192 ctx # argument block + # +196 inp + # +200 end + # +204 esp + &sub ("esp",208); + &and ("esp",-64); + + &movdqa (&QWP(112+0,"esp"),@X[4]); # copy constants + &movdqa (&QWP(112+16,"esp"),@X[5]); + &movdqa (&QWP(112+32,"esp"),@X[6]); + &shl ($D,6); # len*64 + &movdqa (&QWP(112+48,"esp"),@X[3]); + &add ($D,$inp); # end of input + &movdqa (&QWP(112+64,"esp"),@X[2]); + &add ($inp,64); + &mov (&DWP(192+0,"esp"),$E); # save argument block + &mov (&DWP(192+4,"esp"),$inp); + &mov (&DWP(192+8,"esp"),$D); + &mov (&DWP(192+12,"esp"),@T[0]); # save original %esp + + &mov ($A,&DWP(0,$E)); # load context + &mov ($B,&DWP(4,$E)); + &mov ($C,&DWP(8,$E)); + &mov ($D,&DWP(12,$E)); + &mov ($E,&DWP(16,$E)); + &mov (@T[0],$B); # magic seed + + &movdqu (@X[-4&7],&QWP(-64,$inp)); # load input to %xmm[0-3] + &movdqu (@X[-3&7],&QWP(-48,$inp)); + &movdqu (@X[-2&7],&QWP(-32,$inp)); + &movdqu (@X[-1&7],&QWP(-16,$inp)); + &pshufb (@X[-4&7],@X[2]); # byte swap + &pshufb (@X[-3&7],@X[2]); + &pshufb (@X[-2&7],@X[2]); + &movdqa (&QWP(112-16,"esp"),@X[3]); # borrow last backtrace slot + &pshufb (@X[-1&7],@X[2]); + &paddd (@X[-4&7],@X[3]); # add K_00_19 + &paddd (@X[-3&7],@X[3]); + &paddd (@X[-2&7],@X[3]); + &movdqa (&QWP(0,"esp"),@X[-4&7]); # X[]+K xfer to IALU + &psubd (@X[-4&7],@X[3]); # restore X[] + &movdqa (&QWP(0+16,"esp"),@X[-3&7]); + &psubd (@X[-3&7],@X[3]); + &movdqa (&QWP(0+32,"esp"),@X[-2&7]); + &psubd (@X[-2&7],@X[3]); + &movdqa (@X[0],@X[-3&7]); + &jmp (&label("loop")); + +###################################################################### +# SSE instruction sequence is first broken to groups of indepentent +# instructions, independent in respect to their inputs and shifter +# (not all architectures have more than one). Then IALU instructions +# are "knitted in" between the SSE groups. Distance is maintained for +# SSE latency of 2 in hope that it fits better upcoming AMD Bulldozer +# [which allegedly also implements SSSE3]... +# +# Temporary registers usage. X[2] is volatile at the entry and at the +# end is restored from backtrace ring buffer. X[3] is expected to +# contain current K_XX_XX constant and is used to caclulate X[-1]+K +# from previous round, it becomes volatile the moment the value is +# saved to stack for transfer to IALU. X[4] becomes volatile whenever +# X[-4] is accumulated and offloaded to backtrace ring buffer, at the +# end it is loaded with next K_XX_XX [which becomes X[3] in next +# round]... +# +sub Xupdate_ssse3_16_31() # recall that $Xi starts wtih 4 +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 40 instructions + my ($a,$b,$c,$d,$e); + + eval(shift(@insns)); + eval(shift(@insns)); + &palignr(@X[0],@X[-4&7],8); # compose "X[-14]" in "X[0]" + &movdqa (@X[2],@X[-1&7]); + eval(shift(@insns)); + eval(shift(@insns)); + + &paddd (@X[3],@X[-1&7]); + &movdqa (&QWP(64+16*(($Xi-4)%3),"esp"),@X[-4&7]);# save X[] to backtrace buffer + eval(shift(@insns)); + eval(shift(@insns)); + &psrldq (@X[2],4); # "X[-3]", 3 dwords + eval(shift(@insns)); + eval(shift(@insns)); + &pxor (@X[0],@X[-4&7]); # "X[0]"^="X[-16]" + eval(shift(@insns)); + eval(shift(@insns)); + + &pxor (@X[2],@X[-2&7]); # "X[-3]"^"X[-8]" + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &pxor (@X[0],@X[2]); # "X[0]"^="X[-3]"^"X[-8]" + eval(shift(@insns)); + eval(shift(@insns)); + &movdqa (&QWP(0+16*(($Xi-1)&3),"esp"),@X[3]); # X[]+K xfer to IALU + eval(shift(@insns)); + eval(shift(@insns)); + + &movdqa (@X[4],@X[0]); + &movdqa (@X[2],@X[0]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &pslldq (@X[4],12); # "X[0]"<<96, extract one dword + &paddd (@X[0],@X[0]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &psrld (@X[2],31); + eval(shift(@insns)); + eval(shift(@insns)); + &movdqa (@X[3],@X[4]); + eval(shift(@insns)); + eval(shift(@insns)); + + &psrld (@X[4],30); + &por (@X[0],@X[2]); # "X[0]"<<<=1 + eval(shift(@insns)); + eval(shift(@insns)); + &movdqa (@X[2],&QWP(64+16*(($Xi-6)%3),"esp")) if ($Xi>5); # restore X[] from backtrace buffer + eval(shift(@insns)); + eval(shift(@insns)); + + &pslld (@X[3],2); + &pxor (@X[0],@X[4]); + eval(shift(@insns)); + eval(shift(@insns)); + &movdqa (@X[4],&QWP(112-16+16*(($Xi)/5),"esp")); # K_XX_XX + eval(shift(@insns)); + eval(shift(@insns)); + + &pxor (@X[0],@X[3]); # "X[0]"^=("X[0]"<<96)<<<2 + &movdqa (@X[1],@X[-2&7]) if ($Xi<7); + eval(shift(@insns)); + eval(shift(@insns)); + + foreach (@insns) { eval; } # remaining instructions [if any] + + $Xi++; push(@X,shift(@X)); # "rotate" X[] +} + +sub Xupdate_ssse3_32_79() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 to 48 instructions + my ($a,$b,$c,$d,$e); + + &movdqa (@X[2],@X[-1&7]) if ($Xi==8); + eval(shift(@insns)); # body_20_39 + &pxor (@X[0],@X[-4&7]); # "X[0]"="X[-32]"^"X[-16]" + &palignr(@X[2],@X[-2&7],8); # compose "X[-6]" + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # rol + + &pxor (@X[0],@X[-7&7]); # "X[0]"^="X[-28]" + &movdqa (&QWP(64+16*(($Xi-4)%3),"esp"),@X[-4&7]); # save X[] to backtrace buffer + eval(shift(@insns)); + eval(shift(@insns)); + if ($Xi%5) { + &movdqa (@X[4],@X[3]); # "perpetuate" K_XX_XX... + } else { # ... or load next one + &movdqa (@X[4],&QWP(112-16+16*($Xi/5),"esp")); + } + &paddd (@X[3],@X[-1&7]); + eval(shift(@insns)); # ror + eval(shift(@insns)); + + &pxor (@X[0],@X[2]); # "X[0]"^="X[-6]" + eval(shift(@insns)); # body_20_39 + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # rol + + &movdqa (@X[2],@X[0]); + &movdqa (&QWP(0+16*(($Xi-1)&3),"esp"),@X[3]); # X[]+K xfer to IALU + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # ror + eval(shift(@insns)); + + &pslld (@X[0],2); + eval(shift(@insns)); # body_20_39 + eval(shift(@insns)); + &psrld (@X[2],30); + eval(shift(@insns)); + eval(shift(@insns)); # rol + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # ror + eval(shift(@insns)); + + &por (@X[0],@X[2]); # "X[0]"<<<=2 + eval(shift(@insns)); # body_20_39 + eval(shift(@insns)); + &movdqa (@X[2],&QWP(64+16*(($Xi-6)%3),"esp")) if($Xi<19); # restore X[] from backtrace buffer + eval(shift(@insns)); + eval(shift(@insns)); # rol + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # ror + &movdqa (@X[3],@X[0]) if ($Xi<19); + eval(shift(@insns)); + + foreach (@insns) { eval; } # remaining instructions + + $Xi++; push(@X,shift(@X)); # "rotate" X[] +} + +sub Xuplast_ssse3_80() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 instructions + my ($a,$b,$c,$d,$e); + + eval(shift(@insns)); + &paddd (@X[3],@X[-1&7]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &movdqa (&QWP(0+16*(($Xi-1)&3),"esp"),@X[3]); # X[]+K xfer IALU + + foreach (@insns) { eval; } # remaining instructions + + &mov ($inp=@T[1],&DWP(192+4,"esp")); + &cmp ($inp,&DWP(192+8,"esp")); + &je (&label("done")); + + &movdqa (@X[3],&QWP(112+48,"esp")); # K_00_19 + &movdqa (@X[2],&QWP(112+64,"esp")); # pbswap mask + &movdqu (@X[-4&7],&QWP(0,$inp)); # load input + &movdqu (@X[-3&7],&QWP(16,$inp)); + &movdqu (@X[-2&7],&QWP(32,$inp)); + &movdqu (@X[-1&7],&QWP(48,$inp)); + &add ($inp,64); + &pshufb (@X[-4&7],@X[2]); # byte swap + &mov (&DWP(192+4,"esp"),$inp); + &movdqa (&QWP(112-16,"esp"),@X[3]); # borrow last backtrace slot + + $Xi=0; +} + +sub Xloop_ssse3() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 instructions + my ($a,$b,$c,$d,$e); + + eval(shift(@insns)); + eval(shift(@insns)); + &pshufb (@X[($Xi-3)&7],@X[2]); + eval(shift(@insns)); + eval(shift(@insns)); + &paddd (@X[($Xi-4)&7],@X[3]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + &movdqa (&QWP(0+16*$Xi,"esp"),@X[($Xi-4)&7]); # X[]+K xfer to IALU + eval(shift(@insns)); + eval(shift(@insns)); + &psubd (@X[($Xi-4)&7],@X[3]); + + foreach (@insns) { eval; } + $Xi++; +} + +sub Xtail_ssse3() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 instructions + my ($a,$b,$c,$d,$e); + + foreach (@insns) { eval; } +} + +sub body_00_19 () { + ( + '($a,$b,$c,$d,$e)=@V;'. + '&add ($e,&DWP(4*($j&15),"esp"));', # X[]+K xfer + '&xor ($c,$d);', + '&mov (@T[1],$a);', # $b in next round + '&$_rol ($a,5);', + '&and (@T[0],$c);', # ($b&($c^$d)) + '&xor ($c,$d);', # restore $c + '&xor (@T[0],$d);', + '&add ($e,$a);', + '&$_ror ($b,$j?7:2);', # $b>>>2 + '&add ($e,@T[0]);' .'$j++; unshift(@V,pop(@V)); unshift(@T,pop(@T));' + ); +} + +sub body_20_39 () { + ( + '($a,$b,$c,$d,$e)=@V;'. + '&add ($e,&DWP(4*($j++&15),"esp"));', # X[]+K xfer + '&xor (@T[0],$d);', # ($b^$d) + '&mov (@T[1],$a);', # $b in next round + '&$_rol ($a,5);', + '&xor (@T[0],$c);', # ($b^$d^$c) + '&add ($e,$a);', + '&$_ror ($b,7);', # $b>>>2 + '&add ($e,@T[0]);' .'unshift(@V,pop(@V)); unshift(@T,pop(@T));' + ); +} + +sub body_40_59 () { + ( + '($a,$b,$c,$d,$e)=@V;'. + '&mov (@T[1],$c);', + '&xor ($c,$d);', + '&add ($e,&DWP(4*($j++&15),"esp"));', # X[]+K xfer + '&and (@T[1],$d);', + '&and (@T[0],$c);', # ($b&($c^$d)) + '&$_ror ($b,7);', # $b>>>2 + '&add ($e,@T[1]);', + '&mov (@T[1],$a);', # $b in next round + '&$_rol ($a,5);', + '&add ($e,@T[0]);', + '&xor ($c,$d);', # restore $c + '&add ($e,$a);' .'unshift(@V,pop(@V)); unshift(@T,pop(@T));' + ); +} + +&set_label("loop",16); + &Xupdate_ssse3_16_31(\&body_00_19); + &Xupdate_ssse3_16_31(\&body_00_19); + &Xupdate_ssse3_16_31(\&body_00_19); + &Xupdate_ssse3_16_31(\&body_00_19); + &Xupdate_ssse3_32_79(\&body_00_19); + &Xupdate_ssse3_32_79(\&body_20_39); + &Xupdate_ssse3_32_79(\&body_20_39); + &Xupdate_ssse3_32_79(\&body_20_39); + &Xupdate_ssse3_32_79(\&body_20_39); + &Xupdate_ssse3_32_79(\&body_20_39); + &Xupdate_ssse3_32_79(\&body_40_59); + &Xupdate_ssse3_32_79(\&body_40_59); + &Xupdate_ssse3_32_79(\&body_40_59); + &Xupdate_ssse3_32_79(\&body_40_59); + &Xupdate_ssse3_32_79(\&body_40_59); + &Xupdate_ssse3_32_79(\&body_20_39); + &Xuplast_ssse3_80(\&body_20_39); # can jump to "done" + + $saved_j=$j; @saved_V=@V; + + &Xloop_ssse3(\&body_20_39); + &Xloop_ssse3(\&body_20_39); + &Xloop_ssse3(\&body_20_39); + + &mov (@T[1],&DWP(192,"esp")); # update context + &add ($A,&DWP(0,@T[1])); + &add (@T[0],&DWP(4,@T[1])); # $b + &add ($C,&DWP(8,@T[1])); + &mov (&DWP(0,@T[1]),$A); + &add ($D,&DWP(12,@T[1])); + &mov (&DWP(4,@T[1]),@T[0]); + &add ($E,&DWP(16,@T[1])); + &mov (&DWP(8,@T[1]),$C); + &mov ($B,@T[0]); + &mov (&DWP(12,@T[1]),$D); + &mov (&DWP(16,@T[1]),$E); + &movdqa (@X[0],@X[-3&7]); + + &jmp (&label("loop")); + +&set_label("done",16); $j=$saved_j; @V=@saved_V; + + &Xtail_ssse3(\&body_20_39); + &Xtail_ssse3(\&body_20_39); + &Xtail_ssse3(\&body_20_39); + + &mov (@T[1],&DWP(192,"esp")); # update context + &add ($A,&DWP(0,@T[1])); + &mov ("esp",&DWP(192+12,"esp")); # restore %esp + &add (@T[0],&DWP(4,@T[1])); # $b + &add ($C,&DWP(8,@T[1])); + &mov (&DWP(0,@T[1]),$A); + &add ($D,&DWP(12,@T[1])); + &mov (&DWP(4,@T[1]),@T[0]); + &add ($E,&DWP(16,@T[1])); + &mov (&DWP(8,@T[1]),$C); + &mov (&DWP(12,@T[1]),$D); + &mov (&DWP(16,@T[1]),$E); + +&function_end("_sha1_block_data_order_ssse3"); + +if ($ymm) { +my $Xi=4; # 4xSIMD Xupdate round, start pre-seeded +my @X=map("xmm$_",(4..7,0..3)); # pre-seeded for $Xi=4 +my @V=($A,$B,$C,$D,$E); +my $j=0; # hash round +my @T=($T,$tmp1); +my $inp; + +my $_rol=sub { &shld(@_[0],@_) }; +my $_ror=sub { &shrd(@_[0],@_) }; + +&function_begin("_sha1_block_data_order_avx"); + &call (&label("pic_point")); # make it PIC! + &set_label("pic_point"); + &blindpop($tmp1); + &lea ($tmp1,&DWP(&label("K_XX_XX")."-".&label("pic_point"),$tmp1)); +&set_label("avx_shortcut"); + &vzeroall(); + + &vmovdqa(@X[3],&QWP(0,$tmp1)); # K_00_19 + &vmovdqa(@X[4],&QWP(16,$tmp1)); # K_20_39 + &vmovdqa(@X[5],&QWP(32,$tmp1)); # K_40_59 + &vmovdqa(@X[6],&QWP(48,$tmp1)); # K_60_79 + &vmovdqa(@X[2],&QWP(64,$tmp1)); # pbswap mask + + &mov ($E,&wparam(0)); # load argument block + &mov ($inp=@T[1],&wparam(1)); + &mov ($D,&wparam(2)); + &mov (@T[0],"esp"); + + # stack frame layout + # + # +0 X[0]+K X[1]+K X[2]+K X[3]+K # XMM->IALU xfer area + # X[4]+K X[5]+K X[6]+K X[7]+K + # X[8]+K X[9]+K X[10]+K X[11]+K + # X[12]+K X[13]+K X[14]+K X[15]+K + # + # +64 X[0] X[1] X[2] X[3] # XMM->XMM backtrace area + # X[4] X[5] X[6] X[7] + # X[8] X[9] X[10] X[11] # even borrowed for K_00_19 + # + # +112 K_20_39 K_20_39 K_20_39 K_20_39 # constants + # K_40_59 K_40_59 K_40_59 K_40_59 + # K_60_79 K_60_79 K_60_79 K_60_79 + # K_00_19 K_00_19 K_00_19 K_00_19 + # pbswap mask + # + # +192 ctx # argument block + # +196 inp + # +200 end + # +204 esp + &sub ("esp",208); + &and ("esp",-64); + + &vmovdqa(&QWP(112+0,"esp"),@X[4]); # copy constants + &vmovdqa(&QWP(112+16,"esp"),@X[5]); + &vmovdqa(&QWP(112+32,"esp"),@X[6]); + &shl ($D,6); # len*64 + &vmovdqa(&QWP(112+48,"esp"),@X[3]); + &add ($D,$inp); # end of input + &vmovdqa(&QWP(112+64,"esp"),@X[2]); + &add ($inp,64); + &mov (&DWP(192+0,"esp"),$E); # save argument block + &mov (&DWP(192+4,"esp"),$inp); + &mov (&DWP(192+8,"esp"),$D); + &mov (&DWP(192+12,"esp"),@T[0]); # save original %esp + + &mov ($A,&DWP(0,$E)); # load context + &mov ($B,&DWP(4,$E)); + &mov ($C,&DWP(8,$E)); + &mov ($D,&DWP(12,$E)); + &mov ($E,&DWP(16,$E)); + &mov (@T[0],$B); # magic seed + + &vmovdqu(@X[-4&7],&QWP(-64,$inp)); # load input to %xmm[0-3] + &vmovdqu(@X[-3&7],&QWP(-48,$inp)); + &vmovdqu(@X[-2&7],&QWP(-32,$inp)); + &vmovdqu(@X[-1&7],&QWP(-16,$inp)); + &vpshufb(@X[-4&7],@X[-4&7],@X[2]); # byte swap + &vpshufb(@X[-3&7],@X[-3&7],@X[2]); + &vpshufb(@X[-2&7],@X[-2&7],@X[2]); + &vmovdqa(&QWP(112-16,"esp"),@X[3]); # borrow last backtrace slot + &vpshufb(@X[-1&7],@X[-1&7],@X[2]); + &vpaddd (@X[0],@X[-4&7],@X[3]); # add K_00_19 + &vpaddd (@X[1],@X[-3&7],@X[3]); + &vpaddd (@X[2],@X[-2&7],@X[3]); + &vmovdqa(&QWP(0,"esp"),@X[0]); # X[]+K xfer to IALU + &vmovdqa(&QWP(0+16,"esp"),@X[1]); + &vmovdqa(&QWP(0+32,"esp"),@X[2]); + &jmp (&label("loop")); + +sub Xupdate_avx_16_31() # recall that $Xi starts wtih 4 +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 40 instructions + my ($a,$b,$c,$d,$e); + + eval(shift(@insns)); + eval(shift(@insns)); + &vpalignr(@X[0],@X[-3&7],@X[-4&7],8); # compose "X[-14]" in "X[0]" + eval(shift(@insns)); + eval(shift(@insns)); + + &vpaddd (@X[3],@X[3],@X[-1&7]); + &vmovdqa (&QWP(64+16*(($Xi-4)%3),"esp"),@X[-4&7]);# save X[] to backtrace buffer + eval(shift(@insns)); + eval(shift(@insns)); + &vpsrldq(@X[2],@X[-1&7],4); # "X[-3]", 3 dwords + eval(shift(@insns)); + eval(shift(@insns)); + &vpxor (@X[0],@X[0],@X[-4&7]); # "X[0]"^="X[-16]" + eval(shift(@insns)); + eval(shift(@insns)); + + &vpxor (@X[2],@X[2],@X[-2&7]); # "X[-3]"^"X[-8]" + eval(shift(@insns)); + eval(shift(@insns)); + &vmovdqa (&QWP(0+16*(($Xi-1)&3),"esp"),@X[3]); # X[]+K xfer to IALU + eval(shift(@insns)); + eval(shift(@insns)); + + &vpxor (@X[0],@X[0],@X[2]); # "X[0]"^="X[-3]"^"X[-8]" + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &vpsrld (@X[2],@X[0],31); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &vpslldq(@X[4],@X[0],12); # "X[0]"<<96, extract one dword + &vpaddd (@X[0],@X[0],@X[0]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &vpsrld (@X[3],@X[4],30); + &vpor (@X[0],@X[0],@X[2]); # "X[0]"<<<=1 + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &vpslld (@X[4],@X[4],2); + &vmovdqa (@X[2],&QWP(64+16*(($Xi-6)%3),"esp")) if ($Xi>5); # restore X[] from backtrace buffer + eval(shift(@insns)); + eval(shift(@insns)); + &vpxor (@X[0],@X[0],@X[3]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &vpxor (@X[0],@X[0],@X[4]); # "X[0]"^=("X[0]"<<96)<<<2 + eval(shift(@insns)); + eval(shift(@insns)); + &vmovdqa (@X[4],&QWP(112-16+16*(($Xi)/5),"esp")); # K_XX_XX + eval(shift(@insns)); + eval(shift(@insns)); + + foreach (@insns) { eval; } # remaining instructions [if any] + + $Xi++; push(@X,shift(@X)); # "rotate" X[] +} + +sub Xupdate_avx_32_79() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 to 48 instructions + my ($a,$b,$c,$d,$e); + + &vpalignr(@X[2],@X[-1&7],@X[-2&7],8); # compose "X[-6]" + &vpxor (@X[0],@X[0],@X[-4&7]); # "X[0]"="X[-32]"^"X[-16]" + eval(shift(@insns)); # body_20_39 + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # rol + + &vpxor (@X[0],@X[0],@X[-7&7]); # "X[0]"^="X[-28]" + &vmovdqa (&QWP(64+16*(($Xi-4)%3),"esp"),@X[-4&7]); # save X[] to backtrace buffer + eval(shift(@insns)); + eval(shift(@insns)); + if ($Xi%5) { + &vmovdqa (@X[4],@X[3]); # "perpetuate" K_XX_XX... + } else { # ... or load next one + &vmovdqa (@X[4],&QWP(112-16+16*($Xi/5),"esp")); + } + &vpaddd (@X[3],@X[3],@X[-1&7]); + eval(shift(@insns)); # ror + eval(shift(@insns)); + + &vpxor (@X[0],@X[0],@X[2]); # "X[0]"^="X[-6]" + eval(shift(@insns)); # body_20_39 + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # rol + + &vpsrld (@X[2],@X[0],30); + &vmovdqa (&QWP(0+16*(($Xi-1)&3),"esp"),@X[3]); # X[]+K xfer to IALU + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # ror + eval(shift(@insns)); + + &vpslld (@X[0],@X[0],2); + eval(shift(@insns)); # body_20_39 + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # rol + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # ror + eval(shift(@insns)); + + &vpor (@X[0],@X[0],@X[2]); # "X[0]"<<<=2 + eval(shift(@insns)); # body_20_39 + eval(shift(@insns)); + &vmovdqa (@X[2],&QWP(64+16*(($Xi-6)%3),"esp")) if($Xi<19); # restore X[] from backtrace buffer + eval(shift(@insns)); + eval(shift(@insns)); # rol + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # ror + eval(shift(@insns)); + + foreach (@insns) { eval; } # remaining instructions + + $Xi++; push(@X,shift(@X)); # "rotate" X[] +} + +sub Xuplast_avx_80() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 instructions + my ($a,$b,$c,$d,$e); + + eval(shift(@insns)); + &vpaddd (@X[3],@X[3],@X[-1&7]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &vmovdqa (&QWP(0+16*(($Xi-1)&3),"esp"),@X[3]); # X[]+K xfer IALU + + foreach (@insns) { eval; } # remaining instructions + + &mov ($inp=@T[1],&DWP(192+4,"esp")); + &cmp ($inp,&DWP(192+8,"esp")); + &je (&label("done")); + + &vmovdqa(@X[3],&QWP(112+48,"esp")); # K_00_19 + &vmovdqa(@X[2],&QWP(112+64,"esp")); # pbswap mask + &vmovdqu(@X[-4&7],&QWP(0,$inp)); # load input + &vmovdqu(@X[-3&7],&QWP(16,$inp)); + &vmovdqu(@X[-2&7],&QWP(32,$inp)); + &vmovdqu(@X[-1&7],&QWP(48,$inp)); + &add ($inp,64); + &vpshufb(@X[-4&7],@X[-4&7],@X[2]); # byte swap + &mov (&DWP(192+4,"esp"),$inp); + &vmovdqa(&QWP(112-16,"esp"),@X[3]); # borrow last backtrace slot + + $Xi=0; +} + +sub Xloop_avx() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 instructions + my ($a,$b,$c,$d,$e); + + eval(shift(@insns)); + eval(shift(@insns)); + &vpshufb (@X[($Xi-3)&7],@X[($Xi-3)&7],@X[2]); + eval(shift(@insns)); + eval(shift(@insns)); + &vpaddd (@X[$Xi&7],@X[($Xi-4)&7],@X[3]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + &vmovdqa (&QWP(0+16*$Xi,"esp"),@X[$Xi&7]); # X[]+K xfer to IALU + eval(shift(@insns)); + eval(shift(@insns)); + + foreach (@insns) { eval; } + $Xi++; +} + +sub Xtail_avx() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 instructions + my ($a,$b,$c,$d,$e); + + foreach (@insns) { eval; } +} + +&set_label("loop",16); + &Xupdate_avx_16_31(\&body_00_19); + &Xupdate_avx_16_31(\&body_00_19); + &Xupdate_avx_16_31(\&body_00_19); + &Xupdate_avx_16_31(\&body_00_19); + &Xupdate_avx_32_79(\&body_00_19); + &Xupdate_avx_32_79(\&body_20_39); + &Xupdate_avx_32_79(\&body_20_39); + &Xupdate_avx_32_79(\&body_20_39); + &Xupdate_avx_32_79(\&body_20_39); + &Xupdate_avx_32_79(\&body_20_39); + &Xupdate_avx_32_79(\&body_40_59); + &Xupdate_avx_32_79(\&body_40_59); + &Xupdate_avx_32_79(\&body_40_59); + &Xupdate_avx_32_79(\&body_40_59); + &Xupdate_avx_32_79(\&body_40_59); + &Xupdate_avx_32_79(\&body_20_39); + &Xuplast_avx_80(\&body_20_39); # can jump to "done" + + $saved_j=$j; @saved_V=@V; + + &Xloop_avx(\&body_20_39); + &Xloop_avx(\&body_20_39); + &Xloop_avx(\&body_20_39); + + &mov (@T[1],&DWP(192,"esp")); # update context + &add ($A,&DWP(0,@T[1])); + &add (@T[0],&DWP(4,@T[1])); # $b + &add ($C,&DWP(8,@T[1])); + &mov (&DWP(0,@T[1]),$A); + &add ($D,&DWP(12,@T[1])); + &mov (&DWP(4,@T[1]),@T[0]); + &add ($E,&DWP(16,@T[1])); + &mov (&DWP(8,@T[1]),$C); + &mov ($B,@T[0]); + &mov (&DWP(12,@T[1]),$D); + &mov (&DWP(16,@T[1]),$E); + + &jmp (&label("loop")); + +&set_label("done",16); $j=$saved_j; @V=@saved_V; + + &Xtail_avx(\&body_20_39); + &Xtail_avx(\&body_20_39); + &Xtail_avx(\&body_20_39); + + &vzeroall(); + + &mov (@T[1],&DWP(192,"esp")); # update context + &add ($A,&DWP(0,@T[1])); + &mov ("esp",&DWP(192+12,"esp")); # restore %esp + &add (@T[0],&DWP(4,@T[1])); # $b + &add ($C,&DWP(8,@T[1])); + &mov (&DWP(0,@T[1]),$A); + &add ($D,&DWP(12,@T[1])); + &mov (&DWP(4,@T[1]),@T[0]); + &add ($E,&DWP(16,@T[1])); + &mov (&DWP(8,@T[1]),$C); + &mov (&DWP(12,@T[1]),$D); + &mov (&DWP(16,@T[1]),$E); +&function_end("_sha1_block_data_order_avx"); +} +&set_label("K_XX_XX",64); +&data_word(0x5a827999,0x5a827999,0x5a827999,0x5a827999); # K_00_19 +&data_word(0x6ed9eba1,0x6ed9eba1,0x6ed9eba1,0x6ed9eba1); # K_20_39 +&data_word(0x8f1bbcdc,0x8f1bbcdc,0x8f1bbcdc,0x8f1bbcdc); # K_40_59 +&data_word(0xca62c1d6,0xca62c1d6,0xca62c1d6,0xca62c1d6); # K_60_79 +&data_word(0x00010203,0x04050607,0x08090a0b,0x0c0d0e0f); # pbswap mask +} &asciz("SHA1 block transform for x86, CRYPTOGAMS by "); &asm_finish(); diff --git a/crypto/openssl/crypto/sha/asm/sha1-x86_64.pl b/crypto/openssl/crypto/sha/asm/sha1-x86_64.pl index 4edc5ea9ad..f27c1e3fb0 100755 --- a/crypto/openssl/crypto/sha/asm/sha1-x86_64.pl +++ b/crypto/openssl/crypto/sha/asm/sha1-x86_64.pl @@ -16,7 +16,7 @@ # There was suggestion to mechanically translate 32-bit code, but I # dismissed it, reasoning that x86_64 offers enough register bank # capacity to fully utilize SHA-1 parallelism. Therefore this fresh -# implementation:-) However! While 64-bit code does performs better +# implementation:-) However! While 64-bit code does perform better # on Opteron, I failed to beat 32-bit assembler on EM64T core. Well, # x86_64 does offer larger *addressable* bank, but out-of-order core # reaches for even more registers through dynamic aliasing, and EM64T @@ -29,6 +29,38 @@ # Xeon P4 +65% +0% 9.9 # Core2 +60% +10% 7.0 +# August 2009. +# +# The code was revised to minimize code size and to maximize +# "distance" between instructions producing input to 'lea' +# instruction and the 'lea' instruction itself, which is essential +# for Intel Atom core. + +# October 2010. +# +# Add SSSE3, Supplemental[!] SSE3, implementation. The idea behind it +# is to offload message schedule denoted by Wt in NIST specification, +# or Xupdate in OpenSSL source, to SIMD unit. See sha1-586.pl module +# for background and implementation details. The only difference from +# 32-bit code is that 64-bit code doesn't have to spill @X[] elements +# to free temporary registers. + +# April 2011. +# +# Add AVX code path. See sha1-586.pl for further information. + +###################################################################### +# Current performance is summarized in following table. Numbers are +# CPU clock cycles spent to process single byte (less is better). +# +# x86_64 SSSE3 AVX +# P4 9.8 - +# Opteron 6.6 - +# Core2 6.7 6.1/+10% - +# Atom 11.0 9.7/+13% - +# Westmere 7.1 5.6/+27% - +# Sandy Bridge 7.9 6.3/+25% 5.2/+51% + $flavour = shift; $output = shift; if ($flavour =~ /\./) { $output = $flavour; undef $flavour; } @@ -40,6 +72,16 @@ $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; ( $xlate="${dir}../../perlasm/x86_64-xlate.pl" and -f $xlate) or die "can't locate x86_64-xlate.pl"; +$avx=1 if (`$ENV{CC} -Wa,-v -c -o /dev/null -x assembler /dev/null 2>&1` + =~ /GNU assembler version ([2-9]\.[0-9]+)/ && + $1>=2.19); +$avx=1 if (!$avx && $win64 && ($flavour =~ /nasm/ || $ENV{ASM} =~ /nasm/) && + `nasm -v 2>&1` =~ /NASM version ([2-9]\.[0-9]+)/ && + $1>=2.09); +$avx=1 if (!$avx && $win64 && ($flavour =~ /masm/ || $ENV{ASM} =~ /ml64/) && + `ml64 2>&1` =~ /Version ([0-9]+)\./ && + $1>=10); + open STDOUT,"| $^X $xlate $flavour $output"; $ctx="%rdi"; # 1st arg @@ -51,196 +93,994 @@ $ctx="%r8"; $inp="%r9"; $num="%r10"; -$xi="%eax"; -$t0="%ebx"; -$t1="%ecx"; -$A="%edx"; -$B="%esi"; -$C="%edi"; -$D="%ebp"; -$E="%r11d"; -$T="%r12d"; - -@V=($A,$B,$C,$D,$E,$T); +$t0="%eax"; +$t1="%ebx"; +$t2="%ecx"; +@xi=("%edx","%ebp"); +$A="%esi"; +$B="%edi"; +$C="%r11d"; +$D="%r12d"; +$E="%r13d"; -sub PROLOGUE { -my $func=shift; -$code.=<<___; -.globl $func -.type $func,\@function,3 -.align 16 -$func: - push %rbx - push %rbp - push %r12 - mov %rsp,%r11 - mov %rdi,$ctx # reassigned argument - sub \$`8+16*4`,%rsp - mov %rsi,$inp # reassigned argument - and \$-64,%rsp - mov %rdx,$num # reassigned argument - mov %r11,`16*4`(%rsp) -.Lprologue: - - mov 0($ctx),$A - mov 4($ctx),$B - mov 8($ctx),$C - mov 12($ctx),$D - mov 16($ctx),$E -___ -} - -sub EPILOGUE { -my $func=shift; -$code.=<<___; - mov `16*4`(%rsp),%rsi - mov (%rsi),%r12 - mov 8(%rsi),%rbp - mov 16(%rsi),%rbx - lea 24(%rsi),%rsp -.Lepilogue: - ret -.size $func,.-$func -___ -} +@V=($A,$B,$C,$D,$E); sub BODY_00_19 { -my ($i,$a,$b,$c,$d,$e,$f,$host)=@_; +my ($i,$a,$b,$c,$d,$e)=@_; my $j=$i+1; $code.=<<___ if ($i==0); - mov `4*$i`($inp),$xi - `"bswap $xi" if(!defined($host))` - mov $xi,`4*$i`(%rsp) + mov `4*$i`($inp),$xi[0] + bswap $xi[0] + mov $xi[0],`4*$i`(%rsp) ___ $code.=<<___ if ($i<15); - lea 0x5a827999($xi,$e),$f mov $c,$t0 - mov `4*$j`($inp),$xi - mov $a,$e + mov `4*$j`($inp),$xi[1] + mov $a,$t2 xor $d,$t0 - `"bswap $xi" if(!defined($host))` - rol \$5,$e + bswap $xi[1] + rol \$5,$t2 + lea 0x5a827999($xi[0],$e),$e and $b,$t0 - mov $xi,`4*$j`(%rsp) - add $e,$f + mov $xi[1],`4*$j`(%rsp) + add $t2,$e xor $d,$t0 rol \$30,$b - add $t0,$f + add $t0,$e ___ $code.=<<___ if ($i>=15); - lea 0x5a827999($xi,$e),$f - mov `4*($j%16)`(%rsp),$xi + mov `4*($j%16)`(%rsp),$xi[1] mov $c,$t0 - mov $a,$e - xor `4*(($j+2)%16)`(%rsp),$xi + mov $a,$t2 + xor `4*(($j+2)%16)`(%rsp),$xi[1] xor $d,$t0 - rol \$5,$e - xor `4*(($j+8)%16)`(%rsp),$xi + rol \$5,$t2 + xor `4*(($j+8)%16)`(%rsp),$xi[1] and $b,$t0 - add $e,$f - xor `4*(($j+13)%16)`(%rsp),$xi + lea 0x5a827999($xi[0],$e),$e + xor `4*(($j+13)%16)`(%rsp),$xi[1] xor $d,$t0 + rol \$1,$xi[1] + add $t2,$e rol \$30,$b - add $t0,$f - rol \$1,$xi - mov $xi,`4*($j%16)`(%rsp) + mov $xi[1],`4*($j%16)`(%rsp) + add $t0,$e ___ +unshift(@xi,pop(@xi)); } sub BODY_20_39 { -my ($i,$a,$b,$c,$d,$e,$f)=@_; +my ($i,$a,$b,$c,$d,$e)=@_; my $j=$i+1; my $K=($i<40)?0x6ed9eba1:0xca62c1d6; $code.=<<___ if ($i<79); - lea $K($xi,$e),$f - mov `4*($j%16)`(%rsp),$xi + mov `4*($j%16)`(%rsp),$xi[1] mov $c,$t0 - mov $a,$e - xor `4*(($j+2)%16)`(%rsp),$xi + mov $a,$t2 + xor `4*(($j+2)%16)`(%rsp),$xi[1] xor $b,$t0 - rol \$5,$e - xor `4*(($j+8)%16)`(%rsp),$xi + rol \$5,$t2 + lea $K($xi[0],$e),$e + xor `4*(($j+8)%16)`(%rsp),$xi[1] xor $d,$t0 - add $e,$f - xor `4*(($j+13)%16)`(%rsp),$xi + add $t2,$e + xor `4*(($j+13)%16)`(%rsp),$xi[1] rol \$30,$b - add $t0,$f - rol \$1,$xi + add $t0,$e + rol \$1,$xi[1] ___ $code.=<<___ if ($i<76); - mov $xi,`4*($j%16)`(%rsp) + mov $xi[1],`4*($j%16)`(%rsp) ___ $code.=<<___ if ($i==79); - lea $K($xi,$e),$f mov $c,$t0 - mov $a,$e + mov $a,$t2 xor $b,$t0 - rol \$5,$e + lea $K($xi[0],$e),$e + rol \$5,$t2 xor $d,$t0 - add $e,$f + add $t2,$e rol \$30,$b - add $t0,$f + add $t0,$e ___ +unshift(@xi,pop(@xi)); } sub BODY_40_59 { -my ($i,$a,$b,$c,$d,$e,$f)=@_; +my ($i,$a,$b,$c,$d,$e)=@_; my $j=$i+1; $code.=<<___; - lea 0x8f1bbcdc($xi,$e),$f - mov `4*($j%16)`(%rsp),$xi - mov $b,$t0 - mov $b,$t1 - xor `4*(($j+2)%16)`(%rsp),$xi - mov $a,$e - and $c,$t0 - xor `4*(($j+8)%16)`(%rsp),$xi - or $c,$t1 - rol \$5,$e - xor `4*(($j+13)%16)`(%rsp),$xi - and $d,$t1 - add $e,$f - rol \$1,$xi - or $t1,$t0 + mov `4*($j%16)`(%rsp),$xi[1] + mov $c,$t0 + mov $c,$t1 + xor `4*(($j+2)%16)`(%rsp),$xi[1] + and $d,$t0 + mov $a,$t2 + xor `4*(($j+8)%16)`(%rsp),$xi[1] + xor $d,$t1 + lea 0x8f1bbcdc($xi[0],$e),$e + rol \$5,$t2 + xor `4*(($j+13)%16)`(%rsp),$xi[1] + add $t0,$e + and $b,$t1 + rol \$1,$xi[1] + add $t1,$e rol \$30,$b - mov $xi,`4*($j%16)`(%rsp) - add $t0,$f + mov $xi[1],`4*($j%16)`(%rsp) + add $t2,$e ___ +unshift(@xi,pop(@xi)); } -$code=".text\n"; +$code.=<<___; +.text +.extern OPENSSL_ia32cap_P -&PROLOGUE("sha1_block_data_order"); -$code.=".align 4\n.Lloop:\n"; +.globl sha1_block_data_order +.type sha1_block_data_order,\@function,3 +.align 16 +sha1_block_data_order: + mov OPENSSL_ia32cap_P+0(%rip),%r9d + mov OPENSSL_ia32cap_P+4(%rip),%r8d + test \$`1<<9`,%r8d # check SSSE3 bit + jz .Lialu +___ +$code.=<<___ if ($avx); + and \$`1<<28`,%r8d # mask AVX bit + and \$`1<<30`,%r9d # mask "Intel CPU" bit + or %r9d,%r8d + cmp \$`1<<28|1<<30`,%r8d + je _avx_shortcut +___ +$code.=<<___; + jmp _ssse3_shortcut + +.align 16 +.Lialu: + push %rbx + push %rbp + push %r12 + push %r13 + mov %rsp,%r11 + mov %rdi,$ctx # reassigned argument + sub \$`8+16*4`,%rsp + mov %rsi,$inp # reassigned argument + and \$-64,%rsp + mov %rdx,$num # reassigned argument + mov %r11,`16*4`(%rsp) +.Lprologue: + + mov 0($ctx),$A + mov 4($ctx),$B + mov 8($ctx),$C + mov 12($ctx),$D + mov 16($ctx),$E + jmp .Lloop + +.align 16 +.Lloop: +___ for($i=0;$i<20;$i++) { &BODY_00_19($i,@V); unshift(@V,pop(@V)); } for(;$i<40;$i++) { &BODY_20_39($i,@V); unshift(@V,pop(@V)); } for(;$i<60;$i++) { &BODY_40_59($i,@V); unshift(@V,pop(@V)); } for(;$i<80;$i++) { &BODY_20_39($i,@V); unshift(@V,pop(@V)); } $code.=<<___; - add 0($ctx),$E - add 4($ctx),$T - add 8($ctx),$A - add 12($ctx),$B - add 16($ctx),$C - mov $E,0($ctx) - mov $T,4($ctx) - mov $A,8($ctx) - mov $B,12($ctx) - mov $C,16($ctx) - - xchg $E,$A # mov $E,$A - xchg $T,$B # mov $T,$B - xchg $E,$C # mov $A,$C - xchg $T,$D # mov $B,$D - # mov $C,$E - lea `16*4`($inp),$inp + add 0($ctx),$A + add 4($ctx),$B + add 8($ctx),$C + add 12($ctx),$D + add 16($ctx),$E + mov $A,0($ctx) + mov $B,4($ctx) + mov $C,8($ctx) + mov $D,12($ctx) + mov $E,16($ctx) + sub \$1,$num + lea `16*4`($inp),$inp jnz .Lloop + + mov `16*4`(%rsp),%rsi + mov (%rsi),%r13 + mov 8(%rsi),%r12 + mov 16(%rsi),%rbp + mov 24(%rsi),%rbx + lea 32(%rsi),%rsp +.Lepilogue: + ret +.size sha1_block_data_order,.-sha1_block_data_order ___ -&EPILOGUE("sha1_block_data_order"); +{{{ +my $Xi=4; +my @X=map("%xmm$_",(4..7,0..3)); +my @Tx=map("%xmm$_",(8..10)); +my @V=($A,$B,$C,$D,$E)=("%eax","%ebx","%ecx","%edx","%ebp"); # size optimization +my @T=("%esi","%edi"); +my $j=0; +my $K_XX_XX="%r11"; + +my $_rol=sub { &rol(@_) }; +my $_ror=sub { &ror(@_) }; + +$code.=<<___; +.type sha1_block_data_order_ssse3,\@function,3 +.align 16 +sha1_block_data_order_ssse3: +_ssse3_shortcut: + push %rbx + push %rbp + push %r12 + lea `-64-($win64?5*16:0)`(%rsp),%rsp +___ +$code.=<<___ if ($win64); + movaps %xmm6,64+0(%rsp) + movaps %xmm7,64+16(%rsp) + movaps %xmm8,64+32(%rsp) + movaps %xmm9,64+48(%rsp) + movaps %xmm10,64+64(%rsp) +.Lprologue_ssse3: +___ +$code.=<<___; + mov %rdi,$ctx # reassigned argument + mov %rsi,$inp # reassigned argument + mov %rdx,$num # reassigned argument + + shl \$6,$num + add $inp,$num + lea K_XX_XX(%rip),$K_XX_XX + + mov 0($ctx),$A # load context + mov 4($ctx),$B + mov 8($ctx),$C + mov 12($ctx),$D + mov $B,@T[0] # magic seed + mov 16($ctx),$E + + movdqa 64($K_XX_XX),@X[2] # pbswap mask + movdqa 0($K_XX_XX),@Tx[1] # K_00_19 + movdqu 0($inp),@X[-4&7] # load input to %xmm[0-3] + movdqu 16($inp),@X[-3&7] + movdqu 32($inp),@X[-2&7] + movdqu 48($inp),@X[-1&7] + pshufb @X[2],@X[-4&7] # byte swap + add \$64,$inp + pshufb @X[2],@X[-3&7] + pshufb @X[2],@X[-2&7] + pshufb @X[2],@X[-1&7] + paddd @Tx[1],@X[-4&7] # add K_00_19 + paddd @Tx[1],@X[-3&7] + paddd @Tx[1],@X[-2&7] + movdqa @X[-4&7],0(%rsp) # X[]+K xfer to IALU + psubd @Tx[1],@X[-4&7] # restore X[] + movdqa @X[-3&7],16(%rsp) + psubd @Tx[1],@X[-3&7] + movdqa @X[-2&7],32(%rsp) + psubd @Tx[1],@X[-2&7] + jmp .Loop_ssse3 +___ + +sub AUTOLOAD() # thunk [simplified] 32-bit style perlasm +{ my $opcode = $AUTOLOAD; $opcode =~ s/.*:://; + my $arg = pop; + $arg = "\$$arg" if ($arg*1 eq $arg); + $code .= "\t$opcode\t".join(',',$arg,reverse @_)."\n"; +} + +sub Xupdate_ssse3_16_31() # recall that $Xi starts wtih 4 +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 40 instructions + my ($a,$b,$c,$d,$e); + + &movdqa (@X[0],@X[-3&7]); + eval(shift(@insns)); + eval(shift(@insns)); + &movdqa (@Tx[0],@X[-1&7]); + &palignr(@X[0],@X[-4&7],8); # compose "X[-14]" in "X[0]" + eval(shift(@insns)); + eval(shift(@insns)); + + &paddd (@Tx[1],@X[-1&7]); + eval(shift(@insns)); + eval(shift(@insns)); + &psrldq (@Tx[0],4); # "X[-3]", 3 dwords + eval(shift(@insns)); + eval(shift(@insns)); + &pxor (@X[0],@X[-4&7]); # "X[0]"^="X[-16]" + eval(shift(@insns)); + eval(shift(@insns)); + + &pxor (@Tx[0],@X[-2&7]); # "X[-3]"^"X[-8]" + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &pxor (@X[0],@Tx[0]); # "X[0]"^="X[-3]"^"X[-8]" + eval(shift(@insns)); + eval(shift(@insns)); + &movdqa (eval(16*(($Xi-1)&3))."(%rsp)",@Tx[1]); # X[]+K xfer to IALU + eval(shift(@insns)); + eval(shift(@insns)); + + &movdqa (@Tx[2],@X[0]); + &movdqa (@Tx[0],@X[0]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &pslldq (@Tx[2],12); # "X[0]"<<96, extract one dword + &paddd (@X[0],@X[0]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &psrld (@Tx[0],31); + eval(shift(@insns)); + eval(shift(@insns)); + &movdqa (@Tx[1],@Tx[2]); + eval(shift(@insns)); + eval(shift(@insns)); + + &psrld (@Tx[2],30); + &por (@X[0],@Tx[0]); # "X[0]"<<<=1 + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &pslld (@Tx[1],2); + &pxor (@X[0],@Tx[2]); + eval(shift(@insns)); + eval(shift(@insns)); + &movdqa (@Tx[2],eval(16*(($Xi)/5))."($K_XX_XX)"); # K_XX_XX + eval(shift(@insns)); + eval(shift(@insns)); + + &pxor (@X[0],@Tx[1]); # "X[0]"^=("X[0]">>96)<<<2 + + foreach (@insns) { eval; } # remaining instructions [if any] + + $Xi++; push(@X,shift(@X)); # "rotate" X[] + push(@Tx,shift(@Tx)); +} + +sub Xupdate_ssse3_32_79() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 to 48 instructions + my ($a,$b,$c,$d,$e); + + &movdqa (@Tx[0],@X[-1&7]) if ($Xi==8); + eval(shift(@insns)); # body_20_39 + &pxor (@X[0],@X[-4&7]); # "X[0]"="X[-32]"^"X[-16]" + &palignr(@Tx[0],@X[-2&7],8); # compose "X[-6]" + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # rol + + &pxor (@X[0],@X[-7&7]); # "X[0]"^="X[-28]" + eval(shift(@insns)); + eval(shift(@insns)) if (@insns[0] !~ /&ro[rl]/); + if ($Xi%5) { + &movdqa (@Tx[2],@Tx[1]);# "perpetuate" K_XX_XX... + } else { # ... or load next one + &movdqa (@Tx[2],eval(16*($Xi/5))."($K_XX_XX)"); + } + &paddd (@Tx[1],@X[-1&7]); + eval(shift(@insns)); # ror + eval(shift(@insns)); + + &pxor (@X[0],@Tx[0]); # "X[0]"^="X[-6]" + eval(shift(@insns)); # body_20_39 + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # rol + + &movdqa (@Tx[0],@X[0]); + &movdqa (eval(16*(($Xi-1)&3))."(%rsp)",@Tx[1]); # X[]+K xfer to IALU + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # ror + eval(shift(@insns)); + + &pslld (@X[0],2); + eval(shift(@insns)); # body_20_39 + eval(shift(@insns)); + &psrld (@Tx[0],30); + eval(shift(@insns)); + eval(shift(@insns)); # rol + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # ror + eval(shift(@insns)); + + &por (@X[0],@Tx[0]); # "X[0]"<<<=2 + eval(shift(@insns)); # body_20_39 + eval(shift(@insns)); + &movdqa (@Tx[1],@X[0]) if ($Xi<19); + eval(shift(@insns)); + eval(shift(@insns)); # rol + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # rol + eval(shift(@insns)); + + foreach (@insns) { eval; } # remaining instructions + + $Xi++; push(@X,shift(@X)); # "rotate" X[] + push(@Tx,shift(@Tx)); +} + +sub Xuplast_ssse3_80() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 instructions + my ($a,$b,$c,$d,$e); + + eval(shift(@insns)); + &paddd (@Tx[1],@X[-1&7]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &movdqa (eval(16*(($Xi-1)&3))."(%rsp)",@Tx[1]); # X[]+K xfer IALU + + foreach (@insns) { eval; } # remaining instructions + + &cmp ($inp,$num); + &je (".Ldone_ssse3"); + + unshift(@Tx,pop(@Tx)); + + &movdqa (@X[2],"64($K_XX_XX)"); # pbswap mask + &movdqa (@Tx[1],"0($K_XX_XX)"); # K_00_19 + &movdqu (@X[-4&7],"0($inp)"); # load input + &movdqu (@X[-3&7],"16($inp)"); + &movdqu (@X[-2&7],"32($inp)"); + &movdqu (@X[-1&7],"48($inp)"); + &pshufb (@X[-4&7],@X[2]); # byte swap + &add ($inp,64); + + $Xi=0; +} + +sub Xloop_ssse3() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 instructions + my ($a,$b,$c,$d,$e); + + eval(shift(@insns)); + eval(shift(@insns)); + &pshufb (@X[($Xi-3)&7],@X[2]); + eval(shift(@insns)); + eval(shift(@insns)); + &paddd (@X[($Xi-4)&7],@Tx[1]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + &movdqa (eval(16*$Xi)."(%rsp)",@X[($Xi-4)&7]); # X[]+K xfer to IALU + eval(shift(@insns)); + eval(shift(@insns)); + &psubd (@X[($Xi-4)&7],@Tx[1]); + + foreach (@insns) { eval; } + $Xi++; +} + +sub Xtail_ssse3() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 instructions + my ($a,$b,$c,$d,$e); + + foreach (@insns) { eval; } +} + +sub body_00_19 () { + ( + '($a,$b,$c,$d,$e)=@V;'. + '&add ($e,eval(4*($j&15))."(%rsp)");', # X[]+K xfer + '&xor ($c,$d);', + '&mov (@T[1],$a);', # $b in next round + '&$_rol ($a,5);', + '&and (@T[0],$c);', # ($b&($c^$d)) + '&xor ($c,$d);', # restore $c + '&xor (@T[0],$d);', + '&add ($e,$a);', + '&$_ror ($b,$j?7:2);', # $b>>>2 + '&add ($e,@T[0]);' .'$j++; unshift(@V,pop(@V)); unshift(@T,pop(@T));' + ); +} + +sub body_20_39 () { + ( + '($a,$b,$c,$d,$e)=@V;'. + '&add ($e,eval(4*($j++&15))."(%rsp)");', # X[]+K xfer + '&xor (@T[0],$d);', # ($b^$d) + '&mov (@T[1],$a);', # $b in next round + '&$_rol ($a,5);', + '&xor (@T[0],$c);', # ($b^$d^$c) + '&add ($e,$a);', + '&$_ror ($b,7);', # $b>>>2 + '&add ($e,@T[0]);' .'unshift(@V,pop(@V)); unshift(@T,pop(@T));' + ); +} + +sub body_40_59 () { + ( + '($a,$b,$c,$d,$e)=@V;'. + '&mov (@T[1],$c);', + '&xor ($c,$d);', + '&add ($e,eval(4*($j++&15))."(%rsp)");', # X[]+K xfer + '&and (@T[1],$d);', + '&and (@T[0],$c);', # ($b&($c^$d)) + '&$_ror ($b,7);', # $b>>>2 + '&add ($e,@T[1]);', + '&mov (@T[1],$a);', # $b in next round + '&$_rol ($a,5);', + '&add ($e,@T[0]);', + '&xor ($c,$d);', # restore $c + '&add ($e,$a);' .'unshift(@V,pop(@V)); unshift(@T,pop(@T));' + ); +} $code.=<<___; -.asciz "SHA1 block transform for x86_64, CRYPTOGAMS by " .align 16 +.Loop_ssse3: +___ + &Xupdate_ssse3_16_31(\&body_00_19); + &Xupdate_ssse3_16_31(\&body_00_19); + &Xupdate_ssse3_16_31(\&body_00_19); + &Xupdate_ssse3_16_31(\&body_00_19); + &Xupdate_ssse3_32_79(\&body_00_19); + &Xupdate_ssse3_32_79(\&body_20_39); + &Xupdate_ssse3_32_79(\&body_20_39); + &Xupdate_ssse3_32_79(\&body_20_39); + &Xupdate_ssse3_32_79(\&body_20_39); + &Xupdate_ssse3_32_79(\&body_20_39); + &Xupdate_ssse3_32_79(\&body_40_59); + &Xupdate_ssse3_32_79(\&body_40_59); + &Xupdate_ssse3_32_79(\&body_40_59); + &Xupdate_ssse3_32_79(\&body_40_59); + &Xupdate_ssse3_32_79(\&body_40_59); + &Xupdate_ssse3_32_79(\&body_20_39); + &Xuplast_ssse3_80(\&body_20_39); # can jump to "done" + + $saved_j=$j; @saved_V=@V; + + &Xloop_ssse3(\&body_20_39); + &Xloop_ssse3(\&body_20_39); + &Xloop_ssse3(\&body_20_39); + +$code.=<<___; + add 0($ctx),$A # update context + add 4($ctx),@T[0] + add 8($ctx),$C + add 12($ctx),$D + mov $A,0($ctx) + add 16($ctx),$E + mov @T[0],4($ctx) + mov @T[0],$B # magic seed + mov $C,8($ctx) + mov $D,12($ctx) + mov $E,16($ctx) + jmp .Loop_ssse3 + +.align 16 +.Ldone_ssse3: +___ + $j=$saved_j; @V=@saved_V; + + &Xtail_ssse3(\&body_20_39); + &Xtail_ssse3(\&body_20_39); + &Xtail_ssse3(\&body_20_39); + +$code.=<<___; + add 0($ctx),$A # update context + add 4($ctx),@T[0] + add 8($ctx),$C + mov $A,0($ctx) + add 12($ctx),$D + mov @T[0],4($ctx) + add 16($ctx),$E + mov $C,8($ctx) + mov $D,12($ctx) + mov $E,16($ctx) +___ +$code.=<<___ if ($win64); + movaps 64+0(%rsp),%xmm6 + movaps 64+16(%rsp),%xmm7 + movaps 64+32(%rsp),%xmm8 + movaps 64+48(%rsp),%xmm9 + movaps 64+64(%rsp),%xmm10 +___ +$code.=<<___; + lea `64+($win64?5*16:0)`(%rsp),%rsi + mov 0(%rsi),%r12 + mov 8(%rsi),%rbp + mov 16(%rsi),%rbx + lea 24(%rsi),%rsp +.Lepilogue_ssse3: + ret +.size sha1_block_data_order_ssse3,.-sha1_block_data_order_ssse3 +___ + +if ($avx) { +my $Xi=4; +my @X=map("%xmm$_",(4..7,0..3)); +my @Tx=map("%xmm$_",(8..10)); +my @V=($A,$B,$C,$D,$E)=("%eax","%ebx","%ecx","%edx","%ebp"); # size optimization +my @T=("%esi","%edi"); +my $j=0; +my $K_XX_XX="%r11"; + +my $_rol=sub { &shld(@_[0],@_) }; +my $_ror=sub { &shrd(@_[0],@_) }; + +$code.=<<___; +.type sha1_block_data_order_avx,\@function,3 +.align 16 +sha1_block_data_order_avx: +_avx_shortcut: + push %rbx + push %rbp + push %r12 + lea `-64-($win64?5*16:0)`(%rsp),%rsp +___ +$code.=<<___ if ($win64); + movaps %xmm6,64+0(%rsp) + movaps %xmm7,64+16(%rsp) + movaps %xmm8,64+32(%rsp) + movaps %xmm9,64+48(%rsp) + movaps %xmm10,64+64(%rsp) +.Lprologue_avx: +___ +$code.=<<___; + mov %rdi,$ctx # reassigned argument + mov %rsi,$inp # reassigned argument + mov %rdx,$num # reassigned argument + vzeroall + + shl \$6,$num + add $inp,$num + lea K_XX_XX(%rip),$K_XX_XX + + mov 0($ctx),$A # load context + mov 4($ctx),$B + mov 8($ctx),$C + mov 12($ctx),$D + mov $B,@T[0] # magic seed + mov 16($ctx),$E + + vmovdqa 64($K_XX_XX),@X[2] # pbswap mask + vmovdqa 0($K_XX_XX),@Tx[1] # K_00_19 + vmovdqu 0($inp),@X[-4&7] # load input to %xmm[0-3] + vmovdqu 16($inp),@X[-3&7] + vmovdqu 32($inp),@X[-2&7] + vmovdqu 48($inp),@X[-1&7] + vpshufb @X[2],@X[-4&7],@X[-4&7] # byte swap + add \$64,$inp + vpshufb @X[2],@X[-3&7],@X[-3&7] + vpshufb @X[2],@X[-2&7],@X[-2&7] + vpshufb @X[2],@X[-1&7],@X[-1&7] + vpaddd @Tx[1],@X[-4&7],@X[0] # add K_00_19 + vpaddd @Tx[1],@X[-3&7],@X[1] + vpaddd @Tx[1],@X[-2&7],@X[2] + vmovdqa @X[0],0(%rsp) # X[]+K xfer to IALU + vmovdqa @X[1],16(%rsp) + vmovdqa @X[2],32(%rsp) + jmp .Loop_avx +___ + +sub Xupdate_avx_16_31() # recall that $Xi starts wtih 4 +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 40 instructions + my ($a,$b,$c,$d,$e); + + eval(shift(@insns)); + eval(shift(@insns)); + &vpalignr(@X[0],@X[-3&7],@X[-4&7],8); # compose "X[-14]" in "X[0]" + eval(shift(@insns)); + eval(shift(@insns)); + + &vpaddd (@Tx[1],@Tx[1],@X[-1&7]); + eval(shift(@insns)); + eval(shift(@insns)); + &vpsrldq(@Tx[0],@X[-1&7],4); # "X[-3]", 3 dwords + eval(shift(@insns)); + eval(shift(@insns)); + &vpxor (@X[0],@X[0],@X[-4&7]); # "X[0]"^="X[-16]" + eval(shift(@insns)); + eval(shift(@insns)); + + &vpxor (@Tx[0],@Tx[0],@X[-2&7]); # "X[-3]"^"X[-8]" + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &vpxor (@X[0],@X[0],@Tx[0]); # "X[0]"^="X[-3]"^"X[-8]" + eval(shift(@insns)); + eval(shift(@insns)); + &vmovdqa (eval(16*(($Xi-1)&3))."(%rsp)",@Tx[1]); # X[]+K xfer to IALU + eval(shift(@insns)); + eval(shift(@insns)); + + &vpsrld (@Tx[0],@X[0],31); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &vpslldq(@Tx[2],@X[0],12); # "X[0]"<<96, extract one dword + &vpaddd (@X[0],@X[0],@X[0]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &vpsrld (@Tx[1],@Tx[2],30); + &vpor (@X[0],@X[0],@Tx[0]); # "X[0]"<<<=1 + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &vpslld (@Tx[2],@Tx[2],2); + &vpxor (@X[0],@X[0],@Tx[1]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &vpxor (@X[0],@X[0],@Tx[2]); # "X[0]"^=("X[0]">>96)<<<2 + eval(shift(@insns)); + eval(shift(@insns)); + &vmovdqa (@Tx[2],eval(16*(($Xi)/5))."($K_XX_XX)"); # K_XX_XX + eval(shift(@insns)); + eval(shift(@insns)); + + + foreach (@insns) { eval; } # remaining instructions [if any] + + $Xi++; push(@X,shift(@X)); # "rotate" X[] + push(@Tx,shift(@Tx)); +} + +sub Xupdate_avx_32_79() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 to 48 instructions + my ($a,$b,$c,$d,$e); + + &vpalignr(@Tx[0],@X[-1&7],@X[-2&7],8); # compose "X[-6]" + &vpxor (@X[0],@X[0],@X[-4&7]); # "X[0]"="X[-32]"^"X[-16]" + eval(shift(@insns)); # body_20_39 + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # rol + + &vpxor (@X[0],@X[0],@X[-7&7]); # "X[0]"^="X[-28]" + eval(shift(@insns)); + eval(shift(@insns)) if (@insns[0] !~ /&ro[rl]/); + if ($Xi%5) { + &vmovdqa (@Tx[2],@Tx[1]);# "perpetuate" K_XX_XX... + } else { # ... or load next one + &vmovdqa (@Tx[2],eval(16*($Xi/5))."($K_XX_XX)"); + } + &vpaddd (@Tx[1],@Tx[1],@X[-1&7]); + eval(shift(@insns)); # ror + eval(shift(@insns)); + + &vpxor (@X[0],@X[0],@Tx[0]); # "X[0]"^="X[-6]" + eval(shift(@insns)); # body_20_39 + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # rol + + &vpsrld (@Tx[0],@X[0],30); + &vmovdqa (eval(16*(($Xi-1)&3))."(%rsp)",@Tx[1]); # X[]+K xfer to IALU + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # ror + eval(shift(@insns)); + + &vpslld (@X[0],@X[0],2); + eval(shift(@insns)); # body_20_39 + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # rol + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # ror + eval(shift(@insns)); + + &vpor (@X[0],@X[0],@Tx[0]); # "X[0]"<<<=2 + eval(shift(@insns)); # body_20_39 + eval(shift(@insns)); + &vmovdqa (@Tx[1],@X[0]) if ($Xi<19); + eval(shift(@insns)); + eval(shift(@insns)); # rol + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); # rol + eval(shift(@insns)); + + foreach (@insns) { eval; } # remaining instructions + + $Xi++; push(@X,shift(@X)); # "rotate" X[] + push(@Tx,shift(@Tx)); +} + +sub Xuplast_avx_80() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 instructions + my ($a,$b,$c,$d,$e); + + eval(shift(@insns)); + &vpaddd (@Tx[1],@Tx[1],@X[-1&7]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + + &movdqa (eval(16*(($Xi-1)&3))."(%rsp)",@Tx[1]); # X[]+K xfer IALU + + foreach (@insns) { eval; } # remaining instructions + + &cmp ($inp,$num); + &je (".Ldone_avx"); + + unshift(@Tx,pop(@Tx)); + + &vmovdqa(@X[2],"64($K_XX_XX)"); # pbswap mask + &vmovdqa(@Tx[1],"0($K_XX_XX)"); # K_00_19 + &vmovdqu(@X[-4&7],"0($inp)"); # load input + &vmovdqu(@X[-3&7],"16($inp)"); + &vmovdqu(@X[-2&7],"32($inp)"); + &vmovdqu(@X[-1&7],"48($inp)"); + &vpshufb(@X[-4&7],@X[-4&7],@X[2]); # byte swap + &add ($inp,64); + + $Xi=0; +} + +sub Xloop_avx() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 instructions + my ($a,$b,$c,$d,$e); + + eval(shift(@insns)); + eval(shift(@insns)); + &vpshufb(@X[($Xi-3)&7],@X[($Xi-3)&7],@X[2]); + eval(shift(@insns)); + eval(shift(@insns)); + &vpaddd (@X[$Xi&7],@X[($Xi-4)&7],@Tx[1]); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + eval(shift(@insns)); + &vmovdqa(eval(16*$Xi)."(%rsp)",@X[$Xi&7]); # X[]+K xfer to IALU + eval(shift(@insns)); + eval(shift(@insns)); + + foreach (@insns) { eval; } + $Xi++; +} + +sub Xtail_avx() +{ use integer; + my $body = shift; + my @insns = (&$body,&$body,&$body,&$body); # 32 instructions + my ($a,$b,$c,$d,$e); + + foreach (@insns) { eval; } +} + +$code.=<<___; +.align 16 +.Loop_avx: +___ + &Xupdate_avx_16_31(\&body_00_19); + &Xupdate_avx_16_31(\&body_00_19); + &Xupdate_avx_16_31(\&body_00_19); + &Xupdate_avx_16_31(\&body_00_19); + &Xupdate_avx_32_79(\&body_00_19); + &Xupdate_avx_32_79(\&body_20_39); + &Xupdate_avx_32_79(\&body_20_39); + &Xupdate_avx_32_79(\&body_20_39); + &Xupdate_avx_32_79(\&body_20_39); + &Xupdate_avx_32_79(\&body_20_39); + &Xupdate_avx_32_79(\&body_40_59); + &Xupdate_avx_32_79(\&body_40_59); + &Xupdate_avx_32_79(\&body_40_59); + &Xupdate_avx_32_79(\&body_40_59); + &Xupdate_avx_32_79(\&body_40_59); + &Xupdate_avx_32_79(\&body_20_39); + &Xuplast_avx_80(\&body_20_39); # can jump to "done" + + $saved_j=$j; @saved_V=@V; + + &Xloop_avx(\&body_20_39); + &Xloop_avx(\&body_20_39); + &Xloop_avx(\&body_20_39); + +$code.=<<___; + add 0($ctx),$A # update context + add 4($ctx),@T[0] + add 8($ctx),$C + add 12($ctx),$D + mov $A,0($ctx) + add 16($ctx),$E + mov @T[0],4($ctx) + mov @T[0],$B # magic seed + mov $C,8($ctx) + mov $D,12($ctx) + mov $E,16($ctx) + jmp .Loop_avx + +.align 16 +.Ldone_avx: +___ + $j=$saved_j; @V=@saved_V; + + &Xtail_avx(\&body_20_39); + &Xtail_avx(\&body_20_39); + &Xtail_avx(\&body_20_39); + +$code.=<<___; + vzeroall + + add 0($ctx),$A # update context + add 4($ctx),@T[0] + add 8($ctx),$C + mov $A,0($ctx) + add 12($ctx),$D + mov @T[0],4($ctx) + add 16($ctx),$E + mov $C,8($ctx) + mov $D,12($ctx) + mov $E,16($ctx) +___ +$code.=<<___ if ($win64); + movaps 64+0(%rsp),%xmm6 + movaps 64+16(%rsp),%xmm7 + movaps 64+32(%rsp),%xmm8 + movaps 64+48(%rsp),%xmm9 + movaps 64+64(%rsp),%xmm10 +___ +$code.=<<___; + lea `64+($win64?5*16:0)`(%rsp),%rsi + mov 0(%rsi),%r12 + mov 8(%rsi),%rbp + mov 16(%rsi),%rbx + lea 24(%rsi),%rsp +.Lepilogue_avx: + ret +.size sha1_block_data_order_avx,.-sha1_block_data_order_avx +___ +} +$code.=<<___; +.align 64 +K_XX_XX: +.long 0x5a827999,0x5a827999,0x5a827999,0x5a827999 # K_00_19 +.long 0x6ed9eba1,0x6ed9eba1,0x6ed9eba1,0x6ed9eba1 # K_20_39 +.long 0x8f1bbcdc,0x8f1bbcdc,0x8f1bbcdc,0x8f1bbcdc # K_40_59 +.long 0xca62c1d6,0xca62c1d6,0xca62c1d6,0xca62c1d6 # K_60_79 +.long 0x00010203,0x04050607,0x08090a0b,0x0c0d0e0f # pbswap mask +___ +}}} +$code.=<<___; +.asciz "SHA1 block transform for x86_64, CRYPTOGAMS by " +.align 64 ___ # EXCEPTION_DISPOSITION handler (EXCEPTION_RECORD *rec,ULONG64 frame, @@ -272,25 +1112,75 @@ se_handler: lea .Lprologue(%rip),%r10 cmp %r10,%rbx # context->Rip<.Lprologue - jb .Lin_prologue + jb .Lcommon_seh_tail mov 152($context),%rax # pull context->Rsp lea .Lepilogue(%rip),%r10 cmp %r10,%rbx # context->Rip>=.Lepilogue - jae .Lin_prologue + jae .Lcommon_seh_tail mov `16*4`(%rax),%rax # pull saved stack pointer - lea 24(%rax),%rax + lea 32(%rax),%rax mov -8(%rax),%rbx mov -16(%rax),%rbp mov -24(%rax),%r12 + mov -32(%rax),%r13 mov %rbx,144($context) # restore context->Rbx mov %rbp,160($context) # restore context->Rbp mov %r12,216($context) # restore context->R12 + mov %r13,224($context) # restore context->R13 + + jmp .Lcommon_seh_tail +.size se_handler,.-se_handler -.Lin_prologue: +.type ssse3_handler,\@abi-omnipotent +.align 16 +ssse3_handler: + push %rsi + push %rdi + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + pushfq + sub \$64,%rsp + + mov 120($context),%rax # pull context->Rax + mov 248($context),%rbx # pull context->Rip + + mov 8($disp),%rsi # disp->ImageBase + mov 56($disp),%r11 # disp->HandlerData + + mov 0(%r11),%r10d # HandlerData[0] + lea (%rsi,%r10),%r10 # prologue label + cmp %r10,%rbx # context->RipRsp + + mov 4(%r11),%r10d # HandlerData[1] + lea (%rsi,%r10),%r10 # epilogue label + cmp %r10,%rbx # context->Rip>=epilogue label + jae .Lcommon_seh_tail + + lea 64(%rax),%rsi + lea 512($context),%rdi # &context.Xmm6 + mov \$10,%ecx + .long 0xa548f3fc # cld; rep movsq + lea `24+64+5*16`(%rax),%rax # adjust stack pointer + + mov -8(%rax),%rbx + mov -16(%rax),%rbp + mov -24(%rax),%r12 + mov %rbx,144($context) # restore context->Rbx + mov %rbp,160($context) # restore context->Rbp + mov %r12,216($context) # restore cotnext->R12 + +.Lcommon_seh_tail: mov 8(%rax),%rdi mov 16(%rax),%rsi mov %rax,152($context) # restore context->Rsp @@ -328,19 +1218,38 @@ se_handler: pop %rdi pop %rsi ret -.size se_handler,.-se_handler +.size ssse3_handler,.-ssse3_handler .section .pdata .align 4 .rva .LSEH_begin_sha1_block_data_order .rva .LSEH_end_sha1_block_data_order .rva .LSEH_info_sha1_block_data_order - + .rva .LSEH_begin_sha1_block_data_order_ssse3 + .rva .LSEH_end_sha1_block_data_order_ssse3 + .rva .LSEH_info_sha1_block_data_order_ssse3 +___ +$code.=<<___ if ($avx); + .rva .LSEH_begin_sha1_block_data_order_avx + .rva .LSEH_end_sha1_block_data_order_avx + .rva .LSEH_info_sha1_block_data_order_avx +___ +$code.=<<___; .section .xdata .align 8 .LSEH_info_sha1_block_data_order: .byte 9,0,0,0 .rva se_handler +.LSEH_info_sha1_block_data_order_ssse3: + .byte 9,0,0,0 + .rva ssse3_handler + .rva .Lprologue_ssse3,.Lepilogue_ssse3 # HandlerData[] +___ +$code.=<<___ if ($avx); +.LSEH_info_sha1_block_data_order_avx: + .byte 9,0,0,0 + .rva ssse3_handler + .rva .Lprologue_avx,.Lepilogue_avx # HandlerData[] ___ } diff --git a/crypto/openssl/crypto/sha/asm/sha256-586.pl b/crypto/openssl/crypto/sha/asm/sha256-586.pl index ecc8b69c75..928ec53123 100644 --- a/crypto/openssl/crypto/sha/asm/sha256-586.pl +++ b/crypto/openssl/crypto/sha/asm/sha256-586.pl @@ -14,8 +14,8 @@ # Pentium PIII P4 AMD K8 Core2 # gcc 46 36 41 27 26 # icc 57 33 38 25 23 -# x86 asm 40 30 35 20 20 -# x86_64 asm(*) - - 21 15.8 16.5 +# x86 asm 40 30 33 20 18 +# x86_64 asm(*) - - 21 16 16 # # (*) x86_64 assembler performance is presented for reference # purposes. @@ -48,20 +48,19 @@ sub BODY_00_15() { my $in_16_63=shift; &mov ("ecx",$E); - &add ($T,&DWP(4*(8+15+16-9),"esp")) if ($in_16_63); # T += X[-7] - &ror ("ecx",6); - &mov ("edi",$E); - &ror ("edi",11); + &add ($T,"edi") if ($in_16_63); # T += sigma1(X[-2]) + &ror ("ecx",25-11); &mov ("esi",$Foff); - &xor ("ecx","edi"); - &ror ("edi",25-11); + &xor ("ecx",$E); + &ror ("ecx",11-6); &mov (&DWP(4*(8+15),"esp"),$T) if ($in_16_63); # save X[0] - &xor ("ecx","edi"); # Sigma1(e) + &xor ("ecx",$E); + &ror ("ecx",6); # Sigma1(e) &mov ("edi",$Goff); &add ($T,"ecx"); # T += Sigma1(e) - &mov ($Eoff,$E); # modulo-scheduled &xor ("esi","edi"); + &mov ($Eoff,$E); # modulo-scheduled &mov ("ecx",$A); &and ("esi",$E); &mov ($E,$Doff); # e becomes d, which is e in next iteration @@ -69,14 +68,14 @@ sub BODY_00_15() { &mov ("edi",$A); &add ($T,"esi"); # T += Ch(e,f,g) - &ror ("ecx",2); + &ror ("ecx",22-13); &add ($T,$Hoff); # T += h - &ror ("edi",13); + &xor ("ecx",$A); + &ror ("ecx",13-2); &mov ("esi",$Boff); - &xor ("ecx","edi"); - &ror ("edi",22-13); + &xor ("ecx",$A); + &ror ("ecx",2); # Sigma0(a) &add ($E,$T); # d += T - &xor ("ecx","edi"); # Sigma0(a) &mov ("edi",$Coff); &add ($T,"ecx"); # T += Sigma0(a) @@ -168,23 +167,22 @@ sub BODY_00_15() { &set_label("16_63",16); &mov ("esi",$T); &mov ("ecx",&DWP(4*(8+15+16-14),"esp")); - &shr ($T,3); - &ror ("esi",7); - &xor ($T,"esi"); &ror ("esi",18-7); &mov ("edi","ecx"); - &xor ($T,"esi"); # T = sigma0(X[-15]) + &xor ("esi",$T); + &ror ("esi",7); + &shr ($T,3); - &shr ("ecx",10); - &mov ("esi",&DWP(4*(8+15+16),"esp")); - &ror ("edi",17); - &xor ("ecx","edi"); &ror ("edi",19-17); - &add ($T,"esi"); # T += X[-16] - &xor ("edi","ecx") # sigma1(X[-2]) + &xor ($T,"esi"); # T = sigma0(X[-15]) + &xor ("edi","ecx"); + &ror ("edi",17); + &shr ("ecx",10); + &add ($T,&DWP(4*(8+15+16),"esp")); # T += X[-16] + &xor ("edi","ecx"); # sigma1(X[-2]) - &add ($T,"edi"); # T += sigma1(X[-2]) - # &add ($T,&DWP(4*(8+15+16-9),"esp")); # T += X[-7], moved to BODY_00_15(1) + &add ($T,&DWP(4*(8+15+16-9),"esp")); # T += X[-7] + # &add ($T,"edi"); # T += sigma1(X[-2]) # &mov (&DWP(4*(8+15),"esp"),$T); # save X[0] &BODY_00_15(1); diff --git a/crypto/openssl/crypto/sha/asm/sha512-x86_64.pl b/crypto/openssl/crypto/sha/asm/sha512-x86_64.pl index e6643f8cf6..f611a2d898 100755 --- a/crypto/openssl/crypto/sha/asm/sha512-x86_64.pl +++ b/crypto/openssl/crypto/sha/asm/sha512-x86_64.pl @@ -95,50 +95,44 @@ sub ROUND_00_15() { my ($i,$a,$b,$c,$d,$e,$f,$g,$h) = @_; $code.=<<___; - mov $e,$a0 - mov $e,$a1 + ror \$`$Sigma1[2]-$Sigma1[1]`,$a0 mov $f,$a2 + mov $T1,`$SZ*($i&0xf)`(%rsp) - ror \$$Sigma1[0],$a0 - ror \$$Sigma1[1],$a1 + ror \$`$Sigma0[2]-$Sigma0[1]`,$a1 + xor $e,$a0 xor $g,$a2 # f^g - xor $a1,$a0 - ror \$`$Sigma1[2]-$Sigma1[1]`,$a1 + ror \$`$Sigma1[1]-$Sigma1[0]`,$a0 + add $h,$T1 # T1+=h + xor $a,$a1 + + add ($Tbl,$round,$SZ),$T1 # T1+=K[round] and $e,$a2 # (f^g)&e - mov $T1,`$SZ*($i&0xf)`(%rsp) + mov $b,$h - xor $a1,$a0 # Sigma1(e) + ror \$`$Sigma0[1]-$Sigma0[0]`,$a1 + xor $e,$a0 xor $g,$a2 # Ch(e,f,g)=((f^g)&e)^g - add $h,$T1 # T1+=h - - mov $a,$h - add $a0,$T1 # T1+=Sigma1(e) + xor $c,$h # b^c + xor $a,$a1 add $a2,$T1 # T1+=Ch(e,f,g) - mov $a,$a0 - mov $a,$a1 + mov $b,$a2 - ror \$$Sigma0[0],$h - ror \$$Sigma0[1],$a0 - mov $a,$a2 - add ($Tbl,$round,$SZ),$T1 # T1+=K[round] + ror \$$Sigma1[0],$a0 # Sigma1(e) + and $a,$h # h=(b^c)&a + and $c,$a2 # b&c - xor $a0,$h - ror \$`$Sigma0[2]-$Sigma0[1]`,$a0 - or $c,$a1 # a|c + ror \$$Sigma0[0],$a1 # Sigma0(a) + add $a0,$T1 # T1+=Sigma1(e) + add $a2,$h # h+=b&c (completes +=Maj(a,b,c) - xor $a0,$h # h=Sigma0(a) - and $c,$a2 # a&c add $T1,$d # d+=T1 - - and $b,$a1 # (a|c)&b add $T1,$h # h+=T1 - - or $a2,$a1 # Maj(a,b,c)=((a|c)&b)|(a&c) lea 1($round),$round # round++ + add $a1,$h # h+=Sigma0(a) - add $a1,$h # h+=Maj(a,b,c) ___ } @@ -147,32 +141,30 @@ sub ROUND_16_XX() $code.=<<___; mov `$SZ*(($i+1)&0xf)`(%rsp),$a0 - mov `$SZ*(($i+14)&0xf)`(%rsp),$T1 - - mov $a0,$a2 + mov `$SZ*(($i+14)&0xf)`(%rsp),$a1 + mov $a0,$T1 + mov $a1,$a2 + ror \$`$sigma0[1]-$sigma0[0]`,$T1 + xor $a0,$T1 shr \$$sigma0[2],$a0 - ror \$$sigma0[0],$a2 - - xor $a2,$a0 - ror \$`$sigma0[1]-$sigma0[0]`,$a2 - xor $a2,$a0 # sigma0(X[(i+1)&0xf]) - mov $T1,$a1 + ror \$$sigma0[0],$T1 + xor $T1,$a0 # sigma0(X[(i+1)&0xf]) + mov `$SZ*(($i+9)&0xf)`(%rsp),$T1 - shr \$$sigma1[2],$T1 - ror \$$sigma1[0],$a1 - - xor $a1,$T1 - ror \$`$sigma1[1]-$sigma1[0]`,$a1 - - xor $a1,$T1 # sigma1(X[(i+14)&0xf]) + ror \$`$sigma1[1]-$sigma1[0]`,$a2 + xor $a1,$a2 + shr \$$sigma1[2],$a1 + ror \$$sigma1[0],$a2 add $a0,$T1 - - add `$SZ*(($i+9)&0xf)`(%rsp),$T1 + xor $a2,$a1 # sigma1(X[(i+14)&0xf]) add `$SZ*($i&0xf)`(%rsp),$T1 + mov $e,$a0 + add $a1,$T1 + mov $a,$a1 ___ &ROUND_00_15(@_); } @@ -219,6 +211,8 @@ $func: ___ for($i=0;$i<16;$i++) { $code.=" mov $SZ*$i($inp),$T1\n"; + $code.=" mov @ROT[4],$a0\n"; + $code.=" mov @ROT[0],$a1\n"; $code.=" bswap $T1\n"; &ROUND_00_15($i,@ROT); unshift(@ROT,pop(@ROT)); diff --git a/crypto/openssl/crypto/sha/sha.h b/crypto/openssl/crypto/sha/sha.h index 16cacf9fc0..8a6bf4bbbb 100644 --- a/crypto/openssl/crypto/sha/sha.h +++ b/crypto/openssl/crypto/sha/sha.h @@ -106,6 +106,9 @@ typedef struct SHAstate_st } SHA_CTX; #ifndef OPENSSL_NO_SHA0 +#ifdef OPENSSL_FIPS +int private_SHA_Init(SHA_CTX *c); +#endif int SHA_Init(SHA_CTX *c); int SHA_Update(SHA_CTX *c, const void *data, size_t len); int SHA_Final(unsigned char *md, SHA_CTX *c); @@ -113,6 +116,9 @@ unsigned char *SHA(const unsigned char *d, size_t n, unsigned char *md); void SHA_Transform(SHA_CTX *c, const unsigned char *data); #endif #ifndef OPENSSL_NO_SHA1 +#ifdef OPENSSL_FIPS +int private_SHA1_Init(SHA_CTX *c); +#endif int SHA1_Init(SHA_CTX *c); int SHA1_Update(SHA_CTX *c, const void *data, size_t len); int SHA1_Final(unsigned char *md, SHA_CTX *c); @@ -135,6 +141,10 @@ typedef struct SHA256state_st } SHA256_CTX; #ifndef OPENSSL_NO_SHA256 +#ifdef OPENSSL_FIPS +int private_SHA224_Init(SHA256_CTX *c); +int private_SHA256_Init(SHA256_CTX *c); +#endif int SHA224_Init(SHA256_CTX *c); int SHA224_Update(SHA256_CTX *c, const void *data, size_t len); int SHA224_Final(unsigned char *md, SHA256_CTX *c); @@ -182,6 +192,10 @@ typedef struct SHA512state_st #endif #ifndef OPENSSL_NO_SHA512 +#ifdef OPENSSL_FIPS +int private_SHA384_Init(SHA512_CTX *c); +int private_SHA512_Init(SHA512_CTX *c); +#endif int SHA384_Init(SHA512_CTX *c); int SHA384_Update(SHA512_CTX *c, const void *data, size_t len); int SHA384_Final(unsigned char *md, SHA512_CTX *c); diff --git a/crypto/openssl/crypto/sha/sha1dgst.c b/crypto/openssl/crypto/sha/sha1dgst.c index 50d1925cde..81219af088 100644 --- a/crypto/openssl/crypto/sha/sha1dgst.c +++ b/crypto/openssl/crypto/sha/sha1dgst.c @@ -57,6 +57,7 @@ */ #include +#include #if !defined(OPENSSL_NO_SHA1) && !defined(OPENSSL_NO_SHA) #undef SHA_0 diff --git a/crypto/openssl/crypto/sha/sha256.c b/crypto/openssl/crypto/sha/sha256.c index 8952d87673..f88d3d6dad 100644 --- a/crypto/openssl/crypto/sha/sha256.c +++ b/crypto/openssl/crypto/sha/sha256.c @@ -16,7 +16,7 @@ const char SHA256_version[]="SHA-256" OPENSSL_VERSION_PTEXT; -int SHA224_Init (SHA256_CTX *c) +fips_md_init_ctx(SHA224, SHA256) { memset (c,0,sizeof(*c)); c->h[0]=0xc1059ed8UL; c->h[1]=0x367cd507UL; @@ -27,7 +27,7 @@ int SHA224_Init (SHA256_CTX *c) return 1; } -int SHA256_Init (SHA256_CTX *c) +fips_md_init(SHA256) { memset (c,0,sizeof(*c)); c->h[0]=0x6a09e667UL; c->h[1]=0xbb67ae85UL; diff --git a/crypto/openssl/crypto/sha/sha512.c b/crypto/openssl/crypto/sha/sha512.c index cbc0e58c48..50dd7dc744 100644 --- a/crypto/openssl/crypto/sha/sha512.c +++ b/crypto/openssl/crypto/sha/sha512.c @@ -59,21 +59,8 @@ const char SHA512_version[]="SHA-512" OPENSSL_VERSION_PTEXT; #define SHA512_BLOCK_CAN_MANAGE_UNALIGNED_DATA #endif -int SHA384_Init (SHA512_CTX *c) +fips_md_init_ctx(SHA384, SHA512) { -#if defined(SHA512_ASM) && (defined(__arm__) || defined(__arm)) - /* maintain dword order required by assembler module */ - unsigned int *h = (unsigned int *)c->h; - - h[0] = 0xcbbb9d5d; h[1] = 0xc1059ed8; - h[2] = 0x629a292a; h[3] = 0x367cd507; - h[4] = 0x9159015a; h[5] = 0x3070dd17; - h[6] = 0x152fecd8; h[7] = 0xf70e5939; - h[8] = 0x67332667; h[9] = 0xffc00b31; - h[10] = 0x8eb44a87; h[11] = 0x68581511; - h[12] = 0xdb0c2e0d; h[13] = 0x64f98fa7; - h[14] = 0x47b5481d; h[15] = 0xbefa4fa4; -#else c->h[0]=U64(0xcbbb9d5dc1059ed8); c->h[1]=U64(0x629a292a367cd507); c->h[2]=U64(0x9159015a3070dd17); @@ -82,27 +69,14 @@ int SHA384_Init (SHA512_CTX *c) c->h[5]=U64(0x8eb44a8768581511); c->h[6]=U64(0xdb0c2e0d64f98fa7); c->h[7]=U64(0x47b5481dbefa4fa4); -#endif + c->Nl=0; c->Nh=0; c->num=0; c->md_len=SHA384_DIGEST_LENGTH; return 1; } -int SHA512_Init (SHA512_CTX *c) +fips_md_init(SHA512) { -#if defined(SHA512_ASM) && (defined(__arm__) || defined(__arm)) - /* maintain dword order required by assembler module */ - unsigned int *h = (unsigned int *)c->h; - - h[0] = 0x6a09e667; h[1] = 0xf3bcc908; - h[2] = 0xbb67ae85; h[3] = 0x84caa73b; - h[4] = 0x3c6ef372; h[5] = 0xfe94f82b; - h[6] = 0xa54ff53a; h[7] = 0x5f1d36f1; - h[8] = 0x510e527f; h[9] = 0xade682d1; - h[10] = 0x9b05688c; h[11] = 0x2b3e6c1f; - h[12] = 0x1f83d9ab; h[13] = 0xfb41bd6b; - h[14] = 0x5be0cd19; h[15] = 0x137e2179; -#else c->h[0]=U64(0x6a09e667f3bcc908); c->h[1]=U64(0xbb67ae8584caa73b); c->h[2]=U64(0x3c6ef372fe94f82b); @@ -111,7 +85,7 @@ int SHA512_Init (SHA512_CTX *c) c->h[5]=U64(0x9b05688c2b3e6c1f); c->h[6]=U64(0x1f83d9abfb41bd6b); c->h[7]=U64(0x5be0cd19137e2179); -#endif + c->Nl=0; c->Nh=0; c->num=0; c->md_len=SHA512_DIGEST_LENGTH; return 1; @@ -160,24 +134,6 @@ int SHA512_Final (unsigned char *md, SHA512_CTX *c) if (md==0) return 0; -#if defined(SHA512_ASM) && (defined(__arm__) || defined(__arm)) - /* recall assembler dword order... */ - n = c->md_len; - if (n == SHA384_DIGEST_LENGTH || n == SHA512_DIGEST_LENGTH) - { - unsigned int *h = (unsigned int *)c->h, t; - - for (n/=4;n;n--) - { - t = *(h++); - *(md++) = (unsigned char)(t>>24); - *(md++) = (unsigned char)(t>>16); - *(md++) = (unsigned char)(t>>8); - *(md++) = (unsigned char)(t); - } - } - else return 0; -#else switch (c->md_len) { /* Let compiler decide if it's appropriate to unroll... */ @@ -214,7 +170,7 @@ int SHA512_Final (unsigned char *md, SHA512_CTX *c) /* ... as well as make sure md_len is not abused. */ default: return 0; } -#endif + return 1; } diff --git a/crypto/openssl/crypto/sha/sha_dgst.c b/crypto/openssl/crypto/sha/sha_dgst.c index 70eb56032c..c946ad827d 100644 --- a/crypto/openssl/crypto/sha/sha_dgst.c +++ b/crypto/openssl/crypto/sha/sha_dgst.c @@ -57,6 +57,7 @@ */ #include +#include #if !defined(OPENSSL_NO_SHA0) && !defined(OPENSSL_NO_SHA) #undef SHA_1 diff --git a/crypto/openssl/crypto/sha/sha_locl.h b/crypto/openssl/crypto/sha/sha_locl.h index 672c26eee1..7a0c3ca8d8 100644 --- a/crypto/openssl/crypto/sha/sha_locl.h +++ b/crypto/openssl/crypto/sha/sha_locl.h @@ -122,7 +122,11 @@ void sha1_block_data_order (SHA_CTX *c, const void *p,size_t num); #define INIT_DATA_h3 0x10325476UL #define INIT_DATA_h4 0xc3d2e1f0UL -int HASH_INIT (SHA_CTX *c) +#ifdef SHA_0 +fips_md_init(SHA) +#else +fips_md_init_ctx(SHA1, SHA) +#endif { memset (c,0,sizeof(*c)); c->h0=INIT_DATA_h0; diff --git a/crypto/openssl/crypto/ecdh/ecdh.h b/crypto/openssl/crypto/srp/srp.h similarity index 50% copy from crypto/openssl/crypto/ecdh/ecdh.h copy to crypto/openssl/crypto/srp/srp.h index b4b58ee65b..7ec7825cad 100644 --- a/crypto/openssl/crypto/ecdh/ecdh.h +++ b/crypto/openssl/crypto/srp/srp.h @@ -1,20 +1,10 @@ -/* crypto/ecdh/ecdh.h */ -/* ==================================================================== - * Copyright 2002 Sun Microsystems, Inc. ALL RIGHTS RESERVED. - * - * The Elliptic Curve Public-Key Crypto Library (ECC Code) included - * herein is developed by SUN MICROSYSTEMS, INC., and is contributed - * to the OpenSSL project. - * - * The ECC Code is licensed pursuant to the OpenSSL open source - * license provided below. - * - * The ECDH software is originally written by Douglas Stebila of - * Sun Microsystems Laboratories. - * +/* crypto/srp/srp.h */ +/* Written by Christophe Renou (christophe.renou@edelweb.fr) with + * the precious help of Peter Sylvester (peter.sylvester@edelweb.fr) + * for the EdelKey project and contributed to the OpenSSL project 2004. */ /* ==================================================================== - * Copyright (c) 2000-2002 The OpenSSL Project. All rights reserved. + * Copyright (c) 2004 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -66,58 +56,117 @@ * Hudson (tjh@cryptsoft.com). * */ -#ifndef HEADER_ECDH_H -#define HEADER_ECDH_H +#ifndef __SRP_H__ +#define __SRP_H__ + +#ifndef OPENSSL_NO_SRP -#include +#include +#include -#ifdef OPENSSL_NO_ECDH -#error ECDH is disabled. +#ifdef __cplusplus +extern "C" { #endif -#include -#include -#ifndef OPENSSL_NO_DEPRECATED +#include #include -#endif +#include -#ifdef __cplusplus -extern "C" { -#endif +typedef struct SRP_gN_cache_st + { + char *b64_bn; + BIGNUM *bn; + } SRP_gN_cache; -const ECDH_METHOD *ECDH_OpenSSL(void); -void ECDH_set_default_method(const ECDH_METHOD *); -const ECDH_METHOD *ECDH_get_default_method(void); -int ECDH_set_method(EC_KEY *, const ECDH_METHOD *); +DECLARE_STACK_OF(SRP_gN_cache) -int ECDH_compute_key(void *out, size_t outlen, const EC_POINT *pub_key, EC_KEY *ecdh, - void *(*KDF)(const void *in, size_t inlen, void *out, size_t *outlen)); +typedef struct SRP_user_pwd_st + { + char *id; + BIGNUM *s; + BIGNUM *v; + const BIGNUM *g; + const BIGNUM *N; + char *info; + } SRP_user_pwd; -int ECDH_get_ex_new_index(long argl, void *argp, CRYPTO_EX_new - *new_func, CRYPTO_EX_dup *dup_func, CRYPTO_EX_free *free_func); -int ECDH_set_ex_data(EC_KEY *d, int idx, void *arg); -void *ECDH_get_ex_data(EC_KEY *d, int idx); +DECLARE_STACK_OF(SRP_user_pwd) +typedef struct SRP_VBASE_st + { + STACK_OF(SRP_user_pwd) *users_pwd; + STACK_OF(SRP_gN_cache) *gN_cache; +/* to simulate a user */ + char *seed_key; + BIGNUM *default_g; + BIGNUM *default_N; + } SRP_VBASE; -/* BEGIN ERROR CODES */ -/* The following lines are auto generated by the script mkerr.pl. Any changes - * made after this point may be overwritten when the script is next run. - */ -void ERR_load_ECDH_strings(void); -/* Error codes for the ECDH functions. */ +/*Structure interne pour retenir les couples N et g*/ +typedef struct SRP_gN_st + { + char *id; + BIGNUM *g; + BIGNUM *N; + } SRP_gN; + +DECLARE_STACK_OF(SRP_gN) -/* Function codes. */ -#define ECDH_F_ECDH_COMPUTE_KEY 100 -#define ECDH_F_ECDH_DATA_NEW_METHOD 101 +SRP_VBASE *SRP_VBASE_new(char *seed_key); +int SRP_VBASE_free(SRP_VBASE *vb); +int SRP_VBASE_init(SRP_VBASE *vb, char * verifier_file); +SRP_user_pwd *SRP_VBASE_get_by_user(SRP_VBASE *vb, char *username); +char *SRP_create_verifier(const char *user, const char *pass, char **salt, + char **verifier, const char *N, const char *g); +int SRP_create_verifier_BN(const char *user, const char *pass, BIGNUM **salt, BIGNUM **verifier, BIGNUM *N, BIGNUM *g); -/* Reason codes. */ -#define ECDH_R_KDF_FAILED 102 -#define ECDH_R_NO_PRIVATE_VALUE 100 -#define ECDH_R_POINT_ARITHMETIC_FAILURE 101 + +#define SRP_NO_ERROR 0 +#define SRP_ERR_VBASE_INCOMPLETE_FILE 1 +#define SRP_ERR_VBASE_BN_LIB 2 +#define SRP_ERR_OPEN_FILE 3 +#define SRP_ERR_MEMORY 4 + +#define DB_srptype 0 +#define DB_srpverifier 1 +#define DB_srpsalt 2 +#define DB_srpid 3 +#define DB_srpgN 4 +#define DB_srpinfo 5 +#undef DB_NUMBER +#define DB_NUMBER 6 + +#define DB_SRP_INDEX 'I' +#define DB_SRP_VALID 'V' +#define DB_SRP_REVOKED 'R' +#define DB_SRP_MODIF 'v' + + +/* see srp.c */ +char * SRP_check_known_gN_param(BIGNUM* g, BIGNUM* N); +SRP_gN *SRP_get_default_gN(const char * id) ; + +/* server side .... */ +BIGNUM *SRP_Calc_server_key(BIGNUM *A, BIGNUM *v, BIGNUM *u, BIGNUM *b, BIGNUM *N); +BIGNUM *SRP_Calc_B(BIGNUM *b, BIGNUM *N, BIGNUM *g, BIGNUM *v); +int SRP_Verify_A_mod_N(BIGNUM *A, BIGNUM *N); +BIGNUM *SRP_Calc_u(BIGNUM *A, BIGNUM *B, BIGNUM *N) ; + + + +/* client side .... */ +BIGNUM *SRP_Calc_x(BIGNUM *s, const char *user, const char *pass); +BIGNUM *SRP_Calc_A(BIGNUM *a, BIGNUM *N, BIGNUM *g); +BIGNUM *SRP_Calc_client_key(BIGNUM *N, BIGNUM *B, BIGNUM *g, BIGNUM *x, BIGNUM *a, BIGNUM *u); +int SRP_Verify_B_mod_N(BIGNUM *B, BIGNUM *N); + +#define SRP_MINIMAL_N 1024 #ifdef __cplusplus } +#endif + #endif #endif diff --git a/crypto/openssl/crypto/srp/srp_grps.h b/crypto/openssl/crypto/srp/srp_grps.h new file mode 100644 index 0000000000..d77c9fff4b --- /dev/null +++ b/crypto/openssl/crypto/srp/srp_grps.h @@ -0,0 +1,517 @@ +/* start of generated data */ + +static BN_ULONG bn_group_1024_value[] = { + bn_pack4(9FC6,1D2F,C0EB,06E3), + bn_pack4(FD51,38FE,8376,435B), + bn_pack4(2FD4,CBF4,976E,AA9A), + bn_pack4(68ED,BC3C,0572,6CC0), + bn_pack4(C529,F566,660E,57EC), + bn_pack4(8255,9B29,7BCF,1885), + bn_pack4(CE8E,F4AD,69B1,5D49), + bn_pack4(5DC7,D7B4,6154,D6B6), + bn_pack4(8E49,5C1D,6089,DAD1), + bn_pack4(E0D5,D8E2,50B9,8BE4), + bn_pack4(383B,4813,D692,C6E0), + bn_pack4(D674,DF74,96EA,81D3), + bn_pack4(9EA2,314C,9C25,6576), + bn_pack4(6072,6187,75FF,3C0B), + bn_pack4(9C33,F80A,FA8F,C5E8), + bn_pack4(EEAF,0AB9,ADB3,8DD6) +}; +static BIGNUM bn_group_1024 = { + bn_group_1024_value, + (sizeof bn_group_1024_value)/sizeof(BN_ULONG), + (sizeof bn_group_1024_value)/sizeof(BN_ULONG), + 0, + BN_FLG_STATIC_DATA +}; + +static BN_ULONG bn_group_1536_value[] = { + bn_pack4(CF76,E3FE,D135,F9BB), + bn_pack4(1518,0F93,499A,234D), + bn_pack4(8CE7,A28C,2442,C6F3), + bn_pack4(5A02,1FFF,5E91,479E), + bn_pack4(7F8A,2FE9,B8B5,292E), + bn_pack4(837C,264A,E3A9,BEB8), + bn_pack4(E442,734A,F7CC,B7AE), + bn_pack4(6577,2E43,7D6C,7F8C), + bn_pack4(DB2F,D53D,24B7,C486), + bn_pack4(6EDF,0195,3934,9627), + bn_pack4(158B,FD3E,2B9C,8CF5), + bn_pack4(764E,3F4B,53DD,9DA1), + bn_pack4(4754,8381,DBC5,B1FC), + bn_pack4(9B60,9E0B,E3BA,B63D), + bn_pack4(8134,B1C8,B979,8914), + bn_pack4(DF02,8A7C,EC67,F0D0), + bn_pack4(80B6,55BB,9A22,E8DC), + bn_pack4(1558,903B,A0D0,F843), + bn_pack4(51C6,A94B,E460,7A29), + bn_pack4(5F4F,5F55,6E27,CBDE), + bn_pack4(BEEE,A961,4B19,CC4D), + bn_pack4(DBA5,1DF4,99AC,4C80), + bn_pack4(B1F1,2A86,17A4,7BBB), + bn_pack4(9DEF,3CAF,B939,277A) +}; +static BIGNUM bn_group_1536 = { + bn_group_1536_value, + (sizeof bn_group_1536_value)/sizeof(BN_ULONG), + (sizeof bn_group_1536_value)/sizeof(BN_ULONG), + 0, + BN_FLG_STATIC_DATA +}; + +static BN_ULONG bn_group_2048_value[] = { + bn_pack4(0FA7,111F,9E4A,FF73), + bn_pack4(9B65,E372,FCD6,8EF2), + bn_pack4(35DE,236D,525F,5475), + bn_pack4(94B5,C803,D89F,7AE4), + bn_pack4(71AE,35F8,E9DB,FBB6), + bn_pack4(2A56,98F3,A8D0,C382), + bn_pack4(9CCC,041C,7BC3,08D8), + bn_pack4(AF87,4E73,03CE,5329), + bn_pack4(6160,2790,04E5,7AE6), + bn_pack4(032C,FBDB,F52F,B378), + bn_pack4(5EA7,7A27,75D2,ECFA), + bn_pack4(5445,23B5,24B0,D57D), + bn_pack4(5B9D,32E6,88F8,7748), + bn_pack4(F1D2,B907,8717,461A), + bn_pack4(76BD,207A,436C,6481), + bn_pack4(CA97,B43A,23FB,8016), + bn_pack4(1D28,1E44,6B14,773B), + bn_pack4(7359,D041,D5C3,3EA7), + bn_pack4(A80D,740A,DBF4,FF74), + bn_pack4(55F9,7993,EC97,5EEA), + bn_pack4(2918,A996,2F0B,93B8), + bn_pack4(661A,05FB,D5FA,AAE8), + bn_pack4(CF60,9517,9A16,3AB3), + bn_pack4(E808,3969,EDB7,67B0), + bn_pack4(CD7F,48A9,DA04,FD50), + bn_pack4(D523,12AB,4B03,310D), + bn_pack4(8193,E075,7767,A13D), + bn_pack4(A373,29CB,B4A0,99ED), + bn_pack4(FC31,9294,3DB5,6050), + bn_pack4(AF72,B665,1987,EE07), + bn_pack4(F166,DE5E,1389,582F), + bn_pack4(AC6B,DB41,324A,9A9B) +}; +static BIGNUM bn_group_2048 = { + bn_group_2048_value, + (sizeof bn_group_2048_value)/sizeof(BN_ULONG), + (sizeof bn_group_2048_value)/sizeof(BN_ULONG), + 0, + BN_FLG_STATIC_DATA +}; + +static BN_ULONG bn_group_3072_value[] = { + bn_pack4(FFFF,FFFF,FFFF,FFFF), + bn_pack4(4B82,D120,A93A,D2CA), + bn_pack4(43DB,5BFC,E0FD,108E), + bn_pack4(08E2,4FA0,74E5,AB31), + bn_pack4(7709,88C0,BAD9,46E2), + bn_pack4(BBE1,1757,7A61,5D6C), + bn_pack4(521F,2B18,177B,200C), + bn_pack4(D876,0273,3EC8,6A64), + bn_pack4(F12F,FA06,D98A,0864), + bn_pack4(CEE3,D226,1AD2,EE6B), + bn_pack4(1E8C,94E0,4A25,619D), + bn_pack4(ABF5,AE8C,DB09,33D7), + bn_pack4(B397,0F85,A6E1,E4C7), + bn_pack4(8AEA,7157,5D06,0C7D), + bn_pack4(ECFB,8504,58DB,EF0A), + bn_pack4(A855,21AB,DF1C,BA64), + bn_pack4(AD33,170D,0450,7A33), + bn_pack4(1572,8E5A,8AAA,C42D), + bn_pack4(15D2,2618,98FA,0510), + bn_pack4(3995,497C,EA95,6AE5), + bn_pack4(DE2B,CBF6,9558,1718), + bn_pack4(B5C5,5DF0,6F4C,52C9), + bn_pack4(9B27,83A2,EC07,A28F), + bn_pack4(E39E,772C,180E,8603), + bn_pack4(3290,5E46,2E36,CE3B), + bn_pack4(F174,6C08,CA18,217C), + bn_pack4(670C,354E,4ABC,9804), + bn_pack4(9ED5,2907,7096,966D), + bn_pack4(1C62,F356,2085,52BB), + bn_pack4(8365,5D23,DCA3,AD96), + bn_pack4(6916,3FA8,FD24,CF5F), + bn_pack4(98DA,4836,1C55,D39A), + bn_pack4(C200,7CB8,A163,BF05), + bn_pack4(4928,6651,ECE4,5B3D), + bn_pack4(AE9F,2411,7C4B,1FE6), + bn_pack4(EE38,6BFB,5A89,9FA5), + bn_pack4(0BFF,5CB6,F406,B7ED), + bn_pack4(F44C,42E9,A637,ED6B), + bn_pack4(E485,B576,625E,7EC6), + bn_pack4(4FE1,356D,6D51,C245), + bn_pack4(302B,0A6D,F25F,1437), + bn_pack4(EF95,19B3,CD3A,431B), + bn_pack4(514A,0879,8E34,04DD), + bn_pack4(020B,BEA6,3B13,9B22), + bn_pack4(2902,4E08,8A67,CC74), + bn_pack4(C4C6,628B,80DC,1CD1), + bn_pack4(C90F,DAA2,2168,C234), + bn_pack4(FFFF,FFFF,FFFF,FFFF) +}; +static BIGNUM bn_group_3072 = { + bn_group_3072_value, + (sizeof bn_group_3072_value)/sizeof(BN_ULONG), + (sizeof bn_group_3072_value)/sizeof(BN_ULONG), + 0, + BN_FLG_STATIC_DATA +}; + +static BN_ULONG bn_group_4096_value[] = { + bn_pack4(FFFF,FFFF,FFFF,FFFF), + bn_pack4(4DF4,35C9,3406,3199), + bn_pack4(86FF,B7DC,90A6,C08F), + bn_pack4(93B4,EA98,8D8F,DDC1), + bn_pack4(D006,9127,D5B0,5AA9), + bn_pack4(B81B,DD76,2170,481C), + bn_pack4(1F61,2970,CEE2,D7AF), + bn_pack4(233B,A186,515B,E7ED), + bn_pack4(99B2,964F,A090,C3A2), + bn_pack4(287C,5947,4E6B,C05D), + bn_pack4(2E8E,FC14,1FBE,CAA6), + bn_pack4(DBBB,C2DB,04DE,8EF9), + bn_pack4(2583,E9CA,2AD4,4CE8), + bn_pack4(1A94,6834,B615,0BDA), + bn_pack4(99C3,2718,6AF4,E23C), + bn_pack4(8871,9A10,BDBA,5B26), + bn_pack4(1A72,3C12,A787,E6D7), + bn_pack4(4B82,D120,A921,0801), + bn_pack4(43DB,5BFC,E0FD,108E), + bn_pack4(08E2,4FA0,74E5,AB31), + bn_pack4(7709,88C0,BAD9,46E2), + bn_pack4(BBE1,1757,7A61,5D6C), + bn_pack4(521F,2B18,177B,200C), + bn_pack4(D876,0273,3EC8,6A64), + bn_pack4(F12F,FA06,D98A,0864), + bn_pack4(CEE3,D226,1AD2,EE6B), + bn_pack4(1E8C,94E0,4A25,619D), + bn_pack4(ABF5,AE8C,DB09,33D7), + bn_pack4(B397,0F85,A6E1,E4C7), + bn_pack4(8AEA,7157,5D06,0C7D), + bn_pack4(ECFB,8504,58DB,EF0A), + bn_pack4(A855,21AB,DF1C,BA64), + bn_pack4(AD33,170D,0450,7A33), + bn_pack4(1572,8E5A,8AAA,C42D), + bn_pack4(15D2,2618,98FA,0510), + bn_pack4(3995,497C,EA95,6AE5), + bn_pack4(DE2B,CBF6,9558,1718), + bn_pack4(B5C5,5DF0,6F4C,52C9), + bn_pack4(9B27,83A2,EC07,A28F), + bn_pack4(E39E,772C,180E,8603), + bn_pack4(3290,5E46,2E36,CE3B), + bn_pack4(F174,6C08,CA18,217C), + bn_pack4(670C,354E,4ABC,9804), + bn_pack4(9ED5,2907,7096,966D), + bn_pack4(1C62,F356,2085,52BB), + bn_pack4(8365,5D23,DCA3,AD96), + bn_pack4(6916,3FA8,FD24,CF5F), + bn_pack4(98DA,4836,1C55,D39A), + bn_pack4(C200,7CB8,A163,BF05), + bn_pack4(4928,6651,ECE4,5B3D), + bn_pack4(AE9F,2411,7C4B,1FE6), + bn_pack4(EE38,6BFB,5A89,9FA5), + bn_pack4(0BFF,5CB6,F406,B7ED), + bn_pack4(F44C,42E9,A637,ED6B), + bn_pack4(E485,B576,625E,7EC6), + bn_pack4(4FE1,356D,6D51,C245), + bn_pack4(302B,0A6D,F25F,1437), + bn_pack4(EF95,19B3,CD3A,431B), + bn_pack4(514A,0879,8E34,04DD), + bn_pack4(020B,BEA6,3B13,9B22), + bn_pack4(2902,4E08,8A67,CC74), + bn_pack4(C4C6,628B,80DC,1CD1), + bn_pack4(C90F,DAA2,2168,C234), + bn_pack4(FFFF,FFFF,FFFF,FFFF) +}; +static BIGNUM bn_group_4096 = { + bn_group_4096_value, + (sizeof bn_group_4096_value)/sizeof(BN_ULONG), + (sizeof bn_group_4096_value)/sizeof(BN_ULONG), + 0, + BN_FLG_STATIC_DATA +}; + +static BN_ULONG bn_group_6144_value[] = { + bn_pack4(FFFF,FFFF,FFFF,FFFF), + bn_pack4(E694,F91E,6DCC,4024), + bn_pack4(12BF,2D5B,0B74,74D6), + bn_pack4(043E,8F66,3F48,60EE), + bn_pack4(387F,E8D7,6E3C,0468), + bn_pack4(DA56,C9EC,2EF2,9632), + bn_pack4(EB19,CCB1,A313,D55C), + bn_pack4(F550,AA3D,8A1F,BFF0), + bn_pack4(06A1,D58B,B7C5,DA76), + bn_pack4(A797,15EE,F29B,E328), + bn_pack4(14CC,5ED2,0F80,37E0), + bn_pack4(CC8F,6D7E,BF48,E1D8), + bn_pack4(4BD4,07B2,2B41,54AA), + bn_pack4(0F1D,45B7,FF58,5AC5), + bn_pack4(23A9,7A7E,36CC,88BE), + bn_pack4(59E7,C97F,BEC7,E8F3), + bn_pack4(B5A8,4031,900B,1C9E), + bn_pack4(D55E,702F,4698,0C82), + bn_pack4(F482,D7CE,6E74,FEF6), + bn_pack4(F032,EA15,D172,1D03), + bn_pack4(5983,CA01,C64B,92EC), + bn_pack4(6FB8,F401,378C,D2BF), + bn_pack4(3320,5151,2BD7,AF42), + bn_pack4(DB7F,1447,E6CC,254B), + bn_pack4(44CE,6CBA,CED4,BB1B), + bn_pack4(DA3E,DBEB,CF9B,14ED), + bn_pack4(1797,27B0,865A,8918), + bn_pack4(B06A,53ED,9027,D831), + bn_pack4(E5DB,382F,4130,01AE), + bn_pack4(F8FF,9406,AD9E,530E), + bn_pack4(C975,1E76,3DBA,37BD), + bn_pack4(C1D4,DCB2,6026,46DE), + bn_pack4(36C3,FAB4,D27C,7026), + bn_pack4(4DF4,35C9,3402,8492), + bn_pack4(86FF,B7DC,90A6,C08F), + bn_pack4(93B4,EA98,8D8F,DDC1), + bn_pack4(D006,9127,D5B0,5AA9), + bn_pack4(B81B,DD76,2170,481C), + bn_pack4(1F61,2970,CEE2,D7AF), + bn_pack4(233B,A186,515B,E7ED), + bn_pack4(99B2,964F,A090,C3A2), + bn_pack4(287C,5947,4E6B,C05D), + bn_pack4(2E8E,FC14,1FBE,CAA6), + bn_pack4(DBBB,C2DB,04DE,8EF9), + bn_pack4(2583,E9CA,2AD4,4CE8), + bn_pack4(1A94,6834,B615,0BDA), + bn_pack4(99C3,2718,6AF4,E23C), + bn_pack4(8871,9A10,BDBA,5B26), + bn_pack4(1A72,3C12,A787,E6D7), + bn_pack4(4B82,D120,A921,0801), + bn_pack4(43DB,5BFC,E0FD,108E), + bn_pack4(08E2,4FA0,74E5,AB31), + bn_pack4(7709,88C0,BAD9,46E2), + bn_pack4(BBE1,1757,7A61,5D6C), + bn_pack4(521F,2B18,177B,200C), + bn_pack4(D876,0273,3EC8,6A64), + bn_pack4(F12F,FA06,D98A,0864), + bn_pack4(CEE3,D226,1AD2,EE6B), + bn_pack4(1E8C,94E0,4A25,619D), + bn_pack4(ABF5,AE8C,DB09,33D7), + bn_pack4(B397,0F85,A6E1,E4C7), + bn_pack4(8AEA,7157,5D06,0C7D), + bn_pack4(ECFB,8504,58DB,EF0A), + bn_pack4(A855,21AB,DF1C,BA64), + bn_pack4(AD33,170D,0450,7A33), + bn_pack4(1572,8E5A,8AAA,C42D), + bn_pack4(15D2,2618,98FA,0510), + bn_pack4(3995,497C,EA95,6AE5), + bn_pack4(DE2B,CBF6,9558,1718), + bn_pack4(B5C5,5DF0,6F4C,52C9), + bn_pack4(9B27,83A2,EC07,A28F), + bn_pack4(E39E,772C,180E,8603), + bn_pack4(3290,5E46,2E36,CE3B), + bn_pack4(F174,6C08,CA18,217C), + bn_pack4(670C,354E,4ABC,9804), + bn_pack4(9ED5,2907,7096,966D), + bn_pack4(1C62,F356,2085,52BB), + bn_pack4(8365,5D23,DCA3,AD96), + bn_pack4(6916,3FA8,FD24,CF5F), + bn_pack4(98DA,4836,1C55,D39A), + bn_pack4(C200,7CB8,A163,BF05), + bn_pack4(4928,6651,ECE4,5B3D), + bn_pack4(AE9F,2411,7C4B,1FE6), + bn_pack4(EE38,6BFB,5A89,9FA5), + bn_pack4(0BFF,5CB6,F406,B7ED), + bn_pack4(F44C,42E9,A637,ED6B), + bn_pack4(E485,B576,625E,7EC6), + bn_pack4(4FE1,356D,6D51,C245), + bn_pack4(302B,0A6D,F25F,1437), + bn_pack4(EF95,19B3,CD3A,431B), + bn_pack4(514A,0879,8E34,04DD), + bn_pack4(020B,BEA6,3B13,9B22), + bn_pack4(2902,4E08,8A67,CC74), + bn_pack4(C4C6,628B,80DC,1CD1), + bn_pack4(C90F,DAA2,2168,C234), + bn_pack4(FFFF,FFFF,FFFF,FFFF) +}; +static BIGNUM bn_group_6144 = { + bn_group_6144_value, + (sizeof bn_group_6144_value)/sizeof(BN_ULONG), + (sizeof bn_group_6144_value)/sizeof(BN_ULONG), + 0, + BN_FLG_STATIC_DATA +}; + +static BN_ULONG bn_group_8192_value[] = { + bn_pack4(FFFF,FFFF,FFFF,FFFF), + bn_pack4(60C9,80DD,98ED,D3DF), + bn_pack4(C81F,56E8,80B9,6E71), + bn_pack4(9E30,50E2,7656,94DF), + bn_pack4(9558,E447,5677,E9AA), + bn_pack4(C919,0DA6,FC02,6E47), + bn_pack4(889A,002E,D5EE,382B), + bn_pack4(4009,438B,481C,6CD7), + bn_pack4(3590,46F4,EB87,9F92), + bn_pack4(FAF3,6BC3,1ECF,A268), + bn_pack4(B1D5,10BD,7EE7,4D73), + bn_pack4(F9AB,4819,5DED,7EA1), + bn_pack4(64F3,1CC5,0846,851D), + bn_pack4(4597,E899,A025,5DC1), + bn_pack4(DF31,0EE0,74AB,6A36), + bn_pack4(6D2A,13F8,3F44,F82D), + bn_pack4(062B,3CF5,B3A2,78A6), + bn_pack4(7968,3303,ED5B,DD3A), + bn_pack4(FA9D,4B7F,A2C0,87E8), + bn_pack4(4BCB,C886,2F83,85DD), + bn_pack4(3473,FC64,6CEA,306B), + bn_pack4(13EB,57A8,1A23,F0C7), + bn_pack4(2222,2E04,A403,7C07), + bn_pack4(E3FD,B8BE,FC84,8AD9), + bn_pack4(238F,16CB,E39D,652D), + bn_pack4(3423,B474,2BF1,C978), + bn_pack4(3AAB,639C,5AE4,F568), + bn_pack4(2576,F693,6BA4,2466), + bn_pack4(741F,A7BF,8AFC,47ED), + bn_pack4(3BC8,32B6,8D9D,D300), + bn_pack4(D8BE,C4D0,73B9,31BA), + bn_pack4(3877,7CB6,A932,DF8C), + bn_pack4(74A3,926F,12FE,E5E4), + bn_pack4(E694,F91E,6DBE,1159), + bn_pack4(12BF,2D5B,0B74,74D6), + bn_pack4(043E,8F66,3F48,60EE), + bn_pack4(387F,E8D7,6E3C,0468), + bn_pack4(DA56,C9EC,2EF2,9632), + bn_pack4(EB19,CCB1,A313,D55C), + bn_pack4(F550,AA3D,8A1F,BFF0), + bn_pack4(06A1,D58B,B7C5,DA76), + bn_pack4(A797,15EE,F29B,E328), + bn_pack4(14CC,5ED2,0F80,37E0), + bn_pack4(CC8F,6D7E,BF48,E1D8), + bn_pack4(4BD4,07B2,2B41,54AA), + bn_pack4(0F1D,45B7,FF58,5AC5), + bn_pack4(23A9,7A7E,36CC,88BE), + bn_pack4(59E7,C97F,BEC7,E8F3), + bn_pack4(B5A8,4031,900B,1C9E), + bn_pack4(D55E,702F,4698,0C82), + bn_pack4(F482,D7CE,6E74,FEF6), + bn_pack4(F032,EA15,D172,1D03), + bn_pack4(5983,CA01,C64B,92EC), + bn_pack4(6FB8,F401,378C,D2BF), + bn_pack4(3320,5151,2BD7,AF42), + bn_pack4(DB7F,1447,E6CC,254B), + bn_pack4(44CE,6CBA,CED4,BB1B), + bn_pack4(DA3E,DBEB,CF9B,14ED), + bn_pack4(1797,27B0,865A,8918), + bn_pack4(B06A,53ED,9027,D831), + bn_pack4(E5DB,382F,4130,01AE), + bn_pack4(F8FF,9406,AD9E,530E), + bn_pack4(C975,1E76,3DBA,37BD), + bn_pack4(C1D4,DCB2,6026,46DE), + bn_pack4(36C3,FAB4,D27C,7026), + bn_pack4(4DF4,35C9,3402,8492), + bn_pack4(86FF,B7DC,90A6,C08F), + bn_pack4(93B4,EA98,8D8F,DDC1), + bn_pack4(D006,9127,D5B0,5AA9), + bn_pack4(B81B,DD76,2170,481C), + bn_pack4(1F61,2970,CEE2,D7AF), + bn_pack4(233B,A186,515B,E7ED), + bn_pack4(99B2,964F,A090,C3A2), + bn_pack4(287C,5947,4E6B,C05D), + bn_pack4(2E8E,FC14,1FBE,CAA6), + bn_pack4(DBBB,C2DB,04DE,8EF9), + bn_pack4(2583,E9CA,2AD4,4CE8), + bn_pack4(1A94,6834,B615,0BDA), + bn_pack4(99C3,2718,6AF4,E23C), + bn_pack4(8871,9A10,BDBA,5B26), + bn_pack4(1A72,3C12,A787,E6D7), + bn_pack4(4B82,D120,A921,0801), + bn_pack4(43DB,5BFC,E0FD,108E), + bn_pack4(08E2,4FA0,74E5,AB31), + bn_pack4(7709,88C0,BAD9,46E2), + bn_pack4(BBE1,1757,7A61,5D6C), + bn_pack4(521F,2B18,177B,200C), + bn_pack4(D876,0273,3EC8,6A64), + bn_pack4(F12F,FA06,D98A,0864), + bn_pack4(CEE3,D226,1AD2,EE6B), + bn_pack4(1E8C,94E0,4A25,619D), + bn_pack4(ABF5,AE8C,DB09,33D7), + bn_pack4(B397,0F85,A6E1,E4C7), + bn_pack4(8AEA,7157,5D06,0C7D), + bn_pack4(ECFB,8504,58DB,EF0A), + bn_pack4(A855,21AB,DF1C,BA64), + bn_pack4(AD33,170D,0450,7A33), + bn_pack4(1572,8E5A,8AAA,C42D), + bn_pack4(15D2,2618,98FA,0510), + bn_pack4(3995,497C,EA95,6AE5), + bn_pack4(DE2B,CBF6,9558,1718), + bn_pack4(B5C5,5DF0,6F4C,52C9), + bn_pack4(9B27,83A2,EC07,A28F), + bn_pack4(E39E,772C,180E,8603), + bn_pack4(3290,5E46,2E36,CE3B), + bn_pack4(F174,6C08,CA18,217C), + bn_pack4(670C,354E,4ABC,9804), + bn_pack4(9ED5,2907,7096,966D), + bn_pack4(1C62,F356,2085,52BB), + bn_pack4(8365,5D23,DCA3,AD96), + bn_pack4(6916,3FA8,FD24,CF5F), + bn_pack4(98DA,4836,1C55,D39A), + bn_pack4(C200,7CB8,A163,BF05), + bn_pack4(4928,6651,ECE4,5B3D), + bn_pack4(AE9F,2411,7C4B,1FE6), + bn_pack4(EE38,6BFB,5A89,9FA5), + bn_pack4(0BFF,5CB6,F406,B7ED), + bn_pack4(F44C,42E9,A637,ED6B), + bn_pack4(E485,B576,625E,7EC6), + bn_pack4(4FE1,356D,6D51,C245), + bn_pack4(302B,0A6D,F25F,1437), + bn_pack4(EF95,19B3,CD3A,431B), + bn_pack4(514A,0879,8E34,04DD), + bn_pack4(020B,BEA6,3B13,9B22), + bn_pack4(2902,4E08,8A67,CC74), + bn_pack4(C4C6,628B,80DC,1CD1), + bn_pack4(C90F,DAA2,2168,C234), + bn_pack4(FFFF,FFFF,FFFF,FFFF) +}; +static BIGNUM bn_group_8192 = { + bn_group_8192_value, + (sizeof bn_group_8192_value)/sizeof(BN_ULONG), + (sizeof bn_group_8192_value)/sizeof(BN_ULONG), + 0, + BN_FLG_STATIC_DATA +}; + +static BN_ULONG bn_generator_19_value[] = {19} ; +static BIGNUM bn_generator_19 = { + bn_generator_19_value, + 1, + 1, + 0, + BN_FLG_STATIC_DATA +}; +static BN_ULONG bn_generator_5_value[] = {5} ; +static BIGNUM bn_generator_5 = { + bn_generator_5_value, + 1, + 1, + 0, + BN_FLG_STATIC_DATA +}; +static BN_ULONG bn_generator_2_value[] = {2} ; +static BIGNUM bn_generator_2 = { + bn_generator_2_value, + 1, + 1, + 0, + BN_FLG_STATIC_DATA +}; + +static SRP_gN knowngN[] = { + {"8192",&bn_generator_19 , &bn_group_8192}, + {"6144",&bn_generator_5 , &bn_group_6144}, + {"4096",&bn_generator_5 , &bn_group_4096}, + {"3072",&bn_generator_5 , &bn_group_3072}, + {"2048",&bn_generator_2 , &bn_group_2048}, + {"1536",&bn_generator_2 , &bn_group_1536}, + {"1024",&bn_generator_2 , &bn_group_1024}, +}; +#define KNOWN_GN_NUMBER sizeof(knowngN) / sizeof(SRP_gN) + +/* end of generated data */ diff --git a/crypto/openssl/crypto/ecdh/ech_locl.h b/crypto/openssl/crypto/srp/srp_lcl.h similarity index 77% copy from crypto/openssl/crypto/ecdh/ech_locl.h copy to crypto/openssl/crypto/srp/srp_lcl.h index f658526a7e..42bda3f148 100644 --- a/crypto/openssl/crypto/ecdh/ech_locl.h +++ b/crypto/openssl/crypto/srp/srp_lcl.h @@ -1,6 +1,9 @@ -/* crypto/ecdh/ech_locl.h */ +/* crypto/srp/srp_lcl.h */ +/* Written by Peter Sylvester (peter.sylvester@edelweb.fr) + * for the EdelKey project and contributed to the OpenSSL project 2004. + */ /* ==================================================================== - * Copyright (c) 2000-2005 The OpenSSL Project. All rights reserved. + * Copyright (c) 2004 The OpenSSL Project. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -52,43 +55,29 @@ * Hudson (tjh@cryptsoft.com). * */ +#ifndef HEADER_SRP_LCL_H +#define HEADER_SRP_LCL_H + +#include +#include + +#if 0 +#define srp_bn_print(a) {fprintf(stderr, #a "="); BN_print_fp(stderr,a); \ + fprintf(stderr,"\n");} +#else +#define srp_bn_print(a) +#endif -#ifndef HEADER_ECH_LOCL_H -#define HEADER_ECH_LOCL_H -#include #ifdef __cplusplus extern "C" { #endif -struct ecdh_method - { - const char *name; - int (*compute_key)(void *key, size_t outlen, const EC_POINT *pub_key, EC_KEY *ecdh, - void *(*KDF)(const void *in, size_t inlen, void *out, size_t *outlen)); -#if 0 - int (*init)(EC_KEY *eckey); - int (*finish)(EC_KEY *eckey); -#endif - int flags; - char *app_data; - }; -typedef struct ecdh_data_st { - /* EC_KEY_METH_DATA part */ - int (*init)(EC_KEY *); - /* method specific part */ - ENGINE *engine; - int flags; - const ECDH_METHOD *meth; - CRYPTO_EX_DATA ex_data; -} ECDH_DATA; - -ECDH_DATA *ecdh_check(EC_KEY *); #ifdef __cplusplus } #endif -#endif /* HEADER_ECH_LOCL_H */ +#endif diff --git a/crypto/openssl/crypto/srp/srp_lib.c b/crypto/openssl/crypto/srp/srp_lib.c new file mode 100644 index 0000000000..92cea98dcd --- /dev/null +++ b/crypto/openssl/crypto/srp/srp_lib.c @@ -0,0 +1,357 @@ +/* crypto/srp/srp_lib.c */ +/* Written by Christophe Renou (christophe.renou@edelweb.fr) with + * the precious help of Peter Sylvester (peter.sylvester@edelweb.fr) + * for the EdelKey project and contributed to the OpenSSL project 2004. + */ +/* ==================================================================== + * Copyright (c) 2004 The OpenSSL Project. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * 3. All advertising materials mentioning features or use of this + * software must display the following acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit. (http://www.OpenSSL.org/)" + * + * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to + * endorse or promote products derived from this software without + * prior written permission. For written permission, please contact + * licensing@OpenSSL.org. + * + * 5. Products derived from this software may not be called "OpenSSL" + * nor may "OpenSSL" appear in their names without prior written + * permission of the OpenSSL Project. + * + * 6. Redistributions of any form whatsoever must retain the following + * acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit (http://www.OpenSSL.org/)" + * + * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY + * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * ==================================================================== + * + * This product includes cryptographic software written by Eric Young + * (eay@cryptsoft.com). This product includes software written by Tim + * Hudson (tjh@cryptsoft.com). + * + */ +#ifndef OPENSSL_NO_SRP +#include "cryptlib.h" +#include "srp_lcl.h" +#include +#include + +#if (BN_BYTES == 8) +#define bn_pack4(a1,a2,a3,a4) 0x##a1##a2##a3##a4##ul +#endif +#if (BN_BYTES == 4) +#define bn_pack4(a1,a2,a3,a4) 0x##a3##a4##ul, 0x##a1##a2##ul +#endif +#if (BN_BYTES == 2) +#define bn_pack4(a1,a2,a3,a4) 0x##a4##u,0x##a3##u,0x##a2##u,0x##a1##u +#endif + + +#include "srp_grps.h" + +static BIGNUM *srp_Calc_k(BIGNUM *N, BIGNUM *g) + { + /* k = SHA1(N | PAD(g)) -- tls-srp draft 8 */ + + unsigned char digest[SHA_DIGEST_LENGTH]; + unsigned char *tmp; + EVP_MD_CTX ctxt; + int longg ; + int longN = BN_num_bytes(N); + + if ((tmp = OPENSSL_malloc(longN)) == NULL) + return NULL; + BN_bn2bin(N,tmp) ; + + EVP_MD_CTX_init(&ctxt); + EVP_DigestInit_ex(&ctxt, EVP_sha1(), NULL); + EVP_DigestUpdate(&ctxt, tmp, longN); + + memset(tmp, 0, longN); + longg = BN_bn2bin(g,tmp) ; + /* use the zeros behind to pad on left */ + EVP_DigestUpdate(&ctxt, tmp + longg, longN-longg); + EVP_DigestUpdate(&ctxt, tmp, longg); + OPENSSL_free(tmp); + + EVP_DigestFinal_ex(&ctxt, digest, NULL); + EVP_MD_CTX_cleanup(&ctxt); + return BN_bin2bn(digest, sizeof(digest), NULL); + } + +BIGNUM *SRP_Calc_u(BIGNUM *A, BIGNUM *B, BIGNUM *N) + { + /* k = SHA1(PAD(A) || PAD(B) ) -- tls-srp draft 8 */ + + BIGNUM *u; + unsigned char cu[SHA_DIGEST_LENGTH]; + unsigned char *cAB; + EVP_MD_CTX ctxt; + int longN; + if ((A == NULL) ||(B == NULL) || (N == NULL)) + return NULL; + + longN= BN_num_bytes(N); + + if ((cAB = OPENSSL_malloc(2*longN)) == NULL) + return NULL; + + memset(cAB, 0, longN); + + EVP_MD_CTX_init(&ctxt); + EVP_DigestInit_ex(&ctxt, EVP_sha1(), NULL); + EVP_DigestUpdate(&ctxt, cAB + BN_bn2bin(A,cAB+longN), longN); + EVP_DigestUpdate(&ctxt, cAB + BN_bn2bin(B,cAB+longN), longN); + OPENSSL_free(cAB); + EVP_DigestFinal_ex(&ctxt, cu, NULL); + EVP_MD_CTX_cleanup(&ctxt); + + if (!(u = BN_bin2bn(cu, sizeof(cu), NULL))) + return NULL; + if (!BN_is_zero(u)) + return u; + BN_free(u); + return NULL; +} + +BIGNUM *SRP_Calc_server_key(BIGNUM *A, BIGNUM *v, BIGNUM *u, BIGNUM *b, BIGNUM *N) + { + BIGNUM *tmp = NULL, *S = NULL; + BN_CTX *bn_ctx; + + if (u == NULL || A == NULL || v == NULL || b == NULL || N == NULL) + return NULL; + + if ((bn_ctx = BN_CTX_new()) == NULL || + (tmp = BN_new()) == NULL || + (S = BN_new()) == NULL ) + goto err; + + /* S = (A*v**u) ** b */ + + if (!BN_mod_exp(tmp,v,u,N,bn_ctx)) + goto err; + if (!BN_mod_mul(tmp,A,tmp,N,bn_ctx)) + goto err; + if (!BN_mod_exp(S,tmp,b,N,bn_ctx)) + goto err; +err: + BN_CTX_free(bn_ctx); + BN_clear_free(tmp); + return S; + } + +BIGNUM *SRP_Calc_B(BIGNUM *b, BIGNUM *N, BIGNUM *g, BIGNUM *v) + { + BIGNUM *kv = NULL, *gb = NULL; + BIGNUM *B = NULL, *k = NULL; + BN_CTX *bn_ctx; + + if (b == NULL || N == NULL || g == NULL || v == NULL || + (bn_ctx = BN_CTX_new()) == NULL) + return NULL; + + if ( (kv = BN_new()) == NULL || + (gb = BN_new()) == NULL || + (B = BN_new())== NULL) + goto err; + + /* B = g**b + k*v */ + + if (!BN_mod_exp(gb,g,b,N,bn_ctx) || + !(k = srp_Calc_k(N,g)) || + !BN_mod_mul(kv,v,k,N,bn_ctx) || + !BN_mod_add(B,gb,kv,N,bn_ctx)) + { + BN_free(B); + B = NULL; + } +err: + BN_CTX_free(bn_ctx); + BN_clear_free(kv); + BN_clear_free(gb); + BN_free(k); + return B; + } + +BIGNUM *SRP_Calc_x(BIGNUM *s, const char *user, const char *pass) + { + unsigned char dig[SHA_DIGEST_LENGTH]; + EVP_MD_CTX ctxt; + unsigned char *cs; + + if ((s == NULL) || + (user == NULL) || + (pass == NULL)) + return NULL; + + if ((cs = OPENSSL_malloc(BN_num_bytes(s))) == NULL) + return NULL; + + EVP_MD_CTX_init(&ctxt); + EVP_DigestInit_ex(&ctxt, EVP_sha1(), NULL); + EVP_DigestUpdate(&ctxt, user, strlen(user)); + EVP_DigestUpdate(&ctxt, ":", 1); + EVP_DigestUpdate(&ctxt, pass, strlen(pass)); + EVP_DigestFinal_ex(&ctxt, dig, NULL); + + EVP_DigestInit_ex(&ctxt, EVP_sha1(), NULL); + BN_bn2bin(s,cs); + EVP_DigestUpdate(&ctxt, cs, BN_num_bytes(s)); + OPENSSL_free(cs); + EVP_DigestUpdate(&ctxt, dig, sizeof(dig)); + EVP_DigestFinal_ex(&ctxt, dig, NULL); + EVP_MD_CTX_cleanup(&ctxt); + + return BN_bin2bn(dig, sizeof(dig), NULL); + } + +BIGNUM *SRP_Calc_A(BIGNUM *a, BIGNUM *N, BIGNUM *g) + { + BN_CTX *bn_ctx; + BIGNUM * A = NULL; + + if (a == NULL || N == NULL || g == NULL || + (bn_ctx = BN_CTX_new()) == NULL) + return NULL; + + if ((A = BN_new()) != NULL && + !BN_mod_exp(A,g,a,N,bn_ctx)) + { + BN_free(A); + A = NULL; + } + BN_CTX_free(bn_ctx); + return A; + } + + +BIGNUM *SRP_Calc_client_key(BIGNUM *N, BIGNUM *B, BIGNUM *g, BIGNUM *x, BIGNUM *a, BIGNUM *u) + { + BIGNUM *tmp = NULL, *tmp2 = NULL, *tmp3 = NULL , *k = NULL, *K = NULL; + BN_CTX *bn_ctx; + + if (u == NULL || B == NULL || N == NULL || g == NULL || x == NULL || a == NULL || + (bn_ctx = BN_CTX_new()) == NULL) + return NULL; + + if ((tmp = BN_new()) == NULL || + (tmp2 = BN_new())== NULL || + (tmp3 = BN_new())== NULL || + (K = BN_new()) == NULL) + goto err; + + if (!BN_mod_exp(tmp,g,x,N,bn_ctx)) + goto err; + if (!(k = srp_Calc_k(N,g))) + goto err; + if (!BN_mod_mul(tmp2,tmp,k,N,bn_ctx)) + goto err; + if (!BN_mod_sub(tmp,B,tmp2,N,bn_ctx)) + goto err; + + if (!BN_mod_mul(tmp3,u,x,N,bn_ctx)) + goto err; + if (!BN_mod_add(tmp2,a,tmp3,N,bn_ctx)) + goto err; + if (!BN_mod_exp(K,tmp,tmp2,N,bn_ctx)) + goto err; + +err : + BN_CTX_free(bn_ctx); + BN_clear_free(tmp); + BN_clear_free(tmp2); + BN_clear_free(tmp3); + BN_free(k); + return K; + } + +int SRP_Verify_B_mod_N(BIGNUM *B, BIGNUM *N) + { + BIGNUM *r; + BN_CTX *bn_ctx; + int ret = 0; + + if (B == NULL || N == NULL || + (bn_ctx = BN_CTX_new()) == NULL) + return 0; + + if ((r = BN_new()) == NULL) + goto err; + /* Checks if B % N == 0 */ + if (!BN_nnmod(r,B,N,bn_ctx)) + goto err; + ret = !BN_is_zero(r); +err: + BN_CTX_free(bn_ctx); + BN_free(r); + return ret; + } + +int SRP_Verify_A_mod_N(BIGNUM *A, BIGNUM *N) + { + /* Checks if A % N == 0 */ + return SRP_Verify_B_mod_N(A,N) ; + } + + +/* Check if G and N are kwown parameters. + The values have been generated from the ietf-tls-srp draft version 8 +*/ +char *SRP_check_known_gN_param(BIGNUM *g, BIGNUM *N) + { + size_t i; + if ((g == NULL) || (N == NULL)) + return 0; + + srp_bn_print(g); + srp_bn_print(N); + + for(i = 0; i < KNOWN_GN_NUMBER; i++) + { + if (BN_cmp(knowngN[i].g, g) == 0 && BN_cmp(knowngN[i].N, N) == 0) + return knowngN[i].id; + } + return NULL; + } + +SRP_gN *SRP_get_default_gN(const char *id) + { + size_t i; + + if (id == NULL) + return knowngN; + for(i = 0; i < KNOWN_GN_NUMBER; i++) + { + if (strcmp(knowngN[i].id, id)==0) + return knowngN + i; + } + return NULL; + } +#endif diff --git a/crypto/openssl/crypto/srp/srp_vfy.c b/crypto/openssl/crypto/srp/srp_vfy.c new file mode 100644 index 0000000000..c8be907d7f --- /dev/null +++ b/crypto/openssl/crypto/srp/srp_vfy.c @@ -0,0 +1,657 @@ +/* crypto/srp/srp_vfy.c */ +/* Written by Christophe Renou (christophe.renou@edelweb.fr) with + * the precious help of Peter Sylvester (peter.sylvester@edelweb.fr) + * for the EdelKey project and contributed to the OpenSSL project 2004. + */ +/* ==================================================================== + * Copyright (c) 2004 The OpenSSL Project. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * 3. All advertising materials mentioning features or use of this + * software must display the following acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit. (http://www.OpenSSL.org/)" + * + * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to + * endorse or promote products derived from this software without + * prior written permission. For written permission, please contact + * licensing@OpenSSL.org. + * + * 5. Products derived from this software may not be called "OpenSSL" + * nor may "OpenSSL" appear in their names without prior written + * permission of the OpenSSL Project. + * + * 6. Redistributions of any form whatsoever must retain the following + * acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit (http://www.OpenSSL.org/)" + * + * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY + * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * ==================================================================== + * + * This product includes cryptographic software written by Eric Young + * (eay@cryptsoft.com). This product includes software written by Tim + * Hudson (tjh@cryptsoft.com). + * + */ +#ifndef OPENSSL_NO_SRP +#include "cryptlib.h" +#include "srp_lcl.h" +#include +#include +#include +#include +#include + +#define SRP_RANDOM_SALT_LEN 20 +#define MAX_LEN 2500 + +static char b64table[] = + "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz./"; + +/* the following two conversion routines have been inspired by code from Stanford */ + +/* + * Convert a base64 string into raw byte array representation. + */ +static int t_fromb64(unsigned char *a, const char *src) + { + char *loc; + int i, j; + int size; + + while(*src && (*src == ' ' || *src == '\t' || *src == '\n')) + ++src; + size = strlen(src); + i = 0; + while(i < size) + { + loc = strchr(b64table, src[i]); + if(loc == (char *) 0) break; + else a[i] = loc - b64table; + ++i; + } + size = i; + i = size - 1; + j = size; + while(1) + { + a[j] = a[i]; + if(--i < 0) break; + a[j] |= (a[i] & 3) << 6; + --j; + a[j] = (unsigned char) ((a[i] & 0x3c) >> 2); + if(--i < 0) break; + a[j] |= (a[i] & 0xf) << 4; + --j; + a[j] = (unsigned char) ((a[i] & 0x30) >> 4); + if(--i < 0) break; + a[j] |= (a[i] << 2); + + a[--j] = 0; + if(--i < 0) break; + } + while(a[j] == 0 && j <= size) ++j; + i = 0; + while (j <= size) a[i++] = a[j++]; + return i; + } + + +/* + * Convert a raw byte string into a null-terminated base64 ASCII string. + */ +static char *t_tob64(char *dst, const unsigned char *src, int size) + { + int c, pos = size % 3; + unsigned char b0 = 0, b1 = 0, b2 = 0, notleading = 0; + char *olddst = dst; + + switch(pos) + { + case 1: + b2 = src[0]; + break; + case 2: + b1 = src[0]; + b2 = src[1]; + break; + } + + while(1) + { + c = (b0 & 0xfc) >> 2; + if(notleading || c != 0) + { + *dst++ = b64table[c]; + notleading = 1; + } + c = ((b0 & 3) << 4) | ((b1 & 0xf0) >> 4); + if(notleading || c != 0) + { + *dst++ = b64table[c]; + notleading = 1; + } + c = ((b1 & 0xf) << 2) | ((b2 & 0xc0) >> 6); + if(notleading || c != 0) + { + *dst++ = b64table[c]; + notleading = 1; + } + c = b2 & 0x3f; + if(notleading || c != 0) + { + *dst++ = b64table[c]; + notleading = 1; + } + if(pos >= size) break; + else + { + b0 = src[pos++]; + b1 = src[pos++]; + b2 = src[pos++]; + } + } + + *dst++ = '\0'; + return olddst; + } + +static void SRP_user_pwd_free(SRP_user_pwd *user_pwd) + { + if (user_pwd == NULL) + return; + BN_free(user_pwd->s); + BN_clear_free(user_pwd->v); + OPENSSL_free(user_pwd->id); + OPENSSL_free(user_pwd->info); + OPENSSL_free(user_pwd); + } + +static SRP_user_pwd *SRP_user_pwd_new() + { + SRP_user_pwd *ret = OPENSSL_malloc(sizeof(SRP_user_pwd)); + if (ret == NULL) + return NULL; + ret->N = NULL; + ret->g = NULL; + ret->s = NULL; + ret->v = NULL; + ret->id = NULL ; + ret->info = NULL; + return ret; + } + +static void SRP_user_pwd_set_gN(SRP_user_pwd *vinfo, const BIGNUM *g, + const BIGNUM *N) + { + vinfo->N = N; + vinfo->g = g; + } + +static int SRP_user_pwd_set_ids(SRP_user_pwd *vinfo, const char *id, + const char *info) + { + if (id != NULL && NULL == (vinfo->id = BUF_strdup(id))) + return 0; + return (info == NULL || NULL != (vinfo->info = BUF_strdup(info))) ; + } + +static int SRP_user_pwd_set_sv(SRP_user_pwd *vinfo, const char *s, + const char *v) + { + unsigned char tmp[MAX_LEN]; + int len; + + if (strlen(s) > MAX_LEN || strlen(v) > MAX_LEN) + return 0; + len = t_fromb64(tmp, v); + if (NULL == (vinfo->v = BN_bin2bn(tmp, len, NULL)) ) + return 0; + len = t_fromb64(tmp, s); + return ((vinfo->s = BN_bin2bn(tmp, len, NULL)) != NULL) ; + } + +static int SRP_user_pwd_set_sv_BN(SRP_user_pwd *vinfo, BIGNUM *s, BIGNUM *v) + { + vinfo->v = v; + vinfo->s = s; + return (vinfo->s != NULL && vinfo->v != NULL) ; + } + +SRP_VBASE *SRP_VBASE_new(char *seed_key) + { + SRP_VBASE *vb = (SRP_VBASE *) OPENSSL_malloc(sizeof(SRP_VBASE)); + + if (vb == NULL) + return NULL; + if (!(vb->users_pwd = sk_SRP_user_pwd_new_null()) || + !(vb->gN_cache = sk_SRP_gN_cache_new_null())) + { + OPENSSL_free(vb); + return NULL; + } + vb->default_g = NULL; + vb->default_N = NULL; + vb->seed_key = NULL; + if ((seed_key != NULL) && + (vb->seed_key = BUF_strdup(seed_key)) == NULL) + { + sk_SRP_user_pwd_free(vb->users_pwd); + sk_SRP_gN_cache_free(vb->gN_cache); + OPENSSL_free(vb); + return NULL; + } + return vb; + } + + +int SRP_VBASE_free(SRP_VBASE *vb) + { + sk_SRP_user_pwd_pop_free(vb->users_pwd,SRP_user_pwd_free); + sk_SRP_gN_cache_free(vb->gN_cache); + OPENSSL_free(vb->seed_key); + OPENSSL_free(vb); + return 0; + } + + +static SRP_gN_cache *SRP_gN_new_init(const char *ch) + { + unsigned char tmp[MAX_LEN]; + int len; + + SRP_gN_cache *newgN = (SRP_gN_cache *)OPENSSL_malloc(sizeof(SRP_gN_cache)); + if (newgN == NULL) + return NULL; + + if ((newgN->b64_bn = BUF_strdup(ch)) == NULL) + goto err; + + len = t_fromb64(tmp, ch); + if ((newgN->bn = BN_bin2bn(tmp, len, NULL))) + return newgN; + + OPENSSL_free(newgN->b64_bn); +err: + OPENSSL_free(newgN); + return NULL; + } + + +static void SRP_gN_free(SRP_gN_cache *gN_cache) + { + if (gN_cache == NULL) + return; + OPENSSL_free(gN_cache->b64_bn); + BN_free(gN_cache->bn); + OPENSSL_free(gN_cache); + } + +static SRP_gN *SRP_get_gN_by_id(const char *id, STACK_OF(SRP_gN) *gN_tab) + { + int i; + + SRP_gN *gN; + if (gN_tab != NULL) + for(i = 0; i < sk_SRP_gN_num(gN_tab); i++) + { + gN = sk_SRP_gN_value(gN_tab, i); + if (gN && (id == NULL || strcmp(gN->id,id)==0)) + return gN; + } + + return SRP_get_default_gN(id); + } + +static BIGNUM *SRP_gN_place_bn(STACK_OF(SRP_gN_cache) *gN_cache, char *ch) + { + int i; + if (gN_cache == NULL) + return NULL; + + /* search if we have already one... */ + for(i = 0; i < sk_SRP_gN_cache_num(gN_cache); i++) + { + SRP_gN_cache *cache = sk_SRP_gN_cache_value(gN_cache, i); + if (strcmp(cache->b64_bn,ch)==0) + return cache->bn; + } + { /* it is the first time that we find it */ + SRP_gN_cache *newgN = SRP_gN_new_init(ch); + if (newgN) + { + if (sk_SRP_gN_cache_insert(gN_cache,newgN,0)>0) + return newgN->bn; + SRP_gN_free(newgN); + } + } + return NULL; + } + +/* this function parses verifier file. Format is: + * string(index):base64(N):base64(g):0 + * string(username):base64(v):base64(salt):int(index) + */ + + +int SRP_VBASE_init(SRP_VBASE *vb, char *verifier_file) + { + int error_code ; + STACK_OF(SRP_gN) *SRP_gN_tab = sk_SRP_gN_new_null(); + char *last_index = NULL; + int i; + char **pp; + + SRP_gN *gN = NULL; + SRP_user_pwd *user_pwd = NULL ; + + TXT_DB *tmpdb = NULL; + BIO *in = BIO_new(BIO_s_file()); + + error_code = SRP_ERR_OPEN_FILE; + + if (in == NULL || BIO_read_filename(in,verifier_file) <= 0) + goto err; + + error_code = SRP_ERR_VBASE_INCOMPLETE_FILE; + + if ((tmpdb =TXT_DB_read(in,DB_NUMBER)) == NULL) + goto err; + + error_code = SRP_ERR_MEMORY; + + + if (vb->seed_key) + { + last_index = SRP_get_default_gN(NULL)->id; + } + for (i = 0; i < sk_OPENSSL_PSTRING_num(tmpdb->data); i++) + { + pp = (char **)sk_OPENSSL_PSTRING_value(tmpdb->data,i); + if (pp[DB_srptype][0] == DB_SRP_INDEX) + { + /*we add this couple in the internal Stack */ + + if ((gN = (SRP_gN *)OPENSSL_malloc(sizeof(SRP_gN))) == NULL) + goto err; + + if (!(gN->id = BUF_strdup(pp[DB_srpid])) + || !(gN->N = SRP_gN_place_bn(vb->gN_cache,pp[DB_srpverifier])) + || !(gN->g = SRP_gN_place_bn(vb->gN_cache,pp[DB_srpsalt])) + || sk_SRP_gN_insert(SRP_gN_tab,gN,0) == 0) + goto err; + + gN = NULL; + + if (vb->seed_key != NULL) + { + last_index = pp[DB_srpid]; + } + } + else if (pp[DB_srptype][0] == DB_SRP_VALID) + { + /* it is a user .... */ + SRP_gN *lgN; + if ((lgN = SRP_get_gN_by_id(pp[DB_srpgN],SRP_gN_tab))!=NULL) + { + error_code = SRP_ERR_MEMORY; + if ((user_pwd = SRP_user_pwd_new()) == NULL) + goto err; + + SRP_user_pwd_set_gN(user_pwd,lgN->g,lgN->N); + if (!SRP_user_pwd_set_ids(user_pwd, pp[DB_srpid],pp[DB_srpinfo])) + goto err; + + error_code = SRP_ERR_VBASE_BN_LIB; + if (!SRP_user_pwd_set_sv(user_pwd, pp[DB_srpsalt],pp[DB_srpverifier])) + goto err; + + if (sk_SRP_user_pwd_insert(vb->users_pwd, user_pwd, 0) == 0) + goto err; + user_pwd = NULL; /* abandon responsability */ + } + } + } + + if (last_index != NULL) + { + /* this means that we want to simulate a default user */ + + if (((gN = SRP_get_gN_by_id(last_index,SRP_gN_tab))==NULL)) + { + error_code = SRP_ERR_VBASE_BN_LIB; + goto err; + } + vb->default_g = gN->g ; + vb->default_N = gN->N ; + gN = NULL ; + } + error_code = SRP_NO_ERROR; + + err: + /* there may be still some leaks to fix, if this fails, the application terminates most likely */ + + if (gN != NULL) + { + OPENSSL_free(gN->id); + OPENSSL_free(gN); + } + + SRP_user_pwd_free(user_pwd); + + if (tmpdb) TXT_DB_free(tmpdb); + if (in) BIO_free_all(in); + + sk_SRP_gN_free(SRP_gN_tab); + + return error_code; + + } + + +SRP_user_pwd *SRP_VBASE_get_by_user(SRP_VBASE *vb, char *username) + { + int i; + SRP_user_pwd *user; + unsigned char digv[SHA_DIGEST_LENGTH]; + unsigned char digs[SHA_DIGEST_LENGTH]; + EVP_MD_CTX ctxt; + + if (vb == NULL) + return NULL; + for(i = 0; i < sk_SRP_user_pwd_num(vb->users_pwd); i++) + { + user = sk_SRP_user_pwd_value(vb->users_pwd, i); + if (strcmp(user->id,username)==0) + return user; + } + if ((vb->seed_key == NULL) || + (vb->default_g == NULL) || + (vb->default_N == NULL)) + return NULL; + +/* if the user is unknown we set parameters as well if we have a seed_key */ + + if ((user = SRP_user_pwd_new()) == NULL) + return NULL; + + SRP_user_pwd_set_gN(user,vb->default_g,vb->default_N); + + if (!SRP_user_pwd_set_ids(user,username,NULL)) + goto err; + + RAND_pseudo_bytes(digv, SHA_DIGEST_LENGTH); + EVP_MD_CTX_init(&ctxt); + EVP_DigestInit_ex(&ctxt, EVP_sha1(), NULL); + EVP_DigestUpdate(&ctxt, vb->seed_key, strlen(vb->seed_key)); + EVP_DigestUpdate(&ctxt, username, strlen(username)); + EVP_DigestFinal_ex(&ctxt, digs, NULL); + EVP_MD_CTX_cleanup(&ctxt); + if (SRP_user_pwd_set_sv_BN(user, BN_bin2bn(digs,SHA_DIGEST_LENGTH,NULL), BN_bin2bn(digv,SHA_DIGEST_LENGTH, NULL))) + return user; + +err: SRP_user_pwd_free(user); + return NULL; + } + + +/* + create a verifier (*salt,*verifier,g and N are in base64) +*/ +char *SRP_create_verifier(const char *user, const char *pass, char **salt, + char **verifier, const char *N, const char *g) + { + int len; + char * result=NULL; + char *vf; + BIGNUM *N_bn = NULL, *g_bn = NULL, *s = NULL, *v = NULL; + unsigned char tmp[MAX_LEN]; + unsigned char tmp2[MAX_LEN]; + char * defgNid = NULL; + + if ((user == NULL)|| + (pass == NULL)|| + (salt == NULL)|| + (verifier == NULL)) + goto err; + + if (N) + { + if (!(len = t_fromb64(tmp, N))) goto err; + N_bn = BN_bin2bn(tmp, len, NULL); + if (!(len = t_fromb64(tmp, g))) goto err; + g_bn = BN_bin2bn(tmp, len, NULL); + defgNid = "*"; + } + else + { + SRP_gN * gN = SRP_get_gN_by_id(g, NULL) ; + if (gN == NULL) + goto err; + N_bn = gN->N; + g_bn = gN->g; + defgNid = gN->id; + } + + if (*salt == NULL) + { + RAND_pseudo_bytes(tmp2, SRP_RANDOM_SALT_LEN); + + s = BN_bin2bn(tmp2, SRP_RANDOM_SALT_LEN, NULL); + } + else + { + if (!(len = t_fromb64(tmp2, *salt))) + goto err; + s = BN_bin2bn(tmp2, len, NULL); + } + + + if(!SRP_create_verifier_BN(user, pass, &s, &v, N_bn, g_bn)) goto err; + + BN_bn2bin(v,tmp); + if (((vf = OPENSSL_malloc(BN_num_bytes(v)*2)) == NULL)) + goto err; + t_tob64(vf, tmp, BN_num_bytes(v)); + + *verifier = vf; + if (*salt == NULL) + { + char *tmp_salt; + if ((tmp_salt = (char *)OPENSSL_malloc(SRP_RANDOM_SALT_LEN * 2)) == NULL) + { + OPENSSL_free(vf); + goto err; + } + t_tob64(tmp_salt, tmp2, SRP_RANDOM_SALT_LEN); + *salt = tmp_salt; + } + + result=defgNid; + +err: + if(N) + { + BN_free(N_bn); + BN_free(g_bn); + } + return result; + } + +/* + create a verifier (*salt,*verifier,g and N are BIGNUMs) +*/ +int SRP_create_verifier_BN(const char *user, const char *pass, BIGNUM **salt, BIGNUM **verifier, BIGNUM *N, BIGNUM *g) + { + int result=0; + BIGNUM *x = NULL; + BN_CTX *bn_ctx = BN_CTX_new(); + unsigned char tmp2[MAX_LEN]; + + if ((user == NULL)|| + (pass == NULL)|| + (salt == NULL)|| + (verifier == NULL)|| + (N == NULL)|| + (g == NULL)|| + (bn_ctx == NULL)) + goto err; + + srp_bn_print(N); + srp_bn_print(g); + + if (*salt == NULL) + { + RAND_pseudo_bytes(tmp2, SRP_RANDOM_SALT_LEN); + + *salt = BN_bin2bn(tmp2,SRP_RANDOM_SALT_LEN,NULL); + } + + x = SRP_Calc_x(*salt,user,pass); + + *verifier = BN_new(); + if(*verifier == NULL) goto err; + + if (!BN_mod_exp(*verifier,g,x,N,bn_ctx)) + { + BN_clear_free(*verifier); + goto err; + } + + srp_bn_print(*verifier); + + result=1; + +err: + + BN_clear_free(x); + BN_CTX_free(bn_ctx); + return result; + } + + + +#endif diff --git a/crypto/openssl/crypto/stack/safestack.h b/crypto/openssl/crypto/stack/safestack.h index 3e76aa58f5..ea3aa0d800 100644 --- a/crypto/openssl/crypto/stack/safestack.h +++ b/crypto/openssl/crypto/stack/safestack.h @@ -1459,6 +1459,94 @@ DECLARE_SPECIAL_STACK_OF(OPENSSL_BLOCK, void) #define sk_POLICY_MAPPING_sort(st) SKM_sk_sort(POLICY_MAPPING, (st)) #define sk_POLICY_MAPPING_is_sorted(st) SKM_sk_is_sorted(POLICY_MAPPING, (st)) +#define sk_SRP_gN_new(cmp) SKM_sk_new(SRP_gN, (cmp)) +#define sk_SRP_gN_new_null() SKM_sk_new_null(SRP_gN) +#define sk_SRP_gN_free(st) SKM_sk_free(SRP_gN, (st)) +#define sk_SRP_gN_num(st) SKM_sk_num(SRP_gN, (st)) +#define sk_SRP_gN_value(st, i) SKM_sk_value(SRP_gN, (st), (i)) +#define sk_SRP_gN_set(st, i, val) SKM_sk_set(SRP_gN, (st), (i), (val)) +#define sk_SRP_gN_zero(st) SKM_sk_zero(SRP_gN, (st)) +#define sk_SRP_gN_push(st, val) SKM_sk_push(SRP_gN, (st), (val)) +#define sk_SRP_gN_unshift(st, val) SKM_sk_unshift(SRP_gN, (st), (val)) +#define sk_SRP_gN_find(st, val) SKM_sk_find(SRP_gN, (st), (val)) +#define sk_SRP_gN_find_ex(st, val) SKM_sk_find_ex(SRP_gN, (st), (val)) +#define sk_SRP_gN_delete(st, i) SKM_sk_delete(SRP_gN, (st), (i)) +#define sk_SRP_gN_delete_ptr(st, ptr) SKM_sk_delete_ptr(SRP_gN, (st), (ptr)) +#define sk_SRP_gN_insert(st, val, i) SKM_sk_insert(SRP_gN, (st), (val), (i)) +#define sk_SRP_gN_set_cmp_func(st, cmp) SKM_sk_set_cmp_func(SRP_gN, (st), (cmp)) +#define sk_SRP_gN_dup(st) SKM_sk_dup(SRP_gN, st) +#define sk_SRP_gN_pop_free(st, free_func) SKM_sk_pop_free(SRP_gN, (st), (free_func)) +#define sk_SRP_gN_shift(st) SKM_sk_shift(SRP_gN, (st)) +#define sk_SRP_gN_pop(st) SKM_sk_pop(SRP_gN, (st)) +#define sk_SRP_gN_sort(st) SKM_sk_sort(SRP_gN, (st)) +#define sk_SRP_gN_is_sorted(st) SKM_sk_is_sorted(SRP_gN, (st)) + +#define sk_SRP_gN_cache_new(cmp) SKM_sk_new(SRP_gN_cache, (cmp)) +#define sk_SRP_gN_cache_new_null() SKM_sk_new_null(SRP_gN_cache) +#define sk_SRP_gN_cache_free(st) SKM_sk_free(SRP_gN_cache, (st)) +#define sk_SRP_gN_cache_num(st) SKM_sk_num(SRP_gN_cache, (st)) +#define sk_SRP_gN_cache_value(st, i) SKM_sk_value(SRP_gN_cache, (st), (i)) +#define sk_SRP_gN_cache_set(st, i, val) SKM_sk_set(SRP_gN_cache, (st), (i), (val)) +#define sk_SRP_gN_cache_zero(st) SKM_sk_zero(SRP_gN_cache, (st)) +#define sk_SRP_gN_cache_push(st, val) SKM_sk_push(SRP_gN_cache, (st), (val)) +#define sk_SRP_gN_cache_unshift(st, val) SKM_sk_unshift(SRP_gN_cache, (st), (val)) +#define sk_SRP_gN_cache_find(st, val) SKM_sk_find(SRP_gN_cache, (st), (val)) +#define sk_SRP_gN_cache_find_ex(st, val) SKM_sk_find_ex(SRP_gN_cache, (st), (val)) +#define sk_SRP_gN_cache_delete(st, i) SKM_sk_delete(SRP_gN_cache, (st), (i)) +#define sk_SRP_gN_cache_delete_ptr(st, ptr) SKM_sk_delete_ptr(SRP_gN_cache, (st), (ptr)) +#define sk_SRP_gN_cache_insert(st, val, i) SKM_sk_insert(SRP_gN_cache, (st), (val), (i)) +#define sk_SRP_gN_cache_set_cmp_func(st, cmp) SKM_sk_set_cmp_func(SRP_gN_cache, (st), (cmp)) +#define sk_SRP_gN_cache_dup(st) SKM_sk_dup(SRP_gN_cache, st) +#define sk_SRP_gN_cache_pop_free(st, free_func) SKM_sk_pop_free(SRP_gN_cache, (st), (free_func)) +#define sk_SRP_gN_cache_shift(st) SKM_sk_shift(SRP_gN_cache, (st)) +#define sk_SRP_gN_cache_pop(st) SKM_sk_pop(SRP_gN_cache, (st)) +#define sk_SRP_gN_cache_sort(st) SKM_sk_sort(SRP_gN_cache, (st)) +#define sk_SRP_gN_cache_is_sorted(st) SKM_sk_is_sorted(SRP_gN_cache, (st)) + +#define sk_SRP_user_pwd_new(cmp) SKM_sk_new(SRP_user_pwd, (cmp)) +#define sk_SRP_user_pwd_new_null() SKM_sk_new_null(SRP_user_pwd) +#define sk_SRP_user_pwd_free(st) SKM_sk_free(SRP_user_pwd, (st)) +#define sk_SRP_user_pwd_num(st) SKM_sk_num(SRP_user_pwd, (st)) +#define sk_SRP_user_pwd_value(st, i) SKM_sk_value(SRP_user_pwd, (st), (i)) +#define sk_SRP_user_pwd_set(st, i, val) SKM_sk_set(SRP_user_pwd, (st), (i), (val)) +#define sk_SRP_user_pwd_zero(st) SKM_sk_zero(SRP_user_pwd, (st)) +#define sk_SRP_user_pwd_push(st, val) SKM_sk_push(SRP_user_pwd, (st), (val)) +#define sk_SRP_user_pwd_unshift(st, val) SKM_sk_unshift(SRP_user_pwd, (st), (val)) +#define sk_SRP_user_pwd_find(st, val) SKM_sk_find(SRP_user_pwd, (st), (val)) +#define sk_SRP_user_pwd_find_ex(st, val) SKM_sk_find_ex(SRP_user_pwd, (st), (val)) +#define sk_SRP_user_pwd_delete(st, i) SKM_sk_delete(SRP_user_pwd, (st), (i)) +#define sk_SRP_user_pwd_delete_ptr(st, ptr) SKM_sk_delete_ptr(SRP_user_pwd, (st), (ptr)) +#define sk_SRP_user_pwd_insert(st, val, i) SKM_sk_insert(SRP_user_pwd, (st), (val), (i)) +#define sk_SRP_user_pwd_set_cmp_func(st, cmp) SKM_sk_set_cmp_func(SRP_user_pwd, (st), (cmp)) +#define sk_SRP_user_pwd_dup(st) SKM_sk_dup(SRP_user_pwd, st) +#define sk_SRP_user_pwd_pop_free(st, free_func) SKM_sk_pop_free(SRP_user_pwd, (st), (free_func)) +#define sk_SRP_user_pwd_shift(st) SKM_sk_shift(SRP_user_pwd, (st)) +#define sk_SRP_user_pwd_pop(st) SKM_sk_pop(SRP_user_pwd, (st)) +#define sk_SRP_user_pwd_sort(st) SKM_sk_sort(SRP_user_pwd, (st)) +#define sk_SRP_user_pwd_is_sorted(st) SKM_sk_is_sorted(SRP_user_pwd, (st)) + +#define sk_SRTP_PROTECTION_PROFILE_new(cmp) SKM_sk_new(SRTP_PROTECTION_PROFILE, (cmp)) +#define sk_SRTP_PROTECTION_PROFILE_new_null() SKM_sk_new_null(SRTP_PROTECTION_PROFILE) +#define sk_SRTP_PROTECTION_PROFILE_free(st) SKM_sk_free(SRTP_PROTECTION_PROFILE, (st)) +#define sk_SRTP_PROTECTION_PROFILE_num(st) SKM_sk_num(SRTP_PROTECTION_PROFILE, (st)) +#define sk_SRTP_PROTECTION_PROFILE_value(st, i) SKM_sk_value(SRTP_PROTECTION_PROFILE, (st), (i)) +#define sk_SRTP_PROTECTION_PROFILE_set(st, i, val) SKM_sk_set(SRTP_PROTECTION_PROFILE, (st), (i), (val)) +#define sk_SRTP_PROTECTION_PROFILE_zero(st) SKM_sk_zero(SRTP_PROTECTION_PROFILE, (st)) +#define sk_SRTP_PROTECTION_PROFILE_push(st, val) SKM_sk_push(SRTP_PROTECTION_PROFILE, (st), (val)) +#define sk_SRTP_PROTECTION_PROFILE_unshift(st, val) SKM_sk_unshift(SRTP_PROTECTION_PROFILE, (st), (val)) +#define sk_SRTP_PROTECTION_PROFILE_find(st, val) SKM_sk_find(SRTP_PROTECTION_PROFILE, (st), (val)) +#define sk_SRTP_PROTECTION_PROFILE_find_ex(st, val) SKM_sk_find_ex(SRTP_PROTECTION_PROFILE, (st), (val)) +#define sk_SRTP_PROTECTION_PROFILE_delete(st, i) SKM_sk_delete(SRTP_PROTECTION_PROFILE, (st), (i)) +#define sk_SRTP_PROTECTION_PROFILE_delete_ptr(st, ptr) SKM_sk_delete_ptr(SRTP_PROTECTION_PROFILE, (st), (ptr)) +#define sk_SRTP_PROTECTION_PROFILE_insert(st, val, i) SKM_sk_insert(SRTP_PROTECTION_PROFILE, (st), (val), (i)) +#define sk_SRTP_PROTECTION_PROFILE_set_cmp_func(st, cmp) SKM_sk_set_cmp_func(SRTP_PROTECTION_PROFILE, (st), (cmp)) +#define sk_SRTP_PROTECTION_PROFILE_dup(st) SKM_sk_dup(SRTP_PROTECTION_PROFILE, st) +#define sk_SRTP_PROTECTION_PROFILE_pop_free(st, free_func) SKM_sk_pop_free(SRTP_PROTECTION_PROFILE, (st), (free_func)) +#define sk_SRTP_PROTECTION_PROFILE_shift(st) SKM_sk_shift(SRTP_PROTECTION_PROFILE, (st)) +#define sk_SRTP_PROTECTION_PROFILE_pop(st) SKM_sk_pop(SRTP_PROTECTION_PROFILE, (st)) +#define sk_SRTP_PROTECTION_PROFILE_sort(st) SKM_sk_sort(SRTP_PROTECTION_PROFILE, (st)) +#define sk_SRTP_PROTECTION_PROFILE_is_sorted(st) SKM_sk_is_sorted(SRTP_PROTECTION_PROFILE, (st)) + #define sk_SSL_CIPHER_new(cmp) SKM_sk_new(SSL_CIPHER, (cmp)) #define sk_SSL_CIPHER_new_null() SKM_sk_new_null(SSL_CIPHER) #define sk_SSL_CIPHER_free(st) SKM_sk_free(SSL_CIPHER, (st)) @@ -2056,31 +2144,6 @@ DECLARE_SPECIAL_STACK_OF(OPENSSL_BLOCK, void) #define sk_OPENSSL_STRING_is_sorted(st) SKM_sk_is_sorted(OPENSSL_STRING, (st)) -#define sk_OPENSSL_PSTRING_new(cmp) ((STACK_OF(OPENSSL_PSTRING) *)sk_new(CHECKED_SK_CMP_FUNC(OPENSSL_STRING, cmp))) -#define sk_OPENSSL_PSTRING_new_null() ((STACK_OF(OPENSSL_PSTRING) *)sk_new_null()) -#define sk_OPENSSL_PSTRING_push(st, val) sk_push(CHECKED_STACK_OF(OPENSSL_PSTRING, st), CHECKED_PTR_OF(OPENSSL_STRING, val)) -#define sk_OPENSSL_PSTRING_find(st, val) sk_find(CHECKED_STACK_OF(OPENSSL_PSTRING, st), CHECKED_PTR_OF(OPENSSL_STRING, val)) -#define sk_OPENSSL_PSTRING_value(st, i) ((OPENSSL_PSTRING)sk_value(CHECKED_STACK_OF(OPENSSL_PSTRING, st), i)) -#define sk_OPENSSL_PSTRING_num(st) SKM_sk_num(OPENSSL_PSTRING, st) -#define sk_OPENSSL_PSTRING_pop_free(st, free_func) sk_pop_free(CHECKED_STACK_OF(OPENSSL_PSTRING, st), CHECKED_SK_FREE_FUNC2(OPENSSL_PSTRING, free_func)) -#define sk_OPENSSL_PSTRING_insert(st, val, i) sk_insert(CHECKED_STACK_OF(OPENSSL_PSTRING, st), CHECKED_PTR_OF(OPENSSL_STRING, val), i) -#define sk_OPENSSL_PSTRING_free(st) SKM_sk_free(OPENSSL_PSTRING, st) -#define sk_OPENSSL_PSTRING_set(st, i, val) sk_set(CHECKED_STACK_OF(OPENSSL_PSTRING, st), i, CHECKED_PTR_OF(OPENSSL_STRING, val)) -#define sk_OPENSSL_PSTRING_zero(st) SKM_sk_zero(OPENSSL_PSTRING, (st)) -#define sk_OPENSSL_PSTRING_unshift(st, val) sk_unshift(CHECKED_STACK_OF(OPENSSL_PSTRING, st), CHECKED_PTR_OF(OPENSSL_STRING, val)) -#define sk_OPENSSL_PSTRING_find_ex(st, val) sk_find_ex((_STACK *)CHECKED_CONST_PTR_OF(STACK_OF(OPENSSL_PSTRING), st), CHECKED_CONST_PTR_OF(OPENSSL_STRING, val)) -#define sk_OPENSSL_PSTRING_delete(st, i) SKM_sk_delete(OPENSSL_PSTRING, (st), (i)) -#define sk_OPENSSL_PSTRING_delete_ptr(st, ptr) (OPENSSL_PSTRING *)sk_delete_ptr(CHECKED_STACK_OF(OPENSSL_PSTRING, st), CHECKED_PTR_OF(OPENSSL_STRING, ptr)) -#define sk_OPENSSL_PSTRING_set_cmp_func(st, cmp) \ - ((int (*)(const OPENSSL_STRING * const *,const OPENSSL_STRING * const *)) \ - sk_set_cmp_func(CHECKED_STACK_OF(OPENSSL_PSTRING, st), CHECKED_SK_CMP_FUNC(OPENSSL_STRING, cmp))) -#define sk_OPENSSL_PSTRING_dup(st) SKM_sk_dup(OPENSSL_PSTRING, st) -#define sk_OPENSSL_PSTRING_shift(st) SKM_sk_shift(OPENSSL_PSTRING, (st)) -#define sk_OPENSSL_PSTRING_pop(st) (OPENSSL_STRING *)sk_pop(CHECKED_STACK_OF(OPENSSL_PSTRING, st)) -#define sk_OPENSSL_PSTRING_sort(st) SKM_sk_sort(OPENSSL_PSTRING, (st)) -#define sk_OPENSSL_PSTRING_is_sorted(st) SKM_sk_is_sorted(OPENSSL_PSTRING, (st)) - - #define sk_OPENSSL_BLOCK_new(cmp) ((STACK_OF(OPENSSL_BLOCK) *)sk_new(CHECKED_SK_CMP_FUNC(void, cmp))) #define sk_OPENSSL_BLOCK_new_null() ((STACK_OF(OPENSSL_BLOCK) *)sk_new_null()) #define sk_OPENSSL_BLOCK_push(st, val) sk_push(CHECKED_STACK_OF(OPENSSL_BLOCK, st), CHECKED_PTR_OF(void, val)) @@ -2106,6 +2169,31 @@ DECLARE_SPECIAL_STACK_OF(OPENSSL_BLOCK, void) #define sk_OPENSSL_BLOCK_is_sorted(st) SKM_sk_is_sorted(OPENSSL_BLOCK, (st)) +#define sk_OPENSSL_PSTRING_new(cmp) ((STACK_OF(OPENSSL_PSTRING) *)sk_new(CHECKED_SK_CMP_FUNC(OPENSSL_STRING, cmp))) +#define sk_OPENSSL_PSTRING_new_null() ((STACK_OF(OPENSSL_PSTRING) *)sk_new_null()) +#define sk_OPENSSL_PSTRING_push(st, val) sk_push(CHECKED_STACK_OF(OPENSSL_PSTRING, st), CHECKED_PTR_OF(OPENSSL_STRING, val)) +#define sk_OPENSSL_PSTRING_find(st, val) sk_find(CHECKED_STACK_OF(OPENSSL_PSTRING, st), CHECKED_PTR_OF(OPENSSL_STRING, val)) +#define sk_OPENSSL_PSTRING_value(st, i) ((OPENSSL_PSTRING)sk_value(CHECKED_STACK_OF(OPENSSL_PSTRING, st), i)) +#define sk_OPENSSL_PSTRING_num(st) SKM_sk_num(OPENSSL_PSTRING, st) +#define sk_OPENSSL_PSTRING_pop_free(st, free_func) sk_pop_free(CHECKED_STACK_OF(OPENSSL_PSTRING, st), CHECKED_SK_FREE_FUNC2(OPENSSL_PSTRING, free_func)) +#define sk_OPENSSL_PSTRING_insert(st, val, i) sk_insert(CHECKED_STACK_OF(OPENSSL_PSTRING, st), CHECKED_PTR_OF(OPENSSL_STRING, val), i) +#define sk_OPENSSL_PSTRING_free(st) SKM_sk_free(OPENSSL_PSTRING, st) +#define sk_OPENSSL_PSTRING_set(st, i, val) sk_set(CHECKED_STACK_OF(OPENSSL_PSTRING, st), i, CHECKED_PTR_OF(OPENSSL_STRING, val)) +#define sk_OPENSSL_PSTRING_zero(st) SKM_sk_zero(OPENSSL_PSTRING, (st)) +#define sk_OPENSSL_PSTRING_unshift(st, val) sk_unshift(CHECKED_STACK_OF(OPENSSL_PSTRING, st), CHECKED_PTR_OF(OPENSSL_STRING, val)) +#define sk_OPENSSL_PSTRING_find_ex(st, val) sk_find_ex((_STACK *)CHECKED_CONST_PTR_OF(STACK_OF(OPENSSL_PSTRING), st), CHECKED_CONST_PTR_OF(OPENSSL_STRING, val)) +#define sk_OPENSSL_PSTRING_delete(st, i) SKM_sk_delete(OPENSSL_PSTRING, (st), (i)) +#define sk_OPENSSL_PSTRING_delete_ptr(st, ptr) (OPENSSL_PSTRING *)sk_delete_ptr(CHECKED_STACK_OF(OPENSSL_PSTRING, st), CHECKED_PTR_OF(OPENSSL_STRING, ptr)) +#define sk_OPENSSL_PSTRING_set_cmp_func(st, cmp) \ + ((int (*)(const OPENSSL_STRING * const *,const OPENSSL_STRING * const *)) \ + sk_set_cmp_func(CHECKED_STACK_OF(OPENSSL_PSTRING, st), CHECKED_SK_CMP_FUNC(OPENSSL_STRING, cmp))) +#define sk_OPENSSL_PSTRING_dup(st) SKM_sk_dup(OPENSSL_PSTRING, st) +#define sk_OPENSSL_PSTRING_shift(st) SKM_sk_shift(OPENSSL_PSTRING, (st)) +#define sk_OPENSSL_PSTRING_pop(st) (OPENSSL_STRING *)sk_pop(CHECKED_STACK_OF(OPENSSL_PSTRING, st)) +#define sk_OPENSSL_PSTRING_sort(st) SKM_sk_sort(OPENSSL_PSTRING, (st)) +#define sk_OPENSSL_PSTRING_is_sorted(st) SKM_sk_is_sorted(OPENSSL_PSTRING, (st)) + + #define d2i_ASN1_SET_OF_ACCESS_DESCRIPTION(st, pp, length, d2i_func, free_func, ex_tag, ex_class) \ SKM_ASN1_SET_OF_d2i(ACCESS_DESCRIPTION, (st), (pp), (length), (d2i_func), (free_func), (ex_tag), (ex_class)) #define i2d_ASN1_SET_OF_ACCESS_DESCRIPTION(st, pp, i2d_func, ex_tag, ex_class, is_set) \ diff --git a/crypto/openssl/crypto/symhacks.h b/crypto/openssl/crypto/symhacks.h index 3fd4a81692..403f592dcd 100644 --- a/crypto/openssl/crypto/symhacks.h +++ b/crypto/openssl/crypto/symhacks.h @@ -176,7 +176,6 @@ #define SSL_CTX_set_default_passwd_cb_userdata SSL_CTX_set_def_passwd_cb_ud #undef SSL_COMP_get_compression_methods #define SSL_COMP_get_compression_methods SSL_COMP_get_compress_methods - #undef ssl_add_clienthello_renegotiate_ext #define ssl_add_clienthello_renegotiate_ext ssl_add_clienthello_reneg_ext #undef ssl_add_serverhello_renegotiate_ext @@ -185,6 +184,26 @@ #define ssl_parse_clienthello_renegotiate_ext ssl_parse_clienthello_reneg_ext #undef ssl_parse_serverhello_renegotiate_ext #define ssl_parse_serverhello_renegotiate_ext ssl_parse_serverhello_reneg_ext +#undef SSL_srp_server_param_with_username +#define SSL_srp_server_param_with_username SSL_srp_server_param_with_un +#undef SSL_CTX_set_srp_client_pwd_callback +#define SSL_CTX_set_srp_client_pwd_callback SSL_CTX_set_srp_client_pwd_cb +#undef SSL_CTX_set_srp_verify_param_callback +#define SSL_CTX_set_srp_verify_param_callback SSL_CTX_set_srp_vfy_param_cb +#undef SSL_CTX_set_srp_username_callback +#define SSL_CTX_set_srp_username_callback SSL_CTX_set_srp_un_cb +#undef ssl_add_clienthello_use_srtp_ext +#define ssl_add_clienthello_use_srtp_ext ssl_add_clihello_use_srtp_ext +#undef ssl_add_serverhello_use_srtp_ext +#define ssl_add_serverhello_use_srtp_ext ssl_add_serhello_use_srtp_ext +#undef ssl_parse_clienthello_use_srtp_ext +#define ssl_parse_clienthello_use_srtp_ext ssl_parse_clihello_use_srtp_ext +#undef ssl_parse_serverhello_use_srtp_ext +#define ssl_parse_serverhello_use_srtp_ext ssl_parse_serhello_use_srtp_ext +#undef SSL_CTX_set_next_protos_advertised_cb +#define SSL_CTX_set_next_protos_advertised_cb SSL_CTX_set_next_protos_adv_cb +#undef SSL_CTX_set_next_proto_select_cb +#define SSL_CTX_set_next_proto_select_cb SSL_CTX_set_next_proto_sel_cb /* Hack some long ENGINE names */ #undef ENGINE_get_default_BN_mod_exp_crt @@ -238,6 +257,9 @@ #define EC_GROUP_get_point_conversion_form EC_GROUP_get_point_conv_form #undef EC_GROUP_clear_free_all_extra_data #define EC_GROUP_clear_free_all_extra_data EC_GROUP_clr_free_all_xtra_data +#undef EC_KEY_set_public_key_affine_coordinates +#define EC_KEY_set_public_key_affine_coordinates \ + EC_KEY_set_pub_key_aff_coords #undef EC_POINT_set_Jprojective_coordinates_GFp #define EC_POINT_set_Jprojective_coordinates_GFp \ EC_POINT_set_Jproj_coords_GFp @@ -399,6 +421,12 @@ #undef dtls1_retransmit_buffered_messages #define dtls1_retransmit_buffered_messages dtls1_retransmit_buffered_msgs +/* Hack some long SRP names */ +#undef SRP_generate_server_master_secret +#define SRP_generate_server_master_secret SRP_gen_server_master_secret +#undef SRP_generate_client_master_secret +#define SRP_generate_client_master_secret SRP_gen_client_master_secret + /* Hack some long UI names */ #undef UI_method_get_prompt_constructor #define UI_method_get_prompt_constructor UI_method_get_prompt_constructr diff --git a/crypto/openssl/crypto/ts/ts.h b/crypto/openssl/crypto/ts/ts.h index 190e8a1bf2..c2448e3c3b 100644 --- a/crypto/openssl/crypto/ts/ts.h +++ b/crypto/openssl/crypto/ts/ts.h @@ -86,9 +86,6 @@ #include #endif -#include - - #ifdef __cplusplus extern "C" { #endif diff --git a/crypto/openssl/crypto/ts/ts_rsp_verify.c b/crypto/openssl/crypto/ts/ts_rsp_verify.c index e1f3b534af..afe16afbe4 100644 --- a/crypto/openssl/crypto/ts/ts_rsp_verify.c +++ b/crypto/openssl/crypto/ts/ts_rsp_verify.c @@ -614,12 +614,15 @@ static int TS_compute_imprint(BIO *data, TS_TST_INFO *tst_info, goto err; } - EVP_DigestInit(&md_ctx, md); + if (!EVP_DigestInit(&md_ctx, md)) + goto err; while ((length = BIO_read(data, buffer, sizeof(buffer))) > 0) { - EVP_DigestUpdate(&md_ctx, buffer, length); + if (!EVP_DigestUpdate(&md_ctx, buffer, length)) + goto err; } - EVP_DigestFinal(&md_ctx, *imprint, NULL); + if (!EVP_DigestFinal(&md_ctx, *imprint, NULL)) + goto err; return 1; err: diff --git a/crypto/openssl/crypto/ui/ui.h b/crypto/openssl/crypto/ui/ui.h index 2b1cfa2289..bd78aa413f 100644 --- a/crypto/openssl/crypto/ui/ui.h +++ b/crypto/openssl/crypto/ui/ui.h @@ -316,7 +316,7 @@ int (*UI_method_get_writer(UI_METHOD *method))(UI*,UI_STRING*); int (*UI_method_get_flusher(UI_METHOD *method))(UI*); int (*UI_method_get_reader(UI_METHOD *method))(UI*,UI_STRING*); int (*UI_method_get_closer(UI_METHOD *method))(UI*); -char* (*UI_method_get_prompt_constructor(UI_METHOD *method))(UI*, const char*, const char*); +char * (*UI_method_get_prompt_constructor(UI_METHOD *method))(UI*, const char*, const char*); /* The following functions are helpers for method writers to access relevant data from a UI_STRING. */ diff --git a/crypto/openssl/crypto/ui/ui_openssl.c b/crypto/openssl/crypto/ui/ui_openssl.c index 1bc25f48d5..5832a73cf5 100644 --- a/crypto/openssl/crypto/ui/ui_openssl.c +++ b/crypto/openssl/crypto/ui/ui_openssl.c @@ -122,7 +122,7 @@ * sigaction and fileno included. -pedantic would be more appropriate for * the intended purposes, but we can't prevent users from adding -ansi. */ -#ifndef _POSIX_C_SOURCE +#if !defined(_POSIX_C_SOURCE) && defined(OPENSSL_SYS_VMS) #define _POSIX_C_SOURCE 2 #endif #include diff --git a/crypto/openssl/crypto/whrlpool/whrlpool.h b/crypto/openssl/crypto/whrlpool/whrlpool.h index 03c91da115..9e01f5b076 100644 --- a/crypto/openssl/crypto/whrlpool/whrlpool.h +++ b/crypto/openssl/crypto/whrlpool/whrlpool.h @@ -24,6 +24,9 @@ typedef struct { } WHIRLPOOL_CTX; #ifndef OPENSSL_NO_WHIRLPOOL +#ifdef OPENSSL_FIPS +int private_WHIRLPOOL_Init(WHIRLPOOL_CTX *c); +#endif int WHIRLPOOL_Init (WHIRLPOOL_CTX *c); int WHIRLPOOL_Update (WHIRLPOOL_CTX *c,const void *inp,size_t bytes); void WHIRLPOOL_BitUpdate(WHIRLPOOL_CTX *c,const void *inp,size_t bits); diff --git a/crypto/openssl/crypto/whrlpool/wp_block.c b/crypto/openssl/crypto/whrlpool/wp_block.c index 221f6cc59f..824ed1827c 100644 --- a/crypto/openssl/crypto/whrlpool/wp_block.c +++ b/crypto/openssl/crypto/whrlpool/wp_block.c @@ -68,9 +68,9 @@ typedef unsigned long long u64; CPUs this is actually faster! */ # endif # define GO_FOR_MMX(ctx,inp,num) do { \ - extern unsigned long OPENSSL_ia32cap_P; \ + extern unsigned int OPENSSL_ia32cap_P[]; \ void whirlpool_block_mmx(void *,const void *,size_t); \ - if (!(OPENSSL_ia32cap_P & (1<<23))) break; \ + if (!(OPENSSL_ia32cap_P[0] & (1<<23))) break; \ whirlpool_block_mmx(ctx->H.c,inp,num); return; \ } while (0) # endif diff --git a/crypto/openssl/crypto/whrlpool/wp_dgst.c b/crypto/openssl/crypto/whrlpool/wp_dgst.c index ee5c5c1bf3..7e28bef51d 100644 --- a/crypto/openssl/crypto/whrlpool/wp_dgst.c +++ b/crypto/openssl/crypto/whrlpool/wp_dgst.c @@ -52,9 +52,10 @@ */ #include "wp_locl.h" +#include #include -int WHIRLPOOL_Init (WHIRLPOOL_CTX *c) +fips_md_init(WHIRLPOOL) { memset (c,0,sizeof(*c)); return(1); diff --git a/crypto/openssl/crypto/x509/x509.h b/crypto/openssl/crypto/x509/x509.h index e6f8a40395..092dd7450d 100644 --- a/crypto/openssl/crypto/x509/x509.h +++ b/crypto/openssl/crypto/x509/x509.h @@ -657,11 +657,15 @@ int NETSCAPE_SPKI_set_pubkey(NETSCAPE_SPKI *x, EVP_PKEY *pkey); int NETSCAPE_SPKI_print(BIO *out, NETSCAPE_SPKI *spki); +int X509_signature_dump(BIO *bp,const ASN1_STRING *sig, int indent); int X509_signature_print(BIO *bp,X509_ALGOR *alg, ASN1_STRING *sig); int X509_sign(X509 *x, EVP_PKEY *pkey, const EVP_MD *md); +int X509_sign_ctx(X509 *x, EVP_MD_CTX *ctx); int X509_REQ_sign(X509_REQ *x, EVP_PKEY *pkey, const EVP_MD *md); +int X509_REQ_sign_ctx(X509_REQ *x, EVP_MD_CTX *ctx); int X509_CRL_sign(X509_CRL *x, EVP_PKEY *pkey, const EVP_MD *md); +int X509_CRL_sign_ctx(X509_CRL *x, EVP_MD_CTX *ctx); int NETSCAPE_SPKI_sign(NETSCAPE_SPKI *x, EVP_PKEY *pkey, const EVP_MD *md); int X509_pubkey_digest(const X509 *data,const EVP_MD *type, @@ -763,6 +767,7 @@ X509_ALGOR *X509_ALGOR_dup(X509_ALGOR *xn); int X509_ALGOR_set0(X509_ALGOR *alg, ASN1_OBJECT *aobj, int ptype, void *pval); void X509_ALGOR_get0(ASN1_OBJECT **paobj, int *pptype, void **ppval, X509_ALGOR *algor); +void X509_ALGOR_set_md(X509_ALGOR *alg, const EVP_MD *md); X509_NAME *X509_NAME_dup(X509_NAME *xn); X509_NAME_ENTRY *X509_NAME_ENTRY_dup(X509_NAME_ENTRY *ne); @@ -896,6 +901,9 @@ int ASN1_item_verify(const ASN1_ITEM *it, X509_ALGOR *algor1, int ASN1_item_sign(const ASN1_ITEM *it, X509_ALGOR *algor1, X509_ALGOR *algor2, ASN1_BIT_STRING *signature, void *data, EVP_PKEY *pkey, const EVP_MD *type); +int ASN1_item_sign_ctx(const ASN1_ITEM *it, + X509_ALGOR *algor1, X509_ALGOR *algor2, + ASN1_BIT_STRING *signature, void *asn, EVP_MD_CTX *ctx); #endif int X509_set_version(X509 *x,long version); @@ -1161,6 +1169,9 @@ X509_ALGOR *PKCS5_pbe2_set_iv(const EVP_CIPHER *cipher, int iter, unsigned char *salt, int saltlen, unsigned char *aiv, int prf_nid); +X509_ALGOR *PKCS5_pbkdf2_set(int iter, unsigned char *salt, int saltlen, + int prf_nid, int keylen); + /* PKCS#8 utilities */ DECLARE_ASN1_FUNCTIONS(PKCS8_PRIV_KEY_INFO) diff --git a/crypto/openssl/crypto/x509/x509_cmp.c b/crypto/openssl/crypto/x509/x509_cmp.c index 4bc9da07e0..7c2aaee2e9 100644 --- a/crypto/openssl/crypto/x509/x509_cmp.c +++ b/crypto/openssl/crypto/x509/x509_cmp.c @@ -87,15 +87,20 @@ unsigned long X509_issuer_and_serial_hash(X509 *a) EVP_MD_CTX_init(&ctx); f=X509_NAME_oneline(a->cert_info->issuer,NULL,0); ret=strlen(f); - EVP_DigestInit_ex(&ctx, EVP_md5(), NULL); - EVP_DigestUpdate(&ctx,(unsigned char *)f,ret); + if (!EVP_DigestInit_ex(&ctx, EVP_md5(), NULL)) + goto err; + if (!EVP_DigestUpdate(&ctx,(unsigned char *)f,ret)) + goto err; OPENSSL_free(f); - EVP_DigestUpdate(&ctx,(unsigned char *)a->cert_info->serialNumber->data, - (unsigned long)a->cert_info->serialNumber->length); - EVP_DigestFinal_ex(&ctx,&(md[0]),NULL); + if(!EVP_DigestUpdate(&ctx,(unsigned char *)a->cert_info->serialNumber->data, + (unsigned long)a->cert_info->serialNumber->length)) + goto err; + if (!EVP_DigestFinal_ex(&ctx,&(md[0]),NULL)) + goto err; ret=( ((unsigned long)md[0] )|((unsigned long)md[1]<<8L)| ((unsigned long)md[2]<<16L)|((unsigned long)md[3]<<24L) )&0xffffffffL; + err: EVP_MD_CTX_cleanup(&ctx); return(ret); } @@ -219,7 +224,9 @@ unsigned long X509_NAME_hash(X509_NAME *x) /* Make sure X509_NAME structure contains valid cached encoding */ i2d_X509_NAME(x,NULL); - EVP_Digest(x->canon_enc, x->canon_enclen, md, NULL, EVP_sha1(), NULL); + if (!EVP_Digest(x->canon_enc, x->canon_enclen, md, NULL, EVP_sha1(), + NULL)) + return 0; ret=( ((unsigned long)md[0] )|((unsigned long)md[1]<<8L)| ((unsigned long)md[2]<<16L)|((unsigned long)md[3]<<24L) @@ -234,12 +241,18 @@ unsigned long X509_NAME_hash(X509_NAME *x) unsigned long X509_NAME_hash_old(X509_NAME *x) { + EVP_MD_CTX md_ctx; unsigned long ret=0; unsigned char md[16]; /* Make sure X509_NAME structure contains valid cached encoding */ i2d_X509_NAME(x,NULL); - EVP_Digest(x->bytes->data, x->bytes->length, md, NULL, EVP_md5(), NULL); + EVP_MD_CTX_init(&md_ctx); + EVP_MD_CTX_set_flags(&md_ctx, EVP_MD_CTX_FLAG_NON_FIPS_ALLOW); + EVP_DigestInit_ex(&md_ctx, EVP_md5(), NULL); + EVP_DigestUpdate(&md_ctx, x->bytes->data, x->bytes->length); + EVP_DigestFinal_ex(&md_ctx,md,NULL); + EVP_MD_CTX_cleanup(&md_ctx); ret=( ((unsigned long)md[0] )|((unsigned long)md[1]<<8L)| ((unsigned long)md[2]<<16L)|((unsigned long)md[3]<<24L) diff --git a/crypto/openssl/crypto/x509/x509_vfy.c b/crypto/openssl/crypto/x509/x509_vfy.c index 701ec565e9..b0779db023 100644 --- a/crypto/openssl/crypto/x509/x509_vfy.c +++ b/crypto/openssl/crypto/x509/x509_vfy.c @@ -153,7 +153,6 @@ static int x509_subject_cmp(X509 **a, X509 **b) int X509_verify_cert(X509_STORE_CTX *ctx) { X509 *x,*xtmp,*chain_ss=NULL; - X509_NAME *xn; int bad_chain = 0; X509_VERIFY_PARAM *param = ctx->param; int depth,i,ok=0; @@ -205,7 +204,6 @@ int X509_verify_cert(X509_STORE_CTX *ctx) */ /* If we are self signed, we break */ - xn=X509_get_issuer_name(x); if (ctx->check_issued(ctx, x,x)) break; /* If we were passed a cert chain, use it first */ @@ -242,7 +240,6 @@ int X509_verify_cert(X509_STORE_CTX *ctx) i=sk_X509_num(ctx->chain); x=sk_X509_value(ctx->chain,i-1); - xn = X509_get_subject_name(x); if (ctx->check_issued(ctx, x, x)) { /* we have a self signed certificate */ @@ -291,7 +288,6 @@ int X509_verify_cert(X509_STORE_CTX *ctx) if (depth < num) break; /* If we are self signed, we break */ - xn=X509_get_issuer_name(x); if (ctx->check_issued(ctx,x,x)) break; ok = ctx->get_issuer(&xtmp, ctx, x); @@ -310,7 +306,6 @@ int X509_verify_cert(X509_STORE_CTX *ctx) } /* we now have our chain, lets check it... */ - xn=X509_get_issuer_name(x); /* Is last certificate looked up self signed? */ if (!ctx->check_issued(ctx,x,x)) diff --git a/crypto/openssl/crypto/x509/x509type.c b/crypto/openssl/crypto/x509/x509type.c index 3385ad3f67..9702ec5310 100644 --- a/crypto/openssl/crypto/x509/x509type.c +++ b/crypto/openssl/crypto/x509/x509type.c @@ -100,20 +100,26 @@ int X509_certificate_type(X509 *x, EVP_PKEY *pkey) break; } - i=X509_get_signature_type(x); - switch (i) + i=OBJ_obj2nid(x->sig_alg->algorithm); + if (i && OBJ_find_sigid_algs(i, NULL, &i)) { - case EVP_PKEY_RSA: - ret|=EVP_PKS_RSA; - break; - case EVP_PKEY_DSA: - ret|=EVP_PKS_DSA; - break; - case EVP_PKEY_EC: - ret|=EVP_PKS_EC; - break; - default: - break; + + switch (i) + { + case NID_rsaEncryption: + case NID_rsa: + ret|=EVP_PKS_RSA; + break; + case NID_dsa: + case NID_dsa_2: + ret|=EVP_PKS_DSA; + break; + case NID_X9_62_id_ecPublicKey: + ret|=EVP_PKS_EC; + break; + default: + break; + } } if (EVP_PKEY_size(pk) <= 1024/8)/* /8 because it's 1024 bits we look diff --git a/crypto/openssl/crypto/x509/x_all.c b/crypto/openssl/crypto/x509/x_all.c index 8ec88c215a..b94aeeb873 100644 --- a/crypto/openssl/crypto/x509/x_all.c +++ b/crypto/openssl/crypto/x509/x_all.c @@ -95,12 +95,25 @@ int X509_sign(X509 *x, EVP_PKEY *pkey, const EVP_MD *md) x->sig_alg, x->signature, x->cert_info,pkey,md)); } +int X509_sign_ctx(X509 *x, EVP_MD_CTX *ctx) + { + return ASN1_item_sign_ctx(ASN1_ITEM_rptr(X509_CINF), + x->cert_info->signature, + x->sig_alg, x->signature, x->cert_info, ctx); + } + int X509_REQ_sign(X509_REQ *x, EVP_PKEY *pkey, const EVP_MD *md) { return(ASN1_item_sign(ASN1_ITEM_rptr(X509_REQ_INFO),x->sig_alg, NULL, x->signature, x->req_info,pkey,md)); } +int X509_REQ_sign_ctx(X509_REQ *x, EVP_MD_CTX *ctx) + { + return ASN1_item_sign_ctx(ASN1_ITEM_rptr(X509_REQ_INFO), + x->sig_alg, NULL, x->signature, x->req_info, ctx); + } + int X509_CRL_sign(X509_CRL *x, EVP_PKEY *pkey, const EVP_MD *md) { x->crl->enc.modified = 1; @@ -108,6 +121,12 @@ int X509_CRL_sign(X509_CRL *x, EVP_PKEY *pkey, const EVP_MD *md) x->sig_alg, x->signature, x->crl,pkey,md)); } +int X509_CRL_sign_ctx(X509_CRL *x, EVP_MD_CTX *ctx) + { + return ASN1_item_sign_ctx(ASN1_ITEM_rptr(X509_CRL_INFO), + x->crl->sig_alg, x->sig_alg, x->signature, x->crl, ctx); + } + int NETSCAPE_SPKI_sign(NETSCAPE_SPKI *x, EVP_PKEY *pkey, const EVP_MD *md) { return(ASN1_item_sign(ASN1_ITEM_rptr(NETSCAPE_SPKAC), x->sig_algor,NULL, diff --git a/crypto/openssl/crypto/x509v3/v3_asid.c b/crypto/openssl/crypto/x509v3/v3_asid.c index 3f434c0603..1587e8ed72 100644 --- a/crypto/openssl/crypto/x509v3/v3_asid.c +++ b/crypto/openssl/crypto/x509v3/v3_asid.c @@ -358,6 +358,20 @@ static int ASIdentifierChoice_is_canonical(ASIdentifierChoice *choice) goto done; } + /* + * Check for inverted range. + */ + i = sk_ASIdOrRange_num(choice->u.asIdsOrRanges) - 1; + { + ASIdOrRange *a = sk_ASIdOrRange_value(choice->u.asIdsOrRanges, i); + ASN1_INTEGER *a_min, *a_max; + if (a != NULL && a->type == ASIdOrRange_range) { + extract_min_max(a, &a_min, &a_max); + if (ASN1_INTEGER_cmp(a_min, a_max) > 0) + goto done; + } + } + ret = 1; done: @@ -392,9 +406,18 @@ static int ASIdentifierChoice_canonize(ASIdentifierChoice *choice) return 1; /* - * We have a list. Sort it. + * If not a list, or if empty list, it's broken. + */ + if (choice->type != ASIdentifierChoice_asIdsOrRanges || + sk_ASIdOrRange_num(choice->u.asIdsOrRanges) == 0) { + X509V3err(X509V3_F_ASIDENTIFIERCHOICE_CANONIZE, + X509V3_R_EXTENSION_VALUE_ERROR); + return 0; + } + + /* + * We have a non-empty list. Sort it. */ - OPENSSL_assert(choice->type == ASIdentifierChoice_asIdsOrRanges); sk_ASIdOrRange_sort(choice->u.asIdsOrRanges); /* @@ -414,6 +437,13 @@ static int ASIdentifierChoice_canonize(ASIdentifierChoice *choice) */ OPENSSL_assert(ASN1_INTEGER_cmp(a_min, b_min) <= 0); + /* + * Punt inverted ranges. + */ + if (ASN1_INTEGER_cmp(a_min, a_max) > 0 || + ASN1_INTEGER_cmp(b_min, b_max) > 0) + goto done; + /* * Check for overlaps. */ @@ -465,12 +495,26 @@ static int ASIdentifierChoice_canonize(ASIdentifierChoice *choice) break; } ASIdOrRange_free(b); - sk_ASIdOrRange_delete(choice->u.asIdsOrRanges, i + 1); + (void) sk_ASIdOrRange_delete(choice->u.asIdsOrRanges, i + 1); i--; continue; } } + /* + * Check for final inverted range. + */ + i = sk_ASIdOrRange_num(choice->u.asIdsOrRanges) - 1; + { + ASIdOrRange *a = sk_ASIdOrRange_value(choice->u.asIdsOrRanges, i); + ASN1_INTEGER *a_min, *a_max; + if (a != NULL && a->type == ASIdOrRange_range) { + extract_min_max(a, &a_min, &a_max); + if (ASN1_INTEGER_cmp(a_min, a_max) > 0) + goto done; + } + } + OPENSSL_assert(ASIdentifierChoice_is_canonical(choice)); /* Paranoia */ ret = 1; @@ -498,6 +542,7 @@ static void *v2i_ASIdentifiers(const struct v3_ext_method *method, struct v3_ext_ctx *ctx, STACK_OF(CONF_VALUE) *values) { + ASN1_INTEGER *min = NULL, *max = NULL; ASIdentifiers *asid = NULL; int i; @@ -508,7 +553,6 @@ static void *v2i_ASIdentifiers(const struct v3_ext_method *method, for (i = 0; i < sk_CONF_VALUE_num(values); i++) { CONF_VALUE *val = sk_CONF_VALUE_value(values, i); - ASN1_INTEGER *min = NULL, *max = NULL; int i1, i2, i3, is_range, which; /* @@ -578,18 +622,19 @@ static void *v2i_ASIdentifiers(const struct v3_ext_method *method, max = s2i_ASN1_INTEGER(NULL, s + i2); OPENSSL_free(s); if (min == NULL || max == NULL) { - ASN1_INTEGER_free(min); - ASN1_INTEGER_free(max); X509V3err(X509V3_F_V2I_ASIDENTIFIERS, ERR_R_MALLOC_FAILURE); goto err; } + if (ASN1_INTEGER_cmp(min, max) > 0) { + X509V3err(X509V3_F_V2I_ASIDENTIFIERS, X509V3_R_EXTENSION_VALUE_ERROR); + goto err; + } } if (!v3_asid_add_id_or_range(asid, which, min, max)) { - ASN1_INTEGER_free(min); - ASN1_INTEGER_free(max); X509V3err(X509V3_F_V2I_ASIDENTIFIERS, ERR_R_MALLOC_FAILURE); goto err; } + min = max = NULL; } /* @@ -601,6 +646,8 @@ static void *v2i_ASIdentifiers(const struct v3_ext_method *method, err: ASIdentifiers_free(asid); + ASN1_INTEGER_free(min); + ASN1_INTEGER_free(max); return NULL; } diff --git a/crypto/openssl/crypto/x509v3/v3_skey.c b/crypto/openssl/crypto/x509v3/v3_skey.c index 202c9e4896..0a984fbaa8 100644 --- a/crypto/openssl/crypto/x509v3/v3_skey.c +++ b/crypto/openssl/crypto/x509v3/v3_skey.c @@ -129,7 +129,8 @@ static ASN1_OCTET_STRING *s2i_skey_id(X509V3_EXT_METHOD *method, goto err; } - EVP_Digest(pk->data, pk->length, pkey_dig, &diglen, EVP_sha1(), NULL); + if (!EVP_Digest(pk->data, pk->length, pkey_dig, &diglen, EVP_sha1(), NULL)) + goto err; if(!M_ASN1_OCTET_STRING_set(oct, pkey_dig, diglen)) { X509V3err(X509V3_F_S2I_SKEY_ID,ERR_R_MALLOC_FAILURE); diff --git a/crypto/openssl/crypto/x86_64cpuid.pl b/crypto/openssl/crypto/x86_64cpuid.pl index c96821a3c8..7b7b93b223 100644 --- a/crypto/openssl/crypto/x86_64cpuid.pl +++ b/crypto/openssl/crypto/x86_64cpuid.pl @@ -7,15 +7,24 @@ if ($flavour =~ /\./) { $output = $flavour; undef $flavour; } $win64=0; $win64=1 if ($flavour =~ /[nm]asm|mingw64/ || $output =~ /\.asm$/); $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; -open STDOUT,"| $^X ${dir}perlasm/x86_64-xlate.pl $flavour $output"; +( $xlate="${dir}x86_64-xlate.pl" and -f $xlate ) or +( $xlate="${dir}perlasm/x86_64-xlate.pl" and -f $xlate) or +die "can't locate x86_64-xlate.pl"; + +open STDOUT,"| $^X $xlate $flavour $output"; + +($arg1,$arg2,$arg3,$arg4)=$win64?("%rcx","%rdx","%r8", "%r9") : # Win64 order + ("%rdi","%rsi","%rdx","%rcx"); # Unix order -if ($win64) { $arg1="%rcx"; $arg2="%rdx"; } -else { $arg1="%rdi"; $arg2="%rsi"; } print<<___; .extern OPENSSL_cpuid_setup +.hidden OPENSSL_cpuid_setup .section .init call OPENSSL_cpuid_setup +.hidden OPENSSL_ia32cap_P +.comm OPENSSL_ia32cap_P,8,4 + .text .globl OPENSSL_atomic_add @@ -46,7 +55,7 @@ OPENSSL_rdtsc: .type OPENSSL_ia32_cpuid,\@abi-omnipotent .align 16 OPENSSL_ia32_cpuid: - mov %rbx,%r8 + mov %rbx,%r8 # save %rbx xor %eax,%eax cpuid @@ -78,7 +87,15 @@ OPENSSL_ia32_cpuid: # AMD specific mov \$0x80000000,%eax cpuid - cmp \$0x80000008,%eax + cmp \$0x80000001,%eax + jb .Lintel + mov %eax,%r10d + mov \$0x80000001,%eax + cpuid + or %ecx,%r9d + and \$0x00000801,%r9d # isolate AMD XOP bit, 1<<11 + + cmp \$0x80000008,%r10d jb .Lintel mov \$0x80000008,%eax @@ -89,12 +106,12 @@ OPENSSL_ia32_cpuid: mov \$1,%eax cpuid bt \$28,%edx # test hyper-threading bit - jnc .Ldone + jnc .Lgeneric shr \$16,%ebx # number of logical processors cmp %r10b,%bl - ja .Ldone + ja .Lgeneric and \$0xefffffff,%edx # ~(1<<28) - jmp .Ldone + jmp .Lgeneric .Lintel: cmp \$4,%r11d @@ -111,30 +128,47 @@ OPENSSL_ia32_cpuid: .Lnocacheinfo: mov \$1,%eax cpuid + and \$0xbfefffff,%edx # force reserved bits to 0 cmp \$0,%r9d jne .Lnotintel - or \$0x00100000,%edx # use reserved 20th bit to engage RC4_CHAR + or \$0x40000000,%edx # set reserved bit#30 on Intel CPUs and \$15,%ah cmp \$15,%ah # examine Family ID - je .Lnotintel - or \$0x40000000,%edx # use reserved bit to skip unrolled loop + jne .Lnotintel + or \$0x00100000,%edx # set reserved bit#20 to engage RC4_CHAR .Lnotintel: bt \$28,%edx # test hyper-threading bit - jnc .Ldone + jnc .Lgeneric and \$0xefffffff,%edx # ~(1<<28) cmp \$0,%r10d - je .Ldone + je .Lgeneric or \$0x10000000,%edx # 1<<28 shr \$16,%ebx cmp \$1,%bl # see if cache is shared - ja .Ldone + ja .Lgeneric and \$0xefffffff,%edx # ~(1<<28) +.Lgeneric: + and \$0x00000800,%r9d # isolate AMD XOP flag + and \$0xfffff7ff,%ecx + or %ecx,%r9d # merge AMD XOP flag + + mov %edx,%r10d # %r9d:%r10d is copy of %ecx:%edx + bt \$27,%r9d # check OSXSAVE bit + jnc .Lclear_avx + xor %ecx,%ecx # XCR0 + .byte 0x0f,0x01,0xd0 # xgetbv + and \$6,%eax # isolate XMM and YMM state support + cmp \$6,%eax + je .Ldone +.Lclear_avx: + mov \$0xefffe7ff,%eax # ~(1<<28|1<<12|1<<11) + and %eax,%r9d # clear AVX, FMA and AMD XOP bits .Ldone: - shl \$32,%rcx - mov %edx,%eax - mov %r8,%rbx - or %rcx,%rax + shl \$32,%r9 + mov %r10d,%eax + mov %r8,%rbx # restore %rbx + or %r9,%rax ret .size OPENSSL_ia32_cpuid,.-OPENSSL_ia32_cpuid @@ -229,4 +263,21 @@ OPENSSL_wipe_cpu: .size OPENSSL_wipe_cpu,.-OPENSSL_wipe_cpu ___ +print<<___; +.globl OPENSSL_ia32_rdrand +.type OPENSSL_ia32_rdrand,\@abi-omnipotent +.align 16 +OPENSSL_ia32_rdrand: + mov \$8,%ecx +.Loop_rdrand: + rdrand %rax + jc .Lbreak_rdrand + loop .Loop_rdrand +.Lbreak_rdrand: + cmp \$0,%rax + cmove %rcx,%rax + ret +.size OPENSSL_ia32_rdrand,.-OPENSSL_ia32_rdrand +___ + close STDOUT; # flush diff --git a/crypto/openssl/crypto/x86cpuid.pl b/crypto/openssl/crypto/x86cpuid.pl index a7464af19b..39fd8f2293 100644 --- a/crypto/openssl/crypto/x86cpuid.pl +++ b/crypto/openssl/crypto/x86cpuid.pl @@ -19,9 +19,9 @@ for (@ARGV) { $sse2=1 if (/-DOPENSSL_IA32_SSE2/); } &pushf (); &pop ("eax"); &xor ("ecx","eax"); - &bt ("ecx",21); - &jnc (&label("done")); &xor ("eax","eax"); + &bt ("ecx",21); + &jnc (&label("nocpuid")); &cpuid (); &mov ("edi","eax"); # max value for standard query level @@ -51,7 +51,14 @@ for (@ARGV) { $sse2=1 if (/-DOPENSSL_IA32_SSE2/); } # AMD specific &mov ("eax",0x80000000); &cpuid (); - &cmp ("eax",0x80000008); + &cmp ("eax",0x80000001); + &jb (&label("intel")); + &mov ("esi","eax"); + &mov ("eax",0x80000001); + &cpuid (); + &or ("ebp","ecx"); + &and ("ebp",1<<11|1); # isolate XOP bit + &cmp ("esi",0x80000008); &jb (&label("intel")); &mov ("eax",0x80000008); @@ -62,13 +69,13 @@ for (@ARGV) { $sse2=1 if (/-DOPENSSL_IA32_SSE2/); } &mov ("eax",1); &cpuid (); &bt ("edx",28); - &jnc (&label("done")); + &jnc (&label("generic")); &shr ("ebx",16); &and ("ebx",0xff); &cmp ("ebx","esi"); - &ja (&label("done")); + &ja (&label("generic")); &and ("edx",0xefffffff); # clear hyper-threading bit - &jmp (&label("done")); + &jmp (&label("generic")); &set_label("intel"); &cmp ("edi",4); @@ -85,27 +92,51 @@ for (@ARGV) { $sse2=1 if (/-DOPENSSL_IA32_SSE2/); } &set_label("nocacheinfo"); &mov ("eax",1); &cpuid (); + &and ("edx",0xbfefffff); # force reserved bits #20, #30 to 0 &cmp ("ebp",0); - &jne (&label("notP4")); + &jne (&label("notintel")); + &or ("edx",1<<30); # set reserved bit#30 on Intel CPUs &and (&HB("eax"),15); # familiy ID &cmp (&HB("eax"),15); # P4? - &jne (&label("notP4")); - &or ("edx",1<<20); # use reserved bit to engage RC4_CHAR -&set_label("notP4"); + &jne (&label("notintel")); + &or ("edx",1<<20); # set reserved bit#20 to engage RC4_CHAR +&set_label("notintel"); &bt ("edx",28); # test hyper-threading bit - &jnc (&label("done")); + &jnc (&label("generic")); &and ("edx",0xefffffff); &cmp ("edi",0); - &je (&label("done")); + &je (&label("generic")); &or ("edx",0x10000000); &shr ("ebx",16); &cmp (&LB("ebx"),1); - &ja (&label("done")); + &ja (&label("generic")); &and ("edx",0xefffffff); # clear hyper-threading bit if not + +&set_label("generic"); + &and ("ebp",1<<11); # isolate AMD XOP flag + &and ("ecx",0xfffff7ff); # force 11th bit to 0 + &mov ("esi","edx"); + &or ("ebp","ecx"); # merge AMD XOP flag + + &bt ("ecx",27); # check OSXSAVE bit + &jnc (&label("clear_avx")); + &xor ("ecx","ecx"); + &data_byte(0x0f,0x01,0xd0); # xgetbv + &and ("eax",6); + &cmp ("eax",6); + &je (&label("done")); + &cmp ("eax",2); + &je (&label("clear_avx")); +&set_label("clear_xmm"); + &and ("ebp",0xfdfffffd); # clear AESNI and PCLMULQDQ bits + &and ("esi",0xfeffffff); # clear FXSR +&set_label("clear_avx"); + &and ("ebp",0xefffe7ff); # clear AVX, FMA and AMD XOP bits &set_label("done"); - &mov ("eax","edx"); - &mov ("edx","ecx"); + &mov ("eax","esi"); + &mov ("edx","ebp"); +&set_label("nocpuid"); &function_end("OPENSSL_ia32_cpuid"); &external_label("OPENSSL_ia32cap_P"); @@ -199,8 +230,9 @@ for (@ARGV) { $sse2=1 if (/-DOPENSSL_IA32_SSE2/); } &bt (&DWP(0,"ecx"),1); &jnc (&label("no_x87")); if ($sse2) { - &bt (&DWP(0,"ecx"),26); - &jnc (&label("no_sse2")); + &and ("ecx",1<<26|1<<24); # check SSE2 and FXSR bits + &cmp ("ecx",1<<26|1<<24); + &jne (&label("no_sse2")); &pxor ("xmm0","xmm0"); &pxor ("xmm1","xmm1"); &pxor ("xmm2","xmm2"); @@ -307,6 +339,18 @@ for (@ARGV) { $sse2=1 if (/-DOPENSSL_IA32_SSE2/); } &ret (); &function_end_B("OPENSSL_cleanse"); +&function_begin_B("OPENSSL_ia32_rdrand"); + &mov ("ecx",8); +&set_label("loop"); + &rdrand ("eax"); + &jc (&label("break")); + &loop (&label("loop")); +&set_label("break"); + &cmp ("eax",0); + &cmove ("eax","ecx"); + &ret (); +&function_end_B("OPENSSL_ia32_rdrand"); + &initseg("OPENSSL_cpuid_setup"); &asm_finish(); diff --git a/crypto/openssl/doc/apps/genpkey.pod b/crypto/openssl/doc/apps/genpkey.pod index 1611b5ca78..c74d097fb3 100644 --- a/crypto/openssl/doc/apps/genpkey.pod +++ b/crypto/openssl/doc/apps/genpkey.pod @@ -114,6 +114,8 @@ hexadecimal value if preceded by B<0x>. Default value is 65537. The number of bits in the generated parameters. If not specified 1024 is used. +=back + =head1 DH PARAMETER GENERATION OPTIONS =over 4 diff --git a/crypto/openssl/doc/apps/openssl.pod b/crypto/openssl/doc/apps/openssl.pod index 738142e9ff..64a160c20a 100644 --- a/crypto/openssl/doc/apps/openssl.pod +++ b/crypto/openssl/doc/apps/openssl.pod @@ -287,8 +287,6 @@ SHA Digest SHA-1 Digest -=back - =item B SHA-224 Digest @@ -305,6 +303,8 @@ SHA-384 Digest SHA-512 Digest +=back + =head2 ENCODING AND CIPHER COMMANDS =over 10 diff --git a/crypto/openssl/doc/ssl/SSL_alert_type_string.pod b/crypto/openssl/doc/ssl/SSL_alert_type_string.pod index 94e28cc307..0329c34869 100644 --- a/crypto/openssl/doc/ssl/SSL_alert_type_string.pod +++ b/crypto/openssl/doc/ssl/SSL_alert_type_string.pod @@ -214,6 +214,11 @@ satisfy a request; the process might receive security parameters difficult to communicate changes to these parameters after that point. This message is always a warning. +=item "UP"/"unknown PSK identity" + +Sent by the server to indicate that it does not recognize a PSK +identity or an SRP identity. + =item "UK"/"unknown" This indicates that no description is available for this alert type. diff --git a/crypto/openssl/e_os.h b/crypto/openssl/e_os.h index 5ceeeeb950..79c1392573 100644 --- a/crypto/openssl/e_os.h +++ b/crypto/openssl/e_os.h @@ -99,7 +99,6 @@ extern "C" { # ifndef MAC_OS_GUSI_SOURCE # define MAC_OS_pre_X # define NO_SYS_TYPES_H - typedef long ssize_t; # endif # define NO_SYS_PARAM_H # define NO_CHMOD @@ -340,8 +339,6 @@ static unsigned int _strlen31(const char *str) # define OPENSSL_NO_POSIX_IO # endif -# define ssize_t long - # if defined (__BORLANDC__) # define _setmode setmode # define _O_TEXT O_TEXT @@ -456,9 +453,6 @@ static unsigned int _strlen31(const char *str) * (unless when compiling with -D_POSIX_SOURCE, * which doesn't work for us) */ # endif -# if defined(NeXT) || defined(OPENSSL_SYS_NEWS4) || defined(OPENSSL_SYS_SUNOS) -# define ssize_t int /* ditto */ -# endif # ifdef OPENSSL_SYS_NEWS4 /* setvbuf is missing on mips-sony-bsd */ # define setvbuf(a, b, c, d) setbuffer((a), (b), (d)) typedef unsigned long clock_t; @@ -637,12 +631,6 @@ static unsigned int _strlen31(const char *str) #endif -#if defined(__ultrix) -# ifndef ssize_t -# define ssize_t int -# endif -#endif - #if defined(sun) && !defined(__svr4__) && !defined(__SVR4) /* include headers first, so our defines don't break it */ #include diff --git a/crypto/openssl/e_os2.h b/crypto/openssl/e_os2.h index d30724d304..d22c0368f8 100644 --- a/crypto/openssl/e_os2.h +++ b/crypto/openssl/e_os2.h @@ -289,6 +289,26 @@ extern "C" { # define OPENSSL_GLOBAL_REF(name) _shadow_##name #endif +#if defined(OPENSSL_SYS_MACINTOSH_CLASSIC) && macintosh==1 && !defined(MAC_OS_GUSI_SOURCE) +# define ossl_ssize_t long +#endif + +#ifdef OPENSSL_SYS_MSDOS +# define ossl_ssize_t long +#endif + +#if defined(NeXT) || defined(OPENSSL_SYS_NEWS4) || defined(OPENSSL_SYS_SUNOS) +# define ssize_t int +#endif + +#if defined(__ultrix) && !defined(ssize_t) +# define ossl_ssize_t int +#endif + +#ifndef ossl_ssize_t +# define ossl_ssize_t ssize_t +#endif + #ifdef __cplusplus } #endif diff --git a/crypto/openssl/engines/ccgost/gost_ameth.c b/crypto/openssl/engines/ccgost/gost_ameth.c index e6c2839e5f..2cde1fcfd9 100644 --- a/crypto/openssl/engines/ccgost/gost_ameth.c +++ b/crypto/openssl/engines/ccgost/gost_ameth.c @@ -13,6 +13,9 @@ #include #include #include +#ifndef OPENSSL_NO_CMS +#include +#endif #include "gost_params.h" #include "gost_lcl.h" #include "e_gost_err.h" @@ -230,6 +233,24 @@ static int pkey_ctrl_gost(EVP_PKEY *pkey, int op, X509_ALGOR_set0(alg2, OBJ_nid2obj(nid), V_ASN1_NULL, 0); } return 1; +#ifndef OPENSSL_NO_CMS + case ASN1_PKEY_CTRL_CMS_SIGN: + if (arg1 == 0) + { + X509_ALGOR *alg1 = NULL, *alg2 = NULL; + int nid = EVP_PKEY_base_id(pkey); + CMS_SignerInfo_get0_algs((CMS_SignerInfo *)arg2, + NULL, NULL, &alg1, &alg2); + X509_ALGOR_set0(alg1, OBJ_nid2obj(NID_id_GostR3411_94), + V_ASN1_NULL, 0); + if (nid == NID_undef) + { + return (-1); + } + X509_ALGOR_set0(alg2, OBJ_nid2obj(nid), V_ASN1_NULL, 0); + } + return 1; +#endif case ASN1_PKEY_CTRL_PKCS7_ENCRYPT: if (arg1 == 0) { @@ -244,6 +265,22 @@ static int pkey_ctrl_gost(EVP_PKEY *pkey, int op, V_ASN1_SEQUENCE, params); } return 1; +#ifndef OPENSSL_NO_CMS + case ASN1_PKEY_CTRL_CMS_ENVELOPE: + if (arg1 == 0) + { + X509_ALGOR *alg; + ASN1_STRING * params = encode_gost_algor_params(pkey); + if (!params) + { + return -1; + } + CMS_RecipientInfo_ktri_get0_algs((CMS_RecipientInfo *)arg2, NULL, NULL, &alg); + X509_ALGOR_set0(alg, OBJ_nid2obj(pkey->type), + V_ASN1_SEQUENCE, params); + } + return 1; +#endif case ASN1_PKEY_CTRL_DEFAULT_MD_NID: *(int *)arg2 = NID_id_GostR3411_94; return 2; diff --git a/crypto/openssl/engines/ccgost/gost_pmeth.c b/crypto/openssl/engines/ccgost/gost_pmeth.c index 4a05853e55..f91c9b1939 100644 --- a/crypto/openssl/engines/ccgost/gost_pmeth.c +++ b/crypto/openssl/engines/ccgost/gost_pmeth.c @@ -89,6 +89,12 @@ static int pkey_gost_ctrl(EVP_PKEY_CTX *ctx, int type, int p1, void *p2) case EVP_PKEY_CTRL_PKCS7_ENCRYPT: case EVP_PKEY_CTRL_PKCS7_DECRYPT: case EVP_PKEY_CTRL_PKCS7_SIGN: + case EVP_PKEY_CTRL_DIGESTINIT: +#ifndef OPENSSL_NO_CMS + case EVP_PKEY_CTRL_CMS_ENCRYPT: + case EVP_PKEY_CTRL_CMS_DECRYPT: + case EVP_PKEY_CTRL_CMS_SIGN: +#endif return 1; case EVP_PKEY_CTRL_GOST_PARAMSET: @@ -521,6 +527,7 @@ static int pkey_gost_mac_ctrl_str(EVP_PKEY_CTX *ctx, { GOSTerr(GOST_F_PKEY_GOST_MAC_CTRL_STR, GOST_R_INVALID_MAC_KEY_LENGTH); + OPENSSL_free(keybuf); return 0; } ret= pkey_gost_mac_ctrl(ctx, EVP_PKEY_CTRL_SET_MAC_KEY, diff --git a/crypto/openssl/engines/e_aep.c b/crypto/openssl/engines/e_aep.c index d7f89e5156..1953f0643c 100644 --- a/crypto/openssl/engines/e_aep.c +++ b/crypto/openssl/engines/e_aep.c @@ -85,7 +85,6 @@ extern int GetThreadID(void); #ifndef OPENSSL_NO_DH #include #endif -#include #ifndef OPENSSL_NO_HW #ifndef OPENSSL_NO_HW_AEP diff --git a/crypto/openssl/engines/e_capi.c b/crypto/openssl/engines/e_capi.c index 24b620fc07..bfedde0eb0 100644 --- a/crypto/openssl/engines/e_capi.c +++ b/crypto/openssl/engines/e_capi.c @@ -442,28 +442,36 @@ static int capi_init(ENGINE *e) CAPI_CTX *ctx; const RSA_METHOD *ossl_rsa_meth; const DSA_METHOD *ossl_dsa_meth; - capi_idx = ENGINE_get_ex_new_index(0, NULL, NULL, NULL, 0); - cert_capi_idx = X509_get_ex_new_index(0, NULL, NULL, NULL, 0); + + if (capi_idx < 0) + { + capi_idx = ENGINE_get_ex_new_index(0, NULL, NULL, NULL, 0); + if (capi_idx < 0) + goto memerr; + + cert_capi_idx = X509_get_ex_new_index(0, NULL, NULL, NULL, 0); + + /* Setup RSA_METHOD */ + rsa_capi_idx = RSA_get_ex_new_index(0, NULL, NULL, NULL, 0); + ossl_rsa_meth = RSA_PKCS1_SSLeay(); + capi_rsa_method.rsa_pub_enc = ossl_rsa_meth->rsa_pub_enc; + capi_rsa_method.rsa_pub_dec = ossl_rsa_meth->rsa_pub_dec; + capi_rsa_method.rsa_mod_exp = ossl_rsa_meth->rsa_mod_exp; + capi_rsa_method.bn_mod_exp = ossl_rsa_meth->bn_mod_exp; + + /* Setup DSA Method */ + dsa_capi_idx = DSA_get_ex_new_index(0, NULL, NULL, NULL, 0); + ossl_dsa_meth = DSA_OpenSSL(); + capi_dsa_method.dsa_do_verify = ossl_dsa_meth->dsa_do_verify; + capi_dsa_method.dsa_mod_exp = ossl_dsa_meth->dsa_mod_exp; + capi_dsa_method.bn_mod_exp = ossl_dsa_meth->bn_mod_exp; + } ctx = capi_ctx_new(); - if (!ctx || (capi_idx < 0)) + if (!ctx) goto memerr; ENGINE_set_ex_data(e, capi_idx, ctx); - /* Setup RSA_METHOD */ - rsa_capi_idx = RSA_get_ex_new_index(0, NULL, NULL, NULL, 0); - ossl_rsa_meth = RSA_PKCS1_SSLeay(); - capi_rsa_method.rsa_pub_enc = ossl_rsa_meth->rsa_pub_enc; - capi_rsa_method.rsa_pub_dec = ossl_rsa_meth->rsa_pub_dec; - capi_rsa_method.rsa_mod_exp = ossl_rsa_meth->rsa_mod_exp; - capi_rsa_method.bn_mod_exp = ossl_rsa_meth->bn_mod_exp; - - /* Setup DSA Method */ - dsa_capi_idx = DSA_get_ex_new_index(0, NULL, NULL, NULL, 0); - ossl_dsa_meth = DSA_OpenSSL(); - capi_dsa_method.dsa_do_verify = ossl_dsa_meth->dsa_do_verify; - capi_dsa_method.dsa_mod_exp = ossl_dsa_meth->dsa_mod_exp; - capi_dsa_method.bn_mod_exp = ossl_dsa_meth->bn_mod_exp; #ifdef OPENSSL_CAPIENG_DIALOG { @@ -522,6 +530,7 @@ static int bind_capi(ENGINE *e) { if (!ENGINE_set_id(e, engine_capi_id) || !ENGINE_set_name(e, engine_capi_name) + || !ENGINE_set_flags(e, ENGINE_FLAGS_NO_REGISTER_ALL) || !ENGINE_set_init_function(e, capi_init) || !ENGINE_set_finish_function(e, capi_finish) || !ENGINE_set_destroy_function(e, capi_destroy) @@ -1155,6 +1164,7 @@ static int capi_list_containers(CAPI_CTX *ctx, BIO *out) { CAPIerr(CAPI_F_CAPI_LIST_CONTAINERS, CAPI_R_ENUMCONTAINERS_ERROR); capi_addlasterror(); + CryptReleaseContext(hprov, 0); return 0; } CAPI_trace(ctx, "Got max container len %d\n", buflen); @@ -1572,6 +1582,8 @@ static int capi_ctx_set_provname(CAPI_CTX *ctx, LPSTR pname, DWORD type, int che } CryptReleaseContext(hprov, 0); } + if (ctx->cspname) + OPENSSL_free(ctx->cspname); ctx->cspname = BUF_strdup(pname); ctx->csptype = type; return 1; @@ -1581,9 +1593,12 @@ static int capi_ctx_set_provname_idx(CAPI_CTX *ctx, int idx) { LPSTR pname; DWORD type; + int res; if (capi_get_provname(ctx, &pname, &type, idx) != 1) return 0; - return capi_ctx_set_provname(ctx, pname, type, 0); + res = capi_ctx_set_provname(ctx, pname, type, 0); + OPENSSL_free(pname); + return res; } static int cert_issuer_match(STACK_OF(X509_NAME) *ca_dn, X509 *x) diff --git a/crypto/openssl/engines/e_padlock.c b/crypto/openssl/engines/e_padlock.c index 7d09419804..9f7a85a8da 100644 --- a/crypto/openssl/engines/e_padlock.c +++ b/crypto/openssl/engines/e_padlock.c @@ -104,11 +104,13 @@ # if (defined(__GNUC__) && (defined(__i386__) || defined(__i386))) || \ (defined(_MSC_VER) && defined(_M_IX86)) # define COMPILE_HW_PADLOCK -static ENGINE *ENGINE_padlock (void); # endif #endif #ifdef OPENSSL_NO_DYNAMIC_ENGINE +#ifdef COMPILE_HW_PADLOCK +static ENGINE *ENGINE_padlock (void); +#endif void ENGINE_load_padlock (void) { @@ -197,6 +199,8 @@ padlock_bind_helper(ENGINE *e) return 1; } +#ifdef OPENSSL_NO_DYNAMIC_ENGINE + /* Constructor */ static ENGINE * ENGINE_padlock(void) @@ -215,6 +219,8 @@ ENGINE_padlock(void) return eng; } +#endif + /* Check availability of the engine */ static int padlock_init(ENGINE *e) diff --git a/crypto/openssl/ssl/d1_both.c b/crypto/openssl/ssl/d1_both.c index 9f898d6997..de8bab873f 100644 --- a/crypto/openssl/ssl/d1_both.c +++ b/crypto/openssl/ssl/d1_both.c @@ -227,14 +227,14 @@ int dtls1_do_write(SSL *s, int type) unsigned int len, frag_off, mac_size, blocksize; /* AHA! Figure out the MTU, and stick to the right size */ - if ( ! (SSL_get_options(s) & SSL_OP_NO_QUERY_MTU)) + if (s->d1->mtu < dtls1_min_mtu() && !(SSL_get_options(s) & SSL_OP_NO_QUERY_MTU)) { s->d1->mtu = BIO_ctrl(SSL_get_wbio(s), BIO_CTRL_DGRAM_QUERY_MTU, 0, NULL); /* I've seen the kernel return bogus numbers when it doesn't know * (initial write), so just make sure we have a reasonable number */ - if ( s->d1->mtu < dtls1_min_mtu()) + if (s->d1->mtu < dtls1_min_mtu()) { s->d1->mtu = 0; s->d1->mtu = dtls1_guess_mtu(s->d1->mtu); @@ -1084,7 +1084,11 @@ int dtls1_read_failed(SSL *s, int code) return code; } - if ( ! SSL_in_init(s)) /* done, no need to send a retransmit */ +#ifndef OPENSSL_NO_HEARTBEATS + if (!SSL_in_init(s) && !s->tlsext_hb_pending) /* done, no need to send a retransmit */ +#else + if (!SSL_in_init(s)) /* done, no need to send a retransmit */ +#endif { BIO_set_flags(SSL_get_rbio(s), BIO_FLAGS_READ); return code; @@ -1417,3 +1421,171 @@ dtls1_get_ccs_header(unsigned char *data, struct ccs_header_st *ccs_hdr) ccs_hdr->type = *(data++); } + +int dtls1_shutdown(SSL *s) + { + int ret; +#ifndef OPENSSL_NO_SCTP + if (BIO_dgram_is_sctp(SSL_get_wbio(s)) && + !(s->shutdown & SSL_SENT_SHUTDOWN)) + { + ret = BIO_dgram_sctp_wait_for_dry(SSL_get_wbio(s)); + if (ret < 0) return -1; + + if (ret == 0) + BIO_ctrl(SSL_get_wbio(s), BIO_CTRL_DGRAM_SCTP_SAVE_SHUTDOWN, 1, NULL); + } +#endif + ret = ssl3_shutdown(s); +#ifndef OPENSSL_NO_SCTP + BIO_ctrl(SSL_get_wbio(s), BIO_CTRL_DGRAM_SCTP_SAVE_SHUTDOWN, 0, NULL); +#endif + return ret; + } + +#ifndef OPENSSL_NO_HEARTBEATS +int +dtls1_process_heartbeat(SSL *s) + { + unsigned char *p = &s->s3->rrec.data[0], *pl; + unsigned short hbtype; + unsigned int payload; + unsigned int padding = 16; /* Use minimum padding */ + + /* Read type and payload length first */ + hbtype = *p++; + n2s(p, payload); + pl = p; + + if (s->msg_callback) + s->msg_callback(0, s->version, TLS1_RT_HEARTBEAT, + &s->s3->rrec.data[0], s->s3->rrec.length, + s, s->msg_callback_arg); + + if (hbtype == TLS1_HB_REQUEST) + { + unsigned char *buffer, *bp; + int r; + + /* Allocate memory for the response, size is 1 byte + * message type, plus 2 bytes payload length, plus + * payload, plus padding + */ + buffer = OPENSSL_malloc(1 + 2 + payload + padding); + bp = buffer; + + /* Enter response type, length and copy payload */ + *bp++ = TLS1_HB_RESPONSE; + s2n(payload, bp); + memcpy(bp, pl, payload); + bp += payload; + /* Random padding */ + RAND_pseudo_bytes(bp, padding); + + r = dtls1_write_bytes(s, TLS1_RT_HEARTBEAT, buffer, 3 + payload + padding); + + if (r >= 0 && s->msg_callback) + s->msg_callback(1, s->version, TLS1_RT_HEARTBEAT, + buffer, 3 + payload + padding, + s, s->msg_callback_arg); + + OPENSSL_free(buffer); + + if (r < 0) + return r; + } + else if (hbtype == TLS1_HB_RESPONSE) + { + unsigned int seq; + + /* We only send sequence numbers (2 bytes unsigned int), + * and 16 random bytes, so we just try to read the + * sequence number */ + n2s(pl, seq); + + if (payload == 18 && seq == s->tlsext_hb_seq) + { + dtls1_stop_timer(s); + s->tlsext_hb_seq++; + s->tlsext_hb_pending = 0; + } + } + + return 0; + } + +int +dtls1_heartbeat(SSL *s) + { + unsigned char *buf, *p; + int ret; + unsigned int payload = 18; /* Sequence number + random bytes */ + unsigned int padding = 16; /* Use minimum padding */ + + /* Only send if peer supports and accepts HB requests... */ + if (!(s->tlsext_heartbeat & SSL_TLSEXT_HB_ENABLED) || + s->tlsext_heartbeat & SSL_TLSEXT_HB_DONT_SEND_REQUESTS) + { + SSLerr(SSL_F_DTLS1_HEARTBEAT,SSL_R_TLS_HEARTBEAT_PEER_DOESNT_ACCEPT); + return -1; + } + + /* ...and there is none in flight yet... */ + if (s->tlsext_hb_pending) + { + SSLerr(SSL_F_DTLS1_HEARTBEAT,SSL_R_TLS_HEARTBEAT_PENDING); + return -1; + } + + /* ...and no handshake in progress. */ + if (SSL_in_init(s) || s->in_handshake) + { + SSLerr(SSL_F_DTLS1_HEARTBEAT,SSL_R_UNEXPECTED_MESSAGE); + return -1; + } + + /* Check if padding is too long, payload and padding + * must not exceed 2^14 - 3 = 16381 bytes in total. + */ + OPENSSL_assert(payload + padding <= 16381); + + /* Create HeartBeat message, we just use a sequence number + * as payload to distuingish different messages and add + * some random stuff. + * - Message Type, 1 byte + * - Payload Length, 2 bytes (unsigned int) + * - Payload, the sequence number (2 bytes uint) + * - Payload, random bytes (16 bytes uint) + * - Padding + */ + buf = OPENSSL_malloc(1 + 2 + payload + padding); + p = buf; + /* Message Type */ + *p++ = TLS1_HB_REQUEST; + /* Payload length (18 bytes here) */ + s2n(payload, p); + /* Sequence number */ + s2n(s->tlsext_hb_seq, p); + /* 16 random bytes */ + RAND_pseudo_bytes(p, 16); + p += 16; + /* Random padding */ + RAND_pseudo_bytes(p, padding); + + ret = dtls1_write_bytes(s, TLS1_RT_HEARTBEAT, buf, 3 + payload + padding); + if (ret >= 0) + { + if (s->msg_callback) + s->msg_callback(1, s->version, TLS1_RT_HEARTBEAT, + buf, 3 + payload + padding, + s, s->msg_callback_arg); + + dtls1_start_timer(s); + s->tlsext_hb_pending = 1; + } + + OPENSSL_free(buf); + + return ret; + } +#endif diff --git a/crypto/openssl/ssl/d1_clnt.c b/crypto/openssl/ssl/d1_clnt.c index 089fa4c7f8..a6ed09c51d 100644 --- a/crypto/openssl/ssl/d1_clnt.c +++ b/crypto/openssl/ssl/d1_clnt.c @@ -150,7 +150,11 @@ int dtls1_connect(SSL *s) unsigned long Time=(unsigned long)time(NULL); void (*cb)(const SSL *ssl,int type,int val)=NULL; int ret= -1; - int new_state,state,skip=0;; + int new_state,state,skip=0; +#ifndef OPENSSL_NO_SCTP + unsigned char sctpauthkey[64]; + char labelbuffer[sizeof(DTLS1_SCTP_AUTH_LABEL)]; +#endif RAND_add(&Time,sizeof(Time),0); ERR_clear_error(); @@ -164,6 +168,27 @@ int dtls1_connect(SSL *s) s->in_handshake++; if (!SSL_in_init(s) || SSL_in_before(s)) SSL_clear(s); +#ifndef OPENSSL_NO_SCTP + /* Notify SCTP BIO socket to enter handshake + * mode and prevent stream identifier other + * than 0. Will be ignored if no SCTP is used. + */ + BIO_ctrl(SSL_get_wbio(s), BIO_CTRL_DGRAM_SCTP_SET_IN_HANDSHAKE, s->in_handshake, NULL); +#endif + +#ifndef OPENSSL_NO_HEARTBEATS + /* If we're awaiting a HeartbeatResponse, pretend we + * already got and don't await it anymore, because + * Heartbeats don't make sense during handshakes anyway. + */ + if (s->tlsext_hb_pending) + { + dtls1_stop_timer(s); + s->tlsext_hb_pending = 0; + s->tlsext_hb_seq++; + } +#endif + for (;;) { state=s->state; @@ -171,7 +196,7 @@ int dtls1_connect(SSL *s) switch(s->state) { case SSL_ST_RENEGOTIATE: - s->new_session=1; + s->renegotiate=1; s->state=SSL_ST_CONNECT; s->ctx->stats.sess_connect_renegotiate++; /* break */ @@ -226,6 +251,42 @@ int dtls1_connect(SSL *s) s->hit = 0; break; +#ifndef OPENSSL_NO_SCTP + case DTLS1_SCTP_ST_CR_READ_SOCK: + + if (BIO_dgram_sctp_msg_waiting(SSL_get_rbio(s))) + { + s->s3->in_read_app_data=2; + s->rwstate=SSL_READING; + BIO_clear_retry_flags(SSL_get_rbio(s)); + BIO_set_retry_read(SSL_get_rbio(s)); + ret = -1; + goto end; + } + + s->state=s->s3->tmp.next_state; + break; + + case DTLS1_SCTP_ST_CW_WRITE_SOCK: + /* read app data until dry event */ + + ret = BIO_dgram_sctp_wait_for_dry(SSL_get_wbio(s)); + if (ret < 0) goto end; + + if (ret == 0) + { + s->s3->in_read_app_data=2; + s->rwstate=SSL_READING; + BIO_clear_retry_flags(SSL_get_rbio(s)); + BIO_set_retry_read(SSL_get_rbio(s)); + ret = -1; + goto end; + } + + s->state=s->d1->next_state; + break; +#endif + case SSL3_ST_CW_CLNT_HELLO_A: case SSL3_ST_CW_CLNT_HELLO_B: @@ -248,9 +309,17 @@ int dtls1_connect(SSL *s) s->init_num=0; - /* turn on buffering for the next lot of output */ - if (s->bbio != s->wbio) - s->wbio=BIO_push(s->bbio,s->wbio); +#ifndef OPENSSL_NO_SCTP + /* Disable buffering for SCTP */ + if (!BIO_dgram_is_sctp(SSL_get_wbio(s))) + { +#endif + /* turn on buffering for the next lot of output */ + if (s->bbio != s->wbio) + s->wbio=BIO_push(s->bbio,s->wbio); +#ifndef OPENSSL_NO_SCTP + } +#endif break; @@ -260,9 +329,25 @@ int dtls1_connect(SSL *s) if (ret <= 0) goto end; else { - dtls1_stop_timer(s); if (s->hit) + { +#ifndef OPENSSL_NO_SCTP + /* Add new shared key for SCTP-Auth, + * will be ignored if no SCTP used. + */ + snprintf((char*) labelbuffer, sizeof(DTLS1_SCTP_AUTH_LABEL), + DTLS1_SCTP_AUTH_LABEL); + + SSL_export_keying_material(s, sctpauthkey, + sizeof(sctpauthkey), labelbuffer, + sizeof(labelbuffer), NULL, 0, 0); + + BIO_ctrl(SSL_get_wbio(s), BIO_CTRL_DGRAM_SCTP_ADD_AUTH_KEY, + sizeof(sctpauthkey), sctpauthkey); +#endif + s->state=SSL3_ST_CR_FINISHED_A; + } else s->state=DTLS1_ST_CR_HELLO_VERIFY_REQUEST_A; } @@ -354,12 +439,20 @@ int dtls1_connect(SSL *s) case SSL3_ST_CR_SRVR_DONE_B: ret=ssl3_get_server_done(s); if (ret <= 0) goto end; + dtls1_stop_timer(s); if (s->s3->tmp.cert_req) - s->state=SSL3_ST_CW_CERT_A; + s->s3->tmp.next_state=SSL3_ST_CW_CERT_A; else - s->state=SSL3_ST_CW_KEY_EXCH_A; + s->s3->tmp.next_state=SSL3_ST_CW_KEY_EXCH_A; s->init_num=0; +#ifndef OPENSSL_NO_SCTP + if (BIO_dgram_is_sctp(SSL_get_wbio(s)) && + state == SSL_ST_RENEGOTIATE) + s->state=DTLS1_SCTP_ST_CR_READ_SOCK; + else +#endif + s->state=s->s3->tmp.next_state; break; case SSL3_ST_CW_CERT_A: @@ -378,6 +471,22 @@ int dtls1_connect(SSL *s) dtls1_start_timer(s); ret=dtls1_send_client_key_exchange(s); if (ret <= 0) goto end; + +#ifndef OPENSSL_NO_SCTP + /* Add new shared key for SCTP-Auth, + * will be ignored if no SCTP used. + */ + snprintf((char*) labelbuffer, sizeof(DTLS1_SCTP_AUTH_LABEL), + DTLS1_SCTP_AUTH_LABEL); + + SSL_export_keying_material(s, sctpauthkey, + sizeof(sctpauthkey), labelbuffer, + sizeof(labelbuffer), NULL, 0, 0); + + BIO_ctrl(SSL_get_wbio(s), BIO_CTRL_DGRAM_SCTP_ADD_AUTH_KEY, + sizeof(sctpauthkey), sctpauthkey); +#endif + /* EAY EAY EAY need to check for DH fix cert * sent back */ /* For TLS, cert_req is set to 2, so a cert chain @@ -388,7 +497,15 @@ int dtls1_connect(SSL *s) } else { - s->state=SSL3_ST_CW_CHANGE_A; +#ifndef OPENSSL_NO_SCTP + if (BIO_dgram_is_sctp(SSL_get_wbio(s))) + { + s->d1->next_state=SSL3_ST_CW_CHANGE_A; + s->state=DTLS1_SCTP_ST_CW_WRITE_SOCK; + } + else +#endif + s->state=SSL3_ST_CW_CHANGE_A; s->s3->change_cipher_spec=0; } @@ -400,7 +517,15 @@ int dtls1_connect(SSL *s) dtls1_start_timer(s); ret=dtls1_send_client_verify(s); if (ret <= 0) goto end; - s->state=SSL3_ST_CW_CHANGE_A; +#ifndef OPENSSL_NO_SCTP + if (BIO_dgram_is_sctp(SSL_get_wbio(s))) + { + s->d1->next_state=SSL3_ST_CW_CHANGE_A; + s->state=DTLS1_SCTP_ST_CW_WRITE_SOCK; + } + else +#endif + s->state=SSL3_ST_CW_CHANGE_A; s->init_num=0; s->s3->change_cipher_spec=0; break; @@ -412,6 +537,14 @@ int dtls1_connect(SSL *s) ret=dtls1_send_change_cipher_spec(s, SSL3_ST_CW_CHANGE_A,SSL3_ST_CW_CHANGE_B); if (ret <= 0) goto end; + +#ifndef OPENSSL_NO_SCTP + /* Change to new shared key of SCTP-Auth, + * will be ignored if no SCTP used. + */ + BIO_ctrl(SSL_get_wbio(s), BIO_CTRL_DGRAM_SCTP_NEXT_AUTH_KEY, 0, NULL); +#endif + s->state=SSL3_ST_CW_FINISHED_A; s->init_num=0; @@ -457,9 +590,23 @@ int dtls1_connect(SSL *s) if (s->hit) { s->s3->tmp.next_state=SSL_ST_OK; +#ifndef OPENSSL_NO_SCTP + if (BIO_dgram_is_sctp(SSL_get_wbio(s))) + { + s->d1->next_state = s->s3->tmp.next_state; + s->s3->tmp.next_state=DTLS1_SCTP_ST_CW_WRITE_SOCK; + } +#endif if (s->s3->flags & SSL3_FLAGS_DELAY_CLIENT_FINISHED) { s->state=SSL_ST_OK; +#ifndef OPENSSL_NO_SCTP + if (BIO_dgram_is_sctp(SSL_get_wbio(s))) + { + s->d1->next_state = SSL_ST_OK; + s->state=DTLS1_SCTP_ST_CW_WRITE_SOCK; + } +#endif s->s3->flags|=SSL3_FLAGS_POP_BUFFER; s->s3->delay_buf_pop_ret=0; } @@ -508,6 +655,16 @@ int dtls1_connect(SSL *s) s->state=SSL3_ST_CW_CHANGE_A; else s->state=SSL_ST_OK; + +#ifndef OPENSSL_NO_SCTP + if (BIO_dgram_is_sctp(SSL_get_wbio(s)) && + state == SSL_ST_RENEGOTIATE) + { + s->d1->next_state=s->state; + s->state=DTLS1_SCTP_ST_CW_WRITE_SOCK; + } +#endif + s->init_num=0; break; @@ -515,6 +672,13 @@ int dtls1_connect(SSL *s) s->rwstate=SSL_WRITING; if (BIO_flush(s->wbio) <= 0) { + /* If the write error was fatal, stop trying */ + if (!BIO_should_retry(s->wbio)) + { + s->rwstate=SSL_NOTHING; + s->state=s->s3->tmp.next_state; + } + ret= -1; goto end; } @@ -541,6 +705,7 @@ int dtls1_connect(SSL *s) /* else do it later in ssl3_write */ s->init_num=0; + s->renegotiate=0; s->new_session=0; ssl_update_cache(s,SSL_SESS_CACHE_CLIENT); @@ -587,6 +752,15 @@ int dtls1_connect(SSL *s) } end: s->in_handshake--; + +#ifndef OPENSSL_NO_SCTP + /* Notify SCTP BIO socket to leave handshake + * mode and allow stream identifier other + * than 0. Will be ignored if no SCTP is used. + */ + BIO_ctrl(SSL_get_wbio(s), BIO_CTRL_DGRAM_SCTP_SET_IN_HANDSHAKE, s->in_handshake, NULL); +#endif + if (buf != NULL) BUF_MEM_free(buf); if (cb != NULL) diff --git a/crypto/openssl/ssl/d1_lib.c b/crypto/openssl/ssl/d1_lib.c index c3b77c889b..56f62530e5 100644 --- a/crypto/openssl/ssl/d1_lib.c +++ b/crypto/openssl/ssl/d1_lib.c @@ -82,6 +82,7 @@ SSL3_ENC_METHOD DTLSv1_enc_data={ TLS_MD_CLIENT_FINISH_CONST,TLS_MD_CLIENT_FINISH_CONST_SIZE, TLS_MD_SERVER_FINISH_CONST,TLS_MD_SERVER_FINISH_CONST_SIZE, tls1_alert_code, + tls1_export_keying_material, }; long dtls1_default_timeout(void) @@ -291,6 +292,15 @@ const SSL_CIPHER *dtls1_get_cipher(unsigned int u) void dtls1_start_timer(SSL *s) { +#ifndef OPENSSL_NO_SCTP + /* Disable timer for SCTP */ + if (BIO_dgram_is_sctp(SSL_get_wbio(s))) + { + memset(&(s->d1->next_timeout), 0, sizeof(struct timeval)); + return; + } +#endif + /* If timer is not set, initialize duration with 1 second */ if (s->d1->next_timeout.tv_sec == 0 && s->d1->next_timeout.tv_usec == 0) { @@ -381,6 +391,7 @@ void dtls1_double_timeout(SSL *s) void dtls1_stop_timer(SSL *s) { /* Reset everything */ + memset(&(s->d1->timeout), 0, sizeof(struct dtls1_timeout_st)); memset(&(s->d1->next_timeout), 0, sizeof(struct timeval)); s->d1->timeout_duration = 1; BIO_ctrl(SSL_get_rbio(s), BIO_CTRL_DGRAM_SET_NEXT_TIMEOUT, 0, &(s->d1->next_timeout)); @@ -388,10 +399,28 @@ void dtls1_stop_timer(SSL *s) dtls1_clear_record_buffer(s); } -int dtls1_handle_timeout(SSL *s) +int dtls1_check_timeout_num(SSL *s) { - DTLS1_STATE *state; + s->d1->timeout.num_alerts++; + /* Reduce MTU after 2 unsuccessful retransmissions */ + if (s->d1->timeout.num_alerts > 2) + { + s->d1->mtu = BIO_ctrl(SSL_get_wbio(s), BIO_CTRL_DGRAM_GET_FALLBACK_MTU, 0, NULL); + } + + if (s->d1->timeout.num_alerts > DTLS1_TMO_ALERT_COUNT) + { + /* fail the connection, enough alerts have been sent */ + SSLerr(SSL_F_DTLS1_HANDLE_TIMEOUT,SSL_R_READ_TIMEOUT_EXPIRED); + return -1; + } + + return 0; + } + +int dtls1_handle_timeout(SSL *s) + { /* if no timer is expired, don't do anything */ if (!dtls1_is_timer_expired(s)) { @@ -399,20 +428,23 @@ int dtls1_handle_timeout(SSL *s) } dtls1_double_timeout(s); - state = s->d1; - state->timeout.num_alerts++; - if ( state->timeout.num_alerts > DTLS1_TMO_ALERT_COUNT) - { - /* fail the connection, enough alerts have been sent */ - SSLerr(SSL_F_DTLS1_HANDLE_TIMEOUT,SSL_R_READ_TIMEOUT_EXPIRED); + + if (dtls1_check_timeout_num(s) < 0) return -1; + + s->d1->timeout.read_timeouts++; + if (s->d1->timeout.read_timeouts > DTLS1_TMO_READ_COUNT) + { + s->d1->timeout.read_timeouts = 1; } - state->timeout.read_timeouts++; - if ( state->timeout.read_timeouts > DTLS1_TMO_READ_COUNT) +#ifndef OPENSSL_NO_HEARTBEATS + if (s->tlsext_hb_pending) { - state->timeout.read_timeouts = 1; + s->tlsext_hb_pending = 0; + return dtls1_heartbeat(s); } +#endif dtls1_start_timer(s); return dtls1_retransmit_buffered_messages(s); diff --git a/crypto/openssl/ssl/d1_pkt.c b/crypto/openssl/ssl/d1_pkt.c index de30a505a6..987af60835 100644 --- a/crypto/openssl/ssl/d1_pkt.c +++ b/crypto/openssl/ssl/d1_pkt.c @@ -179,7 +179,6 @@ static int dtls1_record_needs_buffering(SSL *s, SSL3_RECORD *rr, static int dtls1_buffer_record(SSL *s, record_pqueue *q, unsigned char *priority); static int dtls1_process_record(SSL *s); -static void dtls1_clear_timeouts(SSL *s); /* copy buffered record into SSL structure */ static int @@ -232,6 +231,14 @@ dtls1_buffer_record(SSL *s, record_pqueue *queue, unsigned char *priority) item->data = rdata; +#ifndef OPENSSL_NO_SCTP + /* Store bio_dgram_sctp_rcvinfo struct */ + if (BIO_dgram_is_sctp(SSL_get_rbio(s)) && + (s->state == SSL3_ST_SR_FINISHED_A || s->state == SSL3_ST_CR_FINISHED_A)) { + BIO_ctrl(SSL_get_rbio(s), BIO_CTRL_DGRAM_SCTP_GET_RCVINFO, sizeof(rdata->recordinfo), &rdata->recordinfo); + } +#endif + /* insert should not fail, since duplicates are dropped */ if (pqueue_insert(queue->q, item) == NULL) { @@ -641,20 +648,28 @@ again: goto again; /* get another record */ } - /* Check whether this is a repeat, or aged record. - * Don't check if we're listening and this message is - * a ClientHello. They can look as if they're replayed, - * since they arrive from different connections and - * would be dropped unnecessarily. - */ - if (!(s->d1->listen && rr->type == SSL3_RT_HANDSHAKE && - *p == SSL3_MT_CLIENT_HELLO) && - !dtls1_record_replay_check(s, bitmap)) - { - rr->length = 0; - s->packet_length=0; /* dump this record */ - goto again; /* get another record */ - } +#ifndef OPENSSL_NO_SCTP + /* Only do replay check if no SCTP bio */ + if (!BIO_dgram_is_sctp(SSL_get_rbio(s))) + { +#endif + /* Check whether this is a repeat, or aged record. + * Don't check if we're listening and this message is + * a ClientHello. They can look as if they're replayed, + * since they arrive from different connections and + * would be dropped unnecessarily. + */ + if (!(s->d1->listen && rr->type == SSL3_RT_HANDSHAKE && + *p == SSL3_MT_CLIENT_HELLO) && + !dtls1_record_replay_check(s, bitmap)) + { + rr->length = 0; + s->packet_length=0; /* dump this record */ + goto again; /* get another record */ + } +#ifndef OPENSSL_NO_SCTP + } +#endif /* just read a 0 length packet */ if (rr->length == 0) goto again; @@ -682,7 +697,6 @@ again: goto again; /* get another record */ } - dtls1_clear_timeouts(s); /* done waiting */ return(1); } @@ -740,7 +754,17 @@ int dtls1_read_bytes(SSL *s, int type, unsigned char *buf, int len, int peek) /* Now s->d1->handshake_fragment_len == 0 if type == SSL3_RT_HANDSHAKE. */ +#ifndef OPENSSL_NO_SCTP + /* Continue handshake if it had to be interrupted to read + * app data with SCTP. + */ + if ((!s->in_handshake && SSL_in_init(s)) || + (BIO_dgram_is_sctp(SSL_get_rbio(s)) && + (s->state == DTLS1_SCTP_ST_SR_READ_SOCK || s->state == DTLS1_SCTP_ST_CR_READ_SOCK) && + s->s3->in_read_app_data != 2)) +#else if (!s->in_handshake && SSL_in_init(s)) +#endif { /* type == SSL3_RT_APPLICATION_DATA */ i=s->handshake_func(s); @@ -771,6 +795,15 @@ start: item = pqueue_pop(s->d1->buffered_app_data.q); if (item) { +#ifndef OPENSSL_NO_SCTP + /* Restore bio_dgram_sctp_rcvinfo struct */ + if (BIO_dgram_is_sctp(SSL_get_rbio(s))) + { + DTLS1_RECORD_DATA *rdata = (DTLS1_RECORD_DATA *) item->data; + BIO_ctrl(SSL_get_rbio(s), BIO_CTRL_DGRAM_SCTP_SET_RCVINFO, sizeof(rdata->recordinfo), &rdata->recordinfo); + } +#endif + dtls1_copy_record(s, item); OPENSSL_free(item->data); @@ -853,6 +886,31 @@ start: rr->off=0; } } + +#ifndef OPENSSL_NO_SCTP + /* We were about to renegotiate but had to read + * belated application data first, so retry. + */ + if (BIO_dgram_is_sctp(SSL_get_rbio(s)) && + rr->type == SSL3_RT_APPLICATION_DATA && + (s->state == DTLS1_SCTP_ST_SR_READ_SOCK || s->state == DTLS1_SCTP_ST_CR_READ_SOCK)) + { + s->rwstate=SSL_READING; + BIO_clear_retry_flags(SSL_get_rbio(s)); + BIO_set_retry_read(SSL_get_rbio(s)); + } + + /* We might had to delay a close_notify alert because + * of reordered app data. If there was an alert and there + * is no message to read anymore, finally set shutdown. + */ + if (BIO_dgram_is_sctp(SSL_get_rbio(s)) && + s->d1->shutdown_received && !BIO_dgram_sctp_msg_waiting(SSL_get_rbio(s))) + { + s->shutdown |= SSL_RECEIVED_SHUTDOWN; + return(0); + } +#endif return(n); } @@ -880,6 +938,19 @@ start: dest = s->d1->alert_fragment; dest_len = &s->d1->alert_fragment_len; } +#ifndef OPENSSL_NO_HEARTBEATS + else if (rr->type == TLS1_RT_HEARTBEAT) + { + dtls1_process_heartbeat(s); + + /* Exit and notify application to read again */ + rr->length = 0; + s->rwstate=SSL_READING; + BIO_clear_retry_flags(SSL_get_rbio(s)); + BIO_set_retry_read(SSL_get_rbio(s)); + return(-1); + } +#endif /* else it's a CCS message, or application data or wrong */ else if (rr->type != SSL3_RT_CHANGE_CIPHER_SPEC) { @@ -963,6 +1034,7 @@ start: !(s->s3->flags & SSL3_FLAGS_NO_RENEGOTIATE_CIPHERS) && !s->s3->renegotiate) { + s->new_session = 1; ssl3_renegotiate(s); if (ssl3_renegotiate_check(s)) { @@ -1024,6 +1096,21 @@ start: s->s3->warn_alert = alert_descr; if (alert_descr == SSL_AD_CLOSE_NOTIFY) { +#ifndef OPENSSL_NO_SCTP + /* With SCTP and streams the socket may deliver app data + * after a close_notify alert. We have to check this + * first so that nothing gets discarded. + */ + if (BIO_dgram_is_sctp(SSL_get_rbio(s)) && + BIO_dgram_sctp_msg_waiting(SSL_get_rbio(s))) + { + s->d1->shutdown_received = 1; + s->rwstate=SSL_READING; + BIO_clear_retry_flags(SSL_get_rbio(s)); + BIO_set_retry_read(SSL_get_rbio(s)); + return -1; + } +#endif s->shutdown |= SSL_RECEIVED_SHUTDOWN; return(0); } @@ -1130,6 +1217,15 @@ start: if (s->version == DTLS1_BAD_VER) s->d1->handshake_read_seq++; +#ifndef OPENSSL_NO_SCTP + /* Remember that a CCS has been received, + * so that an old key of SCTP-Auth can be + * deleted when a CCS is sent. Will be ignored + * if no SCTP is used + */ + BIO_ctrl(SSL_get_wbio(s), BIO_CTRL_DGRAM_SCTP_AUTH_CCS_RCVD, 1, NULL); +#endif + goto start; } @@ -1152,6 +1248,9 @@ start: */ if (msg_hdr.type == SSL3_MT_FINISHED) { + if (dtls1_check_timeout_num(s) < 0) + return -1; + dtls1_retransmit_buffered_messages(s); rr->length = 0; goto start; @@ -1169,6 +1268,7 @@ start: #else s->state = s->server ? SSL_ST_ACCEPT : SSL_ST_CONNECT; #endif + s->renegotiate=1; s->new_session=1; } i=s->handshake_func(s); @@ -1265,7 +1365,16 @@ dtls1_write_app_data_bytes(SSL *s, int type, const void *buf_, int len) { int i; - if (SSL_in_init(s) && !s->in_handshake) +#ifndef OPENSSL_NO_SCTP + /* Check if we have to continue an interrupted handshake + * for reading belated app data with SCTP. + */ + if ((SSL_in_init(s) && !s->in_handshake) || + (BIO_dgram_is_sctp(SSL_get_wbio(s)) && + (s->state == DTLS1_SCTP_ST_SR_READ_SOCK || s->state == DTLS1_SCTP_ST_CR_READ_SOCK))) +#else + if (SSL_in_init(s) && !s->in_handshake) +#endif { i=s->handshake_func(s); if (i < 0) return(i); @@ -1765,10 +1874,3 @@ dtls1_reset_seq_numbers(SSL *s, int rw) memset(seq, 0x00, seq_bytes); } - - -static void -dtls1_clear_timeouts(SSL *s) - { - memset(&(s->d1->timeout), 0x00, sizeof(struct dtls1_timeout_st)); - } diff --git a/crypto/openssl/ssl/d1_srtp.c b/crypto/openssl/ssl/d1_srtp.c new file mode 100644 index 0000000000..928935bd8b --- /dev/null +++ b/crypto/openssl/ssl/d1_srtp.c @@ -0,0 +1,493 @@ +/* ssl/t1_lib.c */ +/* Copyright (C) 1995-1998 Eric Young (eay@cryptsoft.com) + * All rights reserved. + * + * This package is an SSL implementation written + * by Eric Young (eay@cryptsoft.com). + * The implementation was written so as to conform with Netscapes SSL. + * + * This library is free for commercial and non-commercial use as long as + * the following conditions are aheared to. The following conditions + * apply to all code found in this distribution, be it the RC4, RSA, + * lhash, DES, etc., code; not just the SSL code. The SSL documentation + * included with this distribution is covered by the same copyright terms + * except that the holder is Tim Hudson (tjh@cryptsoft.com). + * + * Copyright remains Eric Young's, and as such any Copyright notices in + * the code are not to be removed. + * If this package is used in a product, Eric Young should be given attribution + * as the author of the parts of the library used. + * This can be in the form of a textual message at program startup or + * in documentation (online or textual) provided with the package. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. All advertising materials mentioning features or use of this software + * must display the following acknowledgement: + * "This product includes cryptographic software written by + * Eric Young (eay@cryptsoft.com)" + * The word 'cryptographic' can be left out if the rouines from the library + * being used are not cryptographic related :-). + * 4. If you include any Windows specific code (or a derivative thereof) from + * the apps directory (application code) you must include an acknowledgement: + * "This product includes software written by Tim Hudson (tjh@cryptsoft.com)" + * + * THIS SOFTWARE IS PROVIDED BY ERIC YOUNG ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * The licence and distribution terms for any publically available version or + * derivative of this code cannot be changed. i.e. this code cannot simply be + * copied and put under another distribution licence + * [including the GNU Public Licence.] + */ +/* ==================================================================== + * Copyright (c) 1998-2006 The OpenSSL Project. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * 3. All advertising materials mentioning features or use of this + * software must display the following acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit. (http://www.openssl.org/)" + * + * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to + * endorse or promote products derived from this software without + * prior written permission. For written permission, please contact + * openssl-core@openssl.org. + * + * 5. Products derived from this software may not be called "OpenSSL" + * nor may "OpenSSL" appear in their names without prior written + * permission of the OpenSSL Project. + * + * 6. Redistributions of any form whatsoever must retain the following + * acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit (http://www.openssl.org/)" + * + * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY + * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * ==================================================================== + * + * This product includes cryptographic software written by Eric Young + * (eay@cryptsoft.com). This product includes software written by Tim + * Hudson (tjh@cryptsoft.com). + * + */ +/* + DTLS code by Eric Rescorla + + Copyright (C) 2006, Network Resonance, Inc. + Copyright (C) 2011, RTFM, Inc. +*/ + +#ifndef OPENSSL_NO_SRTP + +#include +#include +#include "ssl_locl.h" +#include "srtp.h" + + +static SRTP_PROTECTION_PROFILE srtp_known_profiles[]= + { + { + "SRTP_AES128_CM_SHA1_80", + SRTP_AES128_CM_SHA1_80, + }, + { + "SRTP_AES128_CM_SHA1_32", + SRTP_AES128_CM_SHA1_32, + }, +#if 0 + { + "SRTP_NULL_SHA1_80", + SRTP_NULL_SHA1_80, + }, + { + "SRTP_NULL_SHA1_32", + SRTP_NULL_SHA1_32, + }, +#endif + {0} + }; + +static int find_profile_by_name(char *profile_name, + SRTP_PROTECTION_PROFILE **pptr,unsigned len) + { + SRTP_PROTECTION_PROFILE *p; + + p=srtp_known_profiles; + while(p->name) + { + if((len == strlen(p->name)) && !strncmp(p->name,profile_name, + len)) + { + *pptr=p; + return 0; + } + + p++; + } + + return 1; + } + +static int find_profile_by_num(unsigned profile_num, + SRTP_PROTECTION_PROFILE **pptr) + { + SRTP_PROTECTION_PROFILE *p; + + p=srtp_known_profiles; + while(p->name) + { + if(p->id == profile_num) + { + *pptr=p; + return 0; + } + p++; + } + + return 1; + } + +static int ssl_ctx_make_profiles(const char *profiles_string,STACK_OF(SRTP_PROTECTION_PROFILE) **out) + { + STACK_OF(SRTP_PROTECTION_PROFILE) *profiles; + + char *col; + char *ptr=(char *)profiles_string; + + SRTP_PROTECTION_PROFILE *p; + + if(!(profiles=sk_SRTP_PROTECTION_PROFILE_new_null())) + { + SSLerr(SSL_F_SSL_CTX_MAKE_PROFILES, SSL_R_SRTP_COULD_NOT_ALLOCATE_PROFILES); + return 1; + } + + do + { + col=strchr(ptr,':'); + + if(!find_profile_by_name(ptr,&p, + col ? col-ptr : (int)strlen(ptr))) + { + sk_SRTP_PROTECTION_PROFILE_push(profiles,p); + } + else + { + SSLerr(SSL_F_SSL_CTX_MAKE_PROFILES,SSL_R_SRTP_UNKNOWN_PROTECTION_PROFILE); + return 1; + } + + if(col) ptr=col+1; + } while (col); + + *out=profiles; + + return 0; + } + +int SSL_CTX_set_tlsext_use_srtp(SSL_CTX *ctx,const char *profiles) + { + return ssl_ctx_make_profiles(profiles,&ctx->srtp_profiles); + } + +int SSL_set_tlsext_use_srtp(SSL *s,const char *profiles) + { + return ssl_ctx_make_profiles(profiles,&s->srtp_profiles); + } + + +STACK_OF(SRTP_PROTECTION_PROFILE) *SSL_get_srtp_profiles(SSL *s) + { + if(s != NULL) + { + if(s->srtp_profiles != NULL) + { + return s->srtp_profiles; + } + else if((s->ctx != NULL) && + (s->ctx->srtp_profiles != NULL)) + { + return s->ctx->srtp_profiles; + } + } + + return NULL; + } + +SRTP_PROTECTION_PROFILE *SSL_get_selected_srtp_profile(SSL *s) + { + return s->srtp_profile; + } + +/* Note: this function returns 0 length if there are no + profiles specified */ +int ssl_add_clienthello_use_srtp_ext(SSL *s, unsigned char *p, int *len, int maxlen) + { + int ct=0; + int i; + STACK_OF(SRTP_PROTECTION_PROFILE) *clnt=0; + SRTP_PROTECTION_PROFILE *prof; + + clnt=SSL_get_srtp_profiles(s); + ct=sk_SRTP_PROTECTION_PROFILE_num(clnt); /* -1 if clnt == 0 */ + + if(p) + { + if(ct==0) + { + SSLerr(SSL_F_SSL_ADD_CLIENTHELLO_USE_SRTP_EXT,SSL_R_EMPTY_SRTP_PROTECTION_PROFILE_LIST); + return 1; + } + + if((2 + ct*2 + 1) > maxlen) + { + SSLerr(SSL_F_SSL_ADD_CLIENTHELLO_USE_SRTP_EXT,SSL_R_SRTP_PROTECTION_PROFILE_LIST_TOO_LONG); + return 1; + } + + /* Add the length */ + s2n(ct * 2, p); + for(i=0;iid,p); + } + + /* Add an empty use_mki value */ + *p++ = 0; + } + + *len=2 + ct*2 + 1; + + return 0; + } + + +int ssl_parse_clienthello_use_srtp_ext(SSL *s, unsigned char *d, int len,int *al) + { + SRTP_PROTECTION_PROFILE *cprof,*sprof; + STACK_OF(SRTP_PROTECTION_PROFILE) *clnt=0,*srvr; + int ct; + int mki_len; + int i,j; + int id; + int ret; + + /* Length value + the MKI length */ + if(len < 3) + { + SSLerr(SSL_F_SSL_PARSE_CLIENTHELLO_USE_SRTP_EXT,SSL_R_BAD_SRTP_PROTECTION_PROFILE_LIST); + *al=SSL_AD_DECODE_ERROR; + return 1; + } + + /* Pull off the length of the cipher suite list */ + n2s(d, ct); + len -= 2; + + /* Check that it is even */ + if(ct%2) + { + SSLerr(SSL_F_SSL_PARSE_CLIENTHELLO_USE_SRTP_EXT,SSL_R_BAD_SRTP_PROTECTION_PROFILE_LIST); + *al=SSL_AD_DECODE_ERROR; + return 1; + } + + /* Check that lengths are consistent */ + if(len < (ct + 1)) + { + SSLerr(SSL_F_SSL_PARSE_CLIENTHELLO_USE_SRTP_EXT,SSL_R_BAD_SRTP_PROTECTION_PROFILE_LIST); + *al=SSL_AD_DECODE_ERROR; + return 1; + } + + + clnt=sk_SRTP_PROTECTION_PROFILE_new_null(); + + while(ct) + { + n2s(d,id); + ct-=2; + len-=2; + + if(!find_profile_by_num(id,&cprof)) + { + sk_SRTP_PROTECTION_PROFILE_push(clnt,cprof); + } + else + { + ; /* Ignore */ + } + } + + /* Now extract the MKI value as a sanity check, but discard it for now */ + mki_len = *d; + d++; len--; + + if (mki_len != len) + { + SSLerr(SSL_F_SSL_PARSE_CLIENTHELLO_USE_SRTP_EXT,SSL_R_BAD_SRTP_MKI_VALUE); + *al=SSL_AD_DECODE_ERROR; + return 1; + } + + srvr=SSL_get_srtp_profiles(s); + + /* Pick our most preferred profile. If no profiles have been + configured then the outer loop doesn't run + (sk_SRTP_PROTECTION_PROFILE_num() = -1) + and so we just return without doing anything */ + for(i=0;iid==sprof->id) + { + s->srtp_profile=sprof; + *al=0; + ret=0; + goto done; + } + } + } + + ret=0; + +done: + if(clnt) sk_SRTP_PROTECTION_PROFILE_free(clnt); + + return ret; + } + +int ssl_add_serverhello_use_srtp_ext(SSL *s, unsigned char *p, int *len, int maxlen) + { + if(p) + { + if(maxlen < 5) + { + SSLerr(SSL_F_SSL_ADD_SERVERHELLO_USE_SRTP_EXT,SSL_R_SRTP_PROTECTION_PROFILE_LIST_TOO_LONG); + return 1; + } + + if(s->srtp_profile==0) + { + SSLerr(SSL_F_SSL_ADD_SERVERHELLO_USE_SRTP_EXT,SSL_R_USE_SRTP_NOT_NEGOTIATED); + return 1; + } + s2n(2, p); + s2n(s->srtp_profile->id,p); + *p++ = 0; + } + *len=5; + + return 0; + } + + +int ssl_parse_serverhello_use_srtp_ext(SSL *s, unsigned char *d, int len,int *al) + { + unsigned id; + int i; + int ct; + + STACK_OF(SRTP_PROTECTION_PROFILE) *clnt; + SRTP_PROTECTION_PROFILE *prof; + + if(len!=5) + { + SSLerr(SSL_F_SSL_PARSE_SERVERHELLO_USE_SRTP_EXT,SSL_R_BAD_SRTP_PROTECTION_PROFILE_LIST); + *al=SSL_AD_DECODE_ERROR; + return 1; + } + + n2s(d, ct); + if(ct!=2) + { + SSLerr(SSL_F_SSL_PARSE_SERVERHELLO_USE_SRTP_EXT,SSL_R_BAD_SRTP_PROTECTION_PROFILE_LIST); + *al=SSL_AD_DECODE_ERROR; + return 1; + } + + n2s(d,id); + if (*d) /* Must be no MKI, since we never offer one */ + { + SSLerr(SSL_F_SSL_PARSE_SERVERHELLO_USE_SRTP_EXT,SSL_R_BAD_SRTP_MKI_VALUE); + *al=SSL_AD_ILLEGAL_PARAMETER; + return 1; + } + + clnt=SSL_get_srtp_profiles(s); + + /* Throw an error if the server gave us an unsolicited extension */ + if (clnt == NULL) + { + SSLerr(SSL_F_SSL_PARSE_SERVERHELLO_USE_SRTP_EXT,SSL_R_NO_SRTP_PROFILES); + *al=SSL_AD_DECODE_ERROR; + return 1; + } + + /* Check to see if the server gave us something we support + (and presumably offered) + */ + for(i=0;iid == id) + { + s->srtp_profile=prof; + *al=0; + return 0; + } + } + + SSLerr(SSL_F_SSL_PARSE_SERVERHELLO_USE_SRTP_EXT,SSL_R_BAD_SRTP_PROTECTION_PROFILE_LIST); + *al=SSL_AD_DECODE_ERROR; + return 1; + } + + +#endif diff --git a/crypto/openssl/ssl/d1_srvr.c b/crypto/openssl/ssl/d1_srvr.c index 149983be30..5822379d10 100644 --- a/crypto/openssl/ssl/d1_srvr.c +++ b/crypto/openssl/ssl/d1_srvr.c @@ -151,6 +151,10 @@ int dtls1_accept(SSL *s) int ret= -1; int new_state,state,skip=0; int listen; +#ifndef OPENSSL_NO_SCTP + unsigned char sctpauthkey[64]; + char labelbuffer[sizeof(DTLS1_SCTP_AUTH_LABEL)]; +#endif RAND_add(&Time,sizeof(Time),0); ERR_clear_error(); @@ -168,6 +172,13 @@ int dtls1_accept(SSL *s) if (!SSL_in_init(s) || SSL_in_before(s)) SSL_clear(s); s->d1->listen = listen; +#ifndef OPENSSL_NO_SCTP + /* Notify SCTP BIO socket to enter handshake + * mode and prevent stream identifier other + * than 0. Will be ignored if no SCTP is used. + */ + BIO_ctrl(SSL_get_wbio(s), BIO_CTRL_DGRAM_SCTP_SET_IN_HANDSHAKE, s->in_handshake, NULL); +#endif if (s->cert == NULL) { @@ -175,6 +186,19 @@ int dtls1_accept(SSL *s) return(-1); } +#ifndef OPENSSL_NO_HEARTBEATS + /* If we're awaiting a HeartbeatResponse, pretend we + * already got and don't await it anymore, because + * Heartbeats don't make sense during handshakes anyway. + */ + if (s->tlsext_hb_pending) + { + dtls1_stop_timer(s); + s->tlsext_hb_pending = 0; + s->tlsext_hb_seq++; + } +#endif + for (;;) { state=s->state; @@ -182,7 +206,7 @@ int dtls1_accept(SSL *s) switch (s->state) { case SSL_ST_RENEGOTIATE: - s->new_session=1; + s->renegotiate=1; /* s->state=SSL_ST_ACCEPT; */ case SSL_ST_BEFORE: @@ -227,8 +251,12 @@ int dtls1_accept(SSL *s) { /* Ok, we now need to push on a buffering BIO so that * the output is sent in a way that TCP likes :-) + * ...but not with SCTP :-) */ - if (!ssl_init_wbio_buffer(s,1)) { ret= -1; goto end; } +#ifndef OPENSSL_NO_SCTP + if (!BIO_dgram_is_sctp(SSL_get_wbio(s))) +#endif + if (!ssl_init_wbio_buffer(s,1)) { ret= -1; goto end; } ssl3_init_finished_mac(s); s->state=SSL3_ST_SR_CLNT_HELLO_A; @@ -313,25 +341,75 @@ int dtls1_accept(SSL *s) ssl3_init_finished_mac(s); break; +#ifndef OPENSSL_NO_SCTP + case DTLS1_SCTP_ST_SR_READ_SOCK: + + if (BIO_dgram_sctp_msg_waiting(SSL_get_rbio(s))) + { + s->s3->in_read_app_data=2; + s->rwstate=SSL_READING; + BIO_clear_retry_flags(SSL_get_rbio(s)); + BIO_set_retry_read(SSL_get_rbio(s)); + ret = -1; + goto end; + } + + s->state=SSL3_ST_SR_FINISHED_A; + break; + + case DTLS1_SCTP_ST_SW_WRITE_SOCK: + ret = BIO_dgram_sctp_wait_for_dry(SSL_get_wbio(s)); + if (ret < 0) goto end; + + if (ret == 0) + { + if (s->d1->next_state != SSL_ST_OK) + { + s->s3->in_read_app_data=2; + s->rwstate=SSL_READING; + BIO_clear_retry_flags(SSL_get_rbio(s)); + BIO_set_retry_read(SSL_get_rbio(s)); + ret = -1; + goto end; + } + } + + s->state=s->d1->next_state; + break; +#endif + case SSL3_ST_SW_SRVR_HELLO_A: case SSL3_ST_SW_SRVR_HELLO_B: - s->new_session = 2; + s->renegotiate = 2; dtls1_start_timer(s); ret=dtls1_send_server_hello(s); if (ret <= 0) goto end; -#ifndef OPENSSL_NO_TLSEXT if (s->hit) { +#ifndef OPENSSL_NO_SCTP + /* Add new shared key for SCTP-Auth, + * will be ignored if no SCTP used. + */ + snprintf((char*) labelbuffer, sizeof(DTLS1_SCTP_AUTH_LABEL), + DTLS1_SCTP_AUTH_LABEL); + + SSL_export_keying_material(s, sctpauthkey, + sizeof(sctpauthkey), labelbuffer, + sizeof(labelbuffer), NULL, 0, 0); + + BIO_ctrl(SSL_get_wbio(s), BIO_CTRL_DGRAM_SCTP_ADD_AUTH_KEY, + sizeof(sctpauthkey), sctpauthkey); +#endif +#ifndef OPENSSL_NO_TLSEXT if (s->tlsext_ticket_expected) s->state=SSL3_ST_SW_SESSION_TICKET_A; else s->state=SSL3_ST_SW_CHANGE_A; - } #else - if (s->hit) - s->state=SSL3_ST_SW_CHANGE_A; + s->state=SSL3_ST_SW_CHANGE_A; #endif + } else s->state=SSL3_ST_SW_CERT_A; s->init_num=0; @@ -441,6 +519,13 @@ int dtls1_accept(SSL *s) skip=1; s->s3->tmp.cert_request=0; s->state=SSL3_ST_SW_SRVR_DONE_A; +#ifndef OPENSSL_NO_SCTP + if (BIO_dgram_is_sctp(SSL_get_wbio(s))) + { + s->d1->next_state = SSL3_ST_SW_SRVR_DONE_A; + s->state = DTLS1_SCTP_ST_SW_WRITE_SOCK; + } +#endif } else { @@ -450,9 +535,23 @@ int dtls1_accept(SSL *s) if (ret <= 0) goto end; #ifndef NETSCAPE_HANG_BUG s->state=SSL3_ST_SW_SRVR_DONE_A; +#ifndef OPENSSL_NO_SCTP + if (BIO_dgram_is_sctp(SSL_get_wbio(s))) + { + s->d1->next_state = SSL3_ST_SW_SRVR_DONE_A; + s->state = DTLS1_SCTP_ST_SW_WRITE_SOCK; + } +#endif #else s->state=SSL3_ST_SW_FLUSH; s->s3->tmp.next_state=SSL3_ST_SR_CERT_A; +#ifndef OPENSSL_NO_SCTP + if (BIO_dgram_is_sctp(SSL_get_wbio(s))) + { + s->d1->next_state = s->s3->tmp.next_state; + s->s3->tmp.next_state=DTLS1_SCTP_ST_SW_WRITE_SOCK; + } +#endif #endif s->init_num=0; } @@ -472,6 +571,13 @@ int dtls1_accept(SSL *s) s->rwstate=SSL_WRITING; if (BIO_flush(s->wbio) <= 0) { + /* If the write error was fatal, stop trying */ + if (!BIO_should_retry(s->wbio)) + { + s->rwstate=SSL_NOTHING; + s->state=s->s3->tmp.next_state; + } + ret= -1; goto end; } @@ -485,15 +591,16 @@ int dtls1_accept(SSL *s) ret = ssl3_check_client_hello(s); if (ret <= 0) goto end; - dtls1_stop_timer(s); if (ret == 2) + { + dtls1_stop_timer(s); s->state = SSL3_ST_SR_CLNT_HELLO_C; + } else { /* could be sent for a DH cert, even if we * have not asked for it :-) */ ret=ssl3_get_client_certificate(s); if (ret <= 0) goto end; - dtls1_stop_timer(s); s->init_num=0; s->state=SSL3_ST_SR_KEY_EXCH_A; } @@ -503,7 +610,21 @@ int dtls1_accept(SSL *s) case SSL3_ST_SR_KEY_EXCH_B: ret=ssl3_get_client_key_exchange(s); if (ret <= 0) goto end; - dtls1_stop_timer(s); +#ifndef OPENSSL_NO_SCTP + /* Add new shared key for SCTP-Auth, + * will be ignored if no SCTP used. + */ + snprintf((char *) labelbuffer, sizeof(DTLS1_SCTP_AUTH_LABEL), + DTLS1_SCTP_AUTH_LABEL); + + SSL_export_keying_material(s, sctpauthkey, + sizeof(sctpauthkey), labelbuffer, + sizeof(labelbuffer), NULL, 0, 0); + + BIO_ctrl(SSL_get_wbio(s), BIO_CTRL_DGRAM_SCTP_ADD_AUTH_KEY, + sizeof(sctpauthkey), sctpauthkey); +#endif + s->state=SSL3_ST_SR_CERT_VRFY_A; s->init_num=0; @@ -540,9 +661,13 @@ int dtls1_accept(SSL *s) /* we should decide if we expected this one */ ret=ssl3_get_cert_verify(s); if (ret <= 0) goto end; - dtls1_stop_timer(s); - - s->state=SSL3_ST_SR_FINISHED_A; +#ifndef OPENSSL_NO_SCTP + if (BIO_dgram_is_sctp(SSL_get_wbio(s)) && + state == SSL_ST_RENEGOTIATE) + s->state=DTLS1_SCTP_ST_SR_READ_SOCK; + else +#endif + s->state=SSL3_ST_SR_FINISHED_A; s->init_num=0; break; @@ -594,6 +719,14 @@ int dtls1_accept(SSL *s) SSL3_ST_SW_CHANGE_A,SSL3_ST_SW_CHANGE_B); if (ret <= 0) goto end; + +#ifndef OPENSSL_NO_SCTP + /* Change to new shared key of SCTP-Auth, + * will be ignored if no SCTP used. + */ + BIO_ctrl(SSL_get_wbio(s), BIO_CTRL_DGRAM_SCTP_NEXT_AUTH_KEY, 0, NULL); +#endif + s->state=SSL3_ST_SW_FINISHED_A; s->init_num=0; @@ -618,7 +751,16 @@ int dtls1_accept(SSL *s) if (s->hit) s->s3->tmp.next_state=SSL3_ST_SR_FINISHED_A; else + { s->s3->tmp.next_state=SSL_ST_OK; +#ifndef OPENSSL_NO_SCTP + if (BIO_dgram_is_sctp(SSL_get_wbio(s))) + { + s->d1->next_state = s->s3->tmp.next_state; + s->s3->tmp.next_state=DTLS1_SCTP_ST_SW_WRITE_SOCK; + } +#endif + } s->init_num=0; break; @@ -636,11 +778,9 @@ int dtls1_accept(SSL *s) s->init_num=0; - if (s->new_session == 2) /* skipped if we just sent a HelloRequest */ + if (s->renegotiate == 2) /* skipped if we just sent a HelloRequest */ { - /* actually not necessarily a 'new' session unless - * SSL_OP_NO_SESSION_RESUMPTION_ON_RENEGOTIATION is set */ - + s->renegotiate=0; s->new_session=0; ssl_update_cache(s,SSL_SESS_CACHE_SERVER); @@ -692,6 +832,14 @@ end: /* BIO_flush(s->wbio); */ s->in_handshake--; +#ifndef OPENSSL_NO_SCTP + /* Notify SCTP BIO socket to leave handshake + * mode and prevent stream identifier other + * than 0. Will be ignored if no SCTP is used. + */ + BIO_ctrl(SSL_get_wbio(s), BIO_CTRL_DGRAM_SCTP_SET_IN_HANDSHAKE, s->in_handshake, NULL); +#endif + if (cb != NULL) cb(s,SSL_CB_ACCEPT_EXIT,ret); return(ret); @@ -1147,7 +1295,7 @@ int dtls1_send_server_key_exchange(SSL *s) if (!(s->s3->tmp.new_cipher->algorithm_auth & SSL_aNULL) && !(s->s3->tmp.new_cipher->algorithm_mkey & SSL_kPSK)) { - if ((pkey=ssl_get_sign_pkey(s,s->s3->tmp.new_cipher)) + if ((pkey=ssl_get_sign_pkey(s,s->s3->tmp.new_cipher, NULL)) == NULL) { al=SSL_AD_DECODE_ERROR; diff --git a/crypto/openssl/ssl/dtls1.h b/crypto/openssl/ssl/dtls1.h index 2900d1d8ae..5008bf6081 100644 --- a/crypto/openssl/ssl/dtls1.h +++ b/crypto/openssl/ssl/dtls1.h @@ -105,6 +105,11 @@ extern "C" { #define DTLS1_AL_HEADER_LENGTH 2 #endif +#ifndef OPENSSL_NO_SSL_INTERN + +#ifndef OPENSSL_NO_SCTP +#define DTLS1_SCTP_AUTH_LABEL "EXPORTER_DTLS_OVER_SCTP" +#endif typedef struct dtls1_bitmap_st { @@ -227,7 +232,7 @@ typedef struct dtls1_state_st struct dtls1_timeout_st timeout; - /* Indicates when the last handshake msg sent will timeout */ + /* Indicates when the last handshake msg or heartbeat sent will timeout */ struct timeval next_timeout; /* Timeout duration */ @@ -243,6 +248,13 @@ typedef struct dtls1_state_st unsigned int retransmitting; unsigned int change_cipher_spec_ok; +#ifndef OPENSSL_NO_SCTP + /* used when SSL_ST_XX_FLUSH is entered */ + int next_state; + + int shutdown_received; +#endif + } DTLS1_STATE; typedef struct dtls1_record_data_st @@ -251,8 +263,12 @@ typedef struct dtls1_record_data_st unsigned int packet_length; SSL3_BUFFER rbuf; SSL3_RECORD rrec; +#ifndef OPENSSL_NO_SCTP + struct bio_dgram_sctp_rcvinfo recordinfo; +#endif } DTLS1_RECORD_DATA; +#endif /* Timeout multipliers (timeout slice is defined in apps/timeouts.h */ #define DTLS1_TMO_READ_COUNT 2 diff --git a/crypto/openssl/ssl/kssl.c b/crypto/openssl/ssl/kssl.c index b820e37464..fd7c67bb1f 100644 --- a/crypto/openssl/ssl/kssl.c +++ b/crypto/openssl/ssl/kssl.c @@ -2194,6 +2194,22 @@ krb5_error_code kssl_build_principal_2( return ENOMEM; } +void SSL_set0_kssl_ctx(SSL *s, KSSL_CTX *kctx) + { + s->kssl_ctx = kctx; + } + +KSSL_CTX * SSL_get0_kssl_ctx(SSL *s) + { + return s->kssl_ctx; + } + +char *kssl_ctx_get0_client_princ(KSSL_CTX *kctx) + { + if (kctx) + return kctx->client_princ; + return NULL; + } #else /* !OPENSSL_NO_KRB5 */ diff --git a/crypto/openssl/ssl/kssl.h b/crypto/openssl/ssl/kssl.h index a3d20e1ccb..8242fd5eeb 100644 --- a/crypto/openssl/ssl/kssl.h +++ b/crypto/openssl/ssl/kssl.h @@ -172,6 +172,10 @@ krb5_error_code kssl_check_authent(KSSL_CTX *kssl_ctx, krb5_data *authentp, krb5_timestamp *atimep, KSSL_ERR *kssl_err); unsigned char *kssl_skip_confound(krb5_enctype enctype, unsigned char *authn); +void SSL_set0_kssl_ctx(SSL *s, KSSL_CTX *kctx); +KSSL_CTX * SSL_get0_kssl_ctx(SSL *s); +char *kssl_ctx_get0_client_princ(KSSL_CTX *kctx); + #ifdef __cplusplus } #endif diff --git a/crypto/openssl/ssl/s23_clnt.c b/crypto/openssl/ssl/s23_clnt.c index c4d8bf2eb3..b3c48232d7 100644 --- a/crypto/openssl/ssl/s23_clnt.c +++ b/crypto/openssl/ssl/s23_clnt.c @@ -129,6 +129,10 @@ static const SSL_METHOD *ssl23_get_client_method(int ver) return(SSLv3_client_method()); else if (ver == TLS1_VERSION) return(TLSv1_client_method()); + else if (ver == TLS1_1_VERSION) + return(TLSv1_1_client_method()); + else if (ver == TLS1_2_VERSION) + return(TLSv1_2_client_method()); else return(NULL); } @@ -284,7 +288,15 @@ static int ssl23_client_hello(SSL *s) if (ssl2_compat && ssl23_no_ssl2_ciphers(s)) ssl2_compat = 0; - if (!(s->options & SSL_OP_NO_TLSv1)) + if (!(s->options & SSL_OP_NO_TLSv1_2)) + { + version = TLS1_2_VERSION; + } + else if (!(s->options & SSL_OP_NO_TLSv1_1)) + { + version = TLS1_1_VERSION; + } + else if (!(s->options & SSL_OP_NO_TLSv1)) { version = TLS1_VERSION; } @@ -329,11 +341,29 @@ static int ssl23_client_hello(SSL *s) if (RAND_pseudo_bytes(p,SSL3_RANDOM_SIZE-4) <= 0) return -1; - if (version == TLS1_VERSION) + if (version == TLS1_2_VERSION) + { + version_major = TLS1_2_VERSION_MAJOR; + version_minor = TLS1_2_VERSION_MINOR; + } + else if (version == TLS1_1_VERSION) + { + version_major = TLS1_1_VERSION_MAJOR; + version_minor = TLS1_1_VERSION_MINOR; + } + else if (version == TLS1_VERSION) { version_major = TLS1_VERSION_MAJOR; version_minor = TLS1_VERSION_MINOR; } +#ifdef OPENSSL_FIPS + else if(FIPS_mode()) + { + SSLerr(SSL_F_SSL23_CLIENT_HELLO, + SSL_R_ONLY_TLS_ALLOWED_IN_FIPS_MODE); + return -1; + } +#endif else if (version == SSL3_VERSION) { version_major = SSL3_VERSION_MAJOR; @@ -608,7 +638,7 @@ static int ssl23_get_server_hello(SSL *s) #endif } else if (p[1] == SSL3_VERSION_MAJOR && - (p[2] == SSL3_VERSION_MINOR || p[2] == TLS1_VERSION_MINOR) && + p[2] <= TLS1_2_VERSION_MINOR && ((p[0] == SSL3_RT_HANDSHAKE && p[5] == SSL3_MT_SERVER_HELLO) || (p[0] == SSL3_RT_ALERT && p[3] == 0 && p[4] == 2))) { @@ -617,6 +647,14 @@ static int ssl23_get_server_hello(SSL *s) if ((p[2] == SSL3_VERSION_MINOR) && !(s->options & SSL_OP_NO_SSLv3)) { +#ifdef OPENSSL_FIPS + if(FIPS_mode()) + { + SSLerr(SSL_F_SSL23_GET_SERVER_HELLO, + SSL_R_ONLY_TLS_ALLOWED_IN_FIPS_MODE); + goto err; + } +#endif s->version=SSL3_VERSION; s->method=SSLv3_client_method(); } @@ -626,6 +664,18 @@ static int ssl23_get_server_hello(SSL *s) s->version=TLS1_VERSION; s->method=TLSv1_client_method(); } + else if ((p[2] == TLS1_1_VERSION_MINOR) && + !(s->options & SSL_OP_NO_TLSv1_1)) + { + s->version=TLS1_1_VERSION; + s->method=TLSv1_1_client_method(); + } + else if ((p[2] == TLS1_2_VERSION_MINOR) && + !(s->options & SSL_OP_NO_TLSv1_2)) + { + s->version=TLS1_2_VERSION; + s->method=TLSv1_2_client_method(); + } else { SSLerr(SSL_F_SSL23_GET_SERVER_HELLO,SSL_R_UNSUPPORTED_PROTOCOL); diff --git a/crypto/openssl/ssl/s23_meth.c b/crypto/openssl/ssl/s23_meth.c index c6099efcf7..40eae0f0be 100644 --- a/crypto/openssl/ssl/s23_meth.c +++ b/crypto/openssl/ssl/s23_meth.c @@ -76,6 +76,10 @@ static const SSL_METHOD *ssl23_get_method(int ver) #ifndef OPENSSL_NO_TLS1 if (ver == TLS1_VERSION) return(TLSv1_method()); + else if (ver == TLS1_1_VERSION) + return(TLSv1_1_method()); + else if (ver == TLS1_2_VERSION) + return(TLSv1_2_method()); else #endif return(NULL); diff --git a/crypto/openssl/ssl/s23_srvr.c b/crypto/openssl/ssl/s23_srvr.c index 836dd1f1cf..4877849013 100644 --- a/crypto/openssl/ssl/s23_srvr.c +++ b/crypto/openssl/ssl/s23_srvr.c @@ -115,6 +115,9 @@ #include #include #include +#ifdef OPENSSL_FIPS +#include +#endif static const SSL_METHOD *ssl23_get_server_method(int ver); int ssl23_get_client_hello(SSL *s); @@ -128,6 +131,10 @@ static const SSL_METHOD *ssl23_get_server_method(int ver) return(SSLv3_server_method()); else if (ver == TLS1_VERSION) return(TLSv1_server_method()); + else if (ver == TLS1_1_VERSION) + return(TLSv1_1_server_method()); + else if (ver == TLS1_2_VERSION) + return(TLSv1_2_server_method()); else return(NULL); } @@ -283,7 +290,20 @@ int ssl23_get_client_hello(SSL *s) /* SSLv3/TLSv1 */ if (p[4] >= TLS1_VERSION_MINOR) { - if (!(s->options & SSL_OP_NO_TLSv1)) + if (p[4] >= TLS1_2_VERSION_MINOR && + !(s->options & SSL_OP_NO_TLSv1_2)) + { + s->version=TLS1_2_VERSION; + s->state=SSL23_ST_SR_CLNT_HELLO_B; + } + else if (p[4] >= TLS1_1_VERSION_MINOR && + !(s->options & SSL_OP_NO_TLSv1_1)) + { + s->version=TLS1_1_VERSION; + /* type=2; */ /* done later to survive restarts */ + s->state=SSL23_ST_SR_CLNT_HELLO_B; + } + else if (!(s->options & SSL_OP_NO_TLSv1)) { s->version=TLS1_VERSION; /* type=2; */ /* done later to survive restarts */ @@ -350,7 +370,19 @@ int ssl23_get_client_hello(SSL *s) v[1]=p[10]; /* minor version according to client_version */ if (v[1] >= TLS1_VERSION_MINOR) { - if (!(s->options & SSL_OP_NO_TLSv1)) + if (v[1] >= TLS1_2_VERSION_MINOR && + !(s->options & SSL_OP_NO_TLSv1_2)) + { + s->version=TLS1_2_VERSION; + type=3; + } + else if (v[1] >= TLS1_1_VERSION_MINOR && + !(s->options & SSL_OP_NO_TLSv1_1)) + { + s->version=TLS1_1_VERSION; + type=3; + } + else if (!(s->options & SSL_OP_NO_TLSv1)) { s->version=TLS1_VERSION; type=3; @@ -393,6 +425,15 @@ int ssl23_get_client_hello(SSL *s) } } +#ifdef OPENSSL_FIPS + if (FIPS_mode() && (s->version < TLS1_VERSION)) + { + SSLerr(SSL_F_SSL23_GET_CLIENT_HELLO, + SSL_R_ONLY_TLS_ALLOWED_IN_FIPS_MODE); + goto err; + } +#endif + if (s->state == SSL23_ST_SR_CLNT_HELLO_B) { /* we have SSLv3/TLSv1 in an SSLv2 header @@ -567,8 +608,11 @@ int ssl23_get_client_hello(SSL *s) s->s3->rbuf.left=0; s->s3->rbuf.offset=0; } - - if (s->version == TLS1_VERSION) + if (s->version == TLS1_2_VERSION) + s->method = TLSv1_2_server_method(); + else if (s->version == TLS1_1_VERSION) + s->method = TLSv1_1_server_method(); + else if (s->version == TLS1_VERSION) s->method = TLSv1_server_method(); else s->method = SSLv3_server_method(); diff --git a/crypto/openssl/ssl/s3_both.c b/crypto/openssl/ssl/s3_both.c index a6d869df59..b63460a56d 100644 --- a/crypto/openssl/ssl/s3_both.c +++ b/crypto/openssl/ssl/s3_both.c @@ -202,15 +202,38 @@ int ssl3_send_finished(SSL *s, int a, int b, const char *sender, int slen) return(ssl3_do_write(s,SSL3_RT_HANDSHAKE)); } +#ifndef OPENSSL_NO_NEXTPROTONEG +/* ssl3_take_mac calculates the Finished MAC for the handshakes messages seen to far. */ +static void ssl3_take_mac(SSL *s) { + const char *sender; + int slen; + + if (s->state & SSL_ST_CONNECT) + { + sender=s->method->ssl3_enc->server_finished_label; + slen=s->method->ssl3_enc->server_finished_label_len; + } + else + { + sender=s->method->ssl3_enc->client_finished_label; + slen=s->method->ssl3_enc->client_finished_label_len; + } + + s->s3->tmp.peer_finish_md_len = s->method->ssl3_enc->final_finish_mac(s, + sender,slen,s->s3->tmp.peer_finish_md); +} +#endif + int ssl3_get_finished(SSL *s, int a, int b) { int al,i,ok; long n; unsigned char *p; - /* the mac has already been generated when we received the - * change cipher spec message and is in s->s3->tmp.peer_finish_md - */ +#ifdef OPENSSL_NO_NEXTPROTONEG + /* the mac has already been generated when we received the change + * cipher spec message and is in s->s3->tmp.peer_finish_md. */ +#endif n=s->method->ssl_get_message(s, a, @@ -514,6 +537,13 @@ long ssl3_get_message(SSL *s, int st1, int stn, int mt, long max, int *ok) s->init_num += i; n -= i; } +#ifndef OPENSSL_NO_NEXTPROTONEG + /* If receiving Finished, record MAC of prior handshake messages for + * Finished verification. */ + if (*s->init_buf->data == SSL3_MT_FINISHED) + ssl3_take_mac(s); +#endif + /* Feed this message into MAC computation. */ ssl3_finish_mac(s, (unsigned char *)s->init_buf->data, s->init_num + 4); if (s->msg_callback) s->msg_callback(0, s->version, SSL3_RT_HANDSHAKE, s->init_buf->data, (size_t)s->init_num + 4, s, s->msg_callback_arg); diff --git a/crypto/openssl/ssl/s3_clnt.c b/crypto/openssl/ssl/s3_clnt.c index 53223bd38d..4511a914a4 100644 --- a/crypto/openssl/ssl/s3_clnt.c +++ b/crypto/openssl/ssl/s3_clnt.c @@ -156,6 +156,9 @@ #include #include #include +#ifdef OPENSSL_FIPS +#include +#endif #ifndef OPENSSL_NO_DH #include #endif @@ -200,6 +203,18 @@ int ssl3_connect(SSL *s) s->in_handshake++; if (!SSL_in_init(s) || SSL_in_before(s)) SSL_clear(s); +#ifndef OPENSSL_NO_HEARTBEATS + /* If we're awaiting a HeartbeatResponse, pretend we + * already got and don't await it anymore, because + * Heartbeats don't make sense during handshakes anyway. + */ + if (s->tlsext_hb_pending) + { + s->tlsext_hb_pending = 0; + s->tlsext_hb_seq++; + } +#endif + for (;;) { state=s->state; @@ -207,7 +222,7 @@ int ssl3_connect(SSL *s) switch(s->state) { case SSL_ST_RENEGOTIATE: - s->new_session=1; + s->renegotiate=1; s->state=SSL_ST_CONNECT; s->ctx->stats.sess_connect_renegotiate++; /* break */ @@ -280,7 +295,16 @@ int ssl3_connect(SSL *s) if (ret <= 0) goto end; if (s->hit) + { s->state=SSL3_ST_CR_FINISHED_A; +#ifndef OPENSSL_NO_TLSEXT + if (s->tlsext_ticket_expected) + { + /* receive renewed session ticket */ + s->state=SSL3_ST_CR_SESSION_TICKET_A; + } +#endif + } else s->state=SSL3_ST_CR_CERT_A; s->init_num=0; @@ -358,6 +382,17 @@ int ssl3_connect(SSL *s) case SSL3_ST_CR_SRVR_DONE_B: ret=ssl3_get_server_done(s); if (ret <= 0) goto end; +#ifndef OPENSSL_NO_SRP + if (s->s3->tmp.new_cipher->algorithm_mkey & SSL_kSRP) + { + if ((ret = SRP_Calc_A_param(s))<=0) + { + SSLerr(SSL_F_SSL3_CONNECT,SSL_R_SRP_A_CALC); + ssl3_send_alert(s,SSL3_AL_FATAL,SSL_AD_INTERNAL_ERROR); + goto end; + } + } +#endif if (s->s3->tmp.cert_req) s->state=SSL3_ST_CW_CERT_A; else @@ -423,7 +458,16 @@ int ssl3_connect(SSL *s) ret=ssl3_send_change_cipher_spec(s, SSL3_ST_CW_CHANGE_A,SSL3_ST_CW_CHANGE_B); if (ret <= 0) goto end; + + +#if defined(OPENSSL_NO_TLSEXT) || defined(OPENSSL_NO_NEXTPROTONEG) s->state=SSL3_ST_CW_FINISHED_A; +#else + if (s->s3->next_proto_neg_seen) + s->state=SSL3_ST_CW_NEXT_PROTO_A; + else + s->state=SSL3_ST_CW_FINISHED_A; +#endif s->init_num=0; s->session->cipher=s->s3->tmp.new_cipher; @@ -451,6 +495,15 @@ int ssl3_connect(SSL *s) break; +#if !defined(OPENSSL_NO_TLSEXT) && !defined(OPENSSL_NO_NEXTPROTONEG) + case SSL3_ST_CW_NEXT_PROTO_A: + case SSL3_ST_CW_NEXT_PROTO_B: + ret=ssl3_send_next_proto(s); + if (ret <= 0) goto end; + s->state=SSL3_ST_CW_FINISHED_A; + break; +#endif + case SSL3_ST_CW_FINISHED_A: case SSL3_ST_CW_FINISHED_B: ret=ssl3_send_finished(s, @@ -546,6 +599,7 @@ int ssl3_connect(SSL *s) /* else do it later in ssl3_write */ s->init_num=0; + s->renegotiate=0; s->new_session=0; ssl_update_cache(s,SSL_SESS_CACHE_CLIENT); @@ -635,9 +689,43 @@ int ssl3_client_hello(SSL *s) /* Do the message type and length last */ d=p= &(buf[4]); + /* version indicates the negotiated version: for example from + * an SSLv2/v3 compatible client hello). The client_version + * field is the maximum version we permit and it is also + * used in RSA encrypted premaster secrets. Some servers can + * choke if we initially report a higher version then + * renegotiate to a lower one in the premaster secret. This + * didn't happen with TLS 1.0 as most servers supported it + * but it can with TLS 1.1 or later if the server only supports + * 1.0. + * + * Possible scenario with previous logic: + * 1. Client hello indicates TLS 1.2 + * 2. Server hello says TLS 1.0 + * 3. RSA encrypted premaster secret uses 1.2. + * 4. Handhaked proceeds using TLS 1.0. + * 5. Server sends hello request to renegotiate. + * 6. Client hello indicates TLS v1.0 as we now + * know that is maximum server supports. + * 7. Server chokes on RSA encrypted premaster secret + * containing version 1.0. + * + * For interoperability it should be OK to always use the + * maximum version we support in client hello and then rely + * on the checking of version to ensure the servers isn't + * being inconsistent: for example initially negotiating with + * TLS 1.0 and renegotiating with TLS 1.2. We do this by using + * client_version in client hello and not resetting it to + * the negotiated version. + */ +#if 0 *(p++)=s->version>>8; *(p++)=s->version&0xff; s->client_version=s->version; +#else + *(p++)=s->client_version>>8; + *(p++)=s->client_version&0xff; +#endif /* Random stuff */ memcpy(p,s->s3->client_random,SSL3_RANDOM_SIZE); @@ -847,6 +935,14 @@ int ssl3_get_server_hello(SSL *s) SSLerr(SSL_F_SSL3_GET_SERVER_HELLO,SSL_R_UNKNOWN_CIPHER_RETURNED); goto f_err; } + /* TLS v1.2 only ciphersuites require v1.2 or later */ + if ((c->algorithm_ssl & SSL_TLSV1_2) && + (TLS1_get_version(s) < TLS1_2_VERSION)) + { + al=SSL_AD_ILLEGAL_PARAMETER; + SSLerr(SSL_F_SSL3_GET_SERVER_HELLO,SSL_R_WRONG_CIPHER_RETURNED); + goto f_err; + } p+=ssl_put_cipher_by_char(s,NULL,NULL); sk=ssl_get_ciphers_by_id(s); @@ -878,9 +974,11 @@ int ssl3_get_server_hello(SSL *s) } } s->s3->tmp.new_cipher=c; - if (!ssl3_digest_cached_records(s)) + /* Don't digest cached records if TLS v1.2: we may need them for + * client authentication. + */ + if (TLS1_get_version(s) < TLS1_2_VERSION && !ssl3_digest_cached_records(s)) goto f_err; - /* lets get the compression algorithm */ /* COMPRESSION */ #ifdef OPENSSL_NO_COMP @@ -1159,6 +1257,7 @@ int ssl3_get_key_exchange(SSL *s) int al,i,j,param_len,ok; long n,alg_k,alg_a; EVP_PKEY *pkey=NULL; + const EVP_MD *md = NULL; #ifndef OPENSSL_NO_RSA RSA *rsa=NULL; #endif @@ -1282,6 +1381,86 @@ int ssl3_get_key_exchange(SSL *s) } else #endif /* !OPENSSL_NO_PSK */ +#ifndef OPENSSL_NO_SRP + if (alg_k & SSL_kSRP) + { + n2s(p,i); + param_len=i+2; + if (param_len > n) + { + al=SSL_AD_DECODE_ERROR; + SSLerr(SSL_F_SSL3_GET_KEY_EXCHANGE,SSL_R_BAD_SRP_N_LENGTH); + goto f_err; + } + if (!(s->srp_ctx.N=BN_bin2bn(p,i,NULL))) + { + SSLerr(SSL_F_SSL3_GET_KEY_EXCHANGE,ERR_R_BN_LIB); + goto err; + } + p+=i; + + n2s(p,i); + param_len+=i+2; + if (param_len > n) + { + al=SSL_AD_DECODE_ERROR; + SSLerr(SSL_F_SSL3_GET_KEY_EXCHANGE,SSL_R_BAD_SRP_G_LENGTH); + goto f_err; + } + if (!(s->srp_ctx.g=BN_bin2bn(p,i,NULL))) + { + SSLerr(SSL_F_SSL3_GET_KEY_EXCHANGE,ERR_R_BN_LIB); + goto err; + } + p+=i; + + i = (unsigned int)(p[0]); + p++; + param_len+=i+1; + if (param_len > n) + { + al=SSL_AD_DECODE_ERROR; + SSLerr(SSL_F_SSL3_GET_KEY_EXCHANGE,SSL_R_BAD_SRP_S_LENGTH); + goto f_err; + } + if (!(s->srp_ctx.s=BN_bin2bn(p,i,NULL))) + { + SSLerr(SSL_F_SSL3_GET_KEY_EXCHANGE,ERR_R_BN_LIB); + goto err; + } + p+=i; + + n2s(p,i); + param_len+=i+2; + if (param_len > n) + { + al=SSL_AD_DECODE_ERROR; + SSLerr(SSL_F_SSL3_GET_KEY_EXCHANGE,SSL_R_BAD_SRP_B_LENGTH); + goto f_err; + } + if (!(s->srp_ctx.B=BN_bin2bn(p,i,NULL))) + { + SSLerr(SSL_F_SSL3_GET_KEY_EXCHANGE,ERR_R_BN_LIB); + goto err; + } + p+=i; + n-=param_len; + +/* We must check if there is a certificate */ +#ifndef OPENSSL_NO_RSA + if (alg_a & SSL_aRSA) + pkey=X509_get_pubkey(s->session->sess_cert->peer_pkeys[SSL_PKEY_RSA_ENC].x509); +#else + if (0) + ; +#endif +#ifndef OPENSSL_NO_DSA + else if (alg_a & SSL_aDSS) + pkey=X509_get_pubkey(s->session->sess_cert->peer_pkeys[SSL_PKEY_DSA_SIGN].x509); +#endif + } + else +#endif /* !OPENSSL_NO_SRP */ #ifndef OPENSSL_NO_RSA if (alg_k & SSL_kRSA) { @@ -1529,6 +1708,38 @@ int ssl3_get_key_exchange(SSL *s) /* if it was signed, check the signature */ if (pkey != NULL) { + if (TLS1_get_version(s) >= TLS1_2_VERSION) + { + int sigalg = tls12_get_sigid(pkey); + /* Should never happen */ + if (sigalg == -1) + { + SSLerr(SSL_F_SSL3_GET_KEY_EXCHANGE,ERR_R_INTERNAL_ERROR); + goto err; + } + /* Check key type is consistent with signature */ + if (sigalg != (int)p[1]) + { + SSLerr(SSL_F_SSL3_GET_KEY_EXCHANGE,SSL_R_WRONG_SIGNATURE_TYPE); + al=SSL_AD_DECODE_ERROR; + goto f_err; + } + md = tls12_get_hash(p[0]); + if (md == NULL) + { + SSLerr(SSL_F_SSL3_GET_KEY_EXCHANGE,SSL_R_UNKNOWN_DIGEST); + al=SSL_AD_DECODE_ERROR; + goto f_err; + } +#ifdef SSL_DEBUG +fprintf(stderr, "USING TLSv1.2 HASH %s\n", EVP_MD_name(md)); +#endif + p += 2; + n -= 2; + } + else + md = EVP_sha1(); + n2s(p,i); n-=2; j=EVP_PKEY_size(pkey); @@ -1542,7 +1753,7 @@ int ssl3_get_key_exchange(SSL *s) } #ifndef OPENSSL_NO_RSA - if (pkey->type == EVP_PKEY_RSA) + if (pkey->type == EVP_PKEY_RSA && TLS1_get_version(s) < TLS1_2_VERSION) { int num; @@ -1550,6 +1761,8 @@ int ssl3_get_key_exchange(SSL *s) q=md_buf; for (num=2; num > 0; num--) { + EVP_MD_CTX_set_flags(&md_ctx, + EVP_MD_CTX_FLAG_NON_FIPS_ALLOW); EVP_DigestInit_ex(&md_ctx,(num == 2) ?s->ctx->md5:s->ctx->sha1, NULL); EVP_DigestUpdate(&md_ctx,&(s->s3->client_random[0]),SSL3_RANDOM_SIZE); @@ -1577,29 +1790,8 @@ int ssl3_get_key_exchange(SSL *s) } else #endif -#ifndef OPENSSL_NO_DSA - if (pkey->type == EVP_PKEY_DSA) - { - /* lets do DSS */ - EVP_VerifyInit_ex(&md_ctx,EVP_dss1(), NULL); - EVP_VerifyUpdate(&md_ctx,&(s->s3->client_random[0]),SSL3_RANDOM_SIZE); - EVP_VerifyUpdate(&md_ctx,&(s->s3->server_random[0]),SSL3_RANDOM_SIZE); - EVP_VerifyUpdate(&md_ctx,param,param_len); - if (EVP_VerifyFinal(&md_ctx,p,(int)n,pkey) <= 0) - { - /* bad signature */ - al=SSL_AD_DECRYPT_ERROR; - SSLerr(SSL_F_SSL3_GET_KEY_EXCHANGE,SSL_R_BAD_SIGNATURE); - goto f_err; - } - } - else -#endif -#ifndef OPENSSL_NO_ECDSA - if (pkey->type == EVP_PKEY_EC) { - /* let's do ECDSA */ - EVP_VerifyInit_ex(&md_ctx,EVP_ecdsa(), NULL); + EVP_VerifyInit_ex(&md_ctx, md, NULL); EVP_VerifyUpdate(&md_ctx,&(s->s3->client_random[0]),SSL3_RANDOM_SIZE); EVP_VerifyUpdate(&md_ctx,&(s->s3->server_random[0]),SSL3_RANDOM_SIZE); EVP_VerifyUpdate(&md_ctx,param,param_len); @@ -1611,12 +1803,6 @@ int ssl3_get_key_exchange(SSL *s) goto f_err; } } - else -#endif - { - SSLerr(SSL_F_SSL3_GET_KEY_EXCHANGE,ERR_R_INTERNAL_ERROR); - goto err; - } } else { @@ -1663,7 +1849,7 @@ int ssl3_get_certificate_request(SSL *s) { int ok,ret=0; unsigned long n,nc,l; - unsigned int llen,ctype_num,i; + unsigned int llen, ctype_num,i; X509_NAME *xn=NULL; const unsigned char *p,*q; unsigned char *d; @@ -1683,6 +1869,14 @@ int ssl3_get_certificate_request(SSL *s) if (s->s3->tmp.message_type == SSL3_MT_SERVER_DONE) { s->s3->tmp.reuse_message=1; + /* If we get here we don't need any cached handshake records + * as we wont be doing client auth. + */ + if (s->s3->handshake_buffer) + { + if (!ssl3_digest_cached_records(s)) + goto err; + } return(1); } @@ -1719,6 +1913,26 @@ int ssl3_get_certificate_request(SSL *s) for (i=0; is3->tmp.ctype[i]= p[i]; p+=ctype_num; + if (TLS1_get_version(s) >= TLS1_2_VERSION) + { + n2s(p, llen); + /* Check we have enough room for signature algorithms and + * following length value. + */ + if ((unsigned long)(p - d + llen + 2) > n) + { + ssl3_send_alert(s,SSL3_AL_FATAL,SSL_AD_DECODE_ERROR); + SSLerr(SSL_F_SSL3_GET_CERTIFICATE_REQUEST,SSL_R_DATA_LENGTH_TOO_LONG); + goto err; + } + if ((llen & 1) || !tls1_process_sigalgs(s, p, llen)) + { + ssl3_send_alert(s,SSL3_AL_FATAL,SSL_AD_DECODE_ERROR); + SSLerr(SSL_F_SSL3_GET_CERTIFICATE_REQUEST,SSL_R_SIGNATURE_ALGORITHMS_ERROR); + goto err; + } + p += llen; + } /* get the CA RDNs */ n2s(p,llen); @@ -1731,7 +1945,7 @@ fclose(out); } #endif - if ((llen+ctype_num+2+1) != n) + if ((unsigned long)(p - d + llen) != n) { ssl3_send_alert(s,SSL3_AL_FATAL,SSL_AD_DECODE_ERROR); SSLerr(SSL_F_SSL3_GET_CERTIFICATE_REQUEST,SSL_R_LENGTH_MISMATCH); @@ -2553,6 +2767,39 @@ int ssl3_send_client_key_exchange(SSL *s) EVP_PKEY_free(pub_key); } +#ifndef OPENSSL_NO_SRP + else if (alg_k & SSL_kSRP) + { + if (s->srp_ctx.A != NULL) + { + /* send off the data */ + n=BN_num_bytes(s->srp_ctx.A); + s2n(n,p); + BN_bn2bin(s->srp_ctx.A,p); + n+=2; + } + else + { + SSLerr(SSL_F_SSL3_SEND_CLIENT_KEY_EXCHANGE,ERR_R_INTERNAL_ERROR); + goto err; + } + if (s->session->srp_username != NULL) + OPENSSL_free(s->session->srp_username); + s->session->srp_username = BUF_strdup(s->srp_ctx.login); + if (s->session->srp_username == NULL) + { + SSLerr(SSL_F_SSL3_SEND_CLIENT_KEY_EXCHANGE, + ERR_R_MALLOC_FAILURE); + goto err; + } + + if ((s->session->master_key_length = SRP_generate_client_master_secret(s,s->session->master_key))<0) + { + SSLerr(SSL_F_SSL3_SEND_CLIENT_KEY_EXCHANGE,ERR_R_INTERNAL_ERROR); + goto err; + } + } +#endif #ifndef OPENSSL_NO_PSK else if (alg_k & SSL_kPSK) { @@ -2672,12 +2919,13 @@ int ssl3_send_client_verify(SSL *s) unsigned char data[MD5_DIGEST_LENGTH+SHA_DIGEST_LENGTH]; EVP_PKEY *pkey; EVP_PKEY_CTX *pctx=NULL; -#ifndef OPENSSL_NO_RSA + EVP_MD_CTX mctx; unsigned u=0; -#endif unsigned long n; int j; + EVP_MD_CTX_init(&mctx); + if (s->state == SSL3_ST_CW_CERT_VRFY_A) { d=(unsigned char *)s->init_buf->data; @@ -2688,7 +2936,8 @@ int ssl3_send_client_verify(SSL *s) EVP_PKEY_sign_init(pctx); if (EVP_PKEY_CTX_set_signature_md(pctx, EVP_sha1())>0) { - s->method->ssl3_enc->cert_verify_mac(s, + if (TLS1_get_version(s) < TLS1_2_VERSION) + s->method->ssl3_enc->cert_verify_mac(s, NID_sha1, &(data[MD5_DIGEST_LENGTH])); } @@ -2696,6 +2945,41 @@ int ssl3_send_client_verify(SSL *s) { ERR_clear_error(); } + /* For TLS v1.2 send signature algorithm and signature + * using agreed digest and cached handshake records. + */ + if (TLS1_get_version(s) >= TLS1_2_VERSION) + { + long hdatalen = 0; + void *hdata; + const EVP_MD *md = s->cert->key->digest; + hdatalen = BIO_get_mem_data(s->s3->handshake_buffer, + &hdata); + if (hdatalen <= 0 || !tls12_get_sigandhash(p, pkey, md)) + { + SSLerr(SSL_F_SSL3_SEND_CLIENT_VERIFY, + ERR_R_INTERNAL_ERROR); + goto err; + } + p += 2; +#ifdef SSL_DEBUG + fprintf(stderr, "Using TLS 1.2 with client alg %s\n", + EVP_MD_name(md)); +#endif + if (!EVP_SignInit_ex(&mctx, md, NULL) + || !EVP_SignUpdate(&mctx, hdata, hdatalen) + || !EVP_SignFinal(&mctx, p + 2, &u, pkey)) + { + SSLerr(SSL_F_SSL3_SEND_CLIENT_VERIFY, + ERR_R_EVP_LIB); + goto err; + } + s2n(u,p); + n = u + 4; + if (!ssl3_digest_cached_records(s)) + goto err; + } + else #ifndef OPENSSL_NO_RSA if (pkey->type == EVP_PKEY_RSA) { @@ -2778,9 +3062,11 @@ int ssl3_send_client_verify(SSL *s) s->init_num=(int)n+4; s->init_off=0; } + EVP_MD_CTX_cleanup(&mctx); EVP_PKEY_CTX_free(pctx); return(ssl3_do_write(s,SSL3_RT_HANDSHAKE)); err: + EVP_MD_CTX_cleanup(&mctx); EVP_PKEY_CTX_free(pctx); return(-1); } @@ -2904,7 +3190,7 @@ int ssl3_check_cert_and_algorithm(SSL *s) if (idx == SSL_PKEY_ECC) { if (ssl_check_srvr_ecc_cert_and_alg(sc->peer_pkeys[idx].x509, - s->s3->tmp.new_cipher) == 0) + s) == 0) { /* check failed */ SSLerr(SSL_F_SSL3_CHECK_CERT_AND_ALGORITHM,SSL_R_BAD_ECC_CERT); goto f_err; @@ -3000,6 +3286,32 @@ err: return(0); } +#if !defined(OPENSSL_NO_TLSEXT) && !defined(OPENSSL_NO_NEXTPROTONEG) +int ssl3_send_next_proto(SSL *s) + { + unsigned int len, padding_len; + unsigned char *d; + + if (s->state == SSL3_ST_CW_NEXT_PROTO_A) + { + len = s->next_proto_negotiated_len; + padding_len = 32 - ((len + 2) % 32); + d = (unsigned char *)s->init_buf->data; + d[4] = len; + memcpy(d + 5, s->next_proto_negotiated, len); + d[5 + len] = padding_len; + memset(d + 6 + len, 0, padding_len); + *(d++)=SSL3_MT_NEXT_PROTO; + l2n3(2 + len + padding_len, d); + s->state = SSL3_ST_CW_NEXT_PROTO_B; + s->init_num = 4 + 2 + len + padding_len; + s->init_off = 0; + } + + return ssl3_do_write(s, SSL3_RT_HANDSHAKE); +} +#endif /* !OPENSSL_NO_TLSEXT && !OPENSSL_NO_NEXTPROTONEG */ + /* Check to see if handshake is full or resumed. Usually this is just a * case of checking to see if a cache hit has occurred. In the case of * session tickets we have to check the next message to be sure. diff --git a/crypto/openssl/ssl/s3_enc.c b/crypto/openssl/ssl/s3_enc.c index b14597076d..c5df2cb90a 100644 --- a/crypto/openssl/ssl/s3_enc.c +++ b/crypto/openssl/ssl/s3_enc.c @@ -170,6 +170,7 @@ static int ssl3_generate_key_block(SSL *s, unsigned char *km, int num) #endif k=0; EVP_MD_CTX_init(&m5); + EVP_MD_CTX_set_flags(&m5, EVP_MD_CTX_FLAG_NON_FIPS_ALLOW); EVP_MD_CTX_init(&s1); for (i=0; (int)is3->handshake_dgst); s->s3->handshake_dgst=NULL; } - + void ssl3_finish_mac(SSL *s, const unsigned char *buf, int len) { - if (s->s3->handshake_buffer) + if (s->s3->handshake_buffer && !(s->s3->flags & TLS1_FLAGS_KEEP_HANDSHAKE)) { BIO_write (s->s3->handshake_buffer,(void *)buf,len); } @@ -613,9 +614,16 @@ int ssl3_digest_cached_records(SSL *s) /* Loop through bitso of algorithm2 field and create MD_CTX-es */ for (i=0;ssl_get_handshake_digest(i,&mask,&md); i++) { - if ((mask & s->s3->tmp.new_cipher->algorithm2) && md) + if ((mask & ssl_get_algorithm2(s)) && md) { s->s3->handshake_dgst[i]=EVP_MD_CTX_create(); +#ifdef OPENSSL_FIPS + if (EVP_MD_nid(md) == NID_md5) + { + EVP_MD_CTX_set_flags(s->s3->handshake_dgst[i], + EVP_MD_CTX_FLAG_NON_FIPS_ALLOW); + } +#endif EVP_DigestInit_ex(s->s3->handshake_dgst[i],md,NULL); EVP_DigestUpdate(s->s3->handshake_dgst[i],hdata,hdatalen); } @@ -624,9 +632,12 @@ int ssl3_digest_cached_records(SSL *s) s->s3->handshake_dgst[i]=NULL; } } - /* Free handshake_buffer BIO */ - BIO_free(s->s3->handshake_buffer); - s->s3->handshake_buffer = NULL; + if (!(s->s3->flags & TLS1_FLAGS_KEEP_HANDSHAKE)) + { + /* Free handshake_buffer BIO */ + BIO_free(s->s3->handshake_buffer); + s->s3->handshake_buffer = NULL; + } return 1; } @@ -672,6 +683,7 @@ static int ssl3_handshake_mac(SSL *s, int md_nid, return 0; } EVP_MD_CTX_init(&ctx); + EVP_MD_CTX_set_flags(&ctx, EVP_MD_CTX_FLAG_NON_FIPS_ALLOW); EVP_MD_CTX_copy_ex(&ctx,d); n=EVP_MD_CTX_size(&ctx); if (n < 0) diff --git a/crypto/openssl/ssl/s3_lib.c b/crypto/openssl/ssl/s3_lib.c index 1130244aeb..db75479c38 100644 --- a/crypto/openssl/ssl/s3_lib.c +++ b/crypto/openssl/ssl/s3_lib.c @@ -1071,6 +1071,103 @@ OPENSSL_GLOBAL SSL_CIPHER ssl3_ciphers[]={ 256, }, + /* TLS v1.2 ciphersuites */ + /* Cipher 3B */ + { + 1, + TLS1_TXT_RSA_WITH_NULL_SHA256, + TLS1_CK_RSA_WITH_NULL_SHA256, + SSL_kRSA, + SSL_aRSA, + SSL_eNULL, + SSL_SHA256, + SSL_SSLV3, + SSL_NOT_EXP|SSL_STRONG_NONE|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 0, + 0, + }, + + /* Cipher 3C */ + { + 1, + TLS1_TXT_RSA_WITH_AES_128_SHA256, + TLS1_CK_RSA_WITH_AES_128_SHA256, + SSL_kRSA, + SSL_aRSA, + SSL_AES128, + SSL_SHA256, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 128, + 128, + }, + + /* Cipher 3D */ + { + 1, + TLS1_TXT_RSA_WITH_AES_256_SHA256, + TLS1_CK_RSA_WITH_AES_256_SHA256, + SSL_kRSA, + SSL_aRSA, + SSL_AES256, + SSL_SHA256, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 256, + 256, + }, + + /* Cipher 3E */ + { + 0, /* not implemented (non-ephemeral DH) */ + TLS1_TXT_DH_DSS_WITH_AES_128_SHA256, + TLS1_CK_DH_DSS_WITH_AES_128_SHA256, + SSL_kDHr, + SSL_aDH, + SSL_AES128, + SSL_SHA256, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 128, + 128, + }, + + /* Cipher 3F */ + { + 0, /* not implemented (non-ephemeral DH) */ + TLS1_TXT_DH_RSA_WITH_AES_128_SHA256, + TLS1_CK_DH_RSA_WITH_AES_128_SHA256, + SSL_kDHr, + SSL_aDH, + SSL_AES128, + SSL_SHA256, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 128, + 128, + }, + + /* Cipher 40 */ + { + 1, + TLS1_TXT_DHE_DSS_WITH_AES_128_SHA256, + TLS1_CK_DHE_DSS_WITH_AES_128_SHA256, + SSL_kEDH, + SSL_aDSS, + SSL_AES128, + SSL_SHA256, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 128, + 128, + }, + #ifndef OPENSSL_NO_CAMELLIA /* Camellia ciphersuites from RFC4132 (128-bit portion) */ @@ -1287,6 +1384,122 @@ OPENSSL_GLOBAL SSL_CIPHER ssl3_ciphers[]={ 128, }, #endif + + /* TLS v1.2 ciphersuites */ + /* Cipher 67 */ + { + 1, + TLS1_TXT_DHE_RSA_WITH_AES_128_SHA256, + TLS1_CK_DHE_RSA_WITH_AES_128_SHA256, + SSL_kEDH, + SSL_aRSA, + SSL_AES128, + SSL_SHA256, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 128, + 128, + }, + + /* Cipher 68 */ + { + 0, /* not implemented (non-ephemeral DH) */ + TLS1_TXT_DH_DSS_WITH_AES_256_SHA256, + TLS1_CK_DH_DSS_WITH_AES_256_SHA256, + SSL_kDHr, + SSL_aDH, + SSL_AES256, + SSL_SHA256, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 256, + 256, + }, + + /* Cipher 69 */ + { + 0, /* not implemented (non-ephemeral DH) */ + TLS1_TXT_DH_RSA_WITH_AES_256_SHA256, + TLS1_CK_DH_RSA_WITH_AES_256_SHA256, + SSL_kDHr, + SSL_aDH, + SSL_AES256, + SSL_SHA256, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 256, + 256, + }, + + /* Cipher 6A */ + { + 1, + TLS1_TXT_DHE_DSS_WITH_AES_256_SHA256, + TLS1_CK_DHE_DSS_WITH_AES_256_SHA256, + SSL_kEDH, + SSL_aDSS, + SSL_AES256, + SSL_SHA256, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 256, + 256, + }, + + /* Cipher 6B */ + { + 1, + TLS1_TXT_DHE_RSA_WITH_AES_256_SHA256, + TLS1_CK_DHE_RSA_WITH_AES_256_SHA256, + SSL_kEDH, + SSL_aRSA, + SSL_AES256, + SSL_SHA256, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 256, + 256, + }, + + /* Cipher 6C */ + { + 1, + TLS1_TXT_ADH_WITH_AES_128_SHA256, + TLS1_CK_ADH_WITH_AES_128_SHA256, + SSL_kEDH, + SSL_aNULL, + SSL_AES128, + SSL_SHA256, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 128, + 128, + }, + + /* Cipher 6D */ + { + 1, + TLS1_TXT_ADH_WITH_AES_256_SHA256, + TLS1_CK_ADH_WITH_AES_256_SHA256, + SSL_kEDH, + SSL_aNULL, + SSL_AES256, + SSL_SHA256, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 256, + 256, + }, + + /* GOST Ciphersuites */ + { 1, "GOST94-GOST89-GOST89", @@ -1610,107 +1823,301 @@ OPENSSL_GLOBAL SSL_CIPHER ssl3_ciphers[]={ #endif /* OPENSSL_NO_SEED */ -#ifndef OPENSSL_NO_ECDH - /* Cipher C001 */ - { - 1, - TLS1_TXT_ECDH_ECDSA_WITH_NULL_SHA, - TLS1_CK_ECDH_ECDSA_WITH_NULL_SHA, - SSL_kECDHe, - SSL_aECDH, - SSL_eNULL, - SSL_SHA1, - SSL_TLSV1, - SSL_NOT_EXP|SSL_STRONG_NONE, - SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, - 0, - 0, - }, + /* GCM ciphersuites from RFC5288 */ - /* Cipher C002 */ + /* Cipher 9C */ { 1, - TLS1_TXT_ECDH_ECDSA_WITH_RC4_128_SHA, - TLS1_CK_ECDH_ECDSA_WITH_RC4_128_SHA, - SSL_kECDHe, - SSL_aECDH, - SSL_RC4, - SSL_SHA1, - SSL_TLSV1, - SSL_NOT_EXP|SSL_MEDIUM, - SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + TLS1_TXT_RSA_WITH_AES_128_GCM_SHA256, + TLS1_CK_RSA_WITH_AES_128_GCM_SHA256, + SSL_kRSA, + SSL_aRSA, + SSL_AES128GCM, + SSL_AEAD, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA256|TLS1_PRF_SHA256, 128, 128, }, - /* Cipher C003 */ + /* Cipher 9D */ { 1, - TLS1_TXT_ECDH_ECDSA_WITH_DES_192_CBC3_SHA, - TLS1_CK_ECDH_ECDSA_WITH_DES_192_CBC3_SHA, - SSL_kECDHe, - SSL_aECDH, - SSL_3DES, - SSL_SHA1, - SSL_TLSV1, - SSL_NOT_EXP|SSL_HIGH, - SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, - 168, - 168, + TLS1_TXT_RSA_WITH_AES_256_GCM_SHA384, + TLS1_CK_RSA_WITH_AES_256_GCM_SHA384, + SSL_kRSA, + SSL_aRSA, + SSL_AES256GCM, + SSL_AEAD, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA384|TLS1_PRF_SHA384, + 256, + 256, }, - /* Cipher C004 */ + /* Cipher 9E */ { 1, - TLS1_TXT_ECDH_ECDSA_WITH_AES_128_CBC_SHA, - TLS1_CK_ECDH_ECDSA_WITH_AES_128_CBC_SHA, - SSL_kECDHe, - SSL_aECDH, - SSL_AES128, - SSL_SHA1, - SSL_TLSV1, - SSL_NOT_EXP|SSL_HIGH, - SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + TLS1_TXT_DHE_RSA_WITH_AES_128_GCM_SHA256, + TLS1_CK_DHE_RSA_WITH_AES_128_GCM_SHA256, + SSL_kEDH, + SSL_aRSA, + SSL_AES128GCM, + SSL_AEAD, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA256|TLS1_PRF_SHA256, 128, 128, }, - /* Cipher C005 */ + /* Cipher 9F */ { 1, - TLS1_TXT_ECDH_ECDSA_WITH_AES_256_CBC_SHA, - TLS1_CK_ECDH_ECDSA_WITH_AES_256_CBC_SHA, - SSL_kECDHe, - SSL_aECDH, - SSL_AES256, - SSL_SHA1, - SSL_TLSV1, - SSL_NOT_EXP|SSL_HIGH, - SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + TLS1_TXT_DHE_RSA_WITH_AES_256_GCM_SHA384, + TLS1_CK_DHE_RSA_WITH_AES_256_GCM_SHA384, + SSL_kEDH, + SSL_aRSA, + SSL_AES256GCM, + SSL_AEAD, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA384|TLS1_PRF_SHA384, 256, 256, }, - /* Cipher C006 */ + /* Cipher A0 */ { - 1, - TLS1_TXT_ECDHE_ECDSA_WITH_NULL_SHA, - TLS1_CK_ECDHE_ECDSA_WITH_NULL_SHA, - SSL_kEECDH, - SSL_aECDSA, - SSL_eNULL, - SSL_SHA1, - SSL_TLSV1, - SSL_NOT_EXP|SSL_STRONG_NONE, - SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, 0, + TLS1_TXT_DH_RSA_WITH_AES_128_GCM_SHA256, + TLS1_CK_DH_RSA_WITH_AES_128_GCM_SHA256, + SSL_kDHr, + SSL_aDH, + SSL_AES128GCM, + SSL_AEAD, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA256|TLS1_PRF_SHA256, + 128, + 128, + }, + + /* Cipher A1 */ + { 0, + TLS1_TXT_DH_RSA_WITH_AES_256_GCM_SHA384, + TLS1_CK_DH_RSA_WITH_AES_256_GCM_SHA384, + SSL_kDHr, + SSL_aDH, + SSL_AES256GCM, + SSL_AEAD, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA384|TLS1_PRF_SHA384, + 256, + 256, }, - /* Cipher C007 */ + /* Cipher A2 */ { 1, - TLS1_TXT_ECDHE_ECDSA_WITH_RC4_128_SHA, + TLS1_TXT_DHE_DSS_WITH_AES_128_GCM_SHA256, + TLS1_CK_DHE_DSS_WITH_AES_128_GCM_SHA256, + SSL_kEDH, + SSL_aDSS, + SSL_AES128GCM, + SSL_AEAD, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA256|TLS1_PRF_SHA256, + 128, + 128, + }, + + /* Cipher A3 */ + { + 1, + TLS1_TXT_DHE_DSS_WITH_AES_256_GCM_SHA384, + TLS1_CK_DHE_DSS_WITH_AES_256_GCM_SHA384, + SSL_kEDH, + SSL_aDSS, + SSL_AES256GCM, + SSL_AEAD, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA384|TLS1_PRF_SHA384, + 256, + 256, + }, + + /* Cipher A4 */ + { + 0, + TLS1_TXT_DH_DSS_WITH_AES_128_GCM_SHA256, + TLS1_CK_DH_DSS_WITH_AES_128_GCM_SHA256, + SSL_kDHr, + SSL_aDH, + SSL_AES128GCM, + SSL_AEAD, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA256|TLS1_PRF_SHA256, + 128, + 128, + }, + + /* Cipher A5 */ + { + 0, + TLS1_TXT_DH_DSS_WITH_AES_256_GCM_SHA384, + TLS1_CK_DH_DSS_WITH_AES_256_GCM_SHA384, + SSL_kDHr, + SSL_aDH, + SSL_AES256GCM, + SSL_AEAD, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA384|TLS1_PRF_SHA384, + 256, + 256, + }, + + /* Cipher A6 */ + { + 1, + TLS1_TXT_ADH_WITH_AES_128_GCM_SHA256, + TLS1_CK_ADH_WITH_AES_128_GCM_SHA256, + SSL_kEDH, + SSL_aNULL, + SSL_AES128GCM, + SSL_AEAD, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA256|TLS1_PRF_SHA256, + 128, + 128, + }, + + /* Cipher A7 */ + { + 1, + TLS1_TXT_ADH_WITH_AES_256_GCM_SHA384, + TLS1_CK_ADH_WITH_AES_256_GCM_SHA384, + SSL_kEDH, + SSL_aNULL, + SSL_AES256GCM, + SSL_AEAD, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA384|TLS1_PRF_SHA384, + 256, + 256, + }, + +#ifndef OPENSSL_NO_ECDH + /* Cipher C001 */ + { + 1, + TLS1_TXT_ECDH_ECDSA_WITH_NULL_SHA, + TLS1_CK_ECDH_ECDSA_WITH_NULL_SHA, + SSL_kECDHe, + SSL_aECDH, + SSL_eNULL, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_STRONG_NONE|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 0, + 0, + }, + + /* Cipher C002 */ + { + 1, + TLS1_TXT_ECDH_ECDSA_WITH_RC4_128_SHA, + TLS1_CK_ECDH_ECDSA_WITH_RC4_128_SHA, + SSL_kECDHe, + SSL_aECDH, + SSL_RC4, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_MEDIUM, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 128, + 128, + }, + + /* Cipher C003 */ + { + 1, + TLS1_TXT_ECDH_ECDSA_WITH_DES_192_CBC3_SHA, + TLS1_CK_ECDH_ECDSA_WITH_DES_192_CBC3_SHA, + SSL_kECDHe, + SSL_aECDH, + SSL_3DES, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 168, + 168, + }, + + /* Cipher C004 */ + { + 1, + TLS1_TXT_ECDH_ECDSA_WITH_AES_128_CBC_SHA, + TLS1_CK_ECDH_ECDSA_WITH_AES_128_CBC_SHA, + SSL_kECDHe, + SSL_aECDH, + SSL_AES128, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 128, + 128, + }, + + /* Cipher C005 */ + { + 1, + TLS1_TXT_ECDH_ECDSA_WITH_AES_256_CBC_SHA, + TLS1_CK_ECDH_ECDSA_WITH_AES_256_CBC_SHA, + SSL_kECDHe, + SSL_aECDH, + SSL_AES256, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 256, + 256, + }, + + /* Cipher C006 */ + { + 1, + TLS1_TXT_ECDHE_ECDSA_WITH_NULL_SHA, + TLS1_CK_ECDHE_ECDSA_WITH_NULL_SHA, + SSL_kEECDH, + SSL_aECDSA, + SSL_eNULL, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_STRONG_NONE|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 0, + 0, + }, + + /* Cipher C007 */ + { + 1, + TLS1_TXT_ECDHE_ECDSA_WITH_RC4_128_SHA, TLS1_CK_ECDHE_ECDSA_WITH_RC4_128_SHA, SSL_kEECDH, SSL_aECDSA, @@ -1733,7 +2140,7 @@ OPENSSL_GLOBAL SSL_CIPHER ssl3_ciphers[]={ SSL_3DES, SSL_SHA1, SSL_TLSV1, - SSL_NOT_EXP|SSL_HIGH, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, 168, 168, @@ -1749,7 +2156,7 @@ OPENSSL_GLOBAL SSL_CIPHER ssl3_ciphers[]={ SSL_AES128, SSL_SHA1, SSL_TLSV1, - SSL_NOT_EXP|SSL_HIGH, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, 128, 128, @@ -1765,7 +2172,7 @@ OPENSSL_GLOBAL SSL_CIPHER ssl3_ciphers[]={ SSL_AES256, SSL_SHA1, SSL_TLSV1, - SSL_NOT_EXP|SSL_HIGH, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, 256, 256, @@ -1781,7 +2188,7 @@ OPENSSL_GLOBAL SSL_CIPHER ssl3_ciphers[]={ SSL_eNULL, SSL_SHA1, SSL_TLSV1, - SSL_NOT_EXP|SSL_STRONG_NONE, + SSL_NOT_EXP|SSL_STRONG_NONE|SSL_FIPS, SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, 0, 0, @@ -1803,214 +2210,624 @@ OPENSSL_GLOBAL SSL_CIPHER ssl3_ciphers[]={ 128, }, - /* Cipher C00D */ + /* Cipher C00D */ + { + 1, + TLS1_TXT_ECDH_RSA_WITH_DES_192_CBC3_SHA, + TLS1_CK_ECDH_RSA_WITH_DES_192_CBC3_SHA, + SSL_kECDHr, + SSL_aECDH, + SSL_3DES, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 168, + 168, + }, + + /* Cipher C00E */ + { + 1, + TLS1_TXT_ECDH_RSA_WITH_AES_128_CBC_SHA, + TLS1_CK_ECDH_RSA_WITH_AES_128_CBC_SHA, + SSL_kECDHr, + SSL_aECDH, + SSL_AES128, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 128, + 128, + }, + + /* Cipher C00F */ + { + 1, + TLS1_TXT_ECDH_RSA_WITH_AES_256_CBC_SHA, + TLS1_CK_ECDH_RSA_WITH_AES_256_CBC_SHA, + SSL_kECDHr, + SSL_aECDH, + SSL_AES256, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 256, + 256, + }, + + /* Cipher C010 */ + { + 1, + TLS1_TXT_ECDHE_RSA_WITH_NULL_SHA, + TLS1_CK_ECDHE_RSA_WITH_NULL_SHA, + SSL_kEECDH, + SSL_aRSA, + SSL_eNULL, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_STRONG_NONE|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 0, + 0, + }, + + /* Cipher C011 */ + { + 1, + TLS1_TXT_ECDHE_RSA_WITH_RC4_128_SHA, + TLS1_CK_ECDHE_RSA_WITH_RC4_128_SHA, + SSL_kEECDH, + SSL_aRSA, + SSL_RC4, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_MEDIUM, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 128, + 128, + }, + + /* Cipher C012 */ + { + 1, + TLS1_TXT_ECDHE_RSA_WITH_DES_192_CBC3_SHA, + TLS1_CK_ECDHE_RSA_WITH_DES_192_CBC3_SHA, + SSL_kEECDH, + SSL_aRSA, + SSL_3DES, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 168, + 168, + }, + + /* Cipher C013 */ + { + 1, + TLS1_TXT_ECDHE_RSA_WITH_AES_128_CBC_SHA, + TLS1_CK_ECDHE_RSA_WITH_AES_128_CBC_SHA, + SSL_kEECDH, + SSL_aRSA, + SSL_AES128, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 128, + 128, + }, + + /* Cipher C014 */ + { + 1, + TLS1_TXT_ECDHE_RSA_WITH_AES_256_CBC_SHA, + TLS1_CK_ECDHE_RSA_WITH_AES_256_CBC_SHA, + SSL_kEECDH, + SSL_aRSA, + SSL_AES256, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 256, + 256, + }, + + /* Cipher C015 */ + { + 1, + TLS1_TXT_ECDH_anon_WITH_NULL_SHA, + TLS1_CK_ECDH_anon_WITH_NULL_SHA, + SSL_kEECDH, + SSL_aNULL, + SSL_eNULL, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_STRONG_NONE|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 0, + 0, + }, + + /* Cipher C016 */ + { + 1, + TLS1_TXT_ECDH_anon_WITH_RC4_128_SHA, + TLS1_CK_ECDH_anon_WITH_RC4_128_SHA, + SSL_kEECDH, + SSL_aNULL, + SSL_RC4, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_MEDIUM, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 128, + 128, + }, + + /* Cipher C017 */ + { + 1, + TLS1_TXT_ECDH_anon_WITH_DES_192_CBC3_SHA, + TLS1_CK_ECDH_anon_WITH_DES_192_CBC3_SHA, + SSL_kEECDH, + SSL_aNULL, + SSL_3DES, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 168, + 168, + }, + + /* Cipher C018 */ + { + 1, + TLS1_TXT_ECDH_anon_WITH_AES_128_CBC_SHA, + TLS1_CK_ECDH_anon_WITH_AES_128_CBC_SHA, + SSL_kEECDH, + SSL_aNULL, + SSL_AES128, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 128, + 128, + }, + + /* Cipher C019 */ + { + 1, + TLS1_TXT_ECDH_anon_WITH_AES_256_CBC_SHA, + TLS1_CK_ECDH_anon_WITH_AES_256_CBC_SHA, + SSL_kEECDH, + SSL_aNULL, + SSL_AES256, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 256, + 256, + }, +#endif /* OPENSSL_NO_ECDH */ + +#ifndef OPENSSL_NO_SRP + /* Cipher C01A */ + { + 1, + TLS1_TXT_SRP_SHA_WITH_3DES_EDE_CBC_SHA, + TLS1_CK_SRP_SHA_WITH_3DES_EDE_CBC_SHA, + SSL_kSRP, + SSL_aNULL, + SSL_3DES, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 168, + 168, + }, + + /* Cipher C01B */ + { + 1, + TLS1_TXT_SRP_SHA_RSA_WITH_3DES_EDE_CBC_SHA, + TLS1_CK_SRP_SHA_RSA_WITH_3DES_EDE_CBC_SHA, + SSL_kSRP, + SSL_aRSA, + SSL_3DES, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 168, + 168, + }, + + /* Cipher C01C */ + { + 1, + TLS1_TXT_SRP_SHA_DSS_WITH_3DES_EDE_CBC_SHA, + TLS1_CK_SRP_SHA_DSS_WITH_3DES_EDE_CBC_SHA, + SSL_kSRP, + SSL_aDSS, + SSL_3DES, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 168, + 168, + }, + + /* Cipher C01D */ + { + 1, + TLS1_TXT_SRP_SHA_WITH_AES_128_CBC_SHA, + TLS1_CK_SRP_SHA_WITH_AES_128_CBC_SHA, + SSL_kSRP, + SSL_aNULL, + SSL_AES128, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 128, + 128, + }, + + /* Cipher C01E */ + { + 1, + TLS1_TXT_SRP_SHA_RSA_WITH_AES_128_CBC_SHA, + TLS1_CK_SRP_SHA_RSA_WITH_AES_128_CBC_SHA, + SSL_kSRP, + SSL_aRSA, + SSL_AES128, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 128, + 128, + }, + + /* Cipher C01F */ + { + 1, + TLS1_TXT_SRP_SHA_DSS_WITH_AES_128_CBC_SHA, + TLS1_CK_SRP_SHA_DSS_WITH_AES_128_CBC_SHA, + SSL_kSRP, + SSL_aDSS, + SSL_AES128, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 128, + 128, + }, + + /* Cipher C020 */ + { + 1, + TLS1_TXT_SRP_SHA_WITH_AES_256_CBC_SHA, + TLS1_CK_SRP_SHA_WITH_AES_256_CBC_SHA, + SSL_kSRP, + SSL_aNULL, + SSL_AES256, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 256, + 256, + }, + + /* Cipher C021 */ + { + 1, + TLS1_TXT_SRP_SHA_RSA_WITH_AES_256_CBC_SHA, + TLS1_CK_SRP_SHA_RSA_WITH_AES_256_CBC_SHA, + SSL_kSRP, + SSL_aRSA, + SSL_AES256, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 256, + 256, + }, + + /* Cipher C022 */ + { + 1, + TLS1_TXT_SRP_SHA_DSS_WITH_AES_256_CBC_SHA, + TLS1_CK_SRP_SHA_DSS_WITH_AES_256_CBC_SHA, + SSL_kSRP, + SSL_aDSS, + SSL_AES256, + SSL_SHA1, + SSL_TLSV1, + SSL_NOT_EXP|SSL_HIGH, + SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + 256, + 256, + }, +#endif /* OPENSSL_NO_SRP */ +#ifndef OPENSSL_NO_ECDH + + /* HMAC based TLS v1.2 ciphersuites from RFC5289 */ + + /* Cipher C023 */ + { + 1, + TLS1_TXT_ECDHE_ECDSA_WITH_AES_128_SHA256, + TLS1_CK_ECDHE_ECDSA_WITH_AES_128_SHA256, + SSL_kEECDH, + SSL_aECDSA, + SSL_AES128, + SSL_SHA256, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA256|TLS1_PRF_SHA256, + 128, + 128, + }, + + /* Cipher C024 */ { 1, - TLS1_TXT_ECDH_RSA_WITH_DES_192_CBC3_SHA, - TLS1_CK_ECDH_RSA_WITH_DES_192_CBC3_SHA, - SSL_kECDHr, - SSL_aECDH, - SSL_3DES, - SSL_SHA1, - SSL_TLSV1, - SSL_NOT_EXP|SSL_HIGH, - SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, - 168, - 168, + TLS1_TXT_ECDHE_ECDSA_WITH_AES_256_SHA384, + TLS1_CK_ECDHE_ECDSA_WITH_AES_256_SHA384, + SSL_kEECDH, + SSL_aECDSA, + SSL_AES256, + SSL_SHA384, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA384|TLS1_PRF_SHA384, + 256, + 256, }, - /* Cipher C00E */ + /* Cipher C025 */ { 1, - TLS1_TXT_ECDH_RSA_WITH_AES_128_CBC_SHA, - TLS1_CK_ECDH_RSA_WITH_AES_128_CBC_SHA, - SSL_kECDHr, + TLS1_TXT_ECDH_ECDSA_WITH_AES_128_SHA256, + TLS1_CK_ECDH_ECDSA_WITH_AES_128_SHA256, + SSL_kECDHe, SSL_aECDH, SSL_AES128, - SSL_SHA1, - SSL_TLSV1, - SSL_NOT_EXP|SSL_HIGH, - SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + SSL_SHA256, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA256|TLS1_PRF_SHA256, 128, 128, }, - /* Cipher C00F */ + /* Cipher C026 */ { 1, - TLS1_TXT_ECDH_RSA_WITH_AES_256_CBC_SHA, - TLS1_CK_ECDH_RSA_WITH_AES_256_CBC_SHA, - SSL_kECDHr, + TLS1_TXT_ECDH_ECDSA_WITH_AES_256_SHA384, + TLS1_CK_ECDH_ECDSA_WITH_AES_256_SHA384, + SSL_kECDHe, SSL_aECDH, SSL_AES256, - SSL_SHA1, - SSL_TLSV1, - SSL_NOT_EXP|SSL_HIGH, - SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + SSL_SHA384, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA384|TLS1_PRF_SHA384, 256, 256, }, - /* Cipher C010 */ + /* Cipher C027 */ { 1, - TLS1_TXT_ECDHE_RSA_WITH_NULL_SHA, - TLS1_CK_ECDHE_RSA_WITH_NULL_SHA, + TLS1_TXT_ECDHE_RSA_WITH_AES_128_SHA256, + TLS1_CK_ECDHE_RSA_WITH_AES_128_SHA256, SSL_kEECDH, SSL_aRSA, - SSL_eNULL, - SSL_SHA1, - SSL_TLSV1, - SSL_NOT_EXP|SSL_STRONG_NONE, - SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, - 0, - 0, + SSL_AES128, + SSL_SHA256, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA256|TLS1_PRF_SHA256, + 128, + 128, }, - /* Cipher C011 */ + /* Cipher C028 */ { 1, - TLS1_TXT_ECDHE_RSA_WITH_RC4_128_SHA, - TLS1_CK_ECDHE_RSA_WITH_RC4_128_SHA, + TLS1_TXT_ECDHE_RSA_WITH_AES_256_SHA384, + TLS1_CK_ECDHE_RSA_WITH_AES_256_SHA384, SSL_kEECDH, SSL_aRSA, - SSL_RC4, - SSL_SHA1, - SSL_TLSV1, - SSL_NOT_EXP|SSL_MEDIUM, - SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + SSL_AES256, + SSL_SHA384, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA384|TLS1_PRF_SHA384, + 256, + 256, + }, + + /* Cipher C029 */ + { + 1, + TLS1_TXT_ECDH_RSA_WITH_AES_128_SHA256, + TLS1_CK_ECDH_RSA_WITH_AES_128_SHA256, + SSL_kECDHe, + SSL_aECDH, + SSL_AES128, + SSL_SHA256, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA256|TLS1_PRF_SHA256, 128, 128, }, - /* Cipher C012 */ + /* Cipher C02A */ { 1, - TLS1_TXT_ECDHE_RSA_WITH_DES_192_CBC3_SHA, - TLS1_CK_ECDHE_RSA_WITH_DES_192_CBC3_SHA, - SSL_kEECDH, - SSL_aRSA, - SSL_3DES, - SSL_SHA1, - SSL_TLSV1, - SSL_NOT_EXP|SSL_HIGH, - SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, - 168, - 168, + TLS1_TXT_ECDH_RSA_WITH_AES_256_SHA384, + TLS1_CK_ECDH_RSA_WITH_AES_256_SHA384, + SSL_kECDHe, + SSL_aECDH, + SSL_AES256, + SSL_SHA384, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA384|TLS1_PRF_SHA384, + 256, + 256, }, - /* Cipher C013 */ + /* GCM based TLS v1.2 ciphersuites from RFC5289 */ + + /* Cipher C02B */ { 1, - TLS1_TXT_ECDHE_RSA_WITH_AES_128_CBC_SHA, - TLS1_CK_ECDHE_RSA_WITH_AES_128_CBC_SHA, + TLS1_TXT_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256, + TLS1_CK_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256, SSL_kEECDH, - SSL_aRSA, - SSL_AES128, - SSL_SHA1, - SSL_TLSV1, - SSL_NOT_EXP|SSL_HIGH, - SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + SSL_aECDSA, + SSL_AES128GCM, + SSL_AEAD, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA256|TLS1_PRF_SHA256, 128, 128, }, - /* Cipher C014 */ + /* Cipher C02C */ { 1, - TLS1_TXT_ECDHE_RSA_WITH_AES_256_CBC_SHA, - TLS1_CK_ECDHE_RSA_WITH_AES_256_CBC_SHA, + TLS1_TXT_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384, + TLS1_CK_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384, SSL_kEECDH, - SSL_aRSA, - SSL_AES256, - SSL_SHA1, - SSL_TLSV1, - SSL_NOT_EXP|SSL_HIGH, - SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + SSL_aECDSA, + SSL_AES256GCM, + SSL_AEAD, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA384|TLS1_PRF_SHA384, 256, 256, }, - /* Cipher C015 */ + /* Cipher C02D */ { 1, - TLS1_TXT_ECDH_anon_WITH_NULL_SHA, - TLS1_CK_ECDH_anon_WITH_NULL_SHA, - SSL_kEECDH, - SSL_aNULL, - SSL_eNULL, - SSL_SHA1, - SSL_TLSV1, - SSL_NOT_EXP|SSL_STRONG_NONE, - SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, - 0, - 0, + TLS1_TXT_ECDH_ECDSA_WITH_AES_128_GCM_SHA256, + TLS1_CK_ECDH_ECDSA_WITH_AES_128_GCM_SHA256, + SSL_kECDHe, + SSL_aECDH, + SSL_AES128GCM, + SSL_AEAD, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA256|TLS1_PRF_SHA256, + 128, + 128, }, - /* Cipher C016 */ + /* Cipher C02E */ { 1, - TLS1_TXT_ECDH_anon_WITH_RC4_128_SHA, - TLS1_CK_ECDH_anon_WITH_RC4_128_SHA, + TLS1_TXT_ECDH_ECDSA_WITH_AES_256_GCM_SHA384, + TLS1_CK_ECDH_ECDSA_WITH_AES_256_GCM_SHA384, + SSL_kECDHe, + SSL_aECDH, + SSL_AES256GCM, + SSL_AEAD, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA384|TLS1_PRF_SHA384, + 256, + 256, + }, + + /* Cipher C02F */ + { + 1, + TLS1_TXT_ECDHE_RSA_WITH_AES_128_GCM_SHA256, + TLS1_CK_ECDHE_RSA_WITH_AES_128_GCM_SHA256, SSL_kEECDH, - SSL_aNULL, - SSL_RC4, - SSL_SHA1, - SSL_TLSV1, - SSL_NOT_EXP|SSL_MEDIUM, - SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + SSL_aRSA, + SSL_AES128GCM, + SSL_AEAD, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA256|TLS1_PRF_SHA256, 128, 128, }, - /* Cipher C017 */ + /* Cipher C030 */ { 1, - TLS1_TXT_ECDH_anon_WITH_DES_192_CBC3_SHA, - TLS1_CK_ECDH_anon_WITH_DES_192_CBC3_SHA, + TLS1_TXT_ECDHE_RSA_WITH_AES_256_GCM_SHA384, + TLS1_CK_ECDHE_RSA_WITH_AES_256_GCM_SHA384, SSL_kEECDH, - SSL_aNULL, - SSL_3DES, - SSL_SHA1, - SSL_TLSV1, - SSL_NOT_EXP|SSL_HIGH, - SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, - 168, - 168, + SSL_aRSA, + SSL_AES256GCM, + SSL_AEAD, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA384|TLS1_PRF_SHA384, + 256, + 256, }, - /* Cipher C018 */ + /* Cipher C031 */ { 1, - TLS1_TXT_ECDH_anon_WITH_AES_128_CBC_SHA, - TLS1_CK_ECDH_anon_WITH_AES_128_CBC_SHA, - SSL_kEECDH, - SSL_aNULL, - SSL_AES128, - SSL_SHA1, - SSL_TLSV1, - SSL_NOT_EXP|SSL_HIGH, - SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + TLS1_TXT_ECDH_RSA_WITH_AES_128_GCM_SHA256, + TLS1_CK_ECDH_RSA_WITH_AES_128_GCM_SHA256, + SSL_kECDHe, + SSL_aECDH, + SSL_AES128GCM, + SSL_AEAD, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA256|TLS1_PRF_SHA256, 128, 128, }, - /* Cipher C019 */ + /* Cipher C032 */ { 1, - TLS1_TXT_ECDH_anon_WITH_AES_256_CBC_SHA, - TLS1_CK_ECDH_anon_WITH_AES_256_CBC_SHA, - SSL_kEECDH, - SSL_aNULL, - SSL_AES256, - SSL_SHA1, - SSL_TLSV1, - SSL_NOT_EXP|SSL_HIGH, - SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF, + TLS1_TXT_ECDH_RSA_WITH_AES_256_GCM_SHA384, + TLS1_CK_ECDH_RSA_WITH_AES_256_GCM_SHA384, + SSL_kECDHe, + SSL_aECDH, + SSL_AES256GCM, + SSL_AEAD, + SSL_TLSV1_2, + SSL_NOT_EXP|SSL_HIGH|SSL_FIPS, + SSL_HANDSHAKE_MAC_SHA384|TLS1_PRF_SHA384, 256, 256, }, -#endif /* OPENSSL_NO_ECDH */ + +#endif /* OPENSSL_NO_ECDH */ + #ifdef TEMP_GOST_TLS /* Cipher FF00 */ @@ -2087,6 +2904,9 @@ SSL3_ENC_METHOD SSLv3_enc_data={ SSL3_MD_CLIENT_FINISHED_CONST,4, SSL3_MD_SERVER_FINISHED_CONST,4, ssl3_alert_code, + (int (*)(SSL *, unsigned char *, size_t, const char *, + size_t, const unsigned char *, size_t, + int use_context))ssl_undefined_function, }; long ssl3_default_timeout(void) @@ -2128,6 +2948,9 @@ int ssl3_new(SSL *s) s->s3=s3; +#ifndef OPENSSL_NO_SRP + SSL_SRP_CTX_init(s); +#endif s->method->ssl_clear(s); return(1); err: @@ -2168,6 +2991,9 @@ void ssl3_free(SSL *s) BIO_free(s->s3->handshake_buffer); } if (s->s3->handshake_dgst) ssl3_free_digest_list(s); +#ifndef OPENSSL_NO_SRP + SSL_SRP_CTX_free(s); +#endif OPENSSL_cleanse(s->s3,sizeof *s->s3); OPENSSL_free(s->s3); s->s3=NULL; @@ -2239,7 +3065,23 @@ void ssl3_clear(SSL *s) s->s3->num_renegotiations=0; s->s3->in_read_app_data=0; s->version=SSL3_VERSION; + +#if !defined(OPENSSL_NO_TLSEXT) && !defined(OPENSSL_NO_NEXTPROTONEG) + if (s->next_proto_negotiated) + { + OPENSSL_free(s->next_proto_negotiated); + s->next_proto_negotiated = NULL; + s->next_proto_negotiated_len = 0; + } +#endif + } + +#ifndef OPENSSL_NO_SRP +static char * MS_CALLBACK srp_password_from_info_cb(SSL *s, void *arg) + { + return BUF_strdup(s->srp_ctx.info) ; } +#endif long ssl3_ctrl(SSL *s, int cmd, long larg, void *parg) { @@ -2486,6 +3328,27 @@ long ssl3_ctrl(SSL *s, int cmd, long larg, void *parg) ret = 1; break; +#ifndef OPENSSL_NO_HEARTBEATS + case SSL_CTRL_TLS_EXT_SEND_HEARTBEAT: + if (SSL_version(s) == DTLS1_VERSION || SSL_version(s) == DTLS1_BAD_VER) + ret = dtls1_heartbeat(s); + else + ret = tls1_heartbeat(s); + break; + + case SSL_CTRL_GET_TLS_EXT_HEARTBEAT_PENDING: + ret = s->tlsext_hb_pending; + break; + + case SSL_CTRL_SET_TLS_EXT_HEARTBEAT_NO_REQUESTS: + if (larg) + s->tlsext_heartbeat |= SSL_TLSEXT_HB_DONT_RECV_REQUESTS; + else + s->tlsext_heartbeat &= ~SSL_TLSEXT_HB_DONT_RECV_REQUESTS; + ret = 1; + break; +#endif + #endif /* !OPENSSL_NO_TLSEXT */ default: break; @@ -2718,6 +3581,38 @@ long ssl3_ctx_ctrl(SSL_CTX *ctx, int cmd, long larg, void *parg) return 1; break; +#ifndef OPENSSL_NO_SRP + case SSL_CTRL_SET_TLS_EXT_SRP_USERNAME: + ctx->srp_ctx.srp_Mask|=SSL_kSRP; + if (ctx->srp_ctx.login != NULL) + OPENSSL_free(ctx->srp_ctx.login); + ctx->srp_ctx.login = NULL; + if (parg == NULL) + break; + if (strlen((const char *)parg) > 255 || strlen((const char *)parg) < 1) + { + SSLerr(SSL_F_SSL3_CTX_CTRL, SSL_R_INVALID_SRP_USERNAME); + return 0; + } + if ((ctx->srp_ctx.login = BUF_strdup((char *)parg)) == NULL) + { + SSLerr(SSL_F_SSL3_CTX_CTRL, ERR_R_INTERNAL_ERROR); + return 0; + } + break; + case SSL_CTRL_SET_TLS_EXT_SRP_PASSWORD: + ctx->srp_ctx.SRP_give_srp_client_pwd_callback=srp_password_from_info_cb; + ctx->srp_ctx.info=parg; + break; + case SSL_CTRL_SET_SRP_ARG: + ctx->srp_ctx.srp_Mask|=SSL_kSRP; + ctx->srp_ctx.SRP_cb_arg=parg; + break; + + case SSL_CTRL_SET_TLS_EXT_SRP_STRENGTH: + ctx->srp_ctx.strength=larg; + break; +#endif #endif /* !OPENSSL_NO_TLSEXT */ /* A Thawte special :-) */ @@ -2730,6 +3625,18 @@ long ssl3_ctx_ctrl(SSL_CTX *ctx, int cmd, long larg, void *parg) sk_X509_push(ctx->extra_certs,(X509 *)parg); break; + case SSL_CTRL_GET_EXTRA_CHAIN_CERTS: + *(STACK_OF(X509) **)parg = ctx->extra_certs; + break; + + case SSL_CTRL_CLEAR_EXTRA_CHAIN_CERTS: + if (ctx->extra_certs) + { + sk_X509_pop_free(ctx->extra_certs, X509_free); + ctx->extra_certs = NULL; + } + break; + default: return(0); } @@ -2787,6 +3694,20 @@ long ssl3_ctx_callback_ctrl(SSL_CTX *ctx, int cmd, void (*fp)(void)) HMAC_CTX *, int))fp; break; +#ifndef OPENSSL_NO_SRP + case SSL_CTRL_SET_SRP_VERIFY_PARAM_CB: + ctx->srp_ctx.srp_Mask|=SSL_kSRP; + ctx->srp_ctx.SRP_verify_param_callback=(int (*)(SSL *,void *))fp; + break; + case SSL_CTRL_SET_TLS_EXT_SRP_USERNAME_CB: + ctx->srp_ctx.srp_Mask|=SSL_kSRP; + ctx->srp_ctx.TLS_ext_srp_username_callback=(int (*)(SSL *,int *,void *))fp; + break; + case SSL_CTRL_SET_SRP_GIVE_CLIENT_PWD_CB: + ctx->srp_ctx.srp_Mask|=SSL_kSRP; + ctx->srp_ctx.SRP_give_srp_client_pwd_callback=(char *(*)(SSL *,void *))fp; + break; +#endif #endif default: return(0); @@ -2805,6 +3726,9 @@ const SSL_CIPHER *ssl3_get_cipher_by_char(const unsigned char *p) id=0x03000000L|((unsigned long)p[0]<<8L)|(unsigned long)p[1]; c.id=id; cp = OBJ_bsearch_ssl_cipher_id(&c, ssl3_ciphers, SSL3_NUM_CIPHERS); +#ifdef DEBUG_PRINT_UNKNOWN_CIPHERSUITES +if (cp == NULL) fprintf(stderr, "Unknown cipher ID %x\n", (p[0] << 8) | p[1]); +#endif if (cp == NULL || cp->valid == 0) return NULL; else @@ -2882,11 +3806,20 @@ SSL_CIPHER *ssl3_choose_cipher(SSL *s, STACK_OF(SSL_CIPHER) *clnt, { c=sk_SSL_CIPHER_value(prio,i); + /* Skip TLS v1.2 only ciphersuites if lower than v1.2 */ + if ((c->algorithm_ssl & SSL_TLSV1_2) && + (TLS1_get_version(s) < TLS1_2_VERSION)) + continue; + ssl_set_cert_masks(cert,c); mask_k = cert->mask_k; mask_a = cert->mask_a; emask_k = cert->export_mask_k; emask_a = cert->export_mask_a; +#ifndef OPENSSL_NO_SRP + mask_k=cert->mask_k | s->srp_ctx.srp_Mask; + emask_k=cert->export_mask_k | s->srp_ctx.srp_Mask; +#endif #ifdef KSSL_DEBUG /* printf("ssl3_choose_cipher %d alg= %lx\n", i,c->algorithms);*/ @@ -3335,4 +4268,15 @@ need to go to SSL_ST_ACCEPT. } return(ret); } - +/* If we are using TLS v1.2 or later and default SHA1+MD5 algorithms switch + * to new SHA256 PRF and handshake macs + */ +long ssl_get_algorithm2(SSL *s) + { + long alg2 = s->s3->tmp.new_cipher->algorithm2; + if (TLS1_get_version(s) >= TLS1_2_VERSION && + alg2 == (SSL_HANDSHAKE_MAC_DEFAULT|TLS1_PRF)) + return SSL_HANDSHAKE_MAC_SHA256 | TLS1_PRF_SHA256; + return alg2; + } + diff --git a/crypto/openssl/ssl/s3_pkt.c b/crypto/openssl/ssl/s3_pkt.c index f9b3629cf7..3c56a86933 100644 --- a/crypto/openssl/ssl/s3_pkt.c +++ b/crypto/openssl/ssl/s3_pkt.c @@ -115,6 +115,7 @@ #include "ssl_locl.h" #include #include +#include static int do_ssl3_write(SSL *s, int type, const unsigned char *buf, unsigned int len, int create_empty_fragment); @@ -630,6 +631,7 @@ static int do_ssl3_write(SSL *s, int type, const unsigned char *buf, unsigned char *p,*plen; int i,mac_size,clear=0; int prefix_len=0; + int eivlen; long align=0; SSL3_RECORD *wr; SSL3_BUFFER *wb=&(s->s3->wbuf); @@ -739,9 +741,27 @@ static int do_ssl3_write(SSL *s, int type, const unsigned char *buf, /* field where we are to write out packet length */ plen=p; p+=2; + /* Explicit IV length, block ciphers and TLS version 1.1 or later */ + if (s->enc_write_ctx && s->version >= TLS1_1_VERSION) + { + int mode = EVP_CIPHER_CTX_mode(s->enc_write_ctx); + if (mode == EVP_CIPH_CBC_MODE) + { + eivlen = EVP_CIPHER_CTX_iv_length(s->enc_write_ctx); + if (eivlen <= 1) + eivlen = 0; + } + /* Need explicit part of IV for GCM mode */ + else if (mode == EVP_CIPH_GCM_MODE) + eivlen = EVP_GCM_TLS_EXPLICIT_IV_LEN; + else + eivlen = 0; + } + else + eivlen = 0; /* lets setup the record stuff. */ - wr->data=p; + wr->data=p + eivlen; wr->length=(int)len; wr->input=(unsigned char *)buf; @@ -769,11 +789,19 @@ static int do_ssl3_write(SSL *s, int type, const unsigned char *buf, if (mac_size != 0) { - if (s->method->ssl3_enc->mac(s,&(p[wr->length]),1) < 0) + if (s->method->ssl3_enc->mac(s,&(p[wr->length + eivlen]),1) < 0) goto err; wr->length+=mac_size; - wr->input=p; - wr->data=p; + } + + wr->input=p; + wr->data=p; + + if (eivlen) + { + /* if (RAND_pseudo_bytes(p, eivlen) <= 0) + goto err; */ + wr->length += eivlen; } /* ssl3_enc can only have an error on read */ @@ -1042,6 +1070,19 @@ start: dest = s->s3->alert_fragment; dest_len = &s->s3->alert_fragment_len; } +#ifndef OPENSSL_NO_HEARTBEATS + else if (rr->type == TLS1_RT_HEARTBEAT) + { + tls1_process_heartbeat(s); + + /* Exit and notify application to read again */ + rr->length = 0; + s->rwstate=SSL_READING; + BIO_clear_retry_flags(SSL_get_rbio(s)); + BIO_set_retry_read(SSL_get_rbio(s)); + return(-1); + } +#endif if (dest_maxlen > 0) { @@ -1185,6 +1226,10 @@ start: SSLerr(SSL_F_SSL3_READ_BYTES,SSL_R_NO_RENEGOTIATION); goto f_err; } +#ifdef SSL_AD_MISSING_SRP_USERNAME + if (alert_descr == SSL_AD_MISSING_SRP_USERNAME) + return(0); +#endif } else if (alert_level == 2) /* fatal */ { @@ -1263,6 +1308,7 @@ start: #else s->state = s->server ? SSL_ST_ACCEPT : SSL_ST_CONNECT; #endif + s->renegotiate=1; s->new_session=1; } i=s->handshake_func(s); @@ -1296,8 +1342,10 @@ start: { default: #ifndef OPENSSL_NO_TLS - /* TLS just ignores unknown message types */ - if (s->version == TLS1_VERSION) + /* TLS up to v1.1 just ignores unknown message types: + * TLS v1.2 give an unexpected message alert. + */ + if (s->version >= TLS1_VERSION && s->version <= TLS1_1_VERSION) { rr->length = 0; goto start; diff --git a/crypto/openssl/ssl/s3_srvr.c b/crypto/openssl/ssl/s3_srvr.c index d734c359fb..5944d8c484 100644 --- a/crypto/openssl/ssl/s3_srvr.c +++ b/crypto/openssl/ssl/s3_srvr.c @@ -179,6 +179,31 @@ static const SSL_METHOD *ssl3_get_server_method(int ver) return(NULL); } +#ifndef OPENSSL_NO_SRP +static int ssl_check_srp_ext_ClientHello(SSL *s, int *al) + { + int ret = SSL_ERROR_NONE; + + *al = SSL_AD_UNRECOGNIZED_NAME; + + if ((s->s3->tmp.new_cipher->algorithm_mkey & SSL_kSRP) && + (s->srp_ctx.TLS_ext_srp_username_callback != NULL)) + { + if(s->srp_ctx.login == NULL) + { + /* There isn't any srp login extension !!! */ + ret = SSL3_AL_FATAL; + *al = SSL_AD_UNKNOWN_PSK_IDENTITY; + } + else + { + ret = SSL_srp_server_param_with_username(s,al); + } + } + return ret; + } +#endif + IMPLEMENT_ssl3_meth_func(SSLv3_server_method, ssl3_accept, ssl_undefined_function, @@ -211,6 +236,18 @@ int ssl3_accept(SSL *s) return(-1); } +#ifndef OPENSSL_NO_HEARTBEATS + /* If we're awaiting a HeartbeatResponse, pretend we + * already got and don't await it anymore, because + * Heartbeats don't make sense during handshakes anyway. + */ + if (s->tlsext_hb_pending) + { + s->tlsext_hb_pending = 0; + s->tlsext_hb_seq++; + } +#endif + for (;;) { state=s->state; @@ -218,7 +255,7 @@ int ssl3_accept(SSL *s) switch (s->state) { case SSL_ST_RENEGOTIATE: - s->new_session=1; + s->renegotiate=1; /* s->state=SSL_ST_ACCEPT; */ case SSL_ST_BEFORE: @@ -314,10 +351,34 @@ int ssl3_accept(SSL *s) case SSL3_ST_SR_CLNT_HELLO_C: s->shutdown=0; - ret=ssl3_get_client_hello(s); - if (ret <= 0) goto end; - - s->new_session = 2; + if (s->rwstate != SSL_X509_LOOKUP) + { + ret=ssl3_get_client_hello(s); + if (ret <= 0) goto end; + } +#ifndef OPENSSL_NO_SRP + { + int al; + if ((ret = ssl_check_srp_ext_ClientHello(s,&al)) < 0) + { + /* callback indicates firther work to be done */ + s->rwstate=SSL_X509_LOOKUP; + goto end; + } + if (ret != SSL_ERROR_NONE) + { + ssl3_send_alert(s,SSL3_AL_FATAL,al); + /* This is not really an error but the only means to + for a client to detect whether srp is supported. */ + if (al != TLS1_AD_UNKNOWN_PSK_IDENTITY) + SSLerr(SSL_F_SSL3_ACCEPT,SSL_R_CLIENTHELLO_TLSEXT); + ret = SSL_TLSEXT_ERR_ALERT_FATAL; + ret= -1; + goto end; + } + } +#endif + s->renegotiate = 2; s->state=SSL3_ST_SW_SRVR_HELLO_A; s->init_num=0; break; @@ -346,7 +407,7 @@ int ssl3_accept(SSL *s) case SSL3_ST_SW_CERT_A: case SSL3_ST_SW_CERT_B: /* Check if it is anon DH or anon ECDH, */ - /* normal PSK or KRB5 */ + /* normal PSK or KRB5 or SRP */ if (!(s->s3->tmp.new_cipher->algorithm_auth & SSL_aNULL) && !(s->s3->tmp.new_cipher->algorithm_mkey & SSL_kPSK) && !(s->s3->tmp.new_cipher->algorithm_auth & SSL_aKRB5)) @@ -410,6 +471,10 @@ int ssl3_accept(SSL *s) * hint if provided */ #ifndef OPENSSL_NO_PSK || ((alg_k & SSL_kPSK) && s->ctx->psk_identity_hint) +#endif +#ifndef OPENSSL_NO_SRP + /* SRP: send ServerKeyExchange */ + || (alg_k & SSL_kSRP) #endif || (alg_k & (SSL_kDHr|SSL_kDHd|SSL_kEDH)) || (alg_k & SSL_kEECDH) @@ -457,6 +522,9 @@ int ssl3_accept(SSL *s) skip=1; s->s3->tmp.cert_request=0; s->state=SSL3_ST_SW_SRVR_DONE_A; + if (s->s3->handshake_buffer) + if (!ssl3_digest_cached_records(s)) + return -1; } else { @@ -539,9 +607,34 @@ int ssl3_accept(SSL *s) * the client uses its key from the certificate * for key exchange. */ +#if defined(OPENSSL_NO_TLSEXT) || defined(OPENSSL_NO_NEXTPROTONEG) s->state=SSL3_ST_SR_FINISHED_A; +#else + if (s->s3->next_proto_neg_seen) + s->state=SSL3_ST_SR_NEXT_PROTO_A; + else + s->state=SSL3_ST_SR_FINISHED_A; +#endif s->init_num = 0; } + else if (TLS1_get_version(s) >= TLS1_2_VERSION) + { + s->state=SSL3_ST_SR_CERT_VRFY_A; + s->init_num=0; + if (!s->session->peer) + break; + /* For TLS v1.2 freeze the handshake buffer + * at this point and digest cached records. + */ + if (!s->s3->handshake_buffer) + { + SSLerr(SSL_F_SSL3_ACCEPT,ERR_R_INTERNAL_ERROR); + return -1; + } + s->s3->flags |= TLS1_FLAGS_KEEP_HANDSHAKE; + if (!ssl3_digest_cached_records(s)) + return -1; + } else { int offset=0; @@ -582,23 +675,37 @@ int ssl3_accept(SSL *s) ret=ssl3_get_cert_verify(s); if (ret <= 0) goto end; +#if defined(OPENSSL_NO_TLSEXT) || defined(OPENSSL_NO_NEXTPROTONEG) s->state=SSL3_ST_SR_FINISHED_A; +#else + if (s->s3->next_proto_neg_seen) + s->state=SSL3_ST_SR_NEXT_PROTO_A; + else + s->state=SSL3_ST_SR_FINISHED_A; +#endif s->init_num=0; break; +#if !defined(OPENSSL_NO_TLSEXT) && !defined(OPENSSL_NO_NEXTPROTONEG) + case SSL3_ST_SR_NEXT_PROTO_A: + case SSL3_ST_SR_NEXT_PROTO_B: + ret=ssl3_get_next_proto(s); + if (ret <= 0) goto end; + s->init_num = 0; + s->state=SSL3_ST_SR_FINISHED_A; + break; +#endif + case SSL3_ST_SR_FINISHED_A: case SSL3_ST_SR_FINISHED_B: ret=ssl3_get_finished(s,SSL3_ST_SR_FINISHED_A, SSL3_ST_SR_FINISHED_B); if (ret <= 0) goto end; -#ifndef OPENSSL_NO_TLSEXT - if (s->tlsext_ticket_expected) - s->state=SSL3_ST_SW_SESSION_TICKET_A; - else if (s->hit) - s->state=SSL_ST_OK; -#else if (s->hit) s->state=SSL_ST_OK; +#ifndef OPENSSL_NO_TLSEXT + else if (s->tlsext_ticket_expected) + s->state=SSL3_ST_SW_SESSION_TICKET_A; #endif else s->state=SSL3_ST_SW_CHANGE_A; @@ -656,7 +763,16 @@ int ssl3_accept(SSL *s) if (ret <= 0) goto end; s->state=SSL3_ST_SW_FLUSH; if (s->hit) + { +#if defined(OPENSSL_NO_TLSEXT) || defined(OPENSSL_NO_NEXTPROTONEG) s->s3->tmp.next_state=SSL3_ST_SR_FINISHED_A; +#else + if (s->s3->next_proto_neg_seen) + s->s3->tmp.next_state=SSL3_ST_SR_NEXT_PROTO_A; + else + s->s3->tmp.next_state=SSL3_ST_SR_FINISHED_A; +#endif + } else s->s3->tmp.next_state=SSL_ST_OK; s->init_num=0; @@ -674,11 +790,9 @@ int ssl3_accept(SSL *s) s->init_num=0; - if (s->new_session == 2) /* skipped if we just sent a HelloRequest */ + if (s->renegotiate == 2) /* skipped if we just sent a HelloRequest */ { - /* actually not necessarily a 'new' session unless - * SSL_OP_NO_SESSION_RESUMPTION_ON_RENEGOTIATION is set */ - + s->renegotiate=0; s->new_session=0; ssl_update_cache(s,SSL_SESS_CACHE_SERVER); @@ -756,14 +870,6 @@ int ssl3_check_client_hello(SSL *s) int ok; long n; - /* We only allow the client to restart the handshake once per - * negotiation. */ - if (s->s3->flags & SSL3_FLAGS_SGC_RESTART_DONE) - { - SSLerr(SSL_F_SSL3_CHECK_CLIENT_HELLO, SSL_R_MULTIPLE_SGC_RESTARTS); - return -1; - } - /* this function is called when we really expect a Certificate message, * so permit appropriate message length */ n=s->method->ssl_get_message(s, @@ -776,6 +882,13 @@ int ssl3_check_client_hello(SSL *s) s->s3->tmp.reuse_message = 1; if (s->s3->tmp.message_type == SSL3_MT_CLIENT_HELLO) { + /* We only allow the client to restart the handshake once per + * negotiation. */ + if (s->s3->flags & SSL3_FLAGS_SGC_RESTART_DONE) + { + SSLerr(SSL_F_SSL3_CHECK_CLIENT_HELLO, SSL_R_MULTIPLE_SGC_RESTARTS); + return -1; + } /* Throw away what we have done so far in the current handshake, * which will now be aborted. (A full SSL_clear would be too much.) */ #ifndef OPENSSL_NO_DH @@ -817,7 +930,8 @@ int ssl3_get_client_hello(SSL *s) * If we are SSLv3, we will respond with SSLv3, even if prompted with * TLSv1. */ - if (s->state == SSL3_ST_SR_CLNT_HELLO_A) + if (s->state == SSL3_ST_SR_CLNT_HELLO_A + ) { s->state=SSL3_ST_SR_CLNT_HELLO_B; } @@ -874,13 +988,16 @@ int ssl3_get_client_hello(SSL *s) j= *(p++); s->hit=0; - /* Versions before 0.9.7 always allow session reuse during renegotiation - * (i.e. when s->new_session is true), option - * SSL_OP_NO_SESSION_RESUMPTION_ON_RENEGOTIATION is new with 0.9.7. - * Maybe this optional behaviour should always have been the default, - * but we cannot safely change the default behaviour (or new applications - * might be written that become totally unsecure when compiled with - * an earlier library version) + /* Versions before 0.9.7 always allow clients to resume sessions in renegotiation. + * 0.9.7 and later allow this by default, but optionally ignore resumption requests + * with flag SSL_OP_NO_SESSION_RESUMPTION_ON_RENEGOTIATION (it's a new flag rather + * than a change to default behavior so that applications relying on this for security + * won't even compile against older library versions). + * + * 1.0.1 and later also have a function SSL_renegotiate_abbreviated() to request + * renegotiation but not a new session (s->new_session remains unset): for servers, + * this essentially just means that the SSL_OP_NO_SESSION_RESUMPTION_ON_RENEGOTIATION + * setting will be ignored. */ if ((s->new_session && (s->options & SSL_OP_NO_SESSION_RESUMPTION_ON_RENEGOTIATION))) { @@ -1269,8 +1386,11 @@ int ssl3_get_client_hello(SSL *s) s->s3->tmp.new_cipher=s->session->cipher; } - if (!ssl3_digest_cached_records(s)) - goto f_err; + if (TLS1_get_version(s) < TLS1_2_VERSION || !(s->verify_mode & SSL_VERIFY_PEER)) + { + if (!ssl3_digest_cached_records(s)) + goto f_err; + } /* we now have the following setup. * client_random @@ -1325,20 +1445,20 @@ int ssl3_send_server_hello(SSL *s) memcpy(p,s->s3->server_random,SSL3_RANDOM_SIZE); p+=SSL3_RANDOM_SIZE; - /* now in theory we have 3 options to sending back the - * session id. If it is a re-use, we send back the - * old session-id, if it is a new session, we send - * back the new session-id or we send back a 0 length - * session-id if we want it to be single use. - * Currently I will not implement the '0' length session-id - * 12-Jan-98 - I'll now support the '0' length stuff. - * - * We also have an additional case where stateless session - * resumption is successful: we always send back the old - * session id. In this case s->hit is non zero: this can - * only happen if stateless session resumption is succesful - * if session caching is disabled so existing functionality - * is unaffected. + /* There are several cases for the session ID to send + * back in the server hello: + * - For session reuse from the session cache, + * we send back the old session ID. + * - If stateless session reuse (using a session ticket) + * is successful, we send back the client's "session ID" + * (which doesn't actually identify the session). + * - If it is a new session, we send back the new + * session ID. + * - However, if we want the new session to be single-use, + * we send back a 0-length session ID. + * s->hit is non-zero in either case of session reuse, + * so the following won't overwrite an ID that we're supposed + * to send back. */ if (!(s->ctx->session_cache_mode & SSL_SESS_CACHE_SERVER) && !s->hit) @@ -1439,6 +1559,7 @@ int ssl3_send_server_key_exchange(SSL *s) BN_CTX *bn_ctx = NULL; #endif EVP_PKEY *pkey; + const EVP_MD *md = NULL; unsigned char *p,*d; int al,i; unsigned long type; @@ -1679,21 +1800,44 @@ int ssl3_send_server_key_exchange(SSL *s) } else #endif /* !OPENSSL_NO_PSK */ +#ifndef OPENSSL_NO_SRP + if (type & SSL_kSRP) + { + if ((s->srp_ctx.N == NULL) || + (s->srp_ctx.g == NULL) || + (s->srp_ctx.s == NULL) || + (s->srp_ctx.B == NULL)) + { + SSLerr(SSL_F_SSL3_SEND_SERVER_KEY_EXCHANGE,SSL_R_MISSING_SRP_PARAM); + goto err; + } + r[0]=s->srp_ctx.N; + r[1]=s->srp_ctx.g; + r[2]=s->srp_ctx.s; + r[3]=s->srp_ctx.B; + } + else +#endif { al=SSL_AD_HANDSHAKE_FAILURE; SSLerr(SSL_F_SSL3_SEND_SERVER_KEY_EXCHANGE,SSL_R_UNKNOWN_KEY_EXCHANGE_TYPE); goto f_err; } - for (i=0; r[i] != NULL; i++) + for (i=0; r[i] != NULL && i<4; i++) { nr[i]=BN_num_bytes(r[i]); +#ifndef OPENSSL_NO_SRP + if ((i == 2) && (type & SSL_kSRP)) + n+=1+nr[i]; + else +#endif n+=2+nr[i]; } if (!(s->s3->tmp.new_cipher->algorithm_auth & SSL_aNULL) && !(s->s3->tmp.new_cipher->algorithm_mkey & SSL_kPSK)) { - if ((pkey=ssl_get_sign_pkey(s,s->s3->tmp.new_cipher)) + if ((pkey=ssl_get_sign_pkey(s,s->s3->tmp.new_cipher,&md)) == NULL) { al=SSL_AD_DECODE_ERROR; @@ -1715,8 +1859,16 @@ int ssl3_send_server_key_exchange(SSL *s) d=(unsigned char *)s->init_buf->data; p= &(d[4]); - for (i=0; r[i] != NULL; i++) + for (i=0; r[i] != NULL && i<4; i++) { +#ifndef OPENSSL_NO_SRP + if ((i == 2) && (type & SSL_kSRP)) + { + *p = nr[i]; + p++; + } + else +#endif s2n(nr[i],p); BN_bn2bin(r[i],p); p+=nr[i]; @@ -1764,12 +1916,15 @@ int ssl3_send_server_key_exchange(SSL *s) /* n is the length of the params, they start at &(d[4]) * and p points to the space at the end. */ #ifndef OPENSSL_NO_RSA - if (pkey->type == EVP_PKEY_RSA) + if (pkey->type == EVP_PKEY_RSA + && TLS1_get_version(s) < TLS1_2_VERSION) { q=md_buf; j=0; for (num=2; num > 0; num--) { + EVP_MD_CTX_set_flags(&md_ctx, + EVP_MD_CTX_FLAG_NON_FIPS_ALLOW); EVP_DigestInit_ex(&md_ctx,(num == 2) ?s->ctx->md5:s->ctx->sha1, NULL); EVP_DigestUpdate(&md_ctx,&(s->s3->client_random[0]),SSL3_RANDOM_SIZE); @@ -1791,44 +1946,41 @@ int ssl3_send_server_key_exchange(SSL *s) } else #endif -#if !defined(OPENSSL_NO_DSA) - if (pkey->type == EVP_PKEY_DSA) + if (md) { - /* lets do DSS */ - EVP_SignInit_ex(&md_ctx,EVP_dss1(), NULL); - EVP_SignUpdate(&md_ctx,&(s->s3->client_random[0]),SSL3_RANDOM_SIZE); - EVP_SignUpdate(&md_ctx,&(s->s3->server_random[0]),SSL3_RANDOM_SIZE); - EVP_SignUpdate(&md_ctx,&(d[4]),n); - if (!EVP_SignFinal(&md_ctx,&(p[2]), - (unsigned int *)&i,pkey)) + /* For TLS1.2 and later send signature + * algorithm */ + if (TLS1_get_version(s) >= TLS1_2_VERSION) { - SSLerr(SSL_F_SSL3_SEND_SERVER_KEY_EXCHANGE,ERR_LIB_DSA); - goto err; + if (!tls12_get_sigandhash(p, pkey, md)) + { + /* Should never happen */ + al=SSL_AD_INTERNAL_ERROR; + SSLerr(SSL_F_SSL3_SEND_SERVER_KEY_EXCHANGE,ERR_R_INTERNAL_ERROR); + goto f_err; + } + p+=2; } - s2n(i,p); - n+=i+2; - } - else +#ifdef SSL_DEBUG + fprintf(stderr, "Using hash %s\n", + EVP_MD_name(md)); #endif -#if !defined(OPENSSL_NO_ECDSA) - if (pkey->type == EVP_PKEY_EC) - { - /* let's do ECDSA */ - EVP_SignInit_ex(&md_ctx,EVP_ecdsa(), NULL); + EVP_SignInit_ex(&md_ctx, md, NULL); EVP_SignUpdate(&md_ctx,&(s->s3->client_random[0]),SSL3_RANDOM_SIZE); EVP_SignUpdate(&md_ctx,&(s->s3->server_random[0]),SSL3_RANDOM_SIZE); EVP_SignUpdate(&md_ctx,&(d[4]),n); if (!EVP_SignFinal(&md_ctx,&(p[2]), (unsigned int *)&i,pkey)) { - SSLerr(SSL_F_SSL3_SEND_SERVER_KEY_EXCHANGE,ERR_LIB_ECDSA); + SSLerr(SSL_F_SSL3_SEND_SERVER_KEY_EXCHANGE,ERR_LIB_EVP); goto err; } s2n(i,p); n+=i+2; + if (TLS1_get_version(s) >= TLS1_2_VERSION) + n+= 2; } else -#endif { /* Is this error check actually needed? */ al=SSL_AD_HANDSHAKE_FAILURE; @@ -1881,6 +2033,14 @@ int ssl3_send_certificate_request(SSL *s) p+=n; n++; + if (TLS1_get_version(s) >= TLS1_2_VERSION) + { + nl = tls12_get_req_sig_algs(s, p + 2); + s2n(nl, p); + p += nl + 2; + n += nl + 2; + } + off=n; p+=2; n+=2; @@ -2600,6 +2760,44 @@ int ssl3_get_client_key_exchange(SSL *s) } else #endif +#ifndef OPENSSL_NO_SRP + if (alg_k & SSL_kSRP) + { + int param_len; + + n2s(p,i); + param_len=i+2; + if (param_len > n) + { + al=SSL_AD_DECODE_ERROR; + SSLerr(SSL_F_SSL3_GET_CLIENT_KEY_EXCHANGE,SSL_R_BAD_SRP_A_LENGTH); + goto f_err; + } + if (!(s->srp_ctx.A=BN_bin2bn(p,i,NULL))) + { + SSLerr(SSL_F_SSL3_GET_CLIENT_KEY_EXCHANGE,ERR_R_BN_LIB); + goto err; + } + if (s->session->srp_username != NULL) + OPENSSL_free(s->session->srp_username); + s->session->srp_username = BUF_strdup(s->srp_ctx.login); + if (s->session->srp_username == NULL) + { + SSLerr(SSL_F_SSL3_GET_CLIENT_KEY_EXCHANGE, + ERR_R_MALLOC_FAILURE); + goto err; + } + + if ((s->session->master_key_length = SRP_generate_server_master_secret(s,s->session->master_key))<0) + { + SSLerr(SSL_F_SSL3_GET_CLIENT_KEY_EXCHANGE,ERR_R_INTERNAL_ERROR); + goto err; + } + + p+=i; + } + else +#endif /* OPENSSL_NO_SRP */ if (alg_k & SSL_kGOST) { int ret = 0; @@ -2683,7 +2881,7 @@ int ssl3_get_client_key_exchange(SSL *s) return(1); f_err: ssl3_send_alert(s,SSL3_AL_FATAL,al); -#if !defined(OPENSSL_NO_DH) || !defined(OPENSSL_NO_RSA) || !defined(OPENSSL_NO_ECDH) +#if !defined(OPENSSL_NO_DH) || !defined(OPENSSL_NO_RSA) || !defined(OPENSSL_NO_ECDH) || defined(OPENSSL_NO_SRP) err: #endif #ifndef OPENSSL_NO_ECDH @@ -2704,12 +2902,15 @@ int ssl3_get_cert_verify(SSL *s) long n; int type=0,i,j; X509 *peer; + const EVP_MD *md = NULL; + EVP_MD_CTX mctx; + EVP_MD_CTX_init(&mctx); n=s->method->ssl_get_message(s, SSL3_ST_SR_CERT_VRFY_A, SSL3_ST_SR_CERT_VRFY_B, -1, - 514, /* 514? */ + 516, /* Enough for 4096 bit RSA key with TLS v1.2 */ &ok); if (!ok) return((int)n); @@ -2772,6 +2973,36 @@ int ssl3_get_cert_verify(SSL *s) } else { + if (TLS1_get_version(s) >= TLS1_2_VERSION) + { + int sigalg = tls12_get_sigid(pkey); + /* Should never happen */ + if (sigalg == -1) + { + SSLerr(SSL_F_SSL3_GET_CERT_VERIFY,ERR_R_INTERNAL_ERROR); + al=SSL_AD_INTERNAL_ERROR; + goto f_err; + } + /* Check key type is consistent with signature */ + if (sigalg != (int)p[1]) + { + SSLerr(SSL_F_SSL3_GET_CERT_VERIFY,SSL_R_WRONG_SIGNATURE_TYPE); + al=SSL_AD_DECODE_ERROR; + goto f_err; + } + md = tls12_get_hash(p[0]); + if (md == NULL) + { + SSLerr(SSL_F_SSL3_GET_CERT_VERIFY,SSL_R_UNKNOWN_DIGEST); + al=SSL_AD_DECODE_ERROR; + goto f_err; + } +#ifdef SSL_DEBUG +fprintf(stderr, "USING TLSv1.2 HASH %s\n", EVP_MD_name(md)); +#endif + p += 2; + n -= 2; + } n2s(p,i); n-=2; if (i > n) @@ -2789,6 +3020,37 @@ int ssl3_get_cert_verify(SSL *s) goto f_err; } + if (TLS1_get_version(s) >= TLS1_2_VERSION) + { + long hdatalen = 0; + void *hdata; + hdatalen = BIO_get_mem_data(s->s3->handshake_buffer, &hdata); + if (hdatalen <= 0) + { + SSLerr(SSL_F_SSL3_GET_CERT_VERIFY, ERR_R_INTERNAL_ERROR); + al=SSL_AD_INTERNAL_ERROR; + goto f_err; + } +#ifdef SSL_DEBUG + fprintf(stderr, "Using TLS 1.2 with client verify alg %s\n", + EVP_MD_name(md)); +#endif + if (!EVP_VerifyInit_ex(&mctx, md, NULL) + || !EVP_VerifyUpdate(&mctx, hdata, hdatalen)) + { + SSLerr(SSL_F_SSL3_GET_CERT_VERIFY, ERR_R_EVP_LIB); + al=SSL_AD_INTERNAL_ERROR; + goto f_err; + } + + if (EVP_VerifyFinal(&mctx, p , i, pkey) <= 0) + { + al=SSL_AD_DECRYPT_ERROR; + SSLerr(SSL_F_SSL3_GET_CERT_VERIFY,SSL_R_BAD_SIGNATURE); + goto f_err; + } + } + else #ifndef OPENSSL_NO_RSA if (pkey->type == EVP_PKEY_RSA) { @@ -2879,6 +3141,13 @@ f_err: ssl3_send_alert(s,SSL3_AL_FATAL,al); } end: + if (s->s3->handshake_buffer) + { + BIO_free(s->s3->handshake_buffer); + s->s3->handshake_buffer = NULL; + s->s3->flags &= ~TLS1_FLAGS_KEEP_HANDSHAKE; + } + EVP_MD_CTX_cleanup(&mctx); EVP_PKEY_free(pkey); return(ret); } @@ -2991,6 +3260,12 @@ int ssl3_get_client_certificate(SSL *s) al=SSL_AD_HANDSHAKE_FAILURE; goto f_err; } + /* No client certificate so digest cached records */ + if (s->s3->handshake_buffer && !ssl3_digest_cached_records(s)) + { + al=SSL_AD_INTERNAL_ERROR; + goto f_err; + } } else { @@ -3067,13 +3342,17 @@ int ssl3_send_server_certificate(SSL *s) /* SSL3_ST_SW_CERT_B */ return(ssl3_do_write(s,SSL3_RT_HANDSHAKE)); } + #ifndef OPENSSL_NO_TLSEXT +/* send a new session ticket (not necessarily for a new session) */ int ssl3_send_newsession_ticket(SSL *s) { if (s->state == SSL3_ST_SW_SESSION_TICKET_A) { unsigned char *p, *senc, *macstart; - int len, slen; + const unsigned char *const_p; + int len, slen_full, slen; + SSL_SESSION *sess; unsigned int hlen; EVP_CIPHER_CTX ctx; HMAC_CTX hctx; @@ -3082,12 +3361,38 @@ int ssl3_send_newsession_ticket(SSL *s) unsigned char key_name[16]; /* get session encoding length */ - slen = i2d_SSL_SESSION(s->session, NULL); + slen_full = i2d_SSL_SESSION(s->session, NULL); /* Some length values are 16 bits, so forget it if session is * too long */ - if (slen > 0xFF00) + if (slen_full > 0xFF00) + return -1; + senc = OPENSSL_malloc(slen_full); + if (!senc) + return -1; + p = senc; + i2d_SSL_SESSION(s->session, &p); + + /* create a fresh copy (not shared with other threads) to clean up */ + const_p = senc; + sess = d2i_SSL_SESSION(NULL, &const_p, slen_full); + if (sess == NULL) + { + OPENSSL_free(senc); return -1; + } + sess->session_id_length = 0; /* ID is irrelevant for the ticket */ + + slen = i2d_SSL_SESSION(sess, NULL); + if (slen > slen_full) /* shouldn't ever happen */ + { + OPENSSL_free(senc); + return -1; + } + p = senc; + i2d_SSL_SESSION(sess, &p); + SSL_SESSION_free(sess); + /* Grow buffer if need be: the length calculation is as * follows 1 (size of message name) + 3 (message length * bytes) + 4 (ticket lifetime hint) + 2 (ticket length) + @@ -3099,11 +3404,6 @@ int ssl3_send_newsession_ticket(SSL *s) 26 + EVP_MAX_IV_LENGTH + EVP_MAX_BLOCK_LENGTH + EVP_MAX_MD_SIZE + slen)) return -1; - senc = OPENSSL_malloc(slen); - if (!senc) - return -1; - p = senc; - i2d_SSL_SESSION(s->session, &p); p=(unsigned char *)s->init_buf->data; /* do the header */ @@ -3134,7 +3434,13 @@ int ssl3_send_newsession_ticket(SSL *s) tlsext_tick_md(), NULL); memcpy(key_name, tctx->tlsext_tick_key_name, 16); } - l2n(s->session->tlsext_tick_lifetime_hint, p); + + /* Ticket lifetime hint (advisory only): + * We leave this unspecified for resumed session (for simplicity), + * and guess that tickets for new sessions will live as long + * as their sessions. */ + l2n(s->hit ? 0 : s->session->timeout, p); + /* Skip ticket length for now */ p += 2; /* Output key name */ @@ -3209,4 +3515,72 @@ int ssl3_send_cert_status(SSL *s) /* SSL3_ST_SW_CERT_STATUS_B */ return(ssl3_do_write(s,SSL3_RT_HANDSHAKE)); } + +# ifndef OPENSSL_NO_NEXTPROTONEG +/* ssl3_get_next_proto reads a Next Protocol Negotiation handshake message. It + * sets the next_proto member in s if found */ +int ssl3_get_next_proto(SSL *s) + { + int ok; + int proto_len, padding_len; + long n; + const unsigned char *p; + + /* Clients cannot send a NextProtocol message if we didn't see the + * extension in their ClientHello */ + if (!s->s3->next_proto_neg_seen) + { + SSLerr(SSL_F_SSL3_GET_NEXT_PROTO,SSL_R_GOT_NEXT_PROTO_WITHOUT_EXTENSION); + return -1; + } + + n=s->method->ssl_get_message(s, + SSL3_ST_SR_NEXT_PROTO_A, + SSL3_ST_SR_NEXT_PROTO_B, + SSL3_MT_NEXT_PROTO, + 514, /* See the payload format below */ + &ok); + + if (!ok) + return((int)n); + + /* s->state doesn't reflect whether ChangeCipherSpec has been received + * in this handshake, but s->s3->change_cipher_spec does (will be reset + * by ssl3_get_finished). */ + if (!s->s3->change_cipher_spec) + { + SSLerr(SSL_F_SSL3_GET_NEXT_PROTO,SSL_R_GOT_NEXT_PROTO_BEFORE_A_CCS); + return -1; + } + + if (n < 2) + return 0; /* The body must be > 1 bytes long */ + + p=(unsigned char *)s->init_msg; + + /* The payload looks like: + * uint8 proto_len; + * uint8 proto[proto_len]; + * uint8 padding_len; + * uint8 padding[padding_len]; + */ + proto_len = p[0]; + if (proto_len + 2 > s->init_num) + return 0; + padding_len = p[proto_len + 1]; + if (proto_len + padding_len + 2 != s->init_num) + return 0; + + s->next_proto_negotiated = OPENSSL_malloc(proto_len); + if (!s->next_proto_negotiated) + { + SSLerr(SSL_F_SSL3_GET_NEXT_PROTO,ERR_R_MALLOC_FAILURE); + return 0; + } + memcpy(s->next_proto_negotiated, p + 1, proto_len); + s->next_proto_negotiated_len = proto_len; + + return 1; + } +# endif #endif diff --git a/crypto/openssl/crypto/evp/m_ecdsa.c b/crypto/openssl/ssl/srtp.h similarity index 86% copy from crypto/openssl/crypto/evp/m_ecdsa.c copy to crypto/openssl/ssl/srtp.h index 8d87a49ebe..c0cf33ef28 100644 --- a/crypto/openssl/crypto/evp/m_ecdsa.c +++ b/crypto/openssl/ssl/srtp.h @@ -1,57 +1,4 @@ -/* crypto/evp/m_ecdsa.c */ -/* ==================================================================== - * Copyright (c) 1998-2002 The OpenSSL Project. All rights reserved. - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * - * 1. Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. - * - * 2. Redistributions in binary form must reproduce the above copyright - * notice, this list of conditions and the following disclaimer in - * the documentation and/or other materials provided with the - * distribution. - * - * 3. All advertising materials mentioning features or use of this - * software must display the following acknowledgment: - * "This product includes software developed by the OpenSSL Project - * for use in the OpenSSL Toolkit. (http://www.openssl.org/)" - * - * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to - * endorse or promote products derived from this software without - * prior written permission. For written permission, please contact - * openssl-core@openssl.org. - * - * 5. Products derived from this software may not be called "OpenSSL" - * nor may "OpenSSL" appear in their names without prior written - * permission of the OpenSSL Project. - * - * 6. Redistributions of any form whatsoever must retain the following - * acknowledgment: - * "This product includes software developed by the OpenSSL Project - * for use in the OpenSSL Toolkit (http://www.openssl.org/)" - * - * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY - * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE - * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR - * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR - * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, - * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT - * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; - * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) - * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, - * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) - * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED - * OF THE POSSIBILITY OF SUCH DAMAGE. - * ==================================================================== - * - * This product includes cryptographic software written by Eric Young - * (eay@cryptsoft.com). This product includes software written by Tim - * Hudson (tjh@cryptsoft.com). - * - */ +/* ssl/tls1.h */ /* Copyright (C) 1995-1998 Eric Young (eay@cryptsoft.com) * All rights reserved. * @@ -108,41 +55,91 @@ * copied and put under another distribution licence * [including the GNU Public Licence.] */ +/* ==================================================================== + * Copyright (c) 1998-2006 The OpenSSL Project. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * 3. All advertising materials mentioning features or use of this + * software must display the following acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit. (http://www.openssl.org/)" + * + * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to + * endorse or promote products derived from this software without + * prior written permission. For written permission, please contact + * openssl-core@openssl.org. + * + * 5. Products derived from this software may not be called "OpenSSL" + * nor may "OpenSSL" appear in their names without prior written + * permission of the OpenSSL Project. + * + * 6. Redistributions of any form whatsoever must retain the following + * acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit (http://www.openssl.org/)" + * + * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY + * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * ==================================================================== + * + * This product includes cryptographic software written by Eric Young + * (eay@cryptsoft.com). This product includes software written by Tim + * Hudson (tjh@cryptsoft.com). + * + */ +/* + DTLS code by Eric Rescorla + + Copyright (C) 2006, Network Resonance, Inc. + Copyright (C) 2011, RTFM, Inc. +*/ + +#ifndef HEADER_D1_SRTP_H +#define HEADER_D1_SRTP_H -#include -#include "cryptlib.h" -#include -#include -#include +#ifdef __cplusplus +extern "C" { +#endif -#ifndef OPENSSL_NO_SHA -static int init(EVP_MD_CTX *ctx) - { return SHA1_Init(ctx->md_data); } + +#define SRTP_AES128_CM_SHA1_80 0x0001 +#define SRTP_AES128_CM_SHA1_32 0x0002 +#define SRTP_AES128_F8_SHA1_80 0x0003 +#define SRTP_AES128_F8_SHA1_32 0x0004 +#define SRTP_NULL_SHA1_80 0x0005 +#define SRTP_NULL_SHA1_32 0x0006 -static int update(EVP_MD_CTX *ctx,const void *data,size_t count) - { return SHA1_Update(ctx->md_data,data,count); } +int SSL_CTX_set_tlsext_use_srtp(SSL_CTX *ctx, const char *profiles); +int SSL_set_tlsext_use_srtp(SSL *ctx, const char *profiles); +SRTP_PROTECTION_PROFILE *SSL_get_selected_srtp_profile(SSL *s); -static int final(EVP_MD_CTX *ctx,unsigned char *md) - { return SHA1_Final(md,ctx->md_data); } +STACK_OF(SRTP_PROTECTION_PROFILE) *SSL_get_srtp_profiles(SSL *ssl); +SRTP_PROTECTION_PROFILE *SSL_get_selected_srtp_profile(SSL *s); -static const EVP_MD ecdsa_md= - { - NID_ecdsa_with_SHA1, - NID_ecdsa_with_SHA1, - SHA_DIGEST_LENGTH, - EVP_MD_FLAG_PKEY_DIGEST, - init, - update, - final, - NULL, - NULL, - EVP_PKEY_ECDSA_method, - SHA_CBLOCK, - sizeof(EVP_MD *)+sizeof(SHA_CTX), - }; +#ifdef __cplusplus +} +#endif -const EVP_MD *EVP_ecdsa(void) - { - return(&ecdsa_md); - } #endif + diff --git a/crypto/openssl/ssl/ssl.h b/crypto/openssl/ssl/ssl.h index 8f922eea72..525602e4c2 100644 --- a/crypto/openssl/ssl/ssl.h +++ b/crypto/openssl/ssl/ssl.h @@ -252,6 +252,7 @@ extern "C" { #define SSL_TXT_kEECDH "kEECDH" #define SSL_TXT_kPSK "kPSK" #define SSL_TXT_kGOST "kGOST" +#define SSL_TXT_kSRP "kSRP" #define SSL_TXT_aRSA "aRSA" #define SSL_TXT_aDSS "aDSS" @@ -275,6 +276,7 @@ extern "C" { #define SSL_TXT_ECDSA "ECDSA" #define SSL_TXT_KRB5 "KRB5" #define SSL_TXT_PSK "PSK" +#define SSL_TXT_SRP "SRP" #define SSL_TXT_DES "DES" #define SSL_TXT_3DES "3DES" @@ -285,6 +287,7 @@ extern "C" { #define SSL_TXT_AES128 "AES128" #define SSL_TXT_AES256 "AES256" #define SSL_TXT_AES "AES" +#define SSL_TXT_AES_GCM "AESGCM" #define SSL_TXT_CAMELLIA128 "CAMELLIA128" #define SSL_TXT_CAMELLIA256 "CAMELLIA256" #define SSL_TXT_CAMELLIA "CAMELLIA" @@ -294,10 +297,14 @@ extern "C" { #define SSL_TXT_SHA "SHA" /* same as "SHA1" */ #define SSL_TXT_GOST94 "GOST94" #define SSL_TXT_GOST89MAC "GOST89MAC" +#define SSL_TXT_SHA256 "SHA256" +#define SSL_TXT_SHA384 "SHA384" #define SSL_TXT_SSLV2 "SSLv2" #define SSL_TXT_SSLV3 "SSLv3" #define SSL_TXT_TLSV1 "TLSv1" +#define SSL_TXT_TLSV1_1 "TLSv1.1" +#define SSL_TXT_TLSV1_2 "TLSv1.2" #define SSL_TXT_EXP "EXP" #define SSL_TXT_EXPORT "EXPORT" @@ -356,9 +363,29 @@ extern "C" { * in SSL_CTX. */ typedef struct ssl_st *ssl_crock_st; typedef struct tls_session_ticket_ext_st TLS_SESSION_TICKET_EXT; +typedef struct ssl_method_st SSL_METHOD; +typedef struct ssl_cipher_st SSL_CIPHER; +typedef struct ssl_session_st SSL_SESSION; + +DECLARE_STACK_OF(SSL_CIPHER) + +/* SRTP protection profiles for use with the use_srtp extension (RFC 5764)*/ +typedef struct srtp_protection_profile_st + { + const char *name; + unsigned long id; + } SRTP_PROTECTION_PROFILE; + +DECLARE_STACK_OF(SRTP_PROTECTION_PROFILE) + +typedef int (*tls_session_ticket_ext_cb_fn)(SSL *s, const unsigned char *data, int len, void *arg); +typedef int (*tls_session_secret_cb_fn)(SSL *s, void *secret, int *secret_len, STACK_OF(SSL_CIPHER) *peer_ciphers, SSL_CIPHER **cipher, void *arg); + + +#ifndef OPENSSL_NO_SSL_INTERN /* used to hold info on the particular ciphers used */ -typedef struct ssl_cipher_st +struct ssl_cipher_st { int valid; const char *name; /* text name */ @@ -375,15 +402,11 @@ typedef struct ssl_cipher_st unsigned long algorithm2; /* Extra flags */ int strength_bits; /* Number of bits really used */ int alg_bits; /* Number of bits for algorithm */ - } SSL_CIPHER; - -DECLARE_STACK_OF(SSL_CIPHER) + }; -typedef int (*tls_session_ticket_ext_cb_fn)(SSL *s, const unsigned char *data, int len, void *arg); -typedef int (*tls_session_secret_cb_fn)(SSL *s, void *secret, int *secret_len, STACK_OF(SSL_CIPHER) *peer_ciphers, SSL_CIPHER **cipher, void *arg); /* Used to hold functions for SSLv2 or SSLv3/TLSv1 functions */ -typedef struct ssl_method_st +struct ssl_method_st { int version; int (*ssl_new)(SSL *s); @@ -416,7 +439,7 @@ typedef struct ssl_method_st int (*ssl_version)(void); long (*ssl_callback_ctrl)(SSL *s, int cb_id, void (*fp)(void)); long (*ssl_ctx_callback_ctrl)(SSL_CTX *s, int cb_id, void (*fp)(void)); - } SSL_METHOD; + }; /* Lets make this into an ASN.1 type structure as follows * SSL_SESSION_ID ::= SEQUENCE { @@ -433,14 +456,17 @@ typedef struct ssl_method_st * Session_ID_context [ 4 ] EXPLICIT OCTET STRING, -- the Session ID context * Verify_result [ 5 ] EXPLICIT INTEGER, -- X509_V_... code for `Peer' * HostName [ 6 ] EXPLICIT OCTET STRING, -- optional HostName from servername TLS extension - * ECPointFormatList [ 7 ] OCTET STRING, -- optional EC point format list from TLS extension - * PSK_identity_hint [ 8 ] EXPLICIT OCTET STRING, -- optional PSK identity hint - * PSK_identity [ 9 ] EXPLICIT OCTET STRING -- optional PSK identity + * PSK_identity_hint [ 7 ] EXPLICIT OCTET STRING, -- optional PSK identity hint + * PSK_identity [ 8 ] EXPLICIT OCTET STRING, -- optional PSK identity + * Ticket_lifetime_hint [9] EXPLICIT INTEGER, -- server's lifetime hint for session ticket + * Ticket [10] EXPLICIT OCTET STRING, -- session ticket (clients only) + * Compression_meth [11] EXPLICIT OCTET STRING, -- optional compression method + * SRP_username [ 12 ] EXPLICIT OCTET STRING -- optional SRP username * } * Look in ssl/ssl_asn1.c for more details * I'm using EXPLICIT tags so I can read the damn things using asn1parse :-). */ -typedef struct ssl_session_st +struct ssl_session_st { int ssl_version; /* what ssl version session info is * being kept in here? */ @@ -512,8 +538,12 @@ typedef struct ssl_session_st size_t tlsext_ticklen; /* Session ticket length */ long tlsext_tick_lifetime_hint; /* Session lifetime hint in seconds */ #endif - } SSL_SESSION; +#ifndef OPENSSL_NO_SRP + char *srp_username; +#endif + }; +#endif #define SSL_OP_MICROSOFT_SESS_ID_BUG 0x00000001L #define SSL_OP_NETSCAPE_CHALLENGE_BUG 0x00000002L @@ -526,6 +556,7 @@ typedef struct ssl_session_st #define SSL_OP_SSLEAY_080_CLIENT_DH_BUG 0x00000080L #define SSL_OP_TLS_D5_BUG 0x00000100L #define SSL_OP_TLS_BLOCK_PADDING_BUG 0x00000200L +#define SSL_OP_NO_TLSv1_1 0x00000400L /* Disable SSL 3.0/TLS 1.0 CBC vulnerability workaround that was added * in OpenSSL 0.9.6d. Usually (depending on the application protocol) @@ -536,7 +567,7 @@ typedef struct ssl_session_st /* SSL_OP_ALL: various bug workarounds that should be rather harmless. * This used to be 0x000FFFFFL before 0.9.7. */ -#define SSL_OP_ALL 0x80000FFFL +#define SSL_OP_ALL 0x80000BFFL /* DTLS options */ #define SSL_OP_NO_QUERY_MTU 0x00001000L @@ -572,11 +603,16 @@ typedef struct ssl_session_st #define SSL_OP_NO_SSLv2 0x01000000L #define SSL_OP_NO_SSLv3 0x02000000L #define SSL_OP_NO_TLSv1 0x04000000L +#define SSL_OP_NO_TLSv1_2 0x08000000L +/* These next two were never actually used for anything since SSLeay + * zap so we have some more flags. + */ /* The next flag deliberately changes the ciphertest, this is a check * for the PKCS#1 attack */ -#define SSL_OP_PKCS1_CHECK_1 0x08000000L -#define SSL_OP_PKCS1_CHECK_2 0x10000000L +#define SSL_OP_PKCS1_CHECK_1 0x0 +#define SSL_OP_PKCS1_CHECK_2 0x0 + #define SSL_OP_NETSCAPE_CA_DN_BUG 0x20000000L #define SSL_OP_NETSCAPE_DEMO_CIPHER_CHANGE_BUG 0x40000000L /* Make server add server-hello extension from early version of @@ -637,12 +673,53 @@ typedef struct ssl_session_st #define SSL_get_secure_renegotiation_support(ssl) \ SSL_ctrl((ssl), SSL_CTRL_GET_RI_SUPPORT, 0, NULL) +#ifndef OPENSSL_NO_HEARTBEATS +#define SSL_heartbeat(ssl) \ + SSL_ctrl((ssl),SSL_CTRL_TLS_EXT_SEND_HEARTBEAT,0,NULL) +#endif + void SSL_CTX_set_msg_callback(SSL_CTX *ctx, void (*cb)(int write_p, int version, int content_type, const void *buf, size_t len, SSL *ssl, void *arg)); void SSL_set_msg_callback(SSL *ssl, void (*cb)(int write_p, int version, int content_type, const void *buf, size_t len, SSL *ssl, void *arg)); #define SSL_CTX_set_msg_callback_arg(ctx, arg) SSL_CTX_ctrl((ctx), SSL_CTRL_SET_MSG_CALLBACK_ARG, 0, (arg)) #define SSL_set_msg_callback_arg(ssl, arg) SSL_ctrl((ssl), SSL_CTRL_SET_MSG_CALLBACK_ARG, 0, (arg)) +#ifndef OPENSSL_NO_SRP +#ifndef OPENSSL_NO_SSL_INTERN + +typedef struct srp_ctx_st + { + /* param for all the callbacks */ + void *SRP_cb_arg; + /* set client Hello login callback */ + int (*TLS_ext_srp_username_callback)(SSL *, int *, void *); + /* set SRP N/g param callback for verification */ + int (*SRP_verify_param_callback)(SSL *, void *); + /* set SRP client passwd callback */ + char *(*SRP_give_srp_client_pwd_callback)(SSL *, void *); + + char *login; + BIGNUM *N,*g,*s,*B,*A; + BIGNUM *a,*b,*v; + char *info; + int strength; + + unsigned long srp_Mask; + } SRP_CTX; + +#endif + +/* see tls_srp.c */ +int SSL_SRP_CTX_init(SSL *s); +int SSL_CTX_SRP_CTX_init(SSL_CTX *ctx); +int SSL_SRP_CTX_free(SSL *ctx); +int SSL_CTX_SRP_CTX_free(SSL_CTX *ctx); +int SSL_srp_server_param_with_username(SSL *s, int *ad); +int SRP_generate_server_master_secret(SSL *s,unsigned char *master_key); +int SRP_Calc_A_param(SSL *s); +int SRP_generate_client_master_secret(SSL *s,unsigned char *master_key); + +#endif #if defined(OPENSSL_SYS_MSDOS) && !defined(OPENSSL_SYS_WIN32) #define SSL_MAX_CERT_LIST_DEFAULT 1024*30 /* 30k max cert list :-) */ @@ -668,7 +745,11 @@ void SSL_set_msg_callback(SSL *ssl, void (*cb)(int write_p, int version, int con typedef int (*GEN_SESSION_CB)(const SSL *ssl, unsigned char *id, unsigned int *id_len); -typedef struct ssl_comp_st +typedef struct ssl_comp_st SSL_COMP; + +#ifndef OPENSSL_NO_SSL_INTERN + +struct ssl_comp_st { int id; const char *name; @@ -677,7 +758,7 @@ typedef struct ssl_comp_st #else char *method; #endif - } SSL_COMP; + }; DECLARE_STACK_OF(SSL_COMP) DECLARE_LHASH_OF(SSL_SESSION); @@ -846,7 +927,6 @@ struct ssl_ctx_st /* Callback for status request */ int (*tlsext_status_cb)(SSL *ssl, void *arg); void *tlsext_status_arg; - /* draft-rescorla-tls-opaque-prf-input-00.txt information */ int (*tlsext_opaque_prf_input_callback)(SSL *, void *peerinput, size_t len, void *arg); void *tlsext_opaque_prf_input_callback_arg; @@ -866,9 +946,37 @@ struct ssl_ctx_st unsigned int freelist_max_len; struct ssl3_buf_freelist_st *wbuf_freelist; struct ssl3_buf_freelist_st *rbuf_freelist; +#endif +#ifndef OPENSSL_NO_SRP + SRP_CTX srp_ctx; /* ctx for SRP authentication */ +#endif + +#ifndef OPENSSL_NO_TLSEXT +# ifndef OPENSSL_NO_NEXTPROTONEG + /* Next protocol negotiation information */ + /* (for experimental NPN extension). */ + + /* For a server, this contains a callback function by which the set of + * advertised protocols can be provided. */ + int (*next_protos_advertised_cb)(SSL *s, const unsigned char **buf, + unsigned int *len, void *arg); + void *next_protos_advertised_cb_arg; + /* For a client, this contains a callback function that selects the + * next protocol from the list provided by the server. */ + int (*next_proto_select_cb)(SSL *s, unsigned char **out, + unsigned char *outlen, + const unsigned char *in, + unsigned int inlen, + void *arg); + void *next_proto_select_cb_arg; +# endif + /* SRTP profiles we are willing to do from RFC 5764 */ + STACK_OF(SRTP_PROTECTION_PROFILE) *srtp_profiles; #endif }; +#endif + #define SSL_SESS_CACHE_OFF 0x0000 #define SSL_SESS_CACHE_CLIENT 0x0001 #define SSL_SESS_CACHE_SERVER 0x0002 @@ -921,6 +1029,32 @@ int SSL_CTX_set_client_cert_engine(SSL_CTX *ctx, ENGINE *e); #endif void SSL_CTX_set_cookie_generate_cb(SSL_CTX *ctx, int (*app_gen_cookie_cb)(SSL *ssl, unsigned char *cookie, unsigned int *cookie_len)); void SSL_CTX_set_cookie_verify_cb(SSL_CTX *ctx, int (*app_verify_cookie_cb)(SSL *ssl, unsigned char *cookie, unsigned int cookie_len)); +#ifndef OPENSSL_NO_NEXTPROTONEG +void SSL_CTX_set_next_protos_advertised_cb(SSL_CTX *s, + int (*cb) (SSL *ssl, + const unsigned char **out, + unsigned int *outlen, + void *arg), + void *arg); +void SSL_CTX_set_next_proto_select_cb(SSL_CTX *s, + int (*cb) (SSL *ssl, + unsigned char **out, + unsigned char *outlen, + const unsigned char *in, + unsigned int inlen, + void *arg), + void *arg); + +int SSL_select_next_proto(unsigned char **out, unsigned char *outlen, + const unsigned char *in, unsigned int inlen, + const unsigned char *client, unsigned int client_len); +void SSL_get0_next_proto_negotiated(const SSL *s, + const unsigned char **data, unsigned *len); + +#define OPENSSL_NPN_UNSUPPORTED 0 +#define OPENSSL_NPN_NEGOTIATED 1 +#define OPENSSL_NPN_NO_OVERLAP 2 +#endif #ifndef OPENSSL_NO_PSK /* the maximum length of the buffer given to callbacks containing the @@ -961,6 +1095,8 @@ const char *SSL_get_psk_identity(const SSL *s); #define SSL_MAC_FLAG_READ_MAC_STREAM 1 #define SSL_MAC_FLAG_WRITE_MAC_STREAM 2 +#ifndef OPENSSL_NO_SSL_INTERN + struct ssl_st { /* protocol version @@ -1005,9 +1141,7 @@ struct ssl_st int server; /* are we the server side? - mostly used by SSL_clear*/ - int new_session;/* 1 if we are to use a new session. - * 2 if we are a server and are inside a handshake - * (i.e. not just sending a HelloRequest) + int new_session;/* Generate a new session or reuse an old one. * NB: For servers, the 'new' session may actually be a previously * cached session or even the previous session unless * SSL_OP_NO_SESSION_RESUMPTION_ON_RENEGOTIATION is set */ @@ -1177,12 +1311,46 @@ struct ssl_st void *tls_session_secret_cb_arg; SSL_CTX * initial_ctx; /* initial ctx, used to store sessions */ + +#ifndef OPENSSL_NO_NEXTPROTONEG + /* Next protocol negotiation. For the client, this is the protocol that + * we sent in NextProtocol and is set when handling ServerHello + * extensions. + * + * For a server, this is the client's selected_protocol from + * NextProtocol and is set when handling the NextProtocol message, + * before the Finished message. */ + unsigned char *next_proto_negotiated; + unsigned char next_proto_negotiated_len; +#endif + #define session_ctx initial_ctx + + STACK_OF(SRTP_PROTECTION_PROFILE) *srtp_profiles; /* What we'll do */ + SRTP_PROTECTION_PROFILE *srtp_profile; /* What's been chosen */ + + unsigned int tlsext_heartbeat; /* Is use of the Heartbeat extension negotiated? + 0: disabled + 1: enabled + 2: enabled, but not allowed to send Requests + */ + unsigned int tlsext_hb_pending; /* Indicates if a HeartbeatRequest is in flight */ + unsigned int tlsext_hb_seq; /* HeartbeatRequest sequence number */ #else #define session_ctx ctx #endif /* OPENSSL_NO_TLSEXT */ + + int renegotiate;/* 1 if we are renegotiating. + * 2 if we are a server and are inside a handshake + * (i.e. not just sending a HelloRequest) */ + +#ifndef OPENSSL_NO_SRP + SRP_CTX srp_ctx; /* ctx for SRP authentication */ +#endif }; +#endif + #ifdef __cplusplus } #endif @@ -1192,6 +1360,7 @@ struct ssl_st #include /* This is mostly sslv3 with a few tweaks */ #include /* Datagram TLS */ #include +#include /* Support for the use_srtp extension */ #ifdef __cplusplus extern "C" { @@ -1408,6 +1577,20 @@ DECLARE_PEM_rw(SSL_SESSION, SSL_SESSION) #define SSL_CTRL_SET_TLSEXT_STATUS_REQ_OCSP_RESP 71 #define SSL_CTRL_SET_TLSEXT_TICKET_KEY_CB 72 + +#define SSL_CTRL_SET_TLS_EXT_SRP_USERNAME_CB 75 +#define SSL_CTRL_SET_SRP_VERIFY_PARAM_CB 76 +#define SSL_CTRL_SET_SRP_GIVE_CLIENT_PWD_CB 77 + +#define SSL_CTRL_SET_SRP_ARG 78 +#define SSL_CTRL_SET_TLS_EXT_SRP_USERNAME 79 +#define SSL_CTRL_SET_TLS_EXT_SRP_STRENGTH 80 +#define SSL_CTRL_SET_TLS_EXT_SRP_PASSWORD 81 +#ifndef OPENSSL_NO_HEARTBEATS +#define SSL_CTRL_TLS_EXT_SEND_HEARTBEAT 85 +#define SSL_CTRL_GET_TLS_EXT_HEARTBEAT_PENDING 86 +#define SSL_CTRL_SET_TLS_EXT_HEARTBEAT_NO_REQUESTS 87 +#endif #endif #define DTLS_CTRL_GET_TIMEOUT 73 @@ -1418,6 +1601,9 @@ DECLARE_PEM_rw(SSL_SESSION, SSL_SESSION) #define SSL_CTRL_CLEAR_OPTIONS 77 #define SSL_CTRL_CLEAR_MODE 78 +#define SSL_CTRL_GET_EXTRA_CHAIN_CERTS 82 +#define SSL_CTRL_CLEAR_EXTRA_CHAIN_CERTS 83 + #define DTLSv1_get_timeout(ssl, arg) \ SSL_ctrl(ssl,DTLS_CTRL_GET_TIMEOUT,0, (void *)arg) #define DTLSv1_handle_timeout(ssl) \ @@ -1454,6 +1640,10 @@ DECLARE_PEM_rw(SSL_SESSION, SSL_SESSION) #define SSL_CTX_add_extra_chain_cert(ctx,x509) \ SSL_CTX_ctrl(ctx,SSL_CTRL_EXTRA_CHAIN_CERT,0,(char *)x509) +#define SSL_CTX_get_extra_chain_certs(ctx,px509) \ + SSL_CTX_ctrl(ctx,SSL_CTRL_GET_EXTRA_CHAIN_CERTS,0,px509) +#define SSL_CTX_clear_extra_chain_certs(ctx) \ + SSL_CTX_ctrl(ctx,SSL_CTRL_CLEAR_EXTRA_CHAIN_CERTS,0,NULL) #ifndef OPENSSL_NO_BIO BIO_METHOD *BIO_f_ssl(void); @@ -1481,6 +1671,7 @@ const SSL_CIPHER *SSL_get_current_cipher(const SSL *s); int SSL_CIPHER_get_bits(const SSL_CIPHER *c,int *alg_bits); char * SSL_CIPHER_get_version(const SSL_CIPHER *c); const char * SSL_CIPHER_get_name(const SSL_CIPHER *c); +unsigned long SSL_CIPHER_get_id(const SSL_CIPHER *c); int SSL_get_fd(const SSL *s); int SSL_get_rfd(const SSL *s); @@ -1546,10 +1737,14 @@ long SSL_SESSION_set_time(SSL_SESSION *s, long t); long SSL_SESSION_get_timeout(const SSL_SESSION *s); long SSL_SESSION_set_timeout(SSL_SESSION *s, long t); void SSL_copy_session_id(SSL *to,const SSL *from); +X509 *SSL_SESSION_get0_peer(SSL_SESSION *s); +int SSL_SESSION_set1_id_context(SSL_SESSION *s,const unsigned char *sid_ctx, + unsigned int sid_ctx_len); SSL_SESSION *SSL_SESSION_new(void); const unsigned char *SSL_SESSION_get_id(const SSL_SESSION *s, unsigned int *len); +unsigned int SSL_SESSION_get_compress_id(const SSL_SESSION *s); #ifndef OPENSSL_NO_FP_API int SSL_SESSION_print_fp(FILE *fp,const SSL_SESSION *ses); #endif @@ -1612,6 +1807,30 @@ int SSL_set_trust(SSL *s, int trust); int SSL_CTX_set1_param(SSL_CTX *ctx, X509_VERIFY_PARAM *vpm); int SSL_set1_param(SSL *ssl, X509_VERIFY_PARAM *vpm); +#ifndef OPENSSL_NO_SRP +int SSL_CTX_set_srp_username(SSL_CTX *ctx,char *name); +int SSL_CTX_set_srp_password(SSL_CTX *ctx,char *password); +int SSL_CTX_set_srp_strength(SSL_CTX *ctx, int strength); +int SSL_CTX_set_srp_client_pwd_callback(SSL_CTX *ctx, + char *(*cb)(SSL *,void *)); +int SSL_CTX_set_srp_verify_param_callback(SSL_CTX *ctx, + int (*cb)(SSL *,void *)); +int SSL_CTX_set_srp_username_callback(SSL_CTX *ctx, + int (*cb)(SSL *,int *,void *)); +int SSL_CTX_set_srp_cb_arg(SSL_CTX *ctx, void *arg); + +int SSL_set_srp_server_param(SSL *s, const BIGNUM *N, const BIGNUM *g, + BIGNUM *sa, BIGNUM *v, char *info); +int SSL_set_srp_server_param_pw(SSL *s, const char *user, const char *pass, + const char *grp); + +BIGNUM *SSL_get_srp_g(SSL *s); +BIGNUM *SSL_get_srp_N(SSL *s); + +char *SSL_get_srp_username(SSL *s); +char *SSL_get_srp_userinfo(SSL *s); +#endif + void SSL_free(SSL *ssl); int SSL_accept(SSL *ssl); int SSL_connect(SSL *ssl); @@ -1647,6 +1866,15 @@ const SSL_METHOD *TLSv1_method(void); /* TLSv1.0 */ const SSL_METHOD *TLSv1_server_method(void); /* TLSv1.0 */ const SSL_METHOD *TLSv1_client_method(void); /* TLSv1.0 */ +const SSL_METHOD *TLSv1_1_method(void); /* TLSv1.1 */ +const SSL_METHOD *TLSv1_1_server_method(void); /* TLSv1.1 */ +const SSL_METHOD *TLSv1_1_client_method(void); /* TLSv1.1 */ + +const SSL_METHOD *TLSv1_2_method(void); /* TLSv1.2 */ +const SSL_METHOD *TLSv1_2_server_method(void); /* TLSv1.2 */ +const SSL_METHOD *TLSv1_2_client_method(void); /* TLSv1.2 */ + + const SSL_METHOD *DTLSv1_method(void); /* DTLSv1.0 */ const SSL_METHOD *DTLSv1_server_method(void); /* DTLSv1.0 */ const SSL_METHOD *DTLSv1_client_method(void); /* DTLSv1.0 */ @@ -1655,6 +1883,7 @@ STACK_OF(SSL_CIPHER) *SSL_get_ciphers(const SSL *s); int SSL_do_handshake(SSL *s); int SSL_renegotiate(SSL *s); +int SSL_renegotiate_abbreviated(SSL *s); int SSL_renegotiate_pending(SSL *s); int SSL_shutdown(SSL *s); @@ -1706,6 +1935,7 @@ void SSL_set_info_callback(SSL *ssl, void (*cb)(const SSL *ssl,int type,int val)); void (*SSL_get_info_callback(const SSL *ssl))(const SSL *ssl,int type,int val); int SSL_state(const SSL *ssl); +void SSL_set_state(SSL *ssl, int state); void SSL_set_verify_result(SSL *ssl,long v); long SSL_get_verify_result(const SSL *ssl); @@ -1806,6 +2036,9 @@ int SSL_set_session_ticket_ext_cb(SSL *s, tls_session_ticket_ext_cb_fn cb, /* Pre-shared secret session resumption functions */ int SSL_set_session_secret_cb(SSL *s, tls_session_secret_cb_fn tls_session_secret_cb, void *arg); +void SSL_set_debug(SSL *s, int debug); +int SSL_cache_hit(SSL *s); + /* BEGIN ERROR CODES */ /* The following lines are auto generated by the script mkerr.pl. Any changes * made after this point may be overwritten when the script is next run. @@ -1833,6 +2066,7 @@ void ERR_load_SSL_strings(void); #define SSL_F_DTLS1_GET_MESSAGE_FRAGMENT 253 #define SSL_F_DTLS1_GET_RECORD 254 #define SSL_F_DTLS1_HANDLE_TIMEOUT 297 +#define SSL_F_DTLS1_HEARTBEAT 305 #define SSL_F_DTLS1_OUTPUT_CERT_CHAIN 255 #define SSL_F_DTLS1_PREPROCESS_FRAGMENT 288 #define SSL_F_DTLS1_PROCESS_OUT_OF_SEQ_MESSAGE 256 @@ -1901,6 +2135,7 @@ void ERR_load_SSL_strings(void); #define SSL_F_SSL3_GET_KEY_EXCHANGE 141 #define SSL_F_SSL3_GET_MESSAGE 142 #define SSL_F_SSL3_GET_NEW_SESSION_TICKET 283 +#define SSL_F_SSL3_GET_NEXT_PROTO 306 #define SSL_F_SSL3_GET_RECORD 143 #define SSL_F_SSL3_GET_SERVER_CERTIFICATE 144 #define SSL_F_SSL3_GET_SERVER_DONE 145 @@ -1925,10 +2160,12 @@ void ERR_load_SSL_strings(void); #define SSL_F_SSL3_WRITE_PENDING 159 #define SSL_F_SSL_ADD_CLIENTHELLO_RENEGOTIATE_EXT 298 #define SSL_F_SSL_ADD_CLIENTHELLO_TLSEXT 277 +#define SSL_F_SSL_ADD_CLIENTHELLO_USE_SRTP_EXT 307 #define SSL_F_SSL_ADD_DIR_CERT_SUBJECTS_TO_STACK 215 #define SSL_F_SSL_ADD_FILE_CERT_SUBJECTS_TO_STACK 216 #define SSL_F_SSL_ADD_SERVERHELLO_RENEGOTIATE_EXT 299 #define SSL_F_SSL_ADD_SERVERHELLO_TLSEXT 278 +#define SSL_F_SSL_ADD_SERVERHELLO_USE_SRTP_EXT 308 #define SSL_F_SSL_BAD_METHOD 160 #define SSL_F_SSL_BYTES_TO_CIPHER_LIST 161 #define SSL_F_SSL_CERT_DUP 221 @@ -1945,6 +2182,7 @@ void ERR_load_SSL_strings(void); #define SSL_F_SSL_CREATE_CIPHER_LIST 166 #define SSL_F_SSL_CTRL 232 #define SSL_F_SSL_CTX_CHECK_PRIVATE_KEY 168 +#define SSL_F_SSL_CTX_MAKE_PROFILES 309 #define SSL_F_SSL_CTX_NEW 169 #define SSL_F_SSL_CTX_SET_CIPHER_LIST 269 #define SSL_F_SSL_CTX_SET_CLIENT_CERT_ENGINE 290 @@ -1973,8 +2211,10 @@ void ERR_load_SSL_strings(void); #define SSL_F_SSL_NEW 186 #define SSL_F_SSL_PARSE_CLIENTHELLO_RENEGOTIATE_EXT 300 #define SSL_F_SSL_PARSE_CLIENTHELLO_TLSEXT 302 +#define SSL_F_SSL_PARSE_CLIENTHELLO_USE_SRTP_EXT 310 #define SSL_F_SSL_PARSE_SERVERHELLO_RENEGOTIATE_EXT 301 #define SSL_F_SSL_PARSE_SERVERHELLO_TLSEXT 303 +#define SSL_F_SSL_PARSE_SERVERHELLO_USE_SRTP_EXT 311 #define SSL_F_SSL_PEEK 270 #define SSL_F_SSL_PREPARE_CLIENTHELLO_TLSEXT 281 #define SSL_F_SSL_PREPARE_SERVERHELLO_TLSEXT 282 @@ -1983,6 +2223,7 @@ void ERR_load_SSL_strings(void); #define SSL_F_SSL_RSA_PUBLIC_ENCRYPT 188 #define SSL_F_SSL_SESSION_NEW 189 #define SSL_F_SSL_SESSION_PRINT_FP 190 +#define SSL_F_SSL_SESSION_SET1_ID_CONTEXT 312 #define SSL_F_SSL_SESS_CERT_NEW 225 #define SSL_F_SSL_SET_CERT 191 #define SSL_F_SSL_SET_CIPHER_LIST 271 @@ -1996,6 +2237,7 @@ void ERR_load_SSL_strings(void); #define SSL_F_SSL_SET_TRUST 228 #define SSL_F_SSL_SET_WFD 196 #define SSL_F_SSL_SHUTDOWN 224 +#define SSL_F_SSL_SRP_CTX_INIT 313 #define SSL_F_SSL_UNDEFINED_CONST_FUNCTION 243 #define SSL_F_SSL_UNDEFINED_FUNCTION 197 #define SSL_F_SSL_UNDEFINED_VOID_FUNCTION 244 @@ -2015,6 +2257,8 @@ void ERR_load_SSL_strings(void); #define SSL_F_TLS1_CHANGE_CIPHER_STATE 209 #define SSL_F_TLS1_CHECK_SERVERHELLO_TLSEXT 274 #define SSL_F_TLS1_ENC 210 +#define SSL_F_TLS1_EXPORT_KEYING_MATERIAL 314 +#define SSL_F_TLS1_HEARTBEAT 315 #define SSL_F_TLS1_PREPARE_CLIENTHELLO_TLSEXT 275 #define SSL_F_TLS1_PREPARE_SERVERHELLO_TLSEXT 276 #define SSL_F_TLS1_PRF 284 @@ -2054,6 +2298,13 @@ void ERR_load_SSL_strings(void); #define SSL_R_BAD_RSA_MODULUS_LENGTH 121 #define SSL_R_BAD_RSA_SIGNATURE 122 #define SSL_R_BAD_SIGNATURE 123 +#define SSL_R_BAD_SRP_A_LENGTH 347 +#define SSL_R_BAD_SRP_B_LENGTH 348 +#define SSL_R_BAD_SRP_G_LENGTH 349 +#define SSL_R_BAD_SRP_N_LENGTH 350 +#define SSL_R_BAD_SRP_S_LENGTH 351 +#define SSL_R_BAD_SRTP_MKI_VALUE 352 +#define SSL_R_BAD_SRTP_PROTECTION_PROFILE_LIST 353 #define SSL_R_BAD_SSL_FILETYPE 124 #define SSL_R_BAD_SSL_SESSION_ID_LENGTH 125 #define SSL_R_BAD_STATE 126 @@ -2092,12 +2343,15 @@ void ERR_load_SSL_strings(void); #define SSL_R_ECC_CERT_SHOULD_HAVE_RSA_SIGNATURE 322 #define SSL_R_ECC_CERT_SHOULD_HAVE_SHA1_SIGNATURE 323 #define SSL_R_ECGROUP_TOO_LARGE_FOR_CIPHER 310 +#define SSL_R_EMPTY_SRTP_PROTECTION_PROFILE_LIST 354 #define SSL_R_ENCRYPTED_LENGTH_TOO_LONG 150 #define SSL_R_ERROR_GENERATING_TMP_RSA_KEY 282 #define SSL_R_ERROR_IN_RECEIVED_CIPHER_LIST 151 #define SSL_R_EXCESSIVE_MESSAGE_SIZE 152 #define SSL_R_EXTRA_DATA_IN_MESSAGE 153 #define SSL_R_GOT_A_FIN_BEFORE_A_CCS 154 +#define SSL_R_GOT_NEXT_PROTO_BEFORE_A_CCS 355 +#define SSL_R_GOT_NEXT_PROTO_WITHOUT_EXTENSION 356 #define SSL_R_HTTPS_PROXY_REQUEST 155 #define SSL_R_HTTP_REQUEST 156 #define SSL_R_ILLEGAL_PADDING 283 @@ -2106,6 +2360,7 @@ void ERR_load_SSL_strings(void); #define SSL_R_INVALID_COMMAND 280 #define SSL_R_INVALID_COMPRESSION_ALGORITHM 341 #define SSL_R_INVALID_PURPOSE 278 +#define SSL_R_INVALID_SRP_USERNAME 357 #define SSL_R_INVALID_STATUS_RESPONSE 328 #define SSL_R_INVALID_TICKET_KEYS_LENGTH 325 #define SSL_R_INVALID_TRUST 279 @@ -2135,6 +2390,7 @@ void ERR_load_SSL_strings(void); #define SSL_R_MISSING_RSA_CERTIFICATE 168 #define SSL_R_MISSING_RSA_ENCRYPTING_CERT 169 #define SSL_R_MISSING_RSA_SIGNING_CERT 170 +#define SSL_R_MISSING_SRP_PARAM 358 #define SSL_R_MISSING_TMP_DH_KEY 171 #define SSL_R_MISSING_TMP_ECDH_KEY 311 #define SSL_R_MISSING_TMP_RSA_KEY 172 @@ -2164,6 +2420,7 @@ void ERR_load_SSL_strings(void); #define SSL_R_NO_RENEGOTIATION 339 #define SSL_R_NO_REQUIRED_DIGEST 324 #define SSL_R_NO_SHARED_CIPHER 193 +#define SSL_R_NO_SRTP_PROFILES 359 #define SSL_R_NO_VERIFY_CALLBACK 194 #define SSL_R_NULL_SSL_CTX 195 #define SSL_R_NULL_SSL_METHOD_PASSED 196 @@ -2207,7 +2464,12 @@ void ERR_load_SSL_strings(void); #define SSL_R_SERVERHELLO_TLSEXT 275 #define SSL_R_SESSION_ID_CONTEXT_UNINITIALIZED 277 #define SSL_R_SHORT_READ 219 +#define SSL_R_SIGNATURE_ALGORITHMS_ERROR 360 #define SSL_R_SIGNATURE_FOR_NON_SIGNING_CERTIFICATE 220 +#define SSL_R_SRP_A_CALC 361 +#define SSL_R_SRTP_COULD_NOT_ALLOCATE_PROFILES 362 +#define SSL_R_SRTP_PROTECTION_PROFILE_LIST_TOO_LONG 363 +#define SSL_R_SRTP_UNKNOWN_PROTECTION_PROFILE 364 #define SSL_R_SSL23_DOING_SESSION_ID_REUSE 221 #define SSL_R_SSL2_CONNECTION_ID_TOO_LONG 299 #define SSL_R_SSL3_EXT_INVALID_ECPOINTFORMAT 321 @@ -2252,6 +2514,9 @@ void ERR_load_SSL_strings(void); #define SSL_R_TLSV1_UNRECOGNIZED_NAME 1112 #define SSL_R_TLSV1_UNSUPPORTED_EXTENSION 1110 #define SSL_R_TLS_CLIENT_CERT_REQ_WITH_ANON_CIPHER 232 +#define SSL_R_TLS_HEARTBEAT_PEER_DOESNT_ACCEPT 365 +#define SSL_R_TLS_HEARTBEAT_PENDING 366 +#define SSL_R_TLS_ILLEGAL_EXPORTER_LABEL 367 #define SSL_R_TLS_INVALID_ECPOINTFORMAT_LIST 157 #define SSL_R_TLS_PEER_DID_NOT_RESPOND_WITH_CERTIFICATE_LIST 233 #define SSL_R_TLS_RSA_ENCRYPTED_VALUE_LENGTH_IS_WRONG 234 @@ -2273,6 +2538,7 @@ void ERR_load_SSL_strings(void); #define SSL_R_UNKNOWN_CERTIFICATE_TYPE 247 #define SSL_R_UNKNOWN_CIPHER_RETURNED 248 #define SSL_R_UNKNOWN_CIPHER_TYPE 249 +#define SSL_R_UNKNOWN_DIGEST 368 #define SSL_R_UNKNOWN_KEY_EXCHANGE_TYPE 250 #define SSL_R_UNKNOWN_PKEY_TYPE 251 #define SSL_R_UNKNOWN_PROTOCOL 252 @@ -2287,12 +2553,14 @@ void ERR_load_SSL_strings(void); #define SSL_R_UNSUPPORTED_PROTOCOL 258 #define SSL_R_UNSUPPORTED_SSL_VERSION 259 #define SSL_R_UNSUPPORTED_STATUS_TYPE 329 +#define SSL_R_USE_SRTP_NOT_NEGOTIATED 369 #define SSL_R_WRITE_BIO_NOT_SET 260 #define SSL_R_WRONG_CIPHER_RETURNED 261 #define SSL_R_WRONG_MESSAGE_TYPE 262 #define SSL_R_WRONG_NUMBER_OF_KEY_BITS 263 #define SSL_R_WRONG_SIGNATURE_LENGTH 264 #define SSL_R_WRONG_SIGNATURE_SIZE 265 +#define SSL_R_WRONG_SIGNATURE_TYPE 370 #define SSL_R_WRONG_SSL_VERSION 266 #define SSL_R_WRONG_VERSION_NUMBER 267 #define SSL_R_X509_LIB 268 diff --git a/crypto/openssl/ssl/ssl2.h b/crypto/openssl/ssl/ssl2.h index 99a52ea0dd..eb25dcb0bf 100644 --- a/crypto/openssl/ssl/ssl2.h +++ b/crypto/openssl/ssl/ssl2.h @@ -155,6 +155,8 @@ extern "C" { #define CERT char #endif +#ifndef OPENSSL_NO_SSL_INTERN + typedef struct ssl2_state_st { int three_byte_header; @@ -219,6 +221,8 @@ typedef struct ssl2_state_st } tmp; } SSL2_STATE; +#endif + /* SSLv2 */ /* client */ #define SSL2_ST_SEND_CLIENT_HELLO_A (0x10|SSL_ST_CONNECT) diff --git a/crypto/openssl/ssl/ssl3.h b/crypto/openssl/ssl/ssl3.h index 9c2c41287a..112e627de0 100644 --- a/crypto/openssl/ssl/ssl3.h +++ b/crypto/openssl/ssl/ssl3.h @@ -322,6 +322,7 @@ extern "C" { #define SSL3_RT_ALERT 21 #define SSL3_RT_HANDSHAKE 22 #define SSL3_RT_APPLICATION_DATA 23 +#define TLS1_RT_HEARTBEAT 24 #define SSL3_AL_WARNING 1 #define SSL3_AL_FATAL 2 @@ -339,6 +340,11 @@ extern "C" { #define SSL3_AD_CERTIFICATE_UNKNOWN 46 #define SSL3_AD_ILLEGAL_PARAMETER 47 /* fatal */ +#define TLS1_HB_REQUEST 1 +#define TLS1_HB_RESPONSE 2 + +#ifndef OPENSSL_NO_SSL_INTERN + typedef struct ssl3_record_st { /*r */ int type; /* type of record */ @@ -360,6 +366,8 @@ typedef struct ssl3_buffer_st int left; /* how many bytes left */ } SSL3_BUFFER; +#endif + #define SSL3_CT_RSA_SIGN 1 #define SSL3_CT_DSS_SIGN 2 #define SSL3_CT_RSA_FIXED_DH 3 @@ -379,6 +387,7 @@ typedef struct ssl3_buffer_st #define SSL3_FLAGS_POP_BUFFER 0x0004 #define TLS1_FLAGS_TLS_PADDING_BUG 0x0008 #define TLS1_FLAGS_SKIP_CERT_VERIFY 0x0010 +#define TLS1_FLAGS_KEEP_HANDSHAKE 0x0020 /* SSL3_FLAGS_SGC_RESTART_DONE is set when we * restart a handshake because of MS SGC and so prevents us @@ -391,6 +400,8 @@ typedef struct ssl3_buffer_st */ #define SSL3_FLAGS_SGC_RESTART_DONE 0x0040 +#ifndef OPENSSL_NO_SSL_INTERN + typedef struct ssl3_state_st { long flags; @@ -475,7 +486,7 @@ typedef struct ssl3_state_st int finish_md_len; unsigned char peer_finish_md[EVP_MAX_MD_SIZE*2]; int peer_finish_md_len; - + unsigned long message_size; int message_type; @@ -523,13 +534,23 @@ typedef struct ssl3_state_st unsigned char previous_server_finished[EVP_MAX_MD_SIZE]; unsigned char previous_server_finished_len; int send_connection_binding; /* TODOEKR */ + +#ifndef OPENSSL_NO_NEXTPROTONEG + /* Set if we saw the Next Protocol Negotiation extension from our peer. */ + int next_proto_neg_seen; +#endif } SSL3_STATE; +#endif /* SSLv3 */ /*client */ /* extra state */ #define SSL3_ST_CW_FLUSH (0x100|SSL_ST_CONNECT) +#ifndef OPENSSL_NO_SCTP +#define DTLS1_SCTP_ST_CW_WRITE_SOCK (0x310|SSL_ST_CONNECT) +#define DTLS1_SCTP_ST_CR_READ_SOCK (0x320|SSL_ST_CONNECT) +#endif /* write to server */ #define SSL3_ST_CW_CLNT_HELLO_A (0x110|SSL_ST_CONNECT) #define SSL3_ST_CW_CLNT_HELLO_B (0x111|SSL_ST_CONNECT) @@ -557,6 +578,8 @@ typedef struct ssl3_state_st #define SSL3_ST_CW_CERT_VRFY_B (0x191|SSL_ST_CONNECT) #define SSL3_ST_CW_CHANGE_A (0x1A0|SSL_ST_CONNECT) #define SSL3_ST_CW_CHANGE_B (0x1A1|SSL_ST_CONNECT) +#define SSL3_ST_CW_NEXT_PROTO_A (0x200|SSL_ST_CONNECT) +#define SSL3_ST_CW_NEXT_PROTO_B (0x201|SSL_ST_CONNECT) #define SSL3_ST_CW_FINISHED_A (0x1B0|SSL_ST_CONNECT) #define SSL3_ST_CW_FINISHED_B (0x1B1|SSL_ST_CONNECT) /* read from server */ @@ -572,6 +595,10 @@ typedef struct ssl3_state_st /* server */ /* extra state */ #define SSL3_ST_SW_FLUSH (0x100|SSL_ST_ACCEPT) +#ifndef OPENSSL_NO_SCTP +#define DTLS1_SCTP_ST_SW_WRITE_SOCK (0x310|SSL_ST_ACCEPT) +#define DTLS1_SCTP_ST_SR_READ_SOCK (0x320|SSL_ST_ACCEPT) +#endif /* read from client */ /* Do not change the number values, they do matter */ #define SSL3_ST_SR_CLNT_HELLO_A (0x110|SSL_ST_ACCEPT) @@ -602,6 +629,8 @@ typedef struct ssl3_state_st #define SSL3_ST_SR_CERT_VRFY_B (0x1A1|SSL_ST_ACCEPT) #define SSL3_ST_SR_CHANGE_A (0x1B0|SSL_ST_ACCEPT) #define SSL3_ST_SR_CHANGE_B (0x1B1|SSL_ST_ACCEPT) +#define SSL3_ST_SR_NEXT_PROTO_A (0x210|SSL_ST_ACCEPT) +#define SSL3_ST_SR_NEXT_PROTO_B (0x211|SSL_ST_ACCEPT) #define SSL3_ST_SR_FINISHED_A (0x1C0|SSL_ST_ACCEPT) #define SSL3_ST_SR_FINISHED_B (0x1C1|SSL_ST_ACCEPT) /* write to client */ @@ -626,6 +655,7 @@ typedef struct ssl3_state_st #define SSL3_MT_CLIENT_KEY_EXCHANGE 16 #define SSL3_MT_FINISHED 20 #define SSL3_MT_CERTIFICATE_STATUS 22 +#define SSL3_MT_NEXT_PROTO 67 #define DTLS1_MT_HELLO_VERIFY_REQUEST 3 diff --git a/crypto/openssl/ssl/ssl_algs.c b/crypto/openssl/ssl/ssl_algs.c index 0967b2dfe4..d443143c59 100644 --- a/crypto/openssl/ssl/ssl_algs.c +++ b/crypto/openssl/ssl/ssl_algs.c @@ -73,6 +73,9 @@ int SSL_library_init(void) #endif #ifndef OPENSSL_NO_RC4 EVP_add_cipher(EVP_rc4()); +#if !defined(OPENSSL_NO_MD5) && (defined(__x86_64) || defined(__x86_64__)) + EVP_add_cipher(EVP_rc4_hmac_md5()); +#endif #endif #ifndef OPENSSL_NO_RC2 EVP_add_cipher(EVP_rc2_cbc()); @@ -85,6 +88,12 @@ int SSL_library_init(void) EVP_add_cipher(EVP_aes_128_cbc()); EVP_add_cipher(EVP_aes_192_cbc()); EVP_add_cipher(EVP_aes_256_cbc()); + EVP_add_cipher(EVP_aes_128_gcm()); + EVP_add_cipher(EVP_aes_256_gcm()); +#if !defined(OPENSSL_NO_SHA) && !defined(OPENSSL_NO_SHA1) + EVP_add_cipher(EVP_aes_128_cbc_hmac_sha1()); + EVP_add_cipher(EVP_aes_256_cbc_hmac_sha1()); +#endif #endif #ifndef OPENSSL_NO_CAMELLIA EVP_add_cipher(EVP_camellia_128_cbc()); diff --git a/crypto/openssl/ssl/ssl_asn1.c b/crypto/openssl/ssl/ssl_asn1.c index d7f4c6087e..38540be1e5 100644 --- a/crypto/openssl/ssl/ssl_asn1.c +++ b/crypto/openssl/ssl/ssl_asn1.c @@ -114,6 +114,9 @@ typedef struct ssl_session_asn1_st ASN1_OCTET_STRING psk_identity_hint; ASN1_OCTET_STRING psk_identity; #endif /* OPENSSL_NO_PSK */ +#ifndef OPENSSL_NO_SRP + ASN1_OCTET_STRING srp_username; +#endif /* OPENSSL_NO_SRP */ } SSL_SESSION_ASN1; int i2d_SSL_SESSION(SSL_SESSION *in, unsigned char **pp) @@ -129,6 +132,9 @@ int i2d_SSL_SESSION(SSL_SESSION *in, unsigned char **pp) #ifndef OPENSSL_NO_COMP unsigned char cbuf; int v11=0; +#endif +#ifndef OPENSSL_NO_SRP + int v12=0; #endif long l; SSL_SESSION_ASN1 a; @@ -267,6 +273,14 @@ int i2d_SSL_SESSION(SSL_SESSION *in, unsigned char **pp) a.psk_identity.data=(unsigned char *)(in->psk_identity); } #endif /* OPENSSL_NO_PSK */ +#ifndef OPENSSL_NO_SRP + if (in->srp_username) + { + a.srp_username.length=strlen(in->srp_username); + a.srp_username.type=V_ASN1_OCTET_STRING; + a.srp_username.data=(unsigned char *)(in->srp_username); + } +#endif /* OPENSSL_NO_SRP */ M_ASN1_I2D_len(&(a.version), i2d_ASN1_INTEGER); M_ASN1_I2D_len(&(a.ssl_version), i2d_ASN1_INTEGER); @@ -307,6 +321,10 @@ int i2d_SSL_SESSION(SSL_SESSION *in, unsigned char **pp) if (in->psk_identity) M_ASN1_I2D_len_EXP_opt(&(a.psk_identity), i2d_ASN1_OCTET_STRING,8,v8); #endif /* OPENSSL_NO_PSK */ +#ifndef OPENSSL_NO_SRP + if (in->srp_username) + M_ASN1_I2D_len_EXP_opt(&(a.srp_username), i2d_ASN1_OCTET_STRING,12,v12); +#endif /* OPENSSL_NO_SRP */ M_ASN1_I2D_seq_total(); @@ -351,6 +369,10 @@ int i2d_SSL_SESSION(SSL_SESSION *in, unsigned char **pp) if (in->compress_meth) M_ASN1_I2D_put_EXP_opt(&(a.comp_id), i2d_ASN1_OCTET_STRING,11,v11); #endif +#ifndef OPENSSL_NO_SRP + if (in->srp_username) + M_ASN1_I2D_put_EXP_opt(&(a.srp_username), i2d_ASN1_OCTET_STRING,12,v12); +#endif /* OPENSSL_NO_SRP */ M_ASN1_I2D_finish(); } @@ -549,6 +571,19 @@ SSL_SESSION *d2i_SSL_SESSION(SSL_SESSION **a, const unsigned char **pp, } else ret->psk_identity_hint=NULL; + + os.length=0; + os.data=NULL; + M_ASN1_D2I_get_EXP_opt(osp,d2i_ASN1_OCTET_STRING,8); + if (os.data) + { + ret->psk_identity = BUF_strndup((char *)os.data, os.length); + OPENSSL_free(os.data); + os.data = NULL; + os.length = 0; + } + else + ret->psk_identity=NULL; #endif /* OPENSSL_NO_PSK */ #ifndef OPENSSL_NO_TLSEXT @@ -588,5 +623,20 @@ SSL_SESSION *d2i_SSL_SESSION(SSL_SESSION **a, const unsigned char **pp, } #endif +#ifndef OPENSSL_NO_SRP + os.length=0; + os.data=NULL; + M_ASN1_D2I_get_EXP_opt(osp,d2i_ASN1_OCTET_STRING,12); + if (os.data) + { + ret->srp_username = BUF_strndup((char *)os.data, os.length); + OPENSSL_free(os.data); + os.data = NULL; + os.length = 0; + } + else + ret->srp_username=NULL; +#endif /* OPENSSL_NO_SRP */ + M_ASN1_D2I_Finish(a,SSL_SESSION_free,SSL_F_D2I_SSL_SESSION); } diff --git a/crypto/openssl/ssl/ssl_cert.c b/crypto/openssl/ssl/ssl_cert.c index 27256eea81..917be31876 100644 --- a/crypto/openssl/ssl/ssl_cert.c +++ b/crypto/openssl/ssl/ssl_cert.c @@ -160,6 +160,21 @@ int SSL_get_ex_data_X509_STORE_CTX_idx(void) return ssl_x509_store_ctx_idx; } +static void ssl_cert_set_default_md(CERT *cert) + { + /* Set digest values to defaults */ +#ifndef OPENSSL_NO_DSA + cert->pkeys[SSL_PKEY_DSA_SIGN].digest = EVP_dss1(); +#endif +#ifndef OPENSSL_NO_RSA + cert->pkeys[SSL_PKEY_RSA_SIGN].digest = EVP_sha1(); + cert->pkeys[SSL_PKEY_RSA_ENC].digest = EVP_sha1(); +#endif +#ifndef OPENSSL_NO_ECDSA + cert->pkeys[SSL_PKEY_ECC].digest = EVP_ecdsa(); +#endif + } + CERT *ssl_cert_new(void) { CERT *ret; @@ -174,7 +189,7 @@ CERT *ssl_cert_new(void) ret->key= &(ret->pkeys[SSL_PKEY_RSA_ENC]); ret->references=1; - + ssl_cert_set_default_md(ret); return(ret); } @@ -307,6 +322,10 @@ CERT *ssl_cert_dup(CERT *cert) * chain is held inside SSL_CTX */ ret->references=1; + /* Set digests to defaults. NB: we don't copy existing values as they + * will be set during handshake. + */ + ssl_cert_set_default_md(ret); return(ret); diff --git a/crypto/openssl/ssl/ssl_ciph.c b/crypto/openssl/ssl/ssl_ciph.c index 54ba7ef5b4..ac643c928c 100644 --- a/crypto/openssl/ssl/ssl_ciph.c +++ b/crypto/openssl/ssl/ssl_ciph.c @@ -162,11 +162,13 @@ #define SSL_ENC_CAMELLIA256_IDX 9 #define SSL_ENC_GOST89_IDX 10 #define SSL_ENC_SEED_IDX 11 -#define SSL_ENC_NUM_IDX 12 +#define SSL_ENC_AES128GCM_IDX 12 +#define SSL_ENC_AES256GCM_IDX 13 +#define SSL_ENC_NUM_IDX 14 static const EVP_CIPHER *ssl_cipher_methods[SSL_ENC_NUM_IDX]={ - NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL, + NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL }; #define SSL_COMP_NULL_IDX 0 @@ -179,28 +181,32 @@ static STACK_OF(SSL_COMP) *ssl_comp_methods=NULL; #define SSL_MD_SHA1_IDX 1 #define SSL_MD_GOST94_IDX 2 #define SSL_MD_GOST89MAC_IDX 3 +#define SSL_MD_SHA256_IDX 4 +#define SSL_MD_SHA384_IDX 5 /*Constant SSL_MAX_DIGEST equal to size of digests array should be * defined in the * ssl_locl.h */ #define SSL_MD_NUM_IDX SSL_MAX_DIGEST static const EVP_MD *ssl_digest_methods[SSL_MD_NUM_IDX]={ - NULL,NULL,NULL,NULL + NULL,NULL,NULL,NULL,NULL,NULL }; /* PKEY_TYPE for GOST89MAC is known in advance, but, because * implementation is engine-provided, we'll fill it only if * corresponding EVP_PKEY_METHOD is found */ static int ssl_mac_pkey_id[SSL_MD_NUM_IDX]={ - EVP_PKEY_HMAC,EVP_PKEY_HMAC,EVP_PKEY_HMAC,NID_undef + EVP_PKEY_HMAC,EVP_PKEY_HMAC,EVP_PKEY_HMAC,NID_undef, + EVP_PKEY_HMAC,EVP_PKEY_HMAC }; static int ssl_mac_secret_size[SSL_MD_NUM_IDX]={ - 0,0,0,0 + 0,0,0,0,0,0 }; static int ssl_handshake_digest_flag[SSL_MD_NUM_IDX]={ SSL_HANDSHAKE_MAC_MD5,SSL_HANDSHAKE_MAC_SHA, - SSL_HANDSHAKE_MAC_GOST94,0 + SSL_HANDSHAKE_MAC_GOST94, 0, SSL_HANDSHAKE_MAC_SHA256, + SSL_HANDSHAKE_MAC_SHA384 }; #define CIPHER_ADD 1 @@ -247,6 +253,7 @@ static const SSL_CIPHER cipher_aliases[]={ {0,SSL_TXT_ECDH,0, SSL_kECDHr|SSL_kECDHe|SSL_kEECDH,0,0,0,0,0,0,0,0}, {0,SSL_TXT_kPSK,0, SSL_kPSK, 0,0,0,0,0,0,0,0}, + {0,SSL_TXT_kSRP,0, SSL_kSRP, 0,0,0,0,0,0,0,0}, {0,SSL_TXT_kGOST,0, SSL_kGOST,0,0,0,0,0,0,0,0}, /* server authentication aliases */ @@ -273,6 +280,7 @@ static const SSL_CIPHER cipher_aliases[]={ {0,SSL_TXT_ADH,0, SSL_kEDH,SSL_aNULL,0,0,0,0,0,0,0}, {0,SSL_TXT_AECDH,0, SSL_kEECDH,SSL_aNULL,0,0,0,0,0,0,0}, {0,SSL_TXT_PSK,0, SSL_kPSK,SSL_aPSK,0,0,0,0,0,0,0}, + {0,SSL_TXT_SRP,0, SSL_kSRP,0,0,0,0,0,0,0,0}, /* symmetric encryption aliases */ @@ -283,9 +291,10 @@ static const SSL_CIPHER cipher_aliases[]={ {0,SSL_TXT_IDEA,0, 0,0,SSL_IDEA, 0,0,0,0,0,0}, {0,SSL_TXT_SEED,0, 0,0,SSL_SEED, 0,0,0,0,0,0}, {0,SSL_TXT_eNULL,0, 0,0,SSL_eNULL, 0,0,0,0,0,0}, - {0,SSL_TXT_AES128,0, 0,0,SSL_AES128,0,0,0,0,0,0}, - {0,SSL_TXT_AES256,0, 0,0,SSL_AES256,0,0,0,0,0,0}, - {0,SSL_TXT_AES,0, 0,0,SSL_AES128|SSL_AES256,0,0,0,0,0,0}, + {0,SSL_TXT_AES128,0, 0,0,SSL_AES128|SSL_AES128GCM,0,0,0,0,0,0}, + {0,SSL_TXT_AES256,0, 0,0,SSL_AES256|SSL_AES256GCM,0,0,0,0,0,0}, + {0,SSL_TXT_AES,0, 0,0,SSL_AES,0,0,0,0,0,0}, + {0,SSL_TXT_AES_GCM,0, 0,0,SSL_AES128GCM|SSL_AES256GCM,0,0,0,0,0,0}, {0,SSL_TXT_CAMELLIA128,0,0,0,SSL_CAMELLIA128,0,0,0,0,0,0}, {0,SSL_TXT_CAMELLIA256,0,0,0,SSL_CAMELLIA256,0,0,0,0,0,0}, {0,SSL_TXT_CAMELLIA ,0,0,0,SSL_CAMELLIA128|SSL_CAMELLIA256,0,0,0,0,0,0}, @@ -296,6 +305,8 @@ static const SSL_CIPHER cipher_aliases[]={ {0,SSL_TXT_SHA,0, 0,0,0,SSL_SHA1, 0,0,0,0,0}, {0,SSL_TXT_GOST94,0, 0,0,0,SSL_GOST94, 0,0,0,0,0}, {0,SSL_TXT_GOST89MAC,0, 0,0,0,SSL_GOST89MAC, 0,0,0,0,0}, + {0,SSL_TXT_SHA256,0, 0,0,0,SSL_SHA256, 0,0,0,0,0}, + {0,SSL_TXT_SHA384,0, 0,0,0,SSL_SHA384, 0,0,0,0,0}, /* protocol version aliases */ {0,SSL_TXT_SSLV2,0, 0,0,0,0,SSL_SSLV2, 0,0,0,0}, @@ -379,6 +390,11 @@ void ssl_load_ciphers(void) ssl_cipher_methods[SSL_ENC_SEED_IDX]= EVP_get_cipherbyname(SN_seed_cbc); + ssl_cipher_methods[SSL_ENC_AES128GCM_IDX]= + EVP_get_cipherbyname(SN_aes_128_gcm); + ssl_cipher_methods[SSL_ENC_AES256GCM_IDX]= + EVP_get_cipherbyname(SN_aes_256_gcm); + ssl_digest_methods[SSL_MD_MD5_IDX]= EVP_get_digestbyname(SN_md5); ssl_mac_secret_size[SSL_MD_MD5_IDX]= @@ -404,6 +420,14 @@ void ssl_load_ciphers(void) ssl_mac_secret_size[SSL_MD_GOST89MAC_IDX]=32; } + ssl_digest_methods[SSL_MD_SHA256_IDX]= + EVP_get_digestbyname(SN_sha256); + ssl_mac_secret_size[SSL_MD_SHA256_IDX]= + EVP_MD_size(ssl_digest_methods[SSL_MD_SHA256_IDX]); + ssl_digest_methods[SSL_MD_SHA384_IDX]= + EVP_get_digestbyname(SN_sha384); + ssl_mac_secret_size[SSL_MD_SHA384_IDX]= + EVP_MD_size(ssl_digest_methods[SSL_MD_SHA384_IDX]); } #ifndef OPENSSL_NO_COMP @@ -526,6 +550,12 @@ int ssl_cipher_get_evp(const SSL_SESSION *s, const EVP_CIPHER **enc, case SSL_SEED: i=SSL_ENC_SEED_IDX; break; + case SSL_AES128GCM: + i=SSL_ENC_AES128GCM_IDX; + break; + case SSL_AES256GCM: + i=SSL_ENC_AES256GCM_IDX; + break; default: i= -1; break; @@ -549,6 +579,12 @@ int ssl_cipher_get_evp(const SSL_SESSION *s, const EVP_CIPHER **enc, case SSL_SHA1: i=SSL_MD_SHA1_IDX; break; + case SSL_SHA256: + i=SSL_MD_SHA256_IDX; + break; + case SSL_SHA384: + i=SSL_MD_SHA384_IDX; + break; case SSL_GOST94: i = SSL_MD_GOST94_IDX; break; @@ -564,17 +600,39 @@ int ssl_cipher_get_evp(const SSL_SESSION *s, const EVP_CIPHER **enc, *md=NULL; if (mac_pkey_type!=NULL) *mac_pkey_type = NID_undef; if (mac_secret_size!=NULL) *mac_secret_size = 0; - + if (c->algorithm_mac == SSL_AEAD) + mac_pkey_type = NULL; } else { *md=ssl_digest_methods[i]; if (mac_pkey_type!=NULL) *mac_pkey_type = ssl_mac_pkey_id[i]; if (mac_secret_size!=NULL) *mac_secret_size = ssl_mac_secret_size[i]; - } + } - if ((*enc != NULL) && (*md != NULL) && (!mac_pkey_type||*mac_pkey_type != NID_undef)) + if ((*enc != NULL) && + (*md != NULL || (EVP_CIPHER_flags(*enc)&EVP_CIPH_FLAG_AEAD_CIPHER)) && + (!mac_pkey_type||*mac_pkey_type != NID_undef)) + { + const EVP_CIPHER *evp; + + if (s->ssl_version >= TLS1_VERSION && + c->algorithm_enc == SSL_RC4 && + c->algorithm_mac == SSL_MD5 && + (evp=EVP_get_cipherbyname("RC4-HMAC-MD5"))) + *enc = evp, *md = NULL; + else if (s->ssl_version >= TLS1_VERSION && + c->algorithm_enc == SSL_AES128 && + c->algorithm_mac == SSL_SHA1 && + (evp=EVP_get_cipherbyname("AES-128-CBC-HMAC-SHA1"))) + *enc = evp, *md = NULL; + else if (s->ssl_version >= TLS1_VERSION && + c->algorithm_enc == SSL_AES256 && + c->algorithm_mac == SSL_SHA1 && + (evp=EVP_get_cipherbyname("AES-256-CBC-HMAC-SHA1"))) + *enc = evp, *md = NULL; return(1); + } else return(0); } @@ -585,9 +643,11 @@ int ssl_get_handshake_digest(int idx, long *mask, const EVP_MD **md) { return 0; } - if (ssl_handshake_digest_flag[idx]==0) return 0; *mask = ssl_handshake_digest_flag[idx]; - *md = ssl_digest_methods[idx]; + if (*mask) + *md = ssl_digest_methods[idx]; + else + *md = NULL; return 1; } @@ -661,6 +721,9 @@ static void ssl_cipher_get_disabled(unsigned long *mkey, unsigned long *auth, un #ifdef OPENSSL_NO_PSK *mkey |= SSL_kPSK; *auth |= SSL_aPSK; +#endif +#ifdef OPENSSL_NO_SRP + *mkey |= SSL_kSRP; #endif /* Check for presence of GOST 34.10 algorithms, and if they * do not present, disable appropriate auth and key exchange */ @@ -687,6 +750,8 @@ static void ssl_cipher_get_disabled(unsigned long *mkey, unsigned long *auth, un *enc |= (ssl_cipher_methods[SSL_ENC_IDEA_IDX] == NULL) ? SSL_IDEA:0; *enc |= (ssl_cipher_methods[SSL_ENC_AES128_IDX] == NULL) ? SSL_AES128:0; *enc |= (ssl_cipher_methods[SSL_ENC_AES256_IDX] == NULL) ? SSL_AES256:0; + *enc |= (ssl_cipher_methods[SSL_ENC_AES128GCM_IDX] == NULL) ? SSL_AES128GCM:0; + *enc |= (ssl_cipher_methods[SSL_ENC_AES256GCM_IDX] == NULL) ? SSL_AES256GCM:0; *enc |= (ssl_cipher_methods[SSL_ENC_CAMELLIA128_IDX] == NULL) ? SSL_CAMELLIA128:0; *enc |= (ssl_cipher_methods[SSL_ENC_CAMELLIA256_IDX] == NULL) ? SSL_CAMELLIA256:0; *enc |= (ssl_cipher_methods[SSL_ENC_GOST89_IDX] == NULL) ? SSL_eGOST2814789CNT:0; @@ -694,6 +759,8 @@ static void ssl_cipher_get_disabled(unsigned long *mkey, unsigned long *auth, un *mac |= (ssl_digest_methods[SSL_MD_MD5_IDX ] == NULL) ? SSL_MD5 :0; *mac |= (ssl_digest_methods[SSL_MD_SHA1_IDX] == NULL) ? SSL_SHA1:0; + *mac |= (ssl_digest_methods[SSL_MD_SHA256_IDX] == NULL) ? SSL_SHA256:0; + *mac |= (ssl_digest_methods[SSL_MD_SHA384_IDX] == NULL) ? SSL_SHA384:0; *mac |= (ssl_digest_methods[SSL_MD_GOST94_IDX] == NULL) ? SSL_GOST94:0; *mac |= (ssl_digest_methods[SSL_MD_GOST89MAC_IDX] == NULL || ssl_mac_pkey_id[SSL_MD_GOST89MAC_IDX]==NID_undef)? SSL_GOST89MAC:0; @@ -724,6 +791,9 @@ static void ssl_cipher_collect_ciphers(const SSL_METHOD *ssl_method, c = ssl_method->get_cipher(i); /* drop those that use any of that is not available */ if ((c != NULL) && c->valid && +#ifdef OPENSSL_FIPS + (!FIPS_mode() || (c->algo_strength & SSL_FIPS)) && +#endif !(c->algorithm_mkey & disabled_mkey) && !(c->algorithm_auth & disabled_auth) && !(c->algorithm_enc & disabled_enc) && @@ -1423,7 +1493,11 @@ STACK_OF(SSL_CIPHER) *ssl_create_cipher_list(const SSL_METHOD *ssl_method, */ for (curr = head; curr != NULL; curr = curr->next) { +#ifdef OPENSSL_FIPS + if (curr->active && (!FIPS_mode() || curr->cipher->algo_strength & SSL_FIPS)) +#else if (curr->active) +#endif { sk_SSL_CIPHER_push(cipherstack, curr->cipher); #ifdef CIPHER_DEBUG @@ -1480,6 +1554,8 @@ char *SSL_CIPHER_description(const SSL_CIPHER *cipher, char *buf, int len) ver="SSLv2"; else if (alg_ssl & SSL_SSLV3) ver="SSLv3"; + else if (alg_ssl & SSL_TLSV1_2) + ver="TLSv1.2"; else ver="unknown"; @@ -1512,6 +1588,9 @@ char *SSL_CIPHER_description(const SSL_CIPHER *cipher, char *buf, int len) case SSL_kPSK: kx="PSK"; break; + case SSL_kSRP: + kx="SRP"; + break; default: kx="unknown"; } @@ -1574,6 +1653,12 @@ char *SSL_CIPHER_description(const SSL_CIPHER *cipher, char *buf, int len) case SSL_AES256: enc="AES(256)"; break; + case SSL_AES128GCM: + enc="AESGCM(128)"; + break; + case SSL_AES256GCM: + enc="AESGCM(256)"; + break; case SSL_CAMELLIA128: enc="Camellia(128)"; break; @@ -1596,6 +1681,15 @@ char *SSL_CIPHER_description(const SSL_CIPHER *cipher, char *buf, int len) case SSL_SHA1: mac="SHA1"; break; + case SSL_SHA256: + mac="SHA256"; + break; + case SSL_SHA384: + mac="SHA384"; + break; + case SSL_AEAD: + mac="AEAD"; + break; default: mac="unknown"; break; @@ -1653,6 +1747,11 @@ int SSL_CIPHER_get_bits(const SSL_CIPHER *c, int *alg_bits) return(ret); } +unsigned long SSL_CIPHER_get_id(const SSL_CIPHER *c) + { + return c->id; + } + SSL_COMP *ssl3_comp_find(STACK_OF(SSL_COMP) *sk, int n) { SSL_COMP *ctmp; diff --git a/crypto/openssl/ssl/ssl_err.c b/crypto/openssl/ssl/ssl_err.c index e9be77109f..ccb93d2689 100644 --- a/crypto/openssl/ssl/ssl_err.c +++ b/crypto/openssl/ssl/ssl_err.c @@ -88,6 +88,7 @@ static ERR_STRING_DATA SSL_str_functs[]= {ERR_FUNC(SSL_F_DTLS1_GET_MESSAGE_FRAGMENT), "DTLS1_GET_MESSAGE_FRAGMENT"}, {ERR_FUNC(SSL_F_DTLS1_GET_RECORD), "DTLS1_GET_RECORD"}, {ERR_FUNC(SSL_F_DTLS1_HANDLE_TIMEOUT), "DTLS1_HANDLE_TIMEOUT"}, +{ERR_FUNC(SSL_F_DTLS1_HEARTBEAT), "DTLS1_HEARTBEAT"}, {ERR_FUNC(SSL_F_DTLS1_OUTPUT_CERT_CHAIN), "DTLS1_OUTPUT_CERT_CHAIN"}, {ERR_FUNC(SSL_F_DTLS1_PREPROCESS_FRAGMENT), "DTLS1_PREPROCESS_FRAGMENT"}, {ERR_FUNC(SSL_F_DTLS1_PROCESS_OUT_OF_SEQ_MESSAGE), "DTLS1_PROCESS_OUT_OF_SEQ_MESSAGE"}, @@ -156,6 +157,7 @@ static ERR_STRING_DATA SSL_str_functs[]= {ERR_FUNC(SSL_F_SSL3_GET_KEY_EXCHANGE), "SSL3_GET_KEY_EXCHANGE"}, {ERR_FUNC(SSL_F_SSL3_GET_MESSAGE), "SSL3_GET_MESSAGE"}, {ERR_FUNC(SSL_F_SSL3_GET_NEW_SESSION_TICKET), "SSL3_GET_NEW_SESSION_TICKET"}, +{ERR_FUNC(SSL_F_SSL3_GET_NEXT_PROTO), "SSL3_GET_NEXT_PROTO"}, {ERR_FUNC(SSL_F_SSL3_GET_RECORD), "SSL3_GET_RECORD"}, {ERR_FUNC(SSL_F_SSL3_GET_SERVER_CERTIFICATE), "SSL3_GET_SERVER_CERTIFICATE"}, {ERR_FUNC(SSL_F_SSL3_GET_SERVER_DONE), "SSL3_GET_SERVER_DONE"}, @@ -180,10 +182,12 @@ static ERR_STRING_DATA SSL_str_functs[]= {ERR_FUNC(SSL_F_SSL3_WRITE_PENDING), "SSL3_WRITE_PENDING"}, {ERR_FUNC(SSL_F_SSL_ADD_CLIENTHELLO_RENEGOTIATE_EXT), "SSL_ADD_CLIENTHELLO_RENEGOTIATE_EXT"}, {ERR_FUNC(SSL_F_SSL_ADD_CLIENTHELLO_TLSEXT), "SSL_ADD_CLIENTHELLO_TLSEXT"}, +{ERR_FUNC(SSL_F_SSL_ADD_CLIENTHELLO_USE_SRTP_EXT), "SSL_ADD_CLIENTHELLO_USE_SRTP_EXT"}, {ERR_FUNC(SSL_F_SSL_ADD_DIR_CERT_SUBJECTS_TO_STACK), "SSL_add_dir_cert_subjects_to_stack"}, {ERR_FUNC(SSL_F_SSL_ADD_FILE_CERT_SUBJECTS_TO_STACK), "SSL_add_file_cert_subjects_to_stack"}, {ERR_FUNC(SSL_F_SSL_ADD_SERVERHELLO_RENEGOTIATE_EXT), "SSL_ADD_SERVERHELLO_RENEGOTIATE_EXT"}, {ERR_FUNC(SSL_F_SSL_ADD_SERVERHELLO_TLSEXT), "SSL_ADD_SERVERHELLO_TLSEXT"}, +{ERR_FUNC(SSL_F_SSL_ADD_SERVERHELLO_USE_SRTP_EXT), "SSL_ADD_SERVERHELLO_USE_SRTP_EXT"}, {ERR_FUNC(SSL_F_SSL_BAD_METHOD), "SSL_BAD_METHOD"}, {ERR_FUNC(SSL_F_SSL_BYTES_TO_CIPHER_LIST), "SSL_BYTES_TO_CIPHER_LIST"}, {ERR_FUNC(SSL_F_SSL_CERT_DUP), "SSL_CERT_DUP"}, @@ -200,6 +204,7 @@ static ERR_STRING_DATA SSL_str_functs[]= {ERR_FUNC(SSL_F_SSL_CREATE_CIPHER_LIST), "SSL_CREATE_CIPHER_LIST"}, {ERR_FUNC(SSL_F_SSL_CTRL), "SSL_ctrl"}, {ERR_FUNC(SSL_F_SSL_CTX_CHECK_PRIVATE_KEY), "SSL_CTX_check_private_key"}, +{ERR_FUNC(SSL_F_SSL_CTX_MAKE_PROFILES), "SSL_CTX_MAKE_PROFILES"}, {ERR_FUNC(SSL_F_SSL_CTX_NEW), "SSL_CTX_new"}, {ERR_FUNC(SSL_F_SSL_CTX_SET_CIPHER_LIST), "SSL_CTX_set_cipher_list"}, {ERR_FUNC(SSL_F_SSL_CTX_SET_CLIENT_CERT_ENGINE), "SSL_CTX_set_client_cert_engine"}, @@ -228,8 +233,10 @@ static ERR_STRING_DATA SSL_str_functs[]= {ERR_FUNC(SSL_F_SSL_NEW), "SSL_new"}, {ERR_FUNC(SSL_F_SSL_PARSE_CLIENTHELLO_RENEGOTIATE_EXT), "SSL_PARSE_CLIENTHELLO_RENEGOTIATE_EXT"}, {ERR_FUNC(SSL_F_SSL_PARSE_CLIENTHELLO_TLSEXT), "SSL_PARSE_CLIENTHELLO_TLSEXT"}, +{ERR_FUNC(SSL_F_SSL_PARSE_CLIENTHELLO_USE_SRTP_EXT), "SSL_PARSE_CLIENTHELLO_USE_SRTP_EXT"}, {ERR_FUNC(SSL_F_SSL_PARSE_SERVERHELLO_RENEGOTIATE_EXT), "SSL_PARSE_SERVERHELLO_RENEGOTIATE_EXT"}, {ERR_FUNC(SSL_F_SSL_PARSE_SERVERHELLO_TLSEXT), "SSL_PARSE_SERVERHELLO_TLSEXT"}, +{ERR_FUNC(SSL_F_SSL_PARSE_SERVERHELLO_USE_SRTP_EXT), "SSL_PARSE_SERVERHELLO_USE_SRTP_EXT"}, {ERR_FUNC(SSL_F_SSL_PEEK), "SSL_peek"}, {ERR_FUNC(SSL_F_SSL_PREPARE_CLIENTHELLO_TLSEXT), "SSL_PREPARE_CLIENTHELLO_TLSEXT"}, {ERR_FUNC(SSL_F_SSL_PREPARE_SERVERHELLO_TLSEXT), "SSL_PREPARE_SERVERHELLO_TLSEXT"}, @@ -238,6 +245,7 @@ static ERR_STRING_DATA SSL_str_functs[]= {ERR_FUNC(SSL_F_SSL_RSA_PUBLIC_ENCRYPT), "SSL_RSA_PUBLIC_ENCRYPT"}, {ERR_FUNC(SSL_F_SSL_SESSION_NEW), "SSL_SESSION_new"}, {ERR_FUNC(SSL_F_SSL_SESSION_PRINT_FP), "SSL_SESSION_print_fp"}, +{ERR_FUNC(SSL_F_SSL_SESSION_SET1_ID_CONTEXT), "SSL_SESSION_set1_id_context"}, {ERR_FUNC(SSL_F_SSL_SESS_CERT_NEW), "SSL_SESS_CERT_NEW"}, {ERR_FUNC(SSL_F_SSL_SET_CERT), "SSL_SET_CERT"}, {ERR_FUNC(SSL_F_SSL_SET_CIPHER_LIST), "SSL_set_cipher_list"}, @@ -251,6 +259,7 @@ static ERR_STRING_DATA SSL_str_functs[]= {ERR_FUNC(SSL_F_SSL_SET_TRUST), "SSL_set_trust"}, {ERR_FUNC(SSL_F_SSL_SET_WFD), "SSL_set_wfd"}, {ERR_FUNC(SSL_F_SSL_SHUTDOWN), "SSL_shutdown"}, +{ERR_FUNC(SSL_F_SSL_SRP_CTX_INIT), "SSL_SRP_CTX_init"}, {ERR_FUNC(SSL_F_SSL_UNDEFINED_CONST_FUNCTION), "SSL_UNDEFINED_CONST_FUNCTION"}, {ERR_FUNC(SSL_F_SSL_UNDEFINED_FUNCTION), "SSL_UNDEFINED_FUNCTION"}, {ERR_FUNC(SSL_F_SSL_UNDEFINED_VOID_FUNCTION), "SSL_UNDEFINED_VOID_FUNCTION"}, @@ -270,6 +279,8 @@ static ERR_STRING_DATA SSL_str_functs[]= {ERR_FUNC(SSL_F_TLS1_CHANGE_CIPHER_STATE), "TLS1_CHANGE_CIPHER_STATE"}, {ERR_FUNC(SSL_F_TLS1_CHECK_SERVERHELLO_TLSEXT), "TLS1_CHECK_SERVERHELLO_TLSEXT"}, {ERR_FUNC(SSL_F_TLS1_ENC), "TLS1_ENC"}, +{ERR_FUNC(SSL_F_TLS1_EXPORT_KEYING_MATERIAL), "TLS1_EXPORT_KEYING_MATERIAL"}, +{ERR_FUNC(SSL_F_TLS1_HEARTBEAT), "SSL_F_TLS1_HEARTBEAT"}, {ERR_FUNC(SSL_F_TLS1_PREPARE_CLIENTHELLO_TLSEXT), "TLS1_PREPARE_CLIENTHELLO_TLSEXT"}, {ERR_FUNC(SSL_F_TLS1_PREPARE_SERVERHELLO_TLSEXT), "TLS1_PREPARE_SERVERHELLO_TLSEXT"}, {ERR_FUNC(SSL_F_TLS1_PRF), "tls1_prf"}, @@ -312,6 +323,13 @@ static ERR_STRING_DATA SSL_str_reasons[]= {ERR_REASON(SSL_R_BAD_RSA_MODULUS_LENGTH),"bad rsa modulus length"}, {ERR_REASON(SSL_R_BAD_RSA_SIGNATURE) ,"bad rsa signature"}, {ERR_REASON(SSL_R_BAD_SIGNATURE) ,"bad signature"}, +{ERR_REASON(SSL_R_BAD_SRP_A_LENGTH) ,"bad srp a length"}, +{ERR_REASON(SSL_R_BAD_SRP_B_LENGTH) ,"bad srp b length"}, +{ERR_REASON(SSL_R_BAD_SRP_G_LENGTH) ,"bad srp g length"}, +{ERR_REASON(SSL_R_BAD_SRP_N_LENGTH) ,"bad srp n length"}, +{ERR_REASON(SSL_R_BAD_SRP_S_LENGTH) ,"bad srp s length"}, +{ERR_REASON(SSL_R_BAD_SRTP_MKI_VALUE) ,"bad srtp mki value"}, +{ERR_REASON(SSL_R_BAD_SRTP_PROTECTION_PROFILE_LIST),"bad srtp protection profile list"}, {ERR_REASON(SSL_R_BAD_SSL_FILETYPE) ,"bad ssl filetype"}, {ERR_REASON(SSL_R_BAD_SSL_SESSION_ID_LENGTH),"bad ssl session id length"}, {ERR_REASON(SSL_R_BAD_STATE) ,"bad state"}, @@ -350,12 +368,15 @@ static ERR_STRING_DATA SSL_str_reasons[]= {ERR_REASON(SSL_R_ECC_CERT_SHOULD_HAVE_RSA_SIGNATURE),"ecc cert should have rsa signature"}, {ERR_REASON(SSL_R_ECC_CERT_SHOULD_HAVE_SHA1_SIGNATURE),"ecc cert should have sha1 signature"}, {ERR_REASON(SSL_R_ECGROUP_TOO_LARGE_FOR_CIPHER),"ecgroup too large for cipher"}, +{ERR_REASON(SSL_R_EMPTY_SRTP_PROTECTION_PROFILE_LIST),"empty srtp protection profile list"}, {ERR_REASON(SSL_R_ENCRYPTED_LENGTH_TOO_LONG),"encrypted length too long"}, {ERR_REASON(SSL_R_ERROR_GENERATING_TMP_RSA_KEY),"error generating tmp rsa key"}, {ERR_REASON(SSL_R_ERROR_IN_RECEIVED_CIPHER_LIST),"error in received cipher list"}, {ERR_REASON(SSL_R_EXCESSIVE_MESSAGE_SIZE),"excessive message size"}, {ERR_REASON(SSL_R_EXTRA_DATA_IN_MESSAGE) ,"extra data in message"}, {ERR_REASON(SSL_R_GOT_A_FIN_BEFORE_A_CCS),"got a fin before a ccs"}, +{ERR_REASON(SSL_R_GOT_NEXT_PROTO_BEFORE_A_CCS),"got next proto before a ccs"}, +{ERR_REASON(SSL_R_GOT_NEXT_PROTO_WITHOUT_EXTENSION),"got next proto without seeing extension"}, {ERR_REASON(SSL_R_HTTPS_PROXY_REQUEST) ,"https proxy request"}, {ERR_REASON(SSL_R_HTTP_REQUEST) ,"http request"}, {ERR_REASON(SSL_R_ILLEGAL_PADDING) ,"illegal padding"}, @@ -364,6 +385,7 @@ static ERR_STRING_DATA SSL_str_reasons[]= {ERR_REASON(SSL_R_INVALID_COMMAND) ,"invalid command"}, {ERR_REASON(SSL_R_INVALID_COMPRESSION_ALGORITHM),"invalid compression algorithm"}, {ERR_REASON(SSL_R_INVALID_PURPOSE) ,"invalid purpose"}, +{ERR_REASON(SSL_R_INVALID_SRP_USERNAME) ,"invalid srp username"}, {ERR_REASON(SSL_R_INVALID_STATUS_RESPONSE),"invalid status response"}, {ERR_REASON(SSL_R_INVALID_TICKET_KEYS_LENGTH),"invalid ticket keys length"}, {ERR_REASON(SSL_R_INVALID_TRUST) ,"invalid trust"}, @@ -393,6 +415,7 @@ static ERR_STRING_DATA SSL_str_reasons[]= {ERR_REASON(SSL_R_MISSING_RSA_CERTIFICATE),"missing rsa certificate"}, {ERR_REASON(SSL_R_MISSING_RSA_ENCRYPTING_CERT),"missing rsa encrypting cert"}, {ERR_REASON(SSL_R_MISSING_RSA_SIGNING_CERT),"missing rsa signing cert"}, +{ERR_REASON(SSL_R_MISSING_SRP_PARAM) ,"can't find SRP server param"}, {ERR_REASON(SSL_R_MISSING_TMP_DH_KEY) ,"missing tmp dh key"}, {ERR_REASON(SSL_R_MISSING_TMP_ECDH_KEY) ,"missing tmp ecdh key"}, {ERR_REASON(SSL_R_MISSING_TMP_RSA_KEY) ,"missing tmp rsa key"}, @@ -422,6 +445,7 @@ static ERR_STRING_DATA SSL_str_reasons[]= {ERR_REASON(SSL_R_NO_RENEGOTIATION) ,"no renegotiation"}, {ERR_REASON(SSL_R_NO_REQUIRED_DIGEST) ,"digest requred for handshake isn't computed"}, {ERR_REASON(SSL_R_NO_SHARED_CIPHER) ,"no shared cipher"}, +{ERR_REASON(SSL_R_NO_SRTP_PROFILES) ,"no srtp profiles"}, {ERR_REASON(SSL_R_NO_VERIFY_CALLBACK) ,"no verify callback"}, {ERR_REASON(SSL_R_NULL_SSL_CTX) ,"null ssl ctx"}, {ERR_REASON(SSL_R_NULL_SSL_METHOD_PASSED),"null ssl method passed"}, @@ -465,7 +489,12 @@ static ERR_STRING_DATA SSL_str_reasons[]= {ERR_REASON(SSL_R_SERVERHELLO_TLSEXT) ,"serverhello tlsext"}, {ERR_REASON(SSL_R_SESSION_ID_CONTEXT_UNINITIALIZED),"session id context uninitialized"}, {ERR_REASON(SSL_R_SHORT_READ) ,"short read"}, +{ERR_REASON(SSL_R_SIGNATURE_ALGORITHMS_ERROR),"signature algorithms error"}, {ERR_REASON(SSL_R_SIGNATURE_FOR_NON_SIGNING_CERTIFICATE),"signature for non signing certificate"}, +{ERR_REASON(SSL_R_SRP_A_CALC) ,"error with the srp params"}, +{ERR_REASON(SSL_R_SRTP_COULD_NOT_ALLOCATE_PROFILES),"srtp could not allocate profiles"}, +{ERR_REASON(SSL_R_SRTP_PROTECTION_PROFILE_LIST_TOO_LONG),"srtp protection profile list too long"}, +{ERR_REASON(SSL_R_SRTP_UNKNOWN_PROTECTION_PROFILE),"srtp unknown protection profile"}, {ERR_REASON(SSL_R_SSL23_DOING_SESSION_ID_REUSE),"ssl23 doing session id reuse"}, {ERR_REASON(SSL_R_SSL2_CONNECTION_ID_TOO_LONG),"ssl2 connection id too long"}, {ERR_REASON(SSL_R_SSL3_EXT_INVALID_ECPOINTFORMAT),"ssl3 ext invalid ecpointformat"}, @@ -510,6 +539,9 @@ static ERR_STRING_DATA SSL_str_reasons[]= {ERR_REASON(SSL_R_TLSV1_UNRECOGNIZED_NAME),"tlsv1 unrecognized name"}, {ERR_REASON(SSL_R_TLSV1_UNSUPPORTED_EXTENSION),"tlsv1 unsupported extension"}, {ERR_REASON(SSL_R_TLS_CLIENT_CERT_REQ_WITH_ANON_CIPHER),"tls client cert req with anon cipher"}, +{ERR_REASON(SSL_R_TLS_HEARTBEAT_PEER_DOESNT_ACCEPT),"peer does not accept heartbearts"}, +{ERR_REASON(SSL_R_TLS_HEARTBEAT_PENDING) ,"heartbeat request already pending"}, +{ERR_REASON(SSL_R_TLS_ILLEGAL_EXPORTER_LABEL),"tls illegal exporter label"}, {ERR_REASON(SSL_R_TLS_INVALID_ECPOINTFORMAT_LIST),"tls invalid ecpointformat list"}, {ERR_REASON(SSL_R_TLS_PEER_DID_NOT_RESPOND_WITH_CERTIFICATE_LIST),"tls peer did not respond with certificate list"}, {ERR_REASON(SSL_R_TLS_RSA_ENCRYPTED_VALUE_LENGTH_IS_WRONG),"tls rsa encrypted value length is wrong"}, @@ -531,6 +563,7 @@ static ERR_STRING_DATA SSL_str_reasons[]= {ERR_REASON(SSL_R_UNKNOWN_CERTIFICATE_TYPE),"unknown certificate type"}, {ERR_REASON(SSL_R_UNKNOWN_CIPHER_RETURNED),"unknown cipher returned"}, {ERR_REASON(SSL_R_UNKNOWN_CIPHER_TYPE) ,"unknown cipher type"}, +{ERR_REASON(SSL_R_UNKNOWN_DIGEST) ,"unknown digest"}, {ERR_REASON(SSL_R_UNKNOWN_KEY_EXCHANGE_TYPE),"unknown key exchange type"}, {ERR_REASON(SSL_R_UNKNOWN_PKEY_TYPE) ,"unknown pkey type"}, {ERR_REASON(SSL_R_UNKNOWN_PROTOCOL) ,"unknown protocol"}, @@ -545,12 +578,14 @@ static ERR_STRING_DATA SSL_str_reasons[]= {ERR_REASON(SSL_R_UNSUPPORTED_PROTOCOL) ,"unsupported protocol"}, {ERR_REASON(SSL_R_UNSUPPORTED_SSL_VERSION),"unsupported ssl version"}, {ERR_REASON(SSL_R_UNSUPPORTED_STATUS_TYPE),"unsupported status type"}, +{ERR_REASON(SSL_R_USE_SRTP_NOT_NEGOTIATED),"use srtp not negotiated"}, {ERR_REASON(SSL_R_WRITE_BIO_NOT_SET) ,"write bio not set"}, {ERR_REASON(SSL_R_WRONG_CIPHER_RETURNED) ,"wrong cipher returned"}, {ERR_REASON(SSL_R_WRONG_MESSAGE_TYPE) ,"wrong message type"}, {ERR_REASON(SSL_R_WRONG_NUMBER_OF_KEY_BITS),"wrong number of key bits"}, {ERR_REASON(SSL_R_WRONG_SIGNATURE_LENGTH),"wrong signature length"}, {ERR_REASON(SSL_R_WRONG_SIGNATURE_SIZE) ,"wrong signature size"}, +{ERR_REASON(SSL_R_WRONG_SIGNATURE_TYPE) ,"wrong signature type"}, {ERR_REASON(SSL_R_WRONG_SSL_VERSION) ,"wrong ssl version"}, {ERR_REASON(SSL_R_WRONG_VERSION_NUMBER) ,"wrong version number"}, {ERR_REASON(SSL_R_X509_LIB) ,"x509 lib"}, diff --git a/crypto/openssl/ssl/ssl_lib.c b/crypto/openssl/ssl/ssl_lib.c index 6688f19a64..f82d071d6e 100644 --- a/crypto/openssl/ssl/ssl_lib.c +++ b/crypto/openssl/ssl/ssl_lib.c @@ -176,7 +176,10 @@ SSL3_ENC_METHOD ssl3_undef_enc_method={ 0, /* client_finished_label_len */ NULL, /* server_finished_label */ 0, /* server_finished_label_len */ - (int (*)(int))ssl_undefined_function + (int (*)(int))ssl_undefined_function, + (int (*)(SSL *, unsigned char *, size_t, const char *, + size_t, const unsigned char *, size_t, + int use_context)) ssl_undefined_function, }; int SSL_clear(SSL *s) @@ -202,9 +205,9 @@ int SSL_clear(SSL *s) * needed because SSL_clear is not called when doing renegotiation) */ /* This is set if we are doing dynamic renegotiation so keep * the old cipher. It is sort of a SSL_clear_lite :-) */ - if (s->new_session) return(1); + if (s->renegotiate) return(1); #else - if (s->new_session) + if (s->renegotiate) { SSLerr(SSL_F_SSL_CLEAR,ERR_R_INTERNAL_ERROR); return 0; @@ -353,6 +356,9 @@ SSL *SSL_new(SSL_CTX *ctx) s->tlsext_ocsp_resplen = -1; CRYPTO_add(&ctx->references,1,CRYPTO_LOCK_SSL_CTX); s->initial_ctx=ctx; +# ifndef OPENSSL_NO_NEXTPROTONEG + s->next_proto_negotiated = NULL; +# endif #endif s->verify_result=X509_V_OK; @@ -586,6 +592,14 @@ void SSL_free(SSL *s) kssl_ctx_free(s->kssl_ctx); #endif /* OPENSSL_NO_KRB5 */ +#if !defined(OPENSSL_NO_TLSEXT) && !defined(OPENSSL_NO_NEXTPROTONEG) + if (s->next_proto_negotiated) + OPENSSL_free(s->next_proto_negotiated); +#endif + + if (s->srtp_profiles) + sk_SRTP_PROTECTION_PROFILE_free(s->srtp_profiles); + OPENSSL_free(s); } @@ -1008,10 +1022,21 @@ int SSL_shutdown(SSL *s) int SSL_renegotiate(SSL *s) { - if (s->new_session == 0) - { - s->new_session=1; - } + if (s->renegotiate == 0) + s->renegotiate=1; + + s->new_session=1; + + return(s->method->ssl_renegotiate(s)); + } + +int SSL_renegotiate_abbreviated(SSL *s) + { + if (s->renegotiate == 0) + s->renegotiate=1; + + s->new_session=0; + return(s->method->ssl_renegotiate(s)); } @@ -1019,7 +1044,7 @@ int SSL_renegotiate_pending(SSL *s) { /* becomes true when negotiation is requested; * false again once a handshake has finished */ - return (s->new_session != 0); + return (s->renegotiate != 0); } long SSL_ctrl(SSL *s,int cmd,long larg,void *parg) @@ -1360,6 +1385,10 @@ int ssl_cipher_list_to_bytes(SSL *s,STACK_OF(SSL_CIPHER) *sk,unsigned char *p, for (i=0; ialgorithm_ssl & SSL_TLSV1_2) && + (TLS1_get_client_version(s) < TLS1_2_VERSION)) + continue; #ifndef OPENSSL_NO_KRB5 if (((c->algorithm_mkey & SSL_kKRB5) || (c->algorithm_auth & SSL_aKRB5)) && nokrb5) @@ -1377,7 +1406,7 @@ int ssl_cipher_list_to_bytes(SSL *s,STACK_OF(SSL_CIPHER) *sk,unsigned char *p, /* If p == q, no ciphers and caller indicates an error. Otherwise * add SCSV if not renegotiating. */ - if (p != q && !s->new_session) + if (p != q && !s->renegotiate) { static SSL_CIPHER scsv = { @@ -1424,7 +1453,7 @@ STACK_OF(SSL_CIPHER) *ssl_bytes_to_cipher_list(SSL *s,unsigned char *p,int num, (p[n-1] == (SSL3_CK_SCSV & 0xff))) { /* SCSV fatal if renegotiating */ - if (s->new_session) + if (s->renegotiate) { SSLerr(SSL_F_SSL_BYTES_TO_CIPHER_LIST,SSL_R_SCSV_RECEIVED_WHEN_RENEGOTIATING); ssl3_send_alert(s,SSL3_AL_FATAL,SSL_AD_HANDSHAKE_FAILURE); @@ -1481,8 +1510,137 @@ int SSL_get_servername_type(const SSL *s) return TLSEXT_NAMETYPE_host_name; return -1; } + +# ifndef OPENSSL_NO_NEXTPROTONEG +/* SSL_select_next_proto implements the standard protocol selection. It is + * expected that this function is called from the callback set by + * SSL_CTX_set_next_proto_select_cb. + * + * The protocol data is assumed to be a vector of 8-bit, length prefixed byte + * strings. The length byte itself is not included in the length. A byte + * string of length 0 is invalid. No byte string may be truncated. + * + * The current, but experimental algorithm for selecting the protocol is: + * + * 1) If the server doesn't support NPN then this is indicated to the + * callback. In this case, the client application has to abort the connection + * or have a default application level protocol. + * + * 2) If the server supports NPN, but advertises an empty list then the + * client selects the first protcol in its list, but indicates via the + * API that this fallback case was enacted. + * + * 3) Otherwise, the client finds the first protocol in the server's list + * that it supports and selects this protocol. This is because it's + * assumed that the server has better information about which protocol + * a client should use. + * + * 4) If the client doesn't support any of the server's advertised + * protocols, then this is treated the same as case 2. + * + * It returns either + * OPENSSL_NPN_NEGOTIATED if a common protocol was found, or + * OPENSSL_NPN_NO_OVERLAP if the fallback case was reached. + */ +int SSL_select_next_proto(unsigned char **out, unsigned char *outlen, const unsigned char *server, unsigned int server_len, const unsigned char *client, unsigned int client_len) + { + unsigned int i, j; + const unsigned char *result; + int status = OPENSSL_NPN_UNSUPPORTED; + + /* For each protocol in server preference order, see if we support it. */ + for (i = 0; i < server_len; ) + { + for (j = 0; j < client_len; ) + { + if (server[i] == client[j] && + memcmp(&server[i+1], &client[j+1], server[i]) == 0) + { + /* We found a match */ + result = &server[i]; + status = OPENSSL_NPN_NEGOTIATED; + goto found; + } + j += client[j]; + j++; + } + i += server[i]; + i++; + } + + /* There's no overlap between our protocols and the server's list. */ + result = client; + status = OPENSSL_NPN_NO_OVERLAP; + + found: + *out = (unsigned char *) result + 1; + *outlen = result[0]; + return status; + } + +/* SSL_get0_next_proto_negotiated sets *data and *len to point to the client's + * requested protocol for this connection and returns 0. If the client didn't + * request any protocol, then *data is set to NULL. + * + * Note that the client can request any protocol it chooses. The value returned + * from this function need not be a member of the list of supported protocols + * provided by the callback. + */ +void SSL_get0_next_proto_negotiated(const SSL *s, const unsigned char **data, unsigned *len) + { + *data = s->next_proto_negotiated; + if (!*data) { + *len = 0; + } else { + *len = s->next_proto_negotiated_len; + } +} + +/* SSL_CTX_set_next_protos_advertised_cb sets a callback that is called when a + * TLS server needs a list of supported protocols for Next Protocol + * Negotiation. The returned list must be in wire format. The list is returned + * by setting |out| to point to it and |outlen| to its length. This memory will + * not be modified, but one should assume that the SSL* keeps a reference to + * it. + * + * The callback should return SSL_TLSEXT_ERR_OK if it wishes to advertise. Otherwise, no + * such extension will be included in the ServerHello. */ +void SSL_CTX_set_next_protos_advertised_cb(SSL_CTX *ctx, int (*cb) (SSL *ssl, const unsigned char **out, unsigned int *outlen, void *arg), void *arg) + { + ctx->next_protos_advertised_cb = cb; + ctx->next_protos_advertised_cb_arg = arg; + } + +/* SSL_CTX_set_next_proto_select_cb sets a callback that is called when a + * client needs to select a protocol from the server's provided list. |out| + * must be set to point to the selected protocol (which may be within |in|). + * The length of the protocol name must be written into |outlen|. The server's + * advertised protocols are provided in |in| and |inlen|. The callback can + * assume that |in| is syntactically valid. + * + * The client must select a protocol. It is fatal to the connection if this + * callback returns a value other than SSL_TLSEXT_ERR_OK. + */ +void SSL_CTX_set_next_proto_select_cb(SSL_CTX *ctx, int (*cb) (SSL *s, unsigned char **out, unsigned char *outlen, const unsigned char *in, unsigned int inlen, void *arg), void *arg) + { + ctx->next_proto_select_cb = cb; + ctx->next_proto_select_cb_arg = arg; + } +# endif #endif +int SSL_export_keying_material(SSL *s, unsigned char *out, size_t olen, + const char *label, size_t llen, const unsigned char *p, size_t plen, + int use_context) + { + if (s->version < TLS1_VERSION) + return -1; + + return s->method->ssl3_enc->export_keying_material(s, out, olen, label, + llen, p, plen, + use_context); + } + static unsigned long ssl_session_hash(const SSL_SESSION *a) { unsigned long l; @@ -1526,6 +1684,14 @@ SSL_CTX *SSL_CTX_new(const SSL_METHOD *meth) return(NULL); } +#ifdef OPENSSL_FIPS + if (FIPS_mode() && (meth->version < TLS1_VERSION)) + { + SSLerr(SSL_F_SSL_CTX_NEW, SSL_R_ONLY_TLS_ALLOWED_IN_FIPS_MODE); + return NULL; + } +#endif + if (SSL_get_ex_data_X509_STORE_CTX_idx() < 0) { SSLerr(SSL_F_SSL_CTX_NEW,SSL_R_X509_VERIFICATION_SETUP_PROBLEMS); @@ -1645,12 +1811,19 @@ SSL_CTX *SSL_CTX_new(const SSL_METHOD *meth) ret->tlsext_status_cb = 0; ret->tlsext_status_arg = NULL; +# ifndef OPENSSL_NO_NEXTPROTONEG + ret->next_protos_advertised_cb = 0; + ret->next_proto_select_cb = 0; +# endif #endif #ifndef OPENSSL_NO_PSK ret->psk_identity_hint=NULL; ret->psk_client_callback=NULL; ret->psk_server_callback=NULL; #endif +#ifndef OPENSSL_NO_SRP + SSL_CTX_SRP_CTX_init(ret); +#endif #ifndef OPENSSL_NO_BUF_FREELISTS ret->freelist_max_len = SSL_MAX_BUF_FREELIST_LEN_DEFAULT; ret->rbuf_freelist = OPENSSL_malloc(sizeof(SSL3_BUF_FREELIST)); @@ -1779,10 +1952,16 @@ void SSL_CTX_free(SSL_CTX *a) a->comp_methods = NULL; #endif + if (a->srtp_profiles) + sk_SRTP_PROTECTION_PROFILE_free(a->srtp_profiles); + #ifndef OPENSSL_NO_PSK if (a->psk_identity_hint) OPENSSL_free(a->psk_identity_hint); #endif +#ifndef OPENSSL_NO_SRP + SSL_CTX_SRP_CTX_free(a); +#endif #ifndef OPENSSL_NO_ENGINE if (a->client_cert_engine) ENGINE_finish(a->client_cert_engine); @@ -2036,12 +2215,13 @@ void ssl_set_cert_masks(CERT *c, const SSL_CIPHER *cipher) #ifndef OPENSSL_NO_EC -int ssl_check_srvr_ecc_cert_and_alg(X509 *x, const SSL_CIPHER *cs) +int ssl_check_srvr_ecc_cert_and_alg(X509 *x, SSL *s) { unsigned long alg_k, alg_a; EVP_PKEY *pkey = NULL; int keysize = 0; int signature_nid = 0, md_nid = 0, pk_nid = 0; + const SSL_CIPHER *cs = s->s3->tmp.new_cipher; alg_k = cs->algorithm_mkey; alg_a = cs->algorithm_auth; @@ -2071,7 +2251,7 @@ int ssl_check_srvr_ecc_cert_and_alg(X509 *x, const SSL_CIPHER *cs) SSLerr(SSL_F_SSL_CHECK_SRVR_ECC_CERT_AND_ALG, SSL_R_ECC_CERT_NOT_FOR_KEY_AGREEMENT); return 0; } - if (alg_k & SSL_kECDHe) + if ((alg_k & SSL_kECDHe) && TLS1_get_version(s) < TLS1_2_VERSION) { /* signature alg must be ECDSA */ if (pk_nid != NID_X9_62_id_ecPublicKey) @@ -2080,7 +2260,7 @@ int ssl_check_srvr_ecc_cert_and_alg(X509 *x, const SSL_CIPHER *cs) return 0; } } - if (alg_k & SSL_kECDHr) + if ((alg_k & SSL_kECDHr) && TLS1_get_version(s) < TLS1_2_VERSION) { /* signature alg must be RSA */ @@ -2170,34 +2350,36 @@ X509 *ssl_get_server_send_cert(SSL *s) return(c->pkeys[i].x509); } -EVP_PKEY *ssl_get_sign_pkey(SSL *s,const SSL_CIPHER *cipher) +EVP_PKEY *ssl_get_sign_pkey(SSL *s,const SSL_CIPHER *cipher, const EVP_MD **pmd) { unsigned long alg_a; CERT *c; + int idx = -1; alg_a = cipher->algorithm_auth; c=s->cert; if ((alg_a & SSL_aDSS) && (c->pkeys[SSL_PKEY_DSA_SIGN].privatekey != NULL)) - return(c->pkeys[SSL_PKEY_DSA_SIGN].privatekey); + idx = SSL_PKEY_DSA_SIGN; else if (alg_a & SSL_aRSA) { if (c->pkeys[SSL_PKEY_RSA_SIGN].privatekey != NULL) - return(c->pkeys[SSL_PKEY_RSA_SIGN].privatekey); + idx = SSL_PKEY_RSA_SIGN; else if (c->pkeys[SSL_PKEY_RSA_ENC].privatekey != NULL) - return(c->pkeys[SSL_PKEY_RSA_ENC].privatekey); - else - return(NULL); + idx = SSL_PKEY_RSA_ENC; } else if ((alg_a & SSL_aECDSA) && (c->pkeys[SSL_PKEY_ECC].privatekey != NULL)) - return(c->pkeys[SSL_PKEY_ECC].privatekey); - else /* if (alg_a & SSL_aNULL) */ + idx = SSL_PKEY_ECC; + if (idx == -1) { SSLerr(SSL_F_SSL_GET_SIGN_PKEY,ERR_R_INTERNAL_ERROR); return(NULL); } + if (pmd) + *pmd = c->pkeys[idx].digest; + return c->pkeys[idx].privatekey; } void ssl_update_cache(SSL *s,int mode) @@ -2422,6 +2604,10 @@ SSL_METHOD *ssl_bad_method(int ver) const char *SSL_get_version(const SSL *s) { + if (s->version == TLS1_2_VERSION) + return("TLSv1.2"); + else if (s->version == TLS1_1_VERSION) + return("TLSv1.1"); if (s->version == TLS1_VERSION) return("TLSv1"); else if (s->version == SSL3_VERSION) @@ -2516,6 +2702,7 @@ SSL *SSL_dup(SSL *s) ret->in_handshake = s->in_handshake; ret->handshake_func = s->handshake_func; ret->server = s->server; + ret->renegotiate = s->renegotiate; ret->new_session = s->new_session; ret->quiet_shutdown = s->quiet_shutdown; ret->shutdown=s->shutdown; @@ -2781,6 +2968,11 @@ int SSL_state(const SSL *ssl) return(ssl->state); } +void SSL_set_state(SSL *ssl, int state) + { + ssl->state = state; + } + void SSL_set_verify_result(SSL *ssl,long arg) { ssl->verify_result=arg; @@ -3039,6 +3231,16 @@ void ssl_clear_hash_ctx(EVP_MD_CTX **hash) *hash=NULL; } +void SSL_set_debug(SSL *s, int debug) + { + s->debug = debug; + } + +int SSL_cache_hit(SSL *s) + { + return s->hit; + } + #if defined(_WINDLL) && defined(OPENSSL_SYS_WIN16) #include "../crypto/bio/bss_file.c" #endif diff --git a/crypto/openssl/ssl/ssl_locl.h b/crypto/openssl/ssl/ssl_locl.h index cea622a2a6..d87fd51cfa 100644 --- a/crypto/openssl/ssl/ssl_locl.h +++ b/crypto/openssl/ssl/ssl_locl.h @@ -170,7 +170,7 @@ # define OPENSSL_EXTERN OPENSSL_EXPORT #endif -#define PKCS1_CHECK +#undef PKCS1_CHECK #define c2l(c,l) (l = ((unsigned long)(*((c)++))) , \ l|=(((unsigned long)(*((c)++)))<< 8), \ @@ -289,6 +289,7 @@ #define SSL_kEECDH 0x00000080L /* ephemeral ECDH */ #define SSL_kPSK 0x00000100L /* PSK */ #define SSL_kGOST 0x00000200L /* GOST key exchange */ +#define SSL_kSRP 0x00000400L /* SRP */ /* Bits for algorithm_auth (server authentication) */ #define SSL_aRSA 0x00000001L /* RSA auth */ @@ -316,21 +317,29 @@ #define SSL_CAMELLIA256 0x00000200L #define SSL_eGOST2814789CNT 0x00000400L #define SSL_SEED 0x00000800L +#define SSL_AES128GCM 0x00001000L +#define SSL_AES256GCM 0x00002000L -#define SSL_AES (SSL_AES128|SSL_AES256) +#define SSL_AES (SSL_AES128|SSL_AES256|SSL_AES128GCM|SSL_AES256GCM) #define SSL_CAMELLIA (SSL_CAMELLIA128|SSL_CAMELLIA256) /* Bits for algorithm_mac (symmetric authentication) */ + #define SSL_MD5 0x00000001L #define SSL_SHA1 0x00000002L #define SSL_GOST94 0x00000004L #define SSL_GOST89MAC 0x00000008L +#define SSL_SHA256 0x00000010L +#define SSL_SHA384 0x00000020L +/* Not a real MAC, just an indication it is part of cipher */ +#define SSL_AEAD 0x00000040L /* Bits for algorithm_ssl (protocol version) */ #define SSL_SSLV2 0x00000001L #define SSL_SSLV3 0x00000002L #define SSL_TLSV1 SSL_SSLV3 /* for now */ +#define SSL_TLSV1_2 0x00000004L /* Bits for algorithm2 (handshake digests and other extra flags) */ @@ -338,15 +347,21 @@ #define SSL_HANDSHAKE_MAC_MD5 0x10 #define SSL_HANDSHAKE_MAC_SHA 0x20 #define SSL_HANDSHAKE_MAC_GOST94 0x40 +#define SSL_HANDSHAKE_MAC_SHA256 0x80 +#define SSL_HANDSHAKE_MAC_SHA384 0x100 #define SSL_HANDSHAKE_MAC_DEFAULT (SSL_HANDSHAKE_MAC_MD5 | SSL_HANDSHAKE_MAC_SHA) /* When adding new digest in the ssl_ciph.c and increment SSM_MD_NUM_IDX * make sure to update this constant too */ -#define SSL_MAX_DIGEST 4 +#define SSL_MAX_DIGEST 6 + +#define TLS1_PRF_DGST_MASK (0xff << TLS1_PRF_DGST_SHIFT) -#define TLS1_PRF_DGST_SHIFT 8 +#define TLS1_PRF_DGST_SHIFT 10 #define TLS1_PRF_MD5 (SSL_HANDSHAKE_MAC_MD5 << TLS1_PRF_DGST_SHIFT) #define TLS1_PRF_SHA1 (SSL_HANDSHAKE_MAC_SHA << TLS1_PRF_DGST_SHIFT) +#define TLS1_PRF_SHA256 (SSL_HANDSHAKE_MAC_SHA256 << TLS1_PRF_DGST_SHIFT) +#define TLS1_PRF_SHA384 (SSL_HANDSHAKE_MAC_SHA384 << TLS1_PRF_DGST_SHIFT) #define TLS1_PRF_GOST94 (SSL_HANDSHAKE_MAC_GOST94 << TLS1_PRF_DGST_SHIFT) #define TLS1_PRF (TLS1_PRF_MD5 | TLS1_PRF_SHA1) @@ -457,6 +472,8 @@ typedef struct cert_pkey_st { X509 *x509; EVP_PKEY *privatekey; + /* Digest to use when signing */ + const EVP_MD *digest; } CERT_PKEY; typedef struct cert_st @@ -554,6 +571,10 @@ typedef struct ssl3_enc_method const char *server_finished_label; int server_finished_label_len; int (*alert_value)(int); + int (*export_keying_material)(SSL *, unsigned char *, size_t, + const char *, size_t, + const unsigned char *, size_t, + int use_context); } SSL3_ENC_METHOD; #ifndef OPENSSL_NO_COMP @@ -591,11 +612,12 @@ extern SSL3_ENC_METHOD TLSv1_enc_data; extern SSL3_ENC_METHOD SSLv3_enc_data; extern SSL3_ENC_METHOD DTLSv1_enc_data; -#define IMPLEMENT_tls1_meth_func(func_name, s_accept, s_connect, s_get_meth) \ +#define IMPLEMENT_tls_meth_func(version, func_name, s_accept, s_connect, \ + s_get_meth) \ const SSL_METHOD *func_name(void) \ { \ static const SSL_METHOD func_name##_data= { \ - TLS1_VERSION, \ + version, \ tls1_new, \ tls1_clear, \ tls1_free, \ @@ -669,7 +691,7 @@ const SSL_METHOD *func_name(void) \ const SSL_METHOD *func_name(void) \ { \ static const SSL_METHOD func_name##_data= { \ - TLS1_VERSION, \ + TLS1_2_VERSION, \ tls1_new, \ tls1_clear, \ tls1_free, \ @@ -752,7 +774,7 @@ const SSL_METHOD *func_name(void) \ ssl3_read, \ ssl3_peek, \ ssl3_write, \ - ssl3_shutdown, \ + dtls1_shutdown, \ ssl3_renegotiate, \ ssl3_renegotiate_check, \ dtls1_get_message, \ @@ -809,7 +831,7 @@ int ssl_undefined_function(SSL *s); int ssl_undefined_void_function(void); int ssl_undefined_const_function(const SSL *s); X509 *ssl_get_server_send_cert(SSL *); -EVP_PKEY *ssl_get_sign_pkey(SSL *,const SSL_CIPHER *); +EVP_PKEY *ssl_get_sign_pkey(SSL *s,const SSL_CIPHER *c, const EVP_MD **pmd); int ssl_cert_type(X509 *x,EVP_PKEY *pkey); void ssl_set_cert_masks(CERT *c, const SSL_CIPHER *cipher); STACK_OF(SSL_CIPHER) *ssl_get_ciphers_by_id(SSL *s); @@ -943,6 +965,7 @@ void dtls1_get_ccs_header(unsigned char *data, struct ccs_header_st *ccs_hdr); void dtls1_reset_seq_numbers(SSL *s, int rw); long dtls1_default_timeout(void); struct timeval* dtls1_get_timeout(SSL *s, struct timeval* timeleft); +int dtls1_check_timeout_num(SSL *s); int dtls1_handle_timeout(SSL *s); const SSL_CIPHER *dtls1_get_cipher(unsigned int u); void dtls1_start_timer(SSL *s); @@ -968,6 +991,9 @@ int ssl3_get_server_certificate(SSL *s); int ssl3_check_cert_and_algorithm(SSL *s); #ifndef OPENSSL_NO_TLSEXT int ssl3_check_finished(SSL *s); +# ifndef OPENSSL_NO_NEXTPROTONEG +int ssl3_send_next_proto(SSL *s); +# endif #endif int dtls1_client_hello(SSL *s); @@ -986,6 +1012,9 @@ int ssl3_check_client_hello(SSL *s); int ssl3_get_client_certificate(SSL *s); int ssl3_get_client_key_exchange(SSL *s); int ssl3_get_cert_verify(SSL *s); +#ifndef OPENSSL_NO_NEXTPROTONEG +int ssl3_get_next_proto(SSL *s); +#endif int dtls1_send_hello_request(SSL *s); int dtls1_send_server_hello(SSL *s); @@ -1013,6 +1042,7 @@ int dtls1_connect(SSL *s); void dtls1_free(SSL *s); void dtls1_clear(SSL *s); long dtls1_ctrl(SSL *s,int cmd, long larg, void *parg); +int dtls1_shutdown(SSL *s); long dtls1_get_message(SSL *s, int st1, int stn, int mt, long max, int *ok); int dtls1_get_record(SSL *s); @@ -1033,12 +1063,15 @@ int tls1_cert_verify_mac(SSL *s, int md_nid, unsigned char *p); int tls1_mac(SSL *ssl, unsigned char *md, int snd); int tls1_generate_master_secret(SSL *s, unsigned char *out, unsigned char *p, int len); +int tls1_export_keying_material(SSL *s, unsigned char *out, size_t olen, + const char *label, size_t llen, + const unsigned char *p, size_t plen, int use_context); int tls1_alert_code(int code); int ssl3_alert_code(int code); int ssl_ok(SSL *s); #ifndef OPENSSL_NO_ECDH -int ssl_check_srvr_ecc_cert_and_alg(X509 *x, const SSL_CIPHER *cs); +int ssl_check_srvr_ecc_cert_and_alg(X509 *x, SSL *s); #endif SSL_COMP *ssl3_comp_find(STACK_OF(SSL_COMP) *sk, int n); @@ -1058,6 +1091,13 @@ int ssl_prepare_serverhello_tlsext(SSL *s); int ssl_check_clienthello_tlsext(SSL *s); int ssl_check_serverhello_tlsext(SSL *s); +#ifndef OPENSSL_NO_HEARTBEATS +int tls1_heartbeat(SSL *s); +int dtls1_heartbeat(SSL *s); +int tls1_process_heartbeat(SSL *s); +int dtls1_process_heartbeat(SSL *s); +#endif + #ifdef OPENSSL_NO_SHA256 #define tlsext_tick_md EVP_sha1 #else @@ -1065,6 +1105,12 @@ int ssl_check_serverhello_tlsext(SSL *s); #endif int tls1_process_ticket(SSL *s, unsigned char *session_id, int len, const unsigned char *limit, SSL_SESSION **ret); + +int tls12_get_sigandhash(unsigned char *p, const EVP_PKEY *pk, + const EVP_MD *md); +int tls12_get_sigid(const EVP_PKEY *pk); +const EVP_MD *tls12_get_hash(unsigned char hash_alg); + #endif EVP_MD_CTX* ssl_replace_hash(EVP_MD_CTX **hash,const EVP_MD *md) ; void ssl_clear_hash_ctx(EVP_MD_CTX **hash); @@ -1076,4 +1122,13 @@ int ssl_add_clienthello_renegotiate_ext(SSL *s, unsigned char *p, int *len, int maxlen); int ssl_parse_clienthello_renegotiate_ext(SSL *s, unsigned char *d, int len, int *al); +long ssl_get_algorithm2(SSL *s); +int tls1_process_sigalgs(SSL *s, const unsigned char *data, int dsize); +int tls12_get_req_sig_algs(SSL *s, unsigned char *p); + +int ssl_add_clienthello_use_srtp_ext(SSL *s, unsigned char *p, int *len, int maxlen); +int ssl_parse_clienthello_use_srtp_ext(SSL *s, unsigned char *d, int len,int *al); +int ssl_add_serverhello_use_srtp_ext(SSL *s, unsigned char *p, int *len, int maxlen); +int ssl_parse_serverhello_use_srtp_ext(SSL *s, unsigned char *d, int len,int *al); + #endif diff --git a/crypto/openssl/ssl/ssl_sess.c b/crypto/openssl/ssl/ssl_sess.c index 8e5d8a0972..ad40fadd02 100644 --- a/crypto/openssl/ssl/ssl_sess.c +++ b/crypto/openssl/ssl/ssl_sess.c @@ -217,6 +217,9 @@ SSL_SESSION *SSL_SESSION_new(void) #ifndef OPENSSL_NO_PSK ss->psk_identity_hint=NULL; ss->psk_identity=NULL; +#endif +#ifndef OPENSSL_NO_SRP + ss->srp_username=NULL; #endif return(ss); } @@ -228,6 +231,11 @@ const unsigned char *SSL_SESSION_get_id(const SSL_SESSION *s, unsigned int *len) return s->session_id; } +unsigned int SSL_SESSION_get_compress_id(const SSL_SESSION *s) + { + return s->compress_meth; + } + /* Even with SSLv2, we have 16 bytes (128 bits) of session ID space. SSLv3/TLSv1 * has 32 bytes (256 bits). As such, filling the ID with random gunk repeatedly * until we have no conflict is going to complete in one iteration pretty much @@ -300,6 +308,16 @@ int ssl_get_new_session(SSL *s, int session) ss->ssl_version=TLS1_VERSION; ss->session_id_length=SSL3_SSL_SESSION_ID_LENGTH; } + else if (s->version == TLS1_1_VERSION) + { + ss->ssl_version=TLS1_1_VERSION; + ss->session_id_length=SSL3_SSL_SESSION_ID_LENGTH; + } + else if (s->version == TLS1_2_VERSION) + { + ss->ssl_version=TLS1_2_VERSION; + ss->session_id_length=SSL3_SSL_SESSION_ID_LENGTH; + } else if (s->version == DTLS1_BAD_VER) { ss->ssl_version=DTLS1_BAD_VER; @@ -423,6 +441,25 @@ int ssl_get_new_session(SSL *s, int session) return(1); } +/* ssl_get_prev attempts to find an SSL_SESSION to be used to resume this + * connection. It is only called by servers. + * + * session_id: points at the session ID in the ClientHello. This code will + * read past the end of this in order to parse out the session ticket + * extension, if any. + * len: the length of the session ID. + * limit: a pointer to the first byte after the ClientHello. + * + * Returns: + * -1: error + * 0: a session may have been found. + * + * Side effects: + * - If a session is found then s->session is pointed at it (after freeing an + * existing session if need be) and s->verify_result is set from the session. + * - Both for new and resumed sessions, s->tlsext_ticket_expected is set to 1 + * if the server should issue a new session ticket (to 0 otherwise). + */ int ssl_get_prev_session(SSL *s, unsigned char *session_id, int len, const unsigned char *limit) { @@ -430,27 +467,39 @@ int ssl_get_prev_session(SSL *s, unsigned char *session_id, int len, SSL_SESSION *ret=NULL; int fatal = 0; + int try_session_cache = 1; #ifndef OPENSSL_NO_TLSEXT int r; #endif if (len > SSL_MAX_SSL_SESSION_ID_LENGTH) goto err; + + if (len == 0) + try_session_cache = 0; + #ifndef OPENSSL_NO_TLSEXT - r = tls1_process_ticket(s, session_id, len, limit, &ret); - if (r == -1) + r = tls1_process_ticket(s, session_id, len, limit, &ret); /* sets s->tlsext_ticket_expected */ + switch (r) { + case -1: /* Error during processing */ fatal = 1; goto err; + case 0: /* No ticket found */ + case 1: /* Zero length ticket found */ + break; /* Ok to carry on processing session id. */ + case 2: /* Ticket found but not decrypted. */ + case 3: /* Ticket decrypted, *ret has been set. */ + try_session_cache = 0; + break; + default: + abort(); } - else if (r == 0 || (!ret && !len)) - goto err; - else if (!ret && !(s->session_ctx->session_cache_mode & SSL_SESS_CACHE_NO_INTERNAL_LOOKUP)) -#else - if (len == 0) - goto err; - if (!(s->session_ctx->session_cache_mode & SSL_SESS_CACHE_NO_INTERNAL_LOOKUP)) #endif + + if (try_session_cache && + ret == NULL && + !(s->session_ctx->session_cache_mode & SSL_SESS_CACHE_NO_INTERNAL_LOOKUP)) { SSL_SESSION data; data.ssl_version=s->version; @@ -461,20 +510,22 @@ int ssl_get_prev_session(SSL *s, unsigned char *session_id, int len, CRYPTO_r_lock(CRYPTO_LOCK_SSL_CTX); ret=lh_SSL_SESSION_retrieve(s->session_ctx->sessions,&data); if (ret != NULL) - /* don't allow other threads to steal it: */ - CRYPTO_add(&ret->references,1,CRYPTO_LOCK_SSL_SESSION); + { + /* don't allow other threads to steal it: */ + CRYPTO_add(&ret->references,1,CRYPTO_LOCK_SSL_SESSION); + } CRYPTO_r_unlock(CRYPTO_LOCK_SSL_CTX); + if (ret == NULL) + s->session_ctx->stats.sess_miss++; } - if (ret == NULL) + if (try_session_cache && + ret == NULL && + s->session_ctx->get_session_cb != NULL) { int copy=1; - s->session_ctx->stats.sess_miss++; - ret=NULL; - if (s->session_ctx->get_session_cb != NULL - && (ret=s->session_ctx->get_session_cb(s,session_id,len,©)) - != NULL) + if ((ret=s->session_ctx->get_session_cb(s,session_id,len,©))) { s->session_ctx->stats.sess_cb_hit++; @@ -493,23 +544,18 @@ int ssl_get_prev_session(SSL *s, unsigned char *session_id, int len, * things are very strange */ SSL_CTX_add_session(s->session_ctx,ret); } - if (ret == NULL) - goto err; } - /* Now ret is non-NULL, and we own one of its reference counts. */ + if (ret == NULL) + goto err; + + /* Now ret is non-NULL and we own one of its reference counts. */ if (ret->sid_ctx_length != s->sid_ctx_length || memcmp(ret->sid_ctx,s->sid_ctx,ret->sid_ctx_length)) { - /* We've found the session named by the client, but we don't + /* We have the session requested by the client, but we don't * want to use it in this context. */ - -#if 0 /* The client cannot always know when a session is not appropriate, - * so we shouldn't generate an error message. */ - - SSLerr(SSL_F_SSL_GET_PREV_SESSION,SSL_R_ATTEMPT_TO_REUSE_SESSION_IN_DIFFERENT_CONTEXT); -#endif goto err; /* treat like cache miss */ } @@ -546,39 +592,38 @@ int ssl_get_prev_session(SSL *s, unsigned char *session_id, int len, goto err; } - -#if 0 /* This is way too late. */ - - /* If a thread got the session, then 'swaped', and another got - * it and then due to a time-out decided to 'OPENSSL_free' it we could - * be in trouble. So I'll increment it now, then double decrement - * later - am I speaking rubbish?. */ - CRYPTO_add(&ret->references,1,CRYPTO_LOCK_SSL_SESSION); -#endif - if (ret->timeout < (long)(time(NULL) - ret->time)) /* timeout */ { s->session_ctx->stats.sess_timeout++; - /* remove it from the cache */ - SSL_CTX_remove_session(s->session_ctx,ret); + if (try_session_cache) + { + /* session was from the cache, so remove it */ + SSL_CTX_remove_session(s->session_ctx,ret); + } goto err; } s->session_ctx->stats.sess_hit++; - /* ret->time=time(NULL); */ /* rezero timeout? */ - /* again, just leave the session - * if it is the same session, we have just incremented and - * then decremented the reference count :-) */ if (s->session != NULL) SSL_SESSION_free(s->session); s->session=ret; s->verify_result = s->session->verify_result; - return(1); + return 1; err: if (ret != NULL) + { SSL_SESSION_free(ret); +#ifndef OPENSSL_NO_TLSEXT + if (!try_session_cache) + { + /* The session was from a ticket, so we should + * issue a ticket for the new session */ + s->tlsext_ticket_expected = 1; + } +#endif + } if (fatal) return -1; else @@ -728,6 +773,10 @@ void SSL_SESSION_free(SSL_SESSION *ss) OPENSSL_free(ss->psk_identity_hint); if (ss->psk_identity != NULL) OPENSSL_free(ss->psk_identity); +#endif +#ifndef OPENSSL_NO_SRP + if (ss->srp_username != NULL) + OPENSSL_free(ss->srp_username); #endif OPENSSL_cleanse(ss,sizeof(*ss)); OPENSSL_free(ss); @@ -753,10 +802,6 @@ int SSL_set_session(SSL *s, SSL_SESSION *session) { if (!SSL_set_ssl_method(s,meth)) return(0); - if (s->ctx->session_timeout == 0) - session->timeout=SSL_get_default_timeout(s); - else - session->timeout=s->ctx->session_timeout; } #ifndef OPENSSL_NO_KRB5 @@ -824,6 +869,25 @@ long SSL_SESSION_set_time(SSL_SESSION *s, long t) return(t); } +X509 *SSL_SESSION_get0_peer(SSL_SESSION *s) + { + return s->peer; + } + +int SSL_SESSION_set1_id_context(SSL_SESSION *s,const unsigned char *sid_ctx, + unsigned int sid_ctx_len) + { + if(sid_ctx_len > SSL_MAX_SID_CTX_LENGTH) + { + SSLerr(SSL_F_SSL_SESSION_SET1_ID_CONTEXT,SSL_R_SSL_SESSION_ID_CONTEXT_TOO_LONG); + return 0; + } + s->sid_ctx_length=sid_ctx_len; + memcpy(s->sid_ctx,sid_ctx,sid_ctx_len); + + return 1; + } + long SSL_CTX_set_timeout(SSL_CTX *s, long t) { long l; diff --git a/crypto/openssl/ssl/ssl_txt.c b/crypto/openssl/ssl/ssl_txt.c index 3122440e26..6479d52c0c 100644 --- a/crypto/openssl/ssl/ssl_txt.c +++ b/crypto/openssl/ssl/ssl_txt.c @@ -115,6 +115,10 @@ int SSL_SESSION_print(BIO *bp, const SSL_SESSION *x) s="SSLv2"; else if (x->ssl_version == SSL3_VERSION) s="SSLv3"; + else if (x->ssl_version == TLS1_2_VERSION) + s="TLSv1.2"; + else if (x->ssl_version == TLS1_1_VERSION) + s="TLSv1.1"; else if (x->ssl_version == TLS1_VERSION) s="TLSv1"; else if (x->ssl_version == DTLS1_VERSION) @@ -187,6 +191,10 @@ int SSL_SESSION_print(BIO *bp, const SSL_SESSION *x) if (BIO_puts(bp,"\n PSK identity hint: ") <= 0) goto err; if (BIO_printf(bp, "%s", x->psk_identity_hint ? x->psk_identity_hint : "None") <= 0) goto err; #endif +#ifndef OPENSSL_NO_SRP + if (BIO_puts(bp,"\n SRP username: ") <= 0) goto err; + if (BIO_printf(bp, "%s", x->srp_username ? x->srp_username : "None") <= 0) goto err; +#endif #ifndef OPENSSL_NO_TLSEXT if (x->tlsext_tick_lifetime_hint) { diff --git a/crypto/openssl/ssl/t1_clnt.c b/crypto/openssl/ssl/t1_clnt.c index c87af17712..578617ed84 100644 --- a/crypto/openssl/ssl/t1_clnt.c +++ b/crypto/openssl/ssl/t1_clnt.c @@ -66,13 +66,26 @@ static const SSL_METHOD *tls1_get_client_method(int ver); static const SSL_METHOD *tls1_get_client_method(int ver) { + if (ver == TLS1_2_VERSION) + return TLSv1_2_client_method(); + if (ver == TLS1_1_VERSION) + return TLSv1_1_client_method(); if (ver == TLS1_VERSION) - return(TLSv1_client_method()); - else - return(NULL); + return TLSv1_client_method(); + return NULL; } -IMPLEMENT_tls1_meth_func(TLSv1_client_method, +IMPLEMENT_tls_meth_func(TLS1_2_VERSION, TLSv1_2_client_method, + ssl_undefined_function, + ssl3_connect, + tls1_get_client_method) + +IMPLEMENT_tls_meth_func(TLS1_1_VERSION, TLSv1_1_client_method, + ssl_undefined_function, + ssl3_connect, + tls1_get_client_method) + +IMPLEMENT_tls_meth_func(TLS1_VERSION, TLSv1_client_method, ssl_undefined_function, ssl3_connect, tls1_get_client_method) diff --git a/crypto/openssl/ssl/t1_enc.c b/crypto/openssl/ssl/t1_enc.c index 793ea43e90..201ca9ad6d 100644 --- a/crypto/openssl/ssl/t1_enc.c +++ b/crypto/openssl/ssl/t1_enc.c @@ -143,6 +143,7 @@ #include #include #include +#include #ifdef KSSL_DEBUG #include #endif @@ -158,68 +159,75 @@ static int tls1_P_hash(const EVP_MD *md, const unsigned char *sec, unsigned char *out, int olen) { int chunk; - unsigned int j; - HMAC_CTX ctx; - HMAC_CTX ctx_tmp; + size_t j; + EVP_MD_CTX ctx, ctx_tmp; + EVP_PKEY *mac_key; unsigned char A1[EVP_MAX_MD_SIZE]; - unsigned int A1_len; + size_t A1_len; int ret = 0; chunk=EVP_MD_size(md); OPENSSL_assert(chunk >= 0); - HMAC_CTX_init(&ctx); - HMAC_CTX_init(&ctx_tmp); - if (!HMAC_Init_ex(&ctx,sec,sec_len,md, NULL)) + EVP_MD_CTX_init(&ctx); + EVP_MD_CTX_init(&ctx_tmp); + EVP_MD_CTX_set_flags(&ctx, EVP_MD_CTX_FLAG_NON_FIPS_ALLOW); + EVP_MD_CTX_set_flags(&ctx_tmp, EVP_MD_CTX_FLAG_NON_FIPS_ALLOW); + mac_key = EVP_PKEY_new_mac_key(EVP_PKEY_HMAC, NULL, sec, sec_len); + if (!mac_key) + goto err; + if (!EVP_DigestSignInit(&ctx,NULL,md, NULL, mac_key)) goto err; - if (!HMAC_Init_ex(&ctx_tmp,sec,sec_len,md, NULL)) + if (!EVP_DigestSignInit(&ctx_tmp,NULL,md, NULL, mac_key)) goto err; - if (seed1 != NULL && !HMAC_Update(&ctx,seed1,seed1_len)) + if (seed1 && !EVP_DigestSignUpdate(&ctx,seed1,seed1_len)) goto err; - if (seed2 != NULL && !HMAC_Update(&ctx,seed2,seed2_len)) + if (seed2 && !EVP_DigestSignUpdate(&ctx,seed2,seed2_len)) goto err; - if (seed3 != NULL && !HMAC_Update(&ctx,seed3,seed3_len)) + if (seed3 && !EVP_DigestSignUpdate(&ctx,seed3,seed3_len)) goto err; - if (seed4 != NULL && !HMAC_Update(&ctx,seed4,seed4_len)) + if (seed4 && !EVP_DigestSignUpdate(&ctx,seed4,seed4_len)) goto err; - if (seed5 != NULL && !HMAC_Update(&ctx,seed5,seed5_len)) + if (seed5 && !EVP_DigestSignUpdate(&ctx,seed5,seed5_len)) goto err; - if (!HMAC_Final(&ctx,A1,&A1_len)) + if (!EVP_DigestSignFinal(&ctx,A1,&A1_len)) goto err; for (;;) { - if (!HMAC_Init_ex(&ctx,NULL,0,NULL,NULL)) /* re-init */ + /* Reinit mac contexts */ + if (!EVP_DigestSignInit(&ctx,NULL,md, NULL, mac_key)) goto err; - if (!HMAC_Init_ex(&ctx_tmp,NULL,0,NULL,NULL)) /* re-init */ + if (!EVP_DigestSignInit(&ctx_tmp,NULL,md, NULL, mac_key)) goto err; - if (!HMAC_Update(&ctx,A1,A1_len)) + if (!EVP_DigestSignUpdate(&ctx,A1,A1_len)) goto err; - if (!HMAC_Update(&ctx_tmp,A1,A1_len)) + if (!EVP_DigestSignUpdate(&ctx_tmp,A1,A1_len)) goto err; - if (seed1 != NULL && !HMAC_Update(&ctx,seed1,seed1_len)) + if (seed1 && !EVP_DigestSignUpdate(&ctx,seed1,seed1_len)) goto err; - if (seed2 != NULL && !HMAC_Update(&ctx,seed2,seed2_len)) + if (seed2 && !EVP_DigestSignUpdate(&ctx,seed2,seed2_len)) goto err; - if (seed3 != NULL && !HMAC_Update(&ctx,seed3,seed3_len)) + if (seed3 && !EVP_DigestSignUpdate(&ctx,seed3,seed3_len)) goto err; - if (seed4 != NULL && !HMAC_Update(&ctx,seed4,seed4_len)) + if (seed4 && !EVP_DigestSignUpdate(&ctx,seed4,seed4_len)) goto err; - if (seed5 != NULL && !HMAC_Update(&ctx,seed5,seed5_len)) + if (seed5 && !EVP_DigestSignUpdate(&ctx,seed5,seed5_len)) goto err; if (olen > chunk) { - if (!HMAC_Final(&ctx,out,&j)) + if (!EVP_DigestSignFinal(&ctx,out,&j)) goto err; out+=j; olen-=j; - if (!HMAC_Final(&ctx_tmp,A1,&A1_len)) /* calc the next A1 value */ + /* calc the next A1 value */ + if (!EVP_DigestSignFinal(&ctx_tmp,A1,&A1_len)) goto err; } else /* last one */ { - if (!HMAC_Final(&ctx,A1,&A1_len)) + if (!EVP_DigestSignFinal(&ctx,A1,&A1_len)) goto err; memcpy(out,A1,olen); break; @@ -227,8 +235,9 @@ static int tls1_P_hash(const EVP_MD *md, const unsigned char *sec, } ret = 1; err: - HMAC_CTX_cleanup(&ctx); - HMAC_CTX_cleanup(&ctx_tmp); + EVP_PKEY_free(mac_key); + EVP_MD_CTX_cleanup(&ctx); + EVP_MD_CTX_cleanup(&ctx_tmp); OPENSSL_cleanse(A1,sizeof(A1)); return ret; } @@ -256,6 +265,8 @@ static int tls1_PRF(long digest_mask, if ((m<s3->tmp.new_cipher->algorithm2, + ret = tls1_PRF(ssl_get_algorithm2(s), TLS_MD_KEY_EXPANSION_CONST,TLS_MD_KEY_EXPANSION_CONST_SIZE, s->s3->server_random,SSL3_RANDOM_SIZE, s->s3->client_random,SSL3_RANDOM_SIZE, @@ -358,7 +369,7 @@ int tls1_change_cipher_state(SSL *s, int which) { if (s->s3->tmp.new_cipher->algorithm2 & TLS1_STREAM_MAC) s->mac_flags |= SSL_MAC_FLAG_READ_MAC_STREAM; - else + else s->mac_flags &= ~SSL_MAC_FLAG_READ_MAC_STREAM; if (s->enc_read_ctx != NULL) @@ -445,7 +456,11 @@ int tls1_change_cipher_state(SSL *s, int which) j=is_export ? (cl < SSL_C_EXPORT_KEYLENGTH(s->s3->tmp.new_cipher) ? cl : SSL_C_EXPORT_KEYLENGTH(s->s3->tmp.new_cipher)) : cl; /* Was j=(exp)?5:EVP_CIPHER_key_length(c); */ - k=EVP_CIPHER_iv_length(c); + /* If GCM mode only part of IV comes from PRF */ + if (EVP_CIPHER_mode(c) == EVP_CIPH_GCM_MODE) + k = EVP_GCM_TLS_FIXED_IV_LEN; + else + k=EVP_CIPHER_iv_length(c); if ( (which == SSL3_CHANGE_CIPHER_CLIENT_WRITE) || (which == SSL3_CHANGE_CIPHER_SERVER_READ)) { @@ -474,10 +489,14 @@ int tls1_change_cipher_state(SSL *s, int which) } memcpy(mac_secret,ms,i); - mac_key = EVP_PKEY_new_mac_key(mac_type, NULL, - mac_secret,*mac_secret_size); - EVP_DigestSignInit(mac_ctx,NULL,m,NULL,mac_key); - EVP_PKEY_free(mac_key); + + if (!(EVP_CIPHER_flags(c)&EVP_CIPH_FLAG_AEAD_CIPHER)) + { + mac_key = EVP_PKEY_new_mac_key(mac_type, NULL, + mac_secret,*mac_secret_size); + EVP_DigestSignInit(mac_ctx,NULL,m,NULL,mac_key); + EVP_PKEY_free(mac_key); + } #ifdef TLS_DEBUG printf("which = %04X\nmac key=",which); { int z; for (z=0; zoptions & SSL_OP_DONT_INSERT_EMPTY_FRAGMENTS)) + if (!(s->options & SSL_OP_DONT_INSERT_EMPTY_FRAGMENTS) + && s->method->version <= TLS1_VERSION) { /* enable vulnerability countermeasure for CBC ciphers with * known-IV problem (http://www.openssl.org/~bodo/tls-cbc.txt) @@ -640,14 +672,14 @@ int tls1_enc(SSL *s, int send) SSL3_RECORD *rec; EVP_CIPHER_CTX *ds; unsigned long l; - int bs,i,ii,j,k,n=0; + int bs,i,ii,j,k,pad=0; const EVP_CIPHER *enc; if (send) { if (EVP_MD_CTX_md(s->write_hash)) { - n=EVP_MD_CTX_size(s->write_hash); + int n=EVP_MD_CTX_size(s->write_hash); OPENSSL_assert(n >= 0); } ds=s->enc_write_ctx; @@ -655,13 +687,34 @@ int tls1_enc(SSL *s, int send) if (s->enc_write_ctx == NULL) enc=NULL; else + { + int ivlen; enc=EVP_CIPHER_CTX_cipher(s->enc_write_ctx); + /* For TLSv1.1 and later explicit IV */ + if (s->version >= TLS1_1_VERSION + && EVP_CIPHER_mode(enc) == EVP_CIPH_CBC_MODE) + ivlen = EVP_CIPHER_iv_length(enc); + else + ivlen = 0; + if (ivlen > 1) + { + if ( rec->data != rec->input) + /* we can't write into the input stream: + * Can this ever happen?? (steve) + */ + fprintf(stderr, + "%s:%d: rec->data != rec->input\n", + __FILE__, __LINE__); + else if (RAND_bytes(rec->input, ivlen) <= 0) + return -1; + } + } } else { if (EVP_MD_CTX_md(s->read_hash)) { - n=EVP_MD_CTX_size(s->read_hash); + int n=EVP_MD_CTX_size(s->read_hash); OPENSSL_assert(n >= 0); } ds=s->enc_read_ctx; @@ -687,7 +740,43 @@ int tls1_enc(SSL *s, int send) l=rec->length; bs=EVP_CIPHER_block_size(ds->cipher); - if ((bs != 1) && send) + if (EVP_CIPHER_flags(ds->cipher)&EVP_CIPH_FLAG_AEAD_CIPHER) + { + unsigned char buf[13],*seq; + + seq = send?s->s3->write_sequence:s->s3->read_sequence; + + if (s->version == DTLS1_VERSION || s->version == DTLS1_BAD_VER) + { + unsigned char dtlsseq[9],*p=dtlsseq; + + s2n(send?s->d1->w_epoch:s->d1->r_epoch,p); + memcpy(p,&seq[2],6); + memcpy(buf,dtlsseq,8); + } + else + { + memcpy(buf,seq,8); + for (i=7; i>=0; i--) /* increment */ + { + ++seq[i]; + if (seq[i] != 0) break; + } + } + + buf[8]=rec->type; + buf[9]=(unsigned char)(s->version>>8); + buf[10]=(unsigned char)(s->version); + buf[11]=rec->length>>8; + buf[12]=rec->length&0xff; + pad=EVP_CIPHER_CTX_ctrl(ds,EVP_CTRL_AEAD_TLS1_AAD,13,buf); + if (send) + { + l+=pad; + rec->length+=pad; + } + } + else if ((bs != 1) && send) { i=bs-((int)l%bs); @@ -728,13 +817,25 @@ int tls1_enc(SSL *s, int send) { if (l == 0 || l%bs != 0) { + if (s->version >= TLS1_1_VERSION) + return -1; SSLerr(SSL_F_TLS1_ENC,SSL_R_BLOCK_CIPHER_PAD_IS_WRONG); ssl3_send_alert(s,SSL3_AL_FATAL,SSL_AD_DECRYPTION_FAILED); return 0; } } - EVP_Cipher(ds,rec->data,rec->input,l); + i = EVP_Cipher(ds,rec->data,rec->input,l); + if ((EVP_CIPHER_flags(ds->cipher)&EVP_CIPH_FLAG_CUSTOM_CIPHER) + ?(i<0) + :(i==0)) + return -1; /* AEAD can fail to verify MAC */ + if (EVP_CIPHER_mode(enc) == EVP_CIPH_GCM_MODE && !send) + { + rec->data += EVP_GCM_TLS_EXPLICIT_IV_LEN; + rec->input += EVP_GCM_TLS_EXPLICIT_IV_LEN; + rec->length -= EVP_GCM_TLS_EXPLICIT_IV_LEN; + } #ifdef KSSL_DEBUG { @@ -784,8 +885,17 @@ int tls1_enc(SSL *s, int send) return -1; } } - rec->length-=i; + rec->length -=i; + if (s->version >= TLS1_1_VERSION + && EVP_CIPHER_CTX_mode(ds) == EVP_CIPH_CBC_MODE) + { + rec->data += bs; /* skip the explicit IV */ + rec->input += bs; + rec->length -= bs; + } } + if (pad && !send) + rec->length -= pad; } return(1); } @@ -841,7 +951,7 @@ int tls1_final_finish_mac(SSL *s, for (idx=0;ssl_get_handshake_digest(idx,&mask,&md);idx++) { - if (mask & s->s3->tmp.new_cipher->algorithm2) + if (mask & ssl_get_algorithm2(s)) { int hashsize = EVP_MD_size(md); if (hashsize < 0 || hashsize > (int)(sizeof buf - (size_t)(q-buf))) @@ -860,7 +970,7 @@ int tls1_final_finish_mac(SSL *s, } } - if (!tls1_PRF(s->s3->tmp.new_cipher->algorithm2, + if (!tls1_PRF(ssl_get_algorithm2(s), str,slen, buf,(int)(q-buf), NULL,0, NULL,0, NULL,0, s->session->master_key,s->session->master_key_length, out,buf2,sizeof buf2)) @@ -970,6 +1080,7 @@ int tls1_generate_master_secret(SSL *s, unsigned char *out, unsigned char *p, const void *co = NULL, *so = NULL; int col = 0, sol = 0; + #ifdef KSSL_DEBUG printf ("tls1_generate_master_secret(%p,%p, %p, %d)\n", s,out, p,len); #endif /* KSSL_DEBUG */ @@ -986,7 +1097,7 @@ int tls1_generate_master_secret(SSL *s, unsigned char *out, unsigned char *p, } #endif - tls1_PRF(s->s3->tmp.new_cipher->algorithm2, + tls1_PRF(ssl_get_algorithm2(s), TLS_MD_MASTER_SECRET_CONST,TLS_MD_MASTER_SECRET_CONST_SIZE, s->s3->client_random,SSL3_RANDOM_SIZE, co, col, @@ -994,6 +1105,16 @@ int tls1_generate_master_secret(SSL *s, unsigned char *out, unsigned char *p, so, sol, p,len, s->session->master_key,buff,sizeof buff); +#ifdef SSL_DEBUG + fprintf(stderr, "Premaster Secret:\n"); + BIO_dump_fp(stderr, (char *)p, len); + fprintf(stderr, "Client Random:\n"); + BIO_dump_fp(stderr, (char *)s->s3->client_random, SSL3_RANDOM_SIZE); + fprintf(stderr, "Server Random:\n"); + BIO_dump_fp(stderr, (char *)s->s3->server_random, SSL3_RANDOM_SIZE); + fprintf(stderr, "Master Secret:\n"); + BIO_dump_fp(stderr, (char *)s->session->master_key, SSL3_MASTER_SECRET_SIZE); +#endif #ifdef KSSL_DEBUG printf ("tls1_generate_master_secret() complete\n"); @@ -1001,6 +1122,95 @@ int tls1_generate_master_secret(SSL *s, unsigned char *out, unsigned char *p, return(SSL3_MASTER_SECRET_SIZE); } +int tls1_export_keying_material(SSL *s, unsigned char *out, size_t olen, + const char *label, size_t llen, const unsigned char *context, + size_t contextlen, int use_context) + { + unsigned char *buff; + unsigned char *val = NULL; + size_t vallen, currentvalpos; + int rv; + +#ifdef KSSL_DEBUG + printf ("tls1_export_keying_material(%p,%p,%d,%s,%d,%p,%d)\n", s, out, olen, label, llen, p, plen); +#endif /* KSSL_DEBUG */ + + buff = OPENSSL_malloc(olen); + if (buff == NULL) goto err2; + + /* construct PRF arguments + * we construct the PRF argument ourself rather than passing separate + * values into the TLS PRF to ensure that the concatenation of values + * does not create a prohibited label. + */ + vallen = llen + SSL3_RANDOM_SIZE * 2; + if (use_context) + { + vallen += 2 + contextlen; + } + + val = OPENSSL_malloc(vallen); + if (val == NULL) goto err2; + currentvalpos = 0; + memcpy(val + currentvalpos, (unsigned char *) label, llen); + currentvalpos += llen; + memcpy(val + currentvalpos, s->s3->client_random, SSL3_RANDOM_SIZE); + currentvalpos += SSL3_RANDOM_SIZE; + memcpy(val + currentvalpos, s->s3->server_random, SSL3_RANDOM_SIZE); + currentvalpos += SSL3_RANDOM_SIZE; + + if (use_context) + { + val[currentvalpos] = (contextlen >> 8) & 0xff; + currentvalpos++; + val[currentvalpos] = contextlen & 0xff; + currentvalpos++; + if ((contextlen > 0) || (context != NULL)) + { + memcpy(val + currentvalpos, context, contextlen); + } + } + + /* disallow prohibited labels + * note that SSL3_RANDOM_SIZE > max(prohibited label len) = + * 15, so size of val > max(prohibited label len) = 15 and the + * comparisons won't have buffer overflow + */ + if (memcmp(val, TLS_MD_CLIENT_FINISH_CONST, + TLS_MD_CLIENT_FINISH_CONST_SIZE) == 0) goto err1; + if (memcmp(val, TLS_MD_SERVER_FINISH_CONST, + TLS_MD_SERVER_FINISH_CONST_SIZE) == 0) goto err1; + if (memcmp(val, TLS_MD_MASTER_SECRET_CONST, + TLS_MD_MASTER_SECRET_CONST_SIZE) == 0) goto err1; + if (memcmp(val, TLS_MD_KEY_EXPANSION_CONST, + TLS_MD_KEY_EXPANSION_CONST_SIZE) == 0) goto err1; + + rv = tls1_PRF(s->s3->tmp.new_cipher->algorithm2, + val, vallen, + NULL, 0, + NULL, 0, + NULL, 0, + NULL, 0, + s->session->master_key,s->session->master_key_length, + out,buff,olen); + +#ifdef KSSL_DEBUG + printf ("tls1_export_keying_material() complete\n"); +#endif /* KSSL_DEBUG */ + goto ret; +err1: + SSLerr(SSL_F_TLS1_EXPORT_KEYING_MATERIAL, SSL_R_TLS_ILLEGAL_EXPORTER_LABEL); + rv = 0; + goto ret; +err2: + SSLerr(SSL_F_TLS1_EXPORT_KEYING_MATERIAL, ERR_R_MALLOC_FAILURE); + rv = 0; +ret: + if (buff != NULL) OPENSSL_free(buff); + if (val != NULL) OPENSSL_free(val); + return(rv); + } + int tls1_alert_code(int code) { switch (code) @@ -1042,4 +1252,3 @@ int tls1_alert_code(int code) default: return(-1); } } - diff --git a/crypto/openssl/ssl/t1_lib.c b/crypto/openssl/ssl/t1_lib.c index 26cbae449e..57d1107e40 100644 --- a/crypto/openssl/ssl/t1_lib.c +++ b/crypto/openssl/ssl/t1_lib.c @@ -114,6 +114,7 @@ #include #include #include +#include #include "ssl_locl.h" const char tls1_version_str[]="TLSv1" OPENSSL_VERSION_PTEXT; @@ -136,6 +137,7 @@ SSL3_ENC_METHOD TLSv1_enc_data={ TLS_MD_CLIENT_FINISH_CONST,TLS_MD_CLIENT_FINISH_CONST_SIZE, TLS_MD_SERVER_FINISH_CONST,TLS_MD_SERVER_FINISH_CONST_SIZE, tls1_alert_code, + tls1_export_keying_material, }; long tls1_default_timeout(void) @@ -166,10 +168,11 @@ void tls1_free(SSL *s) void tls1_clear(SSL *s) { ssl3_clear(s); - s->version=TLS1_VERSION; + s->version = s->method->version; } #ifndef OPENSSL_NO_EC + static int nid_list[] = { NID_sect163k1, /* sect163k1 (1) */ @@ -198,7 +201,36 @@ static int nid_list[] = NID_secp384r1, /* secp384r1 (24) */ NID_secp521r1 /* secp521r1 (25) */ }; - + +static int pref_list[] = + { + NID_sect571r1, /* sect571r1 (14) */ + NID_sect571k1, /* sect571k1 (13) */ + NID_secp521r1, /* secp521r1 (25) */ + NID_sect409k1, /* sect409k1 (11) */ + NID_sect409r1, /* sect409r1 (12) */ + NID_secp384r1, /* secp384r1 (24) */ + NID_sect283k1, /* sect283k1 (9) */ + NID_sect283r1, /* sect283r1 (10) */ + NID_secp256k1, /* secp256k1 (22) */ + NID_X9_62_prime256v1, /* secp256r1 (23) */ + NID_sect239k1, /* sect239k1 (8) */ + NID_sect233k1, /* sect233k1 (6) */ + NID_sect233r1, /* sect233r1 (7) */ + NID_secp224k1, /* secp224k1 (20) */ + NID_secp224r1, /* secp224r1 (21) */ + NID_sect193r1, /* sect193r1 (4) */ + NID_sect193r2, /* sect193r2 (5) */ + NID_secp192k1, /* secp192k1 (18) */ + NID_X9_62_prime192v1, /* secp192r1 (19) */ + NID_sect163k1, /* sect163k1 (1) */ + NID_sect163r1, /* sect163r1 (2) */ + NID_sect163r2, /* sect163r2 (3) */ + NID_secp160k1, /* secp160k1 (15) */ + NID_secp160r1, /* secp160r1 (16) */ + NID_secp160r2, /* secp160r2 (17) */ + }; + int tls1_ec_curve_id2nid(int curve_id) { /* ECC curves from draft-ietf-tls-ecc-12.txt (Oct. 17, 2005) */ @@ -270,6 +302,64 @@ int tls1_ec_nid2curve_id(int nid) #endif /* OPENSSL_NO_EC */ #ifndef OPENSSL_NO_TLSEXT + +/* List of supported signature algorithms and hashes. Should make this + * customisable at some point, for now include everything we support. + */ + +#ifdef OPENSSL_NO_RSA +#define tlsext_sigalg_rsa(md) /* */ +#else +#define tlsext_sigalg_rsa(md) md, TLSEXT_signature_rsa, +#endif + +#ifdef OPENSSL_NO_DSA +#define tlsext_sigalg_dsa(md) /* */ +#else +#define tlsext_sigalg_dsa(md) md, TLSEXT_signature_dsa, +#endif + +#ifdef OPENSSL_NO_ECDSA +#define tlsext_sigalg_ecdsa(md) /* */ +#else +#define tlsext_sigalg_ecdsa(md) md, TLSEXT_signature_ecdsa, +#endif + +#define tlsext_sigalg(md) \ + tlsext_sigalg_rsa(md) \ + tlsext_sigalg_dsa(md) \ + tlsext_sigalg_ecdsa(md) + +static unsigned char tls12_sigalgs[] = { +#ifndef OPENSSL_NO_SHA512 + tlsext_sigalg(TLSEXT_hash_sha512) + tlsext_sigalg(TLSEXT_hash_sha384) +#endif +#ifndef OPENSSL_NO_SHA256 + tlsext_sigalg(TLSEXT_hash_sha256) + tlsext_sigalg(TLSEXT_hash_sha224) +#endif +#ifndef OPENSSL_NO_SHA + tlsext_sigalg(TLSEXT_hash_sha1) +#endif +#ifndef OPENSSL_NO_MD5 + tlsext_sigalg_rsa(TLSEXT_hash_md5) +#endif +}; + +int tls12_get_req_sig_algs(SSL *s, unsigned char *p) + { + size_t slen = sizeof(tls12_sigalgs); +#ifdef OPENSSL_FIPS + /* If FIPS mode don't include MD5 which is last */ + if (FIPS_mode()) + slen -= 2; +#endif + if (p) + memcpy(p, tls12_sigalgs, slen); + return (int)slen; + } + unsigned char *ssl_add_clienthello_tlsext(SSL *s, unsigned char *p, unsigned char *limit) { int extdatalen=0; @@ -317,7 +407,7 @@ unsigned char *ssl_add_clienthello_tlsext(SSL *s, unsigned char *p, unsigned cha } /* Add RI if renegotiating */ - if (s->new_session) + if (s->renegotiate) { int el; @@ -341,6 +431,34 @@ unsigned char *ssl_add_clienthello_tlsext(SSL *s, unsigned char *p, unsigned cha ret += el; } +#ifndef OPENSSL_NO_SRP + /* Add SRP username if there is one */ + if (s->srp_ctx.login != NULL) + { /* Add TLS extension SRP username to the Client Hello message */ + + int login_len = strlen(s->srp_ctx.login); + if (login_len > 255 || login_len == 0) + { + SSLerr(SSL_F_SSL_ADD_CLIENTHELLO_TLSEXT, ERR_R_INTERNAL_ERROR); + return NULL; + } + + /* check for enough space. + 4 for the srp type type and entension length + 1 for the srp user identity + + srp user identity length + */ + if ((limit - ret - 5 - login_len) < 0) return NULL; + + /* fill in the extension */ + s2n(TLSEXT_TYPE_srp,ret); + s2n(login_len+1,ret); + (*ret++) = (unsigned char) login_len; + memcpy(ret, s->srp_ctx.login, login_len); + ret+=login_len; + } +#endif + #ifndef OPENSSL_NO_EC if (s->tlsext_ecpointformatlist != NULL && s->version != DTLS1_VERSION) @@ -426,6 +544,17 @@ unsigned char *ssl_add_clienthello_tlsext(SSL *s, unsigned char *p, unsigned cha } skip_ext: + if (TLS1_get_version(s) >= TLS1_2_VERSION) + { + if ((size_t)(limit - ret) < sizeof(tls12_sigalgs) + 6) + return NULL; + s2n(TLSEXT_TYPE_signature_algorithms,ret); + s2n(sizeof(tls12_sigalgs) + 2, ret); + s2n(sizeof(tls12_sigalgs), ret); + memcpy(ret, tls12_sigalgs, sizeof(tls12_sigalgs)); + ret += sizeof(tls12_sigalgs); + } + #ifdef TLSEXT_TYPE_opaque_prf_input if (s->s3->client_opaque_prf_input != NULL && s->version != DTLS1_VERSION) @@ -494,6 +623,51 @@ unsigned char *ssl_add_clienthello_tlsext(SSL *s, unsigned char *p, unsigned cha i2d_X509_EXTENSIONS(s->tlsext_ocsp_exts, &ret); } +#ifndef OPENSSL_NO_HEARTBEATS + /* Add Heartbeat extension */ + s2n(TLSEXT_TYPE_heartbeat,ret); + s2n(1,ret); + /* Set mode: + * 1: peer may send requests + * 2: peer not allowed to send requests + */ + if (s->tlsext_heartbeat & SSL_TLSEXT_HB_DONT_RECV_REQUESTS) + *(ret++) = SSL_TLSEXT_HB_DONT_SEND_REQUESTS; + else + *(ret++) = SSL_TLSEXT_HB_ENABLED; +#endif + +#ifndef OPENSSL_NO_NEXTPROTONEG + if (s->ctx->next_proto_select_cb && !s->s3->tmp.finish_md_len) + { + /* The client advertises an emtpy extension to indicate its + * support for Next Protocol Negotiation */ + if (limit - ret - 4 < 0) + return NULL; + s2n(TLSEXT_TYPE_next_proto_neg,ret); + s2n(0,ret); + } +#endif + + if(SSL_get_srtp_profiles(s)) + { + int el; + + ssl_add_clienthello_use_srtp_ext(s, 0, &el, 0); + + if((limit - p - 4 - el) < 0) return NULL; + + s2n(TLSEXT_TYPE_use_srtp,ret); + s2n(el,ret); + + if(ssl_add_clienthello_use_srtp_ext(s, ret, &el, el)) + { + SSLerr(SSL_F_SSL_ADD_CLIENTHELLO_TLSEXT, ERR_R_INTERNAL_ERROR); + return NULL; + } + ret += el; + } + if ((extdatalen = ret-p-2)== 0) return p; @@ -505,6 +679,9 @@ unsigned char *ssl_add_serverhello_tlsext(SSL *s, unsigned char *p, unsigned cha { int extdatalen=0; unsigned char *ret = p; +#ifndef OPENSSL_NO_NEXTPROTONEG + int next_proto_neg_seen; +#endif /* don't add extensions for SSLv3, unless doing secure renegotiation */ if (s->version == SSL3_VERSION && !s->s3->send_connection_binding) @@ -603,6 +780,26 @@ unsigned char *ssl_add_serverhello_tlsext(SSL *s, unsigned char *p, unsigned cha ret += sol; } #endif + + if(s->srtp_profile) + { + int el; + + ssl_add_serverhello_use_srtp_ext(s, 0, &el, 0); + + if((limit - p - 4 - el) < 0) return NULL; + + s2n(TLSEXT_TYPE_use_srtp,ret); + s2n(el,ret); + + if(ssl_add_serverhello_use_srtp_ext(s, ret, &el, el)) + { + SSLerr(SSL_F_SSL_ADD_SERVERHELLO_TLSEXT, ERR_R_INTERNAL_ERROR); + return NULL; + } + ret+=el; + } + if (((s->s3->tmp.new_cipher->id & 0xFFFF)==0x80 || (s->s3->tmp.new_cipher->id & 0xFFFF)==0x81) && (SSL_get_options(s) & SSL_OP_CRYPTOPRO_TLSEXT_BUG)) { const unsigned char cryptopro_ext[36] = { @@ -618,6 +815,46 @@ unsigned char *ssl_add_serverhello_tlsext(SSL *s, unsigned char *p, unsigned cha } +#ifndef OPENSSL_NO_HEARTBEATS + /* Add Heartbeat extension if we've received one */ + if (s->tlsext_heartbeat & SSL_TLSEXT_HB_ENABLED) + { + s2n(TLSEXT_TYPE_heartbeat,ret); + s2n(1,ret); + /* Set mode: + * 1: peer may send requests + * 2: peer not allowed to send requests + */ + if (s->tlsext_heartbeat & SSL_TLSEXT_HB_DONT_RECV_REQUESTS) + *(ret++) = SSL_TLSEXT_HB_DONT_SEND_REQUESTS; + else + *(ret++) = SSL_TLSEXT_HB_ENABLED; + + } +#endif + +#ifndef OPENSSL_NO_NEXTPROTONEG + next_proto_neg_seen = s->s3->next_proto_neg_seen; + s->s3->next_proto_neg_seen = 0; + if (next_proto_neg_seen && s->ctx->next_protos_advertised_cb) + { + const unsigned char *npa; + unsigned int npalen; + int r; + + r = s->ctx->next_protos_advertised_cb(s, &npa, &npalen, s->ctx->next_protos_advertised_cb_arg); + if (r == SSL_TLSEXT_ERR_OK) + { + if ((long)(limit - ret - 4 - npalen) < 0) return NULL; + s2n(TLSEXT_TYPE_next_proto_neg,ret); + s2n(npalen,ret); + memcpy(ret, npa, npalen); + ret += npalen; + s->s3->next_proto_neg_seen = 1; + } + } +#endif + if ((extdatalen = ret-p-2)== 0) return p; @@ -632,9 +869,18 @@ int ssl_parse_clienthello_tlsext(SSL *s, unsigned char **p, unsigned char *d, in unsigned short len; unsigned char *data = *p; int renegotiate_seen = 0; + int sigalg_seen = 0; s->servername_done = 0; s->tlsext_status_type = -1; +#ifndef OPENSSL_NO_NEXTPROTONEG + s->s3->next_proto_neg_seen = 0; +#endif + +#ifndef OPENSSL_NO_HEARTBEATS + s->tlsext_heartbeat &= ~(SSL_TLSEXT_HB_ENABLED | + SSL_TLSEXT_HB_DONT_SEND_REQUESTS); +#endif if (data >= (d+n-2)) goto ri_check; @@ -762,6 +1008,31 @@ int ssl_parse_clienthello_tlsext(SSL *s, unsigned char **p, unsigned char *d, in } } +#ifndef OPENSSL_NO_SRP + else if (type == TLSEXT_TYPE_srp) + { + if (size <= 0 || ((len = data[0])) != (size -1)) + { + *al = SSL_AD_DECODE_ERROR; + return 0; + } + if (s->srp_ctx.login != NULL) + { + *al = SSL_AD_DECODE_ERROR; + return 0; + } + if ((s->srp_ctx.login = OPENSSL_malloc(len+1)) == NULL) + return -1; + memcpy(s->srp_ctx.login, &data[1], len); + s->srp_ctx.login[len]='\0'; + + if (strlen(s->srp_ctx.login) != len) + { + *al = SSL_AD_DECODE_ERROR; + return 0; + } + } +#endif #ifndef OPENSSL_NO_EC else if (type == TLSEXT_TYPE_ec_point_formats && @@ -882,6 +1153,28 @@ int ssl_parse_clienthello_tlsext(SSL *s, unsigned char **p, unsigned char *d, in return 0; renegotiate_seen = 1; } + else if (type == TLSEXT_TYPE_signature_algorithms) + { + int dsize; + if (sigalg_seen || size < 2) + { + *al = SSL_AD_DECODE_ERROR; + return 0; + } + sigalg_seen = 1; + n2s(data,dsize); + size -= 2; + if (dsize != size || dsize & 1) + { + *al = SSL_AD_DECODE_ERROR; + return 0; + } + if (!tls1_process_sigalgs(s, data, dsize)) + { + *al = SSL_AD_DECODE_ERROR; + return 0; + } + } else if (type == TLSEXT_TYPE_status_request && s->version != DTLS1_VERSION && s->ctx->tlsext_status_cb) { @@ -994,8 +1287,54 @@ int ssl_parse_clienthello_tlsext(SSL *s, unsigned char **p, unsigned char *d, in else s->tlsext_status_type = -1; } +#ifndef OPENSSL_NO_HEARTBEATS + else if (type == TLSEXT_TYPE_heartbeat) + { + switch(data[0]) + { + case 0x01: /* Client allows us to send HB requests */ + s->tlsext_heartbeat |= SSL_TLSEXT_HB_ENABLED; + break; + case 0x02: /* Client doesn't accept HB requests */ + s->tlsext_heartbeat |= SSL_TLSEXT_HB_ENABLED; + s->tlsext_heartbeat |= SSL_TLSEXT_HB_DONT_SEND_REQUESTS; + break; + default: *al = SSL_AD_ILLEGAL_PARAMETER; + return 0; + } + } +#endif +#ifndef OPENSSL_NO_NEXTPROTONEG + else if (type == TLSEXT_TYPE_next_proto_neg && + s->s3->tmp.finish_md_len == 0) + { + /* We shouldn't accept this extension on a + * renegotiation. + * + * s->new_session will be set on renegotiation, but we + * probably shouldn't rely that it couldn't be set on + * the initial renegotation too in certain cases (when + * there's some other reason to disallow resuming an + * earlier session -- the current code won't be doing + * anything like that, but this might change). + + * A valid sign that there's been a previous handshake + * in this connection is if s->s3->tmp.finish_md_len > + * 0. (We are talking about a check that will happen + * in the Hello protocol round, well before a new + * Finished message could have been computed.) */ + s->s3->next_proto_neg_seen = 1; + } +#endif /* session ticket processed earlier */ + else if (type == TLSEXT_TYPE_use_srtp) + { + if(ssl_parse_clienthello_use_srtp_ext(s, data, size, + al)) + return 0; + } + data+=size; } @@ -1005,7 +1344,7 @@ int ssl_parse_clienthello_tlsext(SSL *s, unsigned char **p, unsigned char *d, in /* Need RI if renegotiating */ - if (!renegotiate_seen && s->new_session && + if (!renegotiate_seen && s->renegotiate && !(s->options & SSL_OP_ALLOW_UNSAFE_LEGACY_RENEGOTIATION)) { *al = SSL_AD_HANDSHAKE_FAILURE; @@ -1017,6 +1356,26 @@ int ssl_parse_clienthello_tlsext(SSL *s, unsigned char **p, unsigned char *d, in return 1; } +#ifndef OPENSSL_NO_NEXTPROTONEG +/* ssl_next_proto_validate validates a Next Protocol Negotiation block. No + * elements of zero length are allowed and the set of elements must exactly fill + * the length of the block. */ +static char ssl_next_proto_validate(unsigned char *d, unsigned len) + { + unsigned int off = 0; + + while (off < len) + { + if (d[off] == 0) + return 0; + off += d[off]; + off++; + } + + return off == len; + } +#endif + int ssl_parse_serverhello_tlsext(SSL *s, unsigned char **p, unsigned char *d, int n, int *al) { unsigned short length; @@ -1026,6 +1385,15 @@ int ssl_parse_serverhello_tlsext(SSL *s, unsigned char **p, unsigned char *d, in int tlsext_servername = 0; int renegotiate_seen = 0; +#ifndef OPENSSL_NO_NEXTPROTONEG + s->s3->next_proto_neg_seen = 0; +#endif + +#ifndef OPENSSL_NO_HEARTBEATS + s->tlsext_heartbeat &= ~(SSL_TLSEXT_HB_ENABLED | + SSL_TLSEXT_HB_DONT_SEND_REQUESTS); +#endif + if (data >= (d+n-2)) goto ri_check; @@ -1151,12 +1519,71 @@ int ssl_parse_serverhello_tlsext(SSL *s, unsigned char **p, unsigned char *d, in /* Set flag to expect CertificateStatus message */ s->tlsext_status_expected = 1; } +#ifndef OPENSSL_NO_NEXTPROTONEG + else if (type == TLSEXT_TYPE_next_proto_neg && + s->s3->tmp.finish_md_len == 0) + { + unsigned char *selected; + unsigned char selected_len; + + /* We must have requested it. */ + if ((s->ctx->next_proto_select_cb == NULL)) + { + *al = TLS1_AD_UNSUPPORTED_EXTENSION; + return 0; + } + /* The data must be valid */ + if (!ssl_next_proto_validate(data, size)) + { + *al = TLS1_AD_DECODE_ERROR; + return 0; + } + if (s->ctx->next_proto_select_cb(s, &selected, &selected_len, data, size, s->ctx->next_proto_select_cb_arg) != SSL_TLSEXT_ERR_OK) + { + *al = TLS1_AD_INTERNAL_ERROR; + return 0; + } + s->next_proto_negotiated = OPENSSL_malloc(selected_len); + if (!s->next_proto_negotiated) + { + *al = TLS1_AD_INTERNAL_ERROR; + return 0; + } + memcpy(s->next_proto_negotiated, selected, selected_len); + s->next_proto_negotiated_len = selected_len; + s->s3->next_proto_neg_seen = 1; + } +#endif else if (type == TLSEXT_TYPE_renegotiate) { if(!ssl_parse_serverhello_renegotiate_ext(s, data, size, al)) return 0; renegotiate_seen = 1; } +#ifndef OPENSSL_NO_HEARTBEATS + else if (type == TLSEXT_TYPE_heartbeat) + { + switch(data[0]) + { + case 0x01: /* Server allows us to send HB requests */ + s->tlsext_heartbeat |= SSL_TLSEXT_HB_ENABLED; + break; + case 0x02: /* Server doesn't accept HB requests */ + s->tlsext_heartbeat |= SSL_TLSEXT_HB_ENABLED; + s->tlsext_heartbeat |= SSL_TLSEXT_HB_DONT_SEND_REQUESTS; + break; + default: *al = SSL_AD_ILLEGAL_PARAMETER; + return 0; + } + } +#endif + else if (type == TLSEXT_TYPE_use_srtp) + { + if(ssl_parse_serverhello_use_srtp_ext(s, data, size, + al)) + return 0; + } + data+=size; } @@ -1236,7 +1663,7 @@ int ssl_prepare_clienthello_tlsext(SSL *s) break; } } - using_ecc = using_ecc && (s->version == TLS1_VERSION); + using_ecc = using_ecc && (s->version >= TLS1_VERSION); if (using_ecc) { if (s->tlsext_ecpointformatlist != NULL) OPENSSL_free(s->tlsext_ecpointformatlist); @@ -1252,16 +1679,19 @@ int ssl_prepare_clienthello_tlsext(SSL *s) /* we support all named elliptic curves in draft-ietf-tls-ecc-12 */ if (s->tlsext_ellipticcurvelist != NULL) OPENSSL_free(s->tlsext_ellipticcurvelist); - s->tlsext_ellipticcurvelist_length = sizeof(nid_list)/sizeof(nid_list[0]) * 2; + s->tlsext_ellipticcurvelist_length = sizeof(pref_list)/sizeof(pref_list[0]) * 2; if ((s->tlsext_ellipticcurvelist = OPENSSL_malloc(s->tlsext_ellipticcurvelist_length)) == NULL) { s->tlsext_ellipticcurvelist_length = 0; SSLerr(SSL_F_SSL_PREPARE_CLIENTHELLO_TLSEXT,ERR_R_MALLOC_FAILURE); return -1; } - for (i = 1, j = s->tlsext_ellipticcurvelist; (unsigned int)i <= - sizeof(nid_list)/sizeof(nid_list[0]); i++) - s2n(i,j); + for (i = 0, j = s->tlsext_ellipticcurvelist; (unsigned int)i < + sizeof(pref_list)/sizeof(pref_list[0]); i++) + { + int id = tls1_ec_nid2curve_id(pref_list[i]); + s2n(id,j); + } } #endif /* OPENSSL_NO_EC */ @@ -1570,26 +2000,56 @@ int ssl_check_serverhello_tlsext(SSL *s) } } -/* Since the server cache lookup is done early on in the processing of client - * hello and other operations depend on the result we need to handle any TLS - * session ticket extension at the same time. +/* Since the server cache lookup is done early on in the processing of the + * ClientHello, and other operations depend on the result, we need to handle + * any TLS session ticket extension at the same time. + * + * session_id: points at the session ID in the ClientHello. This code will + * read past the end of this in order to parse out the session ticket + * extension, if any. + * len: the length of the session ID. + * limit: a pointer to the first byte after the ClientHello. + * ret: (output) on return, if a ticket was decrypted, then this is set to + * point to the resulting session. + * + * If s->tls_session_secret_cb is set then we are expecting a pre-shared key + * ciphersuite, in which case we have no use for session tickets and one will + * never be decrypted, nor will s->tlsext_ticket_expected be set to 1. + * + * Returns: + * -1: fatal error, either from parsing or decrypting the ticket. + * 0: no ticket was found (or was ignored, based on settings). + * 1: a zero length extension was found, indicating that the client supports + * session tickets but doesn't currently have one to offer. + * 2: either s->tls_session_secret_cb was set, or a ticket was offered but + * couldn't be decrypted because of a non-fatal error. + * 3: a ticket was successfully decrypted and *ret was set. + * + * Side effects: + * Sets s->tlsext_ticket_expected to 1 if the server will have to issue + * a new session ticket to the client because the client indicated support + * (and s->tls_session_secret_cb is NULL) but the client either doesn't have + * a session ticket or we couldn't use the one it gave us, or if + * s->ctx->tlsext_ticket_key_cb asked to renew the client's ticket. + * Otherwise, s->tlsext_ticket_expected is set to 0. */ - int tls1_process_ticket(SSL *s, unsigned char *session_id, int len, - const unsigned char *limit, SSL_SESSION **ret) + const unsigned char *limit, SSL_SESSION **ret) { /* Point after session ID in client hello */ const unsigned char *p = session_id + len; unsigned short i; + *ret = NULL; + s->tlsext_ticket_expected = 0; + /* If tickets disabled behave as if no ticket present - * to permit stateful resumption. - */ + * to permit stateful resumption. + */ if (SSL_get_options(s) & SSL_OP_NO_TICKET) - return 1; - + return 0; if ((s->version <= SSL3_VERSION) || !limit) - return 1; + return 0; if (p >= limit) return -1; /* Skip past DTLS cookie */ @@ -1612,7 +2072,7 @@ int tls1_process_ticket(SSL *s, unsigned char *session_id, int len, return -1; /* Now at start of extensions */ if ((p + 2) >= limit) - return 1; + return 0; n2s(p, i); while ((p + 4) <= limit) { @@ -1620,39 +2080,61 @@ int tls1_process_ticket(SSL *s, unsigned char *session_id, int len, n2s(p, type); n2s(p, size); if (p + size > limit) - return 1; + return 0; if (type == TLSEXT_TYPE_session_ticket) { - /* If tickets disabled indicate cache miss which will - * trigger a full handshake - */ - if (SSL_get_options(s) & SSL_OP_NO_TICKET) - return 1; - /* If zero length note client will accept a ticket - * and indicate cache miss to trigger full handshake - */ + int r; if (size == 0) { + /* The client will accept a ticket but doesn't + * currently have one. */ s->tlsext_ticket_expected = 1; - return 0; /* Cache miss */ + return 1; } if (s->tls_session_secret_cb) { - /* Indicate cache miss here and instead of - * generating the session from ticket now, - * trigger abbreviated handshake based on - * external mechanism to calculate the master - * secret later. */ - return 0; + /* Indicate that the ticket couldn't be + * decrypted rather than generating the session + * from ticket now, trigger abbreviated + * handshake based on external mechanism to + * calculate the master secret later. */ + return 2; + } + r = tls_decrypt_ticket(s, p, size, session_id, len, ret); + switch (r) + { + case 2: /* ticket couldn't be decrypted */ + s->tlsext_ticket_expected = 1; + return 2; + case 3: /* ticket was decrypted */ + return r; + case 4: /* ticket decrypted but need to renew */ + s->tlsext_ticket_expected = 1; + return 3; + default: /* fatal error */ + return -1; } - return tls_decrypt_ticket(s, p, size, session_id, len, - ret); } p += size; } - return 1; + return 0; } +/* tls_decrypt_ticket attempts to decrypt a session ticket. + * + * etick: points to the body of the session ticket extension. + * eticklen: the length of the session tickets extenion. + * sess_id: points at the session ID. + * sesslen: the length of the session ID. + * psess: (output) on return, if a ticket was decrypted, then this is set to + * point to the resulting session. + * + * Returns: + * -1: fatal error, either from parsing or decrypting the ticket. + * 2: the ticket couldn't be decrypted. + * 3: a ticket was successfully decrypted and *psess was set. + * 4: same as 3, but the ticket needs to be renewed. + */ static int tls_decrypt_ticket(SSL *s, const unsigned char *etick, int eticklen, const unsigned char *sess_id, int sesslen, SSL_SESSION **psess) @@ -1667,7 +2149,7 @@ static int tls_decrypt_ticket(SSL *s, const unsigned char *etick, int eticklen, SSL_CTX *tctx = s->initial_ctx; /* Need at least keyname + iv + some encrypted data */ if (eticklen < 48) - goto tickerr; + return 2; /* Initialize session ticket encryption and HMAC contexts */ HMAC_CTX_init(&hctx); EVP_CIPHER_CTX_init(&ctx); @@ -1679,7 +2161,7 @@ static int tls_decrypt_ticket(SSL *s, const unsigned char *etick, int eticklen, if (rv < 0) return -1; if (rv == 0) - goto tickerr; + return 2; if (rv == 2) renew_ticket = 1; } @@ -1687,15 +2169,15 @@ static int tls_decrypt_ticket(SSL *s, const unsigned char *etick, int eticklen, { /* Check key name matches */ if (memcmp(etick, tctx->tlsext_tick_key_name, 16)) - goto tickerr; + return 2; HMAC_Init_ex(&hctx, tctx->tlsext_tick_hmac_key, 16, tlsext_tick_md(), NULL); EVP_DecryptInit_ex(&ctx, EVP_aes_128_cbc(), NULL, tctx->tlsext_tick_aes_key, etick + 16); } /* Attempt to process session ticket, first conduct sanity and - * integrity checks on ticket. - */ + * integrity checks on ticket. + */ mlen = HMAC_size(&hctx); if (mlen < 0) { @@ -1708,7 +2190,7 @@ static int tls_decrypt_ticket(SSL *s, const unsigned char *etick, int eticklen, HMAC_Final(&hctx, tick_hmac, NULL); HMAC_CTX_cleanup(&hctx); if (memcmp(tick_hmac, etick + eticklen, mlen)) - goto tickerr; + return 2; /* Attempt to decrypt session data */ /* Move p after IV to start of encrypted ticket, update length */ p = etick + 16 + EVP_CIPHER_CTX_iv_length(&ctx); @@ -1721,33 +2203,376 @@ static int tls_decrypt_ticket(SSL *s, const unsigned char *etick, int eticklen, } EVP_DecryptUpdate(&ctx, sdec, &slen, p, eticklen); if (EVP_DecryptFinal(&ctx, sdec + slen, &mlen) <= 0) - goto tickerr; + return 2; slen += mlen; EVP_CIPHER_CTX_cleanup(&ctx); p = sdec; - + sess = d2i_SSL_SESSION(NULL, &p, slen); OPENSSL_free(sdec); if (sess) { - /* The session ID if non-empty is used by some clients to - * detect that the ticket has been accepted. So we copy it to - * the session structure. If it is empty set length to zero - * as required by standard. - */ + /* The session ID, if non-empty, is used by some clients to + * detect that the ticket has been accepted. So we copy it to + * the session structure. If it is empty set length to zero + * as required by standard. + */ if (sesslen) memcpy(sess->session_id, sess_id, sesslen); sess->session_id_length = sesslen; *psess = sess; - s->tlsext_ticket_expected = renew_ticket; + if (renew_ticket) + return 4; + else + return 3; + } + ERR_clear_error(); + /* For session parse failure, indicate that we need to send a new + * ticket. */ + return 2; + } + +/* Tables to translate from NIDs to TLS v1.2 ids */ + +typedef struct + { + int nid; + int id; + } tls12_lookup; + +static tls12_lookup tls12_md[] = { +#ifndef OPENSSL_NO_MD5 + {NID_md5, TLSEXT_hash_md5}, +#endif +#ifndef OPENSSL_NO_SHA + {NID_sha1, TLSEXT_hash_sha1}, +#endif +#ifndef OPENSSL_NO_SHA256 + {NID_sha224, TLSEXT_hash_sha224}, + {NID_sha256, TLSEXT_hash_sha256}, +#endif +#ifndef OPENSSL_NO_SHA512 + {NID_sha384, TLSEXT_hash_sha384}, + {NID_sha512, TLSEXT_hash_sha512} +#endif +}; + +static tls12_lookup tls12_sig[] = { +#ifndef OPENSSL_NO_RSA + {EVP_PKEY_RSA, TLSEXT_signature_rsa}, +#endif +#ifndef OPENSSL_NO_DSA + {EVP_PKEY_DSA, TLSEXT_signature_dsa}, +#endif +#ifndef OPENSSL_NO_ECDSA + {EVP_PKEY_EC, TLSEXT_signature_ecdsa} +#endif +}; + +static int tls12_find_id(int nid, tls12_lookup *table, size_t tlen) + { + size_t i; + for (i = 0; i < tlen; i++) + { + if (table[i].nid == nid) + return table[i].id; + } + return -1; + } +#if 0 +static int tls12_find_nid(int id, tls12_lookup *table, size_t tlen) + { + size_t i; + for (i = 0; i < tlen; i++) + { + if (table[i].id == id) + return table[i].nid; + } + return -1; + } +#endif + +int tls12_get_sigandhash(unsigned char *p, const EVP_PKEY *pk, const EVP_MD *md) + { + int sig_id, md_id; + if (!md) + return 0; + md_id = tls12_find_id(EVP_MD_type(md), tls12_md, + sizeof(tls12_md)/sizeof(tls12_lookup)); + if (md_id == -1) + return 0; + sig_id = tls12_get_sigid(pk); + if (sig_id == -1) + return 0; + p[0] = (unsigned char)md_id; + p[1] = (unsigned char)sig_id; + return 1; + } + +int tls12_get_sigid(const EVP_PKEY *pk) + { + return tls12_find_id(pk->type, tls12_sig, + sizeof(tls12_sig)/sizeof(tls12_lookup)); + } + +const EVP_MD *tls12_get_hash(unsigned char hash_alg) + { + switch(hash_alg) + { +#ifndef OPENSSL_NO_MD5 + case TLSEXT_hash_md5: +#ifdef OPENSSL_FIPS + if (FIPS_mode()) + return NULL; +#endif + return EVP_md5(); +#endif +#ifndef OPENSSL_NO_SHA + case TLSEXT_hash_sha1: + return EVP_sha1(); +#endif +#ifndef OPENSSL_NO_SHA256 + case TLSEXT_hash_sha224: + return EVP_sha224(); + + case TLSEXT_hash_sha256: + return EVP_sha256(); +#endif +#ifndef OPENSSL_NO_SHA512 + case TLSEXT_hash_sha384: + return EVP_sha384(); + + case TLSEXT_hash_sha512: + return EVP_sha512(); +#endif + default: + return NULL; + + } + } + +/* Set preferred digest for each key type */ + +int tls1_process_sigalgs(SSL *s, const unsigned char *data, int dsize) + { + int i, idx; + const EVP_MD *md; + CERT *c = s->cert; + /* Extension ignored for TLS versions below 1.2 */ + if (TLS1_get_version(s) < TLS1_2_VERSION) return 1; + /* Should never happen */ + if (!c) + return 0; + + c->pkeys[SSL_PKEY_DSA_SIGN].digest = NULL; + c->pkeys[SSL_PKEY_RSA_SIGN].digest = NULL; + c->pkeys[SSL_PKEY_RSA_ENC].digest = NULL; + c->pkeys[SSL_PKEY_ECC].digest = NULL; + + for (i = 0; i < dsize; i += 2) + { + unsigned char hash_alg = data[i], sig_alg = data[i+1]; + + switch(sig_alg) + { +#ifndef OPENSSL_NO_RSA + case TLSEXT_signature_rsa: + idx = SSL_PKEY_RSA_SIGN; + break; +#endif +#ifndef OPENSSL_NO_DSA + case TLSEXT_signature_dsa: + idx = SSL_PKEY_DSA_SIGN; + break; +#endif +#ifndef OPENSSL_NO_ECDSA + case TLSEXT_signature_ecdsa: + idx = SSL_PKEY_ECC; + break; +#endif + default: + continue; + } + + if (c->pkeys[idx].digest == NULL) + { + md = tls12_get_hash(hash_alg); + if (md) + { + c->pkeys[idx].digest = md; + if (idx == SSL_PKEY_RSA_SIGN) + c->pkeys[SSL_PKEY_RSA_ENC].digest = md; + } + } + } - /* If session decrypt failure indicate a cache miss and set state to - * send a new ticket - */ - tickerr: - s->tlsext_ticket_expected = 1; + + + /* Set any remaining keys to default values. NOTE: if alg is not + * supported it stays as NULL. + */ +#ifndef OPENSSL_NO_DSA + if (!c->pkeys[SSL_PKEY_DSA_SIGN].digest) + c->pkeys[SSL_PKEY_DSA_SIGN].digest = EVP_dss1(); +#endif +#ifndef OPENSSL_NO_RSA + if (!c->pkeys[SSL_PKEY_RSA_SIGN].digest) + { + c->pkeys[SSL_PKEY_RSA_SIGN].digest = EVP_sha1(); + c->pkeys[SSL_PKEY_RSA_ENC].digest = EVP_sha1(); + } +#endif +#ifndef OPENSSL_NO_ECDSA + if (!c->pkeys[SSL_PKEY_ECC].digest) + c->pkeys[SSL_PKEY_ECC].digest = EVP_ecdsa(); +#endif + return 1; + } + +#endif + +#ifndef OPENSSL_NO_HEARTBEATS +int +tls1_process_heartbeat(SSL *s) + { + unsigned char *p = &s->s3->rrec.data[0], *pl; + unsigned short hbtype; + unsigned int payload; + unsigned int padding = 16; /* Use minimum padding */ + + /* Read type and payload length first */ + hbtype = *p++; + n2s(p, payload); + pl = p; + + if (s->msg_callback) + s->msg_callback(0, s->version, TLS1_RT_HEARTBEAT, + &s->s3->rrec.data[0], s->s3->rrec.length, + s, s->msg_callback_arg); + + if (hbtype == TLS1_HB_REQUEST) + { + unsigned char *buffer, *bp; + int r; + + /* Allocate memory for the response, size is 1 bytes + * message type, plus 2 bytes payload length, plus + * payload, plus padding + */ + buffer = OPENSSL_malloc(1 + 2 + payload + padding); + bp = buffer; + + /* Enter response type, length and copy payload */ + *bp++ = TLS1_HB_RESPONSE; + s2n(payload, bp); + memcpy(bp, pl, payload); + bp += payload; + /* Random padding */ + RAND_pseudo_bytes(bp, padding); + + r = ssl3_write_bytes(s, TLS1_RT_HEARTBEAT, buffer, 3 + payload + padding); + + if (r >= 0 && s->msg_callback) + s->msg_callback(1, s->version, TLS1_RT_HEARTBEAT, + buffer, 3 + payload + padding, + s, s->msg_callback_arg); + + OPENSSL_free(buffer); + + if (r < 0) + return r; + } + else if (hbtype == TLS1_HB_RESPONSE) + { + unsigned int seq; + + /* We only send sequence numbers (2 bytes unsigned int), + * and 16 random bytes, so we just try to read the + * sequence number */ + n2s(pl, seq); + + if (payload == 18 && seq == s->tlsext_hb_seq) + { + s->tlsext_hb_seq++; + s->tlsext_hb_pending = 0; + } + } + return 0; } +int +tls1_heartbeat(SSL *s) + { + unsigned char *buf, *p; + int ret; + unsigned int payload = 18; /* Sequence number + random bytes */ + unsigned int padding = 16; /* Use minimum padding */ + + /* Only send if peer supports and accepts HB requests... */ + if (!(s->tlsext_heartbeat & SSL_TLSEXT_HB_ENABLED) || + s->tlsext_heartbeat & SSL_TLSEXT_HB_DONT_SEND_REQUESTS) + { + SSLerr(SSL_F_TLS1_HEARTBEAT,SSL_R_TLS_HEARTBEAT_PEER_DOESNT_ACCEPT); + return -1; + } + + /* ...and there is none in flight yet... */ + if (s->tlsext_hb_pending) + { + SSLerr(SSL_F_TLS1_HEARTBEAT,SSL_R_TLS_HEARTBEAT_PENDING); + return -1; + } + + /* ...and no handshake in progress. */ + if (SSL_in_init(s) || s->in_handshake) + { + SSLerr(SSL_F_TLS1_HEARTBEAT,SSL_R_UNEXPECTED_MESSAGE); + return -1; + } + + /* Check if padding is too long, payload and padding + * must not exceed 2^14 - 3 = 16381 bytes in total. + */ + OPENSSL_assert(payload + padding <= 16381); + + /* Create HeartBeat message, we just use a sequence number + * as payload to distuingish different messages and add + * some random stuff. + * - Message Type, 1 byte + * - Payload Length, 2 bytes (unsigned int) + * - Payload, the sequence number (2 bytes uint) + * - Payload, random bytes (16 bytes uint) + * - Padding + */ + buf = OPENSSL_malloc(1 + 2 + payload + padding); + p = buf; + /* Message Type */ + *p++ = TLS1_HB_REQUEST; + /* Payload length (18 bytes here) */ + s2n(payload, p); + /* Sequence number */ + s2n(s->tlsext_hb_seq, p); + /* 16 random bytes */ + RAND_pseudo_bytes(p, 16); + p += 16; + /* Random padding */ + RAND_pseudo_bytes(p, padding); + + ret = ssl3_write_bytes(s, TLS1_RT_HEARTBEAT, buf, 3 + payload + padding); + if (ret >= 0) + { + if (s->msg_callback) + s->msg_callback(1, s->version, TLS1_RT_HEARTBEAT, + buf, 3 + payload + padding, + s, s->msg_callback_arg); + + s->tlsext_hb_pending = 1; + } + + OPENSSL_free(buf); + + return ret; + } #endif diff --git a/crypto/openssl/ssl/t1_meth.c b/crypto/openssl/ssl/t1_meth.c index 6ce7c0bbf5..53c807de28 100644 --- a/crypto/openssl/ssl/t1_meth.c +++ b/crypto/openssl/ssl/t1_meth.c @@ -60,16 +60,28 @@ #include #include "ssl_locl.h" -static const SSL_METHOD *tls1_get_method(int ver); static const SSL_METHOD *tls1_get_method(int ver) { + if (ver == TLS1_2_VERSION) + return TLSv1_2_method(); + if (ver == TLS1_1_VERSION) + return TLSv1_1_method(); if (ver == TLS1_VERSION) - return(TLSv1_method()); - else - return(NULL); + return TLSv1_method(); + return NULL; } -IMPLEMENT_tls1_meth_func(TLSv1_method, +IMPLEMENT_tls_meth_func(TLS1_2_VERSION, TLSv1_2_method, + ssl3_accept, + ssl3_connect, + tls1_get_method) + +IMPLEMENT_tls_meth_func(TLS1_1_VERSION, TLSv1_1_method, + ssl3_accept, + ssl3_connect, + tls1_get_method) + +IMPLEMENT_tls_meth_func(TLS1_VERSION, TLSv1_method, ssl3_accept, ssl3_connect, tls1_get_method) diff --git a/crypto/openssl/ssl/t1_srvr.c b/crypto/openssl/ssl/t1_srvr.c index 42525e9e89..f1d1565769 100644 --- a/crypto/openssl/ssl/t1_srvr.c +++ b/crypto/openssl/ssl/t1_srvr.c @@ -67,13 +67,26 @@ static const SSL_METHOD *tls1_get_server_method(int ver); static const SSL_METHOD *tls1_get_server_method(int ver) { + if (ver == TLS1_2_VERSION) + return TLSv1_2_server_method(); + if (ver == TLS1_1_VERSION) + return TLSv1_1_server_method(); if (ver == TLS1_VERSION) - return(TLSv1_server_method()); - else - return(NULL); + return TLSv1_server_method(); + return NULL; } -IMPLEMENT_tls1_meth_func(TLSv1_server_method, +IMPLEMENT_tls_meth_func(TLS1_2_VERSION, TLSv1_2_server_method, + ssl3_accept, + ssl_undefined_function, + tls1_get_server_method) + +IMPLEMENT_tls_meth_func(TLS1_1_VERSION, TLSv1_1_server_method, + ssl3_accept, + ssl_undefined_function, + tls1_get_server_method) + +IMPLEMENT_tls_meth_func(TLS1_VERSION, TLSv1_server_method, ssl3_accept, ssl_undefined_function, tls1_get_server_method) diff --git a/crypto/openssl/ssl/tls1.h b/crypto/openssl/ssl/tls1.h index b3cc8f098b..c39c267f0b 100644 --- a/crypto/openssl/ssl/tls1.h +++ b/crypto/openssl/ssl/tls1.h @@ -159,10 +159,24 @@ extern "C" { #define TLS1_ALLOW_EXPERIMENTAL_CIPHERSUITES 0 +#define TLS1_2_VERSION 0x0303 +#define TLS1_2_VERSION_MAJOR 0x03 +#define TLS1_2_VERSION_MINOR 0x03 + +#define TLS1_1_VERSION 0x0302 +#define TLS1_1_VERSION_MAJOR 0x03 +#define TLS1_1_VERSION_MINOR 0x02 + #define TLS1_VERSION 0x0301 #define TLS1_VERSION_MAJOR 0x03 #define TLS1_VERSION_MINOR 0x01 +#define TLS1_get_version(s) \ + ((s->version >> 8) == TLS1_VERSION_MAJOR ? s->version : 0) + +#define TLS1_get_client_version(s) \ + ((s->client_version >> 8) == TLS1_VERSION_MAJOR ? s->client_version : 0) + #define TLS1_AD_DECRYPTION_FAILED 21 #define TLS1_AD_RECORD_OVERFLOW 22 #define TLS1_AD_UNKNOWN_CA 48 /* fatal */ @@ -183,17 +197,42 @@ extern "C" { #define TLS1_AD_BAD_CERTIFICATE_HASH_VALUE 114 #define TLS1_AD_UNKNOWN_PSK_IDENTITY 115 /* fatal */ -/* ExtensionType values from RFC3546 / RFC4366 */ +/* ExtensionType values from RFC3546 / RFC4366 / RFC6066 */ #define TLSEXT_TYPE_server_name 0 #define TLSEXT_TYPE_max_fragment_length 1 #define TLSEXT_TYPE_client_certificate_url 2 #define TLSEXT_TYPE_trusted_ca_keys 3 #define TLSEXT_TYPE_truncated_hmac 4 #define TLSEXT_TYPE_status_request 5 +/* ExtensionType values from RFC4681 */ +#define TLSEXT_TYPE_user_mapping 6 + +/* ExtensionType values from RFC5878 */ +#define TLSEXT_TYPE_client_authz 7 +#define TLSEXT_TYPE_server_authz 8 + +/* ExtensionType values from RFC6091 */ +#define TLSEXT_TYPE_cert_type 9 + /* ExtensionType values from RFC4492 */ #define TLSEXT_TYPE_elliptic_curves 10 #define TLSEXT_TYPE_ec_point_formats 11 + +/* ExtensionType value from RFC5054 */ +#define TLSEXT_TYPE_srp 12 + +/* ExtensionType values from RFC5246 */ +#define TLSEXT_TYPE_signature_algorithms 13 + +/* ExtensionType value from RFC5764 */ +#define TLSEXT_TYPE_use_srtp 14 + +/* ExtensionType value from RFC5620 */ +#define TLSEXT_TYPE_heartbeat 15 + +/* ExtensionType value from RFC4507 */ #define TLSEXT_TYPE_session_ticket 35 + /* ExtensionType value from draft-rescorla-tls-opaque-prf-input-00.txt */ #if 0 /* will have to be provided externally for now , * i.e. build with -DTLSEXT_TYPE_opaque_prf_input=38183 @@ -204,6 +243,11 @@ extern "C" { /* Temporary extension type */ #define TLSEXT_TYPE_renegotiate 0xff01 +#ifndef OPENSSL_NO_NEXTPROTONEG +/* This is not an IANA defined extension number */ +#define TLSEXT_TYPE_next_proto_neg 13172 +#endif + /* NameType value from RFC 3546 */ #define TLSEXT_NAMETYPE_host_name 0 /* status request value from RFC 3546 */ @@ -216,12 +260,37 @@ extern "C" { #define TLSEXT_ECPOINTFORMAT_ansiX962_compressed_char2 2 #define TLSEXT_ECPOINTFORMAT_last 2 +/* Signature and hash algorithms from RFC 5246 */ + +#define TLSEXT_signature_anonymous 0 +#define TLSEXT_signature_rsa 1 +#define TLSEXT_signature_dsa 2 +#define TLSEXT_signature_ecdsa 3 + +#define TLSEXT_hash_none 0 +#define TLSEXT_hash_md5 1 +#define TLSEXT_hash_sha1 2 +#define TLSEXT_hash_sha224 3 +#define TLSEXT_hash_sha256 4 +#define TLSEXT_hash_sha384 5 +#define TLSEXT_hash_sha512 6 + #ifndef OPENSSL_NO_TLSEXT #define TLSEXT_MAXLEN_host_name 255 -const char *SSL_get_servername(const SSL *s, const int type) ; -int SSL_get_servername_type(const SSL *s) ; +const char *SSL_get_servername(const SSL *s, const int type); +int SSL_get_servername_type(const SSL *s); +/* SSL_export_keying_material exports a value derived from the master secret, + * as specified in RFC 5705. It writes |olen| bytes to |out| given a label and + * optional context. (Since a zero length context is allowed, the |use_context| + * flag controls whether a context is included.) + * + * It returns 1 on success and zero otherwise. + */ +int SSL_export_keying_material(SSL *s, unsigned char *out, size_t olen, + const char *label, size_t llen, const unsigned char *p, size_t plen, + int use_context); #define SSL_set_tlsext_host_name(s,name) \ SSL_ctrl(s,SSL_CTRL_SET_TLSEXT_HOSTNAME,TLSEXT_NAMETYPE_host_name,(char *)name) @@ -285,6 +354,16 @@ SSL_CTX_ctrl(ctx,SSL_CTRL_SET_TLSEXT_OPAQUE_PRF_INPUT_CB_ARG, 0, arg) #define SSL_CTX_set_tlsext_ticket_key_cb(ssl, cb) \ SSL_CTX_callback_ctrl(ssl,SSL_CTRL_SET_TLSEXT_TICKET_KEY_CB,(void (*)(void))cb) +#ifndef OPENSSL_NO_HEARTBEATS +#define SSL_TLSEXT_HB_ENABLED 0x01 +#define SSL_TLSEXT_HB_DONT_SEND_REQUESTS 0x02 +#define SSL_TLSEXT_HB_DONT_RECV_REQUESTS 0x04 + +#define SSL_get_tlsext_heartbeat_pending(ssl) \ + SSL_ctrl((ssl),SSL_CTRL_GET_TLS_EXT_HEARTBEAT_PENDING,0,NULL) +#define SSL_set_tlsext_heartbeat_no_requests(ssl, arg) \ + SSL_ctrl((ssl),SSL_CTRL_SET_TLS_EXT_HEARTBEAT_NO_REQUESTS,arg,NULL) +#endif #endif /* PSK ciphersuites from 4279 */ @@ -322,6 +401,14 @@ SSL_CTX_callback_ctrl(ssl,SSL_CTRL_SET_TLSEXT_TICKET_KEY_CB,(void (*)(void))cb) #define TLS1_CK_DHE_RSA_WITH_AES_256_SHA 0x03000039 #define TLS1_CK_ADH_WITH_AES_256_SHA 0x0300003A +/* TLS v1.2 ciphersuites */ +#define TLS1_CK_RSA_WITH_NULL_SHA256 0x0300003B +#define TLS1_CK_RSA_WITH_AES_128_SHA256 0x0300003C +#define TLS1_CK_RSA_WITH_AES_256_SHA256 0x0300003D +#define TLS1_CK_DH_DSS_WITH_AES_128_SHA256 0x0300003E +#define TLS1_CK_DH_RSA_WITH_AES_128_SHA256 0x0300003F +#define TLS1_CK_DHE_DSS_WITH_AES_128_SHA256 0x03000040 + /* Camellia ciphersuites from RFC4132 */ #define TLS1_CK_RSA_WITH_CAMELLIA_128_CBC_SHA 0x03000041 #define TLS1_CK_DH_DSS_WITH_CAMELLIA_128_CBC_SHA 0x03000042 @@ -330,6 +417,16 @@ SSL_CTX_callback_ctrl(ssl,SSL_CTRL_SET_TLSEXT_TICKET_KEY_CB,(void (*)(void))cb) #define TLS1_CK_DHE_RSA_WITH_CAMELLIA_128_CBC_SHA 0x03000045 #define TLS1_CK_ADH_WITH_CAMELLIA_128_CBC_SHA 0x03000046 +/* TLS v1.2 ciphersuites */ +#define TLS1_CK_DHE_RSA_WITH_AES_128_SHA256 0x03000067 +#define TLS1_CK_DH_DSS_WITH_AES_256_SHA256 0x03000068 +#define TLS1_CK_DH_RSA_WITH_AES_256_SHA256 0x03000069 +#define TLS1_CK_DHE_DSS_WITH_AES_256_SHA256 0x0300006A +#define TLS1_CK_DHE_RSA_WITH_AES_256_SHA256 0x0300006B +#define TLS1_CK_ADH_WITH_AES_128_SHA256 0x0300006C +#define TLS1_CK_ADH_WITH_AES_256_SHA256 0x0300006D + +/* Camellia ciphersuites from RFC4132 */ #define TLS1_CK_RSA_WITH_CAMELLIA_256_CBC_SHA 0x03000084 #define TLS1_CK_DH_DSS_WITH_CAMELLIA_256_CBC_SHA 0x03000085 #define TLS1_CK_DH_RSA_WITH_CAMELLIA_256_CBC_SHA 0x03000086 @@ -345,6 +442,20 @@ SSL_CTX_callback_ctrl(ssl,SSL_CTRL_SET_TLSEXT_TICKET_KEY_CB,(void (*)(void))cb) #define TLS1_CK_DHE_RSA_WITH_SEED_SHA 0x0300009A #define TLS1_CK_ADH_WITH_SEED_SHA 0x0300009B +/* TLS v1.2 GCM ciphersuites from RFC5288 */ +#define TLS1_CK_RSA_WITH_AES_128_GCM_SHA256 0x0300009C +#define TLS1_CK_RSA_WITH_AES_256_GCM_SHA384 0x0300009D +#define TLS1_CK_DHE_RSA_WITH_AES_128_GCM_SHA256 0x0300009E +#define TLS1_CK_DHE_RSA_WITH_AES_256_GCM_SHA384 0x0300009F +#define TLS1_CK_DH_RSA_WITH_AES_128_GCM_SHA256 0x030000A0 +#define TLS1_CK_DH_RSA_WITH_AES_256_GCM_SHA384 0x030000A1 +#define TLS1_CK_DHE_DSS_WITH_AES_128_GCM_SHA256 0x030000A2 +#define TLS1_CK_DHE_DSS_WITH_AES_256_GCM_SHA384 0x030000A3 +#define TLS1_CK_DH_DSS_WITH_AES_128_GCM_SHA256 0x030000A4 +#define TLS1_CK_DH_DSS_WITH_AES_256_GCM_SHA384 0x030000A5 +#define TLS1_CK_ADH_WITH_AES_128_GCM_SHA256 0x030000A6 +#define TLS1_CK_ADH_WITH_AES_256_GCM_SHA384 0x030000A7 + /* ECC ciphersuites from draft-ietf-tls-ecc-12.txt with changes soon to be in draft 13 */ #define TLS1_CK_ECDH_ECDSA_WITH_NULL_SHA 0x0300C001 #define TLS1_CK_ECDH_ECDSA_WITH_RC4_128_SHA 0x0300C002 @@ -376,6 +487,38 @@ SSL_CTX_callback_ctrl(ssl,SSL_CTRL_SET_TLSEXT_TICKET_KEY_CB,(void (*)(void))cb) #define TLS1_CK_ECDH_anon_WITH_AES_128_CBC_SHA 0x0300C018 #define TLS1_CK_ECDH_anon_WITH_AES_256_CBC_SHA 0x0300C019 +/* SRP ciphersuites from RFC 5054 */ +#define TLS1_CK_SRP_SHA_WITH_3DES_EDE_CBC_SHA 0x0300C01A +#define TLS1_CK_SRP_SHA_RSA_WITH_3DES_EDE_CBC_SHA 0x0300C01B +#define TLS1_CK_SRP_SHA_DSS_WITH_3DES_EDE_CBC_SHA 0x0300C01C +#define TLS1_CK_SRP_SHA_WITH_AES_128_CBC_SHA 0x0300C01D +#define TLS1_CK_SRP_SHA_RSA_WITH_AES_128_CBC_SHA 0x0300C01E +#define TLS1_CK_SRP_SHA_DSS_WITH_AES_128_CBC_SHA 0x0300C01F +#define TLS1_CK_SRP_SHA_WITH_AES_256_CBC_SHA 0x0300C020 +#define TLS1_CK_SRP_SHA_RSA_WITH_AES_256_CBC_SHA 0x0300C021 +#define TLS1_CK_SRP_SHA_DSS_WITH_AES_256_CBC_SHA 0x0300C022 + +/* ECDH HMAC based ciphersuites from RFC5289 */ + +#define TLS1_CK_ECDHE_ECDSA_WITH_AES_128_SHA256 0x0300C023 +#define TLS1_CK_ECDHE_ECDSA_WITH_AES_256_SHA384 0x0300C024 +#define TLS1_CK_ECDH_ECDSA_WITH_AES_128_SHA256 0x0300C025 +#define TLS1_CK_ECDH_ECDSA_WITH_AES_256_SHA384 0x0300C026 +#define TLS1_CK_ECDHE_RSA_WITH_AES_128_SHA256 0x0300C027 +#define TLS1_CK_ECDHE_RSA_WITH_AES_256_SHA384 0x0300C028 +#define TLS1_CK_ECDH_RSA_WITH_AES_128_SHA256 0x0300C029 +#define TLS1_CK_ECDH_RSA_WITH_AES_256_SHA384 0x0300C02A + +/* ECDH GCM based ciphersuites from RFC5289 */ +#define TLS1_CK_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 0x0300C02B +#define TLS1_CK_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 0x0300C02C +#define TLS1_CK_ECDH_ECDSA_WITH_AES_128_GCM_SHA256 0x0300C02D +#define TLS1_CK_ECDH_ECDSA_WITH_AES_256_GCM_SHA384 0x0300C02E +#define TLS1_CK_ECDHE_RSA_WITH_AES_128_GCM_SHA256 0x0300C02F +#define TLS1_CK_ECDHE_RSA_WITH_AES_256_GCM_SHA384 0x0300C030 +#define TLS1_CK_ECDH_RSA_WITH_AES_128_GCM_SHA256 0x0300C031 +#define TLS1_CK_ECDH_RSA_WITH_AES_256_GCM_SHA384 0x0300C032 + /* XXX * Inconsistency alert: * The OpenSSL names of ciphers with ephemeral DH here include the string @@ -443,6 +586,17 @@ SSL_CTX_callback_ctrl(ssl,SSL_CTRL_SET_TLSEXT_TICKET_KEY_CB,(void (*)(void))cb) #define TLS1_TXT_PSK_WITH_AES_128_CBC_SHA "PSK-AES128-CBC-SHA" #define TLS1_TXT_PSK_WITH_AES_256_CBC_SHA "PSK-AES256-CBC-SHA" +/* SRP ciphersuite from RFC 5054 */ +#define TLS1_TXT_SRP_SHA_WITH_3DES_EDE_CBC_SHA "SRP-3DES-EDE-CBC-SHA" +#define TLS1_TXT_SRP_SHA_RSA_WITH_3DES_EDE_CBC_SHA "SRP-RSA-3DES-EDE-CBC-SHA" +#define TLS1_TXT_SRP_SHA_DSS_WITH_3DES_EDE_CBC_SHA "SRP-DSS-3DES-EDE-CBC-SHA" +#define TLS1_TXT_SRP_SHA_WITH_AES_128_CBC_SHA "SRP-AES-128-CBC-SHA" +#define TLS1_TXT_SRP_SHA_RSA_WITH_AES_128_CBC_SHA "SRP-RSA-AES-128-CBC-SHA" +#define TLS1_TXT_SRP_SHA_DSS_WITH_AES_128_CBC_SHA "SRP-DSS-AES-128-CBC-SHA" +#define TLS1_TXT_SRP_SHA_WITH_AES_256_CBC_SHA "SRP-AES-256-CBC-SHA" +#define TLS1_TXT_SRP_SHA_RSA_WITH_AES_256_CBC_SHA "SRP-RSA-AES-256-CBC-SHA" +#define TLS1_TXT_SRP_SHA_DSS_WITH_AES_256_CBC_SHA "SRP-DSS-AES-256-CBC-SHA" + /* Camellia ciphersuites from RFC4132 */ #define TLS1_TXT_RSA_WITH_CAMELLIA_128_CBC_SHA "CAMELLIA128-SHA" #define TLS1_TXT_DH_DSS_WITH_CAMELLIA_128_CBC_SHA "DH-DSS-CAMELLIA128-SHA" @@ -466,6 +620,55 @@ SSL_CTX_callback_ctrl(ssl,SSL_CTRL_SET_TLSEXT_TICKET_KEY_CB,(void (*)(void))cb) #define TLS1_TXT_DHE_RSA_WITH_SEED_SHA "DHE-RSA-SEED-SHA" #define TLS1_TXT_ADH_WITH_SEED_SHA "ADH-SEED-SHA" +/* TLS v1.2 ciphersuites */ +#define TLS1_TXT_RSA_WITH_NULL_SHA256 "NULL-SHA256" +#define TLS1_TXT_RSA_WITH_AES_128_SHA256 "AES128-SHA256" +#define TLS1_TXT_RSA_WITH_AES_256_SHA256 "AES256-SHA256" +#define TLS1_TXT_DH_DSS_WITH_AES_128_SHA256 "DH-DSS-AES128-SHA256" +#define TLS1_TXT_DH_RSA_WITH_AES_128_SHA256 "DH-RSA-AES128-SHA256" +#define TLS1_TXT_DHE_DSS_WITH_AES_128_SHA256 "DHE-DSS-AES128-SHA256" +#define TLS1_TXT_DHE_RSA_WITH_AES_128_SHA256 "DHE-RSA-AES128-SHA256" +#define TLS1_TXT_DH_DSS_WITH_AES_256_SHA256 "DH-DSS-AES256-SHA256" +#define TLS1_TXT_DH_RSA_WITH_AES_256_SHA256 "DH-RSA-AES256-SHA256" +#define TLS1_TXT_DHE_DSS_WITH_AES_256_SHA256 "DHE-DSS-AES256-SHA256" +#define TLS1_TXT_DHE_RSA_WITH_AES_256_SHA256 "DHE-RSA-AES256-SHA256" +#define TLS1_TXT_ADH_WITH_AES_128_SHA256 "ADH-AES128-SHA256" +#define TLS1_TXT_ADH_WITH_AES_256_SHA256 "ADH-AES256-SHA256" + +/* TLS v1.2 GCM ciphersuites from RFC5288 */ +#define TLS1_TXT_RSA_WITH_AES_128_GCM_SHA256 "AES128-GCM-SHA256" +#define TLS1_TXT_RSA_WITH_AES_256_GCM_SHA384 "AES256-GCM-SHA384" +#define TLS1_TXT_DHE_RSA_WITH_AES_128_GCM_SHA256 "DHE-RSA-AES128-GCM-SHA256" +#define TLS1_TXT_DHE_RSA_WITH_AES_256_GCM_SHA384 "DHE-RSA-AES256-GCM-SHA384" +#define TLS1_TXT_DH_RSA_WITH_AES_128_GCM_SHA256 "DH-RSA-AES128-GCM-SHA256" +#define TLS1_TXT_DH_RSA_WITH_AES_256_GCM_SHA384 "DH-RSA-AES256-GCM-SHA384" +#define TLS1_TXT_DHE_DSS_WITH_AES_128_GCM_SHA256 "DHE-DSS-AES128-GCM-SHA256" +#define TLS1_TXT_DHE_DSS_WITH_AES_256_GCM_SHA384 "DHE-DSS-AES256-GCM-SHA384" +#define TLS1_TXT_DH_DSS_WITH_AES_128_GCM_SHA256 "DH-DSS-AES128-GCM-SHA256" +#define TLS1_TXT_DH_DSS_WITH_AES_256_GCM_SHA384 "DH-DSS-AES256-GCM-SHA384" +#define TLS1_TXT_ADH_WITH_AES_128_GCM_SHA256 "ADH-AES128-GCM-SHA256" +#define TLS1_TXT_ADH_WITH_AES_256_GCM_SHA384 "ADH-AES256-GCM-SHA384" + +/* ECDH HMAC based ciphersuites from RFC5289 */ + +#define TLS1_TXT_ECDHE_ECDSA_WITH_AES_128_SHA256 "ECDHE-ECDSA-AES128-SHA256" +#define TLS1_TXT_ECDHE_ECDSA_WITH_AES_256_SHA384 "ECDHE-ECDSA-AES256-SHA384" +#define TLS1_TXT_ECDH_ECDSA_WITH_AES_128_SHA256 "ECDH-ECDSA-AES128-SHA256" +#define TLS1_TXT_ECDH_ECDSA_WITH_AES_256_SHA384 "ECDH-ECDSA-AES256-SHA384" +#define TLS1_TXT_ECDHE_RSA_WITH_AES_128_SHA256 "ECDHE-RSA-AES128-SHA256" +#define TLS1_TXT_ECDHE_RSA_WITH_AES_256_SHA384 "ECDHE-RSA-AES256-SHA384" +#define TLS1_TXT_ECDH_RSA_WITH_AES_128_SHA256 "ECDH-RSA-AES128-SHA256" +#define TLS1_TXT_ECDH_RSA_WITH_AES_256_SHA384 "ECDH-RSA-AES256-SHA384" + +/* ECDH GCM based ciphersuites from RFC5289 */ +#define TLS1_TXT_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 "ECDHE-ECDSA-AES128-GCM-SHA256" +#define TLS1_TXT_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 "ECDHE-ECDSA-AES256-GCM-SHA384" +#define TLS1_TXT_ECDH_ECDSA_WITH_AES_128_GCM_SHA256 "ECDH-ECDSA-AES128-GCM-SHA256" +#define TLS1_TXT_ECDH_ECDSA_WITH_AES_256_GCM_SHA384 "ECDH-ECDSA-AES256-GCM-SHA384" +#define TLS1_TXT_ECDHE_RSA_WITH_AES_128_GCM_SHA256 "ECDHE-RSA-AES128-GCM-SHA256" +#define TLS1_TXT_ECDHE_RSA_WITH_AES_256_GCM_SHA384 "ECDHE-RSA-AES256-GCM-SHA384" +#define TLS1_TXT_ECDH_RSA_WITH_AES_128_GCM_SHA256 "ECDH-RSA-AES128-GCM-SHA256" +#define TLS1_TXT_ECDH_RSA_WITH_AES_256_GCM_SHA384 "ECDH-RSA-AES256-GCM-SHA384" #define TLS_CT_RSA_SIGN 1 #define TLS_CT_DSS_SIGN 2 diff --git a/crypto/openssl/ssl/tls_srp.c b/crypto/openssl/ssl/tls_srp.c new file mode 100644 index 0000000000..8512c4daf6 --- /dev/null +++ b/crypto/openssl/ssl/tls_srp.c @@ -0,0 +1,506 @@ +/* ssl/tls_srp.c */ +/* Written by Christophe Renou (christophe.renou@edelweb.fr) with + * the precious help of Peter Sylvester (peter.sylvester@edelweb.fr) + * for the EdelKey project and contributed to the OpenSSL project 2004. + */ +/* ==================================================================== + * Copyright (c) 2004-2011 The OpenSSL Project. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * 3. All advertising materials mentioning features or use of this + * software must display the following acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit. (http://www.OpenSSL.org/)" + * + * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to + * endorse or promote products derived from this software without + * prior written permission. For written permission, please contact + * licensing@OpenSSL.org. + * + * 5. Products derived from this software may not be called "OpenSSL" + * nor may "OpenSSL" appear in their names without prior written + * permission of the OpenSSL Project. + * + * 6. Redistributions of any form whatsoever must retain the following + * acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit (http://www.OpenSSL.org/)" + * + * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY + * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * ==================================================================== + * + * This product includes cryptographic software written by Eric Young + * (eay@cryptsoft.com). This product includes software written by Tim + * Hudson (tjh@cryptsoft.com). + * + */ +#include "ssl_locl.h" +#ifndef OPENSSL_NO_SRP + +#include +#include +#include + +int SSL_CTX_SRP_CTX_free(struct ssl_ctx_st *ctx) + { + if (ctx == NULL) + return 0; + OPENSSL_free(ctx->srp_ctx.login); + BN_free(ctx->srp_ctx.N); + BN_free(ctx->srp_ctx.g); + BN_free(ctx->srp_ctx.s); + BN_free(ctx->srp_ctx.B); + BN_free(ctx->srp_ctx.A); + BN_free(ctx->srp_ctx.a); + BN_free(ctx->srp_ctx.b); + BN_free(ctx->srp_ctx.v); + ctx->srp_ctx.TLS_ext_srp_username_callback = NULL; + ctx->srp_ctx.SRP_cb_arg = NULL; + ctx->srp_ctx.SRP_verify_param_callback = NULL; + ctx->srp_ctx.SRP_give_srp_client_pwd_callback = NULL; + ctx->srp_ctx.N = NULL; + ctx->srp_ctx.g = NULL; + ctx->srp_ctx.s = NULL; + ctx->srp_ctx.B = NULL; + ctx->srp_ctx.A = NULL; + ctx->srp_ctx.a = NULL; + ctx->srp_ctx.b = NULL; + ctx->srp_ctx.v = NULL; + ctx->srp_ctx.login = NULL; + ctx->srp_ctx.info = NULL; + ctx->srp_ctx.strength = SRP_MINIMAL_N; + ctx->srp_ctx.srp_Mask = 0; + return (1); + } + +int SSL_SRP_CTX_free(struct ssl_st *s) + { + if (s == NULL) + return 0; + OPENSSL_free(s->srp_ctx.login); + BN_free(s->srp_ctx.N); + BN_free(s->srp_ctx.g); + BN_free(s->srp_ctx.s); + BN_free(s->srp_ctx.B); + BN_free(s->srp_ctx.A); + BN_free(s->srp_ctx.a); + BN_free(s->srp_ctx.b); + BN_free(s->srp_ctx.v); + s->srp_ctx.TLS_ext_srp_username_callback = NULL; + s->srp_ctx.SRP_cb_arg = NULL; + s->srp_ctx.SRP_verify_param_callback = NULL; + s->srp_ctx.SRP_give_srp_client_pwd_callback = NULL; + s->srp_ctx.N = NULL; + s->srp_ctx.g = NULL; + s->srp_ctx.s = NULL; + s->srp_ctx.B = NULL; + s->srp_ctx.A = NULL; + s->srp_ctx.a = NULL; + s->srp_ctx.b = NULL; + s->srp_ctx.v = NULL; + s->srp_ctx.login = NULL; + s->srp_ctx.info = NULL; + s->srp_ctx.strength = SRP_MINIMAL_N; + s->srp_ctx.srp_Mask = 0; + return (1); + } + +int SSL_SRP_CTX_init(struct ssl_st *s) + { + SSL_CTX *ctx; + + if ((s == NULL) || ((ctx = s->ctx) == NULL)) + return 0; + s->srp_ctx.SRP_cb_arg = ctx->srp_ctx.SRP_cb_arg; + /* set client Hello login callback */ + s->srp_ctx.TLS_ext_srp_username_callback = ctx->srp_ctx.TLS_ext_srp_username_callback; + /* set SRP N/g param callback for verification */ + s->srp_ctx.SRP_verify_param_callback = ctx->srp_ctx.SRP_verify_param_callback; + /* set SRP client passwd callback */ + s->srp_ctx.SRP_give_srp_client_pwd_callback = ctx->srp_ctx.SRP_give_srp_client_pwd_callback; + + s->srp_ctx.N = NULL; + s->srp_ctx.g = NULL; + s->srp_ctx.s = NULL; + s->srp_ctx.B = NULL; + s->srp_ctx.A = NULL; + s->srp_ctx.a = NULL; + s->srp_ctx.b = NULL; + s->srp_ctx.v = NULL; + s->srp_ctx.login = NULL; + s->srp_ctx.info = ctx->srp_ctx.info; + s->srp_ctx.strength = ctx->srp_ctx.strength; + + if (((ctx->srp_ctx.N != NULL) && + ((s->srp_ctx.N = BN_dup(ctx->srp_ctx.N)) == NULL)) || + ((ctx->srp_ctx.g != NULL) && + ((s->srp_ctx.g = BN_dup(ctx->srp_ctx.g)) == NULL)) || + ((ctx->srp_ctx.s != NULL) && + ((s->srp_ctx.s = BN_dup(ctx->srp_ctx.s)) == NULL)) || + ((ctx->srp_ctx.B != NULL) && + ((s->srp_ctx.B = BN_dup(ctx->srp_ctx.B)) == NULL)) || + ((ctx->srp_ctx.A != NULL) && + ((s->srp_ctx.A = BN_dup(ctx->srp_ctx.A)) == NULL)) || + ((ctx->srp_ctx.a != NULL) && + ((s->srp_ctx.a = BN_dup(ctx->srp_ctx.a)) == NULL)) || + ((ctx->srp_ctx.v != NULL) && + ((s->srp_ctx.v = BN_dup(ctx->srp_ctx.v)) == NULL)) || + ((ctx->srp_ctx.b != NULL) && + ((s->srp_ctx.b = BN_dup(ctx->srp_ctx.b)) == NULL))) + { + SSLerr(SSL_F_SSL_SRP_CTX_INIT,ERR_R_BN_LIB); + goto err; + } + if ((ctx->srp_ctx.login != NULL) && + ((s->srp_ctx.login = BUF_strdup(ctx->srp_ctx.login)) == NULL)) + { + SSLerr(SSL_F_SSL_SRP_CTX_INIT,ERR_R_INTERNAL_ERROR); + goto err; + } + s->srp_ctx.srp_Mask = ctx->srp_ctx.srp_Mask; + + return (1); +err: + OPENSSL_free(s->srp_ctx.login); + BN_free(s->srp_ctx.N); + BN_free(s->srp_ctx.g); + BN_free(s->srp_ctx.s); + BN_free(s->srp_ctx.B); + BN_free(s->srp_ctx.A); + BN_free(s->srp_ctx.a); + BN_free(s->srp_ctx.b); + BN_free(s->srp_ctx.v); + return (0); + } + +int SSL_CTX_SRP_CTX_init(struct ssl_ctx_st *ctx) + { + if (ctx == NULL) + return 0; + + ctx->srp_ctx.SRP_cb_arg = NULL; + /* set client Hello login callback */ + ctx->srp_ctx.TLS_ext_srp_username_callback = NULL; + /* set SRP N/g param callback for verification */ + ctx->srp_ctx.SRP_verify_param_callback = NULL; + /* set SRP client passwd callback */ + ctx->srp_ctx.SRP_give_srp_client_pwd_callback = NULL; + + ctx->srp_ctx.N = NULL; + ctx->srp_ctx.g = NULL; + ctx->srp_ctx.s = NULL; + ctx->srp_ctx.B = NULL; + ctx->srp_ctx.A = NULL; + ctx->srp_ctx.a = NULL; + ctx->srp_ctx.b = NULL; + ctx->srp_ctx.v = NULL; + ctx->srp_ctx.login = NULL; + ctx->srp_ctx.srp_Mask = 0; + ctx->srp_ctx.info = NULL; + ctx->srp_ctx.strength = SRP_MINIMAL_N; + + return (1); + } + +/* server side */ +int SSL_srp_server_param_with_username(SSL *s, int *ad) + { + unsigned char b[SSL_MAX_MASTER_KEY_LENGTH]; + int al; + + *ad = SSL_AD_UNKNOWN_PSK_IDENTITY; + if ((s->srp_ctx.TLS_ext_srp_username_callback !=NULL) && + ((al = s->srp_ctx.TLS_ext_srp_username_callback(s, ad, s->srp_ctx.SRP_cb_arg))!=SSL_ERROR_NONE)) + return al; + + *ad = SSL_AD_INTERNAL_ERROR; + if ((s->srp_ctx.N == NULL) || + (s->srp_ctx.g == NULL) || + (s->srp_ctx.s == NULL) || + (s->srp_ctx.v == NULL)) + return SSL3_AL_FATAL; + + RAND_bytes(b, sizeof(b)); + s->srp_ctx.b = BN_bin2bn(b,sizeof(b),NULL); + OPENSSL_cleanse(b,sizeof(b)); + + /* Calculate: B = (kv + g^b) % N */ + + return ((s->srp_ctx.B = SRP_Calc_B(s->srp_ctx.b, s->srp_ctx.N, s->srp_ctx.g, s->srp_ctx.v)) != NULL)? + SSL_ERROR_NONE:SSL3_AL_FATAL; + } + +/* If the server just has the raw password, make up a verifier entry on the fly */ +int SSL_set_srp_server_param_pw(SSL *s, const char *user, const char *pass, const char *grp) + { + SRP_gN *GN = SRP_get_default_gN(grp); + if(GN == NULL) return -1; + s->srp_ctx.N = BN_dup(GN->N); + s->srp_ctx.g = BN_dup(GN->g); + if(s->srp_ctx.v != NULL) + { + BN_clear_free(s->srp_ctx.v); + s->srp_ctx.v = NULL; + } + if(s->srp_ctx.s != NULL) + { + BN_clear_free(s->srp_ctx.s); + s->srp_ctx.s = NULL; + } + if(!SRP_create_verifier_BN(user, pass, &s->srp_ctx.s, &s->srp_ctx.v, GN->N, GN->g)) return -1; + + return 1; + } + +int SSL_set_srp_server_param(SSL *s, const BIGNUM *N, const BIGNUM *g, + BIGNUM *sa, BIGNUM *v, char *info) + { + if (N!= NULL) + { + if (s->srp_ctx.N != NULL) + { + if (!BN_copy(s->srp_ctx.N,N)) + { + BN_free(s->srp_ctx.N); + s->srp_ctx.N = NULL; + } + } + else + s->srp_ctx.N = BN_dup(N); + } + if (g!= NULL) + { + if (s->srp_ctx.g != NULL) + { + if (!BN_copy(s->srp_ctx.g,g)) + { + BN_free(s->srp_ctx.g); + s->srp_ctx.g = NULL; + } + } + else + s->srp_ctx.g = BN_dup(g); + } + if (sa!= NULL) + { + if (s->srp_ctx.s != NULL) + { + if (!BN_copy(s->srp_ctx.s,sa)) + { + BN_free(s->srp_ctx.s); + s->srp_ctx.s = NULL; + } + } + else + s->srp_ctx.s = BN_dup(sa); + } + if (v!= NULL) + { + if (s->srp_ctx.v != NULL) + { + if (!BN_copy(s->srp_ctx.v,v)) + { + BN_free(s->srp_ctx.v); + s->srp_ctx.v = NULL; + } + } + else + s->srp_ctx.v = BN_dup(v); + } + s->srp_ctx.info = info; + + if (!(s->srp_ctx.N) || + !(s->srp_ctx.g) || + !(s->srp_ctx.s) || + !(s->srp_ctx.v)) + return -1; + + return 1; + } + +int SRP_generate_server_master_secret(SSL *s,unsigned char *master_key) + { + BIGNUM *K = NULL, *u = NULL; + int ret = -1, tmp_len; + unsigned char *tmp = NULL; + + if (!SRP_Verify_A_mod_N(s->srp_ctx.A,s->srp_ctx.N)) + goto err; + if (!(u = SRP_Calc_u(s->srp_ctx.A,s->srp_ctx.B,s->srp_ctx.N))) + goto err; + if (!(K = SRP_Calc_server_key(s->srp_ctx.A, s->srp_ctx.v, u, s->srp_ctx.b, s->srp_ctx.N))) + goto err; + + tmp_len = BN_num_bytes(K); + if ((tmp = OPENSSL_malloc(tmp_len)) == NULL) + goto err; + BN_bn2bin(K, tmp); + ret = s->method->ssl3_enc->generate_master_secret(s,master_key,tmp,tmp_len); +err: + if (tmp) + { + OPENSSL_cleanse(tmp,tmp_len) ; + OPENSSL_free(tmp); + } + BN_clear_free(K); + BN_clear_free(u); + return ret; + } + +/* client side */ +int SRP_generate_client_master_secret(SSL *s,unsigned char *master_key) + { + BIGNUM *x = NULL, *u = NULL, *K = NULL; + int ret = -1, tmp_len; + char *passwd = NULL; + unsigned char *tmp = NULL; + + /* Checks if b % n == 0 + */ + if (SRP_Verify_B_mod_N(s->srp_ctx.B,s->srp_ctx.N)==0) goto err; + if (!(u = SRP_Calc_u(s->srp_ctx.A,s->srp_ctx.B,s->srp_ctx.N))) goto err; + if (s->srp_ctx.SRP_give_srp_client_pwd_callback == NULL) goto err; + if (!(passwd = s->srp_ctx.SRP_give_srp_client_pwd_callback(s, s->srp_ctx.SRP_cb_arg))) goto err; + if (!(x = SRP_Calc_x(s->srp_ctx.s,s->srp_ctx.login,passwd))) goto err; + if (!(K = SRP_Calc_client_key(s->srp_ctx.N, s->srp_ctx.B, s->srp_ctx.g, x, s->srp_ctx.a, u))) goto err; + + tmp_len = BN_num_bytes(K); + if ((tmp = OPENSSL_malloc(tmp_len)) == NULL) goto err; + BN_bn2bin(K, tmp); + ret = s->method->ssl3_enc->generate_master_secret(s,master_key,tmp,tmp_len); +err: + if (tmp) + { + OPENSSL_cleanse(tmp,tmp_len) ; + OPENSSL_free(tmp); + } + BN_clear_free(K); + BN_clear_free(x); + if (passwd) + { + OPENSSL_cleanse(passwd,strlen(passwd)) ; + OPENSSL_free(passwd); + } + BN_clear_free(u); + return ret; + } + +int SRP_Calc_A_param(SSL *s) + { + unsigned char rnd[SSL_MAX_MASTER_KEY_LENGTH]; + + if (BN_num_bits(s->srp_ctx.N) < s->srp_ctx.strength) + return -1; + + if (s->srp_ctx.SRP_verify_param_callback ==NULL && + !SRP_check_known_gN_param(s->srp_ctx.g,s->srp_ctx.N)) + return -1 ; + + RAND_bytes(rnd, sizeof(rnd)); + s->srp_ctx.a = BN_bin2bn(rnd, sizeof(rnd), s->srp_ctx.a); + OPENSSL_cleanse(rnd, sizeof(rnd)); + + if (!(s->srp_ctx.A = SRP_Calc_A(s->srp_ctx.a,s->srp_ctx.N,s->srp_ctx.g))) + return -1; + + /* We can have a callback to verify SRP param!! */ + if (s->srp_ctx.SRP_verify_param_callback !=NULL) + return s->srp_ctx.SRP_verify_param_callback(s,s->srp_ctx.SRP_cb_arg); + + return 1; + } + +BIGNUM *SSL_get_srp_g(SSL *s) + { + if (s->srp_ctx.g != NULL) + return s->srp_ctx.g; + return s->ctx->srp_ctx.g; + } + +BIGNUM *SSL_get_srp_N(SSL *s) + { + if (s->srp_ctx.N != NULL) + return s->srp_ctx.N; + return s->ctx->srp_ctx.N; + } + +char *SSL_get_srp_username(SSL *s) + { + if (s->srp_ctx.login != NULL) + return s->srp_ctx.login; + return s->ctx->srp_ctx.login; + } + +char *SSL_get_srp_userinfo(SSL *s) + { + if (s->srp_ctx.info != NULL) + return s->srp_ctx.info; + return s->ctx->srp_ctx.info; + } + +#define tls1_ctx_ctrl ssl3_ctx_ctrl +#define tls1_ctx_callback_ctrl ssl3_ctx_callback_ctrl + +int SSL_CTX_set_srp_username(SSL_CTX *ctx,char *name) + { + return tls1_ctx_ctrl(ctx,SSL_CTRL_SET_TLS_EXT_SRP_USERNAME,0,name); + } + +int SSL_CTX_set_srp_password(SSL_CTX *ctx,char *password) + { + return tls1_ctx_ctrl(ctx,SSL_CTRL_SET_TLS_EXT_SRP_PASSWORD,0,password); + } + +int SSL_CTX_set_srp_strength(SSL_CTX *ctx, int strength) + { + return tls1_ctx_ctrl(ctx, SSL_CTRL_SET_TLS_EXT_SRP_STRENGTH, strength, + NULL); + } + +int SSL_CTX_set_srp_verify_param_callback(SSL_CTX *ctx, int (*cb)(SSL *,void *)) + { + return tls1_ctx_callback_ctrl(ctx,SSL_CTRL_SET_SRP_VERIFY_PARAM_CB, + (void (*)(void))cb); + } + +int SSL_CTX_set_srp_cb_arg(SSL_CTX *ctx, void *arg) + { + return tls1_ctx_ctrl(ctx,SSL_CTRL_SET_SRP_ARG,0,arg); + } + +int SSL_CTX_set_srp_username_callback(SSL_CTX *ctx, + int (*cb)(SSL *,int *,void *)) + { + return tls1_ctx_callback_ctrl(ctx,SSL_CTRL_SET_TLS_EXT_SRP_USERNAME_CB, + (void (*)(void))cb); + } + +int SSL_CTX_set_srp_client_pwd_callback(SSL_CTX *ctx, char *(*cb)(SSL *,void *)) + { + return tls1_ctx_callback_ctrl(ctx,SSL_CTRL_SET_SRP_GIVE_CLIENT_PWD_CB, + (void (*)(void))cb); + } + +#endif