Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The exported sha256_sse() function fails to properly hash blocks that are larger than 65536 bytes in size #20

Closed
SteelBlueVision opened this issue Aug 10, 2018 · 6 comments
Labels

Comments

@SteelBlueVision
Copy link

SteelBlueVision commented Aug 10, 2018

Note that this is true of all of the exported sha256_*() functions, not just the sse version. I am just using the sse version to demonstrate. I was hoping to use these functions to efficiently hash large files on an Intel based system, but if only blocks up to 64KB are supported for hashing, these functions are far less useful!

Here is a sample C++ program. You can try it on files that are <=65536 bytes in size and then on files that are >65536 bytes in size. It will fail to produce a correct hash for files that are >65536 bytes in size.

Examples:

Generate data files:

~/dev/sha256$ # Generate two files containing the character '0' repeated 65536 times
~/dev/sha256$ # and 65537 times, respectively
~/dev/sha256$ printf '0%.0s' {1..65536} >test-file.65536-bytes
~/dev/sha256$ printf '0%.0s' {1..65537} >test-file.65537-bytes
~/dev/sha256$ ls -l test-file*
-rw-rw-r-- 1 michael michael 65536 Aug 10 16:21 test-file.65536-bytes
-rw-rw-r-- 1 michael michael 65537 Aug 10 16:21 test-file.65537-bytes

65536 byte example - SUCCESS

Output of application using sha256_sse() to compute the SHA256 hash:

~/dev/sha256$ ./sha256_test test-file.65536-bytes 65536
Processing test-file.65536-bytes of size: 65536

00000000: 34 dd 67 58 c2 90 8c f5 be 0e 41 11 73 f6 18 77 | 4.gX......A.s..w
00000010: f8 fa f2 dc dc fe d1 35 b5 a5 fa 90 57 74 47 bc | .......5....WtG.

Correct sha256sum output for comparison (matches above output at the hex level):

~/dev/sha256$ sha256sum test-file.65536-bytes
34dd6758c2908cf5be0e411173f61877f8faf2dcdcfed135b5a5fa90577447bc  test-file.65536-bytes

65537 byte example - FAILURE

Output of application using sha256_sse() to compute the SHA256 hash:

~/dev/sha256$ ./sha256_test test-file.65537-bytes 65537
Processing test-file.65537-bytes of size: 65537

00000000: 59 73 28 91 be bc c7 c6 74 d5 a0 e9 2e 0a 4e 04 | Ys(.....t.....N.
00000010: 08 f2 1d 9e 6c ce ca 1f ac e0 3c c1 ab 1c b6 7a | ....l.....<....z

Correct sha256sum output for comparison (does not match above output at the hex level):

~/dev/sha256$ sha256sum test-file.65537-bytes
5feceb66ffc86f38d952786c6d696c79c2dbc239dd4e91b46729d73a27fb57e9  test-file.65537-bytes

Source code:

// sha256_test.cpp

#include <cstdint>
#include <fstream>
#include <intel-ipsec-mb.h>
#include <iomanip>
#include <iostream>
#include <string>

void
hexdump(FILE *fp,
        const char *msg,
        const void *p,
        size_t len)
{
        unsigned int i, out, ofs;
        const unsigned char *data = (const unsigned char *) p;

        fprintf(fp, "%s\n", msg);

        ofs = 0;
        while (ofs < len) {
                char line[120];

                out = snprintf(line, sizeof(line), "%08x:", ofs);
                for (i = 0; ((ofs + i) < len) && (i < 16); i++)
                        out += snprintf(line + out, sizeof(line) - out,
                                        " %02x", (data[ofs + i] & 0xff));
                for (; i <= 16; i++)
                        out += snprintf(line + out, sizeof(line) - out, " | ");
                for (i = 0; (ofs < len) && (i < 16); i++, ofs++) {
                        unsigned char c = data[ofs];

                        if ((c < ' ') || (c > '~'))
                                c = '.';
                        out += snprintf(line + out,
                                        sizeof(line) - out, "%c", c);
                }
                fprintf(fp, "%s\n", line);
        }
}

int main(int argc, char *argv[])
{
    using namespace std;
    std::cout << std::fixed;

    if (argc!=3)
    {
        std::cerr << "Usage: file_to_be_hashed size_of_file_in_bytes\n";
        return 1;
    }
    ifstream infile(argv[1]);

    size_t BUF_SIZE=stoul(argv[2]);
    char *buf=new char[BUF_SIZE];

    std::clog << "Processing " << argv[1] << " of size: " << BUF_SIZE << '\n';
    if (!infile.read(buf,BUF_SIZE))
    {
        std::cerr << "Could not read log file\n";
        return 1;
    }

    infile.close();

    uint8_t digest[SHA256_DIGEST_SIZE_IN_BYTES]={0}; // 32-byte buffer

    sha256_sse(buf,BUF_SIZE,digest);
     
    hexdump(stdout,"",digest,sizeof(digest));

    delete [] buf;
}
@tkanteck
Copy link
Contributor

Many thanks for rising the issue. These API's are exported only for IPsec usage when HMAC-SHAx is used for authentication and key size is larger than SHAx block size.

These API's are not really optimized for performance because these potential HMAC key reductions are not frequent operations in IPSec.

For best SHA hash performance and handling of large files I would recommend ISA-L crypto library
https://github.com/01org/isa-l_crypto

Having said that we will have a look into this issue as the exported API should be functional for larger data sizes.

Thanks,
Tomasz

@SteelBlueVision
Copy link
Author

Hi Tomasz, I have believe you mean to use isa-l_crypto since isa-l only hash CRC type hashing. Unfortunately, I am having an issue getting the correct SHA256 hash out of that as well. Please see the following issue report: intel/isa-l_crypto#14

@tkanteck
Copy link
Contributor

tkanteck commented Aug 20, 2018

Correct, ISA-L crypto is the one that I pointed to.
I tried to reproduce the problem using snippet of the code above (thanks!) and I couldn't find any mismatch between OpenSSL produced hashes vs ipsec-mb.
Could you try to run again and help identify files that cause problems?


// sha256_test.cpp

#include <cstdint>
#include <fstream>
#include <intel-ipsec-mb.h>
#include <iomanip>
#include <iostream>
#include <string>

#define OPENSSL_NO_SHA1
#define OPENSSL_NO_SHA512
#include <openssl/sha.h>

static void
hexdump(FILE *fp,
        const char *msg,
        const void *p,
        size_t len)
{
        unsigned int i, out, ofs;
        const unsigned char *data = (const unsigned char *) p;

        fprintf(fp, "%s\n", msg);

        ofs = 0;
        while (ofs < len) {
                char line[120];

                out = snprintf(line, sizeof(line), "%08x:", ofs);
                for (i = 0; ((ofs + i) < len) && (i < 16); i++)
                        out += snprintf(line + out, sizeof(line) - out,
                                        " %02x", (data[ofs + i] & 0xff));
                for (; i <= 16; i++)
                        out += snprintf(line + out, sizeof(line) - out, " | ");
                for (i = 0; (ofs < len) && (i < 16); i++, ofs++) {
                        unsigned char c = data[ofs];

                        if ((c < ' ') || (c > '~'))
                                c = '.';
                        out += snprintf(line + out,
                                        sizeof(line) - out, "%c", c);
                }
                fprintf(fp, "%s\n", line);
        }
}

int main(int argc, char *argv[])
{
        using namespace std;
        std::cout << std::fixed;

        if (argc!=3)
        {
                std::cerr << "Usage: file_to_be_hashed size_of_file_in_bytes\n";
                return 1;
        }
        ifstream infile(argv[1]);

        size_t BUF_SIZE=stoul(argv[2]);
        char *buf=new char[BUF_SIZE];

        std::clog << "Processing " << argv[1] << " of size: " << BUF_SIZE << '\n';
        if (!infile.read(buf,BUF_SIZE))
        {
                std::cerr << "Could not read log file\n";
                return 1;
        }

        infile.close();

        uint8_t digest1[SHA256_DIGEST_SIZE_IN_BYTES]={0}; // 32-byte buffer

        sha256_sse(buf,BUF_SIZE,digest1);

        unsigned char digest2[SHA256_DIGEST_LENGTH];
        SHA256_CTX sha256;
        SHA256_Init(&sha256);
        SHA256_Update(&sha256, buf, BUF_SIZE);
        SHA256_Final(digest2, &sha256);
        
        hexdump(stdout,"IPSEC-MB: ",digest1,sizeof(digest1));
        hexdump(stdout,"OpenSSL: ",digest2,sizeof(digest2));

        delete [] buf;

        return 0;
}

@tkanteck
Copy link
Contributor

tkanteck commented Aug 20, 2018

Example output:

>./a.out  ipsec_perf 75456
Processing ipsec_perf of size: 75456
IPSEC-MB: 
00000000: db 8f 66 f8 72 8e d5 c6 62 a4 e7 97 2e d3 9a 7b | ..f.r...b......{
00000010: 3c 79 d0 04 03 9c b7 b9 12 4a a5 af a3 07 d2 c1 | <y.......J......
OpenSSL: 
00000000: db 8f 66 f8 72 8e d5 c6 62 a4 e7 97 2e d3 9a 7b | ..f.r...b......{
00000010: 3c 79 d0 04 03 9c b7 b9 12 4a a5 af a3 07 d2 c1 | <y.......J......

@tkanteck
Copy link
Contributor

Plus output for the test files mentioned above. Maybe this is something specific to sha256sum tool

>./a.out test-file.65536-bytes 65536
Processing test-file.65536-bytes of size: 65536
IPSEC-MB: 
00000000: 34 dd 67 58 c2 90 8c f5 be 0e 41 11 73 f6 18 77 | 4.gX......A.s..w
00000010: f8 fa f2 dc dc fe d1 35 b5 a5 fa 90 57 74 47 bc | .......5....WtG.
OpenSSL: 
00000000: 34 dd 67 58 c2 90 8c f5 be 0e 41 11 73 f6 18 77 | 4.gX......A.s..w
00000010: f8 fa f2 dc dc fe d1 35 b5 a5 fa 90 57 74 47 bc | .......5....WtG.
>./a.out test-file.65537-bytes 65537
Processing test-file.65537-bytes of size: 65537
IPSEC-MB: 
00000000: 59 73 28 91 be bc c7 c6 74 d5 a0 e9 2e 0a 4e 04 | Ys(.....t.....N.
00000010: 08 f2 1d 9e 6c ce ca 1f ac e0 3c c1 ab 1c b6 7a | ....l.....<....z
OpenSSL: 
00000000: 59 73 28 91 be bc c7 c6 74 d5 a0 e9 2e 0a 4e 04 | Ys(.....t.....N.
00000010: 08 f2 1d 9e 6c ce ca 1f ac e0 3c c1 ab 1c b6 7a | ....l.....<....z

@tkanteck
Copy link
Contributor

I am closing the issue as ipsec-mb library and openssl produce same SHA256 hash values for the same buffers.
Feel free to re-open if new/more evidence is available.
Thanks,
Tomasz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants