8

As part of my projects, I have a binary data file consisting of a large series of 32 bit integers that one of my classes reads in on initialization. In my C++ library, I read it in with the following initializer:

Evaluator::Evaluator() {
    m_HandNumbers.resize(32487834);
    ifstream inputReader;

    inputReader.open("/path/to/file/7CHands.dat", ios::binary);

    int inputValue;
    for (int x = 0; x < 32487834; ++x) {
        inputReader.read((char *) &inputValue, sizeof (inputValue));
        m_HandNumbers[x] = inputValue;
    }
    inputReader.close();
};

and in porting to Swift, I decided to read the entire file into one buffer (it's only about 130 MB) and then copy the bytes out of the buffer.

So, I've done the following:

public init() {
    var inputStream = NSInputStream(fileAtPath: "/path/to/file/7CHands.dat")!
    var inputBuffer = [UInt8](count: 32478734 * 4, repeatedValue: 0)
    inputStream.open()
    inputStream.read(&inputBuffer, maxLength: inputBuffer.count)
    inputStream.close()
}

and it works fine in that when I debug it, I can see inputBuffer contains the same array of bytes that my hex editor says it should. Now, I'd like to get that data out of there effectively. I know it's stored in.. whatever format you call it where the least significant bytes are first (i.e. the number 0x00011D4A is represented as '4A1D 0100' in the file). I'm tempted to just iterate through it manually and calculate the byte values by hand, but I'm wondering if there's a quick way I can pass an array of [Int32] and have it read those bytes in. I tried using NSData, such as with:

    let data = NSData(bytes: handNumbers, length: handNumbers.count * sizeof(Int32))
    data.getBytes(&inputBuffer, length: inputBuffer.count)

but that didn't seem to load the values (all the values were still zero). Can anyone please help me convert this byte array into some Int32 values? Better yet would be to convert them to Int (i.e. 64 bit integer) just to keep my variable sizes the same across the project.

Charles
  • 577
  • 2
  • 7
  • 16

2 Answers2

3

Not sure about your endian-ness, but I use the following function. The difference from your code is using NSRanges of the actual required type, rather than lengths of bytes. This routine reads one value at a time (it's for ESRI files whose contents vary field by field), but should be easily adaptable.

func getBigIntFromData(data: NSData, offset: Int) -> Int {
    var rng = NSRange(location: offset, length: 4)
    var i = [UInt32](count: 1, repeatedValue:0)

    data.getBytes(&i, range: rng)
    return Int(i[0].bigEndian)// return Int(i[0]) for littleEndian
}
Grimxn
  • 22,115
  • 10
  • 72
  • 85
  • I tried using that function and I get nothing but zeros. I don't think the data.getBytes statement is correctly loading the data from the input buffer. Your function may well work but I have no idea at this time. Do you see any reason why, given that I know inputBuffer contains non-zeros, why my data.getBytes statement is not working? – Charles Mar 11 '15 at 17:16
  • 1
    `data.getBytes(&inputBuffer...)` reads *from* `data` and puts it *into* `inputBuffer`, not the other way round. You need to read the file into the `NSData` then read it out to the array of integers of the correct size. – Grimxn Mar 11 '15 at 17:26
  • Based on this, I actually simplified things substantially my simply initializing the data with the path to the data file, namely: `let data = NSData(contentsOfFile: "/path/to/file/7CHands.dat")!` and it seems to work like a charm as far as loading the data, then a simple call to your function (in littleEndian, as it turns out) and everything is ship-shape. This DOES take a long time, however.. even in Release build, it takes 90s or so to mount whereas C++ was instantaneous. Is the difference between the two languages really that substantial, or am I missing a build setting somewhere? – Charles Mar 11 '15 at 17:44
  • 1
    Are you doing it one value at a time? Change my routine to return `[UInt32]` rather than `Int`, then pass in the count of values, modify `length:4` to `length:4*count` and `count:1` to `count:count`, then return `i` rather than `i[0]`... that should speed it up a great deal... – Grimxn Mar 11 '15 at 20:15
  • Yeah, that made a great difference. Instantaneous in Release mode and under 1s in Debug mode. I was even able to just read and store the entire array as '[Int32]' and cast it as necessary to an Int when doing lookup values. Thanks so much! – Charles Mar 11 '15 at 22:38
  • You should upload your code as your own answer, or edit my answer (if you can), to show the final answer... – Grimxn Mar 11 '15 at 22:41
  • Will do. Thanks for being patient, I'm still kind of new at the proper etiquette for using this resource. – Charles Mar 12 '15 at 00:23
3

Grimxn provided the backbone of the solution to my problem, which showed me how to read sections of the buffer into an array; he then showed me a way to read the entire buffer in all at once. Rather than convert all of the items of the array needlessly to Int, I simply read the array into the buffer as UInt32 and did the casting to Int in the function that accesses that array.

For now, since I don't have my utility class defined yet, I integrated Grimxn's code directly into my initializer. The class initializer now looks like this:

public class Evaluator {
    let HandNumberArraySize = 32487834

    var handNumbers: [Int32]

    public init() {
        let data = NSData(contentsOfFile: "/path/to/file/7CHands.dat")!
        var dataRange = NSRange(location: 0, length: HandNumberArraySize * 4)
        handNumbers = [Int32](count: HandNumberArraySize, repeatedValue: 0)
        data.getBytes(&handNumbers, range: dataRange)

        println("Evaluator loaded successfully")
    }

...

}

... and the function that references them is now:

public func cardVectorToHandNumber(#cards: [Int], numberToUse: Int) -> Int {
    var output: Int

    output = Int(handNumbers[53 + cards[0] + 1])

    for i in 1 ..< numberToUse {
        output = Int(handNumbers[output + cards[i] + 1])
    }

    return Int(handNumbers[output])
}

Thanks to Grimxn and thanks once again to StackOverflow for helping me in a very real way!

Charles
  • 577
  • 2
  • 7
  • 16
  • Great! Check out the Subscript protocol for your reference function. It's not necessary, but it might make things tidier... https://developer.apple.com/library/ios/documentation/Swift/Conceptual/Swift_Programming_Language/Subscripts.html – Grimxn Mar 12 '15 at 09:28