1

I am reading some data from a SQLite table by JDBC. Where the table has following columns:

[Id:Integer], [parentId:Integer], [Name:String], [Type:Integer], [Data:BLOB]

Now from the BLOB data I need to create some Unique identifier, thus the same BLOB will generate the same identifier every time. As of now I am creating a byte array from the blob and then making a toString of it. Will it guarantees the uniqueness? And Is it CPU cycle efficient? As I have a lots of such records to process. Please suggest. Following is my code for the same.

public static void scanData(String dbName) {
    String url = "jdbc:sqlite:C:/dbfolder/" + dbName;
    try (Connection con = DriverManager.getConnection(url);
            Statement st = con.createStatement();
            ResultSet rs = st.executeQuery("select * from someTable");) {

        while (rs.next()) {
            Integer type = rs.getInt("Type");
            if (type != null && type.equals(3)) {
                Integer rowId = rs.getInt("Id");
                Integer parentId = rs.getInt("ParentID");
                String name = rs.getString("Name");
                System.out.println("rowId = " + rowId);
                System.out.println("parentId = " + parentId);
                System.out.println("name = " + name);
                System.out.println("type = " + type);

                InputStream is = rs.getBinaryStream("Data");
                if (is != null) {
                    byte[] arr = IOUtils.toByteArray(is);
                    if (arr != null) {
                        System.out.println("Data = " + arr.toString());
                    }
                }
                System.out.println("---------------------------");
            }

        }
    } catch (SQLException e) {
        System.out.println(e.getMessage());
    } catch (IOException ioe) {
        System.out.println(ioe.getMessage());
    }
}

** IOUtils is used of : org.apache.commons.io.IOUtils

Arpan Das
  • 1,015
  • 3
  • 24
  • 57
  • 1
    If you need a unique string, you need to look at hashing and salting. What you have is probably going to make collisions possible but I didn't look hard because it looks like you need to use hashing. Sqlite might even have it built in. –  May 04 '18 at 12:29
  • A comment below mentions GUIDs, which may be more appropriate in this case. See https://stackoverflow.com/questions/2982748/create-a-guid-in-java –  May 04 '18 at 16:58

1 Answers1

0

To generate a unique identifier you can use org.apache.commons.codec.digest.DigestUtils to generate sha256 so that it will be unique for every blob type

DigestUtils.sha256Hex(new FileInputStream(file));

Convert the Blob to inputStream

blob.getBinaryStream();
Brajesh Pant
  • 311
  • 1
  • 6
  • 21
  • 1
    Note that SHA-256 is a hashing algorithm, which by definition can output the same value for different BLOBs (hash collision). How likely this will happen depends on the amount of data (and maybe the type of data). If you want a guaranteed unique value, you need a GUID. – Mick Mnemonic May 04 '18 at 16:20