3

Use case:
Process list of string via method which returns ImmutableTable of type {R,C,V}. For instance ImmutableTable of {Integer,String,Boolean} process(String item){...}

Collect the result i.e, merge all results and return ImmutableTable. Is there a way to achieve it?

Current implementation (as suggested by Bohemian):

How about using parallel stream ? Is there any concurrency issues in the below code? With Parallel stream i an getting "NullPointerException at index 1800" on tableBuilder.build(), but works fine with stream.

ImmutableTable<Integer, String, Boolean> buildData() {   

  // list of 4 AwsS3KeyName   
listToProcess.parallelStream() 

  //Create new instance via Guice dependency injection 
.map(s3KeyName -> ProcessorInstanceProvider.get()    
.fetchAndBuild(s3KeyName)) 
.forEach(tableBuilder::putAll); 

 return tableBuilder.build(); }

While below code worksgreat with stream as well as parallel stream. But ImmutableBuild is failing due to duplicate entry for row and col. What could be the best way to prevent duplicates while merging tables ?

public static <R, C, V> Collector<ImmutableTable<R, C, V>,     
ImmutableTable.Builder<R, C, V>, ImmutableTable<R, C, V>>   
toImmutableTable() 
{ 
return Collector.of(ImmutableTable.Builder::new, 
ImmutableTable.Builder::putAll, (builder1, builder2) -> 
builder1.putAll(builder2.build()), ImmutableTable.Builder::build); }

Edit : If there is any duplicate entry in ImmutableTable.Builder while merging different tables then it fails,

Trying to avoid faluire by putting ImmutableTables in HashBasedTable

  ImmutableTable.copyOf(itemListToProcess.parallelStream()
            .map(itemString ->
           ProcessorInstanceProvider.get()
                    .buildImmutableTable(itemString))
                    .collect(
                            Collector.of(
                                    HashBasedTable::create,
                                    HashBasedTable::putAll,
                                    (a, b) -> {
                                        a.putAll(b);
                                        return a;
                                    }));
  )

But i am getting runtime exception "Caused by: java.lang.IllegalAccessError: tried to access class com.google.common.collect.AbstractTable".

How can we use HashBasedTable as Accumulator to collect ImmutablesTables, as HashBasedTable overrides the existing entry with latest one and doesn't fail if we try to put duplicate entry , and return aggregated Immutable table.

sidss
  • 923
  • 1
  • 12
  • 20

3 Answers3

8

Since Guava 21 you can use ImmutableTable.toImmutableTable collector.

public ImmutableTable<Integer, String, Boolean> processList(List<String> strings) {
    return strings.stream()
            .map(this::processText)
            .flatMap(table -> table.cellSet().stream())
            .collect(ImmutableTable.toImmutableTable(
                    Table.Cell::getRowKey,
                    Table.Cell::getColumnKey,
                    Table.Cell::getValue,
                    (b1, b2) -> b1 && b2 // You can ommit merge function!
            ));
}

private ImmutableTable<Integer, String, Boolean> processText(String text) {
    return ImmutableTable.of(); // Whatever
}
Václav Kužel
  • 1,070
  • 13
  • 16
3

This should work:

List<String> list; // given a list of String

ImmutableTable result = list.parallelStream()
    .map(processor::process) // converts String to ImmutableTable
    .collect(ImmutableTable.Builder::new, ImmutableTable.Builder::putAll,
        (a, b) -> a.putAll(b.build())
    .build();

This reduction is threadsafe.


Or using HashBasedTable as the intermediate data structure:

ImmutableTable result = ImmutableTable.copyOf(list.parallelStream()
    .map(processor::process) // converts String to ImmutableTable
    .collect(HashBasedTable::create, HashBasedTable::putAll, HashBasedTable::putAll));
Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • How about using parallel stream ? Do you see any concurrency issues here? public ImmutableTable fetch() { listToProcess.parallelStream() // list of 4 AwsS3KeyName .map(s3KeyName -> ProcessorInstanceProvider.get() //Create new instance via Guice dependency injection .build(s3KeyName)) .forEach(tableBuilder::putAll); return tableBuilder.build(); } – sidss Aug 21 '16 at 23:46
  • 1
    The doc doesn't say `ImmutableTable` is threadsafe, but see altered code that is threadsafe (and also now only one line :) ) – Bohemian Aug 22 '16 at 09:27
  • Thankyou so much for providing this solution. Builld is failing due to duplicates, kindly tell what could be the best way to prevent duplicates ? It seems i will have to use HashBasedTable. – sidss Aug 22 '16 at 23:21
  • 1
    @sidss duplicate what exactly? – Bohemian Aug 22 '16 at 23:26
  • When same row-column entry exist in two different table. So when ImmutableTable.build is called and when there is any duplicate then is fails.So it seems instead of ImmutableTable.Builder i will have to use HashBasedTable. – sidss Aug 23 '16 at 02:54
  • My approach is instead of using Immutable table builder, i will be doing putAll on HashBasedTable and then return ImmutableTable.copyOf( hashBasedTable) – sidss Aug 23 '16 at 03:25
3

You should be able to do this by creating an appropriate Collector, using the Collector.of static factory method:

ImmutableTable<R, C, V> table =
    list.stream()
        .map(processor::process)
        .collect(
            Collector.of(
                () -> new ImmutableTable.Builder<R, C, V>(),
                (builder, table1) -> builder.putAll(table1),
                (builder1, builder2) ->
                    new ImmutableTable.Builder<R, C, V>()
                        .putAll(builder1.build())
                        .putAll(builder2.build()),
                ImmutableTable.Builder::build));
Andy Turner
  • 137,514
  • 11
  • 162
  • 243
  • I think you could use method references for the Supplier(ImmutableTable.Builder::new) and BiConsumer (ImmutableTable.Builder::putAll). – srborlongan Aug 17 '16 at 12:14
  • 1
    I'm not convinced you can: I tried `ImmutableTable.Builder::new`, and it couldn't infer the types. – Andy Turner Aug 17 '16 at 12:17
  • 1
    The combiner can be optimized by reusing one of the builders. e.g.: `builder1.putAll(builder2.build())` – mfulton26 Aug 17 '16 at 14:19
  • 2
    Also, you can use method references for the supplier and the accumulator if you wrap it all into a function. e.g.: `public static Collector, ImmutableTable.Builder, ImmutableTable> toImmutableTable() { return Collector.of(ImmutableTable.Builder::new, ImmutableTable.Builder::putAll, (builder1, builder2) -> builder1.putAll(builder2.build()), ImmutableTable.Builder::build); }` – mfulton26 Aug 17 '16 at 14:20
  • Works great with stream as well as parallel stream. But ImmutableBuild is failing due to duplicate entry for row and col. It seems i will have to use HashBasedTable to prevent duplicates.Is there any other way to prevent Immutable.copyof(hashBasedTable). – sidss Aug 22 '16 at 05:39