2

I'm reading multiple files from the S3 bucket using MultiResourceItemReader, I'm getting ClassCastException before executing the myReader() method, Something wrong with MultiResourceItemReader not sure what's going wrong here.

Please find my code below:

                @Bean
                public MultiResourceItemReader<String> multiResourceReader()
                {
                    String bucket = "mybucket;
                    String key = "/myfiles";
                
                    List<InputStream> resourceList = s3Client.getFiles(bucket, key);
                    List<InputStreamResource> inputStreamResourceList = new ArrayList<>();
                    for (InputStream s: resourceList) {
                        inputStreamResourceList.add(new InputStreamResource(s));
                    }
            
            Resource[] resources = inputStreamResourceList.toArray(new InputStreamResource[inputStreamResourceList.size()]);
        //InputStreamResource[] resources = inputStreamResourceList.toArray(new InputStreamResource[inputStreamResourceList.size()]);
            
            // I'm getting all the stream content - I verified my stream is not null
                    for (int i = 0; i < resources.length; i++) {
                        try {
                            InputStream s  = resources[i].getInputStream();
                            String result = IOUtils.toString(s, StandardCharsets.UTF_8);
                            System.out.println(result);
                        } catch (IOException e) {
                            e.printStackTrace();
                        }
                    }
            
                    MultiResourceItemReader<String> resourceItemReader = new MultiResourceItemReader<>();
                    resourceItemReader.setResources(resources);
                    resourceItemReader.setDelegate(myReader());
                    
    resourceItemReader.setDelegate((ResourceAwareItemReaderItemStream<? extends String>) new CustomComparator()); 
                    return resourceItemReader;
                }
        
        
          

Exception:

Caused by: java.lang.ClassCastException: class CustomComparator cannot be cast to class org.springframework.batch.item.file.ResourceAwareItemReaderItemStream (CustomComparator and org.springframework.batch.item.file.ResourceAwareItemReaderItemStream are in unnamed module of loader org.springframework.boot.loader.LaunchedURLClassLoader @cc285f4)
        at org.springframework.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:244)
        at org.springframework.context.annotation.ConfigurationClassEnhancer$BeanMethodInterceptor.intercept(ConfigurationClassEnhancer.java:331)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:154)
        ... 65 common frames omitted

Can someone please help me to resolve this issue. Appreciated your help in advance. Thanks.

Mike Marsh
  • 387
  • 3
  • 15

1 Answers1

3

The reason you see the NullPointerException is due to the default comparator used by the MultiResourceItemReader to sort the resources after loading them.

The default compare behavior calls the getFilename() method of the InputStreamResource.

Refer - https://github.com/spring-projects/spring-batch/blob/115c3022147692155d45e23cdd5cef84895bf9f5/spring-batch-infrastructure/src/main/java/org/springframework/batch/item/file/MultiResourceItemReader.java#L82

But the InputStreamResource just inherits the getFileName() method from its parent AbstractResource, which just returns null. https://github.com/spring-projects/spring-framework/blob/316e84f04f3dbec3ea5ab8563cc920fb21f49749/spring-core/src/main/java/org/springframework/core/io/AbstractResource.java#L220

The solution is to provide a custom comparator for the MultiResourceItemReader. Here is a simple example, assuming you do not want to sort the resources in a specific way before processing:

public class CustomComparator implements Comparator<InputStream>{

        @Override
        public int compare(InputStream is1, InputStream is2) {
       //comparing based on last modified time
            return Long.compare(is1.hashCode(),is2.hashCode());
   }
}

MultiResourceItemReader<String> resourceItemReader = new MultiResourceItemReader<>();
resourceItemReader.setResources(resources);
resourceItemReader.setDelegate(myReader());
//UPDATED with correction - set custom Comparator
resourceItemReader.setComparator(new CustomComparator());

Refer this answer for how a Comparator is used by Spring Batch MultiResourceItemReader.

File processing order with Spring Batch

Shankar
  • 2,625
  • 3
  • 25
  • 49
  • Thanks @Shankar P S. I added CustomComparator class and added `resourceItemReader.setDelegate((ResourceAwareItemReaderItemStream extends String>) new CustomComparator());` like this - but I'm getting `ClassCastException Caused by: java.lang.ClassCastException: class CustomComparator cannot be cast to class org.springframework.batch.item.file.ResourceAwareItemReaderItemStream (CustomComparator and org.springframework.batch.item.file.ResourceAwareItemReaderItemStream` -pls help me to fix this. Updated my original post with new code and exception – Mike Marsh Jan 17 '22 at 15:47
  • @MikeMarsh I had a typo in my answer. It should be resourceItemReader.setDelegate(). I have corrected it. Pls try now. – Shankar Jan 17 '22 at 18:06
  • Hi @Shankar P S I added this line `resourceItemReader.setComparator(new CustomComparator());` but getting compilation error `Required type: Comparator ` In CustomComparator class we have given `Comparator` if I change to `Comparator` compilation error goes away but getting this runtime exception `IllegalStateException: InputStream has already been read - do not use InputStreamResource if a stream needs to be read multiple times` How to fix this issue? Any help would be much appreciated. Thanks! – Mike Marsh Jan 18 '22 at 02:18
  • 1
    This is because of your debug code, where you are reading the list of Inputstreams to verify that they are valid. Remove the for loop you added after your comment "I verified my stream is not null". Spring Batch cannot read it again, since it assumes it was already read. – Shankar Jan 18 '22 at 04:10
  • Thank you so much - it worked. I would like to clarify one more thing here - in a directory if I've 100 files - will MultiResourceItemReader read all of them and process them sequentially? If yes, how can we configure it to read the first 50 files, then read the remaining 50 files? Because if we read all 100 files together and process - not sure if any out of memory issue comes. Please suggest your approach? – Mike Marsh Jan 19 '22 at 03:37