3

I have Spring Cloud Task that loads data from SQL Server to Cassandra DB which will be run on Spring Cloud Data Flow.

One of the requirement of Spring Task is to provide relational database to persist metadata like task execution state. But I don't want use either of the above databases for that. Instead, I have to specify third database for persistence. But it seems like Spring Cloud Task flow automatically picks up data source properties of SQL Server from application.properties. How can I specify another db for task state persistence?

My Current properties:

spring.datasource.url=jdbc:sqlserver://iphost;databaseName=dbname
spring.datasource.username=user
spring.datasource.password=password
spring.datasource.driverClassName=com.microsoft.sqlserver.jdbc.SQLServerDriver
spring.jpa.show-sql=false
#spring.jpa.hibernate.dialect=org.hibernate.dialect.SQLServer2012Dialect
spring.jpa.hibernate.naming.physical-strategy=org.hibernate.boot.model.naming.PhysicalNamingStrategyStandardImpl
spring.jpa.hibernate.ddl-auto=none

spring.data.cassandra.contact-points=ip
spring.data.cassandra.port=9042
spring.data.cassandra.username=username
spring.data.cassandra.password=password
spring.data.cassandra.keyspace-name=mykeyspace
spring.data.cassandra.schema-action=CREATE_IF_NOT_EXISTS

Update: 1 I added below code to point to 3rd database as suggested by Michael Minella. Now Spring Task is able to connect to this DB and persist state. But now my batch job source queries are also connecting to this database. Only thing I changed was to add datasource for task.

spring.task.datasource.url=jdbc:postgresql://host:5432/testdb?stringtype=unspecified
spring.task.datasource.username=user
spring.task.datasource.password=passwrod
spring.task.datasource.driverClassName=org.postgresql.Driver

@Configuration
public class DataSourceConfigs {

    @Bean(name = "taskDataSource")
    @ConfigurationProperties(prefix="spring.task.datasource")
    public DataSource getDataSource() {
        return DataSourceBuilder.create().build();
    }   
}


@Configuration
public class DDTaskConfigurer extends DefaultTaskConfigurer{


    @Autowired
    public DDTaskConfigurer(@Qualifier("taskDataSource") DataSource dataSource) {
        super(dataSource);

    }

}

Update #2:

@Component
@StepScope
public class MyItemReader extends RepositoryItemReader<Scan> implements InitializingBean{

    @Autowired
    private ScanRepository repository;
    private Integer lastScanIdPulled = null;

    public MyItemReader(Integer _lastIdPulled) {
        super();        
        if(_lastIdPulled == null || _lastIdPulled <=0 ){
            lastScanIdPulled = 0;
        } else {
            lastScanIdPulled = _lastIdPulled;
        }
    }



    @PostConstruct
    protected void setUpRepo() {
        final Map<String, Sort.Direction> sorts = new HashMap<>();
        sorts.put("id", Direction.ASC);
        this.setRepository(this.repository);
        this.setSort(sorts);
        this.setMethodName("findByScanGreaterThanId"); 
        List<Object> methodArgs = new ArrayList<Object>();
        System.out.println("lastScanIdpulled >>> " + lastScanIdPulled);
        if(lastScanIdPulled == null || lastScanIdPulled <=0 ){
            lastScanIdPulled = 0;
        }
        methodArgs.add(lastScanIdPulled);
        this.setArguments(methodArgs);
    }


}



@Repository
public interface ScanRepository extends JpaRepository<Scan, Integer> {


    @Query("...")
    Page<Scan> findAllScan(final Pageable pageable);

    @Query("...")
    Page<Scan> findByScanGreaterThanId(int id, final Pageable pageable);

}

Update #3: If I add config datasource for Repository, I now get below exception. Before you mention that one of the datasource needs to be declared Primary. I already tried that.

Caused by: java.lang.IllegalStateException: Expected one datasource and found 2
at org.springframework.cloud.task.batch.configuration.TaskBatchAutoConfiguration$TaskBatchExecutionListenerAutoconfiguration.taskBatchExecutionListener(TaskBatchAutoConfiguration.java:65) ~[spring-cloud-task-batch-1.0.3.RELEASE.jar:1.0.3.RELEASE]
at org.springframework.cloud.task.batch.configuration.TaskBatchAutoConfiguration$TaskBatchExecutionListenerAutoconfiguration$$EnhancerBySpringCGLIB$$baeae6b9.CGLIB$taskBatchExecutionListener$0(<generated>) ~[spring-cloud-task-batch-1.0.3.RELEASE.jar:1.0.3.RELEASE]
at org.springframework.cloud.task.batch.configuration.TaskBatchAutoConfiguration$TaskBatchExecutionListenerAutoconfiguration$$EnhancerBySpringCGLIB$$baeae6b9$$FastClassBySpringCGLIB$$5a898c9.invoke(<generated>) ~[spring-cloud-task-batch-1.0.3.RELEASE.jar:1.0.3.RELEASE]
at org.springframework.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228) ~[spring-core-4.3.14.RELEASE.jar:4.3.14.RELEASE]
at org.springframework.context.annotation.ConfigurationClassEnhancer$BeanMethodInterceptor.intercept(ConfigurationClassEnhancer.java:358) ~[spring-context-4.3.14.RELEASE.jar:4.3.14.RELEASE]
at org.springframework.cloud.task.batch.configuration.TaskBatchAutoConfigu


@Configuration
@EnableTransactionManagement
@EnableJpaRepositories(
  entityManagerFactoryRef = "myEntityManagerFactory",
  basePackages = { "com.company.dd.collector.tool" },
  transactionManagerRef = "TransactionManager"

)
public class ToolDbConfig {

      @Bean(name = "myEntityManagerFactory")
      public LocalContainerEntityManagerFactoryBean 
      myEntityManagerFactory(
        EntityManagerFactoryBuilder builder,
        @Qualifier("ToolDataSource") DataSource dataSource
      ) {
        return builder
          .dataSource(dataSource)
          .packages("com.company.dd.collector.tool")
          .persistenceUnit("tooldatasource")
          .build();
      }


      @Bean(name = "myTransactionManager")
      public PlatformTransactionManager transactionManager(
        @Qualifier("myEntityManagerFactory") EntityManagerFactory 
        entityManagerFactory
      ) {
        return new JpaTransactionManager(entityManagerFactory);
      }
}

@Configuration

public class DataSourceConfigs {


    @Bean(name = "taskDataSource")
    @ConfigurationProperties(prefix="spring.task.datasource")
    public DataSource getDataSource() {
        return DataSourceBuilder.create().build();
    }   

    @Primary
    @Bean(name = "ToolDataSource")
    @ConfigurationProperties(prefix = "tool.datasource")
    public DataSource dataSource() {
      return DataSourceBuilder.create().build();
   }

}
indusBull
  • 1,834
  • 5
  • 27
  • 39

1 Answers1

3

You need to create a TaskConfigurer to specify the DataSource to be used. You can read about this interface in the documentation here: https://docs.spring.io/spring-cloud-task/1.1.1.RELEASE/reference/htmlsingle/#features-task-configurer

The javadoc can be found here: https://docs.spring.io/spring-cloud-task/docs/current/apidocs/org/springframework/cloud/task/configuration/TaskConfigurer.html

UPDATE 1:
When using more than one DataSource, both Spring Batch and Spring Cloud Task follow the same paradigm in that they both have *Configurer interfaces that need to be used to specify what DataSource to use. For Spring Batch, you use the BatchConfigurer (typically by just extending the DefaultBatchConfigurer) and as noted above, the TaskConfigurer is used in Spring Cloud Task. This is because when there is more than one DataSource, the framework has no way of knowing which one to use.

Michael Minella
  • 20,843
  • 4
  • 55
  • 67
  • I added the Custom Task Configure as you said but its creating other problem. Pls see my update. – indusBull Feb 08 '18 at 18:11
  • Pardon but I'm not following. I want to clarify that in my Spring Batch job, I have a step which reads data from DB1 and writes to DB2. All was good and working until I added above TaskConfiguration change to store task state in DB3. Now that batch job attempts to read from DB3 instead of DB1. No idea why. – indusBull Feb 08 '18 at 19:22
  • Please share your `ItemReader` configuration – Michael Minella Feb 08 '18 at 19:23
  • ...And your `ScanRepository' since that's what's using the `DataSource`... – Michael Minella Feb 08 '18 at 19:48
  • Sorry about that. Added. – indusBull Feb 08 '18 at 19:52
  • SO is going to kick this over to a chat soon, at which time you can also use the Gitter for Spring Cloud Task (https://gitter.im/spring-cloud/spring-cloud-task) or Spring Batch (https://gitter.im/spring-batch/Lobby). However, it looks like you're not explicitly configuring the `DataSource` for the `Repository` in question so you may have been getting lucky... – Michael Minella Feb 08 '18 at 19:56
  • I know but unfortunately gitter.im is blocked in office. And believe me, I tried configuring datasource for Repository which throws another exception. It has been frustrating so far. – indusBull Feb 08 '18 at 20:13
  • I added one last thing. Will appreciate if you direct me to some doc or example otherwise I will have to try something else. – indusBull Feb 08 '18 at 20:52