2

I am trying to create job script using Java. In AWS Glue Console, I could be able to find only "Python, Spark", so which means we cant write script using Java at all? If yes, then whats this api used for: aws-java-sdk-glue

I even found some example: https://stackoverflow.com/questions/48256281/how-to-read-aws-glue-data-catalog-table-schemas-programmatically

In above, seems like we can able to write aws glue script in Java too. Can anyone please confirm this?

EDIT: In Scala, we are writing as: glueContext.getCatalogSource(database = "my_data_base", tableName = "my_table")

In Java, I found below class, which has method names: withDatabaseName and withTableName https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/glue/model/CatalogEntry.html

Then, may I know what is the purpose of above class?

john
  • 925
  • 1
  • 12
  • 20

2 Answers2

1

Java is not supported for the actual script definition of AWS Glue jobs.

The API that you are referring to is the AWS SDK that will allow you to create and manage AWS Glue resources such as creating/running crawlers, viewing and manage the glue catalogues, creating job definitions, etc.

So you can manage resources in the Glue service with the AWS SDK for Java similar to how to you manage resources in EC2, S3, RDS with the AWS SDK for Java.

JD D
  • 7,398
  • 2
  • 34
  • 53
  • 1
    Thank you :) As I can accept only one answer, I am accepting above one, as he explained bit briefly.. nothing else.. Hope you dont mind... but I gave thumbsup for your answer.. – john Aug 21 '20 at 14:18
  • 1
    no problem, they did a better job explaining anyway :) – JD D Aug 21 '20 at 14:19
  • I just edited my query, can you pls answer it, if you are aware of.. – john Aug 21 '20 at 14:31
  • Can you pls answer this query too: https://stackoverflow.com/questions/63524905/how-data-retreived-from-metadata-created-tables-in-glue-script – john Aug 21 '20 at 14:37
1

The language option on the Glue console that you see is the script/code that yoiu will write to extract, transform and load the actual data that needs to be processed. The source can be a db or s3 bucket and destination can be anything depending on your use case.

Normally you can create a Glue job or a S3 bucket from AWS Management console and when you don't want to do this manually then you need a SDK which has the API call definitions that you use to create AWS resources.

So the script inside a Glue job can be written only in python or scala but when it comes to creating a Glue job you can use different languages/SDKs.

Java - https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/glue/AWSGlueClient.html

Python - https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/glue.html

Java script - https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Glue.html

Ruby - https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/Glue/Client.html

Above all are SDKs used to define resources in AWS where as refer to below link which has the actual code used inside a Glue job.

https://github.com/aws-samples/aws-glue-samples

Prabhakar Reddy
  • 4,628
  • 18
  • 36
  • Thanks Prabhakar. I just edited my query, can you pls answer it, if you are aware of.. – john Aug 21 '20 at 14:31
  • Can you pls answer this query too: https://stackoverflow.com/questions/63524905/how-data-retreived-from-metadata-created-tables-in-glue-script – john Aug 21 '20 at 14:37
  • Thanks for your reply for above link, May i know are you aware of my above edited query too above in my actual post? So does Java supports scripting for Job too? (based on CatalogEntry class) – john Aug 21 '20 at 16:52