I'm following an example to run a AWS Glue Job from: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-samples-legislators.html
However, I'm currently facing an issue with the 3rd step:
persons = glueContext.create_dynamic_frame.from_catalog(
database="legislators",
table_name="persons_json")
print "Count: ", persons.count()
persons.printSchema()
Every time I run this command I get the following output.
Count: 0
root
After running the Glue crawler, I managed to generate the Glue metadata tables as mentioned in Step 1 I am sure it's detecting the Glue metadata table correctly because if I change database parameter to something else like "legislator1", it would fail immediately which also confirms the metadata tables exists.
The above Python code was run on an AWS SageMaker notebook created under the AWS Glue service
Edit 1: I tried to get the json file directly from the public S3 bucket.
import boto3
import botocore
BUCKET_NAME='awsglue-datasets'
KEY = 'examples/us-legislators/all/areas.json'
s3 = boto3.resource('s3')
try:
obj = s3.Object(BUCKET_NAME, KEY)
body = obj.get()['Body'].read()
print(body)
except botocore.exceptions.ClientError as e:
if e.response['Error']['Code'] == "404":
print("The object does not exist.")
else:
raise
But I get
botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the GetObject operation: Access Denied
which is weird because it's supposed to be a public bucket? I tried other ways to get the object like this one: https://stackoverflow.com/a/56060184/13126175 But it gives
ImportError: The s3fs library is required to handle s3 files
Edit 2
I tried the same code used in Edit 1 and run it in a Glue Job to get the JSON file. It actually worked. Here's a snippet from the CloudWatch logs:
627\u06c1\u0648\u0645\u0627"}, {"lang": "uz", "note": "multilingual", "name": "Oklaxoma"}, {"lang": "vi", "note": "multilingual", "name": "Oklahoma"}, {"lang": "vo", "note": "multilingual", "name": "Oklahoma"}, {"lang": "war", "note": "multilingual", "name": "Oklahoma"}, {"lang": "xal", "note": "multilingual", "name": "\u041e\u043a\u043b\u0430\u0445\u043e\u043c"}, {"lang": "yi", "note": "multilingual", "name": "\u05d0\u05e7\u05dc\u05e2\u05d4\u05d0\u05de\u05e2"}, {"lang": "yo", "note": "multilingual", "name": "Oklahoma"}, {"lang": "zh", "note": "multilingual", "name": "\u5967\u514b\u62c9\u8377\u99ac\u5dde"}, {"lang": "zh_hans", "note": "multilingual", "name": "\u5965\u514b\u62c9\u8377\u9a6c\u5dde"}, {"lang": "ay", "note": "multilingual", "name": "Oklahoma suyu"}, {"lang": "be_tarask", "note": "multilingual", "name": "\u0410\u043a\u043b\u0430\u0433\u043e\u043c\u0430"}, {"lang": "nb", "note": "multilingual", "name": "Oklahoma"}, {"lang": "pt_br", "note": "multilingual", "name": "Oklahoma"}, {"lang": "pa", "note": "multilingual", "name": "\u0a13\u0a15\u0a32\u0a3e\u0a39\u0a4b\u0a2e\u0a3e"}, {"lang": "am", "note": "multilingual", "name": "\u12a6\u12ad\u120b\u1206\u121b"}, {"lang": "ne", "note": "multilingual", "name": "\u0913\u0915\u094d\u0932\u093e\u0939\u094b\u092e\u093e"}, {"lang": "sgs", "note": "multilingual", "name": "Oklahoma"}, {"lang": "yue", "note": "multilingual", "name": "\u5967\u514b\u62c9\u4f55\u99ac\u5dde"}, {"lang": "nan", "note": "multilingual", "name": "Oklahoma"}, {"lang": "kab", "note": "multilingual", "name": "Oklahoma"}, {"lang": "mhr", "note": "multilingual", "name": "\u041e\u043a\u043b\u0430\u0445\u043e\u043c\u043e"}, {"lang": "ce", "note": "multilingual", "name": "\u041e\u043a\u043b\u0430\u0445\u043e\u043c\u0430"}, {"lang": "na", "note": "multilingual", "name": "Oklahoma"}, {"lang": "sc", "note": "multilingual", "name": "Oklahoma"}, {"lang": "te", "note": "multilingual", "name": "\u0c13\u0c15\u0c4d\u0c32\u0c39\u0c4b\u0c2e\u0c3e"}, {"lang": "sah", "note": "multilingual", "name": "\u041e\u043a\u043b\u0430h\u043e\u043c\u0430"}, {"lang": "mzn", "note": "multilingual", "name": "\u0627\u0648\u06a9\u0644\u0627\u0647\u0627\u0645\u0627"}, {"lang": "pi", "note": "multilingual", "name": "\u0913\u0915\u094d\u0932\u093e\u0939\u094b\u092e\u093e"}, {"lang": "cdo", "note": "multilingual", "name": "Oklahoma"}, {"lang": "bxr", "note": "multilingual", "name": "\u041e\u043a\u043b\u0430\u0445\u043e\u043c\u0430"}, {"lang": "azb", "note": "multilingual", "name": "\u0627\u0648\u06a9\u0644\u0627\u0647\u0645\u0627 \u0627\u06cc\u0627\u0644\u062a\u06cc"}, {"lang": "bho", "note": "multilingual", "name": "\u0913\u0915\u094d\u0932\u093e\u0939\u094b\u092e\u093e"}, {"lang": "xmf", "note": "multilingual", "name": "\u10dd\u10d9\u10da\u10d0\u10f0\u10dd\u10db\u10d0"}, {"lang": "de_ch", "note": "multilingual", "name": "Oklahoma"}, {"la
I don't think it's permission issues because I already gave my notebook Admin access for testing purposes.