1

This is probably an easy fix, but I cannot get this code to run. I have been using AWS Secrets Manager with no issues on Pycharm 2020.2.3. The problems with AWS Wrangler however are listed below:

Read in Dataframe

test_df = pd.read_csv(source, encoding='latin-1')

Check df data types

data_types_df = test_df.dtypes
print('Data type of each column of Dataframe:')
print(data_types_df)

Convert columns to correct data types

test_df['C'] = pd.to_datetime(test_df['C'])

test_df['E'] = pd.to_datetime(test_df['E'])

Check df data types

df_new = test_df.dtypes
print('Data type of each column of Dataframe:')
print(df_new)

I have tried both snippets below and I get the same error:

engine = wr.catalog.get_engine("aws-data-wrangler-redshift", region_name=region_name)

engine = wr.catalog.get_engine('redshift+psycopg2://' + Username + ":" + Password + ClusterURL)

Error:

botocore.exceptions.NoRegionError: You must specify a region.

Then I was going to try to convert a Pandas Dataframe to a custom table in redshift using one of the two methods below:

path = f"s3://{bucket}/stage/"
iam_role = 'ARN'

Copy df to redshift custom table

wr.db.copy_to_redshift(
    df=df_new,
    path=path,
    con=engine,
    schema="custom",
    table="test_df",
    mode="overwrite",
    iam_role=iam_role,
    primary_keys=["c"]
)

Pandas df to redshift

wr.pandas.to_redshift(
    dataframe=df_new,
    path=path,
    schema="custom",
    table="test_df",
    connection=con,
    iam_role="YOUR_ROLE_ARN",
    mode="overwrite",
    preserve_index=False
)

Any help would be much appreciated :)

1 Answers1

1

Data Wrangler uses Boto3 under the hood. And Boto3 will look for the AWS_DEFAULT_REGION env variable. So you have two options:

Set this in your ~/.aws/config file:

[default]  
region=us-east-1

Or set this as env variable in your PC:

export AWS_DEFAULT_REGION=us-east-1

More specific you can set environment variables in PyCharm

rodrigombs
  • 496
  • 4
  • 7