3

Using interactive Glue Sessions in a Jupyter Notebook was working correctly with the aws-glue-sessions package version 0.32 installed. After upgrading with pip3 install --upgrade jupyter boto3 aws-glue-sessions to version 0.35, the kernel would not start. Gave an error message in GlueKernel.py line 443 in set_glue_version Exception: Valid Glue versions are {'3.0', '2,0} and the Kernel won't start.

Reverting to version 0.32 resolves the issue. Tried installing 0.35, 0.34, 0.33 and get the error, which makes me think it's something I'm doing wrong or don't understand and not something in the product. Is there anything additional I need to do to upgrade the version of the aws-glue-sessions?

  • I have the same issue, including the glue version in the config file, does not seem to resolve it either. – Ankit Sep 20 '22 at 19:07
  • Had the same problem, opened an Issue for it on Github: https://github.com/awslabs/aws-glue-libs/issues/160 – DSC Dec 19 '22 at 21:37

5 Answers5

2

Obviously this is not a good workaround - but it worked for me.

I went into the file GlueKernel.py in the directory: \site-packages\aws_glue_interactive_sessions_kernel\glue_pyspark

and hard-coded the 2nd line of this function to set the version to "3.0"

I'm on windows

def set_glue_version(self, glue_version):
        glue_version = str("3.0")
        if glue_version not in VALID_GLUE_VERSIONS:
            raise Exception(f"Valid Glue versions are {VALID_GLUE_VERSIONS}")
        self.glue_version = glue_version
sqlpro
  • 21
  • 3
1

I am a bit lost here as well -- and confused. I will add that I am a python newbie. I am running the whole thing on Windows. AWS has an article that describes the installation. So, I am assuming it's supported. I get the same error as @theOtherOne. line 443 in set_glue_version Exception: Valid Glue versions are {'3.0', '2,0}

I checked GlueKernel.py of glue_pyspark, and found this code:

def _retrieve_os_env_variable(self, key):
  _, output = subprocess.getstatusoutput(f"echo ${key}")
  return output or os.environ.get(key)

When I run the code below manually, I get $GLUE_VERSION as final result. That obviously doesn't match '2.0' or '3.0'. The command for retrieving environment variables on Windows is a different one. If my understanding is correct, then this whole thing will never work on Windows. Maybe I am the only one who wants to run it on Windows and no one else cares? I got it to work on WSL, but still. I lost quite some time to fix something that cannot be fixed (or can it?)

import subprocess
import os
_, output = subprocess.getstatusoutput(f"echo $GLUE_VERSION")
osoutput = os.environ.get("GLUE_VERSION")
print(output) #$GLUE_VERSION
print (osoutput) #'3.0'
print(output or osoutput) #$GLUE_VERSION
atlan
  • 176
  • 1
  • 13
  • 1
    Digging into the code, the "$" is being used in many places, and this syntax is invalid on windows. So there is indeed, no way this will work on windows as-is. I wonder where else this package is broken for Windows users. – Ahsin Shabbir Dec 20 '22 at 17:30
0

enter image description here

So the issue seems to be that GLUE_VERSION is not set in the environment variables. Once this is set - it works

A Johnson
  • 1
  • 1
  • Thank you for the suggestion. I am running Jupyter from a Git Bash Prompt on a Windows workstation. I tried running `EXPORT GLUE_VERSION='3.0'` before running `jupyter notebook` and also tried it without the quotes around 3.0. Neither seems to change the error I get when starting a glue_pyspark kernel. Is there a different way to set the environment variable and make it available to the kernel when it tries to start? – the0ther0ne Sep 26 '22 at 12:20
  • Same problem as @the0ther0ne unfortunately. Cannot get it to work with GLUE_VERSION="3.0" – atlan Oct 03 '22 at 15:31
  • @atlan, I am facing the same issue, please let me know if you managed it to work - Thanks – Trini Dec 06 '22 at 09:20
  • @Trini, I moved to using wsl and ultimately ended up avoiding the service wherever I can. Coming from a "normal" development background, this whole setup is a bit too messy for me to enjoy. If you are looking for a solution, checkout sqlpro's answer below. – atlan Dec 07 '22 at 10:15
0

This is a known bug, from the command line run,

pip install aws-glue-sessions==0.32

DaveP
  • 259
  • 4
  • 16
0

I managed to get this running last year on my local windows desktop, but was very tricky. Easiest way to get this working on Windows is to install Windows Subsystem for Linux and run notebooks inside it.

joobee
  • 1