1

A common approach for connecting to third party systems from spark is to provide the credentials for the systems as arguments to the spark script. However, this raises some questions about security. E.g. See this question Bluemix spark-submit -- How to secure credentials needed by my Scala jar

Is it possible for a spark job running on bluemix to see a list of the other processes on the operating system? I.e. Can a job run the equivalent of ps -awx to inspect the processes running on the spark cluster and the arguments that were passed to those processes? I'm guessing that it was a design goal that this must not be possible, but it would be good to verify this.

Community
  • 1
  • 1
Chris Snow
  • 23,813
  • 35
  • 144
  • 309

1 Answers1

1

For the Bluemix Apache Spark service, each provisioned spark service instance is a tenant. Each tenant is isolated from all other tenants. Spark jobs of a given tenant cannot access files or memory of any other tenant. So even if you could ascertain, say, the id of another tenant through process lists, you could not exploit that; and nothing truly private should be in any such argument. A relevant analogy here is that/etc/passwd is world readable, but the knowledge of a user id does not, in and of itself, open any doors. i.e. it is not security by obscurity; actual things are locked down.

Given all this, I understand that this service will further isolate through containerization sometime in the near future.

Randy Horman
  • 430
  • 2
  • 5
  • Thanks @Randy. I was thinking more along the analogy that on a multiuser UNIX system, users can run something like `ps -awx` and see commands that were submitted by another user along with the arguments to those commands. In this case, if the arguments are credentials to another service, they will be visible by other users. This problem exists with any application that is run on UNIX, and the solution is to pass things like credentials in environment variables or in files. – Chris Snow May 29 '16 at 01:59
  • 1
    Yes, if you are passing credentials to your spark job run via spark-submit, then you should provide your creds in a file passed in with the program or otherwise uploaded to your tenant's account beforehand. – Randy Horman May 29 '16 at 14:33