What is the proper way to share a program without sharing personal information?

Question

I am writing a Python program that polls various sources and tracks the metrics they output. This clearly has lots of personal credentials, including usernames, passwords and API keys. I want to be able to open source the program, but keep the credentials secret.

At the moment, I have simply got a file called config.py which contains all the sensitive credentials. I have copied this to a file called EXAMPLE_config.py and removed all the sensitive information. The first line is

# Add your information to the below then rename this file to config.py

I was planning to put a gitignore on config.py and to git the EXAMPLE_config.py file. While this works, it does seem a bit inefficient; every time I add a new credential to config.py, I would also need to add the variable name to EXAMPLE_config.py.

What is the best way to share this program on via git, on GitHub for example, without sharing the sensitive information? I have seen Configuration files in Python, however this suggests many options. I have also seen Remove sensitive files and their commits from Git history, however I want to prevent the data from ever being shared in the first place. Is there an accepted pythonic or general standard?

We keep config files in Git without sensitive information. And when we deploy the app to Kubernetes, we update secret files which are the same as original config files but WITH sensitive information. The solution depends on the deployment: Kubernetes and docker compose support secret files, Jenkins supports injecting of environment variables with sensitive information. — RandomB, Dec 04 '19 at 14:10
I typically have a config_local.py with all information, included in .gitignore. Then all relevant information is read by an interface module config.py, which _is_ commited to github and throws a detailed error message including a config_local template for copy and paste if any of the relevant information is missing. — mcsoini, Dec 04 '19 at 14:10
In the script, I get these sensitive values from environment variables. The file where the variables and values are defined is kept out of the repository. If these values are not available, raise an error and print the reason. — ElpieKay, Dec 04 '19 at 14:15

wotanii · Answer 1 · 2019-12-05T18:21:55.437

one of the most common way to deal with secrets in open source applications are environment-variables and .env files.

Example: https://medium.com/codait/environment-variables-or-keeping-your-secrets-secret-in-a-node-js-app-99019dfff716

Something I like to do is having a secret.py, which just reads all secrets and makes them available to the rest of the program in form of a python module, like this:

import os
reddit_client_id = os.environ['reddit_client_id']
reddit_client_secret = os.environ['reddit_client_secret']
reddit_user = os.environ['reddit_user']
reddit_password = os.environ['reddit_password']
streamable_pass = os.environ['streamable_pass']
streamable_user = os.environ['streamable_user']

source

The actual secrets are then stored in an .env file, which is ignored via .gitignore and read with docker-compose (in this particular case).

score 1 · Answer 2 · answered Dec 04 '19 at 17:52

To expand on my comment: I typically have a config_local.py with all information, included in .gitignore. Then all relevant information is read by an interface module config.py, which is commited to github and throws a detailed error message including a config_local template for copy and paste if any of the relevant information is missing. The advantage is obviously that this is self-explanatory if I pass the repository to someone else. Works for me so far.

Example:

repo_name/config.py (commited):

try:
    import config_local
    PASSWORD = config_local.PASSWORD
    MYPATH = config_local.MYPATH

    # could also make this more fine-grained or define default values

except Exception as e:
    print(e)  # to get info on whether variables are missing or the file was not found
    logger.error('''
Please set configuration parameters in
repo_name/config_local.py, e.g.

PASSWORD = 'password1234'
MYPATH = '/path'
'''
)
    # could also raise something here, if variables are strictly required

repo_name/conf_local.py (in .gitignore):

PASSWORD = 'password1234'
MYPATH = '/path'

In this type of context Id prefer this over environment vars. Setting 6-10 env vars on a dedicated Docker/VM is one thing and I do precisely that. Needing potentially lots of specifically-named env variables on someone’s workstation is another. in that case prefix it with your app name at least. `EXAMPLE_API1_KEY`. — JL Peyret, Dec 05 '19 at 04:29

What is the proper way to share a program without sharing personal information?

2 Answers2