2

The goal of this script is to delete all node_modules that haven't been touch in the last 15 days.

It is currently working, but as it goes inside each folder because of os.walk i lose efficiency as i don't have to go inside a node_modules folder because it's exactly what i want to delete

import os
import time
import shutil

PATH = "/Users/wagnermattei/www"

now = time.time()
old = now - 1296000
for root, dirs, files in os.walk(PATH, topdown=False):
    for _dir in dirs:
        if _dir == 'node_modules' and os.path.getmtime(os.path.join(root, _dir)) < old:
            print('Deleting: '+os.path.join(root, _dir))
            shutil.rmtree(os.path.join(root, _dir))
wmattei
  • 464
  • 3
  • 10

2 Answers2

2

If you're using Python 3, you can use Path from pathlib module with rglob function to find only node_modules directory. That way you will only iterate through node_modules directory in your for loop and excluding other files

import os
import time
import shutil
from pathlib import Path

PATH = "/Users/wagnermattei/www"
now = time.time()
old = now - 1296000

for path in Path(PATH).rglob('node_modules'):
    abs_path = str(path.absolute())
    if os.path.getmtime(abs_path) < old:
        print('Deleting: ' + abs_path)
        shutil.rmtree(abs_path)

Update: If you don't want to check node_modules directory if one of its parent directories is also a node_modules and is deleted. You can use os.listdir instead to non-recursively list all the directories in the current directory and use it with a recursive function so that you can traverse down the directory tree and will always check the parent directories first before checking their subdirectories. If the parent directory is an unused node_modules, you can delete that directory and don't traverse further down to the subdirectories

import os
import time
import shutil

PATH = "/Users/wagnermattei/www"
now = time.time()
old = now - 1296000

def traverse(path):
    dirs = os.listdir(path)
    for d in dirs:
        abs_path = os.path.join(path, d)
        if d == 'node_modules' and os.path.getmtime(abs_path) < old:
            print('Deleting: ' + abs_path)
            shutil.rmtree(abs_path)
        else:
            traverse(abs_path)

traverse(PATH)
VietHTran
  • 2,233
  • 2
  • 9
  • 16
  • Thanks, but the problem is, i don't want to go in folders like: `my-project/node_modules/a-lib/node_modules` because the first `node_modules` will be deleted anyway – wmattei Oct 02 '20 at 12:00
  • @WagnerMattei Ok I've updated my answer so that if the first `node_modules` in `my-project/node_modules/a-lib/node_modules` is deleted, it won't look for the second `node_modules` inside it – VietHTran Oct 03 '20 at 01:18
1

List comprehensions are more effective in python that for loops. But I'm not sure if it's better for debugging.

You should try this:

[shutil.rmtree(os.path.join(root, _dir) \
for root, dirs, files in os.walk(PATH, topdown=False) \
    for _dir in dirs \
        if _dir == 'node_modules' and os.path.getmtime(os.path.join(root, _dir)) < old ]

But I think you should use npm to manage the old packages. Maybe this post can help :)

NadTraps
  • 66
  • 1
  • Hey, yeah the post is awesome, but i'm not looking to delete unused packages, i want to delete all of them, even if declared on package.json, my only rule is that it is on a projetct that i have not work in the last 15 days – wmattei Oct 02 '20 at 11:57
  • Oh okok, I hope the list comprehension helps you. The performance will increase but I'm not sure how much. – NadTraps Oct 02 '20 at 18:03