2

I'm working on automating removing some rows of data from multiple files. One of the criteria is to exclude all products except a specific list. Currently I'm doing that by iterating through the rows and deleting rows that aren't for the right product.

I'm thinking that sorting the rows so that the products to be kept are first, and the ones to be deleted are clumped together lower down in the data would be faster, as I could then just find the first bad row and clear contents (as opposed to Range("<XX:XX>").Delete Shift:=xlUp).

The trouble I'm having is that the products actually present in any of the files varies, as does the list of products to be kept. It's pretty much a unique list of products to keep for each of over a dozen files.

So what I'm hoping is that there is a way when I specify the custom sort list such that I can have a single item for all other products I didn't explicitly list.

For example, if I wanted sort the alphabet "V, C, R, ", would there be a way to do so without specifically listing every other letter in the alphabet?

I'm avoiding specifying the remainder of the list for two reasons. One, the list is fairly long and retyping the full list for each of the files I need to sort would be pretty error prone and hard to maintain. Two, the product list is not necessarily static going forward, so I don't want to have to update the macro each time a new product is added.

pnuts
  • 58,317
  • 11
  • 87
  • 139
JMichael
  • 569
  • 11
  • 28
  • Why not filter with wildcards as you mention and use 'SpecialCells(xlCellTypeVisisble)' to delete all visible rows? – Kyle Oct 16 '15 at 14:39

1 Answers1

2

I'm not sure I fully understood your question, but perhaps using a column with a formula might help?

enter image description here

It can be automated, by coding to add a column, set the formulas in the newly created range, and clear or sort based on a filter on the new range.

Demetri
  • 869
  • 1
  • 6
  • 12
  • A formula would in fact create a way to sort the right products to the top, but unfortunately applying that formula to the entire range of data and waiting for it to calculate is taking pretty much just as long as the row deletion. So I tweaked your idea a little. Rather than adding a formula to determine "Keep" or "Clear" I simply figured out if the row should be retained in code and then added a value to a dummy column ("A" to keep, "B" to clear"). Then I sorted and removed all the B's. That trimmed 1300+ second run-time to <100 seconds. – JMichael Oct 19 '15 at 18:55
  • I wonder what was slowing it down - maybe if the match range was more specific, like A2:A4, that might have sped things up a bit. But I'm glad to hear you figured it out! – Demetri Oct 19 '15 at 22:18
  • I'm coming to learn that there seems to be something on the environment at my employer that causes Excel calculations to run slowly. It also doesn't help that the virtual machine I use for my day-to-day is (for reasons beyond comprehension outside our IT dept.) setup to only have 1 CPU. – JMichael Dec 14 '15 at 14:57
  • If you have some knowledge of vba the formula could be replaced with code. Depending on how your file is set up the code could be written one of several ways. Just to give you a start:[link](http://stackoverflow.com/questions/30959315/excel-vba-performance-1-million-rows-delete-rows-containing-a-value-in-less) but there are many others if you do a bit of searching. – Demetri Dec 14 '15 at 16:32