0

A colleague has inserted duplicates for ~1200 entries into our database. They have sent me a text file containing both the originals and copies in alternating lines of CSV text. I've opened that up in VS Code with the goal of converting the lines representing duplicates into DELETE statements targeting our database. No line is truly identical to another—every two is a pair in which the data is the same other than the row ID.

I have found Stack Overflow entries for removing every other line when the line is empty, or when every other line is an exact copy of the previous line. I have not found an entry this scenario in which the lines have a difference. E.g. I tried using (.*)\n\1 w/ $1\n from another SO entry, which seems to target truly duplicate lines.

So how do I use VS Code to delete every other line regardless of content?

james_womack
  • 10,028
  • 6
  • 55
  • 74
  • 1
    Do you need to remove every duplicates excluding the ID column from the comparison? Could you add same sample input rows? – aborruso Dec 11 '22 at 08:42

1 Answers1

0

You can achieve this using Replace-All UI in regex mode.

  • Press command-F or control-F
  • Expand the arrow on the left of the Find display
  • Press the ".*" so that it's highlighted
  • Enter this for Find (top text field in the Find UI): (.*\n)(.*)\n (basically select two lines but save the contents of the first line in the regex system)
  • Enter this for Replace (following text field in the find UI): $1 (take the line saved from the Find regex and re-insert it)
  • Hit the Replace All button

Here's a similar SO question

EDIT: as Mark said, (.*\n).*\n? should work as well

VS Code Find and Replace

james_womack
  • 10,028
  • 6
  • 55
  • 74
  • 1
    Just `(.*\n).*\n?` works without so many parentheses. And the `?` at the end might be necessary depending on whether your file has a newline at the end or not. – Mark Dec 10 '22 at 19:05