1

I've got two files:

1st: Entries.txt

confirmation.resend
send
confirmation.showResendForm
login.header
login.loginBtn

2nd: Used_Entries.txt

confirmation.showResendForm = some value
login.header = some other value

I want to find all entries from the first file (Entries.txt) that have not been asigned a value in the 2nd file (Used_Entries.txt)

In this example I'd like the following result:

confirmation.resend
send
login.loginBtn

In the result confirmation.showResendForm and login.header do not show up because these exist in the Used_Entries.txt

How do I do this? I've been playing around with regular expressions but haven't been able to solve it. A bash script or sth would be much appreciated!

user3346601
  • 1,019
  • 1
  • 11
  • 18
  • First of all... What flavour is your regex engine? "_I've been playing around with regular expressions but haven't been able to solve it._" Show your attempts? – Unihedron Jul 30 '14 at 10:51

3 Answers3

1

You can do this with regex. But get your code mood ready, because you can't match both files with regex at once, and we do want to match both contents with regex at once. Well, that means you must have at least some understanding of your language, I would like you to concatenate the contents from the two files with at least a new line in between.

This regex solution expects your string to be matched to be in this format:

text (no equals sign)
text
text
...
key (no equals sign) ␣ (optional whitespace) = (literal equal) whatever (our regex will skip this part.)
key=whatever
key=whatever

Do I have your attention? Yes? Please see the following regex (using techniques accessible to most regex engines):

/(^[^=\n]+$)(?!(?s).*^\1\s*=)/m

Inspired from a recent answer I saw from zx81, you can switch to (?s) flag in the middle to switch to DOTALL mode suddenly, allowing you to start multiline matching with . in the middle of a RegExp. Using this technique and the set syntax above, here's what the regex does, as an explanation:

  • (^[^=\n]+$) Goes through all the text (no equals sign) elements. Enforces no equals signs or newlines in the capture. This means our regex hits every text element as a line, and tries to match it appropriately.
  • (?! Opens a negative lookahead group. Asserts that this match will not locate the following:
  •   (?s).* Any number of characters or new lines - As this is a greedy match, will throw our matcher pointer to the very end of the string, skipping to the last parts of the document to backtrack and scoop up quickly.
  •   ^\1\s*= The captured key, followed by an equals sign after some optional whitespaces, in its own line.
  • ) Ends our group.

View a Regex Demo!

A regex demo with more test cases


I'm stupid. I could had just put this:

/(^[^=\n]+$)(?!.*^\1\s*=)/sm
Community
  • 1
  • 1
Unihedron
  • 10,902
  • 13
  • 62
  • 72
0
    import re
    e=open("Entries.txt",'r')
    m=e.readlines()
    u=open("Used_Entries.txt",'r')
    s=u.read()
    y=re.sub(r"= .*","",s)
    for i in m:
        if i.strip() in [k.strip() for k in y.split("\n")] :
            pass
    else:
        print i.strip()
vks
  • 67,027
  • 10
  • 91
  • 124
  • Thx for your help! One (newbie) question though: How do i run this? – user3346601 Jul 30 '14 at 10:24
  • In python u can run this straightaway .Please maintain the indentation. – vks Jul 30 '14 at 10:30
  • This gives me the following output: ['confirmation.resend\n', 'send\n', 'confirmation.showResendForm\n', 'login.header\n', 'login.loginBtn']. This does not seem to be correct. Have you considered that in the Used_Entries.txt the entries are assigned a value (for example " = some value")? – user3346601 Jul 30 '14 at 10:34
  • The output looks nicer now (every entry is printed on a new line without '\n' etc) but it still prints every entry from Entry.txt. – user3346601 Jul 30 '14 at 10:39
0

I've been going at this a little bit to complex and just solved it with a small script in scala:

import scala.io.Source

object HelloWorld {
  def main(args: Array[String]) {

    val entries = (for(line <- Source.fromFile("Entries.txt").getLines()) yield {
      line
    }).toList

    val usedEntries = (for(line <- Source.fromFile("Used_Entries.txt").getLines()) yield {
      line.dropRight(line.length - line.indexOf(' '))
    }).toList

    println(entries)
    println(usedEntries)

    val missingEntries = (for {
      entry <- entries
      if !usedEntries.exists(_ == entry)
    } yield {
      entry
    }).toList

    println(missingEntries)

    println("Missing Entries: ")
    println()
    for {
      missingEntry <- missingEntries
    } yield {
      println(missingEntry)
    }

  }
}
user3346601
  • 1,019
  • 1
  • 11
  • 18