0

I have a big xml file where the whole file is just one line. This is very impractical since I need to search for certain occurrences in the file, which the grep command can't help with in this case. I tried to open the file in several editors like notpad++ and sublime, but the file is too big. Is there any clever way to search for occurrences of a string or pattern in linux or windows? The problem with grep is of course that it returns the line that the match occurs on, which is no good in this case.

The size of the file is 4GB.

Suvarna Pattayil
  • 5,136
  • 5
  • 32
  • 59
user16655
  • 1,901
  • 6
  • 36
  • 60
  • Yes, there is a very simple way to do it and it does NOT involve editing the existing file to introduce linebreaks! If you post a [mcve] with sample input and expected output then we can help you. – Ed Morton Apr 21 '16 at 02:52

4 Answers4

1

If you can edit the file, or at least edit a copy of the file, I suggest that you split it up into separate lines and then use grep or Notepad++ etc. to search.

Try changing >< to >\n< - this will put each XML element on its own line.

If you need help with the substitution, there's an SO question on doing string substitution in bash

Community
  • 1
  • 1
Bob Salmon
  • 411
  • 4
  • 10
1

If you are trying to use grep you can use --color=always for highlighting the part where the match is found

grep --color=always Issues.txt

enter image description here

Alternatively, try using vim editor for such files.

Also, if you really want to format the xml i.e split it in multiple lines and with indentation you can use xmllint

xmllint --format theXMLFile which will output to stdout which you can redirect to another file.

If you search a bit you maybe also figure out how to use this from within your favorite editor ( In Kate, I use the command option )

Suvarna Pattayil
  • 5,136
  • 5
  • 32
  • 59
  • I didn't know that this was possible actually, but since the file is 4GB this won't really help as much as I want. Still, thank you for the suggestion :) – user16655 Apr 20 '16 at 13:16
0

Most XML editors can cope with this. It's well worth investing in an IDE such as oXygen or Stylus Studio, but there are probably free XML editors that do a good job too. An XML editor will generally allow you to open a single-line XML file and display it nicely indented on multiple lines, taking account of its knowledge of the XML syntax.

Unfortunately you don't say what you mean by "big". It could be 1Mb, 1Gb, or 1Tb - there's a big difference between those numbers! All editors are going to struggle above 50Mb or so.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
0

When I need to open a big file I use less. It is really fast:

 less -n filename 

-n disables line numbers (they take a while to calculate and you do not need them)

You can search with /pattern

BeniBela
  • 16,412
  • 4
  • 45
  • 52