0

I am using RegEx in Notepad++ to transform my lines from a .txt file into rows that I can easily transfer to my Spreadsheet. For example, I have these lines

6:
1    my 1st log
-2    my 2nd log

7:
-3    my 3rd log
4    my 4th log
-5    my 5th log

What I want to show in my spreadsheet is this:

+-+-----+-------+------------+
|1| DAY | VALUE |     LOG    |
|2|  6  |    1  | my 1st log |
|3|     |   -2  | my 2nd log |
|4|     |       |            |
|5|  7  |   -3  | my 3rd log |
|6|     |    4  | my 4th log |
|7|     |   -5  | my 5th log |
+-+-----+-------+------------+

Where the row numbers in a Spreadsheet are shown on the 1st column. I am able to match the first and second lines with ^(\d+)+\:$\r\n(.*+) and this is okay for 2 lines but if there are 3 logs such as on the 7th day, it wouldn't. How do I match all characters until an empty line? Thanks

Norseback
  • 193
  • 2
  • 4
  • 16
  • you can match empty lines using \n\r as suggested here: https://stackoverflow.com/questions/3866034/removing-empty-lines-in-notepad?rq=1 – ChrisB Aug 20 '21 at 20:15

4 Answers4

0

This should work:

\d+:\r\n(-\d+\t.*\r\n)+
ronpi
  • 470
  • 3
  • 8
  • it does select what I want and I had to add a ``?`` after the minus since I forgot to mention that my values can also be positive. How do I select the second line and group that? My replace string is basically gonna look like: ``\1\t\2\r\n\t\3`` – Norseback Aug 20 '21 at 21:09
  • Because there might be multiple lines of different size for each selection, I think it is too complicated to do it with a single regex. I would recommend to do it in 2 steps. First capture the head number with the first log line and put them on the same line, separated with a tab, as you mentioned in your replace pattern. Then capture all other log lines, and add a tab only before them (because they already has a tab between the value and the log text) – ronpi Aug 20 '21 at 22:26
0

You need to set include a look ahead to look for 2 line breaks.

/\d:[\s\S]+?(?=\r\n\r\n)/g

Produces 2 matches with following contents:

[
  [
    {
      "content": "6:\n-1    my 1st log\n-2    my 2nd log",
      "isParticipating": true,
      "groupNum": 0,
      "groupName": null,
      "startPos": 0,
      "endPos": 36
    }
  ],
  [
    {
      "content": "7:\n-3    my 3rd log\n-4    my 4th log\n-5    my 5th log",
      "isParticipating": true,
      "groupNum": 0,
      "groupName": null,
      "startPos": 38,
      "endPos": 91
    }
  ]
]
jamoreiras
  • 315
  • 1
  • 14
0

Check the box . matches newlines and use this pattern:

^\d+:.*?^$

The quantifier *? is non-greedy and catch characters until the end of the pattern succeeds.

^ and $ anchors successively suffice to ensure an empty line or the end of the string.

Put the capture groups where you want.

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
0

Can probably handle the 3 types of items within a branch reset.
This would allow to use a single replacement with consistent capture groups.
The 3 types are Day, Log entry, and separator.

(?m)(?|^(\d+):()()|\G^()([-+]?\d+)(?![:\d])[ \t]+(.*)|\G^()()()[ \t]*$\R*)\R?

Replace with $1\t$2\t$3\r\n

https://regex101.com/r/QMOk4c/1

 (?m)                          # Multi-line mode
 (?|                           # Branch reset
    ^                             # BOL day
    ( \d+ )                       # (1)
    :                             # colon
    ( )                           # (2)
    ( )                           # (3)
  |                              # or
    \G                            # Start where last match left off
    ^                             # BOL log entry line
    ( )                           # (1)
    ( [-+]? \d+ )                 # (2)
    (?! [:\d] )                   # not the colon here
    [ \t]+                        # trim
    ( .* )                        # (3)
  | 
    \G                            # Start where last match left off
    ^                             # BOL record seperator
    ( )                           # (1)
    ( )                           # (2)
    ( )                           # (3)
    [ \t]* $ \R* 
 )
 \R?                           # Any line break
sln
  • 2,071
  • 1
  • 3
  • 11