1

Following is the Raw-Text that I need to do re.search (stored in variable named 'table_t'):

' Table of Contents I. INTRODUCTION .................................... 1 II. FACTUAL ASPECTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 A. The Clean Air Act . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 B. EPA\'s Gasoline Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Establishment of Baselines . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Reformulated Gasoline . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Conventional Gasoline (or "Anti-Dumping Rules") . . . . . . . . 4 C. The May 1994 Proposal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 III. MAIN ARGUMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 A. General .................................... 5 B. The General Agreement on Tariffs and Trade . . . . . . . . . . . . . . . . 6 1. Article I - General Most-Favoured-Nation Treatment . . . . . . . 6 2. Article III - National Treatment on Internal Taxation and Regulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 a) Article III:4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 b) Article III:1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3. Article XX - General Exceptions . . . . . . . . . . . . . . . . . . . . 15 4. Article XX(b) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 a) "Protection of Human, Animal and Plant Life or Health" . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 b) "Necessary" . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 5. Article XX(d) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 6. Article XX(g) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 a) "Related to the conservation of exhaustible natural resources..." . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 b) "... made effective in conjunction with restrictions on domestic production or consumption" . . . . . . . . . . 23 7. Preamble to Article XX . . . . . . . . . . . . . . . . . . . . . . . . . . 23 8. Article XXIII - Nullification and Impairment . . . . . . . . . . . . 25 '

I'd like to match ['I.INTRODUCTION ... 1' , 'II.FACTUAL ASPECT...2', 'III.MAIN ARGUMENT....5'] which are in a form of 'ROMAN + TITLE + DOTs + PAGE NUMBER'

So I had written down code like this:

romans = ["I.", "II.", "III.", "IV.", "V.", "VI.", "VII.", "VIII.", "IX.", "X."]
for i in range(0,len(romans)):
    try:
        print(i)
        print(re.search(r"((?<={})(.*)(\d))(?!{})".format(romans[i], romans[i+1]),table_t).group())
    except:
        pass

But it keep returns like this:

0 INTRODUCTION .................................... 1 II. FACTUAL ASPECTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 A. The Clean Air Act . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 B. EPA's Gasoline Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Establishment of Baselines . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Reformulated Gasoline . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Conventional Gasoline (or "Anti-Dumping Rules") . . . . . . . . 4 C. The May 1994 Proposal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 III. MAIN ARGUMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 A. General .................................... 5 B. The General Agreement on Tariffs and Trade . . . . . . . . . . . . . . . . 6 1. Article I - General Most-Favoured-Nation Treatment . . . . . . . 6 2. Article III - National Treatment on Internal Taxation and Regulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 a) Article III:4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 b) Article III:1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3. Article XX - General Exceptions . . . . . . . . . . . . . . . . . . . . 15 4. Article XX(b) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 a) "Protection of Human, Animal and Plant Life or Health" . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 b) "Necessary" . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 5. Article XX(d) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 6. Article XX(g) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 a) "Related to the conservation of exhaustible natural resources..." . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 b) "... made effective in conjunction with restrictions on domestic production or consumption" . . . . . . . . . . 23 7. Preamble to Article XX . . . . . . . . . . . . . . . . . . . . . . . . . . 23 8. Article XXIII - Nullification and Impairment . . . . . . . . . . . . 25 1 FACTUAL ASPECTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 A. The Clean Air Act . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 B. EPA's Gasoline Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Establishment of Baselines . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Reformulated Gasoline . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Conventional Gasoline (or "Anti-Dumping Rules") . . . . . . . . 4 C. The May 1994 Proposal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 III. MAIN ARGUMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 A. General .................................... 5 B. The General Agreement on Tariffs and Trade . . . . . . . . . . . . . . . . 6 1. Article I - General Most-Favoured-Nation Treatment . . . . . . . 6 2. Article III - National Treatment on Internal Taxation and Regulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 a) Article III:4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 b) Article III:1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3. Article XX - General Exceptions . . . . . . . . . . . . . . . . . . . . 15 4. Article XX(b) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 a) "Protection of Human, Animal and Plant Life or Health" . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 b) "Necessary" . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 5. Article XX(d) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 6. Article XX(g) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 a) "Related to the conservation of exhaustible natural resources..." . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 b) "... made effective in conjunction with restrictions on domestic production or consumption" . . . . . . . . . . 23 7. Preamble to Article XX . . . . . . . . . . . . . . . . . . . . . . . . . . 23 8. Article XXIII - Nullification and Impairment . . . . . . . . . . . . 25 2 MAIN ARGUMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 A. General .................................... 5 B. The General Agreement on Tariffs and Trade . . . . . . . . . . . . . . . . 6 1. Article I - General Most-Favoured-Nation Treatment . . . . . . . 6 2. Article III - National Treatment on Internal Taxation and Regulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 a) Article III:4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 b) Article III:1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3. Article XX - General Exceptions . . . . . . . . . . . . . . . . . . . . 15 4. Article XX(b) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 a) "Protection of Human, Animal and Plant Life or Health" . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 b) "Necessary" . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 5. Article XX(d) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 6. Article XX(g) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 a) "Related to the conservation of exhaustible natural resources..." . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 b) "... made effective in conjunction with restrictions on domestic production or consumption" . . . . . . . . . . 23 7. Preamble to Article XX . . . . . . . . . . . . . . . . . . . . . . . . . . 23 8. Article XXIII - Nullification and Impairment . . . . . . . . . . . . 25 3 4 5 6 7 8 9

There's too many tails attached after the page number of corresponding page numb for given roman alphabet appears.

Which point went wrong?

snapper
  • 997
  • 1
  • 12
  • 15

1 Answers1

1

Use non-greedy regex while matching for the TITLE + DOT(S)

import re

table_t = """' Table of Contents I. INTRODUCTION .................................... 1 II. FACTUAL ASPECTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 A. The Clean Air Act . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 B. EPA\'s Gasoline Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Establishment of Baselines . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Reformulated Gasoline . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Conventional Gasoline (or "Anti-Dumping Rules") . . . . . . . . 4 C. The May 1994 Proposal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 III. MAIN ARGUMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 A. General .................................... 5 B. The General Agreement on Tariffs and Trade . . . . . . . . . . . . . . . . 6 1. Article I - General Most-Favoured-Nation Treatment . . . . . . . 6 2. Article III - National Treatment on Internal Taxation and Regulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 a) Article III:4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 b) Article III:1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3. Article XX - General Exceptions . . . . . . . . . . . . . . . . . . . . 15 4. Article XX(b) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 a) "Protection of Human, Animal and Plant Life or Health" . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 b) "Necessary" . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 5. Article XX(d) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 6. Article XX(g) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 a) "Related to the conservation of exhaustible natural resources..." . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 b) "... made effective in conjunction with restrictions on domestic production or consumption" . . . . . . . . . . 23 7. Preamble to Article XX . . . . . . . . . . . . . . . . . . . . . . . . . . 23 8. Article XXIII - Nullification and Impairment . . . . . . . . . . . . 25 '"""
for i in range(0, len(romans)):
    try: #\s+ : one or more white spaces(re.s) 
         #[] bracket denotes character group 
         #* preceding one could be matched multiple times 
         #\. means real dot(literally dot!)
        print("{}th trial - for roman {}".format(i,romans[i]))  
        print(re.search(r"((?<={})\s+(?P<name>[A-Z \.]*?)(\d))".format(romans[i]), toc).group())
    except:
        pass

Output :

0th trial - for roman I.
   INTRODUCTION          ....................................                                  1
1th trial - for roman II.
  FACTUAL ASPECTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     2
2th trial - for roman III.
 MAIN ARGUMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      5
3th trial - for roman IV.
4th trial - for roman V.
5th trial - for roman VI.
6th trial - for roman VII.
7th trial - for roman VIII.
8th trial - for roman IX.
9th trial - for roman X.
 . . . . . . . . . . . . . . . . . . . . . . . . . .     2

Here's the Output

snapper
  • 997
  • 1
  • 12
  • 15
Shashank Singh
  • 647
  • 1
  • 5
  • 22