I have a 1 million lines file, which once read with readLines
can be condensed to:
prob <- readLines("offendingFile.txt")
dput(prob)
c("000005928484|Name Nmee Leonel |YUMBO |El Placer de El Cerrito ALG 76248 |114|80041725|20140424|4132638|20140425|P|PED.ELE/100098-114 |Corregimiento de amaime",
"", " ||90300105 |V-1 MUIMERP NALBOC |6.0000|30.820000|.0000|.00000000000000|6.0000|458114.67",
"000005928484|Name Nmee Leonel |YUMBO |El Placer de El Cerrito ALG 76248 |114|80041725|20140424|4132638|20140425|P|PED.ELE/100098-114 |Corregimiento de amaime",
"", " ||90400105 |V-2 MUIMERP NALBOC |3.0000|29.170000|.0000|.00000000000000|3.0000|169750.62",
"000005928484|Name Nmee Leonel |YUMBO |El Placer de El Cerrito ALG 76248 |114|80041725|20140424|4132638|20140425|P|PED.ELE/100098-114 |Corregimiento de amaime",
"", " ||90700101 |V-OCIMONOCE LOREMIPSUM |12.0000|5.980000|.0000|.00000000000000|12.0000|107118.18",
"000815004980|Odrareg Oinotna Namzug S. En C.S. |YUMBO |Rozo (Palmira) ALG 76520 |114|80041726|20140424|4132636|20140425|P|PED.ELE/100099-114 |Corregimiento de palmira"
)
I want to remove the sequences of LFLF and spaces that are occurring in the file (that would result in removing rows 2, 5 and 8 and appending rows 3 to 1; 6 to 4 and 9 to 7 (original row numbering)). So I tried:
prob2 <- gsub("\n {2,}", "", prob) # didn't do anything
gsub("[\r\n] {2,}", "", prob)
gsub("\r?\n {2,}|\r {2,}", "", prob)
The last two lines are borrowed from this SO post.
How should I proceed?
Expected output:
dput(prob2)
c("000005928484|Name Nmee Leonel |YUMBO |El Placer de El Cerrito ALG 76248 |114|80041725|20140424|4132638|20140425|P|PED.ELE/100098-114 |Corregimiento de amaime ||90300105 |V-1 MUIMERP NALBOC |6.0000|30.820000|.0000|.00000000000000|6.0000|458114.67",
"000005928484|Name Nmee Leonel |YUMBO |El Placer de El Cerrito ALG 76248 |114|80041725|20140424|4132638|20140425|P|PED.ELE/100098-114 |Corregimiento de amaime ||90400105 |V-2 MUIMERP NALBOC |3.0000|29.170000|.0000|.00000000000000|3.0000|169750.62",
"000005928484|Name Nmee Leonel |YUMBO |El Placer de El Cerrito ALG 76248 |114|80041725|20140424|4132638|20140425|P|PED.ELE/100098-114 |Corregimiento de amaime ||90700101 |V-OCIMONOCE LOREMIPSUM |12.0000|5.980000|.0000|.00000000000000|12.0000|107118.18",
"000815004980|Odrareg Oinotna Namzug S. En C.S. |YUMBO |Rozo (Palmira) ALG 76520 |114|80041726|20140424|4132636|20140425|P|PED.ELE/100099-114 |Corregimiento de palmira"
)