Questions about manipulating or examining textual data using common UNIX/Linux utilities.
Questions tagged [unix-text-processing]
32 questions
3
votes
5 answers
Extract substrings between strings
I have a file with text as follows:
###interest1 moreinterest1### sometext ###interest2###
not-interesting-line
sometext ###interest3###
sometext ###interest4### sometext othertext ###interest5### sometext ###interest6###
I want to extract all…

Digsby
- 151
- 10
2
votes
3 answers
Multiple input files - loop through one and check if string contained in second file - output paragraph
I try to filter a text file based on a second file. The first file contains paragraphs like:
$ cat paragraphs.txt
# ::id 1
# ::snt what is an example of a 2-step garage album
(e / exemplify-01
:arg0 (a / amr-unknown)
:arg1 (a2 / album
…

niccip
- 23
- 5
2
votes
2 answers
Split Markdown text file by regular expression that defines headings
I am trying to use a commandline program to split a larger text file into chunks with:
split on defined regex pattern
filenames defined by a capturing group in that regex pattern
The text file is of the format:
# Title
# 2020-01-01
Multi-line…

Leeroy
- 2,003
- 1
- 15
- 21
2
votes
2 answers
gsub: remove till first occurence instead of last occurence of a given character in a line
I have an html file which I basically try to remove first occurences of <...> with sub/gsub functionalities.
I used awk regex . * + according to match anything between < >. However first occurence of > is being escaped (?). I don't know if there is…

Ahmet Said Akbulut
- 426
- 4
- 17
2
votes
2 answers
AWK: Concatenate and process three or more files with a method similar to FNR==NR approach
Since I am learning awk; I found out FNR==NR approach is a very common method to process two files. If FNR==NR; then it is the first file, when FNR reset to 1 while reading every line from concatenated files it means !(FNR==NR) and it is obviously…

Ahmet Said Akbulut
- 426
- 4
- 17
2
votes
3 answers
How to extract text from access log?
I am very new in this. I am trying to extract some text from my access log in a new file.
My log file is like this:
111.111.111.111 - - [02/Jul/2021:18:35:19 +0000] "GET /api/items HTTP/2.0" 304 0 "https://example.com/some/text/call-log?roomNo=5003"…

Zaman
- 103
- 7
2
votes
2 answers
replace new line with a space if next line starts with a word character
I've large text file that looks like
some random : demo text for
illustration, can be long
and : some more
here is : another
one
I want an output like
some random : demo text for illustration, can be long
and : some more
here is : another one
I…

nlper
- 23
- 3
2
votes
1 answer
Regex to match nginx location block?
I am working on a bash script that can add nginx location blocks to a file taking in a URL. To prevent duplicates this script will also remove them if if it already exists.
For removing a block if it already exists I made the regex…

azy141
- 39
- 3
1
vote
0 answers
Formatting wide output via 'column' (or similar) command(s)
This question actually asks the 'inverse' solution as the one here, namely I would like to wrap the long column (column 4) on multiple lines. In effect, the output should look like:
cat test.csv | column -s"," -t -c5
col1 col2 col3 col4 …

cg79
- 63
- 5
1
vote
2 answers
Convert field names to lower case using miller
I would like to use miller (mlr) to convert column names to lower case. The closest I get is using the rename verb with a regular expression. \L should change the case, but instead the the column names are getting prefixed by "\L".
I'm using macOS…

Ben Carlson
- 1,053
- 2
- 10
- 18
1
vote
2 answers
awk array is created but elements are missing
I have this sample file
userX 2020 start id1
userY 2005 stop id2
userZ 2006 start id3
userT 2014 stop id1
userX 2010 stop id1
I want to create an array where year value $2 is element for every unique user-id…

Ahmet Said Akbulut
- 426
- 4
- 17
1
vote
3 answers
remove enclosing brackets from a file
How can I efficiently remove enclosing brackets from a file with bash scripting (first occurrence of [ and last occurrence of ] in file)?
All brackets that are nested within the outer brackets and may extend over several lines should be…

jpseng
- 1,618
- 6
- 18
1
vote
5 answers
How to format a TXT file into a structured CSV file in bash?
I wanted to get some information about my CPU temperatures on my Linux Server (OpenSuse Leap 15.2). So I wrote a Script which collects data every 20 seconds and writes it into a text file. Now I have removed all garbage data (like "CPU Temp" etc.) I…

Planetdragon
- 13
- 3
1
vote
1 answer
Shell script for cleaning up listener.ora file
we have listener.ora file as below
[oracle@orahow admin]$ more listener.ora
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = host-ip-address)(PORT = 1521))
(ADDRESS = (PROTOCOL = IPC)(KEY =…

dba
- 11
- 1
1
vote
1 answer
Remove section from file based on its content
How can I remove the config section that contains config B2 in the following file using bash? Any quick solution using sed or awk or similar? The different sections are separated by an empty line if that helps.
Input file:
section X
config A1
…

Hommous
- 23
- 3