Questions tagged [unix-text-processing]

Questions about manipulating or examining textual data using common UNIX/Linux utilities.

32 questions
3
votes
5 answers

Extract substrings between strings

I have a file with text as follows: ###interest1 moreinterest1### sometext ###interest2### not-interesting-line sometext ###interest3### sometext ###interest4### sometext othertext ###interest5### sometext ###interest6### I want to extract all…
Digsby
  • 151
  • 10
2
votes
3 answers

Multiple input files - loop through one and check if string contained in second file - output paragraph

I try to filter a text file based on a second file. The first file contains paragraphs like: $ cat paragraphs.txt # ::id 1 # ::snt what is an example of a 2-step garage album (e / exemplify-01 :arg0 (a / amr-unknown) :arg1 (a2 / album …
niccip
  • 23
  • 5
2
votes
2 answers

Split Markdown text file by regular expression that defines headings

I am trying to use a commandline program to split a larger text file into chunks with: split on defined regex pattern filenames defined by a capturing group in that regex pattern The text file is of the format: # Title # 2020-01-01 Multi-line…
Leeroy
  • 2,003
  • 1
  • 15
  • 21
2
votes
2 answers

gsub: remove till first occurence instead of last occurence of a given character in a line

I have an html file which I basically try to remove first occurences of <...> with sub/gsub functionalities. I used awk regex . * + according to match anything between < >. However first occurence of > is being escaped (?). I don't know if there is…
2
votes
2 answers

AWK: Concatenate and process three or more files with a method similar to FNR==NR approach

Since I am learning awk; I found out FNR==NR approach is a very common method to process two files. If FNR==NR; then it is the first file, when FNR reset to 1 while reading every line from concatenated files it means !(FNR==NR) and it is obviously…
2
votes
3 answers

How to extract text from access log?

I am very new in this. I am trying to extract some text from my access log in a new file. My log file is like this: 111.111.111.111 - - [02/Jul/2021:18:35:19 +0000] "GET /api/items HTTP/2.0" 304 0 "https://example.com/some/text/call-log?roomNo=5003"…
Zaman
  • 103
  • 7
2
votes
2 answers

replace new line with a space if next line starts with a word character

I've large text file that looks like some random : demo text for illustration, can be long and : some more here is : another one I want an output like some random : demo text for illustration, can be long and : some more here is : another one I…
nlper
  • 23
  • 3
2
votes
1 answer

Regex to match nginx location block?

I am working on a bash script that can add nginx location blocks to a file taking in a URL. To prevent duplicates this script will also remove them if if it already exists. For removing a block if it already exists I made the regex…
azy141
  • 39
  • 3
1
vote
0 answers

Formatting wide output via 'column' (or similar) command(s)

This question actually asks the 'inverse' solution as the one here, namely I would like to wrap the long column (column 4) on multiple lines. In effect, the output should look like: cat test.csv | column -s"," -t -c5 col1 col2 col3 col4 …
cg79
  • 63
  • 5
1
vote
2 answers

Convert field names to lower case using miller

I would like to use miller (mlr) to convert column names to lower case. The closest I get is using the rename verb with a regular expression. \L should change the case, but instead the the column names are getting prefixed by "\L". I'm using macOS…
Ben Carlson
  • 1,053
  • 2
  • 10
  • 18
1
vote
2 answers

awk array is created but elements are missing

I have this sample file userX 2020 start id1 userY 2005 stop id2 userZ 2006 start id3 userT 2014 stop id1 userX 2010 stop id1 I want to create an array where year value $2 is element for every unique user-id…
1
vote
3 answers

remove enclosing brackets from a file

How can I efficiently remove enclosing brackets from a file with bash scripting (first occurrence of [ and last occurrence of ] in file)? All brackets that are nested within the outer brackets and may extend over several lines should be…
jpseng
  • 1,618
  • 6
  • 18
1
vote
5 answers

How to format a TXT file into a structured CSV file in bash?

I wanted to get some information about my CPU temperatures on my Linux Server (OpenSuse Leap 15.2). So I wrote a Script which collects data every 20 seconds and writes it into a text file. Now I have removed all garbage data (like "CPU Temp" etc.) I…
1
vote
1 answer

Shell script for cleaning up listener.ora file

we have listener.ora file as below [oracle@orahow admin]$ more listener.ora LISTENER = (DESCRIPTION_LIST = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = host-ip-address)(PORT = 1521)) (ADDRESS = (PROTOCOL = IPC)(KEY =…
dba
  • 11
  • 1
1
vote
1 answer

Remove section from file based on its content

How can I remove the config section that contains config B2 in the following file using bash? Any quick solution using sed or awk or similar? The different sections are separated by an empty line if that helps. Input file: section X config A1 …
Hommous
  • 23
  • 3
1
2 3