Questions tagged [text-processing]

Mechanizing the creation or manipulation of electronic text.

Text processing includes basic processing jobs using filter, tokenization or normalization method to process text. This could be a pre-processing step for .

See also:

1959 questions
655
votes
28 answers

How can I extract a predetermined range of lines from a text file on Unix?

I have a ~23000 line SQL dump containing several databases worth of data. I need to extract a certain section of this file (i.e. the data for a single database) and place it in a new file. I know both the start and end line numbers of the data that…
Adam J. Forster
  • 17,789
  • 9
  • 25
  • 20
452
votes
18 answers

Add a prefix string to beginning of each line

I have a file as below: line1 line2 line3 And I want to get: prefixline1 prefixline2 prefixline3 I could write a Ruby script, but it is better if I do not need to. prefix will contain /. It is a path, /opt/workdir/ for example.
pierrotlefou
  • 39,805
  • 37
  • 135
  • 175
360
votes
8 answers

Select random lines from a file

In a Bash script, I want to pick out N random lines from input file and output to another file. How can this be done?
user121196
  • 30,032
  • 57
  • 148
  • 198
323
votes
25 answers

How to use sed to replace only the first occurrence in a file?

I would like to update a large number of C++ source files with an extra include directive before any existing #includes. For this sort of task, I normally use a small bash script with sed to re-write the file. How do I get sed to replace just the…
David Dibben
  • 18,460
  • 6
  • 41
  • 41
279
votes
8 answers

Using multiple delimiters in awk

I have a file which contain following lines: /logs/tc0001/tomcat/tomcat7.1/conf/catalina.properties:app.env.server.name = demo.example.com /logs/tc0001/tomcat/tomcat7.2/conf/catalina.properties:app.env.server.name =…
Satish
  • 16,544
  • 29
  • 93
  • 149
249
votes
10 answers

How to convert all text to lowercase in Vim

How do you convert all text in Vim to lowercase? Is it even possible?
ksuralta
  • 16,276
  • 16
  • 38
  • 36
224
votes
19 answers

How to replace ${} placeholders in a text file?

I want to pipe the output of a "template" file into MySQL, the file having variables like ${dbName} interspersed. What is the command line utility to replace these instances and dump the output to standard output? The input file is considered to be…
Dana the Sane
  • 14,762
  • 8
  • 58
  • 80
126
votes
26 answers

Is there still any reason to learn AWK?

I am constantly learning new tools, even old fashioned ones, because I like to use the right solution for the problem. Nevertheless, I wonder if there is still any reason to learn some of them. awk for example is interesting to me, but for simple…
Bite code
  • 578,959
  • 113
  • 301
  • 329
109
votes
7 answers

How to obtain the first letter in a Bash variable?

I have a Bash variable, $word, which is sometimes a word or sentence, e.g.: word="tiger" Or: word="This is a sentence." How can I make a new Bash variable which is equal to only the first letter found in the variable? E.g., the above would…
Village
  • 22,513
  • 46
  • 122
  • 163
103
votes
7 answers

How to add a new line of text to an existing file in Java?

I would like to append a new line to an existing file without erasing the current information of that file. In short, here is the methodology that I am using the current time: import java.io.BufferedWriter; import java.io.FileWriter; import…
CompilingCyborg
  • 4,760
  • 13
  • 44
  • 61
99
votes
11 answers

Remove empty lines in a text file via grep

FILE: hello world foo bar How can I remove all the empty new lines in this FILE? Output of command: FILE: hello world foo bar
user191960
  • 1,941
  • 5
  • 20
  • 24
79
votes
8 answers

How to count the number of words in a sentence, ignoring numbers, punctuation and whitespace?

How would I go about counting the words in a sentence? I'm using Python. For example, I might have the string: string = "I am having a very nice 23!@$ day. " That would be 7 words. I'm having trouble with the random amount of spaces…
HossBender
  • 1,019
  • 2
  • 10
  • 23
79
votes
5 answers

Text processing - Python vs Perl performance

Here is my Perl and Python script to do some simple text processing from about 21 log files, each about 300 KB to 1 MB (maximum) x 5 times repeated (total of 125 files, due to the log repeated 5 times). Python Code (code modified to use compiled re…
ihightower
  • 3,093
  • 6
  • 34
  • 49
71
votes
9 answers

shell replace cr\lf by comma

I have input.txt 1 2 3 4 5 I need to get such output.txt 1,2,3,4,5 How to do it?
vinnitu
  • 4,234
  • 10
  • 41
  • 59
70
votes
6 answers

Remove first directory components from path of file

I need to remove one directory (the leftmost) from variables in Bash. I found ways how can I remove all the path or use dirname and others but it was removing all or one path component on the right side; it wouldn't help me. So you have a better…
Libor Zapletal
  • 13,752
  • 20
  • 95
  • 182
1
2 3
99 100