3

I am having trouble figuring out how to grep the characters between two single quotes .

I have this in a file version: '8.x-1.0-alpha1'

and I like to have the output like this (the version numbers can be various):

8.x-1.0-alpha1

I wrote the following but it does not work:

cat myfile.txt | grep -e 'version' | sed 's/.*\?'\(.*?\)'.*//g'

Thank you for your help.

Addition: I used the sed command sed -n "s#version:\s*'\(.*\)'#\1#p" I also like to remove 8.x- which I edited to sed -n "s#version:\s*'8.x-\(.*\)'#\1#p".

This command only works on linux and it does not work on MAC. How to change this command to make it works on MAC?

sed -n "s#version:\s*'8.x-\(.*\)'#\1#p"

Amin Z
  • 37
  • 1
  • 7
  • 4
    Note that `cat | grep | sed` is largely pointless and can just be `sed pattern myfile.txt`. What format is this file in? My guess is sed is the wrong tool for this. – DTSCode Oct 09 '18 at 14:22
  • Your shell is ‘eating’ your quotes; `sed` doesn’t get to see them. – Biffen Oct 09 '18 at 14:22
  • See [Extract version number from a string](https://stackoverflow.com/questions/16817646/extract-version-number-from-a-string). – Wiktor Stribiżew Oct 09 '18 at 14:23
  • 3
    Possible duplicate of [How to escape single quotes within single quoted strings?](https://stackoverflow.com/questions/1250079/how-to-escape-single-quotes-within-single-quoted-strings) – Biffen Oct 09 '18 at 14:24
  • 1
    @Kwright02: that "regex" only quotes the parens. the `.*` is unquoted and will be handled/expanded by the shell – DTSCode Oct 09 '18 at 14:29
  • @Kwright02 You can’t escape single quotes within single quotes like that. – Biffen Oct 09 '18 at 14:33
  • I like to use the sed command. `sed -n "s#version:\s*'\(.*\)'#\1#p" myfile.txt` How can I remove the `8.x-` from my output. – Amin Z Oct 10 '18 at 13:26
  • Don't add requirements in comments. If your question doesn't state what you really want then fix your question. You did just waste several peoples time trying to help you with a question that isn't really what you want to please make sure to get it right this time. – Ed Morton Oct 10 '18 at 13:28

5 Answers5

8

If you just want to have that information from the file, and only that you can quickly do:

awk -F"'" '/version/{print $2}' file

Example:

$ echo "version: '8.x-1.0-alpha1'" | awk -F"'" '/version/{print $2}'
8.x-1.0-alpha1

How does this work?

An awk program is a series of pattern-action pairs, written as:

condition { action }
condition { action }
...

where condition is typically an expression and action a series of commands.

  1. -F "'": Here we tell to define the field separator FS to be a <single quote> '. This means the all lines will be split in fields $1, $2, ... ,$NF and between each field there is a '. We can now reference these fields by using $1 for the first field, $2 for the second ... etc and this till $NF where NF is the total number of fields per line.

  2. /version/{print $2}: This is the condition-action pair.

    • condition: /version/:: The condition reads: If a substring in the current record/line matches the regular expression /version/ then do action. Here, this is simply translated as if the current line contains a substring version

    • action: {print $2}:: If the previous condition is satisfied, then print the second field. In this case, the second field would be what the OP requests.

There are now several things that can be done.

  1. Improve the condition to be /^version :/ && NF==3 which reads _If the current line starts with the substring version : and the current line has 3 fields then do action

  2. If you only want the first occurance, you can tell the system to exit immediately after the find by updating the action to {print $2; exit}

kvantour
  • 25,269
  • 4
  • 47
  • 72
5

I'd use GNU grep with pcre regexes:

grep -oP "version: '\\K.*(?=')" file

where we are looking for "version: '" and then the \K directive will forget what it just saw, leaving .*(?=') to match up to the last single quote.

glenn jackman
  • 238,783
  • 38
  • 220
  • 352
2

Try something like this: sed -n "s#version:\s*'\(.*\)'#\1#p" myfile.txt. This avoids the redundant cat and grep by finding the "version" line and extracting the contents between the single quotes.

Explanation:

the -n flag tells sed not to print lines automatically. We then use the p command at the end of our sed pattern to explicitly print when we've found the version line.

Search for pattern: version:\s*'\(.*\)'

  • version:\s* Match "version:" followed by any amount of whitespace
  • '\(.*\)' Match a single ', then capture everything until the next '

Replace with: \1; This is the first (and only) capture group above, containing contents between single quotes.

John
  • 2,395
  • 15
  • 21
  • 1
    I tried this command (replacing the `:` with `=`) with a line as follows: `version='0.1.2',`. The output was: `0.1.2,`, meaning that your suggested command does not precisely keep only what's between single quotes – Ezequiel Berto Nov 12 '20 at 20:56
  • It prints everything in the line of `version=`, after that and without the `'` – Ezequiel Berto Nov 12 '20 at 22:28
  • Thank you, this works nicely to get a __version__ variable from a python file, e.g. to be used in CI: `sed -n "s#__version__ =\s*'\(.*\)'#\1#p" version.py` will extract `0.20.9` from version.py with a line `__version__ = '0.20.9'` – Alex Jan 04 '21 at 03:54
2

When your only want to look at he quotes, you can use cut.

grep -e 'version' myfile.txt | cut -d "'" -f2
Walter A
  • 19,067
  • 2
  • 23
  • 43
1

grep can almost do this alone:

grep -o "'.*'" file.txt

But this may also print lines you don't want to: it will print all lines with 2 single quotes (') in them. And the output still has the single quotes (') around it:

'8.x-1.0-alpha1'

But sed alone can do it properly:

sed -rn "s/^version: +'([^']+)'.*/\1/p" file.txt
Hkoof
  • 756
  • 5
  • 14