0

Provided I have a string with multiple path files that look something like this:

"mydrive/mypath/myapp/first_app.java, mydrive/mypath/myapp/second_app.java, mydrive/mypath/myapp/third_app.java". From this string I'd like to extract only the file names without the file extension and build a new list of strings that will look like this:

"first_app, second_app, third_app" etc..

My current implementation is in Python and it looks like this:

from sys import argv

incoming_strings = argv
clean_strings_list = []
if isinstance(incoming_strings, list):
    for string_to_cut in incoming_strings:
        if "app" in string_to_cut:
            string_to_cut_ = string_to_cut.split('/')
            string_to_cut__ = string_to_cut_[len(string_to_cut_) - 1]
            string_to_cut = string_to_cut__.split('.')[0]
            clean_strings_list.append(string_to_cut)
    print(clean_strings_list)
else:
    string_to_cut_ = incoming_strings.split('/')
    string_to_cut__ = string_to_cut_[len(string_to_cut_)-1]
    string_to_cut = string_to_cut__.split('.')[0]
    print(string_to_cut)

I am required to implement the following code with a Bash script. What would be the proper way to do that? Thanks!

oguz ismail
  • 1
  • 16
  • 47
  • 69
Pavel Zagalsky
  • 1,620
  • 7
  • 22
  • 52
  • To split the string into pieces, you can use `echo $yourstring | tr , "\n"` . For removing the directory part and the _.java_ extension, the easiest approach (from the viewpoint of programming effort) is to use the `basename` command, though other possibilities exist too. – user1934428 Nov 06 '19 at 07:42

3 Answers3

2

It has quite many solutions for your issue, one suggestion here:

Python

>>>st = 'mydrive/mypath/myapp/first_app.java, mydrive/mypath/myapp/second_app.java, mydrive/mypath/myapp/third_app.java'
>>>import os
>>>for s in st.strip().split(","):
...    fname = os.path.basename(s).split(".")[0]
...    print(fname)

first_app
second_app
third_app

Bash

st='mydrive/mypath/myapp/first_app.java, mydrive/mypath/myapp/second_app.java, mydrive/mypath/myapp/third_app.java'
OLDIFS=$IFS   // get default IFS
IFS=","  // set comma as a delimiter
read -ra ADDR <<< "$st"     // split st into array
for i in "${ADDR[@]}"; do 
    filename=$(basename -- "$i")    // get filename from filepath
    echo "${filename%.*}"   // get filename only - without extension
done
IFS=$OLDIFS  // reset to default value

Output:

first_app
second_app
third_app

Read more here

Lê Tư Thành
  • 1,063
  • 2
  • 10
  • 19
  • Your suggestion uses _Python_. The OP was looking for a _bash_ program. – user1934428 Nov 06 '19 at 07:43
  • @user1934428, thanks for reminding, I just focus on his tag. Updated answer with bash script already. – Lê Tư Thành Nov 06 '19 at 08:19
  • Since you solved this by fiddeling with IFS, it might be better to save the original IFS into a variable and restore it later from there. We don't know in which context your code snippet will be used, and maybe the IFS has been modified already. This is one reason why I don't like so much changing IFS in the middle of a script, although I'm well aware that this is considered common and good practice. – user1934428 Nov 06 '19 at 09:26
0

Using bash, you could split the path on ',' using IFS like,

$ path="mydrive/mypath/myapp/first_app.java, mydrive/mypath/myapp/second_app.java, mydrive/mypath/myapp/third_app.java"
$ IFS=, read -ra paths <<<"$path" # read into an array, by splitting the path on `,`
$ for path in "${paths[@]}"
> do
>   filename="${path##*/}" # strip whatever is before final /
>   filename="${filename%.*}" # strip the extension
>   echo "$filename"
> done
first_app
second_app
third_app

You could just use os.path or pathlib(i recommend, if you are on python3),

>>> path = "mydrive/mypath/myapp/first_app.java, mydrive/mypath/myapp/second_app.java, mydrive/mypath/myapp/third_app.java"
>>> import os
>>> [os.path.splitext(os.path.basename(p)) for p in path.split(',')]
[('first_app', '.java'), ('second_app', '.java'), ('third_app', '.java')]
>>> import pathlib
>>> details = [(pathlib.Path(p).stem, pathlib.Path(p).suffix) for p in path.split(',')]
[('first_app', '.java'), ('second_app', '.java'), ('third_app', '.java')]

And if you just want names,

>>> names_without_exts, _ = zip(*details)
>>> names_without_exts
('first_app', 'second_app', 'third_app')

or directly use,

>>> [os.path.splitext(os.path.basename(p))[0] for p in path.split(',')]
['first_app', 'second_app', 'third_app']
>>> [pathlib.Path(p).stem for p in path.split(',')]
['first_app', 'second_app', 'third_app']
han solo
  • 6,390
  • 1
  • 15
  • 19
0

You can use regex to extract only the part you need by using re.search.

import re
strings=["mydrive/mypath/myapp/first_app.java", "mydrive/mypath/myapp/second_app.java", "mydrive/mypath/myapp/third_app.java"]
for string in strings:
    a = re.search("\/([^\/]*)\.[^\.\/]*$", string)
    a.group(1)

This will produce:

'first_app'
'second_app'
'third_app'

You can test this on regex 101.

Bayou
  • 3,293
  • 1
  • 9
  • 22