Remove all path string from relative path

Question

The following strings are in an html file that is a subset of the strings I have to work with:

content/css/dashboard.css
content/pages/icon-apache.png
content/js/dashboard-commons.js
sbadmin2-1.0.7/bower_components/jquery/dist/jquery.min.js

I'm trying to remove all the path and only leave the file name, so it would be like this:

dashboard.css
icon-apache.png
dashboard-commons.js
jquery.min.js

I'm trying to find an approach that doesn't involve just getting all cases one by one and use sed to replace it, but a generic way to do it.

In short:

A regex to find the pattern (multi-level directory path) in the html file and remove it

Edit: I'm looking for a solution that works on linux, preferably that doesn't involves scripting or installing tools.

Edit 2: this question partially answers my question. With the answer provided there, I can now get the last part of the path. But I'm still looking for a regex pattern for extracting the list of strings from the html file.

Edit 3: As requested, here are a few examples:

<link href="sbadmin2-1.0.7/dist/css/sb-admin-2.css" rel="stylesheet">
<link href="content/css/dashboard.css" rel="stylesheet">
<link href="content/css/theme.blue.css" rel="stylesheet">
<script src="sbadmin2-1.0.7/bower_components/bootstrap/dist/js/bootstrap.min.js"></script>
<script src="sbadmin2-1.0.7/bower_components/flot/excanvas.min.js"></script>
<script src="sbadmin2-1.0.7/bower_components/flot/jquery.flot.js"></script>

Possible duplicate of [Get last field using awk substr](https://stackoverflow.com/questions/17921544/get-last-field-using-awk-substr) — kvantour, Sep 28 '18 at 14:21
For the HTML question, you have to provide us with an example so we know where these strings come from. Are they part of or where do they come from. — kvantour, Sep 28 '18 at 14:27
Why not think about removing what is not needed with an RE? For example with sed: `sed 's:.*/::'` — Thor, Sep 28 '18 at 14:27
Also, you ask for a regex to parse your HTML. [**Never** parse HTML or XML with a regex](https://stackoverflow.com/a/1732454/8344060) you might meet the pony. — kvantour, Sep 28 '18 at 14:29
@Thor that was my intention when asking the question. But I'm not familiar with sed/awk/grep to come up with the most appropriate regex for the job. — luizfzs, Sep 28 '18 at 14:33

score 1 · Answer 1 · answered Sep 28 '18 at 15:35

from the full path

$ awk -F/ '{print $NF}' file

dashboard.css
icon-apache.png
dashboard-commons.js
jquery.min.js

from the html

$ awk -F'"' '/<link|script/{n=split($2,a,"/"); print a[n]}' file.html

sb-admin-2.css
dashboard.css
theme.blue.css
bootstrap.min.js
excanvas.min.js
jquery.flot.js

assumes one link/script tag per line.

J.F. · Answer 2 · 2018-09-28T14:19:27.183

-2

You should use basename for that

J.F.

basename content/css/dashboard.css

gives

dashboard.css

edited Sep 28 '18 at 14:19

answered Sep 28 '18 at 14:12

J.F.

60
6

Sorry but I cannot see how that answers my question – luizfzs Sep 28 '18 at 14:14
basename content/css/dashboard.css gives you what you want dashboard.css – J.F. Sep 28 '18 at 14:15
Suppose I have a list with 100 of strings like this, and the base name does not repeat. Your suggestion is to have a 100 replace commands, one for each base name, right? If so, I stated that it is not what I'm looking for. – luizfzs Sep 28 '18 at 14:18
You can also pipe the data through `rev | cut -d/ -f1 | rev`. – Florian Weimer Sep 28 '18 at 16:06

Remove all path string from relative path

2 Answers2