2

I have the following csv file:

"Function","Source","Line","FnCov","C/D Coverage","out of","%"
"sharp_coll_env2bool","../../src/coll/util.c",176,1,1,34,2%
"TreeManager::SortTreeRootsByGroup","../../src/am/tree_manager.cpp",1467,1,1,26,3%
"FabricGraph::MadSendRetry","../../src/am/fabric_graph.cpp",2170,1,1,16,6%
"ibis_log_mad_function","../../src/am/fabric_provider.cpp",93,1,1,12,8%
"__free_context","../../src/external/mellanox/alog/src/core/media/alog_media.c",415,1,2,13,15%

I need to remove all the content that comes before the first "/src" in the second cells:

"Function","Source","Line","FnCov","C/D Coverage","out of","%"
"sharp_coll_env2bool","/src/coll/util.c",176,1,1,34,2%
"TreeManager::SortTreeRootsByGroup","/src/am/tree_manager.cpp",1467,1,1,26,3%
"FabricGraph::MadSendRetry","/src/am/fabric_graph.cpp",2170,1,1,16,6%
"ibis_log_mad_function","/src/am/fabric_provider.cpp",93,1,1,12,8%
"__free_context","/src/external/mellanox/alog/src/core/media/alog_media.c",415,1,2,13,15%

So far, I tried the following:

sed -i -r 's|(["'\''],.*\/src)|"src"|g'

which does not handle the quotation marks.

awk -F, '{gsub(/\.*src/,"",$2); print}

which replaced all content and ruins the file

Ravi Saroch
  • 934
  • 2
  • 13
  • 28
  • 1
    On SO we encourage users to post their efforts which they put in order to solve their own problem, do kindly do add so and let us know then. Also you could use search functionality on SO too for looking guidance related to similar kind of questions. – RavinderSingh13 Oct 28 '19 at 09:02
  • 1
    I have tried using the following commands: sed -i -r 's|(["'\''],.*\/src)|"src"|g' - which does not handle the quotation marks. awk -F, '{gsub(/\.*src/,"",$2); print} - which replaced all content and ruins the file' – Ofir Michael Oct 28 '19 at 09:05
  • 1
    Please add these commands as your efforts in your post, comments are not meant for posting codes – RavinderSingh13 Oct 28 '19 at 09:10
  • Try directly without any comma and quotes as you did on `sed` with same `|` separator. – sungtm Oct 28 '19 at 09:20

3 Answers3

2

EDIT: Adding more generic solution here, which will remove text till very first src string only.

awk -v s1="\"" -v var="src" 'BEGIN{FS=OFS=","} 
{
  for(j=1;j<=NF;j++){
     val=index($j,var)
     if(val){
       $j=s1 "/src" substr($j,val+3)
       val=""
     }
  }
  val=""
}
1
'  Input_file


Could you please try following. Why I have gone for going field by field solution if in case there are multiple occurrences of src it should handle that without issues.

awk -v s1="\"" 'BEGIN{FS=OFS=","} {for(j=1;j<=NF;j++){if($j~/src/){sub(/.*src/,s1 "/src",$j)}}} 1'  Input_file
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
  • 1
    @Ofir Michael, Thank you for selecting answer as correct one, please do lemme know in case of any queries – RavinderSingh13 Oct 28 '19 at 11:11
  • Your solution is very elegant, but remove more than the first "/src/". Is there any adjustment I can do to remove only the first one" – Ofir Michael Oct 28 '19 at 11:16
  • awk: BEGIN{FS=OFS=","}{for(j=1;j<=NF;j++){val=index($j,var) if(val){$j=s1 "/src" substr($j,val+3) val=""}} val=""}1 awk: ^ syntax error – Ofir Michael Oct 28 '19 at 12:56
  • 2
    @OfirMichael, Not clear since back ticks were not added, could you please open a chat room now? We both can talk on this one then. – RavinderSingh13 Oct 28 '19 at 12:59
1

Simply :

sed 's,[^"]*src,/src,' <file>

(use -i option to replace content on file, g at the end if you need to replace more than 1 ../src per line)

Output :

"Function","Source","Line","FnCov","C/D Coverage","out of","%"
"sharp_coll_env2bool","/src/coll/util.c",176,1,1,34,2%
"TreeManager::SortTreeRootsByGroup","/src/am/tree_manager.cpp",1467,1,1,26,3%
"FabricGraph::MadSendRetry","/src/am/fabric_graph.cpp",2170,1,1,16,6%
"ibis_log_mad_function","/src/am/fabric_provider.cpp",93,1,1,12,8%
"__free_context","/src/external/mellanox/alog/src/core/media/alog_media.c",415,1,2,13,15%

For something more robust, you can read this.


Edit :

Now it removes more than the first "/src/". Is there any adjustment I can do to remove only the first one? – Ofir Michael 1 hour ago

You can use perl for non-greedy regex :

perl -pe 's,[^"]*?src,/src,' <file>

Input :

"Function","Source","Line","FnCov","C/D Coverage","out of","%"
"sharp_coll_env2bool","../../src/coll/util.c",176,1,1,34,2%
"TreeManager::SortTreeRootsByGroup","../../src/am/tree_manager.cpp",1467,1,1,26,3%
"FabricGraph::MadSendRetry","../../src/am/fabric_graph.cpp",2170,1,1,16,6%
"ibis_log_mad_function","../../src/am/fabric_provider.cpp",93,1,1,12,8%
"__free_context","../../src/external/mellanox/alog/src/core/media/alog_media.c",415,1,2,13,15%
"FabricUpdateType2Char","../mtrswgwork/dmitriyu/sharp_ws3/auto/mtrswgwork/dmitriyu/sharp_ws3/sharp/src/am/fabric_update.h",60,1,1,5,20% 
"CaPortType2Char","../auto/mtrswgwork/dmitriyu/sharp_ws3/auto/mtrswgwork/dmitriyu/sharp_ws3/sharp/src/am/port_data.h",56,1,1,5,20% 
"smx_init","../auto/mtrswgwork/dmitriyu/sharp_ws3/auto/mtrswgwork/dmitriyu/sharp_ws3/sharp/src/smx/smx.c",130,1,4,18,22% 
"dev_sa_response_method","../auto/dmitriyu/sharp/to/mtrswgwork/dmitriyu/sharp_ws3/sharp/src/sr/sr.c",30,1,2,9,22%

Output :

"Function","Source","Line","FnCov","C/D Coverage","out of","%"
"sharp_coll_env2bool","/src/coll/util.c",176,1,1,34,2%
"TreeManager::SortTreeRootsByGroup","/src/am/tree_manager.cpp",1467,1,1,26,3%
"FabricGraph::MadSendRetry","/src/am/fabric_graph.cpp",2170,1,1,16,6%
"ibis_log_mad_function","/src/am/fabric_provider.cpp",93,1,1,12,8%
"__free_context","/src/external/mellanox/alog/src/core/media/alog_media.c",415,1,2,13,15%
"FabricUpdateType2Char","/src/am/fabric_update.h",60,1,1,5,20% 
"CaPortType2Char","/src/am/port_data.h",56,1,1,5,20% 
"smx_init","/src/smx/smx.c",130,1,4,18,22% 
"dev_sa_response_method","/src/sr/sr.c",30,1,2,9,22%
Corentin Limier
  • 4,946
  • 1
  • 13
  • 24
  • 1
    does this guarantee 'remove all the content that comes before the **first** "/src"' ? – Kent Oct 28 '19 at 10:38
  • 1
    @Kent clearly, no. RavinderSingh13's solution may be more convenient if you have other characters than `.` and `/`. Link I provided on my solution may guide you to the most robust solution. – Corentin Limier Oct 28 '19 at 10:43
  • Anyway, please add some extra-examples if you have specific cases that should be handled on your question. – Corentin Limier Oct 28 '19 at 10:45
  • your solution worked for some cases, but not for this one, for some reason: "FabricUpdateType2Char","../mtrswgwork/dmitriyu/sharp_ws3/auto/mtrswgwork/dmitriyu/sharp_ws3/sharp/src/am/fabric_update.h",60,1,1,5,20% "CaPortType2Char","../auto/mtrswgwork/dmitriyu/sharp_ws3/auto/mtrswgwork/dmitriyu/sharp_ws3/sharp/src/am/port_data.h",56,1,1,5,20% "smx_init","../auto/mtrswgwork/dmitriyu/sharp_ws3/auto/mtrswgwork/dmitriyu/sharp_ws3/sharp/src/smx/smx.c",130,1,4,18,22% "dev_sa_response_method","../auto/dmitriyu/sharp/to/mtrswgwork/dmitriyu/sharp_ws3/sharp/src/sr/sr.c",30,1,2,9,22% any iseas? – Ofir Michael Oct 28 '19 at 11:02
  • Now it removes more than the first "/src/". Is there any adjustment I can do to remove only the first one? – Ofir Michael Oct 28 '19 at 11:40
  • @OfirMichael I edited using `perl` and non-greedy regexes. – Corentin Limier Oct 28 '19 at 13:21
0
sed 's/..\/..\/src/\/src/g' Inputfile
Ravi Saroch
  • 934
  • 2
  • 13
  • 28