0

I'd like to extract the folder structure of an URL in excel. Lets say we have the following URL:

https://stackoverflow.com/questions/51632630/count-of-many-to-many-and-optimize-query

I'd like to extract the 2nd subfolder (51632630) with an excel formula. How do I realize this? Take in account:

  • URLs can include https://, but also be added without https://

Thnx!

Darren Bartrup-Cook
  • 18,362
  • 1
  • 23
  • 45
mh3982
  • 97
  • 6

3 Answers3

1

If you are okay with it, this should work for both mentioned cases and potential http.

If data is in A1 - formula in B1 would be:

=IF(LEFT(A1,4)="http",MID(A1, FIND("/", A1, FIND("/",A1, FIND("/", A1, FIND("//",A1)+1)+1)+1)+1,256),MID(A1, FIND("/", A1, FIND("/",A1, FIND("/", A1, FIND("/",A1)+1)+1)+1)+1,256))

And in C1 would be:

=LEFT(B1,FIND("/",B1)-1)

It produces a following result:

| A                                                                     | B                                 | C         |
|---------------------------------------------------------------------  |---------------------------------  |---------- |
| https://stackoverflow.com/questions/51633071/excel-extract-function   | 51633071/excel-extract-function   | 51633071  |
| stackoverflow.com/questions/51633071/excel-extract-function           | 51633071/excel-extract-function   | 51633071  |
zipa
  • 27,316
  • 6
  • 40
  • 58
  • What i'd like to see is: A - https://stackoverflow.com/questions/51633071/excel-extract-function B - question C - 51633071 D - excel-extract-function And for some URLs there are 6 levels; how can I do this? – mh3982 Aug 01 '18 at 12:58
  • Well, the nested `FIND` is a way to go, and this creates a basic solution that you can edit per your need. – zipa Aug 01 '18 at 13:25
0

You can use string manipulation functions SEARCH,MID and so on to archieve that. See this nice blog about string splitting https://www.ablebits.com/office-addins-blog/2016/06/01/split-text-string-excel/

EDIT: With the help of How can I perform a reverse string search in Excel without using VBA?, You can strip trailing words like this

  • A1="https://stackoverflow.com/questions/51632630/count-of-many-to-many-and-optimize-query
  • A2=SUBSTITUTE(A1,"/","|",LEN(A1)-LEN(SUBSTITUTE(A1,"/","")))
  • A3=RIGHT(A2, LEN(A2) - FIND("|",A2))
  • A4=LEFT(A2, FIND("|",A2)-1)
  • Nice post, but I aint get it working. I'd like to have formulas for every level: - questions - 51632630 unfortunately it doesn't work :( – mh3982 Aug 01 '18 at 12:53
0

With a udf?

Option Explicit
Public Sub test()
    MsgBox GetPathItem("https://stackoverflow.com/questions/51632630/count-of-many-to-many-and-optimize-query", 2)
End Sub
Public Function GetPathItem(ByVal path As String, ByVal position As Long) As String
    path = Replace$(path, "//", "##")
    GetPathItem = Replace$(Split(path, "/")(position), "##", "//")
End Function
QHarr
  • 83,427
  • 12
  • 54
  • 101