0

I have a spreasheet from an csv formatted exported report which contains an address field with multiple lines.

Here is an example of the cell contents (I have used an image to show the question mark at the end of each line):

enter image description here

I need to extract the suburb (in this case Brooklyn) from the address however the challeneg I have is there are some cases where there is an extra line in the address, for example:

enter image description here

In all cases the Suburb will always be the second to last line so my though was to somehow work backwards but I have no idea how to do it as a formula.

ANy help would be very much appreciated.

Spionred
  • 797
  • 2
  • 10
  • 27

1 Answers1

2

Let us say that the address is located in cell A1. Then, the following value would give you the last but one line.

It is derived by using the following pseudo-code:

MID
(A1, 
(the position of the [n-1]th new-line character) + 1, 
(the position of the [n]th new-line characters) - (the position of the n-1 new-line characters) - 1)

which is implemented using the formula below:

The # of occurrences of the new-line character is given by:

LEN(A1)-LEN(SUBSTITUTE(A1,"
",""))

The [n]th occurrence of the new-line character is given by:

FIND(CHAR(1),SUBSTITUTE(A1,"
",CHAR(1),(*n*-1)))

Thus, the final formula is given by:

">"&MID(A1,FIND("
",A1,FIND("
",A1)+((LEN(A1)-LEN(SUBSTITUTE(A1,"
","")))-2))+1,(FIND(CHAR(1),SUBSTITUTE(A1,"
",CHAR(1),LEN(A1)-LEN(SUBSTITUTE(A1,"
","")))))-(FIND("
",A1,FIND("
",A1)+((LEN(A1)-LEN(SUBSTITUTE(A1,"
","")))-2)))-1)&"<"

As you can see, > and < characters are concatenated in the final formula to ensure that there are no extra characters in the final string. You may omit these concatenations.

Reference:

How to find nth occurrence (position) of a character in text string in Excel? on ExtendOffice

Joseph B
  • 5,519
  • 1
  • 15
  • 19
  • Holly moley :-) Thanks for taking the time to answer. I will give it a go and report back. CHeers, Kevin. – Spionred Jul 14 '14 at 09:49
  • The final formula is not quite working as it is giving me the last line each time. There does seem to be a CR/LF at the end of the last line and a few spaces after that. I have tried playing with the numbers but can't seem to figure out what I need to change to get that second to last line. – Spionred Jul 14 '14 at 11:59
  • If I left() the cell to strip the last 5 chars the formula works great. I just need to firgure out how the work in the right funtion to your formula.....maybe :-) – Spionred Jul 14 '14 at 12:13
  • Nice answer, works as it should when I try it. Similar functionality can be found in [Excel: last character/string match in a string](http://stackoverflow.com/q/18617349/1380680) – Reinier Torenbeek Jul 14 '14 at 20:32
  • UPDATE: I could not find a way to trim the trailing CR/LF or spaces as it was not consistent across all records however I exported the report into XML then opened it up in Excel and this removed the trailing character meaning the above formula worked a treat. Thank you for your help! – Spionred Jul 15 '14 at 04:28
  • @ReinierTorenbeek Thank you for the feedback and the link to a related question! – Joseph B Jul 15 '14 at 07:50