1

I have listings for cars that I want to extract just the year from. Was going to use excel to do this but I think Im hitting the edge of its capabilities (or mine).

An example:

2000 lotus elise 
1996 Porsche 911 
Nissan Z280 from 1984

Id like to find a way to extract and print

2000
1996
1984 

Into a separate column

=TEXTJOIN("",TRUE,IFERROR((MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)*1),""))

Works but the other numbers in the listings could throw it off. In these examples I would get

2000
1996911
1984280

Seems like there should be a way to do this in SQL, but I can't figure it out either. Would love any tips or pointers to get me going in the right direction

JvdV
  • 70,606
  • 8
  • 39
  • 70
CONTRA
  • 11
  • 1

4 Answers4

1

You may use:

=FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s[.*0=0][string-length(.)=4]")

enter image description here

If more 4 character digits may appear, it could be safer to check what range these numbers fall in:

=FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s[.*1>=1885][.*1<=2020]")

To be even more safe, add the previously used [string-length(.)=4] in there to prevent '0002000' from being accepted as a year.

The resulting array will spill down in Excel O365. In older vesions of Excel it will only return the first result. Which in your case is what you'll need. If for whatever reason you need another value from the array, use INDEX().

If this sparks your enthusiasm about FILTERXML, you may find this interesting.

JvdV
  • 70,606
  • 8
  • 39
  • 70
0

In Excel you could a custom function with RegEx, something like:

Function FindYear(str As String)

Dim regEx As Object, obj As Object
Dim strPattern As String

Set regEx = CreateObject("vbscript.regexp")

strPattern = "\b[0-9]{4}\b" 'looks for (4 consecutive numbers)

With regEx
    
       .Pattern = strPattern
       
        If .test(str) Then

            FindYear = .Execute(str)(0)
        
        Else
            FindYear = ""
        End If
        


End With
End Function

If you are not sure how to create custom function you can read about it here

Michal Rosa
  • 2,459
  • 2
  • 15
  • 28
0

Below is query tested in Oracle:

DDL:

create table tab1( str varchar2(25));

INSERT ALL
into tab1(str) values('2000 lotus elise')
into tab1(str) values('1996 Porsche 911')
into tab1(str) values('Nissan Z280 from 1984')
select * from dual;

Something like this if you table like above:

Select str, regexp_substr(str,'^\d{4}|\d{4}$') from tab1;

or

Select str, regexp_substr(str,'^[[:digit:]]{4}|[[:digit:]]{4}$') from tab1;

Output:

+-----------------------+------------------------------------+
|          STR          | REGEXP_SUBSTR(STR,'^\D{4}|\D{4}$') |
+-----------------------+------------------------------------+
| 2000 lotus elise      | 2000                               |
+-----------------------+------------------------------------+
| 1996 Porsche 911      | 1996                               |
+-----------------------+------------------------------------+
| Nissan Z280 from 1984 | 1984                               |
+-----------------------+------------------------------------+
The AG
  • 672
  • 9
  • 18
0

I don't think that any Excel formula can solve this perfectly therefore I would like to suggest VBA macro a UDF (User Define Function), help you to fix it.

enter image description here

Function GetYear(s As String)

With CreateObject("vbscript.regexp")
  .Pattern = "\b\d{4}\b"
  If .Test(s) Then GetYear = .Execute(s)(0)
End With

End Function

N.B.

  • Use this code as Module.

  • You need to insure that Microsoft VBscript Regular Expression 5.5 is available as references.

  • Follow these steps to check it:

    1.In VB editor hit TOOL Tab then References.

    2.Find VBscript Regular Expression 5.5 and check it.

  • Save the Workbook as Macro Enabled.

  • In cell D66 enter this formula.

    =GetYear(C66)

You may adjust cell references in the formula as needed.

Rajesh Sinha
  • 197
  • 3
  • 8
  • Rajesh, thanks so much for you answer. Im running excel 365 for mac. When I hit the TOOL tab and then go to references I dont see VBscript Regular Expression 5.5. It is saved as a xlsm. When Im entering the formula in what would be column D in your example its returning just 0. Any other tips? – CONTRA Jul 27 '20 at 17:13
  • Hi @CONTRA,,, if the DLL file is out of References then you need do download `vbscript.dll`, from any reliable site and store in the location `C:\Windows\System32\vbscript.dll`,,, for that VBA macro this is mandatory ☺ – Rajesh Sinha Jul 28 '20 at 06:16
  • @CONTRA,,, All versions of Windows desktop Excel from Excel 5 through Excel 2019 support VBA. Also this includes the version of Excel installed by Office 365. The version installed by Office 365 known as Excel 2016—but it has more recent features than Excel 2019. – Rajesh Sinha Jul 28 '20 at 08:58