0

Here is the Sample Google sheet file https://docs.google.com/spreadsheets/d/1B0CQyFeqxg2wgYHJpFxLIzw_8Pv067p0cwacWk0Nc4o/edit?usp=sharing

I have an Excel Sheet where I need to find Arabic Words and separate them.

For example, I have data like this:

//olyservice/GIS-TANSIQ01/Storage/46-أمانة منطقة عسير -بلدية بللحمر/حدود القري المطلوب اعتمادهاالمعتمد مسمايتها بالوزارة.rar

I'm looking for:

  • 1st column: أمانة منطقة عسير
  • 2nd column: بلدية بللحمر
  • 3rd column: RAR

If there is no أمانة and بلدية words, the columns should be blank.

I tried these methods, without success:

=RIGHT(MID(A2,FIND("-",A2,20)+1,255),25)

and

=TRIM(MID(SUBSTITUTE(A2,"",REPT(" ",99)),MAX(1,FIND("-",SUBSTITUTE(A2,"",REPT(" ",99)))+21),99))
Skorpion
  • 63
  • 8
  • 1
    This website requires you to include your own attempt too. Have a look at [ask] a question with a [mcve]. – JvdV Nov 07 '19 at 12:18
  • Your question is not clear to me since there seem to be ten Arabic language words. Are you looking for specific words? Words in a specific location? – Ron Rosenfeld Nov 07 '19 at 12:44
  • أمانة منطقة عسير -بلدية بللحمر looking thse two before the - its will come in 1 column and aftr the columns - word will come in another column – Skorpion Nov 07 '19 at 12:47
  • What version of Excel are you working with? – Ron Rosenfeld Nov 07 '19 at 14:20

1 Answers1

0

Since you specify certain key words to be found, we can look for those key words and then the relevant delimiter, based on your example.

In your example, أمانة is followed by the dash, and بلدية by the slash. (followed by is in terms of the right-to-left orientation of Arabic words).

Try this:

Col1: =MID(A1,FIND("أمانة",A1),FIND(CHAR(1),SUBSTITUTE(A1,"-",CHAR(1),LEN(A1) - LEN(SUBSTITUTE(A1,"-",""))))-FIND("أمانة",A1))

Col2:  =MID(A1,FIND("بلدية",A1),FIND(CHAR(1),SUBSTITUTE(A1,"/",CHAR(1),LEN(A1)-LEN(SUBSTITUTE(A1,"/",""))))-FIND("بلدية",A1))

Col3:  =TRIM(RIGHT(SUBSTITUTE(A1,".",REPT(" ",99)),99))

If the keywords are not found, the formula will return an Error. So you can just "wrap" the formula in IFERROR to have it return a blank if the key words are not present.

Edit:

The actual workbook does not have the same pattern as the sample you posted. In particular. Try this for column 2 data:

=MID(A2,FIND("بلدية",A2),99)

or with error suppression:

Col1:  =IFERROR(MID(A2,FIND("أمانة",A2),FIND("-",A2,FIND("أمانة",A2))-FIND("أمانة",A2)),"")

Col2:  =IFERROR(MID(A2,FIND("بلدية",A2),99),"")

And, the cells that are still returning the #VALUE! error do not have that keyword in the line.

For example:

A6: //olyservice/GIS-TANSIQ01/Storage/103-أمانة منطقة عسير -أحد رفيدة

does not contain بلدية

BTW, those formulas seem to both work on Sheets also.

Edit2:

Since you also posted an example in Sheets, if you can implement this in Sheets, you can use Regular Expressions to account for multiple terminations.

In that case, you would use:

=iferror(REGEXEXTRACT(A2,"(أمانة.*?)\s*(?:[-/\\.]|$)"),"")

or

iferror(REGEXEXTRACT(A2,"(بلدية.*?)\s*(?:[-/\\.\w]|$)"),"")

for the columns.

The regex extracts the pattern that begins with the key phrase, up to the terminator which can be any character in the set of -/\.A-Za-z0-9 or the end of the line. That seems to cover the examples in your sample worksheet, but if there are other terminators, you can add them to the sequence.

In Excel, this would require a VBA UDF to implement the Regex engine.

enter image description here

Ron Rosenfeld
  • 53,870
  • 7
  • 28
  • 60
  • Its worked but most getting Error #Value they have word which im looking can i share the file and see where im doing wrong?? – Skorpion Nov 07 '19 at 17:23
  • @Skorpion Is it an Excel file? The file reference you posted is, I think, a Google Sheets file, and that seems to handle right-to-left differently. – Ron Rosenfeld Nov 07 '19 at 17:41
  • Yes Its Excel file can i give you Gdrive link ?? – Skorpion Nov 07 '19 at 20:14
  • @Skorpion Yes, go ahead – Ron Rosenfeld Nov 07 '19 at 20:16
  • Okay give me too secs – Skorpion Nov 07 '19 at 20:17
  • Please find the link https://drive.google.com/open?id=11G7jiNaRYb3Q0m8_Ej0DIV0IQUCejWPcPE1EVvpgDxI – Skorpion Nov 07 '19 at 20:19
  • You can download fro google sheet as well i gve you permission ... – Skorpion Nov 07 '19 at 20:20
  • @Skorpion See my Edit – Ron Rosenfeld Nov 07 '19 at 21:05
  • till row 155 is okay aftr that have issue in Column R and In colum P most of have أمانة منطقة عسير -ب - and thn one word – Skorpion Nov 07 '19 at 21:12
  • @Skorpion You have multiple patterns in your data that are difficult to sort out. Once you specify all the possible patterns, a solution should be simpler. So far the only definitive pattern you've presented is that the phrase should start with `بلدية` or `أمانة`. But it seems the phrase may end with a variety of characters. You need to closely examine your data to ensure you have accounted for all of the possibilities. Then it should be relatively simple to devise a more reliable formula. Or, if there are non-unique endings,decide that a different algorithm is needed. – Ron Rosenfeld Nov 08 '19 at 01:38
  • @Skorpion Take a look at an implementation that works in **Sheets**. For the equivalent in Excel, you would require VBA. – Ron Rosenfeld Nov 08 '19 at 02:24
  • Thanks Ron Rosefeld Thanks u – Skorpion Nov 08 '19 at 09:31
  • @Skorpion Take a look at [How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops](https://stackoverflow.com/questions/22542834/how-to-use-regular-expressions-regex-in-microsoft-excel-both-in-cell-and-loops) – Ron Rosenfeld Nov 08 '19 at 12:41
  • @Skorpion Also, have you gone through your **real** data to ensure that the Sheets formula will work? Do you know how to add a user defined function to Excel? (*you had not used the VBA tag on your question*). Those are things you need to do first. – Ron Rosenfeld Nov 08 '19 at 12:56