0

i want to display short description of articles on home page. Descriptions are a mix of Thai and English language.

I am using this function for strlen

mb_strlen($str, 'UTF-8');

but this is not acurate as some descriptions end up in just one line and some goes upto 3 lines and I want to show descriptions of two lines.

If strlen is bigger than 155 i do

$descr = mb_strlen($descr, 'UTF-8') > 155 ? substr($descr, 0, 152) . '...' : $descr;

Thank You.

hippietrail
  • 15,848
  • 18
  • 99
  • 158
Shishant
  • 9,256
  • 15
  • 58
  • 79
  • 1
    I understand neither the problem (I don't see what the business with the lines is all about) nor your question. Can you clarify? – Pekka Feb 25 '10 at 16:00
  • I don't understand either. :/ – Teekin Feb 25 '10 at 16:02
  • He probably wants to do this: http://stackoverflow.com/questions/2154220/truncate-a-multibyte-string-to-n-chars and is having the issue that `mb_strlen` and `str_len` counts some chars twice, due to them being multibyte. – Gordon Feb 25 '10 at 16:02
  • 2
    btw, as far as i remember you ought to use mb_substr instead substr in this case – shuvalov Feb 25 '10 at 16:03
  • The reason I am doing strlen is to present just short descriptions on the home page of site as the strlen function is not accurate for non english language the descriptions break the design of site as some are too small/big – Shishant Feb 25 '10 at 16:03
  • @Gordon the problem with function in link mentioned by you is that it adds `...` even if the description is short than 150 chars – Shishant Feb 25 '10 at 16:17
  • @Shishant The function will truncate to any length you specify for `$chars`. So your question really is "how can I truncate a multibyte string to 150 chars?" Correct? We are still having problems understanding what you actually want. – Gordon Feb 25 '10 at 17:10

3 Answers3

5

Glyphs, the graphical representations of characters, have different widths in different fonts. Just compare the m with i:

mmmmmmmmmm
iiiiiiiiii

Both characters are repeated ten times. But the glyph of the m is much broader than the glyph of the i.

So you cannot conclude the width of its graphical representation from the number of characters (except for monospaced fonts).

Gumbo
  • 643,351
  • 109
  • 780
  • 844
  • So... how would you do it? :) – Adam Kiss Feb 25 '10 at 16:09
  • +1 for correctly (I think) interpreting an unclear question :) – Nick Meyer Feb 25 '10 at 16:10
  • Thats not a big deal I understand this but its causing the difference not because of widths but the strlen differences in languages – Shishant Feb 25 '10 at 16:12
  • @Shishant: Can you give an example? – Gumbo Feb 25 '10 at 16:19
  • @Gumbo: 3 Lines `Description:2NE1 - Fire & I Don't Care on Music Core (2010-02-20) SS: http://img.fakrub.com/fakRubDownload.php?id=3756_4B80DCB6 ตัวเล็ก คุณภาพใหญ่ Enjoy The Show !` 1 line `บางครั้งเราก็มองข้ามสิ่งเล็กๆ น้อยๆ ไป เพียงเพราะใช้�...` – Shishant Feb 25 '10 at 16:24
  • @Shishant: So the first example is just *displayed* in three line? – Gumbo Feb 25 '10 at 16:43
  • @gumbo/@shrishant: My understanding is that its we are talking about web application. So if its web page will it be better if entire text is pushed to page and have JavaScript and CSS to do rest of the work ? – Anil Namde Feb 25 '10 at 16:45
  • @Gumbo Yes its display in 3 line and one line after truncating. – Shishant Feb 25 '10 at 17:07
  • @Shishant: Well, as I said: Different glyphs have different dimension. You would need to know the actual glyphs and its dimensions to get exactly just one line. The best is to do this on the client side with JavaScript. – Gumbo Feb 25 '10 at 17:17
  • @Shishant: Take a look at this: http://adamhooper.com/bodacity/playground/jquery.excerpt.html – Gumbo Feb 25 '10 at 17:56
  • You cannot conclude the width of its graphical representation from the number of characters for monospaced fonts either. For example, the text may contain combining characters. – Mechanical snail Aug 23 '11 at 18:59
2

substr is unsafe to use on utf-8 data. Use mb_substr

troelskn
  • 115,121
  • 27
  • 131
  • 155
0

if you want prevent entries with 3 ore more lines firstly split string by '\n' and then make triming with mb_substr

shuvalov
  • 4,713
  • 2
  • 20
  • 17