0

I receive data from a 3rd party API.

Sometimes the data isn't great, and has a lot of whitespace.

I've tried using ltrim(), and preg_replace() without any luck:

$job['description'] = ltrim($job['description']);
$job['description'] = preg_replace('/^\s+/S', " ", $job['description']);

The 'whitespace' is always at the start of the string. The following is directly copied from the response:

                                    Cover Supervisors 

If that helps at all. Any ideas to try??

EDIT::

var_dump shows:

string(1664) "Cover Supervisors 
Leicester & Surrounding areas   
£65 - £110 per day
Required for the New Term 
Immediate Interviews 

Clarence Recruitment is currently looking for Cover Supervisors to work in Secondary schools across the Leicester area.  

• Do you have any experience of working with children 11-16yrs and have a good knowledge of the National Curriculum? 

• Are you looking to have a future within the education sector? 

• Would you like the freedom to choose the days you work? 

We are currently experiencing a high demand for flexible, part time and full time secondary cover supervisors. 

We are therefore looking to recruit temporary staff, ideally those who are looking to build a long term career in teaching and would like to gain further school based experience before starting a PGCE/SCITT or SD program.

We are currently working with numerous vacancies so if this position is not quite for you please do send us your CV to discuss other opportunities more suitable to your requirements.

Alan (your consultant) has been working the East Midlands Desk for over 5 years so his knowledge of the local schools is exceptional. He has also gained QTS so he knows the education system extremely well. Feel free to contact him for a more detailed chat about the roles we can help you fill. 

Clarence Recruitment is committed to safeguarding and promoting the welfare of children and young people and expects all staff and volunteers to share this commitment. The successful candidate will be required to have an enhanced DBS check in line with the Government’s Keeping Children Safe in Education guidelines (2014)"
rpsep2
  • 3,061
  • 10
  • 39
  • 52
  • 1
    `var_dump($job['description']);` – AbraCadaver Sep 28 '16 at 17:49
  • Couldnt the white space be a tab? rather than a space. We try to remove space but actually there is a tab. – masterFly Sep 28 '16 at 17:52
  • Not reproducible, https://eval.in/652084; https://eval.in/652085. Maybe the whitespaces are entities? – chris85 Sep 28 '16 at 17:54
  • need to see what is in the initial $job['description'] Printing it into a textarea might reveal characters that your browser normally hides. `` – Duane Lortie Sep 28 '16 at 17:58
  • 2
    you might have some invisible characters (non-breaking spaces) somewhere. a.k.a. unicode. – Funk Forty Niner Sep 28 '16 at 17:58
  • can you a supply a var_dump? – Funk Forty Niner Sep 28 '16 at 18:02
  • var_dump() doesnt seem to show this random whitespace?? – rpsep2 Sep 28 '16 at 18:03
  • Post the results of it. How many characters does it show? – chris85 Sep 28 '16 at 18:04
  • see this also http://php.net/manual/en/function.ctype-space.php there's info on unicode. You need to detect it if it's present and this Q&A http://stackoverflow.com/questions/12837682/non-breaking-utf-8-0xc2a0-space-and-preg-replace-strange-behaviour and http://stackoverflow.com/questions/7085204/php-utf8-encode-converts-spaces-to-non-breaking-spaces and https://gist.github.com/tistre/40277b47155b7f5a678c – Funk Forty Niner Sep 28 '16 at 18:08
  • Okay so you have `1627` characters in that string and `1664` characters total so you have 37 unaccounted for characters. View the page source, do you have ` ` or ` ` or some other space entity. – chris85 Sep 28 '16 at 18:09
  • there are some '•' , '&' but nothing before the first word suggesting any whitespace – rpsep2 Sep 28 '16 at 18:13

1 Answers1

0

It's probably the non-breaking spaces fooling around. Try removing them explicitly:

$job['description'] = str_replace(' ', '', preg_replace('/^\s+/S', ' ', $job['description']));

You can also add htmlentities in the middle, but whether that's acceptable is dependent on the data you're working with.

BVengerov
  • 2,947
  • 1
  • 18
  • 32