0

I am wondering if it is possible to create a regular expression which will allow digits, letters and spaces but no punctuation.

Whats happening is I have an online loan application and in the street address field some users are adding commas (,) in their input. Upon exporting the applications to CSV, the CSV file has commas in the wrong places due to the address field having commas in it.

I have been playing around with a regex for a few hours now and its coming along well, but it cant be perfect. Every now and then I get an address which cannot validate but it has no punctuation. It would otherwise be an acceptable input but the regex does not allow it to be validated.

So instead of me trying to stipulate what is allowed(number, spaces, letters), can I not just stipulate what is not allowed(punctuation)? The latter makes a whole lot more sense than the former.

For informational purposes I have included the list of addresses that have been submitted which I use to test my regex on. I have also included the regex I have been working on. Any guidance or assistance is greatly appreciated

Note: I have added comments where an invalid input is. Below the invalid input I have placed an edited copy of the input which is now valid.

Addresses:

3 TINDERWOOD CRESCENT



117HEIDERAND FLATS EESSENTHOUT STREET HEIDERAND 6511 -- no space after number

117 HEIDERAND FLATS EESSENTHOUT STREET HEIDERAND 6511



3289 ext5 edenpark

120411 mpande loation

1433b Moulton Avenue

1433 b Moulton Avenue

4 diesel rd

10821 Morubisi Street



Unit 44 Charis Place, Prosperity Road, Groblerspark -- punctuation

Unit 44 Charis Place Prosperity Road Groblerspark -- starts with word

44 Charis Place Prosperity Road Groblerspark



p box 3581      -- invalid street address



82 Akasia laan

987 leruleng sectionsaulspoort

1 lenton drive

1179 gugulethu street

1 lenton drive



10269verdwaal2  -- no spaces

10269 verdwaal 2



15 Prinsloo Street

1179 gugulethu street



13 Adler str, Eden Park -- has punctuation

13 Adler str Eden Park



410 wonderzicht, 538 de beerstr -- has punctuation

410 wonderzicht 538 de beerstr



1551        -- invalid street address



52 Koedoe Ave

26a high road orchards



Musina ext9 4363    -- number in wrong place - See below

4363 Musina ext9



18 Replubliek street

753 steve biko avenue unit2

221 Buitekant straat

54 Zone A

B1287 UMSOLWA ROAD

54 Zone A

6574 LETLHODI STREET UNIT 14

27 k street

9635 Zanemvula street

2667 nthatisi Street

498 CESSNA AVENUE



maromene 17         -- number in incorrect place - see below

17 maromene



F1 Ngoje cres MarainnhillM

6574 letlhodi street unit 14

21 tecomastreet

21 tecomastreet

07 Frara Drive

6440 MAYANA STREET THEMBALETHU GEORGE 6529



no.87 Vista Villas      -- has word before number and has punctuation

87 Vista Villas



12 jama street z section    --- !!!!!!!!!!!!!!!!!!!!!!!!



B1287 Umsolwa road

B 1287 Umsolwa road     -- space after B



6574 letlhodi street

7658 Itsomo Steet Ext6



5dharrisson str     -- no space fter number

5d harrisson str



322 Lenham Drive, Lenham    -- has punctuation

322 Lenham Drive Lenham



schoolstreet 3      -- name before number

3 schoolstreet



50 Hercules Court

1546 sefatsa stand

61 sixwila street

20800 mamelodi east



colchester crescent



12 Iraq Street

12 moshoeshoe str

21 vansoelen street

12 moshoeshoe str

4102 Geelhout Street



ward 16, umzumbe, hibberdene        -- has punctuation

ward 16 umzumbe hibberdene



839 Maokeng ext



28caledon drive



34 Leo ave



6423 Bandura Street, Willow Manor Ext. 1



8541 Snake Park

7 INSISWA WAY EXT 12 SHERWOOD PARK 7349



plot 50 kareebos



404 Windermere street

404 Windermere street

404 Windermere street

68 Flamink Street

68 Flamink Street

71 Eagle Dawn Zeiss road



block r3 room 105 n2 gateway



50 plein str

B206 Chapters

225 Buitenkant Street

9 Chestnut street Bonteheuwel

1740a Ben Naude Drive Zone 2



Droedam Farm



16 Aberfeldy Road The Hill



11th road 25 simonsig noordwyk



49 Soetdoring



Allensnek


12935


29 likhonda street

6 Matume street



Plot 88



22 Community Road


1100 kwamakhutha township, po amanzimtoti



9 balmoral heights, balmoral road



3 makubalo streets

3951 umthungwa street birch acres ext 23

12986 Walter Sisulu Street

11138 nongoma str

51 rudloff rd



10 Villa Palazzo, Belami Drive



321 Francis Baard Street

65 9 th street

35 Fiskaal street

14 olive street

9 manho street

38 victoria road

354 Ethafeni Section

16 Arkeldien Street



6.5 Boundary Road



100 Vlamboom Road

902 Mquma Street Bophelong

35 amsterdam street

55 fanplam gardens palmview

6213 Zone 12 Extion Sebokeng



stand no.043 ooslope



3 MURISON STREET

177 govan mbeki township

9 Chestnut street Bonteheuwel

4347 BLOCK B

1a roy cambell street

146 poole avenue

116 Woburn Street

545 Phase 5B Buhlepark

33 vos street



37 fifth street , rusthof



18 Marinda Crescent Marinda

16 chromite ave



13 kanti street 35263,Harare



67 Third Street

270 queen elizabeth ave

30 Victoria

9 manho str

26 Liebenhof Flats Young Str

1 Carter Street



10 Villa Palazzo, Belami Drive



11 vlasblomsingel progress



block 44, 513 sunset avenue



1073 mphafa road

1344 Pickerel place

1073 mphafa road



sweatpealaan 2751



44 Brummerstreet

151 Ext 3

433 10 kodi street

617 rondomstraat Louterwater

10 GREEN STREET

149 tamsanqa street

3 Newfeld Street

76 Dorp Street

5 Mabille Street

28 heathbury place

2356 joyce ndinisa road

A13 Lekoane street

Regex:

/^(\w*)(\d+)(\w)?\s(\w+(\s?\w+?))+$/
Kbam7
  • 324
  • 2
  • 15
  • You mean like this one from which you can add a space and remove the underscore? http://stackoverflow.com/questions/336210/regular-expression-for-alphanumeric-and-underscores Or any number of other examples you could search for "alphanumeric regex" – Marc Nov 12 '15 at 15:52
  • 4
    asked before : http://stackoverflow.com/questions/4328500/how-can-i-strip-all-punctuation-from-a-string-in-javascript-using-regex – sudo_dudo Nov 12 '15 at 15:54
  • If you take the route Marcos Pérez Gude suggested, you can let users enter almost anything ... which might be of interest. – Asons Nov 12 '15 at 16:28
  • To allow A-Z, digits and spaces: `/^[A-Z0-9\s]+$/i` but sounds like you want to validate format too. – bobble bubble Nov 12 '15 at 16:44

2 Answers2

1

Why you need a regular expression for this? You can make something as simple as this:

 var address = "My address, with commas not allowed";
 var newAddress = address.replace(",","");

As I said in the comments, with this method you allow to your users to write the address as they want, but without breaking your CSV separators. You will be able to write an address like this

Street Nowhere Nº 10 - 2 (12000)

And it doesn't break your CSV. with the regular expression, this address will be this:

Street Nowhere N 10 2 12000

And this is less readable.

Marcos Pérez Gude
  • 21,869
  • 4
  • 38
  • 69
  • I guess _...regular expression which will allow digits, letters and spaces but no punctuation._ will make it difficult with this method :) – Asons Nov 12 '15 at 15:58
  • OP's problem is the comma separator for CSV read. I purpose a simply method to allow users write what they need without breaking the CSV separations. But if you feel better when you downvote you are welcome with your downvotes. – Marcos Pérez Gude Nov 12 '15 at 16:08
  • I didn't down vote, I commented instead. And I think you could be more clear with why you suggest the method you did (as you were in your comment), which might save your from further down votes. – Asons Nov 12 '15 at 16:15
  • Ok, I assume the recommendation and I will edit my answer. Thank you :) – Marcos Pérez Gude Nov 12 '15 at 16:20
  • This won't work if there are multiple commas, or if there's a double quote in a value. See http://stackoverflow.com/a/33676623/227299 – Ruan Mendes Nov 12 '15 at 16:37
  • Yeah, depending on the format that the CSV was constructed. However, OP doesn't send life signals... – Marcos Pérez Gude Nov 12 '15 at 16:50
1

A better solution is to escape the field with a comma. See https://stackoverflow.com/a/769675/227299. You will also need to escape any double quotes.

If you are doing this in JavaScript, you could do:

function escapeCsvValue(val) {
    if (val.indexOf(',') > -1 || val.indexOf('"') > -1) {
        return '"' + val.replace(/"/g, '""') + '"';
    }
    return val;
}
var csv = [
    ['Name, CommaForFun', 'Address'], 
    ['Jack','Calle Ocidental, 2'], 
    ['Jill O"quote"','125, The "cool" place'], 
    ['John','123 Main st'], 
];
   
var escapedCsv = csv.map(function(row){
    return row.map(escapeCsvValue).join(',');
});

var csvString = escapedCsv.join('\n');
console.log(csvString);
/*
"Name, CommaForFun",Address
Jack,"Calle Ocidental, 2"
"Jill O""quote""","125, The ""cool"" place"
John,123 Main st
*/
Community
  • 1
  • 1
Ruan Mendes
  • 90,375
  • 31
  • 153
  • 217