0

I'm making a french verb conjugation Rails website where users may insert conjugations of verbs like:

     se abstenir
     m'appelle
     êtes
     achète

And I need to validate_format_of those verbs. The apostrophes are quite easy, but what about the êèã characters?

By now I have:

    word_format = /\A[\w]+[' ]?[\w]*\z/
    validates_format_of (...), :with => word_format

Which clearly doesn't work since \w doesn't match them. Also including áêĩ(...) to the regexp gives me a invalid multibyte char (US-ASCII) error.

I also need to upcase of downcase those strings, which ruby is ignoring, resulting in 'VOUS êTES' for example. The trivial answer seems to be doing it by hand, but I hope Ruby/Rails to surprise me again.

Its seems to be a hard problem, and I wasn't expecting since Ruby/Rails power.

Anybody could give me a clue?

alexandrecosta
  • 3,218
  • 2
  • 16
  • 16

2 Answers2

0

It looks like instead of \w you need to use the POSIX bracket expression [:alpha].

word_format = /\A[:alpha]+[' ]?[\w]*\z/
Community
  • 1
  • 1
ScottJShea
  • 7,041
  • 11
  • 44
  • 67
0

You'll need to install UnicodeUtils for the upcasing thing.

#encoding: utf-8
require "unicode_utils/upcase"
puts UnicodeUtils.upcase("êtes Niño")#=> ÊTES NIÑO

The regex could look like this:

word_format = /\A[[:word:]]+[' ]?[[:word:]]*\z/

/[[:word:]]/ - A character in one of the following Unicode general categories Letter, Mark, Number, Connector_Punctuation.

Cœur
  • 37,241
  • 25
  • 195
  • 267
steenslag
  • 79,051
  • 16
  • 138
  • 171