11

I am trying to write a function to parse the string representation of a musical chord.

Example: C major chord -> Cmaj (this is what I want to parse)

Just to make it clear, a chord is made of three different parts:

  • the note (C, D, E, F, G, A)
  • the accidentals for that note (#, ##, b, bb)
  • the chord name

For those, music savvy, I am not considering slash chords (on purpose).

The below function is almost working. However it still doesn't work for the following case:

  • "C#maj" # matches and should
  • "C#maj7" # matches and should
  • "C#maj2" # mathches and shouldn't

I suppose that if I could make the chords regex part forced to be at the end of the regex, did the trick. I have tried using the $ both before and after this String but it didn't work.

Any idea? Thanks.

public static void regex(String chord) {                
    String notes = "^[CDEFGAB]";
    String accidentals = "[#|##|b|bb]";
    String chords = "[maj7|maj|min7|min|sus2]";
    String regex = notes + accidentals + chords; 
    Pattern pattern = Pattern.compile(regex);
    Matcher matcher = pattern.matcher(chord);
    System.out.println("regex is " + regex);
    if (matcher.find()) {
        int i = matcher.start();
        int j = matcher.end();
        System.out.println("i:" + i + " j:" + j);           
    }
    else {
        System.out.println("no match!");
    }
}
Matthieu Brucher
  • 21,634
  • 7
  • 38
  • 62
nunos
  • 20,479
  • 50
  • 119
  • 154
  • The pattern between `C#maj2` and `C#maj7` is identical (C#maj\d), so differentiating between them is not really a job for regex. I would grab all instances of that pattern and then use some more string patterns to validate. You could however build a regex that includes all accepted digits as literals. – David B Jun 27 '12 at 14:57
  • I would make static collections containing all possible chords. Than see if the string representation is found in the Collections. – hovanessyan Jun 27 '12 at 15:00
  • "...For those, music savvy, I am not considering slash chords..." Just for the record: You are ignoring *most* chords, not just slash chords. – Willem van Rumpt Jun 27 '12 at 15:34
  • Yes, you're right. What I meant was that I have no intention of adding matching for slash chords and that would imply that I had to change the regex structure itself. I have intention of adding more chords to the `chords` String. I just didn't did that for the sake of clarirty. – nunos Jun 27 '12 at 15:41
  • No worries, I just had my little weekly pedantic moment :) There are various "dialects" for chord notation (and dialects within them) , and you can even more or less just mix'n'match parts between dialects. Musicians (with a theoretical background at least) across the world will understand any combination, but it's almost impossible to come up with a *simple* computerized matching algorithm that can do the same. (I've been there ;) ) – Willem van Rumpt Jun 27 '12 at 15:50

4 Answers4

2

Change [ and ] to ( and ) in the following lines:

String accidentals = "(#|##|b|bb)";
String chords = "(maj7|maj|min7|min|sus2)";

Otherwise you're just making character classes, so [maj7|maj|min7|min|sus2] simply matches on the letter m.

I'm guessing you also want to add an ending anchor $? I see you had problems with that before, but that's probably because of the aforementioned issue.


Also, might you want (#|##|b|bb) to be optional (i.e., with ?: (#|##|b|bb)?)?

Wiseguy
  • 20,522
  • 8
  • 65
  • 81
2

Forgive the JavaScript, but on a purely REGEX point, this pattern seems to work. You didn't stipulate which numbers are allowed after which chord names but I've assumed 2 is allowed only after 'sus' and 7 only after 'min' and 'maj'.

var chords = "C#maj7 C##maj Bbmaj7 Abmin2 Cbmin Dsus";
var valid_chords = chords.match(/\b[CDEFGAB](?:#{1,2}|b{1,2})?(?:maj7?|min7?|sus2?)\b/g);
Mitya
  • 33,629
  • 9
  • 60
  • 107
2

Building on Wiseguy's answer, I improved your regex matching. I had to add # outside of the variable accidentals since \b throws matching # off.

Bonus: It even matches chords like Dsus9, D7 etc.

Forgive the JavaScript, but this is the code I ended up using:

var notes = "[CDEFGAB]",
  accidentals = "(b|bb)?",
  chords = "(m|maj7|maj|min7|min|sus)?",
  suspends = "(1|2|3|4|5|6|7|8|9)?",
  sharp = "(#)?",
  regex = new RegExp("\\b" + notes + accidentals + chords + suspends + "\\b" + sharp, "g");

var matched_chords = "A# is a chord, Bb is a chord. But H isn't".match(regex);

console.log(matched_chords);
Amit
  • 1,620
  • 1
  • 15
  • 24
0

I don't have enough reputation to comment on Amit's post.

This is what I used for this. The important thing is to check for 'maj' and 'min' before 'm' otherwise there will be false matches on chords like C#m. Technically this would allow for chords like C#9000, but I'm guessing that won't be much of a problem in your case.

[A-G](b|#)?(maj|min|m|M|\+|-|dim|aug)?[0-9]*(sus)?[0-9]*(\/[A-G](b|#)?)?
Scott Jodoin
  • 175
  • 2
  • 8