Goal
My goal is to programmatically determine which chord is played at which point in time in any song available on Spotify with reasonable accuracy.
What I got so far
I already wrote a script, that basically does this. The issue is, it is not accurate enough.
What data Spotify is providing
Spotify's API has a route that provides very detailed audio-analysis data of any track you want. It's documented right here: get-audio-analysis
The route is a simple GET route:
GET https://api.spotify.com/v1/audio-analysis/1hwJKpe0BPUsq6UUrwBWTw
And here is the json response of that example: https://gist.github.com/T-vK/bcac7f5c0885115a9b76a3e14896e399
As you can see, there is a lot of interesting information like
- Starting point and duration of every beat and bar including a confidence score
- Sections of a song (for each section the start point, duration, loudness, key, tempo, mode (major/minor) and time signature are provided including a confidence score)
- Segments of a song (Each segment contains a roughly conisistent sound throughout its duration.) and for each segment we get the starting point, duration, loudness information, pitches* (a chroma vector, see image below) and timbre information (as a 12 value vector again) and a conficence score of course.
This is what the docs say about the segment pitches:
Pitch content is given by a “chroma” vector, corresponding to the 12 pitch classes C, C#, D to B, with values ranging from 0 to 1 that describe the relative dominance of every pitch in the chromatic scale. For example a C Major chord would likely be represented by large values of C, E and G (i.e. classes 0, 4, and 7).
Vectors are normalized to 1 by their strongest dimension, therefore noisy sounds are likely represented by values that are all close to 1, while pure tones are described by one value at 1 (the pitch) and others near 0. As can be seen below, the 12 vector indices are a combination of low-power spectrum values at their respective pitch frequencies.
And here is what the docs say about the segment timbre:
Timbre is the quality of a musical note or sound that distinguishes different types of musical instruments, or voices. It is a complex notion also referred to as sound color, texture, or tone quality, and is derived from the shape of a segment’s spectro-temporal surface, independently of pitch and loudness. The timbre feature is a vector that includes 12 unbounded values roughly centered around 0. Those values are high level abstractions of the spectral surface, ordered by degree of importance.
For completeness however, the first dimension represents the average loudness of the segment; second emphasizes brightness; third is more closely correlated to the flatness of a sound; fourth to sounds with a stronger attack; etc. See an image below representing the 12 basis functions (i.e. template segments).
The actual timbre of the segment is best described as a linear combination of these 12 basis functions weighted by the coefficient values: timbre = c1 x b1 + c2 x b2 + ... + c12 x b12, where c1 to c12 represent the 12 coefficients and b1 to b12 the 12 basis functions as displayed below. Timbre vectors are best used in comparison with each other.
Here is an example segment:
"segments": [
{
"start": 0,
"duration": 0.52816,
"confidence": 0,
"loudness_start": -60,
"loudness_max_time": 0,
"loudness_max": -60,
"loudness_end": 0,
"pitches": [0.905,1,0.987,0.932,0.827,0.736,0.636,0.597,0.592,0.627,0.672,0.707],
"timbre": [0,171.13,9.469,-28.48,57.491,-50.067,14.833,5.359,-27.228,0.973,-10.64,-7.228]
},
...
]
My code
Now my naive approach was to iterate over all the segments[].pitches and use these chroma arrays to estimate which chords they most likely represent:
const axios = require('axios');
const ChordDetector = require('chord-recognition').ChordDetector;
const chordDetector = new ChordDetector();
// Mock for https://api.spotify.com/v1/audio-analysis/1hwJKpe0BPUsq6UUrwBWTw which would require authentication
const SPOTIFY_MOCK_URL = "https://gist.githubusercontent.com/T-vK/bcac7f5c0885115a9b76a3e14896e399/raw/9131e994f1271e09c864d847f7f7eefa3abd3e88/response.json"
const chromaticScaleSharp = ["C", "C#", "D", "D#", "E", "F", "F#", "G", "G#", "A", "A#", "B"];
const chromaticScaleFlat = ["C", "Db", "D", "Eb", "E", "F", "Gb", "G", "Ab", "A", "Bb", "B"];
const chordQuality = ["m", "", "sus", "7", "dim", "aug"];
function getChordFromChroma(chroma) {
chordDetector.detectChord(chroma);
const rootNote = chromaticScaleFlat[chordDetector.rootNote];
const quality = chordQuality[chordDetector.quality];
let intervals = chordDetector.intervals;
intervals = (intervals == quality ? '' : intervals) || ''
return `${rootNote}${quality}${intervals}`
}
async function main() {
const res = await axios.get(SPOTIFY_MOCK_URL);
const data = res.data
const key = chromaticScaleFlat[data.track.key];
const keyConfidence = data.track.key_confidence;
const mode = data.track.mode === 1 ? 'Major' : 'Minor';
const mode_confidence = data.track.mode_confidence;
const chordsMap = {};
let previousChordName = null;
for (const segment of data.segments) {
const startTime = segment.start; // in seconds (float)
const chroma = segment.pitches; // Array with 12 entries representing the dominance(?) of each note in the chromatic scale for the given segment.
const chordName = getChordFromChroma(chroma);
if (previousChordName !== chordName) {
chordsMap[startTime] = chordName;
previousChordName = chordName;
}
}
console.log(chordsMap);
}
main().catch(console.error);
My results
Unfortunately the results seem very inaccurate:
{
'0': 'Bbdim',
'0.52816': 'Gb7',
'0.72644': 'F7',
'0.87678': 'Gb7',
'1.07524': 'F',
'1.2361': 'Gb7',
'1.64562': 'F7',
'1.78104': 'Gb7',
'1.96721': 'F',
'2.16998': 'F7',
'2.53646': 'Fm7',
'2.71973': 'F',
'2.86576': 'Fm7',
'3.04943': 'Gb7',
'3.26236': 'F',
'3.57995': 'Gb7',
'3.77914': 'F',
'3.97651': 'F7',
'4.15601': 'F',
'4.5141': 'F7',
'4.87546': 'F',
'5.41537': 'F7',
'5.74798': 'Gb7',
'5.95465': 'F',
'6.27601': 'F7',
'6.48862': 'F',
'7.02807': 'Fm',
'7.2112': 'F',
'7.56209': 'Gb7',
'7.75147': 'Fm7',
'8.10054': 'Fm',
'8.29034': 'Fm7',
'8.4439': 'Fm',
'8.63465': 'Ab',
'8.82245': 'Fm7',
'8.97057': 'Fm',
'9.15383': 'F',
'9.3605': 'F7',
'9.54363': 'Ebsus2',
'10.62531': 'F',
'10.80522': 'Bdim',
'10.98322': 'Bb',
'11.16395': 'Bbsus2',
'11.34431': 'Bbm',
'11.51669': 'Bbsus2',
'11.69401': 'F',
'11.87719': 'Dm',
'12.06141': 'Fm',
'13.1073': 'Gm7',
'13.47438': 'Ab7',
'13.7746': 'Dbsus2',
'14.21138': 'Ab',
'14.94757': 'Ebsus2',
'15.22109': 'Absus2',
'15.59868': 'Bb',
'16.00367': 'Absus2',
'16.20163': 'Bb7',
'16.38181': 'Gb7',
'16.52073': 'Bb7',
'16.73587': 'Gm7',
'17.03238': 'Ebsus2',
'17.42812': 'Ab7',
'17.65361': 'Fm7',
'17.81034': 'Fm',
'18.88993': 'Gb7',
'19.22884': 'Dbsus2',
'19.9161': 'Ab7',
'20.29946': 'Absus2',
'20.66676': 'Ebsus2',
'21.04286': 'E7',
'21.23646': 'Eb7',
'21.76172': 'Abdim',
'21.95247': 'Bbsus2',
'22.05297': 'Bb',
'22.4766': 'Bbm',
'22.84268': 'Bb7',
'23.20254': 'Ab7',
'23.38821': 'Db',
'23.4932': 'Db7',
'23.74744': 'F',
'24.05596': 'Fm',
'24.25438': 'Gb7',
'24.98481': 'Db',
'25.30989': 'Dbsus2',
'25.66481': 'Ab7',
'25.97868': 'Gbdim',
'26.07029': 'Absus2',
'26.43673': 'E7',
'27.49864': 'D7',
'27.86961': 'Bbm',
'28.0663': 'Bb7',
'28.2176': 'Bbm',
'28.55497': 'Gm7',
'28.95057': 'Absus2',
'29.15447': 'Ebdim',
'29.28902': 'Db7',
'29.62658': 'Fm',
'30.35478': 'Gb7',
'30.7195': 'Dbsus2',
'31.41664': 'Ab7',
'31.82889': 'Ebdim',
'32.15533': 'Eb7',
'32.55197': 'E7',
'32.8961': 'Eb',
'33.23723': 'Eb7',
'33.44077': 'Db7',
'33.61687': 'Bbm',
'34.31927': 'Ebsus2',
'34.66689': 'Ab7',
'34.87075': 'Fm',
'35.04463': 'Db7',
'35.32771': 'Fm',
'35.59841': 'F7',
'35.75927': 'Fm',
'36.13397': 'G7',
'36.49592': 'Dbm',
'36.86785': 'Dbsus2',
'37.21578': 'Ab7',
'37.58707': 'Ebdim',
'37.75565': 'Gbdim',
'37.93546': 'Eb',
'38.2902': 'E7',
'38.9205': 'Eb7',
'39.20018': 'Db7',
'39.32236': 'Db',
'39.56757': 'Bbm',
'39.74671': 'Gm7',
'40.08372': 'Bb7',
'40.42381': 'Ab7',
'40.61846': 'Fm',
'40.81941': 'Db',
'40.99447': 'Fm',
'41.24331': 'F',
'41.50263': 'Fm',
'41.84794': 'Gb7',
'42.20227': 'Db7',
'42.42971': 'Ab',
'42.60286': 'Ab7',
'42.78231': 'Gbsus2',
'42.90617': 'Ab7',
'43.31769': 'Ab',
'43.6585': 'Eb7',
'43.97615': 'E7',
'44.75764': 'Dbsus2',
'44.93265': 'Db7',
'45.12268': 'Bbm7',
'45.44798': 'Bbm',
'45.79079': 'Gm7',
'46.19034': 'Ab',
'46.34041': 'Fm',
'46.55574': 'Db7',
'46.90395': 'Fm',
'47.01982': 'Gb7',
'47.97891': 'Dbsus2',
'48.34485': 'Gbm',
'48.50762': 'A7',
'48.69995': 'Ab7',
'49.42195': 'E7',
'50.01551': 'Eb7',
'50.13202': 'E7',
'50.50336': 'Db7',
'50.67206': 'Bbsus2',
'50.81737': 'Bb',
'51.5707': 'Gm7',
'51.89719': 'Db7',
'52.04748': 'Ab7',
'52.25587': 'Bbm',
'52.41909': 'Db',
'52.84939': 'Db7',
'52.97084': 'Db',
'54.44454': 'Ab7',
'54.73528': 'C',
'54.91551': 'Caug',
'55.16431': 'Eb',
'55.49587': 'Ab',
'55.88984': 'Eb',
'56.20404': 'Abm',
'56.60422': 'Db7',
'57.10984': 'F',
'57.318': 'Fm',
'57.69034': 'Db',
'59.07152': 'Bbm7',
'59.46671': 'Dbsus2',
'59.78041': 'Ab',
'60.18617': 'Ddim',
'61.25424': 'Ab',
'61.65488': 'Ebsus2',
'61.97424': 'Ab7',
'62.18943': 'Bbm',
'63.04222': 'Fm',
'63.41297': 'Db',
'65.5268': 'Dbsus2',
'65.84639': 'Ab7',
'65.95029': 'Ebdim',
'66.03687': 'Ab7',
'66.29796': 'Absus2',
'67.00132': 'Dbsus2',
'67.34249': 'Eb7',
'67.90644': 'Bbm',
'68.76744': 'Fm',
'69.13741': 'Db7',
'69.49696': 'Gbsus2',
'69.84476': 'Dbm',
'70.17624': 'Db',
'70.6054': 'Dbsus2',
'71.3254': 'Ab',
'71.63941': 'Ab7',
'72.01197': 'Ab',
'72.39986': 'C7',
'72.75342': 'Gbdim',
'73.39814': 'Cm7',
'73.46181': 'C',
'73.83968': 'C7',
'74.39669': 'G7',
'74.7278': 'Gbm',
'74.90862': 'D7',
'75.23755': 'Gb7',
'75.80399': 'F7',
'75.94063': 'Fsus2',
'76.27719': 'Gb7',
'76.71873': 'Dbsus2',
'77.40562': 'Ab7',
'78.14571': 'Gdim',
'78.31746': 'E7',
'78.55415': 'Eb7',
'79.62113': 'Bbm',
'80.27542': 'Gm7',
'80.63125': 'Ab7',
'80.82295': 'A7',
'81.06154': 'F',
'81.58349': 'Fm',
'82.08399': 'Gbsus2',
'82.64644': 'Ab',
'82.83156': 'A7',
'83.0117': 'Ab7',
'83.92667': 'E7',
'84.29483': 'Eb7',
'85.14698': 'Db7',
'85.35075': 'Bbm',
'86.01973': 'Gm7',
'86.28059': 'Absus2',
'86.43079': 'Ab7',
'86.62195': 'Fdim',
'86.79025': 'Db7',
'87.12703': 'Gb7',
'87.34653': 'F7',
'87.51642': 'Fm',
'87.81147': 'Gb7',
'88.20172': 'Db7',
'88.32807': 'Ab7',
'89.64109': 'Eb7',
'89.98041': 'E7',
'90.32286': 'Eb',
'90.73923': 'Eb7',
'90.92862': 'Db7',
'91.08023': 'Bb',
'91.46249': 'Bbm',
'91.82181': 'Bb7',
'92.54839': 'F',
'92.99673': 'Fm',
'93.62794': 'G7',
'93.96481': 'Dbsus2',
'94.15764': 'Ab7',
'94.34857': 'Dbsus2',
'94.69755': 'Ab7',
'95.4059': 'Ebm7',
'95.75624': 'E7',
'96.13927': 'Eb',
'96.42073': 'Abdim',
'96.8288': 'Bbm',
'97.33773': 'Bbsus2',
'97.57034': 'Bb7',
'97.93878': 'Ab',
'98.08993': 'Fm',
'98.29764': 'F',
'98.67361': 'F7',
'98.99918': 'Fm',
'99.72426': 'Dbsus2',
'99.90467': 'Ab7',
'101.16308': 'Eb7',
'101.50408': 'E7',
'101.71016': 'Eb7',
'102.56812': 'Dbdim',
'102.7658': 'Bb7',
'102.94576': 'Bbm',
'103.10181': 'Bb7',
'103.3107': 'Gm7',
'103.65891': 'Fm7',
'104.01859': 'Gb7',
'104.39855': 'F7',
'104.73111': 'Fsus2',
'105.08263': 'Gb7',
'105.33156': 'G7',
'105.42331': 'Db7',
'105.59868': 'Ab7',
'106.73116': 'Gbsus2',
'106.86998': 'Eb7',
'107.26834': 'E7',
'107.55796': 'Eb',
'107.9961': 'D7',
'108.34812': 'Bbm',
'108.98871': 'Gm7',
'109.33619': 'Absus2',
'109.42417': 'Ab7',
'109.5756': 'Ab',
'109.76068': 'Fm',
'110.7937': 'Gb7',
'111.21864': 'Db7',
'111.36331': 'Ab',
'111.57224': 'Dbsus2',
'111.93166': 'Ab7',
'112.66132': 'Ebm7',
'113.01701': 'E7',
'113.21156': 'Eb7',
'113.38236': 'Eb',
'113.93179': 'Db7',
'114.1029': 'Gb7',
'114.31773': 'Bb',
'114.46254': 'Bbm',
'114.60839': 'Bb',
'114.801': 'Bb7',
'115.18785': 'Ebdim',
'115.33923': 'A7',
'115.51778': 'Db7',
'115.69342': 'F7',
'115.86277': 'Fm',
'116.24925': 'Gb7',
'116.98177': 'Db',
'117.31265': 'Dbsus2',
'117.67166': 'Ab7',
'118.41043': 'Ebm7',
'118.71424': 'Eb7',
'119.12413': 'E7',
'119.4956': 'Gdim',
'119.83202': 'Bbm7',
'120.18095': 'Bb',
'120.56304': 'Bb7',
'120.90177': 'Ab7',
'121.118': 'Fm7',
'121.27837': 'Db7',
'121.63587': 'F',
'121.99456': 'Fm',
'122.57841': 'Fm7',
'122.72444': 'Dbsus2',
'122.89138': 'Ab7',
'123.07751': 'Dbsus2',
'123.41533': 'Ab7',
'124.12585': 'Bbdim',
'124.33229': 'Eb7',
'124.87633': 'E7',
'125.21379': 'A',
'125.59061': 'Bbm7',
'125.95642': 'Bbm',
'126.31651': 'Gm',
'126.64159': 'Cm',
'126.839': 'C7',
'126.98862': 'Db',
'127.39683': 'Dbm',
'127.71787': 'Db',
'128.81224': 'Ab7',
'129.51537': 'Ebdim',
'129.66576': 'Ddim',
'130.22939': 'Dbsus2',
'130.59537': 'Eb',
'130.94322': 'Dbsus2',
'131.12236': 'Db7',
'132.0342': 'Fm',
'132.40018': 'Fsus2',
'133.13737': 'Dbm',
'134.76317': 'Ab7',
'135.61075': 'Ebm',
'135.98807': 'Dbsus2',
'136.31879': 'Dbdim',
'136.44608': 'Eb7',
'136.69089': 'Dbsus2',
'136.8468': 'F',
'137.78844': 'Fm',
'138.15764': 'Db',
'139.05841': 'Db7',
'139.2152': 'Db',
'139.5517': 'Dbm',
'140.26599': 'Edim',
'140.3574': 'Caug',
'140.48059': 'Ab7',
'141.03206': 'Bb7',
'141.7171': 'Absus2',
'142.09003': 'Eb7',
'142.47176': 'Eb',
'142.6458': 'Bb',
'143.14531': 'Fm',
'144.3122': 'Db',
'145.33392': 'Dbsus2',
'145.51283': 'Ddim',
'146.05905': 'Ab',
'146.44221': 'Ab7',
'146.77963': 'Dbsus2',
'146.97652': 'Dbm',
'147.31909': 'C7',
'147.48747': 'Cm',
'147.67923': 'C7',
'148.19627': 'Cm',
'148.46259': 'C',
'148.58495': 'Db7',
'149.97891': 'Fm',
'150.379': 'F7',
'150.537': 'Gb7',
'150.75604': 'Fm',
'151.08531': 'Fsus2',
'151.2083': 'G7',
'151.48639': 'Db',
'151.81288': 'Dbsus2',
'152.1079': 'Ab7',
'152.49161': 'Ab',
'153.2551': 'E7',
'153.44331': 'Eb7',
'154.337': 'Db',
'154.52313': 'Bb7',
'154.6854': 'Bbm',
'155.03134': 'Bb7',
'155.4297': 'Ab',
'155.74141': 'Fm',
'156.141': 'F7',
'156.3': 'F',
'156.49669': 'Fm',
'156.84485': 'G7',
'157.23357': 'Db',
'157.5463': 'Dbsus2',
'157.90036': 'Ab7',
'158.65778': 'Ebm7',
'158.99832': 'E7',
'159.21324': 'Eb7',
'159.3673': 'Eb',
'159.70988': 'Ab',
'160.0317': 'Bb',
'160.7923': 'Bbm',
'160.96114': 'Gdim',
'161.11642': 'Adim',
'161.3381': 'Cm',
'161.47238': 'Db',
'163.66603': 'Ab7',
'164.33878': 'Ebm7',
'164.54861': 'Eb7',
'165.11107': 'Dbsus2',
'166.15619': 'Fm',
'167.26494': 'Db',
'167.62508': 'Db7',
'167.9673': 'Db',
'168.36214': 'Dbm',
'169.0356': 'Gbm',
'169.14561': 'Ab7',
'169.94695': 'Eb',
'170.10948': 'Ebm',
'170.82484': 'Eb7',
'171.20653': 'Dbaug',
'172.26282': 'F',
'172.50095': 'Fm',
'173.00398': 'Db',
'173.36612': 'D7',
'174.08018': 'Ab',
'174.39877': 'Ddim',
'174.8039': 'Fm',
'175.62956': 'Bb',
'175.83896': 'Eb',
'177.13918': 'Cm',
'177.29007': 'Fm',
'178.71823': 'Db',
'179.1009': 'D7',
'179.57732': 'C7',
'179.81546': 'Fm7',
'180.00685': 'Ab7',
'180.15198': 'Dbsus2',
'180.5528': 'A7',
'181.23746': 'Gdim',
'181.61597': 'C7',
'181.9639': 'Gdim',
'182.14317': 'C',
'182.28218': 'C7',
'182.6654': 'C',
'183.05457': 'Csus2',
'183.72794': 'C',
'185.90494': 'Db7',
'187.37941': 'Fm',
'187.77982': 'Db',
'187.84933': 'Fm',
'188.4478': 'Ab7',
'189.1435': 'D7',
'189.22476': 'Ab',
'189.49796': 'Absus2',
'189.68349': 'Ab',
'190.06082': 'Fm7',
'190.23456': 'Ebsus2',
'190.62961': 'Ab7',
'190.96072': 'Absus2',
'191.4878': 'Fdim',
'191.65234': 'Bb7',
'191.97093': 'Fsus2',
'192.38481': 'Fm7',
'192.76027': 'Ab7',
'192.9629': 'F7',
'193.03474': 'Fm',
'194.18893': 'Ab7',
'194.90349': 'Ab',
'195.65692': 'G7',
'195.7849': 'Ab',
'196.02295': 'Ebsus2',
'196.41243': 'Ab7',
'196.74245': 'Ebsus2',
'197.05501': 'Ab7',
'197.28322': 'Fdim',
'197.4156': 'Gb7',
'197.6316': 'Bb',
'197.77061': 'Db7',
'197.93773': 'Gm7',
'198.154': 'Bb',
'198.47238': 'Cm',
'198.84993': 'Db',
'199.98753': 'Bbm7',
'200.4166': 'Ab',
'201.02686': 'Ab7',
'201.34608': 'Ddim',
'202.04268': 'Dbsus2',
'202.44807': 'Eb',
'202.80807': 'Ebsus2',
'202.95985': 'Gdim',
'203.14607': 'Fm',
'203.86562': 'Gb7',
'204.23732': 'Db',
'204.9683': 'Db7',
'205.32785': 'Db',
'205.85655': 'Db7',
'205.98386': 'Ab7',
'206.37932': 'Dbsus2',
'206.73306': 'Ab7',
'207.10358': 'Ab',
'208.17256': 'Eb',
'208.58372': 'E7',
'208.7473': 'Gdim',
'208.92172': 'Db7',
'209.60822': 'Fm',
'210.7161': 'Db7',
'211.05779': 'Db',
'211.40594': 'Dbm',
'211.56871': 'Db7',
'211.7313': 'Dbsus2',
'212.06218': 'Dbm',
'212.51483': 'Ab',
'212.8689': 'Caug',
'213.46095': 'Eb7',
'213.55968': 'Dbsus2',
'213.91968': 'Eb7',
'214.33179': 'Absus2',
'214.5057': 'Fm',
'215.77696': 'Db',
'216.48508': 'D7',
'216.77634': 'Db',
'217.1638': 'Ab',
'217.3735': 'Dbsus2',
'217.90785': 'Ab',
'218.63324': 'Dbsus2',
'218.81306': 'Ab7',
'218.99333': 'Cdim',
'219.29543': 'Csus2',
'219.67188': 'Cm',
'220.04993': 'Db7',
'220.39256': 'C',
'220.7065': 'Cdim',
'221.12956': 'Csus2',
'221.80911': 'Cm',
'222.01778': 'Ab7',
'222.19167': 'C7',
'222.56915': 'Csus2',
'223.30035': 'C',
'223.66599': 'Cm',
'223.91533': 'Ab7',
'224.35133': 'Cm',
'224.63719': 'Gb7',
'225.44826': 'Fm',
'226.16853': 'Db',
'226.88762': 'Dbdim',
'227.638': 'Ebdim',
'228.27547': 'Eb7',
'229.04141': 'D',
'229.77293': 'Ddim',
'231.14899': 'A7',
'231.52036': 'Dbsus2',
'231.87451': 'E',
'232.63025': 'Em',
'232.98894': 'Dbm7',
'233.33134': 'Gb7',
'233.88371': 'F7',
'234.06126': 'F',
'234.79442': 'Db7',
'235.5141': 'Dbsus2',
'236.95955': 'Eb',
'237.26834': 'Eb7',
'237.6151': 'D',
'238.35075': 'D7',
'239.07224': 'Dbm',
'239.8098': 'Ab',
'240.46571': 'E7',
'241.2434': 'E',
'241.91138': 'F',
'243.02145': 'Gb7',
'243.38544': 'Db',
'244.14082': 'Db7',
'244.47787': 'Ebm7',
'245.60327': 'Eb',
'245.9458': 'E7',
'246.29982': 'D',
'247.79442': 'Dbm',
'248.41333': 'Ab7',
'249.14413': 'E7',
'249.81175': 'Dbdim',
'249.88164': 'Dbm',
'250.56612': 'Gb7',
'251.29796': 'Dbaug',
'252.71442': 'Db7',
'253.10912': 'Db',
'253.46866': 'Ebaug',
'254.17134': 'Eb',
'254.50313': 'Eb7',
'254.86237': 'Dm',
'255.65228': 'D7',
'256.30765': 'Ab',
'256.66757': 'Db',
'257.02362': 'Em',
'257.4117': 'Db'
}
What I actually need
The correct chords would have been more like this: https://tabs.ultimate-guitar.com/tab/green-day/boulevard-of-broken-dreams-chords-971400
So basically just a progression of Fm, Ab, Eb, Bb
for the verses and Db, Ab, Eb, Fm
for the chorus.
What can I do to determine the chords with more reasonable accuracy?
Thoughts
- I could filter out all chords that are not in the key of the given segment. Disadvantages are that I'd potentially be removing borrowed chords or even chords that are in key, in cases where Spotify didn't estimate the key signature correctly.
- I could manipulate the chroma vectors, reducing the weight of accidentals (notes that are out of key), but again this would make it more difficult to detect borrowed chords and relies on the key estimation to be on point.
- Maybe the chroma vectors could be interpreted in a smarter way or the timbre vectors could be incorporated into the algorithm. And maybe similar segments could be merged. But I'm not sure if that really makes any sense.