0

BACKGROUND
I was given a bunch of data that looks like this, a little over 200 columns wide: Name|Address|etc...|Value 1|Value 2|Code 1|Code 2|repeat 7 times for codes|Value 3|repeat 200 times for values....|Value 200

They included definition lists that are used to decipher the Coded Values, for example: U6 = Local Limit and U7 = More than 100 Times

So I loaded it up into mysql because they wanted reports that swapped in the value from the definition lists for the Coded Values. However not all cells in main table have data, some are blank.

PROBLEM
So, when build my select statement, I would usually use a left join and be fine, but I need multiple left joins to get the 8 definition lists swapped in when needed, multiple left joins give me a lot of extra fields, having trouble with this.

Main Table is called
RAW_DATA

and the tables that hold all of the definition lists are named:
COUNTRY
ORIGIN
LANGUAGE
PREFERENCE
HAS_VEHICLE
EDUCATION
MARITAL_STATUS
OCCUPATION
TECHCODE
TYPE
INCOME

These tables above are just the ones that have definitions. All of the other fields in the 225 table are static and often unique. It could be normalized I am sure, but it would be tons of effort for converting one report, one time. That is why I just used the ones that had codes that were not human recognizable via the definition lists.

MY QUERY  

SELECT `raw_data`.`id_raw_data`,
    `raw_data`.`id`,
    `raw_data`.`first_name`,
    `raw_data`.`last_name`,
    `raw_data`.`OTHER_COLUMNS_AS_NEEDED`,
    `country`.`longname` as `country`,
    `origin`.`longname` as `origin`,
    `language`.`longname` as `language`,
    `preference`.`longname` as `preference`,
    `has_vehicle`.`longname` as `vehichle_type`,
    `education`.`longname` as `education`,
    `marital_status`.`longname` as `marital_status`,
    `occupation`.`longname` as `occupation`,
    `techcode`.`longname` as `tech_group`,
    `typestat`.`longname` as `typecode`,
    `income`.`longname` as `income`,
FROM `raw_data` 
left join `country`
    on `raw_data`.`countrycode` = `country`.`shortname`
left join `origin`
    on `raw_data`.`origincode` = `origin`.`shortname`
left join `language`
    on `raw_data`.`languagecode` = `origin`.`language`
left join `preference`
    on `raw_data`.`preferencecode` = `preference`.`shortname`
left join `has_vehicle`
    on `raw_data`.`has_vehiclecode` = `has_vehicle`.`shortname`
left join `education`
    on `raw_data`.`educationcode` = `education`.`shortname`
left join `marital_status`
    on `raw_data`.`marital_statuscode` = `marital_status`.`shortname`
left join `occupation`
    on `raw_data`.`occupationcode` = `occupation`.`shortname`
left join `techcode`
    on `raw_data`.`techcodecode` = `techcode`.`shortname`
left join `typecode`
    on `raw_data`.`typestatcode` = `typestat`.`shortname`
left join `income`
    on `raw_data`.`incomecode` = `income`.`shortname`

I have done some searching, all seem to use some form of sub-query or question involved joining back to itself. I am pretty sure it has something to do with the columns in the massive raw_data table that do not have values so there is no match, but need help.

This seemed close, but my query times out if too many joins already and this seems like even more work for all my lookups: Removing duplicates from result of multiple join on tables with different columns in MySQL

Thanks for the help,

David

Community
  • 1
  • 1
david
  • 3
  • 1

1 Answers1

0

In case anyone else wants to know, I found the issue was not with the sql at all, which worked fine for my purpose.

Rather, the data in a definition table had some values that were not unique, so the result returned an extra row in those cases where there was a duplicate definition defined.

david
  • 3
  • 1