0

I have 2 tables, Resources and ChapterMap.

The Resources table contains records of educational resources. This table has a set of columns relating to Topics (RES_Topic_Sports, RES_Topic_Politics, RES_Topic_CurrentEvents, etc.). If the value of one of these fields is 1, the resource relates to that topic.

The ChapterMap table links topics to chapters in a book. Each record in this table has a Topic field with values like (Sports, Politics, Current Events, etc.). It also has links to the book chapter and sections (FR_Ch1_Sec1, FR_Ch1_Sec2, FR_Ch2_Sec3, etc.)

I'm trying to JOIN these two tables on topics. For example, if Resources.RES_Topic_Sports = 1, I want to join ChapterMap where ChapterMap.Topic = 'Sports'. IF Resources.RES_Topic_Sports = 1 AND Resources.RES_Topic_Politics = 1, I want to join ChapterMap where ChapterMap.Topic = 'Sports' AND where ChapterMap.Topic = 'Politics', etc., etc.

I'm also open to restructuring the tables if that's optimal.

I would be VERY grateful for any advice. If it matters, this is for a non-profit organization that works on global human rights issues.

Thanks in advance!!

******************** SAMPLE DATA **********************

Table: RESOURCES
Fields: RES_ID, RES_Name, RES_Topic_Sports, RES_Topic_Politics, RES_Topic_CurrentEvents
Record: (101, 'Sports Lesson Plan', 1, NULL, 1)
Explanation: A resource called "Sports Lesson Plan" that relates to the Sports and Current Events topics.

Table: CHAPTERMAP
Fields: ID, Topic, Ch1_Sec1, Ch1_Sec2, Ch1_Sec3, Ch1_Sec4, Ch2_Sec1, Ch2_Sec2, Ch3_Sec1, Ch3_Sec2, Ch3_Sec3
Record: (4, 'Sports', NULL, NULL, NULL, NULL, 1, NULL, NULL, 1, 1)
Explanation: A mapping that links the topic "Sports" to Chapter 2, Section 1 and to Chapter 3, Section 2 and to Chapter 3, Section 3.

GOAL: To link the resource "Sports Lesson Plan" to Chapter 2, Section 1 and to Chapter 3, Section 2 and to Chapter 3, Section 3.

nbardach
  • 123
  • 1
  • 10
  • 2
    Sample data and desired results would really help. – Gordon Linoff Oct 04 '17 at 01:33
  • I would but the table structure is very big. I guess I could summarize it by saying that: IF Resources.RES_Topic_Politics = 1 THEN JOIN ChapterMap ON ChapterMap.Topic = 'Politics' AND IF Resources.RES_Topic_Sports = 1 THEN JOIN ChapterMap ON ChapterMap.Topic = 'Sports' AND IF Resources.RES_Topic_CurrentEvents = 1 THEN JOIN ChapterMap ON ChapterMap.Topic = 'Current Events' Hope that helps explain. – nbardach Oct 04 '17 at 01:38
  • . . Then simplify it so you can ask your question on the simpler version. – Gordon Linoff Oct 04 '17 at 01:46
  • Added sample data. Not sure if that's the best way to input it but I hope it helps clarify. – nbardach Oct 04 '17 at 02:01
  • Added an explanation for each of the records – nbardach Oct 04 '17 at 02:06

3 Answers3

2

You first need to refactor the tables to normalize your database. Instead of having a column for each topic, you should have a table of topics, a table of resources, and a join table that stores which topics are assigned to which resources. You should do the same for assigning topics to chapters.

For example, consider the following tables:

         TOPIC:  ID     NAME
                 ----------------------------------------------------------
                 1      Sports
                 2      Politics


      RESOURCE:  ID     (any other fields you need, i.e. description, etc.)
                 ----------------------------------------------------------
                 1      blah  blah
                 2      blah  blah


RESOURCE_TOPIC:  RESOURCE_ID    TOPIC_ID
                 ----------------------------------------------------------
                 1              2
                 2              1
                 2              2


       CHAPTER:  ID   (any other fields you need)
                 ----------------------------------------------------------
                 Ch1_Sec1      blah blah
                 Ch1_Sec2      blah blah
                 Ch1_Sec1      blah blah


 CHAPTER_TOPIC:  CHAPTER_ID     TOPIC_ID
                 ----------------------------------------------------------
                 Ch1_Sec1       1
                 Ch1_Sec2       2
                 Ch1_Sec1       1

Now you can easily write a JOIN statement to match resources to a chapter that share ANY common topic:

SELECT
    r.ID AS RESOURCE_ID,
    c.ID AS CHAPTER_ID
    -- any other fields you need
FROM RESOURCE r
JOIN RESOURCE_TOPIC rt ON rt.RESOURCE_ID = r.ID
JOIN CHAPTER_TOPIC ct ON ct.TOPIC_ID = rt.TOPIC_ID
JOIN CHAPTER c ON c.ID = ct.CHAPTER_ID
GROUP BY
    r.ID AS RESOURCE_ID,
    ct.CHAPTER_ID
    -- any other fields you need

If you want to match resources to chapters where ALL topics match, it's a little more complicated but doable. I'm not completely sure from your question if this is what you need, but if so, let me know and I'll provide that query as well.

A little bit about normalizing your database

There are many good reason to normalize your database, but here are some high points:

  • Given your current set up, you have to keep adding new columns to your table every time there's a new topic and/or chapter. Obviously, the tables will explode to a ridiculous number of columns pretty quickly (as you've mentioned)

  • This method allows you to easily add or remove topics and chapters without altering your table structures

  • It's is practically impossible to join resources and chapters based on common topics given your current set up. Normalizing the tables makes the queries simpler.

  • If you want to rename a topic, you should only have to update one row in the topics table, instead of updating it everywhere it is used.

patrick3853
  • 1,100
  • 9
  • 17
  • Patrick, thanks SO MUCH for your answer. I went ahead and normalized the DB. Now, I have the following tables (Resources, Topics, Chapters, Topic-to-Resource, Topic-to-Chapter). Here's the SQL Fiddle (http://sqlfiddle.com/#!9/af2462/2). The point of this is to return a list of individual Resources with their corresponding topics and chapters. Using the INNER JOIN has given me duplicates (55 records). How can I combine the data from the tables and just return the 5 Resource Records? – nbardach Oct 06 '17 at 21:40
  • What's weird about it is that no matter whether I use a RIGHT, LEFT OR INNER JOIN, the record count is still 55. Beginning to fear I can't use a JOIN to get the 5 resources with their related topics and chapters. – nbardach Oct 06 '17 at 21:59
  • I broke this out into another question. Would be very grateful if you could take a look: https://stackoverflow.com/questions/46615130/join-with-group-by-in-a-normalized-db-ie-owww-my-head – nbardach Oct 06 '17 at 23:20
1

When preparing "sample data" you need to consider "Does this represent all the conditions needed?", for example do I have all topics included? i.e. One row for each table isn't sufficient.

Once you do have some sample data identified, you also need to take the effort to display the "expected result" - which should be achievable from the sample data provided. It is quite normal for expected results to be manually prepared of course because you may not have the needed query. (But, you should also demonstrate that you have tied a query.)

Anyway, below is my guess about what your question represents and have supplemented the sample data to cover more situations. SQL Fiddle

MySQL 5.6 Schema Setup:

CREATE TABLE RESOURCES
    (`RES_ID` int, `RES_Name` varchar(18), `RES_Topic_Sports` int, `RES_Topic_Politics` varchar(4), `RES_Topic_CurrentEvents` int)
;

INSERT INTO RESOURCES
    (`RES_ID`, `RES_Name`, `RES_Topic_Sports`, `RES_Topic_Politics`, `RES_Topic_CurrentEvents`)
VALUES
    (101, 'Sports Lesson Plan', 1, 1, 1),
    (201, 'Sports Lesson Plan', 1, NULL, NULL),
    (301, 'Political treatse', NULL, 1, NULL),
    (401, 'Event unfolding', NULL, NULL, 1)
;

CREATE TABLE CHAPTERMAP
    (`ID` int, `Topic` varchar(60), `Ch1_Sec1` varchar(4), `Ch1_Sec2` varchar(4), `Ch1_Sec3` varchar(4), `Ch1_Sec4` varchar(4), `Ch2_Sec1` int, `Ch2_Sec2` varchar(4), `Ch3_Sec1` varchar(4), `Ch3_Sec2` int, `Ch3_Sec3` int)
;

INSERT INTO CHAPTERMAP
    (`ID`, `Topic`, `Ch1_Sec1`, `Ch1_Sec2`, `Ch1_Sec3`, `Ch1_Sec4`, `Ch2_Sec1`, `Ch2_Sec2`, `Ch3_Sec1`, `Ch3_Sec2`, `Ch3_Sec3`)
VALUES
    (4, 'Sports', NULL, NULL, NULL, NULL, 1, NULL, NULL, 1, 1),
    (104, 'Politics', NULL, NULL, NULL, NULL, 1, NULL, NULL, 1, 1),
    (401, 'Current Events', NULL, NULL, NULL, NULL, 1, NULL, NULL, 1, 1)
;

Query 1:

select
*
from RESOURCES r
left join CHAPTERMAP c on (r.RES_Topic_Politics = 1      and c.Topic = 'Politics')
                       or (r.RES_Topic_Sports = 1        and c.Topic = 'Sports')
                       or (r.RES_Topic_CurrentEvents = 1 and c.Topic = 'Current Events') 
order by r.res_id, c.id

Results:

| RES_ID |           RES_Name | RES_Topic_Sports | RES_Topic_Politics | RES_Topic_CurrentEvents |  ID |          Topic | Ch1_Sec1 | Ch1_Sec2 | Ch1_Sec3 | Ch1_Sec4 | Ch2_Sec1 | Ch2_Sec2 | Ch3_Sec1 | Ch3_Sec2 | Ch3_Sec3 |
|--------|--------------------|------------------|--------------------|-------------------------|-----|----------------|----------|----------|----------|----------|----------|----------|----------|----------|----------|
|    101 | Sports Lesson Plan |                1 |                  1 |                       1 |   4 |         Sports |   (null) |   (null) |   (null) |   (null) |        1 |   (null) |   (null) |        1 |        1 |
|    101 | Sports Lesson Plan |                1 |                  1 |                       1 | 104 |       Politics |   (null) |   (null) |   (null) |   (null) |        1 |   (null) |   (null) |        1 |        1 |
|    101 | Sports Lesson Plan |                1 |                  1 |                       1 | 401 | Current Events |   (null) |   (null) |   (null) |   (null) |        1 |   (null) |   (null) |        1 |        1 |
|    201 | Sports Lesson Plan |                1 |             (null) |                  (null) |   4 |         Sports |   (null) |   (null) |   (null) |   (null) |        1 |   (null) |   (null) |        1 |        1 |
|    301 |  Political treatse |           (null) |                  1 |                  (null) | 104 |       Politics |   (null) |   (null) |   (null) |   (null) |        1 |   (null) |   (null) |        1 |        1 |
|    401 |    Event unfolding |           (null) |             (null) |                       1 | 401 | Current Events |   (null) |   (null) |   (null) |   (null) |        1 |   (null) |   (null) |        1 |        1 |
Paul Maxwell
  • 33,002
  • 3
  • 32
  • 51
  • While this does work, it doesn't normalize the data and is extremely inefficient. If there are 100 topics, the `JOIN` condition become insane and 100 `OR` expressions in a `JOIN` condition is a performance killer. Also, this query would have to be updated every time a topic is added or removed. – patrick3853 Oct 04 '17 at 03:25
  • 1
    @patrick3853 I agree absolutely that the current data model is not extensible or optimally efficient. However I don't know the intentions behind this question and have made no assumptions about it. A normalized data structure would be a vastly better solution. – Paul Maxwell Oct 04 '17 at 03:45
  • Super grateful to have a solution I can use right now but TOTALLY agree that normalization will have to be the long-term plan. Used_By_Already's query is really what I was trying to nail down. Patrick's solution is the path forward. Many thanks to you both!! – nbardach Oct 04 '17 at 04:29
  • How can I change the join to only return each record (ie. 101, 201, 301, 401) from Resources once? – nbardach Oct 04 '17 at 16:52
1

The subject you're interested in is database normalization. To understand, consider that you'll need a column in your chaptermap table for every section of every chapter in every book you want the database to cover.

Consider instead a third table called "Topics". In that table are two columns: topicID and topicName. A typical entry is '1', 'sports' ('2', 'politics' etc...).

Now, get rid of all of the topic related columns in the "Resources" table. Now your row in Resources simply looks like looks like '101', 'Sports Lesson Plan'.

One more table to normalize: let's change your "Chaptermap" table to just have the following columns: ID, chapter, and section. A typical row looks like '5', '2', '1', giving ID 5 to Chapter 2, Section 1.

We're going to normalize the topic stuff out to another couple of tables, just in case a resource or a chapter section relates to more than one topic (a many-to-many relationship). To do that, you'll create two more tables: resourceTopics and sectionTopics. resourceTopics will have two columns - resourceID and topicID. sectionTopics is the same deal - sectionID and topicID.

Your application will add a row to the resourceTopics table '101', '1' to indicate that resource 101 (the sports lesson plan) is associated with the topic 1 (sports). Likewise, you'll also add a row to the sectionTopics table '5', '1' to show a connection between section 5 (chapter 2, section 1) and topic 1 (sports).

Now if you query:

SELECT
r.RES_NAME, 
c.chapter,
c.section
FROM CHAPTERMAP c
INNER JOIN sectionTopics st
ON st.sectionID = c.id
INNER JOIN resourceTopics rt
ON rt.topicID = st.topicID
INNER JOIN Resources r
ON r.RES_ID = rt.resourceID
INNER JOIN Topics t
ON t.topicID = st.topicID
WHERE st.topicName = 'sports'

You should get the result you're looking for.

SQLCliff
  • 181
  • 6
  • Really grateful for this excellent response!! Normalization is undoubtedly the path forward... – nbardach Oct 04 '17 at 04:30
  • Thanks, Cliff! I normalized my database per your suggestion and now I'm struggling to get the query dialed in. I've broken it out into another question and would be very grateful if you would weigh in: https://stackoverflow.com/questions/46615130/join-with-group-by-in-a-normalized-db-ie-owww-my-head Thanks! Noah – nbardach Oct 06 '17 at 23:22