Running Mysql Server version: 5.7.27-0ubuntu0.18.04.1
I'm creating a site/app where a user "submission" can be one of:
- Text Comments
- picture/file upload
- video/file upload (more or less technically same as #2, just with different mime type)
I'm having trouble deciding between the two designs (shortened for brevity)...
CREATE TABLE submissions
(
submissionID INT,
userID INT,
submissionComments TEXT,
fileDirectory VARCHAR2(32), -- starting here these are only used 20% of time
fileName VARCHAR2(128)
fileMimeType VARCHAR2(128),
fileSize INT,
originalFileName VARCHAR2(64)
)
-OR-
CREATE TABLE submissions
(
submissionID INT,
userID INT,
submissionComments TEXT
)
CREATE TABLE submissionFiles
(
submissionFileID INT,
submissionID INT, -- FK to submissions table
fileDirectory VARCHAR2(32),
fileName VARCHAR2(128),
fileMimeType VARCHAR2(128),
fileSize INT,
originalFileName VARCHAR2(64)
)
I'm assuming text comments will prob be 70-80% of submissions.
So, the question becomes, is it better to use a single table and have a bunch of NULL values in fileDirectory/fileName/fileMimeType/fileSize/originalFileName? Or, is it better to have a 1:1 relationship to support when files are uploaded. In that case, I'd be creating both a submissions and submissionFiles record. Obviously most queries would then require joining the two tables.
This essentially comes down to not having a good understanding of the impacts of VARCHAR (and 1 INT) columns in tables where they are majority NULL. I'm probably pre-optimizing a bit here considering this is a brand new site/app, but i'm trying to plan ahead.
Late addition 2nd question (as I type this out), i see that TEXT is capable of handling: 65,535 characters or 64 KB. That seems like a lot for what a typical user would be submitting (probably less than 500 characters). It would eat up storage pretty quick. Would would be the impacts of making submissionComments into VARCHAR(500) instead of TEXT? I'm assuming if anything, there are no negative trade-offs besides being able to store "less".
Thanks!
Edit: as madhur pointed out, there are similar questions/good answers about "design patterns". i'm more concerned about performance. does the presence of large number of varchar's negatively impact data storage/retrieval (by messing up the way mysql implements pages/extents/etc)?