0

I'm new to Mysql - I'm looking at using it as a back-end for some work I have to do in Access. Unfortunately I'm having to start this at the deep-end as I have a database of just over 26 million records in a 5gb text file to analyse.

I've tried doing this in Access by splitting the text file up into 280 or so different text files, via a Git Bash process, then creating some VBA code to create 280 different back-end databases and importing 1 text file into each db. I then have a front-end db where I have 280 linked table to get at the data.

This is obviously a very unwieldy process and makes doing any analysis take hours and hours and hours and hours (etc)

I've just imported a file of 26,955 records as a test into Mysql and it ran at about 1113 files imported per minute. If I try this with the main 26 million file at that rate it will take over 389 hours to run - assuming it didn't fall over before then of course.

So the query is: what's the best way to go about getting the data into Mysql? or am I just going about this completely the wrong way? - I donlt even know if the end result of one Mysql table will be more efficient than the Access route described above.

Here's the table I created for my Mysql database for the fields. They are mostly single characters, so I made them varchar(1). Unfortunately there are 63 of them and I do need them all:

CREATE TABLE Mydata (
Address_URN int,
URN int,
Address_Line_1 varchar(40),
Address_Line_2 varchar(40),
Address_Line_3 varchar(40),
Address_Line_4 varchar(40),
Address_Line_5 varchar(40),
Address_Line_6 varchar(40),
Town varchar(40),
Postcode varchar(10),
Male_Head_of_Household_Flag varchar(1),
Length_of_Residence_0_2_Years varchar(1),
Length_of_Residence_3_5_Years varchar(1),
Length_of_Residence_6_9_Years varchar(1),
Length_of_Residence_10Plus_Years varchar(1),
Aged_18_24 varchar(1),
Aged_25_29 varchar(1),
Aged_30_34 varchar(1),
Aged_35_39 varchar(1),
Aged_40_44 varchar(1),
Aged_45_49 varchar(1),
Aged_50_54 varchar(1),
Aged_55_59 varchar(1),
Aged_60_64 varchar(1),
Aged_65_69 varchar(1),
Aged_70_74 varchar(1),
Aged_75_79 varchar(1),
Aged_80_84 varchar(1),
Aged_85Plus varchar(1),
Income_0_9999 varchar(1),
Income_10000_19999 varchar(1),
Income_20000_29999 varchar(1),
Income_30000_39999 varchar(1),
Income_40000_49999 varchar(1),
Income_50000_74999 varchar(1),
Income_75000_99999 varchar(1),
Income_100000Plus varchar(1),
Is_employed varchar(1),
Is_not_employed varchar(1),
Social_renting varchar(1),
Private_renting varchar(1),
House_owned_outright varchar(1),
Home_owned_with_mortgage varchar(1),
Has_mortgage varchar(1),
Have_endowment_mortgage varchar(1),
Have_repayment_mortgage varchar(1),
Have_mortgage_more_than_10_years_old varchar(1),
Have_discounted_rate_mortgage varchar(1),
Have_offset_mortgage varchar(1),
Social_grade_A varchar(1),
Social_grade_B varchar(1),
Social_grade_C1 varchar(1),
Social_grade_C2 varchar(1),
Social_grade_D varchar(1),
Social_grade_E varchar(1),
Latitude decimal(8,6),
Longitude decimal(8,6),
Coordinate_Level varchar(10),
Suppressed varchar(1),
LocID varchar(15),
LocDistance decimal(10,6),
Student varchar(1),
PRIMARY KEY (URN)
);

Any advice gratefully received!

Ash_stack
  • 13
  • 5
  • Where is this data to be analyzed - in MS Access table? if so then simply create linked table is MS Access which is linked to MySQL table via ODBC, and copy your data with simple MS Access query. – Akina Jun 24 '22 at 11:39
  • Hi - it's in a text file. I split it up into smaller files to get it into separate Access dbs, but the original text file has all the data together. I'm trying to import it as one table to a Mysql db. – Ash_stack Jun 24 '22 at 12:10
  • Common LOAD DATA must import this file if it have no structural errors. Otherwise it can be loaded into temptable and then parsed with another query. – Akina Jun 24 '22 at 12:22
  • I'm trying to do LOAD DATA but I'm having difficuties turning off the secure-file-priv option. I've put secure_file_priv = "" in the my.ini file but workbench keeps saying the option is still set. Can you tell me how to turn off secure-file-priv ? – Ash_stack Jun 24 '22 at 14:59
  • Sorry - ignore above - didn't realise I had to restart service. – Ash_stack Jun 24 '22 at 15:19
  • OK, I think I've got the table structure set up properly. I've had some options to set in terms of timeouts, but what's holding me up now is the error 'Error Code: 1205. Lock wait timeout exceeded; try restarting transaction' after 50 seconds - is there a way I can handle this? – Ash_stack Jun 27 '22 at 10:48
  • Try to adjust [innodb_lock_wait_timeout](https://dev.mysql.com/doc/refman/8.0/en/innodb-parameters.html#sysvar_innodb_lock_wait_timeout) – Akina Jun 27 '22 at 11:51
  • OK thanks - had to force stop service when trying to restart and it now shuts down immediately when I start it again, so looks like I may need a reboot. Unfortunately I can't do this for a few hours as I'm running something I don't want to interrup, so this will take a while..... – Ash_stack Jun 27 '22 at 12:18
  • Well, data is in at last - took under 3 hours, which I think is pretty good for 26 million recs. Thanks for your help Akina! – Ash_stack Jun 28 '22 at 07:52

0 Answers0