I'm trying to import a database into R to transform and load back into a different database. I am pulling my set from an RDS Mysql with:
con <- dbConnect(MySQL(),
user = 'user',
password = 'password',
host = 'url',
dbname='dbName')
sqlcmd = paste("SELECT * FROM dbName.`users`");
contentTable = dbGetQuery(con,sqlcmd);
contentTable["first_name"]
which nets me this unfortunate output
first_name
1 Sergio
2 Sara
3 J\xfalia
4 Tatiana
5 Paula
My problem is that the third name should be coming back as Júlia. This problem occurred in other rows as well.
My locale is set as follows.
> Sys.getlocale()
[1] "pt_PT.UTF-8/pt_PT.UTF-8/pt_PT.UTF-8/C/pt_PT.UTF-8/en_US.UTF-8"
and the default character for the server is
# Variable_name, Value
'character_set_client', 'utf8'
'character_set_connection', 'utf8'
'character_set_database', 'utf8'
'character_set_filesystem', 'binary'
'character_set_results', 'utf8'
'character_set_server', 'latin1'
'character_set_system', 'utf8'
'character_sets_dir', '/rdsdbbin/oscar-5.6.10a.14.15/share/charsets/'
I am a bit lost and about to switch to Python/Panda (which gave me the correct characters but I have a bit of a learning curve to face for what I want to do next). Any idea what to do now?
Update1:
SHOW CREATE TABLE users;
CREATE TABLE `users` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`first_name` varchar(255) NOT NULL,
`last_name` varchar(255) NOT NULL,
`email` varchar(255) NOT NULL,
`birthday` date DEFAULT NULL,
`age` int(11) DEFAULT NULL,
`city` varchar(255) DEFAULT NULL,
`state` varchar(255) DEFAULT NULL,
`country` varchar(255) DEFAULT NULL,
`beer_points` int(11) DEFAULT NULL,
`access_token` text,
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
`profile_picture_file_name` varchar(255) DEFAULT NULL,
`profile_picture_content_type` varchar(255) DEFAULT NULL,
`profile_picture_file_size` int(11) DEFAULT NULL,
`profile_picture_updated_at` datetime DEFAULT NULL,
`role` varchar(255) DEFAULT NULL,
`password_digest` varchar(255) DEFAULT NULL,
`gender` varchar(255) DEFAULT NULL,
`share_code` varchar(255) DEFAULT NULL,
`privacy_enabled` tinyint(1) DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `index_users_on_email` (`email`),
KEY `index_users_on_role` (`role`)
) ENGINE=InnoDB AUTO_INCREMENT=46 DEFAULT CHARSET=latin1