I am trying to making a web crawler for a schoolproject. When i try to scrape some websites, i get the following error:
Incorrect string value: '\xC4\x82\xC5\xA4 \xC3...' for column 'content' at row 1
The configuration for the table content looks like this
CREATE TABLE IF NOT EXISTS scotchbox.content (
id INT(11) NOT NULL AUTO_INCREMENT,
url INT(11) NOT NULL,
content LONGTEXT CHARACTER SET 'utf8' NOT NULL,
content_raw LONGTEXT CHARACTER SET 'utf8' NOT NULL,
content_raw_hash VARCHAR(255) CHARACTER SET 'utf8' NOT NULL,
PRIMARY KEY (id),
INDEX idx_content__url (url ASC),
CONSTRAINT fk_content__url
FOREIGN KEY (url)
REFERENCES scotchbox.url (id))
ENGINE = InnoDB
AUTO_INCREMENT = 4
DEFAULT CHARACTER SET = utf8mb4;
Can anyone tell me what i need to change/do to get page into the database?