I am trying to use the Sci-kit learn python library to classify a bunch of urls for the presence of certain keywords matching a user profile. A user has name, email address ... and a url assigned to them. I have created a txt with the result of each profile data match on each link so it is in the format:
Name Email Address
0 1 0 =>Relavent
1 1 0 =>Relavent
0 1 1 =>Relavent
0 0 0 =>Not Relavent
Where the 0 or 1 signifies that the attribute was found on the page(each row is a webpage) How do i give this data to the sci-kit so it can use it to run a classifier? The examples i have seen all have data coming from a predefined sch-kit library such as digits or iris or are being generated in the format i already have. I just dont know how to use the data format i have to provide to the library
The above is a toy example and i have many more features than 3