1

I am streaming data from twitter, which is coming in below format:

Map(UserLang -> hi, 
    UserName -> CarterWyatt,  
    UserScreenName -> CarterWyatt1,  
    HashTags -> ,  
    UserVerification -> false,  
    Spam -> true,  
    UserFollowersCount -> 121,  
    UserLocation -> null,  
    UserStatusCount -> 146405,  
    UserCreated -> 2013-03-04T16:44:27.000+0530,  
    UserDescription -> null,  
    TextLength -> 113,  
    Text -> abcd.,  
    UserFollowersRatio -> 121.0,  
    UserFavouritesCount -> 0,  
    UserFriendsCount -> 1,  
    StatusCreatedAt -> 2016-07-14T20:52:52.000+0530,  
    UserID -> 1241101146)

I want to use case class like below:

  case class Foo(UserLang :String, UserName :String, UserScreenName :String, HashTags :String,
              UserVerification :String, Spam :String, UserFollowersCount :String,
              UserLocation :String, UserStatusCount :String, UserCreated :String, UserDescription :String,
              TextLength :String, Text :String, UserFollowersRatio :String, UserFavouritesCount :String,
              UserFriendsCount :String, StatusCreatedAt :String, UserID: String)

Now I want to use case class as a spark-sql table column name and want to fetch values from map(values), in short want to populate data in table from streaming values.

I am not sure how to do this exactly, please provide me pointers on the same.

slouc
  • 9,508
  • 3
  • 16
  • 41
Anand
  • 621
  • 3
  • 9
  • 31
  • The first two ideas that come to mind ... you could convert the map to the case class (http://stackoverflow.com/questions/20684572/scala-convert-map-to-case-class) and then create a dataset of those instances. Or you could convert a map to rdd (http://stackoverflow.com/questions/32080708/how-to-convert-a-map-to-sparks-rdd) and then convert to a dataset of that type. – Robert Horvick Jul 14 '16 at 17:27
  • topCounts60.foreachRDD( rdd => {var item= for( item <- rdd.keys.collect().toArray) { item.foreach(item =>Foo(item ,item(1),item(2),item(3),item(4),item(5),item(6),item(7),item(8), item(9),item(10),item(11),item(12),item(13),item(14),item(15),item(16),item(17))) // println(item); } }) not sure how to do 1st approach...please help across if u hv any more pointers – Anand Jul 16 '16 at 10:19

0 Answers0