Using seaborn in apache spark

Asked Jan 06 '20 at 13:39

Active Jan 06 '20 at 14:12

Viewed 525 times

Using pandas and seaborn on a csv dataframe with 50 million cases to make some scatter matrix I noticed that the process times are really long, for convenience I made df.sample() on a part of the data and this reduced the process time. Considering the potential of apache spark I wanted to ask if it is possible to apply its speed to process all the 50 million data to create: scatter matrix, scatter plot, pairgrid etc. in seaborn. Taking information on this topic I saw that it is quite difficult to do this.

edited Jan 06 '20 at 14:12

krishna Prasad

3,541
1
34
44

asked Jan 06 '20 at 13:39

vins_26

Using seaborn in apache spark

0 Answers0