I work with R but I haven't come across a case wherein I had to apply a single comparison operator for an entire dataframe. While comparing Pandas DataFrame and R dataframe, I could see the result of df[df > 0] is different in Python and R.
In Python the result of df[df > 0] is another DataFrame whereas in R, the result is a vector.
Python Code:
from numpy.random import randn
np.random.seed(101)
df = pd.DataFrame(randn(5,5), ['A', 'B', 'C', 'D', 'E'], ['V', 'W', 'X' , 'Y', 'Z'])
df[df > 0]
V W X Y Z
A 2.706849839 0.628132709 0.907969446 0.503825754 0.651117948
B NaN NaN 0.605965349 NaN 0.740122057
C 0.528813494 NaN 0.188695309 NaN NaN
D 0.955056509 0.190794322 1.978757324 2.60596728 0.683508886
E 0.302665449 1.693722925 NaN NaN NaN
R Code:
> set.seed(101)
> df = data.frame(matrix(rnorm(25), 5, 5))
> df
X1 X2 X3 X4 X5
1 -0.3260365 1.1739663 0.5264481 -0.1933380 -0.1637557
2 0.5524619 0.6187899 -0.7948444 -0.8497547 0.7085221
3 -0.6749438 -0.1127343 1.4277555 0.0584655 -0.2679805
4 0.2143595 0.9170283 -1.4668197 -0.8176704 -1.4639218
5 0.3107692 -0.2232594 -0.2366834 -2.0503078 0.7444358
> df[df > 0]
[1] 0.5524619 0.2143595 0.3107692 1.1739663 0.6187899 0.9170283 0.5264481 1.4277555 0.0584655 0.7085221 0.7444358
>
Could someone let me know what is the significance of the way in which R and Python outputs the result. Also, in R is there a way to get a dataframe as a result for the command df[df > 0]