Objects of the DataFrame type represent a data table as a series of vectors, each corresponding to a column or variable.
Questions tagged [dataframes.jl]
74 questions
13
votes
2 answers
Replace specific values in Julia Dataframe column with random value
I'm looking for a way to replace values in Dataframe column with random numbers. They should be different in every row where the substitution was performed.
For example replacing "X" with random numbers drawn from 100:120 range
julia> df =…

Maciej Fender
- 311
- 2
- 8
7
votes
3 answers
How to do close join in Julia DataFrames?
classA = Dataset(id = ["id1", "id2", "id3", "id4", "id5"],
mark = [50, 69.5, 45.5, 88.0, 98.5]);
grades = Dataset(mark = [0, 49.5, 59.5, 69.5, 79.5, 89.5, 95.5],
grade = ["F", "P", "C", "B", "A-",…

Warwick Wang
- 199
- 5
5
votes
3 answers
Julia DataFrames: Replace entries in a dataframe based on a comparison with another dataframe
I have the following dataframes:
df1 = DataFrame(
col_A = [1, 2, 3, 4, 5, 6, 7],
col_B = ["A", "B", "C", "D", "E", "F", "G"],
col_C = missing,
)
7×3 DataFrame
Row │ col_A col_B col_C
│ Int64 String Missing…

jn_br
- 99
- 4
5
votes
1 answer
Replace multiple strings with multiple values in Julia
In Python pandas you can pass a dictionary to df.replace in order to replace every matching key with its corresponding value. I use this feature a lot to replace word abbreviations in Spanish that mess up sentence tokenizers.
Is there something…

Dijkie85
- 1,036
- 8
- 21
5
votes
2 answers
Return the maximum sum in `DataFrames.jl`?
Suppose my DataFrame has two columns v and g. First, I grouped the DataFrame by column g and calculated the sum of the column v. Second, I used the function maximum to retrieve the maximum sum. I am wondering whether it is possible to retrieve the…

Likan Zhan
- 1,056
- 6
- 14
5
votes
2 answers
Plot DataFrames in Julia using Plots
In Julia, is there a way to plot a dataframe similarly to df.plot() in Python's Pandas?
More specifically, I am using Plots, plotlyjs() and the DataFrames package.

Joris Limonier
- 681
- 1
- 10
- 27
5
votes
1 answer
Processing JSON from a .txt file and converting to a DataFrame in Julia
Cross posting from Julia Discourse in case anyone here has any leads.
I’m just looking for some insight into why the below code is returning a dataframe containing just the first line of my json file. If you’d like to try working with the file I’m…

clibassi
- 67
- 3
4
votes
2 answers
Extracting Data from .csv File in Julia
I'm quite new to Julia and i have a .csv File, which is stored inside a gzip, where i want to extract some informations from for educational purposes and to get to know the language better.
In Python there are many helpful functions from Panda to…

Robin
- 63
- 5
4
votes
1 answer
Apply "any" or "all" function row-wise to arbitrary number of Boolean columns in Julia DataFrames.jl
Suppose I have a dataframe with multiple boolean columns representing certain conditions:
df = DataFrame(
id = ["A", "B", "C", "D"],
cond1 = [true, false, false, false],
cond2 = [false, false, false, false],
…

jsinai
- 93
- 3
4
votes
1 answer
Datetimes for Julia dataframes
pandas has a number of very handy utilities for manipulating datetime indices. Is there any similar functionality in Julia? I have not found any tutorials for working with such things, though it obviously must be possible.
Some examples of pandas…

Igor Rivin
- 4,632
- 2
- 23
- 35
4
votes
2 answers
Vector from Dataframe Column in Julia
I have a DataFrame
df = DataFrame(x = 1:3, y = 4:6)
3×2 DataFrame
Row │ x y
│ Int64 Int64
─────┼──────────────
1 │ 1 4
2 │ 2 5
3 │ 3 6
How can I extract one of the columns as a Vector?
I know I…

Georgery
- 7,643
- 1
- 19
- 52
4
votes
1 answer
Multithreaded iteration over groups for Julia GroupedDataFrame
I have a GroupedDataFrame in Julia 1.4 (DataFrames 0.22.1). I want to iterate over the groups of rows to compute some statistics. Because there are many groups and the computations are slow, I want to do this multithreaded.
The code
grouped_rows =…

Miklós Koren
- 158
- 2
- 8
3
votes
1 answer
How to subset rows with an OR condition in Julia DataFrames
I have a DataFrame and I want to filter the rows where column During_Cabg OR column During_Pci have a value of 1. Here's what I'm doing:
pci_or_cabg = @chain df begin
select([:During_Cabg, :During_Pci] .=> ByRow(x -> coalesce.(x, 0));…

SevenSouls
- 539
- 3
- 12
3
votes
1 answer
How to define an empty DataFrame with dynamically typed Column Names and Column Types in Julia?
Given column names and column types like these:
col_names = ["A", "B", "C"]
col_types = ["String", "Int64", "Bool"]
I want to create an empty DataFrame like this:
desired_DF = DataFrame(A = String[], B = Int64[], C = Bool[]) #But I cannot specify…

Realife_Brahmin
- 63
- 1
- 9
3
votes
2 answers
Is there any function in Julia to repeat each row of Julia data frame n times (where n varies across all rows)?
I have a Julia data frame:
df=DataFrame("Category" => ["A", "B", "C"], "n" => [1,2,3])
3×2 DataFrame
Row │ Category n
│ String Int64
─────┼─────────────────
1 │ A 1
2 │ B 2
3 │ C 3
and I…

Marcin Żurek
- 33
- 4