5.4 Row Selection
We already covered two macros that operate on columns, @select
and @transform
.
Now let’s cover the only macro we need to operate on rows: @subset
It follows the same principes we’ve seen so far with DataFramesMeta.jl
, except that the operation must return a boolean variable for row selection.
Let’s filter grades above 7:
@rsubset df :grade > 7
name | grade |
---|---|
Alice | 8.5 |
Bob | 9.5 |
Sally | 9.5 |
As you can see, @subset
has also a vectorized variant @rsubset
. Sometimes we want to mix and match vectorized and non-vectorized function calls. For instance, suppose that we want to filter out the grades above the mean grade:
@subset df :grade .> mean(:grade)
name | grade |
---|---|
Alice | 8.5 |
Bob | 9.5 |
Sally | 9.5 |
For this, we need a @subset
macro with the >
operator vectorized, since we want a element-wise comparison, but the mean
function needs to operate on the whole column of values.
@subset
also supports multiple operations inside a begin ... end
statement:
@rsubset df begin
:grade > 7
startswith(:name, "A")
end
name | grade |
---|---|
Alice | 8.5 |
Support this project
CC BY-NC-SA 4.0 Jose Storopoli, Rik Huijzer, Lazaro Alonso