RDataFrame
September 23, 2024ROOT
is an open-source framework designed for data analysis, particularly in high-energy
physics, but it's also used in other fields. While it's a very powerful tool, ROOT
can be
challenging to work with due to its complexity.
A relatively new feature in ROOT
has been introduced that combines the efficiency and
power of ROOT
with the ease and simplicity of Pandas
dataframes:
RDataFrame
.
RDataFrame
provides a simple way to filter, reduce, and process datasets directly from
.root
files, making large datasets more manageable. It also allows easy conversion to
Pandas
dataframes, enabling the use of Python’s data analysis tools without the burden
of massive file sizes. In short, a perfect tool for handling big data
Since navigating the RDataFrame
documentation can be a bit challenging,
I thought it would be helpful to provide a few lines of code below to help you get started.
Once ROOT
is installed (Check CERN's official website
on how to do it), the ROOT
library can be used in Python (and C++ as well).
This step is important as ROOT
has to be imported in the Python script.
import ROOT
The initialization of a RDataFrame
from a list of .root
files can be done in a single line:
rdf = ROOT.RDataFrame(BranchName, ListOfROOTFiles)
One can print the RDataFrame
, get its columns, or count its number of rows:
rdf.Display().Print()
rdf.GetColumnNames()
rdf.Count().GetValue()
Just as in Pandas
, one can define new columns, and apply a selection of the rows of the dataframe:
rdf.Define(VariableName, StringOfOperation)
rdf.Filter(StringOfExpression)
A last and important feature is the conversion between Pandas
dataframes and RDataFrame
.
In one way, it can be done like this:
data = {key: df[key].values for key in list(df.columns)}
rdf = ROOT.RDF.MakeNumpyDataFrame(data)
And in the other yields:
ListColumns = [str(col) for col in rdf.GetColumnNames()]
df = pd.DataFrame.from_dict(rdf.AsNumpy(ListColumns))
This is in no circumstances a comprehensive overview of the RDataFrame
methods
and functions that are available. More information can be found online in ROOT
forums
for example, yet I hope it gives a taste of it.