Quick start
NOTE: You can find this whole page as a jupyter notebook and try it out yourself •⩊• here: notebooks/Usage_example_of_awkward_zipper.ipynb
Before even starting to work with awkward-zipper we have to use uproot
library to load the data. First we load our root file:
# Create a TTree from root
tree = uproot.open("nano_dy.root")["Events"]
Then we use TBranch.arrays
function to conver the TTree format we got
form the root file into an awkward array:
# TTree -> awkward.Array[awkward.Record[str, awkward.Array]]
array = tree.arrays(ak_add_doc=True)
Finally we can pass the resulting array to awkward-zipper.
from awkward_zipper import NanoAOD
restructure = NanoAOD(version="latest")
result = restructure(array)
This whole process we can picture on a diagram:
![digraph {
"Root file" -> "TTree" ->
"awkward.Array[awkward.Record[str, awkward.Array]]"
-> "awkward-zipper";
}](_images/graphviz-6308a652806151c6367d185836d2c192ed6afb32.png)
How awkward-zipper works internally
Now let’s see how awkward-zipper restructures arrays internally, using Nanoaod schema(layout) as an example. The whole process can be broken down into two big steps.
Adding new fields
Any branches named n{name}
are assumed to be counts branches
and can converted to offsets using this helper function:
counts2offsets(counts) |
---|
Counts |
Result |
Any local index branches with names matching {source}_{target}Idx*
are converted to global indexes for the event chunk (postfix G
).
All local indices and their correlating global indices are taken from awkward_zipper.NanoAOD.all_cross_references
dictionary.
local2globalindex(index, counts) |
---|
Index array |
Result |
Any nested_items are constructed, if the necessary branches are available.
All the input indices are taken from the awkward_zipper.NanoAOD.nested_items
dictionary.
nestedindex(indices) |
---|
First index array |
Result |
In the same manner any awkward_zipper.NanoAOD.nested_index_items
and awkward_zipper.NanoAOD.special_items
are constructed,
if the necessary branches are available. You can find all these functions in the documentation for awkward_zipper.kernels
Grouping(zipping) all the fields together
From those arrays, NanoAOD collections are formed as collections of branches grouped by name, where:
one branch exists named
name
and no branches start withname_
, interpreted as a single flat array;
Example: Each event has only one Run Id. Interpreted flat array will look look like this: |
one branch exists named
name
, one namedn{name}
, and no branches start withname_
, interpreted as a single jagged array;
Example: Each event has a flat array of PS Weights. Interpreted single jagged array will look look like this: |
no branch exists named
{name}
and many branches start withname_*
, interpreted as a flat table; or
Example: Each event has a SINGLE Generator. Each Generator consists of a record of Generator parameters.
These parameters can be scalars or flat arrays. Interpreted flat table will look look like this: |
one branch exists named
n{name}
and many branches start withname_*
, interpreted as a jagged table.
Example: Each event has an array of Jets. Each Jet consists of a record of Jet parameters.
These parameters can be scalars or flat arrays. Interpreted jagged table will look look like this: |
Collections are assigned mixin types according to the mixins awkward_zipper.NanoAOD.mixins
mapping.
Finally, all collections are then zipped into one NanoEvents record and returned.