2

My question is about the Vector module in scikit-hep.

https://vector.readthedocs.io/en/latest/index.html

I have an awkward array of vectors and I'd like to set the mass of all of them to be a common value. For example, I can do this with a single vector object.

x = vector.obj(pt=2, eta=1.5, phi=1, energy=10)
y = x.from_rhophietatau(rho=x.rho, eta=x.eta, phi=x.phi, tau=20)

print(f"{x.mass:6.3f}  {x.pt}  {x.eta}  {x.phi}  {x.energy:6.2f}")
print(f"{y.mass:6.3f}  {y.pt}  {y.eta}  {y.phi}  {y.energy:6.2f}")

Output

 8.824  2  1.5  1   10.00
20.000  2  1.5  1   20.55

But suppose I want to do this with an awkward array of vectors?

Let me start with some starter code from this previous question:

Using awkward-array with zip/unzip with two different physics objects

First, I'll get an input file

curl http://opendata.cern.ch/record/12361/files/SMHiggsToZZTo4L.root --output SMHiggsToZZTo4L.root

Then I'll make use of the code from the answer to that question:

import numpy as np
import matplotlib.pylab as plt

import uproot
import awkward as ak

import vector
vector.register_awkward()


infile = uproot.open("/tmp/SMHiggsToZZTo4L.root")

muon_branch_arrays = infile["Events"].arrays(filter_name="Muon_*")
electron_branch_arrays = infile["Events"].arrays(filter_name="Electron_*")

muons = ak.zip({
    "pt": muon_branch_arrays["Muon_pt"],
    "phi": muon_branch_arrays["Muon_phi"],
    "eta": muon_branch_arrays["Muon_eta"],
    "mass": muon_branch_arrays["Muon_mass"],
    "charge": muon_branch_arrays["Muon_charge"],
}, with_name="Momentum4D")

quads = ak.combinations(muons, 4)
mu1, mu2, mu3, mu4 = ak.unzip(quads)

p4 = mu1 + mu2 + mu3 + mu4

The type of p4 is <class 'vector._backends.awkward_.MomentumArray4D'>. Is there a way to set all the masses of the p4 objects to be, for example, 125? While this is not exactly my analysis, I need to do something similar where I will then use p4 to boost the muX objects to the CM frame of p4 and look at some relative angles. But I need to set the mass of p4 to be a constant value.

Is this possible? Thanks!

Matt

Progman
  • 16,827
  • 6
  • 33
  • 48
Matt Bellis
  • 289
  • 2
  • 10

1 Answers1

1

This is a well written question, thank you for the effort!

The answer here is yes, you can set a new value for the mass! One would do this updating the mass field using ak.with_field, or using the subscript __setitem__ operator, e.g.

p4['mass'] = 125.0

This will internally call ak.with_field, which you could also use e.g.

p4 = ak.with_field(p4, 125.0, "mass")

and broadcasts the 125.0 value against the rest of the array.

It is sometimes more convenient to use the vector Awkward constructors, as it is slightly less typing:

muons = vector.zip({
    'pt': muon_branch_arrays['Muon_pt'],
    'phi': muon_branch_arrays['Muon_phi'],
    'eta': muon_branch_arrays['Muon_eta'],
    'charge': muon_branch_arrays['Muon_charge'],
    'mass': muon_branch_arrays['Muon_mass'],
})

vector determines what kind of array you are building from the field names. This provides a good opportunity to highlight something important: vector supports aliases for fields, e.g. taumass. If you compare the fields of the muons array above with the array you built with ak.zip, you'll notice that my muons array has fields ['rho', 'phi', 'eta', 'tau', 'charge'] whilst your muon array has fields ['pt', 'phi', 'eta', 'mass', 'charge']. What's happening here is that vector is canonicalising the field names. This means that, were you to build an array in this manner, you'd want to use p4['tau'] = 125.0 instead of p4['mass'] = 125.0.

This would also be apparent if you transformed your muons array in any way, e.g. double_muons = muons + muons. You'd find that the result loses the charge field, and has tau instead of mass and rho instead of pt. So, something to be mindful of if you need to set a field.

The reason that p4['mass'] = 125 works, but pt['mass'][:] = 125 does not is because of how Awkward Array is designed. Whilst Awkward Arrays are immutable, this is only half of the story. You can already see that there is some kind of mutability - we can modify a field in-place. This works because, whilst the underlying "layouts" from which Arrays are built do not allow users to modify their values, the high level ak.Array can be given a new layout. This is what __setitem__ does under the hood, i.e. https://github.com/scikit-hep/awkward/blob/72c9edd55b9c4611ffc46952cda4cf9920a91315/src/awkward/highlevel.py#L1062-L1063

Angus Hollands
  • 351
  • 2
  • 11
  • Ah, thank you for this detailed explanation! I think what I want to do is update the `tau` field, as that causes the `energy` to be recalculated, which is what I want. However this means that I can have a vector with `mass` as one value and `tau` as a different value. Is this the intended behavior? – Matt Bellis Jul 11 '22 at 05:39
  • 1
    I edited my answer at one point, so do check the website (here) if you've been following the replies over email. I originally alluded to this more directly, but my revised answer instead explains the cause of the field name discrepancies. You should be able to use `mass` and `tau` interchangeably - as long as you only ever have one of these fields set on your array. If you set `mass` and your vector instead only has the `tau` field (you can check this with `p4.fields`), you'll run into this conflict. – Angus Hollands Jul 11 '22 at 08:43
  • 1
    Note that any operation on your vector that returns a new vector object e.g. addition may change the fields (by canonicalising the aliases, and dropping any unknown fields e.g. `charge`) – Angus Hollands Jul 11 '22 at 08:44
  • Thanks again for taking the time to explain this! I think I'm getting it now. For my particular case, I want to make sure I `ak.with_field(...)` with `tau`. When I do, I see both the `tau` and `mass` fields update (along with `energy` and `t`, natch). If I use `ak.with_field(...)` with `mass`, I do not see the fields update because (I know this is oversimplifying things, so forgive me) the `tau`, `rho`, etc. are the *real* fields storing the information of the geometric vector and `mass`, `pt`, etc are aliased "helper" fields for standard physics-speak. Close enough? :) – Matt Bellis Jul 11 '22 at 14:49
  • 1
    Nearly! I've read the source, and I think I'd advise just using `tau` for mass instead of `mass`. The reason for this is that I believe aliases like `mass` are transformed to their canonical equivalents quite frequently at the moment, so for ease of use it might be better to switch over to `tau`. What you're observing is that `p4` has different fields to `mu1`. This is because the addition operation changes the fields. In future I would assume that vector will be modified to preserve the aliases where possible. – Angus Hollands Jul 12 '22 at 09:37
  • 1
    To further concrete my response above: in theory, `mass` and `tau` are equally legitimate "real" fields. At the moment, however, vector converts `mass` to `tau` in certain contexts. – Angus Hollands Jul 12 '22 at 10:30
  • 1
    Rockin'. Thanks Angus! – Matt Bellis Jul 12 '22 at 14:49