gh-143732: Add tier2 specialization for TO_BOOL#148271
gh-143732: Add tier2 specialization for TO_BOOL#148271eendebakpt wants to merge 11 commits intopython:mainfrom
Conversation
| _REPLACE_WITH_TRUE + | ||
| POP_TOP; | ||
|
|
||
| tier2 op(_TO_BOOL_DICT, (value -- res)) { |
There was a problem hiding this comment.
You can merge this with _TO_BOOL_SIZED by using the fact that both do a fixed offset lookup.
In tier2 optimizer you can set the offset for where the size is stored and do size = (Py_ssize_t)((char *)obj + offset) and check that directly.
There was a problem hiding this comment.
I see what you mean. It goes into the internals of PyDict (e.g. not using PyDict_GET_SIZE, but doing manual offset calculations) and we also need to store the offset somewhere. So I think this is too much of a complication to get rid of a tier2 opcode.
There was a problem hiding this comment.
we also need to store the offset somewhere.
You can store it in the instruction operand0.
| REPLACE_OP(this_instr, _TO_BOOL_DICT, 0, 0); | ||
| } | ||
| else if (tp == &PyTuple_Type || | ||
| tp == &PySet_Type || |
There was a problem hiding this comment.
This is incorrect for set as it does not uses PyObject_VAR_HEAD, this works by accident because it has fill at that offset which is incorrect if set has dummy entries.
There was a problem hiding this comment.
Good catch! I updated the PR to handle the set/frozenset separately.
We can also use your suggestion to fold everything into the _TO_BOOL_SIZED. That means we have to load the offset at runtime (minor cost), but it does keep the number of ops lower. I implemented this in main...eendebakpt:to_bool_specialization_v2.
There was a problem hiding this comment.
That means we have to load the offset at runtime (minor cost), but it does keep the number of ops lower.
I don't think so, in the JIT the offset would be burned into the machine code itself so the offset is fixed and not looked up at runtime.
markshannon
left a comment
There was a problem hiding this comment.
The problem with recording uops not being allowed after specializing uops has been fixed, so you can add a recording uop to _TO_BOOL and use the recorded information for better specialization.
#148285
@markshannon Adding the recording uop will help to specialize for several cases. But it will slow down (a tiny bit) the unspecialized cases. Should we make the recording uop change in a followup PR? And do you have an opinion on whether to use the |
See discussion at #148113.
This PR adds two tier2 opcodes for specialization of
TO_BOOL. The*argsand **kwargs` arguments are marked in tier2 as tuple and dict, respectively.In this PR there is no additional type recording or tier1 opcodes, that is left to followup PRs.
Details