This is the second blog entry in the weekly series, with some updates on
what took place last week with respect to yt development. One of the more
exciting things is the final one, which is the start of what I want to focus
on for the next couple months or years: integration of physics modules with
analysis code, and then the ultimate inversion of that relationship.
yt-2.0
This week saw the release of yt 2.0, which has the most prominent feature of
being a completely re-organized code base, ready for expansion, with clearer
entry points for modification, and a faster import time. The “stable”
branch was updated to this reorganized codebase, and so all new
installations should have this same basic layout. The new layout looks
something like this:
yt/
yt/data_objects
yt/utilities
yt/frontends
yt/gui
yt/visualization
yt/analysis_modules
Each of these contains python files, possibly submodules, and so on. But in
contrast with the old layout, where things were named lagos and
raven and so on, I think it’s safe to say we’ve traded whimsy for
utility.
Cython
(If you’ll forgive me, this is taken from an email I wrote to yt-users.)
Cython is a Python-like dialect that compiles down to C code. It
understands NumPy arrays and Python objects natively, and can produce code
that operates at speeds competitive with raw, hand-written C code. We use
it in yt to write things like the volume renderer, fast interpolation, level
set identification, PNG writing, and so on. You can see most of the Cython
code in yt/utilities/_amr_utils/*.pyx .
In the past, the Cython code was converted to C before being put into the
repository. This is visible in yt/utilities/amr_utils.c. If you take a
second to open this up, you’ll note a couple things about it. The first is
that it’s generated code: notoriously awful to read, and 100% impossible to
maintain. That’s why it’s never touched directly, and only generated by
Cython. But, the problem with that is that if we want to make small
changes, they cascade into ridiculously long changesets and diffs, which end
up growing the size of the mercurial
repository by far more than is appropriate.
To get around this, I have added an install-time dependency on the Cython
package. To ensure that this will cause no problems during the transition,
installation of Cython was added to the install script a while ago, and I
have additionally added a check to setup.py. If Cython is not found, it
should install it. The only cases where this should cause a problem are
those where yt was installed with elevated (sudo) permissions. Now, every
time yt is rebuilt, the C code that was previously in
yt/utilities/amr_utils.c is regenerated from the Cython code in
yt/utilities/_amr_utils/*.pyx. This means that we no longer need to
update `amr_utils.c in the mercurial repository.
Adding Cython brings with it a number of interesting things we can do; these
include much faster iteration on improvements to, say, the volume renderer.
Additionally, there are some other very interesting things one can do with
the speed improvements Cython brings, which hopefully we can explore in the
future.
(Here ends the email extraction)
One possible place to explore using Cython is with writing on-the-fly
faster, single-cell kernels. I have been thinking about this, for things
like calculating flux across isocontours, which may need to be redefined on
the fly inside a running session. This could be very useful for looking at
the collapse-state of star forming clumps in a calculation. Another would
be the subject of the next section, which is generating interfaces to
physics modules.
Physics Module Wrapping
As a bit of background, one of the goals I am attempting to reach is to be
able to run individual physics modules (chemistry, cooling, possibly even
hydrodynamics) on a given set of cells; this will enable further exploration
of things like the cooling time, the chemical timescales and so on. More
importantly, it’s the first step toward being able to run an actual
simulation inside yt, which is the goal of this project over the next
several years. More relevantly, I am planning to further develop the Enzo
chemistry solvers, and this is necessary to do so in a rapid manner. A
quick dig through the early stages of yt will reveal that this worked, once
upon a time, with the older Enzo solvers; however, in order to drive forward
my near-term solver development I will not be able to use those wrappers.
This last week I spent some time attempting to write a semi-general wrapper
for Enzo’s physics solvers (in this case, the chemistry solver) that could
be called from Python. I ran into several difficulties, which I will
chronicle here, but I believe I have come upon a solution; unfortunately it
is a solution that may take longer to implement than I had hoped.
In the past, before the development of the branch of Enzo that would become
2.0, the construction of wrappers around fortran modules was much simpler.
In the past, the code had been slightly more sanitized than it is now
(comments in F77 were more uniformly used — “c” instead of “!”, few
end-of-line, running over 80 characters, and so on) in the solvers.
Additionally, the transition to Enzo 2.0 brought with it uniform 64-bit
aware code; for Fortran modules, this relies on “—fdefault-real-8” in the
GNU compilers, and “-r8” in the Intel compilers. This promotes all “REAL”
variables to be of higher precision within the Fortran code.
f2py is a module that comes with NumPy, and which can parse and generate
wrappers for Fortran code. However, the default promotion presents a
difficulty, and I think that we cannot use f2py for this. I found the
next-generation f2py to pose a similar problem, and I was unable to get
fwrap, the new Cython-using wrapper into a working state. I think that
direct wrapping of Fortran is outside the scope of my near-term goals.
However, what I was able to locate was a piece of software called
cppheader, which is
based on PLY, that is able to parse in a
general-fashion the header files from C++. I believe that I can use this to
construct general wrappers using Cython for the entire Grid object, thus
exposing the Fortran solvers indirectly. I have a set of hand-constructed
wrappers that I wrote last Summer, but a generative method will be easier to
maintain. This will remove the complexity that arises from the precision
handling in Enzo. Constructing wrappers for other simulation platforms will
likely take a similar form.
That wraps it up for this week.