Posterous theme by Cory Watilo

Filed under: yt

Dataset Tracking with yt

In this post I'd like to discuss a bit of work in progress to highlight some exciting new features that we hope to have working in yt sometime soon.

On any machine that runs yt, there is a file created in the users home directory named ~/.yt/parameter_files.csv that yt uses internally to keep track of datasets it has seen. This is just a simple text file containing comma-separated entries with a few pieces of information about datasets, like their location on disk and the last date and time they were 'seen' by yt. To keep this file from exploding, it's kept at some maximum number of entries. But, clearly, text is not the ideal way to store this kind of information for anything over a few hundred entries. Recently Matt has been working on updating this system to use a SQLite database, which should have several advantages over the text file in terms of speed and disk usage.

This got me thinking about what could be done to extend this local listing of datasets into something more useful, globally. What if there was a way to view any and all datasets ever seen by yt in one convenient place? It could be searchable over a number of attributes, including creation date and when it was last seen by yt, and it would list which machine the dataset is stored on. Finally, this functionality should be transparent to the user once it is set up (with minimal effort) - the global listing of datasets should just be updated automatically in the background as part of the normal workflow.

Over a couple days last week I did a quick and dirty implementation of this using Amazon AWS SimpleDB and a simple web-cgi script I wrote in Python. The advantages of SimpleDB are that it is "in the cloud" (sheesh) and very inexpensive. In fact, for small databases with low usage levels, it is free. (As an aside, Amazon is very generous with academic grants, which could be used for this or other yt-related services.) The Python script is very simple and can be cloned off of BitBucket. The script can be run on any computer with a webserver and Python (which includes Macs and Linux machines), and I envision a website (perhaps mydb.yt-project.org, for example) being created where a user can login from anywhere to view their datasets easily.

The entire thing is not finished yet: the updates to SimpleDB are not automatic, nor have we settled on a final list of which attributes to store in the listing. However, in two days I was able to get enough working to show what I think are the key killer features of the system in a screencast which I've linked below. I should note that in the time since I made the screencast, I have made a few improvements. In particular, the numerical columns can now be sorted correctly.

I'm excited about the prospects for a simple system like this!

yt Development: Treecodes, GUIs, IRC and more!

It’s been nearly a month since the last yt development post; in that time, there’s been quite a bit of development in a couple different areas. This is culminating in a 2.1 release, for which Sam Skillman is release manager, sometime in the next few days.

Streamlines and Treecode

SamS has spent some time over the last month developing two types of streamline code. The first integrates a series of streamlines over a selection of the domain, which can then be visualizing using the mplot3d package. The other aspect of this involves selecting one of these integrated streamlines, which is then transformed into an AMR1DData object, which can be queried for values and plotted in other ways. Sam has documented both of these modes of streamline integration and they’ll be included in a new build of the documentation presently.

StephenS has been hard at work on developing a treecode for speeding up the binding energy calculation of clumps or other regions. To that end, he’s implemented not only the treecode itself (using some of the octree functionality which had as-yet been unused in yt) but also a series of tests. He sees substantial speedups, and it has been documented to become part of the next release.

Development Bootstrap and Pasteboards

The process of getting up and running with Mercurial and with development of yt can be a bit tricky. To try to alleviate that, I’ve added a new command, “yt bootstrap_dev” that will handle this. It will set up a bitbucket user, set up a couple handly hg extensions, and also create a ‘pasteboard.’ The pasteboard itself is still a bit in flux, so it’s not being touted just yet as a major feature, but I think it has some promise. The idea behind it is to create a semi-permanent mechanism for sharing scripts and so forth; it’s a versions hg repository which lives on BitBucket’s servers, from which scripts can be programmatically downloaded, embedded, or viewed.

You can see mine at http://matthewturk.bitbucket.org. If you’ve used the bootstrap_dev command, you can play with the pasteboards with “yt pasteboard” and “yt pastegrab”, but be forewarned they might not be completely working yet!

Web GUI and PlotWindow

JeffO has been hard at work rethinking and rebuilding the plotting system in yt. He’s started with the concept of the PlotCollection and thrown it out! The PlotCollection dates from a time when our mechanism for interacting with plots was actually fundamentally different; in the first few months of yt’s existence, it was operated through a GUI called ‘HippoDraw’ which operated with worksheets. The mapping was from a single worksheet to a single plot collection.

Nowadays, however, it seems that the most common use of making a plot collection is to make a bunch of plots that aren’t quite synced up in the same way that they were with HippoDraw. So Jeff has been rethinking the plotting system, sticking close to the idea of ‘conduits’ of data that are present in other systems like the AMRData system. You should be able to take a porthole into the data, toss it down, and then simply receive back an image that is visible through that porthole.

Over the last couple weeks he’s made quite a bit of progress in that, which enabled us to add it as a widget in a forthcoming GUI for yt.

…and, speaking of the GUI, we now have a quite functional protoype of a GUI working. This is the fifth (!) GUI that has been designed for yt. BrittonS, CameronH and JeffO and I took a few days and worked very hard on creating a useful, extensible, and maintainable GUI. (Britton, in fact, had one of the best quotes of the sprint. I said something like, “I’m looking forward to this GUI.” Britton replied, “I’m looking forward to this being the last GUI.” I couldn’t agree with this sentiment more.)

All of the previous versions and implementations of GUIs for yt — HippoDraw, then the original GUI called Reason and written in wxPython, then the subsequent TraitsUI wxPython Reason version 2.0, and finally the Tkinter-based Fisheye — were dependent on a whole stack of dependencies. These included things like wx, GTK, Qt, and on and on and on. These were difficult to install in an automated fashion, but more than that, they added an incredible level of complexity to the installation and to using these GUIs on supercomputer centers.

So we decided to get rid of all of that, and return to the basics. We built a Web GUI, where the widgets, the toolkits, events, and everything are all handled by a web browser, using JavaScript. The decision to do this ultimately came down to a maintainability issue — there’s no compilation necessary, everybody has a web browser, and it can be done trivially over SSH (with far less bandwidth than is required for a comparative X11 session being forwarded.)

The underlying system is just a Python interpreter; when buttons are pressed, they simply call functions that are available in the interpreter. It’s fundamentally a mechanism for issuing commands, displaying results of those commands (including images), and then on top of this we have begun adding widgets.

Cameron summarized some of this in an email to the developer list. It’s still in testing, but is available for testing, despite not being documented. This will be a big part of a 2.2 release of yt.

IRC

There’s now an IRC channel for yt, on FreeNode. There aren’t usually that many people in there (sometimes it’s just me!) but the bot from CIA.vc will echo all pushed commits to the channel. You can use a client like Adium, Irssi, or one of the other Linux or OSX clients, to connect to irc.freenode.net and then to join the channel #yt (“/join #yt”).

We’ll occasionally have development talk here, but it could also be viewed as a faster turnover mechanism for getting help, chatting about oddities, and suggesting feedback. If you’re interested in getting started developing on yt or fixing bugs, this would be the perfect way to get your feet wet. We’ll also likely have some coordinated development sessions here in the future.

See you next week!

yt development: External Analysis and Simulation Code Support

This last week was the first full week on BitBucket and so far I think it has been quite successful.  The new development process is for most of the core developers to maintain personal forks for experimental changes, or longer term changes, and then to commit directly or merge when bug fixes or features are ready to be integrated.  The list of forks is easily visible and each individual fork's divergence from the primary repository can be viewed by clicking on the green arrows.  All of the new mechanisms for developing using BitBucket are included in the "How to Develop yt" section of the documentation.

This last week I spent a few days at KITP's Galaxy Clusters workshop, where I presented on yt.  There were a few major points that came out of my visit, talking to the simulators there, that are germane to the long term development of yt.

  • As time goes on, yt should be increasingly viewed as a mechanism not just for analyzing data with its own, internal analysis routines, but as a mechanism for handling data, transforming it into a uniform interface independent of the underlying simulation code.  This will allow for linking against and utilizing external analysis codes much more easily.  (The three examples that came up while I was at KITP were a new halo finder, a weak lensing code, and a radiation transport code.)  To facilitate the process of calling external codes from within yt, I've written a section in the documentation that covers it.
  • There is a great deal of interest in ensuring yt works equally well with many different simulation platforms.  This is a primary goal of my current fellowship, and I am working toward it.  The next two codes that will be targeted for improvement are Gadget and ART, and I made good contacts at the workshop to this end.
  • The idea of analysis modules, particularly in a block-programming environment, is compelling.  There is quite a bit of interest in an interface where inputs and outputs were handled like pipes.  I am still formulating my ideas on this.  Last Fall I experimented a bit with an introspection system that could handle arguments and could hook up pipes, but it never got very far.  (The code.)
I had a number of scientific takeaways from the meeting, too, but that would all go into a different blog post.

This week I hope to finish up the adaptive ray tracing.  This past week StephenS unveiled a halo serialization mechanism, which I Think many people are excited about, and SamS continued developing his streamline code.

yt development: BitBucket, Task Queues, and Streamlines

The major changes this week came mostly in the form of administrative shifts. However, SamS did some great work I’m going to hint at (he’ll post a blog entry later) and I started laying the ground work for something I’ve been excited about for a while, an MPI-aware task queue.

BitBucket

For the last couple months, yt has been struggling under the constraints of the hg server on its hosting plan. The issue was that particular files checked into the repository (docs_html.zip for one, which is now gone, and amr_utils.c, also gone, for another) took a while to transfer over some connections. During this transfer, the (shared) hosting provider on hg.enzotools.org would kill the server process, resulting in an “abort” message given to the cloning user.

Basically, this was kind of awful, because it meant people couldn’t clone the yt repo reliably, and it also meant that the install script would fail in unpredictable ways (usually indicating a Forthon or setup.py error.) I’m kind of bummed out that I didn’t do something about this sooner; I suspect several people probably have tried to install yt and failed as a result of this. I added some workarounds that staged the download of yt over a couple pulls, which usually fixed it, but there was no reliable solution.

Enter BitBucket. A few of the developers had been using BitBucket for private projects, small repositories, and even (especially) papers that we’d been working on. For a while we’d been talking about moving yt there and trying to leverage the functionality it brings for Distributed Version Control Systems — like forking and pull requests, social coding, and on and on — and last week we hit the breaking point. So we created a new user (yt_analysis) and uploaded the yt repo, the documentation repo, and the cookbook, and we’re going to be conducting our development there. The old addresses should all still work — we have forwarded hg.enzotools.org to the new location.

One of the coolest aspects of this is that anyone can now “fork” the yt repository. What this means is that you can then get your own private version, which you can then make changes to very easily, and then submit them back upstream. I’m really excited about this and I would encourage people to take advantage of it. I’ve rewritten the Developer Documentation to describe how to do this.

All in all, I think this will be a very positive move. BitBucket has a number of value adds, including the forking model, but we should also immediately see a dramatic increase in the reliability of the repository.

Streamlines

SamS has done some work implementing streamlines. Right now they operate by integrating using RK4 any set of vector fields, and then plotting their paths using Matplotlib’s mplot3d command. He’s working on some cool ways to colorize their values, and one of the things I am pushing for is to take any given streamline and convert it to an AMR1DData object. This would enable you to, for instance, follow a stream line in magnetic fields and calculate the density at every point along that streamline.

Once Sam’s comfortable with the feature as-is, he’s going to blog about it, so I won’t steal the thunder for his hard work here.

Task Queues

Building on the ideas behind the time series analysis I started work on the idea of a task queue that’s MPI aware. When this is finished being implemented, it will act as a mechanism for dispatching work, which will be fully integrated with time series analysis. Right now it’s not even close to being done, but a few pieces of the architecture have been implemented.

The idea here is that you will be able to launch a parallel yt job, but have it split itself into sub-parallel tasks. For instance, it you had 100 outputs of a medium-size simulation to analyze, you would write your time series code as usual — you would describe the actions you want taken, how to output them, etc etc. You would then launch yt with a “sub-parallel” option, saying that you wanted to split the total number of MPI jobs into jobs of size N — for instance you could launch a 64 processor yt job, telling it to split into sub-groupings of 4 processors each. Each output would then be distributed in a first come first serve fashion to each of the processor groups. When each group finished its job, it would ask for the next job available, and so on. When completed, the results would be collated and returned.

I’m excited about this, but right now it’s in its infancy. I’ve constructed the mechanisms to do this within a single process space, with no sub-delegation of MPI tasks. The process of implementing this and properly integrating it with time series analysis is going to be a long one, but I am setting it as a task for the next major release of yt. If you’re at all interested in this, drop me a line, and I’m happy to show you how to get started testing it out.

Example HEALpix Contour Rendering Movie

In response to Matt's post on the HEALpix rendering update, I thought it would be worth posting an example movie.  This shows the all-sky rendering of an observer moving from the front face of a simulation through the volume to the back face.  The test simulation is 32 Mpc/h on a side with 64^3 root grid cells and up to 4 levels of refinement.  At the start it looks like a disc because the entire simulation is in front of the camera and by the end it is all around the sides, indicating the simulation is behind the camera.  Enjoy!

(download)

yt development: HEALpix and Contour Tree

This week there was not very much yt development.  However, a few notes may be of interest.  SamS has updated the HEALPix camera to support ordered projections; what this means is that you can now make volume renderings using a standard color transfer function, or even the Planck transfer function, that cover 4pi of the sky.  I am still working on integrating a method for creating images easily, but for now the scripts from last week should work.

I worked a bit on improving the speed of the contour tree, but I am growing to suspect that a full new algorithm will have to be implemented.  I have researching this, and I believe that the best method will require a "union merge" data structure.  Hopefully I will have something to report on this shortly.  As of now, the contouring algorithm should be a factor of 10%-50% faster than it was a week ago, depending on the dataset characteristics.

The idea of selection of astrophysical objects, rather than geometric objects, has been on my mind lately, and I committed a first pass at an API for this sort of selection.  I hope to have more information about that in the near future, but I anticipate those development efforts to ramp up around the end of March.

yt development: All-sky column density calculation

This week I added the ability to calculate all-sky column densities. This functionality uses HEALpix to calculate equally-area regions on the sky and then shoots out rays from a central source to some fixed radius, accumulating values of a field along the way. Although so far I’ve only used it to calculate column densities of “Density” it could be used for other values as well, including all-sky weighted averages of quantities.

This works via a new HEALpixCamera object, which accepts an Nside argument (the number of pixels is 12NsideNside) as well as a center and a radius. It’s still a bit rough around the edges, but currently it will work to generate projections over the entire sky.

It’s important to note that the difficulties inherent in sorting bricks radially prevent it from being used for transfer functions that involve any absorption. This means that currently it only works with “ProjectionTransferFunction” as an input.

To generate the projected pixels, we have to create a camera and ask it to cast rays along each pixel. This script will do that:

from yt.mods import *
import yt.visualization.volume_rendering.camera as camera

Nside = 32
pf = load("DD0008/galaxy0008")
cam = camera.HEALpixCamera([0.5,0.5,0.5], 0.2, Nside, pf = pf, log_fields = [False])
bitmap = cam.snapshot()

The returned bitmap will, as per usual, be an array of integrated values. Because we’re using the projection transfer function, with the HEALpix camera, it will be an ordered pixel list of shape (12NsideNside, 1, 4) where the first channel is ordered in order of pixels as per the HEALpix notation. We now have to convert this to a regularly gridded set of values, between 0 and 2pi and 0 and pi, for the theta and phi coordinates.

yt provides a helper function to go from pixel ID to angle (as well as a few other things). You can access this helper function in this manner:

import yt.utilities.amr_utils as au
from numpy import pi
phi, theta = na.mgrid[0.0:2*pi:800j, 0:pi:800j]
pixi = au.arr_ang2pix_nest(Nside, theta.ravel(), phi.ravel())
img = na.log10(bitmap[:,0,0][pixi]).reshape((800,800))

The call to mgrid creates a regularly-spaced mesh of values. We then ask HEALpix what the pixel IDs are that fall into each of these regularly spaced mesh values, and then we apply those pixels in that order. This transformation will, someday, be implicit in the snapshot() call.

At this point we can plot our regularly spaced mesh using one of several projections. We’ll do the Mollweide projection. To do this, we import the appropriate Matplotlib components and plot using the imshow command:

import matplotlib.figure
import matplotlib.backends.backend_agg

fig = matplotlib.figure.Figure((10, 5))
ax = fig.add_subplot(1,1,1,projection='mollweide')
image = ax.imshow(img, extent=(-pi,pi,-pi/2,pi/2), clip_on=False, aspect=0.5)
cb = fig.colorbar(image, orientation='horizontal')

cb.set_label(r"$\mathrm{Column}\/\mathrm{Density}\/[\mathrm{g}/\mathrm{cm}^2]$")
canvas = matplotlib.backends.backend_agg.FigureCanvasAgg(fig)
canvas.print_figure("allsky.png")

As you can see, it’s currently still a bit rough-around-the-edges (the image is below) but it’s on the way to being very useful. In particular, setting Nside higher above should improve the image quality, and adjusting the axis display in matplotlib can remove the latitude and longitude display. There is also currently a bug (which I have reported but not heard back on) in matplotlib related to changing the size of the axes object.

Hopefully soon enough this will be ready to be wrapped more simply into yt, and it can be made available in a more straightforward fashion. For now, paste 1503 includes a full script that will all-sky render and output an image like the one below.

All-sky projection

yt development: Documentation

As a result of progress in my scientific goals, and the application of recent yt developments to them, I did not make many changes or developments in yt this week. When I did work on yt, I primarily spent time re-organizing the documentation and fixing several errors. I have added an "installation" section, consolidated a few sections, and wrote two new sections on how to make plots and on how to generate derived data products. I think that the documentation is now much easier to navigate and follow.

This week I also fine-tuned the API for the bridge that connects Enzo and Python, relying on feedback from other Enzo developments. This work is the first step in creating a generalized library for constructing initial conditions; it is not meant to be Enzo-specific. I will report more on this, but the ultimate goal is to be able to construct initial conditions using scripts, rather than constructing multiple C routines. This should make scientific experimentation much more accessible.

One final note is that this last week saw the first successful execution of a large-scale Enzo simulation with inline analysis provided by yt, using the `EnzoStaticOutputInMemory` object. BrittonS is driving that simulation forward, and he's also conducting benchmarks to see how expensive inline analysis is.

yt development: Time series, and more

Not much yt development went on in the last week; I spent some time working with Enzo and driving forward simulation goals, which resulted in some development that directly benefited those simulation goals. However, this fortuitously coincided with work I have been eager to return to for quite some time: namely, time series analysis!

Time Series Analysis

The problem with time series analysis in yt has, to this point, been an issue of verbosity and clunkiness. Typically, unless using the EnzoSimulation class (which is only available as of right now for Enzo) one would set up a loop:

for pfi in range(30):
    fn = "DD%04i/DD%04i" % (pfi, pfi)
    pf = load(fn)
    process_output(pf)

This has always been somewhat dissatisfying for me. The idea of a “time series” of data — another data object, moving along a temporal dimension instead of spatial — has been more alluring, but difficult to implement in a consistent way. In August of 2009 I tried my first attempt at this, creating TimeSeries objects that contained multiple parameter files, but it never went anywhere.

This week, overwhelmed with time series data, I spent some time bringing this functionality in line with the current state of yt. The idea behind it is that the underlying data and the operators that act on that data can and should be distinct. The code can be seen in yt/data_objects/time_series.py and yt/data_objects/analyzer_objects.py. I have created a number of analyzer objects which perform common tasks, as well as an easy method for generating your own analyzer objects.

As of the current tip of the repository, the method for generating and adding time series data is somewhat clunky; I can only guarantee it works with Enzo data where the file OutputLog exists. The examples below assume that.

You can now create a TimeSeries object and act on it; in this case we’ll create an EnzoTimeSeries from the OutputLog, and we’ll find the maximum density at all times.

from yt.mods import *
ts = EnzoTimeSeries("MyData", output_log = "OutputLog")
max_rho = ts.tasks["MaximumValue"]("Density")

You can see the (currently) available tasks by calling ts.tasks.keys(). You can also set up chains of tasks to be evaluated by constructing lists of AnalysisTask objects and feeding them to ts.eval and you can create your own tasks as well. For instance, if you wanted to look at the mass in star particles as a function of time, you would write a function that accepts params and pf and then decorate it with analysis_task. Here we have done so:

@analysis_task(('particle_type',))
def MassInParticleType(params, pf):
    dd = pf.h.all_data()
    ptype = (dd["particle_type"] == params.particle_type)
    return (ptype.sum(), dd["ParticleMassMsun"][ptype].sum())

ms = ts.tasks["MassInParticleType"](4)
print ms

I’m very excited about this, and I think it will be a centerpiece of the next major release of yt (and its affiliated documentation.) I also hope to see BrittonS’s EnzoSimulation class act as an initializer for this data object, so that more advanced selections of time series objects can take place when handling Enzo data.

libconfig

I’ve been spending some time looking at alternate configuration methods. This week I spent some time working with Enzo’s problem generation, which is in need of an overhaul, and attempting to expose it to yt to enable rapid construction of new initial conditions. I believe that Enzo may end up moving to libconfig-generatedconfiguration files, but the outcome of that is not certain. In order to future-proof yt I have included a modified version of libconfig and a wrapper around it.

This is in need of testing! If you would like to test it, and you are using the development branch of the code, open up yt/utilities/setup.py and change INSTALL_LIBCONFIG_WRAPPER = 0 to INSTALL_LIBCONFIG_WRAPPER = 1 and try re-building yt. If this works, we’re in the clear — and I’d very much like to hear about it, so drop me a line!

Tab-completion

I have added tab-completion for field names in iyt. This means if you do:

pf = load("DD0040/DD0040")
dd = pf.h.all_data()
dd["

and then hit tab, it will pop up a list of all possible fields that you can access. It only works for data objects as of right now.

Documentation

I’ve rebuilt the documentation a couple times this week, to add a table of contents for the field list, to update the changelog and to add a section on how to load data. The section on loading data is under “Analyzing Data” and is designed to help people get started with yt if they don’t know how to get started with getting their data in. It also includes a list of caveats for the various codes that are supported.

I also realized that the section on “How to Make Plots” was pretty empty, so I filled that in with information and links to the various methods of PlotCollection.

That about covers it. See you next week!

yt development: 2.0, Cython, and physics module wrapping

This is the second blog entry in the weekly series, with some updates on what took place last week with respect to yt development. One of the more exciting things is the final one, which is the start of what I want to focus on for the next couple months or years: integration of physics modules with analysis code, and then the ultimate inversion of that relationship.

yt-2.0

This week saw the release of yt 2.0, which has the most prominent feature of being a completely re-organized code base, ready for expansion, with clearer entry points for modification, and a faster import time. The “stable” branch was updated to this reorganized codebase, and so all new installations should have this same basic layout. The new layout looks something like this:

yt/
yt/data_objects
yt/utilities
yt/frontends
yt/gui
yt/visualization
yt/analysis_modules

Each of these contains python files, possibly submodules, and so on. But in contrast with the old layout, where things were named lagos and raven and so on, I think it’s safe to say we’ve traded whimsy for utility.

Cython

(If you’ll forgive me, this is taken from an email I wrote to yt-users.)

Cython is a Python-like dialect that compiles down to C code. It understands NumPy arrays and Python objects natively, and can produce code that operates at speeds competitive with raw, hand-written C code. We use it in yt to write things like the volume renderer, fast interpolation, level set identification, PNG writing, and so on. You can see most of the Cython code in yt/utilities/_amr_utils/*.pyx .

In the past, the Cython code was converted to C before being put into the repository. This is visible in yt/utilities/amr_utils.c. If you take a second to open this up, you’ll note a couple things about it. The first is that it’s generated code: notoriously awful to read, and 100% impossible to maintain. That’s why it’s never touched directly, and only generated by Cython. But, the problem with that is that if we want to make small changes, they cascade into ridiculously long changesets and diffs, which end up growing the size of the mercurial repository by far more than is appropriate.

To get around this, I have added an install-time dependency on the Cython package. To ensure that this will cause no problems during the transition, installation of Cython was added to the install script a while ago, and I have additionally added a check to setup.py. If Cython is not found, it should install it. The only cases where this should cause a problem are those where yt was installed with elevated (sudo) permissions. Now, every time yt is rebuilt, the C code that was previously in yt/utilities/amr_utils.c is regenerated from the Cython code in yt/utilities/_amr_utils/*.pyx. This means that we no longer need to update `amr_utils.c in the mercurial repository.

Adding Cython brings with it a number of interesting things we can do; these include much faster iteration on improvements to, say, the volume renderer. Additionally, there are some other very interesting things one can do with the speed improvements Cython brings, which hopefully we can explore in the future.

(Here ends the email extraction)

One possible place to explore using Cython is with writing on-the-fly faster, single-cell kernels. I have been thinking about this, for things like calculating flux across isocontours, which may need to be redefined on the fly inside a running session. This could be very useful for looking at the collapse-state of star forming clumps in a calculation. Another would be the subject of the next section, which is generating interfaces to physics modules.

Physics Module Wrapping

As a bit of background, one of the goals I am attempting to reach is to be able to run individual physics modules (chemistry, cooling, possibly even hydrodynamics) on a given set of cells; this will enable further exploration of things like the cooling time, the chemical timescales and so on. More importantly, it’s the first step toward being able to run an actual simulation inside yt, which is the goal of this project over the next several years. More relevantly, I am planning to further develop the Enzo chemistry solvers, and this is necessary to do so in a rapid manner. A quick dig through the early stages of yt will reveal that this worked, once upon a time, with the older Enzo solvers; however, in order to drive forward my near-term solver development I will not be able to use those wrappers.

This last week I spent some time attempting to write a semi-general wrapper for Enzo’s physics solvers (in this case, the chemistry solver) that could be called from Python. I ran into several difficulties, which I will chronicle here, but I believe I have come upon a solution; unfortunately it is a solution that may take longer to implement than I had hoped.

In the past, before the development of the branch of Enzo that would become 2.0, the construction of wrappers around fortran modules was much simpler. In the past, the code had been slightly more sanitized than it is now (comments in F77 were more uniformly used — “c” instead of “!”, few end-of-line, running over 80 characters, and so on) in the solvers. Additionally, the transition to Enzo 2.0 brought with it uniform 64-bit aware code; for Fortran modules, this relies on “—fdefault-real-8” in the GNU compilers, and “-r8” in the Intel compilers. This promotes all “REAL” variables to be of higher precision within the Fortran code.

f2py is a module that comes with NumPy, and which can parse and generate wrappers for Fortran code. However, the default promotion presents a difficulty, and I think that we cannot use f2py for this. I found the next-generation f2py to pose a similar problem, and I was unable to get fwrap, the new Cython-using wrapper into a working state. I think that direct wrapping of Fortran is outside the scope of my near-term goals.

However, what I was able to locate was a piece of software called cppheader, which is based on PLY, that is able to parse in a general-fashion the header files from C++. I believe that I can use this to construct general wrappers using Cython for the entire Grid object, thus exposing the Fortran solvers indirectly. I have a set of hand-constructed wrappers that I wrote last Summer, but a generative method will be easier to maintain. This will remove the complexity that arises from the precision handling in Enzo. Constructing wrappers for other simulation platforms will likely take a similar form.

That wraps it up for this week.