Optimizing yt
@ Matthew Turk | Saturday, Sep 10, 2011 | 3 minute read | Update at Saturday, Sep 10, 2011

This last week, following the release of version 2.2 of yt, I spent a bit of time looking at speed improvements. There were several places that the code was unacceptably slow:

  • 1D profiles (as noted in our method paper, even)
  • Ghost-zone generation
  • RAMSES grid data loading

The first of these was relatively easy to fix. In the past, 1D profiles (unlike 2D profiles) were calculated using pure-python mechanisms; numpy was used for digitization, then inverse binning was conducted by the numpy ‘where’ command, and these binnings were used to generate the overall histogram. However, with 2D and 3D profiles, we used specialized C code written expressly for our purposes. This last week I found myself waiting for profiles for too long, and I wrote a specialized C function that conducted binning in one-dimensions. This sped up my profiling code by a factor of 3-4, depending on the specific field being profiled.

The second, ghost zone generation, was harder. To generated a ‘smoothed’ grid, interpolation is performed cascading down from the root grid to the final grid, allowing for a buffer region. This helps to avoid dangling nodes. Ideally, filling ghost zones would be easier and require less interpolation; however, as we do not restrict the characteristics of the mesh in such a way as to ease this, we have to use the most general case. I spent some time looking over the code, however, and realized that the most general method of interpolation was being used – which allowed for interpolation from a regular grid onto arbitrary shapes. After writing a specialized regular-grid to regular-grid interpolator (and ensuring consistency and identicality of results) I saw a speedup of a factor of about 2.5-3 in generating ghost zones; this has applications from volume rendering to finite differencing and so on.

Finally, in the past, RAMSES grids following regridding were allowed to cross domains (i.e., processor files.) By rewriting the regridding process to only allow regrids to exist within a single domain, I was able to speed up the process of loading data, allowing it to preload data for things like projections, as well. Next this will be used as a load balancer, and it will also ease the process of loading particles from disk. I am optimistic that this will also enable faster, more specific read times to bring down peak memory usage.

Hopefully over the next few months more optimization can be conducted. If you want to test out how long something takes, particularly if it’s a long-running task, I recommend using pyprof2html, which you can install with pip install pyprof2html. Then run a profiling code:

$ python2.7 -m cProfile -o my_slow_script.cprof my_slow_script.py
$ pyprof2html my_slow_script.cprof 

This will create a directory called ‘html’, which has a nice presentation of where things are slow. If you send the .cprof file to the mailing list, we can take a look, too, and see if there are some obvious places to speed things up.

yt extension modules

yt has many extension packages to help you in your scientific workflow! Check these out, or create your own.

ytini

ytini is set of tools and tutorials for using yt as a tool inside the 3D visual effects software Houdini or a data pre-processor externally to Houdini.

Trident

Trident is a full-featured tool that projects arbitrary sightlines through astrophysical hydrodynamics simulations for generating mock spectral observations of the IGM and CGM.

pyXSIM

pyXSIM is a Python package for simulating X-ray observations from astrophysical sources.

ytree

Analyze merger tree data from multiple sources. It’s yt for merger trees!

yt_idv

yt_idv is a package for interactive volume rendering with yt! It provides interactive visualization using OpenGL for datasets loaded in yt. It is written to provide both scripting and interactive access.

widgyts

widgyts is a jupyter widgets extension for yt, backed by rust/webassembly to allow for browser-based, interactive exploration of data from yt.

yt_astro_analysis

yt_astro_analysis is the yt extension package for astrophysical analysis.

Make your own!!

Finally, check out our development docs on writing your own yt extensions!

Contributing to the Blog

Are you interested in contributing to the yt blog?

Check out our post on contributing to the blog for a guide!

We welcome contributions from all members of the yt community. Feel free to reach out if you need any help.

the yt data hub

The yt hub at https://girder.hub.yt/ has a ton of resources to check out, whether you have yt installed or not.

The collections host all sorts of data that can be loaded with yt. Some have been used in publications, and others are used as sample frontend data for yt. Maybe there’s data from your simulation software?

The rafts host the yt quickstart notebooks, where you can interact with yt in the browser, without needing to install it locally. Check out some of the other rafts too, like the widgyts release notebooks – a demo of the widgyts yt extension pacakge; or the notebooks from the CCA workshop – a user’s workshop on using yt.

Social Links