Posterous theme by Cory Watilo

Filed under: python

Dataset Tracking with yt

In this post I'd like to discuss a bit of work in progress to highlight some exciting new features that we hope to have working in yt sometime soon.

On any machine that runs yt, there is a file created in the users home directory named ~/.yt/parameter_files.csv that yt uses internally to keep track of datasets it has seen. This is just a simple text file containing comma-separated entries with a few pieces of information about datasets, like their location on disk and the last date and time they were 'seen' by yt. To keep this file from exploding, it's kept at some maximum number of entries. But, clearly, text is not the ideal way to store this kind of information for anything over a few hundred entries. Recently Matt has been working on updating this system to use a SQLite database, which should have several advantages over the text file in terms of speed and disk usage.

This got me thinking about what could be done to extend this local listing of datasets into something more useful, globally. What if there was a way to view any and all datasets ever seen by yt in one convenient place? It could be searchable over a number of attributes, including creation date and when it was last seen by yt, and it would list which machine the dataset is stored on. Finally, this functionality should be transparent to the user once it is set up (with minimal effort) - the global listing of datasets should just be updated automatically in the background as part of the normal workflow.

Over a couple days last week I did a quick and dirty implementation of this using Amazon AWS SimpleDB and a simple web-cgi script I wrote in Python. The advantages of SimpleDB are that it is "in the cloud" (sheesh) and very inexpensive. In fact, for small databases with low usage levels, it is free. (As an aside, Amazon is very generous with academic grants, which could be used for this or other yt-related services.) The Python script is very simple and can be cloned off of BitBucket. The script can be run on any computer with a webserver and Python (which includes Macs and Linux machines), and I envision a website (perhaps mydb.yt-project.org, for example) being created where a user can login from anywhere to view their datasets easily.

The entire thing is not finished yet: the updates to SimpleDB are not automatic, nor have we settled on a final list of which attributes to store in the listing. However, in two days I was able to get enough working to show what I think are the key killer features of the system in a screencast which I've linked below. I should note that in the time since I made the screencast, I have made a few improvements. In particular, the numerical columns can now be sorted correctly.

I'm excited about the prospects for a simple system like this!

kD-Tree Rendering Improvements

Hi all,

Just sharing a video here that showcases some improvements I've made to the kD-tree rendering that will be making its way to yt for the 2.0 release.  

(download)
Just to be clear this is showing the rendering of a cosmology simulation with a 64^3 root grid + 6 AMR levels in real time on 8 processors.  The script is run in parallel, with the root processor displaying the results once each frame is finished.  The viewpoint is being randomized, showing the power of a kD-tree homogenization that allows a fast back-to-front sorting algorithm for each brick.  The big key here is that each processor keeps the data associated with its volume in memory so that a new viewpoint doesn't require additional file I/O.  

To help get an idea of what the load balancing is doing, I figured out a way to plot the outline of the bricks with a color corresponding to each processor.  This is using the breadth-first load balancing where the top N subtrees are distributed across the N processors. Some of the colors overlap in an odd way because of the order in which they are shown but you can get the general idea.  

3d_kd_breadth_decomp

There are a few more improvements on the way such as parallel kD-tree construction which should lower the overhead for this method by quite a lot, so keep an eye out!

 

 

Figuring Out Stereo Volume Rendering

Last week I was approached by a friend and collaborator to prepare some large volume renderings using the software volume renderer in yt.  In the past we've successfully made very, very large image renderings using yt -- Sam's even made one at 8192^2, although at extremely high resolution like that sometimes the lack of fidelity in the underlying volume renderer shows up; sometimes even artifacts in the AMR grid boundaries, but that's less common.  Making the very large volume renderings isn't too bad -- it scales roughly with the number of pixels, but we can dispatch many frames to be rendered at once on a cluster.  

There are a couple other, more important things to consider when making the big volume renderings.  For starters, the entire structure of volume rendering in yt was not really created to generate a series of images -- only a single image.  The idea was that you would prepare a specific image, make it, and move on.  However, for this project, I want to do a zoomin, or possibly a more complicated camera path.

Additionally, one of the first things that we did with the volume rendering was silly: we applied no normalization to the output images.  That was a mistake, I see now.  Part of the reason for this was uncertainty in the correct normalization -- the bias that the user wanted to apply may not be the natural bias from the image.  But more than that, because the rendering algorithm itself was some what holistically settled upon (the original implementation, which we used for shell-style renderings, was not a "correct" implementation of alpha blending) a natural mechanism for scaling did not immediately present itself.  One likely exists, possibly dependent on the field of view, I simply do not yet know it.  This will have to be rectified, because the mechanism used for scaling a set of images will have to be different than the mechanism for scaling an image in isolation, or else frames will jump in brightness during the movie's course.

The final thing that I wanted to change was to add support for stereo rendering.  Rather than repeat any of the amazing discussion from Paul Bourke's website, I'll simply direct you there.  Everything you ever wanted to know about stereo rendering.  (When I was a first year grad student, we actually bought a copy of his site to use locally -- it was our way of showing support for him putting it online, and it also came with a bunch of source code for example applications.)  I first attempted to apply the correct method for stereo, where the view direction is parallel and the total view frustum is shifted.

This did not work.  In fact, it made me realize that all this time, the yt method for volume rendering is in fact ... not really a volume rendering method inasmuch as it is a planar ray-casting method.  Typically when doing volume rendering, there's a perspective applied to the image: the rays all emanate from a single place, creating a frustum.  But for yt, we actually set up a single plane of vectors at the back of the volume and advance that forward across the image.  This is good and bad; it's good in the sense that it's more clear precisely what is going on.  But it's bad in the sense that correct stereo is more difficult.  (Of course, on Bourke's page he has a workaround that may work for this, but I have not yet attempted it.)  Here's a rough depiction of the different between the two methods.

Renderingmechanisminyt
The upshot is that stereo doesn't seem to work unless you go with the "toe-in" method that can cause eyestrain after a long time and shows visible parallax at the edges.  I'm not sure if this is going to be a problem, but because I am not right now eager to rewrite the rendering backend, this is the way it is for the moment.

To set up the stereo rendering, I separated out the rendering mechanism from the objects to be rendered.  Previously, there was a single VolumeRendering object that you could create, raycast through, then discard.  I created a new camera object that accepted a homogenized volume and would call "traverse" on that volume, feeding a back and a front point.  The Volume is then responsible for passing off fixed-resolution grids to the camera, which accumulates an image buffer by calling the ray traversal functions.  The front and back points are essentially the only thing needed to know this order, but the camera also stores its three orientation vectors and its position that describe it in 3D space.  By separating out these two conceptual objects, we undo some of the "single, carefully constructed image" bias that was in the original volume renderer.  (And, we open ourselves up to being able to use the Camera with a hardware volume renderer, should that day ever come.)

So now we have a camera, and it makes images like this:

C_0001

It's a little dim, but that's a task for another day.  The next step is taking that perspective and turning it into a set of stereo images.  To do that, I added a new class called StereoPairCamera.  It accepts a Camera object and turns it into two camera objects, where the final interocular distance is calculated relative to the image plane width.  As I mentioned above, this only operates via toe-in stereo, so it does this in the simplest manner possible: it moves each of the Left and Right cameras by half of the interocular distance away from the original location, and then recalculates a normal vector to point back at the original center.  Now we can generate left and right images:
Unfortunately, on my laptop (which is my primary/exclusive work computer) I don't have the ability to view these pairs.  To get around that, I wrote a simple stereo pair image viewer in OpenGL and imposed upon my friends that do have a stereo viz wall to test it out -- and after some fiddling with the interocular distance, we got what appeared to be workable stereo pairs.

The full code for generating camera paths as well as stereo pairs is already in yt, but the documentation is still being written; I might also clean up the interface a bit.  Additionally, at some point in the future, the issue of toe-in stereo versus correct parallel-frustum stereo will need to be dealt with; the last thing I really want to do is force people to only use a bad method for generating stereo pairs.  Hopefully that is something that can be dealt with at a later time.

Thanks to the wealth of resources out there for making this a relatively easy task: the aforementioned Paul Bourke website on stereo pairs, the PyOpenGL and PIL teams for making the image pair viewer easy, and everyone else whose work I've built on to make things like this.

L_0001

 

R_0001