## Implementing a Parallelized Octree in F#

Wednesday, January 20, 2010 – 10:31 pm

The result of all my F# hacking around over Christmas wasn’t just some notes on what not to do (my previous F# blog post). I actually got down to writing some more code for my n-body model in F#. I got down to reading some of “F# for Scientists” and a couple of really interesting blog posts on building an Asteroids game for the Xbox using F#. This prompted me to stop *thinking about* implementing an octree and actually get started.

Why F#? Well in this case it seems like the best tool for the job. I’d already considered a C# octree implementation on Code Project. Even after some clean up, refactoring (and bug fixing) I was less than happy with the clarity of the resulting code and ditched it.

The book actually includes some examples of n-body modeling and shows lots of examples of building trees with F#, something the language is pretty good at. As noted in my previous post I also discovered how to do TDD in F# which was a real help.

### Implementing the octree in F#

What’s an octree? An octree is a tree structure where each node has up to eight children, one for each of the octants. You can think of them as a three dimensional variation on a binary tree.

The implementation shown here is a sparse tree where empty nodes are culled from the tree. It also supports limiting the tree depth (the **maxDepth** parameter)and limiting the minimum number of bodies per node (the **minBodiesPerNode** parameter).

As we’re dividing volumes of space up into cubes then the first thing you need is a way to define the octant boundaries. The **OctantBoundary** type is defined in terms of two points (e.g. front-bottom-left and back-top-right, depending on your coordinate system). A boundary has methods allowing calling code to establish if an individual **VectorFloat** or an array of **Body** objects fall within the boundary.

1: [<DebuggerDisplay("{Center} with size {Size}")>]

` 2: type OctantBoundary =`

` 3: { Min : VectorFloat;`

` 4: Max : VectorFloat }`

` 5:`

6: static member create (p1:VectorFloat) (p2:VectorFloat) =

` 7: let xMax, xMin = maxMin p1.X p2.X`

` 8: let yMax, yMin = maxMin p1.Y p2.Y`

` 9: let zMax, zMin = maxMin p1.Z p2.Z`

` 10: { Max = VectorFloat(xMax, yMax, zMax);`

` 11: Min = VectorFloat(xMin, yMin, zMin) }`

` 12:`

13: member inline this.Center = 0.5f * (this.Min + this.Max)

` 14:`

15: member inline this.Size = this.Max.X - this.Min.X

` 16:`

17: static member (*) (b : OctantBoundary, k : float32) = { Min = b.Min * k; Max = b.Max * k }

` 18:`

19: member inline this.isContaining (position:VectorFloat) =

20: (position.X > this.Min.X) && (position.X <= this.Max.X) &&

21: (position.Y > this.Min.Y) && (position.Y <= this.Max.Y) &&

22: (position.Z > this.Min.Z) && (position.Z <= this.Max.Z)

` 23:`

24: member inline this.bodiesWithin (bodies:Body[]) : Body[] =

25: Array.filter ( fun b -> this.isContaining b.Position ) bodies

` 26:`

27: member this.Corners =

28: [| VectorFloat(this.Max.X, this.Max.Y, this.Min.Z);

29: VectorFloat(this.Max.X, this.Max.Y, this.Max.Z);

30: VectorFloat(this.Max.X, this.Min.Y, this.Min.Z);

31: VectorFloat(this.Max.X, this.Min.Y, this.Max.Z);

32: VectorFloat(this.Min.X, this.Max.Y, this.Min.Z);

33: VectorFloat(this.Min.X, this.Max.Y, this.Max.Z);

34: VectorFloat(this.Min.X, this.Min.Y, this.Min.Z);

35: VectorFloat(this.Min.X, this.Min.Y, this.Max.Z)

` 36: |]`

You can see what the **VectorFloat** and **Body** classes look like here, they’re pretty obvious implementations. I’m actually hoping to get some of my code up online really soon so you’ll be able to build and run it for yourself if your interested.

Next you need to define types for the actual tree. A **Node** either contains **LeafData**—a boundary, node properties and a list of bodies—or **NodeData**—a boundary, node properties and a list of (sub)nodes.

1: [<DebuggerDisplay("{displayMe()}")>]

` 2: type Node =`

` 3: | Leaf of LeafData`

` 4: | Node of NodeData`

` 5: `

6: member inline this.isEmpty =

7: match this with | Node(_) -> false | Leaf(n) -> n.Bodies.Length = 0

` 8:`

9: [<NoCoverage("Debugging use only")>]

10: member this.displayMe() =

11: match this with

12: | Leaf(n) -> sprintf "Leaf: bodies %d, mass %f @ %s" n.Bodies.Length n.Properties.Mass (n.Properties.Position.ToString())

13: | Node(n) -> sprintf "Node: subnodes %d, mass %f @ %s" n.Nodes.Length n.Properties.Mass (n.Properties.Position.ToString())

` 14: `

` 15: and NodeData =`

` 16: { Boundary : OctantBoundary;`

` 17: Properties : CenterOfMass;`

` 18: Nodes : Node[]; }`

` 19:`

` 20: and LeafData =`

` 21: { Boundary : OctantBoundary;`

` 22: Properties : CenterOfMass;`

` 23: Bodies : Body[]; }`

` 24: `

25: static member inline (==) (a : LeafData, b : LeafData) = LanguagePrimitives.PhysicalEquality a b

Now you need to recursively partition the tree. This is where F# really starts to shine. The **partition** function recursively divides up space placing the appropriate bodies in sub-octants.

` 1: let NoDepthLimit = -1`

` 2: `

3: let partition (minBodiesPerNode:int) (maxDepth:int) (bodies:Body[]) =

` 4: `

` 5: let getOctantBoundary (bodies:Body[]) =`

` 6: let rec getOctantBoundaryForBody (boundary:OctantBoundary) (b:Body) =`

7: if (boundary.isContaining b.Position) then

` 8: boundary`

9: else

` 10: getOctantBoundaryForBody (boundary * 2.0f) b`

` 11:`

` 12: let boundary = OctantBoundary.create (VectorFloat(-1.0, -1.0, -1.0)) (VectorFloat(1.0, 1.0, 1.0))`

` 13: Array.fold (fun boundary b -> getOctantBoundaryForBody boundary b )`

` 14: boundary bodies`

` 15:`

16: let inline getMaxDepth (maxDepth:int) = if (maxDepth <= NoDepthLimit) then Int32.MaxValue else maxDepth

` 17: `

` 18: let calculateLeafProperties (bodies:Body[]) =`

` 19: let props = bodies |> Array.sumBy (fun b -> { Mass = b.Mass; Position = b.Position })`

` 20: { Position = props.Position / props.Mass; Mass = props.Mass }`

` 21: `

` 22: let calculateNodeProperties (nodes:Node[]) =`

` 23: let props = nodes |> Seq.sumBy (fun n ->`

` 24: match n with`

` 25: | Leaf(n) -> n.Properties`

` 26: | Node(n) -> { Mass = n.Properties.Mass; Position = n.Properties.Position })`

` 27: { Position = props.Position / props.Mass; Mass = props.Mass }`

` 28:`

29: let rec doPartition (minBodiesPerNode:int) (maxDepth:int) (bodies:Body[]) (depth:int) (boundary:OctantBoundary) =

30: if (bodies.Length = 0) then

` 31: { Boundary = boundary;`

` 32: Properties = CenterOfMass.Zero;`

` 33: Bodies = [||]`

` 34: }`

` 35: |> Leaf`

` 36: elif ((bodies.Length <= minBodiesPerNode) || (depth >= maxDepth)) then`

` 37: { Boundary = boundary;`

` 38: Properties = calculateLeafProperties bodies;`

` 39: Bodies = bodies }`

` 40: |> Leaf`

41: else

` 42: let nodes =`

43: (if (depth < 1) then

` 44: boundary.Corners`

` 45: |> Array.map (fun c -> OctantBoundary.create c boundary.Center)`

46: |> Array.map (fun b -> async { return doPartition minBodiesPerNode maxDepth (b.bodiesWithin bodies) (depth + 1) b } )

` 47: |> Async.Parallel`

` 48: |> Async.RunSynchronously`

` 49: |> Array.filter (fun n -> not n.isEmpty)`

50: else

` 51: boundary.Corners`

` 52: |> Array.map (fun c -> OctantBoundary.create c boundary.Center)`

` 53: |> Array.map (fun b -> doPartition minBodiesPerNode maxDepth (b.bodiesWithin bodies) (depth + 1) b)`

` 54: |> Array.filter (fun n -> not n.isEmpty)`

` 55: )`

` 56: {`

` 57: Boundary = boundary;`

` 58: Properties = calculateNodeProperties nodes`

` 59: Nodes = nodes; }`

` 60: |> Node`

` 61:`

` 62: doPartition minBodiesPerNode (getMaxDepth maxDepth) bodies 0 (getOctantBoundary bodies)`

The **partition** function also tries to parallelize the problem by using Parallel.Async to run each of the octants at the first level of the tree on a different thread. This isn’t ideal as it assumes that the distribution of bodies across the tree is uniform—so the amount of work in each octant will be equal—which isn’t true in most cases. However it gives a reasonable speedup on my existing i7 processor with eight hardware threads. I’ll probably revisit this later.

One interesting observation is comparing the (serial) C# implementation with the parallel F# one. The octree is roughly 150 lines of F# but around 600 lines of C#.

**Caveat Emptor*: **I’m relatively new to F#. This code works just fine. I wrote most of it test first using xUnit in F# and am pretty satisfied with it’s functionality and performance. However, an F# expert might do considerably better. If that’s you and you have some feedback on improving this code I’d love to hear it.

*****That’s latin for “buyer beware” which is what an expensive English public school education buys you I guess.

**Update: March 26th 2010.** The source code for this is now available here. For further details see this post.

### Further reading

If you want to find out more about the Barnes-Hut divide and conquer approach to the n-body problem then Berkeley’s Computer Science Department has some online lecture notes.

CS267: Fast Hierarchical Methods for the N-body Problem, Part 1Lecture 24, Apr 11 1996

You can also look at the code implemented in C with notes from Prof. Barnes in The Treecode Guide. It’s probably faster than my F# implementation but it’s also a lot more complex.

## 1 Trackback(s)

Sorry, comments for this entry are closed at this time.