Sunday, March 27, 2016

Bidirectional Path Tracing 8 - Combine multiple s,t strategies optimally

Now we have all the s,t combination strategies implemented, which means for a certain path length integration we can have multiple solutions to apply:
path length 1: 3 ways to integrate direct visible light
path length 2: 4 ways to integrate direct lighting
path length 3: 5 ways to integrate 1st bounce lighting
path length 4: 6 ways to integrate 2nd bounce lighting

Simply added all the strategy would result to overbright image (probably 3 times brighter direct visible light, 4 times brighter direct lighting, 5 times brighter 1st bounce lighting....etc) But does simply divide by strategy number work? Something like:

\(C_{s+t=2}(\bar{x}) = 1/3(C_{s=0, t=2}^{*}(\bar{x})+C_{s=1, t=1}^{*}(\bar{x})+C_{s=2, t=0}^{*}(\bar{x}))\)

\(C_{s+t=3}(\bar{x}) = 1/4(C_{s=0, t=3}^{*}(\bar{x})+C_{s=1, t=2}^{*}(\bar{x})+C_{s=2, t=1}^{*}(\bar{x})+C_{s=3, t=0}^{*}(\bar{x}))\)

\(C_{s+t=4}(\bar{x}) = 1/5(C_{s=0, t=4}^{*}(\bar{x})+C_{s=1, t=3}^{*}(\bar{x})+C_{s=2, t=2}^{*}(\bar{x})+C_{s=3, t=1}^{*}(\bar{x})+C_{s=4, t=0}^{*}(\bar{x}))\)

\(...........\)

Uhh...... it might work......when you have a camera with intersectable lens + every light in the scene is intersectable + no specular bsdf in the scene, which means: any Dirac distribution light (point/directional/spot lights), bsdf(specular), camera(pinhole camera, orthogonal camera) can break this naive divide method. The reason is, some of the strategy just can't handle certain Dirac distribution: s=0 strategies can't handle point light since ray will never hit the light, t=0 strategies can't handle pinhole camera since ray will never hit the lens, t=1 strategies can't handle direct visible specular surface since \(f_{s}(\omega_{o}, \omega_{i}) = 0\) for every possible \(\omega_{o}, \omega_{i}\) combinations. (We rely on sample bsdf specifically for specular bsdf to get a \(f_{s}(\omega_{o}, \omega_{i})\) that represent the whole integration result in Dirac distribution case)


The above image show the length 4 strategies for specular bsdf + pinhole camera (from left \((s=0, t=4)\) to right \((s=4, t=0)\) ) Divide by 5 is definitely not enough to serve the purpose of combing length 4 strategies here.

Even we assume we won't encounter Dirac distribution, assign equal weight to each strategy doesn't provide the optimal result anyway. We see that certain strategy work particularly good on certain scenario, it doesn't make sense to add in other inferior strategy in this scenario to downgrade the result. The problem is: how we know which strategy work best on which scenario?

Veach95 introduced multiple importance sampling to solve this problem. The sampling combination techniques were applied to several rendering related topics includes: direct lighting sampling, final gathering, and bidirectional path tracing. Lafortune publish bidirectional path tracing around the same time with Veach, but the paper didn't formalize the way to combine different sampling strategies. My personal feeling is that bdpt become a powerful beast after applying multiple importance sampling. It was cool to come up result with multiple strategies, but determine which strategy is better in run time is mind-blowing cool :P

The multiple importance sampling estimator:
$$F=\sum_{i=1}^{n}\frac{1}{n_{i}}\sum_{j=1}^{n_{i}}\omega_{i}(X_{i,j})\frac{f(X_{i,j})}{pdf_{i}(X_{i,j})}$$

where \(n\) is the number of sampling techniques, and \(n_{i}\) is the number of sample for technique \(i\). With the power heuristic approach Veach propose:
$$\omega_{i} = (n_{i}pdf_{i}(X_{i,j}))^2/\sum_{k=1}^{n}(n_{k}pdf_{k}(X_{k,j}))^2$$

For example, we got a length 3 path come up with \((s=2,t=2)\) strategy, and we would like to have its MIS weight. What we need will be figure out the pdf to form this path with other strategies \(P_{A}(\bar{x}_{s=0, t=4})\), \(P_{A}(\bar{x}_{s=1, t=3})\),  \(P_{A}(\bar{x}_{s=3, t=1})\),  \(P_{A}(\bar{x}_{s=4, t=0})\) The \(n_{i}\) can simply be treated as 1 since we form every strategies for each integration samples. Expand the result:
$$\omega_{s=2,t=2}(\bar{x}) = \frac{P_{A}(\bar{x}_{s=2, t=2})^2}{P_{A}(\bar{x}_{s=0, t=4})^2 + P_{A}(\bar{x}_{s=1, t=3})^2+ P_{A}(\bar{x}_{s=2, t=2})^2+P_{A}(\bar{x}_{s=3, t=1})^2+P_{A}(\bar{x}_{s=4, t=0})^2}$$

Veach97 use the notation \(k=s+t-1\) and \(p_{i} = P_{A}(\bar{x}_{s=i, t=s+t-i})\), rewritten path \(\bar{x}_{s,t}\) as \(\bar{x} = x_{0}...x_{k}\). This simplify the power heuristic equation to \(\omega_{s,t} = \frac{p_{s}^{2}}{\sum_{i}p_{s}^{2}}\) . In above equation:
$$\omega_{s=2,t=2}(\bar{x}) = \frac{p_{2}^2}{p_{0}^2 + p_{1}^2+ p_{2}^2+p_{3}^2+p_{4}^2}$$
where:
\(p_{0} = P_{A}(x_{3})P_{\omega^{\perp}}(x_{3}\rightarrow x_{2})G(x_{2}\leftrightarrow x_{3})P_{\omega^{\perp}}(x_{2}\rightarrow x_{1})G(x_{1}\leftrightarrow x_{2})\)

\(\:\:\:\:\:\:\:\:\:P_{\omega^{\perp}}(x_{1}\rightarrow x_{0})G(x_{0}\leftrightarrow x_{1})\)

\(p_{1} = P_{A}(x_{3})P_{\omega^{\perp}}(x_{3}\rightarrow x_{2})G(x_{2}\leftrightarrow x_{3})P_{\omega^{\perp}}(x_{2}\rightarrow x_{1})G(x_{1}\leftrightarrow x_{2})P_A(x_{0})\)

\(p_{2} = P_{A}(x_{3})P_{\omega^{\perp}}(x_{3}\rightarrow x_{2})G(x_{2}\leftrightarrow x_{3})P_{\omega^{\perp}}(x_{0}\rightarrow x_{1})G(x_{0}\leftrightarrow x_{1})P_A(x_{0})\)

\(p_{3} = P_{A}(x_{3})P_{\omega^{\perp}}(x_{1}\rightarrow x_{2})G(x_{1}\leftrightarrow x_{2})P_{\omega^{\perp}}(x_{0}\rightarrow x_{1})G(x_{0}\leftrightarrow x_{1})P_A(x_{0})\)

\(p_{4} = P_{\omega^{\perp}}(x_{2}\rightarrow x_{3})G(x_{2}\leftrightarrow x_{3})P_{\omega^{\perp}}(x_{1}\rightarrow x_{2})G(x_{1}\leftrightarrow x_{2})\)

\(\:\:\:\:\:\:\:\:\:P_{\omega^{\perp}}(x_{0}\rightarrow x_{1})G(x_{0}\leftrightarrow x_{1})P_A(x_{0})\)

All ingredients (\(P_{\omega^\perp}\),  \(P_{A}\), \(G\)) can be collected during the path building stage. In fact, Veach even did a clever optimization to avoid duplicate computation between different \(p_{i}\):
$$\omega_{s,t} = \frac{p_{s}^{2}}{\sum_{i}p_{s}^{2}}=\frac{1}{\sum_{i}(p_{i}/p_{s})^{2}}$$

By treating the current \(p_{s}=1\), we can come up the other \(p_{i}\) related to \(p_{s}\) using the equation:

\(\frac{p_{1}}{p_{0}} = \frac{P_{A}(x_{0})}{P_{\omega^{\perp}}(x_{1} \rightarrow x_{0} )G(x_{1} \leftrightarrow x_{0})}\)

\(\frac{p_{i+1}}{p_{i}} = \frac{P_{\omega^{\perp}}(x_{i-1} \rightarrow x_{i}) G(x_{i-1} \leftrightarrow x_{i})}{P_{\omega^{\perp}}(x_{i+1} \rightarrow x_{i}) G(x_{i+1} \leftrightarrow x_{i})}\) for \( 0 < i < k \)

\(\frac{p_{k+1}}{p_{k}} = \frac{P_{\omega^{\perp}}(x_{k-1} \rightarrow x_{k}) G(x_{k-1} \leftrightarrow x_{k})}{P_{A}(x_{k})}\)

and on the reverse side:
\(\frac{p_{0}}{p_{1}} = \frac{P_{\omega^{\perp}}(x_{1} \rightarrow x_{0}) G(x_{1} \leftrightarrow x_{0})}{P_{A}(x_{0})}\)

\(\frac{p_{i-1}}{p_{i}} = \frac{P_{\omega^{\perp}}(x_{i} \rightarrow x_{i-1}) G(x_{i-1} \leftrightarrow x_{i})}{P_{\omega^{\perp}}(x_{i-2} \rightarrow x_{i-1}) G(x_{i-2} \leftrightarrow x_{i-1})}\) for \( 1 < i \leq k \)

\(\frac{p_{k}}{p_{k+1}} = \frac{P_{A}(x_{k})}{P_{\omega^{\perp}}(x_{k-1} \rightarrow x_{k}) G(x_{k-1} \leftrightarrow x_{k})}\)

There are......some devils hidden in the details that need special treatments.
First, while reusing the path by connecting different path vertex from light path and eye path, the end point pdf need to be reevaluate since the connection changes the outgoing direction:


Second, \(\frac{p_{k+1}}{p_{k}} = 0 \) when \(p_{k+1}\) is a camera with Dirac distribution (pinhole camera) since there are 0 chance you can hit a pinhole camera.

Third,  \(\frac{p_{0}}{p_{1}} = 0\) when \(p_{0}\) is a light with Dirac distribution (point light, spot light....etc) since there are 0 chance you can hit a point/spot light.

Fourth, when \(x_{j}\) stands for a surface point with specular bsdf, \(p_{j} = 0\) and \(p_{j+1} =0\) Veach97 contains a detail explanation of this exception handling in 10.3.5, the basic idea is, when \(p_{j} = 0\):

\(\frac {p_{j}}{p_{j-1}} = \frac{P_{\omega^{\perp}}(x_{j-2} \rightarrow x_{j-1}) G(x_{j-2} \leftrightarrow x_{j-1})}{P_{\omega^{\perp}}(x_{j} \rightarrow x_{j-1}) G(x_{j} \leftrightarrow x_{j-1})}\) contains a \(P_{\omega^{\perp}}(x_{j} \rightarrow x_{j-1}) \) in denominator which is a number that is not an actual pdf but a hack to get integration working, let's just call it evil alien number for now

\(\frac {p_{j}}{p_{j-1}}\frac {p_{j+1}}{p_{j}} =\frac{P_{\omega^{\perp}}(x_{j-2} \rightarrow x_{j-1}) G(x_{j-2} \leftrightarrow x_{j-1})}{P_{\omega^{\perp}}(x_{j} \rightarrow x_{j-1}) G(x_{j} \leftrightarrow x_{j-1})} \frac{P_{\omega^{\perp}}(x_{j-1} \rightarrow x_{j}) G(x_{j-1} \leftrightarrow x_{j})}{P_{\omega^{\perp}}(x_{j+1} \rightarrow x_{j}) G(x_{j} \leftrightarrow x_{j+1})}\) evil alien number still sticks in denominator :|

\(\frac {p_{j}}{p_{j-1}}\frac {p_{j+1}}{p_{j}}\frac {p_{j+2}}{p_{j+1}} =\frac{P_{\omega^{\perp}}(x_{j-2} \rightarrow x_{j-1}) G(x_{j-2} \leftrightarrow x_{j-1})}{P_{\omega^{\perp}}(x_{j} \rightarrow x_{j-1}) G(x_{j} \leftrightarrow x_{j-1})} \frac{P_{\omega^{\perp}}(x_{j-1} \rightarrow x_{j}) G(x_{j-1} \leftrightarrow x_{j})}{P_{\omega^{\perp}}(x_{j+1} \rightarrow x_{j}) G(x_{j} \leftrightarrow x_{j+1})}\frac{P_{\omega^{\perp}}(x_{j} \rightarrow x_{j+1}) G(x_{j} \leftrightarrow x_{j+1})}{P_{\omega^{\perp}}(x_{j+2} \rightarrow x_{j+1}) G(x_{j+2} \leftrightarrow x_{j+1})}\) 

The interesting thing happen, another evil alien number \(P_{\omega^{\perp}}(x_{j} \rightarrow x_{j+1})\) appears in the nominator. To make thing even more interesting, this two number are actually the same number that can cancel each other. Yes I questioned myself once whether this two number is the same number and the result is true. In pure specular reflection case (mirror) there are both 1. In reflection/transmission combine material like glass, depends on the implementation, it may be 0.5 (giving half half chance to do the transmission or reflection), it may be a \(F(\omega_{o}, \omega_{i})\) Fresnel number so that you can do a bit importance sampling for this bsdf. 0.5 is self explanatory enough that pdf from \(\omega_{o}\) to \(\omega_{i}\) is the same as  \(\omega_{i}\) to \(\omega_{o}\). Fresnel number if you examine the equation carefully, pdf from \(\omega_{o}\) to \(\omega_{i}\) is still the same as  \(\omega_{i}\) to \(\omega_{o}\) :) 

The conclusion: a specular surface point  \(x_{j}\)  will bring an evil alien number to pdf \(p_{j}\) and \(p_{j+1}\) that this two pdf shouldn't be counted in the final weighting calculation, but as long as we move on to  \(p_{j+2}\), everything goes back to normal.

With the above gotcha in mind, here is my implementation of multiple importance sampling weight evaluation:

And this is how the above first bounce strategies looks like after assign appropriate MIS weight:

You can see the high variance noise (estimator with low pdf) got smooth out and different strategies only preserve the estimator result that with high pdf. The most important thing is: the added on result, still converge correctly compare to the Mitsuba reference.


battle royale round 3: bdpt rendering vs path tracing Mitsuba reference

One thing to notice is that even this scene is not a particularly fun one, bdpt already show that it can use much less samples converging to similar result path tracing did (the above comparison, bdpt use 196 spp, Mitsuba reference uses 400 spp) Though the rendered images are kinda boring (cornell box all the way til the end :P ) The bidirectional path tracing is roughly complete now, and I have to say, OMG!......this journey is definitely not a piece of cake for myself. Time to open up a beer for a small celebration! In the next post, I would try to come up some relatively more interesting images and do a postmortem of this round of implementation.



No comments:

Post a Comment