-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Start the sequence at zero and add scrambling #31
Comments
I agree with your points 1) and 3). I am not sure why a lot of package do like that. It feels like the literature (including a lot of paper by A. Owen) do not present things like they are coded. I coded 2) and packaged it RandomizedQuasiMonteCarlo.jl. |
Point 1 makes sense to me. I agree with recommending 2^n points, but I'm not sure about forcing it with the API. Point 2 sounds like it would be a nice feature addition. A PR for (1) would be a good starting point, and a PR for (2) would be welcome if someone can figure out the algorithm. |
Some comments:
|
Maybe point (1) is related to this comment in the source code about skipping |
I opened #35. |
First of all, thank youfor writing Sobol.jl. I have been learning about Quasi Monte Carlo (QMC) recently and I'm so glad that there is a Sobol module in Julia. I want to suggest a fairly significant API change, but I hope to convince you that it's a very good idea.
1) Include the initial$(0,0,...)$ point in the Sobol sequence
The first value in the Sobol sequence is zero. This recent paper by Art Owen shows that the common practice of dropping the$(0,0,...)$ point (as done in Sobol.jl) ruins the digital net property of Sobol and makes its convergence worse. What makes Sobol a "digital net" is the fact that if you take the first $2^n$ Sobol points (in any dimension) you can partition your unit interval / square / hypercube into $2^k (k \le n)$ even slices and each slice will get its fair share $2^(n-k)$ of the points. This is the key to its value in QMC. If you skip the $(0,0,...)$ and draw $2^n$ points, the result is not a digital net. Let me illustrate:
Notice how Sobol.jl leaves the bottom-left square empty and instead puts two points in another square. So that ruins the digital net, and the convergence. You can fix this by skipping the next$2^n - 1$ so that the next $2^n$ points do make a digital net. I suspect that this might be the origin of the common practice of skipping $2^n - 1$ points. This also addresses the concern that $(0,0,...)$ is a corner case that could lead to weird results. But as Owen's paper shows (Figure 1), an even better solution is to scramble the points.
2) Add support for scrambled Sobol points
Back in the 90's Owen proposed scrambling digital nets to create a hybrid method that brings the best features of MC and QMC. In Randomized QMC (RQMC) you scramble the points in a way that preserves the digital net but creates a random instance and removes the corner points like$(0,0,...)$ . Figure 1 of Owen and Rudolf (2021) has a nice illustration of MC vs QMC vs RQMC.
Since Owen's proposal, several authors have proposed various ways to scramble the points and have shown empirically that RQMC has better convergence than QMC. But more importantly, RQMC recovers a feature from MC that is lost in QMC: The ability to measure uncertainty. In MC, if you draw$N$ points to compute some integral, you can repeat the experiment many times to measure the variance of that result. This is lost in QMC because the points are not random. But RQMC lets you do something similar by generating multiple scrambles.
A scrambled Sobol is already implemented in scipy.stats.qmc.Sobol and in Owen's R package (rsobol). Unfortunately I was not able to figure out either method works, so I could not translate them to Julia. It doesn't look complicated at all; I just don't understand SciPy, R, or Sobol direction vectors well enough. They don't seem to be doing exactly the same thing, but they both seem to be doing something short and clever with the direction vectors.
I should mention that the R code from Owen is BSD-licensed, so if you can read R, you can just take his method and convert it to Julia.
Notice that the SciPy and R implementations both default to scrambled points and you have to explicitly ask for unscrambled points.
3) Don't produce Sobol points one at a time, draw$2^n$ at a time
At this point I hope I have convinced you that drawing anything other than a power of 2 worth of Sobol points ruins the digital net, and misses what makes Sobol powerful. In Owen's article he says
The SciPy API strongly encourages you to draw$2^n$ points, and while they allow you to draw a different number, they give you a big warning to not do that. Owen's R code goes even farther --- it doesn't have an option to draw anything but a power of two. The input parameters are the exponent and the dimensionality. So I think Sobol.jl should do the same thing. Ditch the
next!()
method and instead draw a power of two.I realize that this was a long request and that I'm suggesting breaking the API, but I hope I have made a convincing case that this would turn Sobol.jl into a very powerful tool for anyone computing high-dimensional integrals with RQMC methods. Not coincidentally, this is precisely what I am hoping to do with it.
Let me know what you think.
References:
The text was updated successfully, but these errors were encountered: