Separable projection integrals for higher-order correlators of the cosmic microwave sky: Acceleration by factors exceeding 100

Briggs, JP; Jäykkä, J; Fergusson, James Robert; Pennycook, SJ; Shellard, Edward Paul

Separable projection integrals for higher-order correlators of the cosmic microwave sky: Acceleration by factors exceeding 100

dc.creator	Briggs, JP
dc.creator	Pennycook, SJ
dc.creator	Fergusson, James Robert
dc.creator	Jäykkä, J
dc.creator	Shellard, Edward Paul
dc.date.accessioned	2018-11-24T23:18:32Z
dc.date.available	2016-01-28T15:41:36Z
dc.date.available	2018-11-24T23:18:32Z
dc.date.issued	2016-01-19
dc.identifier	https://www.repository.cam.ac.uk/handle/1810/253535
dc.identifier.uri	http://repository.aust.edu.ng/xmlui/handle/123456789/3299
dc.description.abstract	We present a case study describing efforts to optimise and modernise “Modal”, the simulation and analysis pipeline used by the Planck satellite experiment for constraining general non-Gaussian models of the early universe via the bispectrum (or three-point correlator) of the cosmic microwave background radiation. We focus on one particular element of the code: the projection of bispectra from the end of inflation to the spherical shell at decoupling, which defines the CMB we observe today. This code involves a three-dimensional inner product between two functions, one of which requires an integral, on a non-rectangular domain containing a sparse grid. We show that by employing separable methods this calculation can be reduced to a one-dimensional summation plus two integrations, reducing the overall dimensionality from four to three. The introduction of separable functions also solves the issue of the non-rectangular sparse grid. This separable method can become unstable in certain scenarios and so the slower non-separable integral must be calculated instead. We present a discussion of the optimisation of both approaches. We demonstrate significant speed-ups of ≈100× , arising from a combination of algorithmic improvements and architecture-aware optimisations targeted at improving thread and vectorisation behaviour. The resulting MPI/OpenMP hybrid code is capable of executing on clusters containing processors and/or coprocessors, with strong-scaling efficiency of 98.6% on up to 16 nodes. We find that a single coprocessor outperforms two processor sockets by a factor of 1.3× and that running the same code across a combination of both microarchitectures improves performance-per-node by a factor of 3.38× . By making bispectrum calculations competitive with those for the power spectrum (or two-point correlator) we are now able to consider joint analysis for cosmological science exploitation of new data.
dc.language	en
dc.publisher	Elsevier
dc.publisher	Journal of Computational Physics
dc.rights	http://creativecommons.org/licenses/by/4.0/
dc.rights	http://creativecommons.org/licenses/by/4.0/
dc.rights	http://creativecommons.org/licenses/by/4.0/
dc.rights	Attribution 4.0 International
dc.rights	Attribution 4.0 International
dc.rights	Attribution 4.0 International
dc.title	Separable projection integrals for higher-order correlators of the cosmic microwave sky: Acceleration by factors exceeding 100
dc.type	Article

Files in this item

Files	Size	Format	View
1-s2.0-S0021999116000279-main.pdf	712.0Kb	application/pdf	View/Open
Briggs et al 20 ... Computational Physics.pdf	1.525Mb	application/pdf	View/Open

This item appears in the following Collection(s)

Department of Materials Science and Metallurgy648

Show simple item record

Separable projection integrals for higher-order correlators of the cosmic microwave sky: Acceleration by factors exceeding 100

Files in this item

This item appears in the following Collection(s)

Department of Materials Science and Metallurgy648