SAGECal

Sarod Yattawatta

doi:10.5281/zenodo.1051168

SAGECal

Fast, memory efficient and GPU accelerated radio interferometric calibration program

189

mentions

contributors

Get started

795 commitsLast commit ≈ 2 months ago13 stars8 forks

Cite this software

DOI:

10.5281/zenodo.1051168

Description

SAGECAL

Features

Levenberg-Marquardt, (stochastic) LBFGS, Riemannian Trust Region, Nesterov's accelerated gradient descent algorithms
GPU acceleration using CUDA
Fast and accurate interferometric calibration
Gaussian and Student's t noise models
Shapelet source models
CASA MS data format supported
Distributed calibration using MPI - consensus optimization with data multiplexing
Tools to build sky models and restore sky models to images
Adaptive update of ADMM penalty (Barzilai-Borwein a.k.a. Spectral method)
Bandpass calibration and unprecedented RFI mitigation with stochastic LBFGS
Stochastic calibration for handling data at highest resolution (with federated averaging and consensus optimization)
Spectral and spatial regularization of calibration solutions
Large scale diffuse sky models via shapelet decomposition
Lunar frame interferometric simulation via the SPICE toolkit

Please read INSTALL.md for installation instructions, but 'cmake' should work in most cases. We give a brief guide to use SAGECal here but there is extensive documentation in the links at the end.

Contributing

Read the contributing guide

Code documentation

Code documentation can be found here.

Step by Step Introduction:

1) Input Data

Input to sagecal must be in CASA MS format, make sure to create a column in the MS to write output data as well. The data can be in raw or averaged form, also initial calibration using other software can be also applied.

2) Sky Model:

2a) Make an image of your MS (using ExCon/casapy).

Use Duchamp to create a mask for the image. Use buildsky to create a sky model. (see the README file on top level directory). Also create a proper cluster file. Special options to buildsky: -o 1 (NOTE: not -o 2)

Alternatively, create these files by hand according to the following formats.

2b) Cluster file format:

cluster_id chunk_size source1 source2 ... e.g.


0 1 P0C1 P0C2
2 3 P11C2 P11C1 P13C1

Note: putting -ve values for cluster_id will not subtract them from data. chunk_size: find hybrid solutions during one solve run. Eg. if -t 120 is used to select 120 timeslots, cluster 0 will find a solution using the full 120 timeslots while cluster 2 will solve for every 120/3=40 timeslots.

2c) Sky model format:

#name h m s d m s I Q U V spectral_index RM extent_X(rad) extent_Y(rad) pos_angle(rad) freq0

#name h m s d m s I Q U V spectral_index1 spectral_index2 spectral_index3 RM extent_X(rad) extent_Y(rad) pos_angle(rad) freq0

e.g.:

P1C1 0 12 42.996 85 43 21.514 0.030498 0 0 0 -5.713060 0 0 0 0 115039062.0
P5C1 1 18 5.864 85 58 39.755 0.041839 0 0 0 -6.672879 0 0 0 0 115039062.0
#A Gaussian mjor,minor 0.1375,0.0917 deg diameter -> radius(rad), PA 43.4772 deg (-> rad)
#Position Angle: "West from North (counter-clockwise)" (0 deg = North, 90 deg = West). 
#Note: PyBDSF/BBS/DP3 use "North from East (counter-clockwise)" (0 deg = East, 90 deg = North).
#Flux scaling is same for both
G0  5 34 31.75 22 00 52.86 100 0 0 0 0.00 0 0.0012  0.0008 -2.329615801 130.0e6
#A Disk radius=0.041 deg
D01 23 23 25.67 58 48 58 80 0 0 0 0 0 0.000715 0.000715 0 130e6
#A Ring radius=0.031 deg
R01 23 23 25.416 58 48 57 70 0 0 0 0 0 0.00052 0.00052 0 130e6
#A shapelet ('S3C61MD.fits.modes' file must be in the current directory)
S3C61MD 2 22 49.796414 86 18 55.913266 0.135 0 0 0 -6.6 0 1 1 0.0 115000000.0

Note: Comments starting with a '#' are allowed for both sky model and cluster files. Note: 3rd order spectral indices are also supported, use -F 1 option in sagecal. Note: Spectral indices use natural logarithm, exp(ln(I0) + p1 * ln(f/f0) + p2 * ln(f/f0)^2 + ..) so if you have a model with common logarithms like 10^(log(J0) + q1*log(f/f0) + q2*log(f/f0)^2 + ..) then, conversion is

ln(I0)+p1*ln(f/f0)+p2*ln(f/f0)^2+... = ln(10)*(log(J0)+q1*log(f/f0)+q2*log(f/f0))^2)+...)
=ln(10)*(ln(J0)/ln(10)+q1*ln(f/f0)/ln(10)+q2*ln(f/f0)^2/ln(10)^2+...)

I0=J0
p1=q1
p2=q2/ln(10)
p3=q3/ln(10)^2
...

3) Run sagecal

Optionally: Make sure your machine has (1/2 working NVIDIA GPU cards or Intel Xeon Phi MICs) to use sagecal. Recommended usage: (with GPUs)

sagecal_gpu -d my_data.MS -s my_skymodel -c my_clustering -n no.of.threads -t 60 -p my_solutions -e 3 -g 2 -l 10 -m 7 -w 1 -b 1

Replace sagecal_gpu with sagecal if you have a CPU only build.

Use your solution interval (-t 60) so that its big enough to get a decent solution and not too big to make the parameters vary too much. (about 20 minutes per solution is reasonable).

Note: It is also possible to calibrate more than one MS together. See section 4 below. Note: To fully use GPU acceleration use -E 1 option.

Simulations: With -a 1, only a simulation of the sky model is done. With -a 1 and -p 'solutions_file', simulation is done with the sky model corrupted with solutions in 'solutions_file'. With -a 1 and -p 'solutions_file' and -z 'ignore_file', simulation is done with the solutions in the 'solutions_file', but ignoring the cluster ids in the 'ignore_file'. E.g., If you need to ignore cluster ids '-1', '10', '999', create a text file :

-1
10
999

and use it as the 'ignore_file'.

Bandpass correction using stochastic calibration with consensus: Use -N 1 combined with options for -M,-w (see also section 4 below).

4) Distributed calibration

Use mpirun to run sagecal-mpi, (or sagecal-mpi_gpu for GPU version) for example:

 mpirun  -np 11 -hostfile ./machines --map-by node --cpus-per-proc 8 
 --mca yield_when_idle 1 -mca orte_tmpdir_base /scratch/users/sarod 
 /full/path/to/sagecal-mpi -f 'MS*pattern' -A 30 -P 2 -r 5 
 -s sky.txt -c cluster.txt -n 16 -t 1 -e 3 -g 2 -l 10 -m 7 -x 10 -F 1 -j 5

Specific options : -np 11 : 11 processes : starts 10 workers + 1 master

./machines : will list the host names of the 11 (or fewer) nodes used ( 1st name is the master ) : normally the node where you invoke mpirun

-f 'MS*pattern' : Search MS names that match this pattern and calibrate all of them together. The total number of MS being calibrated can be higher than the actual number of slaves (multiplexing).

-A 30 : 30 ADMM iterations.

-P 2 : polynomial in frequency has 2 terms.

-Q : can change the type of polynomial used (-Q 2 gives Bernstein polynomials).

-r 5 : regularization factor is 5.0. See -G option below for regularization per direction.

-G textfile: each cluster can have a different regularization factor, instead of using -r option (and -u option) when the regularization is the same for all clusters. The text file format is as follows: each row should have the cluter_id, hybrid count (= same as in the cluster file), spectral regularization, and (optionally) spatial regularization, e.g.,

# cluster_id hybrid_factor spetral_reg spatial_reg
-1 1 5.0 3.0
2 4 1.0 2.0
...

It is also possible to only use spectral regularization, for example,

# cluster_id hybrid_factor spetral_reg
-1 1 5.0
2 4 1.0
...

Note that lines starting with '#' are comments.

-N 1: enable stochastic calibration (minibatches of data), combined with options -M, -w and -u.

-U 1: use global solution instead of local solution for residual calculation.

MPI specific options:

/scratch/users/sarod : this is where MPI stores temp files (default is probably /tmp).

--mca*: various options to tune the networking and scheduling.

Note: the number of workers (-np option) can be lower than the number of MS calibrated. The program will divide the workload among the number of available workers.

The rest of the options are similar to sagecal.

5) Spatial regularization

Spatial regularization (with distributed multi-directional calibration) enables the use of solutions along all directions in the sky as spatial regularization for calibration along each single direction. In other words, we construct a model of the systematic errors covering the full field of view (not explicitly but implicitly) and use this as an additional constraint. To turn this on, the following options are needed:

-X L2 penalty,L1 penalty,model order,FISTA iterations,update cadence : For example -X 0.01,1e-4,3,40,10 will use 0.01 as the L2 penalty and 1e-4 as the L1 penalty for solving an elastic net regression to build a spatial model. The model will have 3x3=9 spatial basis functions. The elastic net model will be found using 40 FISTA iterations. The spatial model will be updated at every 10 ADMM iterations.

-u alpha : The regularization factor for the spatial constraint while solving the consensus problem. Note that by using the -G option, you can define regularization factors for each cluster.

After each solution, images showing the spatial model (amplitude) will be created as .PPM files.

5a) Diffuse sky models in calibration

When a spatial model is enabled, it is also possible to apply the spatial model onto a model of a diffuse sky background. The diffuse sky model can have any number of shapelet components. They all should belong to one cluster (say cluster 45). With -D option, it is possible to enable the use of diffuse sky model with the spatial model, like -D 45,0.1, where 0.1 is the regularization factor used while applying the spatial model. The larger this regularization is, the more strongly the spatial model will be enforced onto the diffuse sky model.

6) Solution format

All SAGECal solutions are stored as text files. Lines starting with '#' are comments. The first non-comment line includes some general information, i.e. freq(MHz) bandwidth(MHz) time_interval(min) stations clusters effective_clusters

The remaining lines contain solutions for each cluster as a single column, the first column is just a counter. Let's say there are K effective clusters and N directions. Then there will be K+1 columns, the first column will start from 0 and increase to 8N-1, which can be used to count the row number. It will keep repeating this, for each time interval. The rows 0 to 7 belong to the solutions for the 1st station. The rows 8 to 15 for the 2nd station and so on. Each 8 rows of any given column represent the 8 values of a 2x2 Jones matrix. Lets say these are S0,S1,S2,S3,S4,S5,S6 and S7. Then the Jones matrix is [S0+j*S1, S4+j*S5; S2+j*S3, S6+j*S7] (the ';' denotes the 1st row of the 2x2 matrix).

When a cluster has a chunk size > 1, there will be more than 1 solution per given time interval. So for this cluster, there will be more than 1 column in the solution file, the exact number of columns being equal to the chunk size.