correlation

This wiki is not maintained! Do not use this when setting up AuScope experiments!

This is an old revision of the document!


Correlation on the vc cluster & optimisation

The vc cluster consists of 9 flexbuff-style machines, each with 20 CPUs and 12 SAS/SATA slots. There is a shared home area (hosted by vc0) which is used in correlation by the head node. The aim is to bring this up to a “production” standard - e.g capable of correlating a 3-station VGOS experiment every week with sufficient overhead for problems. Refinements would feature increased automation, job-queuing, archiving but the main aim at the moment is confirming its viability.

At present, the main bottlenecks appears to be linked to the I/O speeds of the media and/or network interfaces. In correlating VGOS/mixed mode experiments, we tend to have several telescopes with multiple datastreams to handle. These data can be either locally stored on the vc cluster as either normal files or as “vbs” scattered recordings. Alternatively, they may be available from the central RDSI data store (located on campus) or on other media at Mt. Pleasant accessed via NFS.

If there are no CPU limits, the correlation of a scan will proceed at the access rate of the slowest datastream. Note that this is not related to the recorded data rate - it is possible (and routine) to correlate with a “speed-up” factor > 1.

Known bottlenecks

Disk Speed The HDDs in use on the flexbuffs have a basic speed of ~1Gbps. Accessing a file that spans multiple drives (such as a VBS recording) can be slowed to this rate unless the “read ahead” option is enabled at runtime (e.g vbs_fs -n 4 -I test* /mnt/vbs/ will enable a 4-chunk read ahead buffer). Note that the VBS recording will naturally tend to scatter across disks in something like a round-robin fashion which should mean access speeds can be up to the recording rate (~32 Gbps if using the whole set of drives).

VBS The vbs_fs utility is an easy way to access the scattered recordings, treating them as unified files. One potential issue is with how to handle mutiple datastreams - is it better to mount them all under one vbs mountpoint, or run separete instances? How does this interact with the read ahead setting?

Network interfaces At present, the vcs and flexbuff are connected via a single 10GbE interfaces to a fibre switch. would it be sensible to create a “bonded” interface? Is there any kernel tuning required (NB - the standard flexbuff optimsation has already been applied)

v2d parameters to consider: dataBufferFactor, nDataSegments, sendLength, sendSize, visBufferLength, strideLength, xmaclength, numBufferedFFTs NB - all of these may interact and be media-dependent!

/home/www/auscope/opswiki/data/attic/correlation.1609996453.txt.gz · Last modified: 2021/01/07 05:14 by Jamie McCallum