The vc cluster consists of 9 flexbuff-style machines, each with 20 CPUs and 12 SAS/SATA slots. There is a shared home area (hosted by vc0) which is used in correlation by the head node. The aim is to bring this up to a “production” standard - e.g capable of correlating a 3-station VGOS experiment every week with sufficient overhead for problems. Refinements would feature increased automation, job-queuing, archiving but the main aim at the moment is confirming its viability.
At present, the main bottlenecks appears to be linked to the I/O speeds of the media and/or network interfaces. In correlating VGOS/mixed mode experiments, we tend to have several telescopes with multiple datastreams to handle. These data can be either locally stored on the vc cluster as either normal files or as “vbs” scattered recordings. Alternatively, they may be available from the central RDSI data store (located on campus) or on other media at Mt. Pleasant accessed via NFS.
If there are no CPU limits, the correlation of a scan will proceed at the access rate of the slowest datastream. Note that this is not related to the recorded data rate - it is possible (and routine) to correlate with a “speed-up” factor > 1.
The current VGOS recording mode consiststs of 6 datstreams, each at 1 Gbps recorded via a flexbuff. The flexbuff acts as a packet capture device, storing the datastreams as “chunks” of VDIF-formatted baseband data on an array of HDDs. Each scan consists of 10s-100s of time separated chunks of data (the typical chunk size is ~512 MB, corresponding to ~2s of data). For usability, a utility called “vbs_fs” is used to create a pseudo-file system which presents the recordings as single continuous files which can be read normally by DiFX. Changes made to the recording mode and vbs_fs at run time can radically change the read performance of these pseudo files.
The HDDs in use on the flexbuffs have a basic speed of ~1Gbps. Accessing a file that spans multiple drives (such as a VBS recording) can be slowed to this rate unless the “read ahead” option is enabled at runtime (e.g
vbs_fs -n 4 -I test* /mnt/vbs/ will enable a 4-chunk read ahead buffer). Note that the VBS recording will naturally tend to scatter across disks in something like a round-robin fashion which should mean access speeds can be up to the recording rate (~32 Gbps if using the whole set of drives).
The vbs_fs utility is an easy way to access the scattered recordings, treating them as unified files. One potential issue is with how to handle multiple datastreams - is it better to mount them all under one vbs mountpoint (e.g
vbs_fs -n 4 -I 'auv*' /mnt/vbs/), or run seperate instances (
vbs_fs -n 4 -I 'auv*a.vdif' /mnt/vbs1 ; vbs_fs -n 4 -I 'auv*b.vdif' /mnt/vbs2)? How does this interact with the read ahead setting? Lastly, would it make sense to bypass vbs_fs and let DiFX access the individual chunks? One thing has been confirmed is that simultaneous access of the same pseudo-file by DiFX gives very poor performance (e.g, faking another station and pointing to the same filelist for the baseband data0
Network interfaces At present, the vcs and flexbuff are connected via a single 10GbE interfaces to a fibre switch. would it be sensible to create a “bonded” interface? Is there any kernel tuning required (NB - the standard flexbuff optimsation has already been applied). Should we arrange a VLAN on the switch?
NFS/RDSI access The AuScope project has an allocation on thr the RDSI data storage located on campus. This is a high-capacity, high-rate data store which we regularly (mis-)use as scratch space for data transfers and correlation. The Mt. Pleasant network access is over a 10 GbE link but there are potentially issues with NFS tuning and also for simultaneous I/O
Logistics The biggest issue tends to be getting data somewhere usable, and making sure that there are no competing/interfering operations affecting it. This might include write operations due to a flexbuff recording or another transfer running in parallel. In practive, some parallelisation is acceptable (particularly if one operation is rate limited) but this can lead to severe congestion and poor throughout.
DiFX runtime tuning v2d parameters to consider: dataBufferFactor, nDataSegments, sendLength, sendSize, visBufferLength, strideLength, xmaclength, numBufferedFFTs NB - all of these may interact and be media-dependent! DiFX defaults give reasonably performance for a range of media/data types but there are likely some that are driven by limitations that do not apply to our system (e.g, readSize limits for the Mark5B, etc). Also, what would be the optimal (while implementable) way to arrange the datastreams around the cluster? Is it worth distributing baseband data around the cluster onto the onboard storage prior to correlation or is this too much time lost?