User Tools

Site Tools


correlation

This wiki is not maintained! Do not use this when setting up AuScope experiments!

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
correlation [2021/01/07 05:14]
Jamie McCallum [Known bottlenecks]
correlation [2021/01/07 05:38] (current)
Jamie McCallum
Line 6: Line 6:
  
 If there are no CPU limits, the correlation of a scan will proceed at the access rate of the slowest datastream. Note that this is not related to the recorded data rate - it is possible (and routine) to correlate with a "​speed-up"​ factor > 1. If there are no CPU limits, the correlation of a scan will proceed at the access rate of the slowest datastream. Note that this is not related to the recorded data rate - it is possible (and routine) to correlate with a "​speed-up"​ factor > 1.
 +
 +===== VGOS recording mode =====
 +
 +The current VGOS recording mode consiststs of 6 datstreams, each at 1 Gbps recorded via a flexbuff. The flexbuff acts as a packet capture device, storing the datastreams as "​chunks"​ of VDIF-formatted baseband data on an array of HDDs. Each scan consists of 10s-100s of time separated chunks of data (the typical chunk size is ~512 MB, corresponding to ~2s of data). For usability, a utility called "​vbs_fs"​ is used to create a pseudo-file system which presents the recordings as single continuous files which can be read normally by DiFX. Changes made to the recording mode and vbs_fs at run time can radically change the read performance of these pseudo files. ​
  
 ===== Known bottlenecks ===== ===== Known bottlenecks =====
Line 13: Line 17:
  
 **VBS** **VBS**
-The vbs_fs utility is an easy way to access the scattered recordings, treating them as unified files. One potential issue is with how to handle ​mutiple ​datastreams - is it better to mount them all under one vbs mountpoint, or run separete ​instances? How does this interact with the read ahead setting?+The vbs_fs utility is an easy way to access the scattered recordings, treating them as unified files. One potential issue is with how to handle ​multiple ​datastreams - is it better to mount them all under one vbs mountpoint ​(e.g ''​vbs_fs -n 4 -I '​auv*'​ /​mnt/​vbs/''​), or run seperate ​instances ​(''​vbs_fs -n 4 -I '​auv*a.vdif'​ /mnt/vbs1 ; vbs_fs -n 4 -I '​auv*b.vdif'​ /​mnt/​vbs2''​)? How does this interact with the read ahead setting? ​Lastly, would it make sense to bypass vbs_fs and let DiFX access the individual chunks? One thing has been confirmed is that simultaneous access of the same pseudo-file by DiFX gives very poor performance (e.g, faking another station and pointing to the same filelist for the baseband data0
  
 **Network interfaces** **Network interfaces**
-At present, the vcs and flexbuff are connected via a single 10GbE interfaces to a fibre switch. would it be sensible to create a "​bonded"​ interface? Is there any kernel tuning required (NB - the standard flexbuff optimsation has already been applied)+At present, the vcs and flexbuff are connected via a single 10GbE interfaces to a fibre switch. would it be sensible to create a "​bonded"​ interface? Is there any kernel tuning required (NB - the standard flexbuff optimsation has already been applied). Should we arrange a VLAN on the switch? 
 + 
 +**NFS/RDSI access** 
 +The AuScope project has an allocation on thr the RDSI data storage located on campus. This is a high-capacity,​ high-rate data store which we regularly (mis-)use as scratch space for data transfers and correlation. The Mt. Pleasant network access is over a 10 GbE link but there are potentially issues with NFS tuning and also for simultaneous I/O
  
 +**Logistics**
 +The biggest issue tends to be getting data somewhere usable, and making sure that there are no competing/​interfering operations affecting it. This might include write operations due to a flexbuff recording or another transfer running in parallel. In practive, some parallelisation is acceptable (particularly if one operation is rate limited) but this can lead to severe congestion and poor throughout. ​
  
 +**DiFX runtime tuning**
 v2d parameters to consider: dataBufferFactor,​ nDataSegments,​ sendLength, sendSize, visBufferLength,​ strideLength,​ xmaclength, numBufferedFFTs v2d parameters to consider: dataBufferFactor,​ nDataSegments,​ sendLength, sendSize, visBufferLength,​ strideLength,​ xmaclength, numBufferedFFTs
-NB - all of these may interact and be media-dependent!+NB - all of these may interact and be media-dependent! ​DiFX defaults give reasonably performance for a range of media/data types but there are likely some that are driven by limitations that do not apply to our system (e.g, readSize limits for the Mark5B, etc).  
 +Also, what would be the optimal (while implementable) way to arrange the datastreams around the cluster? Is it worth distributing baseband data around the cluster onto the onboard storage prior to correlation or is this too much time lost?
/home/www/auscope/opswiki/data/attic/correlation.1609996453.txt.gz · Last modified: 2021/01/07 05:14 by Jamie McCallum