User Tools

Site Tools


operations:monitoring_hb

This wiki is not maintained! Do not use this when setting up AuScope experiments!

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
operations:monitoring_hb [2011/07/20 13:02]
cehotan
operations:monitoring_hb [2014/12/17 09:53] (current)
Warren Hankey [FS time is out by several seconds]
Line 24: Line 24:
   * **mk5=mode? correct**\\ Check mode with this command in econtrol: ''​mk5=mode?''​. The result should be\\ ''/​mk5/​!mode?​ 0 : ext : 0x55555555 : 2 : 2 ;''​ for R1 and R4 experiments,​\\ ''/​mk5/​!mode?​ 0 : ext : 0x55555555 : 4 : 2 ;''​ for OHIG, APSG and CRF observations.   * **mk5=mode? correct**\\ Check mode with this command in econtrol: ''​mk5=mode?''​. The result should be\\ ''/​mk5/​!mode?​ 0 : ext : 0x55555555 : 2 : 2 ;''​ for R1 and R4 experiments,​\\ ''/​mk5/​!mode?​ 0 : ext : 0x55555555 : 4 : 2 ;''​ for OHIG, APSG and CRF observations.
   * **mk5=dot? response nominal**\\ This is a check of the Mark5 decoder time. Check the time offset in the formatter with this command in econtrol:''​mk5=dot?''​. Make sure it reports a small offset (~<10ms) as the final value, that ''​syncerr_eq_0''​ and that ''​FHG_on''​ or ''​FHG_off''​ depending on whether it is currently recording or not.   * **mk5=dot? response nominal**\\ This is a check of the Mark5 decoder time. Check the time offset in the formatter with this command in econtrol:''​mk5=dot?''​. Make sure it reports a small offset (~<10ms) as the final value, that ''​syncerr_eq_0''​ and that ''​FHG_on''​ or ''​FHG_off''​ depending on whether it is currently recording or not.
-  * **disk_pos OK**\\ The command ''​disk_pos''​ in econtrol should report three values - the current number of btyes recorded, bytes at start of previous scan and bytes at start of current scan. If not currently recording, the first and third values should agree.+  * **disk_pos OK**\\ The command ''​disk_pos''​ in econtrol should report three values - the current number of btyes recorded, bytes at start of previous scan and bytes at start of current scan. If not currently recording, the first and third values should agree. It is normal for Yarragadee ''​disk_pos''​ to lag its expected value due to regular stows for USN uplinks.
   * **Weather (wth) being logged**\\ Look through recent messages in the field system log for output from the ''​wth''​ command, which will look like this:​\\/#​wx#/​16.1,​1007.9,​58.6\\Also make a note in the log of present weather conditions (if you're at the observatory).   * **Weather (wth) being logged**\\ Look through recent messages in the field system log for output from the ''​wth''​ command, which will look like this:​\\/#​wx#/​16.1,​1007.9,​58.6\\Also make a note in the log of present weather conditions (if you're at the observatory).
   * **S-band Tsys OK (~15-17)**\\ Check recent output from a systemp12 command (don't execute it unless the Mark5 is NOT recording) to see is the S-band Tsys is within the expected range: about 15 to 17 cal units. Look for "​tsysS"​ in the log. Make a note in the log if it is outside this range. If it persists, or the values vary wildly, there may be a problem.   * **S-band Tsys OK (~15-17)**\\ Check recent output from a systemp12 command (don't execute it unless the Mark5 is NOT recording) to see is the S-band Tsys is within the expected range: about 15 to 17 cal units. Look for "​tsysS"​ in the log. Make a note in the log if it is outside this range. If it persists, or the values vary wildly, there may be a problem.
   * **X-band Tsys OK (~5-7)**\\ Check recent output from a systemp12 command (don't execute it unless the Mark5 is NOT recording) to see is the X-band Tsys is within the expected range: about 5 to 7 cal units. Look for "​tsysS"​ in the log. Make a note in the log if it is outside this range. If it persists, or the values vary wildly, there may be a problem.   * **X-band Tsys OK (~5-7)**\\ Check recent output from a systemp12 command (don't execute it unless the Mark5 is NOT recording) to see is the X-band Tsys is within the expected range: about 5 to 7 cal units. Look for "​tsysS"​ in the log. Make a note in the log if it is outside this range. If it persists, or the values vary wildly, there may be a problem.
-  * **Any problems or concerns logged**\\ If there are any other issues or unusual ​behavior, report it in the log by typing a comment preceeded by double quotes in econtrol+  * **Any problems or concerns logged**\\ If there are any other issues or unusual ​behaviour, report it in the log by typing a comment preceeded by double quotes in econtrol
   * **Field System time (monit2) agrees with station time**\\ Compare the clock shown in the monit2 display in the PCFS VNC session with the station clock (if you're at the observatory) or with the TAC32 GPS clock in the Tac32Plus display on the Windows PC (to view it, start a VNC session via the Applications menu to timehb). The seconds should tick over together. If they don't, the clocks probably need synchronizing. To run the monit2 status monitor, enter this command at the pcfshb prompt in the VNC session <​code>/​usr/​bin/​xterm -name monit2 -e /​usr2/​fs/​bin/​monit2</​code>​   * **Field System time (monit2) agrees with station time**\\ Compare the clock shown in the monit2 display in the PCFS VNC session with the station clock (if you're at the observatory) or with the TAC32 GPS clock in the Tac32Plus display on the Windows PC (to view it, start a VNC session via the Applications menu to timehb). The seconds should tick over together. If they don't, the clocks probably need synchronizing. To run the monit2 status monitor, enter this command at the pcfshb prompt in the VNC session <​code>/​usr/​bin/​xterm -name monit2 -e /​usr2/​fs/​bin/​monit2</​code>​
  
Line 46: Line 46:
 ==== FS time is out by several seconds ==== ==== FS time is out by several seconds ====
 The origin of this problem is presently unknown but the FS time can get seriously out of step. To fix this, **while not recording** start the ''​fmset''​ program from an ''​oper@pcfshb''​ terminal and issue the "​+"​ and "​-"​ commands, then quit from fmset (ESC). Restart fmset and the FS time should now be correct. You may need to resync the mark5B pps after this procedure. ​ The origin of this problem is presently unknown but the FS time can get seriously out of step. To fix this, **while not recording** start the ''​fmset''​ program from an ''​oper@pcfshb''​ terminal and issue the "​+"​ and "​-"​ commands, then quit from fmset (ESC). Restart fmset and the FS time should now be correct. You may need to resync the mark5B pps after this procedure. ​
 +
 +Be sure to check that FHG=off. ​ Sometimes if there is a power glitch while the Mark5 is still recording, it can get '​stuck'​ in record mode.  This will need to be stopped with disk_record=off,​ then run fmset again.
  
 ==== clkoff reading is drifting or far from the maser-GPS offset ==== ==== clkoff reading is drifting or far from the maser-GPS offset ====
-This usually is caused by the DBBC. First, go around ​the back of rack 14 and move the cable from the DOTMON output ​of the Mark5B ​to the "1 PPS Mon" output ​of the DBBC (left hand side, sixth SMA from the top). If the same offset ​is seen on the counterthen the problem ​in in the DBBCA temporary fix can be achieved with ''​pps_sync'' ​in DBBC Control but this did not reliably fix the problem on 13/10/10Insteadtry reconfiguring ​the DBBC with ''​reconf''​ - this will take ~two minutes ​in total and you will need to re-issue the ''​dbbcifa=...'' ​commands, and resynch ​the mark5B ​with ''​fmset''​.+The clkoff command measures ​the difference in the 1 PPS (pulse per second) signal coming from the GPS with the 1PPS from the Mark5. The Mark5 1PPS has travelled through both the DBBC and Mark5 and is a good diagnostic ​of a timing problem in our hardware.  
 + 
 +There are occasionally timing glitches (clock jumps) that cause the clkoff value to change. There are several possible causes: 
 +  - Spurious signals on the 1 PPS signal. For example at Yarragadee we sometimes see a clock jump when the antenna drives are powered on. We also sometimes see it as a result ​of poor earthing or a bad connection in the cable between the DBBC or Mark5 
 +  - DBBC problem. Sometimes ​the DBBC (which uses the 1PPS from the maser and passes it's timing on to the Mark5can become unstable and the 1PPS signal will start to drift.  
 + 
 +The easiest way to check for clock stability is to compare the clkoff and maserdelay values. The difference between these two should remain stable at around 0.3 us. The Log Monitor software calculates the difference and logs it as the "Delay difference"​. If this value exceeds abs(0.5) us, an alarm is sounded (by default). 
 + 
 +=== So what do I do if there'​s a clock jump? === 
 +The first thing to do is not panic. If the new delay remains constant and less than abs(20) us, the correlator can handle it. Re-setting the delay introduces another clock jump which makes the correlation more difficult. So the first thing to do is in the Log Monitor: 
 +  - Press "​Acknowledge alarm"​ 
 +  - Under the "​Configure"​ menuselect either: 
 +    - "Delay monitoring -> Audible warning"​ which will make the monitor software beep every time it sees a > abs(0.5) us offset, rather than sound the alarm, or... 
 +    - "Delay monitoring -> Silent warning"​ which will log that the offset is large but not beep or ring alarms. This should be used with caution! 
 +  - Now monitor the Delay difference and see if it has stabilised. You can do this in several ways: 
 +    - Watch the Delay difference values ​in the log monitor windowYou can get more frequent updates by issuing regular ​''​clkoff'' ​and ''​maserdelay''​ commands from e-RemoteCtrl 
 +    - Get Log Monitor to extract a history of the delay and delay difference values by pressing the "​Export Data" button. When you do this, several ascii files will be written to /vlbobs/ivs/logsThe file that will be of most interest is (e.g. for Yarragadee) /​vlbobs/​ivs/​logs/​yg_ddif.txt. You can open this file and read it's contentsor you can use a plotting program like gnuplot to plot the values. This is especially useful if you want to see if the new offset is stable or not: 
 +      - from a terminal window:<​code>​cd /​vlbobs/​ivs/​logs 
 +gnuplot 
 +plot 'yg_ddif.txt' ​u linesp 
 +</​code>​ This will plot the delay difference against day number. You can use the right mouse button ​in the plot window to zoom in. Every time you press "​Export data" the output files are refreshed ​and you can replot the values in gnuplot either by typing '​replot'​ or by pressing the "​Replot"​ button in the plot window. Other possible useful files to plot are ''​yg_maser2gps.txt'',​ the difference between the maser and GPS 1PPS, and ''​yg_fmout.txt'',​ the difference between GPS and Mark5 output 1PPS. 
 +      ​Seperate windows [0|1|2] can be opened for each station by replacing ​the final command above with:<​code>​ 
 +set terminal ​'wxt' ​2; plot '​yg_ddif.txt'​ 
 +</​code>​  
 + 
 +=== So when do I need to reconfigure the DBBC, run fmset etc? === 
 + 
 +If the delay difference is stable you don't need to do anything. 
 + 
 +If the delay difference is more than 20 us, or gets so large that the ''​clkoff''​ or ''​maserdelay''​ values lose precision, run ''​fmset''​ to get the delays back to something manageable//Make sure you are not recording while running fmset! Issuing a ''​halt''​ command from e-RemoteCtrl followed by ''​disk_record=off''​ is usually a safe method.// 
 + 
 +The first thing to do is try the command 
 +<​code>​counter</​code>​ 
 +in e-RemoteCtrl. Check to see if this worked by typing ​''​clkoff''​ and ''​maserdelay''​. If this doesn'​t fix itproceed with the steps below. 
 + 
 +If the delay difference is drifting (usually linearly), the DBBC probably needs reconfiguring. This can be done from e-RemoteCtrl as follows (again, best to halt the schedule ​and make sure you're not recording):​ 
 +<​code>​dbbc=reconf</​code>​ 
 +Monitor how things are going in the DBBC VNC session. A reconfig takes about 2 minutes. When it's completed, synchronise the dbbc: 
 +<​code>​dbbc=pps_sync</​code>​ 
 +Then in a terminal window on pcfs[hb|ke|yg],​ run fmset to get the clocks lined up. 
 + 
 +Now resume observations ​with ''​cont''​ or ''​schedule='' ​command. 
 ==== PCFS log window reports problem with ReadPower.sh ==== ==== PCFS log window reports problem with ReadPower.sh ====
  
/home/www/auscope/opswiki/data/attic/operations/monitoring_hb.1311166949.txt.gz · Last modified: 2011/10/26 06:37 (external edit)