Operators meeting 18.12.2014 Agenda 1. Current issues: * Observing at Mt Pleasant: experience, suggestions. * Alarms at Mt Pleasant -- what is wrong? (seemingly nothing) * alarm is .ogg maybe not playing -- convert to another format? * another another player? * use the system alarm at Mt Pl - mute! * Hb recording consistently less than schedule assumes for certain AUST experiments (e.g., [[handover:aust67|AUST67]], [[handover:aust68|AUST68]]) * Check scan_checks (Jamie) * Ke time setting -- mark5 doesn't synch (Recent [[handover:aust72|AUST72]]+generator issue) * Ke issues during [[handover:aust67|AUST67]] and [[handover:crds74|CRDS74]] //(see below)// * MONICA puts load on pcfs, but maybe not it is to plame * java? * keep monitoring and gather info * Ed's suggestion: add handover comments to the log rather than to the end message. 2. Roster, any questions or comments * Roster 2015 4. Other items * AuScope Observers Holiday drinks 5:30pm Monday 22nd of December in The Winston http://www.thewinstonbar.com/ PD Below an email from Ed about recent Katherine issues: Hi Jim, Jamie, Brett, Rich, Dan, and I have been looking at some recent problems Katherine had. Below is our diagnosis of what happened and some suggestions for handling them. Please share this with the operators if you think it would be helpful. AUST67: The major problem was not a Mark 5 problem, but it was the tip-off "canary in the coal mine". The FS PC appears to have been loaded (as happened in CONT) slowing the system down. This snowballed into the other errors. What was visible was the Mark 5 rejecting commands ("error m5 -900 not while recording or playing"). When problems like this are encountered, we recommend that the operator check the load with "uptime" and enter the numbers in a log comment. Until this issue is resolved, you might consider having a window opening running tload, xload or top, as part of normal operations for a visual indication of the load. A FS PC reboot may fix this problem when it occurs. Ideally, the problem program should be identified and fixed or at least the operator given instructions on what program to "kill", which would be less traumatic than a FS PC reboot. CRDS74: Again the Mark 5 was not the issue. The DBBC had 1 PPS problems, visible as a jump in /dbbc/pps_delay/ output. This snowballed into the other problems. The situation was visible as a Mark 5 time error ("ERROR sc -13 setcl: formatter to FS time difference 0.5 seconds or greater"). We suggest that you have the operator monitor /dbbc/pps_delay/ value and look for jumps. The DBBC needs to be resync'd/rebooted in this situation, but not the FS, or Mark 5, as the operator discovered. There is a learning curve. Of course, after resyncing/reboot the DBBC, the Mark 5 has to be resync'd and the time set. It seems like there are still DBBC PPS stability issues that need to be addressed. BTW, we noticed the following text in crds74ke end message: Comments from the log: Disk VSN: WSRT-049 Data volume at beginning: 1.159 GB UT 17:10 Sched start. First scan 343-1719 (-13.4 GB) (JS) UT 10:36 - Mark5B drifting wildly. Halted schedule. (vk) UT 10:54 To fix clock drift, restarted field system, GPIB and Mark 5 with no luck, reset counters and fmset with no luck, had to restart DBBC, finally fixed the problem. (Imogen) UT 11:50 - Schedule resumed. Missed scans: 344-1027 to 344-1141, corresponding to ~90 GB (vk) These comments don't seem to actually appear in the log. Did we just not look in the right place? Maybe they are just the operators hand written notes entered for the end message. If so, it would be very helpful if they could be injected into the log file (a la the 'msg' program) since these comments contain valuable information that didn't otherwise appear. Thanks, Ed