CORRELATOR SCIENCE OPERATIONS REPORT, EVN MkIV DATA PROCESSOR AT JIVE EVN TOG MEETING, June 2010, Metsahovi 17 June 2010 (statistics cover 15 Oct 2009 - 16 June 2010) Bob Campbell (there are two questions for general station input prefaced by ">" as the initial character in the lines: one under network support [disk-distribution] and one under user support [skd-file extention change in sched v.9]) SCIENCE OPERATIONS Sessions and their Experiments The table below summarizes projects correlated, distributed, and released from 15 October to 16 June. The table lists the number of experiments as well as the network hours and correlator hours for both user and test/NME experiments. Here, correlator hours are the network hours multiplied by any multiple correlation passes required. User Experiments Test & Network Monitoring N Ntwk_hr Corr_hr N Ntwk_hr Corr_hr Correlated 65 568 754 18 66 67 Distributed 53 463 661 15 63 63 Released 53 477 684 12 52 52 The following table summarizes by session the user experiments with activity since the previous TOG meeting , with an additional column for experiments not yet distributed (entries = remaining to do / total). N_to.corr Corr.hrs N_to.dist session 2/2009 0/25 0/350 0/25 Jun-Sep e-VLBI 0/7 0/52 0/7 session 3/2009 0/17 0/309 0/17 Nov-Feb e-VLBI 0/14 0/77 0/14 Feb Disk ToO 0/1 0/24 0/1 Mar e-ToO 0/1 0/11 0/1 session 1/2010 7/23 109/257 17/23 Mar-May e-VLBI 0/9 0/61 0/9 session 2/2010 16/16 240/240 anticipated correlator hours session 2/2010(e) 0/6 0/63 6/6 The attentive reader may notice that we're rather behind in clearing the session 1/2010 experiments. The largest contributor to this has been adnormally high processing factors in the Gbps experiments. By the time of the meeting, I'll have a plot illustrating this. Some landmarks since the previous TOG report: 264 hours of user e-experiments (not counting initial test time; gaps between experiments) First ftp fringes to Zc, Bd (session 1/2010 NMEs) First 5cm fringes to Sh (session 1/2010) First consecutive e-VLBI days, and first use of e-VLBI during a regular session (2x 2 days at the end of session 2/2010) First use of 5B (vice 5A+) playback of 5B recordings in user experiments (globals from session 2/2010 for Wb, Ys) 10 Target-of-Opportunity proposals (~3 in previous decade). List of topics follows, with notes about the number of related proposals and whether some of the proposals led to multiple epochs. All were observed in e-VLBI, unless 'disk' is noted; and occurred in regularly-scheduled e-VLBI days or in a session unless otherwise 'unscheduled' is noted. OH outburst in O Cet (unscheduled disk) KT Eri nova outburst XTE J1752 X-ray transient (2; 1=disk) gamma-ray outburst in M86 (2; multi-epoch; 1=unscheduled e-day) new transient in M82 (disk; multi-epoch; global, corr @ Soc.) V407 Cyg GeV nova (3; multi-epoch) Astronomical Features: The KVASAR stations were included in numerous user EVN experiments for the first time in session 2/2010, after participating successfully in NMEs in 1/2010. A standard EVN array (once Hh is back) could now quite easily reach 14 stations at C-band. This pushes then envelope in terms of available mark5/SU units when the possibility of multiple MERLIN out-stations is also considered (requiring two extra Mark5 units to simulate the e-VLBI transfer into the switch/router, plus one SU for each of the stations "in" the Cm recording). For wide-field mapping observations, we usually need to correlate in multiple passes by subsets of sub-bands in order to cut the bandwidth smearing. We now can combine these passes at the Measurement Set stage into a single Measurement Set containing all sub-bands. Included are extra steps to check for "orphan" sub-bands at any time, and to make sure the data stay in TB order. The PI avoids the need to VBGLU separate sets of FITS files back together, and the pypeline processing can provide a single ANTAB file applicable to the whole experiment. Two experiments from session 3/2009 went through this process: GV020A with 230 GB of FITS files and GF015A with 328 GB (the GV020A PI still contemplating whether he wants a similar correlation for GV020B from 1/2010, or smaller data-sets from multiple phase-center correlations). We discovered that the combination of recirculation on oversampled data does not work together. This affected one experiment from session 3/2009, which had to drop from 9 to 8 stations. Oversampling is almost exclusively used to get to 0.5 MHz sub-bands (sampled at the current minimum 4 Msamp/s sampling rate, thus 4 times Nyquist). For 2 MHz sub-bands, recirculation can provide a factor of 8 capacity improvement, but the maximum of 2048 frequency points per baseline/sub-band/polarization would remain. The principal beneficiaries from recirculation are spectral-line experiments having 9 or more stations. Recirculation of 2 MHz sub-bands would provide more net capacity, except for users wanting 1024 or 2048 frequency points across 0.5 MHz sub-bands. Such observations would still need to be limited to 8 stations. The correlator capacity pages on the JIVE web page were updated to reflect these additional considerations. NETWORK SUPPORT The automatic-ftp feature continues to be exercised in all network monitoring experiments. Stations send the specified portion of a scan directly to JIVE for correlation on the SXFC software correlator. Correlation results go to a web page available to all the stations within a couple hours, and Skype chat sessions during the NME provides the station friends with even more immediate initial feedback. With 2-3 ftp fringe-test scans per NME, there is opportunity to find, feed-back, fix, and verify a problem and its solution within a single NME. We continue to process all experiments, including NMEs, via the pipeline (now run via ParselTongue), with results being posted to the EVN web pages. The pipeline provides feedback on stations' general performance and in particular on their gain corrections, and identifies stations/ frequency bands with particular problems. Jun will present a more detailed calibration report in his presentations. The preferred patching for KVASAR stations isn't supported yet in sched, so we provided PIs of session 2/2010 experiments including them with set-ups for them that allowed sched to produce the output skd/vex file, and then we hand-edited these skd/vex files to conform to the KVASAR preferences (i.e., in the $FREQ, $BBC, $LO sections). Once we had a reasonable understanding of what to do, this process was fairly straightforward. Working towards a proper sched-based solution is on the agenda to address with Craig. There also remains an issue with the C-band Gbps total spanned-frequency range in relation to RFI and/or front-end limitations at KVASAR stations. Initial attempts to schedule test observations in an 80-MHz lowered Gbps set-up have yet to bear fruit, but RFI information at some stations has been passed along. We hear unofficially from CRAF sources that the frequency range above 5000 MHz may become less favorable going into the future (outside primary or secondary protection), so long-term there may be motivation for lowering the frequency range anyway. We'd still be aiming to carry out such observations, either in an upcoming e-test period or a gap in the next session (in the latter case, as a scheduled test). OF course, tests to explore the RFI environment at new frequencies may have UT considerations, if there are a class of sometimes-on interferers. In recent sessions, there have been experiments going to three different correlators. This added some complexity to the pre-session disk-distribution planning. The principal goal is to avoid individual packs containing data for more than one correlator. Thus in the disk-distribution plan, the load load for each target correlator for each station is computed separately from the known available packs on-hand at the stations or free at the correlators. These loads are computed by *) assuming a 100% recording duty-cycle for the duration of each experiment *) subtracting 0.09TB from the capacity of each pack (to provide a buffer to compensate for unused space at the ends of packs du to the (last + 1) Mark5 scan not having fit onto the pack). The disk-distribution plan is made well before PI schedules are ready; once they are, the *.sum files provide the actual amount of disk-space required per station ("Station summaries" section). It is hoped that the two above tactics provide some insurance against loss of capacity due to bad packs, but this would likely be insufficient if a very large pack fails. The disk-distribution plan tabulates the specific make-up of the packs to use per target correlator in terms of how many of which capacity. We try preferentially to use the largest-available packs for the farthest stations (to cut down on shipping), but consistent with minimizing unused capacity (e.g., 4.4TB = 3.2TB + 1.48TB instead of a 6TB or 8TB). In session 1/2010, I managed to get stations in trouble by using larger packs for the 2 globals destined for Socorro, which implied that they needed to hold on to partially- recorded packs from the first to use in the second, delaying receipt in Socorro. In session 2/2010, we adjusted our tactics to build the VLBA-bound load out of a larger number of smaller packs, crafted to permit pack shipping of each global individually. In the past few sessions, there have remained a couple-few instances of data for different correlators being recorded onto an individual pack. > Is there anything further beyond the present disk-distribution plan and > the time-table of target-correlator blocks that we could do to help in > this regard? Of course, any a priori scheduling could be thrown off by pack casualties encountered during the session. USER SUPPORT The EVN Archive at JIVE continues to provide web access to the station feedback, standard plots, pipeline results, and FITS files. Access and public-release policy remain the same. The archive machine continues to have 12.8 TB of dedicated disk space, with a buffer of another 1.8 TB that also houses the pipeline work area. In addition to the tape back-up of the Archive, we also now have a mirror to a machine physically in Westerbork. On 9 April, we had 8.4 TB of FITS files in the Archive, a gain of 1.4 TB over the preceding six months (date range corresponds to the most recent CBD report). We continue to contact all PIs once the block schedule is made public, and to check over schedules posted to VLBEER prior to stations downloading them. This occupies occupies a great deal of time in the fourth to second weeks before the start of the session, but helps to prevent avoidable errors in the observations themselves and thus makes the available observing time more productive. In session 1/2010, this process was especially fraught: *) a couple schedules > 2 weeks late (including a new record of 31d) *) one PI managed to revert back to a wrong phase-reference source in subsequent versions of the schedule after it was fixed once *) the first mixed Gbps/512Mbps globals going to Socorro for correlation (The then pre-release sched v9.1 was able to handle the latter kind of observation in a single schedule, which previous versions required separate Gbps schedules for EVN stations and 512Mbps schedules for VLBAs, resulting in a vulnerability different schedules growing unsynchronized -- the GC029 disease). By session 2/2010, most users had shifted to sched v9. There was one instance when a PI changed version of sched between versions of the schedule; thus there was at one time both an skd file (v2) and a vex file (v3), and both wound up on .latest/. As PIs stop using the older versions of sched, this problem should desist. > What are the stations' views about directly DRUDGing a *.vex file (i.e., > is it still necessary to change extensions of *vex files back to *.skd)? In both sessions 1/2010 and 2/2010 there were isolated cases in which PIs initially scheduled non-authorized targets, which were handled in the usual fashion through the EVN PC. At JIVE, Des Small will take over the EVN-related sched responsibilities from Friso Olnon, who retires over the summer.