CORRELATOR REPORT, EVN MkIV DATA PROCESSOR AT JIVE EVN CBD MEETING, November 2005 14 November 2005 (statistics cover 16 May 2005 - 13 Nov 2005) Huib van Langevelde Bob Campbell SCIENCE OPERATIONS Sessions and their Experiments The table below summarizes projects correlated, distributed, and released from 16 May to 13 November. The table lists the number of experiments as well as the network hours and correlator hours for both user and test/NME experiments. Here, correlator hours are the network hours multiplied by any multiple correlation passes required (e.g., continuum/line, >16 station, 2 head stacks, different phase centers, etc.) User Experiments Test & Network Monitoring N Ntwk_hr Corr_hr N Ntwk_hr Corr_hr Correlated 19 258 368 9 46 43 Distributed 20 283 444 9 46 43 Released 12 160 267 7 43 46 Included in the test experiments were 5 e-VLBI tests (average frequency one per 4.8wk). As a practical matter, release of cleared experiments occurs as needed to meet the disk distribution for the upcoming session. The following table summarizes by session the user experiments still in the queue, with an additional column for experiments not yet distributed (entries = remaining to do / total). The actual correlator time is typically between 1.5-2.5 times these estimates, depending on the number of redos or other problems. N_to.corr Corr.hrs N_to.dist Feb'04 0/13 0/228 hr 1/13 May/Jun'04 0/14 0/237 hr 0/14 ad-hocs 0/6 0/40 hr 0/6 (Huygens, CHAKx) Oct/Nov'04 0/14 0/280 hr 0/14 ad-hocs 0/2 0/32 hr 0/2 (Huygens) Feb/Mar'05 0/10 0/239 hr 0/10 Jun'05 0/12 0/194 hr 1/12 Oct/Nov'05 12/12 not yet determined There is no backlog for correlation through the June 2005 session. The last part of GG053 from Feb'04 was completed on 11 November, after receiving one of the disk-packs back from Conduant, who repaired a never-before seen problem with the placement of the record-pointer. This still remains to distribute, as does the last experiment from June 2005. The Oct/Nov'05 session lasted until 12 November. So far, there's no user experiment for which we've received the media from all stations. The pre-correlation e-mails to PIs, by which we confirm the correlation parameters & archive classification of the individual sources, are being sent out as experiments finish. Nt has missed the entire session, due to the same azimuth-drive casualty they suffered early in the Jun'05 session. Engineering work at the Lovell meant that it also missed the C-band session, but was back in by L-band, much to the relief of the 1Gb/s-experiment PIs. To review some landmarks from recent sessions: Feb'04: 1st sub-second integration-time user experiments Regular Mk5 recording by 3-4 stations per experiment May'04: Regular Mk5 recording by up to 8 stations per experiment Oct'04: 1st all-disk experiments (the 5cm sub-session [8station] was recorded entirely on disk) 1st overwhelmingly large user datasets (260GB of FITS files from one experiment) ad hocs: Huygens dress-rehearsal and actual observations: 1st Mk5 recordings from VLBA stations 1st fringes from Australian, Japanese stations most disk-stations being correlated at once (15) Feb'05: 1st 1Gb/s user experiment correlated and distributed Jun'05: Longest single user experiment (48 network hr) Oct'05: 8 VLBA stations recording onto disk during globals Infrastructure We currently have 13 working DPUs and 16 Mk5A units (one Mk5A was loaned to Wb for the Oct/Nov'05 session). Ten of the Mk5A units are housed inside temperature-controlled cabinets, with the others sitting on benches behind the row of DPUs. Three Mk5A units are fully connected to their SU (all 64 tracks); the rest share their SU with a tape playback unit. Through a simple re-connection of 2 cables, we can fully connect all Mk5As to their SU. With the last of the Feb'04 experiments requiring 13 tapes now correlated, we can carry out our medium-term plan of an 11 DPU - 16 Mk5A configuration, all in cabinets. Our disk-shipping requirements are derived from the recording capacity needed by a session (from the EVN scheduler) and the supply on-hand at the stations (from the TOG chairman). Using the policy that stations should buy two sessions' worth of disks, our disk flux should balance over the same two-session time frame. The following table charts our net disk flux since Mk5A recording began; each row corresponds to a two-session cycle (the syntax for all entries: N_packs "for" Size[TB]). Ad-hoc experiments are not included in these numbers. IN OUT NET OVER-DISTRIBUTION Oct'03 20 for 22.000 -> May'04 28 for 30.520 +8 for +8.520 Feb'04 53 for 63.689 -> Oct'04 68 for 86.312 +15 for +22.623 May'04 78 for 111.516 -> Feb'05 47 for 69.890 -31 for -41.626 Oct'04 87 for 122.529 -> May'05 71 for 100.665 -16 for -21.864 Feb'05 88 for 140.583 -> Oct'05 103 for 170.152 +15 for +29.569 Jun'05 100 for 176.992 -> to recycle in Feb'06 The steady growth of disk receipts per session is obvious, as is the mean disk-pack size. Over the first 5 2-session cycles, we are within 3TB of balance (460.317 in, 457.539 out). In order to facilitate faster disk turn-around, we have begun an unofficial policy of giving priority to 1Gb/s experiments -- these have the highest (pack-freeing-up)/(corr.hr) ratio. Now that the VLBA appears to be starting disk-recording in earnest (8 VLBA stations plan to use disk in Oct'05), a complication in disk-turnaround operations may ensue: they will need next-session disk-turnaround. To support the Oct'05 globals to be correlated at JIVE, we have essentially advanced them 16 packs for 27.84TB. TECHNICAL DEVELOPMENTS Disk-servoing is now much faster at the beginning of jobs and coming out of gaps in the schedule. Some attention now turns to the interaction between sched, the field-system logs, and the correlator-control files, to get the playback to start from the beginning of good headers on disk rather than when sched (or the field-system) thinks the antenna is on source (the latter condition more rigorously a job for post-correlation flagging that already takes place in the pipeline). The PCInt project reached a major milestone, when the full data path was exercised. Previously the maximum datarate was 6MB/s, which corresponds to 0.25s integration time. Now 24 and 48 MB/s read-out has been achieved. Although the data conversion for this mode is not yet operational, this promises to allow 1/8 and 1/16 s integration time, essential for some wide field imaging experiments. The PCInt cluster computer backend can be used to store and format large experiments. The top 5 experiments in terms of FITS-file size are: GG060 (268.2 GB), EL032 (260.2 GB), GG053C (224.9 GB), GG053B (221.2 GB), and EB026 (94.2 GB). Further progress has been made on e-VLBI. Tests have been done that reached 256 Mb/s fringes and 512 Mb/s data transfer. But the connectivity remains rather erratic. Much effort has gone in preparing the contract with the EU for the EXPReS project. This aims to transform the EVN into a fully functional 16-station real-time e-VLBI network. Within the scope of the ALBUS project we have made good progress on phase-cal detection. It has been confirmed that the hardware in the EVN processor can obtain sensible phase-cal values. With software developed in house it can be shown that these take out the instrumental phase offsets and single band delays. Also, progress has been made on the various calibration projects. For the epochs of test observations, various estimates of the total electron content have been obtained. From their internal consistency it seems promising to derive ionospheric calibration from these. Similar progress has been made in processing WVR data from the Effelsberg telescope. These measurements have been demonstrated to reduce the tropospheric phase fluctuations. The development of ParselTongue, a Python scripting environment for AIPS, has been quite successful. This product is now available for interested users. It offers an environment for running large or complex data reduction projects. Besides this scripting or pipelining functionality it can be used to implement and test new calibration methods. USER SUPPORT Ten users have visited JIVE for data reduction in this period. There are now a total of 6 computers in the visitors' room: four dual-processor PCs running linux, one sun workstation running solaris, and a windows-based PC. Increasing disk-space all around for support of large data-sets will be a priority next year. The EVN Archive at JIVE is up and running. This provides web access to the station feedback, standard plots, pipeline results, and FITS files. Public access to source-specific information is governed by the EVN Archive Policy -- the complete raw FITS files and pipeline results for sources identified by the PI as "private" have a one-year proprietary period, starting from distribution of the last experiment resulting from a proposal. I coordinate with the EVN scheduler to determine whether specific experiments that have a terminal-letter are indeed the "last experiment" of a series (there was a ~3-week period in which the data for EB028A was publically available in violation of this rule, since EB028B was still in its proprietary period -- this was fixed, and internal procedures adjusted to avoid such occurrences in the future). PIs can access proprietary data via a password they arrange with JIVE. PIs receive a one-month warning prior to the expiration of their proprietary period; I modified the wording of this warning to discourage frivolous or reflex appeals to the PC Chairman for extensions. We also increased the storage available on the archive machine, as it was rapidly getting too small. Such increases will likely prove to be a continuing requirement. Provisions have been made to store a copy of all the user data outside the main Dwingeloo building. There are two independent ways to search the Archive other than by direct entry via a specific experiment. The EVN catalogue of observations (Bologna) can be used to search for observations of particular sources, and provides a link to the relevant experiments on the EVN data archive for experiments correlated at JIVE. In addition, a FITS-finder utility for archived data is operational. A database contains all the meta-data for the projects that are on-line. Searches can be key to source names or coordinates, observing frequency, and/or participating telescopes, among other characteristics. Stations continue to measure their T_cal as a function of time and frequency. Median gain corrections are compiled each session, and the overall situation is improving, especially for C and X band. L band suffers from RFI (exacerbated by the trend to wider bandwidth observing). K-band and 5cm are still a problem though Onsala are implementing a new high frequency calibration (>15 GHz) scheme which uses a hot load and chopper wheel instead of the conventional calibration diode that will be used for the first time in the October 2005 session. The pipeline provides stations with feedback on gain corrections for all experiments correlated, both NMEs/fringe-tests and user experiments. Off-source flagging files generated by the FS at the stations are also available via the pipeline results. A new release of SCHED was used for the first time in preparing schedules for the May'05 session. It provides support for native Mk5 schedules, as well as for transparent ftp and e-VLBI scheduling. All the main EVN stations (including Robledo) have now upgraded to a recent version of the FS which can accept native disk, ftp and e-VLBI schedules. We continue to contact all PIs prior to their scheduling to ensure they know how to obtain help, and to check over schedules posted to VLBEER prior to stations downloading them. In the past couple sessions, the areas of greatest attention have included optimizing schedules in light of the 10-minute cycle-time limit for Jb1 (or replacing it with Jb2) and making Gb/s schedules. The tactics of making dedicated disk-only schedules evolve with the newer versions of SCHED. NETWORK SUPPORT We continue to process NMEs via the pipeline, with results being posted to the EVN web pages. EVN Reliability Indicators (ERI) are calculated for all experiments. In addition to routine network monitoring, NMEs are used for ftp fringe tests, which continue to be very successful in identifying station problems early in the session (initial results are reported on the same day as the experiment is observed) and have contributed to an overall increase in the ERI in recent sessions (see attached plot). We had our first-ever ftp fringes to Robledo in October 2005. Previous failures to detect Medicina K-band fringes in the ftp fringe-tests were traced to a bug in the software correlator which has now been remedied. The pipeline provides stations with feedback on gain corrections for all experiments correlated, both NMEs/fringe-tests and user experiments. These data are being used to identify stations/frequency bands with particular problems. For example, K-band was shown to have severe problems, and so an NME in Feb'05 was used to investigate the origin of this problem (e.g., pointing errors, opacity, etc.). A new version of the field system was released for use in the May'05 session. It provides more transparent control of e-VLBI operations and improves the monitoring of remaining disk capacity.