CORRELATOR REPORT, EVN MkIV DATA PROCESSOR AT JIVE EVN TOG MEETING, Jodrell Bank, 22 NOVEMBER 2004 11 November 2004 (statistics cover 19 Mar 2004 - 10 Nov 2004) Huib van Langevelde Bob Campbell SCIENCE OPERATIONS Sessions and their Experiments The table below summarizes projects correlated, distributed, and released from 19 March to 10 November. The table lists the number of experiments as well as the network hours and correlator hours for both user and test/NME experiments. Here, correlator hours are the network hours multiplied by any multiple correlation passes required (e.g., continuum/line, >16 station, 2 head stacks, different phase centers, etc.) User Experiments Test & Network Monitoring N Ntwk_hr Corr_hr N Ntwk_hr Corr_hr Correlated 34 390 520 26 48 48 Distributed 36 429 598 33 64 72 Released 43 534 691 36 72 81 The following table summarizes by session the user experiments still in the queue, with an additional column for experiments not yet distributed (entries = remaining to do / total). The actual correlator time is typically between 1.5-2.5 times these estimates, depending on the number of redos or other problems. N_to.corr Corr.hrs N_to.dist May/Jun'03 1/25 15/542 hr 1/25 Nov'03 0/9 0/161 hr 0/9 Feb'04 3/13 168/364 hr 3/13 May/Jun'04 0/14 0/264 hr 1/14 Aug'04 ad hocs 0/5 0/27 hr 2/5 Oct/Nov'04 14/14 280/280 hr 14/14 All user experiments from the May/Jun'04 session have been correlated, including the ad-hoc experiments from August. Up to the date of this report, we still haven't received media from all stations for any user experiment from the Oct/Nov'04 session. The three remaining experiments from the Feb'04 session each require 4 passes to provide the requested 1/8s integration times, and each of the three would produce ~575GB of correlator output. The next stage of PCInt development (full correlator read-out at 1/8s) would cut the correlation time in half (such that the entry in the above table would be 84/280 hr). The Nov'03 session reflects abandoning GI001A, which was scheduled with an oversampling x8 (0.5MHz filters at 8Ms/s), which had never been advertised and which proved impossible to correlate with the existing architecture of the correlator-chip clocking (we have added a check in sched to prevent such occurrences in the future). One user experiment from the May/June'03 session remains in the queue to finish its second correlation run, awaiting PIs to provide revised coordinates for their targets based on a first correlation run we did using only short baselines and a short integration time. Parts of one older user experiment used 40ips recording, which would require speed-up to correlate; those parts also remain on hold. To review some landmarks from the sessions that are still 'active': May/Jun'03: 1st user experiments with >16sta at one time (more in Nov) 1st user experiment with a sub-netted schedule (not yet advertised, required some manual hacks to make a VEX file for correlating) 1st successful correlation of observations having an LO offset Nov'03: 1st station recording all experiments regularly onto Mk5 disks 1st fully successfully observed 512Mb/s user experiment Feb'04: 1st sub-second integration-time user experiments (one of these holds the record for size of distributed FITS files at 94GB) Regular Mk5 recording by 3-4 stations per experiment May/Jun'04: Regular Mk5 recording by up to 8 stations per experiment Oct/Nov'04: First all-disk user experiments Infrastructure Shipping tapes has essentially become yesterday's problem -- as more stations go over to disk, the demand for tapes is decreasing. Actually, balancing the trans-Atlantic tape flux, arising from a disk-based EVN and a tape-based VLBA where most global-experiment correlation occurs at JIVE, now involves more tape shipment than does meeting EVN requirements. To minimize costs and avoid the need for a new shipment every session, we sent 120 tapes to NRAO by ship during the summer. It turned out that the Oct/Nov'04 had a more balanced trans-Atlantic tape flux because of the several global P-band pulsar observations that had to go to Socorro. Shipping disks is somewhat more complicated than it was for tapes: multiple experiments per disk-pack makes releasing media no long a one-to-one function of releasing experiments, and individual disk-packs aren't all identical. The scheduler and TOG chairman determine how much recording capacity is required by each station in a session, balancing load against on-hand supply, and we are told how much to ship to meet the shortfall. Using the policy that stations should buy two sessions' worth of disks, we should be replenishing stations for a session with those received two sessions previously. The following table compares how this scheme has been followed for the previous two sessions: Rcvd Nov'03 Shppd May'04 Rcvd Feb'04 Shppd Oct'04 N_pack 20 28 53 68 total Size [GB] 22,000 30,520 63,689 86,312 We are distributing more disks than conventionally required to allow some stations that don't have enough disk-packs on hand to participate as Mk5 stations during the sessions. This sort of indulgence isn't sustainable; for the May/Jun'04 session we received 78 packs for 111,516 GB, but currently only have a total of 74 packs for 111,594 GB in house, including some JIVE-nnn packs (22 for 32,560GB) that in principle we should be able to keep out of circulation for various sorts of test recordings. TECHNICAL DEVELOPMENTS We currently have 13 working DPUs and 10 Mk5A units attached to the station units (SUs) for operations, with another five in house (one is currently on loan to Metsahovi for the Oct/Nov'04 session). Three of the Mk5s are fully connected to their SU (all 64 tracks), and the others share an SU with a tape playback unit. When more than 10 stations record Mk5, we can attach further Mk5 units in a similar shared configuration. We have a medium-term plan to shift to a 12-12 disk/tape mix, wherein four of the Mk5s would be fully connected to their SU, and the other eight would share an SU with a tape player, but this is on hold until we complete the remaining experiments from Feb'04 that have 13 tapes. Reduction in the number of tape players provides the opportunity to cannibalize some of the key components (capstan motors, heads) as the need may arise. Several new observing modes were enabled recently. Prompted by the Huygens project, support for VLBA recordings made on disk was implemented. A special correlator mode for eVLBI allows more straightforward configuring for real-time VLBI. This was used for the first eVLBI science demo on September 22. The demo also showed the need to focus on the robustness of the correlator software. Longer unattended operation is the highest priority and is also deemed important to take full advantage of disk recording. Progress was made by curing the two most limiting problems in the latest correlator software. The new ALBUS manpower allows us to address specific problems related to the data product. A first milestone was passed by implementing "calibration transfer" for the JIVE processor. System temperature and telescope based flagging can now be inserted directly in the data product. This of course would rely on timely creation of the ANTAB files at the stations. The next item is to provide Phase Cal calibration; successful first measurements were made. The computationally intensive van Vleck corrections were transformed from Glish to compiled aips++ code, which removed a serious bottleneck in the data distribution process. Progress continues on the PCInt project, which will eventually allow read out of the whole correlator with integration times as low as 1/64s. Recently it was shown that all stages in the new data path work correctly. Work remains on the definition of a backwards compatible data format that can hold the new parallel data streams. A first step was to allow different native byte orders, so data can be stored on the large Linux cluster. The first tests of recirculation modes have been done. In the context of the FP6 project ALBUS we have evaluated the possibility to add a Python interface to classic AIPS in collaboration with Bill Cotton (NRAO). This seems to be a promising route to produce and distribute the software that needs to be made available to the user community.