EVN TOG - December 2006 Procedures for Pre-Session Distribution Bob Campbell, JIVE 1. The first step in deterimining how many disk-packs to distribute to stations prior to a network session is receipt of the block schedule. From this, the total recording load per station can be calculated by summing, over all experiments in which a station participates, the product of (recording-rate * duration) for each experiment. a. There is some question whether a net 90% duty-cycle should be assumed in PI scheduling -- that would obviously allow ~10% more recorded data to be scheduled, but if PIs exceed this (on net), then the distributed disk-packs may prove insufficient. Even though the 90% rule has empirically been honored on average in the past couple sessions, disk recording does have fewer intrinsic gaps, and I'm probably leaning towards using a 100% duty-cycle rule in calculating pack needs, to be sure enough would be distributed. b. Another factor to keep in mind is the destination correlator per experiment. Recordings destined for different correlators should not be put onto the same disk-pack, which may lead to some inefficient use of available space given the quantized sizes of packs. 2. The next step in determinig how much to distribute is to determine the inventory available at the stations & correlators. To minimize shipping costs and maintain flexibility to respond to unanticipated experiments, we aim to distribute just enough to bring individual stations up to their required load for the session. Therefore, it's important to know how much is already on hand at the station, either from excess from previous sessions or from new purchases. If there are multiple destination-correlators for a station's recordings, then it would also be useful to know the number/size distribution of the available packs. The number and sizes available for release at the correlators (Bonn, JIVE) would form the final piece of the puzzle. If the session is expected to be disk-space limited, then it's quite possible the creation of the block schedule may be an iterative process, taking into account various scenarios to see whether the resulting loads could be supported by the available packs. 3. At some point, a set of packs to distribute to each station is determined. This takes into account: a. the load per destination correlator -- we currently try to provide, as separate sub-distributions, the load + ~0.5TB for each correlator. The rationale for this safety margin is to allow ~2 individual disks to fail. Ideally, an entire pack safety margin would be more robust, but this could limit the total number of experiments that could be supported, especially when there are experiments going to JIVE, Socorro, and Haystack from several stations. b. I try to provide specific (sub)-shipment break-downs (N_packs, sizes) that should meet the above requirements for each destination correlator. c. There may be some individual stations having particular needs. Usually, this has included returning specific disk packs to the purchasing station to effect repairs. In the future, with the newly agreed tactic of JIVE repairing packs and sending individual disks to the purchasing station for warranty service/replacement, this category would become obsolete. 4. There are some logistical points to keep in mind; some agreed upon at the CBD level, and some that evolve as we learn more about optimizing the distribution: a. EVN full-member institutes pay for the shipping from JIVE to the station. Hence, we tried to coordinate with the stations to use a shipper who provides reliable and economical service. Because we don't get charged for these shipments, we don't get feedback about these aspects unless the receiving station brings things to our attention. b. JIVE pays for shipments to non-full-member institutes (typically, Ro, Ar, VLBA). From these shipments over the past couple years, we have become aware of some undesirable characteristics of the shipper we had been using (expensive, often encountered difficulties clearing customs to the US, etc.), and have switched companies. c. We try to ship packs out to arrive at least 2 weeks prior to the session (shipments to China/South Africa leave 4-5 weeks prior; continental Europe typically the week after). Of course, a constraint on the earliest possible shipping date is the existence of the block schedule and an inventory of disk-packs already on hand at the stations, so at least in terms of the latter, you can see a direct benefit from providing this inventory in a timely fashion. In order to minimize shipping costs to the distant stations, we generally try to meet the load for non-continental stations with the fewest possible packs. 5. Tables of disk receipt/distribution: Our disk-shipping requirements are derived from the recording capacity needed by a session (from the EVN scheduler) and the supply on-hand at the stations (from the TOG chairman). Now that the VLBA has also shifted to Mark 5 recording, the bookkeeping of disk-flux accounting has become more complicated. We have two sets of rules to follow: the EVN policy that stations should buy two sessions' worth of disks, hence our disk flux should balance over the same 2-session interval. the VLBA's need for sub-session turn-around, which essentially requires us to pre-position the difference between what NRAO stations will observe in globals to be correlated at JIVE and what EVN stations will observe in globals to be correlated in Socorro. The following tables charts our net disk flux to support both EVN and VLBA stations (all entries in [TB]), with positive balance signifying flow away JIVE to stations or from EVN to NRAO. Session 3/2006 is very short, owing to the sub-reflector improvements at Effelsberg; we had a further 100.55 TB of packs available to ship that were not required. EVN only Tactics: recycle in time for recording in 2nd following session IN OUT BALANCE NET 3/2003 22.00 -> 2/2004 30.52 +8.52 +8.52 1/2004 63.69 -> 3/2004 86.31 +22.62 +31.14 2/2004 111.52 -> 1/2005 69.89 -41.63 -10.49 3/2004 122.53 -> 2/2005 100.67 -21.86 -32.35 1/2005 140.58 -> 3/2005 142.31 +1.73 -30.62 2/2005 176.99 -> 1/2006 152.93 -24.06 -54.68 3/2005 211.85 -> 2/2006 303.37 +91.52 +36.84 1/2006 191.05 -> 3/2006 133.36 -57.69 -20.85 2/2006 306.63 -> 1/2007 NRAO only Tactics: JIVE pre-positions packs for globals to be correlated at JIVE (col 1) EVN stations send packs from globals to be correlated at SOC (col 2) NRAO sends packs from globals to be correlated at JIVE (col 3) NRAO returns some packs to stations once correlated (col 4) to NRAO to EVN BALANCE NET JIVE pre-pos Sta->SOC to JIVE to Sta 3/2005 27.84 15.73 33.56 3.31 +6.70 +6.70 1/2006 31.60 5.06 30.22 0.00 +6.44 +13.14 2/2006 4.96 16.48 23.86 0.00 -2.42 +10.72 3/2006 24.00 0.00 Note that in the first two sessions, I was not accounting for the packs sent from EVN stations to Socorro for correlation, thus a net balance was accumulating, as pointed out to me by Craig Walker via the scheduler. Liaison with Socorro continues to make sure we're both on the same page. There are no global experiments going to Socorro for correlation in session 3/2006.