AMS Geneva Memo

The algorithm for AMS data compression

Produit Nicolas

27 march 1997

This memo mainly explains the mathematics behind the data compression code in the file treat.pas.

Bandwidth considerations:

The total downlink bandwidth is 1Mbit/sec. Let's assume that half of it is dedicated to the tracker. One good event is 6planes * 2 sides *1 cluster. Lets assume 1 cluster is 10 bytes (very optimistic) and we have the maximum 500 Hz of trigger.

So if we do a perfect job of clustering we use

500 Hz * 8bit/byte * 10 byte/cluster *12 cluster= 480 kbit/sec

How many fake clusters do we expect (with a seed cut at 3)?

200 000 strips *probability of a 3 sigma positive fluctuation= 200 000 *(1-erf(3))= 5 clusters

so not negligeable against 12! besides we know that the noise has some non gaussian tails...

Another guess is that cluster is 3 order of magnitude less than strips and we have 200000 strips so that we can expect 200 cluster of noise per events...

So to stay inside our bandwidth budget we are forced to do a perfect data compression job.

CPU power considerations:

1 DSP ina TDRK has to compress data coming from 6*4*64 strips (6 VA per ADC, 4 ADC, 64 strips per VA) per events. With a (very optimistic) 10 instructions per strip algorithm wee need around 15 MIPS per kilohertz of LV1 trigger rate. The DSP has around 10 MIPS of data crunshing power.

For the designed 500 Hz of LV1 trigger rate this figure seems to be adequate.

Definition:

(Look also in AMS Geneva memo The calib program )

We consider that the ADC value read from a good strip i in VA k in event n is equal to:

ADC(i,n)=ped(i)+CN(k,n)+noi(i,n)+gain(i)*signal(i,n)

Where ped(i) is the pedestal of strip i

CN(k,n) is the common noise of VA k and event n, CN is just one single random number for all good strips belonging to the same VA gaussian distributed around 0. The width (=SCN(k)) of this distribution depends on the readout system and not on the ladder itself.

noi(i,n) is a gaussian random variable with mean 0 and sigma=noise(i)

signal(i,n) is the signal in MIPS on strip i for event n

gain(i) is the MIP to ADC convertion factor for strip i

status(i) is 0 for a good strip. For a bad strip each bit of status encodes a pathology discovered (look at table 1 in AMS Geneva memo The calib program)

Algorithm for common noise:

Common noise turns out to be surprisingly hard to define.

We know that there are two CN's: one is ladder-wide (or even power supply-wide), the other is VA-wide. We will consider only a VA-wide CN which includes both.

First naive definition: after pedestal substration mean of the signal over one VA. This is OK as long as there is no signal and no strips are bad.

First refinement: after pedestal subtraction, mean of the signal over good strips in a VA. This is OK if there is no signal which is not so useful. The definition is also made so that a VA with all bad strips will make a CN of 0.

Second refinement: to try to get rid of the signal we can try to exclude strips that show big non 0 values, let say more than 3 sigma. This worked perfectly with small CN and MIPS in CERN test beam, but it was rather bad with heavy ions in GSI. The 3 sigma works if the CN is really around 0. If there is an event with a big CN value, this can exlude all the strips or on the contrary select the signal region.

The ultimate cure of this last problem is to make an histogram of all the values after pedestal subtraction and the CN is then the most populated bin. But this definition is enormously CPU and memory intensive, and rather stupid because for example clearing the histogram take much more time than filling it. The actual solution is to take a good strip (we start with strip number 10 in the VA, not 1 because we have reason to suspect that boundary strips are less representative than middle strips) as representative of CN and define the 3 sigma cut around its value. If now too many strip are excluded we think that this strip had a signal so we skip to the next good strip and start again (wraping around if reaching strip 64). If we exhaust all strips then the CN is set to 0. On most of the cases this algorithm works with the first strips but some times it can take much more time to converge. You will have to look yourself in the code to understand the fine details because the presence of bad strips and avoiding division by 0 add a lot of small details that are much more obvious in the code itself rather than in written.

I don't consider this part of the code as final.

For example, we will have to refine what we call "good" for CN wich is perhaps different than what we call "good" for clustering (for example, underflow are bad for CN but good for clustering, bad gain strips are ok for CN)

Algorithm for cluster:

To find a cluster we subtract the pedestal from the raw value, compute the common noise and subtract it.

We then scan all the good strips for a n sigma positive fluctucation (n is read from the file default.set under the name "seed cut"). We then associate to this seed strip all the neighbour strips (irrespective of their status) as long as they exhibit a more than m sigma positive fluctuation (m is read from default.set under the name "neighbour cut"). From experience on MIPS (a fortiori on heavy ions) n=5 and m=2 are good values, but you can lower them for more efficiency against more noise.

This defines the boundary of the cluster. Some values are then computed on the cluster. We have to make detailed studies to see what are the best choices, remembering that the bandwidth will limit us to something like 10 bytes per cluster.

actually we compute:

starting strip

lengh

center of gravity

integral

extrapolated center using 2 strips (Shoutko proposal, seems very good for MIPS but conceptually bad for heavy ions): see below.

extrapolated center using n strips

All the clusters are then put in a linked list, than can be saved on disk in ASCII. The scope program is also able to display clusters.

Here we see that 10 bytes is a real small number: we need 3 bytes to code position (200 000 strips + some bits for sub strip position). dE/dx will need 2 bytes to achieve 10**2 dynamic range with S/N of 20. Leaving only 5 bytes for the shape of the cluster.

Extrapolated center:

The extrapolated center using 2 strips is computed as the weighted mean of the highest strip and the biggest of its adjacent strip. It means that in the case of a 1 strip cluster we use a strip which does not belong to the cluster. This seems to recuperate a little bit of resolution for MIPS because in this case we have a significant fraction of single strip clusters. The definition has to include what to do in case of bad strips and of strip on the boundary. You will have to look inside the code to learn about that (people never explain those details in their talks but most of the devil lies there).

This definition is bad for heavy ions because in this case the signal spreads always on may strips and with this definition the eta function (look below) can, by definition, never be near an integer value which is obviously stupid.

For example, if we have this sequence of numbers: 0 0 0 10 20 11 0 0 0 this algorithm predicts the center in the middle between strips with value 20 and 11. Hence the true answer is very close to the strip with value 20 (a little bit on the right but not significantly (noise around 2)).

Definition for extrapolated center on n strips exit in litterature.

Eta function

The Eta function can be unambiguously defined for 2 strip clusters (strips are i and i+1)).

eta=abs(adc(i)-adc(i+1))/(adc(i)+adc(i+1))

With this definition eta is always between 0 and 1. We can consider single strip clusters as having their eta function equal to 0 (or 1 half of the times to symmetrise the definition).

This eta function is expected to be flat if the source is randomly distributed in space. Experimentally the eta function will not be flat because of floating strips, charge collection dynamic, capacitive coupling of the strips, magnetic field (we can in fact measure the Lorenz angle using this distribution). Using a lot of statistics and cuts that enhance the 2 strip clusters we can measure the eta function and then deconvolute it by the standard method (inverting the function y(x)=integral 0 to x eta(t) dt). This can improve by one or 2 microns and take out magnetic field systematics

This correction will be applied on earth and not in space, but we have to assure that the eta function can be computed on earth using the output from the standard cluster algorithm.

Memory effect

This effect is when the strip i+1 readout value adc(i+1) depends on the difference adc(i)-adc(i+1) due to slew rate effect induced by the VA, line transmission and receiver slew rate.

The slew rate of our input stage is rather good so we dont really expect that we will suffer from a big memory effect, but we also want to run with 5Mhz readout speed which is close to the maximum.

Studies using GSI test beam data are nonconclusive yet on this subject.

This effect can be measured on earth, but it requires normally raw data: the quantity

sigma(adc(i)+adc(i+1))/sqrt(2)/sigma(adc(i))

must be 1 for statistically independent strips, which is the case in absence of signal and if the VA pedestals are uncorrelated.

We know that the pedestals of the VA strips are strongly correlated, so this effect is partially canceled in the pedestal subtration. Nevertheless this is an effect that can systematically move the position by one or two microns; we will have to investigate this effect with the full system.

If the effect is measured, it is easy to correct for it on earth or better in space, if we are confident enough.