HERA File Utilities - Data Statistics

Data Statistics

It might sometimes be useful to know some statistics from the data, such as the minimum or maximum observed values, or the mean, or average value. Such information can provide a valuable general sense of the data without doing too much analysis (is the source stronger than a certain benchmark, for example?). The File Utilities tool Data Statistics looks through the FITs file to extract just such information. Select the data set and run Data Statistics. The only parameters you need to fill in are the name of the parameter you are finding the statistics of and the range of rows to include in compiling the statistics and . The Name of Column in the FITs file parameter can be any of the data columns (TIME, RATE, ERROR, etc.), though it is often the independent variable (in this case, the RATE, or number of photons per second received) you are interested in. The Range of Rows to Include parameter specifies what part of the data set to use in compiling the statistics. Selecting a large number to gives an overall sense of the trends, whereas a smaller number will give more localized results. You need to specify a range of rows (i.e. 3000 - 5000) not just a number of rows.

Exercise 5

Find the minimum, maximum, mean and standard deviation for the "rate" data variable for a large number of rows (like 1-20000. You set the number of rows by changing the parameter Range of Rows to include in the parameter box). Copy the results. Now find out the results for a small subset of rows (try 5000-8000 rows, then 5000-5100). Is there significant variations in the values? Do you expect to see more or less variation with a smaller sample size? Explain the results.


