Imagine Home  |   Teachers' Corner  |   Lesson Plans  |

# Finding a Source

How do scientists know when they have truly determined the location of a source emitting high-energy photons? They try to find out if the detection of the source has statistical significance, and in order to do this, they might use this "short cut" method related to standard deviation.

In order to understand what a standard deviation (also referred to as sigma) is and how it helps you to determine if you have detected a source or not in your data, we need to first learn about what statisticians call "normal distribution" of data.

A normal distribution of data means that most of the samples in a set of data are close to the "average," while relatively few samples tend to one extreme or the other. If you looked at normally distributed data on a graph, it would look something like this:

One standard deviation away from the mean in either direction on the horizontal axis (the red area on the above graph) accounts for about 68 percent of the data in this set. Two standard deviations away from the mean (the red and green areas) account for roughly 95 percent of the data. And three standard deviations (the red, green and blue areas) account for about 99 percent of the data. If the datum is greater than 3 sigma away from the mean, it is truly an exceptional sample compared to all the rest of the data.This is what you would expect a source to look like compared to the background noise! In other words, if the difference between the two numbers you are testing is "greater than three times sigma", then you can be certain that you have located a source emitting high-energy photons.

In this activity, we can assume that one sigma is well approximated by the square root of the value of the pixel count that we are testing. You may ask your more advanced students to research why this would be the case, and when this approximation would break down.

Now consider the following:

Find the pixel with the highest number in it. Then exclude all of the pixels immediately surrounding this pixel and look for the highest number in any of the pixels directly outside the excluded area. In the example below, the maximum pixel count is 60 and the highest pixel count outside the excluded area is 20. Note that we exclude a box of pixels around the highest pixel because a real source will be imaged onto more than a single pixel; thus we exclude the nearest-neighbor pixels from consideration when looking for a statistical significance of a source (indicated by the maximum pixel count over the whole array) above the background, or noise.

So our source is 60, our sigma is sqrt(60)=7.8, and the highest pixel outside our excluded area is 20. We see that 60 - 3 x 7.8 > 20, so you can be 99% sure that you have located a source at the pixel position with 60 counts in it.