Selecting a random sample of data

If you have a database with many records and you want to take a random sample of that data, here are a few techniques you can use.
One way to get a random sample is to use a computed criteria and advanced filter.
Suppose you want to take a random 10% of the data. Enter the formula as shown in C2 (keep C1 blank). By entering the formula = RAND()<0.1, every time this worksheet calculates, the =RAND() will return another random number. So RAND()<0.1 will return true, about 10% of the time.
(Rand() returns a random value between 0 and 1, not including 1).
Book image
Using the Advanced button from the Data tab: [Excel 2003:Date|Filter|Advanced Filter]
Book image
You can filter like this:
Book image
and that will create a random selection:
Book image
This will be different each time. You may notice that there are only 9 items shown not 10, and that's because the values RAND returns are random! It's best to use this on larger databases!
A second way to select a random 10% of your data is to still use the RAND function, but not use filtering. Look at this:
Book image
Cells B2 thru B101 contain = RAND(). All you need do is select A2:B101 and sort by column B! Take just the first 10 items, and you have your random 10% of the database!