Tuesday, June 07, 2005

I heard it's your birthday!

No, not really, but today's the birthday of what may become an endless journey into probabilistic modeling. Welcome ablog!

So many of us just want the answer --- in a way we've almost lost our desire to enjoy the journey --- but I remain forever hopeful that your interest are more in line with doing something interesting and the answer is really just a punctuation mark between these journeys. Personally I think the answer-mentality may be a part of our collective educational experience, but please let today be the birth of an idea that you can only learn by taking the journey, and that you need to find something that interest you along the way.

First let's relax and just sign on board by posting a comment --- just click on the comment link below to start. Don't worry, nothing will break! You should also poke around this blog, just to familiarize yourself with the layout and the interactive nature of blogging in general.

Monday, June 06, 2005

Online polls?

Go ahead now and take a simple online poll, it's so simple, it will only take a moment and you'll be able to see real-time results. There's three polls in the sidebar, and I hope you take all three. If you don't know the day you were born on, just click the birth day link in the question to find out.

Yes, it's almost a given that people love to take online polls. However, you should always be wary of their outcomes, especially so when the questions warrant a reactive response. Anyway, I hope you took these short polls, but please be aware that whatever online poll I design, it will be biased because I'm not taking a random sample. As you're already aware, someone might try to vote early and often (I try to prevent this, just try to vote twice in the same day and you'll hopefully be blocked from doing so.). Of course, if I could get the complete sample, that is everyone to fork over the requested information, I'd get an unbiased picture . So let's ask everyone in the USA with a birth-year of 1978, what date they were born on.

Now of course we'll need some real data on actual birthday distributions, that is, what I'd like to know (and I hope you do too) is if each day has a similar number of births. So please download the actual distribution of birthdays for births in the U.S. in 1978. The data was obtained from an article written by Geoffrey Berresford ("The uniformity assumption in the birthday problem, Math. Mag. 53 1980, no. 5, 286-288.)

What are you going to do with the data --- do Worksheet #1, it's in the sidebar. In this worksheet you'll create a time series graph to see how births fluctuate throughout the year. Please post your initial feelings or questions about this data.

Sunday, June 05, 2005

Analyzing the data . . .

People were doing statistics long before computers, in fact, probably the best example of a statistcal graph was done in the 1800s by Charles Joseph Minard. But computers have made the collection and analysis of numerical data a whole lot easier, but they have not made us any smarter.

[Charles Joseph Minard Anti-War Poster]
Charles Joseph Minard's Anti-war poster [Larger English version].

So what I'd like for you to do is to assign each person born in 1978 a unique number. We'll eventually place all these numbers in a big lottery type machine, mix well, and then sample twenty-three people at random. The easiest way to do this is as follows.

  • 7701 people were born on 1/1/78, so just assign them a range from 1 . . 7701;
  • 7527 people were born on 1/2/78, so just assign them a range from 7701 + 1 = 7702 . . 7701+ 7527 = 15228;
  • 8825 people were born on 1/3/78, so just assign them a range from 15229 . . 24053;
  • 8859 people were born on 1/4/78, so just assign them a range from 24054 . . 32912;
  • 9043 people were born on 1/5/78, so just assign them a range from 32913 . . 41955;
  • 9208 people were born on 1/6/78, so just assign them a range from 41956 . . 51163;
  • etc., until you're reached December 31, 1978

Now, if I select a person who was assigned the number 589, he/she will be in the range of people with birth dates of January 1. You should check that the person who's number is 3,043,614 will have a birth date of November 11. Go ahead and prepare the file (a speadsheet might be helpful, or at least a calculator) and then download Worksheet #2 from the sidebar and answer the questions. Submit them to your instructor to see if you've done it right!

Saturday, June 04, 2005

Generate Samples.

A very famous birthday problem makes its rounds in almost every introductory probability class. The question posed is: “What is the fewest number of people that can be assembled in a room so there is a probability greater than 0.5 of at least one duplicate birthday, given that the year of birth is not considered.” Many neglect February 29th (leap year) as a birthday, because it complicates the initial analysis. If we neglect February 29th as a birthday, assume that all birthdays are equally likely, and therefore, the probability for any birthday is 1/365 . We will also assume that the people assembled have birthdays that are unrelated, i.e. independent. Independence means that knowing one birthday does not give you any insight into anyone else’s birthday. Independence can easily be ruined if we are at a gathering of twins or a gathering of people who were born after an historical event. Examples of an historical event that can create a boon in births are something as simple as an electrical blackout that hits a large metropolitan area like New York City. I guess any event that affects people’s normal nighttime activities might lead to a situation where procreative activities are changed. Anyway, our assumptions are that each of the 365 birthdays is equally likely (our data from 1978 does not really support this conclusion) and that the people gathered have unrelated birthdays. I guess you might be able to simulate this experiment by selecting n people at random and then see if anyone in the gathering of these n people has the same birthday. You would have to repeat this many times to get an idea of what is happening.

Here's where I want to create a distributive Internet project from around the world to start sampling this population of data. Now, Go ahead and start sampling, at random, by downloading Worksheet #3 from the sidebar and answer the questions. Submit them to your instructor to see if you've done it right! Your instructor will then submit them to me for tabulation and publication. This may take time.

Friday, June 03, 2005

For the motivated learners.

Okay, if I caught your interest you need to start the process of real work. Let's begin.

The following files will be a very mild introduction into stochastic modeling. The textbook and supplement are typeset in LATEX and then output to pdf. As always, if you have problems with any of these files, you can email me a short description of your troubles and I will try to help. Code is in ASCII format which will allow you to cut-and-paste, but you may need to 'tweak' the code in order for it to compile on your system.

Although this looks like a Princeton University Press book, it's just that they were considering it for publication and I spent more than a year trimming it down, but alas, they declined to publish because the peer reviews varied too much for their editorial board. In any case, I used their book class.

I'm hopeful that others will be interested in creating a dynamic blog community that will be useful to the readers of "Deterministic Uncertainties." I welcome your thoughtful comments, especially if they relate to making the material more understandable.

If interest is there, I will expand the create a new blog. Just email me and we'll start!