**fully Bayesian model**to estimate how long it will take me to get a job!

## Model

I am estimating how many applications it will take before I get a job using a Bayesian model. With some simplification, there are **two major steps to the application process**: (1) getting a first-round interview, and (2) getting a job offer after the interview. There may actually be many interviews between the first-round interview and landing an offer, but I will likely not get enough data in those intervening steps to build a worthwhile model.

For this model, I am considering each application submitted to be a Bernoulli trial where a success is getting a first-round interview. I'm going to assume this is a fixed probability \(P(\text{First-round Interview})= p_1\). Then, if there is a success in this first trial, there is a second Bernoulli trial where a success is getting a job offer. Again I'm going to assume a fixed, but different parameter for this conditional probability of success, which I will call \(P(\text{Job Offer}|\text{First-round Interview}) = p_2\).

Note that I have data for both Bernoulli steps. For the first step, I have \(n_1=9\) trials and \(y_1=1\) success. For the second step, I have \(y_1=1\) trial (this is the number of first-round interviews I've had - we could also call this \(n_2\)) and \(y_2=0\) successes.

Honestly, I have no idea what my success probability is. So I chose **a uniform prior** for \(p_1\) on [0,1], which leads to a neat posterior. The posterior distribution of \(p_1|y_1, n_1\) is \(Beta(y_1+1,n_1-y_1+1)\) where \(n_1\) is the number of applications submitted. The same holds for \(p_2\) with a uniform prior; the posterior for \(p_2|y_1, y_2\) is \(Beta(y_2+1,y_1-y_2+1)\).

## Posterior Distributions

Let's visualize these densities.

Nothing crazy here. The median of the posterior for \(p_1\) (chance of getting a first-round interview) is **0.16**, with a 90% credible interval of **0.04 to 0.39**. So the uniform prior is nudging that upwards from the \(\hat{p}\) we would get from \(\frac{y_1}{n_1}\).

The median of the posterior for \(p_2\) (chance of getting a job offer after first-round interview) is **0.29**, which is not really saying much since our 90% credible interval is **0.03 to 0.78**. We just don't have much data, which is doubly sad :( The prior in this case is giving me the *benefit of the doubt,* saying that I have a roughly one-quarter chance of getting a job offer even though I've received no job offers!

## So how many jobs do I need to apply for?

Let's return back to the original question. We have \(P(\text{First-round Interview})= p_1\) and \(P(\text{Job Offer}|\text{First-round Interview}) = p_2\). Then \(P(\text{First-round Interview & Job Offer}) = p_1*p_2\). Let's call this joint probability \(q\). Now we can conceive of a set of Bernoulli trials with this joint probability. The expected number of trials required to achieve one success comes from the geometric distribution and is \(\frac{1}{q}\).

Okay… so how does this help me? Well, I can simply simulate draws from my known posteriors and then combine the draws to estimate the posterior distribution of \(\frac{1}{q}\), which is the number of applications required to land a job! Here is the posterior for \(q|y_1,n_1,y_2\):

The median of the posterior is **25.11** with a 90% credible interval of **5.22 to 383.9**. Note that due to the memoryless property of the geometric distribution, we should interpret those as **additional applications**, i.e., my best estimate is **25 additional applications**. But take a look at that upper bound. Yikes! So the job search may take me another week to ... another year. Better get back to the job search!