What is censoring?
The notion of censoring is fundamental to survival analysis and is used when computing our survival functions (more on that in the next part of the series). But what do I mean by censoring? Strictly speaking, censoring is a condition when only part of the observation or measurement is known. That is the ability to take into account missing data, whereby the time to event is not observed.
For example, death in office of a president, or someone leaving a medical study before the study formally concludes. In the case of the latter, you can see this is really important for the analysis in medical trials, but in both cases the underlying principle is the same – we made some observations until a given time, but we cannot measure the event. If a president dies after one year in office, how can we possibly know that they would have served two terms?
Left and right censoring in Survival Analysis
There are different types of censoring, two commonly discussed ones are left and right censoring (two others that come to mind are interval censoring and random censoring, but are not discussed here).
- Left censoring is when the event has occurred before the data is collected (or study has started) – that is we only know the upper bound of the time. For example, in a medical study someone dies before the drug trial begins (which is normally not considered).
- Whereas, right censoring is when only a lower limit of the time is known, for example, if a subject leaves a study before the end, or the study ends before the event occurs.
You can think of this as events that happened to the left of time (in the past) are left censored, and events that may happen to the right of time (in the future) are right censored.
In the case of turnover, we are only considering right censoring, where a person may leave at some point in the future, but we don’t know when they will (if at all). Hopefully, the diagram below will help demonstrate this.