University campuses represent an opportunity to advance the understanding of how the built environment influences health. We used de-identified billing codes from a private university clinic serving undergraduate students for academic years 2008 through 2012 linked to students’ residential history and demographic information. We used a two-stage, hierarchical regression model to study the differences in the reported prevalence of diagnostic groups by dorm and the association between building characteristics and disease incidence rates. We found significant differences in the prevalence of mental health (MH), upper respiratory infections (URI) and substance-abuse between freshmen and upperclassmen. Additionally, we found systematic differences in the relative rates of URI and MH diagnoses across dorms. Among upperclasmen dorms, the only mechanically ventilated building had a lower rate of allergy cases. An increase in available dorm space of 100ft2 per student was associated to a decrease in 10.8 URI cases per 100 students per academic year (p<0.01). Construction age was also associated with lower incidence rate of MH (1.1 fewer diagnoses/100 students-academic year for every 25-year increment in building age, p=0.04). These results suggest the potential for the use of electronic health records (EHR) to identify differential health issues faced by students depending on the housing characteristics and on the stages of their academic career.
Undergraduate student health is a topic of significant public health concern. With an enrollment of 19.3 million, they represent a sizable proportion of the United States population . Due to their age distribution, undergraduate students are at a critical period of susceptibility to environmental and social stressors in college campuses that could set them in different life course health trajectories. Mental health is particularly time-sensitive during college years: three-fourths of lifetime mental health conditions have their onset before age 24 years . Endocrine and reproductive systems develop through adolescence, making them susceptible to endocrine-disruptive chemical exposures . Nutrition, physical activity, and sleep are additional examples of health-related factors influenced by the campus environment.
Dorms are one of the most important places in the campus environment. Dorms are among the spaces where students spend most of their time  and approximately 40 percent of full-time college students live on campus . Existing research on undergraduate dorms has focused on the peer effects of roommates on educational attainment , student satisfaction , propensity to engage in risky behaviors , but little attention has been paid to the role of the physical residential environment on undergraduates' health. The influence of indoor environmental quality on health has been studied for the past four decades. Authoritative reviews have documented robust associations between environmental factors and human health. Improvements in indoor air quality through higher ventilation rates have been associated to reduced prevalence of sick building syndrome symptoms, respiratory infections, asthma symptoms and absenteeism . Signs of water damage, dampness, and mold have been linked to increased risk of allergic rhinitis and rhinoconjunctivitis symptoms , and increased lower respiratory symptoms (e.g., asthma, wheeze, cough, bronchitis) . Visual and non-visual (i.e., via photoreceptors sensitive to light at short-wavelengths) properties of light have been associated with lower absenteeism, work performance and safety . Effects of auditory and non-auditory noise exposures range from hearing loss, increased prevalence of cardiovascular disease, cognitive impairment, sleep disruption and daytime alertness . Temperatures outside a narrow range of thermal comfort (70-75°C) have been associated to decreases in cognitive function and dexterity . These findings, however, stem mostly from laboratory experiments or field studies in office buildings, or single-family homes. Other limitations include the use of self-reported health data from surveys, making findings to selection and recall bias or small sample sizes, due to the intensive labor and costs of environmental sample collection.
In recent years, colleges and universities offering health services to their students and staff have incorporated electronic health records in their clinical practices. The use of electronic health records (EHR) might offer a new opportunity to understand the relationship between undergraduate student health and the dorm environment. A recent analysis of EHR data from 23 colleges has enabled the estimation of epidemiologic data from the undergraduate student population . Utilizing the clinic visit data from a high educational institution may have advantages because of 1) the largely homogenous population with regard to age and geographic location that is followed over up to four consecutive years, and 2) the large percentage of students that may live on campus buildings under administrative control. Due to the many other services offered on campus (e.g., housing), additional information collected for other purposes (e.g., energy use consumption, building occupancy data) could be leveraged to study the associations between the campus environment and health.
The objectives of this study were to analyze a university’s EHR to: a) estimate the prevalence of different diseases by gender, grade year and academic year; and b) estimate the association between dorms’ characteristics (e.g., building area, occupancy density, building age, energy consumption and distance to central health clinic site) plausibly related to disease incidence. For example, occupancy density is tied to the supplied ventilation rates per person. Similarly, energy use intensity (i.e., energy consumption per unit area) is a surrogate of building envelope performance, an influential aspect of infiltration, moisture and temperature control, and outdoor pollutant penetration . To our knowledge, this is the first study to examine the impact of college dorms on students’ health using available information of building performance, in conjunction with clinically-diagnosed health outcomes from the university’s EHR.
Enrolled undergraduate students from a private college in the greater Boston Area during five academic years (2008-2012). Ninety-seven percent of the students live in campus dorms during the four years of the college program. Freshmen students are assigned to one of 16 dorms. After their freshmen year, students are relocated, following a lottery process, to one of 12 upperclass dorms (some with multiple buildings) where they live for the remaining three years. These upperclass dorms serve as basis of much of the social and extramural activities and are communities with a unique set of traditions and, in the past, a marked clustering of stereotypes. To promote diversity in campus life, several methods of randomization for dorm assignment have been implemented. A mathematical algorithm, controlling for gender balance and a preference for a specific dorm, assigns individuals or groups of two to eight students to one of the 12 upperclass dorms. Individuals within the groups are not necessarily assigned to the same dorms, but they are guaranteed to be housed in one of four clusters of geographically proximate dorms. For privacy, freshmen dorms are coded numerically from 1 to 16, and upperclassmen dormitories use letters A to M. Thus, this randomization of living arrangements, could present a de facto experimental condition for a clinical trial of the health effects of the built environment.
We received students’ residential information as well as age, gender and race for most students living on campus during the study period: Data for 13,491 students over five years, yielding 32,323 student years; on average, for each class year there were 1436 freshmen and 3869 upperclassmen. The dataset contained four years of records for two full graduation classes (2011 and 2012), representing 26% of the total number of students. Subjects from other graduation classes other than 2011 and 2012 were missing at least one of the four grade years (freshmen, sophomore, junior or senior).
We analyzed EHR data from a private college in the northeastern United States for five years (178,775 diagnoses) and building information for 28 dorms. Billing records from the university’s health clinic contained a coded reason(s) for every diagnosis using the standard International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM), which is used to classify and assign codes to health conditions for administrative purposes. The ICD codes for every diagnosis on record were linked through coded student IDs to their dorm. The linked data allowed identification to their dorm by year but did not provide further personal information. Demographics, such as age, gender and ethnicity, of the students living on campus for the academic years 2008 to 2012 were collected separately by the university's housing office. All students are required to pay the health fee to utilize the health services. Only in very limited cases students waive this service to receive health care in other settings.
The university’s division of medical records prepared a dataset with billing codes including all diagnoses of undergraduate students living on campus for five academic year periods (2008-2012) from August 2008 to July 2013. From a total of 178,775 records available in the dataset, 125,581 corresponded to diagnoses and 53,194 corresponded to general medical examinations, screenings, counseling, surveillance, orders for vaccination, and details of personal and familiar clinical history. Due to uncertainty in the baseline population during non-active academic periods (i.e., a portion of students stayed in the dorms during non-academic periods for winter sessions or summer research internships), diagnoses during January, June, July, and August were not included.
Each record consisted of a de-identified ID number of the student requesting services from the health clinic, date and time of the diagnosis, and the ICD-9 code corresponding to condition observed during the diagnosis. The de-identified residential information was linked to clinical files by authorized university personnel. Only this de-identified version of the files was provided to the researchers. The study protocols to access de-identified clinical records with dorm assignments were reviewed and approved by the university Institutional Review Board.
ICD-9 codes were classified in a tabular list of 19 chapters representing different categories of related health conditions. An inspection of the ICD-9 codes in the diagnoses file led to a secondary classification based on a plausible aetiology related to buildings' characteristics. For example, diseases of the respiratory system (i.e. ICD-9 CM Chapter 8) were sub-classified into lower respiratory infection, upper respiratory infections, asthma and allergies. We refer to these sub-categories as diagnostic groups. A list of the diagnostic groups was created for the purposes of this study and their corresponding ICD-9 codes is presented in Supplemental Information (S1 Table). The diagnostic groups were comprised of unique conditions. Therefore, each student registered diagnosis was recoded into a specific diagnostic group. We focused our analysis on the following diagnostic groups: allergies, asthma, mental health (MH) symptoms (i.e., anxiety, depression, acute reaction to stress, etc.), and upper respiratory infections (URI).
Information on the dorm characteristics, such as gross building area (i.e., all indoor floor area inside the building envelope), total suite area (i.e., bedroom area), year of construction, monthly heating degree days (HDD; i.e., sum of degrees that the 24-hr temperature mean is below 65° Fahrenheit), and monthly heating energy consumption in kBTU (i.e., steam, natural gas and fuel oil use), were extracted from the university’s energy monitoring system. All dorms except for one upperclass building (Dorm E) have naturally ventilated rooms, with only mechanically-ventilated bathrooms. Total occupancy per year was estimated from the dataset provided by the housing office. Distances from each dorm to the health clinic site were calculated using street trajectories in Google Earth.
We estimated prevalence dividing the total number of students per diagnostic group by the total number of students from the corresponding grade year (i.e., freshmen, sophomore, junior and senior). Estimated prevalence was stratified by age, gender, class year, academic year and by dorm. Also, we estimated the number of clinical diagnoses per student to understand the levels health service utilization per diagnostic group.
We estimated the diagnostic rates for each diagnostic group to test differences in the influence of dorms on students' health across dorms and class years. We calculated the total number of students from each dorm visiting the health services for each diagnostic group each month within the academic year (September-December; February-May). January, June, July and August were excluded from the analysis because the baseline population of the dorms during those months is unknown. The monthly counts were analyzed using a two-stage, hierarchical regression model, separately for each diagnostic group.
In the first stage we fitted a log-linear model to estimate log relative rates of at least one diagnosis in that diagnostic group, for each dorm (denoted by β0k):
In Equation 1, Ytk denotes the number of students who received at least one diagnosis in month t in dorm k. An offset term, log Ntk is included to account for the different occupancy levels in the dorms, where Ntk denoted the number of students living in dorm k on month t. The regression coefficient, , is the estimated log-relative rate (log(RR)) for each diagnostic group among students in the dorm k. We exponentiated the regression coefficients to obtain the monthly incidence rate (IR) of diagnosis; in the models, the reference dorm (Dorm A) was automatically assigned based on alphabetical order. The IR estimates are used as the primary outcome of the second stage model. In the second stage, the following ordinary linear regression model is used:
The summation in the second term of Equation 2 corresponds to a set of building characteristics for each dorm centered on the mean value of the respective covariate Z for all dorms. These covariates include age of construction (years), occupancy density (square feet per student), and monthly heating energy use intensity normalized by HDD (kBTU/ft2*HDD), and distance to the health clinic (ft). The resulting coefficient αl indicates the effect of the covariate l on the estimated IR for each diagnostic group. The model includes a categorical variable for grade year (freshmen=0, upperclassmen=1) to account for the unexplained differences between the two student populations. To understand the relevance of the effects of building covariates on the utilization of health services, we estimated the changes in incidence rates (IR) as predicted cases per 100 students per academic year for the significant associations found in the second stage model. Statistical analyses were performed using the open-source statistical package R version 3.5.0 (R Project for Statistical Computing, Vienna, Austria).
Females used clinical services more than men (58.9% female; 41.1% male). On average, students had 5.5 diagnoses per academic year (median=3). No significant differences in the mean number of diagnoses per student were observed between academic years. Students that required five years or more to graduate had on average a higher number of diagnoses than their peers that finished school within four years (mean=9.3; median=6), although this group represents only 2.8% of the student population. The most common diagnoses were due to URI; 58.8% of the study population registered at least one diagnosis regarding this health issue. The percentage of students being treated at least once was lower for other diagnostic groups: skin symptoms (28.5%), injuries (23.6%), MH issues (23.1%), gastrointestinal symptoms (19.3%), eye-related symptoms (13.1%), asthma symptoms (4.75%), and eating disorders (4.1%).
The prevalence by grade year for the nine most prevalent diagnostic groups is shown in Figure 1. Reporting of allergies, eating disorders, injuries and gastrointestinal symptoms had little variation across academic years and between grade years. In contrast, diagnoses of URIs, substance abuse and MH symptoms depended more on the grade year and temporal trends. Figure 1 also shows a difference in the temporal trend of URI prevalence by grade year. While upperclassmen followed the same annual trend of the laboratory–confirmed influenza hospitalizations reported by the U.S. Center for Disease Control , URI prevalence among freshmen exhibited a different trend, registering an increase in academic year 2011 contrary to the decrease in prevalence among upperclassmen during the same period. Regarding MH, we found an increase in the prevalence by grade year, with the transition from freshmen to sophomore year registering the highest increase.
Results from the first stage hierarchical model (Figure 2) showed significant differences across dorms in RR of allergies, asthma (not shown), MH and URI diagnoses, suggesting the effect of dorm-specific and grade year factors. Freshmen dorms had significantly higher RR of URI diagnoses in 14 out of 16 buildings compared to upperclass dorms. Asthma and allergy RR were significantly higher in 5 out of 16 freshmen dorms; only freshmen Dorm 12 had significantly lower RR of allergy diagnoses among the freshmen dorms. A contrasting result was observed for the MH diagnostic group where 9 out of 16 freshmen dorms had significantly lower RR than the reference dorm and 10 out of 11 upperclass dorms had higher RR, indicating more diagnoses for MH issues. A subsequent analysis of variance of the RR for each diagnostic group shows variation between dorms was larger than within dorms, supporting the idea that there is a significant relationship between dorms and RR of disease.
Results for the second stage model are shown in Table 1. Higher occupancy density was significantly associated with an increase in IR for URI. An increase in available space of 100ft2 per student resulted in 12.6 (p<0.01) fewer cases per 100 students per academic year. On average, freshmen dorms had an occupancy density (97.5ft2 per student, range: 77.2-178.6 ft2 per student) 40% higher than upperclass dorms (136.7 ft2 per student, range: 106-168.1 ft2 per student). Additional information on building age, size and distance to central health clinic site are shown in S2 Table. Older construction age was associated with lower IR of MH diagnoses. The predicted IR associated with an increase in 25 years in construction age is 1.1 (p=0.04) fewer MH cases per 100 students per academic year. For allergies and asthma none of the building covariates were significantly associated to the estimated IR. Since the upperclass dorms were on average newer and less crowded than their freshmen counterparts, we included in the model a categorical covariate indicating the class status of the dorm (freshman vs. upperclass). Upperclass dorms had significantly lower IR for URI (IR=10.8 fewer diagnoses per 100 students per academic year, p<0.01) and higher IR for MH diagnoses (IR=8.6 new diagnoses per 100 students per academic year, p=0.02).
In our analysis of clinical records from 13,491 students, we observed important associations with regard to class year and dormitory on asthma, allergy, mental health, and upper respiratory infection. Notably, higher incidence of mental health reporting was observed for upperclass students relative to freshmen, and dormitory occupant density was associated with higher incidence of upper respiratory infection. Overall, this study highlights the potential for using large-scale medical health records to ascertain trends in student health outcomes that may be linked with their housing and other social factors.
Our study was conducted at one U.S. university, which may limit generalizability. Yet, our results are in general agreement with trends observed in larger studies on comparable populations. Turner et al (2015) reported the prevalence of the most common diagnostic groups at 23 U.S. universities , matching to a great degree with our findings. There is a 0.6% difference between the values they reported and the mean prevalence per diagnostic group shown in Figure 1. Since both studies analyzed billing EHR, this concordance across diagnostic groups mitigates usual concerns associated to misclassification of diagnoses in administrative health data.
Academic years 2008 and 2009 had the highest prevalence of MH cases, especially among senior students closer to graduation. Others have observed the deterioration in mental health following the 2008 economic crisis, with stronger effects among populations with higher employment vulnerability . In the subsequent academic years from 2010 to 2012, there is an increasing trend in the proportion of students receiving attention for MH outcomes, as reported by others for the 2008-2017 period . While this trend could be explained by the increased number of programs offering attention to students on campus or the reduced stigma towards MH problems, there is also the concern that the severity of the conditions is also responsible for the increased levels of health services utilization. Substance abuse diagnoses, another important issue in the student population, was about twice as high in freshmen than among upperclassmen, as documented by others [19, 20].
We found significant differences in the prevalence of certain symptom groups by grade year: the proportion of students diagnosed with URI was consistently higher among freshmen. Turner 2015 also reports a negative association between student age and prevalence of respiratory diagnoses . The adaptation to previously inexperienced weather conditions in their first undergraduate year could be one of the drivers of this difference. Another plausible explanation was a difference in influenza immunization rates between freshmen and upperclassmen. However, the proportion of students that receive the flu vaccine was higher among freshmen versus upperclassmen (40% freshmen vs. 24% upperclassmen).
Results from the first stage analysis show the significant differences in RR between dorms for allergies, MH, and URI. Diagnoses from these categories represented 42.6% of the total diagnoses in the dataset. Our results show significant differences in diagnostic groups by dorms, even within the same grade year. Among upperclassmen, four buildings had significantly lower RR of URI diagnoses, and all except two freshmen dorms had higher RR with respect to Dorm A. In our attempt to understand which physical characteristics of the dorms explain some of the variance in reported respiratory and MH symptom diagnoses, we found occupancy density and age of the buildings to be relevant factors. The mean occupant density in these dorms was 118 ft2 per student (range= 78.2-184.1 ft2 per student). As a reference, results from the American Housing Survey performed by the United States Census Bureau categorize overcrowding as an occupancy density lower than 166 ft2 per person . Higher occupancy density values resulted in increased relative rates of URI. A plausible explanation of might be an increase in the air's rebreathed fraction  (i.e., the amount of inhaled air previously exhaled by another person in the same indoor space). Its value is a function of a building's ventilation and the number of occupants per unit area. Our rationale was to include the heating energy intensity by HDD as a surrogate for buildings' envelope efficacy at reducing uncontrolled ventilation, but this variable was not significantly associated to any IR.
Other possible explanations of the role of occupancy density in higher URI incidence rates include the increased probability of person-to-person contact, exposure to bacteria contaminated objects (fomites) [23, 24] and airborne transmission via infectious aerosols . In Chinese dorms, a monotonic relationship was found between self-reported incidence and duration of common cold infections and number of students per dorm room, at much higher occupancy densities (26.9 to 71.8 ft2 /student) than the ones in our study . Spengler et al. found an increased odds ratio of cough and phlegm in Russian residencies with higher occupancy densities . Although occupancy density for this study is unknown, the average residential space available in the Russian Federation at the time of the study was only slightly higher (212 ft2 /person) than in our analysis . Our study contributes with additional knowledge on the transmission of respiratory infections and occupancy levels by using validated medical data from a university's health services in the U.S. This finding can have further implications in places that are at higher risk of overcrowding or under ventilated (e.g., schools)
In regards to MH, all dorms except one (Dorm I) had significantly higher RR among upperclass dorms, with respect to Dorm A. We consider this finding worth further investigation, since the rates of reporting MH symptoms in Dorm A were consistently lower during the five years of the study, even when the population mix of the dorms changes every year. Our study did not have the spatial resolution nor the adequate covariates to test for direct associations to environmental exposures such as light, or noise, or mediated ones through building conditions that could diminish quality of sleep and increase stress, or increase social isolation. Baum and Valins found that students living in long corridor-type dorms had less social interaction and had higher crowdedness perception compared to those living in suite-type dorms with similar occupancy densities . Despite the lack of suite configuration data for each student, it is known that in the study dorms occupancy density typically decreases with grade year (i.e., senior students have higher probability of being assigned to single suites). Another possibility is that despite the efforts to randomize the student population at the upperclassmen dorm assignment, there is a residual selection bias during student group formation that reinforces enduring dorm stereotypes.
In future studies, we suggest a more detailed analysis of design features previously associated with students' health. Social and physical determinants of MH, compounded by anxieties about impending life changes near graduation, may offer an explanation for the increased prevalence of MH diagnoses with academic life progression. A better understanding of these physical features of the built environment along with community dynamics that avert students from seclusion will allow targeted interventions to reduce the burden of MH outcomes.
Results of this study highlight the differential health issues faced by students at different stages of their academic career, and the impact of the dorms on health. Our results indicate that an evaluation of EHR by symptom groups, grade year and dorm may provide valuable insights regarding how student health varies over time. The methodology described in this paper, however, is replicable in such cases when dorm assignment follows a randomization process. Otherwise, unknown biases might hinder the ability to interpret the differential reporting of disease rates between buildings. We foresee the use of this approach to proactively identify “hotspots” of certain health outcomes and conduct more targeted on-site indoor environmental quality assessments.
We thank the division of medical records at the university health services that compiled the clinical datasets for our study. This study was sponsored by the National Science Foundation EFRI-1038264 award. Partial sponsorship was also received from the Akira Yamaguchi Endowment and the Mexican National Council of Science and Technology (CONACYT). Dr. Dominici was supported by grant R01 ES024332.
|Distance to health clinic|
|Energy use intensity by HDD|