Weather & the Stock Market

Read Time: 5-10 mins
Equities
Author

Max Sands

Published

December 22, 2022

Intro

In recent years, behavioral finance - the field of study that combines psychology and economics to better understand financial decision making - has grown in popularity. There have been many studies that prove the irrationality of human decision-making processes due to psychological and emotional factors. With this in mind, we will investigate if weather conditions in New York have any noticeable impact on daily stock market returns.

Let’s load the data…

Code
weather_data <- read_rds(here("raw_data", "Weather and Markets", "new_york_weather_data_clean.rds"))

weather_data %>%
    # head() %>%
    set_names(names(.) %>% str_replace_all(., "_", " ") %>% str_to_title()) %>%
    mutate(across(.cols = where(~is.numeric(.x)), .fns = round)) %>%
    datatable()

We will consider the SP500 Index as a proxy for the stock market:

Code
stock_data <- tq_get("^GSPC", from = "1978-12-29")

stock_data <- stock_data %>% 
    mutate(pct_ret = (adjusted / lag(adjusted)) - 1) %>% 
    slice(-1) %>% 
    select(date, pct_ret)

stock_data %>% 
    head() %>% 
    set_names(c("Date", "Return (%)")) %>% 
    gt() %>% 
    gt::fmt_percent(columns = 2)
Date Return (%)
1979-01-02 0.65%
1979-01-03 1.11%
1979-01-04 0.80%
1979-01-05 0.56%
1979-01-08 −0.33%
1979-01-09 0.54%
Variable Definition
Tod The time of day (Morning, Midday, Afternoon).
Temp The temperature in degrees Fahrenheit.
Visibility The maximum distance at which an object can clearly be discerned.
Dew Point The minimum threshold temperature that results in a relative humidity level of 100%.
Feels Like A measure of how hot/cold it feels like outside when accounting for other variables like wind chill, humidity, etc.
Temp Min The minimum temperature during the associated time stamp.
Temp Max The maximum temperature during the associated time stamp.
Pressure The weight of the air. High air pressure (heavy air) is associated with calm weather conditions whereas low air pressure (light air) is associated with active weather conditions.
Humidity The amount of water vapor in the air.
Wind Speed The speed of the wind in miles per hour.
Wind Deg The direction of the wind in circular degrees.
Clouds All Cloudiness of the sky in percent.
Weather Id The ID code associated with the weather.
Weather Main The Primary Weather Category.
Weather Description The Secondary Weather Category.
Weather Icon The ID code of the icon being displayed on weather apps.

Exploring the Data

Let’s start by taking a look at the daily SP500 returns below:

SP500 Returns

Code
avg_ret <- stock_data %>% 
    summarize(avg_ret = mean(pct_ret)) %>% 
    pull(avg_ret)

stock_data %>% 
    ggplot(aes(date, pct_ret)) +
    geom_point(alpha = .5) +
    geom_hline(yintercept = avg_ret, color = "red") +
    theme_bw() +
    labs(
        y = "", x = "",
        title = "SP500 Daily Return (%)",
        subtitle = str_glue("Average: {scales::percent(avg_ret, accuracy = .0001)}")
    ) +
    scale_y_continuous(labels = scales::percent_format()) +
    theme(text = element_text(size=15))

As we can see from the data, there are several days with extreme returns; on October 19, 1987 (‘Black Monday’) the market declined by approximately 22%, and in March of 2020, the stock market dipped when news of the Covid-19 pandemic arose. While these events are extremely important from a historical perspective, it is unlikely that the weather contributed significantly to these extreme returns. Therefore, we will consider days like these to be outliers, and we will remove them from our data. Here is a cleaned version of the data:

Code
stock_summary <- stock_data %>% 
    summarize(
        mean = mean(pct_ret, na.rm = T),
        st_dev = sd(pct_ret, na.rm = T)
    )

stock_data %>% 
    mutate(is_outlier = case_when(
        pct_ret > stock_summary$mean + 2*stock_summary$st_dev ~ "Outlier",
        pct_ret < stock_summary$mean - 2*stock_summary$st_dev ~ "Outlier",
        T ~ "Not Outlier"
    )) %>% 
    ggplot(aes(date, pct_ret, color = is_outlier)) +
    geom_point() +
    theme_bw() +
    labs(
        y = "", x = "",
        title = "SP500 Daily Return (%)",
        color = ""
    ) +
    scale_color_hue(direction = -1) +
    scale_y_continuous(labels = scales::percent_format()) +
    theme(text = element_text(size=15), legend.position = "top")

Code
stock_data <- stock_data %>% 
    mutate(is_outlier = case_when(
        pct_ret > stock_summary$mean + 2*stock_summary$st_dev ~ "Outlier",
        pct_ret < stock_summary$mean - 2*stock_summary$st_dev ~ "Outlier",
        T ~ "Not Outlier"
    )) %>% 
    filter(is_outlier == "Not Outlier") %>% 
    select(-is_outlier)

Going forward, we will solely use the blue data points…

The Impact of the Weather

Let’s examine the returns on days with different morning weather conditions for each month:

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

Black lines represent the average of the group whereas the red line represents the average across all groups

As we can see from the above plots, each of the distributions are relatively similar for different morning weather conditions. Therefore, morning weather seems to have little effect on the distribution of daily stock market returns.

Let’s investigate if temperature differences have any impact on market returns…

The Impact of Temperature Differences

Let’s hypothesize that on days where it is colder than usual, returns are worse than days where it is warmer than usual. To quantify this hypothesis, let’s see if the difference of Feels Like from that month’s average Feels Like yields any interesting results on stock market returns:

Likewise, there is no evidence that variations in temperature can help explain daily stock returns.

Simple Modelling - Linear Regression

From our brief analysis above, it seems unlikely that we will be able to use weather data to model stock market returns accurately, but let’s run through a quick linear regression and examine the results.

Code
data_prep <- data %>% 
    left_join(feels_like_summary) %>% 
    mutate(feels_like_difference = feels_like - avg_feels_like) %>% 
    select(-avg_feels_like) %>% 
    filter(tod == "Morning") %>% 
    filter(!is.na(pct_ret)) %>% 
    mutate(wday = wday(date, label = T)) %>% 
    select(date, month, wday, everything(), -tod, -weather_id, -weather_icon)

lm_output <- data_prep %>% 
    select(pct_ret, month, weather_main, feels_like, feels_like_difference) %>% 
    lm(formula = pct_ret ~ . - 1) %>% 
    summary()

lm_output %>% 
    broom::glance() %>% 
    set_names(names(.) %>% str_replace_all(., "_", " ") %>% str_to_title()) %>%
    mutate(across(.cols = where(~is.numeric(.x)), .fns = ~round(.x, digits = 4))) %>% 
    gt()
R.squared Adj.r.squared Sigma Statistic P.value Df Df.residual Nobs
0.0068 0.0048 0.0081 3.298 0 22 10544 10567
Code
lm_output %>% 
    broom::tidy() %>% 
    arrange(p.value) %>% 
    set_names(names(.) %>% str_replace_all(., "_", " ") %>% str_to_title()) %>%
    mutate(across(.cols = where(~is.numeric(.x)), .fns = ~round(.x, digits = 3))) %>%
    DT::datatable()

Once again, we confirm that the weather cannot help explain variation in daily stock returns (with our model only explaining .7%). In fact, the month of the year seems to be more significant than the weather when explaining daily stock return variation.

Final Remarks

Evidently, there is no clear relationship between the weather and daily stock returns…

The above is intended as an exploration of historical data, and all statements and opinions are expressly my own; neither should be construed as investment advice.