Linear Trend Analysis using Least Squares Regression and R

Trend analysis helps you estimate if the metric of interest (daily users, sessions, sales and so on) is growing or declining. Sometimes the trend is obvious:


But what about this example of real life data:


Can we be sure now that the metric grows on average over time or not? Simple way to see this is to fit the least squares regression line:


From the plot above we can see that our metric slowly grows over time.

# R script to plot the data and regression line

# data is vector of data points
# days is vector of days of observations
days <- c(1:110)

plot(data, type="l", col="darkblue", lwd=1)
abline(lm(data ~ days), col="darkgreen")

But can we see how fast the metric grows on average? Let’s investigate the regression slope:

> lm(data ~ days)

(Intercept)         days  
    40601.4         27.6  

From this output we can see that our metric grows by 27.6 daily on average. What can we do more? So far we fitted the regression over the entire period of observations (in our case it is 110 days). Now let’s see the regression coefficients for last 30 days:

> lm(data[80:110]~days[80:110])

 (Intercept)  days[80:110]  
     23244.7         210.8  

You can see that if we consider only last 30 days our metric grows by 210.8 daily, so our trend rate is also increasing our time! This is very useful to know and estimate.

The trend analysis we just did is quite simple, but unfortunately it cannot be always applied. You may need to consider other methods like Theil-Sen estimation if the data contains many outliers, or autoregressive models if there is autocorrelation in data.