Time Series Forecasting with Machine Learning
We see many different artificial intelligence models and improvements have also been made to lower the human effort in coding and analyzing. In this post, we are going to talk about how machine learning strengthens our ability to analyze data using time series data prediction models.

Time series is a visualization of data points collected at specific time intervals. Here are the most common use cases in different fields:
- Trend Analysis: It helps us understand the long-term movements in data, which can assist brands in making data-based decisions.
- Seasonality Detection: It can highlight patterns that repeat at specific time intervals, such as daily, weekly, monthly, or yearly cycles.
- Forecasting: It predicts future values based on past data, which is essential for planning better strategies and decision-making.
When predicting future values, time series models play a crucial role in various industries. In short, forecasting works on previous data points to predict future values, and it is time to share wisdom not only in theoretical aspects but also in technical ones.
ARIMA
ARIMA is one of the most well-known statistical methods used for time series analysis. As suggested by its name, it combines autoregression (AR), differencing (I), and moving average (MA) to model and predict future data points. Keep in mind that ARIMA is particularly effective for short-term forecasting when the time series data exhibits a linear trend or seasonal patterns.
- AutoRegressive (AR): The model uses the dependency between an observation and a number of lagged observations (p). p is the number of lag observations
- Integrated (I): Differencing the raw observations to make the time series stationary (d). d is the number of times that the raw observations are differenced
- Moving Average (MA): The model uses dependency between an observation and a residual error from a moving average model applied to lagged observations (q). q is the size of the moving average window
According to the definitions of parameters, it might seem a little confusing. Our aim is always to choose the best parameters for the best prediction model that we can create. Here is a step-by-step guide that we can follow while selecting the best ARIMA parameters:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
# Load dataset
data = pd.read_csv('time_series_data.csv')
data['Date'] = pd.to_datetime(data['Date'])
data.set_index('Date', inplace=True)
# Visualize the data
plt.figure(figsize=(10, 6))
plt.plot(data)
plt.title('Time Series Data')
plt.show()
Ensure that your time series data is stationary. Stationarity means that the mean, variance, and autocorrelation structure do not change over time. As I mentioned above, ARIMA can be used for linear trends or seasonal patterns. The differencing parameter d is used to make the series stationary. If d = 1, the model will use the first difference of the data. If the p-value is less than 0.05, the series is stationary.
from statsmodels.tsa.stattools import adfuller
result = adfuller(data['value'])
print('ADF Statistic:', result[0])
print('p-value:', result[1])
Assuming that our dataset is stationary, it is recommended to use the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) plots to identify potential values for p and q.
- ACF: Shows the correlation of the time series with its own lagged values. Used to identify the moving average component q.
- PACF: Shows the correlation of the time series with its own lagged values, controlling for the values of the intermediate lags. Used to identify the autoregressive component p.
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
# Plot ACF and PACF
fig, axes = plt.subplots(1, 2, figsize=(15, 5))
plot_acf(data_diff, lags=40, ax=axes[0])
plot_pacf(data_diff, lags=40, ax=axes[1])
After defining the values for p and q using ACF and PACF, it is time to fit several ARIMA models with different combinations of these values. We can use information criteria like AIC (Akaike Information Criterion) or BIC (Bayesian Information Criterion) to select the best model. Lower values of AIC or BIC indicate a better ARIMA model for our dataset.
import itertools
from statsmodels.tsa.arima.model import ARIMA
# Define the p, d, q ranges
p = range(0, 5)
d = range(0, 2)
q = range(0, 5)
# Generate all different combinations of p, d and q triplets
pdq = list(itertools.product(p, d, q))
# Find the best combination
best_aic = np.inf
best_order = None
best_model = None
for param in pdq:
try:
temp_model = ARIMA(data, order=param).fit()
temp_aic = temp_model.aic
if temp_aic < best_aic:
best_aic = temp_aic
best_order = param
best_model = temp_model
except:
continue
print('Best ARIMA order:', best_order)
print('Best AIC:', best_aic)
Once you choose the best points for the model, you should validate it with plotting it.
best_model.plot_diagnostics(figsize=(15, 12))
plt.show()
Lastly, with the best parameters of ARIMA, we can forecast the future values by the code below.
n_periods = 30 # Number of periods to forecast
forecast = best_model.forecast(steps=n_periods)
# Plot forecast
plt.figure(figsize=(10, 6))
plt.plot(data, label='Original')
plt.plot(forecast, label='Forecast', color='red')
plt.title('Forecasted Time Series Data')
plt.legend()
plt.show()
Prophet
Prophet is a forecasting model developed by Meta. It has better features compared to ARIMA because it is designed to handle missing data, outliers, and seasonal changes efficiently. Prophet uses an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It is useful for business forecasting tasks such as predicting sales, inventory, and other business metrics.
Prophet formula consists of the following four components:
- Trend: Non-linear growth models, either logistic or linear, to capture the long-term trends. So, it represents g(t) in the formula.
- Seasonality: Captures periodic patterns such as yearly, weekly, and daily seasonality. s(t) corresponds to the seasonality in the models.
- Holidays: Incorporates effects of holidays which can significantly affect the time series data. In the formula, h(t) models the holiday effects.
- Error Term: ϵt represents the error term (noise).
As a result, the mathematical formulation of the model is:
y(t)=g(t)+s(t)+h(t)+ϵt
Step-by-Step Guide of Implementing Prophet
First, you need to make sure you have the Prophet installed with the necessary libraries. Below code demonstrates the both, you can use Jupyter notebook for this.
pip install prophet
import pandas as pd
import matplotlib.pyplot as plt
from fbprophet import Prophet
It is time to load the time series data into a pandas DataFrame. While working with forecasts like Prophet, we only need a DataFrame with two columns: dates and values. Please be careful if you copy the code below due to the naming in data. Our 'ds' column represents the date, and 'y' column represents the value, as Prophet requires these specific column names.
data = pd.read_csv('amazon_stock_prices.csv')
data['Date'] = pd.to_datetime(data['Date'])
data = data.rename(columns={'Date': 'ds', 'Value': 'y'})
plt.figure(figsize=(10, 6))
plt.plot(data['ds'], data['y'])
plt.title('Time Series Data')
plt.xlabel('Date')
plt.ylabel('Value')
plt.show()

To make it work, we need two rows below to execute the Prophet. It is that much easy to fit our model
# Initialize the model
model = Prophet()
# Fit the model
model.fit(data)
When it comes to the prediction phase, we need to create a DataFrame that extends our data into future dates. The make_future_dataframe
method creates a DataFrame that includes all dates from the start of your data to a specified number of days into the future. Afterward, we will use the predict
method to forecast future values. This method returns a DataFrame with the forecasted values and various components such as trend and seasonality.
Here’s an example of how to do this using Prophet:
# Create future dates dataframe
future = model.make_future_dataframe(periods=90) # 90 days into the future, depends on your choice
# Make predictions
forecast = model.predict(future)
# View the forecast
forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail()
- ds: The dates for which the forecast is made, including future dates.
- yhat: This gives us the predicted value of the time series, in other words it is an estimation of what we expect the future values to be, based on the historical data and the model’s understanding of trends, seasonality, and holidays.
- yhat_lower and yhat_upper (prediction intervals):The lower bound of the 80% and the upper bound of the 80% prediction interval for each forecasted value respectively. These are important for understanding the uncertainty in your forecasts. These values help us to assess the risk and make more informed decisions.

XGBoost
XGBoost, or Extreme Gradient Boosting, is another powerful way of optimizing speed and performance. It is not only used for regression but also for machine learning models like classification and ranking. When we are considering the use of XGBoost, we should take a look at its key features:
- Regularization: It reduces overfitting which is seen in most models.
- Parallel Processing: If you are looking for something fast rather than slower ones. XGBoost provides efficient implementation for fast training.
- Handling Missing Values: One of the most important features is to automatically detect and handle the missing data.
- Tree Pruning: It can detect the improvement process and stops growth of the trees when there are no further improvements.
Let’s start with the same way like in the Prophet, you need to make sure you have the XGBoost installed with the necessary libraries. Below code demonstrates the both, you can use Jupyter notebook for this.
pip install xgboost
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from xgboost import XGBRegressor
from sklearn.metrics import mean_squared_error
Load your time series data into a pandas DataFrame,
# Load dataset
data = pd.read_csv('amazon_stock_prices.csv')
data['Date'] = pd.to_datetime(data['Date'])
data.set_index('Date', inplace=True)
# Create lag features
def create_lag_features(df, lags, target_col):
for lag in range(1, lags + 1):
df[f'{target_col}_lag_{lag}'] = df[target_col].shift(lag)
df.dropna(inplace=True)
return df
data = create_lag_features(data, lags=5, target_col='Value')
# Split data into features and target
X = data.drop(columns=['Value'])
y = data['Value']
At this point, we are going to do something different from other models, which is the train-test split. We should ensure that the order of data is maintained by splitting the data into training and testing sets. The train-test split is used for many other models to determine the accuracy of the model.
# Split data into train and test sets
train_size = int(len(data) * 0.8)
X_train, X_test = X[:train_size], X[train_size:]
y_train, y_test = y[:train_size], y[train_size:]
Define the XGBoost model and train it on the training data.
# Initialize the model
model = XGBRegressor(objective='reg:squarederror', n_estimators=1000)
# Train the model
model.fit(X_train, y_train, eval_set=[(X_test, y_test)], verbose=False)
Make predictions on the test set using the trained model and then evaluate the model’s performance using the Root Mean Squared Error (RMSE).
# Make predictions
y_pred = model.predict(X_test)
# Calculate RMSE
rmse = np.sqrt(mean_squared_error(y_test, y_pred))
print(f'Root Mean Squared Error: {rmse}')

Final step demonstrates the visualization of the actual vs. predicted values to assess the model’s performance.
# Plot actual vs. predicted values
plt.figure(figsize=(10, 6))
plt.plot(y_test.index, y_test, label='Actual')
plt.plot(y_test.index, y_pred, label='Predicted')
plt.title('Actual vs Predicted Values')
plt.xlabel('Date')
plt.ylabel('Value')
plt.legend()
plt.show()

Conclusion
It is possible to perform time series forecasting with many different models like these. Here, I tried to show how it can be used with an example of stock pricing. Since the dataset is not perfect, it is very likely that there will be issues with the functioning of the models in some parts. Therefore, I recommend working with and comparing different models for forecasting.