This dissertation was written as a part of the MSc in Data Science at the
International Hellenic University.
Traffic Prediction is an intelligent scheme of forecasting the traffic flow of a
specific place. It is the most critical part of any traffic management system in a smart
city. Accurate prediction could decrease accidents and time waste and even increase
the quality of life of the citizens. That is why; the research of this topic is of the
essence.
In this thesis, a dataset with traffic flow of 6 different Crosses of unknown place is
used with Machine Learning and Deep Learning models. Thus, in order to predict the
traffic flow regression models as Linear Regression, Random Forest, Multi Layer
Perceptron (MLP) and Gradient Boosting are utilized. Other techniques of analyzing
the data were adding “time” features and taking another time interval between the
observations of the time series, which concluded to better results. Furthermore, the
regression problem has been converted into a classification problem and classifiers
such as K-Nearest Neighbors (KNN), Support Vector Classifier (SVC), Adaptive
Boosting (Adaboost), Decision Tree, Random Forest, Gaussian Naïve Bayes
Classifier (GaussianNB) and Extra Trees are used for experimentation. Last, Long
short-term memory (LSTM), that the literature review suggests as one of the top deep
learning models to predict traffic flow, was utilized and tuned for our case. Indeed,
LSTM outperformed the other models with regards to RMSE metric. At each analysis
the according statistical metrics have been calculated to compare the different models
and choose the optimal one. In our case, for regression as mentioned the LSTM model
was the best one and for classification the Extra Trees and the Random Forest
classifiers. Cross Validation and Grid Search had also used in search of optimal
models.
For the regression problem, a technique that is utilized is that the machine learning
models used the data not only of one Cross but of another highly correlated
Cross.That results to better models with regards to 𝑅�
2 metric. Thus, different kind of
approaches are examined for this univariable type of problem and acquired better results than
the classic regression problem.
Collections
Show Collections