This thesis reports on a series of novel approaches enabling knowledge
acquisition through exploiting various machine learning capabilities. Interdisciplinary
approaches may expose new possibilities for data analytics. Two theoretical
frameworks are conceptualized reporting on findings that relate to the research
domains of Social Media (SM) and Energy. Common methods/algorithms/tools may
be utilized for knowledge extraction considering specific mining tasks.
The first theoretical framework presents and combines three novel approaches
in Social Media domain elaborating in mining tasks related with Social Media Types
(SMTs), Social Media Topic Extraction (SMTE) and Social Media Sentiment Analysis
(SMSA). SMTs are evaluated through a novel hypothesis-based data driven
methodology that analyses Social Media Platforms (SMPs) and categorizes SMPs
based on their services proposing new SMTs. The proposed methodology evaluates a
new taxonomy, based on a mixture of hypothesis and data driven approach utilizing
association rules and clustering algorithms. As a result, three new SMTs emerge,
namely Social, Entertainment and Profiling networks, that update and capture
emerging SMP services.
Regarding SMTE, this study utilizes Twitter data to mine association rules and
extract knowledge about public attitudes. COVID-19 pandemic acts as the use case,
analysing crawled tweets. The approach incorporates topic extraction and visualization
techniques, to form word clusters that infer to themes of opinions. Association rule
mining is utilized to improve the process of extracted topics, producing more accurate
and generic results. For the examined period, out of 50 initially retrieved topics with
common SMTE methods, the proposed novel approach manages to reduce topics to
just a few ones.
SMSA relates to the identification and analysis of sentiment polarity in
microblogging data. Such a mining task enables new possibilities for knowledge
extraction and evaluation of public sentiment in response to global events, producing
valuable insights. COVID-19 is the use case, gathering data from Twitter. The main
objective in this topic is the evaluation of a possible correlation between public
sentiment and the number of cases and deaths attributed to COVID-19. Findings
iv “Interdisciplinary data science methods using machine learning for enhanced knowledge acquisition”
correlate sentiment polarity with announced deaths, starting 41 days and expanding up
to three days prior to the count. Also a strong correlation is identified, between
COVID-19 Twitter conversation polarity and reported cases, but a weak correlation
between polarity and reported deaths.
The second theoretical framework presents and combines three novel
approaches in Energy domain elaborating in mining tasks related with Energy
Balancing (EB), Energy Load Forecasting (ELF) and Energy Optimal Day-Ahead
Scheduling (EODS). Energy management may be improved by performing EB in both
Peer-to-Peer (P2P) and Virtual Microgrid-to-Virtual Microgrid (VMG2VMG) level.
This task yields an interdisciplinary analytics-based approach for the formation of
VMGs achieving EB. Computer Science methods are incorporated for addressing an
Energy sector problem, utilizing data preprocessing techniques and Machine Learning
concepts. Each prosumer is perceived as a peer, while VMGs are perceived as clusters
of peers. This approach incorporates clustering and binning algorithms for
preprocessing Energy data (for 94 prosumers) producing options for generating
VMGs. Then, a customized Exhaustive brute-force Balancing Algorithm (EBA)
balances at the cluster-to-cluster level (VMG2VMG balancing) reporting outcomes
and prospects for scaling up and expanding this work.
A novel approach in the task of ELF exposes improvements for residential house
energy requirements. This task is crucial for Energy sector stakeholders (e.g., DSO,
aggregators etc.) since they are able to plan in more efficient manner their Demand
Response (DR) management strategies. The experimentation includes the retrieval of
energy readings from a state-of-the-art nearly Zero Energy Building (nZEB). Focus is
made on one step ahead ELF, producing an approach regardless the time resolution of
available data while yielding high accuracy results. Ensemble methods and forecasting
algorithms are utilized while the evaluation of forecasting results is performed with
popular accuracy metrics (MAPE, SMAPE and RMSE) and an Execution Time (ET)
metric.
Optimal energy management relates with the task of EODS. A novel approach
is proposed in the form of a framework/tool for a multi-objective analysis comprising
a decision-making system. Two distinct optimization problems for two actors
(consumers and aggregators) are considered, with each solution completely or partly
interacting with the other in the form of DR signal exchange. The overall optimization
“Interdisciplinary data science methods using machine learning for enhanced knowledge acquisition” v
is formulated by a bi-objective optimization problem for the consumer's side aiming
at cost minimization and discomfort reduction; and a single objective optimization
problem for the aggregator's side aiming also at cost minimization. Experimentation is
conducted on a real pilot (Terni Distribution System portfolio). The framework
performs decision making by forecasting the day-ahead energy management
requirements while aiming at optimal management of energy resources considering
both aggregator's and consumer's preferences and goals.
Achievements of this thesis highlight prospects for enhanced knowledge
acquisition through the conception of two theoretical frameworks in the domains of
Social Media and Energy while envisioning an interdisciplinary research design. The
theoretical frameworks, “A Multi-Functional Framework for defining Social Media
Types, extracting Topics and Inferences, and discovering Correlations based on Public
Sentiment” and “A Novel Framework for P2P and VMG2VMG Energy Balancing,
Incorporating One Step Ahead Load Forecasting and Optimization for Day-Ahead
Energy Scheduling” incorporate common data mining methods/algorithms elevating
the necessity for interdisciplinary novel approaches in multi-domain data analytics
along with benefits they might yield.
Collections
Show Collections