Peer-to-Peer Energy Trading Case Study Using an AI-Powered Community Energy Management System

Mahmoud, Marwan; Slama, Sami Ben

doi:10.3390/app13137838

Open AccessArticle

Peer-to-Peer Energy Trading Case Study Using an AI-Powered Community Energy Management System

by

Marwan Mahmoud

^1,2,*

and

Sami Ben Slama

^2,*

¹

The Applied College, King Abdulaziz University, Jeddah 21589, Saudi Arabia

²

Analysis and Processing of Electrical and Energy Systems Unit, Faculty of Sciences of Tunis El Manar, Belvedere PB 2092, Tunisia

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2023, 13(13), 7838; https://doi.org/10.3390/app13137838

Submission received: 5 May 2023 / Revised: 22 June 2023 / Accepted: 28 June 2023 / Published: 4 July 2023

Download

Browse Figures

Versions Notes

Abstract

:

The Internet of Energy (IoE) is a topic that industry and academics find intriguing and promising, since it can aid in developing technology for smart cities. This study suggests an innovative energy system with peer-to-peer trading and more sophisticated residential energy storage system management. It proposes a smart residential community strategy that includes household customers and nearby energy storage installations. Without constructing new energy-producing facilities, users can consume affordable renewable energy by exchanging energy with the community energy pool. The community energy pool can purchase any excess energy from consumers and renewable energy sources and sell it for a price higher than the feed-in tariff but lower than the going rate. The energy pricing of the power pool is based on a real-time link between supply and demand to stimulate local energy trade. Under this pricing structure, the cost of electricity may vary depending on the retail price, the number of consumers, and the amount of renewable energy. This maximizes the advantages for customers and the utilization of renewable energy. A Markov decision process (MDP) depicts the recommended power to maximize consumer advantages, increase renewable energy utilization, and provide the optimum option for the energy trading process. The reinforcement learning technique determined the best option in the renewable energy MDP and the energy exchange process. The fuzzy inference system, which takes into account infinite opportunities for the energy exchange process, enables Q-learning to be used in continuous state space problems (fuzzy Q-learning). The analysis of the suggested demand-side management system is successful. The efficacy of the advanced demand-side management system is assessed quantitatively by comparing the cost of power before and after the deployment of the proposed energy management system.

Keywords:

artificial intelligence; deep reinforcement learning; renewable energy; peer-to-peer energy trading; prosumer

1. Introduction

Decentralization and digitalization have been chosen as the main factors in the evolution of the global electricity system [1]. Decarbonization aims to reduce the sensitivity of climate change to the effects of manufactured carbon emissions. An energy supply that does not produce carbon dioxide emissions implies using decarbonized resources and technologies, such as variable renewable energies (VRE), energy storage in batteries, and demand flexibility [2]. Prosumers, or communities of prosumers, are groups of many small energy producers who create and self-consume electricity from dispersed energy resources through decentralization. Changes in consumption have an impact on energy resource management and revenue losses for utilities [3]. According to the European Union’s Horizon 2020 research and innovation program, digitization is leading to the creation of new technologies and market solutions in the energy sector. These market solutions are known as peer-to-peer (P2P) energy exchange concepts [4].

The peer-to-peer (P2P) energy exchange concept represents an evolution of future transactive energy trends. It involves the creation of a horizontal platform for sharing, transferring, exchanging, and trading energy between peers who are either consumers or prosumers [5]. This poses a challenge to the traditional vertical structure of the energy value chain. From a legal and regulatory perspective, “P2P energy sharing” is considered a broad term, according to the Council of European Energy Regulators. It considers all possible interactions between the individuals making up a self-consumption system (individual, collective, and community self-consumption). In addition, as part of the “Clean Energy” package, the European Commission is working on legalizing private individuals buying and selling electricity [6]. This involves “selling renewable energy between market players utilizing a contract whose terms are predetermined”, according to the P2P concepts. Whether directly between market players or indirectly via a certified third party, such as an aggregator, the contract controls the automated execution and settlement of the transaction [7]. This can be achieved directly or indirectly. The increasing number of large certified prosumers in the electricity market has necessitated a reevaluation. In the current centralized electricity market, prosumers who produce more electricity than they need have two choices for what to do with their extra power: either they may sell it to the primary grid following a feed-in tariff (FIT) regulation, or they can store it in a battery energy storage device [8]. However, feed-in tariffs have decreased in most nations as many governments have lowered subsidies for renewable energy, which calls the feasibility of renewable energy projects into question. In addition, battery energy storage systems for most stand-alone renewable energy sources, such as residences, must be far more efficient. In addition, the laws and regulations that support the energy market for renewable energy sources held by residents need to be more effective [9]. As a result, the P2P (“peer-to-peer”) energy trading mechanism for distributed energy resources (DER) in microgrids has emerged as the strategy that is being considered for the development of a future energy trading system [10]. The peer-to-peer energy exchange mechanism, which involves a direct energy exchange between prosumers, may also support an equity model that enables prosumers to earn money from their surplus power through net metering [11]. This model allows prosumers to make money from their excess electricity. Microgrid technology enables the integration of small modular multi-energy generating technologies, such as renewable energy sources; these multi-energy systems may be connected to new power networks. Microgrids are a type of distributed energy resource (DER) [12]. Developing peer-to-peer energy trading is an essential step in the growth of microgrids, which is one of the most critical phases. Small-scale actors, such as consumers and prosumers, can actively exchange energy in (near) real time inside their communities thanks to the MG (microgrid) energy market [13]. On the other hand, most of the current research on microgrids is directed at using energy management strategies to improve the system’s resilience and boost the usage of renewable energy. At the same time, the notion of peer-to-peer energy exchange has captured the attention of prosumers on a commercial scale. The value of the net meter produced by DERs may be sold to other customers at a high markup if the installed capacity of the DERs is more than the load of a potential consumer. On the other hand, this might result in weaker economic incentives for potential market players or reduced flexibility in P2P energy exchanges, which would lead to a less-than-ideal balance between energy supply and demand [14].

In [15], the authors argue that peer-to-peer energy trading allows a focus on self-consumption and near self-sufficiency for energy prosumers, which remains to be determined. The surplus energy that can be bought and sold on the open market is decreasing, while the number of prosumers in local communities constantly increases. In a microgrid, the prosumers involved in P2P energy exchange have diverse interests they wish to satisfy. Creating an energy balance in a microgrid is difficult due to the unpredictability of electricity generation from renewable energy sources [16]. It is, therefore, challenging to match energy demand with energy supply. It is, therefore, difficult to find a compromise between the interests of stakeholders concerning the compensation and distribution of energy benefits [17]. The energy trading system must consider issues related to the equitable distribution of benefits. In the context of a microgrid, it is essential to examine the issue of energy sharing as a component of the P2P energy trading system. As these are the most critical variables for prosumers to consider as they will affect their involvement, a model must consider fair energy sharing and flexible commercial pricing [18]. Therefore, this study aims to examine a model that increases energy self-consumption to reap the benefits of fair sharing. In addition, this study analyzed and determined flexible pricing for microgrid energy systems with many prosumers.

1.1. Related Works and Contributions

In [19], the authors state that the distribution network connects many small and medium-sized distributed energy resources. These include distributed generators (such as rooftop solar panels and wind turbines), energy storage, and regulated loads (such as electric vehicles, heat pumps, homes, and buildings). Due to the involvement of many market players who can sell or buy energy more flexibly and efficiently, the conventional business model, in which residential customers or prosumers deal only with energy utilities, is undergoing a substantial transformation. In [20], the authors proposed a new economic approach, often known as the “sharing economy”, which uses today’s information and communication technologies to promote the collaborative exchange of “underutilized assets” between “peers”. In addition, the ‘sharing economy’ is characterized by the cooperative use of otherwise underutilized resources. Other peer-to-peer (P2P) asset exchanges, such as Airbnb for peer-to-peer (P2P) house sharing and Task Rabbit for peer-to-peer matching, are not identical to peer-to-peer (P2P) energy trading. It faces significant challenges that distinguish it from other options. In [21], the authors argue that if an energy control system is considered a resource for peer-to-peer (P2P) trading, it is necessary to consider the adequacy of individual demand and the balance of the overall energy system concerning supply capacity. In [22], the authors conclude that the power grid needs to be reliable, secure, and well-balanced over multiple time scales to accommodate the myriad of potential energy inputs and outputs. In [23], the authors mention that to promote peer-to-peer (P2P) energy exchange on a larger and more comprehensive scale, it is essential to coordinate decentralized markets and facilitate the exchange of peak flexibility between different energy sectors and multi-energy communities. In [24], the authors report that several participants include peers and national aggregators in peer-to-peer energy exchanges.

In [25], the authors state that the community manager, the system operator, and the market operator are aggregators in a community market. Indeed, the authors mention that only the market operator is considered an aggregator in the hybrid market. In [26], the authors implement a peer-to-peer (P2P) energy exchange framework that requires creating complex energy services and the flexibility of a more comprehensive energy supply chain. This is why it is necessary to study the interoperability of the different markets and the difficulties in promoting P2P energy trading within the currency markets. This should enable trading partners to participate in the new P2P market and work with existing market players to establish P2P, wholesale and retail markets. In [27], the authors argue that establishing a P2P energy market requires additional elements to be put in place due to these significant barriers. These include increased engagement of market players at different levels in digitalization attackers and collaborations on social and scientific elements. Both of these elements are essential. Despite this, the number of prosumers in local communities continues to grow. In [28], the authors state that prosumers involved in peer-to-peer energy trading in a microgrid have individual goals. In [29], the authors state that achieving energy balance in a microgrid is difficult because renewable energy production can be challenging to predict. It is challenging to match the amount of energy to the available energy requirements. In [30], the authors state that finding common ground between the many stakeholders who offset energy costs and distribute benefits can take time and effort. In [31], the authors state that the energy trading technique must consider that the benefits are shared equitably. In the context of microgrids, the issue of energy distribution must be seen as an integral part of the powerful P2P exchange system. In [32], the authors mention that a model must consider both the fair sharing of energy and flexible commercial pricing, as these two aspects are essential for consumers and therefore have the most significant influence on their choice. In [33], the authors state that market exchange procedures are essential to peer-to-peer energy exchange. According to them, today’s P2P market techniques mainly focus on centralized, decentralized, and distributed markets (community, integer, and hybrid P2P markets). In [34], the authors argue that the centralized market can maximize the total value accruing for the entire energy trading community by using a centralized management technique. In [35], the authors mention that in a centralized market, a market coordinator can control the operation of the whole microgrid from a macroeconomic point of view. This allows the microgrid to operate more efficiently. The coordinator determines the objective function and maximizes the overall benefits of the group. In [36], the authors show that the decentralized system has no market organizer. Consequently, it allows users to participate in direct transactions and agreements between themselves. This prevents users from exposing their energy statistics to a third party. In [37], the authors consider dispersed marketplaces, which influence consumer behavior in a roundabout way through the price mechanism, another technique that has received much attention and study. This is comparable, but different, to decentralized and centralized marketplaces. When using the distributed system, users also benefit from greater confidentiality. In [38], the authors use the k-means technique to evaluate the historical data of prosumers and achieve a balance between supply and demand in the microgrid. In [39], the authors proposed an improved energy management model for prosumers to help them develop appropriate consumption plans. In [40], the authors proposed the concept of energy collectives, which allows the community negotiation process to be defined and modeled based on distributed optimization concepts. In [41], the authors proposed an optimization model to optimize the financial benefits of households equipped with PV and batteries. The idea is to set a fixed price for users to share energy via peer-to-peer (P2P) trading. In [42], the authors advocate for using local energy reserves for blockchain-based P2P energy trading in the industrial Internet of Things (IoT). The objective is to reduce blockchain length and energy transfer losses. In [43], the authors propose an “intelligent energy exchange platform”, a centralized P2P energy exchange market. As a constraint in this system, the price of P2P transactions is fixed in advance. In [44], the authors describe a market mechanism for operating multiple hubs based on P2P transactions occurring within energy hubs. In [45], the authors propose an event-driven local market in which energy retailers choose tariffs in a temporary market through an event-driven bilateral auction mechanism.

A peer-to-peer (P2P) energy-sharing approach for microgrids based on price demand response has been described [46]. In this paradigm, the relationship between supply and demand defines the dynamic internal pricing mechanism. In [47], the authors describe a viable network cost allocation strategy to compensate for peer-to-peer energy trading. By modeling P2P energy trading with auction payments, this strategy can improve the performance of P2P trading. In [48], the authors propose a trading strategy based on a Nash-type non-cooperative game model. Examining blockchain technology in the context of peer-to-peer (P2P) energy trading, the authors propose an Ethereum-based blockchain test bed that illustrates several blockchain principles [49]. In [50], the authors suggest that decentralized marketplaces’ continuous bilateral auction procedure could reduce some users’ production or load. In [51], the authors used an analytical strategy based on multi-bilateral economic distribution to build a decentralized mechanism for the P2P operation of electricity markets. In [52], the authors study an inventive demurrage approach for blockchain-based energy transfers. In [53], the authors describe a MAS-based architecture for energy trading; they consider customers as different and autonomous entities.

In this study, we focus on implementing an optimal AI-based solution. We propose a realistic Q-reinforcement learning approach to help market participants select acceptable trading system alternatives. The proposed transaction mechanism increases pattern decisions among market participants. We propose a peer-to-peer exchange approach for the energy community pool to improve energy sharing and BT performance. Q-learning can solve optimal choice difficulties by using prohibition limits. To maximize output, the proposed PV battery system for single-home and traditional households used an appliance planning model expressed as a MILP problem and analyzed in depth. Demand response is deployed to regulate prices and optimize device scheduling. Unlike previous methodologies, scenarios are used to mimic power system reliability. Scenarios are generated using exact PDF data and then used as constraints in the optimization problem. An intelligent prosumer environment case study is studied to demonstrate the potential of the suggested technique. Table 1 outlines the three types of peer-to-peer energy trading.

The remaining sections are structured as follows: the methodology is discussed in Section 1.2. The mathematical formula of the optimal scheduling problem is presented and discussed in Section 1.3. The Single Home-Sharing Energy is given in Section 1.4. Section 2 describes the Multi-Agent Reinforcement Q-Learning. Section 3 presents and discusses a “case study”. Two case studies illustrate the effectiveness of the proposed design. Section 4 presents a conclusion and recommendations for future work.

1.2. Methodology

Our study explores the potential for a neighborhood residential community that includes both conventional and smart dwellings, as well as an energy storage unit. Smart houses generally have energy storage systems (i.e., batteries (BT)). They allow customers to sell any excess energy to a power station while simultaneously acquiring shares at the market price in real-time. Families that are not on the market can purchase electricity from the power pool at lower costs than the retail market [29]. When smart home solar panels are installed, any excess energy may be sold to the energy pool at a price greater than the feed-in tariff (FinT). Decentralized agents negotiate the purchase and sale of energy to the pool. Moreover, instead of selling electricity back to their retailer at the grid’s contracted rate, people can sell the extra electricity they produce (using solar panels, for example) to others—another advantage of the peer-to-peer concept. In addition, market players can choose the electricity purchase and sale tariffs they consider reasonable. In this article, “end users”, or “users”, refers to both those who produce and consume energy. This covers both those who create and those who consume energy. Figure 1 illustrates the methodology used for this study. The descriptive research examines individuals’ interests and goals in peer-to-peer energy markets. To achieve this, an online survey using an ELD is undertaken to discover respondents’ preferences in various areas of energy. For various reasons, digital communication equipment was considered an acceptable solution for assessing user preferences. As it is an implicit method of calculating preferences, it removes the incentive to provide potentially incorrect but socially acceptable answers.

1.3. Objective Function

This study attempts to reduce the microgrid’s total external electricity exchanges (Z₁) and the external network transaction costs (Z₂) as much as possible in order to increase the amount of electricity that the microgrid consumes and produces itself. This will enable the microgrid to consume more of its own electricity (see Equation (1)). P^purchase/P^retail, t (t = 1, 2, …, n) represents the amount of electricity that homes use. The transforming methods of multi-objective optimization are employed using z₁ and z_2, in a single-objective optimization, which we will simply call W. If the dimensions of z₁ and z₂ are too disparate, the weights χ₁ and χ₂ can be used to change the dimensions of the total (MinZ) as follows:

\{\begin{cases} z_{1} = \sum_{k = 1}^{n} [\sum_{j = 1}^{m} (P_{t}^{p u r c h a s e, k} + P_{t}^{r e t a i l, k})] \\ z_{2} = \sum_{k = 1}^{n} [\sum_{j = 1}^{m} (P_{t}^{p u r c h a s e, k} C_{t}^{p u r c h a s e} + P_{t}^{r e t a i l, k} C_{t}^{r e t a i l})] \\ M i n Z = χ_{1} z_{1} + χ_{2} z_{2} \end{cases}

(1)

1.4. Single Home-Sharing Energy

1.4.1. PV Supply

The PV system is a controlled energy-generating agent in the presence of a solar module with an MPPT controller and DC/DC converter. The PV power generation is as shown in [40], and it is located on the roof of the house (see Equation (2)).

\{\begin{cases} I_{S o l a r} = I_{p h} - I_{s} e^{(\frac{N_{S} . V_{P V} + N_{P} . I_{P V} . R_{s}}{V_{T}})} - \frac{N_{S} . V_{P V} + N_{P} . I_{P V} . R_{s}}{R_{s h}} \\ P^{p v} (\forall t) = P^{p v} (s, t) * D (s, t) \\ E (t, n) = E_{n}^{C o n s u m e r} (t) + E_{n}^{P r o s u m e r} (t) = \sum_{k = 1}^{k = 6} E_{n}^{k} (t) \end{cases}

(2)

where

E_{n}^{P r o s u m e r} (t)

and

E_{n}^{C o n s u m e r} (t)

represent the cost of installation of consumer and prosumers, respectively, while E_n(t, n) calculates the overall revenue for PV installation. I_solar represents the PV power delivered P^PV(t) at time t under typical rated conditions (given by Equation (2)), and D(s, t) is the cost of installation of the rooftop solar PV system.

1.4.2. Households Load Consumptions

Household load consumption is a co-located agent that monitors household energy demand. This demand is determined by appliances (lighting, heating, leisure, cooking, etc.). The following table presents a matrix showing the typical amount of energy required by each appliance: adjustable appliances (J_a_,s), non-adjustable appliances (J_a_,n), and fixed appliances (J_a_,f) (see Equation (3)) [50]. The system regulates the operation of the appliances in the house to balance the amount of energy used and the amount of energy produced. The following functions are considered (see Figure 2).

J_{a, x} : \{\begin{cases} J_{a, s} = [\begin{matrix} J_{a, s, 1} & J_{a, s, t} & \dots & J_{a, s, 2} \\ J_{a, s, 1} & y_{a, s, 1} & 0 & J_{a, s, 3} \\ \dots & \dots & \dots & 0 \\ J_{a, s, t} & J_{a, s, t - 1} & 0 & y_{a, s, 1} \end{matrix}], \forall s \in I^{s}, \forall a \in A^{s} \\ J_{a, n} = [\begin{matrix} J_{a, n, 1} & J_{a, n, t} & \dots & J_{a, n, 2} \\ J_{a, n, 1} & y_{a, n, 1} & 0 & J_{a, n, 3} \\ \dots & \dots & \dots & 0 \\ J_{a, n, t} & J_{a, n, t - 1} & 0 & y_{a, n, 1} \end{matrix}], \forall n \in I^{N}, \forall a \in A^{N} \\ J_{a, f} = [\begin{matrix} J_{a, f, 1} & J_{a, f, t} & \dots & J_{a, f, 2} \\ J_{a, f, 1} & y_{a, f, 1} & 0 & J_{a, f, 3} \\ \dots & \dots & \dots & 0 \\ J_{a, f, t} & J_{a, f, t - 1} & 0 & y_{a, f, 1} \end{matrix}], \forall f \in I^{F}, \forall a \in A^{F} \end{cases}

(3)

1.4.3. Storage Unit

Equation (4) calculates the maximum permitted output power of the BT. Charging and discharging are two completely different processes that can never occur simultaneously. The same equation also represents the state of charge of the BT, which depends on the battery’s capacity and the discharge level. It reveals the conditions of the battery system’s first charge and the charge levels it will reach in its final state. The following equation can be used to estimate the additional energy stored in the storage unit over an average period and under typical operating conditions [45].

\begin{array}{l} |\begin{array}{l} S o C^{B T} (t) = S o C^{B T} (t - 1) + Δ ν (η^{c} * p^{c} (t) - \frac{p^{d} (t)}{η^{d}}) \\ S o C^{B T} (t) = S o C_{0}^{B T} + S O C_{T}^{B T} (1 - D^{B T}) \\ S o C_{0}^{B T} = P^{B T} (1) = P^{B T} (T) \\ C^{B T} (1 - D^{B T}) \leq P^{B T} (t) \leq C_{0}^{B T} (t) \\ S o C_{1}^{B T} (t, s) = C_{0}^{B T} (0, t) = S o C_{T}^{B T} (t) \end{array} \\ \overset{BT constraints}{\overset{︷}{\{\begin{cases} 0 \leq p^{d} (t) \leq \sum_{\forall t, s}^{} Δ τ^{1} . \bar{u^{B T}} \to \forall t \\ 0 \leq p^{c} (t) \leq \sum_{\forall t, s}^{} Δ τ^{2} . \bar{u^{B T}} \to \forall t \\ Δ τ^{1} + Δ τ^{2} \leq 1 \to \forall t \end{cases}}} \end{array}

(4)

where SoC^BT(t) denotes the battery state of charge at the time t, SoC₀^BT(t) denotes the battery state of charge at the time 0, η^c represents the BT charging efficiency, C^BT_max denotes the battery maximum capacity at the time t, η^d denotes the battery discharging efficiency at the time t, C^BT₀ denotes the battery capacity at the time t, C^BT₀ denotes the initial budget (USD) at time t, Δν^BT T is the BT’s capital cost (USD/kw) at time t, and p^d(t, s) is the power delivered by the BT at the time t.

1.4.4. Time-of-Use Tariff

Electricity companies are introducing new tariff structures for their residential customers. Time-of-use rates, demand charges, and real-time pricing encourage consumers to use energy when electricity is cheap and discourage them from doing so when electricity is expensive [39]. Understanding how time-of-use tariffs work can help to reduce users’ monthly energy bills. The conditions of use of the tariff policy vary according to location. Taking the province of Jeddah in Saudi Arabia as an example, the electricity purchase prices C^purshase(t) and electricity selling prices C^retail(t) in 2021 for prosumers and commercial (see Equations (5) and (6)) and industrial consumers are as follows. The price of fixed-term contracts is described below.

C_{t}^{p u r c h a s e} = \{\begin{cases} \{1.254 ($) \to t = [11 : 00, 12 : 00] \cup [14 : 00, 15 : 00] \leftrightarrow S u n n y D a y s \\ \{1.04 ($) \to t = [10 : 00, 12 : 00] \cup [14 : 00, 19 : 00] \\ \{0.904 ($) \to t = [00 : 00, 08 : 00] \\ 0.204 ($) \to o t h e r - p e r i o d \end{cases}

(5)

C_{t}^{r e t a i l} = \{\begin{cases} \{1.16 ($) \to t = [10 : 00, 12 : 00] \cup [14 : 00, 19 : 00] \\ \{0.804 ($) \to t = [00 : 00, 08 : 00] \\ 0.304 ($) \to o t h e r - p e r i o d \end{cases}

(6)

Equations (7) and (8) describe the upper constraint on the amount of instantaneous electricity that can flow to/from the distribution network from/to the home system. In addition, buying and selling energy are complementary, as the above equation shows. As can be seen, the event value of the matrix will become zero if a network failure occurs at time t and in the scenarios. Therefore, due to the requirements in (9), the binary variables δ₁ and δ₂ must also have a value of zero. As a result, electricity flowing to/from the power grid is constrained to zero, simulating the inaccessibility of the power grid when it is not available.

\{\begin{cases} P_{t}^{^{g r i d, hom e}} = δ_{1} P_{t, \max}^{^{g r i d, hom e}} \\ P_{t}^{^{hom e, g r i d}} = δ_{2} P_{t, \max}^{^{hom e, g r i d}} \\ P_{t}^{^{g r i d, hom e}} \leq δ_{1} P_{t, \max}^{^{g r i d, hom e}} \to \forall t \\ P_{t}^{^{hom e, g r i d}} \leq δ_{2} P_{t, \max}^{^{hom e, g r i d}} \to \forall t \\ δ_{1} + δ_{2} \leq 1 \to \forall t \end{cases}

(7)

If there is a problem with the electricity network, electricity cannot flow out or into it. Equation (8) represents this limitation in its modelled form. Blackouts are the most severe form of power supply interruption. They result from an imbalance between the production and use of electricity. Power cuts can last a few minutes to several days or weeks [42]. In the context of our work, simultaneous over-power, also known as over-power with or without outages, is represented by Equation (9).

\{\begin{cases} \bar{P_{t},_{\max}_{}^{^{g r i d, hom e}}} \leq O_{t}^{T} \to \forall t \\ \bar{P_{t, \max}^{^{g r i d, hom e}}} \leq O_{t}^{T} \to \forall t \end{cases}

(8)

|\begin{array}{l} P_{t}^{^{U t i l i t y - G r i d}} & = \max (0, P_{t}^{P V} - \sum_{k = 1}^{k = m} J_{a, k}) \overset{W i t h o u t}{\to} \\ P^{O T} (t, s) & = \max (0, P_{t}^{P V} + P_{t}^{^{B T, c h}} - \sum_{k = 1}^{k = m} J_{a, k} - P_{t}^{^{B T, d i s}}) \\ \overset{B l a c k o u t s}{\leftarrow} \end{array}

(9)

where O_t^T and P^OT(t, s) are outages and energy consumption in the grid, and P_t is the utility-grid power demand. P^OT(t, s) is the expected number of outages per year.

2. Multi-Agent Reinforcement Q-Learning

Recent advances in deep neural networks have propelled reinforcement learning (RL) to the forefront regarding various artificial intelligence achievements, including victory over human opponents. Despite these advances, a single RL agent can still fail to complete several critical tasks. RL agents must work together to achieve a common goal, speed up the learning process, protect user privacy, demonstrate resilience in the face of setbacks and hostile assaults, and overcome the physical limitations when an agent operates alone [44]. These challenges are studied using multi-agent reinforcement learning or multi-agent cooperative RL. In this framework, agents collaborate to develop policies to maximize the rewards obtained by the team while interacting with each other and in a random and unpredictable environment. RL agents perform their tasks in exchange for a uniform payment for all agents. Each MDP agent can be characterized by this framework’s tuple (S, {Ai} N i = 1, P, r, γ). Each agent (#ID) is responsible for monitoring the common instances and executing and verifying the procedure, which is its set of procedures according to its local policy (п_i) [45]. This question of optimization requires continuous reflection between several different parties. In this competition, each consumer or prosumer chooses its own energy exchange and load planning techniques to implement in an unpredictable and constantly changing environment. P2P and DSM exchanges establish a market where public knowledge of energy is only sometimes available. This is why it can be challenging to draw correct conclusions. One of the most complex machine learning approaches is model-free dynamic reinforcement learning [51]. Choosing the most effective tactic is possible based on previous research or experience. First, we will adapt the structure of the DSM model to accommodate P2P energy transactions and DSM problems already present in the DRL design. Second, we will apply a multi-agent DRL algorithm in a decentralized training environment to find the most efficient home management strategies (Figure 3).

Markov Decision Process Formulation

The Markov decision process (MDP) represents recent sequential, discrete, and stochastic decision making. According to this paradigm, a decision-maker or agent resides in an environment that changes randomly as a function of their activities. The MDP consists of the state (condition, S_M), action (a_M), and reward (R_M) (see Equations (10)–(12)).

State: Sⁿ_M(t)

The elements that make up the state vector Sⁿ_M(t) of household n at time t are the photovoltaic production, the state of charge of the BT, and the purchase and sale prices provided by the regional electricity company. The following is the equation form of this vector:

S^{n}_{M, t} = [\begin{array}{l} E_{t}^{n}, S o C_{n}^{B T}, B_{I D = n}^{s} (t), \\ Q_{I D = n}^{s} (t), Y_{t} \end{array}]

(10)

Action: aⁿ_M

The illustration on the right represents the action vector aⁿ_MDP for six houses at time t and scenario s. aⁿ_M deploys home energy trading (K₁), the load scheduling of six houses (K₂), and BT energy storage (K₃). Figure 3 illustrates the action vectors and ANN for six homes at a given time t and in a given situation.

a^{n}_{M} (t, s) = [K_{1}^{n}, K_{2}^{n}, K_{3}^{n}]

(11)

Reward: Rⁿ_M(t)

The reward function Rⁿ_MDP(t) is the immediate advantage earned by household n at time t to execute the action a based on the state sⁿ, which is defined as an expression of the reward function Rⁿ.

R^{n}_{M D P} (t, s) = [S^{n}_{M D P} (t), a^{n}_{M D P} (t)]

(12)

Figure 4 illustrates the proposed artificial neural network (ANN) model, which has an input data layer containing five neurons, three hidden data layers with seventeen neurons each, and an output data layer containing one neuron. While maintaining the bias, each layer calculates the weighted sum of the input vector by applying the weight (wi) to the vector inputs. The transfer functions take this weighted total and send it to the next layer. This research work used the ReLu (corrected linear unit function) transfer function. A value of 0.005 was chosen for the coefficient in the optimization procedure. ML uses RL to draw non-deterministic conclusions. The worker engaged with the environment after impersonation, as shown in Figure 4. The environment had an impact on the agent’s actions. This process was continued until all benefits for the environment were exhausted. Brokers look for the most efficient way to increase their income. This investigation used experimental control as a source of information for the MDP. An agent’s current state and the activities it undertakes in that state determine its state. In the field of RL decision making, Q-learning is a proven technology. The computation of Q-values for the state Sⁿ_M(t) and the action (aⁿ_M) at the same time is possible through Q-learning (See Equation (13)). The Bellman equation provides the most accurate approximation for rewards and updates [50].

Q^{*}_{k *} (S^{n}_{M}, a^{n}_{M}) = R^{n}_{M} (S^{n}_{M}, a^{n}_{M}) + g \max Q (S_{M} {^{T + 1}}_{M}, a^{T + 1}_{M})

(13)

The optimal Q value Q(Sⁿ_M(t), aⁿ_M) is defined by Equation (14) in the range [0, 1] with the relative reduction coefficient g as the sum of the immediate reward R(Sⁿ_M(t), aⁿ_M) and the maximum future reward g max Q(Sⁿ_M(t), aⁿ_M), where g is the relative discount coefficient. The state procedure table, also known as the Q value table, is responsible for recording all Q(Sⁿ_M(t), aⁿ_M) values and any changes to these values. At time t, the agent chooses an action in the table of Q-values and then uses Bellman’s equation to notify the item (Q-value).

|\begin{array}{l} Q (S^{n}_{M}, a^{n}_{M}) \leftarrow (1 - Δ ζ) (S^{n}_{M}, a^{n}_{M}) + \\ Δ ζ [\begin{array}{l} R^{n}_{M} + \\ g \max Q (S^{T + 1}_{M}, a^{T + 1}_{M}) \end{array}] \\ k * = \arg \max Q (S^{T + 1}_{M}, a^{T + 1}_{M}) \end{array}

(14)

In Equation (15), the notation Δζ (∈[0, 1]) can be seen. When Δζ = 1, the agent modifies Q without taking advantage of it, considering current circumstances and the maximum discounted future payment. The agent can explore or exploit the resource if a value in the interval [0, 1] is entered into the algorithm (when Δζ = 0). By gradually increasing Q(Sⁿ_M(t), aⁿ_M), the agent can learn the absolute value of k*. When the actor-network forms the deterministic policy, the deterministic policy loss (k*) of the actor network (ANe) estimates the separation between the two parts (consumers and prosumers) of the Bellman equation (L(θ)), (θ: soft update) (see Equation (15)). The magnitude of this loss is directly proportional to the distance between the two parties. The magnitude of the loss function can be reduced by training [53]. The following is a presentation of Algorithm 1, which illustrates the DRL learning approach employed by the proposed multi-agent DRL algorithm.

|L^{n} (θ_{Q}^{n}) = [{(g_{t}^{n} - Q (S^{n}_{M}, a^{n}_{M D P}, θ_{Q}^{n})}^{2}]

(15)

Algorithm 1: Energy Trading Community Approach

Input: Solar output, household power demand, temperature, electricity price (RTP, TOU), agent ID status, SoCBS
Output: The maximum function H(s’, a), the new case’s Q-value s’(V(s, a, s’)), output group N (s), Q-value records, optimal action a
Initialize memory G of size N;
Initialize preprocess function Q(s)
Initialize target networks Q(s’, a)
For iteration in [1, Max + 1]:
Episode: s = 1, 2, ……, M (s = ΣMi)
Get the initial state s0
Compute output groups N(s)
End
For q-value in N(s) do:
Convolute N(s)→w [i, a]
$|\begin{matrix} q (i, a) \overset{s t e p 5}{\leftarrow} q (i, a) + β . Δ Q (a) + w (i, a) \overset{s t e p 5}{\to} (14.3) \end{matrix}$
End
For I = 1, 2, 3, …, n − 1 do:
$M a x (a i, a j) + q (i * a i * a j) * Ø * m a x [Q s (r i, a)] + V \to V$ ;
$Ø m a x [Q s (r i, a)] + a V (s, a) + Q (s, a) \to Q$
$Q \overset{f r o m i t o n}{\leftarrow} Q / \sum w (i, a)$
$V \overset{f r o m i t o n}{\leftarrow} V / \sum w (i, a)$
//Execute operation in smart home environments and observe st + 1 (s’)
$Δ Q (s^{'}, a) = V (s, a, s^{'}) + ϕ H (s^{'}, a (s, a))$
$q (i *, a i, a j) \overset{s t e p 5}{\leftarrow} q [i *] [a i] [a j] + β . Δ Q (a) + w (i)$
//Select a limited set of K occurrences, 1 ≤ ε ≤ K;
for i = 1, 2, 3, …, n − 1 do:
If M (s = ΣMi) < ε
//Update target
A [ $a i, a j$ ] = max A [ $a i, a j$ ] * $q (i * a i * a j)$
S’→S
End
Else
$a i, a j = ε$
End
End

3. Case Study

3.1. System Initialization

The benefits of the Q-learning algorithm for an intelligent home user and its ability to utilize surplus renewable energy were examined statistically using a community of six users and a power pool. The smart home users and the power pool exchanged energy every 12 h. The technique was found to be accurate at 12 h and 24 h. The suggested control approach was tested and found to meet the load demands in MATLAB/Simulink. The system simulated solar photovoltaic data (radiance and temperature) using Tunisian weather forecasts (Figure 5e,f). Regional weather data were updated every two days. The simulation results evaluated this strategy. The off-grid supply was one-way. Communal photovoltaic solar power was sufficient. Solar energy and the grid charged the batteries of each residential community with electricity. The optimizer used the batteries to reduce electricity prices during peak hours (Figure 5d). Table 2 shows the system parameters. Solar PV and the grid recharged the batteries during peak hours, producing bidirectional power from the grid. The solar PV responded to the load, while the batteries sold electricity to the grid (a house)—solar energy and grid energy charged off-grid batteries. Figure 5a,c show the PDFs fitted to the data with red lines. These distributions used 100 standard basis estimators evenly spaced between the two random variables. Figure 2a shows the two measures of network outage incidence from [49]. Figure 2b shows that the clusters had the lowest Davies–Bouldin values, but were too far apart. It also shows the typical network outages associated with the proposed strategy. The request for a non-reportable device comes from Ref. [50], which present exact statistics of an intelligent home over one year. Instead of modeling system losses, the raw data provide them. Load averaging converted the data to a 12 h resolution. The Tunisian grid locations provided solar irradiation and ambient temperature in this scenario.

3.2. Outcomes Considering a Grid and with Blackouts

Depending on the strategy, integrating the storage system will only be helpful if the current electricity system can operate normally without any breakdowns or problems. Installing a solar panel can save money. As the electricity system is not unstable, the optimization problem revealed that the storage system was not necessary. Solar panels offer users two distinct ways of reducing costs. Smart homes can operate autonomously when demand is exceptionally high. Selling the extra energy back to the grid reduces the overall energy bill. Figure 6a,b illustrate the order in which the PV system powers appliances and the average number of appliances powered by the SSE batteries. At lunchtime, the PV and energy storage systems do their utmost to ensure that households have enough electricity. When electricity demand is high, the system can make up for any shortfall in supply (see Figure 6c,d). The following section examines how renewable energy storage can help reduce operating costs. The potential for renewable energy storage in Saudi Arabia is highly variable, ranging from 0% to 100%. We assess three possible degrees of cost reduction: insignificant, moderate, and considerable (up to 50%). Negligible cost reduction represents less than 50% of the total cost. The individual house model calculates the total cost of each system, assuming there is no energy sharing between houses. Similarly, the proposed DL algorithm calculates the costs associated with the exchange of resources between households and utilities. Figure 6e,f illustrate how the significant cost difference between these two types of systems translates into considerable energy savings for the grid, demonstrating the link between the savings for the grid and the savings for households in the event of high energy demand (non-renewable energy storage). Each house stores a different amount of energy (See Table 2). The lifespan of the storage device can be reduced if this minimum power level is only occasionally maintained. It is in the home’s interest to charge the energy storage device during the earliest periods, even if this is unnecessary, as this saves money and ensures that energy levels remain above this minimum level. The cost of inefficiently charging storage devices drives up overall energy prices. This is why storage penetration rates above a certain level do not affect savings. Typical reductions in energy consumption rarely reach 5%.

The initial integration of energy storage into the grid may result in lower overall energy costs; however, this benefit will diminish as storage capacity increases. It is improbable that grid storage will result in significant energy savings. Studies are being carried out both on exchanges with the grid and on the influence of solar panel penetration on costs and savings. Possible prices and protection will be directly proportional to the amount of energy that can be stored. Figure 6e,f illustrates the impact on the percentage of households equipped with storage of the overall price of the microgrid system. They show that load inefficiencies increase energy costs for families using storage (the cost curves increase as storage usage levels rise). An increase in storage requirements will lead to an increase in expenditure. Smaller energy storage minimizes overheads. Therefore, energy storage only makes financial sense if connected to renewable sources or mini-grids. The use of storage devices reduces costs.

3.3. Low/High Solar Penetration

Figure 7a–d illustrates two innovative customer management and storage procedures from T = 0 to Td. Figure 7e,f compares retail and community costs when using solar energy at minimal levels. The local pool has less solar power, so the community price is more critical than in Figure 6. It will participate in the social exchange of energy and reduce the benefits of battery storage devices used by sophisticated consumers. The assumption that everyone will use the energy pool to obtain electricity is wrong. The intelligent user also incurs higher electricity costs than in the previous scenario. The proposed method can help consumers, as shown in Figure 7b. The trading behavior of the two agents is equivalent, suggesting that the algorithm is efficient. The proposed method has the potential to reduce costs for the energy community. Overall, the recommended approach is robust and comprehensive. Fuzzy Q-learning aims to stimulate the use of green energy sources. For example, when solar energy is abundant in the middle of the day, the surplus to demand ratio increases. Adding renewable energy penetration to the first scene shows that the proposed algorithm can improve the share of renewables in the energy mix. Given that it only captures 30% of solar energy, the local energy pool may have to raise prices and discourage residents from participating in electricity trading (Figure 7b,c).

With minimal solar penetration, intelligent home agents exchange energy with the community energy pool (Figure 7c). There is no energy transfer between the energy pool and the intelligent user between Tc (4, 6) and Td (9). As the use of solar energy grows slowly, costs in the community increase. Consumers then trade or store electricity. The phenomena we have described show that the algorithm works as expected. After accumulating empirical data, the agent optimizes its energy management based on the q value it has learned to evaluate. Continuous online procedures without forms are conceivable but less efficient than global optimization. The recommended method can help smart customers reduce their electricity costs and simplify their use of solar energy. An increase in the growth of renewable energy is possible with the help of the suggested fuzzy Q-learning algorithm if consumers earn more money by using renewable energy sources. (Figure 7f).

3.4. Discussion

There are several ways in which communities can reduce their monthly energy costs. This study provides an innovative, energy-efficient neighborhood with a shared electricity supply. Thanks to the proposed pricing strategy, the price will stabilize at wholesale and retail levels. According to the data, this pricing structure can increase revenues for energy consumers while reducing their monthly bills. SG 2.0, such as community energy trading, can benefit from reinforcement learning. With this capability, SG 2.0 can refine its energy minimization strategies. Consumers and prosumers have benefited from the improvements brought about by the proposed SG 2.0 case study and the production of renewable domestic energy. Using the community energy pool saves electricity costs, and savvy consumers can even benefit from selling surplus energy back to the grid. In the SG 2.0 vision, users can transform from passive consumers into active participants. Users can have a say in the cost of electricity and generate more revenue through sophisticated storage and control technologies. The results demonstrate the cost-effectiveness of reinforcement Q-learning in meeting energy demand in the event of the high or low penetration of photovoltaics. Thus, the Q-learning method can solve the persistent problems associated with SG 2.0.

4. Conclusions and Future Works

This article offers an intelligent peer-to-peer energy trading approach with smart house prosumers, non-smart consumers, and a local energy pool to enhance neighborhood energy sharing. The community model under consideration enables the sale of surplus energy to emphasize the advantages of adopting renewable energy. There was also a presentation of the demand- and surplus-based pricing approach. The intelligent power community suggests a Q-learning-based boosting algorithm. The AI algorithm aims to assist societies in making quick commercial decisions. To assess the efficacy of the suggested energy community, numerical analyses were carried out under low and high PV penetration. A single-home PV-BS with DR and a grid loss have also been modeled. This work presents an intelligent community energy system, considering traditional and innovative users and the local energy reserve. The planned communal energy unit would exchange the production and consumption of renewable energy. We also show and discuss P2P energy demand and surplus pricing for energy consumption and peak hours. The proposed strategy uses a reinforcement technique based on demand and surplus pricing for energy consumption and peak hours. To meet the energy community’s needs, it was necessary to perform numerical evaluations for both low and high PV penetration. Through modelling of the PV-BT optimization problem for a connected home under grid interruptions, DR limitations and scenarios, the results show that the proposed technique can develop the PV-BT system while considering reliability, DR, and cost to ensure consumer and customer reliability.

Future development in this area will focus on developing V2H systems for renewable energy capable of responding to peak demand and falling prices. V2H is capable of recovering any surplus energy. Power cuts are less likely, thanks to the electricity produced by V2H systems during peak hours and the deployment of HEMS. The proposed technology paves the way for intelligent system management. There are differences in domestic energy production and net electricity consumption between V2H systems with and without photovoltaics. We can better manage energy consumption and costs by integrating real-time systems without photovoltaics.

Author Contributions

Conceptualization, S.B.S.; methodology, S.B.S.; validation, M.M.; investigation, M.M.; resources, M.M.; writing—original draft preparation, M.M.; writing—review and editing, S.B.S.; visualization, S.B.S.; project administration, M.M.; funding acquisition, M.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research work was funded by Institutional Fund Projects under Grant no. (IFPIP-1300-156-1443). Therefore, the authors gratefully acknowledge technical and financial support from the Ministry of Education and King Abdelaziz University, DSR, Jeddah, Saudi Arabia.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Ben Slama, S. Design and implementation of home energy management system using vehicle to home (H2V) approach. J. Clean. Prod. 2021, 312, 127792. [Google Scholar] [CrossRef]
Sami, B.S.; Sihem, N.; Bassam, Z. Design and implementation of an intelligent home energy management system: A realistic autonomous hybrid system using energy storage. Int. J. Hydrogen Energy 2018, 43, 19352–19365. [Google Scholar] [CrossRef]
Nasri, S.; Ben Slama, S.; Yahyaoui, I.; Zafar, B.; Cherif, A. Autonomous hybrid system and coordinated intelligent management approach in power system operation and control using hydrogen storage. Int. J. Hydrogen Energy 2017, 42, 9511–9523. [Google Scholar] [CrossRef]
Padghan, P.R.; Daniel, S.A.; Pitchaimuthu, R. Grid-tied energy cooperative trading framework between Prosumer to Prosumer based on Ethereum smart contracts. Sustain. Energy Grids Netw. 2022, 32, 100860. [Google Scholar] [CrossRef]
Cosic, A.; Stadler, M.; Mansoor, M.; Zellinger, M. Mixed-integer linear programming based optimization strategies for renewable energy communities. Energy 2021, 237, 121559. [Google Scholar] [CrossRef]
Marocco, P.; Ferrero, D.; Martelli, E.; Santarelli, M.; Lanzini, A. An MILP approach for the optimal design of renewable battery-hydrogen energy systems for off-grid insular communities. Energy Convers. Manag. 2021, 245, 114564. [Google Scholar] [CrossRef]
Molina, J.D.; Buitrago, L.F.; Téllez, S.M.; Giraldo, S.; Uribe, J.A. Demand Response Program Implementation Methodology: A Colombian Study Case. Trans. Energy Syst. Eng. Appl. 2022, 3, 13–19. [Google Scholar] [CrossRef]
Mensin, Y.; Ketjoy, N.; Chamsa-Ard, W.; Kaewpanha, M.; Mensin, P. The P2P energy trading using maximized self-consumption priorities strategies for sustainable microgrid community. Energy Rep. 2022, 8, 14289–14303. [Google Scholar] [CrossRef]
Guo, J.; Tan, J.; Li, Y.; Gu, H.; Liu, X.; Cao, Y.; Yan, Q.; Xu, D. Decentralized Incentive-based multi-energy trading mechanism for CCHP-based MG cluster. Int. J. Electr. Power Energy Syst. 2021, 133, 107138. [Google Scholar] [CrossRef]
Spiliopoulos, N.; Sarantakos, I.; Nikkhah, S.; Gkizas, G.; Giaouris, D.; Taylor, P.; Rajarathnam, U.; Wade, N. Peer-to-peer energy trading for improving economic and resilient operation of microgrids. Renew. Energy 2022, 199, 517–535. [Google Scholar] [CrossRef]
Guo, Z.; Qin, B.; Guan, Z.; Wang, Y.; Zheng, H.; Wu, Q. A High-Efficiency and Incentive-Compatible Peer-to-peer Energy Trading Mechanism. IEEE Trans. Smart Grid 2023, 2023, 326680. [Google Scholar] [CrossRef]
Liu, J.; Long, Q.; Liu, R.-P.; Liu, W.; Hou, Y. Online distributed optimization for spatio-temporally constrained real-time peer-to-peer energy trading. Appl. Energy 2023, 331, 120216. [Google Scholar] [CrossRef]
Munahar, S.; Purnomo, B.C.; Köten, H. Fuel Control Systems for Planetary Transmission Vehicles: A Contribution to the LPG-fueled Vehicles Community. Mech. Eng. Soc. Ind. 2021, 1, 14–21. [Google Scholar] [CrossRef]
Akter, M.; Mahmud, M.; Haque, M.; Oo, A.M. An optimal distributed energy management scheme for solving transactive energy sharing problems in residential microgrids. Appl. Energy 2020, 270, 115133. [Google Scholar] [CrossRef]
Suthar, S.; Cherukuri, S.H.C.; Pindoriya, N.M. Peer-to-peer energy trading in smart grid: Frameworks, implementation methodologies, and demonstration projects. Electr. Power Syst. Res. 2023, 214, 108907. [Google Scholar] [CrossRef]
Thomas, H.; Sun, H.; Kazemtabrizi, B. Closest Energy Matching: Improving peer-to-peer energy trading auctions for EV owners. IET Smart Grid 2021, 4, 445–460. [Google Scholar] [CrossRef]
Dileepan, V.; Jayakumar, J. Analysis of performance improvement in energy storage system for electric vehicles: A review. Int. J. Electr. Hybrid Veh. 2020, 12, 315. [Google Scholar] [CrossRef]
Rodriguez, R.; Osma, G.; Bouquain, D.; Solano, J.; Ordoñez, G.; Roche, R.; Paire, D.; Hissel, D. Sizing of a fuel cell–battery backup system for a university building based on the probability of the power outages length. Energy Rep. 2022, 8, 708–722. [Google Scholar] [CrossRef]
Zeng, B.; Liu, Y.; Xu, F.; Liu, Y.; Sun, X.; Ye, X. Optimal demand response resource exploitation for efficient accommodation of renewable energy sources in multi-energy systems considering correlated uncertainties. J. Clean. Prod. 2021, 288, 125666. [Google Scholar] [CrossRef]
Wu, X.; Jiao, D.; Du, Y. Automatic Implementation of a Self-Adaption Non-Intrusive Load Monitoring Method Based on the Convolutional Neural Network. Processes 2020, 8, 704. [Google Scholar] [CrossRef]
Ayotunde, A.; Adeyemo, A.; Amusan, O. Demand side management in future Smart Grid: A review of current state-of the-art. In Proceedings of the 13th International Conference on Applied Energy, Bangkok, Thailand, 29 November–2 December 2021. [Google Scholar] [CrossRef]
Liu, Y.; Liu, C.; Shen, Y.; Zhao, X.; Gao, S.; Huang, X. Non-intrusive energy estimation using random forest based multi-label classification and integer linear programming. Energy Rep. 2021, 7, 283–291. [Google Scholar] [CrossRef]
Sheffrin, A. Empirical Evidence of Strategic Bidding in the California ISO Real-time Market. Electr. Pricing Transit. 2002, 267–281. [Google Scholar] [CrossRef]
Yu, B.; Sun, F.; Chen, C.; Fu, G.; Hu, L. Power demand response in the context of smart home application. Energy 2021, 240, 122774. [Google Scholar] [CrossRef]
Qiu, D.; Ye, Y.; Papadaskalopoulos, D.; Strbac, G. Scalable coordinated management of peer-to-peer energy trading: A multi-cluster deep reinforcement learning approach. Appl. Energy 2021, 292, 116940. [Google Scholar] [CrossRef]
Tushar, W.; Yuen, C.; Mohsenian-Rad, H.; Saha, T.; Poor, H.V.; Wood, K.L. Transforming Energy Networks via peer to peer energy trading: Potential of game theoretic approaches. IEEE Signal Process. Mag. 2020, 35, 90–111. [Google Scholar] [CrossRef] [Green Version]
Mohammadi, S.; Eliassen, F.; Zhang, Y. Effects of false data injection attacks on a local P2P energy trading market with prosumers. In Proceedings of the 2020 IEEE PES Innovative Smart Grid Technologies Europe (ISGT-Europe), The Hague, The Netherlands, 26–28 October 2020; pp. 31–35. [Google Scholar] [CrossRef]
Shin, H.; Baldick, R. Plug-in electric vehicle to home (V2H) operation under a grid outage. IEEE Trans. Smart Grid 2017, 8, 2032–2041. [Google Scholar] [CrossRef]
Sami, B.S. Intelligent Energy Management for Off-Grid Renewable Hybrid System Using Multi-Agent Approach. IEEE Access 2020, 8, 8681–8696. [Google Scholar] [CrossRef]
Slimani, S.; Zhang, K. Selective auctioning using publish/subscribe for real-time bidding. In Proceedings of the 16th International Conference on Web Information Systems and Technologies, Online, 3–5 November 2020. [Google Scholar] [CrossRef]
Al-Sorour, A.; Fazeli, M.; Monfared, M.; Fahmy, A.; Searle, J.R.; Lewis, R.P. Enhancing PV Self-Consumption Within an Energy Community Using MILP-Based P2P Trading. IEEE Access 2022, 10, 93760–93772. [Google Scholar] [CrossRef]
Azim, M.I.; Tushar, W.; Saha, T.K.; Yuen, C.; Smith, D. Peer-to-peer kilowatt and negawatt trading: A review of challenges and recent advances in distribution networks. Renew. Sustain. Energy Rev. 2022, 169, 112908. [Google Scholar] [CrossRef]
Wu, Y.; Liu, Z.; Li, B.; Liu, J.; Zhang, L. Energy management strategy and optimal battery capacity for flexible PV-battery system under time-of-use tariff. Renew. Energy 2022, 200, 558–570. [Google Scholar] [CrossRef]
Roberts, M.B.; Bruce, A.; MacGill, I. Impact of shared battery energy storage systems on photovoltaic self-consumption and electricity bills in apartment buildings. Appl. Energy 2019, 245, 78–95. [Google Scholar] [CrossRef]
Raj, B.D.; Sarkar, A.; Goswami, D. An efficient framework for brownout-based appliance scheduling in microgrids. Sustain. Cities Soc. 2022, 83, 103936. [Google Scholar] [CrossRef]
Wang, J.; Zhong, H.; Xia, Q.; Li, G.; Zhou, M. Sharing Economy for Renewable Energy Aggregation. In Sharing Economy in Energy Markets; Springer: Singapore, 2022; pp. 107–142. [Google Scholar] [CrossRef]
He, Z.; Tran, K.P.; Thomassey, S.; Zeng, X.; Xu, J.; Yi, C. Multi-objective optimization of the textile manufacturing process using deep-Q-network based multi-agent reinforcement learning. J. Manuf. Syst. 2021, 62, 939–949. [Google Scholar] [CrossRef]
Samende, C.; Cao, J.; Fan, Z. Multi-agent deep deterministic policy gradient algorithm for peer-to-peer energy trading considering distribution network constraints. Appl. Energy 2022, 317, 119123. [Google Scholar] [CrossRef]
Magrini, A.; Marenco, L.; Bodrato, A. Energy smart management and performance monitoring of a NZEB: Analysis of an application. Energy Rep. 2022, 8, 8896–8906. [Google Scholar] [CrossRef]
Wu, Q.; Wang, F. Concatenate convolutional neural networks for non-intrusive load monitoring across the complex background. Energies 2019, 12, 1572. [Google Scholar] [CrossRef] [Green Version]
Jaramillo, A.F.M.; Laverty, D.M.; Morrow, D.J.; del Rincon, J.M.; Foley, A.M. Load modeling and non-intrusive load monitoring to integrate distributed energy resources in low and medium voltage networks. Renew. Energy 2021, 179, 445–466. [Google Scholar] [CrossRef]
Ali, I.H.O.; Ouassaid, M.; Maaroufi, M. Optimal appliance management system with renewable energy integration for smart homes. In Renewable Energy Systems Modelling, Optimization and Control; Springer: Berlin/Heidelberg, Germany, 2021; pp. 533–552. [Google Scholar] [CrossRef]
Grover, H.; Panwar, L.; Verma, A.; Panigrahi, B.; Bhatti, T. A multi-head Convolutional Neural Network based non-intrusive load monitoring algorithm under dynamic grid voltage conditions. Sustain. Energy Grids Netw. 2022, 32, 100938. [Google Scholar] [CrossRef]
Sisodiya, S.; Shejul, K.; Kumbhar, G.B. Scheduling of demand-side resources for a building energy management system. Int. Trans. Electr. Energy Syst. 2017, 27, e2369. [Google Scholar] [CrossRef]
Chamandoust, H.; Hashemi, A.; Derakhshan, G.; Hakimi, M. Scheduling of Smart Micro Grid Considering Reserve and demand side management. In Proceedings of the 2018 Smart Grid Conference (SGC), Sanandaj, Iran, 28–29 November 2018. [Google Scholar] [CrossRef]
Valdes, J.; Macia, Y.M.; Dorner, W.; Camargo, L.R. Unsupervised grouping of industrial electricity demand profiles: Synthetic profiles for demand-side management applications. Energy 2020, 215, 118962. [Google Scholar] [CrossRef]
Vashishtha, S.; Ramachandran, M. Multicriteria evaluation of demand side management (DSM) implementation strategies in the Indian power sector. Energy 2006, 31, 2210–2225. [Google Scholar] [CrossRef]
Panda, D.K.; Das, S. Smart grid architecture model for control, optimization and data analytics of future power networks with more renewable energy. J. Clean. Prod. 2021, 301, 126877. [Google Scholar] [CrossRef]
Cin, E.D.; Carraro, G.; Volpato, G.; Lazzaretto, A.; Danieli, P. A multi-criteria approach to optimize the design-operation of Energy Communities considering economic-environmental objectives and demand side management. Energy Convers. Manag. 2022, 263. [Google Scholar] [CrossRef]
Ahammed, T.; Khan, I. Ensuring power quality and demand-side management through IoT-based smart meters in a developing country. Energy 2022, 250, 123747. [Google Scholar] [CrossRef]
Tai, C.-S.; Hong, J.-H.; Hong, D.-Y.; Fu, L.-C. A real-time demand-side management system considering user preference with adaptive deep Q learning in home area network. Sustain. Energy Grids Netw. 2021, 29, 100572. [Google Scholar] [CrossRef]
Shah, Y.T. Simulation and optimization of Hybrid Renewable Energy Systems. In Hybrid Power; CRC Press: Boca Raton, FL, USA, 2021; pp. 535–614. [Google Scholar] [CrossRef]
Hsu, C.-H.; Eshwarappa, N.M.; Chang, W.-T.; Rong, C.; Zhang, W.-Z.; Huang, J. Green communication approach for the smart city using renewable energy systems. Energy Rep. 2022, 8, 9528–9540. [Google Scholar] [CrossRef]

Figure 1. Diagram of the P2P energy trading model.

Figure 2. Monitoring home appliances.

Figure 3. Agent and environment interactions.

Figure 4. The proposed interactions in the multi-agent approach.

Figure 5. Outage events (a); home energy demand (b), outage probability (c); TOU and RTP tariffs (d); solar radiation (w/m²) (e); (f) outdoor temperature (0 °C).

Figure 6. (a) Energy demand from the grid to a home: typical day with TOU/blackouts; (b) battery charging and discharging for a typical day with TOU/blackouts; (c) excess power for a typical day with RTP/blackouts; (d) deferrable appliances during a typical day with RTP/Blackouts; (e) shiftable home appliances during a typical day with RTP/blackouts; (f) deferrable appliances during a typical day with RTP/blackouts; (g) battery state of charge; (h) energy demand during TOU.

Figure 7. (a) High-solar-penetration prosumers: SoC_BT comparison of two prosumers; (b) Prosumer trade action and storage action; (c) Retail price comparison between retailers and the community; (d) Retail price comparison between retailers and the community; (e) Trade and storage action 1; (f) Trade and storage action 2.

Table 1. The three P2P market approaches described above.

P2P Market	Advantages	Limits	References
Decentralized market-(DeMark)	Interaction and conversation between individual customers directly. There is no obligation to exchange data with other parties, which enhances user privacy. Increased scalability, with customers able to join the P2P marketplace whenever they wish.	It will be more difficult to achieve the highest possible overall revenue. It may be difficult to keep track of decentralized users.	[40,42,43,44], [45,46,47,48,49,50,51,52,53].
Centralized market-(Ce-Mark)	A centralized market revolves around the market coordinator, who directly determines the number of inputs and products and distributes the benefits among the many users, achieving the highest possible level of social well-being through the microgrid. Simple management. Complete democratization of the use of available energy sources.	Because of the sharing of food data, customer privacy may be compromised. Optimization aims to maximize overall benefits, which may mean ignoring specific user requirements to achieve this objective.	[19,20,21,22,23], [24].
Distributed market-(DiMark)	In a distributed market, the market coordinator exercises indirect control over user energy exchanges and regulates user behavior through price signals. Distributed markets are halfway between centralized and decentralized markets. A system that can indirectly influence the behavior of users while preserving their privacy and individuality.	In determining market pricing signals, account must be taken of user behavior, actual market processes, and the need to minimize the negative consequences of market dispersion.	[25,26,27,28,29,30,31,32,33,34,35,36,37,38,39].
Our Approach-(DeMark)	Customers/prosumers can enter the P2P market at any time, improving adaptability.	Discussions	-----------------

Table 2. Input parameters.

Items	Parameters	Items	Parameters
System technical parameters		PV array Specification Costs
PV related Power	1.0 kw	Whole Capital	1130 $/kw
Interest rate	4.80%	Total Maintenance per year	5.001 $/kw
PV system lifetime	25.0	Replacement	398.31 $/kwh
Rated Capacity: g^rated_SPV (kw)	8.02 kw	Expected-lifetime per year	21
Investment cost (δ_PV) ($/kw)	769.0 $/kw	BT array Specification Costs
PV Cell Numbers	Ns 3; Np 6	Whole Capital	280 $/kw
P_Grid,max (kw)	9725 kw	Total Maintenance per year	14.2 $/kw
Maximum G₂H/H₂G-(P^HG, P^HG)	10 kw	Replacement	305 $/kwh
PV Efficiency (ηPV) (pu)	0.13%	Expected-lifetime per year	11
Max rated PV array power (kw)	4.2 kw	Whole Capital	1130 $/kw
BS Depth of discharge (DBS) (pu)	0.6	BS charge Efficiency (ηBS) (pu)	0.97
		BS discharge Efficiency (ηBS) (pu)	0.98

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mahmoud, M.; Slama, S.B. Peer-to-Peer Energy Trading Case Study Using an AI-Powered Community Energy Management System. Appl. Sci. 2023, 13, 7838. https://doi.org/10.3390/app13137838

AMA Style

Mahmoud M, Slama SB. Peer-to-Peer Energy Trading Case Study Using an AI-Powered Community Energy Management System. Applied Sciences. 2023; 13(13):7838. https://doi.org/10.3390/app13137838

Chicago/Turabian Style

Mahmoud, Marwan, and Sami Ben Slama. 2023. "Peer-to-Peer Energy Trading Case Study Using an AI-Powered Community Energy Management System" Applied Sciences 13, no. 13: 7838. https://doi.org/10.3390/app13137838

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Peer-to-Peer Energy Trading Case Study Using an AI-Powered Community Energy Management System

Abstract

1. Introduction

1.1. Related Works and Contributions

1.2. Methodology

1.3. Objective Function

1.4. Single Home-Sharing Energy

1.4.1. PV Supply

1.4.2. Households Load Consumptions

1.4.3. Storage Unit

1.4.4. Time-of-Use Tariff

2. Multi-Agent Reinforcement Q-Learning

Markov Decision Process Formulation

3. Case Study

3.1. System Initialization

3.2. Outcomes Considering a Grid and with Blackouts

3.3. Low/High Solar Penetration

3.4. Discussion

4. Conclusions and Future Works

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI