Welcome back to our series on self-organizing wireless networks! In part 1 of this series, we introduced the concept and concerns of wireless networks, and discussed the importance of distributed solutions. In this post, we’ll be delving into the topic of distributed link scheduling, a key component in the organization of wireless multihop networks. We’ll first explore how classic distributed schedulers work, and then how they can be enhanced with machine learning and graph neural networks. Plus, we’ll discuss the implications of this approach for the future of wireless networking and distributed systems. So sit back, relax, and get ready to dive into the world of distributed link scheduling in wireless networks!

In wireless multihop networks, the responsibility of networking falls on user devices themselves. They not only transmit and receive their own data packets, but also help relay traffic for other devices. This self-organizing capability is achieved through two core functionalities: distributed routing and distributed scheduling.

Routing involves finding the optimal path for packets to travel from their source to destination, such as the solid green path in Fig.5 (left), while link scheduling determines which links should be activated in each time slot, as illustrated by the solid edges in Fig.5 (right).

However, the unpredictable and dynamic nature of wireless networks makes routing and scheduling more challenging than in wired networks. Link scheduling, in particular, is limited by the fact that a node with a single radio transceiver can only communicate with one of its neighbors in a given time slot.

In the following section, we will discuss a simple form of distributed link scheduling.

In wireless multihop networks, links share the same wireless medium, which means that if two nearby links transmit simultaneously, they will interfere with each other and fail to communicate successfully. Therefore, it is crucial to avoid scheduling conflicting links to transmit at the same time.

To represent these interference relationships between links in the network, we use a conflict graph. The conflict graph is a graph where each vertex represents a link in the network, and an edge exists between two vertices if the corresponding links conflict with each other. The edge can represent either physical or logical interference. Figure 6 shows an example of mapping a wireless multihop network to its conflict graph.

Using a conflict graph allows us to formulate the link scheduling problem as an optimization problem on the graph, where we aim to find the maximum-weight independent set. An independent set is a set of vertices where no two vertices are adjacent (i.e., no two links conflict with each other), and the maximum-weight independent set is the independent set with the highest sum of vertex weights.

The vertex weight, which is defined as the utility of scheduling that link, represents the importance of that link in the network. By solving the MWIS problem, a valid schedule can be found that not only avoids conflicting links but also maximizes the total utility.

We can formally define the distributed link scheduling as follows: given a conflict graph $\mathcal{G}(\mathcal{V}, \mathcal{E}, \mathbb{u})$, an optimal schedule $\boldsymbol{v}^*$ is found by solving the following MWIS problem:

\begin{align} & \boldsymbol{v}^* = \mathop{\operatorname{arg\,max}}\limits_{\boldsymbol{v}\subseteq \mathcal{V}} \,\, \sum_{v\in\boldsymbol{v}} u(v) \label{eq:mwis} \\ \text{s.t. } & (v_i, v_j)\notin \mathcal{E}\;, \forall \, v_i, v_j \in \boldsymbol{v}. \label{eq:mwis:constraint} \end{align}

However, finding the optimal solution to the MWIS problem is known to be NP-hard, which makes it computationally infeasible for large-scale networks. To address this challenge, researchers have proposed various heuristics to quickly find an independent set that has a total weight close to the maximum value. One promising approach is the Local Greedy Solver/Scheduler (LGS) (Joo 2012 TMC), a distributed method that iteratively selects vertices with the largest weight in their local neighborhood and then remove the selected ones and their interfering neighbors, until no vertices are left.

The LGS is particularly promising as it has been shown to achieve about $86\%-90\%$ of the optimal performance on some random graphs (Zhao 2022 TWC) while only requiring local information exchange among neighboring nodes. Figure 7 illustrates how LGS solve a MWIS problem step-by-step in a distributed manner. For a more in-depth understanding of this algorithm, readers may refer to the study by the original paper (Joo 2012 TMC).

Next, let’s find out how to use machine learning and graph neural networks to enhance LGS to produce even closer approximations to the optimal solution.

Graph Neural Networks (GNNs) are a powerful class of machine learning models specifically designed to operate on graph-structured data, such as social networks, transportation networks, and, most importantly, wireless networks. GNNs utilize the network structure to learn features of nodes and edges, as well as the global structure of the entire graph, making them a powerful tool for modeling complex networks.

At the heart of Graph Neural Networks (GNNs) is an iterative process of aggregating and updating the features of nodes and their neighbors. This process uses a set of learnable parameters shared across all nodes in the network, allowing for efficient and scalable learning. One of the key advantages of GNNs is that they can be executed in a distributed manner, where each node performs lightweight local computations based on its own information and that of its neighbors. This distributed execution is particularly important in wireless networks, where individual nodes often have limited computing resources and memory. Despite these limitations, the network as a whole can still perform powerful tasks, making GNNs an ideal solution for large-scale wireless networks.

In our approach, we utilize a single layer graph convolutional neural network (GCN) to perform the link scheduling task. The local computation at a vertex (link) $v\in\mathcal{V}$ is shown in (3), where $u(v)$ is the utility of link $v$, $d(v)$ is the node degree of $v$, $\mathcal{N}(v)$ represents the set of neighbors of $v$ on the conflict graph, and $\sigma(\cdot)$ is an element-wise activation function.

\begin{equation} z(v) = \sigma\Bigg( u(v) \, \theta_{0} + \Bigg[ u(v) - \sum_{v_i \in \mathcal{N}(v)}\frac{u(v_i)}{\sqrt{d({v})d({v_i})}} \Bigg]\theta_{1} \Bigg), \end{equation}

Notably, the GCN model only contains two learnable parameters $\theta_{0}$, $\theta_{1}$, allowing it to be executed on resource-constrained wireless devices.

In the context of distributed link scheduling, one straightforward approach is to formulate the problem as a binary classification task, where a link is classified as either transmitting or remaining silent. However, this approach often performs poorly, as it fails to consider the potential conflict between neighboring links.

For example, in the conflict graph in Figure 8, each vertex (link) has the same utility and neighborhood topology. The GNN would predict similar probabilities for all four links. If the predicted probabilities exceed 0.5, the scheduler would schedule all links to transmit, leading to collision and communication failure.

To further elaborate on the challenges of formulating link scheduling as a binary classification problem, it’s worth noting that in addition to poor performance, such an approach requires a labeled training dataset, which is difficult to obtain due to the computational complexity of finding the optimal label of NP-hard MWIS problems.

To avoid the pitfalls of a binary classifier, we took a different approach and proposed an innovative method called GCN-LGS. This technique incorporates Graph Convolutional Networks (GCNs) into the algorithmic framework of the distributed greedy scheduler LGS, as shown in Figure 9.

The GCN generates node embeddings, denoted as $\mathbb{z}$, which includes $z(v)$ for all $v\in\mathcal{V}$. These embeddings are used as topological multipliers for the utility $\mathbb{u}$, resulting in a topology-aware utility of $w(v) = z(v)u(v)$. The cool thing about this topological multiplier $z(v)$ is that it can be re-used for multiple time slots until the topology changes. In this case, $z(v)$ can be viewed as the predicted probability of link $v$ being scheduled for the whole sampling space of its link utility $u(v)$. In Figure 10, we visualize the topological multipliers of the conflict graph in Figure 6.

By running LGS based on this topology-aware utility, we are able to leverage the topological importance of a vertex to improve the approximation ratio of the scheduler. For example, if a vertex is peripheral, we increase its original utility $u(v)$, whereas if it has many conflicting neighbors, we discourage it from being scheduled. Overall, this approach allows us to take into account the topology of the conflict graph to improve the performance of the LGS.

Training GCN-LGS is a challenging task that requires the development of effective reinforcement learning schemes. Several approaches have been proposed to address this challenge, ranging from ad-hoc reinforcement learning (Zhao 2021 ICASSP) to a dedicated deterministic policy gradient (Zhao 2022 TWC). Our recent ICLR paper (Zhao 2023 ICLR) presents a more general framework that offers a promising direction for future research. Interested readers can refer to our publications for more detailed information on these approaches.

In this demonstration, we compare the performance of two different distributed link schedulers in simulated wireless multihop networks: the Local Greedy Solver/Scheduler (LGS) and the GCN-enhanced LGS. The two simulations, shown side by side in Figure 11, are the same in every aspect except using different scheduling algorithms. The pink circle around each user device represents its interference zone, a thick gray line represents a potential link between two user devices, and a green line represents an activated/scheduled link.

The curves in the right window of Figure 11 are the throughputs achieved by these two schedulers across time. It is evident that the GCN-enhanced LGS consistently outperforms the LGS in terms of total throughput, even in the face of changing network topologies due to the random walks of individual nodes. It is worth noting that the GCN does not need to be retrained for each new topology, as it can automatically generalize to different topologies.

For a more comprehensive comparison of these distributed schedulers, we invite readers to explore our paper (Zhao 2022 TWC).

Let’s take a moment to recap what we’ve covered so far. We began by introducing routing and scheduling and then explored how to formulate the problem of link scheduling in wireless multihop networks as a maximum weighted independent set problem. We then introduced the Local Greedy Solver/Scheduler (LGS), a distributed heuristic that is simple and easy to implement.

Next, we delved into how to use Graph Convolutional Networks (GCNs) to enhance LGS, leveraging the advantages of GNNs in learning graph structure and distributed execution. We also discussed the drawbacks of using GCNs as a binary classifier in solving MWIS and how that led us to propose the GCN-LGS approach.

Our simulations revealed that GCN-enhanced LGS consistently outperforms LGS in terms of total throughput, even as the network topology changes. It’s worth noting that our GCN model is tiny, with only two learnable parameters, making it well-suited for low-end wireless devices like sensors and IoT devices. Yet, when incorporated into our GCN-LGS framework, it closes the optimality gap of LGS by a third to a half (see Zhao 2022 TWC).

Overall, we have demonstrated that GCN-LGS is a promising approach for solving the distributed link scheduling problem in wireless multihop networks. By combining the power of graph neural networks and distributed heuristics, we can achieve improved performance and more efficient use of network resources.

The GCN-LGS approach we introduced in this post has far-reaching implications for the design and optimization of self-organizing wireless networks and other distributed systems.

Essentially, we train an $L$-layer Graph Neural Network to adjust the importance of each wireless link according to the topology of its $L$-hop interfering neighborhood. By doing so, our prescribed scheduler (LGS) can make better decisions and increase the overall throughput of the wireless network. It’s like individually adjusting the cycling period of each traffic light in a city based on its nearby road network to improve the overall traffic flow.

By leveraging the power of Graph Neural Networks to learn the structure of the network, we are able to enhance traditional distributed algorithms and bring intelligence to the network edge. This opens up a range of possibilities for improving other tasks in self-organizing wireless networks, such as routing, resource allocation, and energy management. For example, we have shown its applications in improving the latency and overhead of scheduling and routing in a series of papers (Zhao 2021 ICASSP, Zhao 2022a ICASSP, Zhao 2022b ICASSP, Zhao 2023 ICASSP).

Moreover, this approach can be extended to other distributed systems, including robot swarms, vehicle/drone fleets, edge computing, and the internet of things, where each node has limited computing resources and the network topology can be dynamic. By reducing the dependence on centralized coordination and control, our learnable distributed algorithms can improve the efficiency, scalability, and robustness of large-scale networked systems.

In our recent ICLR paper (Zhao 2023 ICLR), we took the GCN-LGS approach to the next level with our new GDPG-Twin framework. This generalization of the GCN-LGS approach allows us to tackle an even wider range of problems in routing and scheduling, as well as other network-related tasks. With the GDPG-Twin, we have opened up exciting new opportunities to learn and achieve long-term strategic goals through distributed and parallel decisions, leading to potentially intelligent swarms and networks.

In summary, the GCN-LGS approach represents a promising direction for the development of intelligent and self-organizing distributed systems, with potential applications in a wide range of domains, from wireless networks to edge computing and beyond. We invite researchers and practitioners to explore the potential of this approach and join us in shaping the future of distributed intelligence.

The content of this article is created based on published research sponsored by the Army Research Office under Cooperative Agreement Number W911NF-19-2-0269. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Office or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein.

The presentation of this article is refined and improved with the help of ChatGPT.

- (Joo 2012 TMC): Changhee Joo and Ness B. Shroff, " Local Greedy Approximation for Scheduling in Multihop Wireless Networks," in
*IEEE Transactions on Mobile Computing*, vol. 11, no. 3, pp. 414-426, March 2012, doi: 10.1109/TMC.2011.33. https://doi.org/10.1109/TMC.2011.33 - (Zhao 2021 ICASSP): Zhongyuan Zhao, Gunjan Verma, Chirag Rao, Ananthram Swami, Santiago Segarra, " Distributed Scheduling Using Graph Neural Networks,"
*IEEE ICASSP 2021*, pp. 4720-4724, https://doi.org/10.1109/ICASSP39728.2021.9414098 - (Zhao 2022a ICASSP): Zhongyuan Zhao, Gunjan Verma, Ananthram Swami, Santiago Segarra, " Delay-Oriented Distributed Scheduling Using Graph Neural Networks,"
*IEEE ICASSP 2022*, pp. 8902-8906, https://doi.org/10.1109/ICASSP43922.2022.9746926 - (Zhao 2022b ICASSP): Zhongyuan Zhao, Ananthram Swami, Santiago Segarra, " Distributed Link Sparsification for Scalable Scheduling using Graph Neural Networks,"
*IEEE ICASSP 2022*, pp. 5308-5312, https://doi.org/10.1109/ICASSP43922.2022.9747437 - (Zhao 2022 TWC): Zhongyuan Zhao, Gunjan Verma, Chirag Rao, Ananthram Swami, Santiago Segarra, " Link Scheduling Using Graph Neural Networks," accepted to
*IEEE Transactions on Wireless Communications*, https://doi.org/10.1109/TWC.2022.3222781 - (Zhao 2023 ICLR): Zhongyuan Zhao, Ananthram Swami, Santiago Segarra, " Graph-based Deterministic Policy Gradient for Repetitive Combinatorial Optimization Problems,"
*The 11th International Conference on Learning Representations (ICLR) 2023*, pp. 1-21, https://openreview.net/forum?id=yHIIM9BgOo - (Zhao 2023 ICASSP): Zhongyuan Zhao, Bojan Radojičić, Gunjan Verma, Ananthram Swami, Santiago Segarra, " Delay-aware Backpressure Routing Using Graph Neural Networks," accepted to
*IEEE ICASSP 2023*, arXiv 2211.10748 . https://arxiv.org/pdf/2211.10748.pdf

Welcome to the world of wireless networks, where machines talk to each other through the radiowaves. It’s a fascinating domain, but it can also be pretty darn complicated. Now, imagine trying to get all of these devices to work together without a central controller or any set rules. Sounds like a recipe for disaster, right? Well, not necessarily.

In this post, we’ll be exploring how these distributed systems work and how machine learning can help improve their performance. But before we dive into the technical details, let’s take a step back and think about what makes humans such social creatures. How do we manage to navigate crowded spaces, like a busy cafe or a chaotic street, without bumping into each other or getting lost in the crowd? The answer lies in our ability to self-organize. We can form and maintain order in public without an organizer telling us what to do at every moment. And, as it turns out, wireless networks can learn a thing or two from us.

Imagine a bustling city, with cars speeding down the streets and people engrossed in their phones while walking. At first glance, it may seem like complete chaos, but upon closer inspection, there’s an unspoken order that everyone follows. In conversation, we take turns speaking, listen attentively, and avoid interrupting. On the road, we stay in our lanes, stop at red lights, and signal before turning. These social norms keep us organized and functioning as a society.

It turns out that the concept of self-organization and following rules also applies to wireless networks. Wireless devices are constantly communicating with each other amid background noise and interference, similar to people talking in a crowded coffee shop (Fig. 1). When organizing a wireless network, there are two options: either we program the devices to follow a central authority, like a cellular base-station directing every move of your phone, or we can design a set of norms that enable wireless devices to self-organize.

But what if the communication infrastructure is destroyed in a natural disaster or you find yourself in a remote area with no signal? In such situations, you can no longer contact your friends with your cell phone. However, with self-organizing wireless networks, communication nodes can quickly form new networks, establish connections, and reroute messages without a centralized controller.

In the upcoming sections, we’ll delve into the world of self-organizing wireless networks and explore their differences from other types of wireless networks.

Consider the vast difference between top-down engineering systems and bottom-up self-organizing systems. For example, railway and cellular networks are pre-planned networks using dedicated schedulers to guide the physical and data traffics. These networks rely on properly planned infrastructure to function and can be incredibly efficient in resource utilization. Just think about the energy and labor costs of transporting a passenger 1 kilometer by train versus by car.

The infrastructure for 5G cellular networks, called cloud radio access network (Cloud-RAN), is a perfect example of top-down design (see Fig. 2). It consists of base-stations that connect to the telecommunication network and the Internet through wired/wireless fronthaul and backhaul networks. These base-stations are carefully planned out according to the population density and terrain, providing network connectivity to mobile devices in a macro or small cell. The cellular base-station manages the transmission of mobile devices attached to it, as well as some of their radio parameters, allowing for efficient resource utilization. These functionalities require descent computing power, dedicated backhaul, and expensive radio transceivers. As a result, running a cellular network requires a significant upfront investment and years of construction, maintenance, and upgrades.

In contrast, Wi-Fi networks are organized bottom-up, with anyone able to install a Wi-Fi router (access point) at home. The medium access control (MAC) of Wi-Fi is self-organized, which means that the access point does not schedule transmissions or manage the parameters of devices attached to it. Instead, when a Wi-Fi device has data packets to transmit, it first listens to the medium for a brief time, and if there is no activity, it begins transmission. This self-organizing feature makes Wi-Fi access points 1000 times cheaper than cellular base-stations, and explains why Wi-Fi is so successful. In fact, as of 2022, 51% of internet traffic goes through Wi-Fi, while only 19.6% of that goes through cellular networks [Cisco VNI report, 2017-2022, Fig. 22].

However, the listen-before-talk approach used in Wi-Fi MAC only works for small networks. When there are many Wi-Fi devices in a hotspot, there may be too many collisions, causing congestion even if the total demand for bandwidth is not exceeding the capacity of the Wi-Fi access point. You may have experienced such congestions in a café, hotel, or library. In other words, the Wi-Fi protocal cannot always allocate resource efficiently when the network is crowded.

Let’s take a closer look at why self-organizing wireless networks are so important.

Despite their high efficiency, infrastructure-based networks can be inflexible. The availability of cellular and Wi-Fi network services is limited to areas where cables, fibers, base-stations, and access points have already been installed. Self-organizing networks, on the other hand, are much more adaptable, making them an attractive option for applications where flexibility is key. Take wireless ad-hoc and sensor networks for example. They are often used in harsh or hostile environments, such as disaster areas, remote locations, and even battlefields (see Fig. 3), where traditional infrastructure is impractical. In these cases, self-organizing wireless networks provide critical communication capabilities, enabling users and sensors to establish connections quickly and efficiently without relying on an existing infrastructure.

The self-organizing nature of wireless ad-hoc networks gives them a unique topology compared to cellular and Wi-Fi networks, which rely on direct links between user devices and nearby base-stations or access points. In contrast, wireless ad-hoc networks allow data packets to travel over multiple hops through intermediate user devices to reach their destination, as depicted in Figure 3. This makes them part of a broader class of networks called *wireless multihop networks*, where each user device not only sends and receives its own data packets, but also helps relay traffic for other devices.

Wireless multihop networks are playing an increasingly important role in many emerging civilian applications, such as smart vehicles, drone fleets, the Internet of Things (IoT), and 5G and beyond (wireless backhaul networks that links base-stations in small cells, drones, and CubeSats [Akyildiz 2022]). As illustrated in Fig. 4 (left), the exchange of safety-critical information (or control messages) among traveling vehicles (drones), for example, requires ultra-low latency and seamless network coverage. To meet such requirements, it is better to establish a direct connection between neighboring vehicles (drones) rather than placing a base-station in the middle.

In 5G and beyond, wireless multihop networks can also serve as wireless backhaul networks that connect base-stations in small cells or those carried by drones and CubeSats to the Internet (Fig. 4 right). Small cells are for highly populated hotspots, such as central business districts, schools, and residential areas (see Fig. 2), whereas drone and CubeSat-assisted wireless networks are for temporary hotspots and non-terrestrial communications, such as oceans, airplanes, mountains, and deserts. For small cells, wireless backhaul can avoid the high cost and inconvenience of installing wired backhaul, while for drone and CubeSat-assisted networks, wireless backhaul is the only option [Akyildiz 2022].

Wireless multihop networks might also help address the challenge of massive access in the Internet of Things (IoT) [Chen 2021]. The popularity of IoT devices could lead to very high connectivity density, with projections of up to 10 million wireless connections per $km^2$ by 2030 [Chen 2021]. Although the real bandwidth required by their payloads is not high, the overheads of handshaking and authorization generated by these IoT devices under existing cellular architecture could consume all the bandwidth, jamming the entire network. If these IoT devices can form self-organizing networks, the situation could be significantly improved.

Another dimension of the organization of wireless networks is synchronization. Synchronization is essential in wireless communication to ensure that the transmitter and receiver have a common frame of reference for timing and frequency. This is due to the fact that each wireless device has its own independent clock source, typically a crystal oscillator, which can cause timing drift and frequency offset between devices. If left unsynchronized, this can result in data transmission errors, collisions, and other performance issues.

Like infrastructure, synchronization can improve the resource efficiency and performance, at the cost of reduced flexibility. Based on infrastructure and synchronization, wireless networks can be categorized as follows:

synchronized | random access | |
---|---|---|

Infrastructure | Cellular, 5G networks | Wi-Fi |

Ad hoc | Vehicular/Flying/Tactical Ad-hoc Networks | Wireless Ad hoc / Sensor Networks |

In synchronized networks, wireless devices are synchronized to the same clock, usually by GPS signal, enabling them to communicate on accurately defined time slots without collisions. Synchronized networks offer improved performance in throughput, mobility, and resource efficiency.

In random access networks, devices are not synchronized, which leads to overhead in establishing a link between devices. This limits the performance and resource efficiency of random access networks. However, the lack of synchronization makes it easier for devices to self-organize, resulting in greater flexibility, lower energy consumption, and lower cost of devices.

It is important to note that both random access and synchronization are necessary for wireless communication. Cellular protocols, for example, have a dedicated random access channel for new devices to join the network. In Wi-Fi and other random access networks, the receiver needs to synchronize itself to the transmitter to decode the data. The degree of synchronization and random access required in a network depends on the specific application and the trade-offs between performance, resource efficiency, and flexibility.

So far, we’ve discussed three key aspects of wireless network organization: scheduling, topology, and synchronization. We’ve learned that self-organizing wireless networks prioritize flexibility and reliability over performance and resource efficiency, and are designed to function without the help of infrastructure. Using principles such as listen-before-talk, multihop topology, and random access, these networks can operate even in situations where traditional network infrastructure is not feasible or able to meet the requirements.

In part 2 of this blog post, we’ll delve into link scheduling, a key component in organizing wireless multihop networks. We’ll explore how distributed link scheduling can be formulated and solved, and show how it can be improved through the use of machine learning and graph neural networks to achieve maximum efficiency. We’ll also discuss the broad implications of this new approach for the future of networking. So, stay tuned for a more in-depth look at self-organizing wireless networks!

The author acknowledges the valuable assistance of ChatGPT in refining and improving the presentation of this article (see earlier version). The author would also like to express special gratitude to Derek Muller, popularly known as Veritasium, whose unique linguistic style was incorporated into this post. Derek Muller’s engaging and accessible presentation of complex scientific concepts has inspired the author’s writing and helped make this article more engaging and accessible to a wider audience.

- [Chen 2021] X. Chen, D. W. K. Ng, W. Yu, E. G. Larsson, N. Al-Dhahir,and R. Schober, “Massive access for 5g and beyond,”IEEE Journal of Selected Areas in Communications, vol. 39, no. 3, pp. 615–637, 2021.
- [Akyildiz 2022] I. F. Akyildiz, A. Kak, and S. Nie, “6G and beyond: The future of wireless communications systems,” IEEE Access, vol. 8,pp. 133995–134030, 2020.

Barefoot shoes are all the rage these days, but do they really live up to the hype? In this post, you’ll find a conversation on the topic, along with my thoughts on the answers provided by ChatGPT. While ChatGPT certainly has a good grasp of barefoot shoes, there were instances where its answers relied too heavily on manufacturers’ marketing materials and lacked in-depth understanding and insights. Despite this, ChatGPT’s language ability is impressive, as it was able to understand my intention even when my English was broken. However, there was still one topic that was missed when ChatGPT was asked to summarize our conversation. So, without further ado, let’s jump into the transcripts and take a closer look at the pros and cons of barefoot shoes.

**Definition**: Barefoot shoes are designed to mimic the feel of walking or running barefoot, while providing some protection for the feet.**Supporters’ viewpoint**: Proponents of barefoot shoes argue that they can improve balance, posture, and strengthen the feet and lower leg muscles.**Critics’ viewpoint**: Critics of barefoot shoes point out that they may not provide enough support for the feet, leading to injury, and that they may not be suitable for all types of physical activity.**Feedback from consumers**: Consumers who have tried barefoot shoes have reported a range of experiences, with some finding them comfortable and beneficial, while others found them uncomfortable or difficult to adjust to.**Professional athletes’ opinions**: Professional athletes have had mixed opinions on barefoot shoes, with some using them for limited purposes, while others have opted for traditional sports shoes.**Average person’s use**: Barefoot shoes are used by a variety of people, from fitness enthusiasts to individuals looking for a more natural walking or running experience.**Cost**: Despite using fewer materials and being less complex than traditional sports shoes, barefoot shoes can still be expensive due to factors such as the use of quality materials, the manufacturing process, brand reputation, investment in research and development, and marketing spend.**Conclusion**: The use of barefoot shoes should be approached with caution and considered as just one tool in a larger fitness and wellness toolkit. It’s important to consult with a medical professional or fitness trainer to determine what’s best for you, and to start slowly and gradually increase intensity and duration as your body adapts. Additionally, it’s important to be mindful of the surface you’re on and the impact of your activity when using barefoot shoes, and to listen to your body and adjust as needed to avoid injury. Overall, barefoot shoes may be appropriate for some individuals for limited purposes and low-impact activities, but they are not a panacea and should not be marketed as such.

**ZZ:** Have you heard about barefoot shoes?

**ChatGPT:**

Yes, I’m familiar with barefoot shoes. Barefoot shoes are a type of footwear that are designed to mimic the experience of walking or running barefoot, while still providing some protection for the feet. The goal of barefoot shoes is to allow the wearer to feel more connected to the ground and to promote a more natural, healthy gait. They usually have minimal padding and support, and are made with flexible, thin soles that allow the feet to move and flex in a more natural way. Some popular examples of barefoot shoes include Vibram FiveFingers and Xero Shoes.

**ZZ:** What are the main points of barefoot shoes supporters?

**ChatGPT:**

Barefoot shoe supporters argue that walking or running in traditional shoes with thick, cushioned soles can lead to a variety of foot, knee, and back problems, as well as cause the feet to become weak and dependent on the support provided by the shoes. They believe that by walking or running barefoot, or in barefoot-style shoes, the feet can become stronger and healthier, and the body can adopt a more natural and efficient gait.

Here are some of the key points that barefoot shoe supporters make:

Natural movement: Barefoot shoes allow the feet to move in a more natural way, promoting better balance, stability, and agility.

Strengthening of feet muscles: By removing the support provided by traditional shoes, barefoot shoes force the feet and lower legs to work harder, leading to stronger and more flexible muscles.

Improved posture: By promoting a more natural gait, barefoot shoes can help improve posture and reduce the risk of injury.

Increased sensory feedback: Barefoot shoes provide less cushioning and support, allowing the wearer to feel the ground more, which can lead to improved proprioception (awareness of one’s body in space) and balance.

Reduced impact: By promoting a more natural and efficient gait, barefoot shoes can reduce the impact on the feet, knees, and back, reducing the risk of injury and strain.

These are some of the main arguments made by barefoot shoe supporters, although it’s worth noting that not all health and fitness professionals agree with these claims.

**ZZ:** What are the main critics to barefoot shoes from health and fitness professionals?

**ChatGPT:**

While some health and fitness professionals support the use of barefoot shoes, others are more skeptical and critical of the claims made by barefoot shoe advocates.

Here are some of the key criticisms of barefoot shoes from health and fitness professionals:

Lack of scientific evidence: Some professionals argue that there is limited scientific evidence to support the claims made by barefoot shoe advocates, and that more research is needed to fully understand the effects of barefoot shoes on the body.

Risk of injury: Critics argue that transitioning to barefoot shoes too quickly or without proper training can lead to an increased risk of injury, especially to the feet and lower legs.

Foot protection: Barefoot shoes provide less protection for the feet compared to traditional shoes, and may not be suitable for all surfaces or environments, such as rough or sharp terrain.

Comfort: Some people find barefoot shoes to be less comfortable than traditional shoes, and may experience discomfort or blisters as they adjust to the new style of footwear.

Difficulty transitioning: Some people may find it difficult to adjust to the reduced cushioning and support provided by barefoot shoes, especially if they have been wearing traditional shoes for a long time.

These are some of the main criticisms of barefoot shoes from health and fitness professionals, and it’s important to keep in mind that not all professionals agree on the benefits or drawbacks of barefoot shoes. Ultimately, the best approach is to consult with a health or fitness professional and to listen to your own body when deciding whether barefoot shoes are right for you.

**ZZ:** What are the feedback of consumers who tried barefoot shoes?

**ChatGPT:**

The feedback from consumers who have tried barefoot shoes can be quite varied. Some people love the feeling of being more connected to the ground and the sensation of a more natural gait, while others find barefoot shoes to be uncomfortable or not suitable for their needs.

Here are some common comments from consumers who have tried barefoot shoes:

Positive feedback: Many people report that they feel more balanced and stable when wearing barefoot shoes, and that they have noticed improvements in their posture and gait. Some also say that they have experienced reduced foot, knee, and back pain after switching to barefoot shoes.

Negative feedback: Some people find barefoot shoes to be uncomfortable or painful, especially when they make the transition from traditional shoes. Some also report that they miss the cushioning and support provided by traditional shoes, and find barefoot shoes to be unsuitable for certain activities or environments.

Adjustment period: Many people report that they needed time to adjust to barefoot shoes, and that they experienced some discomfort or blisters during the transition period.

These are some of the common comments from consumers who have tried barefoot shoes, and it’s important to keep in mind that everyone’s experience with barefoot shoes will be different, and will depend on factors such as the type of barefoot shoe, the individual’s foot shape, and the activities they engage in while wearing the shoes.

**ZZ:** What are the opinions of professional athletes on barefoot shoes?

**ChatGPT:**

The opinions of professional athletes on barefoot shoes can also be quite varied. Some athletes swear by barefoot shoes and use them to train and compete, while others prefer traditional shoes and do not see the benefits of barefoot shoes.

Here are some common comments from professional athletes about barefoot shoes:

Positive feedback: Some athletes report that they feel more agile and stable when wearing barefoot shoes, and that they have noticed improvements in their posture and gait. Some also say that barefoot shoes have helped reduce their risk of injury.

Negative feedback: Some athletes find barefoot shoes to be uncomfortable or painful, and prefer the cushioning and support provided by traditional shoes. Some also report that they have had difficulty adjusting to barefoot shoes, especially during high-impact activities.

Limited use: Some athletes use barefoot shoes for specific activities or training sessions, but prefer traditional shoes for competition.

These are some of the common comments from professional athletes about barefoot shoes, and it’s important to keep in mind that everyone’s experience with barefoot shoes will be different, and will depend on factors such as the individual’s sport, training regimen, and personal preferences. Some professional athletes may also be sponsored by shoe companies, which could influence their opinions on barefoot shoes.

**ZZ:** Tell me more about the third point

**ChatGPT:**

The third point that I mentioned earlier refers to the limited use of barefoot shoes by some professional athletes. This means that some athletes choose to use barefoot shoes for specific activities or training sessions, but prefer traditional shoes for competition.

The reason for this limited use of barefoot shoes by some professional athletes is that they may feel that traditional shoes provide better cushioning and support for high-impact activities, such as running, jumping, and landing. This can help reduce the risk of injury and provide a more comfortable experience during competition.

Additionally, some athletes may prefer the feel and fit of traditional shoes, as well as the support and protection they provide for their feet. This can be especially important for athletes who engage in activities that place high demands on their feet and legs.

It’s worth noting that the use of barefoot shoes by professional athletes is a personal choice, and that what works for one athlete may not work for another. Some athletes may find that barefoot shoes help improve their performance, while others may prefer traditional shoes. The key is to listen to your body and choose the type of shoe that works best for you, based on your individual needs and preferences.

**ZZ:** What are the examples of limited use of barefoot shoes by professional athletes?

**ChatGPT:**

Here are some examples of limited use of barefoot shoes by professional athletes:

Runners: Some runners may use barefoot shoes for training runs to improve their form and build strength in their feet and legs, but switch to traditional shoes for races where they need more cushioning and support.

GymnastsZZ: gymnastic shoes are very similar to barefoot shoes, the whole point on gymnasts is problematic.: Some gymnasts may use barefoot shoes for certain activities, such as floor exercises, where they need to feel connected to the ground for improved balance and stability. But for other activities, such as vault and beamZZ: this is wrong because gymnasts are usually on barefoot in these activities, they may prefer traditional shoes for added protection.

Soccer players: Some soccer players may use barefoot shoes for training sessions to improve their agility and balance, but switch to cleats for games to provide better traction on the field.

CrossFit athletes: Some CrossFit athletes may use barefoot shoes for certain exercises, such as weightlifting and gymnastics movements, but switch to traditional shoes for high-impact activities, such as box jumps and running.

These are just a few examples of limited use of barefoot shoes by professional athletes. It’s important to remember that each athlete’s use of barefoot shoes will be unique, based on their individual needs and preferences.

**ZZ:** How average person use barefoot shoes?

**ChatGPT:**

The average person may use barefoot shoes for a variety of activities, including running, walking, hiking, and working out. Some people may choose to wear barefoot shoes as their primary footwear, while others may use them for specific activities or as a supplement to their traditional shoes.

Here are some ways that the average person might use barefoot shoes:

For running: Some people use barefoot shoes as a more natural alternative to traditional running shoes. They may find that barefoot shoes help improve their form and reduce the risk of injury, while still providing enough protection for their feet.

For walking and hiking: Some people use barefoot shoes for casual walks and hikes, finding them to be more comfortable and natural than traditional shoes.

For working out: Some people use barefoot shoes for activities such as yoga, Pilates, and other types of low-impact exercise, where they need to feel connected to the ground for improved balance and stability.

As a supplement to traditional shoes: Some people use barefoot shoes in addition to their traditional shoes, switching to barefoot shoes for certain activities or to give their feet a break from traditional shoes.

It’s important to remember that the use of barefoot shoes will vary from person to person, depending on factors such as the individual’s lifestyle, activity level, and personal preferences. Some people may find that barefoot shoes work well for them, while others may prefer traditional shoes. The key is to choose the type of shoe that works best for you, based on your individual needs and preferences.

**ZZ:** Are the sales of barefoot shoes going downwards recently?

**ChatGPT:**

I’m a language model created by OpenAI and do not have access to the most up-to-date information on sales trends. However, the popularity of barefoot shoes has fluctuated over the years, and there have been periods where sales have gone up and down.

It’s worth noting that the popularity of barefoot shoes is influenced by various factors, such as changes in fashion, advancements in shoe technology, and shifts in consumer preferences. Additionally, the current global health crisis has also had an impact on the footwear industry, including the sales of barefoot shoes.

As with any product, the sales of barefoot shoes will continue to be influenced by a variety of factors, and it’s difficult to predict whether sales will go up or down in the future. The best way to gauge the current popularity of barefoot shoes is to look at sales data and consumer trends. However, I do not have access to that information.

**ZZ:** Why barefoot shoes are so expensive even if they use fewer materials and are less complex than traditional sports shoes?

**ChatGPT:**

Barefoot shoes can be expensive for several reasons, despite using fewer materials and being less complex than traditional sports shoes.

Quality materials: Although barefoot shoes may use fewer materials, the materials used are often of high quality and may be more expensive than those used in traditional sports shoes. For example, some barefoot shoes use premium-quality leather or other materials that are durable, breathable, and comfortable.

Manufacturing process: The manufacturing process for barefoot shoes can be more involved and time-consuming than that for traditional sports shoes. This can add to the cost of producing barefoot shoes.

Brand reputation: Barefoot shoes are often marketed as a premium product, and the brand reputation can impact the price of the shoes. Some brands may charge more for their barefoot shoes because they have a strong reputation for quality and innovation.

Research and development: Developing a high-quality barefoot shoe requires significant investment in research and development. This investment can be reflected in the price of the final product.

In summary, despite using fewer materials and being less complex than traditional sports shoes, barefoot shoes can be more expensive due to the use of quality materials, the manufacturing processZZ: I doubt these two points, which may be just the manufacturers’ claims. , brand reputation, and investment in research and development. However, it’s worth noting that not all barefoot shoes are expensive, and there are options available at a variety of price points.

**ZZ:** You did not mention how much money barefoot shoes company spend on marketingZZ: A human being can infer that barefoot shoes companies heavily rely on marketing and advertising, given the overwhelmingly positive yet similar comments from YouTubers. Understandably, a language model would never know the deals made under the table between those YouTubers and barefoot shoe companies. Based on my insights on advertising and my sparse knowledge of the historical prices of barefoot shoes, I think marketing cost is the major reason why barefoot shoes are so expensive. This answer is really really hard for AI because it requires insights, experiences, and understanding.

**ChatGPT:**

You’re correct, I apologize for not mentioning marketing costs in my previous response. Marketing is also an important factor that can contribute to the high cost of barefoot shoes. Companies may spend a significant amount of money on advertising, promotions, and other marketing efforts to build brand awareness and increase sales. This marketing spend can add to the overall cost of barefoot shoes, which is passed on to consumers in the form of higher prices.

In summary, the high cost of barefoot shoes can be due to a variety of factors, including the use of quality materials, the manufacturing process, brand reputation, investment in research and development, and marketing spend.

**ZZ:** Can you summarize our conversation on barefoot shoes, list the main points we discussed?

**ChatGPT:**

Sure! Here’s a summaryZZ: the discussion on sales is missing, but admittedly that discussion is not informative. of our conversation on barefoot shoes:

Definition: Barefoot shoes are designed to mimic the feel of walking or running barefoot, while providing some protection for the feet.

Supporters’ viewpoint: Proponents of barefoot shoes argue that they can improve balance, posture, and strengthen the feet and lower leg muscles.

Critics’ viewpoint: Critics of barefoot shoes point out that they may not provide enough support for the feet, leading to injury, and that they may not be suitable for all types of physical activity.

Feedback from consumers: Consumers who have tried barefoot shoes have reported a range of experiences, with some finding them comfortable and beneficial, while others found them uncomfortable or difficult to adjust to.

Professional athletes’ opinions: Professional athletes have had mixed opinions on barefoot shoes, with some using them for limited purposes, while others have opted for traditional sports shoes.

Average person’s use: Barefoot shoes are used by a variety of people, from fitness enthusiasts to individuals looking for a more natural walking or running experience.

Cost: Despite using fewer materials and being less complex than traditional sports shoes, barefoot shoes can still be expensive due to factors such as the use of quality materials, the manufacturing process, brand reputation, investment in research and development, and marketing spend.

**ZZ:** I think the most reasonable ways to use barefoot shoes is limited purposes, especially as an auxiliary tool to complement regular training, to target some small muscle groups for strength and conditioning. It should be used for low impact activities, and mostly on a safe surface. It should not be marketed as a product that can solve all the problems, and the false claims that all those problems are caused by the conventional shoes.

**ChatGPT:**

Yes, that’s a well-reasoned perspective. It’s important to remember that barefoot shoes are just one tool in a larger fitness and wellness toolkit, and they may not be suitable for everyone or for all types of physical activity. As with any new workout routine or gear, it’s always a good idea to consult with a medical professional or fitness trainer to determine what’s best for you, and to start slowly and gradually increase intensity and duration as your body adapts. Additionally, it’s important to be mindful of the surface you’re on and the impact of your activity when using barefoot shoes, and to listen to your body and adjust as needed to avoid injury.

**ZZ:** Can you create a title for this conversation?

**ChatGPT:**

Sure! How about: “Exploring the Pros and Cons of Barefoot Shoes: A Comprehensive Discussion”

**ZZ:** “Exploring the Pros and Cons of Barefoot Shoes: A Comprehensive Discussion with ChatGPT”

**ChatGPT:**

That works too! “Exploring the Pros and Cons of Barefoot Shoes: A Comprehensive Discussion with ChatGPT” accurately summarizes the content of our conversation and acknowledges the role of the AI in providing the information.

**ZZ:** Can you summarize the last part of our conversation on the conclusion of barefoot shoes?

**ChatGPT:**

]]>Yes, certainly. The last part of our conversation concluded that the use of barefoot shoes should be approached with caution and considered as just one tool in a larger fitness and wellness toolkit. It’s important to consult with a medical professional or fitness trainer to determine what’s best for you, and to start slowly and gradually increase intensity and duration as your body adapts. Additionally, it’s important to be mindful of the surface you’re on and the impact of your activity when using barefoot shoes, and to listen to your body and adjust as needed to avoid injury. Overall, barefoot shoes may be appropriate for some individuals for limited purposes and low-impact activities, but they are not a panacea and should not be marketed as such.

Reference book:

Lambiotte, R. and Masuda, N., 2021. A guide to temporal networks. World Scientific.

Course archive C5.4 Networks, materials download:

Course Synopsis:

- Introduction and short overview of useful mathematical concepts (2 lectures): Networks as abstractions; Renewal processes; Random walks and diffusion; Power-law distributions; Matrix algebra; Markov chains; Branching processes.
- Basic structural properties of networks (2 lectures): Definition; Degree distribution; Measures derived from walks and paths; Clustering coefficient; Centrality Measures; Spectral properties.
- Models of networks (2 lectures): Erdos-Rényi random graph; Configuration model; Network motifs; Growing network with preferential attachment.
- Community detection (2 lectures): Newman-Girvan Modularity; Spectral optimization of modularity; Greedy optimization of modularity.
- Dynamics, time-scales and Communities (2 lectures): Consensus dynamics; Timescale separation in dynamical systems and networks; Dynamically invariant subspaces and externally equitable partitions
- Dynamics I: Random walks (2 lectures): Discrete-time random walks on networks; PageRank; Mean first-passage and recurrence times; Respondent-driven sampling; Continous-Time Random Walks
- Random walks to reveal network structure (2 lectures): Markov stability; Infomap; Walktrap; Core–periphery structure; Similarity measures and kernels
- Dynamics II: Epidemic processes (2 lectures): Models of epidemic processes; Mean-Field Theories and Pair Approximations

Another related lecture is on Riemannian geometry by Jason Lotay from Oxford Mathematics

Course archive C3.11 Riemannian Geometry

The lectures are not https://www.youtube.com/watch?v=wZgM3u8UkNs&list=PL4d5ZtfQonW17IBjdLKcfQVBuuKaWnxbx

- Riemannian manifolds: basic examples of Riemannian metrics, Levi-Civita connection.
- Geodesics: definition, first variation formula, exponential map, minimizing properties of geodesics.
- Curvature: Riemann curvature tensor, sectional curvature, Ricci curvature, scalar curvature.
- Riemannian submanifolds: examples, second fundamental form, Gauss–Codazzi equations.
- Jacobi fields: Jacobi equation, conjugate points.
- Completeness: Hopf–Rinow and Cartan–Hadamard theorems
- Constant curvature: classification of complete manifolds with constant curvature.
- Second variation and applications: second variation formula, Bonnet–Myers and Synge’s theorems.

In some machine learning (ML) problems, such as policy gradient reinforcement learning algorithms, you may have a non-differentiable loss/objective function with regard to (w.r.t.) the final or intermediate output $\mathbf{y}$ of the downstream machine learning (ML) pipeline.
For example, a custom gradient, $\mathbf{\delta} = \nabla_{\mathbf{y}}l_{b}(\mathbf{y}, r)$, may depend on the feedback $r$ from the environment or some blackbox process, therefore can not be implemented via the automatic differentiation in *Tensorflow* and *PyTorch*.
A quick trick to apply artibrary gradient $\mathbf{\delta}$ in backpropagation is using squared loss:

\begin{equation*} l(\mathbf{y},\tilde{\mathbf{y}},\mathbf{\delta}) = \frac{1}{2} \lVert\mathbf{y} - (\tilde{\mathbf{y}} + \mathbf{\delta})\rVert_{2}^{2} = \frac{1}{2} \sum_{k=1}^{n}\left[\mathbf{y}_{k} - (\tilde{\mathbf{y}}_{k} + \mathbf{\delta}_{k})\right]^2\;. \end{equation*}

If your code is developed in *Tensorflow 1* (with sessions and computational graph), this trick can save you lots of trouble and burden in migrating to *Tensorflow 2* for slower eager execution, or to *PyTorch* for customized autograd.

Given the input data $\mathbf{X}$ and label $\mathbf{y}^*$, the ML algorithm or artificial neural network (ANN) outputs prediction as $\mathbf{y}=f(\mathbf{X};\mathbb{\Theta})$. Here, we denote a matrix by a bold upper case letter, such as $\mathbf{X}$, a vector by bold lower case letter, such as $\mathbf{y}$, and the $k$th element in vector $\mathbf{y}$ by subscript $k$ as $\mathbf{y}_{k}$. The ML algorithm or ANN is represented as a paramterized function $f(\mathbf{X};\mathbb{\Theta})$, where $\mathbb{\Theta}$ is the set of parameters. The training of the parameters is carried out by an optimizer, which iteratively update the parameters through gradient descent in the direction of minimizing a loss function.

The loss function of the prediction, the label, and/or the set of parameters, denoted as $l(\mathbf{y}, \mathbf{y}^*, \mathbb{\Theta})$, is the objective function to be minimized in training (optimization). You probably have already learned several commonly used loss functions, such as cross entropy for classfication, mean-squared-error for regression, and $L^1$ and $L^2$ norm for regularization. The gradient is typically generated as the derivatives of the loss function w.r.t. the parameters. Following the chain rule, the gradient can be denoted as:

\begin{equation} \nabla_{\mathbb{\Theta}} l(\mathbf{y}, \mathbf{y}^*, \mathbb{\Theta}) = \frac{\partial l(\mathbf{y}, \mathbf{y}^*, \mathbb{\Theta})}{\partial \mathbf{y}} \frac{\partial \mathbf{y}}{\partial \mathbb{\Theta}}. \label{eq:gradient} \end{equation}

During training, the parameters are updated as:

\begin{equation} \mathbb{\Theta} \leftarrow \mathbb{\Theta} - \alpha\nabla_{\mathbb{\Theta}} l(\mathbf{y}, \mathbf{y}^*, \mathbb{\Theta}), \label{eq:gd} \end{equation}

where $0<\alpha<1$ is the learning rate.

The gradient in \eqref{eq:gradient} has two components: the derivative of the loss function w.r.t. the output $\mathbf{y}$, $\frac{\partial l(\mathbf{y}, \mathbf{y}^{*}, \mathbb{\Theta})}{\partial \mathbf{y}}$, and the derivative of the output $\mathbf{y}$ w.r.t. the parameters $\mathbb{\Theta}$, $\frac{\partial \mathbf{y}}{\partial \mathbb{\Theta}}$.
In most supervised and unsupervised learning, the loss function $l(\cdot)$ and the machine learning pipeline $f(\cdot;\mathbb{\Theta})$ are both differentiable, which allows the backpropagation of the gradient being carried out by the automatic differentiation mechanism built in *Tensorflow* and *PyTorch*.

However, in reinforcement learning, especially in the development of new approaches, you may end up with a differentiable ML pipeline $f(\cdot;\mathbb{\Theta})$ and a non-differentiable loss/objective function, denoted as $l_{b}(\mathbf{y}, r)$, where $r$ is the observed feedback from the environment. This is because in reinforcement learning, the objective is often to maximize or minimize certain performance metric that does not have an analytical expression but can only be observed from the interaction between the actions of the agent (prediction $\mathbf{y}$) and the environment.

If your ML problem requires a customized or non-differentiable loss/objective function, it is quite burdensome to go beyond the set of commonly used loss functions built in *Tensorflow* and *PyTorch*.
In *Tensorflow 2*, you need to learn the topic of advanced automatic differentiation and work with the `tf.GradientTape`

API and `apply_gradients`

function.
In *PyTorch*, you need to define new autograd functions.
You need first convert your data to tensor and then perform operations on the tensors based on the built-in functions in *Tensorflow* or *PyTorch*.

Let’s say you have worked out a formula to approximate (or guess) the gradient of a blackbox loss or objective function w.r.t. the prediction, $\nabla_{\mathbf{y}} l_{b}(\mathbf{y}, r)=\frac{\partial l_{b}(\mathbf{y}, r)}{\partial \mathbf{y}}$, which by the way is the major effort of policy gradient reinforcement learning algorithm.
You may prefer to implement that formula with the numerical packages like *numpy* and *scipy* rather than the built-in functions of *Tensorflow* or *PyTorch*, since the former may have better performance and/or functionality than the latter, or the former makes debugging much easier.

In a reinforcement learning or customized learning setting, you first collect the experience tuples of state (input data), action (prediction), and reward, $<\mathbf{X}^{(t)}, \tilde{\mathbf{y}}^{(t)}, r^{(t)}>$ for $t=0,\dots,T$, and then compute (or guess) the derivative of your loss/objective function w.r.t. the action (prediction) as $\mathbf{\delta} = l_{b}(\tilde{\mathbf{y}}, r)$.
Note that with exploration, the actual prediction $\tilde{\mathbf{y}}^{(t)}$ does not necessarily equal to the output $\mathbf{y}^{(t)}=f(\mathbf{X}^{(t)};\mathbb{\Theta})$.
Instead of implementing your gradient estimation entirely in Tensorflow or PyTorch, you can first compute the gradient $\mathbb{\delta}^{(t)}$ with whatever packages you like, then plug it into a `placeholder`

and apply it to the backpropagation through an off-the-shelf optimizer and the built-in mean squared loss or the following squared loss:

\begin{equation} l_{s}(\mathbf{y},\tilde{\mathbf{y}},\mathbf{\delta}) = \frac{1}{2} \lVert\mathbf{y} - (\tilde{\mathbf{y}} + \mathbf{\delta})\rVert_{2}^{2} = \frac{1}{2} \sum_{k=1}^{n}\left[\mathbf{y}_{k} - (\tilde{\mathbf{y}}_{k} + \mathbf{\delta}_{k})\right]^2\;. \label{eq:loss} \end{equation}

This is because in the case of exploiation, where $\tilde{\mathbf{y}}^{(t)}=\mathbf{y}^{(t)}=f(\mathbf{X}^{(t)};\mathbb{\Theta})$, we have

\begin{equation} \frac{\partial l_{s}(\mathbf{y},\tilde{\mathbf{y}},\mathbf{\delta})}{\partial \mathbf{y}_k} = \mathbf{y}_{k} - (\tilde{\mathbf{y}}_{k} + \mathbf{\delta}_{k})=\mathbf{\delta}_{k}\;. \label{eq:proof} \end{equation}

The difference between squared loss and mean-squared-error loss is just a constant factor of $2/n$, which can be compensated by setting a larger or smaller learning rate $\alpha$.

- Would \eqref{eq:loss} work in the case of exploration, where $\tilde{\mathbf{y}}^{(t)}\neq\mathbf{y}^{(t)}=f(\mathbf{X}^{(t)};\mathbb{\Theta})$?

Honestly, I don’t know. Maybe just try \eqref{eq:loss} directly or scale $\mathbf{\delta}$ in \eqref{eq:loss} by a small constant $0<\varepsilon<1$. We could also first run the forward pass to compute $\mathbf{y}^{(t)}=f(\mathbf{X}^{(t)};\mathbb{\Theta})$, then replace $\tilde{\mathbf{y}}$ in \eqref{eq:loss} to $\mathbf{y}^{(t)}$. In this case, we apply the gradient of another point $\tilde{\mathbf{y}}^{(t)}$ to the current point $\mathbf{y}^{(t)}$. In stochastic gradient descent, the estimated gradient is quite noisy anyway.

]]>What makes HF (3-30MHz) radio extremely interesting is the Ionospheric Radio Propagation, a.k.a. skywave.

“In radio communication, skywave or skip refers to the propagation of radio waves reflected or refracted back toward Earth from the ionosphere, an electrically charged layer of the upper atmosphere. Since it is not limited by the curvature of the Earth, skywave propagation can be used to communicate beyond the horizon, at intercontinental distances. It is mostly used in the shortwave frequency bands.”

Source: Wikipedia

Specifically, there are two modes of skywave propagation, as illustrated in the following figures. Skywave with lower incident angle can reach longer distances, leaving a skip zone around the transmitter. The skip zone can be covered by near vertical incidence skywave (NVIS). In theory, a 20 Watt handheld SSB HF radio can talk to receivers over thousands of kilometers through oblique incidence propagation (lower incident angle), or receivers within hundreds of kilometers via NVIS.

The ionosphere is greatly affected by the time of the day and the space weather, which subsequently influence the workable frequency bands of skywave. Therefore, users of AM broadcast services and Single-sideband (SSB) communications in HF bands need to schedule their radio frequencies accordingly. Typically, lower frequencies at night and higher frequencies during the day. This is an unique challenge of HF radio, which can be addressed by automatic link establishment (ALE).

The following educational materials can help you better understand HF radio:

- Intro
- Episode 1 - RF Theory
- Episode 2 - Military HF History
- Episode 3 - HF NVIS
- Episode 4 - VOACAP Analysis
- Episode 5 - HF Antennas
- Episode 6 - 2G and 3G ALE
- Episode 7 - Digital Communications. (AES 256 )
- Episode 8 - US Army MARS. (wikipedia, official site)
- Episode 9 - Lessons from the Field

*Note that episodes 7-9 in this series are missing for obvious reasons. I added some links which I think are useful. This is the best introductory lecture for HF radios I have seen so far.*

A shorter (45 min) introduction of this topic is “The HF Renaissance in the US Army” presented by Prof. Col. Stephen Hamilton from West Point. This presentation covers some of the missing episodes of Matthew Sherburne’s lecture including some cool field tests.

https://www.rohde-schwarz.com/us/campaigns/rsa/adt/hf-learning-center_253628.html

Beginners and hobbyists can access to web-based software-defined radio (SDR) (http://websdr.org) to get some first-hand experience of MF and HF radios worldwide with your browser for free. Here is the screenshot of one of the earliest WebSDRs (http://websdr.ewi.utwente.nl:8901/) provided by the amateur radio club at the University of Twente in Netherlands. Each bright line in the waterfall plot of the spectrum represents an active channel, and we can see that the HF band is very busy in Netherlands.

- RTL-SDR https://www.rtl-sdr.com/
- GNU Radio https://www.gnuradio.org/
- The National Association for Amateur Radio https://www.arrl.org/
- HAM Radio School https://www.hamradioschool.com/

Your would probably see something like this, **“AM/FM/LW/VHF/Shortwave SSB Radio”**, on a commercial radio receiver. To choose a proper one, you need to know these jargons:

- AM: Amplitude modulation, AM broadcasting, mostly in 525 kHz - 1705 kHz (medium-wave or medium frequency (MF), 300KHz-3MHz), with 9kHz or 10kHz spacing. Range: 400km during daytime and 2000km at night.
- FM: Frequency modulation, usually refer to FM broacasting in 88.0-108.0 MHz. Range: mostly local stations.
- SSB: Single-sideband modulation
- LW: Longwave, or low frequency (LF), (148.5 kHz – 283.5 kHz), generally 9kHz spacing, has limited number of services.
- SW: Shortwave, or high frequency (HF), 3-30MHz, mostly AM or SSB models. Range: intercontinental.
- VHF: Very high frequency, 30-300MHz, VHF band 1: 54-88MHz (TV), VHF band 2: 87.5-108MHz (FM), VHF band 3: 174-216 (TV).
- Airband/Aircraft band: 108-136 MHz, https://en.wikipedia.org/wiki/Airband
- UHF: Ultra high frequency, 300-3000 MHz.

see Broadcast band

- Eric E. Johnson, Erik Koski, and William N. Furman. “Third-generation and wideband HF radio communications.” Artech House, 2013. Google Books
- Jinlong Wang, Guoru Ding, and Haichao Wang, “HF communications: Past, present, and future,” in China Communications, vol. 15, no. 9, pp. 1-9, Sept. 2018, doi: 10.1109/CC.2018.8456447.
- Hervás M, Bergadà P, Alsina-Pagès RM. “Ionospheric Narrowband and Wideband HF Soundings for Communications Purposes: A Review.” Sensors (Basel). 2020;20(9):2486. Published 2020 Apr 28. doi:10.3390/s20092486
- Witvliet, B.A., Alsina-Pagès, R.M. “Radio communication via Near Vertical Incidence Skywave propagation: an overview.” Telecommun Syst 66, 295–309 (2017). https://doi.org/10.1007/s11235-017-0287-2
- Jian Wang, Yafei Shi, Cheng Yang, Feng Feng, “A review and prospects of operational frequency selecting techniques for HF radio communication,” Advances in Space Research, 2022, ISSN 0273-1177, https://doi.org/10.1016/j.asr.2022.01.026.
- Xianglong Yu, An-An Lu, Xiqi Gao, Geoffrey Ye Li, Guoru Ding, and Cheng-Xiang Wang, “HF Skywave Massive MIMO Communication,” in IEEE Transactions on Wireless Communications, doi: 10.1109/TWC.2021.3115820.
- Toros Arikan and Andrew C. Singer, “Receiver Designs for Low-Latency HF Communications,” in IEEE Transactions on Wireless Communications, vol. 20, no. 5, pp. 3005-3015, May 2021, doi: 10.1109/TWC.2020.3046475.
- Zhiqiang Qin, Jinlong Wang, Jin Chen, Guoru Ding, Yu-Dong Yao, Xinsheng Ji, and Xiangming Chen, “Link Quality Analysis Based Channel Selection in High-Frequency Asynchronous Automatic Link Establishment: A Matrix Completion Approach,” in IEEE Systems Journal, vol. 12, no. 2, pp. 1957-1968, June 2018, doi: 10.1109/JSYST.2017.2717702.
- Jian Wang, Cheng Yang and Wenxing An, “Regional Refined Long-term Predictions Method of Usable Frequency for HF Communication Based on Machine Learning over Asia,” in IEEE Transactions on Antennas and Propagation, doi: 10.1109/TAP.2021.3111634.

Here is a playlist of lectures from the archive of CS224W fall 2019, where slides are available.

- Lecture 1 Introduction; Structure of Graphs
- Lecture 2 Properties of Networks And Random Graph Models
- Lecture 3 Motifs and Structural Roles in Networks
- Lecture 4 Community Structure in Networks
- Lecture 5 Spectral Clustering
- Lecture 6 Message Passing and Node Classification
- Lecture 7 Graph Representation Learning
- Lecture 8 Graph Neural Networks
- Lecture 9 Graph Neural Networks Implementation with Pytorch Geometric
- Lecture 10 Deep Generative Models for Graphs
- Lecture 11 Link Analysis - PageRank
- Lecture 12 Network Effects and Cascading Behavior
- Lecture 13 Probabilistic Contagion and Models of Influence
- Lecture 14 Influence Maximization in Networks
- Lecture 15 Outbreak Detection in Networks
- Lecture 16 Network Evolution
- Lecture 17 Reasoning over Knowledge Graphs
- Lecture 18 Limitations of Graph Neural Networks
- Lecture 19 Applications of Graph Neural Networks

Recommended reading from the course page

- Graph Representation Learning by William L. Hamilton
- Networks, Crowds, and Markets: Reasoning About a Highly Connected World by David Easley and Jon Kleinberg
- Network Science by Albert-László Barabási