dc.description |
Machine Learning (ML) is transforming the way that computers are used to solve problems in computer vision, natural language processing, scientific modelling, and much more. The rising number of devices connected to the Internet generate huge quantities of data that can be used for ML purposes.
Traditionally, organisations require user data to be uploaded to a single location (i.e., cloud datacentre) for centralised ML. However, public concerns regarding data-privacy are growing, and in some domains such as healthcare, there exist strict laws governing the access of data. The computational power and connectivity of devices at the network edge is also increasing: edge computing is a paradigm designed to move computation from the cloud to the edge to reduce latency and traffic.
Federated Learning (FL) is a new and swiftly-developing field that has huge potential for privacy-preserving ML. In FL, edge devices collaboratively train a model without users sharing their personal data with any other party. However, there exist multiple challenges for designing useful FL algorithms, including: the heterogeneity of data across participating clients; the low computing power, intermittent connectivity and unreliability of clients at the network edge compared to the datacentre; and the difficulty of limiting information leakage whilst still training high-performance models.
This thesis proposes new methods for improving the process of FL in edge computing and hence making it more practical for real-world deployments. First, a novel approach is designed that accelerates the convergence of the FL model through adaptive optimisation, reducing the time taken to train a model, whilst lowering the total quantity of information uploaded from edge clients to the coordinating server through two new compression strategies. Next, a Multi-Task FL framework is proposed that allows participating clients to train unique models that are tailored to their own heterogeneous datasets whilst still benefiting from FL, improving model convergence speed and generalisation performance across clients. Then, the principle of decreasing the total work that clients perform during the FL process is explored. A theoretical analysis (and subsequent experimental evaluation) suggests that this approach can reduce the time taken to reach a desired training error whilst lowering the total computational cost of FL and improving communication-efficiency. Lastly, an algorithm is designed that applies adaptive optimisation to FL in a novel way, through the use of a statistically-biased optimiser whose values are kept fixed on clients. This algorithm can leverage the convergence guarantees of centralised algorithms, with the addition of FL-related error-terms. Furthermore, it shows excellent performance on benchmark FL datasets whilst possessing lower computation and upload costs compared to competing adaptive-FL algorithms. |
|