We consider the problem of learning over graphs. A standard problem in machine learning is that of supervised learning in which we are given data consisting of a set of patterns paired with associated labels and the aim is to find a function that will predict well on future data. In the semi-supervised learning framework we consider that we may have a dataset of both pattern-label pairs and pattern singletons. For example, a dataset for optical character recognition may consist of a number of images of characters and their labels, or some mixture of labelled images and unlabelled images. We assume that the data may be represented as a graph. this graph may be naturally associated to the data for example a protein interaction network. Otherwise a graph may be "built" via a similarity metric. We will give a method to predict the labelling of the graph given a partial vertex labelling. To this effect we will give simple algorithms. The algorithm(s) may be distributed over a network of simple computational nodes whose toplogy is identical to the graph's topology. Each node's computation is simple, local and asynchronous. We will consider extensions to active learning and a method to combine multiple graph representations of the data. We will give two types of performance guarantees: first, on the convergence rate of these algorithms; second, on the quality of the predictions of these algorithms, which we will interpret in terms of the structural properties of the graph. Experimental results will also be given.