I have research interest in the following areas:
- network/graph analysis and visualization
- machine learning
- numerical linear algebra
- dynamic load balancing
- static load balancing
- sparse matrices ordering
- scheduling of message passing
- numerical optimization
- CFD solvers
- parallelising industrial applications
- chemical process simulation (Parallel DAE solvers)
network/graph analysis and visualizationNetworks or graphs encapsulate complex relations in many application areas. Algorithms to analysis these networks, for example, in finding out community structures/clusters, calculating PageRank, and visualizing the networks, play an important role in discovering structures and gaining knowledge from complex data. A novel multilevel force directed algorithm with fast force calculation using octree data structure and supernode force approximation was proposed and implemented, resulted in an efficient graph drawing algorithm with a complexity of O(|V| log(|V|) + |E|). This algorithm can handle graphs of millions of nodes, as well as large tree like graphs, such as the Mathematics Genealogy Trees.
One interesting problem is node overlap removal. Often nodes contain information that need to be displayed. Traditional graph drawing algorithm ignore the node size. An efficient and effective overlap removal algorithm, PRISM, has been developed.
Edge bundling is a technique that can help in reduce clutters of large graph visualization.
machine learningA common task of recommender systems is to improve customer experience through personalized recommendations based on prior implicit feedback. These systems passively track different sorts of user behavior, such as purchase history, watching habits and browsing activity, in order to model user preferences. Unlike the much more extensively researched explicit feedback, we do not have any direct input from the users regarding their preferences. In particular, we lack substantial evidence on which products consumer dislike. In this work we identify unique properties of implicit feedback datasets. We propose treating the data as indication of positive and negative preference associated with vastly varying confidence levels. This leads to a factor model which is especially tailored for implicit feedback recommenders. We also suggest a scalable optimization procedure, which scales linearly with the data size. The algorithm is used successfully within a recommender system for television shows. It compares favorably with well tuned implementations of other known methods. In addition, we offer a novel way to give explanations to recommendations given by this factor model. A paper describing this work is here .
numerical linear algebraI have been doing research and developing software in AMG, iterative algorithms and geometric multigrid. A parallel AMG code SAM (Scalable Algebraic Multigrid) has been developed.
dynamic load balancingFor many practical applications, such as adaptive mesh based calculations, the load changes during the course of parallel computation, making it necessary to balance the load dynamicly, and in parallel. Most of the existing parallel dynamic load balancing algorithms involve two steps: flow calculation and migration. The traditional algorithm for flow calculation is the diffusion algorithms. We have proved that this type of algorithms are "optimal" in terms of the amount of load migration. However we have proposed a new algorithm which is also "optimal", but is much more efficient.
static load balancing (grid partitioning)the numerical solution of partial differential equations usually involves dividing up the physical domain into small elements or volumes to form a mesh. To solve the problem on a distributed memory parallel computer, the mesh should be decomposed into subdomains, the number of which usually equals the number of processors, with each subdomain assigned to a unique processor. The static load balancing problem is that of how to decompose the mesh into subdomains so that each processor has about the same amount of computation and so that the communication cost between processors is minimized, with the overall aim of minimizing the runtime of the calculation.
sparse matrices ordering Isolving general nonsymmetric sparse systems in parallel efficiently is a challenging problem. One way of achieve that is to order the system into bordered block-diagonal form, and factorize the diagonal blocks in parallel. A multilevel algorithm MONET has been develop for this purpose. Other ordering algorithms are also being investigated, for example, for minimizing frontal size/bandwidth in the context of frontal/multifrontal solvers.
sparse matrices ordering IIThe problem of reordering a matrix to minimize its front size has many applications. A good reordering helps to reduce the memory usage as well as floating point operations of a frontal solver, it also helps when forming incomplete a LU factorization as a preconditioner for Krylov subspace iterative algorithms. We have proposed a multilevel algorithm for reordering sparse symmetric matrices to reduce the wavefront and profile. The algorithm uses a maximal independent vertex set for coarsening and the Sloan algorithm on the coarsest graph. It is shown to produce orderings that are of a similar quality to those obtained using the best existing combinatorial algorithm (the hybrid Sloan algorithm), yet does not require any spectral information and is significantly faster, requiring on average half the CPU time. A paper entitled Multilevel algorithms for wavefront reduction is available for download.
scheduling of message passingunblocked (asynchronous) communication is usually the most efficient way of message passing in unstructured applications. However when this is not possible due to reasons such as memory consideration, it is necessary to organize the blocked communication to minimize the communication cost. This turns out to be an interesting graph coloring problem.
numerical optimizationI have been working on interior point algorithms for linear and nonlinear optimization. I am also interested in global optimization algorithms (Genetic Algorithms, a parallel controlled random search algorithm). I implemented
CFD solversresearch and development in solvers for CFD, particularly for finite volume/finite element based calculations. These include FAS nonlinear multigrid for SIMPLE based finite volume imcompressible calculations, block correction based parallel multigrid linear solver and AMG solvers for DNS.
parallelising industrial finite element applicationsFLITE3D is a multigrid Euler code for external aero-dynamics. It is used extensively by British Aerospace in aircraft design and simulation. This code has been efficiently parallelised in the FLITE3D project. It has been used for production since and has been proved highly robust and efficient.
chemical process simulation (Parallel DAE solvers)chemical processes are frequently simulated using a large set of differential algebraic equations (DAEs). Computer simulation of such chemical processes involves integration of DAEs over the time interval of interest. Implicit integration schemes are normally used for stability. The nonlinear equations resulted from the implicit integration scheme can be solved using Newton's method, with the sparse Jacobian systems solved using a (direct) sparse solver, such as the MA48 in the Harwell subroutine library. This can be extremely computing intensive. Increasingly, efforts are being made to utilise the vector and parallel computers. Effective use of parallel computers for the solution of DAEs presents a great challenge, because the initial value problems are intrinsically sequential, and the components of the solution process, such as the direct sparse solver, are not easily parallelised either. One way of introducing parallelism into the sparse solver is through reordering the matrices into special for for parallel factorization. To this end we have developed the MONET software . Alternative approaches have also been explored, including iterative algorithms with suitable preconditioners. On a more coarse grain level, parallel extrapolation method has been investigated. It has been demonstrated that some speedups can be achieved.