**
A social network caught in the Web
Lada Adamic**, Orkut Buyukkokten and Eytan Adar

We present an analysis of Club Nexus, an online community at Stanford University. Through the Nexus site we were able to study a reflection of the real world community structure within the student body. We observed and measured social network phenomena such as the small world effect, clustering, and the strength of weak ties. Using the rich profile data provided by the users we were able to deduce the attributes contributing to the formation of friendships, and to determine how the similarity of users decays as the distance between them in the network increases. In addition, we found correlations between a user's personality and their other attributes, as well as interesting correspondences between how users perceive themselves and how they are perceived by others.

Deepak Agarwal

Inference for edge data in large directed graphs is computationally challenging since one has to account for dependencies that exist between edges. We describe a framework for scaling the computations by using "Communities of Interest" recently introduced in the literature (Cortes, Pregibon and Volinsky). A Community of Interest is a small subgraph centered around a node and is assumed to capture influence the node has on the entire graph. This approximation enables us to work locally with a large number of small subgraphs whose union constitutes the entire graph. Inference for each subgraph is done using Bayesian Stochastic Blockmodels.

Mark S. Handcock

We consider statistical and stochastic models for random networks that can be used to represent the structural characteristics of the networks. In our applications, the nodes usually represent people, and the edges represent a specified relationship between the people.To date, the use of complex graph models has been limited by three interrelated factors: the complexity of realistic models, paucity of empirically relevant simulation studies, and a poor understanding of the properties of inferential methods. In this talk we discuss solutions to these limitations. We emphasize the important of likelihood-based inferential procedures and role of Markov Chain Monte Carlo (MCMC) algorithms for simulation and inference.

Denise Scholtens

Coordination of gene expression data, protein-protein interactions, and other high-throughput information about the properties of the cell is fundamental to bioinformatics research. We pose several recent bioinformatics problems in terms of graphs and emphasize some statistical principles that should be considered when doing inference on multiple graphs. We also demontrate the use of the graph and Rgraphviz packages in Bioconductor as visualization and analysis tools for bioinformatics graph data types.

Cliff Behrens

We are currently exploring the use of graphs to visualize knowledge and its distribution among subject matter experts (SMEs) as a means of improving collaborative modeling and information discovery. This talk will report on two related applications. The first describes research for DARPA that builds knowledge "contour" maps from similarities among SME responses and estimates of their knowledge in a domain. This map motivates "knowledge-based" collaboration and the use of new collaboration tools by revealing potential advice-giving relationships among SMEs. The second application involves creating visualizations of semantic neighborhoods for a user's query in vector spaces computed independently with Latent Semantic Indexing (LSI). In this case, graphics are provided to help a user decide whether a particular vector space contains appropriate context for a query. Graphics can also yield new insights about terms that contribute meaning to core concepts in a knowledge domain. Some background and example visualizations will be presented for each application.

Friedrich Leisch

Motivated by our current research on visualizing consumer market structure using graphs we are in the process of implementing R functions that allow for trellis-style displays with a graph layout. While standard trellis graphics use a rectangular grid for the panels, we use the nodes of a graph (projected into 2-dimensional space) as positions for the panels. In addition, each edge of the graph is also drawn using a panel function.

In the simplest case, the node panels draw a circle with the name of the node, and the edge panles draw lines or arrows to give the standard picture of a graph. However, arbitrary panels from the lattice package can be used, e.g., to draw a barplot or scatterplot in each node of the graph. While interactive tools like ggobi are much better suited for exploratory analysis, we aim at publication-quality figures capable of visualizing data with an underlying graph structure of moderate size.

As an example we use marketing data on product perceptions. Each node of the graph represents a cluster of perceptions, edges connect neighboring clusters. For each node we have several background variables like distribution of brands, sales, time or sociodemographic data of customers. Differences between the market segements are tested using permutation tests and visualized using the framework described above.

James Moody,

Increased interest in longitudinal social networks and the recognition that visualization fosters theoretical insight creates a need for dynamic network visualizations, or network "movies." The successful development of network movies requires confronting a number of theoretical questions surrounding the temporal representations of social networks and technical questions about how best to link network change to changes in the graphical representation. We divide network movies into two major classes: staticflip bookswhere node position remains constant, but edges cumulate over time and dynamicmovieswhere nodes move as a function of changes in relations. Static graphs are particularly useful in contexts where relations are sparse. When the network is more connected, movies are often more appropriate, and the bulk of our discussion focuses on techniques and challenges associated with developing meaningful dynamic network movies. We explore the returns to different movie styles using three empirical examples. A new software program for creating network movies is discussed in the appendix.

Scott White

Very large networks with arbitrarily rich attribute structure have become increasingly common in recent years. Examples include collaboration networks, protein interaction networks, and telecommunication networks. The Java Universal Network/Graph Framework (JUNG, sourceforge.net/projects/jung/) provides a general JAVA-based software development framework designed to support the modeling, analysis, and visualization of data that can be represented as graphs. In this brief talk, I will give an overview of JUNG and show how it can be used to support various tasks and queries (e.g. clustering, ranking, filtering, visualization, etc.) that are common to large-scale network analysis.