Senior member, technical staff
Statistics Research Department
AT&T Labs Research
I am a member of the Statistics
Research Department at AT&T Labs in Florham Park, NJ, where I work on
hierarchical Bayesian modeling, MCMC methods, visualization of hierarchies,
text mining, and other topics related to applied statistics.
My paper with Amy Reibman and Chao Tian was recently accepted to the IEEE International
Conference on Image Processing (aka ICIP). The paper is called
"A Probabilistic Pairwise-Preference Predictor For Image Quality", and in it we describe a multilevel Bayesian model that can be used
as an objective image quality estimator. To gather data, we used Mechanical Turk to run an experiment in which subjects viewed pairs of images online
and indicated which image from each pair they felt was of higher quality. The images were systematically degraded with various types of
distortions at various levels of severity. From our model we inferred the effects of the distortion types, their severities, and existing
objective quality estimators, while
controlling for the effects of subject-specific bias (where subjects systematically tend to prefer either the left or right image for some reason, all else
and reference image bias (where subjects tended to prefer the image of the elephant compared to the image of the barn, for example). For all the
details, here's the pdf; Amy will be presenting it at the conference in Sydney next month.
Two weeks ago at EuroVis 2013, my paper with Howard Karloff was awarded an Honorable Mention for
the Best Paper Award! Congrats also go to my
colleagues Jim Klosowski and Carlos Scheidegger (and their co-authors) for earning the other Honorable Mention
award. It was a solid showing for AT&T Labs Research!
My paper with Howard Karloff, "Maximum Entropy Summary Trees", has been accepted for publication
at EuroVis 2013. I'm really excited about this work: it is an
algorithm for summarizing the structure of a large, rooted, node-weighted tree that leads to nice
visualizations. We define a "summary tree" as an aggregation of the nodes of original tree subject to certain constraints. Then,
our algorithm computes the maximum entropy summary tree, where we define the entropy of a node-weighted tree as the
entropy of the discrete probability distribution whose probabilities are the normalized node weights. The result is a way
to visualize a 100-node summary, for example, of a really huge tree (which might have had 500,000 nodes to begin with), where
this particular 100-node summary is by definition the most informative such summary (according to entropy)
among all possible summaries of the same size. Sequentially viewing the maximum entropy k-node summary trees of size k = 2, 3, 4, ..., 100 is a
really nice way to visually do some EDA on large, hierarchical data.
Here is a link to the paper and to the webpage for summary trees,
which includes more discussion and the supplementary material for the paper (an appendix + some examples). My plans for the next steps include
an R package and a d3 implementation.
Below is the 56-node maximum entropy summary tree of the Mathematics Genealogy tree rooted at Carl Gauss (forced to be a tree by removing all but
the primary advisor of each student), which has over 43,000 nodes in its original form.
For older news, click here.