AT&T Labs - Research
AT&T  


Professional Life:

Xin (Luna) Dong
lunadong@research.att.com
Data Management Dept
AT&T Labs--Research
Bld 103, Rm B281
180 Park Ave.
Florham Park, NJ 07932
Tel: (973)360-8508
Fax: (973)360-8421

 

 

Xin (Luna) Dong

I am currently a researcher in the Data Management Department at AT&T Labs - Research. I received my Ph.D. in Computer Science and Engineering at Univ. of Washington. Before coming to the United States, I obtained a M.S. in Computer Science at Peking University, and a B.S. in Computer Science at Nankai University in China.

You can find my C.V. here.
 


Research Interest

The goal of my research is to help people organize, access, and search information effectively and efficiently. My research aims at answering the following questions:

  • The amount of information produced in the world increases by 30% every year and this rate will only go up. On the one hand, how can we take the full advantage of the abundant information to improve quality of data? On the other hand, how can we help users find the answers that address their questions without overwhelming them by the huge amount of relevant information?
  • Nowadays more and more data-management applications need to integrate a large volume of heterogeneous data, but data integration often requires significant upfront efforts and technical expertise. How can we enable data sharing with a minimum cost and how can we still guarantee quality of the shared data?
  • Data exist in various forms: structured, semi-structure, and unstructured. How to realize the full potential of structure explicitly or implicitly existing in data to best fulfill people's information needs?
  • The world contains rich associations: between real-world entities, between data sources, and between data items. How to discover and leverage such associations to facilitate search?
  • How to help non-technical-savvies to best organize, access, and understand information?

In particular, my current research interests include

  • Data integration
  • Data cleaning
  • Web search
  • Record linkage
  • Personal information management, community information management, enterprise data management.

Projects

Currently my research focuses on two directions: source-dependence discovery for data cleaning and data integration, and data integration with uncertainty.

Solomon---Detecting dependence between data sources

The Internet accelerates the rate of information being produced and eases duplication and transmission of data across data sources. Solomon aims at discovering dependencies between data sources and leveraging such knowledge for deciding truth from conflicting information and for efficiently answering queries over a set of data sources.  [Vision paper][Talk]

 

Lime Green Man Carrying A Large Yellow Question Mark Over His Shoulder, Symbolizing Curiousity, Uncertainty Or Confusion UDI---Data Integration with Uncertainty

As the scope of data integration applications broadens, data integration systems need to handle uncertainty at various levels and do so in a principled fashion. Our goal in the project is to build a system that can quickly provide search services over a set of heterogeneous data sources without requiring strenuous work on setting up the precise mappings between the sources. [Talk][DBClip]

My recent projects include

  • Semex---Personal information management system
  • Woogle---Web-service search engine
  • Piazza---Peer data management system

Selected Publications

You can find the full list of my publications here and my DBLP entry here. Below is a list of selected papers categorized by research area.

  1. Data fusion and source dependence
    • Xin Luna Dong and Felix Naumann. Data fusion--Resolving data conflicts for integration. In VLDB, 2009. [PDF][Presentation]
    • Xin Luna Dong, Laure Berti-Equille, and Divesh Srivastava. Integrating conflicting data: the role of source dependence. In VLDB, 2009. [PDF][Presentation]
    • Xin Luna Dong, Laure Berti-Equille, and Divesh Srivastava. Truth discovery and copying detection in a dynamic world. In VLDB, 2009. [PDF][Presentation]
    • Laure Berti-Equille, Anish Das Sarma, Xin Luna Dong, Amelie Marian, and Divesh Srivastava. Sailing the information ocean with awareness of currents: discovery and application of source dependence. In CIDR, 2009. [PDF][Presentation]
  2. Dataspaces
    • Daisy Zhe Wang, Xin Luna Dong, Anish Das Sarma, Michael Franklin, Alon Halevy. Functional Dependency Generation and Applications in Pay-As-You-Go Data Integration Systems. In WebDB, 2009. [PDF]
    • Anish Das Sarma, Xin Dong, and Alon Y. Halevy: Bootstrapping Pay-as-you-go Data Integration Systems. In SIGMOD, 2008. [PDF]
    • Xin Dong, Alon Y. Halevy and Cong Yu: Data Integration with Uncertainties. In VLDB, 2007. [PDF][Presentation][DBClip][JournalVersion]
    • Xin Dong and Alon Y. Halevy: Indexing Dataspaces. In SIGMOD, 2007. [PDF][Presentation]
    • Jing Liu, Xin Dong and Alon Y. Halevy: Answering Structured Queries on Unstructured Data. In WebDB 2006. [PDF][Presentation]
    • Xin Dong, Alon Y. Halevy and Jayant Madhavan: Reference Reconciliation in Complex Information Spaces. In SIGMOD 2005. [PDF][Presentation]
  3. Personal information management
    • Yuhan Cai, Xin Dong, Alon Y. Halevy, Jing Liu and Jayant Madhavan: Personal Information Management with SEMEX. SIGMOD Demo 2005. (BEST DEMO, one of three top demos)[PDF][Presentation]
    • Xin Dong and Alon Y. Halevy: A Platform for Personal Information Management and Integration. In CIDR 2005. [PDF][Presentation]
  4. Misc
    • Xin Dong, Alon Y. Halevy, Jayant Madhavan, Ema Nemes and Jun Zhang: Similarity Search for Web Services. In VLDB 2004. [PDF][Presentation]
    • Xin Dong, Alon Y. Halevy and Igor Tatarinov: Containment of Nested XML Queries. In VLDB 2004. [PDF][Presentation][Tech-report]

Talks

  • Data fusion--Resolving data conflicts for integration. [PPT]
    • NDBC tutorial, Nanchang, China, October 2009. [PPT]
    • VLDB tutorial, Lyon, France, August 2009.
  • Sailing the information ocean with awareness of currents: discovery and application of source dependence. [PPT]
    • NDBC invited talk, Nanchang, China, October 2009.
    • SKG tutorial, Zhuhai, China, October 2009.
    • AT&T, Florham Park NJ, July 2009.
  • Data integration with uncertainty. [PPT]
    • SKG panal talk, Zhuhai, China, October 2009.
    • Database group, Renmin University, Beijing, China, September, 2009.
    • Computer Science Dept, New Jersey's Science & Technology University, Newark, NJ, Feb 2009.
    • Database Group, UPenn, Philadelphia, PA, Nov 2008.
    • Computer Science Dept, Stevens Institute of Technology, Hoboken NJ, Sep 2008.
    • AT&T, Florham Park NJ, May 2008.
  • Managing a space of heterogeneous data. [PPT]
    • Database group, Renmin University, Beijing, China, June 2007.
    • University of Wisconsin, Madison WI, February 2007.
  • Semex: A platform for personal information management and integration. [PPT]
    • Microsoft Research, Adaptive System & Interaction Core Group, Seattle, November 2005.
    • Microsoft Research at Asia, Beijing, China, June 2005.

Patents

  • Detecting Dependence Between Sources in Truth Discovery. Xin Dong, Laure Berti-Equille, Divesh Srivastava. United States Patent, filed 5/2009, to be issued.
  • Minimal difference query and view matching. Raghav Kaushik, Venkatesh Ganti and Xin Dong. United States Patent 7251646, issued 7/31/2007.
  • Method and apparatus for updating XML views of relational data. Philip L. Bohannon, Xin
    Dong, Henry F. Korth, Suryanarayan Perinkulam. United States Patent 20050165866, filed Jan 28, 2004, to be issued.

Recent Professional Activities

  • PC chair in SKG'09.
  • PC member in VLDB Demo'10, WWW'10, NTII'10, ICDE'10, CIKM'09, WebDB'09, VLDB'09, CIKM'08, WebDB'08, VLDB Demo'08, WWW'08.
  • NIH contract reviewer, 2008.
  • Referee for PVLDB, VLDB Journal, TODS, TCS, TOIT, TOIS, TKDE, IS.

Resources

  • Here is a long and growing list of papers in database, IR and AI that I have collected during my research and my readings.
  • Here is a collection of wisdoms on career, research, life, etc.

Personal Life:

Xin (Luna) Dong 董欣
lunadong@gmail.com
Morristown, NJ 07960
Tel: (201)650-3494


In my personal life, I am