CIKM 2001: Technical Program


TUESDAY, NOV 6, 2001

9:00 - 9:15

Opening by General Chair and PC-Co-Chairs

9:15 - 10:15 

Keynote Address One
An Integrated Approach to Knowledge Management

Alfred Spector, IBM
Session Chair: Calton Pu, Georgia Institute of Technology (USA)

10:15 - 10:45 


10:45 - 12:00 

Research Session 1: Similarity Search and Query Optimization
Session Chair: Ellen Voorhees, NIST (USA)

Efficient Processing of Conical Queries
Hakan Ferhatosmanoglu (Ohio State University), Divyakant Agrawal, Amr El Abbadi (University of California, Santa Barbara)

Joint Optimization of Cost and Coverage of Query Plans in Data Integration
Zaiqing Nie and Subbarao Kambhampati (Arizona State University)

Effective Nearest Neighbor Indexing with the Euclidean Metric
Sang-Wook Kim (Kangwon National University), Charu Aggarwal, and Philip Yu (IBM T.J. Watson Research Center) *

Query-Sensitive Similarity Measures for the Calculation of Interdocument Relationships
Anastasios Tombros and C.J. van Rijsbergen (University of Glasgow) *
Research Session 2: Clustering
Session Chair: Ke Wang, Simon Fraser University (Canada)

Bipartite graph partitioining and data clustering
Hongyuan Zha (Penn State University), Xiaofeng He (Penn State University), Chris Ding, (Berkeley National Lab.), Ming Gu (U.C. Berkeley), and Horst Simon (Berkeley National Lab.)

Evaluating Document Clustering for Interactive Information Retrieval
Anton Leuski (Center for Intellignet Information Retrieval, UMASS)

Extracting Meaningful Labels for WEBSOM Text Archives
Arnulfo P. Azcarraga and Teddy N. Yap Jr. (National University of Singapore) *
Research Session 3: Query Processing
Session Chair: Sham Navathe, Georgia Institute of Technology (USA)

Exposing the Vagueness of Query Results on Partly Inaccessible Databases
Oliver Haase (Bell Labs Research, Lucent Technologies) and Andreas Henrich (Otto-Friedrich-Universität Bamberg)

Towards a Visual Query Interface for Phylogenetic Databases
Hasan Jamil, Giovanni Modica, and Maria Teran (Mississippi State University)

A Relational Algebra for Data/Metadata Integration in a Federated Database System
Catharine Wyss, Dirk Van Gucht (Indiana University)

12:00 - 1:30 


1:30 - 2:45 

Industry Session 1: Knowledge Management: Organizing What You Know
Session Chair: Aris Ouksel, University of Illinois at Chicago (USA)

Tempus Fugit: A System for Making Semantic Connections
D. Ford, J. Ruvolo, S. Edlund, J. Myllymaki, J. Kaufman, J. Jackson, M. Gerlach (IBM)

FOCI: Flexible Organizer for Competitive Intelligence
Hwee-Leng Ong, Ah-Hwee Tan, Jamie Ng, Hong Pan, and Qiu-Xiang Li (Kent Ridge Digital Labs)

Towards Speech as a Knowledge Resource
E. Brown, S. Srinivasan, A. Coden, D. Ponceleon, J. Cooper, A. Amir, J. Pieper (IBM)
Research Session 4: Pattern Mining
Session Chair: Hakan Ferhatosmanoglu, Ohio State University (USA)

Approximately Common Patterns in Shared-Forests
Manuel Vilares (University of A Coruna), Francisco J. Ribadas (University of Vigo), and Jorge Grana (University of A Coruna)

Multi-Dimensional Sequential Pattern Mining
Helen Pinto, Jiawei Han, Jian Pei, Ke Wang (Simon Fraser University), and Qiming Chen, Umeshwar Dayal (Hewlett-Packard Labs)

Mining Confident Rules Without Support Requirement
Ke Wang (Simon Fraser University), Yu He (Hewlett-Packard Singapore), David Cheung (University of Hong Kong), and Francis Chin (University of Hong Kong)
Research Session 5: Text Extraction and Summarization
Session Chair: Anton Leuski, University of Massachusetts (USA)

Combining Multiple Classifiers for Text Categorization
Khalid Al-Kofahi, Alex Tyrrell, Arun Vachher, Tim Travers, and Peter Jackson (Thomson Legal & Regulatory - R&D)

Text Classification in a Hierarchical Mixture Model for Small Training Sets
Kristina Toutanova (Xerox PARC and Stanford University), Francine Chen, Kris Popat (Xerox Palo Alto Research Center), and Thomas Hofmann (Brown University)

Using LSI for Text Classification in the Presence of Background Text
Sarah Zelikovitz and Haym Hirsh (Rutgers University)

2:45 - 3:15 


3:15 - 4:55 

Research Session 6: World Wide Web
Session Chair: Ronen Feldman, ClearForest (USA)

Keeping Found Things Found on the Web
William Jones, Harry Bruce (University of Washington), and Susan Dumais (Microsoft Research)

Merging Techniques for Performing Data Fusion on the Web
Theodora Tsikrika and Mounia Lalmas (University of London, UK) *

Using navigation data to improve IR functions in the context of Web search
Mark Hansen and Elizabeth Shriver (Bell Labs)

Mining the Web for Answers to Natural Language Questions
Dragomir Radev, Hong Qi, Zhiping Zheng, Sasha Blair Goldensohn, Zhu Zhang, Weiguo Fan (University of Michigan), and John Prager (IBM TJ Watson Research Center)
Research Session 7: Semistructured Data
Session Chair: Alberto Laender, Federal University of Minas Gerais (Brazil)

Induction of Integrated View for XML Data with Heterogeneous DTDs
Euna Jeong (National Taiwan University) and Chun-Nan Hsu (Institute of Informationi Science, Academia Sinica)

Structural Inference for Semistructured Data
Jason Sankey (University of Sydney) and Raymond Wong (University of New South Wales)

XOO7: Applying OO7 Benchmark to XML Query Processing Tool
Stephane Bressan (National University of Singapore), Gillian Dobbie (University of Auckland), Zoe Lacroix (Arizona State University), Mong Li Lee, Ying Guang Li (National University of Singapore), Ullas Nambiar, and Bimlesh Wadhwa (Arizona State University)

Structural Proximity Searching for Large Collections of Semi-Structured Data
Michael Barg and Raymond Wong (University of New South Wales) *
Panel Session 1: XML, the WEB and Database Functionality?
: Erich Neuhold (GMD IPSI)

6:00 p.m. - 8:00 p.m.




9:00 - 10:00 

Keynote Address Two
Avoiding Irrelevance in Information Systems Research?
Mike Stonebraker, MIT
Session Chair: Calton Pu, Georgia Institute of Technology (USA)

10:00 - 10:30 


10:30 - 11:45 

Research Session 8: Distributed Information Retrieval
Session Chair: Kevyn Collins-Thompson, Microsoft (USA)

The Effectiveness of Query Expansion for Distributed Information Retrieval
Paul Ogilvie and Jamie Callan (Carnegie Mellon University)

Approaches to Collection Selection and Results Merging for Distributed Information Retrieval
Yves Rasolofo (Institut Interfacultaire d'Informatique), Faïza Abbaci (Ecole Nationale Supérieure des Mines de Saint-Etienne), and Jacques Savoy (Institut Interfacultaire d'Informatique)

Exploiting A Controlled Vocabulary to Improve Collection Selection andRetrieval Effectiveness
James French, Allison Powell (University of Virginia), Fred Gey, and Natalia Perelman,(U.C. Berkeley)
Research Session 9: Query Optimization (Cancelled)

Predicting the cost-quality trade-off for information retrieval queries: Facilitating database design and query optimization
Henk Ernst Blok, Djoerd Hiemstra (University of Twente), Sunil Choenni (University of Twente / University Nyenrode), Franciska de Jong, Henk M. Blanken, and Peter M.G. Apers (University of Twente) *

How Foreign Function Integration Conquers Heterogeneous Query Processing
Klaudia Hergula (DaimlerChrysler AG) and Theo Haerder (University of Kaiserslautern) (moved)

Joint Optimization of Cost and Coverage of Query Plans in Data Integration
Zaiqing Nie and Subbarao Kambhampati (Arizona State University)
Note: Moved to Session 1
Research Session 10: Collaborative Filtering and Algorithms
Session Chair: David Grossman, IIT (USA)

A music recommendation system based on music data grouping and user interests
Hung-Chen Chen and Arbee L.P. Chen (National Tsing Hua University)

Selecting Relevant Instances for Efficient and Accurate Collaborative Filtering
Kai Yu (University of Munich), Xiaowei Xu (Siemens AG), Martin Ester, and Hans-Peter Kriegel (University of Munich)

Evaluation of Item-Based Top-N Recommendation Algorithms
George Karypis (University of Minnesota)

12:00 - 1:30 


1:30 - 2:45 

Industry Session 2: Text Summarization and Question Answering
Session Chair: Duncan Ruiz, Pontifical Catholica University of RS (Brazil)

Recent Developments in Text Summarization
Inderjeet Mani (MITRE and Georgetown)

Summarization of Discussion Groups
R. Farrell, P. Fairweather, K. Snyder (IBM)

Question Answering in TREC
Ellen Voorhees (NIST)
Research Session 11: Sequence Mining
Session Chair: Leo Mark, Georgia Tech (USA)

Prefix-Querying: An Approach for Effective Subsequence Matching Under Time Warping in Sequence Databases
Sanghyun Park (IBM), Sang-Wook Kim (Kangwon National University, Korea), June-Suh Cho, and Sriram Padmanabhan (IBM)

Sliding-Window Filtering: An Efficient Algorithm for Incremental Mining
Chang-Hung Lee, Cheng-Ru Lin, and Ming-Syan Chen (National Taiwan University)

Efficient and Robust Feature Extraction and Pattern Matching of Time Series by a Lattice Structure
Wan Po Man Polly and Man Hon Wong (The Chinese University of Hong Kong) *
Research Session 12: Corpus Linguistics
Session Chair: Ian Soboroff, NIST (USA)

Mining the Web to Create Minority Language Corpora
Rayid Ghani, Rosie Jones,and Dunja Mladenic (Carnegie Mellon University)

Automatic Recognition of Distinguishing Negative Indirect History Language in Judicial Opinions
Jack Conrad (Thomson Legal & Regulatory- R&D) and Daniel Dabney (West Group)

Effective Arabic-English Cross Language Information Retrieval via Machine-Readable Dictionaries and Machine Translation
Mohammed Aljlayl and Ophir Frieder (Illinois Institute of Technology)

2:45 - 3:15 


3:15 - 5:15 

Research Session 13: Potpourri
Session Chair: David Grossman, IIT (USA)

A Near Optimal Algorithm for Generating Broadcast Programs on Multiple Channels
Chih-Hao Hsu, Guanling Lee and Arbee L.P. Chen (National Tsing Hua University)

Managing Trust in a Peer-2-Peer Information System
Karl Aberer and Zoran Despotovic (EPFL)

Alternative Representations and Abstractions for Moving Sensors Databases
Jacob Eisenstein, Shahram Ghandeharizadeh, Cyrus Shahabi, Gautum Shanbhag, and Roger Zimmermann (USC)

Termination Analysis of Active Rules Modular Sets
Alain Couchot (Univesité Paris Val de Marne)
Poster Session 1: Data Access and Knowledge Management
Session Chair: Terence Critchlow, Lawrence Livermore National Laboratory (USA)

Advanced Grouping and Aggregation for Data Integration
Eike Schallehn, Kai-Uwe Sattler, and Gunter Saake (University of Magdeburg)

Dynamic Versioning Concurrency Control for Index-Based Data Access in Main Memory Database Systems
Ying Xia, Sung-Hee Kim, Sook-Kyoung Cho, Kee-Wook Rim, and Hae Young Bae (Inha University, Korea) *

O-PreH : Optimistic Transaction Processing Algorithm based on Pre-Reordering in Hybrid Broadcast Environments
SungSuk Kim, SangKeun Lee, SoonYoung Jung, and Chong-Sun Hwang (Korea University)

Algorithm for Discovering Multivalued Dependencies
Men Hin Yan and Ada Wai-chee Fu (Chinese University of Hong Kong)

A Performance Comparison of bitmap indexes
Kesheng Wu, Ekow Otoo, and Arie Shoshani (Lawrence Berkeley National Laboratory)

Facilitating Knowledge Flow through the Enterprise
Jeanette Bruno (GE Corporate Research and Development)

Information access in Implicit Culture framework
Enrico Blanzieri (Universita' di Torino and ITC-irst, Trento), Paolo Giorgini (Universita' di Trento), Paolo Massa, and Sabrina Recla (ITC-irst, Trento)

Advances in Phonetic Word Spotting
Arnon Amir (IBM), Alon Efrat (University of Arizona), and Savitha Srinivasan (IBM)
Panel Session 2: What Can Researchers Do to Improve Security of Data and Documents?
: Arnon Rosenthal (MITRE)

6:30 p.m. - 9:30 p.m.




9:00 - 10:15 

Research Session 14: Data Warehouse
Session Chair: Il-Yeol Song, Drexel University (USA)

Index Filtering and View Materialization in ROLAP Environment
Shi Guang Qiu and Tok Wang Ling (National University of Singapore)

Dynamic and Hierarchical Spatial Access Method using Integer Searching
Kyoosang Cho (Sprint), Yijie Han, Yugyung Lee, and E.K. Park (University of Missouri at Kansas City)

Efficient Incremental View Maintenance in Data Warehouses
Ki Yong Lee, Jin Hyun Son, and Myoung Ho Kim (KAIST)
Research Session 15: String Match and Text Extraction
Session Chair: Kristina Toutanova, Stanford (USA)

Improved String Matching Under Noisy Channel Conditions
Kevyn Collins-Thompson (Microsoft), Charles Schweizer (Duke University), and Susan Dumais (Microsoft)

Summarization as Feature Selection for Text Categorization
Aleksander Kolcz (Personalogy, Inc.), Vidya Prabakarmurthi (University of Colorado at Colorado Springs), and Jugal Kalita (University of Colorado at Colorado Springs)

Bootstrapping for Example-Based Data Extraction
Paulo Golgher, Altigran Soares, Alberto Laender and Berthier Ribeiro-Neto (Federal University of Minas Gerais, Brazil)
Research Session 16: Classification
Session Chair: Hans Schek, ETH (Switzerland)

SQL Database Primitives for Decision Tree Classifiers
Kai-Uwe Sattler and Oliver Dunemann (University of Magdeburg)

Learning probabilistic Datalog rules for information classification and transformation
Henrik Nottelmann and Norbert Fuhr (University of Dortmund)

SVM Binary Classifier Ensembles for Image Classification
Kingshy Goh, Edward Chang, and Kwang-Ting Cheng (University of California, Santa Barbara)

10:15 - 10:30


10:30 - 12:15 

Industry Session 3: Data Management: Beyond the Traditional
Session Chair: Len Seligman, MITRE (USA)

The Enosys Markets Data Integration Platform: Lessons from the Trenches
Y. Papakonstantinou & V. Vassalos (Enosys Markets Inc.) *

Self-Managing Technology in IBM DB2 Universal Database
Daniel Zilio, Sam Lightstone, Kelly Lyons, and Guy Lohman (IBM)

Document Release versus Data Access Controls: Two Sides of a Coin?
Arnon Rosenthal (MITRE), Gio Wiederhold (Stanford)

XQuery: Use Cases and Language Features
Jonathan Robie (Software AG)
Research Session 17: Similarity Measures
Session Chair: Charles Nicholas, UMBC (USA)

Model-based Feedback in the Language Modeling Approach to Information Retrieval
Chengxiang Zhai and John Lafferty (Carnegie Mellon University)

PowerDB-IR - Information Retrieval on Top of a Database Cluster
Torsten Grabs, Klemens Böhm, and Hans Schek (ETH Zurich - Institute of Information Systems)

Automatic Query Expansion based on Divergence
D. Cai, C. J. Van Rijsbergen, and J. M. Jose (Univercity of Glasgow) *

Relevance Score Normalization for Metasearch
Mark Montague and Javed Aslam (Dartmouth College)
Research Session 18: Mobile Computing
Session Chair: Alvin Lim, Auburn University (USA)

Binary Interpolation Search for Solution Mapping on Broadcast and On-demand Channels in a Mobile Computing Environment
Jiun-Long Huang, Wen-Chih Peng, and Ming-Syan Chen (National Taiwan University)

Caching Constrained Mobile Data
Subhasish Mazumdar, Mateusz Pietrzyk (New Mexico Tech), and Panos Chrysanthis(University of Pittsburgh)

Scaling Replica Maintenance In Intermittently Synchronized Mobile Databases
Wai Gen Yee, Edward Omiecinski (Georgia Institute of Technology), Michael J. Donahoo (Baylor University), and Shamkant B. Navathe (Georgia Institute Of Technology)

An Optimal Construction of Invalidation Reports for Mobile Databases
Wen-Chi Hou (Southern Illinois University at Carbondale), Meng Su (Penn State Erie), Hongyan Zhang (Southern Illinois University at Carbondale), and Hong Wang (CoManage Corporation)

12:15 -1:45 


1:45 - 3:30 

Research Session 19: Association Rule Mining

Session Chair: Edward Omiecinsk, Georgia Institute of Technology (USA)

Rapid Association Rule Mining
Amitabha Das, Wee Keong Ng, and Yew Kwong Woon (Nanyang Technological University)

Mining Generalised Disjunctive Association Rules
Amit Anil Nanavati, Krishna Prasad Chitrapura, Sachindra Joshi, and Raghu Krishnapuram (IBM India Research Lab)

Efficient Runtime Generation of Association Rules
Richard Relue, Xindong Wu, and Hao Huang (Colorado School of Mines)
Research Session 20: Multimedia Information Processing
Session Chair: Arbee Chen, National Tsing Hua University (Taiwan)

Automatic Discovery of Salient Segments in Imperfect Speech Transcripts
Dulce Ponceleon and Savitha Srinivasan (IBM)

Finding Similar Images Quickly Using Object Shapes
Hengtao Shen (National University Of Singapore) *

Content-Based Retrieval of MP3 Music Objects
Chih-Chin Liu and Po-Jun Tsai (Chung Hua University)

Irregularity in Multi-dimensional Space-Filling Curves with Applications in Multimedia Databases
Mohamed Mokbel and Walkd Aref (Purdue University)
Poster Session 2: Information Retrieval and Text Mining
Session Chair: Ian Soboroff, NIST (USA)

Real Time User Context Modeling for Information Retrieval Agents
Travis Bauer and David Leake (Indiana University)

A Clustering Algorithm for Asymmetrically Related Data with Applications to Text Mining
K. Krishna and Raghu Krishnapuram (IBM India Reseach Lab)

A Statistical Model for Scientific Readability
Luo Si and Jamie Callan (Carnegie Mellon University)

Discovering the Representative of a Search Engine
King Lup Liu (DePaul University), Clement Yu (University of Illinois at Chicago), Weiyi Meng (SUNY at Binghamton), and Adrian Santos (DePaul University)

Reorganizing Web Sites Based on User Access Patterns
Yongjian Fu, Mario Creado, and Chunhua Ju (University of Missouri-Rolla)

A Domain Independent Environment for Creating Information Extraction Modules
Ronen Feldman, Yonatan Aumann, Yair Liberzon, Kfir Ankori, Jonathan Schler, and Benjamin Rosenfeld (ClearForest Corporation)

Ordinal Association Rules for Error Identification in Data Sets
Andrian Marcus, Jonathan Maletic, and King-Ip Lin (University of Memphis)

3:30 - 3:45

Conference Closing Remarks

* The papers marked with an * were cancelled by the authors due to the September 11th tragedy and travel complications arising from this event. As a consequence, some papers had to be rescheduled, and Research Session 9 has been cancelled.