Welcome to SPIRE 2021

SPIRE 2021 is the 28th edition of the annual Symposium on String Processing and Information Retrieval. SPIRE has its origins in the South American Workshop on String Processing, which was first held in Belo Horizonte, Brazil, in 1993. Since 1998 the focus of the workshop has also included information retrieval, due to its increasing relevance to and inter-relationship with string processing.

Covid-19: SPIRE 2021 was intended to be held in Lille, France. We exchanged with the SPIRE community to consider the Covid-19 situation and decided that SPIRE 2021 will be held 100% online.

Important Dates

Program (Paris Time, CEST)

October 4th

Morning session

Afternoon session

October 5th

Morning session

Afternoon session

October 6th

Morning session

Afternoon session


Conference proceedings are published in the LNCS series of Springer, volume 12944.


Two Best Student Paper Awards, sponsored by Web4Good, have been given to the author(s) of the most outstanding works whose primary authors are students:
  • Lower Bounds for the Number of Repetitions in 2D Strings, Pawel Gawrychowski, Samah Ghazawi and Gad M. Landau
  • On Stricter Reachable Repetitiveness Measures, Gonzalo Navarro and Cristian Urbina

Invited Speakers

Christina Boucher

Computer & Information Science & Engineering Department Herbert Wertheim College of Engineering, University of Florida

Christina Boucher is an Associate Professor at the Department of Computer and Information Science and Engineering at the University of Florida. Her lab focuses on producing novel methodology and data structure to efficiently use current sequencing technologies such as Nanopore sequencing or Optical maps with deep expertise on antimicrobial resistance. She received several grants from NSF, NIH, and USDA.

Talk: Indexing genomes in a scalable manner

Daniel Lemire

University of Quebec

Daniel Lemire is a computer science professor at the University of Quebec (TELUQ) in Montreal. His research focuses on software performance, data engineering, and data indexing, with a particular emphasis on SIMD vectorization. He is one of the most popular developers on Github, and his libraries are widely used in the industry. He received the 2020 Award of Excellence for Achievement in Research.

Talk: Unicode at gigabytes per second

We often represent text using Unicode formats (UTF-8 and UTF-16). UTF-8 is increasingly popular (XML, HTML, JSON, Rust, Go, Swift, Ruby). UTF-16 is most common in Java, .NET, and inside operating systems such as Windows. Software systems frequently have to validate text or convert text from one encoding to the other. While recent disks have bandwidths of 5 GB/s or more, conventional approaches transcode non-ASCII text at a fraction of a gigabyte per second. We show that we can transcode (UTF-8, UTF-16) at gigabytes per second on current systems (x64 and ARM) without sacrificing safety. Our open-source library can be ten times faster than the popular ICU library on non-ASCII strings and even faster on ASCII strings.

Nicola Prezza

Ca' Foscari University of Venice, Italy

Nicola Prezza is an Assistant professor at Ca' Foscari University of Venice, Italy.. He mainly works on algorithms and data structures for the manipulation and analysis of compressed strings and graphs. His thesis focused on dynamic compressed data structure, wich lead to the DYNAMIC library collection of such structures. In 2018 he received the Best Italian Young Researcher in Theoretical Computer Science award.

Talk: Ordering infinity: indexing and compressing regular languages

Long papers

The list of accepted papers is the following:

  • A separation of $\gamma$ and $b$ via Thue--Morse Words Hideo Bannai, Mitsuru Funakoshi, Takuya Mieno, Tomohiro I, Dominik Köppl and Takaaki Nishimoto
  • Longest Common Rollercoasters Kosuke Fujita, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai and Masayuki Takeda
  • Permutation-constrained Common String Partitions with Applications Manuel Lafond and Binhai Zhu
  • Minimal unique palindromic substrings after single-character substitution Mitsuru Funakoshi and Takuya Mieno
  • Exploiting Pseudo-Locality of Interchange Distance Avivit Levy
  • On Stricter Reachable Repetitiveness Measures Gonzalo Navarro and Cristian Urbina
  • Grammar Index By Induced Suffix Sorting Tooru Akagi, Dominik Köppl, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai and Masayuki Takeda
  • Lower Bounds for the Number of Repetitions in 2D Strings Pawel Gawrychowski, Samah Ghazawi and Gad M. Landau
  • String Covers of a Tree Jakub Radoszewski, Wojciech Rytter, Juliusz Straszyński, Tomasz Waleń and Wiktor Zuba
  • findere: fast and precise approximate membership query Lucas Robidou and Pierre Peterlongo
  • Computing the original eBWT faster, simpler, and with less memory Christina Boucher, Davide Cenzato, Zsuzsanna Liptak, Massimiliano Rossi and Marinella Sciortino
  • An LMS-based Grammar Self-index with Local Consistency Properties Diego Diaz, Gonzalo Navarro and Alejandro Pacheco
  • Position Heaps for Cartesian-tree Matching on Strings and Tries Akio Nishimoto, Noriki Fujisato, Yuto Nakashima and Shunsuke Inenaga
  • On the approximation ratio of LZ-End to LZ77 Takumi Ideue, Takuya Mieno, Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga and Masayuki Takeda

Short papers

The list of accepted papers is the following:

  • Extracting the Sparse Longest Common Prefix Array from the Suffix Binary Search Tree Tomohiro I, Robert Irving, Dominik Köppl and Lorna Love
  • All instantiations of the greedy algorithm for the shortest superstring problem are equivalent Maksim Nikolaev
  • TSXor: A Simple Time Series Compression Algorithm Andrea Bruno, Franco Maria Nardini, Giulio Ermanno Pibiri, Roberto Trani and Rossano Venturini
  • Community pooling: LDA topic modelling in Twitter Federico Albanese and Esteban Feuerstein


Program Committee

  • Lorraine Ayad, Brunel University London, UK
  • Golnaz Badkobeh, Goldsmiths, University of London, UK
  • Ricardo Baeza-Yates, Northeastern University, USA, Universitat Pompeu Fabra, Spain and University of Chile
  • Djamal Belazzougui, CERIST, Algeria
  • Philip Bille, Technical University of Denmark, Denmark
  • Iovka Boneva, University of Lille, France
  • Broňa Brejová, Comenius University in Bratislava, Slovakia
  • Nieves R. Brisaboa, Universidade da Coruña, Spain
  • Ayelet Butman, Holon Institute of Technology, Israel
  • Nadia El-Mabrouk, University of Montreal, Canada
  • Simone Faro, University of Catania, Italy
  • Gabriele Fici, University of Palermo, Italy, Italy
  • Travis Gagie, Dalhousie University, Canada
  • Arnab Ganguly, University of Wisconsin-Whitewater, USA
  • Cecilia Hernandez, University of Concepcion, Chile
  • Tomohiro I, Kyushu Institute of Technology, Japan
  • Shunsuke Inenaga, Kyushu University, Japan
  • Giuseppe F. Italiano, LUISS University, Italy
  • Jaap Kamps, University of Amsterdam, The Netherlands
  • Dominik Kempa, Johns Hopkins University, USA
  • Tomasz Kociumaka, University of California, Berkeley, USA
  • Tsvi Kopelowitz, University of Michigan, USA
  • M. Oguzhan Kulekci, Istanbul Technical University, Turkey
  • Susana Ladra, Universidade da Coruña, Spain, Spain
  • Thierry Lecroq, University of Rouen Normandy, France (co-chair)
  • Moshe Lewenstein, Bar-Ilan University, Israel
  • Zsuzsanna Liptak, University of Verona, Italy
  • Giovanni Manzini, University of Pisa, Italy
  • Juan Mendivelso, Universidad Nacional de Colombia, Colombia
  • Laurent Mouchard, University of Rouen Normandy, France
  • Veli Mäkinen, University of Helsinki, Finland
  • Gonzalo Navarro, University of Chile, Chile
  • Yakov Nekrich, Michigan Technological University, USA
  • Kunsoo Park, Seoul National University, Korea
  • Nadia Pisanti, University of Pisa, Italy
  • Solon Pissis, CWI, The Netherlands
  • Cinzia Pizzi, University of Padova, Italy
  • Jakub Radoszewski, University of Warsaw, Poland
  • Giovanna Rosone, University of Pisa, Italy
  • Leena Salmela, University of Helsinki, Finland
  • Srinivasa Rao Satti, Norwegian University of Science and Technology, Norway
  • Marinella Sciortino, University of Palermo, Italy
  • Blerina Sinaimeri, INRIA, France
  • Jouni Sirén, University of California, Santa Cruz, USA
  • Jens Stoye, Bielefeld University, Germany
  • Yasuo Tabei, RIKEN Center for Advanced Intelligence Project, Japan
  • Lynda Tamine, IRIT, France
  • Hélène Touzet, CNRS, Lille, France (co-chair)
  • Bojian Xu, Eastern Washington University, USA
  • Binhai Zhu, Montana State University, USA

Local organizing committee

  • Stéphane Janot, Université de Lille, France (co-chair)
  • Antoine Limasset, CNRS, Lille, France (co-chair)
  • Camille Marchet, CNRS, Lille, France
  • Rayan Chikhi, Institut Pasteur, Paris, France
  • Areski Flissi, CNRS, Lille, France
  • Mikaël Salson, Université de Lille, France
  • Bastien Cazaux, Université de Lille, France

Steering committee

  • Ricardo Baeza-Yates, Northeastern University, USA, Universitat Pompeu Fabra, Spain and University of Chile
  • Christina Boucher, University of Florida, USA
  • Nieves R. Brisaboa, University of A Coruña, Spain
  • Travis Gagie, Dalhousie University, Canada
  • Alistair Moffat, University of Melbourne, Australia
  • Gonzalo Navarro, University of Chile, Chile
  • Berthier Ribeiro-Neto, Federal University of Minas Gerais, Brazil
  • Simon J. Puglisi, University of Helsinki, Finland
  • Sharma Thankachan, University of Central Florida, USA
  • Nivio Ziviani, Universidade Federal Minas Gerais, Brazil


SPIRE 2021 covers research in all aspects of string processing, information retrieval, computational biology, and related applications. Typical topics of interest include (but are not limited to):

  • String Processing: string pattern matching, text indexing, data structures for string processing, text compression, compressed data structures, compressed string processing, text mining, 2D pattern matching, automata based string processing.
  • Information Retrieval (IR): retrieval models, indexing, evaluation, algorithms and data structures for IR, efficient implementation of IR systems, interface design, text classification and clustering, text analysis and mining, collaborative and content-based filtering, topic modeling for IR, search tasks (Web search, enterprise search, desktop search, legal search, cross-lingual retrieval, federated search, (micro) blog search, XML retrieval, multimedia retrieval), digital libraries.
  • Computational Biology: high-throughput DNA sequencing (assembly, read alignment, read error correction, metagenomics, transcriptomics, proteomics), evolution and phylogenetics, gene and regulatory element recognition, motif finding, protein structure prediction.


The list of past venues is the following:

  • 27th SPIRE: Orlando, USA (online), October 2020
  • 26th SPIRE: Segovia, Spain, October 2019
  • 25th SPIRE: Lima, Peru, October 2018
  • 24th SPIRE: Palermo, Italy, September 2017
  • 23rd SPIRE: Beppu, Japan, October 2016
  • 22nd SPIRE: London, UK, September 2015
  • 21st SPIRE: Ouro Preto, Brazil, October 2014
  • 20th SPIRE: Jerusalem, Israel, October 2013
  • 19th SPIRE: Cartagena, Columbia, October 2012
  • 18th SPIRE: Pisa, Italy, October 2011
  • 17th SPIRE: Los Cabos, Mexico, October 2010
  • 16th SPIRE: Saariselkä, Finland, August 2009
  • 15th SPIRE: Melbourne Australia, November 2008
  • 14th SPIRE: Santiago, Chile, October 2007
  • 13th SPIRE: Glasgow, Scotland, October 2006
  • 12th SPIRE: Buenos Aires, Argentina, October 2005
  • 11th SPIRE: Padova, Italy, October 2004
  • 10th SPIRE: Manaus, Brazil, October 2003
  • 9th SPIRE: Lisbon, Portugal, September 2002
  • 8th SPIRE: Laguna de San Rafael, Chile, November 2001
  • 7th SPIRE: A Coruna, Spain, September 2000
  • 6th SPIRE: Cancun, Mexico, September 1999
  • 5th SPIRE: Santa Cruz, Bolivia, September 1998
  • WSP 1997: Valparaiso, Chile
  • WSP 1996: Recife, Brazil
  • WSP 1995: Valparaiso, Chile
  • WSP 1993: Belo Horizonte, Brazil




For more information, please contact spire2021@univ-lille.fr.

Legal mentions