Data Driven Approach to Networks and Language

Lyon MAY 11-13, 2016

ENS de Lyon, Descartes Site, Room F106


Day 1: May 11th, 2016 — ENS de Lyon, Descartes Site, Room F106
9:00 — 10:00 Registration & Coffee — ENS de Lyon, Descartes Site, Room Seneque
10:00 — 11:00 Alfred Hero. Foundational principles for large scale graph mining
11:00 — 11:30 Nicolas Tremblay, Gilles Puy, Rémi Gribonval and Pierre Vandergheynst. Compressive spectral clustering
11:30 — 12:00 Romain Couillet, Hafiz Tiomoko Ali. Random Matrices in Machine Learning
12:00 — 14:00 Lunch — ENS de Lyon, Descartes Site, Room Seneque
14:00 — 15:00 Jacob Eisenstein. Modeling Language Change in Online Social Networks
15:00 — 15:30 Clément Thibert, Jean-Philippe Magué, Eric Fleury, Màrton Karsai, Matthieu Quignard. Dialectal characterization of linguistics variability on Twitter
15:30 — 16:15 Coffee Break — ENS de Lyon, Descartes Site, Room Seneque
16:15 — 16:45 Fabio Lamanna, Maxime Lenormand, María-Henar Salas-Olmedo, Gustavo Romanillos, Antonia Tugores, Bruno Goncalves and José Javier Ramasco. Strangers’ Tweets in Strange Lands
16:45 — 17:45 Martine Hausberger and Laurence Henry. Social network in ethological studies
18:00 — 19:00 Cocktail
Day 2: May 12th, 2016 — ENS de Lyon, Descartes Site, Room F106
9:30 — 10:30 Coffee Break — ENS de Lyon, Descartes Site, Room Seneque
10:30 — 11:00 Pierre Borgnat, Paulo Gonçalves, Nicolas Tremblay, Nathanaël Willaime-Angonin. Community Mining with Graph Filters for Correlation Matrices
11:00 — 11:30 Eitan Altman and Yonathan Portilla. Spatio-temporal evolution of languages through twitter
11:30 — 12:00 Shiri Lev-Ari. The influence of people’s social network size on their linguistic abilities
12:00 — 12:30 Rozenn Gautier. Personal networks and the acquisition of sociolinguistic competence in French as a second language
12:30 — 14:30 Lunch — ENS de Lyon, Descartes Site, Room Seneque
14:30 — 15:30 José Moura. Signal Processing on Graphs
15:30 — 16:00 Luc Le Magoarou and Rémi Gribonval. Approximate Fast Fourier Transforms on graphs via multi-layer sparse matrix approximation
16:00 — 16:45 Coffee Break — ENS de Lyon, Descartes Site, Room Seneque
16:45 — 17:15 Bastien Pasdeloup, Vincent Gripon, Dominique Pastor, Grégoire Mercier and Michael Rabbat. Characterizing graphs from diffused signals
17:15 — 17:45 Sébastien Lerique and Camille Roth. Cultural attractors by iterated sentence reformulation: elements of the cognitive story in complex contagion
Day 3: May 13th, 2016 — ENS de Lyon, Descartes Site, Room F106
9:15 — 10:15 Richard Benton. Social Network Cohesion and Changes in Spoken Dialects
10:15 — 11:00 Coffee Break — ENS de Lyon, Descartes Site, Room Seneque
11:00 — 11:30 Enrique Burgos, Laura Hernandez, Horacio Ceva and Roberto Perazzo. Opinion dynamics in a coevolving social network model
11:30 — 12:00 Rocco Tripodi and Marcello Pelillo. Evolutionary Aspects of Network and Opinions
12:00 — 12:30 Michal Valko. Sequential learning on graphs with limited feedback
12:30 — 14:30 Lunch — ENS de Lyon, Descartes Site, Room Seneque
14:30 — 15:00 Clément Renaud. Topogram: a web-based toolkit for spatiotemporal network analysis of online activities
15:00 — 16:00 Djamé Seddah. Coping with the Jabberwocky Syndrome: Morpho-Syntactic Analysis in Hostile Environment
16:30 — 17:30 Ceremony: E. Fleury will be awarded the National Order of Merit by M. Cosnard, president of HCERES.
17:30 — 18:30 Cocktail


Registration are now closed.

The registration to the workshop will be free but mandatory. Registration will include lunch and breaks and one social event/cocktail.

Motivation and Aims

Social media and their digital footprint give us access to amounts of data that were inconceivable just a few years ago. We can have access to the linguistic and social interactions of millions of users. The nature and the volume of the data available to study the online varieties of languages challenge the traditional linguistics methodologies. On the other hand, these data open new opportunities, and allow us to build fine grained descriptions of large scale phenomena on large populations observed during several years. Yet, the computational and statistical methods to make these data meaningful are still to be developed and validated.

These methods are expected to rely extensively on natural language processing (NLP, network science tools and data analytics. While NLP has become a mature industry with applications in, among other, data retrieval (search engines) or marketing, the available tools were designed for standard forms of language and are thus not suitable for varieties observed in social media. One reason is precisely because these varieties exhibit high variability, evolve quickly and have not been linguistically well described. A scientific account of online language varieties has to solve this circularity. For its part, network science and data analytics provide us with tools for studying massive data from complex networks, through mathematical theory (graph theory), computational modeling, and complex data processing.

Looking at social media to address language variation and change is relevant for two reasons. First, high variability and fast innovation rates make them a good laboratory. Second, they provide amounts of data, both linguistic and social, not accessible before for sociolinguistic enquiry. Yet, a second barrier is that traditional sociolinguistics methods are not suitable to treat these large amounts of digital data.

Under the names of machine learning, data mining, data analytics,…  the  community has developed methods and algorithms for the efficient processing of large-scale graph-structured high-dimensional data, and by finding ways to derive low-dimensional representations by dimensionality reduction. This allows to consider specific tasks such as learning or classification. A second objective of the workshop will be to discuss about these types of methods, which follow often from a data driven approach, especially for their application to social media dataset.


Our aim with this workshop is to enhance the understanding about the links between individuals, social structure, and language usage. These questions can be addressed by the detailed analysis of recently available large digital datasets, like ones collected in Twitter and other systems. These datasets describe the social interactions and written posts of a large number of individuals, which allows for the coupled analysis of the social network and language evolution as a function of time. Our goal is to better understand computational sociolinguistic, fundamentally interdisciplinary and based on data driven approaches. We will bring together researchers working on data driven approaches of social networks especially focusing on network linguistic from the fields of machine learning, data analytics, data mining, and computational linguistic.

Invited speakers

Richard Benton Richard Benton (University of Illinois)
Richard A. Benton is Assistant Professor of Labor and Employment Relations at the University of Illinois. His primary line of research examines social networks in corporate governance and focuses on how network cohesion among corporate elites affects affects corporate control. His other work examines network models for dialect acquisition and change. He is collaborating on research that explores how proximity in a community social network explains variation in spoken dialect change and similarity.
Jacob Eisenstein Jacob Eisenstein (Georgia Tech)
Assistant Professor in the School of Interactive Computing at Georgia Tech, where he leads the Computational Linguistics Laboratory. He works on machine learning approaches to understanding human language. He is especially interested in non-standard language, discourse, computational social science, and statistical machine learning.
Martine Hausberger Martine Hausberger
(UMR 6552 EthoS – Éthologie animale et humaine)
Laurence Henryr Laurence Henry
(UMR 6552 EthoS – Éthologie animale et humaine)
Les travaux de recherche de Laurence Henry s’articulent autour de quatre grandes lignes de réflexion qui sont i) l’influence sociale sur le développement et l’apprentissage de la communication, ii) l’usage et les fonctions des signaux sociaux, iii) le traitement sensoriel, perceptuel et cognitif de l’information sociale et enfin iv) l’évolution de la communication et de la socialité. Ces travaux ont permis de mettre en évidence l’extrême importance du lien social sur le développement, l’usage et le traitement de l’information. L’intérêt majeur du modèle biologique (l’étourneau sansonnet) est le parallèle existant entre le développement du chant chez les oiseaux chanteurs et le développement du langage chez les humains.
Alfred Hero Alfred Hero (University of Michigan)
Alfred O. Hero III is the R. Jamison and Betty Williams Professor of Engineering and co-director of the Michigan Institute for Data Science (MIDAS) at the University of Michigan, Ann Arbor. His primary appointment is in the Department of Electrical Engineering and Computer Science and he also has appointments, by courtesy, in the Department of Biomedical Engineering and the Department of Statistics. He received the B.S. (summa cum laude) from Boston University (1980) and the Ph.D from Princeton University (1984), both in Electrical Engineering. He is a Fellow of the Institute of Electrical and Electronics Engineers (IEEE). He has served as President of the IEEE Signal Processing Society and as a member of the IEEE Board of Directors. He has received numerous awards for his scientific research and service to the profession including the IEEE Signal Processing Society Technical Achievement Award in 2013 and the 2015 Society Award, which is the highest career award bestowed by the IEEE Signal Processing Society. Alfred Hero’s recent research interests are in the data science of high dimensional spatio-temporal data, statistical signal processing, and machine learning. Of particular interest are applications to networks, including social networks, multi-modal sensing and tracking, database indexing and retrieval, imaging, biomedical signal processing, and biomolecular signal processing.
José M. F. Moura José M. F. Moura (Carnegie Mellon University)
José M. F. Moura is the Philip and Marsha Dowd University Professor at Carnegie Mellon University in the Departments of Electrical and Computer Engineering and Biomedical Engineering (by courtesy). In the academic year 2013–14, he was a Visiting Professor at New York University (NYU) with the Center for Urban Science and Progress (CUSP). He holds an EE degree from Instituto Superior Técnico (IST, Portugal) and MSc, EE, and DSc degrees from the Massachusetts Insti- tute of Technology. He has held visiting professor appointments with MIT and NYU and was on the faculty of IST (Portugal). He is currently ECE Associate Department Head for Research and Strategic Initiatives. He founded and directs the CMU Information and Communication Technologies Institute (ICTI) that manages the CMU/Portugal Program, and co-founded the Center for Sensed Critical Infrastructure Research (CenSIR). With CMU colleagues, he co-founded SpiralGen that commercializes, under a CMU license, the Spiral technology ( His interests are on statistical signal and image processing and data science. He holds eleven issued patents and has published over 450 papers. Moura has been an IEEE Board Director, IEEE Division IX Director, member of several IEEE Boards, President of the IEEE Signal Processing Society (SPS), and Editor-in-Chief for the IEEE TRANSACTIONS ON SIGNAL PROCESSING. He has been on the Editorial Boards of several IEEE and ACM Journals. He received several awards including the IEEE Signal Processing Society Technical Achievement Award and the IEEE Signal Processing Society Award for outstanding technical contributions and leadership in signal processing. Moura is a Fellow of the IEEE, a Fellow of AAAS, a corresponding member of the Academy of Sciences of Portugal, and a member of the US National Academy of Engineering.
Djamé Seddah Djamé Seddah (University Paris-Sorbonne / Inria)
Djamé Seddah is a tenured associate professor in linguistics and informatics at the University Paris Sorbonne (Paris 4) and a member of the INRIA’s Alpage project. His current research explores the impact of annotation schemes of non-canonical languages for French as well as out of domain parsing, noisy user generated content parsing and syntax- semantic interface. Djamé is one of the founders of the stasticial parsing of morphologically-rich languages community and has instigated the creation of many annotated data set for French (the Sequoia Treebank, French Social Media Bank and the French QuestionBank).

Submission of abstracts

We invite you to submit a one or two-page abstract.Submissions are done via our EasyChair submission link:

It is required that at least one author of each accepted paper register and attend the Workshop on Data Driven Approach to Networks and Language to present their work.

  • Abstract submission deadline: FEBRUARY 12th, 2016
  • Notification to authors: MARCH 4th, 2016 FEBRUARY 27th, 2016
  • Conference date: MAY 11-13, 2016

Note: Find us also on the Sociolinguistic Events Calendar:

Comments are closed.