It was a terribly good experience! Understanding sentences like that remains one of the challenges in Natural Language Processing (NLP) in general based on my impression after attending the Computational Linguistics (COLING) conference held at Osaka, Japan from Dec 10-16, 2016. In this post, I aim to summarise my learning experiences in COLING.

My proposed topic (semantic similarity) is a core problem in many high-level semantic applications such as question answering, summarisation, entailment, translation, etc. In fact, my observation is confirmed by Omer Levy (University of Washington and Bar-Ilan University, Israel) whom I also met during the conference.

I also had a short discussion with Omer Levy, Pontus Stenetorp (UCL) and Dimitrios Kartsaklis (Queen Mary University of London) among others about sentence compositionally on what they think about it. They know the problem well and just simply responded what would be an alternative. I proposed that it should stop at semantic units (which could be at the sentence level) and from there do the alignments. I shared this intuition with Omer and Dimitrios at least in my impression.

I also discovered interesting and intuitive deep learning architectures for capturing language intuitions. I’m listing down below the especially interesting ones for me:

I also found interesting studies that try to capture linguistic notions from humans and behaviour.

Related to this, I was thinking of doing an experiment that simulates alignment. It should simulate aspects of iSTS such as alignment type. Further, timestamp should be noted to indicate the order of importance when doing alignment. In addition, I wonder also if the work of Barbara could be done for speech for the same task of shallow syntactic parsing.

Further, there were some studies focusing more on machine learning.

And some datasets.

  • Leon Derczynski (University of Sheffield, UK). Twitter corpus that is more representative spatially and temporally. Broad Twitter Corpus: A Diverse Named Entity Recognition Resource. The dataset is free and spans 5 years beginning 2009.

  • Had a short chat with Thomas Francois (UCL, Belgium) on the possibility of teaching machines like teaching kids to read which is to start from simple sentences and he referred to me a dataset: weekly reader and BBC bitesize. This would be an interesting research.

I also met some fantastic people from industry like the CEO and Co-founder of Chata.ai (based in Calgary, Canada), Kelly Cherniwchan. Imagine a question-answering system that interacts with your database.

Here is the COLING 2016 proceedings.

Lastly, the list of upcoming conferences was announced at the closing event as follows:

EACL 2017. April 3-7, Spain. Deadline passed.

ACL 2017. July 30-Aug 4, Vancouver. Deadline: Feb 6, 2017.

EMNLP 2017. Sep 9-11, Copenhagen. Deadline: April 14, 2017.

COLING 2018. Sta Fe, USA. TBA.

Attending conferences indeed is very expensive but I think it is the most efficient way to get new ideas and to know the state-of-the-art in my field through the face-to-face interaction with the experts themselves in the field.