Wednesday, 10 June 2020

Sarcasm Detection


Introduction

Sarcasm detection has been a difficult problem in traditional Natural Language Processing. The difficulty in recognition of sarcasm causes misunderstanding in everyday communication and poses problems to many NLP systems. There were many approaches made in solving this problem. These included rule-based AI, statistical based AI and machine learning based AI. The rule-based AI quite onerous to program. Also, it shows an inability in understanding the context or meaning of words.
The Problem Statement
Since sarcasm detection has received considerable attention in the NLP community in recent years, many computational approaches for sarcasm detection have been modeled either the utterance in isolation or together with contextual information such as conversation context, author context, visual context, or cognitive features. Here, I present a deep neural network-based sarcasm detection technique on datasets with and without conversational context. Two kinds of datasets were used:  Twitter conversations and conversation threads from Reddit, and News Headlines. The goal is to understand the importance context plays in detecting sarcasm.

The Dataset

Dataset with Conversational Context


Twitter and Reddit conversations were taken to create this dataset. This dataset had 3 columns in total:
1. label: 0 indicating not sarcastic and 1 indicating sarcastic.
2. context: This was a list of 2 elements. The first element being the first sentence in the conversation and the second element being the third/final statement in the conversation. This final statement is the one that is classified as sarcastic or not.
3.  response: Contains the second statement, the response to the initial comment, in the conversation.
The total number of entries in this data is 9,400. Even though the dataset is small, we’ll soon see how this yields a better output.

Dataset without Conversational Context


This dataset consists of News Headlines. This dataset had 2 columns:
1.   is_sarcastic: 0 indicating not sarcastic, 1 indicating sarcastic.
2. headline: The headline that needs to be classified as sarcastic or non- sarcastic.
There were a total of 55.3k entries in this data.

Proposed Solution


There were two models made, one for each dataset, but the basic architecture more or less remained the same.

The Architecture

1. Embedding: Accepts comments and encodes them into a vector of size e, outputs matrix of size 100x32.
2. Convolution: The data now undergoes 1-dimensional convolution.  This layer establishes 14-word combinations.
3.  Max Pooling: This layer is used to reduce overfitting and add additional layers to the network.
4.  Convolution: This layer groups those 14-word combinations into groups of 7. Basically, this creates a group of phrases, each 14 word long.
5.   Bidirectional LSTM: Used to train the model in chronological and reverse order.
6.  Output, Loss Functions and Hyperparameters: The output layer consists of a single sigmoid neuron trained with the loss function binary_crossentropy. For the model trained on conversational context (Model1), the activation function used was LeakyReLU to avoid vanishing gradient, while the other model (Model2) used ReLU. Both these models were trained using the Adam Optimizer, with a learning rate of 0.0005 applied on Model1.

Result

Since the main goal of this project was to understand the importance of context, accuracies weren’t compared between these models. A couple of custom input were entered to check the output. It was found that Model1 could easily understand the difference between a sarcastic and a non-sarcastic comment, while Model2 couldn’t. Also, Model2 could only understand a sarcastic comment if it was in the form of a news headline. Any other normal sarcastic statements went totally unidentified.

Conclusion

It can be concluded, that even though the non-conversational dataset was larger than the conversational, the latter could provide a better output only due to the presence of information regarding the scenario the sarcastic statement was used in.

No comments:

Post a Comment