Quantum Information Processing for Computational Linguistics on Small Ethical Texts (SETS) and Tamil Adjectives and Proverbs (TAP)

Session Number

CMPS(ai) 26

Advisor(s)

Dr. Chandrasekaran Subbramaniam, Bharathiar University, Coimbatore, India

Discipline

Computer Science

Start Date

17-4-2025 10:15 AM

End Date

17-4-2025 10:30 AM

Abstract

The objective is to propose an information processing model using quantum computing for computational linguistics on Small Ethical Texts (SETS) with Tamil Adjectives and proverbs (TAP). This model utilizes the Bobcat parser, a part of the Lambeq library, to parse the given SETS and map them into Combinatory Categorial Grammar (CCG) structures. Then apply Quantum Natural Language Processing (QNLP) to convert CCG structures into quantum circuits. The challenge in applying QNLP during Tamil grammar parsing is that the Bobcat parser parses only the English texts since it was pretrained on an English corpus. Since it incorrectly classifies grammar, it is necessary to apply NLP tools like SpaCy from MIT and Stanza from Stanford for Tamil SETS to tokenize and lemmatize. Stanza was pretrained on a corpus of Universal Dependencies that include Tamil syntactic annotations. Stanza uses dependency parsing whereas the Bobcat parser uses CCG parsing, necessitating a python class to be created for conversion. As a first level processing output, the program creates relevant string diagrams for the SETS’ syntactic relationships. The next step is to proceed with the string diagram and convert it into an accurate quantum circuit to analyze the semantic relationships between the Tamil linguistic entities.

Share

COinS
 
Apr 17th, 10:15 AM Apr 17th, 10:30 AM

Quantum Information Processing for Computational Linguistics on Small Ethical Texts (SETS) and Tamil Adjectives and Proverbs (TAP)

The objective is to propose an information processing model using quantum computing for computational linguistics on Small Ethical Texts (SETS) with Tamil Adjectives and proverbs (TAP). This model utilizes the Bobcat parser, a part of the Lambeq library, to parse the given SETS and map them into Combinatory Categorial Grammar (CCG) structures. Then apply Quantum Natural Language Processing (QNLP) to convert CCG structures into quantum circuits. The challenge in applying QNLP during Tamil grammar parsing is that the Bobcat parser parses only the English texts since it was pretrained on an English corpus. Since it incorrectly classifies grammar, it is necessary to apply NLP tools like SpaCy from MIT and Stanza from Stanford for Tamil SETS to tokenize and lemmatize. Stanza was pretrained on a corpus of Universal Dependencies that include Tamil syntactic annotations. Stanza uses dependency parsing whereas the Bobcat parser uses CCG parsing, necessitating a python class to be created for conversion. As a first level processing output, the program creates relevant string diagrams for the SETS’ syntactic relationships. The next step is to proceed with the string diagram and convert it into an accurate quantum circuit to analyze the semantic relationships between the Tamil linguistic entities.