Quantum Information Processing for Computational Linguistics on Small Ethical Texts (SETS) and Tamil Adjectives and Proverbs (TAP)
Session Number
CMPS(ai) 26
Advisor(s)
Dr. Chandrasekaran Subbramaniam, Bharathiar University, Coimbatore, India
Discipline
Computer Science
Start Date
17-4-2025 10:15 AM
End Date
17-4-2025 10:30 AM
Abstract
The objective is to propose an information processing model using quantum computing for computational linguistics on Small Ethical Texts (SETS) with Tamil Adjectives and proverbs (TAP). This model utilizes the Bobcat parser, a part of the Lambeq library, to parse the given SETS and map them into Combinatory Categorial Grammar (CCG) structures. Then apply Quantum Natural Language Processing (QNLP) to convert CCG structures into quantum circuits. The challenge in applying QNLP during Tamil grammar parsing is that the Bobcat parser parses only the English texts since it was pretrained on an English corpus. Since it incorrectly classifies grammar, it is necessary to apply NLP tools like SpaCy from MIT and Stanza from Stanford for Tamil SETS to tokenize and lemmatize. Stanza was pretrained on a corpus of Universal Dependencies that include Tamil syntactic annotations. Stanza uses dependency parsing whereas the Bobcat parser uses CCG parsing, necessitating a python class to be created for conversion. As a first level processing output, the program creates relevant string diagrams for the SETS’ syntactic relationships. The next step is to proceed with the string diagram and convert it into an accurate quantum circuit to analyze the semantic relationships between the Tamil linguistic entities.
Quantum Information Processing for Computational Linguistics on Small Ethical Texts (SETS) and Tamil Adjectives and Proverbs (TAP)
The objective is to propose an information processing model using quantum computing for computational linguistics on Small Ethical Texts (SETS) with Tamil Adjectives and proverbs (TAP). This model utilizes the Bobcat parser, a part of the Lambeq library, to parse the given SETS and map them into Combinatory Categorial Grammar (CCG) structures. Then apply Quantum Natural Language Processing (QNLP) to convert CCG structures into quantum circuits. The challenge in applying QNLP during Tamil grammar parsing is that the Bobcat parser parses only the English texts since it was pretrained on an English corpus. Since it incorrectly classifies grammar, it is necessary to apply NLP tools like SpaCy from MIT and Stanza from Stanford for Tamil SETS to tokenize and lemmatize. Stanza was pretrained on a corpus of Universal Dependencies that include Tamil syntactic annotations. Stanza uses dependency parsing whereas the Bobcat parser uses CCG parsing, necessitating a python class to be created for conversion. As a first level processing output, the program creates relevant string diagrams for the SETS’ syntactic relationships. The next step is to proceed with the string diagram and convert it into an accurate quantum circuit to analyze the semantic relationships between the Tamil linguistic entities.