Research
Research Interests
- Computer Science: Large language models, Machine learning, Video LLMs, SLMs, Artificial intelligence, Computer vision
- Software Engineering: Data strucures and algorithms, Object-oriented programming, Design patterns, Distributed systems, Databases, and Cloud computing
- Mathematics: Topology, Game theory, Linear Algebra, and Discrete mathematics.
- Inter-disciplinary: Robotics, Human-Computer Interface, AI in drug discovery, Computation social science, Computational Biology, Binformatics, Personalized medicines, and Algorithmic Game theory.
Current Projects
- Reducing hallucination in terraform template generation: LLMs often hallucinate, posing risks when near-deterministic outputs are required. This project applies evolutionary computation—using custom fitness scores, mutation rates, and mortality rates—to minimize hallucination in Terraform template generation. We compare existing models alongside a custom dataset, highlighting the pros and cons of this approach.
- Reducing hallucination in structured outputs via Retrieval-Augmented Generation with higher synergy: Generative AI for text-to-low-code flows faces trade-offs between model size, performance, and hallucination risks. This project explores tighter integration of Retriever and LLM—through joint training or architecture enhancements—to improve synergy. We also investigate optimal model sizing and performance for enterprise-level adoption.
- Enhancing LLMs with Synthetic Data: Case Studies in Data Augmentation, Bias mitigation, and model robustness: LLMs thrive on large datasets but often lack domain-specific data due to regulations or scarcity. This work uses synthetic data to strengthen LLMs in low-resource scenarios, tackle bias, and boost robustness. Our case studies underscore synthetic data’s dual role as both a stopgap for scarce datasets and a way to build less biased, more reliable models.
Past Projects
-
SNP Detection in Liquid Biopsy: In this work, we develop a machine learning model—specifically a Random Forest classifier—to distinguish genuine cancer-related variations in genomic reads from noise introduced by sequencing, alignment, or PCR errors. Traditional manual methods of refining variant calls are labor-intensive, prone to error, and rely heavily on read-level features alone. By incorporating base-specific attributes, our model assigns a probability score to each base at a given position (pileup column), enabling more accurate SNP calls with high specificity and sensitivity. This approach significantly reduces false positives and streamlines the variant-calling process compared to manual filters. Here’s a link to the thesis report: Undergraduate_thesis_report
-
Resolution of River water allocation conflicts using game theory: Amid looming global water shortages—where demand could outstrip supply by 40% by 2030—this project proposes a two-level game-theoretic model for sustainable distribution, applied to Bengaluru’s water crisis. In Level 1, central authorities allocate water among states, reaching a Nash equilibrium in cooperative or non-cooperative modes. In Level 2, environmental bodies, industries, citizens, and states react to these allocations, resulting in a system of partial differential equations that identify final equilibria. By employing both Nash Equilibrium and Shapley Value approaches, this framework balances stakeholder interests and guides more equitable, science-based water policies. Here’s a link to the project report: Game_Theory_Project_Report
-
Optimal coalition structure problem with voting indices computation using an influence matrix: This project tackles the optimal coalition structure problem—specifically focusing on critical coalitions needed to pass key bills in India’s parliament. By using a subset generator function and an influence matrix, we account for relationships among various political agents and compute their voting indices under different quotas. To scale efficiently for larger sets of agents, we introduce evolutionary algorithms, including genetic algorithms, to determine coalition configurations with minimal error. Balancing factors like mutation rate proves pivotal for convergence. Ultimately, this framework provides deeper insight into how coalitions might form or dissolve to pass (or block) legislation. Here’s a link to the project report: LOP_Project_report