<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-US"><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://www.adityachinchure.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://www.adityachinchure.com/" rel="alternate" type="text/html" hreflang="en-US" /><updated>2025-12-01T12:34:53-08:00</updated><id>https://www.adityachinchure.com/feed.xml</id><title type="html">Aditya Chinchure</title><subtitle>Graduate Student at University of British Columbia</subtitle><author><name>Aditya Chinchure</name><email>aditya10@cs.ubc.ca</email></author><entry><title type="html">TIBET: Identifying and Evaluating Biases in Text-to-Image Generative Models</title><link href="https://www.adityachinchure.com/tibet/" rel="alternate" type="text/html" title="TIBET: Identifying and Evaluating Biases in Text-to-Image Generative Models" /><published>2024-07-09T13:29:20-07:00</published><updated>2024-07-09T13:29:20-07:00</updated><id>https://www.adityachinchure.com/tibet</id><content type="html" xml:base="https://www.adityachinchure.com/tibet/"><![CDATA[<p>Text-to-Image (TTI) generative models have shown great progress in the past few years in terms of their ability to generate complex and high-quality imagery. At the same time, these models have been shown to suffer from harmful biases, including exaggerated societal biases (e.g., gender, ethnicity), as well as incidental correlations that limit such model’s ability to generate more diverse imagery. In this paper, we propose a general approach to study and quantify a broad spectrum of biases, for any TTI model and for any prompt, using counterfactual reasoning. Unlike other works that evaluate generated images on a predefined set of bias axes, our approach automatically identifies potential biases that might be relevant to the given prompt, and measures those biases. In addition, our paper extends quantitative scores with post-hoc explanations in terms of semantic concepts in the images generated. We show that our method is uniquely capable of explaining complex multi-dimensional biases through semantic concepts, as well as the intersectionality between different biases for any given prompt. We perform extensive user studies to illustrate that the results of our method and analysis are consistent with human judgements.<br />
<a href="https://tibet-ai.github.io">Website and Data</a></p>]]></content><author><name>Aditya Chinchure</name><email>aditya10@cs.ubc.ca</email></author><category term="publication" /><summary type="html"><![CDATA[We propose a general approach to study and quantify a broad spectrum of biases, for any TTI model and for any prompt, using counterfactual reasoning. Unlike other works that evaluate generated images on a predefined set of bias axes, our approach automatically identifies potential biases that might be relevant to the given prompt, and measures those biases.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.adityachinchure.com/assets/posts/tibet-bias/tibet.png" /><media:content medium="image" url="https://www.adityachinchure.com/assets/posts/tibet-bias/tibet.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">From Local Concepts to Universals: Evaluating the Multicultural Understanding of Vision-Language Models</title><link href="https://www.adityachinchure.com/multicultural/" rel="alternate" type="text/html" title="From Local Concepts to Universals: Evaluating the Multicultural Understanding of Vision-Language Models" /><published>2024-06-28T13:29:20-07:00</published><updated>2024-06-28T13:29:20-07:00</updated><id>https://www.adityachinchure.com/multicultural</id><content type="html" xml:base="https://www.adityachinchure.com/multicultural/"><![CDATA[<p>Despite recent advancements in vision-language models, their performance remains suboptimal on images from non-western cultures due to underrepresentation in training datasets. Various benchmarks have been proposed to test models’ cultural inclusivity, but they have limited coverage of cultures and do not adequately assess cultural diversity across universal as well as culture-specific local concepts. To address these limitations, we introduce the GlobalRG benchmark, comprising two challenging tasks: retrieval across universals and cultural visual grounding. The former task entails retrieving culturally diverse images for universal concepts from 50 countries, while the latter aims at grounding culture-specific concepts within images from 15 countries. Our evaluation across a wide range of models reveals that the performance varies significantly across cultures – underscoring the necessity for enhancing multicultural understanding in vision-language models.<br />
<a href="https://arxiv.org/abs/2407.00263">ArXiv</a></p>]]></content><author><name>Aditya Chinchure</name><email>aditya10@cs.ubc.ca</email></author><category term="publication" /><summary type="html"><![CDATA[We introduce the GlobalRG benchmark, comprising two challenging tasks, retrieval across universals and cultural visual grounding, to test VL models' cultural inclusivity.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.adityachinchure.com/assets/posts/multicultural-vl-bench/multicultural.png" /><media:content medium="image" url="https://www.adityachinchure.com/assets/posts/multicultural-vl-bench/multicultural.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">DE-TensoRF: Data-efficient and fast NeRFs</title><link href="https://www.adityachinchure.com/de-tensorf/" rel="alternate" type="text/html" title="DE-TensoRF: Data-efficient and fast NeRFs" /><published>2023-04-28T19:29:20-07:00</published><updated>2023-04-28T19:29:20-07:00</updated><id>https://www.adityachinchure.com/de-tensorf</id><content type="html" xml:base="https://www.adityachinchure.com/de-tensorf/"><![CDATA[<p>Developed DE-TensoRF, a model that can render 3D objects with as few as 3 images, and in under 15 min on a single GPU. We achieved the highest grade in our class, and led to collaboration efforts with Dr. Helge Rhodin’s research group.</p>]]></content><author><name>Aditya Chinchure</name><email>aditya10@cs.ubc.ca</email></author><category term="academic" /><summary type="html"><![CDATA[Developed DE-TensoRF, a model that can render 3D objects with as few as 3 images, and in under 15 min on a single GPU. We achieved the highest grade in our class, and led to collaboration efforts with Dr. Helge Rhodin’s research group.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.adityachinchure.com/assets/posts/de-tensorf/detensorf.gif" /><media:content medium="image" url="https://www.adityachinchure.com/assets/posts/de-tensorf/detensorf.gif" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">VisualCOMET+: Visual Commonsense Generation &amp;amp; its incorporation into a Multimodal Topic Modeling algorithm</title><link href="https://www.adityachinchure.com/viscomet+/" rel="alternate" type="text/html" title="VisualCOMET+: Visual Commonsense Generation &amp;amp; its incorporation into a Multimodal Topic Modeling algorithm" /><published>2022-12-09T18:29:20-08:00</published><updated>2022-12-09T18:29:20-08:00</updated><id>https://www.adityachinchure.com/viscomet+</id><content type="html" xml:base="https://www.adityachinchure.com/viscomet+/"><![CDATA[<p>The task of commonsense knowledge generation is largely limited to the language domain, with models such as COMET (for explicit knowledge) and GPT-3 (for implicit knowledge). Moreover, VisualCOMET, a commonsense generation model that utilizes the visual context, is limited to three people-centric relations. Since commonsense generation on entire scenes, or parts of a scene, can be helpful in several downstream multimodal tasks, including VQA and topic modeling, we propose a general-purpose visual commonsense generation model, VisualCOMET+, by extending VisualCOMET with four diverse inference relations. Using the clue-rationale pairs from a visual abductive reasoning dataset, we train our commonsense generation model by creating groundtruth structured commonsense triplets. Then, we show that we can get coherent and more diverse topics by incorporating generated commonsense inferences and visual features into a novel multimodal topic modeling algorithm, Multimodal CTM. <br />
<a href="https://drive.google.com/file/d/1_HxrSJzDZKj1uDm7irCx3_9KnXlT7My-/view?usp=sharing">Report</a></p>]]></content><author><name>Aditya Chinchure</name><email>aditya10@cs.ubc.ca</email></author><category term="academic" /><summary type="html"><![CDATA[Developed an extension to VisualCOMET to generate general-purpose commonsense knowledge from images. Showed improvements on coherence and diversity scores of a novel topic modelling algorithm using the generated knowledge]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.adityachinchure.com/assets/posts/commonsense-gen/viscomet+.png" /><media:content medium="image" url="https://www.adityachinchure.com/assets/posts/commonsense-gen/viscomet+.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">VLC-BERT: Visual Question Answering with Contextualized Commonsense Knowledge</title><link href="https://www.adityachinchure.com/vlc-bert/" rel="alternate" type="text/html" title="VLC-BERT: Visual Question Answering with Contextualized Commonsense Knowledge" /><published>2022-10-24T10:29:20-07:00</published><updated>2022-10-24T10:29:20-07:00</updated><id>https://www.adityachinchure.com/vlc-bert</id><content type="html" xml:base="https://www.adityachinchure.com/vlc-bert/"><![CDATA[<p>We present a new Vision-Language-Commonsense transformer model, VLC-BERT, that incorporates contextualized knowledge using Commonsense Transformer (COMET) to solve Visual Question Answering (VQA) tasks that require commonsense reasoning. VLC-BERT outperforms existing models that utilize static knowledge bases, and the article provides a detailed analysis of which questions benefit from the contextualized commonsense knowledge from COMET.<br />
<a href="https://openaccess.thecvf.com/content/WACV2023/papers/Ravi_VLC-BERT_Visual_Question_Answering_With_Contextualized_Commonsense_Knowledge_WACV_2023_paper.pdf">Paper</a> | <a href="https://github.com/aditya10/VLC-BERT">Github</a> | <a href="https://arxiv.org/abs/2210.13626">ArXiv</a></p>]]></content><author><name>Aditya Chinchure</name><email>aditya10@cs.ubc.ca</email></author><category term="publication" /><summary type="html"><![CDATA[We present a new Vision-Language-Commonsense transformer model, VLC-BERT, that incorporates contextualized knowledge using Commonsense Transformer (COMET) to solve Visual Question Answering (VQA) tasks that require commonsense reasoning.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.adityachinchure.com/assets/posts/vlc-bert/vlc-bert.png" /><media:content medium="image" url="https://www.adityachinchure.com/assets/posts/vlc-bert/vlc-bert.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Universal Machine Learning API</title><link href="https://www.adityachinchure.com/universal-ml-api/" rel="alternate" type="text/html" title="Universal Machine Learning API" /><published>2022-04-28T19:29:20-07:00</published><updated>2022-04-28T19:29:20-07:00</updated><id>https://www.adityachinchure.com/universal-ml-api</id><content type="html" xml:base="https://www.adityachinchure.com/universal-ml-api/"><![CDATA[<p><strong>Universal Machine Learning API</strong> <br />
A powerful Python API template, built on Flask, for plug-and-play use with machine learning models. <br />
<em>Technologies used: Python, with the Flask API package</em> <br />
<a href="https://medium.com/technonerds/a-production-grade-machine-learning-api-using-flask-gunicorn-nginx-and-docker-part-1-49927238befb">Blog</a> | <a href="https://github.com/aditya10/flask-ml-api">Github</a></p>]]></content><author><name>Aditya Chinchure</name><email>aditya10@cs.ubc.ca</email></author><category term="other" /><summary type="html"><![CDATA[A powerful Python API template, built on Flask, for plug-and-play use with machine learning models.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.adityachinchure.com/assets/posts/vl-bert-graph/vl-bert-graph.png" /><media:content medium="image" url="https://www.adityachinchure.com/assets/posts/vl-bert-graph/vl-bert-graph.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">VL-BERT-Graph: Graph-enhanced Transformers for Referring Expressions Comprehension</title><link href="https://www.adityachinchure.com/vl-bert-graph/" rel="alternate" type="text/html" title="VL-BERT-Graph: Graph-enhanced Transformers for Referring Expressions Comprehension" /><published>2022-04-28T19:29:20-07:00</published><updated>2022-04-28T19:29:20-07:00</updated><id>https://www.adityachinchure.com/vl-bert-graph</id><content type="html" xml:base="https://www.adityachinchure.com/vl-bert-graph/"><![CDATA[<p>We explore a simple method to incorporate inter-token relationships in a Transformer before performing any training, using graphs with edge features. In VL-BERT-Graph, we generate a fully-connected graph of input tokens where the edges represent similarity between the tokens, obtained using GloVE and CLIP. We then use a message-passing GNN to incorporate these features into the input tokens or the output encoding of the model, and train the Transformer with edge-feature attention masks. <br />
<a href="https://lrjconan.github.io/UBC-EECE571F-DL-Structures/assets/sample_reports_2021_W2/report_05.pdf">Report</a></p>]]></content><author><name>Aditya Chinchure</name><email>aditya10@cs.ubc.ca</email></author><category term="academic" /><summary type="html"><![CDATA[Incorporated Graph Neural Networks in a visual-linguistic Transformer]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.adityachinchure.com/assets/posts/vl-bert-graph/vl-bert-graph.png" /><media:content medium="image" url="https://www.adityachinchure.com/assets/posts/vl-bert-graph/vl-bert-graph.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Investigating extensions to VLC-BERT and comparing it with GPT-3</title><link href="https://www.adityachinchure.com/vlc-bert-ext/" rel="alternate" type="text/html" title="Investigating extensions to VLC-BERT and comparing it with GPT-3" /><published>2022-04-20T19:29:20-07:00</published><updated>2022-04-20T19:29:20-07:00</updated><id>https://www.adityachinchure.com/vlc-bert-ext</id><content type="html" xml:base="https://www.adityachinchure.com/vlc-bert-ext/"><![CDATA[<p>Visual Question Answering with commonsense reasoning is a challenging task that requires models to understand of the image, the question, and contextualized commonsense knowledge to assist with the reasoning required to arrive at an answer. In our work, we propose extensions to the VLC-BERT, aimed at solving two drawbacks of the model by identifying potential words in the input sequence that may answer the question using a Pointer Generator, and incorporating additional image information in the form of object tags from an object detection model. Our evaluation shows that the Pointer Generator and object detection models help achieve higher scores on the OK-VQA dataset. Furthermore, we generate answers using GPT-3 and incorporate them into VLC-BERT. Our error analysis on GPT-3 and VLC-BERT models highlight that GPT-3 contains valuable implicit commonsense and factual knowledge that is beneficial to our model. <br />
<a href="https://drive.google.com/file/d/1eH1TtFI5QLS78mf7wWRcrUyZO6T3_N0a/view?usp=sharing">Report</a></p>]]></content><author><name>Aditya Chinchure</name><email>aditya10@cs.ubc.ca</email></author><category term="academic" /><summary type="html"><![CDATA[This project extends VLC-BERT with pointer generator networks and object detection models. Furthermore, we compare the performance of VLC-BERT with GPT-3 on the OK-VQA dataset.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.adityachinchure.com/assets/posts/vlc-bert-extended/vlc-bert-extended.png" /><media:content medium="image" url="https://www.adityachinchure.com/assets/posts/vlc-bert-extended/vlc-bert-extended.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Learning faster Genetic Algorithms with dynamic mutation power</title><link href="https://www.adityachinchure.com/growing-agents/" rel="alternate" type="text/html" title="Learning faster Genetic Algorithms with dynamic mutation power" /><published>2021-12-03T18:29:20-08:00</published><updated>2021-12-03T18:29:20-08:00</updated><id>https://www.adityachinchure.com/growing-agents</id><content type="html" xml:base="https://www.adityachinchure.com/growing-agents/"><![CDATA[<p>Policy Gradient (PG) methods and Genetic Algorithms (GA) are used to train Reinforcement Learning agents to perform a particular task in an environment by maximizing the received reward. In the context of this assignment, both techniques aim to approximate a policy function that, given a state, produces a policy to pick the best action to maximize reward. Here, the policy function used is deep neural network model. In this project, I implement a PG method, REINFORCE, and a simple GA method to solve the Lunar Lander (LunarLander-v2) environment in OpenAI Gym. I propose two modifications to the GA method: an improved fitness function with which the GA can solve the task in about 50 generations, and a novel dynamic mutation power technique that helps the model solve the task in 30 generations. <br />
<a href="https://youtu.be/BmMubRYbuQM">Video</a> | <a href="https://drive.google.com/file/d/1bsnn7sfDrHZSZhAxYsIKkBmPxyytd4Xk/view?usp=sharing">Report</a></p>]]></content><author><name>Aditya Chinchure</name><email>aditya10@cs.ubc.ca</email></author><category term="academic" /><summary type="html"><![CDATA[This project introduces a modification to the GA algorithm to introduce dynamic mutation power, to solve the Lunar Lander evironment on OpenAI Gym in 30 generations.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.adityachinchure.com/assets/posts/growing-agents/ga.png" /><media:content medium="image" url="https://www.adityachinchure.com/assets/posts/growing-agents/ga.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">A Summary of Recent Text Summarization Techniques</title><link href="https://www.adityachinchure.com/text-summarization/" rel="alternate" type="text/html" title="A Summary of Recent Text Summarization Techniques" /><published>2020-12-03T18:29:20-08:00</published><updated>2020-12-03T18:29:20-08:00</updated><id>https://www.adityachinchure.com/text-summarization</id><content type="html" xml:base="https://www.adityachinchure.com/text-summarization/"><![CDATA[<p>In this project paper, we surveyed text summarization models by evaluating existing extractive and abstractive models. We studied the metrics and datasets used to evaluate the latest models and evaluated upcoming abstractive techniques. Finally, we highlighted future pathways for text summarization and suggested areas for improvement. <br />
<a href="https://drive.google.com/file/d/1ayX-OSNrvvJsNsnVA_16JzFIVmI0NnoB/view">Report</a></p>]]></content><author><name>Aditya Chinchure</name><email>aditya10@cs.ubc.ca</email></author><category term="academic" /><summary type="html"><![CDATA[In this project paper, we surveyed text summarization models by evaluating existing extractive and abstractive models. We studied the metrics and datasets used to evaluate the latest models and evaluated upcoming abstractive techniques. Finally, we highlighted future pathways for text summarization and suggested areas for improvement]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.adityachinchure.com/assets/posts/text-summarization/text-summarization.png" /><media:content medium="image" url="https://www.adityachinchure.com/assets/posts/text-summarization/text-summarization.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry></feed>