Hi !
I am Anooshka Bajaj

Data Scientist, Machine Learning & AI Researcher

download resume

about me

I am a Data Scientist II at TGS, and hold an M.S. in Data Science from Indiana University Bloomington and a B.Tech. in Bio-Engineering with a Minor in Computer Science Engineering from IIT Mandi.

During my graduate studies I worked in the Cognitive AI Lab at IU, with research focused on mechanistic interpretability, in-context learning, and AI safety and alignment in large language models. Prior to TGS, I worked as an Engineer at Radisys.

Talk to me about AI, Rationality and EA, and I'd be happy!

Name

Anooshka Bajaj

Address

Houston, TX
United States

Email

anooshkabajaj@gmail.com

profile_image

education

2015 - 2017

Secondary School

Chandigarh, India

CGPA: 10.00

2017 - 2019

Senior Secondary School

Chandigarh, India

Non-Medical: Physics, Chemistry,
Maths
Percentage: 90.20%

2019 - 2023

Bachelor of Technology (B.Tech.)

Himachal Pradesh, India

B.Tech. in Bio-Engineering
with Minor in Computer Science Engineering
CGPA: 8.50 / 10.00

2024 - 2026

Master of Science (M.S.)


Bloomington, IN, United States

M.S. in Data Science
GPA: 3.94 / 4.00

experience

  • May 2026 - Present

    Data Scientist II


    May 2025 - Aug 2025

    Data Science Intern

    Houston, TX, United States

    Developed and optimized a large-scale machine learning foundation model for geological data, refining the architecture and fine-tuning on industry datasets to enable more accurate predictions for subsurface formation analysis.

  • Bloomington, IN, United States

    Under , part of the 's IU Luddy Autonomous Racing Team, competing in the . Led the Data Analysis sub-team, built an AI agent platform for natural language querying of telemetry and performance data, and developed visualization tools to surface actionable race insights.

    Conducted research in the under on in-context learning mechanisms in Transformer-based language models. Investigated the temporal dynamics of attention heads in LLMs, drawing parallels to human memory systems to better understand emergent learning behavior.

    Dec 2024 - May 2026

    AI Engineer


    Sep 2024 - May 2026

    Research Assistant

  • Dec 2023 - June 2024

    Engineer


    Jul 2023 - Nov 2023

    Intern

    Bangalore, Karnataka, India

    Worked with the OAM (Operations, Administration, and Maintenance) team on 5G network infrastructure. Contributed to Configuration Management (CM) by extending the YANG model with new configurations per 3GPP specifications, and to Performance Management (PM) by developing and modifying performance counters in C++ and Python.

  • Indore, Madhya Pradesh, India (remote)

    Developed machine learning models to predict used car prices and assess fair market value. Preprocessed and performed exploratory data analysis on large datasets in Python, leveraged BigQuery for data warehousing, and orchestrated ETL workflows using Google Cloud Composer.

    July 2022 - Sep 2022

    Data Engineer Intern

  • June 2022 - July 2022

    Research Intern

    Bhopal, Madhya Pradesh, India

    Research internship under investigating target genes of the YAP transcription factor in the Hippo Signaling Pathway to identify potential tumor growth inhibition mechanisms. Implemented Python-based web scraping and data visualization to analyze target gene profiles, developed a cell migration analysis script, and deployed the results as a web application using Flask.

  • Mandi, Himachal Pradesh, India

    Research project in computational biology under . Designed custom index sequences to increase throughput of Illumina next-generation sequencing, and developed a Perl pipeline for barcode generation and quality control analysis to minimize sequencing errors.

    July 2021 - Nov 2021

    Research Project

Publications

Emergence of Episodic Memory in Transformers: Characterizing Changes in Temporal Structure of Attention Scores During Training

Deven Mahesh Mistry, Anooshka Bajaj, Yash Aggarwal, Sahaj Singh Maini, Zoran Tiganj

Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL 2025)

Pages: 8893–8911 | Albuquerque, New Mexico

Abstract:
We investigate in-context temporal biases in attention heads and transformer outputs. Using cognitive science methodologies, we analyze attention scores and outputs of the GPT-2 models of varying sizes. Across attention heads, we observe effects characteristic of human episodic memory, including temporal contiguity, primacy and recency. Transformer outputs demonstrate a tendency toward in-context serial recall. Importantly, this effect is eliminated after the ablation of the induction heads, which are the driving force behind the contiguity effect. Our findings offer insights into how transformers organize information temporally during in-context learning, shedding light on their similarities and differences with human memory and learning.

Citation (APA):
Mistry, D. M., Bajaj, A., Aggarwal, Y., Maini, S. S., & Tiganj, Z. (2025). Emergence of Episodic Memory in Transformers: Characterizing Changes in Temporal Structure of Attention Scores During Training. ACL Anthology, 8893–8911. https://aclanthology.org/2025.naacl-long.448/

Beyond Semantics: How Temporal Biases Shape Retrieval in Transformer and State-Space Models

Anooshka Bajaj, Deven Mahesh Mistry, Sahaj Singh Maini, Yash Aggarwal, Zoran Tiganj

Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2026)

Pages: 7562–7581 | Rabat, Morocco

Abstract:
In-context learning depends not only on what appears in the prompt but also on when it appears. To isolate this temporal component from semantic confounds, we construct prompts with repeated anchor tokens and average the model’s predictions over hundreds of random permutations of the intervening context. This approach ensures that any observed position-dependent effects are driven purely by temporal structure rather than token identity or local semantics. Across four transformer LLMs and three state-space/recurrent models, we observe a robust serial recall signature: models allocate disproportionate probability mass to the tokens that previously followed the anchor, but the strength of this signal is modulated by serial position, yielding model-specific primacy/recency profiles. We then introduce an overlapping-episode probe in which only a short cue from one episode is re-presented; retrieval is reliably weakest for episodes embedded in the middle of the prompt, consistent with "lost-in-the-middle" behavior. Mechanistically, ablating high-induction-score attention heads in transformers reduces serial recall and episodic separation. For state-space models, ablating a small fraction of high-attribution channels produces analogous degradations, suggesting a sparse subspace supporting induction-style copying. Together, these results clarify how temporal biases shape retrieval across architectures and provide controlled probes for studying long-context behavior.

Citation (APA):
Bajaj, A., Mistry, D. M., Maini, S. S., Aggarwal, Y., & Tiganj, Z. (2026). Beyond Semantics: How Temporal Biases Shape Retrieval in Transformer and State-Space Models. arXiv preprint arXiv:2510.22752. https://aclanthology.org/2026.eacl-long.355/

Temporal Dependencies in In-Context Learning: The Role of Induction Heads

Anooshka Bajaj, Deven Mahesh Mistry, Sahaj Singh Maini, Yash Aggarwal, Billy Dickson, Zoran Tiganj

arXiv preprint

Abstract:
Large language models (LLMs) exhibit strong in-context learning capabilities, but how they track and retrieve information from context remains underexplored. Drawing on the free recall paradigm in cognitive science (where participants recall list items in any order), we show that several open-source LLMs consistently display a serial-recall-like pattern, assigning peak probability to tokens that immediately follow a repeated token in the input sequence. Through systematic ablation experiments, we show that induction heads, specialized attention heads that attend to the token following a previous occurrence of the current token, play an important role in this phenomenon. Removing heads with a high induction score substantially reduces the +1 lag bias, whereas ablating random heads does not reproduce the same reduction. We also show that removing heads with high induction scores impairs the performance of models prompted to do serial recall using few-shot learning to a larger extent than removing random heads. Our findings highlight a mechanistically specific connection between induction heads and temporal context processing in transformers, suggesting that these heads are especially important for ordered retrieval and serial-recall-like behavior during in-context learning.

Citation (APA):
Bajaj, A., Mistry, D. M., Maini, S. S., Aggarwal, Y., Dickson B., & Tiganj, Z. (2026). Temporal Dependencies in In-Context Learning: The Role of Induction Heads. arXiv preprint arXiv:2604.01094. https://arxiv.org/abs/2604.01094

Who Do LLMs Trust? Human Experts Matter More Than Other LLMs

Anooshka Bajaj, Zoran Tiganj

arXiv preprint

Abstract:
Large language models (LLMs) increasingly operate in environments where they encounter social information such as other agents' answers, tool outputs, or human recommendations. In humans, such inputs influence judgments in ways that depend on the source's credibility and the strength of consensus. This paper investigates whether LLMs exhibit analogous patterns of influence and whether they privilege feedback from humans over feedback from other LLMs. Across three binary decision-making tasks, reading comprehension, multi-step reasoning, and moral judgment, we present four instruction-tuned LLMs with prior responses attributed either to friends, to human experts, or to other LLMs. We manipulate whether the group is correct and vary the group size. In a second experiment, we introduce direct disagreement between a single human and a single LLM. Across tasks, models conform significantly more to responses labeled as coming from human experts, including when that signal is incorrect, and revise their answers toward experts more readily than toward other LLMs. These results reveal that expert framing acts as a strong prior for contemporary LLMs, suggesting a form of credibility-sensitive social influence that generalizes across decision domains.

Citation (APA):
Bajaj, A., & Tiganj, Z. (2026). Who Do LLMs Trust? Human Experts Matter More Than Other LLMs. arXiv preprint arXiv:2602.13568. https://arxiv.org/abs/2602.13568

Media Coverage

New Cohort of Advanced Cyberinfrastructure Student Fellows

Indiana University IT News • Nov 12, 2025

Selected as one of six students across Indiana University for the Advanced Cyberinfrastructure Student Fellowship, supporting my research on interpretability, in-context learning, and confidence alignment in large language models, focusing on analyzing and understanding internal model behavior.

Data Science Research Cyberinfrastructure

IU Luddy Autonomous Race Team Thrives in Latest IAC Test

IU Luddy School of Informatics News • Aug 14, 2025

Highlighted for contributions to the IU Luddy Autonomous Race Team, which placed fourth in the Indy Autonomous Challenge at WeatherTech Raceway Laguna Seca. Contributed to race analysis by developing a tool to integrate and interpret high-frequency, multi-sensor autonomous vehicle data on racetrack maps.

Autonomous Racing Data Analysis Robotics

Challenges, Fun Highlight Computer Vision Course Final Project Poster Session

IU Luddy School of Informatics News • May 16, 2025

Featured for involvement in the Luddy computer vision course through a research driven project on automated detection of bacterial flagellar motors in electron tomography data. The project focused on applying computer vision methods to accelerate biological discovery by automating a traditionally manual annotation process.

Computer Vision Education ML

Grace Hopper Celebration delivers 'magical' experience for Luddy students

IU Luddy School of Informatics News • Nov 17, 2025

Selected to attend the Grace Hopper Celebration (GHC) in Chicago, the world's largest gathering of women in computing and engineering. Participated in the four-day conference alongside IU Luddy students, connecting with thousands of women in technology, attending sessions and workshops, and networking with leaders shaping the future of tech.

Grace Hopper Conference Networking

Positions of Responsibility

Sports Secretary
Sports Secretary

Aug '19 - June '20
Gauri Kund Hostel, IIT Mandi

Coordinator - Dance Club
Coordinator - Dance Club

Aug '20 - June '21
Uhl Dance Crew, IIT Mandi

Coordinator - Table Tennis Team
Coordinator - Table Tennis Team

Aug '20 - June '21
IIT Mandi

Fest Team Member
Planning and Publicity Team in Rann Neeti and Exodia

Sep '19 - Apr '20
IIT Mandi

Projects

Google ADK, Vertex AI, FastAPI

  • Built an for autonomous racing telemetry analysis using Google ADK and Vertex AI on GCP, developing agents that answer natural language queries over high frequency sensor data with a FastAPI backend and React dashboard.
  • Designed a multi-agent system with three specialized agents for telemetry data discovery, natural language race analysis, and automated visualization, containerized with Docker and deployed on Google Cloud Run.

RAG, LLMs, LangChain, FastAPI

  • LLM-powered Retrieval Augmented Generation (RAG) system for enterprise operations, enabling citation-backed question answering over incident response and customer support documents.
  • Ingested and indexed multiple enterprise documents using LangChain and FAISS for semantic search and context-aware retrieval.
  • FastAPI service supporting document ingestion, vector indexing, retrieval, and evaluation, with Dockerized deployment for scalable enterprise use.

Mamba, RoBERTa, DeBERTa, HPC

  • Investigated whether Mamba-2 state-space models encode mathematical operation structure better than Transformer encoders as dense retrievers, fine-tuning models on 251,558 contrastive triplets from the ARQMath-3 dataset.
  • Evaluated retrieval quality (NDCG@10, MRR) and geometric organization of embedding spaces across Mamba-2 (130M–1.3B) and Transformer baselines (RoBERTa, DeBERTa); RoBERTa-base outperformed Mamba-2 130M at matched parameter counts.

Claude AI, MCP, macOS

  • Built a macOS ambient memory engine that captures work context at regular intervals using screen monitoring and macOS accessibility APIs, helping individuals with ADHD maintain cognitive continuity across task switches.
  • Integrated Claude AI via MCP Bridge for semantic search across work history, AI-powered conversation, morning goal prioritization, and automatic 5-minute activity summaries with a local-first, encrypted SQLite storage model.
  • Won the Claude Builder Club Hackathon (March 2026); developed collaboratively in a team of five using Swift and Python.

Computer Vision, Object Detection, YOLO

  • Computer vision benchmarking study for detecting small biological structures in cryo-electron tomographic data using YOLO-based object detection models.
  • Benchmarked multiple YOLOv8 and YOLOv9 variants on cryo-electron tomograms to analyze performance trade-offs across model size, efficiency, precision, and recall, identifying the most effective architecture for small-object detection.
  • Mapped detections across slice depths to visualize spatial consistency of predictions, enabling reliable 3D localization of flagellar motors and reducing manual annotation effort in cryo-ET data.

ROS2, Vision-Language Models, Rviz, Gazebo

  • Built a semantic navigation system for TurtleBot3 that interprets natural language commands (e.g., "go to the table") using a vision-language model (VLM) to infer navigation targets.
  • Integrated ROS2's Nav2 stack and SLAM Toolbox for real-time mapping, path planning, and autonomous motion control in a Gazebo simulation environment.
  • Designed a modular six-node architecture connecting the VLM semantic reasoning layer to the existing ROS2 navigation ecosystem, visualized via RViz.

Achievements

Claude Hackathon
Winner

Won first place at the Indiana University-wide Claude Hackathon 2026, receiving $1,000 in Claude API credits and a first place certificate.

Inter IIT Sports Meet
1st Runner Up

Bagged Silver Medal at Inter IIT Sports Meet Delhi 2023

Dance Competitions
Winner

Won several intra college dance competitions including Breakout and Exuberance

Table Tennis
Nationals

Represented Chandigarh in Sub Junior TT Nationals at Cochin and Gandhidham. Represented Punjab, Haryana, HP and Chd Directorate at the NCC National Games at Delhi.

contact me

Anooshka Bajaj

Address:

Houston, TX, United States

LinkedIn:

anooshkabajaj