cv

You can find my professional and academic experiences detailed here! For more information on my publications and teaching, please visit the relevant pages linked above.

Basics

Name Samar Khanna
Email samarkhanna [at] cs.stanford.edu
Url https://samar-khanna.github.io
Summary I'm passionate about solving impactful, real-world problems. To that end, I believe in using the amazingly flexible tools offered by AI, and more broadly, computer science, to craft beneficial, world-changing technologies.
Location Bay Area, California, USA

Education

  • 2021.09 - 2023.06

    Stanford, CA

    Stanford University
    Master of Science in Computer Science
    Specialization: Artificial Intelligence
    Distinction in Research
  • 2021.01 - 2021.05

    Ithaca, NY

    Cornell University
    Master of Engineering in Computer Science
  • 2017.08 - 2020.12

    Ithaca, NY

    Cornell University
    Bachelor of Science in Computer Science
    Minor: Electrical and Computer Engineering
    Summa Cum Laude (Highest Honors)

Professional experience

  • 2023.08 - Present

    Mountain View, CA

    Aurora Innovation
    Machine Learning Engineer (Perception)
    Working on long-range perception for self-driving trucks
  • 2022.06 - 2022.09

    Santa Clara, CA

    NVIDIA
    Machine Learning Intern
    Interned with the Perception team for autonomous vehicles
    • Improved model training time for 3D detection models by 40%
    • Devised methods to incorporate predictive uncertainty estimation for neural network classification and regression tasks
  • 2020.05 - 2020.08

    Remote

    Uber ATG
    Machine Learning Intern
    Interned with the Perception team on 3D object detection
    • Developed a new anchor-free 2D/3D image-centric object detection approach using camera + LiDAR sensor fusion
    • Improved 3D object AP by ~3% and 3D object F1 Score by ~2% for pedestrian detection
  • 2019.05 - 2019.08

    Pittsburgh, PA

    Uber ATG
    Software Engineering Intern
    Interned with the Machine Teaching team on large-scale auto-labelling efforts
    • Developed CNN to quicken 2D bounding box and segmentation labelling efforts using 3D LiDAR + camera labelled features
    • Achieved ~15-20% improvement in labelling speed using model's pre-labels (based on preliminary experiments)

Research

  • 2022.01 - 2023.08

    Stanford, CA

    Stanford Artificial Intelligence Laboratory (SAIL)
    Graduate Research Assistant
    Worked with Prof. Stefano Ermon (in CS) and also with Prof. David B. Lobell and Prof. Marshall Burke as part of the SustainLab research group. Topics I researched: self-supervised learning, generative (diffusion) models, foundation models for remote sensing data, parameter-efficient training.
    • Published multiple works on generative models, self-supervised learning and foundation models, including DiffusionSat, SatMAE, GeoLLM, Denoising Diffusion Bridge Models
    • Awarded $100,000 Google Cloud grant by Stanford HAI for our research
    • Mentored undergraduate students on research projects including data compression, parameter-efficient training, joint generative-discriminative models
    • Gave a research talk at Stanford HAI Climate-Centered group
  • 2020.02 - 2021.05

    Ithaca, NY

    Cornell Computer and Information Science (CIS)
    Undergraduante Research Assistant
    Researched under Dean Kavita Bala and Prof. Bharath Hariharan on domain generalization in satellite imagery
    • Devised an algorithm to tackle geographic generalization (i.e. robustness to variance in location of imagery)
    • Implemented MAML, self-supervised learning algorithms and time-series vision models for multi-class crop classification

Projects

  • 2018.01 - 2020.12

    Ithaca, NY

    Cornell Data Science
    Intelligent Systems Team Lead
    Served as a team lead for Cornell Data Science, an undergraduate student-run project team. Was responsible for multi-semester projects, collaborations with industry, presentations to professors, and participation in competitions. Other team-management responsibilities included recruitment, onboarding, weekly team check-ins, planning social activities etc.
    • Led a team to improve automatic mapping tools that can be used during humanitarian emergencies (project link)
    • Implemented and validated object detection and semantic segmentation models to map houses in satellite imagery (precision: 85%, recall: 81%, baseline precision: 45%)
    • Conducted interviews with 9 NGOs/organizations (Red Cross, Mapillary etc.) involved in humanitarian mapping
    • Collaborated with IBM Research to access PAIRS satellite data and automate dataset generation
    • Created a curriculum (assignments) spanning techniques in CV, NLP, RL and started a team blog to discuss research
  • 2022.01 - 2022.03

    Stanford, CA

    CS 224W Class Final Project
    Graduate Student
    Final project for CS 224W: Machine Learning with Graphs
    • Demonstrated how graph neural networks (GNN) can be used for online link prediction for drug discovery
    • Featured as one of the best projects by Stanford CS 224W blog page (link)
  • 2021.01 - 2021.05

    Ithaca, NY

    CS 4300 Class Final Project
    Undergraduate Student
    Final project for CS 4300: Language and Information
    • Developed a code search engine using embedding similarity search
    • Runner up for best project

Leadership

  • 2023.03 - 2023.06

    Stanford, CA

    CS221: Artificial Intelligence- Principles and Techniques
    Head Teaching Assistant
    Led a team of 15 TAs to manage CS 221, a class with 250+ students (required for the AI-track)
    • Led weekly meetings to organize TA duties, plan office hours and grading, and prepare homework assignments, exams, and section material
  • 2021.09 - 2023.06

    Stanford, CA

    Stanford CS
    Teaching Assistant
    Served as a TA for CS 221 (Artificial Intelligence) and CS 161 (Algorithms) over the Spring 2022, Winter 2022, and Fall 2021 quarters
    • Held office hours, led sections, prepared and graded exams & assignments, answered student questions on Ed
  • 2020.08 - 2021.05

    Ithaca, NY

    CS 4780: Introduction to Machine Learning
    Co-Head Teaching Assistant
    Served as one of the head teaching assistant for CS 4780 at Cornell over the Fall 2020 and Spring 2021 semesters
    • Led sections, created demos, held office hours, prepared and graded exams, and answered student questions on Ed

Volunteer

  • 2023.10 - 2023.12
    Neurips 2023 Computational Sustainability Workshop
    Reviewer
    Reviewed paper submissions and recommended acceptance/rejection for presentation at the workshop
  • 2022.11 - 2022.12
    Neurips 2022
    Group Leader- Education Outreach Volunteer
    Raised awareness about AI to high-school students from New Orleans for Neurips 2022. Discussed ways to begin a career in CS and AI as well as some of the novel ways AI can change fields such as law, sustainability, science etc. While the students were curious to know what a day in the life of an AI researcher looks like, they were most excited about the free tech company swag :D
  • 2015.06 - 2017.05
    The Akanksha Foundation
    Teaching Volunteer
    Conducted educational activities and created worksheets to teach English and Maths to underprivileged kindergarten children

Publications

  • 2024.01.16
    DiffusionSat: A Generative Foundation Model for Satellite Imagery
    The Twelfth International Conference on Learning Representations
    A novel diffusion-model based generative foundation model for satellite image datasets, that can condition on text and metadata. DiffusionSat can also solve inverse problems such as super-resolution, in-painting, and temporal prediction, surpassing the previous state-of-the-art
  • 2024.01.16
    GeoLLM: Extracting Geospatial Knowledge from Large Language Models
    The Twelfth International Conference on Learning Representations
    A novel method to extract geospatial knowledge from LLMs using auxiliary map data OpenStreetMap. GeoLLM achieves a 70% improvement in performance relative to baselines
  • 2024.01.16
    Denoising Diffusion Bridge Models
    The Twelfth International Conference on Learning Representations
    A more general framework to solve image-to-image translation tasks such as image editing. DDBMs outperform baseline methods by solving a stochastic differential equation based on the learned score of the diffusion bridge from data
  • 2023.08.26
    Differentiable Weight Masks for Domain Transfer
    ICCV 2023 Workshop on Out of Distribution Generalization in Computer Vision
    Can gradient-based weight masking methods achieve domain transfer in computer vision models? We compare different weight masking methods to anaylze their effect on domain transfer in commonly used vision models
  • 2023.07.20
    Invalid Logic, Equivalent Gains: The Bizarreness of Reasoning in Language Model Prompting
    ICML 2023 Workshop on Knowledge and Logical Reasoning in the Era of Data-driven Learning
    Chain-of-thought (CoT) prompting has previously been shown to elicit strong reasoning performance in LLMs. But does making the prompts logically invalid hurt LLM performance? Turns out- not so much!
  • 2022.10.31
    SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery
    Advances in Neural Information Processing Systems
    A novel pre-training method for transformers for temporal and multi-spectral satellite imagery. SatMAE outperforms multiple prior state-of-the-art methods on a collection of datasets, demonstrating its value as a foundational model with strong downstream task performance

Awards

  • 2022
    Neurips 2022 Scholar Award
    Neural Information Processing Systems
    Fully-funded attendance at NeurIPS 2022 conference, presented to a select group of student authors
  • 2021
    Dean's List
    Cornell University
    Awarded in all semesters for strong academic performance
  • 2017
    IBDP World Topper
    International Baccalaureate
    Achieved perfect 45/45 score (top 0.3% worldwide) in the IBDP grade 11-12 curriculum

Skills

Artificial Intelligence
Generative modelling
Diffusion models
Self-supervised learning
Parameter-efficient training
Meta-Learning
Time-series modelling
Machine Learning
Python
Pytorch
Tensorflow
C++
Distributed model training
Computer Vision
Object detection
Semantic segmentation
Image classification
Natural Language Processing
Large language models (LLMs)
Embeddings

Selected coursework

EE 364a: Convex Optimization
Stanford University 2023
CS 324: Foundation Models
Stanford University 2023
CS 265: Randomized Algorithms
Stanford University 2022
CS 231n: Computer Vision
Stanford University 2022
CS 228: Prob. Graphical Models
Stanford University 2022
CS 224n: NLP
Stanford University 2022
CS 330: Meta Learning
Stanford University 2021
CS 224w: ML with Graphs
Stanford University 2021
CS 6787: Advanced ML Systems
Cornell University 2020
CS 4120: Compilers
Cornell University 2020
MATH 2930: Differential Equations
Cornell University 2018

Languages

English
Native speaker
Hindi
Fluent
French
Intermediate

Interests

Outdoor activities
Swimming
Tennis
Hiking
Indoor activities 🤓
Yoga
Reading
Movies
Guitar