Sri Satya SaiIMMANI
AI Tooling - Systems - Performance - Data EngineeringMS CS @ Case Western Reserve University
Sri Satya Sai Immani
I build practical AI systems and developer-friendly tooling: LLM integrations, retrieval workflows, reliable APIs, and performance-aware pipelines.
Library chatbot impact
30% workload drop
MCQ agent impact
~60% time saved
GPU sorting throughput
344.6M elems/s
Tech stack
Python
FastAPI
C
C++
Linux
Docker
PyTorch
Google Cloud
AWS
PostgreSQL
Kubernetes
Git
TypeScript
Next.js
React
Tailwind CSS
OpenAI
HTML5
CSS3
About
SS

Graduate CS engineer focused on AI systems and performance work: LLM integration, RAG-style workflows, and GPU/CPU benchmarking, built to be measurable and reliable.

LLM IntegrationData + APIsCUDA / Benchmarking

Featured Projects

See all
NovelTease
Multimodal generation + FFmpeg video synthesis behind a production-grade FastAPI backend.
AI App
Python
FastAPI
TypeScript
OpenAI
Hybrid Bucket-Radix GPU Sort
Warp-aggregated scatter reduces atomics and hits 344.6M elements/sec on large benchmarks.
CUDA
C
C++
CUDA
Linux
CMake
Multiclass Vulnerability Detection (GGNN)
Trained GGNNs on interprocedural program graphs (150K+ funcs); improved macro-F1 0.72 -> 0.84.
GNN
Python
PyTorch
C
C++

Work Experience

Generative AI Research Assistant - [U]Tech
Apr 2025 - Jan 2026
Impact-first AI systems work.
  • Developed a memory retaining retrieval-augmented generation agent which acts like a patient with predefined medical conditions.
  • Built a library-query chatbot reducing front desk workload by 30%.
  • Engineered an agent that retrieves NLM images and auto-generates MCQs (~60% professor workload reduction).
  • Curated and evaluated 5-10 AI tools monthly; authored adoption reports reviewed by leadership.
Copilot Studio
Dataverse
PowerApps
Sharepoint
Digital Accessibility Assistant - [U]Tech
Jan 2025 - Apr 2025
Quality assurance and structured outputs at scale.
  • Processed and validated digital content with QA standards to ensure clean, structured outputs.
  • Corrected 100+ hours of content to pass accessibility rubrics and metadata accuracy checks.
Markdown
Echo360

Open to AI tooling, backend, and performance-focused roles.

Clear signal: measurable impact, real systems, and clean UX.