Open to Work: Computer Vision Engineer

01 // Introduction

Aaryan Kurade

Building Production CV Systems

02 // Profile

About Me

Computer Vision Engineer with production experience across detection, segmentation, tracking, and geospatial applications. Built and deployed CV systems using YOLO, RT-DETR, SAM, and tracking algorithms (ByteTrack, DeepSORT) for sports analytics, satellite imagery, and industrial inspection. Extensive data annotation expertise (Roboflow/CVAT/QGIS) across detection, segmentation, and pose estimation tasks. Specialized in model optimization (ONNX, quantization) and scalable deployment (FastAPI + Docker).

03 // Career

Experience

Utopia Optovision Pvt. Ltd.

Machine Learning Intern | Pune, India

Jan 2024 - Jan 2025
  • Developed real-time industrial inspection system using YOLOv8 + PaddleOCR pipeline for conveyor belt code extraction, achieving 15% accuracy improvement and 61% CER reduction (18% → 7%)
  • Benchmarked Custom CNNs, R-CNN, and VLMs (QWEN), selecting YOLO+OCR hybrid to meet <100ms latency requirements for manufacturing lines
  • Optimized inference pipelines for resource-constrained CCTV hardware via ONNX export and async processing

Arakoo.ai

Software Engineer Intern | US - Remote

Aug 2025 - Nov 2025
  • Built real-time ASR pipeline with Voice Activity Detection, reducing false transcription triggers by 40% via adaptive thresholding and signal preprocessing
  • Deployed FastAPI async speaker diarization module handling 50+ concurrent audio streams with non-blocking I/O
  • Implemented prompt caching strategies reducing LLM inference costs by $0.02/minute through context reuse

03 // Portfolio

Featured Projects

In Progress

Multi-Sport CV Analytics

Tracking and analytics pipelines across volleyball, football, and basketball using ByteTrack, RT-DETR, and custom ONNX models with zero-shot team classification.

100 FPS Ball Detection (CPU)
87.3% MOTA Player Tracking
30-100 FPS Real-Time Inference
YOLO ByteTrack SigLIP ONNX
Production

AI Visual Search Engine

Deployed semantic search for 100K fashion images using CLIP embeddings + FAISS. Optimized for production with GPU acceleration and hardware detection.

< 100ms P95 Latency
60%+ Memory Efficiency
99%+ Uptime
Pytorch CLIP FAISS Docker
In Progress

Geo-Insight Analyzer

Multi-agent GeoAI system for natural language satellite imagery analysis using LangGraph, Moondream VLM, and SAM3 with Google Earth Engine integration.

Multi-Agent LangGraph
Target: <5s Latency
ChromaDB Vector Search
LangGraph VLM SAM3 GEE
Production

Video Anomaly Detection

Non-blocking video processing pipeline using Convolutional Autoencoders. Engineered to handle high-bitrate streams without timeout errors.

0.92 Precision
Non-blocking Async
Prometheus Monitoring
PyTorch Autoencoder MLOps

04 // Stack

Technical Arsenal

Languages & Frameworks

  • Python, MySQL
  • TensorFlow, PyTorch
  • NumPy, Pandas, Matplotlib, Scikit-learn

Computer Vision

  • YOLO, RT-DETR, SAM, VLMs
  • Object Detection, Segmentation, Classification
  • Pose/Keypoint Estimation, 3D Vision
  • Geospatial AI, SigLIP
  • Multi-Object Tracking (ByteTrack, DeepSORT, Kalman Filter)
  • OCR (PaddleOCR, Tesseract)
  • Homography, Feature Matching
  • Roboflow, CVAT, Supervisely, QGIS, ArcGIS (Geospatial)

AI Agents & LLMs

  • LangChain, LangGraph
  • Agentic RAG
  • Hugging Face Transformers
  • Vector DBs (Pinecone/Chroma)

Optimization

  • ONNX, TensorRT, vLLM
  • INT8/INT4 Quantization
  • Pruning, LoRA/PEFT
  • Mixed-precision (AMP)

MLOps & Backend

  • Docker, FastAPI
  • Git, Streamlit
  • Prometheus Monitoring
  • Async Processing

06 // Background

Education & Certifications

MIT World Peace University

B.Tech, Electronics & Communication Engineering - AI/ML

Jun 2021 - Jun 2025

Pune, India

Certifications

  • AI Agents Fundamentals - HuggingFace
  • Google Cloud Computing Foundations - NPTEL
  • Computer Vision Bootcamp - OpenCV

07 // Community

Open Source

Active contributor to Ultralytics, HuggingFace, and computer vision libraries. Contributing to open-source tools that democratize AI and make CV accessible to developers worldwide.