I am a computer vision engineer at Bigvision.ai. I am interested in doing research on computer vision. Currently I am working on Generative AI and it’s real world applications.
☎️ Contact information
📧 Email: [email protected]
🐦 Twitter: **https://twitter.com/subhrajit608**
🔗 LinkedIn: **https://www.linkedin.com/in/subhrajit-dey-7a2784166/**
Google scholar: https://scholar.google.com/citations?user=qF5U1hIAAAAJ&hl=en
👩🏻💻 Research experience
Computer Vision Engineer at Big Vision
Location: Bengaluru, India, Date: June. 2022 - Present
- Served as a key contributor for a tooth pathology detection project, implementing MobileNetV2 and InceptionV3 models to effectively classify tooth types (Mandibular, Maxillary), achieving an exceptional accuracy of 95.6%. Additionally, obtained an F1-score of 98.42% in classifying teeth pathologies like dental caries and pulpal involvement, and 74.76% in localization using the U2Net segmentation model. Utilized Mask-RCNN for tooth segmentation and YOLOv5 for detection of fixed prosthesis, accomplishing a precision of 92% and a recall of 95%.
- For a card trading company, I trained card classifiers and rotation type classifiers utilizing MobileNetV2 on the AWS Sagemaker platform, leveraging S3 for data downloading and model weight storage.
- Worked on a project involving stable diffusion inpainting for a Car rental company, to facilitate floor change of cars located in white booths for photoshoots. Further refined the model to generate three distinct types of floors. Used different computer vision techniques like Image harmonization, homography, etc.
- Successfully created a course on art generators, such as stable diffusion, in partnership with OpenCV under the mentorship of Satya Mallick, CEO of OpenCV and Bigvision.ai. The course required adapting for Google Colab use, which was accomplished by reducing VRAM usage through quantization and an 8-bit Adam optimizer, cutting VRAM consumption from 22GB to 10GB. This course generated approximately 143,000 dollars on Kickstarter.
- Currently focusing on diverse applications of Large Language Models (LLMs) such as data extraction and image QA on sports cards for a sports card trading company.
UNDERGRADUATE STUDENT RESEARCHER
Mitacs Globalink Research*, Montreal, Canada – (Jun. 2021 ‑ Sept. 2021)*
- Used a spatial pyramid pooling (PSMNet) based stereo matching model for 360 depth estimation using 360 stereoscopic image pairs. The model
is known as 360StereoNet. The code can be found here, https://github.com/subro608/360StereoNet