Projects

AcharyaGPT

Description of Image

A user-friendly chatbot app powered by OpenAI’s fine-tuned GPT-3.5 Turbo, designed to discover tailored Ayurvedic formulations based on individual symptoms, patient attributes, and context. It offers personalized, safe usage instructions by referencing authentic Ayurvedic texts and trusted sources for recommended herbs and formulations, utilizing a retrieval-based Ayurvedic knowledge system for healthcare advice.

GyanSrota

Description of Image

GyanSrota is a web-based chatbot application designed to provide comprehensive answers to inquiries pertaining to academic matters, institutional handbooks, campus facilities, and various other aspects of Manipal University Jaipur. The system employs a retrieval pipeline to extract relevant information from a corpus of documents based on user prompts. For the purpose of RAG, the application utilizes Langchain, while the Gemini 1.5 Pro language model has been implemented to generate responses. Additionally, the system is equipped with buffer memory functionality to maintain contextual awareness throughout the conversation.

AR Model Placer

Description of Image

AR Model Placer is an augmented reality iOS application that allows users to place 3D models into real-world environments with a single touch. The app leverages ARKit for accurate motion tracking and environmental understanding, RealityKit for rendering and animating 3D models, and SwiftUI for building a responsive and modern user interface. ARKit provides advanced features like plane detection and light estimation, ensuring that 3D models are placed accurately and look realistic within the physical space. SwiftUI simplifies the development process with declarative syntax and animations making the app intuitive and efficient to use.

Crowd Anomaly Detection

Description of Image

This crowd behavior analysis framework is designed to detect and classify anomalies across eight categories, including seven crime-related classes and one normal behavior class. The process begins with the conversion of input video into individual frames. These frames are then processed through a Custom CNN architechture to extract high-dimensional spatial features. Following this, the extracted features are fed into a LSTM network, which captures temporal dependencies and sequential patterns essential for understanding dynamic scenes. Finally, the output from the LSTM is used to classify the sequence, determining the presence of criminal activity or normal behavior.

Multimodal Retrieval Framework

Description of Image

A multimodal Retrieval-Augmented Generation framework that incorporates Gemini Pro 1.5 for large language model processing, enabling multiple modalities such as image-to-text, text-to-image, and simultaneous text and image generation. It has integrated YOLO framework for object detection and OCR methods for accurate text extraction from PDFs and diverse image data. The framework enables efficient data retrieval and analysis, facilitating user interaction across complex datasets in information retrieval.

Movie Recommender System

Description of Image

The movie recommender system, trained on the IMDb 5000 movie dataset, employs a content-based approach for generating recommendations. It processes movie details such as genres, keywords, and cast, merging them into a unified text description for each movie. NLP libraries such as NLTK are used for tokenization, stemming, and stopword removal to prepare the text data. The system then employs the TF-IDF vectorizer from scikit-learn to convert these textual descriptions into numerical feature vectors. By calculating cosine similarity between these vectors, the system identifies and recommends movies with the highest similarity to a user’s selected movie, ensuring relevant suggestions.

… more projects on Github

Projects

AcharyaGPT

GyanSrota

AR Model Placer

Crowd Anomaly Detection

Multimodal Retrieval Framework

Movie Recommender System

About

Contact

Coordinates