Projects

Distributed Heterogeneous Training for Large Language Models using Ray and DeepSpeed

less than 1 minute read

Conducted ablation studies for exploring efficient heterogeneous (CPU + GPU) distributed training for language models such as BeRT and RoBeRTa over different factors such as batch size, number of CPU/GPU parallel workers etc. Ray was used for parallelizing CPU processes and Deepspeed’s ZeRO optimization was used for data parallelism along with mixed precision training for sentiment analysis.

Seeing is not believing: Privacy preserving facial manipulation using adversarial mask generation and diffusion models

less than 1 minute read

Identified salient features in input images and generated adversarial masks using various techniques such as saliency gradient maps, GRAD-CAM and random patch masking. Created non-private representations of the input images using latent diffusion models, so that private information is not transmitted to downstream tasks such as FaceNet’s recognition model.