Available for new opportunities

Fatima Tu Zahra

Data Scientist & Developer

Crafting elegant solutions to complex problems. Specializing in building scalable web applications with modern technologies and best practices.

Get in Touch →View Work

Projects

Check out some of my recent work

Nexa - AI-Powered Interview Preparation Platform

An intelligent interview preparation platform that combines job search, resume parsing, automated resource curation, and AI-powered coaching to help candidates prepare effectively for their dream jobs.

  • Next.js
  • TypeScript
  • Tailwind CSS
  • Node.js
  • Express
  • OpenAI API
  • Groq API
  • Serp API
  • RAG

ZeroIdle-NLP: Efficient Training Pipeline for Low-Resource NLP Systems

Optimized transformer training for low-resource NLP under GPU constraints by removing preprocessing bottlenecks and improving GPU utilization through asynchronous CPU-GPU pipelining, dynamic batching, and multi-worker tokenization. Reduced training runtime from about 300 hours to about 55 hours while maintaining BLEU and CHRF on a held-out validation set.

  • PyTorch
  • M2M-100
  • MinatoLoader
  • NLP Systems
  • Dynamic Batching
  • Asynchronous CPU-GPU Pipeline

Global Energy & Development - Data Analysis and Visualization

A data analysis and visualization project exploring global energy production, consumption, and development trends throughout the years years using interactive charts and maps.

  • D3.js
  • React

Document Summarization with Retrieval-Augmented Generation (RAG) Q&A App

This project implements a Retrieval-Augmented Generation (RAG) pipeline to summarize long documents and answer user queries based on document content.

  • Python
  • Streamlit
  • Chroma
  • SentenceTransformers
  • HuggingFace LLMs
  • LangChain

PixelWonders - .NET Desktop Game

PixelWonders is a Windows Forms application that provides a creative platform for users to create, explore, and interact with pixel art designs and puzzles.

  • C#
  • .NET Framework
  • Windows Forms

Optimized Real-Time Data Warehousing System

A real-time data warehousing system using a hybrid join approach for ETL operations. Implements a star schema data warehouse with streaming data processing capabilities, designed to handle customer transactions, products, stores, and suppliers with concurrent data extraction and in-memory hash table joins combined with disk buffering.

  • Python
  • MySQL
  • Streamlit
  • ETL
  • Hybrid Join Algorithm
  • Threading

RAG-Powered Multimodal Voicebot

An AI-powered query system enabling real-time voice interactions and document-augmented question answering with multi-tenant support. Built during internship, the system combines speech-to-text, retrieval-augmented generation (RAG), and text-to-speech capabilities for seamless voice-to-voice conversations with WebSocket and REST APIs.

  • Python
  • FastAPI
  • Redis
  • PostgreSQL
  • Celery
  • WebSocket
  • LangChain
  • HuggingFace
  • FAISS
  • PyTorch
  • Faster-Whisper
  • VAD

Global Disaster Resilience Analytics Dashboard

Interactive dashboard analyzing global disaster resilience trends using multi-source datasets and engineered resilience metrics. Built data pipelines and visualizations (choropleths, Sankey diagrams, heatmaps) to explore resilience, impact, and recovery patterns across countries.

  • Tableau
  • Python
  • Pandas
  • Geospatial
  • Data Visualization
  • Feature Engineering

Skills & Expertise

Technologies and tools I use to bring ideas to life

Data Science & Analytics

  • Python
  • Statistics
  • Data Analysis
  • Feature Engineering
  • Data Visualization
  • Pandas
  • NumPy

Backend & Data Systems

  • SQL (PostgreSQL/MySQL)
  • Data Warehousing
  • ETL Pipelines
  • FastAPI
  • REST APIs
  • Redis
  • Websockets

Programming & Systems

  • Python
  • C++
  • C# (.NET)
  • Data Structures & Algorithms
  • Operating Systems
  • OOP

Visualization, Design & Creative Tech

  • Tableau
  • D3.js
  • Figma
  • Unity
  • Godot
  • UI/UX
  • Data Storytelling
  • Krita

Tools & Workflow

  • Git
  • AWS SageMaker
  • Anaconda
  • RStudio
  • Agile
  • Technical Writing

Machine Learning & AI

  • Machine Learning
  • Deep Learning
  • NLP
  • Agentic AI
  • PyTorch
  • RAG Systems
  • Data Mining