Skip to main content

Research Overview

Welcome to LDF Research. This site documents the various research and development projects undertaken at the Lanka Data Foundation.

Our mission is to leverage data and advanced technology to provide insights and tools relevant to Sri Lanka.

Current Projects

Legislative Analysis

A system for analyzing Sri Lankan legislative acts using Large Language Models (LLMs). This project automates the extraction of summaries, entities, and structural details from PDF documents.

DeepSeek OCR

Research into using DeepSeek models for Optical Character Recognition (OCR), aiming to improve text extraction from scanned documents.

Gazette Analysis

Tools for extracting and processing Sri Lankan government gazette data. Includes LLM-based extraction of ministry structures, amendments, and personnel appointments, along with a versioning system to track structural changes over time.

AuthData Audit

A reproducible, observable audit framework for validating datasets across the OpenGINXplore open data platform. Includes three-phase auditing (Source Discovery, Data Integrity, App Visibility), full action traceability, and a NextJS visualization dashboard.

OpenGIN-X

An interactive query UI for exploring the OpenGIN Data Catalog. OpenGIN-X provides a unified interface to browse, query, and visualize polyglot data in multiple formats — JSON, tabular, or graph — depending on the underlying data structure. Designed for developers and data analysts who need to explore raw data, debug issues, and navigate entity relationships in the OpenGIN knowledge graph.