Event Schedule

9:30am - 10am

Registration

10am - 10:15am

Opening Remarks

10:15am - 11:30am

What is this thing called AI?

Balaraman Ravindran (Department of Data Science & AI, IIT Madras)

AI had been capturing popular imagination for several decades. Recently, it appears as if they are fulfilling their promise. Through the past several decades multiple useful techniques have been developed in this field for solving hard problems. In this talk I will give a very quick over view of the roots of AI and the intuition behind their recent successes.

Talk slides

11:30am - 12pm

Tea Break

12pm - 1:00pm

What is pretraining? How it helps to learn well on less data?

Niloy Ganguly (Department of Computer Science and Engineering, IIT Kharagpur)

I will discuss various ways we can enhance the performance of language generation through pretraining techniques.

Talk slides

1:00pm - 2:00pm

Lunch Break

2:00pm - 3:30pm

Introduction to Large Language Models for the Digital Humanities

Animesh Mukherjee (Department of Computer Science and Engineering, IIT Kharagpur)

This talk will be a very brief primer to the architecture of the present day LLMs, followed by two digital humanities applications, viz., the efficacy of LLMs in (a) online hate speech identification and (b) native language identification.

Talk slides

3:30pm - 4:00pm

Tea Break

4:00pm - 4:45pm

Language models, Markov chains, hidden Markov models and profiles

Rahul Siddharthan (IMSc, Chennai)

We will introduce Markov chains and hidden Markov models (HMMs) which are widely used in computational linguistics as well as bioinformatics. We will briefly discuss "profile HMMs" which are used to describe protein families but can perhaps have applications in linguistics and epigraphy as well.

Talk slides

5:00pm - 5:45pm

Tutorial 1: Encountering writing

Sitabhra Sinha

Using a few examples of real and fictional inscriptions, I will discuss how one can approach the problem of deciphering unknown writing.

Talk slides Lecture handout 1 Lecture handout 2

10am - 11:30am

The Development of AI: From Classical Approaches to Neural Networks

Farhat Habib (Ikigai, Boston)

This talk traces the historical development of artificial intelligence, beginning with the foundational concepts and early milestones. It briefly explores techniques such as linear models, decision trees, and random forests, illustrating their applications in classification, regression, and pattern recognition. The discussion then shifts to the emergence of neural networks, delving into their architecture, training methods, and their role in revolutionizing various AI domains.

Talk slides

11:30am - 12pm

Tea Break

12pm - 1:00pm

The story of writing

Sitabhra Sinha

An introduction to the possible origins of writing and the different types of writitng systems seen across history, viz., ideographic, syllabic, logosyllabic, alphabetic etc.

Talk slides

1:00pm - 2:00pm

Lunch Break

2:00pm - 2:45pm

Indus Valley Civilization: A Brief Introduction

Md. Izhar Ashraf (iCEL & IMSc, Chennai)

This talk will provide a context to the inscriptions of the Indus Civilization (2500-1900 BCE) that is possibly the only remaining major undeciphered writing system.

Talk slides

2:45pm - 3:30pm

Tutorial 2: Web Scraping for Epigrapy

Md. Izhar Ashraf (iCEL & IMSc , Chennai)

I will show how to use simple Python programs to extract inscriptions data from variuos sites in the web, which can be useful for building and analysing epigraphic databases. .

Lecture Handout

3:30pm - 4:00pm

Tea Break

4:00pm - 4:45pm

Tutorial 3: Deciphering Brahmi (part 1)

Nandini Mitra

Talk slides

5:00pm - 5:45pm

Tutorial 3: Deciphering Brahmi (part 2)

Nandini Mitra

Talk slides

10am - 11:30am

The Transformer Revolution: Large Language Models and Generative AI

Farhat Habib (Ikigai, Boston)

Building upon the foundations laid in the previous talk, this presentation goes deeper into generative models such as GANs and diffusion models. Then we will take a deeper look at transformer models and large language models. We will examine the self-attention mechanism that underpins transformers, their applications in natural language processing, and the capabilities of models like GPT and BERT. The talk also explores the rapidly evolving field of generative AI, including text, image, and audio generation, and its potential implications.

Talk slides

11:30am - 12pm

Tea Break

12pm - 1:00pm

Markov, Zipf, Shannon: Statistical approaches to analysing inscriptions - Part I

Sitabhra Sinha (iCEL & IMSc , Chennai)

In this talk we shall explore how probabilistic models have shaped our understanding of linguistic sequences, focusing in particular on Markov's work that prefigures that of Claude Shannon on the entropy of written English, as well as the work of Zipf on the frequency of words and the principle of least effort.

Talk slides

1:00pm - 2:00pm

Lunch Break

2:00pm - 3:30pm

Tutorial 4: An introduction to Network Science for the Digital Humanities

Shakti N Menon (IMSc, Chennai)

If we look closely enough, it becomes apparent that networks are everywhere, and it turns out that a wide range of physical, biological and social interactions can be elegantly encapsulated in terms of network descriptions. I will provide a general overview of network science - a field of study focused on extracting information encoded by such networks - and will demonstrate some associated concepts in an interactive tutorial session using the network visualization software Gephi.

Talk slides

3:30pm - 4:00pm

Tea Break

4:00pm - 4:45pm

Decipherment as a Network Enterprise: The Case of Brahmi

Nandini Mitra (iCEL & IMSc, Chennai)

Talk slides

5:00pm - 5:45pm

The Contribution of Corpora and Concordances to the Field of Indus Script Research

C Subramanian (Independent Scholar, Chennai)

Contributions of four major Indus text corpora and concordances – Hunter (1934), Mahadevan (1977), Parpola (1973-2010), and Wells (1998-present) - made towards the textual and contextual analysis of the Indus script.

Talk slides

10am - 11:30am

Unraveling the Structure of the Indus Script: A Computational Perspective

Nisha Yadav (TIFR, Mumbai)

The Indus script has defied decipherment. The absence of a concrete understanding regarding its structure poses challenges in objectively assessing any purported decipherments. To address this gap, we have employed diverse computational techniques to analyze the structure of the Indus script. Our research aims to uncover patterns within Indus writing and investigate its fundamental principles without presupposing its content. In this presentation, I will provide an overview of our computational investigations into the Indus script.

Talk slides

11:30am - 12pm

Tea Break

12pm - 1:00pm

String alignment, evolution, phylogenetics and how languages evolve

Rahul Siddharthan (IMSc, Chennai)

We will describe some basic string-matching algorithms, and introduce evolution of DNA sequence and phylogenetics, then look at applications of the same ideas in linguistics.

Talk slides

1:00pm - 2:00pm

Lunch Break

2:00pm - 3:30pm

Understanding the Indus Script in the context of its culture

Mayank Vahia (Ex-TIFR, Mumbai)

The grammar of the Harappan Script is now fairly well understood. However, all interpretative models about the Harappan Script have fallen short in their consistency with its grammar. The cultural context of the Indus writing is also well understood. In the present talk, we will discuss the miniatures that were used by the Harappans to express themselves in a variety of ways. We will then discuss the larger issues of the time evolution of the Harappan Civilisation and look at the possible scenarios of how it evolved and changed. We will then summarise by discussing the possible new avenues about the Harappan Script that can be pursued to gain better insights.

Talk slides

3:30pm - 4:00pm

Tea Break

4:00pm - 4:45pm

Markov, Zipf, Shannon: Statistical approaches to analysing inscription - part II

Sitabhra Sinha (iCEL and IMSc, Chennai)

This is the concluding part of the talk whose first part was yesterday, focusing on how Shannon quantified information (measured by the unit of a "bit") using the concept of entropy and how it applies to language. We will also look at Zipf's law of abbreviation and the principle of least effort that he proposed to explain it.

Talk slides

5:00pm - 6:00pm

Figuring out the direction of writing in ancient texts: Directional asymmtery in language

Md. Izhar Ashraf (iCEL & IMSc, Chennai)

This study uncovers an universal pattern in language, revealing asymmetric sign distribution at word boundaries, and applies this insight to deduce the writing direction of undeciphered Indus inscriptions, showing its utility in archaeological decipherment.

Talk slides Lecture Handout

10am - 11:00am

Tracking the Evolutionary Trail: Early Brāhmī to Proto-regional Scripts

Rajat Sanyal (University of Calcutta, Kolkata)

Writing in historical India, almost coterminous with the use of Brāhmī script throughout the subcontinent, witnessed an essentially chequered trajectory of evolution from the third century BCE till the advent of proto-regional scripts in the seventh-eighth centuries. This talk is aimed at taking a tour through this route, focusing on factors behind and processes leading to these varying lines of development.

Talk slides

11:00am - 11:30am

Tutorial 5: Reading Siddhamātr̥kā

Rajat Sanyal (University of Calcutta, Kolkata)

This talk will be a reading session on the most widely used proto=regional script of northern India called Siddhamātr̥kā.

Talk slides

11:30am - 12pm

Tea Break

12pm - 1:00pm

One 'Sanskrit-Mad' scholar and the Beginning of the Study of Epigraphy in India

Sayantani Pal (University of Calcutta, Kolkata)

The study of Indian epigraphy was at its infancy in the late 18th century. The initial phase began with the discovery of inscriptions and their publications by European scholars, apparently with the help of Indian pundits. They appeared in journals like Asiatic Researches and Journal of the Asiatic Society (both published by the Asiatic Society, Calcutta, established in 1784).
The foundation of researches in Indian epigraphy was thus laid and in this process of building interest in Indian epigraphy, the name of Charles Wilkins (1749-1836) stands out most prominently. Wilkins was among the earliest scholars who deciphered inscriptions at a period when those epigraphic records were unintelligible to others. When Wilkins began his study there was no idea of epigraphs as sources of history. None could read the script. How Wilkins made it possible is to decipher the inscriptions of sixth, ninth and tenth centuries is still a wonder. This talk would focus on the method of handling epigraphic data in the initial phase of Indian epigraphic studies (18th – 19th centuries).

Talk slides

1:00pm - 2:00pm

Lunch Break

2:00pm - 3:30pm

Deciphering Linguistic Patterns: A Universal Segmentation Approach Across Languages

Md. Izhar Ashraf (iCEL & IMSc, Chennai)

The study introduces a segmentation method to analyze linguistic sequences, revealing a universal pattern of recurring word segments across languages. It suggests inherent cognitive or phonotactic constraints in language processing and word composition.

Talk slides

3:30pm - 4:00pm

Tea Break

4:00pm - 4:30pm

Valedictory Session

08:30am - 02:30pm

About The Workshop

Organizers

Md Izhar Ashraf (IMSc Computational Epigraphy Lab)

Sitabhra Sinha (IMSc Computational Epigraphy Lab)

Speakers

Event Schedule

Monday

25 March

Tuesday

26 March

Wednesday

27 March

Thursday

28 March

Friday

29 March

Saturday

30 March

Registration

Opening Remarks

What is this thing called AI?

Balaraman Ravindran (Department of Data Science & AI, IIT Madras)

Tea Break

What is pretraining? How it helps to learn well on less data?

Niloy Ganguly (Department of Computer Science and Engineering, IIT Kharagpur)

Lunch Break

Introduction to Large Language Models for the Digital Humanities

Animesh Mukherjee (Department of Computer Science and Engineering, IIT Kharagpur)

Tea Break

Language models, Markov chains, hidden Markov models and profiles

Rahul Siddharthan (IMSc, Chennai)

Tutorial 1: Encountering writing

Sitabhra Sinha

The Development of AI: From Classical Approaches to Neural Networks

Farhat Habib (Ikigai, Boston)

Tea Break

The story of writing

Sitabhra Sinha

Lunch Break

Indus Valley Civilization: A Brief Introduction

Md. Izhar Ashraf (iCEL & IMSc, Chennai)

Tutorial 2: Web Scraping for Epigrapy

Md. Izhar Ashraf (iCEL & IMSc , Chennai)

Tea Break

Tutorial 3: Deciphering Brahmi (part 1)

Nandini Mitra

Tutorial 3: Deciphering Brahmi (part 2)

Nandini Mitra

The Transformer Revolution: Large Language Models and Generative AI

Farhat Habib (Ikigai, Boston)

Tea Break

Markov, Zipf, Shannon: Statistical approaches to analysing inscriptions - Part I

Sitabhra Sinha (iCEL & IMSc , Chennai)

Lunch Break

Tutorial 4: An introduction to Network Science for the Digital Humanities

Shakti N Menon (IMSc, Chennai)

Tea Break

Decipherment as a Network Enterprise: The Case of Brahmi

Nandini Mitra (iCEL & IMSc, Chennai)

The Contribution of Corpora and Concordances to the Field of Indus Script Research

C Subramanian (Independent Scholar, Chennai)

Unraveling the Structure of the Indus Script: A Computational Perspective

Nisha Yadav (TIFR, Mumbai)

Tea Break

String alignment, evolution, phylogenetics and how languages evolve

Rahul Siddharthan (IMSc, Chennai)

Lunch Break

Understanding the Indus Script in the context of its culture

Mayank Vahia (Ex-TIFR, Mumbai)

Tea Break

Markov, Zipf, Shannon: Statistical approaches to analysing inscription - part II

Sitabhra Sinha (iCEL and IMSc, Chennai)

Figuring out the direction of writing in ancient texts: Directional asymmtery in language

Md. Izhar Ashraf (iCEL & IMSc, Chennai)

Tracking the Evolutionary Trail: Early Brāhmī to Proto-regional Scripts

Rajat Sanyal (University of Calcutta, Kolkata)

Tutorial 5: Reading Siddhamātr̥kā

Rajat Sanyal (University of Calcutta, Kolkata)

Tea Break

One 'Sanskrit-Mad' scholar and the Beginning of the Study of Epigraphy in India