Indic Vision and Language
Datasets

Improve model performance by using our off the shelf datasets

Language

All

14

Hindi

1

English

1

Bengali

1

Telugu

1

Haryanvi

1

Bodo

1

Bhojpuri

1

Malayalam

1

Punjabi

1

Maithili

1

Data Type

All

14

Conversational

10

Vision

4

Sample Rate

All

14

48 kHz

14

1351 Hours - Telugu - Conversational Audio Dataset

Telugu

Conversational Audio

Off-The-Shelf

Our Conversational Data in Telugu offers comprehensive and authentic dialogues of Indians conversing in Telugu. This dataset features conversations that span a wide range of topics, including daily life, business, education, and more. It includes diverse speakers from different regions of India, capturing various accents and dialects to provide a rich linguistic resource. <br><br> The data is collected from natural, spontaneous conversations to ensure authenticity, and each conversation is accurately transcribed with annotations for contextual understanding. Additionally, we offer the flexibility to tailor the topics, conversations, and scenarios according to the specific needs of your company, ensuring that the dataset aligns perfectly with your requirements.

10,000 Hours - Punjabi - Conversational Audio Dataset

Punjabi

Conversational Audio

Off-The-Shelf

Our Conversational Audio in Punjabi offers comprehensive and authentic dialogues of individuals conversing in Punjabi. This dataset features conversations that span a wide range of topics, including daily life, business, education, and more. It includes diverse speakers from different regions, capturing various accents and dialects to provide a rich linguistic resource. <br><br> The data is collected from natural, spontaneous conversations to ensure authenticity, and each conversation is accurately transcribed with annotations for contextual understanding. Additionally, we offer the flexibility to tailor the topics, conversations, and scenarios according to the specific needs of your company, ensuring that the dataset aligns perfectly with your requirements.

10,392 Hours - Indian English - Conversational Audio Dataset

English

Conversational Audio

Off-The-Shelf

Our Conversational Audio in Indian English offers comprehensive and authentic dialogues of individuals conversing in Indian English. This dataset features conversations that span a wide range of topics, including daily life, business, education, and more. It includes diverse speakers from different regions of India, capturing various accents and dialects to provide a rich linguistic resource. <br><br> The data is collected from natural, spontaneous conversations to ensure authenticity, and each conversation is accurately transcribed with annotations for contextual understanding. Additionally, we offer the flexibility to tailor the topics, conversations, and scenarios according to the specific needs of your company, ensuring that the dataset aligns perfectly with your requirements.

10,885 Hours - Hindi - Conversational Audio Dataset

Hindi

Conversational Audio

Off-The-Shelf

Our Conversational Data in Hindi offers comprehensive and authentic dialogues of Indians conversing in Hindi. This dataset features conversations that span a wide range of topics, including daily life, business, education, and more. It includes diverse speakers from different regions of India, capturing various accents and dialects to provide a rich linguistic resource. <br><br> The data is collected from natural, spontaneous conversations to ensure authenticity, and each conversation is accurately transcribed with annotations for contextual understanding. Additionally, we offer the flexibility to tailor the topics, conversations, and scenarios according to the specific needs of your company, ensuring that the dataset aligns perfectly with your requirements.

1

2