Hadithi
Convert YouTube, Torrent, Enterprise Videos to LLM Datasets
Listed in categories:
Artificial IntelligenceOpen SourceDeveloper ToolsDescription
Hadithi is an open-source command-line tool designed for AI and ML developers to generate high-quality video datasets for fine-tuning and training large language models (LLMs). It streamlines the video processing workflow by organizing, renaming, segmenting, and validating video files, making it easier to prepare datasets for machine learning tasks.
How to use Hadithi?
To set up and run Hadithi, ensure you have Ubuntu 18.04 LTS, Bash, ExifTool, FFMPEG, and VS Code Editor installed. Follow the instructions in the README file to execute commands for organizing and processing your video files.
Core features of Hadithi:
1️⃣
Organize videos into folders
2️⃣
Rename videos with timestamps
3️⃣
Segment videos into clips
4️⃣
Detect scenes in videos
5️⃣
Batch process videos
Why could be used Hadithi?
# | Use case | Status | |
---|---|---|---|
# 1 | Preparing video datasets for training large language models | ✅ | |
# 2 | Automating video processing tasks for machine learning projects | ✅ | |
# 3 | Enhancing video data quality for AI applications | ✅ |
Who developed Hadithi?
Hadithi is developed by QetLab, a team focused on creating open-source tools for data processing and machine learning applications.