Working env
-
create conda env
-
install python v < 3.11, > 3.7 (3.10)
-
install Chocolatey (https://chocolatey.org/install)
-
install ffmpeg
sudo apt update && sudo apt install ffmpeg
-
install rust
pip install setuptools-rust
-
install Youtube-dl using:
路 馃 Repositories
In order to create and mange 馃 Repositories (datasets, for example) we need to install the huggingface_hub CLI and run the login command. Following 馃 Docs, this can be done by just running these commands on our conda environment:
The huggingface-cli login
command will ask us for a token, which is automatically generated in our account. We only have to follow the clear instructions that appear on the terminal.
路 馃 Datasets library
Before we start creating our flamenco audio dataset, we need to setup the environment and install the appropriate packages. Hugging Face (馃) Datasets is a library for easily accessing and sharing datasets. It works on Python 3.7+. We will follow the installation instructions provided in the 馃 docs.
We go for the conda
option and install it on our whisper environment using: