|
|
|
@@ -27,29 +27,52 @@ Or by cloning the repository with:
|
|
|
|
|
git clone https://github.com/soderstromkr/transcribe.git
|
|
|
|
|
```
|
|
|
|
|
### Python Version **(any platform including Mac users)**
|
|
|
|
|
This is recommended if you don't have Windows. Have Windows and use python, or want to use GPU acceleration (Pytorch and Cuda) for faster transcriptions. I would generally recommend this method anyway, but I can understand not everyone wants to go through the installation process for Python, Anaconda and the other required packages.
|
|
|
|
|
1. This script was made and tested in an Anaconda environment with Python 3.10. I recommend this method if you're not familiar with Python.
|
|
|
|
|
See [here](https://docs.anaconda.com/anaconda/install/index.html) for instructions. You might need administrator rights.
|
|
|
|
|
2. Whisper requires some additional libraries. The [setup](https://github.com/openai/whisper#setup) page states: "The codebase also depends on a few Python packages, most notably HuggingFace Transformers for their fast tokenizer implementation and ffmpeg-python for reading audio files."
|
|
|
|
|
Users might not need to specifically install Transfomers. However, a conda installation might be needed for ffmpeg[^1], which takes care of setting up PATH variables. From the anaconda prompt, type or copy the following:
|
|
|
|
|
1. This script was made and tested in an Anaconda environment with Python 3.10. I recommend miniconda for a smaller installation, and if you're not familiar with Python.
|
|
|
|
|
See [here](https://docs.anaconda.com/free/miniconda/miniconda-install/) for instructions. You will **need administrator rights**.
|
|
|
|
|
2. Whisper also requires some additional libraries. The [setup](https://github.com/openai/whisper#setup) page states: "The codebase also depends on a few Python packages, most notably HuggingFace Transformers for their fast tokenizer implementation and ffmpeg-python for reading audio files."
|
|
|
|
|
Users might not need to specifically install Transfomers. However, a conda installation might be needed for ffmpeg[^1], which takes care of setting up PATH variables.
|
|
|
|
|
|
|
|
|
|
From the Anaconda Prompt (which should now be installed in your system, find it with the search function), type or copy the following:
|
|
|
|
|
```
|
|
|
|
|
conda install -c conda-forge ffmpeg-python
|
|
|
|
|
```
|
|
|
|
|
3. The main functionality comes from openai-whisper. See their [page](https://github.com/openai/whisper) for details. As of 2023-03-22 you can install via:
|
|
|
|
|
You can also choose not to use Anaconda (or miniconda), and use Python. In that case, you need to [download and install FFMPEG](https://ffmpeg.org/download.html) (and potentially add it to your PATH). See here for [WikiHow instructions](https://www.wikihow.com/Install-FFmpeg-on-Windows)
|
|
|
|
|
|
|
|
|
|
3. The main functionality comes from openai-whisper. See their [page](https://github.com/openai/whisper) for details. It also uses some additional packages (colorama, and customtkinter), install them with the following command:
|
|
|
|
|
```
|
|
|
|
|
pip install -U openai-whisper
|
|
|
|
|
pip install -r requirements.txt
|
|
|
|
|
```
|
|
|
|
|
4. To run the app built on TKinter and TTKthemes. If using these options, make sure they are installed in your Python build. You can install them and colorama via pip.
|
|
|
|
|
4. Run the app:
|
|
|
|
|
1. For **Windows**: In the same folder as the *app.py* file, run the app from Anaconda prompt by running
|
|
|
|
|
```python app.py```
|
|
|
|
|
or with the batch file called run_Windows.bat (for Windows users), which assumes you have conda installed and in the base environment (This is for simplicity, but users are usually adviced to create an environment, see [here](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands) for more info) just make sure you have the correct environment (right click on the file and press edit to make any changes).
|
|
|
|
|
3. For **Mac**: Haven't figured out a better way to do this, see [the instructions here](Mac_instructions.md)
|
|
|
|
|
|
|
|
|
|
**Note** If you want to download a model first, and then go offline for transcription, I recommend running the model with the default sample folder, which will download the model locally.
|
|
|
|
|
|
|
|
|
|
## GPU Support
|
|
|
|
|
This program **does support running on NVIDIA GPUs**, which can significantly speed up transcription times. To use GPU acceleration, you need to have the correct version of PyTorch installed with CUDA support.
|
|
|
|
|
|
|
|
|
|
### Installing PyTorch with CUDA Support
|
|
|
|
|
If you have an NVIDIA GPU and want to take advantage of GPU acceleration, you can install a CUDA-enabled version of PyTorch using:
|
|
|
|
|
```
|
|
|
|
|
pip install colorama
|
|
|
|
|
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
|
|
|
|
|
```
|
|
|
|
|
and
|
|
|
|
|
|
|
|
|
|
**Note:** The command above installs PyTorch with CUDA 12.1 support. Make sure your NVIDIA GPU drivers are compatible with CUDA 12.1. You can check your CUDA version by running `nvidia-smi` in your terminal.
|
|
|
|
|
|
|
|
|
|
If you need a different CUDA version, visit the [PyTorch installation page](https://pytorch.org/get-started/locally/) to generate the appropriate installation command for your system.
|
|
|
|
|
|
|
|
|
|
### Verifying GPU Support
|
|
|
|
|
After installation, you can verify that PyTorch can detect your GPU by running:
|
|
|
|
|
```python
|
|
|
|
|
import torch
|
|
|
|
|
print(torch.cuda.is_available()) # Should print True if GPU is available
|
|
|
|
|
print(torch.cuda.get_device_name(0)) # Should print your GPU name
|
|
|
|
|
```
|
|
|
|
|
pip install customtkinter
|
|
|
|
|
```
|
|
|
|
|
5. Run the app:
|
|
|
|
|
1. For **Windows**: In the same folder as the *app.py* file, run the app from terminal by running ```python app.py``` or with the batch file called run_Windows.bat (for Windows users), which assumes you have conda installed and in the base environment (This is for simplicity, but users are usually adviced to create an environment, see [here](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands) for more info) just make sure you have the correct environment (right click on the file and press edit to make any changes). If you want to download a model first, and then go offline for transcription, I recommend running the model with the default sample folder, which will download the model locally.
|
|
|
|
|
2. For **Mac**: Haven't figured out a better way to do this, see [the instructions here](Mac_instructions.md)
|
|
|
|
|
|
|
|
|
|
If GPU is not detected, the program will automatically fall back to CPU processing, though this will be slower.
|
|
|
|
|
|
|
|
|
|
## Usage
|
|
|
|
|
1. When launched, the app will also open a terminal that shows some additional information.
|
|
|
|
|
2. Select the folder containing the audio or video files you want to transcribe by clicking the "Browse" button next to the "Folder" label. This will open a file dialog where you can navigate to the desired folder. Remember, you won't be choosing individual files but whole folders!
|
|
|
|
|