Compare commits
19 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 7d3fe1ba26 | |||
| da42a6e4cc | |||
| 0dab0d9bea | |||
| 953c71ab28 | |||
| 5522bdd575 | |||
| 861c470330 | |||
| 6de6d4b2ff | |||
| 01552cc7cb | |||
| 049a168c81 | |||
| 56a925463f | |||
| fe60b04020 | |||
| ff06a257f2 | |||
| 5e31129ea2 | |||
| 3f0bca02b7 | |||
| 488e78a5ae | |||
| 829a054300 | |||
| 462aae12ca | |||
| fec9190ba1 | |||
| 0dde25204d |
+25
@@ -0,0 +1,25 @@
|
||||
# Python cache
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
*$py.class
|
||||
|
||||
# Virtual environments
|
||||
venv/
|
||||
env/
|
||||
ENV/
|
||||
|
||||
# IDE
|
||||
.vscode/
|
||||
.idea/
|
||||
*.swp
|
||||
*.swo
|
||||
*~
|
||||
|
||||
# OS
|
||||
.DS_Store
|
||||
Thumbs.db
|
||||
|
||||
# Build artifacts
|
||||
dist/
|
||||
build/
|
||||
*.egg-info/
|
||||
@@ -27,29 +27,52 @@ Or by cloning the repository with:
|
||||
git clone https://github.com/soderstromkr/transcribe.git
|
||||
```
|
||||
### Python Version **(any platform including Mac users)**
|
||||
This is recommended if you don't have Windows. Have Windows and use python, or want to use GPU acceleration (Pytorch and Cuda) for faster transcriptions. I would generally recommend this method anyway, but I can understand not everyone wants to go through the installation process for Python, Anaconda and the other required packages.
|
||||
1. This script was made and tested in an Anaconda environment with Python 3.10. I recommend this method if you're not familiar with Python.
|
||||
See [here](https://docs.anaconda.com/anaconda/install/index.html) for instructions. You might need administrator rights.
|
||||
2. Whisper requires some additional libraries. The [setup](https://github.com/openai/whisper#setup) page states: "The codebase also depends on a few Python packages, most notably HuggingFace Transformers for their fast tokenizer implementation and ffmpeg-python for reading audio files."
|
||||
Users might not need to specifically install Transfomers. However, a conda installation might be needed for ffmpeg[^1], which takes care of setting up PATH variables. From the anaconda prompt, type or copy the following:
|
||||
1. This script was made and tested in an Anaconda environment with Python 3.10. I recommend miniconda for a smaller installation, and if you're not familiar with Python.
|
||||
See [here](https://docs.anaconda.com/free/miniconda/miniconda-install/) for instructions. You will **need administrator rights**.
|
||||
2. Whisper also requires some additional libraries. The [setup](https://github.com/openai/whisper#setup) page states: "The codebase also depends on a few Python packages, most notably HuggingFace Transformers for their fast tokenizer implementation and ffmpeg-python for reading audio files."
|
||||
Users might not need to specifically install Transfomers. However, a conda installation might be needed for ffmpeg[^1], which takes care of setting up PATH variables.
|
||||
|
||||
From the Anaconda Prompt (which should now be installed in your system, find it with the search function), type or copy the following:
|
||||
```
|
||||
conda install -c conda-forge ffmpeg-python
|
||||
```
|
||||
3. The main functionality comes from openai-whisper. See their [page](https://github.com/openai/whisper) for details. As of 2023-03-22 you can install via:
|
||||
You can also choose not to use Anaconda (or miniconda), and use Python. In that case, you need to [download and install FFMPEG](https://ffmpeg.org/download.html) (and potentially add it to your PATH). See here for [WikiHow instructions](https://www.wikihow.com/Install-FFmpeg-on-Windows)
|
||||
|
||||
3. The main functionality comes from openai-whisper. See their [page](https://github.com/openai/whisper) for details. It also uses some additional packages (colorama, and customtkinter), install them with the following command:
|
||||
```
|
||||
pip install -U openai-whisper
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
4. To run the app built on TKinter and TTKthemes. If using these options, make sure they are installed in your Python build. You can install them and colorama via pip.
|
||||
4. Run the app:
|
||||
1. For **Windows**: In the same folder as the *app.py* file, run the app from Anaconda prompt by running
|
||||
```python app.py```
|
||||
or with the batch file called run_Windows.bat (for Windows users), which assumes you have conda installed and in the base environment (This is for simplicity, but users are usually adviced to create an environment, see [here](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands) for more info) just make sure you have the correct environment (right click on the file and press edit to make any changes).
|
||||
3. For **Mac**: Haven't figured out a better way to do this, see [the instructions here](Mac_instructions.md)
|
||||
|
||||
**Note** If you want to download a model first, and then go offline for transcription, I recommend running the model with the default sample folder, which will download the model locally.
|
||||
|
||||
## GPU Support
|
||||
This program **does support running on NVIDIA GPUs**, which can significantly speed up transcription times. To use GPU acceleration, you need to have the correct version of PyTorch installed with CUDA support.
|
||||
|
||||
### Installing PyTorch with CUDA Support
|
||||
If you have an NVIDIA GPU and want to take advantage of GPU acceleration, you can install a CUDA-enabled version of PyTorch using:
|
||||
```
|
||||
pip install colorama
|
||||
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
|
||||
```
|
||||
and
|
||||
|
||||
**Note:** The command above installs PyTorch with CUDA 12.1 support. Make sure your NVIDIA GPU drivers are compatible with CUDA 12.1. You can check your CUDA version by running `nvidia-smi` in your terminal.
|
||||
|
||||
If you need a different CUDA version, visit the [PyTorch installation page](https://pytorch.org/get-started/locally/) to generate the appropriate installation command for your system.
|
||||
|
||||
### Verifying GPU Support
|
||||
After installation, you can verify that PyTorch can detect your GPU by running:
|
||||
```python
|
||||
import torch
|
||||
print(torch.cuda.is_available()) # Should print True if GPU is available
|
||||
print(torch.cuda.get_device_name(0)) # Should print your GPU name
|
||||
```
|
||||
pip install customtkinter
|
||||
```
|
||||
5. Run the app:
|
||||
1. For **Windows**: In the same folder as the *app.py* file, run the app from terminal by running ```python app.py``` or with the batch file called run_Windows.bat (for Windows users), which assumes you have conda installed and in the base environment (This is for simplicity, but users are usually adviced to create an environment, see [here](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands) for more info) just make sure you have the correct environment (right click on the file and press edit to make any changes). If you want to download a model first, and then go offline for transcription, I recommend running the model with the default sample folder, which will download the model locally.
|
||||
2. For **Mac**: Haven't figured out a better way to do this, see [the instructions here](Mac_instructions.md)
|
||||
|
||||
If GPU is not detected, the program will automatically fall back to CPU processing, though this will be slower.
|
||||
|
||||
## Usage
|
||||
1. When launched, the app will also open a terminal that shows some additional information.
|
||||
2. Select the folder containing the audio or video files you want to transcribe by clicking the "Browse" button next to the "Folder" label. This will open a file dialog where you can navigate to the desired folder. Remember, you won't be choosing individual files but whole folders!
|
||||
|
||||
@@ -0,0 +1,3 @@
|
||||
openai-whisper
|
||||
customtkinter
|
||||
colorama
|
||||
+12
-6
@@ -2,7 +2,7 @@ import os
|
||||
import datetime
|
||||
from glob import glob
|
||||
import whisper
|
||||
from torch import cuda, Generator
|
||||
from torch import backends, cuda, Generator
|
||||
import colorama
|
||||
from colorama import Back,Fore
|
||||
colorama.init(autoreset=True)
|
||||
@@ -39,14 +39,20 @@ def transcribe(path, glob_file, model=None, language=None, verbose=False):
|
||||
- The transcribed text files will be saved in a "transcriptions" folder
|
||||
within the specified path.
|
||||
|
||||
"""
|
||||
# Check for GPU acceleration
|
||||
if cuda.is_available():
|
||||
"""
|
||||
# Check for GPU acceleration and set device
|
||||
if backends.mps.is_available():
|
||||
device = 'mps'
|
||||
Generator('mps').manual_seed(42)
|
||||
elif cuda.is_available():
|
||||
device = 'cuda'
|
||||
Generator('cuda').manual_seed(42)
|
||||
else:
|
||||
device = 'cpu'
|
||||
Generator().manual_seed(42)
|
||||
# Load model
|
||||
model = whisper.load_model(model)
|
||||
|
||||
# Load model on the correct device
|
||||
model = whisper.load_model(model, device=device)
|
||||
# Start main loop
|
||||
files_transcripted=[]
|
||||
for file in glob_file:
|
||||
|
||||
Binary file not shown.
Reference in New Issue
Block a user