Compare commits
22 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 7d3fe1ba26 | |||
| da42a6e4cc | |||
| 0dab0d9bea | |||
| 953c71ab28 | |||
| 5522bdd575 | |||
| 861c470330 | |||
| 6de6d4b2ff | |||
| 01552cc7cb | |||
| 049a168c81 | |||
| 56a925463f | |||
| fe60b04020 | |||
| ff06a257f2 | |||
| 5e31129ea2 | |||
| 3f0bca02b7 | |||
| 488e78a5ae | |||
| 829a054300 | |||
| 462aae12ca | |||
| fec9190ba1 | |||
| 0dde25204d | |||
| b611aa6b8c | |||
| 7d50d5f4cf | |||
| 7799d03960 |
+25
@@ -0,0 +1,25 @@
|
||||
# Python cache
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
*$py.class
|
||||
|
||||
# Virtual environments
|
||||
venv/
|
||||
env/
|
||||
ENV/
|
||||
|
||||
# IDE
|
||||
.vscode/
|
||||
.idea/
|
||||
*.swp
|
||||
*.swo
|
||||
*~
|
||||
|
||||
# OS
|
||||
.DS_Store
|
||||
Thumbs.db
|
||||
|
||||
# Build artifacts
|
||||
dist/
|
||||
build/
|
||||
*.egg-info/
|
||||
@@ -27,29 +27,52 @@ Or by cloning the repository with:
|
||||
git clone https://github.com/soderstromkr/transcribe.git
|
||||
```
|
||||
### Python Version **(any platform including Mac users)**
|
||||
This is recommended if you don't have Windows. Have Windows and use python, or want to use GPU acceleration (Pytorch and Cuda) for faster transcriptions. I would generally recommend this method anyway, but I can understand not everyone wants to go through the installation process for Python, Anaconda and the other required packages.
|
||||
1. This script was made and tested in an Anaconda environment with Python 3.10. I recommend this method if you're not familiar with Python.
|
||||
See [here](https://docs.anaconda.com/anaconda/install/index.html) for instructions. You might need administrator rights.
|
||||
2. Whisper requires some additional libraries. The [setup](https://github.com/openai/whisper#setup) page states: "The codebase also depends on a few Python packages, most notably HuggingFace Transformers for their fast tokenizer implementation and ffmpeg-python for reading audio files."
|
||||
Users might not need to specifically install Transfomers. However, a conda installation might be needed for ffmpeg[^1], which takes care of setting up PATH variables. From the anaconda prompt, type or copy the following:
|
||||
1. This script was made and tested in an Anaconda environment with Python 3.10. I recommend miniconda for a smaller installation, and if you're not familiar with Python.
|
||||
See [here](https://docs.anaconda.com/free/miniconda/miniconda-install/) for instructions. You will **need administrator rights**.
|
||||
2. Whisper also requires some additional libraries. The [setup](https://github.com/openai/whisper#setup) page states: "The codebase also depends on a few Python packages, most notably HuggingFace Transformers for their fast tokenizer implementation and ffmpeg-python for reading audio files."
|
||||
Users might not need to specifically install Transfomers. However, a conda installation might be needed for ffmpeg[^1], which takes care of setting up PATH variables.
|
||||
|
||||
From the Anaconda Prompt (which should now be installed in your system, find it with the search function), type or copy the following:
|
||||
```
|
||||
conda install -c conda-forge ffmpeg-python
|
||||
```
|
||||
3. The main functionality comes from openai-whisper. See their [page](https://github.com/openai/whisper) for details. As of 2023-03-22 you can install via:
|
||||
You can also choose not to use Anaconda (or miniconda), and use Python. In that case, you need to [download and install FFMPEG](https://ffmpeg.org/download.html) (and potentially add it to your PATH). See here for [WikiHow instructions](https://www.wikihow.com/Install-FFmpeg-on-Windows)
|
||||
|
||||
3. The main functionality comes from openai-whisper. See their [page](https://github.com/openai/whisper) for details. It also uses some additional packages (colorama, and customtkinter), install them with the following command:
|
||||
```
|
||||
pip install -U openai-whisper
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
4. To run the app built on TKinter and TTKthemes. If using these options, make sure they are installed in your Python build. You can install them and colorama via pip.
|
||||
4. Run the app:
|
||||
1. For **Windows**: In the same folder as the *app.py* file, run the app from Anaconda prompt by running
|
||||
```python app.py```
|
||||
or with the batch file called run_Windows.bat (for Windows users), which assumes you have conda installed and in the base environment (This is for simplicity, but users are usually adviced to create an environment, see [here](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands) for more info) just make sure you have the correct environment (right click on the file and press edit to make any changes).
|
||||
3. For **Mac**: Haven't figured out a better way to do this, see [the instructions here](Mac_instructions.md)
|
||||
|
||||
**Note** If you want to download a model first, and then go offline for transcription, I recommend running the model with the default sample folder, which will download the model locally.
|
||||
|
||||
## GPU Support
|
||||
This program **does support running on NVIDIA GPUs**, which can significantly speed up transcription times. To use GPU acceleration, you need to have the correct version of PyTorch installed with CUDA support.
|
||||
|
||||
### Installing PyTorch with CUDA Support
|
||||
If you have an NVIDIA GPU and want to take advantage of GPU acceleration, you can install a CUDA-enabled version of PyTorch using:
|
||||
```
|
||||
pip install colorama
|
||||
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
|
||||
```
|
||||
and
|
||||
|
||||
**Note:** The command above installs PyTorch with CUDA 12.1 support. Make sure your NVIDIA GPU drivers are compatible with CUDA 12.1. You can check your CUDA version by running `nvidia-smi` in your terminal.
|
||||
|
||||
If you need a different CUDA version, visit the [PyTorch installation page](https://pytorch.org/get-started/locally/) to generate the appropriate installation command for your system.
|
||||
|
||||
### Verifying GPU Support
|
||||
After installation, you can verify that PyTorch can detect your GPU by running:
|
||||
```python
|
||||
import torch
|
||||
print(torch.cuda.is_available()) # Should print True if GPU is available
|
||||
print(torch.cuda.get_device_name(0)) # Should print your GPU name
|
||||
```
|
||||
pip install customtkinter
|
||||
```
|
||||
5. Run the app:
|
||||
1. For **Windows**: In the same folder as the *app.py* file, run the app from terminal by running ```python app.py``` or with the batch file called run_Windows.bat (for Windows users), which assumes you have conda installed and in the base environment (This is for simplicity, but users are usually adviced to create an environment, see [here](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands) for more info) just make sure you have the correct environment (right click on the file and press edit to make any changes). If you want to download a model first, and then go offline for transcription, I recommend running the model with the default sample folder, which will download the model locally.
|
||||
2. For **Mac**: Haven't figured out a better way to do this, see [the instructions here](Mac_instructions.md)
|
||||
|
||||
If GPU is not detected, the program will automatically fall back to CPU processing, though this will be slower.
|
||||
|
||||
## Usage
|
||||
1. When launched, the app will also open a terminal that shows some additional information.
|
||||
2. Select the folder containing the audio or video files you want to transcribe by clicking the "Browse" button next to the "Folder" label. This will open a file dialog where you can navigate to the desired folder. Remember, you won't be choosing individual files but whole folders!
|
||||
|
||||
@@ -5,9 +5,10 @@ from tkinter import messagebox
|
||||
from src._LocalTranscribe import transcribe, get_path
|
||||
import customtkinter
|
||||
import threading
|
||||
from colorama import Back, Fore
|
||||
from colorama import Back
|
||||
import colorama
|
||||
colorama.init(autoreset=True)
|
||||
import os
|
||||
|
||||
|
||||
|
||||
@@ -41,7 +42,8 @@ class App:
|
||||
language_frame.pack(fill=tk.BOTH, padx=10, pady=10)
|
||||
customtkinter.CTkLabel(language_frame, text="Language:", font=font).pack(side=tk.LEFT, padx=5)
|
||||
self.language_entry = customtkinter.CTkEntry(language_frame, width=50, font=('Roboto', 12, 'italic'))
|
||||
self.language_entry.insert(0, 'Select language or clear to detect automatically')
|
||||
self.default_language_text = "Enter language (or ignore to auto-detect)"
|
||||
self.language_entry.insert(0, self.default_language_text)
|
||||
self.language_entry.bind('<FocusIn>', on_entry_click)
|
||||
self.language_entry.pack(side=tk.LEFT, fill=tk.X, expand=True)
|
||||
# Model frame
|
||||
@@ -72,7 +74,8 @@ class App:
|
||||
# Helper functions
|
||||
# Browsing
|
||||
def browse(self):
|
||||
folder_path = filedialog.askdirectory()
|
||||
initial_dir = os.getcwd()
|
||||
folder_path = filedialog.askdirectory(initialdir=initial_dir)
|
||||
self.path_entry.delete(0, tk.END)
|
||||
self.path_entry.insert(0, folder_path)
|
||||
# Start transcription
|
||||
@@ -85,22 +88,18 @@ class App:
|
||||
def transcribe_thread(self):
|
||||
path = self.path_entry.get()
|
||||
model = self.model_combobox.get()
|
||||
language = self.language_entry.get() or None
|
||||
language = self.language_entry.get()
|
||||
# Check if the language field has the default text or is empty
|
||||
if language == self.default_language_text or not language.strip():
|
||||
language = None # This is the same as passing nothing
|
||||
verbose = self.verbose_var.get()
|
||||
# Show progress bar
|
||||
self.progress_bar.pack(fill=tk.X, padx=5, pady=5)
|
||||
self.progress_bar.start()
|
||||
# Setting path and files
|
||||
glob_file = get_path(path)
|
||||
info_path = 'Continue?'
|
||||
answer = messagebox.askyesno("Confirmation", info_path)
|
||||
if not answer:
|
||||
self.progress_bar.stop()
|
||||
self.progress_bar.pack_forget()
|
||||
self.transcribe_button.configure(state=tk.NORMAL)
|
||||
return
|
||||
#messagebox.showinfo("Message", "Starting transcription!")
|
||||
# Start transcription
|
||||
error_language = 'https://github.com/openai/whisper#available-models-and-languages'
|
||||
try:
|
||||
output_text = transcribe(path, glob_file, model, language, verbose)
|
||||
except UnboundLocalError:
|
||||
|
||||
@@ -0,0 +1,3 @@
|
||||
openai-whisper
|
||||
customtkinter
|
||||
colorama
|
||||
@@ -1,4 +1,2 @@
|
||||
Armstrong_Small_Step
|
||||
[0:00:00 --> 0:00:07]: And they're still brought to land now.
|
||||
[0:00:07 --> 0:00:18]: It's one small step for man.
|
||||
[0:00:18 --> 0:00:23]: One by a fleet for man time.
|
||||
[0:00:00 --> 0:00:29.360000]: alumnfeldaguyrjarna om det nya skirprå kızım om det där föddarna hatt splittar, do nackrott,
|
||||
+12
-6
@@ -2,7 +2,7 @@ import os
|
||||
import datetime
|
||||
from glob import glob
|
||||
import whisper
|
||||
from torch import cuda, Generator
|
||||
from torch import backends, cuda, Generator
|
||||
import colorama
|
||||
from colorama import Back,Fore
|
||||
colorama.init(autoreset=True)
|
||||
@@ -39,14 +39,20 @@ def transcribe(path, glob_file, model=None, language=None, verbose=False):
|
||||
- The transcribed text files will be saved in a "transcriptions" folder
|
||||
within the specified path.
|
||||
|
||||
"""
|
||||
# Check for GPU acceleration
|
||||
if cuda.is_available():
|
||||
"""
|
||||
# Check for GPU acceleration and set device
|
||||
if backends.mps.is_available():
|
||||
device = 'mps'
|
||||
Generator('mps').manual_seed(42)
|
||||
elif cuda.is_available():
|
||||
device = 'cuda'
|
||||
Generator('cuda').manual_seed(42)
|
||||
else:
|
||||
device = 'cpu'
|
||||
Generator().manual_seed(42)
|
||||
# Load model
|
||||
model = whisper.load_model(model)
|
||||
|
||||
# Load model on the correct device
|
||||
model = whisper.load_model(model, device=device)
|
||||
# Start main loop
|
||||
files_transcripted=[]
|
||||
for file in glob_file:
|
||||
|
||||
Reference in New Issue
Block a user