Compare commits
31 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 7d3fe1ba26 | |||
| da42a6e4cc | |||
| 0dab0d9bea | |||
| 953c71ab28 | |||
| 5522bdd575 | |||
| 861c470330 | |||
| 6de6d4b2ff | |||
| 01552cc7cb | |||
| 049a168c81 | |||
| 56a925463f | |||
| fe60b04020 | |||
| ff06a257f2 | |||
| 5e31129ea2 | |||
| 3f0bca02b7 | |||
| 488e78a5ae | |||
| 829a054300 | |||
| 462aae12ca | |||
| fec9190ba1 | |||
| 0dde25204d | |||
| b611aa6b8c | |||
| 7d50d5f4cf | |||
| 7799d03960 | |||
| f88186dacc | |||
| 3f5c1491ac | |||
| c83e15bdba | |||
| ff16ad30e1 | |||
| 622165b3e6 | |||
| 0e9cbdca58 | |||
| 87cb509b14 | |||
| ba935cafb7 | |||
| 6497508b7a |
@@ -0,0 +1 @@
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
+25
@@ -0,0 +1,25 @@
|
||||
# Python cache
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
*$py.class
|
||||
|
||||
# Virtual environments
|
||||
venv/
|
||||
env/
|
||||
ENV/
|
||||
|
||||
# IDE
|
||||
.vscode/
|
||||
.idea/
|
||||
*.swp
|
||||
*.swo
|
||||
*~
|
||||
|
||||
# OS
|
||||
.DS_Store
|
||||
Thumbs.db
|
||||
|
||||
# Build artifacts
|
||||
dist/
|
||||
build/
|
||||
*.egg-info/
|
||||
+2
-2
@@ -5,5 +5,5 @@ Unfortunately, I have not found a permament solution for this, not being a Mac u
|
||||
1. You can also right-click (or equivalent) on the root folder to open a Terminal within the folder.
|
||||
2. Run the following command:
|
||||
```
|
||||
python main.py
|
||||
```
|
||||
python app.py
|
||||
```
|
||||
|
||||
@@ -6,9 +6,8 @@ Local Transcribe with Whisper is a user-friendly desktop application that allows
|
||||
1. File type: You no longer need to specify file type. The program will only transcribe elligible files.
|
||||
2. Language: Added option to specify language, which might help in some cases. Clear the default text to run automatic language recognition.
|
||||
3. Model selection: Now a dropdown option that includes most models for typical use.
|
||||
2. New and improved GUI.
|
||||
2. New and improved GUI.
|
||||

|
||||
3. Executable: On Windows and don't want to install python? Try the Exe file! See below for instructions (Experimental)
|
||||
|
||||
## Features
|
||||
* Select the folder containing the audio or video files you want to transcribe. Tested with m4a video.
|
||||
@@ -21,42 +20,59 @@ Local Transcribe with Whisper is a user-friendly desktop application that allows
|
||||
|
||||
## Installation
|
||||
### Get the files
|
||||
Download the zip folder and extract it to your preferred working folder.
|
||||

|
||||
Download the zip folder and extract it to your preferred working folder.
|
||||

|
||||
Or by cloning the repository with:
|
||||
```
|
||||
git clone https://github.com/soderstromkr/transcribe.git
|
||||
```
|
||||
### Executable Version **(Experimental. Windows only)**
|
||||
The executable version of Local Transcribe with Whisper is a standalone program and should work out of the box. This experimental version is available if you have Windows, and do not have (or don't want to install) python and additional dependencies. However, it requires more disk space (around 1Gb), has no GPU acceleration and has only been lightly tested for bugs, etc. Let me know if you run into any issues!
|
||||
1. Download the project folder. As the image above shows.
|
||||
2. Navigate to build.
|
||||
3. Unzip the folder (get a coffee or a tea, this might take a while depending on your computer)
|
||||
3. Run the executable (app.exe) file.
|
||||
### Python Version **(any platform including Mac users)**
|
||||
This is recommended if you don't have Windows. Have Windows and use python, or want to use GPU acceleration (Pytorch and Cuda) for faster transcriptions. I would generally recommend this method anyway, but I can understand not everyone wants to go through the installation process for Python, Anaconda and the other required packages.
|
||||
1. This script was made and tested in an Anaconda environment with Python 3.10. I recommend this method if you're not familiar with Python.
|
||||
See [here](https://docs.anaconda.com/anaconda/install/index.html) for instructions. You might need administrator rights.
|
||||
2. Whisper requires some additional libraries. The [setup](https://github.com/openai/whisper#setup) page states: "The codebase also depends on a few Python packages, most notably HuggingFace Transformers for their fast tokenizer implementation and ffmpeg-python for reading audio files."
|
||||
Users might not need to specifically install Transfomers. However, a conda installation might be needed for ffmpeg[^1], which takes care of setting up PATH variables. From the anaconda prompt, type or copy the following:
|
||||
1. This script was made and tested in an Anaconda environment with Python 3.10. I recommend miniconda for a smaller installation, and if you're not familiar with Python.
|
||||
See [here](https://docs.anaconda.com/free/miniconda/miniconda-install/) for instructions. You will **need administrator rights**.
|
||||
2. Whisper also requires some additional libraries. The [setup](https://github.com/openai/whisper#setup) page states: "The codebase also depends on a few Python packages, most notably HuggingFace Transformers for their fast tokenizer implementation and ffmpeg-python for reading audio files."
|
||||
Users might not need to specifically install Transfomers. However, a conda installation might be needed for ffmpeg[^1], which takes care of setting up PATH variables.
|
||||
|
||||
From the Anaconda Prompt (which should now be installed in your system, find it with the search function), type or copy the following:
|
||||
```
|
||||
conda install -c conda-forge ffmpeg-python
|
||||
```
|
||||
3. The main functionality comes from openai-whisper. See their [page](https://github.com/openai/whisper) for details. As of 2023-03-22 you can install via:
|
||||
You can also choose not to use Anaconda (or miniconda), and use Python. In that case, you need to [download and install FFMPEG](https://ffmpeg.org/download.html) (and potentially add it to your PATH). See here for [WikiHow instructions](https://www.wikihow.com/Install-FFmpeg-on-Windows)
|
||||
|
||||
3. The main functionality comes from openai-whisper. See their [page](https://github.com/openai/whisper) for details. It also uses some additional packages (colorama, and customtkinter), install them with the following command:
|
||||
```
|
||||
pip install -U openai-whisper
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
4. To run the app built on TKinter and TTKthemes. If using these options, make sure they are installed in your Python build. You can install them via pip.
|
||||
4. Run the app:
|
||||
1. For **Windows**: In the same folder as the *app.py* file, run the app from Anaconda prompt by running
|
||||
```python app.py```
|
||||
or with the batch file called run_Windows.bat (for Windows users), which assumes you have conda installed and in the base environment (This is for simplicity, but users are usually adviced to create an environment, see [here](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands) for more info) just make sure you have the correct environment (right click on the file and press edit to make any changes).
|
||||
3. For **Mac**: Haven't figured out a better way to do this, see [the instructions here](Mac_instructions.md)
|
||||
|
||||
**Note** If you want to download a model first, and then go offline for transcription, I recommend running the model with the default sample folder, which will download the model locally.
|
||||
|
||||
## GPU Support
|
||||
This program **does support running on NVIDIA GPUs**, which can significantly speed up transcription times. To use GPU acceleration, you need to have the correct version of PyTorch installed with CUDA support.
|
||||
|
||||
### Installing PyTorch with CUDA Support
|
||||
If you have an NVIDIA GPU and want to take advantage of GPU acceleration, you can install a CUDA-enabled version of PyTorch using:
|
||||
```
|
||||
pip install tkinter
|
||||
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
|
||||
```
|
||||
and
|
||||
|
||||
**Note:** The command above installs PyTorch with CUDA 12.1 support. Make sure your NVIDIA GPU drivers are compatible with CUDA 12.1. You can check your CUDA version by running `nvidia-smi` in your terminal.
|
||||
|
||||
If you need a different CUDA version, visit the [PyTorch installation page](https://pytorch.org/get-started/locally/) to generate the appropriate installation command for your system.
|
||||
|
||||
### Verifying GPU Support
|
||||
After installation, you can verify that PyTorch can detect your GPU by running:
|
||||
```python
|
||||
import torch
|
||||
print(torch.cuda.is_available()) # Should print True if GPU is available
|
||||
print(torch.cuda.get_device_name(0)) # Should print your GPU name
|
||||
```
|
||||
pip install customtkinter
|
||||
```
|
||||
5. Run the app:
|
||||
1. For **Windows**: In the same folder as the *app.py* file, run the app from terminal by running ```python app.py``` or with the batch file called run_Windows.bat (for Windows users), which assumes you have conda installed and in the base environment (This is for simplicity, but users are usually adviced to create an environment, see [here](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands) for more info) just make sure you have the correct environment (right click on the file and press edit to make any changes). If you want to download a model first, and then go offline for transcription, I recommend running the model with the default sample folder, which will download the model locally.
|
||||
2. For **Mac**: Haven't figured out a better way to do this, see [the instructions here](Mac_instructions.txt)
|
||||
|
||||
If GPU is not detected, the program will automatically fall back to CPU processing, though this will be slower.
|
||||
|
||||
## Usage
|
||||
1. When launched, the app will also open a terminal that shows some additional information.
|
||||
2. Select the folder containing the audio or video files you want to transcribe by clicking the "Browse" button next to the "Folder" label. This will open a file dialog where you can navigate to the desired folder. Remember, you won't be choosing individual files but whole folders!
|
||||
|
||||
@@ -5,9 +5,10 @@ from tkinter import messagebox
|
||||
from src._LocalTranscribe import transcribe, get_path
|
||||
import customtkinter
|
||||
import threading
|
||||
from colorama import Back, Fore
|
||||
from colorama import Back
|
||||
import colorama
|
||||
colorama.init(autoreset=True)
|
||||
import os
|
||||
|
||||
|
||||
|
||||
@@ -41,7 +42,8 @@ class App:
|
||||
language_frame.pack(fill=tk.BOTH, padx=10, pady=10)
|
||||
customtkinter.CTkLabel(language_frame, text="Language:", font=font).pack(side=tk.LEFT, padx=5)
|
||||
self.language_entry = customtkinter.CTkEntry(language_frame, width=50, font=('Roboto', 12, 'italic'))
|
||||
self.language_entry.insert(0, 'Select language or clear to detect automatically')
|
||||
self.default_language_text = "Enter language (or ignore to auto-detect)"
|
||||
self.language_entry.insert(0, self.default_language_text)
|
||||
self.language_entry.bind('<FocusIn>', on_entry_click)
|
||||
self.language_entry.pack(side=tk.LEFT, fill=tk.X, expand=True)
|
||||
# Model frame
|
||||
@@ -72,7 +74,8 @@ class App:
|
||||
# Helper functions
|
||||
# Browsing
|
||||
def browse(self):
|
||||
folder_path = filedialog.askdirectory()
|
||||
initial_dir = os.getcwd()
|
||||
folder_path = filedialog.askdirectory(initialdir=initial_dir)
|
||||
self.path_entry.delete(0, tk.END)
|
||||
self.path_entry.insert(0, folder_path)
|
||||
# Start transcription
|
||||
@@ -85,29 +88,25 @@ class App:
|
||||
def transcribe_thread(self):
|
||||
path = self.path_entry.get()
|
||||
model = self.model_combobox.get()
|
||||
language = self.language_entry.get() or None
|
||||
language = self.language_entry.get()
|
||||
# Check if the language field has the default text or is empty
|
||||
if language == self.default_language_text or not language.strip():
|
||||
language = None # This is the same as passing nothing
|
||||
verbose = self.verbose_var.get()
|
||||
# Show progress bar
|
||||
self.progress_bar.pack(fill=tk.X, padx=5, pady=5)
|
||||
self.progress_bar.start()
|
||||
# Setting path and files
|
||||
glob_file = get_path(path)
|
||||
info_path = 'I will transcribe all eligible audio/video files in the path: {}\n\nContinue?'.format(path)
|
||||
answer = messagebox.askyesno("Confirmation", info_path)
|
||||
if not answer:
|
||||
self.progress_bar.stop()
|
||||
self.progress_bar.pack_forget()
|
||||
self.transcribe_button.configure(state=tk.NORMAL)
|
||||
return
|
||||
#messagebox.showinfo("Message", "Starting transcription!")
|
||||
# Start transcription
|
||||
error_language = 'https://github.com/openai/whisper#available-models-and-languages'
|
||||
try:
|
||||
output_text = transcribe(path, glob_file, model, language, verbose)
|
||||
except UnboundLocalError:
|
||||
messagebox.showinfo("Files not found error!", 'Nothing found, choose another folder.')
|
||||
pass
|
||||
except ValueError:
|
||||
messagebox.showinfo("Language error!", 'See {} for supported languages'.format(error_language))
|
||||
messagebox.showinfo("Invalid language name, you might have to clear the default text to continue!")
|
||||
# Hide progress bar
|
||||
self.progress_bar.stop()
|
||||
self.progress_bar.pack_forget()
|
||||
|
||||
@@ -0,0 +1,3 @@
|
||||
openai-whisper
|
||||
customtkinter
|
||||
colorama
|
||||
@@ -1,4 +1,2 @@
|
||||
Armstrong_Small_Step
|
||||
[0:00:00 --> 0:00:07]: And they're still brought to land now.
|
||||
[0:00:07 --> 0:00:18]: It's one small step for man.
|
||||
[0:00:18 --> 0:00:23]: One by a fleet for man time.
|
||||
[0:00:00 --> 0:00:29.360000]: alumnfeldaguyrjarna om det nya skirprå kızım om det där föddarna hatt splittar, do nackrott,
|
||||
+12
-6
@@ -2,7 +2,7 @@ import os
|
||||
import datetime
|
||||
from glob import glob
|
||||
import whisper
|
||||
from torch import cuda, Generator
|
||||
from torch import backends, cuda, Generator
|
||||
import colorama
|
||||
from colorama import Back,Fore
|
||||
colorama.init(autoreset=True)
|
||||
@@ -39,14 +39,20 @@ def transcribe(path, glob_file, model=None, language=None, verbose=False):
|
||||
- The transcribed text files will be saved in a "transcriptions" folder
|
||||
within the specified path.
|
||||
|
||||
"""
|
||||
# Check for GPU acceleration
|
||||
if cuda.is_available():
|
||||
"""
|
||||
# Check for GPU acceleration and set device
|
||||
if backends.mps.is_available():
|
||||
device = 'mps'
|
||||
Generator('mps').manual_seed(42)
|
||||
elif cuda.is_available():
|
||||
device = 'cuda'
|
||||
Generator('cuda').manual_seed(42)
|
||||
else:
|
||||
device = 'cpu'
|
||||
Generator().manual_seed(42)
|
||||
# Load model
|
||||
model = whisper.load_model(model)
|
||||
|
||||
# Load model on the correct device
|
||||
model = whisper.load_model(model, device=device)
|
||||
# Start main loop
|
||||
files_transcripted=[]
|
||||
for file in glob_file:
|
||||
|
||||
Reference in New Issue
Block a user