From: MarStr Date: Sun, 21 Jul 2024 20:47:18 +0000 (+0200) Subject: Added background noise and adjusted sample rates for improved realism and immersion. X-Git-Url: http://marstr.online/code/gitweb.cgi?a=commitdiff_plain;h=7752034d2fd452216995059a3bfa063dd1447ad0;p=Pilot2AWS Added background noise and adjusted sample rates for improved realism and immersion. --- diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..8aa2645 --- /dev/null +++ b/LICENSE @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) [year] [fullname] + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/Pilot2AWS.py b/Pilot2AWS.py index 87cfd92..742dcbb 100644 --- a/Pilot2AWS.py +++ b/Pilot2AWS.py @@ -1,6 +1,8 @@ -import boto3 # pip install boto3 -import pygame # pip install pygame +import boto3 # pip install boto3 +import pygame # pip install pygame +import numpy as np # pip install numpy +from scipy.io.wavfile import write # pip install scipy import time import io import random @@ -39,7 +41,7 @@ atc_aws_voicemodel = 'standard' # This file can be anywhere and have any name - just make sure you # put in the correct absolute path into this variable. # ------------------------------------------------------------------- -atc_pilot2atc_log = "C:\\Users\\Marcus\\Desktop\\ConversationText.txt" +atc_pilot2atc_log = "M:\\Developer\\Projects\\Pilot2AWS\\test.txt" # ------------------------------------------------------------------- @@ -87,6 +89,9 @@ atc_last_line = -1 pygame.init() pygame.mixer.init() +# The click at the end of a transmission +click = pygame.mixer.Sound("endclick.wav") + # Open file before main loop atc_log = open(atc_pilot2atc_log) @@ -110,15 +115,41 @@ while True: # Generate voice! # Let's keep it at OGG - best compromize between data transfer size and quality - response = atc_polly_client.synthesize_speech(VoiceId=atc_voices[voice_to_use], OutputFormat='ogg_vorbis', Text = line, Engine = atc_aws_voicemodel) + # Also, added 8Khz as sample frequency to make this sound more authentic + response = atc_polly_client.synthesize_speech(VoiceId=atc_voices[voice_to_use], OutputFormat='ogg_vorbis', Text = line, Engine = atc_aws_voicemodel, SampleRate="8000") # And place that into a binary block data = io.BytesIO(response['AudioStream'].read()) + with open("atc.ogg", 'wb') as f: + f.write(data.getbuffer()) + + # Get length of spoken audio. + t = pygame.mixer.Sound("atc.ogg") + l = int(t.get_length()) + 1 + # OK. Generate white noise: + noise = np.random.normal(0, 1, 8000 * l) + # Normalize the white noise + noise = noise / np.max(np.abs(noise)) + # Convert the white noise to a 16-bit format + noise = (noise * 2**15).astype(np.int16) + # Save that file too + write('noise.wav', 8000, noise) + + # Place Polly's audio in Channel 0 + #pygame.mixer.music.load(data) + #pygame.mixer.music.play() + pygame.mixer.Channel(0).play(t) + + # Set white noise volume to 10% + pygame.mixer.Channel(1).set_volume(0.05) + # Place white noise in Channel 1 + pygame.mixer.Channel(1).play(pygame.mixer.Sound('noise.wav')) + + while pygame.mixer.Channel(0).get_busy(): + time.sleep(0.1) + + pygame.mixer.Channel(2).set_volume(0.3) + pygame.mixer.Channel(2).play(click) - # Place the data into pygame and play it - pygame.mixer.music.load(data) - pygame.mixer.music.play() - pygame.event.wait() - # Increase loop count idx = idx + 1 diff --git a/endclick.wav b/endclick.wav new file mode 100644 index 0000000..01d9ec8 Binary files /dev/null and b/endclick.wav differ diff --git a/repoinfo b/repoinfo index 3836acc..cbe4da9 100644 --- a/repoinfo +++ b/repoinfo @@ -2,6 +2,8 @@ A Python script allowing to use standard or neural voices of Amazon Polly, in Pilot2ATC. It does so by monitoring Pilot2ATC's output and leveraging Amazon's Polly technology, to generate voice responses that sound more natural. +Now with added immersion as the result sounds very much like an actual radio transmission, including background noise. + The voices that Pilot2ATC can use are those that are available on your system, on Windows this set is usually extremely limited. On top of that, they sound very robotic. Not what you'd want when talking to ATC in a flight. While there are solutions available that leverage AI (custom versions of ChatGPT), the costs are - in my opinion - too high. SayIntentions is my prime example. Clocking in at 30 Dollars per month (at the time of this writing), it is deemed not feasible for most. I did have a look at ChatGPT itself for ATC, and other available tools. I came to the conclusion that Pilot2ATC is the best currently available. It does cost 55 Euros - but it is a one-off payment, you get to keep the tool forever. @@ -43,10 +45,12 @@ If you want to use AWS and its services elsewhere, you are of course free to ins [section]Setup[/section] -You will need two Python modules: boto3 and pygame. Install them like so: +You will need four Python modules: pygame, boto3, numpy and scipy. Install them like so: [code]pip install boto3 -pip install pygame[/code] +pip install pygame +pip install numpy +pip install scipy[/code] Next, open up the Pilot2AWS.py script with your favorite editor and make the necessary adjustments as follows: @@ -131,5 +135,6 @@ The sound format is OGG Vorbis. I would recommend you to leave it at that. It is I may need to figure out how to efficiently read X-Plane's dataref values so that I can further enhance realism. For example only pick another voice if you left a certain area or changed the type of contact. I will be looking into this at some time - for now I am happy with how this has turned out. -[History] +[section]History[/section] +v1.02 - Implemented mechanism that generates 8kHz white noise, and change to generate the voice also with 8kHz. Implemented code to mix both sounds together v1.01 - Updated loop mechanism for more efficiency and accuracy