Ardvrk's

Audio To Avatar

Realtime Prototype


Gallery Images and Videos


3d of txtAll
(png) 474k

FFT nn modes steps
(png) 162k

MNIST image
(png) 2mb

MNIST vs FFT
(png) 226k

Studying background noise spectrums
(png) 337k

adding silent to max end not
(png) 391k

adjusting labels
(png) 523k

annotated data set plus live
(png) 533k

average setup not median
(png) 505k

averages corner view
(png) 438k

averaging phonemes 38
(png) 494k

averaging spectrums into segments
(png) 395k

binary example for developing and testing neural network early on
(png) 124k

binary test
(png) 191k

blendshape patterns
(png) 289k

call the neural network
(png) 171k

checking diffs between rows for spectrums
(png) 457k

checking edge conditions
(png) 237k

checking origin
(png) 414k

code diagram
(png) 987k

code diagram neural network
(png) 454k

color coding silence background noise black
(png) 563k

corner view data set
(png) 400k

curvy slope down for spectrums
(png) 309k

data saved as image for neural network
(png) 260k

data set plus live data
(png) 394k

dataset pair with jawOpen highlighted
(png) 424k

first realtime test
(png) 261k

focusing on jawOpen
(png) 1mb

generating one hot
(png) 588k

green jaw and grey mouth blendshapes
(png) 399k

iconButtonFemale
(png) 55k

iconButtonMale
(png) 91k

jaw green mouth grey
(png) 555k

larger spectrums
(png) 550k

looking down blendshape column
(png) 344k

marker to show current row
(png) 199k

median of each first phoneme
(png) 277k

midi blendshapes
(png) 639k

midi blendshapes image
(png) 39k

midi blendshapes work with one hot
(png) 265k

nn 3d
(png) 39k

nn 3d 2
(png) 106k

nn diagram
(png) 163k

noticing silent includes low signals down
(png) 415k

noticing spikes in background silence
(png) 360k

numOneHot update
(png) 529k

numOneHots working
(png) 477k

one hot and midi lookup blendshapes
(png) 460k

one hot data set numeric sort
(png) 223k

onehot for phoneme 93
(png) 46k

phoneme dividers
(png) 680k

prediction results index of match
(png) 509k

random one hot top view
(png) 117k

realtime neural network flow
(png) 184k

separating spectrums and blendshapes
(png) 561k

showing averages
(png) 328k

showing phonemes
(png) 272k

side view of audio phrases spectrum
(png) 375k

simple job of nn
(png) 28k

sizing one hot to equal number of rows
(png) 657k

sorting spectrums by diff sum
(png) 526k

sorting spectrums numerically not good enough
(png) 453k

spectrum blendshapes different colors
(png) 345k

spectrum data not normalized tall
(png) 454k

spectrum first last skip
(png) 819k

spectrum num 6
(png) 458k

spectrum num 64 blendshapes 52 plus extra
(png) 250k

spectrum size 16
(png) 459k

spectrum slope vs blend shapes
(png) 541k

spectrums and blendshapes
(png) 317k

spectrums and blendshapes labels
(png) 592k

spectrums birds eye
(png) 315k

spectrums on left blendshapes on right
(png) 474k

studying borders between n phoneme spectrum and blendshapes
(png) 772k

studying different phrases with silent gaps
(png) 277k

studying first column of spectrum data
(png) 474k

studying overal patterns
(png) 46k

studying overall patterns inverted
(png) 103k

studying spectrum when not speaking background noise noticing negatives
(png) 549k

sum diff sorted distribution
(png) 407k

testing median algo
(png) 643k

testing spectrum size of 6
(png) 439k

testing spectrums vs blendshapes
(png) 513k

texFFT train
(png) 4k

train and test separate data sets
(png) 279k

train test report example
(png) 125k

training log for 120 neurons and 93 numOneHots outputs
(png) 1mb

two different color schemes
(png) 635k

txtAll before compress one hot random
(png) 382k

txtAll image
(png) 355k

txtAll image before one hot
(png) 49k

txtAll one hot
(png) 233k

txtAll one hot left and midi blendshapes right
(png) 168k

txtAll one hot plus midi lookup
(png) 253k

txtAll one hot randomized
(png) 364k

txtAll one hot with different silent
(png) 405k

txtAll one hot with spectrum num 16
(png) 489k

typical layout one hot then blendshapes then incoming spectrum stream
(png) 401k

updating text labels for different layouts
(png) 162k

using red to map out data set image in 3d
(png) 699k

verifying last row
(png) 304k

battery talk
(mp4) 13mb

battery talk 2
(mp4) 36mb

early test neural network
(mov) 9mb

first light realtime female
(mp4) 85mb

first light realtime male
(mp4) 195mb

first reasonable start averaging phonemes zero spectrum for silent
(mov) 40mb

first reasonable train and predict log
(mov) 73mb

formula one
(mp4) 124mb

marker solid
(mov) 897k

neural network jaw mouth not moving much
(mov) 8mb

predict 93 phonemes 256 neurons reasonable
(mov) 18mb

predicting pre recorded audio
(mov) 30mb

quivering during silent parts
(mov) 12mb

scrolling thru jawOpen
(mov) 207mb

test of neural network from audio file
(mov) 13mb

txtAll one hot and midi blend shapes
(mov) 1mb

txtAll one hot and midi blend shapes and spectrum from audio
(mov) 25mb

visual feedback
(mp4) 12mb

count:120

home

https://ded5792.inmotionhosting.com/~n726485/transcript2avatar/realtime/index.php