Mobile Platform Acoustic-Frequency Environmental Tomography

From SpeechWiki

(Difference between revisions)
Jump to: navigation, search
(Robot)
m (typo)
 
(41 intermediate revisions not shown)
Line 1: Line 1:
-
==Where we're at, 2009 Feb 5==
+
==Retired==
-
We have Sarah's speaker-to-mic recordings,
+
* Christopher Co was working with a simulation of robot Illy in March 2011, but had not used the actual Illy.
-
dimensions/positions of room, mics, speakers:
+
* Camille has retrieved his web-controllable power switch for Illy.
-
* raw .wav files and deconvolved .mat files
+
* By 2011 it makes less sense to revive this work than to catch up on the state of the art for sonar SLAM (simulatenous localization and mapping).
-
* MLS and chirp deconvolutions
+
-
* from each of 4 speakers, to each of 40 mic positions
+
-
* from some speaker-pairs, to each of 24 mic positions
+
-
Speaker-pair recordings are incomplete (only 4 of 6 possible pairs).
+
==Dramatis personae==
-
But we could use them as sanity checks on the single-speaker recordings,
+
-
instead of as primary data.
+
-
The plywood cube (actually particleboard with 2x4 framing) has been demolished.
+
  Mark Hasegawa-Johnson
-
The thin-glass parts of the speakers have been demolished.
+
  Camille Goudeseune
 +
  Grads: Sarah Borys, Lae-Hoon Kim, Logan Niehaus
-
ISL still has the amplifiers, speaker drivers, and mics.
+
==Status==
-
One of the two Earthworks omnidirectional mics is malfunctioning and needs replacing,
+
-
if we need stereo recording.
+
-
ISL's multichannel recording PC, fruitfly.isl.uiuc.edu, has moved south with its 8-channel i/o interface.
+
* Mark bought 7 mic/preamp modules from sparkfun.
 +
* Camille mounted 4 of those mics on a 15 cm tetrahedron and soldered them to a DB9 plug.
 +
* Camille repaired 3 broken wires on the tetrahedron.
 +
* Camille soldered an adapter cable, DB9jack to USB-as-power plus 4x 1/4" plugs, so tetrahedron can drive motu828+maclaptop multitrack recorder.
 +
* Illy's IServer code compiles and runs on modern ubuntu.
 +
* Logan put his IServer source code on our svn, 'illycode'.
 +
* Illy moves, with new IServer on sal and old (2.2 kernel) IServerRobot on illy.
 +
* Don't (yet) fix IServer-video's "torch" link errors.
 +
* Illy has a web-controllable power switch [http://cube1.isl.uiuc.edu cube1.isl.uiuc.edu].
-
If we reconstruct a plywoodcube, prefer flush-with-wall
+
==What's Next==
-
conventional speakers over the original motivation
+
-
of glass-speakers-through-cubewall-slits.
+
-
==What we might publish (how much work still to do)==
+
* Mark will play chirps at the tetrahedron and post the resulting 4-channel recording.
 +
* Sarah will compile the svn'd code on her laptop hungrygerbil.
 +
* Logan will put in svn server/mydemo.cc to move illy.
 +
* Camille will give illy a 120VAC power supply.
 +
* Logan's out of town but on email until August 1.
 +
* To read illy's mics, start from localization/demo.c.
 +
==How to use Illy==
-
===Compute room geometry and mic position from MLS===
+
===Power up===
-
Room, loudspeaker, mic. Mic and speaker unmoving, known distance apart. Play MLS. From recorded sound, estimate room's geometry w.r.t. mic and speaker.
+
* Put battery in tray.  Maybe gently jiggle the eyes up if they're in the way.
 +
* Plug battery into yellow-black power cable at back.
 +
* Turn on power switch (red rocker), side top.
 +
* Beside the motherboard's ethernet jack are 3 LEDs.  If the red one isn't lit, use a pointy steel something to jam the red and black wires deeper into the plug of the motherboard's power cable.
 +
* Power lasts 30 minutes while motoring, 60 sitting.
 +
 
 +
If illy's software crashes while she's plugged into wall power and you're not in BI 1510,
 +
powercycle her from [http://cube1.isl.uiuc.edu cube1.isl.uiuc.edu].
 +
 
 +
===Use===
 +
 
 +
* ssh mrmcclai@illy.ifp.uiuc.edu
 +
* illy% Iserver
 +
* On e.g. Sarah's laptop hungrygerbil, run Iserver and run apps that communicate with illy through those 2 Iservers.
 +
 
 +
===Power down===
 +
 
 +
* illy% shutdown
 +
 
 +
==Compute room geometry and mic position==
 +
 
 +
Room, loudspeaker, mic.
 +
 
 +
Mic and speaker unmoving, known distance apart.
 +
 
 +
Play chirp or sine sweep. Not MLS: too slow.
 +
 
 +
From recorded sound, estimate room's geometry w.r.t. mic and speaker.
* Lae-Hoon's master's thesis has an algorithm for this.
* Lae-Hoon's master's thesis has an algorithm for this.
-
* Verify this algorithm against plywood cube MLS recordings.
+
* Verify this algorithm against plywood-cube chirp recordings.
* Generalize to non-shoebox rooms.
* Generalize to non-shoebox rooms.
* Generalize to a dynamic algorithm for a moving mic and speaker (robot).
* Generalize to a dynamic algorithm for a moving mic and speaker (robot).
* Generalize to a changing room shape.
* Generalize to a changing room shape.
-
===Robot===
+
==Space-mapping robot==
-
Put a speaker or MLS (Chirp) generator (espresso machine?) next to a microphone, to map a space.
+
EMAR, Expendable Mapping Acoustic Robot
 +
 
 +
Put a speaker near a microphone, to map a space.
 +
 
 +
Sine sweep or chirp, not MLS. We need speed and reflector attributes, not precision.
* Fast mode. Catch the first two or three echoes and find out where the two nearest surfaces are, and match those against things on the video camera in order to determine space geometry
* Fast mode. Catch the first two or three echoes and find out where the two nearest surfaces are, and match those against things on the video camera in order to determine space geometry
* Slow mode. Measure the detailed room response at a few different locations (by moving the microphone), use this information together with video (hybrid, like AVSR, increases accuracy) to build up and test hypotheses for the room geometry.
* Slow mode. Measure the detailed room response at a few different locations (by moving the microphone), use this information together with video (hybrid, like AVSR, increases accuracy) to build up and test hypotheses for the room geometry.
-
Application: work with [[https://www.fsi.uiuc.edu/ IFSI]](invalid security certificate on 2009 Feb 13, btw) to test this in their collapsed building simulator: small robot rolls its way through the collapsed building and maps it, before the firefighters go through, to reduce the risk they are exposed to.
+
[http://zx81.isl.uiuc.edu/mappingbot/toy Simulator] (ruby, glut).
 +
===Vehicle===
-
Prototype, for carpet or outdoor pavement, but not yet for off-road:
+
Prototype: illy.ifp.uiuc.edu, an Arrick Trilobot.  In BI 1510 (locked).
-
* http://dashtray.net/linksys_router_bot.htm
+
Contacts Logan Niehaus <niehaus4@illinois.edu>, Stephen Levinson <sel@ifp.uiuc.edu>.
-
* http://andrey.mikhalchuk.com/2008/02/23/how-to-build-an-inexpensive-yet-powerful-robot-how-to-turn-your-router-into-a-routerbot.html
+
-
Duplicating this would cost us about $350 and 15 hours. It's sturdy enough to survive collisions with walls, and strong enough to carry the audio gear.
+
-
Per Sarah's request, this new Version 2 now supports blinking blue LEDs.
+
-
(Since this ''is'' a speech rec group, could we justify $150 for [[http://cardrushstore.amazonwebstore.com/Doctor-Who-18Inch-Voice-Interactive-Supreme/M/B001H50C2I.htm this]]??)
+
Summer, don't add hardware. Use its two mics and ADCs, 802.11b, low-level API.
 +
Replace its speech-synth loudspeaker with a more linear piezo?
 +
It has 6 drive motors.
 +
Its 14 sensors include:
 +
* whiskers for collision detection
 +
* temperature+humidity to compute local speed of sound
 +
* position: laser rangefinder, ultrasonic rangefinder (untested), odometry (inaccurate).
-
What are the payload's weight, size, and power requirements?
+
Camille and Logan have got Illy's IServer software compiled and running on modern Ubuntu.
 +
 
 +
Fall, use more of its 8 channels of ADC for a tetrahedral mic array.
 +
Mount mics on pan-tilt head, so head-rotation verifies the array's angular accuracy.
 +
Loosely couple the software connecting payload to vehicle.
 +
 
 +
Outdoors: Conventional two-tread "tank."
 +
 
 +
===Payload===
 +
 
 +
What's the weight, size, power, and cooling requirements of payload components?
* 2 mics
* 2 mics
* 1 speaker
* 1 speaker
Line 64: Line 115:
How much computation happens on the robot, and how much on its base-station laptop?
How much computation happens on the robot, and how much on its base-station laptop?
 +
* How robust and wide is the data path between them?
 +
* How short a battery life can we tolerate?
 +
* Can a fast onboard CPU run cool enough?  (ammonium nitrate + water first aid "instant ice pack", shaken by robot when it feels hot)
-
===Corpus===
+
===Application===
-
Like AVICAR, but to validate room response models. No room-rebuilding, no more "research." Mention image-source, as well as several other algorithms.
+
Small robot rolls ahead of firefighters into a collapsing building, and maps it to reduce the risk they are exposed to.
 +
Small lets it reach places inaccessible to humans.
 +
[http://crasar.cse.tamu.edu/MainFiles/ CRASAR] recommends only high-level commands given by human operator.
-
===Refine image-source===
+
Related [http://www.nsf.gov/events/event_summ.jsp?cntn_id=100518&org=CISE NSF award].
-
Add frequency dependence to wall reflection and/or air transmission, and other subtle refinements as the data suggests. Have to look at CATT and other commercial packages for architectural acoustics; they include, e.g., hybrid image source/ray-tracing room responses, with frequency response of different materials implemented at each reflection.
+
Training venue: [http://www.teex.com/teex.cfm?pageid=USARprog&area=USAR&templateid=1117 Disaster City].
-
When we discussed this in early 2008, Mark guessed at least 12 months until "good-sounding" room inverse (40 dB, not just Bowon's 10 dB) in simulation, warranted before sawing particleboard.
 
-
Mask the reverberant tail by adding 10 dB SNR noise, since later echos
+
====IFSI====
-
may overlap too much to cancel rigorously.
+
Test in "Collapse Street" collapsed building simulators of [https://www.fsi.uiuc.edu/ IFSI].
 +
After Illi/Norbert in-the-lab study, contact IFSI's [http://www.fsi.illinois.edu/content/information/staffDirectory/detail.cfm?people_id=84458 Gavin Horn] 265-6563 <ghorn@illinois.edu> about collaborating, designing an outdoor prototype, and applying for funding.
-
===Validate room response models===
+
Treads should succeed in IFSI's rubble, because it's not sand or mud or wet leaves.
-
Play sounds convolved by the plywood cube's computed inverse-impulse-response. Compare the recorded results to the original unconvolved sounds.  In simulations, or with a fresh plywoodcube.
+
-
A wood "phonebooth" would fit almost anywhere.
+
Acoustic environment: nonstationary noises, like campfire crackle and water hoses.
-
Camille can imagine a larger phonebooth at ISL, though we'd have to sell Hank on building such a contraption, and we'd want to operate it remotely since it's not walking distance.
+
Many fast chirps tolerate such noise?
-
===Two extensions of Lae-Hoon's Jan 30 paper review===
+
Passive mic could listen to crackles to guess wall locations, if crackle and wall correlate.
 +
 
 +
Robot lightweight enough for a firefighter to throw through a door or over an obstacle.
 +
Ingress faster than exploration.
 +
 
 +
MIRV. Launch like a mortar, perhaps just by dropping into the
 +
path of the water hose.  In flight, compressed springs release by electric wires burning through, to scatter robots (which themselves MIRV, 2 or 3 stages).  Robots chirp in flight, while bouncing, while at rest.
 +
Learn a space's rough geometry within 5 seconds, improving accuracy thereafter.  Battery life of smallest bots need not exceed 60 seconds (capacitors for prototypes).
 +
 
 +
===Two extensions of Lae-Hoon's 2009 Jan 30 paper review===
1. Remove assumption of time invariance of RIR, because listeners' heads and ears move enough to degrade performance at high frequencies.
1. Remove assumption of time invariance of RIR, because listeners' heads and ears move enough to degrade performance at high frequencies.
Line 103: Line 167:
adding mics would degrade rather than improve performance.
adding mics would degrade rather than improve performance.
-
Sensitivity analysis of these things could be done entirely in simulation, as a quickly publishable result. A second paper could test that with actual experiments.
+
Sensitivity analysis of these things could be done entirely in simulation, as a quickly publishable result. A second paper tests that with experiments.
 +
 
 +
===Later work===
 +
* For more accuracy, estimate nonlocal speed of sound from computed and remembered values.
 +
* Secondary computation: ASR for "help!" and screams.  Tiny vocabulary.  Robust to background noise.
 +
* Flock of robots.  Faster, but tricky crosstalk.
 +
 
 +
==Plywood-cube status==
 +
 
 +
We have Sarah's speaker-to-mic recordings,
 +
dimensions/positions of room, mics, speakers:
 +
* raw .wav files and deconvolved .mat files
 +
* MLS and chirp deconvolutions
 +
* from each of 4 speakers, to each of 40 mic positions
 +
* from some speaker-pairs, to each of 24 mic positions
 +
 
 +
Speaker-pair recordings are incomplete (only 4 of 6 possible pairs).
 +
But we could use them as sanity checks on the single-speaker recordings,
 +
instead of as primary data.
 +
 
 +
We still have the amplifiers and speaker drivers.

Latest revision as of 22:33, 10 July 2012

Contents

Retired

  • Christopher Co was working with a simulation of robot Illy in March 2011, but had not used the actual Illy.
  • Camille has retrieved his web-controllable power switch for Illy.
  • By 2011 it makes less sense to revive this work than to catch up on the state of the art for sonar SLAM (simulatenous localization and mapping).

Dramatis personae

 Mark Hasegawa-Johnson
 Camille Goudeseune
 Grads: Sarah Borys, Lae-Hoon Kim, Logan Niehaus

Status

  • Mark bought 7 mic/preamp modules from sparkfun.
  • Camille mounted 4 of those mics on a 15 cm tetrahedron and soldered them to a DB9 plug.
  • Camille repaired 3 broken wires on the tetrahedron.
  • Camille soldered an adapter cable, DB9jack to USB-as-power plus 4x 1/4" plugs, so tetrahedron can drive motu828+maclaptop multitrack recorder.
  • Illy's IServer code compiles and runs on modern ubuntu.
  • Logan put his IServer source code on our svn, 'illycode'.
  • Illy moves, with new IServer on sal and old (2.2 kernel) IServerRobot on illy.
  • Don't (yet) fix IServer-video's "torch" link errors.
  • Illy has a web-controllable power switch cube1.isl.uiuc.edu.

What's Next

  • Mark will play chirps at the tetrahedron and post the resulting 4-channel recording.
  • Sarah will compile the svn'd code on her laptop hungrygerbil.
  • Logan will put in svn server/mydemo.cc to move illy.
  • Camille will give illy a 120VAC power supply.
  • Logan's out of town but on email until August 1.
  • To read illy's mics, start from localization/demo.c.

How to use Illy

Power up

  • Put battery in tray. Maybe gently jiggle the eyes up if they're in the way.
  • Plug battery into yellow-black power cable at back.
  • Turn on power switch (red rocker), side top.
  • Beside the motherboard's ethernet jack are 3 LEDs. If the red one isn't lit, use a pointy steel something to jam the red and black wires deeper into the plug of the motherboard's power cable.
  • Power lasts 30 minutes while motoring, 60 sitting.

If illy's software crashes while she's plugged into wall power and you're not in BI 1510, powercycle her from cube1.isl.uiuc.edu.

Use

  • ssh mrmcclai@illy.ifp.uiuc.edu
  • illy% Iserver
  • On e.g. Sarah's laptop hungrygerbil, run Iserver and run apps that communicate with illy through those 2 Iservers.

Power down

  • illy% shutdown

Compute room geometry and mic position

Room, loudspeaker, mic.

Mic and speaker unmoving, known distance apart.

Play chirp or sine sweep. Not MLS: too slow.

From recorded sound, estimate room's geometry w.r.t. mic and speaker.

  • Lae-Hoon's master's thesis has an algorithm for this.
  • Verify this algorithm against plywood-cube chirp recordings.
  • Generalize to non-shoebox rooms.
  • Generalize to a dynamic algorithm for a moving mic and speaker (robot).
  • Generalize to a changing room shape.

Space-mapping robot

EMAR, Expendable Mapping Acoustic Robot

Put a speaker near a microphone, to map a space.

Sine sweep or chirp, not MLS. We need speed and reflector attributes, not precision.

  • Fast mode. Catch the first two or three echoes and find out where the two nearest surfaces are, and match those against things on the video camera in order to determine space geometry
  • Slow mode. Measure the detailed room response at a few different locations (by moving the microphone), use this information together with video (hybrid, like AVSR, increases accuracy) to build up and test hypotheses for the room geometry.

Simulator (ruby, glut).

Vehicle

Prototype: illy.ifp.uiuc.edu, an Arrick Trilobot. In BI 1510 (locked). Contacts Logan Niehaus <niehaus4@illinois.edu>, Stephen Levinson <sel@ifp.uiuc.edu>.

Summer, don't add hardware. Use its two mics and ADCs, 802.11b, low-level API. Replace its speech-synth loudspeaker with a more linear piezo? It has 6 drive motors. Its 14 sensors include:

  • whiskers for collision detection
  • temperature+humidity to compute local speed of sound
  • position: laser rangefinder, ultrasonic rangefinder (untested), odometry (inaccurate).

Camille and Logan have got Illy's IServer software compiled and running on modern Ubuntu.

Fall, use more of its 8 channels of ADC for a tetrahedral mic array. Mount mics on pan-tilt head, so head-rotation verifies the array's angular accuracy. Loosely couple the software connecting payload to vehicle.

Outdoors: Conventional two-tread "tank."

Payload

What's the weight, size, power, and cooling requirements of payload components?

  • 2 mics
  • 1 speaker
  • power amplifier
  • computer handling mics + speakers
  • computer running Lae-Hoon's algorithm

How much computation happens on the robot, and how much on its base-station laptop?

  • How robust and wide is the data path between them?
  • How short a battery life can we tolerate?
  • Can a fast onboard CPU run cool enough? (ammonium nitrate + water first aid "instant ice pack", shaken by robot when it feels hot)

Application

Small robot rolls ahead of firefighters into a collapsing building, and maps it to reduce the risk they are exposed to. Small lets it reach places inaccessible to humans. CRASAR recommends only high-level commands given by human operator.

Related NSF award.

Training venue: Disaster City.


IFSI

Test in "Collapse Street" collapsed building simulators of IFSI. After Illi/Norbert in-the-lab study, contact IFSI's Gavin Horn 265-6563 <ghorn@illinois.edu> about collaborating, designing an outdoor prototype, and applying for funding.

Treads should succeed in IFSI's rubble, because it's not sand or mud or wet leaves.

Acoustic environment: nonstationary noises, like campfire crackle and water hoses. Many fast chirps tolerate such noise?

Passive mic could listen to crackles to guess wall locations, if crackle and wall correlate.

Robot lightweight enough for a firefighter to throw through a door or over an obstacle. Ingress faster than exploration.

MIRV. Launch like a mortar, perhaps just by dropping into the path of the water hose. In flight, compressed springs release by electric wires burning through, to scatter robots (which themselves MIRV, 2 or 3 stages). Robots chirp in flight, while bouncing, while at rest. Learn a space's rough geometry within 5 seconds, improving accuracy thereafter. Battery life of smallest bots need not exceed 60 seconds (capacitors for prototypes).

Two extensions of Lae-Hoon's 2009 Jan 30 paper review

1. Remove assumption of time invariance of RIR, because listeners' heads and ears move enough to degrade performance at high frequencies.

2. Extend their simulation to experiment with real microphones.

Of each mic in an array:

  • nonuniform frequency response
  • nonuniform spatial ("off-axis") response
  • nonuniform accuracy of measurement of spatial position
  • nonuniform accuracy of measurement of orientation, if mic isn't "omnidirectional"
  • nonuniform SNR
  • correlated inter-mic noise (not independent Gaussians) from multichannel preamplifier
  • actual crosstalk between channels, again from preamp
  • noises in domains other than amplitude-vs-time

At some point, even if mics cost no money, these inaccuracies suggest that adding mics would degrade rather than improve performance.

Sensitivity analysis of these things could be done entirely in simulation, as a quickly publishable result. A second paper tests that with experiments.

Later work

  • For more accuracy, estimate nonlocal speed of sound from computed and remembered values.
  • Secondary computation: ASR for "help!" and screams. Tiny vocabulary. Robust to background noise.
  • Flock of robots. Faster, but tricky crosstalk.

Plywood-cube status

We have Sarah's speaker-to-mic recordings, dimensions/positions of room, mics, speakers:

  • raw .wav files and deconvolved .mat files
  • MLS and chirp deconvolutions
  • from each of 4 speakers, to each of 40 mic positions
  • from some speaker-pairs, to each of 24 mic positions

Speaker-pair recordings are incomplete (only 4 of 6 possible pairs). But we could use them as sanity checks on the single-speaker recordings, instead of as primary data.

We still have the amplifiers and speaker drivers.

Personal tools