M.Des. XR Design · IIT Jodhpur · April 2026

Proximity-Based
Presence

Procedural environments that respond to user proximity, gaze, and movement. They sustain engagement more effectively than static, high-fidelity alternatives without exceeding standalone VR hardware limits.

RoleSolo Researcher

ScopeM.Des. Research

PlatformStandalone VR

Year2026

Session Duration+35%RC mean 8.4 min vs. SC mean 6.2 min

Interaction Events+77%RC mean 15.75 triggers vs. SC mean 8.88

Re-visits3.75×Voluntary return visits per session in reactive condition

The Problem

The Immersion Gap in Standalone VR

Standalone VR has become affordable, but that accessibility hasn't resolved a fundamental problem. Users report shorter sessions, more cognitive friction, and “headset awareness” far more frequently than in PC-based environments.

A standalone headset must drive two independent stereo displays simultaneously at 72–120 FPS within a combined 220° field of view, all from a mobile chip with strict power and thermal limits.

Hardware Architecture — PC vs. Standalone Headset

“Low-fidelity PC games like Minecraft and Roblox consistently outperform high-fidelity standalone VR in session duration. If graphical power determined immersion, this would be impossible.”

Research premise, informed by Jin (2024) and Slater & Wilbur (1997)

Reference: Silent Hill

Engagement via atmosphere, not graphical fidelity.

Reference: Minecraft

World-record engagement on minimum graphical budget.

220°Combined horizontal FoV: every animation loop exposed simultaneously

72+FPS required per eye. Drop below this and motion sickness begins

~7%Rendering budget remaining after baked GI: headroom for reactivity

40 yrsOf 2D screen animation paradigms inherited by VR without questioning fit

The Inherited Paradigm

Why 2D Animation Logic Fails in VR

Standalone VR inherited its animation approach from 40 years of 2D screen game development: keyframed loops, repeating cycles, fixed timelines. On a monitor, cameras cut and shift before the patterns become obvious. In VR, there's no controlled camera. Everything is always visible.

Fig. 1.1: Field of View Comparison, 2D Camera vs. VR User · VR's expanded peripheral FoV (~220°) exposes keyframed loops that 2D cameras hide

Key Insights

What the Problem Analysis Revealed

The Hardware Constraint Is Real, But Misattributed

Standalone headsets are genuinely constrained: two stereo displays at 72–120 FPS from a mobile chip doesn't leave much room to work with. But the response from developers has mostly been to push for more graphics power, which isn't the right fix. Low-fidelity games like Minecraft consistently beat high-fidelity standalone VR in session length. The constraint is being framed wrong.

Inherited Animation Paradigms Break in 360° Environments

Keyframed animation loops work on 2D screens because camera cuts and framing keep them hidden. VR's ~220° field of view removes that cover. The same loop that's invisible on a monitor becomes a noticeable, presence-breaking artefact in VR. That's a design problem, not a hardware problem.

Presence Is Software-Addressable Without Touching the Render Budget

The key insight from Slater and Wilbur is that immersion and presence are different things. Immersion is about hardware. Presence, the feeling of actually being in the space, can be improved through environmental responsiveness at very low computational cost. That shifts the whole problem.

Prioritisation Matrix: comparing problems across user impact, feasibility of intervention, and developer control

Research Direction

Reframing the Question

Rather than pursuing graphical fidelity (an arms race standalone hardware cannot win), this research asks whether environments that respond dynamically to user presence can sustain immersion more effectively.

Slater and Wilbur (1997) draw a key distinction: immersion is a technical property (resolution, FoV, tracking latency), while presence is a subjective experience that can be optimised independently of hardware.

Presence vs. Immersion Framework · Slater & Wilbur (1997); Jin (2024); Kelsey et al. (2023)

Objective 01Build the Pipeline

Design a hardware-efficient procedural animation pipeline for standalone VR, maintaining minimum 72 FPS under all interaction conditions on Meta Quest 2.

Objective 02Test Empirically

Conduct a controlled A/B study comparing session duration, interaction frequency, and self-reported immersion between static-animation and user-reactive conditions.

Objective 03Study Onboarding

Measure whether diegetic vs. non-diegetic onboarding affects first-time users' ability to engage with the reactive systems at all.

Methodology

Four-Phase Research Process

The methodology was iterative by necessity. Any technique dropping frame rate below 72 FPS on standalone hardware triggers motion sickness, making the experience unusable as a research instrument.

Phase 01: Exploration & Architectural Pivot

Audio-Reactive Pipeline: TouchDesigner → Unity

TouchDesigner was linked to Unity via NDI/Spout/OSC protocols. Live audio was decomposed via FFT into spectral frequency bands that drove Unity shader parameters and mesh vertex displacements in real time: bass controlled torch flicker intensity, mid-frequencies drove particle emission, highs drove colour modulation.

Fig. 4.1: Audio-Reactive Pipeline Architecture (Abandoned) · TouchDesigner → NDI/Spout/OSC → Unity VR → Meta Quest 2: 15–35 FPS

Abandoned. Meta Quest 2 output: 15–35 FPS. Inter-application GPU texture sharing and audio-driven physics consumed the entire processing budget.

✓

Principle retained. Dynamic environmental responsiveness is the correct direction. Scale of computation must be reduced dramatically for standalone deployment.

Phase 02: Environment Design & Render Optimisation

The Prison Room, Native Unity URP

A purpose-built test environment called the “Prison Room,” designed in native Unity URP. Three interconnected zones: entry cell with warm torch lighting, central corridor with arched openings, and interior study room with grabbable props.

Fig. 4.2: Prison Room, Three-Zone Floor Plan · Zone 1: Entry Cell · Zone 2: Central Corridor · Zone 3: Study Room

Fig. 4.3: Entry Zone · Zone 1: Torch intensity responds to player approach

Fig. 4.5: Study Room · Zone 3: Grabbable props trigger proximity effects

Fig. 4.6: Render Optimisation, Three Nested Layers · Baked GI (~0%) + Light Probe Network (~2%) + Real-Time Shadows in 0.8m player zone (~5%) = ~7% total overhead

✓

Stable 72+ FPS baseline established. Full rendering budget headroom available for procedural animation systems.

Phase 03: Procedural Animation Systems

Four Primitives. Two Trigger Types.

Four procedural animation primitives implemented in C# within Unity, each triggered by user proximity (Euclidean distance from head transform drops below threshold) or physical interaction (XR controller collider intersects object trigger).

Sinusoidal OscillationNegligible

Trigger: ContinuousApplied to: Torch flames, hanging chains, ceiling moss

Composite sine waves at randomised frequency and amplitude per object, prevents synchronisation

Perlin Noise DisplacementLow

Trigger: ContinuousApplied to: Dust drift, puddle ripples, fog density

Mathf.PerlinNoise at time-offset coordinates per particle, producing a unique non-repeating trajectory each frame

Lerp + Proximity GatingLow

Trigger: User proximityApplied to: Key glow, door creak, force-field pulse

Lerp with smoothing coefficient 0.05–0.15/frame, prevents state snapping

Parametric ScalingNegligible

Trigger: Head positionApplied to: Light shaft dust motes, volumetric elements

Scale and fade based on vertical head delta, simulates disturbance of settled dust

✓

All four primitives deployed. Frame rate maintained at 72+ FPS under all interaction conditions on Meta Quest 2.

Phase 04: A/B Testing Framework

8 Participants · Two Conditions · Two Onboarding Types

An A/B testing build deployed on Meta Quest 2 hardware. Eight participants recruited from a student population, ranging from first-time headset users to frequent VR gamers. Each completed sessions in both conditions, with order counterbalanced.

Static Condition (SC)

All animations as pre-rendered keyframe loops. No proximity triggers active.

Reactive Condition (RC)

All four procedural systems fully active, responding to real-time user proximity and controller input.

P1First-timeGuidedStatic → Reactive

P2First-timeGuidedReactive → Static

P3First-timeUnguidedStatic → Reactive

P4First-timeUnguidedReactive → Static

P5ExperiencedGuidedStatic → Reactive

P6ExperiencedGuidedReactive → Static

P7ExperiencedUnguidedStatic → Reactive

P8ExperiencedUnguidedReactive → Static

User Study Recording

Task-Driven Session: Did the Environment Feel Alive?

Participants were given a single objective: find the key and escape the cell. The session captures navigation pattern, zone dwell time, and whether users voluntarily return to areas already visited.

Findings

What the Data Shows

+35%Average session duration. RC mean 8.4 min vs. SC mean 6.2 min

+77%Trigger activation events. RC mean 15.75 vs. SC mean 8.88

3.75Average voluntary re-visits per session in RC. Zero in SC

≈ 0Difference in comfort scores. No nausea or performance degradation

Fig. 5.1: Session Duration by Participant, Static vs. Reactive Condition · RC mean 8.4 min vs. SC mean 6.2 min (+35%)

Re-engagement as the Primary Differentiator

In the static condition, participants visited each zone once and moved on. In the reactive condition, they returned an average of 3.75 times per session to re-trigger effects they'd already discovered. That kind of voluntary re-engagement didn't happen at all in the static version. It's probably the clearest behavioural signal that presence was working.

Reactivity Operates at Two Cognitive Levels

Experienced VR users noticed the responsive systems within the first two minutes and deliberately went back to explore what they'd triggered. First-time users didn't consciously report noticing the reactivity at all. But they still explored more, stayed longer, and triggered more events than they did in the static version.

Agency, Not Complexity, Activates Presence

Reaching toward the brazier and seeing the torch respond felt completely different from watching the same flame on a timer. One feels like the environment is aware of you. The other is just decoration. This lines up with Beacco's framework: presence comes from action-reaction coupling, not from visual complexity.

Environment Gallery

The Prison Room in Unity.

Conclusion

Environmental responsiveness over graphical power.

On constrained standalone hardware, making environments responsive to user presence is more effective than pushing for better graphics. What's interesting is that the same procedural system works at two different cognitive levels: experienced users consciously noticed and explored the reactions, while first-time users stayed longer and engaged more without ever reporting why.

Anirudh Singh · M24LDX002M.Des. in XR Design · IIT Jodhpur, School of DesignUnder the supervision of Dr. Sajan Pillai

Study Room — secondary angle with desk props, bookshelf, and wall paintings

Let's build something immersive.

anirudhsingh1441@gmail.com View resume ->

Let's Talk

Proximity-BasedPresence