Publication statistics

Pub. period:2002-2012
Pub. count:9
Number of co-authors:32


Number of publications with 3 favourite co-authors:

Beverly Harrison:
Shwetak Patel:
Hao Du:



Productive colleagues

Dieter Fox's 3 most productive colleagues in number of publications:

Anthony LaMarca:37
Gaetano Borriello:37
Michael Cohen:19

Dieter Fox


Publications by Dieter Fox (bibliography)

Lei, Jinna, Ren, Xiaofeng and Fox, Dieter (2012): Fine-grained kitchen activity recognition using RGB-D. In: Proceedings of the 2012 International Conference on Uniquitous Computing 2012. pp. 208-211.

We present a first study of using RGB-D (Kinect-style) cameras for fine-grained recognition of kitchen activities. Our prototype system combines depth (shape) and color (appearance) to solve a number of perception problems crucial for smart space applications: locating hands, identifying objects and their functionalities, recognizing actions and tracking object state changes through actions. Our proof-of-concept results demonstrate great potentials of RGB-D perception: without need for instrumentation, our system can robustly track and accurately recognize detailed steps through cooking activities, for instance how many spoons of sugar are in a cake mix, or how long it has been mixing. A robust RGB-D based solution to fine-grained activity recognition in real-world conditions will bring the intelligence of pervasive and interactive systems to the next level.

© All rights reserved Lei et al. and/or ACM Press

Gupta, Ankit, Fox, Dieter, Curless, Brian and Cohen, Michael (2012): DuploTrack: a real-time system for authoring and guiding Duplo block assembly. In: Proceedings of the 2012 ACM Symposium on User Interface Software and Technology 2012. pp. 389-402.

We demonstrate a realtime system which infers and tracks the assembly process of a snap-together block model using a Kinect sensor. The inference enables us to build a virtual replica of the model at every step. Tracking enables us to provide context specific visual feedback on a screen by augmenting the rendered virtual model aligned with the physical model. The system allows users to author a new model and uses the inferred assembly process to guide its recreation by others. We propose a novel way of assembly guidance where the next block to be added is rendered in blinking mode with the tracked virtual model on screen. The system is also able to detect any mistakes made and helps correct them by providing appropriate feedback. We focus on assemblies of Duplo blocks. We discuss the shortcomings of existing methods of guidance -- static figures or recorded videos -- and demonstrate how our method avoids those shortcomings. We also report on a user study to compare our system with standard figure-based guidance methods found in user manuals. The results of the user study suggest that our method is able to aid users' structural perception of the model better, leads to fewer assembly errors, and reduces model construction time.

© All rights reserved Gupta et al. and/or ACM Press

Larson, Eric, Cohn, Gabe, Gupta, Sidhant, Ren, Xiaofeng, Harrison, Beverly, Fox, Dieter and Patel, Shwetak (2011): HeatWave: thermal imaging for surface user interaction. In: Proceedings of ACM CHI 2011 Conference on Human Factors in Computing Systems 2011. pp. 2565-2574.

We present HeatWave, a system that uses digital thermal imaging cameras to detect, track, and support user interaction on arbitrary surfaces. Thermal sensing has had limited examination in the HCI research community and is generally under-explored outside of law enforcement and energy auditing applications. We examine the role of thermal imaging as a new sensing solution for enhancing user surface interaction. In particular, we demonstrate how thermal imaging in combination with existing computer vision techniques can make segmentation and detection of routine interaction techniques possible in real-time, and can be used to complement or simplify algorithms for traditional RGB and depth cameras. Example interactions include (1) distinguishing hovering above a surface from touch events, (2) shape-based gestures similar to ink strokes, (3) pressure based gestures, and (4) multi-finger gestures. We close by discussing the practicality of thermal sensing for naturalistic user interaction and opportunities for future work.

© All rights reserved Larson et al. and/or their publisher

Du, Hao, Henry, Peter, Ren, Xiaofeng, Cheng, Marvin, Goldman, Dan B., Seitz, Steven M. and Fox, Dieter (2011): Interactive 3D modeling of indoor environments with a consumer depth camera. In: Proceedings of the 2011 International Conference on Uniquitous Computing 2011. pp. 75-84.

Detailed 3D visual models of indoor spaces, from walls and floors to objects and their configurations, can provide extensive knowledge about the environments as well as rich contextual information of people living therein. Vision-based 3D modeling has only seen limited success in applications, as it faces many technical challenges that only a few experts understand, let alone solve. In this work we utilize (Kinect style) consumer depth cameras to enable non-expert users to scan their personal spaces into 3D models. We build a prototype mobile system for 3D modeling that runs in real-time on a laptop, assisting and interacting with the user on-the-fly. Color and depth are jointly used to achieve robust 3D registration. The system offers online feedback and hints, tolerates human errors and alignment failures, and helps to obtain complete scene coverage. We show that our prototype system can both scan large environments (50 meters across) and at the same time preserve fine details (centimeter accuracy). The capability of detailed 3D modeling leads to many promising applications such as accurate 3D localization, measuring dimensions, and interactive visualization.

© All rights reserved Du et al. and/or ACM Press

Matuszek, Cynthia, Fox, Dieter and Koscher, Karl (2010): Following directions using statistical machine translation. In: Proceedings of the 5th ACM/IEEE International Conference on Human Robot Interaction 2010. pp. 251-258.

Mobile robots that interact with humans in an intuitive way must be able to follow directions provided by humans in unconstrained natural language. In this work we investigate how statistical machine translation techniques can be used to bridge the gap between natural language route instructions and a map of an environment built by a robot. Our approach uses training data to learn to translate from natural language instructions to an automatically-labeled map. The complexity of the translation process is controlled by taking advantage of physical constraints imposed by the map. As a result, our technique can efficiently handle uncertainty in both map labeling and parsing. Our experiments demonstrate the promising capabilities achieved by our approach.

© All rights reserved Matuszek et al. and/or their publisher

Patterson, Donald J., Liao, Lin, Gajos, Krzysztof, Collier, Michael, Livic, Nik, Olson, Katherine, Wang, Shiaokai, Fox, Dieter and Kautz, Henry A. (2004): Opportunity Knocks: A System to Provide Cognitive Assistance with Transportation Services. In: Davies, Nigel, Mynatt, Elizabeth D. and Siio, Itiro (eds.) UbiComp 2004 Ubiquitous Computing 6th International Conference September 7-10, 2004, Nottingham, UK. pp. 433-450.

Patterson, Donald J., Liao, Lin, Fox, Dieter and Kautz, Henry A. (2003): Inferring High-Level Behavior from Low-Level Sensors. In: Dey, Anind K., Schmidt, Albrecht and McCarthy, Joseph F. (eds.) UbiComp 2003 Ubiquitous Computing - 5th International Conference October 12-15, 2003, Seattle, WA, USA. pp. 73-89.

LaMarca, Anthony, Brunette, Waylon, Koizumi, David, Lease, Matthew, Sigurdsson, Stefan B., Sikorski, Kevin, Fox, Dieter and Borriello, Gaetano (2002): PlantCare: An Investigation in Practical Ubiquitous Systems. In: Borriello, Gaetano and Holmquist, Lars Erik (eds.) UbiComp 2002 Ubiquitous Computing - 4th International Conference September 29 - October 1, 2002, Gteborg, Sweden. pp. 316-332.

LaMarca, Anthony, Brunette, Waylon, Koizumi, David, Lease, Matthew, Sigurdsson, Stefan B., Sikorski, Kevin, Fox, Dieter and Borriello, Gaetano (2002): Making Sensor Networks Practical with Robots. In: Mattern, Friedemann and Naghshineh, Mahmoud (eds.) Pervasive 2002 - Pervasive Computing, First International Conference August 26-28, 2002, Zrich, Switzerland. pp. 152-166.

