Computer Vision, Speech Communication & Signal Processing Group


		Vassilis Pitsikalis Research Associate, PhD
	Office:	2.2.19
	Phone:	(+30) 210772-2420
	Fax:	(+30) 210772-3397
	E-mail:	vpitsik@cs addresses are formatted username@cs.ntua.gr
	URL:	http://cvsp.cs.ntua.gr/vpitsik

Biosketch Recently (2008-2012) I have been with the group as a post-doctoral researcher, working mainly in the field of continuous sign language processing, modeling and recognition, within the DictaSign EU research project. Since April 2012 I was an external associate researcher, and after Feb. 2014 I have been working mainly on the MOBOT research project, and partially on i-support and Baby-Robot. Since Apr. 2017 I am an external research collaborator, while working with DeepLab. I have obtained my PhD (May 2007), from the National Technical University of Athens, Greece entitled ``Non-linear dynamical systems and robust speech processing and recognition''. My general research interests lay in the fields of non-linear signal processing and pattern recognition. During my PhD, I have explored non-linear speech processing methods inspired by dynamical systems and fractal theory, with application to speech analysis and feature extraction; these features are related to fractal dimensions and generalized fractal dimensions. Such measurements have been employed in the context of speech analysis and automatic speech recognition (ASR). Other tasks during my phd years involve speech recognition on noisy signals and fusion of multiple information cues that may be of different types (e.g. fractal features) or different modalities (e.g. audio-visual ASR). I have been involved in several national and european research projects such as HIWIRE and MUSCLE Network of Excellence. Publications D. Dimitriadis, P. Maragos, V. Pitsikalis and A. Potamianos, Modulation and Chaotic Acoustic Features for Speech Recognition. J. Control and Intelligent Systems, Invited Paper, Vol. 30, No 1, pp 19-26, 2002. V. Pitsikalis and P. Maragos, Speech Analysis and Feature Extraction using Chaotic Models, Proc. IEEE Int'l Conference on Acoustics, Speech, and Signal Processing (ICASSP 2002), pp. 533-536, Orlando, USA, May 2002. P. Maragos, T. Loupas and V. Pitsikalis, On Improving Doppler Ultrasound Spectroscopy With Instantaneous Multiband Energy Separation, IEEE Proc. of 14th Int'l Conf. on Digital Signal Processing (DSP 2002), Jul. 1-3, Santorini, Greece, July 2002. V. Pitsikalis and P. Maragos, Some Advances on Speech Analysis using Generalized Dimensions, Proc. ISCA Tutorial and Research Workshop on Non-Linear Speech Processing (NOLISP 2003), Le Croisic, France, May 20-23, 2003. V. Pitsikalis, I. Kokkinos and P. Maragos, Nonlinear Analysis of Speech Signals: Generalized Dimensions and Lyapunov Exponents, Proc. 8th European Conf. on Speech Communication & Technology (EuroSpeech 2003), Geneva, Switzerland, Sep. 1-4, 2003. D. Dimitriadis, N. Katsamanis, P. Maragos, G. Papandreou and V. Pitsikalis, Towards Automatic Speech Recognition in Adverse Environments, Proc. 7th Hellenic-European Conference on Computer Mathematics and its Applications (HERCMA 2005), Athens, Greece, Sep. 2005. A. Potamianos, G. Bouselmi, D. Dimitriadis, D. Fohr, R. Gemello, I. Illina, F. Mana, P. Maragos, M. Matassoni, V. Pitsikalis, J. Ramırez, E. Sanchez-Soto, J. Segura, and P. Svaizer, Towards Speaker and Environmental Robustness in ASR: The HIWIRE Project, ISCA Tutorial and Research Workshop on Speech Recognition and Intrinsic Variation (SRIV 2006), Toulouse, France, May 2006. V. Pitsikalis and P. Maragos, Filtered Dynamics and Fractal Dimensions for Noisy Speech Recognition, IEEE Signal Proc. Letters, Vol.13, No.11, pp 711-714, November 2006. A. Katsamanis, G. Papandreou, V. Pitsikalis, and P. Maragos, Multimodal Fusion by Adaptive Compensation for Feature Uncertainty with Application to Audiovisual Speech Recognition, 14th European Signal Processing Conference (EUSIPCO 2006), Sept. 4-8, Florence, Italy, 2006. V. Pitsikalis, A. Katsamanis, G. Papandreou, and P. Maragos, Adaptive Multimodal Fusion by Uncertainty Compensation, Int'l Conf. on Spoken Language Processing (ICSLP 2006), Sep. 17-21, Pittsburgh PA, USA, 2006. D. Dimitriadis, J. C. Segura, L. Garcia, A. Potamianos, P. Maragos and V. Pitsikalis, Advanced Front-end for Robust Speech Recognition in Extremely Adverse Environments, Proc. Int'l Conf. on Speech Technology and Communication (InterSpeech 2007 - EuroSpeech), pp. 2425-2428, Antwer Belgium, Aug. 2007 G. Papandreou, A. Katsamanis, V. Pitsikalis, and P. Maragos, Multimodal Fusion and Learning with Uncertain Features Applied to Audiovisual Speech Recognition, Proc. IEEE Workshop on Multimedia Signal Processing (MMSP 2007), pp. 264-267, Chania, Greece, Oct. 1-3, 2007. G. Papandreou, A. Katsamanis, V. Pitsikalis and P. Maragos, Adaptive Multimodal Fusion by Uncertainty Compensation with Application to Audio-Visual Speech Recognition, in Multimodal Processing and Interaction: Audio, Video, Text, edited by P. Maragos, A. Potamianos and P. Gros, Springer-Verlag, 2008. G. Papandreou, A. Katsamanis, V. Pitsikalis, and P. Maragos, Adaptive Multimodal Fusion by Uncertainty Compensation with Application to Audio-Visual Speech Recognition, IEEE Transactions on Audio, Speech and Language Processing, Vol. 17, No. 3, pp. 423-435, Mar. 2009. V. Pitsikalis and P. Maragos, Analysis and Classification of Speech Signals by Generalized Fractal Dimension Features, Speech Communication, Vol. 51, No. 12, pp. 1206-1223, Dec. 2009. S. Theodorakis, V. Pitsikalis and P. Maragos, Model-Level Data-Driven Sub-Units for Signs in videos of Continuous Sign Language, Proc. IEEE Int'l Conference on Acoustics, Speech, and Signal Processing (ICASSP 2010), Dallas, Texas, March. 2010. V. Pitsikalis, S. Theodorakis and P. Maragos, Data-Driven Sub-Units and Modeling Structure for Continuous Sign Language Recognition with Multiple Cues, 4th Workshop on Representation and Processing of Sign Languages: Corpora and Sign Language Technologies, Workshop at the 7th International Conference on Language Resources and Evaluation (LREC 2010), Malta, May 2010 A. Roussos, S. Theodorakis, V. Pitsikalis and P. Maragos, Hand Tracking and Affine Shape-Appearance Handshape Sub-Units in Continuous Sign Language Recognition, Workshop on Sign, Gesture and Activity (SGA), 11th European Conference on Computer Vision (ECCV 2010), Crete, Greece, Sep. 2010. A. Roussos, S. Theodorakis, V. Pitsikalis and P. Maragos, Affine-Invariant Modeling of Shape-Appearance Images applied on Sign Language Handshape Classification, Proc. IEEE Int'l Conf. on Image Processing (ICIP 2010), Hong Kong, Sep. 26-29, 2010. S. Theodorakis, V. Pitsikalis and P. Maragos, Advances in Dynamic-Static Integration of Manual Cues for Sign Language Recognition, Proc. 9th International Gesture Workshop: Gesture in Embodied Communication and Human-Computer Interaction (GW 2011), Athens, Greece, May 25-27, 2011. I. Rodomagoulakis, S. Theodorakis, V. Pitsikalis and P. Maragos, Experiments on Global and Local Active Appearance Models for Analysis of Sign Language Facial Expressions, Proc. 9th International Gesture Workshop: Gesture in Embodied Communication and Human-Computer Interaction (GW 2011), May 25-27, Athens, Greece, 2011. V. Pitsikalis, S. Theodorakis, C. Vogler and P. Maragos, Advances in Phonetics-based Sub-Unit Modeling for Transcription Alignment and Sign Language Recognition, IEEE CVPR Workshop on Gesture Recognition, Colorado Springs, USA, June 20, 2011. [Pascal2 Best Paper Award] (preprint) S. Theodorakis, V. Pitsikalis, I. Rodomagoulakis and P. Maragos, Recognition with Raw Canonical Phonetic Movement and Handshape Subunits on Videos of Continuous Sign Language, Proc. IEEE Int'l Conf. on Image Processing (ICIP 2012), Orlando, Florida, USA, Sep. 30-Oct. 3, 2012. E. Antonakos, V. Pitsikalis, I. Rodomagoulakis and P. Maragos, Unsupervised Classification of Extreme Facial Events using Active Appearance Models Tracking for Sign Language Videos, Proc. IEEE Int'l Conf. on Image Processing (ICIP 2012), Orlando, Florida, USA, Sep. 30-Oct. 3, 2012. A. Roussos, S. Theodorakis, V. Pitsikalis and P. Maragos, Dynamic Affine-Invariant Shape-Appearance Handshape Features and Classification in Sign Language Videos, Journal of Machine Learning Research (JMLR), Vol. 14, pp. 1627−1663, Jun. 2013. E. Antonakos, V. Pitsikalis and P. Maragos, Classification of Extreme Facial Events in Sign Language Videos, EURASIP Journal on Image and Video Processing, 2014(14):2014. S. Theodorakis, V. Pitsikalis and P. Maragos, Dynamic–Static Unsupervised Sequentiality, Statistical Subunits and Lexicon for Sign Language Recognition, Image and Vision Computing, Elsevier, Vol. 32, Issue 8, pp. 533–549, Aug. 2014. G. Pavlakos, S. Theodorakis, V. Pitsikalis, A. Katsamanis, and P. Maragos, Kinect-Based Multimodal Gesture Recognition Using a Two-Pass Fusion Scheme, Proc. IEEE Int'l Conf. on Image Processing (ICIP 2014), Paris, France, Oct. 27-30, 2014. V. Pitsikalis, A. Katsamanis, S. Theodorakis and P. Maragos, Multimodal Gesture Recognition via Multiple Hypotheses Rescoring, Journal of Machine Learning Research (JMLR), Vol. 16, pp. 285−322, Feb. 2015. I. Rodomagoulakis, N. Kardaris, V. Pitsikalis, E. Mavroudi, A. Katsamanis, A. Tsiami, and P. Maragos, Multimodal Human Action Recognition in Assistive Human-Robot, Proc. of 41st IEEE Int'l Conf. on Acoustics, Speech and Signal Processing (ICASSP 2016), Shanghai, China, Mar. 2016. N. Kardaris, V. Pitsikalis, E. Mavroudi, P. Maragos, Introducing Temporal Order of Dominant Visual Word Sub-Sequences for Human Action Recognition, Proc. Int'l Conf. on Image Proc., (ICIP 2016), pp. 3061-3065, Phoenix, AZ, 2016. I. Rodomagoulakis, N. Kardaris, V. Pitsikalis, A. Arvanitakis, P. Maragos, A Multimedia Gesture Dataset for Human Robot Communication: Acquisition, Tools and Recognition Results, Proc. Int'l Conf. on Image Proc., (ICIP 2016), pp. 3066-3070, Phoenix, AZ, 2016. N. Kardaris, I. Rodomagoulakis, V. Pitsikalis, A. Arvanitakis, P. Maragos, A Platform for Building New Human-Computer Interface Systems that Support Online Automatic Recognition of Audio-Gestural Commands, Proc. of the 2016 ACM on Multimedia Conference (MM 2016), Open source software competition, , pp. 1169-1173, Amsterdam, The Netherlands, Oct. 15 - 19, 2016. A. Guler, N. Kardaris, S. Chandra, V. Pitsikalis, C. Werner, K. Hauer, C. Tzafestas, P. Maragos, I. Kokkinos, Human Joint Angle Estimation and Gesture Recognition for Assistive Robotic Vision, Proc. of European Conference on Computer Vision (ECCV), Workshop on Assistive Computer Vision and Robotics (ACRV 2016), pp. 415-431, Springer International Publishing, Oct. 2016. E. Efthimiou, S.-E. Fotinea, T. Goulas, A.-L. Dimou, M. Koutsombogera, V. Pitsikalis, P. Maragos, C. Tzafestas, The MOBOT Platform–Showcasing Multimodality in Human-Assistive Robot Interaction, Int.'l Conf. on Universal Access in Human-Computer Interaction, Vol. 9738 Lecture Notes in Computer Science, pp. 382-391, Springer International Publishing, 2016. P. Maragos, V. Pitsikalis, A. Katsamanis, G. Pavlakos, S. Theodorakis, On Shape Recognition and Language, In Perspectives in Shape Analysis, Mathematics and Visualization pp 321-344, Springer International Publishing, 2016. A. Zlatintsi, I. Rodomagoulakis, V. Pitsikalis, P. Koutras, N. Kardaris, X. Papageorgiou, C. Tzafestas, P. Maragos, Social Human-Robot Interaction for the Elderly: Two Real-life Use Cases, Proceedings of the Companion of the 2017 ACM/IEEE International Conference on Human-Robot Interaction, pp 335-336, ACM, 2017. Athanasios Katsamanis, Vassilis Pitsikalis, Stavros Theodorakis, Petros Maragos, Multimodal gesture recognition, The Handbook of Multimodal-Multisensor Interfaces, Association for Computing Machinery and Morgan & Claypool, New York, NY, USA, pp 449-487, 2017. A. Zlatintsi, I. Rodomagoulakis, P. Koutras, A. C. Dometios, V. Pitsikalis, C. S. Tzafestas and P. Maragos, Multimodal Signal Processing and Learning Aspects of Human-Robot Interaction for an Assistive Bathing Robot, in Proc. Int'l Conf. on Acoustics, Speech and Signal Processing (ICASSP-2018), Calgary, Canada, Apr. 2018. The documents contained in these directories are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors and by other copyright holders, notwithstanding that they have offered their works here electronically. It is only allowed to copy this information if you adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder. Some of the articles in this web site are copyrighted by IEEE and the following notice applies: © 1980-2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Computer Vision, Speech Communication &

Signal Processing Group

Vassilis Pitsikalis