Jun Ye (叶 俊)
Department of Computer Science,
University of Central Florida,
4000 Central Florida Blvd,
Orlando, FL, 32816
I am currently a Ph.D. candidate advised by Dr. Kien A. Hua in Department of Computer Science at University of Central Florida. My research interests include multimedia retrieval, multimodal data analysis and machine learning. My current focus is on human-centric live video computing and human action recognition.
||Ph.D student in Computer Science,
School of EECS, Computer Science Dept,
University of Central Florida (UCF), Orlando, FL, USA.
||M.S. in Computer Science,
Department of Pattern Recognition and Artificial Intelligence,
Beihang University (BUAA), Beijing, China.
||B.S. in Automation,
School of Control Science and Engineering,
Huazhong University of Science and Technology (HUST), Wuhan, China.
- Naifan Zhuang, Jun Ye and Kien A. Hua. "DLSTM Approach to Video Modeling with Hashing for Large-Scale Video Retrieval," accepted by International Conference on Pattern Recognition (ICPR). Cancun, Mexico. Dec 4-8, 2016.
- Kai Li, Guo-Jun Qi, Jun Ye and Kien A. Hua. "Cross-modal Hashing Through Ranking Subspace Learning," in proceedings of IEEE International Conference on Multimedia and Expo (ICME), Seattle, July 11-15, 2016.
- Jun Ye, Kai Li and Kien A. Hua. "WTA Hash-based Multimodal Feature Fusion for 3D Human Action Recognition," in proceedings of IEEE International Symposium on Multimedia (best paper award of ISM 2015), Miami, December 14-16, 2015.
- Jun Ye, Hao Hu, Kai Li, Guo-Jun Qi, and Kien A. Hua. "First-Take-All: Temporal Order-Preserving Hashing for 3D Action Videos." arXiv preprint arXiv:1506.02184 (2015).
- Jun Ye, Kai Li, Guo-Jun Qi and Kien A. Hua, "Temporal Order-Preserving Dynamic Quantization for Human Action Recognition from Multimodal Sensor Streams," ACM International Conference on Multimedia Retrieval (ICMR), Shanghai, June 2015.
- Jun Ye and Kien A. Hua, "Octree-based 3D Logic and Computation of Spatial Relationships in Live Video Query Processing," ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP), Vol 11, issue 2, December 2014.
- Kai Li, Jun Ye, and Kien A. Hua, "What Is Making That Sound?" in Proc. of ACM Multimedia Conference (ACM MM), Orlando, November 3-7, 2014.
- Jun Ye and Kien A. Hua, "Exploiting Depth Camera for 3D Spatial Relationship Interpretation [pdf][slides]," in proceedings of ACM Multimedia Systems (ACM MMSys). Oslo, Mar, 2013.
- Jun Ye, Kien A. Hua, "Scalability Study of Wireless Mesh Networks with Dynamic Stream Merging Capability [pdf]," in Proceedings of Multimedia Communications (best presentation paper award), Services & Security (MCSS), Krakow, Poland, 2011.
- Jun Ye, Lin-Lin Huang, and Xiao-Li Hao, "Neural Network Based Text Detection in Videos Using Local Binary Patterns [pdf]," in Proc. of China, Japan and Korea Joint Workshop on Pattern Recognition (CJKPR), Nanjing, November 04-09, 2009.
Temporal Hashing for Human Action Sequence Retrieval,
This project is part of my disseration. Most current hashing algorithms are designed for fixed-length data such as images and focus on the spatial representation of data. As a result, it is still challenging and expensive to hash videos. I investigate a temporal hashing algorithm which encodes the video by the temporal order of randomly generated latent patterns in a video sequence. This is the first work that hashes a video by their temporal information. Conference paper submissions are under review.
3D Human Action Recognition,
In this project, I investigate the challenge of temporal modeling of human action recognition and propose a dynamic quantization algorithm to model the dynamic patterns of the action sequence for 3D human action recognition. I also propose a novel hash-based multimodal feature fusion algorithm which is a generic early fusion method independent of features. Two conference papers are published including a best paper award in IEEE ISM2015.
Live Video Computing,
This project is to investigate techniques for the interpretation of 3D spatial relationships for the LVDBMS (Live Video DataBase Management System). The LVDBMS is a
general-purpose framework for managing and processing live video data for surveillance and analytical applications. This
system allows for automatic event recognition over a network of live cameras. The user is able to specify a monitoring task
by formulating a query describing a spatiotemporal event. This query may be formulated as a combination of logical,
spatial, and temporal operators. When the specified event is observed, an action associated with the query is triggered. In
other words, the LVDBMS treats a camera as a special class of storage, and processes the continuous queries against live
video feeds, and is analogous to a new category of databases. In this project, I extend the original 2D spatial operators in the current LVDBMS into 3D spatial operators by using the Microsoft Kinect sensors. An octree-based algorithm for computing the 3D spatial operators is proposed and a GPU-based implementation is developed. Language: C++, OpenCL; Platform: Windows; Contribution: Algorithm design and simulation, prototype implementation, and Analysis; a conference paper published at ACM MMSys'13 and a journal paper published at ACM TOMCCAP.
Intelligent Traffic Surveillance System,
Design and develop an intelligent traffic surveillance system to monitor and analyze traffic flows in the highway. Algorithm module include components as motion detection, tracking, vehicle classification and traffic flow analysis. Initial version is delivered. I designed and develop the whole algorithm module. Team members include two PhDs and two master students. System is built in Qt 5.1. Algorithm module is written in C++. Use OpenCV, OpenCL and Libsvm.
Dynamic Stream Merging in Wireless Mesh Network,
The project is fully supported by National Science Foundation (NSF). The focus is to develop a Wireless Mesh Access Network (WMAN) enabling the Dynamic Stream Merging (DSM) in order to increase the network capacity as much as possible. Not only do we develop a series of cross-layer network protocols but also we implement the algorithm into our prototype. Experimental results demonstrate the effectiveness of the system. Participant: me and another two research assistants; Language: C++; Platform: Linux; Contribution: Algorithm design and simulation, prototype implementation, and Analysis; a paper published at MCSS 2011.
Detection of Text Objects in Video Images,
Nov. 2008~Dec. 2009
The project aimed at detecting and locating texts in images. The detected results are recognized by OCR and used for image and video indexing and retrieval. It is also my thesis for the master's degree. In the project I was focused on the texts detecting algorithm. I had proposed a novel and highly effective feature combining LBP and HOG and trained a Neural Network with large quantities of samples. The trained classifier produced text region candidates with high confidence and then all regions were integrated and validated to form the final text blocks. This project contained various modules including image processing, feature extraction, classifier design which requiring a firm background of the pattern recognition theory. In addition, other skills such as programming, code optimization were also strongly required. Therefore, this project had greatly improved my capability in both academic research and practical application. Participant: Myself; Language: C++; Platform: Windows Visual Studio; Contribution: Algorithm design and simulation; design and implement a real-time demo system; a paper published at CJKPR 2009.
Internship with Microsoft, Data Scientist, Summer 2015
Internship with the Windows Core Quality Team at Microsoft Redmond. Develop a Belief Propagation Algorithm on the Big telemetry data for windows user segmentation. Proposed a solution to address the challenge of the PU (positive-unlabeled) problem where only the positive labels are available in the dataset.
Algorithm Engineer with Athena Eyes, 2010.1~2010.7
I worked with Athena as an R&D engineer focusing on developing the latest face recognition techniques implemented on our products. During that time, I developed a human liveliness detection system based on blink detection. In addition, I designed and implemented the framework of face recognition algorithm based on Active Shape Model (ASM) for the next version face recognition system of Athena Eyes.
Research Intern at National Lab of Pattern Recognition, CASIA,
The research task focused on the text detection and recognition in videos. I extended my previous text detection algorithm by incorporating different feature extraction methods in single frames and exploiting temporal features between consecutive frames. I developed a demo system composed of a detection module and an OCR module in collaboration with other Ph.D. students in the lab. The demo can run in real time and the result is promising. In the project, I was in charge of algorithm design of texts location and character segmentation and system establishment. During the three-month time, I had strengthened my theory foundation, broaden my vision and improved my ability in both research and application by communication with the top Ph.D. students of Pattern Recognition in China.
Volunteer of 29th Olympic Games (Beijing 2008),
I served to welcome the delegation of the athletes, media and officials from different countries during the Olympic. I also provided language services in the Terminal 2 of Beijing Capital International Airport. All these experiences have strengthened my communication skills and enforced my spirits of teamwork and I will treat it as the most valuable treasure of my life.
Graduate Teach Assistant: Introduction to C Programming, Fall 2010.
Graduate Teach Assistant: Discrete Math, Fall 2013.
Graduate Teach Assistant: Fundamentals of Database Systems, Spring 2014.
Guest lecture: GPU Processing for Distributed Live Video Database (ppt), Spring 2015.
Last updated on July 18, 2016.